Assessment of the Artificial Intelligence– Generated Fibromyalgia Information: Beyond the Hype
Mert Zure1, Ahmet Kıvanç Menekşeoğlu2
1 Department of Physical Medicine and Rehabilitation, University of Health Sciences İstanbul Kanuni Sultan Süleyman Training and Research Hospital, İstanbul, Türkiye
2 Department of Physical Medicine and Rehabilitation, Mvz Berlinomed, Berlin, Germany
Keywords: Artificial intelligence, fibromyalgia, health misinformation, supplementary resources, trends
Abstract
Background/Aims: Individuals increasingly turn to artificial intelligence (AI) chatbots for health-related information; however, the accuracy and usability of their responses remain uncertain. This study assessed the quality, comprehensiveness, and readability of responses from 6 AI chatbots—ChatGPT-3.5, ChatGPT-4o (OpenAI), Copilot AI (Microsoft), Perplexity AI (Perplexity.AI), Gemini AI (Google), and ChatSonic AI (Writesonic)—to the most commonly searched fibromyalgia-related queries.
Materials and Methods: The top 10 most frequently searched fibromyalgia-related questions from the past 2 years were retrieved from the Google Trends database. Each chatbot was queried separately, and a total of 60 responses (10 per chatbot) were assessed both qualitatively and quantitatively by 2 reviewers, focusing on content quality, accuracy, readability, and alignment with evidence-based guidelines.
Results: ChatGPT-3.5 had the lowest Ensuring Quality Information for Patients score (20.6 ± 4.5), indicating very low quality information, while Gemini achieved the highest (40.5 ± 5), which was still classified as low quality. Understandability was moderate for Copilot, Gemini, and Perplexity (67.2) but lowest for ChatGPT-3.5 (43.2 ± 10.2). Actionability was weak and the misinformation assessment revealed a moderate level across all chatbots. Readability scores indicated university-level complexity, with ChatGPT-4o having the lowest Reading Ease score (11.3 ± 11.2) and Copilot the highest (30.3 ± 13.2).
Conclusion: While AI chatbots provide accessible health information, their accuracy and depth vary. Gemini, Copilot, and Perplexity AI showed better quality, but citation inconsistencies, readability challenges, and misinformation risks highlight the need for refinement beyond the hype. Clinicians should guide fibromyalgia patients in critically assessing AI-generated health content. Future research should explore improvements in AI chatbot applicability for medical inquiries.
Cite this article as: Zure M, Kıvanç Menekşeoğlu A. Assessment of the artificial intelligence–generated fibromyalgia information: Beyond the hype. Arch Rheumatol. 2025;40(3):358-364.