Published on in Vol 12 (2024)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/55627, first published .
Evaluating ChatGPT-4’s Diagnostic Accuracy: Impact of Visual Data Integration

Evaluating ChatGPT-4’s Diagnostic Accuracy: Impact of Visual Data Integration

Evaluating ChatGPT-4’s Diagnostic Accuracy: Impact of Visual Data Integration

Journals

  1. Hirosawa T, Harada Y, Mizuta K, Sakamoto T, Tokumasu K, Shimizu T. Diagnostic performance of generative artificial intelligences for a series of complex case reports. DIGITAL HEALTH 2024;10 View
  2. Hirosawa T, Harada Y, Tokumasu K, Ito T, Suzuki T, Shimizu T. Comparative Study to Evaluate the Accuracy of Differential Diagnosis Lists Generated by Gemini Advanced, Gemini, and Bard for a Case Report Series Analysis: An Experimental Study (Preprint). JMIR Medical Informatics 2024 View
  3. Liu C, Ho C, Wu T. Custom GPTs Enhancing Performance and Evidence Compared with GPT-3.5, GPT-4, and GPT-4o? A Study on the Emergency Medicine Specialist Examination. Healthcare 2024;12(17):1726 View
  4. Diniz‐Freitas M, Lago‐Méndez L, Limeres‐Posse J, Diz‐Dios P. Challenging ChatGPT‐4V for the Diagnosis of Oral Diseases and Conditions. Oral Diseases 2025;31(2):701 View
  5. Sun S, Chen K, Anavim S, Phillipi M, Yeh L, Huynh K, Cortes G, Tran J, Tran M, Yaghmai V, Houshyar R. Large Language Models with Vision on Diagnostic Radiology Board Exam Style Questions. Academic Radiology 2025;32(5):3096 View
  6. Hiredesai A, Martinez C, Anderson M, Howlett C, Unadkat K, Noland S. Is Artificial Intelligence the Future of Radiology? Accuracy of ChatGPT in Radiologic Diagnosis of Upper Extremity Bony Pathology. HAND 2024 View
  7. Yang X, Li T, Su Q, Liu Y, Kang C, Lyu Y, Zhao L, Nie Y, Pan Y. Application of large language models in disease diagnosis and treatment. Chinese Medical Journal 2025;138(2):130 View
  8. Saraiva M, Ribeiro T, Agudo B, Afonso J, Mendes F, Martins M, Cardoso P, Mota J, Almeida M, Costa A, Gonzalez Haba Ruiz M, Widmer J, Moura E, Javed A, Manzione T, Nadal S, Barroso L, de Parades V, Ferreira J, Macedo G. Evaluating ChatGPT-4 for the Interpretation of Images from Several Diagnostic Techniques in Gastroenterology. Journal of Clinical Medicine 2025;14(2):572 View
  9. Noda M, Takahara S, Hayashi S, Inui A, Oe K, Matsushita T. Evaluating ChatGPT’s Performance in Classifying Pertrochanteric Fractures Based on Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association (AO/OTA) Standards. Cureus 2025 View
  10. Nguyen H, Dang H, Nguyen T, Hoang V, Nguyen V, Wu J. Accuracy of latest large language models in answering multiple choice questions in dentistry: A comparative study. PLOS ONE 2025;20(1):e0317423 View
  11. Yang X, Li T, Wang H, Zhang R, Ni Z, Liu N, Zhai H, Zhao J, Meng F, Zhou Z, Tang S, Wang L, Wang X, Luo H, Ren G, Zhang L, Kang X, Wang J, Bo N, Yang X, Xue W, Zhang X, Chen N, Guo R, Li B, Li Y, Liu Y, Zhang T, Liang S, Lv Y, Nie Y, Fan D, Zhao L, Pan Y. Multiple large language models versus experienced physicians in diagnosing challenging cases with gastrointestinal symptoms. npj Digital Medicine 2025;8(1) View
  12. Chiesa-Estomba C, Andueza-Guembe M, Maniaci A, Mayo-Yanez M, Betances-Reinoso F, Vaira L, Saibene A, Lechien J. Accuracy of ChatGPT-4o in Text and Video Analysis of Laryngeal Malignant and Premalignant Diseases. Journal of Voice 2025 View
  13. Aşar E, İpek İ, Bi̇lge K. Customized GPT-4V(ision) for radiographic diagnosis: can large language model detect supernumerary teeth?. BMC Oral Health 2025;25(1) View
  14. Alyanak B, Çakar İ, Dede B, Yıldızgören M, Bağcıer F. Artificial intelligence vs human expertise: A comparison of plantar fascia thickness measurements through MRI imaging. International Journal of Medical Informatics 2025;203:105999 View
  15. Peng W, Cheng X, Deng J, Zhang X. ChatGPT Applications in Nursing: Current Status and Future Perspectives. Nursing Open 2025;12(6) View
  16. Nguyen D, Kim G, Bedayat A. Evaluating ChatGPT's performance across radiology subspecialties: A meta-analysis of board-style examination accuracy and variability. Clinical Imaging 2025;125:110551 View
  17. Fukataki Y, Hayashi W, Nishimoto N, Ito Y, Kuo P. Developing artificial intelligence tools for institutional review board pre-review: A pilot study on ChatGPT’s accuracy and reproducibility. PLOS Digital Health 2025;4(6):e0000695 View
  18. Altinbilek E, Az A, Sogut O, Dogan Y, Akdemir T, Belen E, Biter H, Saricicek T, Ozcomlekci M, Kilic N. Task-specific versus general-purpose AI models in ECG analysis: A comparative study with emergency medicine specialists. The American Journal of Emergency Medicine 2025;95:220 View
  19. Kramer R. Comparing ChatGPT with human perceptions of illusory faces. Visual Cognition 2025;33(2):119 View
  20. Tai H, Kovarik C. ChatGPT-4’s Level of Dermatological Knowledge Based on Board Examination Review Questions and Bloom’s Taxonomy. JMIR Dermatology 2025;8:e74085 View
  21. Liu H, Ma C, Yang Y, Liao W, Wang Y. Strategies for enhancing PHC accessibility through mobile and capsule clinics: a spatial location allocation study in China. BMC Health Services Research 2025;25(1) View
  22. Liu J, Zhang T, Ma Y, Hu T, Lin F, Liang H, Yang D, Pan Y, Gao D, Qiu L, Gao T. Generative artificial intelligence perspectives on typical landscape types: Can ChatGPT compete with human insight?. Landscape and Urban Planning 2025;264:105479 View