Published on in Vol 12 (2024)
This is a member publication of University of Pittsburgh
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/55318, first published
.

Journals
- Fang Y, Ryan P, Weng C. Knowledge-guided generative artificial intelligence for automated taxonomy learning from drug labels. Journal of the American Medical Informatics Association 2024;31(9):2065 View
- Nwachukwu B, Varady N, Allen A, Dines J, Altchek D, Williams R, Kunze K. Currently Available Large Language Models Do Not Provide Musculoskeletal Treatment Recommendations That Are Concordant With Evidence-Based Clinical Practice Guidelines. Arthroscopy: The Journal of Arthroscopic & Related Surgery 2025;41(2):263 View
- Shahriar S, Lund B, Mannuru N, Arshad M, Hayawi K, Bevara R, Mannuru A, Batool L. Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency. Applied Sciences 2024;14(17):7782 View
- Zaghir J, Naguib M, Bjelogrlic M, Névéol A, Tannier X, Lovis C. Prompt Engineering Paradigms for Medical Applications: Scoping Review. Journal of Medical Internet Research 2024;26:e60501 View
- Tong L, Zhang C, Liu R, Yang J, Sun Z. Comparative performance analysis of large language models: ChatGPT-3.5, ChatGPT-4 and Google Gemini in glucocorticoid-induced osteoporosis. Journal of Orthopaedic Surgery and Research 2024;19(1) View
- Tam T, Sivarajkumar S, Kapoor S, Stolyar A, Polanska K, McCarthy K, Osterhoudt H, Wu X, Visweswaran S, Fu S, Mathur P, Cacciamani G, Sun C, Peng Y, Wang Y. A framework for human evaluation of large language models in healthcare derived from literature review. npj Digital Medicine 2024;7(1) View
- Ronquillo J, Ye J, Gorman D, Lemeshow A, Watt S. Practical Aspects of Using Large Language Models to Screen Abstracts for Cardiovascular Drug Development: Cross-Sectional Study. JMIR Medical Informatics 2024;12:e64143 View
- Workman T, Ahmed A, Sheriff H, Raman V, Zhang S, Shao Y, Faselis C, Fonarow G, Zeng-Treitler Q. ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records. Progress in Cardiovascular Diseases 2024;87:44 View
- Das M, Senapati A. Co-reference Resolution in Prompt Engineering. Procedia Computer Science 2024;244:194 View
- Othman A, Chemnad K, Tlili A, Da T, Wang H, Huang R. Comparative analysis of GPT-4, Gemini, and Ernie as gloss sign language translators in special education. Discover Global Society 2024;2(1) View
- Acut D, Malabago N, Malicoban E, Galamiton N, Garcia M. “ChatGPT 4.0 Ghosted Us While Conducting Literature Search:” Modeling the Chatbot’s Generated Non-Existent References Using Regression Analysis. Internet Reference Services Quarterly 2025;29(1):27 View
- Cardamone N, Olfson M, Schmutte T, Ungar L, Liu T, Cullen S, Williams N, Marcus S. Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models: Large Language Model Evaluation Study. JMIR Medical Informatics 2025;13:e65454 View
- Tarris G, Martin L. Performance assessment of ChatGPT 4, ChatGPT 3.5, Gemini Advanced Pro 1.5 and Bard 2.0 to problem solving in pathology in French language. DIGITAL HEALTH 2025;11 View
- Kuerbanjiang W, Peng S, Jiamaliding Y, Yi Y. Performance Evaluation of Large Language Models in Cervical Cancer Management Based on a Standardized Questionnaire: Comparative Study. Journal of Medical Internet Research 2025;27:e63626 View
- Geevarghese R, Solomon S, Alexander E, Marinelli B, Chatterjee S, Jain P, Cadley J, Hollingsworth A, Chatterjee A, Ziv E. Utility of a Large Language Model for Extraction of Clinical Findings from Healthcare Data following Lung Ablation: A Feasibility Study. Journal of Vascular and Interventional Radiology 2025;36(4):704 View
- Kim S, Schramm S, Adams L, Braren R, Bressem K, Keicher M, Platzek P, Paprottka K, Zimmer C, Hedderich D, Wiestler B. Benchmarking the diagnostic performance of open source LLMs in 1933 Eurorad case reports. npj Digital Medicine 2025;8(1) View
- Fung M, Tang E, Wu T, Luk Y, Au I, Liu X, Lee V, Wong C, Wei Z, Cheng W, Tai I, Ho J, Wong J, Lang B, Leung K, Wong Z, Wu J, Wong C. Developing a named entity framework for thyroid cancer staging and risk level classification using large language models. npj Digital Medicine 2025;8(1) View
- Valadez-de la Paz N, Vazquez-Lopez J, Hernandez-Lopez A, Aviles-Viñas J, Navarro-Gonzalez J, Reyes-Acosta A, Lopez-Juarez I. Automation Applied to the Collection and Generation of Scientific Literature. Publications 2025;13(1):11 View
- Burstein R, Mafuta E, Proctor J. Large language models for analyzing open text in global health surveys: why children are not accessing vaccine services in the Democratic Republic of the Congo. International Health 2025;17(5):843 View
- Talay L, Lagesen L, Yip A, Vickers M, Ahuja N. ChatGPT-4o and 4o1 Preview as Dietary Support Tools in a Real-World Medicated Obesity Program: A Prospective Comparative Analysis. Healthcare 2025;13(6):647 View
- Cao Y, Hu L, Cao X, Peng J. Can large language models facilitate the effective implementation of nursing processes in clinical settings?. BMC Nursing 2025;24(1) View
- Lauderdale S, Schmitt R, Wuckovich B, Dalal N, Desai H, Tomlinson S. Effectiveness of generative AI-large language models’ recognition of veteran suicide risk: a comparison with human mental health providers using a risk stratification model. Frontiers in Psychiatry 2025;16 View
- Güvel M, Kıyak Y, Varan H, Sezenöz B, Coşkun Ö, Uluoğlu C. Generative AI vs. human expertise: a comparative analysis of case-based rational pharmacotherapy question generation. European Journal of Clinical Pharmacology 2025;81(6):875 View
- Lauderdale S, Griffin S, Lahman K, Mbaba E, Tomlinson S. Unveiling Public Stigma for Borderline Personality Disorder: A Comparative Study of Artificial Intelligence and Mental Health Care Providers. Personality and Mental Health 2025;19(2) View
- Shen M, Shen Y, Liu F, Jin J. Prompts, privacy, and personalized learning: integrating AI into nursing education—a qualitative study. BMC Nursing 2025;24(1) View
- Sumner J, Wang Y, Tan S, Chew E, Wenjun Yip A. Perspectives and Experiences With Large Language Models in Health Care: Survey Study. Journal of Medical Internet Research 2025;27:e67383 View
- Hickman C, Pridgen K, Hughes D, Pair L, Holland A. The Role of Artificial Intelligence in Increasing Efficiency, Reducing Errors, and Improving Patient Outcomes in Clinical Practice. Clinical Journal for Nurse Practitioners in Women's Health 2025;2(2):101 View
- Elabd N, Rahman Z, Abu Alinnin S, Jahan S, Campos L, Baltatu O. Designing Personalized Multimodal Mnemonics With AI: A Medical Student’s Implementation Tutorial. JMIR Medical Education 2025;11:e67926 View
- Hein D, Christie A, Holcomb M, Xie B, Jain A, Vento J, Rakheja N, Shakur A, Christley S, Cowell L, Brugarolas J, Jamieson A, Kapur P. Iterative refinement and goal articulation to optimize large language models for clinical information extraction. npj Digital Medicine 2025;8(1) View
- Radi M, Omar N, Kaur W. Syntactic-Guided Chain of Thought for Iterative Implicit and Explicit Target Detection in Aspect-Based Sentiment Analysis. IEEE Access 2025;13:84738 View
- Thota D, Alt D, Cole J, Tring V. Prompting Pro Tips! Best Practices for Generating Clinical Narrative Summaries. Military Medicine 2025 View
- Miller K, Bedrick S, Lu Q, Wen A, Hersh W, Roberts K, Liu H. Dynamic few-shot prompting for clinical note section classification using lightweight, open-source large language models. Journal of the American Medical Informatics Association 2025;32(7):1164 View
- Fleurence R, Wang X, Bian J, Higashi M, Ayer T, Xu H, Dawoud D, Chhatwal J. A Taxonomy of Generative Artificial Intelligence in Health Economics and Outcomes Research: An ISPOR Working Group Report. Value in Health 2025;28(11):1601 View
- Boie S, Glastetter E, Lux M, Balzer F, von Kalle C, Lenz C, Müller U. Evaluating a Chatbot as a Companion for Patients With Breast Cancer: Collaborative Pilot Study. JMIR Cancer 2025;11:e68426 View
- Hwang M, Lee K, Lee H. A word to the wise: Crafting impactful prompts for ChatGPT. System 2025;133:103756 View
- Hassanein F, El Barbary A, Hussein R, Ahmed Y, El‐Guindy J, Sarhan S, Abou‐Bakr A. Diagnostic Performance of ChatGPT‐4o and DeepSeek‐3 Differential Diagnosis of Complex Oral Lesions: A Multimodal Imaging and Case Difficulty Analysis. Oral Diseases 2025 View
- Chen H, Alfred M, Cohen E. Efficient Detection of Stigmatizing Language in Electronic Health Records via In-Context Learning: Comparative Analysis and Validation Study. JMIR Medical Informatics 2025;13:e68955 View
- Pulari S, Umadevi M, Vasudevan S. Optimizing multimodal personalized disease prediction accuracy using generated prompts and large language models. Image and Vision Computing 2025;161:105649 View
- Bartels S, Carus J. From text to data: Open-source large language models in extracting cancer related medical attributes from German pathology reports. International Journal of Medical Informatics 2025;203:106022 View
- Kantor J. Generative Artificial Intelligence in Dermatology. Dermatologic Clinics 2025;43(4):603 View
- Garcia-Carmona A, Prieto M, Puertas E, Beunza J. Leveraging Large Language Models for Accurate Retrieval of Patient Information From Medical Reports: Systematic Evaluation Study. JMIR AI 2025;4:e68776 View
- Liu J, Liu F, Wang C, Liu S. Prompt Engineering in Clinical Practice: Tutorial for Clinicians. Journal of Medical Internet Research 2025;27:e72644 View
- Yao M, Chae A, Saraiya P, Kahn C, Witschey W, Gee J, Sagreiya H, Bastani O. Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines. Communications Medicine 2025;5(1) View
- Qian Y. Prompt Engineering in Education: A Systematic Review of Approaches and Educational Applications. Journal of Educational Computing Research 2025;63(7-8):1782 View
- Bahng J. The Potential and Applications of Artificial Intelligence in the Field of Audiology. Audiology and Speech Research 2025;21(3):209 View
- Bandeira A, Gonçalves L, Holl F, Shaibu J, Gonçalves M, Payinda R, Paudel S, Berionni A, Purnat T, Mackey T. Viewpoint on the Intersection Among Health Information, Misinformation, and Generative AI Technologies. JMIR Infodemiology 2025;5:e69474 View
- Çakar M, Avcı A, Düzgün S, Aslan T, Hekimoğlu K. Assessment of the Accuracy of Modern Artificial Intelligence Chatbots in Responding to Endodontic Queries. Australian Endodontic Journal 2025 View
- Vieira-Vieira C, Kulkarni S, Zalewski A, Löffler J, Münch J, Kreuchwig A. From data silos to insights: the PRINCE multi-agent knowledge engine for preclinical drug development. Frontiers in Artificial Intelligence 2025;8 View
- Wang H, Bai X, Cui X, Chen G, Fan G, Wei G, Zheng Y, Wu J, Gao S. Symptom Recognition in Medical Conversations Via multi- Instance Learning and Prompt. Journal of Medical Systems 2025;49(1) View
- Li K, Nguyen T, Moss H. Performance of vision language models for optic disc swelling identification on fundus photographs. Frontiers in Digital Health 2025;7 View
- Feyijimi T, Aliu J, Oke A, Aghimien D. ChatGPT’s Expanding Horizons and Transformative Impact Across Domains: A Critical Review of Capabilities, Challenges, and Future Directions. Computers 2025;14(9):366 View
- Emilova Doneva S, de Viragh S, Hubarava H, Schandelmaier S, Briel M, Ineichen B. StudyTypeTeller—Large language models to automatically classify research study types for systematic reviews. Research Synthesis Methods 2025:1 View
- Raghavendran A, Musunuri B, Rajpurohit S, C. G, Shetty S, Kumari P, Shetty R, Shetty A, Bhat G. Evaluation of artificial intelligence-based patient education models for irritable bowel syndrome. Indian Journal of Gastroenterology 2025 View
- Dilbaz O, Ozates M, Bolat B, Gunduz-Demir C, Kulac I. Systematic comparison of GPT models for the analysis of pathology reports in a low-resource language: A case study for Turkish. American Journal of Clinical Pathology 2025 View
- Alter I, Chan K, Andreadis K, Rameau A. Generative Artificial Intelligence Methodology Reporting in Otolaryngology: A Scoping Review. The Laryngoscope 2025 View
- Le A, Shvekher T, Nguyen L, Krylov S. A Conversational Large‐Language‐Model Tutor that Accelerates Machine‐Learning Method Development in Routine Bioanalytical Workflows. ChemBioChem 2025;26(21) View
- Jin R, Zhao M, Niu C, Xia Y, Zhou H, Liu N. Evaluating the performance of ChatGPT and Claude in automated writing scoring: Insights from the Many-facet Rasch model. Education and Information Technologies 2025 View
- Haupt F, Rödig T, Liersch P. Evaluating ChatGPT-4o as an Educational Support Tool for the Emergency Management of Dental Trauma: A Randomized Controlled Study among Students (Preprint). JMIR Medical Education 2025 View
- Daulat S, Dholaria N, Burnet G, Patil S, Manne B, Choudhary A, Mitha R, Zeeshan Q, Hamilton D, Agarwal N. Prompt Engineering and Follow-Up Questioning Improves the Readability of Spine Surgery Questions in Large Language Models. World Neurosurgery 2025;203:124423 View
- Ardila C, Pineda-Vélez E, Vivares-Builes A. Artificial Intelligence in Endodontic Education: A Systematic Review with Frequentist and Bayesian Meta-Analysis of Student-Based Evidence. Dentistry Journal 2025;13(11):489 View
- Ordak M, Adamczyk J, Oskroba A, Majewski M, Nasierowski T. Evaluation of the Accuracy and Reliability of Responses Generated by Artificial Intelligence Related to Clinical Pharmacology. Journal of Clinical Medicine 2025;14(21):7563 View
- Duque A, Araujo L, Martinez-Romo J, Esteban-Vasallo M, Domínguez-Berjón M, Malillos Perez D. An integrated approach for rare disease detection and classification in Spanish pediatric medical reports. Scientific Reports 2025;15(1) View
- Asif S, Hadi F, Qurrat-ul-ain , Yan Y, Wang V, Xu D. The impact of large language models on medical research and patient care: A systematic review of current trends, challenges, and future innovations. Computer Science Review 2026;59:100847 View
- Di Maio F, Gozzi M. Degradation of Multi-Task Prompting Across Six NLP Tasks and LLM Families. Electronics 2025;14(21):4349 View
- Vasilev Y, Vladzymyrskyy A, Omelyanskaya O, Alymova Y, Akhmedzyanova D, Shumskaya Y, Kodenko M, Blokhin I, Reshetnikov R. Development and Validation of a Questionnaire to Evaluate AI-Generated Summaries for Radiologists: ELEGANCE (Expert-Led Evaluation of Generative AI Competence and ExcelleNCE). AI 2025;6(11):287 View
- Shawi R, Jamel L. Leveraging ChatGPT and explainable AI for enhancing clinical decision support. Scientific Reports 2025;15(1) View
- Albosaif W, Aljughaiman A, Alsayed A. The role of diversity in enrichment programs in shaping the career paths of gifted individuals: An analysis of influential factors and emerging trends. International Journal of ADVANCED AND APPLIED SCIENCES 2025;12(11):82 View
- Koo S, Choi K. Unstructured Medical Data Entry System using Gaussian Probabilities and Large Language Models. The Journal of Korean Institute of Information Technology 2025;23(10):23 View
Books/Policy Documents
Conference Proceedings
- García-Barragán Á, Calatayud A, Prieto-Santamaría L, Robles V, Menasalvas E, Rodríguez A. 2024 IEEE 37th International Symposium on Computer-Based Medical Systems (CBMS). Step-forward structuring disease phenotypic entities with LLMs for disease understanding View
- Teng S, Zhang T, D'Alfonso S, Kostakos V. Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing. Predicting Affective States from Screen Text Sentiment View
- Maceda L. 2024 International Conference on Computer and Applications (ICCA). Enhanced Sentiment Classification in Code-Mixed Texts Using Hybrid Embeddings and Synthetic Data Generation View
- Weerathunge T, Jayalal S, Wijayasiriwardhane K. 2025 5th International Conference on Advanced Research in Computing (ICARC). Optimizing Response Consistency of Large Language Models in Medical Education through Prompt Engineering View
- Arabzadeh N, Bagheri E. Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. VAP3: Variation-Aware Prompt Performance Prediction View
