Published on in Vol 12 (2024)
This is a member publication of University of Pittsburgh
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/55318, first published
.

Journals
- Fang Y, Ryan P, Weng C. Knowledge-guided generative artificial intelligence for automated taxonomy learning from drug labels. Journal of the American Medical Informatics Association 2024;31(9):2065 View
- Nwachukwu B, Varady N, Allen A, Dines J, Altchek D, Williams R, Kunze K. Currently Available Large Language Models Do Not Provide Musculoskeletal Treatment Recommendations That Are Concordant With Evidence-Based Clinical Practice Guidelines. Arthroscopy: The Journal of Arthroscopic & Related Surgery 2025;41(2):263 View
- Shahriar S, Lund B, Mannuru N, Arshad M, Hayawi K, Bevara R, Mannuru A, Batool L. Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency. Applied Sciences 2024;14(17):7782 View
- Zaghir J, Naguib M, Bjelogrlic M, Névéol A, Tannier X, Lovis C. Prompt Engineering Paradigms for Medical Applications: Scoping Review. Journal of Medical Internet Research 2024;26:e60501 View
- Tong L, Zhang C, Liu R, Yang J, Sun Z. Comparative performance analysis of large language models: ChatGPT-3.5, ChatGPT-4 and Google Gemini in glucocorticoid-induced osteoporosis. Journal of Orthopaedic Surgery and Research 2024;19(1) View
- Tam T, Sivarajkumar S, Kapoor S, Stolyar A, Polanska K, McCarthy K, Osterhoudt H, Wu X, Visweswaran S, Fu S, Mathur P, Cacciamani G, Sun C, Peng Y, Wang Y. A framework for human evaluation of large language models in healthcare derived from literature review. npj Digital Medicine 2024;7(1) View
- Ronquillo J, Ye J, Gorman D, Lemeshow A, Watt S. Practical Aspects of Using Large Language Models to Screen Abstracts for Cardiovascular Drug Development: Cross-Sectional Study. JMIR Medical Informatics 2024;12:e64143 View
- Workman T, Ahmed A, Sheriff H, Raman V, Zhang S, Shao Y, Faselis C, Fonarow G, Zeng-Treitler Q. ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records. Progress in Cardiovascular Diseases 2024;87:44 View
- Das M, Senapati A. Co-reference Resolution in Prompt Engineering. Procedia Computer Science 2024;244:194 View
- Othman A, Chemnad K, Tlili A, Da T, Wang H, Huang R. Comparative analysis of GPT-4, Gemini, and Ernie as gloss sign language translators in special education. Discover Global Society 2024;2(1) View
- Acut D, Malabago N, Malicoban E, Galamiton N, Garcia M. “ChatGPT 4.0 Ghosted Us While Conducting Literature Search:” Modeling the Chatbot’s Generated Non-Existent References Using Regression Analysis. Internet Reference Services Quarterly 2025;29(1):27 View
- Cardamone N, Olfson M, Schmutte T, Ungar L, Liu T, Cullen S, Williams N, Marcus S. Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models: Large Language Model Evaluation Study. JMIR Medical Informatics 2025;13:e65454 View
- Tarris G, Martin L. Performance assessment of ChatGPT 4, ChatGPT 3.5, Gemini Advanced Pro 1.5 and Bard 2.0 to problem solving in pathology in French language. DIGITAL HEALTH 2025;11 View
- Kuerbanjiang W, Peng S, Jiamaliding Y, Yi Y. Performance Evaluation of Large Language Models in Cervical Cancer Management Based on a Standardized Questionnaire: Comparative Study. Journal of Medical Internet Research 2025;27:e63626 View
- Geevarghese R, Solomon S, Alexander E, Marinelli B, Chatterjee S, Jain P, Cadley J, Hollingsworth A, Chatterjee A, Ziv E. Utility of a Large Language Model for Extraction of Clinical Findings from Healthcare Data following Lung Ablation: A Feasibility Study. Journal of Vascular and Interventional Radiology 2025;36(4):704 View
- Kim S, Schramm S, Adams L, Braren R, Bressem K, Keicher M, Platzek P, Paprottka K, Zimmer C, Hedderich D, Wiestler B. Benchmarking the diagnostic performance of open source LLMs in 1933 Eurorad case reports. npj Digital Medicine 2025;8(1) View
- Fung M, Tang E, Wu T, Luk Y, Au I, Liu X, Lee V, Wong C, Wei Z, Cheng W, Tai I, Ho J, Wong J, Lang B, Leung K, Wong Z, Wu J, Wong C. Developing a named entity framework for thyroid cancer staging and risk level classification using large language models. npj Digital Medicine 2025;8(1) View
- Valadez-de la Paz N, Vazquez-Lopez J, Hernandez-Lopez A, Aviles-Viñas J, Navarro-Gonzalez J, Reyes-Acosta A, Lopez-Juarez I. Automation Applied to the Collection and Generation of Scientific Literature. Publications 2025;13(1):11 View
- Burstein R, Mafuta E, Proctor J. Large language models for analyzing open text in global health surveys: why children are not accessing vaccine services in the Democratic Republic of the Congo. International Health 2025 View
- Talay L, Lagesen L, Yip A, Vickers M, Ahuja N. ChatGPT-4o and 4o1 Preview as Dietary Support Tools in a Real-World Medicated Obesity Program: A Prospective Comparative Analysis. Healthcare 2025;13(6):647 View
- Cao Y, Hu L, Cao X, Peng J. Can large language models facilitate the effective implementation of nursing processes in clinical settings?. BMC Nursing 2025;24(1) View
- Lauderdale S, Schmitt R, Wuckovich B, Dalal N, Desai H, Tomlinson S. Effectiveness of generative AI-large language models’ recognition of veteran suicide risk: a comparison with human mental health providers using a risk stratification model. Frontiers in Psychiatry 2025;16 View
- Güvel M, Kıyak Y, Varan H, Sezenöz B, Coşkun Ö, Uluoğlu C. Generative AI vs. human expertise: a comparative analysis of case-based rational pharmacotherapy question generation. European Journal of Clinical Pharmacology 2025;81(6):875 View
- Lauderdale S, Griffin S, Lahman K, Mbaba E, Tomlinson S. Unveiling Public Stigma for Borderline Personality Disorder: A Comparative Study of Artificial Intelligence and Mental Health Care Providers. Personality and Mental Health 2025;19(2) View
- Shen M, Shen Y, Liu F, Jin J. Prompts, privacy, and personalized learning: integrating AI into nursing education—a qualitative study. BMC Nursing 2025;24(1) View
- Sumner J, Wang Y, Tan S, Chew E, Wenjun Yip A. Perspectives and Experiences With Large Language Models in Health Care: Survey Study. Journal of Medical Internet Research 2025;27:e67383 View
- Hickman C, Pridgen K, Hughes D, Pair L, Holland A. The Role of Artificial Intelligence in Increasing Efficiency, Reducing Errors, and Improving Patient Outcomes in Clinical Practice. Clinical Journal for Nurse Practitioners in Women's Health 2025;2(2):101 View
- Elabd N, Rahman Z, Abu Alinnin S, Jahan S, Campos L, Baltatu O. Designing Personalized Multimodal Mnemonics With AI: A Medical Student’s Implementation Tutorial. JMIR Medical Education 2025;11:e67926 View
- Hein D, Christie A, Holcomb M, Xie B, Jain A, Vento J, Rakheja N, Shakur A, Christley S, Cowell L, Brugarolas J, Jamieson A, Kapur P. Iterative refinement and goal articulation to optimize large language models for clinical information extraction. npj Digital Medicine 2025;8(1) View
- Radi M, Omar N, Kaur W. Syntactic-Guided Chain of Thought for Iterative Implicit and Explicit Target Detection in Aspect-Based Sentiment Analysis. IEEE Access 2025;13:84738 View
- Thota D, Alt D, Cole J, Tring V. Prompting Pro Tips! Best Practices for Generating Clinical Narrative Summaries. Military Medicine 2025 View
- Miller K, Bedrick S, Lu Q, Wen A, Hersh W, Roberts K, Liu H. Dynamic few-shot prompting for clinical note section classification using lightweight, open-source large language models. Journal of the American Medical Informatics Association 2025;32(7):1164 View
- Fleurence R, Wang X, Bian J, Higashi M, Ayer T, Xu H, Dawoud D, Chhatwal J. A Taxonomy of Generative Artificial Intelligence in Health Economics and Outcomes Research: An ISPOR Working Group Report. Value in Health 2025 View
- Boie S, Glastetter E, Lux M, Balzer F, von Kalle C, Lenz C, Müller U. Evaluating a Chatbot as a Companion for Patients With Breast Cancer: Collaborative Pilot Study. JMIR Cancer 2025;11:e68426 View
- Hwang M, Lee K, Lee H. A word to the wise: Crafting impactful prompts for ChatGPT. System 2025;133:103756 View
- Hassanein F, El Barbary A, Hussein R, Ahmed Y, El‐Guindy J, Sarhan S, Abou‐Bakr A. Diagnostic Performance of ChatGPT‐4o and DeepSeek‐3 Differential Diagnosis of Complex Oral Lesions: A Multimodal Imaging and Case Difficulty Analysis. Oral Diseases 2025 View
- Chen H, Alfred M, Cohen E. Efficient Detection of Stigmatizing Language in Electronic Health Records via In-Context Learning: Comparative Analysis and Validation Study. JMIR Medical Informatics 2025;13:e68955 View
- Pulari S, Umadevi M, Vasudevan S. Optimizing multimodal personalized disease prediction accuracy using generated prompts and large language models. Image and Vision Computing 2025;161:105649 View
- Bartels S, Carus J. From text to data: Open-source large language models in extracting cancer related medical attributes from German pathology reports. International Journal of Medical Informatics 2025;203:106022 View
- Kantor J. Generative Artificial Intelligence in Dermatology. Dermatologic Clinics 2025 View
- Garcia-Carmona A, Prieto M, Puertas E, Beunza J. Leveraging Large Language Models for Accurate Retrieval of Patient Information From Medical Reports: Systematic Evaluation Study. JMIR AI 2025;4:e68776 View
- Liu J, Wang C, Liu S. Prompt Engineering in Clinical Practice: A Tutorial for Clinicians (Preprint). Journal of Medical Internet Research 2025 View
- Yao M, Chae A, Saraiya P, Kahn C, Witschey W, Gee J, Sagreiya H, Bastani O. Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines. Communications Medicine 2025;5(1) View
- Qian Y. Prompt Engineering in Education: A Systematic Review of Approaches and Educational Applications. Journal of Educational Computing Research 2025 View
- Bahng J. The Potential and Applications of Artificial Intelligence in the Field of Audiology. Audiology and Speech Research 2025;21(3):209 View
- Bandeira A, Gonçalves L, Holl F, Shaibu J, Gonçalves M, Payinda R, Paudel S, Berionni A, Purnat T, Mackey T. Viewpoint on the Intersection Between Health Information, Misinformation, and Generative AI Technologies (Preprint). JMIR Infodemiology 2024 View
- Çakar M, Avcı A, Düzgün S, Aslan T, Hekimoğlu K. Assessment of the Accuracy of Modern Artificial Intelligence Chatbots in Responding to Endodontic Queries. Australian Endodontic Journal 2025 View
- Vieira-Vieira C, Kulkarni S, Zalewski A, Löffler J, Münch J, Kreuchwig A. From data silos to insights: the PRINCE multi-agent knowledge engine for preclinical drug development. Frontiers in Artificial Intelligence 2025;8 View
- Wang H, Bai X, Cui X, Chen G, Fan G, Wei G, Zheng Y, Wu J, Gao S. Symptom Recognition in Medical Conversations Via multi- Instance Learning and Prompt. Journal of Medical Systems 2025;49(1) View
- Li K, Nguyen T, Moss H. Performance of vision language models for optic disc swelling identification on fundus photographs. Frontiers in Digital Health 2025;7 View
Books/Policy Documents
Conference Proceedings
- García-Barragán Á, Calatayud A, Prieto-Santamaría L, Robles V, Menasalvas E, Rodríguez A. 2024 IEEE 37th International Symposium on Computer-Based Medical Systems (CBMS). Step-forward structuring disease phenotypic entities with LLMs for disease understanding View
- Teng S, Zhang T, D'Alfonso S, Kostakos V. Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing. Predicting Affective States from Screen Text Sentiment View
- Maceda L. 2024 International Conference on Computer and Applications (ICCA). Enhanced Sentiment Classification in Code-Mixed Texts Using Hybrid Embeddings and Synthetic Data Generation View
- Weerathunge T, Jayalal S, Wijayasiriwardhane K. 2025 5th International Conference on Advanced Research in Computing (ICARC). Optimizing Response Consistency of Large Language Models in Medical Education through Prompt Engineering View
- Arabzadeh N, Bagheri E. Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. VAP3: Variation-Aware Prompt Performance Prediction View