Accessibility settings

Published on 16.May.2025 in Vol 13 (2025)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/66917, first published 26.Sep.2024.

Close-up of hands typing on a laptop keyboard, with a pen in one hand.

Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

; Reem Agbareia²

; Benjamin S Glicksberg¹

; Girish N Nadkarni¹

; Eyal Klang¹

Article Authors Cited by (30) Tweetations Metrics

Mahmud Omar ^{1
*} , MD ; Reem Agbareia ² , MD ; Benjamin S Glicksberg ¹ , MD ; Girish N Nadkarni ^{1
*} , MD ; Eyal Klang ^{1
*} , MD

¹ Division of Data-Driven and Digital Medicine (D3M), Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States

² Ophthalmology Department, Hadassah Medical Center, Jerusalem, Israel

*these authors contributed equally

Corresponding Author:

Mahmud Omar, MD
Division of Data-Driven and Digital Medicine (D3M), Department of Medicine
Icahn School of Medicine at Mount Sinai
Gustave L. Levy Place New York
New York, NY 10029
United States
Phone: 1 212 241 6500
Email: mahmudomar70@gmail.com