Published on in Vol 13 (2025)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/66917, first published .
Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

Benchmarking the Confidence of Large Language Models in Answering Clinical Questions: Cross-Sectional Evaluation Study

Mahmud Omar   1 * , MD ;   Reem Agbareia   2 , MD ;   Benjamin S Glicksberg   1 , MD ;   Girish N Nadkarni   1 * , MD ;   Eyal Klang   1 * , MD

1 Division of Data-Driven and Digital Medicine (D3M), Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States

2 Ophthalmology Department, Hadassah Medical Center, Jerusalem, Israel

*these authors contributed equally

Corresponding Author:

  • Mahmud Omar, MD
  • Division of Data-Driven and Digital Medicine (D3M), Department of Medicine
  • Icahn School of Medicine at Mount Sinai
  • Gustave L. Levy Place New York
  • New York, NY 10029
  • United States
  • Phone: 1 212 241 6500
  • Email: mahmudomar70@gmail.com