%0 Journal Article
%@ 2291-9694
%I JMIR Publications
%V 13
%N 
%P e64682
%T GPT-3.5 Turbo and GPT-4 Turbo in Title and Abstract Screening for Systematic Reviews
%A Oami,Takehiko
%A Okada,Yohei
%A Nakada,Taka-aki
%K large language models
%K citation screening
%K systematic review
%K clinical practice guidelines
%K artificial intelligence
%K sepsis
%K AI
%K review
%K GPT
%K screening
%K citations
%K critical care
%K Japan
%K Japanese
%K accuracy
%K efficiency
%K reliability
%K LLM
%D 2025
%7 12.3.2025
%9 
%J JMIR Med Inform
%G English
%X This study demonstrated that while GPT-4 Turbo had superior specificity when compared to GPT-3.5 Turbo (0.98 vs 0.51), as well as comparable sensitivity (0.85 vs 0.83), GPT-3.5 Turbo processed 100 studies faster (0.9 min vs 1.6 min) in citation screening for systematic reviews, suggesting that GPT-4 Turbo may be more suitable due to its higher specificity and highlighting the potential of large language models in optimizing literature selection.
%R 10.2196/64682
%U https://medinform.jmir.org/2025/1/e64682
%U https://doi.org/10.2196/64682