Search Articles

View query in Help articles search

Search Results (1 to 10 of 1370 Results)

Download search results: CSV END BibTex RIS


Evaluation of Large Language Models in Tailoring Educational Content for Cancer Survivors and Their Caregivers: Quality Analysis

Evaluation of Large Language Models in Tailoring Educational Content for Cancer Survivors and Their Caregivers: Quality Analysis

requirements for the LLMs, to produce educational content on 30 selected cancer care topics written at a sixth-grade reading level, limited to 250 words, and translated into Spanish and Chinese; (2) generating tailored educational content using 3 GPT models (GPT-3.5 Turbo, GPT-4, and GPT-4 Turbo) with 2 prompt styles (textual and bulleted); (3) expert evaluation of the generated content’s adherence to word count, reading level, and 5 quality criteria; and (4) statistical analyses (ANOVA [analysis of variance] and chi-square

Darren Liu, Xiao Hu, Canhua Xiao, Jinbing Bai, Zahra A Barandouzi, Stephanie Lee, Caitlin Webster, La-Urshalar Brock, Lindsay Lee, Delgersuren Bold, Yufen Lin

JMIR Cancer 2025;11:e67914

Large Language Models in Summarizing Radiology Report Impressions for Lung Cancer in Chinese: Evaluation Study

Large Language Models in Summarizing Radiology Report Impressions for Lung Cancer in Chinese: Evaluation Study

For instance, Liu et al [22] assessed 29 different LLMs for generating impressions from radiological reports, using open-source datasets such as the MIMIC-CXR and Open I datasets. However, their study was limited to English x-ray reports and relied solely on automatic quantitative evaluation.

Danqing Hu, Shanyuan Zhang, Qing Liu, Xiaofeng Zhu, Bing Liu

J Med Internet Res 2025;27:e65547

A Remote Intervention Based on mHealth and Community Health Workers for Antiretroviral Therapy Adherence in People With HIV: Pilot Randomized Controlled Trial

A Remote Intervention Based on mHealth and Community Health Workers for Antiretroviral Therapy Adherence in People With HIV: Pilot Randomized Controlled Trial

Statistical significance for differences between the control and intervention groups was assessed using the Mann-Whitney U test for continuous variables, such as age, and the chi-square test for the remaining categorical variables. A 2-sample z test for equality of proportions, with a continuity correction, was used to assess whether the differences in retention rates between the intervention and control groups were statistically significant.

Shivesh Shourya, Jianfang Liu, Sophia McInerney, Trinity Casimir, James Kenniff, Trace Kershaw, David Batey, Rebecca Schnall

JMIR Form Res 2025;9:e67997

Impact of Digital Engagement on Weight Loss Outcomes in Obesity Management Among Individuals Using GLP-1 and Dual GLP-1/GIP Receptor Agonist Therapy: Retrospective Cohort Service Evaluation Study

Impact of Digital Engagement on Weight Loss Outcomes in Obesity Management Among Individuals Using GLP-1 and Dual GLP-1/GIP Receptor Agonist Therapy: Retrospective Cohort Service Evaluation Study

Chi-square tests were used to compare the proportion of participants achieving clinically significant weight loss (≥5%, ≥10%, and ≥15% of baseline weight) between the engaged and nonengaged groups. In cases where small sample sizes or low expected frequencies were present (ie, n To assess the time-to-event data for achieving clinically significant weight loss, Kaplan-Meier survival analysis was performed, comparing the time to reach ≥5%, ≥10%, and ≥15% weight loss between the engaged and nonengaged groups.

Hans Johnson, David Huang, Vivian Liu, Mahmoud Al Ammouri, Christopher Jacobs, Austen El-Osta

J Med Internet Res 2025;27:e69466

Ability of ChatGPT to Replace Doctors in Patient Education: Cross-Sectional Comparative Analysis of Inflammatory Bowel Disease

Ability of ChatGPT to Replace Doctors in Patient Education: Cross-Sectional Comparative Analysis of Inflammatory Bowel Disease

We conducted descriptive analysis and assessed evaluators’ preference ratios for Chat GPT using a chi-square goodness-of-fit test. A 2-tailed Welch t test was used to compare the mean values of the 2 responses. We defined a threshold score of 3 (acceptable) and calculated the proportion exceeding or falling below this threshold score (3), comparing them using prevalence ratios.

Zelin Yan, Jingwen Liu, Yihong Fan, Shiyuan Lu, Dingting Xu, Yun Yang, Honggang Wang, Jie Mao, Hou-Chiang Tseng, Tao-Hsing Chang, Yan Chen

J Med Internet Res 2025;27:e62857