TY - JOUR AU - Jia, Shafeng AU - Zhu, Naifeng AU - Liu, Jia AU - Cheng, Niankai AU - Jiang, Ling AU - Yang, Jing PY - 2025/4/7 TI - Construction and Application of an Information Closed-Loop Management System for Maternal and Neonatal Access and Exit Rooms: Non Randomized Controlled Trial JO - JMIR Med Inform SP - e66451 VL - 13 KW - mother-infant same-room management KW - information-based identity verification KW - closed-loop management system KW - newborn safety management N2 - Background: Traditional management methods can no longer meet the demand for efficient and accurate neonatal care. There is a need for an information-based and intelligent management system. Objective: This study aimed to construct an information closed-loop management system to improve the accuracy of identification in mother-infant rooming-in care units and enhance the efficiency of infant admission and discharge management. Methods: Mothers who delivered between January 2023 and June 2023 were assigned to the control group (n=200), while those who delivered between July 2023 and May 2024 were assigned to the research group (n=200). The control group adopted traditional management methods, whereas the research group implemented closed-loop management. Barcode technology, a wireless network, mobile terminals, and other information technology equipments were used to complete the closed loop of newborn exit and entry management. Data on the satisfaction of mothers and their families, the monthly average qualification rate of infant identity verification, and the qualification rate of infant consultation time were collected and statistically analyzed before and after the closed-loop process was implemented. Results: After the closed-loop process was implemented, the monthly average qualification rate of infant identity verification increased to 99.45 (SD 1.34), significantly higher than the control group before implementation 83.58 (SD 1.92) (P=.02). The satisfaction of mothers and their families was 96.45 (SD 3.32), higher than that of the control group before the closed-loop process was implemented 92.82 (SD 4.73) (P=.01). Additionally, the separation time between infants and mothers was restricted to under 1 hour. Conclusion: The construction and application of the information closed-loop management system significantly improved the accuracy and efficiency of maternal and infant identity verification, enhancing the safety of newborns. UR - https://medinform.jmir.org/2025/1/e66451 UR - http://dx.doi.org/10.2196/66451 ID - info:doi/10.2196/66451 ER - TY - JOUR AU - Zheng, Rui AU - Jiang, Xiao AU - Shen, Li AU - He, Tianrui AU - Ji, Mengting AU - Li, Xingyi AU - Yu, Guangjun PY - 2025/4/7 TI - Investigating Clinicians? Intentions and Influencing Factors for Using an Intelligence-Enabled Diagnostic Clinical Decision Support System in Health Care Systems: Cross-Sectional Survey JO - J Med Internet Res SP - e62732 VL - 27 KW - artificial intelligence KW - clinical decision support systems KW - task-technology fit KW - technology acceptance model KW - perceived risk KW - performance expectations KW - intention to use N2 - Background: An intelligence-enabled clinical decision support system (CDSS) is a computerized system that integrates medical knowledge, patient data, and clinical guidelines to assist health care providers make clinical decisions. Research studies have shown that CDSS utilization rates have not met expectations. Clinicians? intentions and their attitudes determine the use and promotion of CDSS in clinical practice. Objective: The aim of this study was to enhance the successful utilization of CDSS by analyzing the pivotal factors that influence clinicians? intentions to adopt it and by putting forward targeted management recommendations. Methods: This study proposed a research model grounded in the task-technology fit model and the technology acceptance model, which was then tested through a cross-sectional survey. The measurement instrument comprised demographic characteristics, multi-item scales, and an open-ended query regarding areas where clinicians perceived the system required improvement. We leveraged structural equation modeling to assess the direct and indirect effects of ?task-technology fit? and ?perceived ease of use? on clinicians? intentions to use the CDSS when mediated by ?performance expectation? and ?perceived risk.? We collated and analyzed the responses to the open-ended question. Results: We collected a total of 247 questionnaires. The model explained 65.8% of the variance in use intention. Performance expectations (?=0.228; P<.001) and perceived risk (?=?0.579; P<.001) were both significant predictors of use intention. Task-technology fit (?=?0.281; P<.001) and perceived ease of use (?=?0.377; P<.001) negatively affected perceived risk. Perceived risk (?=?0.308; P<.001) negatively affected performance expectations. Task-technology fit positively affected perceived ease of use (?=0.692; P<.001) and performance expectations (?=0.508; P<.001). Task characteristics (?=0.168; P<.001) and technology characteristics (?=0.749; P<.001) positively affected task-technology fit. Contrary to expectations, perceived ease of use (?=0.108; P=.07) did not have a significant impact on use intention. From the open-ended question, 3 main themes emerged regarding clinicians? perceived deficiencies in CDSS: system security risks, personalized interaction, seamless integration. Conclusions: Perceived risk and performance expectations were direct determinants of clinicians? adoption of CDSS, significantly influenced by task-technology fit and perceived ease of use. In the future, increasing transparency within CDSS and fostering trust between clinicians and technology should be prioritized. Furthermore, focusing on personalized interactions and ensuring seamless integration into clinical workflows are crucial steps moving forward. UR - https://www.jmir.org/2025/1/e62732 UR - http://dx.doi.org/10.2196/62732 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/62732 ER - TY - JOUR AU - Amidei, Jacopo AU - Nieto, Rubén AU - Kaltenbrunner, Andreas AU - Ferreira De Sá, Gregorio Jose AU - Serrat, Mayte AU - Albajes, Klara PY - 2025/3/31 TI - Exploring the Capacity of Large Language Models to Assess the Chronic Pain Experience: Algorithm Development and Validation JO - J Med Internet Res SP - e65903 VL - 27 KW - large language models KW - fibromyalgia KW - chronic pain KW - written narratives KW - pain narratives KW - automated assessment KW - pain severity KW - pain disability N2 - Background: Chronic pain, affecting more than 20% of the global population, has an enormous pernicious impact on individuals as well as economic ramifications at both the health and social levels. Accordingly, tools that enhance pain assessment can considerably impact people suffering from pain and society at large. In this context, assessment methods based on individuals? personal experiences, such as written narratives (WNs), offer relevant insights into understanding pain from a personal perspective. This approach can uncover subjective, intricate, and multifaceted aspects that standardized questionnaires can overlook. However, WNs can be time-consuming for clinicians. Therefore, a tool that uses WNs while reducing the time required for their evaluation could have a significantly beneficial impact on people's pain assessment. Objective: This study is the first evaluation of the potential of applying large language models (LLMs) to assist clinicians in assessing patients? pain expressed through WNs. Methods: We performed an experiment based on 43 WNs made by people with fibromyalgia and qualitatively evaluated in a prior study. Focusing on pain severity and disability, we prompt GPT-4 (with temperature parameter settings 0 or 1) to assign scores and scores? explanations, to these WNs. Then, we quantitatively compare GPT-4 scores with experts? scores of the same narratives, using statistical measures such as Pearson correlations, root mean squared error, the weighted version of the Gwet agreement coefficient, and Krippendorff ?. Additionally, 2 experts specialized in chronic pain conducted a qualitative analysis of the scores? explanation to assess their accuracy and potential applicability of GPT?s analysis for future pain narrative evaluations. Results: Our analysis reveals that GPT-4?s performance in assessing pain narratives yielded promising results. GPT-4 was comparable in terms of agreement with experts (with a weighted percentage agreement higher than 0.95), correlations with standardized measurements (for example in the range of 0.43 and 0.49 between the Revised Fibromyalgia Impact Questionnaire and GTP-4 with temperatures 1), and low error rates (root mean squared error of 1.20 for severity and 1.44 for disability). Moreover, experts generally deemed the ratings provided by GPT-4, as well as the scores? explanation, to be adequate. However, we observe that GPT has a slight tendency to overestimate pain severity and disability with a lower SD than expert estimates. Conclusions: These findings underline the potential of LLMs in facilitating the assessment of WNs of people with fibromyalgia, offering a novel approach to understanding and evaluating patient pain experiences. Integrating automated assessments through LLMs presents opportunities for streamlining and enhancing the assessment process, paving the way for improved patient care and tailored interventions in the chronic pain management field. UR - https://www.jmir.org/2025/1/e65903 UR - http://dx.doi.org/10.2196/65903 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65903 ER - TY - JOUR AU - Hasegawa, Tatsuki AU - Kizaki, Hayato AU - Ikegami, Keisho AU - Imai, Shungo AU - Yanagisawa, Yuki AU - Yada, Shuntaro AU - Aramaki, Eiji AU - Hori, Satoko PY - 2025/3/27 TI - Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation JO - JMIR Med Inform SP - e65371 VL - 13 KW - systematic review KW - natural language processing KW - guideline updates KW - bidirectional encoder representations from transformer KW - screening model KW - literature KW - efficiency KW - updating systematic reviews KW - language model N2 - Background: A challenge in updating systematic reviews is the workload in screening the articles. Many screening models using natural language processing technology have been implemented to scrutinize articles based on titles and abstracts. While these approaches show promise, traditional models typically treat abstracts as uniform text. We hypothesize that selective training on specific abstract components could enhance model performance for systematic review screening. Objective: We evaluated the efficacy of a novel screening model that selects specific components from abstracts to improve performance and developed an automatic systematic review update model using an abstract component classifier to categorize abstracts based on their components. Methods: A screening model was created based on the included and excluded articles in the existing systematic review and used as the scheme for the automatic update of the systematic review. A prior publication was selected for the systematic review, and articles included or excluded in the articles screening process were used as training data. The titles and abstracts were classified into 5 categories (Title, Introduction, Methods, Results, and Conclusion). Thirty-one component-composition datasets were created by combining 5 component datasets. We implemented 31 screening models using the component-composition datasets and compared their performances. Comparisons were conducted using 3 pretrained models: Bidirectional Encoder Representations from Transformer (BERT), BioLinkBERT, and BioM- Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Moreover, to automate the component selection of abstracts, we developed the Abstract Component Classifier Model and created component datasets using this classifier model classification. Using the component datasets classified using the Abstract Component Classifier Model, we created 10 component-composition datasets used by the top 10 screening models with the highest performance when implementing screening models using the component datasets that were classified manually. Ten screening models were implemented using these datasets, and their performances were compared with those of models developed using manually classified component-composition datasets. The primary evaluation metric was the F10-Score weighted by the recall. Results: A total of 256 included articles and 1261 excluded articles were extracted from the selected systematic review. In the screening models implemented using manually classified datasets, the performance of some surpassed that of models trained on all components (BERT: 9 models, BioLinkBERT: 6 models, and BioM-ELECTRA: 21 models). In models implemented using datasets classified by the Abstract Component Classifier Model, the performances of some models (BERT: 7 models and BioM-ELECTRA: 9 models) surpassed that of the models trained on all components. These models achieved an 88.6% reduction in manual screening workload while maintaining high recall (0.93). Conclusions: Component selection from the title and abstract can improve the performance of screening models and substantially reduce the manual screening workload in systematic review updates. Future research should focus on validating this approach across different systematic review domains. UR - https://medinform.jmir.org/2025/1/e65371 UR - http://dx.doi.org/10.2196/65371 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65371 ER - TY - JOUR AU - Wickramasekera, Nyantara AU - Shackley, Phil AU - Rowen, Donna PY - 2025/3/21 TI - Embedding a Choice Experiment in an Online Decision Aid or Tool: Scoping Review JO - J Med Internet Res SP - e59209 VL - 27 KW - decision aid KW - decision tool KW - discrete choice experiment KW - conjoint analysis KW - value clarification KW - scoping review KW - choice experiment KW - database KW - study KW - article KW - data charting KW - narrative synthesis N2 - Background: Decision aids empower patients to understand how treatment options match their preferences. Choice experiments, a method to clarify values used within decision aids, present patients with hypothetical scenarios to reveal their preferences for treatment characteristics. Given the rise in research embedding choice experiments in decision tools and the emergence of novel developments in embedding methodology, a scoping review is warranted. Objective: This scoping review examines how choice experiments are embedded into decision tools and how these tools are evaluated, to identify best practices. Methods: This scoping review followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. Searches were conducted on MEDLINE, PsycInfo, and Web of Science. The methodology, development and evaluation details of decision aids were extracted and summarized using narrative synthesis. Results: Overall, 33 papers reporting 22 tools were included in the scoping review. These tools were developed for various health conditions, including musculoskeletal (7/22, 32%), oncological (8/22, 36%), and chronic conditions (7/22, 32%). Most decision tools (17/22, 77%) were developed in the United States, with the remaining tools originating in the Netherlands, United Kingdom, Canada, and Australia. The number of publications increased, with 73% (16/22) published since 2015, peaking at 4 publications in 2019. The primary purpose of these tools (20/22, 91%) was to help patients compare or choose treatments. Adaptive conjoint analysis was the most frequently used design type (10/22, 45%), followed by conjoint analysis and discrete choice experiments (DCEs; both 4/22, 18%), modified adaptive conjoint analysis (3/22, 14%), and adaptive best-worst conjoint analysis (1/22, 5%). The number of tasks varied depending on the design (6-12 for DCEs and adaptive conjoint vs 16-20 for conjoint analysis designs). Sawtooth software was commonly used (14/22, 64%) to embed choice tasks. Four proof-of-concept embedding methods were identified: scenario analysis, known preference phenotypes, Bayesian collaborative filtering, and penalized multinomial logit model. After completing the choice tasks patients received tailored information, 73% (16/22) of tools provided attribute importance scores, and 23% (5/22) presented a ?best match? treatment ranking. To convey probabilistic attributes, most tools (13/22, 59%) used a combination of approaches, including percentages, natural frequencies, icon arrays, narratives, and videos. The tools were evaluated across diverse study designs (randomized controlled trials, mixed methods, and cohort studies), with sample sizes ranging from 23 to 743 participants. Over 40 different outcomes were included in the evaluations, with the decisional conflict scale being the most frequently used in 6 tools. Conclusions: This scoping review provides an overview of how choice experiments are embedded into decision tools. It highlights the lack of established best practices for embedding methods, with only 4 proof-of-concept methods identified. Furthermore, the review reveals a lack of consensus on outcome measures, emphasizing the need for standardized outcome selection for future evaluations. UR - https://www.jmir.org/2025/1/e59209 UR - http://dx.doi.org/10.2196/59209 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/59209 ER - TY - JOUR AU - Hao, Jie AU - Chen, Zhenli AU - Peng, Qinglong AU - Zhao, Liang AU - Zhao, Wanqing AU - Cong, Shan AU - Li, Junlian AU - Li, Jiao AU - Qian, Qing AU - Sun, Haixia PY - 2025/3/18 TI - Prompt Framework for Extracting Scale-Related Knowledge Entities from Chinese Medical Literature: Development and Evaluation Study JO - J Med Internet Res SP - e67033 VL - 27 KW - prompt engineering KW - named entity recognition KW - in-context learning KW - large language model KW - Chinese medical literature KW - measurement-based care KW - framework KW - prompt KW - prompt framework KW - scale KW - China KW - medical literature KW - MBC KW - LLM KW - MedScaleNER KW - retrieval KW - information retrieval KW - dataset KW - artificial intelligence KW - AI N2 - Background: Measurement-based care improves patient outcomes by using standardized scales, but its widespread adoption is hindered by the lack of accessible and structured knowledge, particularly in unstructured Chinese medical literature. Extracting scale-related knowledge entities from these texts is challenging due to limited annotated data. While large language models (LLMs) show promise in named entity recognition (NER), specialized prompting strategies are needed to accurately recognize medical scale-related entities, especially in low-resource settings. Objective: This study aims to develop and evaluate MedScaleNER, a task-oriented prompt framework designed to optimize LLM performance in recognizing medical scale-related entities from Chinese medical literature. Methods: MedScaleNER incorporates demonstration retrieval within in-context learning, chain-of-thought prompting, and self-verification strategies to improve performance. The framework dynamically retrieves optimal examples using a k-nearest neighbors approach and decomposes the NER task into two subtasks: entity type identification and entity labeling. Self-verification ensures the reliability of the final output. A dataset of manually annotated Chinese medical journal papers was constructed, focusing on three key entity types: scale names, measurement concepts, and measurement items. Experiments were conducted by varying the number of examples and the proportion of training data to evaluate performance in low-resource settings. Additionally, MedScaleNER?s performance was compared with locally fine-tuned models. Results: The CMedS-NER (Chinese Medical Scale Corpus for Named Entity Recognition) dataset, containing 720 papers with 27,499 manually annotated scale-related knowledge entities, was used for evaluation. Initial experiments identified GLM-4-0520 as the best-performing LLM among six tested models. When applied with GLM-4-0520, MedScaleNER significantly improved NER performance for scale-related entities, achieving a macro F1-score of 59.64% in an exact string match with the full training dataset. The highest performance was achieved with 20-shot demonstrations. Under low-resource scenarios (eg, 1% of the training data), MedScaleNER outperformed all tested locally fine-tuned models. Ablation studies highlighted the importance of demonstration retrieval and self-verification in improving model reliability. Error analysis revealed four main types of mistakes: identification errors, type errors, boundary errors, and missing entities, indicating areas for further improvement. Conclusions: MedScaleNER advances the application of LLMs and prompts engineering for specialized NER tasks in Chinese medical literature. By addressing the challenges of unstructured texts and limited annotated data, MedScaleNER?s adaptability to various biomedical contexts supports more efficient and reliable knowledge extraction, contributing to broader measurement-based care implementation and improved clinical and research outcomes. UR - https://www.jmir.org/2025/1/e67033 UR - http://dx.doi.org/10.2196/67033 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/67033 ER - TY - JOUR AU - Ehrig, Molly AU - Bullock, S. Garrett AU - Leng, Iris Xiaoyan AU - Pajewski, M. Nicholas AU - Speiser, Lynn Jaime PY - 2025/3/13 TI - Imputation and Missing Indicators for Handling Missing Longitudinal Data: Data Simulation Analysis Based on Electronic Health Record Data JO - JMIR Med Inform SP - e64354 VL - 13 KW - missing indicator method KW - missing data KW - imputation KW - longitudinal data KW - electronic health record data KW - electronic health records KW - EHR KW - simulation study KW - clinical prediction model KW - prediction model KW - older adults KW - falls KW - logistic regression KW - prediction modeling N2 - Background: Missing data in electronic health records are highly prevalent and result in analytical concerns such as heterogeneous sources of bias and loss of statistical power. One simple analytic method for addressing missing or unknown covariate values is to treat missingness for a particular variable as a category onto itself, which we refer to as the missing indicator method. For cross-sectional analyses, recent work suggested that there was minimal benefit to the missing indicator method; however, it is unclear how this approach performs in the setting of longitudinal data, in which correlation among clustered repeated measures may be leveraged for potentially improved model performance. Objectives: This study aims to conduct a simulation study to evaluate whether the missing indicator method improved model performance and imputation accuracy for longitudinal data mimicking an application of developing a clinical prediction model for falls in older adults based on electronic health record data. Methods: We simulated a longitudinal binary outcome using mixed effects logistic regression that emulated a falls assessment at annual follow-up visits. Using multivariate imputation by chained equations, we simulated time-invariant predictors such as sex and medical history, as well as dynamic predictors such as physical function, BMI, and medication use. We induced missing data in predictors under scenarios that had both random (missing at random) and dependent missingness (missing not at random). We evaluated aggregate performance using the area under the receiver operating characteristic curve (AUROC) for models with and with no missing indicators as predictors, as well as complete case analysis, across simulation replicates. We evaluated imputation quality using normalized root-mean-square error for continuous variables and percent falsely classified for categorical variables. Results: Independent of the mechanism used to simulate missing data (missing at random or missing not at random), overall model performance via AUROC was similar regardless of whether missing indicators were included in the model. The root-mean-square error and percent falsely classified measures were similar for models including missing indicators versus those with no missing indicators. Model performance and imputation quality were similar regardless of whether the outcome was related to missingness. Imputation with or with no missing indicators had similar mean values of AUROC compared with complete case analysis, although complete case analysis had the largest range of values. Conclusions: The results of this study suggest that the inclusion of missing indicators in longitudinal data modeling neither improves nor worsens overall performance or imputation accuracy. Future research is needed to address whether the inclusion of missing indicators is useful in prediction modeling with longitudinal data in different settings, such as high dimensional data analysis. UR - https://medinform.jmir.org/2025/1/e64354 UR - http://dx.doi.org/10.2196/64354 ID - info:doi/10.2196/64354 ER - TY - JOUR AU - Oami, Takehiko AU - Okada, Yohei AU - Nakada, Taka-aki PY - 2025/3/12 TI - GPT-3.5 Turbo and GPT-4 Turbo in Title and Abstract Screening for Systematic Reviews JO - JMIR Med Inform SP - e64682 VL - 13 KW - large language models KW - citation screening KW - systematic review KW - clinical practice guidelines KW - artificial intelligence KW - sepsis KW - AI KW - review KW - GPT KW - screening KW - citations KW - critical care KW - Japan KW - Japanese KW - accuracy KW - efficiency KW - reliability KW - LLM UR - https://medinform.jmir.org/2025/1/e64682 UR - http://dx.doi.org/10.2196/64682 ID - info:doi/10.2196/64682 ER - TY - JOUR AU - Dai, Zhang-Yi AU - Wang, Fu-Qiang AU - Shen, Cheng AU - Ji, Yan-Li AU - Li, Zhi-Yang AU - Wang, Yun AU - Pu, Qiang PY - 2025/3/11 TI - Accuracy of Large Language Models for Literature Screening in Thoracic Surgery: Diagnostic Study JO - J Med Internet Res SP - e67488 VL - 27 KW - accuracy KW - large language models KW - meta-analysis KW - literature screening KW - thoracic surgery N2 - Background: Systematic reviews and meta-analyses rely on labor-intensive literature screening. While machine learning offers potential automation, its accuracy remains suboptimal. This raises the question of whether emerging large language models (LLMs) can provide a more accurate and efficient approach. Objective: This paper evaluates the sensitivity, specificity, and summary receiver operating characteristic (SROC) curve of LLM-assisted literature screening. Methods: We conducted a diagnostic study comparing the accuracy of LLM-assisted screening versus manual literature screening across 6 thoracic surgery meta-analyses. Manual screening by 2 investigators served as the reference standard. LLM-assisted screening was performed using ChatGPT-4o (OpenAI) and Claude-3.5 (Anthropic) sonnet, with discrepancies resolved by Gemini-1.5 pro (Google). In addition, 2 open-source, machine learning?based screening tools, ASReview (Utrecht University) and Abstrackr (Center for Evidence Synthesis in Health, Brown University School of Public Health), were also evaluated. We calculated sensitivity, specificity, and 95% CIs for the title and abstract, as well as full-text screening, generating pooled estimates and SROC curves. LLM prompts were revised based on a post hoc error analysis. Results: LLM-assisted full-text screening demonstrated high pooled sensitivity (0.87, 95% CI 0.77-0.99) and specificity (0.96, 95% CI 0.91-0.98), with the area under the curve (AUC) of 0.96 (95% CI 0.94-0.97). Title and abstract screening achieved a pooled sensitivity of 0.73 (95% CI 0.57-0.85) and specificity of 0.99 (95% CI 0.97-0.99), with an AUC of 0.97 (95% CI 0.96-0.99). Post hoc revisions improved sensitivity to 0.98 (95% CI 0.74-1.00) while maintaining high specificity (0.98, 95% CI 0.94-0.99). In comparison, the pooled sensitivity and specificity of ASReview tool-assisted screening were 0.58 (95% CI 0.53-0.64) and 0.97 (95% CI 0.91-0.99), respectively, with an AUC of 0.66 (95% CI 0.62-0.70). The pooled sensitivity and specificity of Abstrackr tool-assisted screening were 0.48 (95% CI 0.35-0.62) and 0.96 (95% CI 0.88-0.99), respectively, with an AUC of 0.78 (95% CI 0.74-0.82). A post hoc meta-analysis revealed comparable effect sizes between LLM-assisted and conventional screening. Conclusions: LLMs hold significant potential for streamlining literature screening in systematic reviews, reducing workload without sacrificing quality. Importantly, LLMs outperformed traditional machine learning-based tools (ASReview and Abstrackr) in both sensitivity and AUC values, suggesting that LLMs offer a more accurate and efficient approach to literature screening. UR - https://www.jmir.org/2025/1/e67488 UR - http://dx.doi.org/10.2196/67488 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/67488 ER - TY - JOUR AU - Tan, Jiaxing AU - Yang, Rongxin AU - Xiao, Liyin AU - Dong, Lingqiu AU - Zhong, Zhengxia AU - Zhou, Ling AU - Qin, Wei PY - 2025/3/10 TI - Risk Stratification in Immunoglobulin A Nephropathy Using Network Biomarkers: Development and Validation Study JO - J Med Internet Res SP - e65563 VL - 27 KW - IgA nephropathy KW - unsupervised learning KW - network biomarker KW - metabolomics KW - gut microbiota KW - biomarkers KW - risk stratification KW - IgA KW - immunoglobulin A KW - renal biopsy KW - renal KW - prospective cohort KW - Berger disease KW - synpharyngitic glomerulonephritis KW - kidney KW - immune system KW - glomerulonephritis KW - kidney inflammation KW - chronic kidney disease KW - renal disease KW - nephropathy KW - nephritis N2 - Background: Traditional risk models for immunoglobulin A nephropathy (IgAN), which primarily rely on renal indicators, lack comprehensive assessment and therapeutic guidance, necessitating more refined and integrative approaches. Objective: This study integrated network biomarkers with unsupervised learning clustering (k-means clustering based on network biomarkers [KMN]) to refine risk stratification in IgAN and explore its clinical value. Methods: Involving a multicenter prospective cohort, we analyzed 1460 patients and validated the approach externally with 200 additional patients. Deeper metabolic and microbiomic insights were gained from 2 distinct cohorts: 63 patients underwent ultraperformance liquid chromatography?mass spectrometry, while another 45 underwent fecal 16S RNA sequencing. Our approach used hierarchical clustering and k-means methods, using 3 sets of indicators: demographic and renal indicators, renal and extrarenal indicators, and network biomarkers derived from all indicators. Results: Among 6 clustering methods tested, the KMN scheme was the most effective, accurately reflecting patient severity and prognosis with a prognostic accuracy area under the curve (AUC) of 0.77, achieved solely through cluster grouping without additional indicators. The KMN stratification significantly outperformed the existing International IgA Nephropathy Prediction Tool (AUC of 0.72) and renal function-renal histology grading schemes (AUC of 0.69). Clinically, this stratification facilitated personalized treatment, recommending angiotensin-converting enzyme inhibitors or angiotensin receptor blockers for lower-risk groups and considering immunosuppressive therapy for higher-risk groups. Preliminary findings also indicated a correlation between IgAN progression and alterations in serum metabolites and gut microbiota, although further research is needed to establish causality. Conclusions: The effectiveness and applicability of the KMN scheme indicate its substantial potential for clinical application in IgAN management. UR - https://www.jmir.org/2025/1/e65563 UR - http://dx.doi.org/10.2196/65563 UR - http://www.ncbi.nlm.nih.gov/pubmed/40063072 ID - info:doi/10.2196/65563 ER - TY - JOUR AU - Malik, Salma AU - Dorothea, Pana Zoi AU - Argyropoulos, D. Christos AU - Themistocleous, Sophia AU - Macken, J. Alan AU - Valdenmaiier, Olena AU - Scheckenbach, Frank AU - Bardach, Elena AU - Pfeiffer, Andrea AU - Loens, Katherine AU - Ochando, Cano Jordi AU - Cornely, A. Oliver AU - Demotes-Mainard, Jacques AU - Contrino, Sergio AU - Felder, Gerd PY - 2025/3/7 TI - Data Interoperability in COVID-19 Vaccine Trials: Methodological Approach in the VACCELERATE Project JO - JMIR Med Inform SP - e65590 VL - 13 KW - interoperability KW - metadata KW - data management KW - clinical trials KW - protocol KW - harmonization KW - adult KW - pediatric KW - systems KW - standards N2 - Background: Data standards are not only key to making data processing efficient but also fundamental to ensuring data interoperability. When clinical trial data are structured according to international standards, they become significantly easier to analyze, reducing the efforts required for data cleaning, preprocessing, and secondary use. A common language and a shared set of expectations facilitate interoperability between systems and devices. Objective: The main objectives of this study were to identify commonalities and differences in clinical trial metadata, protocols, and data collection systems/items within the VACCELERATE project. Methods: To assess the degree of interoperability achieved in the project and suggest methodological improvements, interoperable points were identified based on the core outcome areas?immunogenicity, safety, and efficacy (clinical/physiological). These points were emphasized in the development of the master protocol template and were manually compared in the following ways: (1) summaries, objectives, and end points in the protocols of 3 VACCELERATE clinical trials (EU-COVAT-1_AGED, EU-COVAT-2_BOOSTAVAC, and EU-COVPT-1_CoVacc) against the master protocol template; (2) metadata of all 3 clinical trials; and (3) evaluations from a questionnaire survey regarding differences in data management systems and structures that enabled data exchange within the VACCELERATE network. Results: The noncommonalities identified in the protocols and metadata were attributed to differences in populations, variations in protocol design, and vaccination patterns. The detailed metadata released for all 3 vaccine trials were clearly structured using internal standards, terminology, and the general approach of Clinical Data Acquisition Standards Harmonisation (CDASH) for data collection (eg, on electronic case report forms). VACCELERATE benefited significantly from the selection of the Clinical Trials Centre Cologne as the sole data management provider. With system database development coordinated by a single individual and no need for coordination among different trial units, a high degree of uniformity was achieved automatically. The harmonized transfer of data to all sites, using well-established methods, enabled quick exchanges and provided a relatively secure means of data transfer. Conclusions: This study demonstrated that using master protocols can significantly enhance trial operational efficiency and data interoperability, provided that similar infrastructure and data management procedures are adopted across multiple trials. To further improve data interoperability and facilitate interpretation and analysis, shared data should be structured, described, formatted, and stored using widely recognized data and metadata standards. Trial Registration: EudraCT 2021-004526-29; https://www.clinicaltrialsregister.eu/ctr-search/trial/2021-004526-29/DE/; 2021-004889-35; https://www.clinicaltrialsregister.eu/ctr-search/search?query=eudract_number:2021-004889-35; and 2021-004526-29; https://www.clinicaltrialsregister.eu/ctr-search/search?query=eudract_number:2021-004526-29 UR - https://medinform.jmir.org/2025/1/e65590 UR - http://dx.doi.org/10.2196/65590 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65590 ER - TY - JOUR AU - Fekete, Tibor János AU - Gy?rffy, Balázs PY - 2025/3/6 TI - MetaAnalysisOnline.com: Web-Based Tool for the Rapid Meta-Analysis of Clinical and Epidemiological Studies JO - J Med Internet Res SP - e64016 VL - 27 KW - statistics KW - pharmacology KW - treatment KW - epidemiology KW - fixed effect model KW - random effect model KW - hazard rate KW - response rate KW - clinical trial KW - funnel plot KW - z score plot N2 - Background: A meta-analysis is a quantitative, formal study design in epidemiology and clinical medicine that systematically integrates and quantitatively synthesizes findings from multiple independent studies. This approach not only enhances statistical power but also enables the exploration of effects across diverse populations and helps resolve controversies arising from conflicting studies. Objective: This study aims to develop and implement a user-friendly tool for conducting meta-analyses, addressing the need for an accessible platform that simplifies the complex statistical procedures required for evidence synthesis while maintaining methodological rigor. Methods: The platform available at MetaAnalysisOnline.com enables comprehensive meta-analyses through an intuitive web interface, requiring no programming expertise or command-line operations. The system accommodates diverse data types including binary (total and event numbers), continuous (mean and SD), and time-to-event data (hazard rates with CIs), while implementing both fixed-effect and random-effect models using established statistical approaches such as DerSimonian-Laird, Mantel-Haenszel, and inverse variance methods for effect size estimation and heterogeneity assessment. Results: In addition to statistical tests, graphical representations including the forest plot, the funnel plot, and the z score plot can be drawn. A forest plot is highly effective in illustrating heterogeneity and pooled results. The risk of publication bias can be revealed by a funnel plot. A z score plot provides a visual assessment of whether more research is needed to establish a reliable conclusion. All the discussed models and visualization options are integrated into the registration-free web-based portal. Leveraging MetaAnalysisOnline.com's capabilities, we examined treatment-related adverse events in patients with cancer receiving perioperative anti?PD-1 immunotherapy through a systematic review encompassing 10 studies with 8099 total participants. Meta-analysis revealed that anti?PD-1 therapy doubled the risk of adverse events (risk ratio 2.15, 95% CI 1.39-3.32), with significant between-study heterogeneity (I2=95%) and publication bias detected through the Egger test (P=.02). While these findings suggest increased toxicity associated with anti?PD-1 treatment, the z score analysis indicated that additional studies are needed for definitive conclusions. Conclusions: In summary, the web-based tool aims to bridge the void for clinical and life science researchers by offering a user-friendly alternative for the swift and reproducible meta-analysis of clinical and epidemiological trials. UR - https://www.jmir.org/2025/1/e64016 UR - http://dx.doi.org/10.2196/64016 UR - http://www.ncbi.nlm.nih.gov/pubmed/39928123 ID - info:doi/10.2196/64016 ER - TY - JOUR AU - Ohno, Yukiko AU - Aomori, Tohru AU - Nishiyama, Tomohiro AU - Kato, Riri AU - Fujiki, Reina AU - Ishikawa, Haruki AU - Kiyomiya, Keisuke AU - Isawa, Minae AU - Mochizuki, Mayumi AU - Aramaki, Eiji AU - Ohtani, Hisakazu PY - 2025/3/4 TI - Performance Improvement of a Natural Language Processing Tool for Extracting Patient Narratives Related to Medical States From Japanese Pharmaceutical Care Records by Increasing the Amount of Training Data: Natural Language Processing Analysis and Validation Study JO - JMIR Med Inform SP - e68863 VL - 13 KW - natural language processing KW - NLP KW - named entity recognition KW - NER KW - deep learning KW - pharmaceutical care record KW - electronic medical record KW - EMR KW - Japanese N2 - Background: Patients? oral expressions serve as valuable sources of clinical information to improve pharmacotherapy. Natural language processing (NLP) is a useful approach for analyzing unstructured text data, such as patient narratives. However, few studies have focused on using NLP for narratives in the Japanese language. Objective: We aimed to develop a high-performance NLP system for extracting clinical information from patient narratives by examining the performance progression with a gradual increase in the amount of training data. Methods: We used subjective texts from the pharmaceutical care records of Keio University Hospital from April 1, 2018, to March 31, 2019, comprising 12,004 records from 6559 cases. After preprocessing, we annotated diseases and symptoms within the texts. We then trained and evaluated a deep learning model (bidirectional encoder representations from transformers combined with a conditional random field [BERT-CRF]) through 10-fold cross-validation. The annotated data were divided into 10 subsets, and the amount of training data was progressively increased over 10 steps. We also analyzed the causes of errors. Finally, we applied the developed system to the analysis of case report texts to evaluate its usability for texts from other sources. Results: The F1-score of the system improved from 0.67 to 0.82 as the amount of training data increased from 1200 to 12,004 records. The F1-score reached 0.78 with 3600 records and was largely similar thereafter. As performance improved, errors from incorrect extractions decreased significantly, which resulted in an increase in precision. For case reports, the F1-score also increased from 0.34 to 0.41 as the training dataset expanded from 1200 to 12,004 records. Performance was lower for extracting symptoms from case report texts compared with pharmaceutical care records, suggesting that this system is more specialized for analyzing subjective data from pharmaceutical care records. Conclusions: We successfully developed a high-performance system specialized in analyzing subjective data from pharmaceutical care records by training a large dataset, with near-complete saturation of system performance with about 3600 training records. This system will be useful for monitoring symptoms, offering benefits for both clinical practice and research. UR - https://medinform.jmir.org/2025/1/e68863 UR - http://dx.doi.org/10.2196/68863 UR - http://www.ncbi.nlm.nih.gov/pubmed/40053805 ID - info:doi/10.2196/68863 ER - TY - JOUR AU - Zhang, Subo AU - Zhu, Zhitao AU - Yu, Zhenfei AU - Sun, Haifeng AU - Sun, Yi AU - Huang, Hai AU - Xu, Lei AU - Wan, Jinxin PY - 2025/2/27 TI - Effectiveness of AI for Enhancing Computed Tomography Image Quality and Radiation Protection in Radiology: Systematic Review and Meta-Analysis JO - J Med Internet Res SP - e66622 VL - 27 KW - artificial intelligence KW - computed tomography KW - image quality KW - radiation protection KW - meta-analysis N2 - Background: Artificial intelligence (AI) presents a promising approach to balancing high image quality with reduced radiation exposure in computed tomography (CT) imaging. Objective: This meta-analysis evaluates the effectiveness of AI in enhancing CT image quality and lowering radiation doses. Methods: A thorough literature search was performed across several databases, including PubMed, Embase, Web of Science, Science Direct, and Cochrane Library, with the final update in 2024. We included studies that compared AI-based interventions to conventional CT techniques. The quality of these studies was assessed using the Newcastle-Ottawa Scale. Random effect models were used to pool results, and heterogeneity was measured using the I˛ statistic. Primary outcomes included image quality, CT dose index, and diagnostic accuracy. Results: This meta-analysis incorporated 5 clinical validation studies published between 2022 and 2024, totaling 929 participants. Results indicated that AI-based interventions significantly improved image quality (mean difference 0.70, 95% CI 0.43-0.96; P<.001) and showed a positive trend in reducing the CT dose index, though not statistically significant (mean difference 0.47, 95% CI ?0.21 to 1.15; P=.18). AI also enhanced image analysis efficiency (odds ratio 1.57, 95% CI 1.08-2.27; P=.02) and demonstrated high accuracy and sensitivity in detecting intracranial aneurysms, with low-dose CT using AI reconstruction showing noninferiority for liver lesion detection. Conclusions: The findings suggest that AI-based interventions can significantly enhance CT imaging practices by improving image quality and potentially reducing radiation doses, which may lead to better diagnostic accuracy and patient safety. However, these results should be interpreted with caution due to the limited number of studies and the variability in AI algorithms. Further research is needed to clarify AI?s impact on radiation reduction and to establish clinical standards. UR - https://www.jmir.org/2025/1/e66622 UR - http://dx.doi.org/10.2196/66622 UR - http://www.ncbi.nlm.nih.gov/pubmed/40053787 ID - info:doi/10.2196/66622 ER - TY - JOUR AU - Steele, Brian AU - Fairie, Paul AU - Kemp, Kyle AU - D'Souza, G. Adam AU - Wilms, Matthias AU - Santana, Jose Maria PY - 2025/2/24 TI - Identifying Patient-Reported Care Experiences in Free-Text Survey Comments: Topic Modeling Study JO - JMIR Med Inform SP - e63466 VL - 13 KW - natural language processing KW - patient-reported experience KW - topic models KW - inpatient KW - artificial intelligence KW - AI KW - patient reported KW - feedback KW - survey KW - patient experiences KW - bidirectional encoder representations from transformers KW - BERT KW - sentiment analysis KW - pediatric caregivers KW - patient safety KW - safety N2 - Background: Patient-reported experience surveys allow administrators, clinicians, and researchers to quantify and improve health care by receiving feedback directly from patients. Existing research has focused primarily on quantitative analysis of survey items, but these measures may collect optional free-text comments. These comments can provide insights for health systems but may not be analyzed due to limited resources and the complexity of traditional textual analysis. However, advances in machine learning?based natural language processing provide opportunities to learn from this traditionally underused data source. Objective: This study aimed to apply natural language processing to model topics found in free-text comments of patient-reported experience surveys. Methods: Consumer Assessment of Healthcare Providers and Systems?derived patient experience surveys were collected and linked to administrative inpatient records by the provincial health services organization responsible for inpatient care. Unsupervised topic modeling with automated labeling was performed with BERTopic. Sentiment analysis was performed to further assist in topic description. Results: Between April 2016 and February 2020, 43.4% (43,522/100,272) adult patients and 46.9% (3501/7464) pediatric caregivers included free-text responses on completed patient experience surveys. Topic models identified 86 topics among adult survey responses and 35 topics among pediatric responses that included elements of care not currently surveyed by existing questionnaires. Frequent topics were generally positive. Conclusions: We found that with limited tuning, BERTopic identified care experience topics with interpretable automated labeling. Results are discussed in the context of person-centered care, patient safety, and health care quality improvement. Furthermore, we note the opportunity for the identification of temporal and site-specific trends as a method to identify patient care and safety concerns. As the use of patient experience measurement increases in health care, we discuss how machine learning can be leveraged to provide additional insight on patient experiences. UR - https://medinform.jmir.org/2025/1/e63466 UR - http://dx.doi.org/10.2196/63466 ID - info:doi/10.2196/63466 ER - TY - JOUR AU - Park, ChulHyoung AU - Lee, Hee So AU - Lee, Yun Da AU - Choi, Seoyoon AU - You, Chan Seng AU - Jeon, Young Ja AU - Park, Jun Sang AU - Park, Woong Rae PY - 2025/2/21 TI - Analysis of Retinal Thickness in Patients With Chronic Diseases Using Standardized Optical Coherence Tomography Data: Database Study Based on the Radiology Common Data Model JO - JMIR Med Inform SP - e64422 VL - 13 KW - data standardization KW - ophthalmology KW - radiology KW - optical coherence tomography KW - retinal thickness N2 - Background: The Observational Medical Outcome Partners-Common Data Model (OMOP-CDM) is an international standard for harmonizing electronic medical record (EMR) data. However, since it does not standardize unstructured data, such as medical imaging, using this data in multi-institutional collaborative research becomes challenging. To overcome this limitation, extensions such as the Radiology Common Data Model (R-CDM) have emerged to include and standardize these data types. Objective: This work aims to demonstrate that by standardizing optical coherence tomography (OCT) data into an R-CDM format, multi-institutional collaborative studies analyzing changes in retinal thickness in patients with long-standing chronic diseases can be performed efficiently. Methods: We standardized OCT images collected from two tertiary hospitals for research purposes using the R-CDM. As a proof of concept, we conducted a comparative analysis of retinal thickness between patients who have chronic diseases and those who have not. Patients diagnosed or treated for retinal and choroidal diseases, which could affect retinal thickness, were excluded from the analysis. Using the existing OMOP-CDM at each institution, we extracted cohorts of patients with chronic diseases and control groups, performing large-scale 1:2 propensity score matching (PSM). Subsequently, we linked the OMOP-CDM and R-CDM to extract the OCT image data of these cohorts and analyzed central macular thickness (CMT) and retinal nerve fiber layer (RNFL) thickness using a linear mixed model. Results: OCT data of 261,874 images from Ajou University Medical Center (AUMC) and 475,626 images from Seoul National University Bundang Hospital (SNUBH) were standardized in the R-CDM format. The R-CDM databases established at each institution were linked with the OMOP-CDM database. Following 1:2 PSM, the type 2 diabetes mellitus (T2DM) cohort included 957 patients, and the control cohort had 1603 patients. During the follow-up period, significant reductions in CMT were observed in the T2DM cohorts at AUMC (P=.04) and SNUBH (P=.007), without significant changes in RNFL thickness (AUMC: P=.56; SNUBH: P=.39). Notably, a significant reduction in CMT during the follow-up was observed only at AUMC in the hypertension cohort, compared to the control group (P=.04); no other significant differences in retinal thickness were found in the remaining analyses. Conclusions: The significance of our study lies in demonstrating the efficiency of multi-institutional collaborative research that simultaneously uses clinical data and medical imaging data by leveraging the OMOP-CDM for standardizing EMR data and the R-CDM for standardizing medical imaging data. UR - https://medinform.jmir.org/2025/1/e64422 UR - http://dx.doi.org/10.2196/64422 ID - info:doi/10.2196/64422 ER - TY - JOUR AU - Wang, Jianli AU - Orpana, Heather AU - Carrington, André AU - Kephart, George AU - Vasiliadis, Helen-Maria AU - Leikin, Benjamin PY - 2025/2/19 TI - Development and Validation of Prediction Models for Perceived and Unmet Mental Health Needs in the Canadian General Population: Model-Based Synthetic Estimation Study JO - JMIR Public Health Surveill SP - e66056 VL - 11 KW - population risk prediction KW - development KW - validation KW - perceived mental health need KW - unmet mental health need N2 - Background: Research has shown that perceptions of a mental health need are closely associated with service demands and are an important dimension in needs assessment. Perceived and unmet mental health needs are important factors in the decision-making process regarding mental health services planning and resources allocation. However, few prediction tools are available to be used by policy and decision makers to forecast perceived and unmet mental health needs at the population level. Objective: We aim to develop prediction models to forecast perceived and unmet mental health needs at the provincial and health regional levels in Canada. Methods: Data from 2018, 2019, and 2020 Canadian Community Health Survey and Canadian Urban Environment were used (n=65,000 each year). Perceived and unmet mental health needs were measured by the Perceived Needs for Care Questionnaire. Using the 2018 dataset, we developed the prediction models through the application of regression synthetic estimation for the Atlantic, Central, and Western regions. The models were validated in the 2019 and 2020 datasets at the provincial level and in 10 randomly selected health regions by comparing the observed and predicted proportions of the outcomes. Results: In 2018, a total of 17.82% of the participants reported perceived mental health need and 3.81% reported unmet mental health need. The proportions were similar in 2019 (18.04% and 3.91%) and in 2020 (18.1% and 3.92%). Sex, age, self-reported mental health, physician diagnosed mood and anxiety disorders, self-reported life stress and life satisfaction were the predictors in the 3 regional models. The individual based models had good discriminative power with C statistics over 0.83 and good calibration. Applying the synthetic models in 2019 and 2020 data, the models had the best performance in Ontario, Quebec, and British Columbia; the absolute differences between observed and predicted proportions were less than 1%. The absolute differences between the predicted and observed proportion of perceived mental health needs in Newfoundland and Labrador (?4.16% in 2020) and Prince Edward Island (4.58% in 2019) were larger than those in other provinces. When applying the models in the 10 selected health regions, the models calibrated well in the health regions in Ontario and in Quebec; the absolute differences in perceived mental health needs ranged from 0.23% to 2.34%. Conclusions: Predicting perceived and unmet mental health at the population level is feasible. There are common factors that contribute to perceived and unmet mental health needs across regions, at different magnitudes, due to different population characteristics. Therefore, predicting perceived and unmet mental health needs should be region specific. The performance of the models at the provincial and health regional levels may be affected by population size. UR - https://publichealth.jmir.org/2025/1/e66056 UR - http://dx.doi.org/10.2196/66056 ID - info:doi/10.2196/66056 ER - TY - JOUR AU - Draucker, Burke Claire AU - Carrión, Andrés AU - Ott, A. Mary AU - Hicks, I. Ariel AU - Knopf, Amelia PY - 2025/2/13 TI - A 4-Site Public Deliberation Project on the Acceptability of Youth Self-Consent in Biomedical HIV Prevention Trials: Assessment of Facilitator Fidelity to Key Principles JO - JMIR Form Res SP - e58451 VL - 9 KW - public deliberation KW - deliberative democracy KW - bioethics KW - ethical conflict KW - biomedical KW - HIV prevention KW - HIV research KW - group facilitation KW - fidelity assessment KW - content analysis N2 - Background: Public deliberation is an approach used to engage persons with diverse perspectives in discussions and decision-making about issues affecting the public that are controversial or value laden. Because experts have identified the need to evaluate facilitator performance, our research team developed a framework to assess the fidelity of facilitator remarks to key principles of public deliberation. Objective: This report describes how the framework was used to assess facilitator fidelity in a 4-site public deliberation project on the acceptability of minor self-consent in biomedical HIV prevention research. Methods: A total of 88 individuals participated in 4 deliberation sessions held in 4 cities throughout the United States. The sessions, facilitated by 18 team members, were recorded and transcribed verbatim. Facilitator remarks were highlighted, and predetermined coding rules were used to code the remarks to 1 of 6 principles of quality deliberations. A variety of display tables were used to organize the codes and calculate the number of facilitator remarks that were consistent or inconsistent with each principle during each session across all sites. A content analysis was conducted on the remarks to describe how facilitator remarks aligned or failed to align with each principle. Results: In total, 735 remarks were coded to one of the principles; 516 (70.2%) were coded as consistent with a principle, and 219 (29.8%) were coded as inconsistent. A total of 185 remarks were coded to the principle of equal participation (n=138, 74.6% as consistent; n=185, 25.4% as inconsistent), 158 were coded to expression of diverse opinions (n=110, 69.6% as consistent; n=48, 30.4% as inconsistent), 27 were coded to respect for others (n=27, 100% as consistent), 24 were coded to adoption of a societal perspective (n=11, 46% as consistent; n=13, 54% as inconsistent), 99 were coded to reasoned justification of ideas (n=81, 82% as consistent; n=18, 18% as inconsistent), and 242 were coded to compromise or movement toward consensus (n=149, 61.6% as consistent; n=93, 38.4% as inconsistent). Therefore, the counts provided affirmation that most of the facilitator remarks were aligned with the principles of deliberation, suggesting good facilitator fidelity. By considering how the remarks aligned or failed to align with the principles, areas where facilitator fidelity can be strengthened were identified. The results indicated that facilitators should focus more on encouraging quieter members to participate, refraining from expressing personal opinions, promoting the adoption of a societal perspective and reasoned justification of opinions, and inviting deliberants to articulate their areas of common ground. Conclusions: The results provide an example of how a framework for assessing facilitator fidelity was used in a 4-site deliberation project. The framework will be refined to better address issues related to balancing personal and public perspectives, managing plurality, and mitigating social inequalities. UR - https://formative.jmir.org/2025/1/e58451 UR - http://dx.doi.org/10.2196/58451 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/58451 ER - TY - JOUR AU - Marino, Antonio Carlos AU - Diaz Paz, Claudia PY - 2025/1/31 TI - Smart Contracts and Shared Platforms in Sustainable Health Care: Systematic Review JO - JMIR Med Inform SP - e58575 VL - 13 KW - health care KW - smart contracts KW - blockchain KW - security KW - privacy KW - supply chain KW - patient centricity KW - system trust KW - stakeholders N2 - Background: The benefits of smart contracts (SCs) for sustainable health care are a relatively recent topic that has gathered attention given its relationship with trust and the advantages of decentralization, immutability, and traceability introduced in health care. Nevertheless, more studies need to explore the role of SCs in this sector based on the frameworks propounded in the literature that reflect business logic that has been customized, automatized, and prioritized, as well as system trust. This study addressed this lacuna. Objective: This study aimed to provide a comprehensive understanding of SCs in health care based on reviewing the frameworks propounded in the literature. Methods: A structured literature review was performed based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) principles. One database?Web of Science (WoS)?was selected to avoid bias generated by database differences and data wrangling. A quantitative assessment of the studies based on machine learning and data reduction methodologies was complemented with a qualitative, in-depth, detailed review of the frameworks propounded in the literature. Results: A total of 70 studies, which constituted 18.7% (70/374) of the studies on this subject, met the selection criteria and were analyzed. A multiple correspondence analysis?with 74.44% of the inertia?produced 3 factors describing the advances in the topic. Two of them referred to the leading roles of SCs: (1) health care process enhancement and (2) assurance of patients? privacy protection. The first role included 6 themes, and the second one included 3 themes. The third factor encompassed the technical features that improve system efficiency. The in-depth review of these 3 factors and the identification of stakeholders allowed us to characterize the system trust in health care SCs. We assessed the risk of coverage bias, and good percentages of overlap were obtained?66% (49/74) of PubMed articles were also in WoS, and 88.3% (181/205) of WoS articles also appeared in Scopus. Conclusions: This comprehensive review allows us to understand the relevance of SCs and the potentiality of their use in patient-centric health care that considers more than technical aspects. It also provides insights for further research based on specific stakeholders, locations, and behaviors. UR - https://medinform.jmir.org/2025/1/e58575 UR - http://dx.doi.org/10.2196/58575 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/58575 ER - TY - JOUR AU - Xu, Qian AU - Cai, Xue AU - Yu, Ruicong AU - Zheng, Yueyue AU - Chen, Guanjie AU - Sun, Hui AU - Gao, Tianyun AU - Xu, Cuirong AU - Sun, Jing PY - 2025/1/31 TI - Machine Learning?Based Risk Factor Analysis and Prediction Model Construction for the Occurrence of Chronic Heart Failure: Health Ecologic Study JO - JMIR Med Inform SP - e64972 VL - 13 KW - machine learning, chronic heart failure, risk of occurrence KW - prediction model, health ecology N2 - Background: Chronic heart failure (CHF) is a serious threat to human health, with high morbidity and mortality rates, imposing a heavy burden on the health care system and society. With the abundance of medical data and the rapid development of machine learning (ML) technologies, new opportunities are provided for in-depth investigation of the mechanisms of CHF and the construction of predictive models. The introduction of health ecology research methodology enables a comprehensive dissection of CHF risk factors from a wider range of environmental, social, and individual factors. This not only helps to identify high-risk groups at an early stage but also provides a scientific basis for the development of precise prevention and intervention strategies. Objective: This study aims to use ML to construct a predictive model of the risk of occurrence of CHF and analyze the risk of CHF from a health ecology perspective. Methods: This study sourced data from the Jackson Heart Study database. Stringent data preprocessing procedures were implemented, which included meticulous management of missing values and the standardization of data. Principal component analysis and random forest (RF) were used as feature selection techniques. Subsequently, several ML models, namely decision tree, RF, extreme gradient boosting, adaptive boosting (AdaBoost), support vector machine, naive Bayes model, multilayer perceptron, and bootstrap forest, were constructed, and their performance was evaluated. The effectiveness of the models was validated through internal validation using a 10-fold cross-validation approach on the training and validation sets. In addition, the performance metrics of each model, including accuracy, precision, sensitivity, F1-score, and area under the curve (AUC), were compared. After selecting the best model, we used hyperparameter optimization to construct a better model. Results: RF-selected features (21 in total) had an average root mean square error of 0.30, outperforming principal component analysis. Synthetic Minority Oversampling Technique and Edited Nearest Neighbors showed better accuracy in data balancing. The AdaBoost model was most effective with an AUC of 0.86, accuracy of 75.30%, precision of 0.86, sensitivity of 0.69, and F1-score of 0.76. Validation on the training and validation sets through 10-fold cross-validation gave an AUC of 0.97, an accuracy of 91.27%, a precision of 0.94, a sensitivity of 0.92, and an F1-score of 0.94. After random search processing, the accuracy and AUC of AdaBoost improved. Its accuracy was 77.68% and its AUC was 0.86. Conclusions: This study offered insights into CHF risk prediction. Future research should focus on prospective studies, diverse data, advanced techniques, longitudinal studies, and exploring factor interactions for better CHF prevention and management. UR - https://medinform.jmir.org/2025/1/e64972 UR - http://dx.doi.org/10.2196/64972 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/64972 ER - TY - JOUR AU - Figueroa, A. Caroline AU - Torkamaan, Helma AU - Bhattacharjee, Ananya AU - Hauptmann, Hanna AU - Guan, W. Kathleen AU - Sedrakyan, Gayane PY - 2025/1/30 TI - Designing Health Recommender Systems to Promote Health Equity: A Socioecological Perspective JO - J Med Internet Res SP - e60138 VL - 27 KW - digital health KW - health promotion KW - health recommender systems KW - artificial intelligence KW - health equity KW - AI KW - digital devices KW - socioecological KW - health inequities KW - health behavior KW - health behaviors KW - patient centric KW - digital health intervention UR - https://www.jmir.org/2025/1/e60138 UR - http://dx.doi.org/10.2196/60138 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/60138 ER - TY - JOUR AU - Wang, Jiao AU - Chen, Jianrong AU - Liu, Ying AU - Xu, Jixiong PY - 2025/1/28 TI - Use of the FHTHWA Index as a Novel Approach for Predicting the Incidence of Diabetes in a Japanese Population Without Diabetes: Data Analysis Study JO - JMIR Med Inform SP - e64992 VL - 13 KW - prediction KW - diabetes KW - risk KW - index KW - population without diabetes N2 - Background: Many tools have been developed to predict the risk of diabetes in a population without diabetes; however, these tools have shortcomings that include the omission of race, inclusion of variables that are not readily available to patients, and low sensitivity or specificity. Objective: We aimed to develop and validate an easy, systematic index for predicting diabetes risk in the Asian population. Methods: We collected the data from the NAGALA (NAfld [nonalcoholic fatty liver disease] in the Gifu Area, Longitudinal Analysis) database. The least absolute shrinkage and selection operator model was used to select potentially relevant features. Multiple Cox proportional hazard analysis was used to develop a model based on the training set. Results: The final study population of 15464 participants had a mean age of 42 (range 18-79) years; 54.5% (8430) were men. The mean follow-up duration was 6.05 (SD 3.78) years. A total of 373 (2.41%) participants showed progression to diabetes during the follow-up period. Then, we established a novel parameter (the FHTHWA index), to evaluate the incidence of diabetes in a population without diabetes, comprising 6 parameters based on the training set. After multivariable adjustment, individuals in tertile 3 had a significantly higher rate of diabetes compared with those in tertile 1 (hazard ratio 32.141, 95% CI 11.545?89.476). Time receiver operating characteristic curve analyses showed that the FHTHWA index had high accuracy, with the area under the curve value being around 0.9 during the more than 12 years of follow-up. Conclusions: This research successfully developed a diabetes risk assessment index tailored for the Japanese population by utilizing an extensive dataset and a wide range of indices. By categorizing the diabetes risk levels among Japanese individuals, this study offers a novel predictive tool for identifying potential patients, while also delivering valuable insights into diabetes prevention strategies for the healthy Japanese populace. UR - https://medinform.jmir.org/2025/1/e64992 UR - http://dx.doi.org/10.2196/64992 ID - info:doi/10.2196/64992 ER - TY - JOUR AU - Alharbi, A. Abdullah AU - Aljerian, A. Nawfal AU - Binhotan, S. Meshary AU - Alghamdi, A. Hani AU - Alsultan, K. Ali AU - Arafat, S. Mohammed AU - Aldhabib, Abdulrahman AU - Alaska, A. Yasser AU - Alwahbi, B. Eid AU - Muaddi, A. Mohammed AU - Alqassim, Y. Ahmad AU - Horner, D. Ronnie PY - 2025/1/24 TI - Digital Surveillance of Mental Health Care Services in Saudi Arabia: Cross-Sectional Study of National e-Referral System Data JO - JMIR Public Health Surveill SP - e64257 VL - 11 KW - digital health KW - mental health KW - health policy KW - epidemiology KW - Saudi Arabia KW - SMARC KW - health care transformation KW - e-referral KW - Saudi Medical Appointments and Referrals Centre N2 - Background: Mental illness affects an estimated 25% of the global population, with treatment gaps persisting worldwide. The COVID-19 pandemic has exacerbated these challenges, leading to a significant increase in mental health issues globally. In Saudi Arabia, the lifetime prevalence of mental disorders is estimated at 34.2%, yet 86.1% of those with a 12-month mental disorder report no service use. To address these challenges, digital health solutions, particularly electronic referral (e-referral) systems, have emerged as powerful tools to improve care coordination and access. Saudi Arabia has pioneered the nationwide Saudi Medical Appointments and Referrals Centre (SMARC), a centralized e-referral system using artificial intelligence and predictive analytics. Objectives: This study aims to analyze Saudi Arabia?s novel nationwide e-referral system for mental health services, using SMARC platform data to examine referral patterns, and service accessibility. This study also aims to demonstrate how digital health technology can inform and improve mental health care delivery and policy making. Methods: This retrospective, cross-sectional study used secondary data from SMARC on 10,033 psychiatric e-referrals in Saudi Arabia during 2020?2021. Referrals were assessed by patient sociodemographic variables, geographic data, and e-referral characteristics including date, type, bed type, and reason for e-referral. Descriptive statistical analyses identified referral patterns, while regression modeling determined predictors of external referrals to other regions. Results: Analysis of 10,033 psychiatric e-referrals revealed that 58.99% (n=5918) were for patients aged 18?44 years, 63.93% (n=6414) were for men, and 87.10% (n=8739) were for Saudi nationals. The Western Business Unit generated 45.17% (n=4532) of all e-referral requests. Emergency cases were the most common type of referral overall, followed by routine inpatient and routine outpatient department referrals. However, in the Northern Business Unit, routine inpatient referrals were most frequent. Two-thirds of requests were for ward beds, while critical beds were rarely requested. ?Unavailable subspecialty? was the primary reason for referrals across all regions. The utilization of the mental health e-referral system varied across regions, with the Northern Border and Albaha regions showing the highest rates, while Madinah, Eastern, and Riyadh regions demonstrated lower use. Temporal analysis showed almost similar monthly patterns in 2020 and 2021. There was an overall increase in referrals in 2021 compared with 2020. Conclusions: This pioneering study of mental health e-referrals in Saudi Arabia demonstrates how digital health transformation, particularly through an e-referral system, has significantly enhanced access to mental health services nationwide in Saudi Arabia. The success of this digital initiative demonstrates how digital health solutions can transform health care access, particularly in mental health services, offering a valuable model for other health care systems. UR - https://publichealth.jmir.org/2025/1/e64257 UR - http://dx.doi.org/10.2196/64257 ID - info:doi/10.2196/64257 ER - TY - JOUR AU - Fukushima, Takuya AU - Manabe, Masae AU - Yada, Shuntaro AU - Wakamiya, Shoko AU - Yoshida, Akiko AU - Urakawa, Yusaku AU - Maeda, Akiko AU - Kan, Shigeyuki AU - Takahashi, Masayo AU - Aramaki, Eiji PY - 2025/1/16 TI - Evaluating and Enhancing Japanese Large Language Models for Genetic Counseling Support: Comparative Study of Domain Adaptation and the Development of an Expert-Evaluated Dataset JO - JMIR Med Inform SP - e65047 VL - 13 KW - large language models KW - genetic counseling KW - medical KW - health KW - artificial intelligence KW - machine learning KW - domain adaptation KW - retrieval-augmented generation KW - instruction tuning KW - prompt engineering KW - question-answer KW - dialogue KW - ethics KW - safety KW - low-rank adaptation KW - Japanese KW - expert evaluation N2 - Background: Advances in genetics have underscored a strong association between genetic factors and health outcomes, leading to an increased demand for genetic counseling services. However, a shortage of qualified genetic counselors poses a significant challenge. Large language models (LLMs) have emerged as a potential solution for augmenting support in genetic counseling tasks. Despite the potential, Japanese genetic counseling LLMs (JGCLLMs) are underexplored. To advance a JGCLLM-based dialogue system for genetic counseling, effective domain adaptation methods require investigation. Objective: This study aims to evaluate the current capabilities and identify challenges in developing a JGCLLM-based dialogue system for genetic counseling. The primary focus is to assess the effectiveness of prompt engineering, retrieval-augmented generation (RAG), and instruction tuning within the context of genetic counseling. Furthermore, we will establish an experts-evaluated dataset of responses generated by LLMs adapted to Japanese genetic counseling for the future development of JGCLLMs. Methods: Two primary datasets were used in this study: (1) a question-answer (QA) dataset for LLM adaptation and (2) a genetic counseling question dataset for evaluation. The QA dataset included 899 QA pairs covering medical and genetic counseling topics, while the evaluation dataset contained 120 curated questions across 6 genetic counseling categories. Three enhancement techniques of LLMs?instruction tuning, RAG, and prompt engineering?were applied to a lightweight Japanese LLM to enhance its ability for genetic counseling. The performance of the adapted LLM was evaluated on the 120-question dataset by 2 certified genetic counselors and 1 ophthalmologist (SK, YU, and AY). Evaluation focused on four metrics: (1) inappropriateness of information, (2) sufficiency of information, (3) severity of harm, and (4) alignment with medical consensus. Results: The evaluation by certified genetic counselors and an ophthalmologist revealed varied outcomes across different methods. RAG showed potential, particularly in enhancing critical aspects of genetic counseling. In contrast, instruction tuning and prompt engineering produced less favorable outcomes. This evaluation process facilitated the creation an expert-evaluated dataset of responses generated by LLMs adapted with different combinations of these methods. Error analysis identified key ethical concerns, including inappropriate promotion of prenatal testing, criticism of relatives, and inaccurate probability statements. Conclusions: RAG demonstrated notable improvements across all evaluation metrics, suggesting potential for further enhancement through the expansion of RAG data. The expert-evaluated dataset developed in this study provides valuable insights for future optimization efforts. However, the ethical issues observed in JGCLLM responses underscore the critical need for ongoing refinement and thorough ethical evaluation before these systems can be implemented in health care settings. UR - https://medinform.jmir.org/2025/1/e65047 UR - http://dx.doi.org/10.2196/65047 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65047 ER - TY - JOUR AU - Mumtaz, Shahzad AU - McMinn, Megan AU - Cole, Christian AU - Gao, Chuang AU - Hall, Christopher AU - Guignard-Duff, Magalie AU - Huang, Huayi AU - McAllister, A. David AU - Morales, R. Daniel AU - Jefferson, Emily AU - Guthrie, Bruce PY - 2025/1/16 TI - A Digital Tool for Clinical Evidence?Driven Guideline Development by Studying Properties of Trial Eligible and Ineligible Populations: Development and Usability Study JO - J Med Internet Res SP - e52385 VL - 27 KW - multimorbidity KW - clinical practice guideline KW - gout KW - Trusted Research Environment KW - National Institute for Health and Care Excellence KW - Scottish Intercollegiate Guidelines Network KW - clinical practice KW - development KW - efficacy KW - validity KW - epidemiological data KW - epidemiology KW - epidemiological KW - digital tool KW - tool KW - age KW - gender KW - ethnicity KW - mortality KW - feedback KW - availability N2 - Background: Clinical guideline development preferentially relies on evidence from randomized controlled trials (RCTs). RCTs are gold-standard methods to evaluate the efficacy of treatments with the highest internal validity but limited external validity, in the sense that their findings may not always be applicable to or generalizable to clinical populations or population characteristics. The external validity of RCTs for the clinical population is constrained by the lack of tailored epidemiological data analysis designed for this purpose due to data governance, consistency of disease or condition definitions, and reduplicated effort in analysis code. Objective: This study aims to develop a digital tool that characterizes the overall population and differences between clinical trial eligible and ineligible populations from the clinical populations of a disease or condition regarding demography (eg, age, gender, ethnicity), comorbidity, coprescription, hospitalization, and mortality. Currently, the process is complex, onerous, and time-consuming, whereas a real-time tool may be used to rapidly inform a guideline developer?s judgment about the applicability of evidence. Methods: The National Institute for Health and Care Excellence?particularly the gout guideline development group?and the Scottish Intercollegiate Guidelines Network guideline developers were consulted to gather their requirements and evidential data needs when developing guidelines. An R Shiny (R Foundation for Statistical Computing) tool was designed and developed using electronic primary health care data linked with hospitalization and mortality data built upon an optimized data architecture. Disclosure control mechanisms were built into the tool to ensure data confidentiality. The tool was deployed within a Trusted Research Environment, allowing only trusted preapproved researchers to conduct analysis. Results: The tool supports 128 chronic health conditions as index conditions and 161 conditions as comorbidities (33 in addition to the 128 index conditions). It enables 2 types of analyses via the graphic interface: overall population and stratified by user-defined eligibility criteria. The analyses produce an overview of statistical tables (eg, age, gender) of the index condition population and, within the overview groupings, produce details on, for example, electronic frailty index, comorbidities, and coprescriptions. The disclosure control mechanism is integral to the tool, limiting tabular counts to meet local governance needs. An exemplary result for gout as an index condition is presented to demonstrate the tool?s functionality. Guideline developers from the National Institute for Health and Care Excellence and the Scottish Intercollegiate Guidelines Network provided positive feedback on the tool. Conclusions: The tool is a proof-of-concept, and the user feedback has demonstrated that this is a step toward computer-interpretable guideline development. Using the digital tool can potentially improve evidence-driven guideline development through the availability of real-world data in real time. UR - https://www.jmir.org/2025/1/e52385 UR - http://dx.doi.org/10.2196/52385 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/52385 ER - TY - JOUR AU - Oudbier, J. Susan AU - Smets, A. Ellen M. AU - Nieuwkerk, T. Pythia AU - Neal, P. David AU - Nurmohamed, Azam S. AU - Meij, J. Hans AU - Dusseljee-Peute, W. Linda PY - 2025/1/8 TI - Patients? Experienced Usability and Satisfaction With Digital Health Solutions in a Home Setting: Instrument Validation Study JO - JMIR Med Inform SP - e63703 VL - 13 KW - digital health solutions KW - questionnaire development KW - usability instruments KW - self-management KW - home setting KW - validation KW - reliability N2 - Background: The field of digital health solutions (DHS) has grown tremendously over the past years. DHS include tools for self-management, which support individuals to take charge of their own health. The usability of DHS, as experienced by patients, is pivotal to adoption. However, well-known questionnaires that evaluate usability and satisfaction use complex terminology derived from human-computer interaction and are therefore not well suited to assess experienced usability of patients using DHS in a home setting. Objective: This study aimed to develop, validate, and assess an instrument that measures experienced usability and satisfaction of patients using DHS in a home setting. Methods: The development of the ?Experienced Usability and Satisfaction with Self-monitoring in the Home Setting? (GEMS) questionnaire followed several steps. Step I consisted of assessing the content validity, by conducting a literature review on current usability and satisfaction questionnaires, collecting statements and discussing these in an expert meeting, and translating each statement and adjusting it to the language level of the general population. This phase resulted in a draft version of the GEMS. Step II comprised assessing its face validity by pilot testing with Amsterdam University Medical Center?s patient panel. In step III, psychometric analysis was conducted and the GEMS was assessed for reliability. Results: A total of 14 items were included for psychometric analysis and resulted in 4 reliable scales: convenience of use, perceived value, efficiency of use, and satisfaction. Conclusions: Overall, the GEMS questionnaire demonstrated its reliability and validity in assessing experienced usability and satisfaction of DHS in a home setting. Further refinement of the instrument is necessary to confirm its applicability in other patient populations in order to promote the development of a steering mechanism that can be applied longitudinally throughout implementation, and can be used as a benchmarking instrument. UR - https://medinform.jmir.org/2025/1/e63703 UR - http://dx.doi.org/10.2196/63703 ID - info:doi/10.2196/63703 ER - TY - JOUR AU - Zhao, Ziwei AU - Zhang, Weiyi AU - Chen, Xiaolan AU - Song, Fan AU - Gunasegaram, James AU - Huang, Wenyong AU - Shi, Danli AU - He, Mingguang AU - Liu, Na PY - 2024/12/30 TI - Slit Lamp Report Generation and Question Answering: Development and Validation of a Multimodal Transformer Model with Large Language Model Integration JO - J Med Internet Res SP - e54047 VL - 26 KW - large language model KW - slit lamp KW - medical report generation KW - question answering N2 - Background: Large language models have shown remarkable efficacy in various medical research and clinical applications. However, their skills in medical image recognition and subsequent report generation or question answering (QA) remain limited. Objective: We aim to finetune a multimodal, transformer-based model for generating medical reports from slit lamp images and develop a QA system using Llama2. We term this entire process slit lamp?GPT. Methods: Our research used a dataset of 25,051 slit lamp images from 3409 participants, paired with their corresponding physician-created medical reports. We used these data, split into training, validation, and test sets, to finetune the Bootstrapping Language-Image Pre-training framework toward report generation. The generated text reports and human-posed questions were then input into Llama2 for subsequent QA. We evaluated performance using qualitative metrics (including BLEU [bilingual evaluation understudy], CIDEr [consensus-based image description evaluation], ROUGE-L [Recall-Oriented Understudy for Gisting Evaluation?Longest Common Subsequence], SPICE [Semantic Propositional Image Caption Evaluation], accuracy, sensitivity, specificity, precision, and F1-score) and the subjective assessments of two experienced ophthalmologists on a 1-3 scale (1 referring to high quality). Results: We identified 50 conditions related to diseases or postoperative complications through keyword matching in initial reports. The refined slit lamp?GPT model demonstrated BLEU scores (1-4) of 0.67, 0.66, 0.65, and 0.65, respectively, with a CIDEr score of 3.24, a ROUGE (Recall-Oriented Understudy for Gisting Evaluation) score of 0.61, and a Semantic Propositional Image Caption Evaluation score of 0.37. The most frequently identified conditions were cataracts (22.95%), age-related cataracts (22.03%), and conjunctival concretion (13.13%). Disease classification metrics demonstrated an overall accuracy of 0.82 and an F1-score of 0.64, with high accuracies (?0.9) observed for intraocular lens, conjunctivitis, and chronic conjunctivitis, and high F1-scores (?0.9) observed for cataract and age-related cataract. For both report generation and QA components, the two evaluating ophthalmologists reached substantial agreement, with ? scores between 0.71 and 0.84. In assessing 100 generated reports, they awarded scores of 1.36 for both completeness and correctness; 64% (64/100) were considered ?entirely good,? and 93% (93/100) were ?acceptable.? In the evaluation of 300 generated answers to questions, the scores were 1.33 for completeness, 1.14 for correctness, and 1.15 for possible harm, with 66.3% (199/300) rated as ?entirely good? and 91.3% (274/300) as ?acceptable.? Conclusions: This study introduces the slit lamp?GPT model for report generation and subsequent QA, highlighting the potential of large language models to assist ophthalmologists and patients. UR - https://www.jmir.org/2024/1/e54047 UR - http://dx.doi.org/10.2196/54047 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/54047 ER - TY - JOUR AU - Lin, Cheng-Fu AU - Chang, Pei?Jung AU - Chang, Hui-Min AU - Chen, Ching-Tsung AU - Hsu, Pi-Shan AU - Wu, Chieh-Liang AU - Lin, Shih-Yi PY - 2024/12/26 TI - Evaluation of a Telemonitoring System Using Electronic National Early Warning Scores for Patients Receiving Medical Home Care: Pilot Implementation Study JO - JMIR Med Inform SP - e63425 VL - 12 KW - aging in place KW - early warning score KW - home hospitalization KW - remote monitoring KW - telemonitoring N2 - Background: Telehealth programs and wearable sensors that enable patients to monitor their vital signs have expanded due to the COVID-19 pandemic. The electronic National Early Warning Score (e-NEWS) system helps identify and respond to acute illness. Objective: This study aimed to implement and evaluate a comprehensive telehealth system to monitor vital signs using e-NEWS for patients receiving integrated home-based medical care (iHBMC). The goal was to improve the early detection of patient deterioration and enhance care delivery in home settings. The system was deployed to optimize remote monitoring in iHBMC and reduce emergency visits and hospitalizations. Methods: The study was conducted at a medical center and its affiliated home health agency in central Taiwan from November 1, 2022, to October 31, 2023. Patients eligible for iHBMC were enrolled, and sensor data from devices such as blood pressure monitors, thermometers, and pulse oximeters were transmitted to a cloud-based server for e-NEWS calculations at least twice per day over a 2-week period. Patients with e-NEWSs up to 4 received nursing or physician recommendations and interventions based on abnormal physiological data, with reassessment occurring after 2 hours. Implementation (Results): A total of 28 participants were enrolled, with a median age of 84.5 (IQR 79.3?90.8) years, and 32% (n=9) were male. All participants had caregivers, with only 5 out of 28 (18%) able to make decisions independently. The system was implemented across one medical center and its affiliated home health agency. Of the 28 participants, 27 completed the study, while 1 exited early due to low blood pressure and shortness of breath. The median e-NEWS value was 4 (IQR 3?6), with 397 abnormal readings recorded. Of the remaining 27 participants, 8 participants had earlier home visits due to abnormal readings, 6 required hypertension medication adjustments, and 9 received advice on oxygen supplementation. Overall, 24 out of 28 (86%) participants reported being satisfied with the system. Conclusions: This study demonstrated the feasibility of implementing a telehealth system integrated with e-NEWS in iHBMC settings, potentially aiding in the early detection of clinical deterioration. Although caregivers receive training and resources for their tasks, the system may increase their workload, which could lead to higher stress levels. The small sample size, short monitoring duration, and regional focus in central Taiwan may further limit the applicability of the findings to areas with differing countries, regions, and health care infrastructures. Further research is required to confirm its impact. UR - https://medinform.jmir.org/2024/1/e63425 UR - http://dx.doi.org/10.2196/63425 ID - info:doi/10.2196/63425 ER - TY - JOUR AU - Iino, Haru AU - Kizaki, Hayato AU - Imai, Shungo AU - Hori, Satoko PY - 2024/12/23 TI - Identifying the Relative Importance of Factors Influencing Medication Compliance in General Patients Using Regularized Logistic Regression and LightGBM: Web-Based Survey Analysis JO - JMIR Form Res SP - e65882 VL - 8 KW - medication adherence KW - pharmacological management KW - medication compliance KW - Japan KW - drugs KW - dose KW - psychological KW - questionnaire survey KW - LightGBM KW - logistic regression model KW - regularization KW - machine learning KW - AI KW - artificial intelligence N2 - Background: Medication compliance, which refers to the extent to which patients correctly adhere to prescribed regimens, is influenced by various psychological, behavioral, and demographic factors. When analyzing these factors, challenges such as multicollinearity and variable selection often arise, complicating the interpretation of results. To address the issue of multicollinearity and better analyze the importance of each factor, machine learning methods are considered to be useful. Objective: This study aimed to identify key factors influencing medication compliance by applying regularized logistic regression and LightGBM. Methods: A questionnaire survey was conducted among 638 adult patients in Japan who had been continuously taking medications for at least 3 months. The survey collected data on demographics, medication habits, psychological adherence factors, and compliance. Logistic regression with regularization was used to handle multicollinearity, while LightGBM was used to calculate feature importance. Results: The regularized logistic regression model identified significant predictors, including ?using the drug at approximately the same time each day? (coefficient 0.479; P=.02), ?taking meals at approximately the same time each day? (coefficient 0.407; P=.02), and ?I would like to have my medication reduced? (coefficient ?0.410; P=.01). The top 5 variables with the highest feature importance scores in the LightGBM results were ?Age? (feature importance 179.1), ?Using the drug at approximately the same time each day? (feature importance 148.4), ?Taking meals at approximately the same time each day? (feature importance 109.0), ?I would like to have my medication reduced? (feature importance 77.48), and ?I think I want to take my medicine? (feature importance 70.85). Additionally, the feature importance scores for the groups of medication adherence?related factors were 77.92 for lifestyle-related items, 52.04 for awareness of medication, 20.30 for relationships with health care professionals, and 5.05 for others. Conclusions: The most significant factors for medication compliance were the consistency of medication and meal timing (mean of feature importance), followed by the number of medications and patient attitudes toward their treatment. This study is the first to use a machine learning model to calculate and compare the relative importance of factors affecting medication adherence. Our findings demonstrate that, in terms of relative importance, lifestyle habits are the most significant contributors to medication compliance among the general patient population. The findings suggest that regularization and machine learning methods, such as LightGBM, are useful for better understanding the numerous adherence factors affected by multicollinearity. UR - https://formative.jmir.org/2024/1/e65882 UR - http://dx.doi.org/10.2196/65882 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65882 ER - TY - JOUR AU - Hans, Patricius Felix AU - Kleinekort, Jan AU - Boerries, Melanie AU - Nieters, Alexandra AU - Kindle, Gerhard AU - Rautenberg, Micha AU - Bühler, Laura AU - Weiser, Gerda AU - Röttger, Clemens Michael AU - Neufischer, Carolin AU - Kühn, Matthias AU - Wehrle, Julius AU - Slagman, Anna AU - Fischer-Rosinsky, Antje AU - Eienbröker, Larissa AU - Hanses, Frank AU - Teepe, Wilhelm Gisbert AU - Busch, Hans-Jörg AU - Benning, Leo PY - 2024/12/17 TI - Information Mode?Dependent Success Rates of Obtaining German Medical Informatics Initiative?Compliant Broad Consent in the Emergency Department: Single-Center Prospective Observational Study JO - JMIR Med Inform SP - e65646 VL - 12 KW - biomedical research KW - delivery of health care KW - informed consent KW - medical informatics KW - digital health KW - emergency medical services KW - routinely collected health data KW - data science KW - secondary data analysis KW - data analysis KW - biomedical KW - emergency KW - Germany KW - Europe KW - prospective observational study KW - broad consent KW - inpatient stay KW - logistic regression analysis KW - health care delivery KW - inpatients N2 - Background: The broad consent (BC) developed by the German Medical Informatics Initiative is a pivotal national strategy for obtaining patient consent to use routinely collected data from electronic health records, insurance companies, contact information, and biomaterials for research. Emergency departments (EDs) are ideal for enrolling diverse patient populations in research activities. Despite regulatory and ethical challenges, obtaining BC from patients in ED with varying demographic, socioeconomic, and disease characteristics presents a promising opportunity to expand the availability of ED data. Objective: This study aimed to evaluate the success rate of obtaining BC through different consenting approaches in a tertiary ED and to explore factors influencing consent and dropout rates. Methods: A single-center prospective observational study was conducted in a German tertiary ED from September to December 2022. Every 30th patient was screened for eligibility. Eligible patients were informed via one of three modalities: (1) directly in the ED, (2) during their inpatient stay on the ward, or (3) via telephone after discharge. The primary outcome was the success rate of obtaining BC within 30 days of ED presentation. Secondary outcomes included analyzing potential influences on the success and dropout rates based on patient characteristics, information mode, and the interaction time required for patients to make an informed decision. Results: Of 11,842 ED visits, 419 patients were screened for BC eligibility, with 151 meeting the inclusion criteria. Of these, 68 (45%) consented to at least 1 BC module, while 24 (15.9%) refused participation. The dropout rate was 39.1% (n=59) and was highest in the telephone-based group (57/109, 52.3%) and lowest in the ED group (1/14, 7.1%). Patients informed face-to-face during their inpatient stay following the ED treatment had the highest consent rate (23/27, 85.2%), while those approached in the ED or by telephone had consent rates of 69.2% (9/13 and 36/52). Logistic regression analysis indicated that longer interaction time significantly improved consent rates (P=.03), while female sex was associated with higher dropout rates (P=.02). Age, triage category, billing details (inpatient treatment), or diagnosis did not significantly influence the primary outcome (all P>.05). Conclusions: Obtaining BC in an ED environment is feasible, enabling representative inclusion of ED populations. However, discharge from the ED and female sex negatively affected consent rates to the BC. Face-to-face interaction proved most effective, particularly for inpatients, while telephone-based approaches resulted in higher dropout rates despite comparable consent rates to direct consenting in the ED. The findings underscore the importance of tailored consent strategies and maintaining consenting staff in EDs and on the wards to enhance BC information delivery and consent processes for eligible patients. Trial Registration: German Clinical Trials Register DRKS00028753; https://drks.de/search/de/trial/DRKS00028753 UR - https://medinform.jmir.org/2024/1/e65646 UR - http://dx.doi.org/10.2196/65646 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65646 ER - TY - JOUR AU - Silvey, Scott AU - Liu, Jinze PY - 2024/12/17 TI - Sample Size Requirements for Popular Classification Algorithms in Tabular Clinical Data: Empirical Study JO - J Med Internet Res SP - e60231 VL - 26 KW - medical informatics KW - machine learning KW - sample size KW - research design KW - decision trees KW - classification algorithm KW - clinical research KW - learning-curve analysis KW - analysis KW - analyses KW - guidelines KW - ML KW - decision making KW - algorithm KW - curve analysis KW - dataset N2 - Background: The performance of a classification algorithm eventually reaches a point of diminishing returns, where the additional sample added does not improve the results. Thus, there is a need to determine an optimal sample size that maximizes performance while accounting for computational burden or budgetary concerns. Objective: This study aimed to determine optimal sample sizes and the relationships between sample size and dataset-level characteristics over a variety of binary classification algorithms. Methods: A total of 16 large open-source datasets were collected, each containing a binary clinical outcome. Furthermore, 4 machine learning algorithms were assessed: XGBoost (XGB), random forest (RF), logistic regression (LR), and neural networks (NNs). For each dataset, the cross-validated area under the curve (AUC) was calculated at increasing sample sizes, and learning curves were fit. Sample sizes needed to reach the observed full?dataset AUC minus 2 points (0.02) were calculated from the fitted learning curves and compared across the datasets and algorithms. Dataset?level characteristics, minority class proportion, full?dataset AUC, number of features, type of features, and degree of nonlinearity were examined. Negative binomial regression models were used to quantify relationships between these characteristics and expected sample sizes within each algorithm. A total of 4 multivariable models were constructed, which selected the best-fitting combination of dataset?level characteristics. Results: Among the 16 datasets (full-dataset sample sizes ranging from 70,000-1,000,000), median sample sizes were 9960 (XGB), 3404 (RF), 696 (LR), and 12,298 (NN) to reach AUC stability. For all 4 algorithms, more balanced classes (multiplier: 0.93-0.96 for a 1% increase in minority class proportion) were associated with decreased sample size. Other characteristics varied in importance across algorithms?in general, more features, weaker features, and more complex relationships between the predictors and the response increased expected sample sizes. In multivariable analysis, the top selected predictors were minority class proportion among all 4 algorithms assessed, full?dataset AUC (XGB, RF, and NN), and dataset nonlinearity (XGB, RF, and NN). For LR, the top predictors were minority class proportion, percentage of strong linear features, and number of features. Final multivariable sample size models had high goodness-of-fit, with dataset?level predictors explaining a majority (66.5%-84.5%) of the total deviance in the data among all 4 models. Conclusions: The sample sizes needed to reach AUC stability among 4 popular classification algorithms vary by dataset and method and are associated with dataset?level characteristics that can be influenced or estimated before the start of a research study. UR - https://www.jmir.org/2024/1/e60231 UR - http://dx.doi.org/10.2196/60231 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/60231 ER - TY - JOUR AU - Brice, N. Syaribah AU - Boutilier, J. Justin AU - Palmer, Geraint AU - Harper, R. Paul AU - Knight, Vincent AU - Tuson, Mark AU - Gartner, Daniel PY - 2024/12/13 TI - Close-Up on Ambulance Service Estimation in Indonesia: Monte Carlo Simulation Study JO - Interact J Med Res SP - e54240 VL - 13 KW - emergency medical services KW - ambulance services KW - hospital emergency services KW - Southeast Asian countries KW - low-and-middle-income countries KW - EMS KW - survey N2 - Background: Emergency medical services have a pivotal role in giving timely and appropriate responses to emergency events caused by medical, natural, or human-caused disasters. To provide adequate resources for the emergency services, such as ambulances, it is necessary to understand the demand for such services. In Indonesia, estimates of demand for emergency services cannot be obtained easily due to a lack of published literature or official reports concerning the matter. Objective: This study aimed to ascertain an estimate of the annual volume of hospital emergency visits and the corresponding demand for ambulance services in the city of Jakarta. Methods: In this study, we addressed the problem of emergency services demand estimation when aggregated detailed data are not available or are not part of the routine data collection. We used survey data together with the local Office of National Statistics reports and sample data from hospital emergency departments to establish parameter estimation. This involved estimating 4 parameters: the population of each area per period (day and night), the annual per capita hospital emergency visits, the probability of an emergency taking place in each period, and the rate of ambulance need per area. Monte Carlo simulation and naďve methods were used to generate an estimation for the mean ambulance needs per area in Jakarta. Results: The results estimated that the total annual ambulance need in Jakarta is between 83,000 and 241,000. Assuming the rate of ambulance usage in Jakarta at 9.3%, we estimated the total annual hospital emergency visits in Jakarta at around 0.9-2.6 million. The study also found that the estimation from using the simulation method was smaller than the average (naďve) methods (P<.001). Conclusions: The results provide an estimation of the annual emergency services needed for the city of Jakarta. In the absence of aggregated routinely collected data on emergency medical service usage in Jakarta, our results provide insights into whether the current emergency services, such as ambulances, have been adequately provided. UR - https://www.i-jmr.org/2024/1/e54240 UR - http://dx.doi.org/10.2196/54240 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/54240 ER - TY - JOUR AU - Walsh, Julia AU - Cave, Jonathan AU - Griffiths, Frances PY - 2024/12/11 TI - Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences JO - J Med Internet Res SP - e54321 VL - 26 KW - unstructured text KW - natural language processing KW - NLP KW - topic modeling KW - sentiment analysis KW - corpus linguistics KW - social media data KW - patient experience KW - unsupervised KW - modafinil N2 - Background: Patient experience data from social media offer patient-centered perspectives on disease, treatments, and health service delivery. Current guidelines typically rely on systematic reviews, while qualitative health studies are often seen as anecdotal and nongeneralizable. This study explores combining personal health experiences from multiple sources to create generalizable evidence. Objective: The study aims to (1) investigate how combining unsupervised natural language processing (NLP) and corpus linguistics can explore patient perspectives from a large unstructured dataset of modafinil experiences, (2) compare findings with Cochrane meta-analyses on modafinil?s effectiveness, and (3) develop a methodology for analyzing such data. Methods: Using 69,022 posts from 790 sources, we used a variety of NLP and corpus techniques to analyze the data, including data cleaning techniques to maximize post context, Python for NLP techniques, and Sketch Engine for linguistic analysis. We used multiple topic mining approaches, such as latent Dirichlet allocation, nonnegative matrix factorization, and word-embedding methods. Sentiment analysis used TextBlob and Valence Aware Dictionary and Sentiment Reasoner, while corpus methods including collocation, concordance, and n-gram generation. Previous work had mapped topic mining to themes, such as health conditions, reasons for taking modafinil, symptom impacts, dosage, side effects, effectiveness, and treatment comparisons. Results: Key findings of the study included modafinil use across 166 health conditions, most frequently narcolepsy, multiple sclerosis, attention-deficit disorder, anxiety, sleep apnea, depression, bipolar disorder, chronic fatigue syndrome, fibromyalgia, and chronic disease. Word-embedding topic modeling mapped 70% of posts to predefined themes, while sentiment analysis revealed 65% positive responses, 6% neutral responses, and 28% negative responses. Notably, the perceived effectiveness of modafinil for various conditions strongly contrasts with the findings of existing randomized controlled trials and systematic reviews, which conclude insufficient or low-quality evidence of effectiveness. Conclusions: This study demonstrated the value of combining NLP with linguistic techniques for analyzing large unstructured text datasets. Despite varying opinions, findings were methodologically consistent and challenged existing clinical evidence. This suggests that patient-generated data could potentially provide valuable insights into treatment outcomes, potentially improving clinical understanding and patient care. UR - https://www.jmir.org/2024/1/e54321 UR - http://dx.doi.org/10.2196/54321 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/54321 ER - TY - JOUR AU - Helminski, Danielle AU - Sussman, B. Jeremy AU - Pfeiffer, N. Paul AU - Kokaly, N. Alex AU - Ranusch, Allison AU - Renji, Deep Anjana AU - Damschroder, J. Laura AU - Landis-Lewis, Zach AU - Kurlander, E. Jacob PY - 2024/12/10 TI - Development, Implementation, and Evaluation Methods for Dashboards in Health Care: Scoping Review JO - JMIR Med Inform SP - e59828 VL - 12 KW - dashboard KW - medical informatics KW - quality improvement KW - electronic health record KW - scoping review KW - monitoring KW - health care system KW - patient care KW - clinical research KW - emergency department KW - inpatient KW - clinical management N2 - Background: Dashboards have become ubiquitous in health care settings, but to achieve their goals, they must be developed, implemented, and evaluated using methods that help ensure they meet the needs of end users and are suited to the barriers and facilitators of the local context. Objective: This scoping review aimed to explore published literature on health care dashboards to characterize the methods used to identify factors affecting uptake, strategies used to increase dashboard uptake, and evaluation methods, as well as dashboard characteristics and context. Methods: MEDLINE, Embase, Web of Science, and the Cochrane Library were searched from inception through July 2020. Studies were included if they described the development or evaluation of a health care dashboard with publication from 2018?2020. Clinical setting, purpose (categorized as clinical, administrative, or both), end user, design characteristics, methods used to identify factors affecting uptake, strategies to increase uptake, and evaluation methods were extracted. Results: From 116 publications, we extracted data for 118 dashboards. Inpatient (45/118, 38.1%) and outpatient (42/118, 35.6%) settings were most common. Most dashboards had ?2 stated purposes (84/118, 71.2%); of these, 54 of 118 (45.8%) were administrative, 43 of 118 (36.4%) were clinical, and 20 of 118 (16.9%) had both purposes. Most dashboards included frontline clinical staff as end users (97/118, 82.2%). To identify factors affecting dashboard uptake, half involved end users in the design process (59/118, 50%); fewer described formative usability testing (26/118, 22%) or use of any theory or framework to guide development, implementation, or evaluation (24/118, 20.3%). The most common strategies used to increase uptake included education (60/118, 50.8%); audit and feedback (59/118, 50%); and advisory boards (54/118, 45.8%). Evaluations of dashboards (84/118, 71.2%) were mostly quantitative (60/118, 50.8%), with fewer using only qualitative methods (6/118, 5.1%) or a combination of quantitative and qualitative methods (18/118, 15.2%). Conclusions: Most dashboards forego steps during development to ensure they suit the needs of end users and the clinical context; qualitative evaluation?which can provide insight into ways to improve dashboard effectiveness?is uncommon. Education and audit and feedback are frequently used to increase uptake. These findings illustrate the need for promulgation of best practices in dashboard development and will be useful to dashboard planners. International Registered Report Identifier (IRRID): RR2-10.2196/34894 UR - https://medinform.jmir.org/2024/1/e59828 UR - http://dx.doi.org/10.2196/59828 ID - info:doi/10.2196/59828 ER - TY - JOUR AU - AboArab, A. Mohammed AU - Potsika, T. Vassiliki AU - Theodorou, Alexis AU - Vagena, Sylvia AU - Gravanis, Miltiadis AU - Sigala, Fragiska AU - Fotiadis, I. Dimitrios PY - 2024/12/9 TI - Advancing Progressive Web Applications to Leverage Medical Imaging for Visualization of Digital Imaging and Communications in Medicine and Multiplanar Reconstruction: Software Development and Validation Study JO - JMIR Med Inform SP - e63834 VL - 12 KW - medical image visualization KW - peripheral artery computed tomography imaging KW - multiplanar reconstruction KW - progressive web applications N2 - Background: In medical imaging, 3D visualization is vital for displaying volumetric organs, enhancing diagnosis and analysis. Multiplanar reconstruction (MPR) improves visual and diagnostic capabilities by transforming 2D images from computed tomography (CT) and magnetic resonance imaging into 3D representations. Web-based Digital Imaging and Communications in Medicine (DICOM) viewers integrated into picture archiving and communication systems facilitate access to pictures and interaction with remote data. However, the adoption of progressive web applications (PWAs) for web-based DICOM and MPR visualization remains limited. This paper addresses this gap by leveraging PWAs for their offline access and enhanced performance. Objective: This study aims to evaluate the integration of DICOM and MPR visualization into the web using PWAs, addressing challenges related to cross-platform compatibility, integration capabilities, and high-resolution image reconstruction for medical image visualization. Methods: Our paper introduces a PWA that uses a modular design for enhancing DICOM and MPR visualization in web-based medical imaging. By integrating React.js and Cornerstone.js, the application offers seamless DICOM image processing, ensures cross-browser compatibility, and delivers a responsive user experience across multiple devices. It uses advanced interpolation techniques to make volume reconstructions more accurate. This makes MPR analysis and visualization better in a web environment, thus promising a substantial advance in medical imaging analysis. Results: In our approach, the performance of DICOM- and MPR-based PWAs for medical image visualization and reconstruction was evaluated through comprehensive experiments. The application excelled in terms of loading time and volume reconstruction, particularly in Google Chrome, whereas Firefox showed superior performance in viewing slices. This study uses a dataset comprising 22 CT scans of peripheral artery patients to demonstrate the application?s robust performance, with Google Chrome outperforming other browsers in both the local area network and wide area network settings. In addition, the application?s accuracy in MPR reconstructions was validated with an error margin of <0.05 mm and outperformed the state-of-the-art methods by 84% to 98% in loading and volume rendering time. Conclusions: This paper highlights advancements in DICOM and MPR visualization using PWAs, addressing the gaps in web-based medical imaging. By exploiting PWA features such as offline access and improved performance, we have significantly advanced medical imaging technology, focusing on cross-platform compatibility, integration efficiency, and speed. Our application outperforms existing platforms for handling complex MPR analyses and accurate analysis of medical imaging as validated through peripheral artery CT imaging. UR - https://medinform.jmir.org/2024/1/e63834 UR - http://dx.doi.org/10.2196/63834 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/63834 ER - TY - JOUR AU - Sugiura, Ayaka AU - Saegusa, Satoshi AU - Jin, Yingzi AU - Yoshimoto, Riki AU - Smith, D. Nicholas AU - Dohi, Koji AU - Higuchi, Tadashi AU - Kozu, Tomotake PY - 2024/12/9 TI - Evaluation of RMES, an Automated Software Tool Utilizing AI, for Literature Screening with Reference to Published Systematic Reviews as Case-Studies: Development and Usability Study JO - JMIR Form Res SP - e55827 VL - 8 KW - artificial intelligence KW - automated literature screening KW - natural language processing KW - randomized controlled trials KW - Rapid Medical Evidence Synthesis KW - RMES KW - systematic reviews KW - text mining N2 - Background: Systematic reviews and meta-analyses are important to evidence-based medicine, but the information retrieval and literature screening procedures are burdensome tasks. Rapid Medical Evidence Synthesis (RMES; Deloitte Tohmatsu Risk Advisory LLC) is a software designed to support information retrieval, literature screening, and data extraction for evidence-based medicine. Objective: This study aimed to evaluate the accuracy of RMES for literature screening with reference to published systematic reviews. Methods: We used RMES to automatically screen the titles and abstracts of PubMed-indexed articles included in 12 systematic reviews across 6 medical fields, by applying 4 filters: (1) study type; (2) study type + disease; (3) study type + intervention; and (4) study type + disease + intervention. We determined the numbers of articles correctly included by each filter relative to those included by the authors of each systematic review. Only PubMed-indexed articles were assessed. Results: Across the 12 reviews, the number of articles analyzed by RMES ranged from 46 to 5612. The number of PubMed-cited articles included in the reviews ranged from 4 to 47. The median (range) percentage of articles correctly labeled by RMES using filters 1-4 were: 80.9% (57.1%-100%), 65.2% (34.1%-81.8%), 70.5% (0%-100%), and 58.6% (0%-81.8%), respectively. Conclusions: This study demonstrated good performance and accuracy of RMES for the initial screening of the titles and abstracts of articles for use in systematic reviews. RMES has the potential to reduce the workload involved in the initial screening of published studies. UR - https://formative.jmir.org/2024/1/e55827 UR - http://dx.doi.org/10.2196/55827 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/55827 ER - TY - JOUR AU - Grechuta, Klaudia AU - Shokouh, Pedram AU - Alhussein, Ahmad AU - Müller-Wieland, Dirk AU - Meyerhoff, Juliane AU - Gilbert, Jeremy AU - Purushotham, Sneha AU - Rolland, Catherine PY - 2024/11/27 TI - Benefits of Clinical Decision Support Systems for the Management of Noncommunicable Chronic Diseases: Targeted Literature Review JO - Interact J Med Res SP - e58036 VL - 13 KW - clinical decision support system KW - digital health KW - chronic disease management KW - electronic health records KW - noncommunicable diseases KW - targeted literature review KW - mobile phone N2 - Background: Clinical decision support systems (CDSSs) are designed to assist in health care delivery by supporting medical practice with clinical knowledge, patient information, and other relevant types of health information. CDSSs are integral parts of health care technologies assisting in disease management, including diagnosis, treatment, and monitoring. While electronic medical records (EMRs) serve as data repositories, CDSSs are used to assist clinicians in providing personalized, context-specific recommendations derived by comparing individual patient data to evidence-based guidelines. Objective: This targeted literature review (TLR) aimed to identify characteristics and features of both stand-alone and EMR-integrated CDSSs that influence their outcomes and benefits based on published scientific literature. Methods: A TLR was conducted using the Embase, MEDLINE, and Cochrane databases to identify data on CDSSs published in a 10-year frame (2012-2022). Studies on computerized, guideline-based CDSSs used by health care practitioners with a focus on chronic disease areas and reporting outcomes for CDSS utilization were eligible for inclusion. Results: A total of 49 publications were included in the TLR. Studies predominantly reported on EMR-integrated CDSSs (ie, connected to an EMR database; n=32, 65%). The implementation of CDSSs varied globally, with substantial utilization in the United States and within the domain of cardio-renal-metabolic diseases. CDSSs were found to positively impact ?quality assurance? (n=35, 69%) and provide ?clinical benefits? (n=20, 41%), compared to usual care. Among CDSS features, treatment guidance and flagging were consistently reported as the most frequent elements for enhancing health care, followed by risk level estimation, diagnosis, education, and data export. The effectiveness of a CDSS was evaluated most frequently in primary care settings (n=34, 69%) across cardio-renal-metabolic disease areas (n=32, 65%), especially in diabetes (n=13, 26%). Studies reported CDSSs to be commonly used by a mixed group (n=27, 55%) of users including physicians, specialists, nurses or nurse practitioners, and allied health care professionals. Conclusions: Overall, both EMR-integrated and stand-alone CDSSs showed positive results, suggesting their benefits to health care providers and potential for successful adoption. Flagging and treatment recommendation features were commonly used in CDSSs to improve patient care; other features such as risk level estimation, diagnosis, education, and data export were tailored to specific requirements and collectively contributed to the effectiveness of health care delivery. While this TLR demonstrated that both stand-alone and EMR-integrated CDSSs were successful in achieving clinical outcomes, the heterogeneity of included studies reflects the evolving nature of this research area, underscoring the need for further longitudinal studies to elucidate aspects that may impact their adoption in real-world scenarios. UR - https://www.i-jmr.org/2024/1/e58036 UR - http://dx.doi.org/10.2196/58036 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/58036 ER - TY - JOUR AU - Meng, Jian AU - Niu, Xiaoyu AU - Luo, Can AU - Chen, Yueyue AU - Li, Qiao AU - Wei, Dongmei PY - 2024/11/22 TI - Development and Validation of a Machine Learning?Based Early Warning Model for Lichenoid Vulvar Disease: Prediction Model Development Study JO - J Med Internet Res SP - e55734 VL - 26 KW - female KW - lichenoid vulvar disease KW - risk factors KW - evidence-based medicine KW - early warning model N2 - Background: Given the complexity and diversity of lichenoid vulvar disease (LVD) risk factors, it is crucial to actively explore these factors and construct personalized warning models using relevant clinical variables to assess disease risk in patients. Yet, to date, there has been insufficient research, both nationwide and internationally, on risk factors and warning models for LVD. In light of these gaps, this study represents the first systematic exploration of the risk factors associated with LVD. Objective: The risk factors of LVD in women were explored and a medically evidence-based warning model was constructed to provide an early alert tool for the high-risk target population. The model can be applied in the clinic to identify high-risk patients and evaluate its accuracy and practicality in predicting LVD in women. Simultaneously, it can also enhance the diagnostic and treatment proficiency of medical personnel in primary community health service centers, which is of great significance in reducing overall health care spending and disease burden. Methods: A total of 2990 patients who attended West China Second Hospital of Sichuan University from January 2013 to December 2017 were selected as the study candidates and were divided into 1218 cases in the normal vulvovagina group (group 0) and 1772 cases in the lichenoid vulvar disease group (group 1) according to the results of the case examination. We investigated and collected routine examination data from patients for intergroup comparisons, included factors with significant differences in multifactorial analysis, and constructed logistic regression, random forests, gradient boosting machine (GBM), adaboost, eXtreme Gradient Boosting, and Categorical Boosting analysis models. The predictive efficacy of these six models was evaluated using receiver operating characteristic curve and area under the curve. Results: Univariate analysis revealed that vaginitis, urinary incontinence, humidity of the long-term residential environment, spicy dietary habits, regular intake of coffee or caffeinated beverages, daily sleep duration, diabetes mellitus, smoking history, presence of autoimmune diseases, menopausal status, and hypertension were all significant risk factors affecting female LVD. Furthermore, the area under the receiver operating characteristic curve, accuracy, sensitivity, and F1-score of the GBM warning model were notably higher than the other 5 predictive analysis models. The GBM analysis model indicated that menopausal status had the strongest impact on female LVD, showing a positive correlation, followed by the presence of autoimmune diseases, which also displayed a positive dependency. Conclusions: In accordance with evidence-based medicine, the construction of a predictive warning model for female LVD can be used to identify high-risk populations at an early stage, aiding in the formulation of effective preventive measures, which is of paramount importance for reducing the incidence of LVD in women. UR - https://www.jmir.org/2024/1/e55734 UR - http://dx.doi.org/10.2196/55734 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/55734 ER - TY - JOUR AU - Cho, Na Ha AU - Jun, Joon Tae AU - Kim, Young-Hak AU - Kang, Heejun AU - Ahn, Imjin AU - Gwon, Hansle AU - Kim, Yunha AU - Seo, Jiahn AU - Choi, Heejung AU - Kim, Minkyoung AU - Han, Jiye AU - Kee, Gaeun AU - Park, Seohyun AU - Ko, Soyoung PY - 2024/11/18 TI - Task-Specific Transformer-Based Language Models in Health Care: Scoping Review JO - JMIR Med Inform SP - e49724 VL - 12 KW - transformer-based language models KW - medicine KW - health care KW - medical language model N2 - Background: Transformer-based language models have shown great potential to revolutionize health care by advancing clinical decision support, patient interaction, and disease prediction. However, despite their rapid development, the implementation of transformer-based language models in health care settings remains limited. This is partly due to the lack of a comprehensive review, which hinders a systematic understanding of their applications and limitations. Without clear guidelines and consolidated information, both researchers and physicians face difficulties in using these models effectively, resulting in inefficient research efforts and slow integration into clinical workflows. Objective: This scoping review addresses this gap by examining studies on medical transformer-based language models and categorizing them into 6 tasks: dialogue generation, question answering, summarization, text classification, sentiment analysis, and named entity recognition. Methods: We conducted a scoping review following the Cochrane scoping review protocol. A comprehensive literature search was performed across databases, including Google Scholar and PubMed, covering publications from January 2017 to September 2024. Studies involving transformer-derived models in medical tasks were included. Data were categorized into 6 key tasks. Results: Our key findings revealed both advancements and critical challenges in applying transformer-based models to health care tasks. For example, models like MedPIR involving dialogue generation show promise but face privacy and ethical concerns, while question-answering models like BioBERT improve accuracy but struggle with the complexity of medical terminology. The BioBERTSum summarization model aids clinicians by condensing medical texts but needs better handling of long sequences. Conclusions: This review attempted to provide a consolidated understanding of the role of transformer-based language models in health care and to guide future research directions. By addressing current challenges and exploring the potential for real-world applications, we envision significant improvements in health care informatics. Addressing the identified challenges and implementing proposed solutions can enable transformer-based language models to significantly improve health care delivery and patient outcomes. Our review provides valuable insights for future research and practical applications, setting the stage for transformative advancements in medical informatics. UR - https://medinform.jmir.org/2024/1/e49724 UR - http://dx.doi.org/10.2196/49724 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/49724 ER - TY - JOUR AU - Bogale, Binyam AU - Vesinurm, Märt AU - Lillrank, Paul AU - Celius, Gulowsen Elisabeth AU - Halvorsrud, Ragnhild PY - 2024/11/15 TI - Visual Modeling Languages in Patient Pathways: Scoping Review JO - Interact J Med Res SP - e55865 VL - 13 KW - patient pathways KW - visual modeling languages KW - business process model and notation KW - BPMN KW - unified modeling language KW - UML KW - domain-specific modeling languages KW - scoping review N2 - Background: Patient pathways (PPs) are presented as a panacea solution to enhance health system functions. It is a complex concept that needs to be described and communicated well. Modeling plays a crucial role in promoting communication, fostering a shared understanding, and streamlining processes. Only a few existing systematic reviews have focused on modeling methods and standardized modeling languages. There remains a gap in consolidated knowledge regarding the use of diverse visual modeling languages. Objective: This scoping review aimed to compile visual modeling languages used to represent PPs, including the justifications and the context in which a modeling language was adopted, adapted, combined, or developed. Methods: After initial experimentation with the keywords used to describe the concepts of PPs and visual modeling languages, we developed a search strategy that was further refined and customized to the major databases identified as topically relevant. In addition, we consulted gray literature and conducted hand searches of the referenced articles. Two reviewers independently screened the articles in 2 stages using preset inclusion criteria, and a third reviewer voted on the discordance. Data charting was done using an iteratively developed form in the Covidence software. Descriptive and thematic summaries were presented following rounds of discussion to produce the final report. Results: Of 1838 articles retrieved after deduplication, 22 satisfied our inclusion criteria. Clinical pathway is the most used phrase to represent the PP concept, and most papers discussed the concept without providing their operational definition. We categorized the visual modeling languages into five categories: (1) general purpose?modeling language (GPML) adopted without major extension or modification, (2) GPML used with formal extension recommendations, (3) combination of 2 or more modeling languages, (4) a developed domain-specific modeling language (DSML), and (5) ontological modeling languages. The justifications for adopting, adapting, combining, and developing visual modeling languages varied accordingly and ranged from versatility, expressiveness, tool support, and extensibility of a language to domain needs, integration, and simplification. Conclusions: Various visual modeling languages were used in PP modeling, each with varying levels of abstraction and granularity. The categorization we made could aid in a better understanding of the complex combination of PP and modeling languages. Standardized GPMLs were used with or without any modifications. The rationale to propose any modification to GPMLs evolved as more evidence was presented following requirement analyses to support domain constructs. DSMLs are infrequently used due to their resource-intensive development, often initiated at a project level. The justifications provided and the context where DSMLs were created are paramount. Future studies should assess the merits and demerits of using a visual modeling language to facilitate PP communications among stakeholders and use evaluation frameworks to identify, modify, or develop them, depending on the scope and goal of the modeling need. UR - https://www.i-jmr.org/2024/1/e55865 UR - http://dx.doi.org/10.2196/55865 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/55865 ER - TY - JOUR AU - Drummond, David AU - Gonsard, Apolline PY - 2024/11/13 TI - Definitions and Characteristics of Patient Digital Twins Being Developed for Clinical Use: Scoping Review JO - J Med Internet Res SP - e58504 VL - 26 KW - patient simulation KW - cyber-physical systems KW - telemonitoring KW - personalized medicine KW - precision medicine KW - digital twin N2 - Background: The concept of digital twins, widely adopted in industry, is entering health care. However, there is a lack of consensus on what constitutes the digital twin of a patient. Objective: The objective of this scoping review was to analyze definitions and characteristics of patient digital twins being developed for clinical use, as reported in the scientific literature. Methods: We searched PubMed, Scopus, Embase, IEEE, and Google Scholar for studies claiming digital twin development or evaluation until August 2023. Data on definitions, characteristics, and development phase were extracted. Unsupervised classification of claimed digital twins was performed. Results: We identified 86 papers representing 80 unique claimed digital twins, with 98% (78/80) in preclinical phases. Among the 55 papers defining ?digital twin,? 76% (42/55) described a digital replica, 42% (23/55) mentioned real-time updates, 24% (13/55) emphasized patient specificity, and 15% (8/55) included 2-way communication. Among claimed digital twins, 60% (48/80) represented specific organs (primarily heart: 15/48, 31%; bones or joints: 10/48, 21%; lung: 6/48, 12%; and arteries: 5/48, 10%); 14% (11/80) embodied biological systems such as the immune system; and 26% (21/80) corresponded to other products (prediction models, etc). The patient data used to develop and run the claimed digital twins encompassed medical imaging examinations (35/80, 44% of publications), clinical notes (15/80, 19% of publications), laboratory test results (13/80, 16% of publications), wearable device data (12/80, 15% of publications), and other modalities (32/80, 40% of publications). Regarding data flow between patients and their virtual counterparts, 16% (13/80) claimed that digital twins involved no flow from patient to digital twin, 73% (58/80) used 1-way flow from patient to digital twin, and 11% (9/80) enabled 2-way data flow between patient and digital twin. Based on these characteristics, unsupervised classification revealed 3 clusters: simulation patient digital twins in 54% (43/80) of publications, monitoring patient digital twins in 28% (22/80) of publications, and research-oriented models unlinked to specific patients in 19% (15/80) of publications. Simulation patient digital twins used computational modeling for personalized predictions and therapy evaluations, mostly for one-time assessments, and monitoring digital twins harnessed aggregated patient data for continuous risk or outcome forecasting and care optimization. Conclusions: We propose defining a patient digital twin as ?a viewable digital replica of a patient, organ, or biological system that contains multidimensional, patient-specific information and informs decisions? and to distinguish simulation and monitoring digital twins. These proposed definitions and subtypes offer a framework to guide research into realizing the potential of these personalized, integrative technologies to advance clinical care. UR - https://www.jmir.org/2024/1/e58504 UR - http://dx.doi.org/10.2196/58504 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/58504 ER - TY - JOUR AU - Liao, Wenmin AU - He, Rong AU - He, Zhonglian AU - Shi, Nan AU - Li, Dan AU - Zhuang, Aihua AU - Gan, Feng AU - Sun, Ying AU - Li, Chaofeng PY - 2024/11/12 TI - Influence of Blood Sampling Service Process Reengineering on Medical Services Supply: Quasi-Experimental Study JO - J Med Internet Res SP - e51412 VL - 26 KW - process reengineering KW - blood sampling KW - hospital administration KW - medical informatics KW - digital health KW - patient experience N2 - Background: Tertiary hospitals in China are confronted with significant challenges due to limited spatial capacity and workforce constraints, leading to saturated allocation of medical resources and restricted growth in medical service provision. The incorporation of digital health into medical service process reengineering (MSPR) marks a pivotal transformation and restructuring of conventional health service delivery models. Specifically, the application of MSPR to blood sampling services processes reengineering (BSSPR) holds promise for substantially enhancing the efficiency and quality of medical services through streamlining and optimizing these procedures. However, the comprehensive impact of BSSPR has been infrequently quantified in existing research. Objective: This study aims to investigate the influence of BSSPR on the efficiency and quality of medical services and to elucidate the key informative technological support points underpinning BSSPR. Methods: Data were collected from both the new and old laboratory information systems from August 1, 2019, to December 31, 2021. A combination of statistical description, chi-square test, and t test was used to compare check-in time and waiting time of outpatients before and after the implementation of BSSPR. An interrupted time-series design was used to analyze the impact of BSSPR on medical service efficiency and quality, enabling the control of confounding variables, including changes in medical human resources and both long- and short-term temporal trends. Results: BSSPR had an impact on the efficiency and quality of medical services. Notably, there was a significant increase in the number of patients receiving blood sampling services, with a daily service volume increase of ~150 individuals (P=.04). The average waiting time for patients decreased substantially from 29 (SD 36) to 11 (SD 11) minutes, indicating a marked improvement in patient experience. During the peak period, the number of patients receiving blood sampling services per working hour statistically increased from 9.56 to 16.77 (P<.001). The interrupted time-series model results demonstrated a reduction in patients? waiting time by an average of 26.1 (SD 3.8; 95% CI ?33.64 to ?18.57) minutes. Although there was an initial decline in the number of outpatients admitted following BSSPR implementation, an upward trend was observed over time (?=1.13, 95% CI 0.91-1.36). Conclusions: BSSPR implementation for outpatients not only reduced waiting time and improved patients? experience but also augmented the hospital?s capacity to provide medical services. This study?s findings offer valuable insights into the potential advantages of BSSPR and underscore the significance of harnessing digital technologies to optimize medical service processes. This research serves as a foundational basis and provides scientific support for the promotion and application of BSSPR in other health care contexts. By continuing to explore and refine the integration of digital technologies in health care, we can further enhance patient outcomes and elevate the overall quality of medical services. UR - https://www.jmir.org/2024/1/e51412 UR - http://dx.doi.org/10.2196/51412 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/51412 ER - TY - JOUR AU - Cho, Jaeso AU - Han, Yeon Ji AU - Cho, Anna AU - Yoo, Sooyoung AU - Lee, Ho-Young AU - Kim, Hunmin PY - 2024/11/8 TI - Enhancing Clinical History Taking Through the Implementation of a Streamlined Electronic Questionnaire System at a Pediatric Headache Clinic: Development and Evaluation Study JO - JMIR Med Inform SP - e54415 VL - 12 KW - electronic questionnaire system KW - electronic questionnaire KW - history taking KW - medical history KW - headache KW - migraine KW - neuralgia KW - pediatric KW - paediatric KW - infant KW - neonatal KW - toddler KW - child KW - youth KW - adolescent N2 - Background: Accurate history taking is essential for diagnosis, treatment, and patient care, yet miscommunications and time constraints often lead to incomplete information. Consequently, there has been a pressing need to establish a system whereby the questionnaire is duly completed before the medical appointment, entered into the electronic health record (EHR), and stored in a structured format within a database. Objective: This study aimed to develop and evaluate a streamlined electronic questionnaire system, BEST-Survey (Bundang Hospital Electronic System for Total Care-Survey), integrated with the EHR, to enhance history taking and data management for patients with pediatric headaches. Methods: An electronic questionnaire system was developed at Seoul National University Bundang Hospital, allowing patients to complete previsit questionnaires on a tablet PC. The information is automatically integrated into the EHR and stored in a structured database for further analysis. A retrospective analysis compared clinical information acquired from patients aged <18 years visiting the pediatric neurology outpatient clinic for headaches, before and after implementing the BEST-Survey system. The study included 365 patients before and 452 patients after system implementation. Answer rates and positive rates of key headache characteristics were compared between the 2 groups to evaluate the system?s clinical utility. Results: Implementation of the BEST-Survey system significantly increased the mean data acquisition rate from 54.6% to 99.3% (P<.001). Essential clinical features such as onset, location, duration, severity, nature, and frequency were obtained in over 98.7% (>446/452) of patients after implementation, compared to from 53.7% (196/365) to 85.2% (311/365) before. The electronic system facilitated comprehensive data collection, enabling detailed analysis of headache characteristics in the patient population. Most patients (280/452, 61.9%) reported headache onset less than 1 year prior, with the temporal region being the most common pain location (261/703, 37.1%). Over half (232/452, 51.3%) experienced headaches lasting less than 2 hours, with nausea and vomiting as the most commonly associated symptoms (231/1036, 22.3%). Conclusions: The BEST-Survey system markedly improved the completeness and accuracy of essential history items for patients with pediatric headaches. The system also streamlined data extraction and analysis for clinical and research purposes. While the electronic questionnaire cannot replace physician-led history taking, it serves as a valuable adjunctive tool to enhance patient care. UR - https://medinform.jmir.org/2024/1/e54415 UR - http://dx.doi.org/10.2196/54415 ID - info:doi/10.2196/54415 ER - TY - JOUR AU - An, Jinghui AU - Shi, Fengwu AU - Wang, Huajun AU - Zhang, Hang AU - Liu, Su PY - 2024/11/8 TI - Evaluating the Sensitivity of Wearable Devices in Posttranscatheter Aortic Valve Implantation Functional Assessment JO - JMIR Mhealth Uhealth SP - e65277 VL - 12 KW - aortic valve KW - implantation functional KW - wearable devices UR - https://mhealth.jmir.org/2024/1/e65277 UR - http://dx.doi.org/10.2196/65277 ID - info:doi/10.2196/65277 ER - TY - JOUR AU - Ashraf, Reza Amir AU - Mackey, Ken Tim AU - Vida, György Róbert AU - Kulcsár, Gy?z? AU - Schmidt, János AU - Balázs, Orsolya AU - Domián, Márk Bálint AU - Li, Jiawei AU - Csákó, Ibolya AU - Fittler, András PY - 2024/11/7 TI - Multifactor Quality and Safety Analysis of Semaglutide Products Sold by Online Sellers Without a Prescription: Market Surveillance, Content Analysis, and Product Purchase Evaluation Study JO - J Med Internet Res SP - e65440 VL - 26 KW - semaglutide KW - Ozempic KW - Wegovy KW - search engines KW - online pharmacies KW - patient safety KW - medication safety KW - nondelivery schemes KW - counterfeit KW - substandard and falsified medical products N2 - Background: Over the past 4 decades, obesity has escalated into a global epidemic, with its worldwide prevalence nearly tripling. Pharmacological treatments have evolved with the recent development of glucagon-like peptide 1 agonists, such as semaglutide. However, off-label use of drugs such as Ozempic for cosmetic weight loss has surged in popularity, raising concerns about potential misuse and the emergence of substandard and falsified products in the unregulated supply chain. Objective: This study aims to conduct a multifactor investigation of product quality and patient safety risks associated with the unregulated online sale of semaglutide by examining product availability and vendor characteristics and assessing product quality through test purchases. Methods: We used a complex risk and quality assessment methodology combining online market surveillance, search engine results page analysis, website content assessment, domain traffic analytics, conducting targeted product test purchases, visual quality inspection of product packaging, microbiological sterility and endotoxin contamination evaluation, and quantitative sample analysis using liquid chromatography coupled with mass spectrometry. Results: We collected and evaluated 1080 links from search engine results pages and identified 317 (29.35%) links belonging to online pharmacies, of which 183 (57.7%) led to legal pharmacies and 134 (42.3%) directed users to 59 unique illegal online pharmacy websites. Web traffic data for the period between July and September 2023 revealed that the top 30 domains directly or indirectly affiliated with illegal online pharmacies accumulated over 4.7 million visits. Test purchases were completed from 6 illegal online pharmacies with the highest number of links offering semaglutide products for sale without prescription at the lowest price range. Three injection vial purchases were delivered; none of the 3 Ozempic prefilled injection pens were received due to nondelivery e-commerce scams. All purchased vials were considered probable substandard and falsified products, as visual inspection indicated noncompliance in more than half (59%-63%) of the evaluated criteria. The semaglutide content of samples substantially exceeded labeled amounts by 28.56%-38.69%, although no peptide-like impurities were identified. The lyophilized peptide samples were devoid of viable microorganisms at the time of testing; however, endotoxin was detected in all samples with levels ranging between 2.1645 EU/mg and 8.9511 EU/mg. Furthermore, the measured semaglutide purity was significantly low, ranging between 7.7% and 14.37% and deviating from the 99% claimed on product labels by manufacturers. Conclusions: Glucagon-like peptide 1 agonist drugs promoted for weight loss, similar to erectile dysfunction medications more than 2 decades ago, are becoming the new blockbuster lifestyle medications for the illegal online pharmacy market. Protecting the pharmaceutical supply chain from substandard and falsified weight loss products and raising awareness regarding online medication safety must be a public health priority for regulators and technology platforms alike. UR - https://www.jmir.org/2024/1/e65440 UR - http://dx.doi.org/10.2196/65440 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/65440 ER - TY - JOUR AU - Penev, P. Yordan AU - Buchanan, R. Timothy AU - Ruppert, M. Matthew AU - Liu, Michelle AU - Shekouhi, Ramin AU - Guan, Ziyuan AU - Balch, Jeremy AU - Ozrazgat-Baslanti, Tezcan AU - Shickel, Benjamin AU - Loftus, J. Tyler AU - Bihorac, Azra PY - 2024/11/6 TI - Electronic Health Record Data Quality and Performance Assessments: Scoping Review JO - JMIR Med Inform SP - e58130 VL - 12 KW - electronic health record KW - EHR KW - record KW - data quality KW - data performance KW - clinical informatics KW - performance KW - data science KW - synthesis KW - review methods KW - review methodology KW - search KW - scoping N2 - Background: Electronic health records (EHRs) have an enormous potential to advance medical research and practice through easily accessible and interpretable EHR-derived databases. Attainability of this potential is limited by issues with data quality (DQ) and performance assessment. Objective: This review aims to streamline the current best practices on EHR DQ and performance assessments as a replicable standard for researchers in the field. Methods: PubMed was systematically searched for original research articles assessing EHR DQ and performance from inception until May 7, 2023. Results: Our search yielded 26 original research articles. Most articles had 1 or more significant limitations, including incomplete or inconsistent reporting (n=6, 30%), poor replicability (n=5, 25%), and limited generalizability of results (n=5, 25%). Completeness (n=21, 81%), conformance (n=18, 69%), and plausibility (n=16, 62%) were the most cited indicators of DQ, while correctness or accuracy (n=14, 54%) was most cited for data performance, with context-specific supplementation by recency (n=7, 27%), fairness (n=6, 23%), stability (n=4, 15%), and shareability (n=2, 8%) assessments. Artificial intelligence?based techniques, including natural language data extraction, data imputation, and fairness algorithms, were demonstrated to play a rising role in improving both dataset quality and performance. Conclusions: This review highlights the need for incentivizing DQ and performance assessments and their standardization. The results suggest the usefulness of artificial intelligence?based techniques for enhancing DQ and performance to unlock the full potential of EHRs to improve medical research and practice. UR - https://medinform.jmir.org/2024/1/e58130 UR - http://dx.doi.org/10.2196/58130 ID - info:doi/10.2196/58130 ER - TY - JOUR AU - Abbott, E. Ethan AU - Apakama, Donald AU - Richardson, D. Lynne AU - Chan, Lili AU - Nadkarni, N. Girish PY - 2024/10/30 TI - Leveraging Artificial Intelligence and Data Science for Integration of Social Determinants of Health in Emergency Medicine: Scoping Review JO - JMIR Med Inform SP - e57124 VL - 12 KW - data science KW - social determinants of health KW - natural language processing KW - artificial intelligence KW - NLP KW - machine learning KW - review methods KW - review methodology KW - scoping review KW - emergency medicine KW - PRISMA N2 - Background: Social determinants of health (SDOH) are critical drivers of health disparities and patient outcomes. However, accessing and collecting patient-level SDOH data can be operationally challenging in the emergency department (ED) clinical setting, requiring innovative approaches. Objective: This scoping review examines the potential of AI and data science for modeling, extraction, and incorporation of SDOH data specifically within EDs, further identifying areas for advancement and investigation. Methods: We conducted a standardized search for studies published between 2015 and 2022, across Medline (Ovid), Embase (Ovid), CINAHL, Web of Science, and ERIC databases. We focused on identifying studies using AI or data science related to SDOH within emergency care contexts or conditions. Two specialized reviewers in emergency medicine (EM) and clinical informatics independently assessed each article, resolving discrepancies through iterative reviews and discussion. We then extracted data covering study details, methodologies, patient demographics, care settings, and principal outcomes. Results: Of the 1047 studies screened, 26 met the inclusion criteria. Notably, 9 out of 26 (35%) studies were solely concentrated on ED patients. Conditions studied spanned broad EM complaints and included sepsis, acute myocardial infarction, and asthma. The majority of studies (n=16) explored multiple SDOH domains, with homelessness/housing insecurity and neighborhood/built environment predominating. Machine learning (ML) techniques were used in 23 of 26 studies, with natural language processing (NLP) being the most commonly used approach (n=11). Rule-based NLP (n=5), deep learning (n=2), and pattern matching (n=4) were the most commonly used NLP techniques. NLP models in the reviewed studies displayed significant predictive performance with outcomes, with F1-scores ranging between 0.40 and 0.75 and specificities nearing 95.9%. Conclusions: Although in its infancy, the convergence of AI and data science techniques, especially ML and NLP, with SDOH in EM offers transformative possibilities for better usage and integration of social data into clinical care and research. With a significant focus on the ED and notable NLP model performance, there is an imperative to standardize SDOH data collection, refine algorithms for diverse patient groups, and champion interdisciplinary synergies. These efforts aim to harness SDOH data optimally, enhancing patient care and mitigating health disparities. Our research underscores the vital need for continued investigation in this domain. UR - https://medinform.jmir.org/2024/1/e57124 UR - http://dx.doi.org/10.2196/57124 ID - info:doi/10.2196/57124 ER - TY - JOUR AU - Kim, Kyungmo AU - Park, Seongkeun AU - Min, Jeongwon AU - Park, Sumin AU - Kim, Yeon Ju AU - Eun, Jinsu AU - Jung, Kyuha AU - Park, Elyson Yoobin AU - Kim, Esther AU - Lee, Young Eun AU - Lee, Joonhwan AU - Choi, Jinwook PY - 2024/10/30 TI - Multifaceted Natural Language Processing Task?Based Evaluation of Bidirectional Encoder Representations From Transformers Models for Bilingual (Korean and English) Clinical Notes: Algorithm Development and Validation JO - JMIR Med Inform SP - e52897 VL - 12 KW - natural language processing KW - NLP KW - natural language inference KW - reading comprehension KW - large language models KW - transformer N2 - Background: The bidirectional encoder representations from transformers (BERT) model has attracted considerable attention in clinical applications, such as patient classification and disease prediction. However, current studies have typically progressed to application development without a thorough assessment of the model?s comprehension of clinical context. Furthermore, limited comparative studies have been conducted on BERT models using medical documents from non?English-speaking countries. Therefore, the applicability of BERT models trained on English clinical notes to non-English contexts is yet to be confirmed. To address these gaps in literature, this study focused on identifying the most effective BERT model for non-English clinical notes. Objective: In this study, we evaluated the contextual understanding abilities of various BERT models applied to mixed Korean and English clinical notes. The objective of this study was to identify the BERT model that excels in understanding the context of such documents. Methods: Using data from 164,460 patients in a South Korean tertiary hospital, we pretrained BERT-base, BERT for Biomedical Text Mining (BioBERT), Korean BERT (KoBERT), and Multilingual BERT (M-BERT) to improve their contextual comprehension capabilities and subsequently compared their performances in 7 fine-tuning tasks. Results: The model performance varied based on the task and token usage. First, BERT-base and BioBERT excelled in tasks using classification ([CLS]) token embeddings, such as document classification. BioBERT achieved the highest F1-score of 89.32. Both BERT-base and BioBERT demonstrated their effectiveness in document pattern recognition, even with limited Korean tokens in the dictionary. Second, M-BERT exhibited a superior performance in reading comprehension tasks, achieving an F1-score of 93.77. Better results were obtained when fewer words were replaced with unknown ([UNK]) tokens. Third, M-BERT excelled in the knowledge inference task in which correct disease names were inferred from 63 candidate disease names in a document with disease names replaced with [MASK] tokens. M-BERT achieved the highest hit@10 score of 95.41. Conclusions: This study highlighted the effectiveness of various BERT models in a multilingual clinical domain. The findings can be used as a reference in clinical and language-based applications. UR - https://medinform.jmir.org/2024/1/e52897 UR - http://dx.doi.org/10.2196/52897 ID - info:doi/10.2196/52897 ER - TY - JOUR AU - Accorsi, Duenhas Tarso Augusto AU - Eduardo, Aires Anderson AU - Baptista, Guilherme Carlos AU - Moreira, Tocci Flavio AU - Morbeck, Albaladejo Renata AU - Köhler, Francine Karen AU - Lima, Amicis Karine de AU - Pedrotti, Sartorato Carlos Henrique PY - 2024/10/25 TI - The Impact of International Classification of Disease?Triggered Prescription Support on Telemedicine: Observational Analysis of Efficiency and Guideline Adherence JO - JMIR Med Inform SP - e56681 VL - 12 KW - telemedicine KW - clinical decision support systems KW - electronic prescriptions KW - guideline adherence KW - consultation efficiency KW - International Classification of Disease?coded prescriptions KW - teleheath KW - eHealth N2 - Background: Integrating decision support systems into telemedicine may optimize consultation efficiency and adherence to clinical guidelines; however, the extent of such effects remains underexplored. Objective: This study aims to evaluate the use of ICD (International Classification of Disease)-coded prescription decision support systems (PDSSs) and the effects of these systems on consultation duration and guideline adherence during telemedicine encounters. Methods: In this retrospective, single-center, observational study conducted from October 2021 to March 2022, adult patients who sought urgent digital care via direct-to-consumer video consultations were included. Physicians had access to current guidelines and could use an ICD-triggered PDSS (which was introduced in January 2022 after a preliminary test in the preceding month) for 26 guideline-based conditions. This study analyzed the impact of implementing automated prescription systems and compared these systems to manual prescription processes in terms of consultation duration and guideline adherence. Results: This study included 10,485 telemedicine encounters involving 9644 patients, with 12,346 prescriptions issued by 290 physicians. Automated prescriptions were used in 5022 (40.67%) of the consultations following system integration. Before introducing decision support, 4497 (36.42%) prescriptions were issued, which increased to 7849 (63.57%) postimplementation. The physician?s average consultation time decreased significantly to 9.5 (SD 5.5) minutes from 11.2 (SD 5.9) minutes after PDSS implementation (P<.001). Of the 12,346 prescriptions, 8683 (70.34%) were aligned with disease-specific international guidelines tailored for telemedicine encounters. Primary medication adherence in accordance with existing guidelines was significantly greater in the decision support group than in the manual group (n=4697, 93.53% vs n=1389, 49.14%; P<.001). Conclusions: Most of the physicians adopted the PDSS, and the results demonstrated the use of the ICD-code system in reducing consultation times and increasing guideline adherence. These systems appear to be valuable for enhancing the efficiency and quality of telemedicine consultations by supporting evidence-based clinical decision-making. UR - https://medinform.jmir.org/2024/1/e56681 UR - http://dx.doi.org/10.2196/56681 UR - http://www.ncbi.nlm.nih.gov/pubmed/39453703 ID - info:doi/10.2196/56681 ER - TY - JOUR AU - Lee, Heather Younga AU - Zhang, Yingzhe AU - Kennedy, J. Chris AU - Mallard, T. Travis AU - Liu, Zhaowen AU - Vu, Linh Phuong AU - Feng, Anne Yen-Chen AU - Ge, Tian AU - Petukhova, V. Maria AU - Kessler, C. Ronald AU - Nock, K. Matthew AU - Smoller, W. Jordan PY - 2024/10/23 TI - Enhancing Suicide Risk Prediction With Polygenic Scores in Psychiatric Emergency Settings: Prospective Study JO - JMIR Bioinform Biotech SP - e58357 VL - 5 KW - polygenic risk score KW - suicide risk prediction KW - suicide attempt KW - predictive algorithms KW - genomics KW - genotypes KW - electronic health record KW - machine learning N2 - Background: Despite growing interest in the clinical translation of polygenic risk scores (PRSs), it remains uncertain to what extent genomic information can enhance the prediction of psychiatric outcomes beyond the data collected during clinical visits alone. Objective: This study aimed to assess the clinical utility of incorporating PRSs into a suicide risk prediction model trained on electronic health records (EHRs) and patient-reported surveys among patients admitted to the emergency department. Methods: Study participants were recruited from the psychiatric emergency department at Massachusetts General Hospital. There were 333 adult patients of European ancestry who had high-quality genotype data available through their participation in the Mass General Brigham Biobank. Multiple neuropsychiatric PRSs were added to a previously validated suicide prediction model in a prospective cohort enrolled between February 4, 2015, and March 13, 2017. Data analysis was performed from July 11, 2022, to August 31, 2023. Suicide attempt was defined using diagnostic codes from longitudinal EHRs combined with 6-month follow-up surveys. The clinical risk score for suicide attempt was calculated from an ensemble model trained using an EHR-based suicide risk score and a brief survey, and it was subsequently used to define the baseline model. We generated PRSs for depression, bipolar disorder, schizophrenia, suicide attempt, and externalizing traits using a Bayesian polygenic scoring method for European ancestry participants. Model performance was evaluated using area under the receiver operator curve (AUC), area under the precision-recall curve, and positive predictive values. Results: Of the 333 patients (n=178, 53.5% male; mean age 36.8, SD 13.6 years; n=333, 100% non-Hispanic and n=324, 97.3% self-reported White), 28 (8.4%) had a suicide attempt within 6 months. Adding either the schizophrenia PRS or all PRSs to the baseline model resulted in the numerically highest discrimination (AUC 0.86, 95% CI 0.73-0.99) compared to the baseline model (AUC 0.84, 95% Cl 0.70-0.98). However, the improvement in model performance was not statistically significant. Conclusions: In this study, incorporating genomic information into clinical prediction models for suicide attempt did not improve patient risk stratification. Larger studies that include more diverse participants are required to validate whether the inclusion of psychiatric PRSs in clinical prediction models can enhance the stratification of patients at risk of suicide attempts. UR - https://bioinform.jmir.org/2024/1/e58357 UR - http://dx.doi.org/10.2196/58357 UR - http://www.ncbi.nlm.nih.gov/pubmed/39442166 ID - info:doi/10.2196/58357 ER - TY - JOUR AU - Manion, J. Frank AU - Du, Jingcheng AU - Wang, Dong AU - He, Long AU - Lin, Bin AU - Wang, Jingqi AU - Wang, Siwei AU - Eckels, David AU - Cervenka, Jan AU - Fiduccia, C. Peter AU - Cossrow, Nicole AU - Yao, Lixia PY - 2024/10/23 TI - Accelerating Evidence Synthesis in Observational Studies: Development of a Living Natural Language Processing?Assisted Intelligent Systematic Literature Review System JO - JMIR Med Inform SP - e54653 VL - 12 KW - machine learning KW - deep learning KW - natural language processing KW - systematic literature review KW - artificial intelligence KW - software development KW - data extraction KW - epidemiology N2 - Background: Systematic literature review (SLR), a robust method to identify and summarize evidence from published sources, is considered to be a complex, time-consuming, labor-intensive, and expensive task. Objective: This study aimed to present a solution based on natural language processing (NLP) that accelerates and streamlines the SLR process for observational studies using real-world data. Methods: We followed an agile software development and iterative software engineering methodology to build a customized intelligent end-to-end living NLP-assisted solution for observational SLR tasks. Multiple machine learning?based NLP algorithms were adopted to automate article screening and data element extraction processes. The NLP prediction results can be further reviewed and verified by domain experts, following the human-in-the-loop design. The system integrates explainable articificial intelligence to provide evidence for NLP algorithms and add transparency to extracted literature data elements. The system was developed based on 3 existing SLR projects of observational studies, including the epidemiology studies of human papillomavirus?associated diseases, the disease burden of pneumococcal diseases, and cost-effectiveness studies on pneumococcal vaccines. Results: Our Intelligent SLR Platform covers major SLR steps, including study protocol setting, literature retrieval, abstract screening, full-text screening, data element extraction from full-text articles, results summary, and data visualization. The NLP algorithms achieved accuracy scores of 0.86-0.90 on article screening tasks (framed as text classification tasks) and macroaverage F1 scores of 0.57-0.89 on data element extraction tasks (framed as named entity recognition tasks). Conclusions: Cutting-edge NLP algorithms expedite SLR for observational studies, thus allowing scientists to have more time to focus on the quality of data and the synthesis of evidence in observational studies. Aligning the living SLR concept, the system has the potential to update literature data and enable scientists to easily stay current with the literature related to observational studies prospectively and continuously. UR - https://medinform.jmir.org/2024/1/e54653 UR - http://dx.doi.org/10.2196/54653 ID - info:doi/10.2196/54653 ER - TY - JOUR AU - Mollalo, Abolfazl AU - Hamidi, Bashir AU - Lenert, A. Leslie AU - Alekseyenko, V. Alexander PY - 2024/10/15 TI - Application of Spatial Analysis on Electronic Health Records to Characterize Patient Phenotypes: Systematic Review JO - JMIR Med Inform SP - e56343 VL - 12 KW - clinical phenotypes KW - electronic health records KW - geocoding KW - geographic information systems KW - patient phenotypes KW - spatial analysis N2 - Background: Electronic health records (EHRs) commonly contain patient addresses that provide valuable data for geocoding and spatial analysis, enabling more comprehensive descriptions of individual patients for clinical purposes. Despite the widespread use of EHRs in clinical decision support and interventions, no systematic review has examined the extent to which spatial analysis is used to characterize patient phenotypes. Objective: This study reviews advanced spatial analyses that used individual-level health data from EHRs within the United States to characterize patient phenotypes. Methods: We systematically evaluated English-language, peer-reviewed studies from the PubMed/MEDLINE, Scopus, Web of Science, and Google Scholar databases from inception to August 20, 2023, without imposing constraints on study design or specific health domains. Results: A substantial proportion of studies (>85%) were limited to geocoding or basic mapping without implementing advanced spatial statistical analysis, leaving only 49 studies that met the eligibility criteria. These studies used diverse spatial methods, with a predominant focus on clustering techniques, while spatiotemporal analysis (frequentist and Bayesian) and modeling were less common. A noteworthy surge (n=42, 86%) in publications was observed after 2017. The publications investigated a variety of adult and pediatric clinical areas, including infectious disease, endocrinology, and cardiology, using phenotypes defined over a range of data domains such as demographics, diagnoses, and visits. The primary health outcomes investigated were asthma, hypertension, and diabetes. Notably, patient phenotypes involving genomics, imaging, and notes were limited. Conclusions: This review underscores the growing interest in spatial analysis of EHR-derived data and highlights knowledge gaps in clinical health, phenotype domains, and spatial methodologies. We suggest that future research should focus on addressing these gaps and harnessing spatial analysis to enhance individual patient contexts and clinical decision support. UR - https://medinform.jmir.org/2024/1/e56343 UR - http://dx.doi.org/10.2196/56343 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/56343 ER - TY - JOUR AU - Rosenau, Lorenz AU - Gruendner, Julian AU - Kiel, Alexander AU - Köhler, Thomas AU - Schaffer, Bastian AU - Majeed, W. Raphael PY - 2024/10/14 TI - Bridging Data Models in Health Care With a Novel Intermediate Query Format for Feasibility Queries: Mixed Methods Study JO - JMIR Med Inform SP - e58541 VL - 12 KW - feasibility KW - FHIR KW - CQL KW - eligibility criteria KW - clinical research KW - intermediate query format KW - healthcare interoperability KW - cohort definition KW - query KW - queries KW - interoperability KW - interoperable KW - informatics KW - portal KW - portals KW - implementation KW - develop KW - development KW - ontology KW - ontologies KW - JSON N2 - Background: To advance research with clinical data, it is essential to make access to the available data as fast and easy as possible for researchers, which is especially challenging for data from different source systems within and across institutions. Over the years, many research repositories and data standards have been created. One of these is the Fast Healthcare Interoperability Resources (FHIR) standard, used by the German Medical Informatics Initiative (MII) to harmonize and standardize data across university hospitals in Germany. One of the first steps to make these data available is to allow researchers to create feasibility queries to determine the data availability for a specific research question. Given the heterogeneity of different query languages to access different data across and even within standards such as FHIR (eg, CQL and FHIR Search), creating an intermediate query syntax for feasibility queries reduces the complexity of query translation and improves interoperability across different research repositories and query languages. Objective: This study describes the creation and implementation of an intermediate query syntax for feasibility queries and how it integrates into the federated German health research portal (Forschungsdatenportal Gesundheit) and the MII. Methods: We analyzed the requirements for feasibility queries and the feasibility tools that are currently available in research repositories. Based on this analysis, we developed an intermediate query syntax that can be easily translated into different research repository?specific query languages. Results: The resulting Clinical Cohort Definition Language (CCDL) for feasibility queries combines inclusion criteria in a conjunctive normal form and exclusion criteria in a disjunctive normal form, allowing for additional filters like time or numerical restrictions. The inclusion and exclusion results are combined via an expression to specify feasibility queries. We defined a JSON schema for the CCDL, generated an ontology, and demonstrated the use and translatability of the CCDL across multiple studies and real-world use cases. Conclusions: We developed and evaluated a structured query syntax for feasibility queries and demonstrated its use in a real-world example as part of a research platform across 39 German university hospitals. UR - https://medinform.jmir.org/2024/1/e58541 UR - http://dx.doi.org/10.2196/58541 ID - info:doi/10.2196/58541 ER - TY - JOUR AU - Chen, Tingting AU - Tang, Xiaofen AU - Xu, Min AU - Jiang, Yue AU - Zheng, Fengyan PY - 2024/10/14 TI - Application of Information Link Control in Surgical Specimen Near-Miss Events in a South China Hospital: Nonrandomized Controlled Study JO - JMIR Med Inform SP - e52722 VL - 12 KW - near misses KW - technical barriers KW - process barriers KW - surgical specimens KW - information N2 - Background: Information control is a promising approach for managing surgical specimens. However, there is limited research evidence on surgical near misses. This is particularly true in the closed loop of information control for each link. Objective: A new model of surgical specimen process management is further constructed, and a safe operating room nursing practice environment is created by intercepting specimen near-miss events through information safety barriers. Methods: In a large hospital in China, 84,289 surgical specimens collected in the conventional information specimen management mode from January to December 2021 were selected as the control group, and 99,998 surgical specimens collected in the information safety barrier control surgical specimen management mode from January to December 2022 were selected as the improvement group. The incidence of near misses, the qualified rate of pathological specimen fixation, and the average time required for specimen fixation were compared under the 2 management modes. The causes of 2 groups of near misses were analyzed and the near misses of information safety barrier control surgical specimens were studied. Results: Under the information-based safety barrier control surgical specimen management model, the incidence of adverse events in surgical specimens was reduced, the reporting of near-miss events in surgical specimens was improved by 100%, the quality control quality management of surgical specimens was effectively improved, the pass rate of surgical pathology specimen fixation was improved, and the meantime for surgical specimen fixation was shortened, with differences considered statistically significant at P<.05. Conclusions: Our research has developed a new mode of managing the surgical specimen process. This mode can prevent errors in approaching specimens by implementing information security barriers, thereby enhancing the quality of specimen management, ensuring the safety of medical procedures, and improving the quality of hospital services. UR - https://medinform.jmir.org/2024/1/e52722 UR - http://dx.doi.org/10.2196/52722 ID - info:doi/10.2196/52722 ER - TY - JOUR AU - Nishiyama, Tomohiro AU - Yamaguchi, Ayane AU - Han, Peitao AU - Pereira, Kanashiro Lis Weiji AU - Otsuki, Yuka AU - Andrade, Bernardim Gabriel Herman AU - Kudo, Noriko AU - Yada, Shuntaro AU - Wakamiya, Shoko AU - Aramaki, Eiji AU - Takada, Masahiro AU - Toi, Masakazu PY - 2024/9/24 TI - Automated System to Capture Patient Symptoms From Multitype Japanese Clinical Texts: Retrospective Study JO - JMIR Med Inform SP - e58977 VL - 12 KW - natural language processing KW - named entity recognition KW - adverse drug reaction KW - adverse event KW - peripheral neuropathy KW - NLP KW - symptoms KW - symptom KW - machine learning KW - ML KW - drug KW - drugs KW - pharmacology KW - pharmacotherapy KW - pharmaceutic KW - pharmaceutics KW - pharmaceuticals KW - pharmaceutical KW - medication KW - medications KW - adverse KW - neuropathy KW - cancer KW - oncology KW - text KW - texts KW - textual KW - note KW - notes KW - report KW - reports KW - EHR KW - EHRs KW - record KW - records KW - detect KW - detection KW - detecting N2 - Background: Natural language processing (NLP) techniques can be used to analyze large amounts of electronic health record texts, which encompasses various types of patient information such as quality of life, effectiveness of treatments, and adverse drug event (ADE) signals. As different aspects of a patient?s status are stored in different types of documents, we propose an NLP system capable of processing 6 types of documents: physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Objective: This study aimed to investigate the system?s performance in detecting ADEs by evaluating the results from multitype texts. The main objective is to detect adverse events accurately using an NLP system. Methods: We used data written in Japanese from 2289 patients with breast cancer, including medication data, physician progress notes, discharge summaries, radiology reports, radioisotope reports, nursing records, and pharmacist progress notes. Our system performs 3 processes: named entity recognition, normalization of symptoms, and aggregation of multiple types of documents from multiple patients. Among all patients with breast cancer, 103 and 112 with peripheral neuropathy (PN) received paclitaxel or docetaxel, respectively. We evaluate the utility of using multiple types of documents by correlation coefficient and regression analysis to compare their performance with each single type of document. All evaluations of detection rates with our system are performed 30 days after drug administration. Results: Our system underestimates by 13.3 percentage points (74.0%?60.7%), as the incidence of paclitaxel-induced PN was 60.7%, compared with 74.0% in the previous research based on manual extraction. The Pearson correlation coefficient between the manual extraction and system results was 0.87 Although the pharmacist progress notes had the highest detection rate among each type of document, the rate did not match the performance using all documents. The estimated median duration of PN with paclitaxel was 92 days, whereas the previously reported median duration of PN with paclitaxel was 727 days. The number of events detected in each document was highest in the physician?s progress notes, followed by the pharmacist?s and nursing records. Conclusions: Considering the inherent cost that requires constant monitoring of the patient?s condition, such as the treatment of PN, our system has a significant advantage in that it can immediately estimate the treatment duration without fine-tuning a new NLP model. Leveraging multitype documents is better than using single-type documents to improve detection performance. Although the onset time estimation was relatively accurate, the duration might have been influenced by the length of the data follow-up period. The results suggest that our method using various types of data can detect more ADEs from clinical documents. UR - https://medinform.jmir.org/2024/1/e58977 UR - http://dx.doi.org/10.2196/58977 UR - http://www.ncbi.nlm.nih.gov/pubmed/39316418 ID - info:doi/10.2196/58977 ER - TY - JOUR AU - Tesfaye, Wubshet AU - Jordan, Margaret AU - Chen, F. Timothy AU - Castelino, Lynel Ronald AU - Sud, Kamal AU - Dabliz, Racha AU - Aslani, Parisa PY - 2024/9/12 TI - Usability Evaluation Methods Used in Electronic Discharge Summaries: Literature Review JO - J Med Internet Res SP - e55247 VL - 26 KW - electronic discharge summaries KW - usability testing KW - heuristic evaluation KW - heuristics, think-aloud KW - adoption KW - digital health KW - usability KW - electronic KW - discharge summary KW - end users KW - evaluation KW - user-centered N2 - Background: With the widespread adoption of digital health records, including electronic discharge summaries (eDS), it is important to assess their usability in order to understand whether they meet the needs of the end users. While there are established approaches for evaluating the usability of electronic health records, there is a lack of knowledge regarding suitable evaluation methods specifically for eDS. Objective: This literature review aims to identify the usability evaluation approaches used in eDS. Methods: We conducted a comprehensive search of PubMed, CINAHL, Web of Science, ACM Digital Library, MEDLINE, and ProQuest databases from their inception until July 2023. The study information was extracted and reported in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). We included studies that assessed the usability of eDS, and the systems used to display eDS. Results: A total of 12 records, including 11 studies and 1 thesis, met the inclusion criteria. The included studies used qualitative, quantitative, or mixed methods approaches and reported the use of various usability evaluation methods. Heuristic evaluation was the most used method to assess the usability of eDS systems (n=7), followed by the think-aloud approach (n=5) and laboratory testing (n=3). These methods were used either individually or in combination with usability questionnaires (n=3) and qualitative semistructured interviews (n=4) for evaluating eDS usability issues. The evaluation processes incorporated usability metrics such as user performance, satisfaction, efficiency, and impact rating. Conclusions: There are a limited number of studies focusing on usability evaluations of eDS. The identified studies used expert-based and user-centered approaches, which can be used either individually or in combination to identify usability issues. However, further research is needed to determine the most appropriate evaluation method which can assess the fitness for purpose of discharge summaries. UR - https://www.jmir.org/2024/1/e55247 UR - http://dx.doi.org/10.2196/55247 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/55247 ER - TY - JOUR AU - Liu, Chao AU - Jiao, Yuanshi AU - Su, Licong AU - Liu, Wenna AU - Zhang, Haiping AU - Nie, Sheng AU - Gong, Mengchun PY - 2024/8/20 TI - Effective Privacy Protection Strategies for Pregnancy and Gestation Information From Electronic Medical Records: Retrospective Study in a National Health Care Data Network in China JO - J Med Internet Res SP - e46455 VL - 26 KW - pregnancy KW - electronic medical record KW - privacy protection KW - risk stratification KW - rule-based N2 - Background: Pregnancy and gestation information is routinely recorded in electronic medical record (EMR) systems across China in various data sets. The combination of data on the number of pregnancies and gestations can imply occurrences of abortions and other pregnancy-related issues, which is important for clinical decision-making and personal privacy protection. However, the distribution of this information inside EMR is variable due to inconsistent IT structures across different EMR systems. A large-scale quantitative evaluation of the potential exposure of this sensitive information has not been previously performed, ensuring the protection of personal information is a priority, as emphasized in Chinese laws and regulations. Objective: This study aims to perform the first nationwide quantitative analysis of the identification sites and exposure frequency of sensitive pregnancy and gestation information. The goal is to propose strategies for effective information extraction and privacy protection related to women?s health. Methods: This study was conducted in a national health care data network. Rule-based protocols for extracting pregnancy and gestation information were developed by a committee of experts. A total of 6 different sub?data sets of EMRs were used as schemas for data analysis and strategy proposal. The identification sites and frequencies of identification in different sub?data sets were calculated. Manual quality inspections of the extraction process were performed by 2 independent groups of reviewers on 1000 randomly selected records. Based on these statistics, strategies for effective information extraction and privacy protection were proposed. Results: The data network covered hospitalized patients from 19 hospitals in 10 provinces of China, encompassing 15,245,055 patients over an 11-year period (January 1, 2010-December 12, 2020). Among women aged 14-50 years, 70% were randomly selected from each hospital, resulting in a total of 1,110,053 patients. Of these, 688,268 female patients with sensitive reproductive information were identified. The frequencies of identification were variable, with the marriage history in admission medical records being the most frequent at 63.24%. Notably, more than 50% of female patients were identified with pregnancy and gestation history in nursing records, which is not generally considered a sub?data set rich in reproductive information. During the manual curation and review process, 1000 cases were randomly selected, and the precision and recall rates of the information extraction method both exceeded 99.5%. The privacy-protection strategies were designed with clear technical directions. Conclusions: Significant amounts of critical information related to women?s health are recorded in Chinese routine EMR systems and are distributed in various parts of the records with different frequencies. This requires a comprehensive protocol for extracting and protecting the information, which has been demonstrated to be technically feasible. Implementing a data-based strategy will enhance the protection of women?s privacy and improve the accessibility of health care services. UR - https://www.jmir.org/2024/1/e46455 UR - http://dx.doi.org/10.2196/46455 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/46455 ER - TY - JOUR AU - Shah-Mohammadi, Fatemeh AU - Finkelstein, Joseph PY - 2024/8/19 TI - Extraction of Substance Use Information From Clinical Notes: Generative Pretrained Transformer?Based Investigation JO - JMIR Med Inform SP - e56243 VL - 12 KW - substance use KW - natural language processing KW - GPT KW - prompt engineering KW - zero-shot learning KW - few-shot learning N2 - Background: Understanding the multifaceted nature of health outcomes requires a comprehensive examination of the social, economic, and environmental determinants that shape individual well-being. Among these determinants, behavioral factors play a crucial role, particularly the consumption patterns of psychoactive substances, which have important implications on public health. The Global Burden of Disease Study shows a growing impact in disability-adjusted life years due to substance use. The successful identification of patients? substance use information equips clinical care teams to address substance-related issues more effectively, enabling targeted support and ultimately improving patient outcomes. Objective: Traditional natural language processing methods face limitations in accurately parsing diverse clinical language associated with substance use. Large language models offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of the generative pretrained transformer (GPT) model in specific GPT-3.5 for extracting tobacco, alcohol, and substance use information from patient discharge summaries in zero-shot and few-shot learning settings. This study contributes to the evolving landscape of health care informatics by showcasing the potential of advanced language models in extracting nuanced information critical for enhancing patient care. Methods: The main data source for analysis in this paper is Medical Information Mart for Intensive Care III data set. Among all notes in this data set, we focused on discharge summaries. Prompt engineering was undertaken, involving an iterative exploration of diverse prompts. Leveraging carefully curated examples and refined prompts, we investigate the model?s proficiency through zero-shot as well as few-shot prompting strategies. Results: Results show GPT?s varying effectiveness in identifying mentions of tobacco, alcohol, and substance use across learning scenarios. Zero-shot learning showed high accuracy in identifying substance use, whereas few-shot learning reduced accuracy but improved in identifying substance use status, enhancing recall and F1-score at the expense of lower precision. Conclusions: Excellence of zero-shot learning in precisely extracting text span mentioning substance use demonstrates its effectiveness in situations in which comprehensive recall is important. Conversely, few-shot learning offers advantages when accurately determining the status of substance use is the primary focus, even if it involves a trade-off in precision. The results contribute to enhancement of early detection and intervention strategies, tailor treatment plans with greater precision, and ultimately, contribute to a holistic understanding of patient health profiles. By integrating these artificial intelligence?driven methods into electronic health record systems, clinicians can gain immediate, comprehensive insights into substance use that results in shaping interventions that are not only timely but also more personalized and effective. UR - https://medinform.jmir.org/2024/1/e56243 UR - http://dx.doi.org/10.2196/56243 UR - http://www.ncbi.nlm.nih.gov/pubmed/39037700 ID - info:doi/10.2196/56243 ER - TY - JOUR AU - Naseem, Usman AU - Thapa, Surendrabikram AU - Masood, Anum PY - 2024/8/5 TI - Advancing Accuracy in Multimodal Medical Tasks Through Bootstrapped Language-Image Pretraining (BioMedBLIP): Performance Evaluation Study JO - JMIR Med Inform SP - e56627 VL - 12 KW - biomedical text mining KW - BioNLP KW - vision-language pretraining KW - multimodal models KW - medical image analysis N2 - Background: Medical image analysis, particularly in the context of visual question answering (VQA) and image captioning, is crucial for accurate diagnosis and educational purposes. Objective: Our study aims to introduce BioMedBLIP models, fine-tuned for VQA tasks using specialized medical data sets such as Radiology Objects in Context and Medical Information Mart for Intensive Care-Chest X-ray, and evaluate their performance in comparison to the state of the art (SOTA) original Bootstrapping Language-Image Pretraining (BLIP) model. Methods: We present 9 versions of BioMedBLIP across 3 downstream tasks in various data sets. The models are trained on a varying number of epochs. The findings indicate the strong overall performance of our models. We proposed BioMedBLIP for the VQA generation model, VQA classification model, and BioMedBLIP image caption model. We conducted pretraining in BLIP using medical data sets, producing an adapted BLIP model tailored for medical applications. Results: In VQA generation tasks, BioMedBLIP models outperformed the SOTA on the Semantically-Labeled Knowledge-Enhanced (SLAKE) data set, VQA in Radiology (VQA-RAD), and Image Cross-Language Evaluation Forum data sets. In VQA classification, our models consistently surpassed the SOTA on the SLAKE data set. Our models also showed competitive performance on the VQA-RAD and PathVQA data sets. Similarly, in image captioning tasks, our model beat the SOTA, suggesting the importance of pretraining with medical data sets. Overall, in 20 different data sets and task combinations, our BioMedBLIP excelled in 15 (75%) out of 20 tasks. BioMedBLIP represents a new SOTA in 15 (75%) out of 20 tasks, and our responses were rated higher in all 20 tasks (P<.005) in comparison to SOTA models. Conclusions: Our BioMedBLIP models show promising performance and suggest that incorporating medical knowledge through pretraining with domain-specific medical data sets helps models achieve higher performance. Our models thus demonstrate their potential to advance medical image analysis, impacting diagnosis, medical education, and research. However, data quality, task-specific variability, computational resources, and ethical considerations should be carefully addressed. In conclusion, our models represent a contribution toward the synergy of artificial intelligence and medicine. We have made BioMedBLIP freely available, which will help in further advancing research in multimodal medical tasks. UR - https://medinform.jmir.org/2024/1/e56627 UR - http://dx.doi.org/10.2196/56627 UR - http://www.ncbi.nlm.nih.gov/pubmed/39102281 ID - info:doi/10.2196/56627 ER - TY - JOUR AU - Tung, Min Joshua Yi AU - Gill, Ravinder Sunil AU - Sng, Ren Gerald Gui AU - Lim, Zheng Daniel Yan AU - Ke, Yuhe AU - Tan, Fang Ting AU - Jin, Liyuan AU - Elangovan, Kabilan AU - Ong, Ling Jasmine Chiat AU - Abdullah, Rizal Hairil AU - Ting, Wei Daniel Shu AU - Chong, Wen Tsung PY - 2024/7/24 TI - Comparison of the Quality of Discharge Letters Written by Large Language Models and Junior Clinicians: Single-Blinded Study JO - J Med Internet Res SP - e57721 VL - 26 KW - artificial intelligence KW - AI KW - discharge summaries KW - continuity of care KW - large language model KW - LLM KW - junior clinician KW - letter writing KW - single-blinded KW - ChatGPT KW - urology KW - primary care KW - fictional electronic record KW - consultation note KW - referral letter KW - simulated environment N2 - Background: Discharge letters are a critical component in the continuity of care between specialists and primary care providers. However, these letters are time-consuming to write, underprioritized in comparison to direct clinical care, and are often tasked to junior doctors. Prior studies assessing the quality of discharge summaries written for inpatient hospital admissions show inadequacies in many domains. Large language models such as GPT have the ability to summarize large volumes of unstructured free text such as electronic medical records and have the potential to automate such tasks, providing time savings and consistency in quality. Objective: The aim of this study was to assess the performance of GPT-4 in generating discharge letters written from urology specialist outpatient clinics to primary care providers and to compare their quality against letters written by junior clinicians. Methods: Fictional electronic records were written by physicians simulating 5 common urology outpatient cases with long-term follow-up. Records comprised simulated consultation notes, referral letters and replies, and relevant discharge summaries from inpatient admissions. GPT-4 was tasked to write discharge letters for these cases with a specified target audience of primary care providers who would be continuing the patient?s care. Prompts were written for safety, content, and style. Concurrently, junior clinicians were provided with the same case records and instructional prompts. GPT-4 output was assessed for instances of hallucination. A blinded panel of primary care physicians then evaluated the letters using a standardized questionnaire tool. Results: GPT-4 outperformed human counterparts in information provision (mean 4.32, SD 0.95 vs 3.70, SD 1.27; P=.03) and had no instances of hallucination. There were no statistically significant differences in the mean clarity (4.16, SD 0.95 vs 3.68, SD 1.24; P=.12), collegiality (4.36, SD 1.00 vs 3.84, SD 1.22; P=.05), conciseness (3.60, SD 1.12 vs 3.64, SD 1.27; P=.71), follow-up recommendations (4.16, SD 1.03 vs 3.72, SD 1.13; P=.08), and overall satisfaction (3.96, SD 1.14 vs 3.62, SD 1.34; P=.36) between the letters generated by GPT-4 and humans, respectively. Conclusions: Discharge letters written by GPT-4 had equivalent quality to those written by junior clinicians, without any hallucinations. This study provides a proof of concept that large language models can be useful and safe tools in clinical documentation. UR - https://www.jmir.org/2024/1/e57721 UR - http://dx.doi.org/10.2196/57721 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/57721 ER - TY - JOUR AU - Bellmann, Louis AU - Wiederhold, Johannes Alexander AU - Trübe, Leona AU - Twerenbold, Raphael AU - Ückert, Frank AU - Gottfried, Karl PY - 2024/7/24 TI - Introducing Attribute Association Graphs to Facilitate Medical Data Exploration: Development and Evaluation Using Epidemiological Study Data JO - JMIR Med Inform SP - e49865 VL - 12 KW - data exploration KW - cohort studies KW - data visualization KW - big data KW - statistical models KW - medical knowledge KW - data analysis KW - cardiovascular diseases KW - usability N2 - Background: Interpretability and intuitive visualization facilitate medical knowledge generation through big data. In addition, robustness to high-dimensional and missing data is a requirement for statistical approaches in the medical domain. A method tailored to the needs of physicians must meet all the abovementioned criteria. Objective: This study aims to develop an accessible tool for visual data exploration without the need for programming knowledge, adjusting complex parameterizations, or handling missing data. We sought to use statistical analysis using the setting of disease and control cohorts familiar to clinical researchers. We aimed to guide the user by identifying and highlighting data patterns associated with disease and reveal relations between attributes within the data set. Methods: We introduce the attribute association graph, a novel graph structure designed for visual data exploration using robust statistical metrics. The nodes capture frequencies of participant attributes in disease and control cohorts as well as deviations between groups. The edges represent conditional relations between attributes. The graph is visualized using the Neo4j (Neo4j, Inc) data platform and can be interactively explored without the need for technical knowledge. Nodes with high deviations between cohorts and edges of noticeable conditional relationship are highlighted to guide the user during the exploration. The graph is accompanied by a dashboard visualizing variable distributions. For evaluation, we applied the graph and dashboard to the Hamburg City Health Study data set, a large cohort study conducted in the city of Hamburg, Germany. All data structures can be accessed freely by researchers, physicians, and patients. In addition, we developed a user test conducted with physicians incorporating the System Usability Scale, individual questions, and user tasks. Results: We evaluated the attribute association graph and dashboard through an exemplary data analysis of participants with a general cardiovascular disease in the Hamburg City Health Study data set. All results extracted from the graph structure and dashboard are in accordance with findings from the literature, except for unusually low cholesterol levels in participants with cardiovascular disease, which could be induced by medication. In addition, 95% CIs of Pearson correlation coefficients were calculated for all associations identified during the data analysis, confirming the results. In addition, a user test with 10 physicians assessing the usability of the proposed methods was conducted. A System Usability Scale score of 70.5% and average successful task completion of 81.4% were reported. Conclusions: The proposed attribute association graph and dashboard enable intuitive visual data exploration. They are robust to high-dimensional as well as missing data and require no parameterization. The usability for clinicians was confirmed via a user test, and the validity of the statistical results was confirmed by associations known from literature and standard statistical inference. UR - https://medinform.jmir.org/2024/1/e49865 UR - http://dx.doi.org/10.2196/49865 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/49865 ER - TY - JOUR AU - Avnat, Eden AU - Samin, Michael AU - Ben Joya, Daniel AU - Schneider, Eyal AU - Yanko, Elia AU - Eshel, Dafna AU - Ovadia, Shahar AU - Lev, Yossi AU - Souroujon, Daniel PY - 2024/7/16 TI - The Potential of Evidence-Based Clinical Intake Tools to Discover or Ground Prevalence of Symptoms Using Real-Life Digital Health Encounters: Retrospective Cohort Study JO - J Med Internet Res SP - e49570 VL - 26 KW - clinical intake tool KW - evidence-based medicine KW - big data KW - digital health KW - symptoms KW - prevalence N2 - Background: Evidence-based clinical intake tools (EBCITs) are structured assessment tools used to gather information about patients and help health care providers make informed decisions. The growing demand for personalized medicine, along with the big data revolution, has rendered EBCITs a promising solution. EBCITs have the potential to provide comprehensive and individualized assessments of symptoms, enabling accurate diagnosis, while contributing to the grounding of medical care. Objective: This work aims to examine whether EBCITs cover data concerning disorders and symptoms to a similar extent as physicians, and thus can reliably address medical conditions in clinical settings. We also explore the potential of EBCITs to discover and ground the real prevalence of symptoms in different disorders thereby expanding medical knowledge and further supporting medical diagnoses made by physicians. Methods: Between August 1, 2022, and January 15, 2023, patients who used the services of a digital health care (DH) provider in the United States were first assessed by the Kahun EBCIT. Kahun platform gathered and analyzed the information from the sessions. This study estimated the prevalence of patients? symptoms in medical disorders using 2 data sets. The first data set analyzed symptom prevalence, as determined by Kahun?s knowledge engine. The second data set analyzed symptom prevalence, relying solely on data from the DH patients gathered by Kahun. The variance difference between these 2 prevalence data sets helped us assess Kahun?s ability to incorporate new data while integrating existing knowledge. To analyze the comprehensiveness of Kahun?s knowledge engine, we compared how well it covers weighted data for the symptoms and disorders found in the 2019 National Ambulatory Medical Care Survey (NMCAS). To assess Kahun?s diagnosis accuracy, physicians independently diagnosed 250 of Kahun-DH?s sessions. Their diagnoses were compared with Kahun?s diagnoses. Results: In this study, 2550 patients used Kahun to complete a full assessment. Kahun proposed 108,523 suggestions related to symptoms during the intake process. At the end of the intake process, 6496 conditions were presented to the caregiver. Kahun covered 94% (526,157,569/562,150,572) of the weighted symptoms and 91% (1,582,637,476/173,4783,244) of the weighted disorders in the 2019 NMCAS. In 90% (224/250) of the sessions, both physicians and Kahun suggested at least one identical disorder, with a 72% (367/507) total accuracy rate. Kahun?s engine yielded 519 prevalences while the Kahun-DH cohort yielded 599; 156 prevalences were unique to the latter and 443 prevalences were shared by both data sets. Conclusions: ECBITs, such as Kahun, encompass extensive amounts of knowledge and could serve as a reliable database for inferring medical insights and diagnoses. Using this credible database, the potential prevalence of symptoms in different disorders was discovered or grounded. This highlights the ability of ECBITs to refine the understanding of relationships between disorders and symptoms, which further supports physicians in medical diagnosis. UR - https://www.jmir.org/2024/1/e49570 UR - http://dx.doi.org/10.2196/49570 UR - http://www.ncbi.nlm.nih.gov/pubmed/39012659 ID - info:doi/10.2196/49570 ER - TY - JOUR AU - Ji, Hyerim AU - Kim, Seok AU - Sunwoo, Leonard AU - Jang, Sowon AU - Lee, Ho-Young AU - Yoo, Sooyoung PY - 2024/7/12 TI - Integrating Clinical Data and Medical Imaging in Lung Cancer: Feasibility Study Using the Observational Medical Outcomes Partnership Common Data Model Extension JO - JMIR Med Inform SP - e59187 VL - 12 KW - DICOM KW - OMOP KW - CDM KW - lung cancer KW - medical imaging KW - data integration KW - data quality KW - Common Data Model KW - Digital Imaging and Communications in Medicine KW - Observational Medical Outcomes Partnership N2 - Background: Digital transformation, particularly the integration of medical imaging with clinical data, is vital in personalized medicine. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standardizes health data. However, integrating medical imaging remains a challenge. Objective: This study proposes a method for combining medical imaging data with the OMOP CDM to improve multimodal research. Methods: Our approach included the analysis and selection of digital imaging and communications in medicine header tags, validation of data formats, and alignment according to the OMOP CDM framework. The Fast Healthcare Interoperability Resources ImagingStudy profile guided our consistency in column naming and definitions. Imaging Common Data Model (I-CDM), constructed using the entity-attribute-value model, facilitates scalable and efficient medical imaging data management. For patients with lung cancer diagnosed between 2010 and 2017, we introduced 4 new tables?IMAGING_STUDY, IMAGING_SERIES, IMAGING_ANNOTATION, and FILEPATH?to standardize various imaging-related data and link to clinical data. Results: This framework underscores the effectiveness of I-CDM in enhancing our understanding of lung cancer diagnostics and treatment strategies. The implementation of the I-CDM tables enabled the structured organization of a comprehensive data set, including 282,098 IMAGING_STUDY, 5,674,425 IMAGING_SERIES, and 48,536 IMAGING_ANNOTATION records, illustrating the extensive scope and depth of the approach. A scenario-based analysis using actual data from patients with lung cancer underscored the feasibility of our approach. A data quality check applying 44 specific rules confirmed the high integrity of the constructed data set, with all checks successfully passed, underscoring the reliability of our findings. Conclusions: These findings indicate that I-CDM can improve the integration and analysis of medical imaging and clinical data. By addressing the challenges in data standardization and management, our approach contributes toward enhancing diagnostics and treatment strategies. Future research should expand the application of I-CDM to diverse disease populations and explore its wide-ranging utility for medical conditions. UR - https://medinform.jmir.org/2024/1/e59187 UR - http://dx.doi.org/10.2196/59187 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/59187 ER - TY - JOUR AU - Duggan, M. Nicole AU - Jin, Mike AU - Duran Mendicuti, Alejandra Maria AU - Hallisey, Stephen AU - Bernier, Denie AU - Selame, A. Lauren AU - Asgari-Targhi, Ameneh AU - Fischetti, E. Chanel AU - Lucassen, Ruben AU - Samir, E. Anthony AU - Duhaime, Erik AU - Kapur, Tina AU - Goldsmith, J. Andrew PY - 2024/7/4 TI - Gamified Crowdsourcing as a Novel Approach to Lung Ultrasound Data Set Labeling: Prospective Analysis JO - J Med Internet Res SP - e51397 VL - 26 KW - crowdsource KW - crowdsourced KW - crowdsourcing KW - machine learning KW - artificial intelligence KW - point-of-care ultrasound KW - POCUS KW - lung ultrasound KW - B-lines KW - gamification KW - gamify KW - gamified KW - label KW - labels KW - labeling KW - classification KW - lung KW - pulmonary KW - respiratory KW - ultrasound KW - imaging KW - medical image KW - diagnostic KW - diagnose KW - diagnosis KW - data science N2 - Background: Machine learning (ML) models can yield faster and more accurate medical diagnoses; however, developing ML models is limited by a lack of high-quality labeled training data. Crowdsourced labeling is a potential solution but can be constrained by concerns about label quality. Objective: This study aims to examine whether a gamified crowdsourcing platform with continuous performance assessment, user feedback, and performance-based incentives could produce expert-quality labels on medical imaging data. Methods: In this diagnostic comparison study, 2384 lung ultrasound clips were retrospectively collected from 203 emergency department patients. A total of 6 lung ultrasound experts classified 393 of these clips as having no B-lines, one or more discrete B-lines, or confluent B-lines to create 2 sets of reference standard data sets (195 training clips and 198 test clips). Sets were respectively used to (1) train users on a gamified crowdsourcing platform and (2) compare the concordance of the resulting crowd labels to the concordance of individual experts to reference standards. Crowd opinions were sourced from DiagnosUs (Centaur Labs) iOS app users over 8 days, filtered based on past performance, aggregated using majority rule, and analyzed for label concordance compared with a hold-out test set of expert-labeled clips. The primary outcome was comparing the labeling concordance of collated crowd opinions to trained experts in classifying B-lines on lung ultrasound clips. Results: Our clinical data set included patients with a mean age of 60.0 (SD 19.0) years; 105 (51.7%) patients were female and 114 (56.1%) patients were White. Over the 195 training clips, the expert-consensus label distribution was 114 (58%) no B-lines, 56 (29%) discrete B-lines, and 25 (13%) confluent B-lines. Over the 198 test clips, expert-consensus label distribution was 138 (70%) no B-lines, 36 (18%) discrete B-lines, and 24 (12%) confluent B-lines. In total, 99,238 opinions were collected from 426 unique users. On a test set of 198 clips, the mean labeling concordance of individual experts relative to the reference standard was 85.0% (SE 2.0), compared with 87.9% crowdsourced label concordance (P=.15). When individual experts? opinions were compared with reference standard labels created by majority vote excluding their own opinion, crowd concordance was higher than the mean concordance of individual experts to reference standards (87.4% vs 80.8%, SE 1.6 for expert concordance; P<.001). Clips with discrete B-lines had the most disagreement from both the crowd consensus and individual experts with the expert consensus. Using randomly sampled subsets of crowd opinions, 7 quality-filtered opinions were sufficient to achieve near the maximum crowd concordance. Conclusions: Crowdsourced labels for B-line classification on lung ultrasound clips via a gamified approach achieved expert-level accuracy. This suggests a strategic role for gamified crowdsourcing in efficiently generating labeled image data sets for training ML systems. UR - https://www.jmir.org/2024/1/e51397 UR - http://dx.doi.org/10.2196/51397 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/51397 ER - TY - JOUR AU - Faust, Louis AU - Wilson, Patrick AU - Asai, Shusaku AU - Fu, Sunyang AU - Liu, Hongfang AU - Ruan, Xiaoyang AU - Storlie, Curt PY - 2024/6/28 TI - Considerations for Quality Control Monitoring of Machine Learning Models in Clinical Practice JO - JMIR Med Inform SP - e50437 VL - 12 KW - artificial intelligence KW - machine learning KW - implementation science KW - quality control KW - monitoring KW - patient safety UR - https://medinform.jmir.org/2024/1/e50437 UR - http://dx.doi.org/10.2196/50437 UR - http://www.ncbi.nlm.nih.gov/pubmed/38941140 ID - info:doi/10.2196/50437 ER - TY - JOUR AU - Yuan, Yannan AU - Mei, Yun AU - Zhao, Shuhua AU - Dai, Shenglong AU - Liu, Xiaohong AU - Sun, Xiaojing AU - Fu, Zhiying AU - Zhou, Liheng AU - Ai, Jie AU - Ma, Liheng AU - Jiang, Min PY - 2024/6/27 TI - Data Flow Construction and Quality Evaluation of Electronic Source Data in Clinical Trials: Pilot Study Based on Hospital Electronic Medical Records in China JO - JMIR Med Inform SP - e52934 VL - 12 KW - clinical trials KW - electronic source data KW - EHRs KW - electronic data capture systems KW - data quality KW - electronic health records N2 - Background: The traditional clinical trial data collection process requires a clinical research coordinator who is authorized by the investigators to read from the hospital?s electronic medical record. Using electronic source data opens a new path to extract patients? data from electronic health records (EHRs) and transfer them directly to an electronic data capture (EDC) system; this method is often referred to as eSource. eSource technology in a clinical trial data flow can improve data quality without compromising timeliness. At the same time, improved data collection efficiency reduces clinical trial costs. Objective: This study aims to explore how to extract clinical trial?related data from hospital EHR systems, transform the data into a format required by the EDC system, and transfer it into sponsors? environments, and to evaluate the transferred data sets to validate the availability, completeness, and accuracy of building an eSource dataflow. Methods: A prospective clinical trial study registered on the Drug Clinical Trial Registration and Information Disclosure Platform was selected, and the following data modules were extracted from the structured data of 4 case report forms: demographics, vital signs, local laboratory data, and concomitant medications. The extracted data was mapped and transformed, deidentified, and transferred to the sponsor?s environment. Data validation was performed based on availability, completeness, and accuracy. Results: In a secure and controlled data environment, clinical trial data was successfully transferred from a hospital EHR to the sponsor?s environment with 100% transcriptional accuracy, but the availability and completeness of the data could be improved. Conclusions: Data availability was low due to some required fields in the EDC system not being available directly in the EHR. Some data is also still in an unstructured or paper-based format. The top-level design of the eSource technology and the construction of hospital electronic data standards should help lay a foundation for a full electronic data flow from EHRs to EDC systems in the future. UR - https://medinform.jmir.org/2024/1/e52934 UR - http://dx.doi.org/10.2196/52934 ID - info:doi/10.2196/52934 ER - TY - JOUR AU - Akiya, Ippei AU - Ishihara, Takuma AU - Yamamoto, Keiichi PY - 2024/6/18 TI - Comparison of Synthetic Data Generation Techniques for Control Group Survival Data in Oncology Clinical Trials: Simulation Study JO - JMIR Med Inform SP - e55118 VL - 12 KW - oncology clinical trial KW - survival analysis KW - synthetic patient data KW - machine learning KW - SPD KW - simulation N2 - Background: Synthetic patient data (SPD) generation for survival analysis in oncology trials holds significant potential for accelerating clinical development. Various machine learning methods, including classification and regression trees (CART), random forest (RF), Bayesian network (BN), and conditional tabular generative adversarial network (CTGAN), have been used for this purpose, but their performance in reflecting actual patient survival data remains under investigation. Objective: The aim of this study was to determine the most suitable SPD generation method for oncology trials, specifically focusing on both progression-free survival (PFS) and overall survival (OS), which are the primary evaluation end points in oncology trials. To achieve this goal, we conducted a comparative simulation of 4 generation methods, including CART, RF, BN, and the CTGAN, and the performance of each method was evaluated. Methods: Using multiple clinical trial data sets, 1000 data sets were generated by using each method for each clinical trial data set and evaluated as follows: (1) median survival time (MST) of PFS and OS; (2) hazard ratio distance (HRD), which indicates the similarity between the actual survival function and a synthetic survival function; and (3) visual analysis of Kaplan-Meier (KM) plots. Each method?s ability to mimic the statistical properties of real patient data was evaluated from these multiple angles. Results: In most simulation cases, CART demonstrated the high percentages of MSTs for synthetic data falling within the 95% CI range of the MST of the actual data. These percentages ranged from 88.8% to 98.0% for PFS and from 60.8% to 96.1% for OS. In the evaluation of HRD, CART revealed that HRD values were concentrated at approximately 0.9. Conversely, for the other methods, no consistent trend was observed for either PFS or OS. CART demonstrated better similarity than RF, in that CART caused overfitting and RF (a kind of ensemble learning approach) prevented it. In SPD generation, the statistical properties close to the actual data should be the focus, not a well-generalized prediction model. Both the BN and CTGAN methods cannot accurately reflect the statistical properties of the actual data because small data sets are not suitable. Conclusions: As a method for generating SPD for survival data from small data sets, such as clinical trial data, CART demonstrated to be the most effective method compared to RF, BN, and CTGAN. Additionally, it is possible to improve CART-based generation methods by incorporating feature engineering and other methods in future work. UR - https://medinform.jmir.org/2024/1/e55118 UR - http://dx.doi.org/10.2196/55118 ID - info:doi/10.2196/55118 ER - TY - JOUR AU - Vagnetti, Roberto AU - Camp, Nicola AU - Story, Matthew AU - Ait-Belaid, Khaoula AU - Mitra, Suvobrata AU - Zecca, Massimiliano AU - Di Nuovo, Alessandro AU - Magistro, Daniele PY - 2024/6/5 TI - Instruments for Measuring Psychological Dimensions in Human-Robot Interaction: Systematic Review of Psychometric Properties JO - J Med Internet Res SP - e55597 VL - 26 KW - psychometric KW - human-robot interaction KW - psychological dimensions KW - robot KW - assessment KW - systematic review N2 - Background: Numerous user-related psychological dimensions can significantly influence the dynamics between humans and robots. For developers and researchers, it is crucial to have a comprehensive understanding of the psychometric properties of the available instruments used to assess these dimensions as they indicate the reliability and validity of the assessment. Objective: This study aims to provide a systematic review of the instruments available for assessing the psychological aspects of the relationship between people and social and domestic robots, offering a summary of their psychometric properties and the quality of the evidence. Methods: A systematic review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines across different databases: Scopus, PubMed, and IEEE Xplore. The search strategy encompassed studies meeting the following inclusion criteria: (1) the instrument could assess psychological dimensions related to social and domestic robots, including attitudes, beliefs, opinions, feelings, and perceptions; (2) the study focused on validating the instrument; (3) the study evaluated the psychometric properties of the instrument; (4) the study underwent peer review; and (5) the study was in English. Studies focusing on industrial robots, rescue robots, or robotic arms or those primarily concerned with technology validation or measuring anthropomorphism were excluded. Independent reviewers extracted instrument properties and the methodological quality of their evidence following the Consensus-Based Standards for the Selection of Health Measurement Instruments guidelines. Results: From 3828 identified records, the search strategy yielded 34 (0.89%) articles that validated and examined the psychometric properties of 27 instruments designed to assess individuals? psychological dimensions in relation to social and domestic robots. These instruments encompass a broad spectrum of psychological dimensions. While most studies predominantly focused on structural validity (24/27, 89%) and internal consistency (26/27, 96%), consideration of other psychometric properties was frequently inconsistent or absent. No instrument evaluated measurement error and responsiveness despite their significance in the clinical context. Most of the instruments (17/27, 63%) were targeted at both adults and older adults (aged ?18 years). There was a limited number of instruments specifically designed for children, older adults, and health care contexts. Conclusions: Given the strong interest in assessing psychological dimensions in the human-robot relationship, there is a need to develop new instruments using more rigorous methodologies and consider a broader range of psychometric properties. This is essential to ensure the creation of reliable and valid measures for assessing people?s psychological dimensions regarding social and domestic robots. Among its limitations, this review included instruments applicable to both social and domestic robots while excluding those for other specific types of robots (eg, industrial robots). UR - https://www.jmir.org/2024/1/e55597 UR - http://dx.doi.org/10.2196/55597 UR - http://www.ncbi.nlm.nih.gov/pubmed/38682783 ID - info:doi/10.2196/55597 ER - TY - JOUR AU - Bittmann, A. Janina AU - Scherkl, Camilo AU - Meid, D. Andreas AU - Haefeli, E. Walter AU - Seidling, M. Hanna PY - 2024/6/4 TI - Event Analysis for Automated Estimation of Absent and Persistent Medication Alerts: Novel Methodology JO - JMIR Med Inform SP - e54428 VL - 12 KW - clinical decision support system KW - CDSS KW - medication alert system KW - alerting KW - alert acceptance KW - event analysis N2 - Background: Event analysis is a promising approach to estimate the acceptance of medication alerts issued by computerized physician order entry (CPOE) systems with an integrated clinical decision support system (CDSS), particularly when alerts cannot be interactively confirmed in the CPOE-CDSS due to its system architecture. Medication documentation is then reviewed for documented evidence of alert acceptance, which can be a time-consuming process, especially when performed manually. Objective: We present a new automated event analysis approach, which was applied to a large data set generated in a CPOE-CDSS with passive, noninterruptive alerts. Methods: Medication and alert data generated over 3.5 months within the CPOE-CDSS at Heidelberg University Hospital were divided into 24-hour time intervals in which the alert display was correlated with associated prescription changes. Alerts were considered ?persistent? if they were displayed in every consecutive 24-hour time interval due to a respective active prescription until patient discharge and were considered ?absent? if they were no longer displayed during continuous prescriptions in the subsequent interval. Results: Overall, 1670 patient cases with 11,428 alerts were analyzed. Alerts were displayed for a median of 3 (IQR 1-7) consecutive 24-hour time intervals, with the shortest alerts displayed for drug-allergy interactions and the longest alerts displayed for potentially inappropriate medication for the elderly (PIM). Among the total 11,428 alerts, 56.1% (n=6413) became absent, most commonly among alerts for drug-drug interactions (1915/2366, 80.9%) and least commonly among PIM alerts (199/499, 39.9%). Conclusions: This new approach to estimate alert acceptance based on event analysis can be flexibly adapted to the automated evaluation of passive, noninterruptive alerts. This enables large data sets of longitudinal patient cases to be processed, allows for the derivation of the ratios of persistent and absent alerts, and facilitates the comparison and prospective monitoring of these alerts. UR - https://medinform.jmir.org/2024/1/e54428 UR - http://dx.doi.org/10.2196/54428 ID - info:doi/10.2196/54428 ER - TY - JOUR AU - Eerdekens, Rob AU - Zelis, Jo AU - ter Horst, Herman AU - Crooijmans, Caia AU - van 't Veer, Marcel AU - Keulards, Danielle AU - Kelm, Marcus AU - Archer, Gareth AU - Kuehne, Titus AU - Brueren, Guus AU - Wijnbergen, Inge AU - Johnson, Nils AU - Tonino, Pim PY - 2024/6/3 TI - Cardiac Health Assessment Using a Wearable Device Before and After Transcatheter Aortic Valve Implantation: Prospective Study JO - JMIR Mhealth Uhealth SP - e53964 VL - 12 KW - aortic valve stenosis KW - health watch KW - quality of life KW - heart KW - cardiology KW - cardiac KW - aortic KW - valve KW - stenosis KW - watch KW - smartwatch KW - wearables KW - 6MWT KW - walking KW - test KW - QoL KW - WHOQOL-BREF KW - 6-minute walking test N2 - Background: Due to aging of the population, the prevalence of aortic valve stenosis will increase drastically in upcoming years. Consequently, transcatheter aortic valve implantation (TAVI) procedures will also expand worldwide. Optimal selection of patients who benefit with improved symptoms and prognoses is key, since TAVI is not without its risks. Currently, we are not able to adequately predict functional outcomes after TAVI. Quality of life measurement tools and traditional functional assessment tests do not always agree and can depend on factors unrelated to heart disease. Activity tracking using wearable devices might provide a more comprehensive assessment. Objective: This study aimed to identify objective parameters (eg, change in heart rate) associated with improvement after TAVI for severe aortic stenosis from a wearable device. Methods: In total, 100 patients undergoing routine TAVI wore a Philips Health Watch device for 1 week before and after the procedure. Watch data were analyzed offline?before TAVI for 97 patients and after TAVI for 75 patients. Results: Parameters such as the total number of steps and activity time did not change, in contrast to improvements in the 6-minute walking test (6MWT) and physical limitation domain of the transformed WHOQOL-BREF questionnaire. Conclusions: These findings, in an older TAVI population, show that watch-based parameters, such as the number of steps, do not change after TAVI, unlike traditional 6MWT and QoL assessments. Basic wearable device parameters might be less appropriate for measuring treatment effects from TAVI. UR - https://mhealth.jmir.org/2024/1/e53964 UR - http://dx.doi.org/10.2196/53964 ID - info:doi/10.2196/53964 ER - TY - JOUR AU - Jiang, Yuyan AU - Liu, Xue-li AU - Zhang, Zixuan AU - Yang, Xinru PY - 2024/5/31 TI - Evaluation and Comparison of Academic Impact and Disruptive Innovation Level of Medical Journals: Bibliometric Analysis and Disruptive Evaluation JO - J Med Internet Res SP - e55121 VL - 26 KW - medical journals KW - journal evaluation KW - innovative evaluation KW - journal disruption index KW - disruptive innovation KW - academic impact KW - peer review N2 - Background: As an important platform for researchers to present their academic findings, medical journals have a close relationship between their evaluation orientation and the value orientation of their published research results. However, the differences between the academic impact and level of disruptive innovation of medical journals have not been examined by any study yet. Objective: This study aims to compare the relationships and differences between the academic impact, disruptive innovation levels, and peer review results of medical journals and published research papers. We also analyzed the similarities and differences in the impact evaluations, disruptive innovations, and peer reviews for different types of medical research papers and the underlying reasons. Methods: The general and internal medicine Science Citation Index Expanded (SCIE) journals in 2018 were chosen as the study object to explore the differences in the academic impact and level of disruptive innovation of medical journals based on the OpenCitations Index of PubMed open PMID-to-PMID citations (POCI) and H1Connect databases, respectively, and we compared them with the results of peer review. Results: First, the correlation coefficients of the Journal Disruption Index (JDI) with the Journal Cumulative Citation for 5 years (JCC5), Journal Impact Factor (JIF), and Journal Citation Indicator (JCI) were 0.677, 0.585, and 0.621, respectively. The correlation coefficient of the absolute disruption index (Dz) with the Cumulative Citation for 5 years (CC5) was 0.635. However, the average difference in the disruptive innovation and academic influence rankings of journals reached 20 places (about 17.5%). The average difference in the disruptive innovation and influence rankings of research papers reached about 2700 places (about 17.7%). The differences reflect the essential difference between the two evaluation systems. Second, the top 7 journals selected based on JDI, JCC5, JIF, and JCI were the same, and all of them were H-journals. Although 8 (8/15, 53%), 96 (96/150, 64%), and 880 (880/1500, 58.67%) of the top 0.1%, top 1%, and top 10% papers selected based on Dz and CC5, respectively, were the same. Third, research papers with the ?changes clinical practice? tag showed only moderate innovation (4.96) and impact (241.67) levels but had high levels of peer-reviewed recognition (6.00) and attention (2.83). Conclusions: The results of the study show that research evaluation based on innovative indicators is detached from the traditional impact evaluation system. The 3 evaluation systems (impact evaluation, disruptive innovation evaluation, and peer review) only have high consistency for authoritative journals and top papers. Neither a single impact indicator nor an innovative indicator can directly reflect the impact of medical research for clinical practice. How to establish an integrated, comprehensive, scientific, and reasonable journal evaluation system to improve the existing evaluation system of medical journals still needs further research. UR - https://www.jmir.org/2024/1/e55121 UR - http://dx.doi.org/10.2196/55121 UR - http://www.ncbi.nlm.nih.gov/pubmed/38820583 ID - info:doi/10.2196/55121 ER - TY - JOUR AU - Chelli, Mikaël AU - Descamps, Jules AU - Lavoué, Vincent AU - Trojani, Christophe AU - Azar, Michel AU - Deckert, Marcel AU - Raynier, Jean-Luc AU - Clowez, Gilles AU - Boileau, Pascal AU - Ruetsch-Chelli, Caroline PY - 2024/5/22 TI - Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis JO - J Med Internet Res SP - e53164 VL - 26 KW - artificial intelligence KW - large language models KW - ChatGPT KW - Bard KW - rotator cuff KW - systematic reviews KW - literature search KW - hallucinated KW - human conducted N2 - Background: Large language models (LLMs) have raised both interest and concern in the academic community. They offer the potential for automating literature search and synthesis for systematic reviews but raise concerns regarding their reliability, as the tendency to generate unsupported (hallucinated) content persist. Objective: The aim of the study is to assess the performance of LLMs such as ChatGPT and Bard (subsequently rebranded Gemini) to produce references in the context of scientific writing. Methods: The performance of ChatGPT and Bard in replicating the results of human-conducted systematic reviews was assessed. Using systematic reviews pertaining to shoulder rotator cuff pathology, these LLMs were tested by providing the same inclusion criteria and comparing the results with original systematic review references, serving as gold standards. The study used 3 key performance metrics: recall, precision, and F1-score, alongside the hallucination rate. Papers were considered ?hallucinated? if any 2 of the following information were wrong: title, first author, or year of publication. Results: In total, 11 systematic reviews across 4 fields yielded 33 prompts to LLMs (3 LLMs×11 reviews), with 471 references analyzed. Precision rates for GPT-3.5, GPT-4, and Bard were 9.4% (13/139), 13.4% (16/119), and 0% (0/104) respectively (P<.001). Recall rates were 11.9% (13/109) for GPT-3.5 and 13.7% (15/109) for GPT-4, with Bard failing to retrieve any relevant papers (P<.001). Hallucination rates stood at 39.6% (55/139) for GPT-3.5, 28.6% (34/119) for GPT-4, and 91.4% (95/104) for Bard (P<.001). Further analysis of nonhallucinated papers retrieved by GPT models revealed significant differences in identifying various criteria, such as randomized studies, participant criteria, and intervention criteria. The study also noted the geographical and open-access biases in the papers retrieved by the LLMs. Conclusions: Given their current performance, it is not recommended for LLMs to be deployed as the primary or exclusive tool for conducting systematic reviews. Any references generated by such models warrant thorough validation by researchers. The high occurrence of hallucinations in LLMs highlights the necessity for refining their training and functionality before confidently using them for rigorous academic purposes. UR - https://www.jmir.org/2024/1/e53164 UR - http://dx.doi.org/10.2196/53164 UR - http://www.ncbi.nlm.nih.gov/pubmed/38776130 ID - info:doi/10.2196/53164 ER - TY - JOUR AU - Jayasinghe, Thisakya Randi AU - Ahern, Susannah AU - Maharaj, D. Ashika AU - Romero, Lorena AU - Ruseckaite, Rasa PY - 2024/5/21 TI - Identifying Existing Guidelines, Frameworks, Checklists, and Recommendations for Implementing Patient-Reported Outcome Measures: Protocol for a Scoping Review JO - JMIR Res Protoc SP - e52572 VL - 13 KW - patient-reported outcome measures KW - patient-reported outcomes KW - quality of life KW - clinical quality registry KW - guidelines KW - frameworks KW - recommendations KW - scoping review KW - patient perspectives KW - patient perspective KW - patient-reported outcome KW - patient-reported KW - clinical setting KW - clinical registry KW - registry KW - systematic review N2 - Background: Implementing patient-reported outcome measures (PROMs) to measure and evaluate health outcomes is increasing worldwide. Along with this emerging trend, it is important to identify which guidelines, frameworks, checklists, and recommendations exist, and if and how they have been used in implementing PROMs, especially in clinical quality registries (CQRs). Objective: This review aims to identify existing publications, as well as publications that discuss the application of actual guidelines, frameworks, checklists, and recommendations on PROMs? implementation for various purposes such as clinical trials, clinical practice, and CQRs. In addition, the identified publications will be used to guide the development of a new guideline for PROMs? implementation in CQRs, which is the aim of the broader project. Methods: A literature search of the databases MEDLINE, Embase, CINAHL, PsycINFO, and Cochrane Central Register of Controlled Trials will be conducted since the inception of the databases, in addition to using Google Scholar and gray literature to identify literature for the scoping review. Predefined inclusion and exclusion criteria will be used for all phases of screening. Existing publications of guidelines, frameworks, checklists, recommendations, and publications discussing the application of those methodologies for implementing PROMs in clinical trials, clinical practice, and CQRs will be included in the final review. Data relating to bibliographic information, aim, the purpose of PROMs use (clinical trial, practice, or registries), name of guideline, framework, checklist and recommendations, the rationale for development, and their purpose and implications will be extracted. Additionally, for publications of actual methodologies, aspects or domains of PROMs? implementation will be extracted. A narrative synthesis of included publications will be conducted. Results: The electronic database searches were completed in March 2024. Title and abstract screening, full-text screening, and data extraction will be completed in May 2024. The review is expected to be completed by the end of August 2024. Conclusions: The findings of this scoping review will provide evidence on any existing methodologies and tools for PROMs? implementation in clinical trials, clinical practice, and CQRs. It is anticipated that the publications will help us guide the development of a new guideline for PROMs? implementation in CQRs. Trial Registration: PROSPERO CRD42022366085; https://tinyurl.com/bdesk98x International Registered Report Identifier (IRRID): DERR1-10.2196/52572 UR - https://www.researchprotocols.org/2024/1/e52572 UR - http://dx.doi.org/10.2196/52572 UR - http://www.ncbi.nlm.nih.gov/pubmed/38771621 ID - info:doi/10.2196/52572 ER - TY - JOUR AU - Hawa, Saadiya AU - Bane, Shalmali AU - Kinsler, Kayla AU - Rector, Amadeia AU - Chaichian, Yashaar AU - Falasinnu, Titilola AU - Simard, F. Julia PY - 2024/5/14 TI - Impact of Incentives on Physician Participation in Research Surveys: Randomized Experiment JO - JMIR Form Res SP - e54343 VL - 8 KW - internet survey KW - incentive KW - physician recruitment KW - internet surveys KW - online survey KW - online surveys KW - web-based survey KW - web-based surveys KW - survey KW - surveys KW - incentives KW - monetary incentive KW - monetary incentives KW - physician participation KW - physician participant KW - physician participants KW - physician KW - physicians KW - doctor participation KW - doctor participant KW - doctor participants KW - doctor KW - doctors KW - neurologist KW - neurologists N2 - Background: Web-based surveys can be effective data collection instruments; however, participation is notoriously low, particularly among professionals such as physicians. Few studies have explored the impact of varying amounts of monetary incentives on survey completion. Objective: This study aims to conduct a randomized study to assess how different incentive amounts influenced survey participation among neurologists in the United States. Methods: We distributed a web-based survey using standardized email text to 21,753 individuals randomly divided into 5 equal groups (?4351 per group). In phase 1, each group was assigned to receive either nothing or a gift card for US $10, $20, $50, or $75, which was noted in the email subject and text. After 4 reminders, phase 2 began and each remaining individual was offered a US $75 gift card to complete the survey. We calculated and compared the proportions who completed the survey by phase 1 arm, both before and after the incentive change, using a chi-square test. As a secondary outcome, we also looked at survey participation as opposed to completion. Results: For the 20,820 emails delivered, 879 (4.2%) recipients completed the survey; of the 879 recipients, 622 (70.8%) were neurologists. Among the neurologists, most were male (412/622, 66.2%), White (430/622, 69.1%), non-Hispanic (592/622, 95.2%), graduates of American medical schools (465/622, 74.8%), and board certified (598/622, 96.1%). A total of 39.7% (247/622) completed their neurology residency more than 20 years ago, and 62.4% (388/622) practiced in an urban setting. For phase 1, the proportions of respondents completing the survey increased as the incentive amount increased (46/4185, 1.1%; 76/4165, 1.8%; 86/4160, 2.1%; 104/4162, 2.5%; and 119/4148, 2.9%, for US $0, $10, $20, $50, and $75, respectively; P<.001). In phase 2, the survey completion rate for the former US $0 arm increased to 3% (116/3928). Those originally offered US $10, $20, $50, and $75 who had not yet participated were less likely to participate compared with the former US $0 arm (116/3928, 3%; 90/3936, 2.3%; 80/3902, 2.1%; 88/3845, 2.3%; and 74/3878, 1.9%, for US $0, $10, $20, $50, and $75, respectively; P=.03). For our secondary outcome of survey participation, a trend similar to that of survey completion was observed in phase 1 (55/4185, 1.3%; 85/4165, 2%; 96/4160, 2.3%; 118/4162, 2.8%; and 135/4148, 3.3%, for US $0, $10, $20, $50, and $75, respectively; P<.001) and phase 2 (116/3928, 3%; 90/3936, 2.3%; 80/3902, 2.1%; 88/3845, 2.3%; and 86/3845, 2.2%, for US $0, $10, $20, $50, and $75, respectively; P=.10). Conclusions: As expected, monetary incentives can boost physician survey participation and completion, with a positive correlation between the amount offered and participation. UR - https://formative.jmir.org/2024/1/e54343 UR - http://dx.doi.org/10.2196/54343 UR - http://www.ncbi.nlm.nih.gov/pubmed/38743466 ID - info:doi/10.2196/54343 ER - TY - JOUR AU - Attarha, Mouna AU - Mahncke, Henry AU - Merzenich, Michael PY - 2024/5/13 TI - The Real-World Usability, Feasibility, and Performance Distributions of Deploying a Digital Toolbox of Computerized Assessments to Remotely Evaluate Brain Health: Development and Usability Study JO - JMIR Form Res SP - e53623 VL - 8 KW - web-based cognitive assessment KW - remote data collection KW - neurocognition KW - cognitive profiles KW - normative assessment data KW - brain health KW - cognitive status KW - assessment accessibility N2 - Background: An ongoing global challenge is managing brain health and understanding how performance changes across the lifespan. Objective: We developed and deployed a set of self-administrable, computerized assessments designed to measure key indexes of brain health across the visual and auditory sensory modalities. In this pilot study, we evaluated the usability, feasibility, and performance distributions of the assessments in a home-based, real-world setting without supervision. Methods: Potential participants were untrained users who self-registered on an existing brain training app called BrainHQ. Participants were contacted via a recruitment email and registered remotely to complete a demographics questionnaire and 29 unique assessments on their personal devices. We examined participant engagement, descriptive and psychometric properties of the assessments, associations between performance and self-reported demographic variables, cognitive profiles, and factor loadings. Results: Of the 365,782 potential participants contacted via a recruitment email, 414 (0.11%) registered, of whom 367 (88.6%) completed at least one assessment and 104 (25.1%) completed all 29 assessments. Registered participants were, on average, aged 63.6 (SD 14.8; range 13-107) years, mostly female (265/414, 64%), educated (329/414, 79.5% with a degree), and White (349/414, 84.3% White and 48/414, 11.6% people of color). A total of 72% (21/29) of the assessments showed no ceiling or floor effects or had easily modifiable score bounds to eliminate these effects. When correlating performance with self-reported demographic variables, 72% (21/29) of the assessments were sensitive to age, 72% (21/29) of the assessments were insensitive to gender, 93% (27/29) of the assessments were insensitive to race and ethnicity, and 93% (27/29) of the assessments were insensitive to education-based differences. Assessments were brief, with a mean duration of 3 (SD 1.0) minutes per task. The pattern of performance across the assessments revealed distinctive cognitive profiles and loaded onto 4 independent factors. Conclusions: The assessments were both usable and feasible and warrant a full normative study. A digital toolbox of scalable and self-administrable assessments that can evaluate brain health at a glance (and longitudinally) may lead to novel future applications across clinical trials, diagnostics, and performance optimization. UR - https://formative.jmir.org/2024/1/e53623 UR - http://dx.doi.org/10.2196/53623 UR - http://www.ncbi.nlm.nih.gov/pubmed/38739916 ID - info:doi/10.2196/53623 ER - TY - JOUR AU - Chiu, Keith Wan Hang AU - Ko, Koel Wei Sum AU - Cho, Shing William Chi AU - Hui, Joanne Sin Yu AU - Chan, Lawrence Wing Chi AU - Kuo, D. Michael PY - 2024/5/13 TI - Evaluating the Diagnostic Performance of Large Language Models on Complex Multimodal Medical Cases JO - J Med Internet Res SP - e53724 VL - 26 KW - large language model KW - hospital KW - health center KW - Massachusetts KW - statistical analysis KW - chi-square KW - ANOVA KW - clinician KW - physician KW - performance KW - proficiency KW - disease etiology UR - https://www.jmir.org/2024/1/e53724 UR - http://dx.doi.org/10.2196/53724 UR - http://www.ncbi.nlm.nih.gov/pubmed/38739441 ID - info:doi/10.2196/53724 ER - TY - JOUR AU - Jia, Yan AU - Li, Qi AU - Zhang, Xiaowen AU - Yan, Yi AU - Yan, Shiyan AU - Li, Shunping AU - Li, Wei AU - Wu, Xiaowen AU - Rong, Hongguo AU - Liu, Jianping PY - 2024/5/8 TI - Application of Patient-Reported Outcome Measurements in Adult Tumor Clinical Trials in China: Cross-Sectional Study JO - J Med Internet Res SP - e45719 VL - 26 KW - patient-reported outcomes KW - tumor KW - cross-sectional study KW - quality of life KW - outcome study N2 - Background: International health policies and researchers have emphasized the value of evaluating patient-reported outcomes (PROs) in clinical studies. However, the characteristics of PROs in adult tumor clinical trials in China remain insufficiently elucidated. Objective: This study aims to assess the application and characteristics of PRO instruments as primary or secondary outcomes in adult randomized clinical trials related to tumors in China. Methods: This cross-sectional study identified tumor-focused randomized clinical trials conducted in China between January 1, 2010, and June 30, 2022. The ClinicalTrials.gov database and the Chinese Clinical Trial Registry were selected as the databases. Trials were classified into four groups based on the use of PRO instruments: (1) trials listing PRO instruments as primary outcomes, (2) trials listing PRO instruments as secondary outcomes, (3) trials listing PRO instruments as coprimary outcomes, and (4) trials without any mention of PRO instruments. Pertinent data, including study phase, settings, geographic regions, centers, participant demographics (age and sex), funding sources, intervention types, target diseases, and the names of PRO instruments, were extracted from these trials. The target diseases involved in the trials were grouped according to the American Joint Committee on Cancer Staging Manual, 8th Edition. Results: Among the 6445 trials examined, 2390 (37.08%) incorporated PRO instruments as part of their outcomes. Within this subset, 26.82% (641/2390) listed PRO instruments as primary outcomes, 52.72% (1260/2390) as secondary outcomes, and 20.46% (489/2390) as coprimary outcomes. Among the 2,155,306 participants included in these trials, PRO instruments were used to collect data from 613,648 (28.47%) patients as primary or secondary outcomes and from 74,287 (3.45%) patients as coprimary outcomes. The most common conditions explicitly using specified PRO instruments included thorax tumors (217/1280, 16.95%), breast tumors (176/1280, 13.75%), and lower gastrointestinal tract tumors (173/1280, 13.52%). Frequently used PRO instruments included the European Organisation for Research and Treatment of Cancer Quality of Life Core Questionnaire?30, the visual analog scale, the numeric rating scale, the Traditional Chinese Medicine Symptom Scale, and the Pittsburgh Sleep Quality Index. Conclusions: Over recent years, the incorporation of PROs has demonstrated an upward trajectory in adult randomized clinical trials on tumors in China. Nonetheless, the infrequent measurement of the patient?s voice remains noteworthy. Disease-specific PRO instruments should be more effectively incorporated into various tumor disease categories in clinical trials, and there is room for improvement in the inclusion of PRO instruments as clinical trial end points. UR - https://www.jmir.org/2024/1/e45719 UR - http://dx.doi.org/10.2196/45719 UR - http://www.ncbi.nlm.nih.gov/pubmed/38718388 ID - info:doi/10.2196/45719 ER - TY - JOUR AU - Zhu, Lingxuan AU - Mou, Weiming AU - Hong, Chenglin AU - Yang, Tao AU - Lai, Yancheng AU - Qi, Chang AU - Lin, Anqi AU - Zhang, Jian AU - Luo, Peng PY - 2024/5/6 TI - The Evaluation of Generative AI Should Include Repetition to Assess Stability JO - JMIR Mhealth Uhealth SP - e57978 VL - 12 KW - large language model KW - generative AI KW - ChatGPT KW - artificial intelligence KW - health care UR - https://mhealth.jmir.org/2024/1/e57978 UR - http://dx.doi.org/10.2196/57978 UR - http://www.ncbi.nlm.nih.gov/pubmed/38688841 ID - info:doi/10.2196/57978 ER - TY - JOUR AU - Yue, Qi-Xuan AU - Ding, Ruo-Fan AU - Chen, Wei-Hao AU - Wu, Lv-Ying AU - Liu, Ke AU - Ji, Zhi-Liang PY - 2024/5/3 TI - Mining Real-World Big Data to Characterize Adverse Drug Reaction Quantitatively: Mixed Methods Study JO - J Med Internet Res SP - e48572 VL - 26 KW - clinical drug toxicity KW - adverse drug reaction KW - ADR severity KW - ADR frequency KW - mathematical model N2 - Background: Adverse drug reactions (ADRs), which are the phenotypic manifestations of clinical drug toxicity in humans, are a major concern in precision clinical medicine. A comprehensive evaluation of ADRs is helpful for unbiased supervision of marketed drugs and for discovering new drugs with high success rates. Objective: In current practice, drug safety evaluation is often oversimplified to the occurrence or nonoccurrence of ADRs. Given the limitations of current qualitative methods, there is an urgent need for a quantitative evaluation model to improve pharmacovigilance and the accurate assessment of drug safety. Methods: In this study, we developed a mathematical model, namely the Adverse Drug Reaction Classification System (ADReCS) severity-grading model, for the quantitative characterization of ADR severity, a crucial feature for evaluating the impact of ADRs on human health. The model was constructed by mining millions of real-world historical adverse drug event reports. A new parameter called Severity_score was introduced to measure the severity of ADRs, and upper and lower score boundaries were determined for 5 severity grades. Results: The ADReCS severity-grading model exhibited excellent consistency (99.22%) with the expert-grading system, the Common Terminology Criteria for Adverse Events. Hence, we graded the severity of 6277 standard ADRs for 129,407 drug-ADR pairs. Moreover, we calculated the occurrence rates of 6272 distinct ADRs for 127,763 drug-ADR pairs in large patient populations by mining real-world medication prescriptions. With the quantitative features, we demonstrated example applications in systematically elucidating ADR mechanisms and thereby discovered a list of drugs with improper dosages. Conclusions: In summary, this study represents the first comprehensive determination of both ADR severity grades and ADR frequencies. This endeavor establishes a strong foundation for future artificial intelligence applications in discovering new drugs with high efficacy and low toxicity. It also heralds a paradigm shift in clinical toxicity research, moving from qualitative description to quantitative evaluation. UR - https://www.jmir.org/2024/1/e48572 UR - http://dx.doi.org/10.2196/48572 UR - http://www.ncbi.nlm.nih.gov/pubmed/38700923 ID - info:doi/10.2196/48572 ER - TY - JOUR AU - Saunders, L. Catherine PY - 2024/5/1 TI - Using Routine Data to Improve Lesbian, Gay, Bisexual, and Transgender Health JO - Interact J Med Res SP - e53311 VL - 13 KW - lesbian KW - gay KW - bisexual KW - trans KW - LGBTQ+ KW - routine data KW - England KW - United Kingdom KW - health KW - viewpoint KW - sexual orientation KW - health services KW - infrastructure data KW - policy KW - gender KW - health outcome KW - epidemiology KW - risk prediction KW - risk UR - https://www.i-jmr.org/2024/1/e53311 UR - http://dx.doi.org/10.2196/53311 UR - http://www.ncbi.nlm.nih.gov/pubmed/38691398 ID - info:doi/10.2196/53311 ER - TY - JOUR AU - He, Zhe AU - Bhasuran, Balu AU - Jin, Qiao AU - Tian, Shubo AU - Hanna, Karim AU - Shavor, Cindy AU - Arguello, Garcia Lisbeth AU - Murray, Patrick AU - Lu, Zhiyong PY - 2024/4/17 TI - Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study JO - J Med Internet Res SP - e56655 VL - 26 KW - large language models KW - generative artificial intelligence KW - generative AI KW - ChatGPT KW - laboratory test results KW - patient education KW - natural language processing N2 - Background: Although patients have easy access to their electronic health records and laboratory test result data through patient portals, laboratory test results are often confusing and hard to understand. Many patients turn to web-based forums or question-and-answer (Q&A) sites to seek advice from their peers. The quality of answers from social Q&A sites on health-related questions varies significantly, and not all responses are accurate or reliable. Large language models (LLMs) such as ChatGPT have opened a promising avenue for patients to have their questions answered. Objective: We aimed to assess the feasibility of using LLMs to generate relevant, accurate, helpful, and unharmful responses to laboratory test?related questions asked by patients and identify potential issues that can be mitigated using augmentation approaches. Methods: We collected laboratory test result?related Q&A data from Yahoo! Answers and selected 53 Q&A pairs for this study. Using the LangChain framework and ChatGPT web portal, we generated responses to the 53 questions from 5 LLMs: GPT-4, GPT-3.5, LLaMA 2, MedAlpaca, and ORCA_mini. We assessed the similarity of their answers using standard Q&A similarity-based evaluation metrics, including Recall-Oriented Understudy for Gisting Evaluation, Bilingual Evaluation Understudy, Metric for Evaluation of Translation With Explicit Ordering, and Bidirectional Encoder Representations from Transformers Score. We used an LLM-based evaluator to judge whether a target model had higher quality in terms of relevance, correctness, helpfulness, and safety than the baseline model. We performed a manual evaluation with medical experts for all the responses to 7 selected questions on the same 4 aspects. Results: Regarding the similarity of the responses from 4 LLMs; the GPT-4 output was used as the reference answer, the responses from GPT-3.5 were the most similar, followed by those from LLaMA 2, ORCA_mini, and MedAlpaca. Human answers from Yahoo data were scored the lowest and, thus, as the least similar to GPT-4?generated answers. The results of the win rate and medical expert evaluation both showed that GPT-4?s responses achieved better scores than all the other LLM responses and human responses on all 4 aspects (relevance, correctness, helpfulness, and safety). LLM responses occasionally also suffered from lack of interpretation in one?s medical context, incorrect statements, and lack of references. Conclusions: By evaluating LLMs in generating responses to patients? laboratory test result?related questions, we found that, compared to other 4 LLMs and human answers from a Q&A website, GPT-4?s responses were more accurate, helpful, relevant, and safer. There were cases in which GPT-4 responses were inaccurate and not individualized. We identified a number of ways to improve the quality of LLM responses, including prompt engineering, prompt augmentation, retrieval-augmented generation, and response evaluation. UR - https://www.jmir.org/2024/1/e56655 UR - http://dx.doi.org/10.2196/56655 UR - http://www.ncbi.nlm.nih.gov/pubmed/38630520 ID - info:doi/10.2196/56655 ER - TY - JOUR AU - Caruso, Rosario AU - Di Muzio, Marco AU - Di Simone, Emanuele AU - Dionisi, Sara AU - Magon, Arianna AU - Conte, Gianluca AU - Stievano, Alessandro AU - Girani, Emanuele AU - Boveri, Sara AU - Menicanti, Lorenzo AU - Dolansky, A. Mary PY - 2024/4/17 TI - Global Trends of Medical Misadventures Using International Classification of Diseases, Tenth Revision Cluster Y62-Y69 Comparing Pre?, Intra?, and Post?COVID-19 Pandemic Phases: Protocol for a Retrospective Analysis Using the TriNetX Platform JO - JMIR Res Protoc SP - e54838 VL - 13 KW - COVID-19 KW - curve-fitting analyses KW - health care quality KW - health care safety KW - International Classification of Diseases, Tenth Revision KW - ICD-10 KW - incidence rates KW - safety KW - TriNetX N2 - Background: The COVID-19 pandemic has sharpened the focus on health care safety and quality, underscoring the importance of using standardized metrics such as the International Classification of Diseases, Tenth Revision (ICD-10). In this regard, the ICD-10 cluster Y62-Y69 serves as a proxy assessment of safety and quality in health care systems, allowing researchers to evaluate medical misadventures. Thus far, extensive research and reports support the need for more attention to safety and quality in health care. The study aims to leverage the pandemic?s unique challenges to explore health care safety and quality trends during prepandemic, intrapandemic, and postpandemic phases, using the ICD-10 cluster Y62-Y69 as a key tool for their evaluation. Objective: This research aims to perform a comprehensive retrospective analysis of incidence rates associated with ICD-10 cluster Y62-Y69, capturing both linear and nonlinear trends across prepandemic, intrapandemic, and postpandemic phases over an 8-year span. Therefore, it seeks to understand how these trends inform health care safety and quality improvements, policy, and future research. Methods: This study uses the extensive data available through the TriNetX platform, using an observational, retrospective design and applying curve-fitting analyses and quadratic models to comprehend the relationships between incidence rates over an 8-year span (from 2015 to 2023). These techniques will enable the identification of nuanced trends in the data, facilitating a deeper understanding of the impacts of the COVID-19 pandemic on medical misadventures. The anticipated results aim to outline complex patterns in health care safety and quality during the COVID-19 pandemic, using global real-world data for robust and generalizable conclusions. This study will explore significant shifts in health care practices and outcomes, with a special focus on geographical variations and key clinical conditions in cardiovascular and oncological care, ensuring a comprehensive analysis of the pandemic?s impact across different regions and medical fields. Results: This study is currently in the data collection phase, with funding secured in November 2023 through the Ricerca Corrente scheme of the Italian Ministry of Health. Data collection via the TriNetX platform is anticipated to be completed in May 2024, covering an 8-year period from January 2015 to December 2023. This dataset spans pre-pandemic, intra-pandemic, and early post-pandemic phases, enabling a comprehensive analysis of trends in medical misadventures using the ICD-10 cluster Y62-Y69. The final analytics are anticipated to be completed by June 2024. The study's findings aim to provide actionable insights for enhancing healthcare safety and quality, reflecting on the pandemic's transformative impact on global healthcare systems. Conclusions: This study is anticipated to contribute significantly to health care safety and quality literature. It will provide actionable insights for health care professionals, policy makers, and researchers. It will highlight critical areas for intervention and funding to enhance health care safety and quality globally by examining the incidence rates of medical misadventures before, during, and after the pandemic. In addition, the use of global real-world data enhances the study?s strength by providing a practical view of health care safety and quality, paving the way for initiatives that are informed by data and tailored to specific contexts worldwide. This approach ensures the findings are applicable and actionable across different health care settings, contributing significantly to the global understanding and improvement of health care safety and quality. International Registered Report Identifier (IRRID): PRR1-10.2196/54838 UR - https://www.researchprotocols.org/2024/1/e54838 UR - http://dx.doi.org/10.2196/54838 UR - http://www.ncbi.nlm.nih.gov/pubmed/38630516 ID - info:doi/10.2196/54838 ER - TY - JOUR AU - Oreskovic, Jessica AU - Kaufman, Jaycee AU - Fossat, Yan PY - 2024/4/15 TI - Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study JO - JMIR Biomed Eng SP - e56246 VL - 9 KW - vocal biomarker KW - biomarker KW - biomarkers KW - sound KW - sounds KW - audio KW - compression KW - voice KW - acoustic KW - acoustics KW - audio compression KW - feature extraction KW - Python KW - speech KW - detect KW - detection KW - algorithm KW - algorithms N2 - Background: Vocal biomarkers, derived from acoustic analysis of vocal characteristics, offer noninvasive avenues for medical screening, diagnostics, and monitoring. Previous research demonstrated the feasibility of predicting type 2 diabetes mellitus through acoustic analysis of smartphone-recorded speech. Building upon this work, this study explores the impact of audio data compression on acoustic vocal biomarker development, which is critical for broader applicability in health care. Objective: The objective of this research is to analyze how common audio compression algorithms (MP3, M4A, and WMA) applied by 3 different conversion tools at 2 bitrates affect features crucial for vocal biomarker detection. Methods: The impact of audio data compression on acoustic vocal biomarker development was investigated using uncompressed voice samples converted into MP3, M4A, and WMA formats at 2 bitrates (320 and 128 kbps) with MediaHuman (MH) Audio Converter, WonderShare (WS) UniConverter, and Fast Forward Moving Picture Experts Group (FFmpeg). The data set comprised recordings from 505 participants, totaling 17,298 audio files, collected using a smartphone. Participants recorded a fixed English sentence up to 6 times daily for up to 14 days. Feature extraction, including pitch, jitter, intensity, and Mel-frequency cepstral coefficients (MFCCs), was conducted using Python and Parselmouth. The Wilcoxon signed rank test and the Bonferroni correction for multiple comparisons were used for statistical analysis. Results: In this study, 36,970 audio files were initially recorded from 505 participants, with 17,298 recordings meeting the fixed sentence criteria after screening. Differences between the audio conversion software, MH, WS, and FFmpeg, were notable, impacting compression outcomes such as constant or variable bitrates. Analysis encompassed diverse data compression formats and a wide array of voice features and MFCCs. Wilcoxon signed rank tests yielded P values, with those below the Bonferroni-corrected significance level indicating significant alterations due to compression. The results indicated feature-specific impacts of compression across formats and bitrates. MH-converted files exhibited greater resilience compared to WS-converted files. Bitrate also influenced feature stability, with 38 cases affected uniquely by a single bitrate. Notably, voice features showed greater stability than MFCCs across conversion methods. Conclusions: Compression effects were found to be feature specific, with MH and FFmpeg showing greater resilience. Some features were consistently affected, emphasizing the importance of understanding feature resilience for diagnostic applications. Considering the implementation of vocal biomarkers in health care, finding features that remain consistent through compression for data storage or transmission purposes is valuable. Focused on specific features and formats, future research could broaden the scope to include diverse features, real-time compression algorithms, and various recording methods. This study enhances our understanding of audio compression?s influence on voice features and MFCCs, providing insights for developing applications across fields. The research underscores the significance of feature stability in working with compressed audio data, laying a foundation for informed voice data use in evolving technological landscapes. UR - https://biomedeng.jmir.org/2024/1/e56246 UR - http://dx.doi.org/10.2196/56246 UR - http://www.ncbi.nlm.nih.gov/pubmed/38875677 ID - info:doi/10.2196/56246 ER - TY - JOUR AU - Hirosawa, Takanobu AU - Harada, Yukinori AU - Tokumasu, Kazuki AU - Ito, Takahiro AU - Suzuki, Tomoharu AU - Shimizu, Taro PY - 2024/4/9 TI - Evaluating ChatGPT-4?s Diagnostic Accuracy: Impact of Visual Data Integration JO - JMIR Med Inform SP - e55627 VL - 12 KW - artificial intelligence KW - large language model KW - LLM KW - LLMs KW - language model KW - language models KW - ChatGPT KW - GPT KW - ChatGPT-4V KW - ChatGPT-4 Vision KW - clinical decision support KW - natural language processing KW - decision support KW - NLP KW - diagnostic excellence KW - diagnosis KW - diagnoses KW - diagnose KW - diagnostic KW - diagnostics KW - image KW - images KW - imaging N2 - Background: In the evolving field of health care, multimodal generative artificial intelligence (AI) systems, such as ChatGPT-4 with vision (ChatGPT-4V), represent a significant advancement, as they integrate visual data with text data. This integration has the potential to revolutionize clinical diagnostics by offering more comprehensive analysis capabilities. However, the impact on diagnostic accuracy of using image data to augment ChatGPT-4 remains unclear. Objective: This study aims to assess the impact of adding image data on ChatGPT-4?s diagnostic accuracy and provide insights into how image data integration can enhance the accuracy of multimodal AI in medical diagnostics. Specifically, this study endeavored to compare the diagnostic accuracy between ChatGPT-4V, which processed both text and image data, and its counterpart, ChatGPT-4, which only uses text data. Methods: We identified a total of 557 case reports published in the American Journal of Case Reports from January 2022 to March 2023. After excluding cases that were nondiagnostic, pediatric, and lacking image data, we included 363 case descriptions with their final diagnoses and associated images. We compared the diagnostic accuracy of ChatGPT-4V and ChatGPT-4 without vision based on their ability to include the final diagnoses within differential diagnosis lists. Two independent physicians evaluated their accuracy, with a third resolving any discrepancies, ensuring a rigorous and objective analysis. Results: The integration of image data into ChatGPT-4V did not significantly enhance diagnostic accuracy, showing that final diagnoses were included in the top 10 differential diagnosis lists at a rate of 85.1% (n=309), comparable to the rate of 87.9% (n=319) for the text-only version (P=.33). Notably, ChatGPT-4V?s performance in correctly identifying the top diagnosis was inferior, at 44.4% (n=161), compared with 55.9% (n=203) for the text-only version (P=.002, ?2 test). Additionally, ChatGPT-4?s self-reports showed that image data accounted for 30% of the weight in developing the differential diagnosis lists in more than half of cases. Conclusions: Our findings reveal that currently, ChatGPT-4V predominantly relies on textual data, limiting its ability to fully use the diagnostic potential of visual information. This study underscores the need for further development of multimodal generative AI systems to effectively integrate and use clinical image data. Enhancing the diagnostic performance of such AI systems through improved multimodal data integration could significantly benefit patient care by providing more accurate and comprehensive diagnostic insights. Future research should focus on overcoming these limitations, paving the way for the practical application of advanced AI in medicine. UR - https://medinform.jmir.org/2024/1/e55627 UR - http://dx.doi.org/10.2196/55627 UR - http://www.ncbi.nlm.nih.gov/pubmed/38592758 ID - info:doi/10.2196/55627 ER - TY - JOUR AU - Sivarajkumar, Sonish AU - Gao, Fengyi AU - Denny, Parker AU - Aldhahwani, Bayan AU - Visweswaran, Shyam AU - Bove, Allyn AU - Wang, Yanshan PY - 2024/4/3 TI - Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study JO - JMIR Med Inform SP - e52289 VL - 12 KW - natural language processing KW - electronic health records KW - rehabilitation KW - physical exercise KW - ChatGPT KW - artificial intelligence KW - stroke KW - physical rehabilitation KW - rehabilitation therapy KW - exercise KW - machine learning N2 - Background: The rehabilitation of a patient who had a stroke requires precise, personalized treatment plans. Natural language processing (NLP) offers the potential to extract valuable exercise information from clinical notes, aiding in the development of more effective rehabilitation strategies. Objective: This study aims to develop and evaluate a variety of NLP algorithms to extract and categorize physical rehabilitation exercise information from the clinical notes of patients who had a stroke treated at the University of Pittsburgh Medical Center. Methods: A cohort of 13,605 patients diagnosed with stroke was identified, and their clinical notes containing rehabilitation therapy notes were retrieved. A comprehensive clinical ontology was created to represent various aspects of physical rehabilitation exercises. State-of-the-art NLP algorithms were then developed and compared, including rule-based, machine learning?based algorithms (support vector machine, logistic regression, gradient boosting, and AdaBoost) and large language model (LLM)?based algorithms (ChatGPT [OpenAI]). The study focused on key performance metrics, particularly F1-scores, to evaluate algorithm effectiveness. Results: The analysis was conducted on a data set comprising 23,724 notes with detailed demographic and clinical characteristics. The rule-based NLP algorithm demonstrated superior performance in most areas, particularly in detecting the ?Right Side? location with an F1-score of 0.975, outperforming gradient boosting by 0.063. Gradient boosting excelled in ?Lower Extremity? location detection (F1-score: 0.978), surpassing rule-based NLP by 0.023. It also showed notable performance in the ?Passive Range of Motion? detection with an F1-score of 0.970, a 0.032 improvement over rule-based NLP. The rule-based algorithm efficiently handled ?Duration,? ?Sets,? and ?Reps? with F1-scores up to 0.65. LLM-based NLP, particularly ChatGPT with few-shot prompts, achieved high recall but generally lower precision and F1-scores. However, it notably excelled in ?Backward Plane? motion detection, achieving an F1-score of 0.846, surpassing the rule-based algorithm?s 0.720. Conclusions: The study successfully developed and evaluated multiple NLP algorithms, revealing the strengths and weaknesses of each in extracting physical rehabilitation exercise information from clinical notes. The detailed ontology and the robust performance of the rule-based and gradient boosting algorithms demonstrate significant potential for enhancing precision rehabilitation. These findings contribute to the ongoing efforts to integrate advanced NLP techniques into health care, moving toward predictive models that can recommend personalized rehabilitation treatments for optimal patient outcomes. UR - https://medinform.jmir.org/2024/1/e52289 UR - http://dx.doi.org/10.2196/52289 UR - http://www.ncbi.nlm.nih.gov/pubmed/38568736 ID - info:doi/10.2196/52289 ER - TY - JOUR AU - Maleki, Negar AU - Padmanabhan, Balaji AU - Dutta, Kaushik PY - 2024/3/29 TI - Usability of Health Care Price Transparency Data in the United States: Mixed Methods Study JO - J Med Internet Res SP - e50629 VL - 26 KW - price transparency KW - user experiments KW - schema analysis KW - health care KW - patients KW - algorithms N2 - Background: Increasing health care expenditure in the United States has put policy makers under enormous pressure to find ways to curtail costs. Starting January 1, 2021, hospitals operating in the United States were mandated to publish transparent, accessible pricing information online about the items and services in a consumer-friendly format within comprehensive machine-readable files on their websites. Objective: The aims of this study are to analyze the available files on hospitals? websites, answering the question?is price transparency (PT) information as provided usable for patients or for machines??and to provide a solution. Methods: We analyzed 39 main hospitals in Florida that have published machine-readable files on their website, including commercial carriers. We created an Excel (Microsoft) file that included those 39 hospitals along with the 4 most popular services?Current Procedural Terminology (CPT) 45380, 29827, and 70553 and Diagnosis-Related Group (DRG) 807?for the 4 most popular commercial carriers (Health Maintenance Organization [HMO] or Preferred Provider Organization [PPO] plans)?Aetna, Florida Blue, Cigna, and UnitedHealthcare. We conducted an A/B test using 67 MTurkers (randomly selected from US residents), investigating the level of awareness about PT legislation and the usability of available files. We also suggested format standardization, such as master field names using schema integration, to make machine-readable files consistent and usable for machines. Results: The poor usability and inconsistent formats of the current PT information yielded no evidence of its usefulness for patients or its quality for machines. This indicates that the information does not meet the requirements for being consumer-friendly or machine readable as mandated by legislation. Based on the responses to the first part of the experiment (PT awareness), it was evident that participants need to be made aware of the PT legislation. However, they believe it is important to know the service price before receiving it. Based on the responses to the second part of the experiment (human usability of PT information), the average number of correct responses was not equal between the 2 groups, that is, the treatment group (mean 1.23, SD 1.30) found more correct answers than the control group (mean 2.76, SD 0.58; t65=6.46; P<.001; d=1.52). Conclusions: Consistent machine-readable files across all health systems facilitate the development of tools for estimating customer out-of-pocket costs, aligning with the PT rule?s main objective?providing patients with valuable information and reducing health care expenditures. UR - https://www.jmir.org/2024/1/e50629 UR - http://dx.doi.org/10.2196/50629 UR - http://www.ncbi.nlm.nih.gov/pubmed/38442238 ID - info:doi/10.2196/50629 ER - TY - JOUR AU - Snyder, Morgan AU - Elkins, R. Gary PY - 2024/3/14 TI - Characteristics of Users of a Digital Hypnotherapy Intervention for Hot Flashes: Retrospective Study JO - JMIR Form Res SP - e53555 VL - 8 KW - hypnotherapy KW - hot flashes KW - smartphone app KW - mHealth KW - mobile health KW - app KW - apps KW - applications KW - hypnosis KW - menopause KW - menopausal KW - gynecology KW - usage KW - women's health KW - user KW - users KW - demographics KW - demographic KW - characteristic KW - characteristics KW - mental health KW - alternative KW - complementary KW - mind body KW - hypnotism N2 - Background: Hot flashes are associated with a lower quality of life and sleep disturbances. Given the many consequences of hot flashes, it is important to find treatments to reduce them. Hypnotherapy, the use of hypnosis for a medical disorder or concern, has been shown in clinical trials to be effective in reducing hot flashes, but it is not routinely used in clinical practice. One solution to close this implementation gap is to administer hypnotherapy for hot flashes via a smartphone app. Evia is a smartphone app that delivers hypnotherapy for hot flashes. Evia has made hypnotherapy more widely accessible for women who are experiencing hot flashes; however, the app has yet to undergo empirical testing. Additionally, research on user characteristics is lacking. Objective: This study aims to (1) determine the average age, stage of menopause, and length of menopause symptoms for users of the Evia app; (2) determine the characteristics of hot flashes and night sweats for users of the Evia app; (3) determine the self-reported sleep quality of users of the Evia app; (4) determine the self-reported mental health of users of the Evia app; and (5) determine the relationship between hot flash frequency and anxiety and depression for users of the Evia app. Methods: This study analyzed data collected from participants who have downloaded the Evia app. Data were collected at 1 time point from a self-report questionnaire that assessed the demographic and clinical characteristics of users. The questionnaire was given to users when they downloaded the Evia app. Users of the Evia app fill out a questionnaire upon enrolling in the program and prior to beginning the intervention. This included 9764 users. Results: Results showed that the mean age of users was 49.31 years. A total of 41.6% (1942/4665) of users reported experiencing 5 or more hot flashes per day, while 51.2% (1473/2877) of users reported having difficulty falling asleep each night and 47.7% (1253/2626) of users reported their sleep quality to be terrible. In addition, 38.4% (1104/2877) of users reported that they often feel anxious or depressed. There was a small, significant, and negative correlation between hot flash frequency and self-report frequency of anxiety and depression (r=?0.09). Conclusions: This study showed that the average age of app users is in line with the median age of natural menopause. A large percentage of users reported experiencing 5 or more hot flashes per day, reported difficulties with sleep, and reported experiencing depression and anxiety. These findings are in line with previous studies that assessed hot flash frequency and the consequences of hot flashes. This was the first study to report on the characteristics of users of the Evia app. Results will be used to optimize the hypnotherapy program delivered via the Evia app. UR - https://formative.jmir.org/2024/1/e53555 UR - http://dx.doi.org/10.2196/53555 UR - http://www.ncbi.nlm.nih.gov/pubmed/38483465 ID - info:doi/10.2196/53555 ER - TY - JOUR AU - Declerck, Jens AU - Kalra, Dipak AU - Vander Stichele, Robert AU - Coorevits, Pascal PY - 2024/3/6 TI - Frameworks, Dimensions, Definitions of Aspects, and Assessment Methods for the Appraisal of Quality of Health Data for Secondary Use: Comprehensive Overview of Reviews JO - JMIR Med Inform SP - e51560 VL - 12 KW - data quality KW - data quality dimensions KW - data quality assessment KW - secondary use KW - data quality framework KW - fit for purpose N2 - Background: Health care has not reached the full potential of the secondary use of health data because of?among other issues?concerns about the quality of the data being used. The shift toward digital health has led to an increase in the volume of health data. However, this increase in quantity has not been matched by a proportional improvement in the quality of health data. Objective: This review aims to offer a comprehensive overview of the existing frameworks for data quality dimensions and assessment methods for the secondary use of health data. In addition, it aims to consolidate the results into a unified framework. Methods: A review of reviews was conducted including reviews describing frameworks of data quality dimensions and their assessment methods, specifically from a secondary use perspective. Reviews were excluded if they were not related to the health care ecosystem, lacked relevant information related to our research objective, and were published in languages other than English. Results: A total of 22 reviews were included, comprising 22 frameworks, with 23 different terms for dimensions, and 62 definitions of dimensions. All dimensions were mapped toward the data quality framework of the European Institute for Innovation through Health Data. In total, 8 reviews mentioned 38 different assessment methods, pertaining to 31 definitions of the dimensions. Conclusions: The findings in this review revealed a lack of consensus in the literature regarding the terminology, definitions, and assessment methods for data quality dimensions. This creates ambiguity and difficulties in developing specific assessment methods. This study goes a step further by assigning all observed definitions to a consolidated framework of 9 data quality dimensions. UR - https://medinform.jmir.org/2024/1/e51560 UR - http://dx.doi.org/10.2196/51560 UR - http://www.ncbi.nlm.nih.gov/pubmed/38446534 ID - info:doi/10.2196/51560 ER - TY - JOUR AU - Ru, Boshu AU - Sillah, Arthur AU - Desai, Kaushal AU - Chandwani, Sheenu AU - Yao, Lixia AU - Kothari, Smita PY - 2024/3/6 TI - Real-World Data Quality Framework for Oncology Time to Treatment Discontinuation Use Case: Implementation and Evaluation Study JO - JMIR Med Inform SP - e47744 VL - 12 KW - data quality assessment KW - real-world data KW - real-world time to treatment discontinuation KW - systemic anticancer therapy KW - Use Case Specific Relevance and Quality Assessment KW - UReQA framework N2 - Background: The importance of real-world evidence is widely recognized in observational oncology studies. However, the lack of interoperable data quality standards in the fragmented health information technology landscape represents an important challenge. Therefore, adopting validated systematic methods for evaluating data quality is important for oncology outcomes research leveraging real-world data (RWD). Objective: This study aims to implement real-world time to treatment discontinuation (rwTTD) for a systemic anticancer therapy (SACT) as a new use case for the Use Case Specific Relevance and Quality Assessment, a framework linking data quality and relevance in fit-for-purpose RWD assessment. Methods: To define the rwTTD use case, we mapped the operational definition of rwTTD to RWD elements commonly available from oncology electronic health record?derived data sets. We identified 20 tasks to check the completeness and plausibility of data elements concerning SACT use, line of therapy (LOT), death date, and length of follow-up. Using descriptive statistics, we illustrated how to implement the Use Case Specific Relevance and Quality Assessment on 2 oncology databases (Data sets A and B) to estimate the rwTTD of an SACT drug (target SACT) for patients with advanced head and neck cancer diagnosed on or after January 1, 2015. Results: A total of 1200 (24.96%) of 4808 patients in Data set A and 237 (5.92%) of 4003 patients in Data set B received the target SACT, suggesting better relevance of the former in estimating the rwTTD of the target SACT. The 2 data sets differed with regard to the terminology used for SACT drugs, LOT format, and target SACT LOT distribution over time. Data set B appeared to have less complete SACT records, longer lags in incorporating the latest data, and incomplete mortality data, suggesting a lack of fitness for estimating rwTTD. Conclusions: The fit-for-purpose data quality assessment demonstrated substantial variability in the quality of the 2 real-world data sets. The data quality specifications applied for rwTTD estimation can be expanded to support a broad spectrum of oncology use cases. UR - https://medinform.jmir.org/2024/1/e47744 UR - http://dx.doi.org/10.2196/47744 UR - http://www.ncbi.nlm.nih.gov/pubmed/38446504 ID - info:doi/10.2196/47744 ER - TY - JOUR AU - Peng, Yuan AU - Bathelt, Franziska AU - Gebler, Richard AU - Gött, Robert AU - Heidenreich, Andreas AU - Henke, Elisa AU - Kadioglu, Dennis AU - Lorenz, Stephan AU - Vengadeswaran, Abishaa AU - Sedlmayr, Martin PY - 2024/2/14 TI - Use of Metadata-Driven Approaches for Data Harmonization in the Medical Domain: Scoping Review JO - JMIR Med Inform SP - e52967 VL - 12 KW - ETL KW - ELT KW - Extract-Load-Transform KW - Extract-Transform-Load KW - interoperability KW - metadata-driven KW - medical domain KW - data harmonization N2 - Background: Multisite clinical studies are increasingly using real-world data to gain real-world evidence. However, due to the heterogeneity of source data, it is difficult to analyze such data in a unified way across clinics. Therefore, the implementation of Extract-Transform-Load (ETL) or Extract-Load-Transform (ELT) processes for harmonizing local health data is necessary, in order to guarantee the data quality for research. However, the development of such processes is time-consuming and unsustainable. A promising way to ease this is the generalization of ETL/ELT processes. Objective: In this work, we investigate existing possibilities for the development of generic ETL/ELT processes. Particularly, we focus on approaches with low development complexity by using descriptive metadata and structural metadata. Methods: We conducted a literature review following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We used 4 publication databases (ie, PubMed, IEEE Explore, Web of Science, and Biomed Center) to search for relevant publications from 2012 to 2022. The PRISMA flow was then visualized using an R-based tool (Evidence Synthesis Hackathon). All relevant contents of the publications were extracted into a spreadsheet for further analysis and visualization. Results: Regarding the PRISMA guidelines, we included 33 publications in this literature review. All included publications were categorized into 7 different focus groups (ie, medicine, data warehouse, big data, industry, geoinformatics, archaeology, and military). Based on the extracted data, ontology-based and rule-based approaches were the 2 most used approaches in different thematic categories. Different approaches and tools were chosen to achieve different purposes within the use cases. Conclusions: Our literature review shows that using metadata-driven (MDD) approaches to develop an ETL/ELT process can serve different purposes in different thematic categories. The results show that it is promising to implement an ETL/ELT process by applying MDD approach to automate the data transformation from Fast Healthcare Interoperability Resources to Observational Medical Outcomes Partnership Common Data Model. However, the determining of an appropriate MDD approach and tool to implement such an ETL/ELT process remains a challenge. This is due to the lack of comprehensive insight into the characterizations of the MDD approaches presented in this study. Therefore, our next step is to evaluate the MDD approaches presented in this study and to determine the most appropriate MDD approaches and the way to integrate them into the ETL/ELT process. This could verify the ability of using MDD approaches to generalize the ETL process for harmonizing medical data. UR - https://medinform.jmir.org/2024/1/e52967 UR - http://dx.doi.org/10.2196/52967 UR - http://www.ncbi.nlm.nih.gov/pubmed/38354027 ID - info:doi/10.2196/52967 ER - TY - JOUR AU - Weik, Lisa AU - Fehring, Leonard AU - Mortsiefer, Achim AU - Meister, Sven PY - 2024/1/22 TI - Big 5 Personality Traits and Individual- and Practice-Related Characteristics as Influencing Factors of Digital Maturity in General Practices: Quantitative Web-Based Survey Study JO - J Med Internet Res SP - e52085 VL - 26 KW - digital health KW - eHealth KW - digital maturity KW - maturity assessment KW - general practitioners KW - primary care physicians KW - primary care KW - family medicine KW - personality KW - digital affinity KW - digital health adoption N2 - Background: Various studies propose the significance of digital maturity in ensuring effective patient care and enabling improved health outcomes, a successful digital transformation, and optimized service delivery. Although previous research has centered around inpatient health care settings, research on digital maturity in general practices is still in its infancy. Objective: As general practitioners (GPs) are the first point of contact for most patients, we aimed to shed light on the pivotal role of GPs? inherent characteristics, especially their personality, in the digital maturity of general practices. Methods: In the first step, we applied a sequential mixed methods approach involving a literature review and expert interviews with GPs to construct the digital maturity scale used in this study. Next, we designed a web-based survey to assess digital maturity on a 5-point Likert-type scale and analyze the relationship with relevant inherent characteristics using ANOVAs and regression analysis. Results: Our web-based survey with 219 GPs revealed that digital maturity was overall moderate (mean 3.31, SD 0.64) and substantially associated with several characteristics inherent to the GP. We found differences in overall digital maturity based on GPs? gender, the expected future use of digital health solutions, the perceived digital affinity of medical assistants, GPs? level of digital affinity, and GPs? level of extraversion and neuroticism. In a regression model, a higher expected future use, a higher perceived digital affinity of medical assistants, a higher digital affinity of GPs, and lower neuroticism were substantial predictors of overall digital maturity. Conclusions: Our study highlights the impact of GPs? inherent characteristics, especially their personality, on the digital maturity of general practices. By identifying these inherent influencing factors, our findings support targeted approaches to drive digital maturity in general practice settings. UR - https://www.jmir.org/2024/1/e52085 UR - http://dx.doi.org/10.2196/52085 UR - http://www.ncbi.nlm.nih.gov/pubmed/38252468 ID - info:doi/10.2196/52085 ER - TY - JOUR AU - Ulgu, Mahir Mustafa AU - Laleci Erturkmen, Banu Gokce AU - Yuksel, Mustafa AU - Namli, Tuncay AU - Postac?, ?enan AU - Gencturk, Mert AU - Kabak, Yildiray AU - Sinaci, Anil A. AU - Gonul, Suat AU - Dogac, Asuman AU - Özkan Altunay, Zübeyde AU - Ekinci, Banu AU - Aydin, Sahin AU - Birinci, Suayip PY - 2024/1/19 TI - A Nationwide Chronic Disease Management Solution via Clinical Decision Support Services: Software Development and Real-Life Implementation Report JO - JMIR Med Inform SP - e49986 VL - 12 KW - chronic disease management KW - clinical decision support services KW - integrated care KW - interoperability KW - evidence-based medicine KW - medicine KW - disease management KW - management KW - implementation KW - decision support KW - clinical decision KW - support KW - chronic disease KW - physician-centered KW - risk assessment KW - tracking KW - diagnosis N2 - Background: The increasing population of older adults has led to a rise in the demand for health care services, with chronic diseases being a major burden. Person-centered integrated care is required to address these challenges; hence, the Turkish Ministry of Health has initiated strategies to implement an integrated health care model for chronic disease management. We aim to present the design, development, nationwide implementation, and initial performance results of the national Disease Management Platform (DMP). Objective: This paper?s objective is to present the design decisions taken and technical solutions provided to ensure successful nationwide implementation by addressing several challenges, including interoperability with existing IT systems, integration with clinical workflow, enabling transition of care, ease of use by health care professionals, scalability, high performance, and adaptability. Methods: The DMP is implemented as an integrated care solution that heavily uses clinical decision support services to coordinate effective screening and management of chronic diseases in adherence to evidence-based clinical guidelines and, hence, to increase the quality of health care delivery. The DMP is designed and implemented to be easily integrated with the existing regional and national health IT systems via conformance to international health IT standards, such as Health Level Seven Fast Healthcare Interoperability Resources. A repeatable cocreation strategy has been used to design and develop new disease modules to ensure extensibility while ensuring ease of use and seamless integration into the regular clinical workflow during patient encounters. The DMP is horizontally scalable in case of high load to ensure high performance. Results: As of September 2023, the DMP has been used by 25,568 health professionals to perform 73,715,269 encounters for 16,058,904 unique citizens. It has been used to screen and monitor chronic diseases such as obesity, cardiovascular risk, diabetes, and hypertension, resulting in the diagnosis of 3,545,573 patients with obesity, 534,423 patients with high cardiovascular risk, 490,346 patients with diabetes, and 144,768 patients with hypertension. Conclusions: It has been demonstrated that the platform can scale horizontally and efficiently provides services to thousands of family medicine practitioners without performance problems. The system seamlessly interoperates with existing health IT solutions and runs as a part of the clinical workflow of physicians at the point of care. By automatically accessing and processing patient data from various sources to provide personalized care plan guidance, it maximizes the effect of evidence-based decision support services by seamless integration with point-of-care electronic health record systems. As the system is built on international code systems and standards, adaptation and deployment to additional regional and national settings become easily possible. The nationwide DMP as an integrated care solution has been operational since January 2020, coordinating effective screening and management of chronic diseases in adherence to evidence-based clinical guidelines. UR - https://medinform.jmir.org/2024/1/e49986 UR - http://dx.doi.org/10.2196/49986 UR - http://www.ncbi.nlm.nih.gov/pubmed/38241077 ID - info:doi/10.2196/49986 ER - TY - JOUR AU - Mehra, Tarun AU - Wekhof, Tobias AU - Keller, Iris Dagmar PY - 2024/1/17 TI - Additional Value From Free-Text Diagnoses in Electronic Health Records: Hybrid Dictionary and Machine Learning Classification Study JO - JMIR Med Inform SP - e49007 VL - 12 KW - electronic health records KW - free text KW - natural language processing KW - NLP KW - artificial intelligence KW - AI N2 - Background: Physicians are hesitant to forgo the opportunity of entering unstructured clinical notes for structured data entry in electronic health records. Does free text increase informational value in comparison with structured data? Objective: This study aims to compare information from unstructured text-based chief complaints harvested and processed by a natural language processing (NLP) algorithm with clinician-entered structured diagnoses in terms of their potential utility for automated improvement of patient workflows. Methods: Electronic health records of 293,298 patient visits at the emergency department of a Swiss university hospital from January 2014 to October 2021 were analyzed. Using emergency department overcrowding as a case in point, we compared supervised NLP-based keyword dictionaries of symptom clusters from unstructured clinical notes and clinician-entered chief complaints from a structured drop-down menu with the following 2 outcomes: hospitalization and high Emergency Severity Index (ESI) score. Results: Of 12 symptom clusters, the NLP cluster was substantial in predicting hospitalization in 11 (92%) clusters; 8 (67%) clusters remained significant even after controlling for the cluster of clinician-determined chief complaints in the model. All 12 NLP symptom clusters were significant in predicting a low ESI score, of which 9 (75%) remained significant when controlling for clinician-determined chief complaints. The correlation between NLP clusters and chief complaints was low (r=?0.04 to 0.6), indicating complementarity of information. Conclusions: The NLP-derived features and clinicians? knowledge were complementary in explaining patient outcome heterogeneity. They can provide an efficient approach to patient flow management, for example, in an emergency medicine setting. We further demonstrated the feasibility of creating extensive and precise keyword dictionaries with NLP by medical experts without requiring programming knowledge. Using the dictionary, we could classify short and unstructured clinical texts into diagnostic categories defined by the clinician. UR - https://medinform.jmir.org/2024/1/e49007 UR - http://dx.doi.org/10.2196/49007 UR - http://www.ncbi.nlm.nih.gov/pubmed/38231569 ID - info:doi/10.2196/49007 ER - TY - JOUR AU - Muschol, Jennifer AU - Heinrich, Martin AU - Heiss, Christian AU - Hernandez, Mauricio Alher AU - Knapp, Gero AU - Repp, Holger AU - Schneider, Henning AU - Thormann, Ulrich AU - Uhlar, Johanna AU - Unzeitig, Kai AU - Gissel, Christian PY - 2023/12/25 TI - Digitization of Follow-Up Care in Orthopedic and Trauma Surgery With Video Consultations: Health Economic Evaluation Study From a Health Provider?s Perspective JO - J Med Internet Res SP - e46714 VL - 25 KW - digital health KW - economic evaluation KW - health economics KW - orthopedic KW - personnel costs KW - productivity gains KW - telemedicine KW - trauma surgery KW - utility KW - video consultations N2 - Background: Recommendations for health care digitization as issued with the Riyadh Declaration led to an uptake in telemedicine to cope with the COVID-19 pandemic. Evaluations based on clinical data are needed to support stakeholders? decision-making on the long-term implementation of digital health. Objective: This health economic evaluation aims to provide the first German analysis of the suitability of video consultations in the follow-up care of patients in orthopedic and trauma surgery, investigate the financial impact on hospital operations and personnel costs, and provide a basis for decisions on digitizing outpatient care. Methods: We conducted a randomized controlled trial that evaluated video consultations versus face-to-face consultations in the follow-up care of patients in orthopedic and trauma surgery at a German university hospital. We recruited 60 patients who had previously been treated conservatively or surgically for various knee or shoulder injuries. A digital health app and a browser-based software were used to conduct video consultations. The suitability of telemedicine was assessed using the Telemedicine Satisfaction Questionnaire and the EQ-5D-5L questionnaire. Economic analyses included average time spent by physician per consultation, associated personnel costs and capacities for additional treatable patients, and the break-even point for video consultation software fees. Results: After 4 withdrawals in each arm, data from a total of 52 patients (telemedicine group: n=26; control group: n=26) were used for our analyses. In the telemedicine group, 77% (20/26) of all patients agreed that telemedicine provided for their health care needs, and 69% (18/26) found telemedicine an acceptable way to receive health care services. In addition, no significant difference was found in the change of patient utility between groups after 3 months (mean 0.02, SD 0.06 vs mean 0.07, SD 0.17; P=.35). Treatment duration was significantly shorter in the intervention group (mean 8.23, SD 4.45 minutes vs mean 10.92, SD 5.58 minutes; P=.02). The use of telemedicine saved 25% (?2.14 [US $2.35]/?8.67 [US $9.53]) in personnel costs and increased the number of treatable patients by 172 annually, assuming 2 hours of video consultations per week. Sensitivity analysis for scaling up video consultations to 10% of the hospital?s outpatient cases resulted in personnel cost savings of ?73,056 (US $ 80,275.39) for a senior physician. A total of 23 video consultations per month were required to recoup the software fees of telemedicine through reduced personnel costs (break-even point ranging from 12-38 in the sensitivity analysis). Conclusions: Our study supports stakeholders? decision-making on the long-term implementation of digital health by demonstrating that video consultations in the follow-up care of patients in orthopedic and trauma surgery result in cost savings and productivity gains for clinics with no negative impact on patient utility. Trial Registration: German Clinical Trials Register DRKS00023445; https://drks.de/search/en/trial/DRKS00023445 UR - https://www.jmir.org/2023/1/e46714 UR - http://dx.doi.org/10.2196/46714 UR - http://www.ncbi.nlm.nih.gov/pubmed/38145481 ID - info:doi/10.2196/46714 ER - TY - JOUR AU - Xie, Yunhui AU - Wen, Jun AU - Zhu, Hongmei AU - Liu, Yanjun PY - 2023/12/22 TI - The Effects of Reinforcement Techniques in Sleeve Gastrectomy and Roux-en-Y Gastric Bypass: Protocol for a Web-Based Survey, Systematic Review, and Meta-Analysis JO - JMIR Res Protoc SP - e50677 VL - 12 KW - protocol KW - reinforcement technique KW - sleeve gastrectomy KW - Roux-en-Y gastric bypass KW - systematic review KW - meta-analysis KW - bariatric surgery KW - survey KW - effectiveness KW - surgical site KW - blood loss KW - blood KW - gastric bypass KW - gastrectomy N2 - Background: The effects of reinforcement are still controversial in bariatric surgery, and variations may exist in using this technique. Objective: This protocol describes a study that aims to survey the views of bariatric surgeons on reinforcement techniques and evaluate the effects of applying reinforcement techniques in sleeve gastrectomy (SG) and Roux-en-Y gastric bypass (RYGB). Methods: This study is composed of 2 parts. Part 1 will investigate the differences of using reinforcement techniques among surgeons worldwide who perform SG or RYGB through a survey. The survey will be conducted by email and social media. Part 2 will evaluate the safety and effectiveness of using omentopexy or staple line reinforcement in SG and RYGB by systematic review and meta-analysis. In this part, literature searches will be performed in English databases, including CENTRAL, EMBASE CINAHL, Web of Science, and PubMed, and Chinese databases, including Wanfang, China National Knowledge Infrastructure, Database of Chinese Technical Periodicals, and Chinese Biological Medicine, from their establishment to November 2023. Randomized controlled trials and case-control studies will be included. The primary outcomes are rates of postoperative bleeding and gastric leakage. The secondary outcomes include anastomotic stenosis, surgical site infection, reoperation, estimated intraoperative blood loss, operative time (minutes), length of hospital stay (days), overall complications, and 30-day mortality. The meta-analysis will be conducted using RevMan 5.4 under the random-effects model, as well as through extensive subgroup and sensitivity analyses. P values <0.05 will be considered statistically significant. This study was registered with PROSPERO (Prospective Register of Systematic Reviews) in accordance with the PRISMA-P (Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols). Results: The results of this study will be published in a peer-reviewed journal. The web-based survey and initial title or abstract review of papers identified by the search strategy will be completed in November 2023. The second round of title or abstract review and downloading of the papers for full-text inclusion will be completed in January 2024. We aim to complete data extraction and meta-analysis by February 2024 and expect to publish the findings by the end of March 2024. Conclusions: This study aims to investigate the impact of reinforcement techniques on reducing the incidence of postoperative complications in SG and RYGB procedures and provide assistance for standardizing the procedures of SG and RYGB operations for bariatric surgeons. Trial Registration: PROSPERO CRD42022376438; https://tinyurl.com/2d53uf8n International Registered Report Identifier (IRRID): PRR1-10.2196/50677 UR - https://www.researchprotocols.org/2023/1/e50677 UR - http://dx.doi.org/10.2196/50677 UR - http://www.ncbi.nlm.nih.gov/pubmed/38133924 ID - info:doi/10.2196/50677 ER - TY - JOUR AU - Castonguay, Alexandre AU - Lovis, Christian PY - 2023/12/21 TI - Introducing the ?AI Language Models in Health Care? Section: Actionable Strategies for Targeted and Wide-Scale Deployment JO - JMIR Med Inform SP - e53785 VL - 11 KW - generative AI KW - health care digitalization KW - AI in health care KW - digital health standards KW - AI implementation KW - artificial intelligence UR - https://medinform.jmir.org/2023/1/e53785 UR - http://dx.doi.org/10.2196/53785 UR - http://www.ncbi.nlm.nih.gov/pubmed/38127431 ID - info:doi/10.2196/53785 ER - TY - JOUR AU - Autio, Reija AU - Virta, Joni AU - Nordhausen, Klaus AU - Fogelholm, Mikael AU - Erkkola, Maijaliisa AU - Nevalainen, Jaakko PY - 2023/12/15 TI - Tensorial Principal Component Analysis in Detecting Temporal Trajectories of Purchase Patterns in Loyalty Card Data: Retrospective Cohort Study JO - J Med Internet Res SP - e44599 VL - 25 KW - tensorial data KW - principal components KW - loyalty card data KW - purchase pattern KW - food expenditure KW - seasonality KW - food KW - diet N2 - Background: Loyalty card data automatically collected by retailers provide an excellent source for evaluating health-related purchase behavior of customers. The data comprise information on every grocery purchase, including expenditures on product groups and the time of purchase for each customer. Such data where customers have an expenditure value for every product group for each time can be formulated as 3D tensorial data. Objective: This study aimed to use the modern tensorial principal component analysis (PCA) method to uncover the characteristics of health-related purchase patterns from loyalty card data. Another aim was to identify card holders with distinct purchase patterns. We also considered the interpretation, advantages, and challenges of tensorial PCA compared with standard PCA. Methods: Loyalty card program members from the largest retailer in Finland were invited to participate in this study. Our LoCard data consist of the purchases of 7251 card holders who consented to the use of their data from the year 2016. The purchases were reclassified into 55 product groups and aggregated across 52 weeks. The data were then analyzed using tensorial PCA, allowing us to effectively reduce the time and product group-wise dimensions simultaneously. The augmentation method was used for selecting the suitable number of principal components for the analysis. Results: Using tensorial PCA, we were able to systematically search for typical food purchasing patterns across time and product groups as well as detect different purchasing behaviors across groups of card holders. For example, we identified customers who purchased large amounts of meat products and separated them further into groups based on time profiles, that is, customers whose purchases of meat remained stable, increased, or decreased throughout the year or varied between seasons of the year. Conclusions: Using tensorial PCA, we can effectively examine customers? purchasing behavior in more detail than with traditional methods because it can handle time and product group dimensions simultaneously. When interpreting the results, both time and product dimensions must be considered. In further analyses, these time and product groups can be directly associated with additional consumer characteristics such as socioeconomic and demographic predictors of dietary patterns. In addition, they can be linked to external factors that impact grocery purchases such as inflation and unexpected pandemics. This enables us to identify what types of people have specific purchasing patterns, which can help in the development of ways in which consumers can be steered toward making healthier food choices. UR - https://www.jmir.org/2023/1/e44599 UR - http://dx.doi.org/10.2196/44599 UR - http://www.ncbi.nlm.nih.gov/pubmed/38100168 ID - info:doi/10.2196/44599 ER - TY - JOUR AU - Khalid, Mahnoor AU - Sutterfield, Bethany AU - Minley, Kirstien AU - Ottwell, Ryan AU - Abercrombie, McKenna AU - Heath, Christopher AU - Torgerson, Trevor AU - Hartwell, Micah AU - Vassar, Matt PY - 2023/12/7 TI - The Reporting and Methodological Quality of Systematic Reviews Underpinning Clinical Practice Guidelines Focused on the Management of Cutaneous Melanoma: Cross-Sectional Analysis JO - JMIR Dermatol SP - e43821 VL - 6 KW - clinical practice guidelines KW - clinical KW - cutaneous melanoma KW - decision making KW - evidence KW - management KW - melanoma KW - practice guideline KW - review KW - systematic review N2 - Background: Clinical practice guidelines (CPGs) inform evidence-based decision-making in the clinical setting; however, systematic reviews (SRs) that inform these CPGs may vary in terms of reporting and methodological quality, which affects confidence in summary effect estimates. Objective: Our objective was to appraise the methodological and reporting quality of the SRs used in CPGs for cutaneous melanoma and evaluate differences in these outcomes between Cochrane and non-Cochrane reviews. Methods: We conducted a cross-sectional analysis by searching PubMed for cutaneous melanoma guidelines published between January 1, 2015, and May 21, 2021. Next, we extracted SRs composing these guidelines and appraised their reporting and methodological rigor using the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and AMSTAR (A Measurement Tool to Assess Systematic Reviews) checklists. Lastly, we compared these outcomes between Cochrane and non-Cochrane SRs. All screening and data extraction occurred in a masked, duplicate fashion. Results: Of the SRs appraised, the mean completion rate was 66.5% (SD 12.29%) for the PRISMA checklist and 44.5% (SD 21.05%) for AMSTAR. The majority of SRs (19/50, 53%) were of critically low methodological quality, with no SRs being appraised as high quality. There was a statistically significant association (P<.001) between AMSTAR and PRISMA checklists. Cochrane SRs had higher PRISMA mean completion rates and higher methodological quality than non-Cochrane SRs. Conclusions: SRs supporting CPGs focused on the management of cutaneous melanoma vary in reporting and methodological quality, with the majority of SRs being of low quality. Increasing adherence to PRISMA and AMSTAR checklists will likely increase the quality of SRs, thereby increasing the level of evidence supporting cutaneous melanoma CPGs. UR - https://derma.jmir.org/2023/1/e43821 UR - http://dx.doi.org/10.2196/43821 UR - http://www.ncbi.nlm.nih.gov/pubmed/38060306 ID - info:doi/10.2196/43821 ER - TY - JOUR AU - Lee, Ra Ah AU - Park, Hojoon AU - Yoo, Aram AU - Kim, Seok AU - Sunwoo, Leonard AU - Yoo, Sooyoung PY - 2023/12/6 TI - Risk Prediction of Emergency Department Visits in Patients With Lung Cancer Using Machine Learning: Retrospective Observational Study JO - JMIR Med Inform SP - e53058 VL - 11 KW - emergency department KW - lung cancer KW - risk prediction KW - machine learning KW - common data model KW - emergency KW - hospitalization KW - hospitalizations KW - lung KW - cancer KW - oncology KW - lungs KW - pulmonary KW - respiratory KW - predict KW - prediction KW - predictions KW - predictive KW - algorithm KW - algorithms KW - risk KW - risks KW - model KW - models N2 - Background: Patients with lung cancer are among the most frequent visitors to emergency departments due to cancer-related problems, and the prognosis for those who seek emergency care is dismal. Given that patients with lung cancer frequently visit health care facilities for treatment or follow-up, the ability to predict emergency department visits based on clinical information gleaned from their routine visits would enhance hospital resource utilization and patient outcomes. Objective: This study proposed a machine learning?based prediction model to identify risk factors for emergency department visits by patients with lung cancer. Methods: This was a retrospective observational study of patients with lung cancer diagnosed at Seoul National University Bundang Hospital, a tertiary general hospital in South Korea, between January 2010 and December 2017. The primary outcome was an emergency department visit within 30 days of an outpatient visit. This study developed a machine learning?based prediction model using a common data model. In addition, the importance of features that influenced the decision-making of the model output was analyzed to identify significant clinical factors. Results: The model with the best performance demonstrated an area under the receiver operating characteristic curve of 0.73 in its ability to predict the attendance of patients with lung cancer in emergency departments. The frequency of recent visits to the emergency department and several laboratory test results that are typically collected during cancer treatment follow-up visits were revealed as influencing factors for the model output. Conclusions: This study developed a machine learning?based risk prediction model using a common data model and identified influencing factors for emergency department visits by patients with lung cancer. The predictive model contributes to the efficiency of resource utilization and health care service quality by facilitating the identification and early intervention of high-risk patients. This study demonstrated the possibility of collaborative research among different institutions using the common data model for precision medicine in lung cancer. UR - https://medinform.jmir.org/2023/1/e53058 UR - http://dx.doi.org/10.2196/53058 UR - http://www.ncbi.nlm.nih.gov/pubmed/38055320 ID - info:doi/10.2196/53058 ER - TY - JOUR AU - Petrovskaya, Olga AU - Karpman, Albina AU - Schilling, Joanna AU - Singh, Simran AU - Wegren, Larissa AU - Caine, Vera AU - Kusi-Appiah, Elizabeth AU - Geen, Willow PY - 2023/10/19 TI - Patient and Health Care Provider Perspectives on Patient Access to Test Results via Web Portals: Scoping Review JO - J Med Internet Res SP - e43765 VL - 25 KW - patient portal KW - web portal KW - MyChart KW - electronic health records KW - personal health records KW - patient access to records KW - laboratory tests KW - radiology reports KW - diagnostic imaging KW - laboratory test results KW - result release KW - embargo KW - the Cures Act N2 - Background: A frequently used feature of electronic patient portals is the viewing of test results. Research on patient portals is abundant and offers evidence to help portal implementers make policy and practice decisions. In contrast, no comparable comprehensive summary of research addresses the direct release of and patient access to test results. Objective: This scoping review aims to analyze and synthesize published research focused on patient and health care provider perspectives on the direct release of laboratory, imaging, and radiology results to patients via web portals. Methods: PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were followed. Searches were conducted in CINAHL, MEDLINE, and other databases. Citations were screened in Covidence using the inclusion and exclusion criteria. Primary studies that focused on patient and health care provider perspectives on patient access to laboratory and imaging results via web portals were included. An updated search was conducted up to August 2023. Our review included 27 articles?20 examining patient views, 3 examining provider views, and 4 examining both patient and provider views. Data extraction and inductive data analysis were informed by sensitizing concepts from sociomaterial perspectives, and 15 themes were generated. Results: Patient perspectives (24 papers) were synthesized using nine themes: (1) patterns of use and patient characteristics; (2) emotional response when viewing the results and uncertainty about their implications; (3) understanding test results; (4) preferences for mode and timing of result release; (5). information seeking and patients? actions motivated by viewing results via a portal; (6) contemplating changes in behavior and managing own health; (7) benefits of accessing test results via a portal; (8) limitations of accessing test results via a portal; and (9) suggestions for portal improvement. Health care provider perspectives (7 papers) were synthetized into six themes: (1) providers? view of benefits of patient access to results via the portal; (2) effects on health care provider workload; (3) concerns about patient anxiety; (4) timing of result release into the patient portal; (5) the method of result release into the patient portal: manual versus automatic release; and (6) the effects of hospital health information technology system on patient quality outcomes. Conclusions: The timing of the release of test results emerged as a particularly important topic. In some countries, the policy context may motivate immediate release of most tests directly into patient portals. However, our findings aim to make policy makers, health administrators, and other stakeholders aware of factors to consider when making decisions about the timing of result release. This review is sensitive to the characteristics of patient populations and portal technology and can inform result release framework policies. The findings are timely, as patient portals have become more common internationally. UR - https://www.jmir.org/2023/1/e43765 UR - http://dx.doi.org/10.2196/43765 UR - http://www.ncbi.nlm.nih.gov/pubmed/37856174 ID - info:doi/10.2196/43765 ER - TY - JOUR AU - Sangeorzan, Irina AU - Antonacci, Grazia AU - Martin, Anne AU - Grodzinski, Ben AU - Zipser, M. Carl AU - Murphy, J. Rory K. AU - Andriopoulou, Panoraia AU - Cook, E. Chad AU - Anderson, B. David AU - Guest, James AU - Furlan, C. Julio AU - Kotter, N. Mark R. AU - Boerger, F. Timothy AU - Sadler, Iwan AU - Roberts, A. Elizabeth AU - Wood, Helen AU - Fraser, Christine AU - Fehlings, G. Michael AU - Kumar, Vishal AU - Jung, Josephine AU - Milligan, James AU - Nouri, Aria AU - Martin, R. Allan AU - Blizzard, Tammy AU - Vialle, Roberto Luiz AU - Tetreault, Lindsay AU - Kalsi-Ryan, Sukhvinder AU - MacDowall, Anna AU - Martin-Moore, Esther AU - Burwood, Martin AU - Wood, Lianne AU - Lalkhen, Abdul AU - Ito, Manabu AU - Wilson, Nicky AU - Treanor, Caroline AU - Dugan, Sheila AU - Davies, M. Benjamin PY - 2023/10/9 TI - Toward Shared Decision-Making in Degenerative Cervical Myelopathy: Protocol for a Mixed Methods Study JO - JMIR Res Protoc SP - e46809 VL - 12 KW - degenerative cervical myelopathy KW - spine KW - spinal cord KW - chronic KW - aging KW - geriatric KW - patient engagement KW - shared decision-making KW - process mapping KW - core information set KW - decision-making KW - patient education KW - common data element KW - Research Objectives and Common Data Elements for Degenerative Cervical Myelopathy KW - RECODE-DCM N2 - Background: Health care decisions are a critical determinant in the evolution of chronic illness. In shared decision-making (SDM), patients and clinicians work collaboratively to reach evidence-based health decisions that align with individual circumstances, values, and preferences. This personalized approach to clinical care likely has substantial benefits in the oversight of degenerative cervical myelopathy (DCM), a type of nontraumatic spinal cord injury. Its chronicity, heterogeneous clinical presentation, complex management, and variable disease course engenders an imperative for a patient-centric approach that accounts for each patient?s unique needs and priorities. Inadequate patient knowledge about the condition and an incomplete understanding of the critical decision points that arise during the course of care currently hinder the fruitful participation of health care providers and patients in SDM. This study protocol presents the rationale for deploying SDM for DCM and delineates the groundwork required to achieve this. Objective: The study?s primary outcome is the development of a comprehensive checklist to be implemented upon diagnosis that provides patients with essential information necessary to support their informed decision-making. This is known as a core information set (CIS). The secondary outcome is the creation of a detailed process map that provides a diagrammatic representation of the global care workflows and cognitive processes involved in DCM care. Characterizing the critical decision points along a patient?s journey will allow for an effective exploration of SDM tools for routine clinical practice to enhance patient-centered care and improve clinical outcomes. Methods: Both CISs and process maps are coproduced iteratively through a collaborative process involving the input and consensus of key stakeholders. This will be facilitated by Myelopathy.org, a global DCM charity, through its Research Objectives and Common Data Elements for Degenerative Cervical Myelopathy community. To develop the CIS, a 3-round, web-based Delphi process will be used, starting with a baseline list of information items derived from a recent scoping review of educational materials in DCM, patient interviews, and a qualitative survey of professionals. A priori criteria for achieving consensus are specified. The process map will be developed iteratively using semistructured interviews with patients and professionals and validated by key stakeholders. Results: Recruitment for the Delphi consensus study began in April 2023. The pilot-testing of process map interview participants started simultaneously, with the formulation of an initial baseline map underway. Conclusions: This protocol marks the first attempt to provide a starting point for investigating SDM in DCM. The primary work centers on developing an educational tool for use in diagnosis to enable enhanced onward decision-making. The wider objective is to aid stakeholders in developing SDM tools by identifying critical decision junctures in DCM care. Through these approaches, we aim to provide an exhaustive launchpad for formulating SDM tools in the wider DCM community. International Registered Report Identifier (IRRID): DERR1-10.2196/46809 UR - https://www.researchprotocols.org/2023/1/e46809 UR - http://dx.doi.org/10.2196/46809 UR - http://www.ncbi.nlm.nih.gov/pubmed/37812472 ID - info:doi/10.2196/46809 ER - TY - JOUR AU - Zhang, Ying AU - Li, Xiaoying AU - Liu, Yi AU - Li, Aihua AU - Yang, Xuemei AU - Tang, Xiaoli PY - 2023/10/5 TI - A Multilabel Text Classifier of Cancer Literature at the Publication Level: Methods Study of Medical Text Classification JO - JMIR Med Inform SP - e44892 VL - 11 KW - text classification KW - publication-level classifier KW - cancer literature KW - deep learning N2 - Background: Given the threat posed by cancer to human health, there is a rapid growth in the volume of data in the cancer field and interdisciplinary and collaborative research is becoming increasingly important for fine-grained classification. The low-resolution classifier of reported studies at the journal level fails to satisfy advanced searching demands, and a single label does not adequately characterize the literature originated from interdisciplinary research results. There is thus a need to establish a multilabel classifier with higher resolution to support literature retrieval for cancer research and reduce the burden of screening papers for clinical relevance. Objective: The primary objective of this research was to address the low-resolution issue of cancer literature classification due to the ambiguity of the existing journal-level classifier in order to support gaining high-relevance evidence for clinical consideration and all-sided results for literature retrieval. Methods: We trained a multilabel classifier with scalability for classifying the literature on cancer research directly at the publication level to assign proper content-derived labels based on the ?Bidirectional Encoder Representation from Transformers (BERT) + X? model and obtain the best option for X. First, a corpus of 70,599 cancer publications retrieved from the Dimensions database was divided into a training and a testing set in a ratio of 7:3. Second, using the classification terminology of International Cancer Research Partnership cancer types, we compared the performance of classifiers developed using BERT and 5 classical deep learning models, such as the text recurrent neural network (TextRNN) and FastText, followed by metrics analysis. Results: After comparing various combined deep learning models, we obtained a classifier based on the optimal combination ?BERT + TextRNN,? with a precision of 93.09%, a recall of 87.75%, and an F1-score of 90.34%. Moreover, we quantified the distinctive characteristics in the text structure and multilabel distribution in order to generalize the model to other fields with similar characteristics. Conclusions: The ?BERT + TextRNN? model was trained for high-resolution classification of cancer literature at the publication level to support accurate retrieval and academic statistics. The model automatically assigns 1 or more labels to each cancer paper, as required. Quantitative comparison verified that the ?BERT + TextRNN? model is the best fit for multilabel classification of cancer literature compared to other models. More data from diverse fields will be collected to testify the scalability and extensibility of the proposed model in the future. UR - https://medinform.jmir.org/2023/1/e44892 UR - http://dx.doi.org/10.2196/44892 UR - http://www.ncbi.nlm.nih.gov/pubmed/37796584 ID - info:doi/10.2196/44892 ER - TY - JOUR AU - Fraser, Hamish AU - Crossland, Daven AU - Bacher, Ian AU - Ranney, Megan AU - Madsen, Tracy AU - Hilliard, Ross PY - 2023/10/3 TI - Comparison of Diagnostic and Triage Accuracy of Ada Health and WebMD Symptom Checkers, ChatGPT, and Physicians for Patients in an Emergency Department: Clinical Data Analysis Study JO - JMIR Mhealth Uhealth SP - e49995 VL - 11 KW - diagnosis KW - triage KW - symptom checker KW - emergency patient KW - ChatGPT KW - LLM KW - diagnose KW - self-diagnose KW - self-diagnosis KW - app KW - application KW - language model KW - accuracy KW - ChatGPT-3.5 KW - ChatGPT-4.0 KW - emergency KW - machine learning N2 - Background: Diagnosis is a core component of effective health care, but misdiagnosis is common and can put patients at risk. Diagnostic decision support systems can play a role in improving diagnosis by physicians and other health care workers. Symptom checkers (SCs) have been designed to improve diagnosis and triage (ie, which level of care to seek) by patients. Objective: The aim of this study was to evaluate the performance of the new large language model ChatGPT (versions 3.5 and 4.0), the widely used WebMD SC, and an SC developed by Ada Health in the diagnosis and triage of patients with urgent or emergent clinical problems compared with the final emergency department (ED) diagnoses and physician reviews. Methods: We used previously collected, deidentified, self-report data from 40 patients presenting to an ED for care who used the Ada SC to record their symptoms prior to seeing the ED physician. Deidentified data were entered into ChatGPT versions 3.5 and 4.0 and WebMD by a research assistant blinded to diagnoses and triage. Diagnoses from all 4 systems were compared with the previously abstracted final diagnoses in the ED as well as with diagnoses and triage recommendations from three independent board-certified ED physicians who had blindly reviewed the self-report clinical data from Ada. Diagnostic accuracy was calculated as the proportion of the diagnoses from ChatGPT, Ada SC, WebMD SC, and the independent physicians that matched at least one ED diagnosis (stratified as top 1 or top 3). Triage accuracy was calculated as the number of recommendations from ChatGPT, WebMD, or Ada that agreed with at least 2 of the independent physicians or were rated ?unsafe? or ?too cautious.? Results: Overall, 30 and 37 cases had sufficient data for diagnostic and triage analysis, respectively. The rate of top-1 diagnosis matches for Ada, ChatGPT 3.5, ChatGPT 4.0, and WebMD was 9 (30%), 12 (40%), 10 (33%), and 12 (40%), respectively, with a mean rate of 47% for the physicians. The rate of top-3 diagnostic matches for Ada, ChatGPT 3.5, ChatGPT 4.0, and WebMD was 19 (63%), 19 (63%), 15 (50%), and 17 (57%), respectively, with a mean rate of 69% for physicians. The distribution of triage results for Ada was 62% (n=23) agree, 14% unsafe (n=5), and 24% (n=9) too cautious; that for ChatGPT 3.5 was 59% (n=22) agree, 41% (n=15) unsafe, and 0% (n=0) too cautious; that for ChatGPT 4.0 was 76% (n=28) agree, 22% (n=8) unsafe, and 3% (n=1) too cautious; and that for WebMD was 70% (n=26) agree, 19% (n=7) unsafe, and 11% (n=4) too cautious. The unsafe triage rate for ChatGPT 3.5 (41%) was significantly higher (P=.009) than that of Ada (14%). Conclusions: ChatGPT 3.5 had high diagnostic accuracy but a high unsafe triage rate. ChatGPT 4.0 had the poorest diagnostic accuracy, but a lower unsafe triage rate and the highest triage agreement with the physicians. The Ada and WebMD SCs performed better overall than ChatGPT. Unsupervised patient use of ChatGPT for diagnosis and triage is not recommended without improvements to triage accuracy and extensive clinical evaluation. UR - https://mhealth.jmir.org/2023/1/e49995 UR - http://dx.doi.org/10.2196/49995 UR - http://www.ncbi.nlm.nih.gov/pubmed/37788063 ID - info:doi/10.2196/49995 ER - TY - JOUR AU - Matsuda, Shinichi AU - Ohtomo, Takumi AU - Okuyama, Masaru AU - Miyake, Hiraku AU - Aoki, Kotonari PY - 2023/9/14 TI - Estimating Patient Satisfaction Through a Language Processing Model: Model Development and Evaluation JO - JMIR Form Res SP - e48534 VL - 7 KW - breast cancer KW - internet KW - machine learning KW - natural language processing KW - natural language-processing model KW - neural network KW - NLP KW - patient satisfaction KW - textual data N2 - Background: Measuring patient satisfaction is a crucial aspect of medical care. Advanced natural language processing (NLP) techniques enable the extraction and analysis of high-level insights from textual data; nonetheless, data obtained from patients are often limited. Objective: This study aimed to create a model that quantifies patient satisfaction based on diverse patient-written textual data. Methods: We constructed a neural network?based NLP model for this cross-sectional study using the textual content from disease blogs written in Japanese on the Internet between 1994 and 2020. We extracted approximately 20 million sentences from 56,357 patient-authored disease blogs and constructed a model to predict the patient satisfaction index (PSI) using a regression approach. After evaluating the model?s effectiveness, PSI was predicted before and after cancer notification to examine the emotional impact of cancer diagnoses on 48 patients with breast cancer. Results: We assessed the correlation between the predicted and actual PSI values, labeled by humans, using the test set of 169 sentences. The model successfully quantified patient satisfaction by detecting nuances in sentences with excellent effectiveness (Spearman correlation coefficient [?]=0.832; root-mean-squared error [RMSE]=0.166; P<.001). Furthermore, the PSI was significantly lower in the cancer notification period than in the preceding control period (?0.057 and ?0.012, respectively; 2-tailed t47=5.392, P<.001), indicating that the model quantifies the psychological and emotional changes associated with the cancer diagnosis notification. Conclusions: Our model demonstrates the ability to quantify patient dissatisfaction and identify significant emotional changes during the disease course. This approach may also help detect issues in routine medical practice. UR - https://formative.jmir.org/2023/1/e48534 UR - http://dx.doi.org/10.2196/48534 UR - http://www.ncbi.nlm.nih.gov/pubmed/37707946 ID - info:doi/10.2196/48534 ER - TY - JOUR AU - Fernandes, J. Glenn AU - Choi, Arthur AU - Schauer, Michael Jacob AU - Pfammatter, F. Angela AU - Spring, J. Bonnie AU - Darwiche, Adnan AU - Alshurafa, I. Nabil PY - 2023/9/6 TI - An Explainable Artificial Intelligence Software Tool for Weight Management Experts (PRIMO): Mixed Methods Study JO - J Med Internet Res SP - e42047 VL - 25 KW - explainable artificial intelligence KW - explainable AI KW - machine learning KW - ML KW - interpretable ML KW - random forest KW - decision-making KW - weight loss prediction KW - mobile phone N2 - Background: Predicting the likelihood of success of weight loss interventions using machine learning (ML) models may enhance intervention effectiveness by enabling timely and dynamic modification of intervention components for nonresponders to treatment. However, a lack of understanding and trust in these ML models impacts adoption among weight management experts. Recent advances in the field of explainable artificial intelligence enable the interpretation of ML models, yet it is unknown whether they enhance model understanding, trust, and adoption among weight management experts. Objective: This study aimed to build and evaluate an ML model that can predict 6-month weight loss success (ie, ?7% weight loss) from 5 engagement and diet-related features collected over the initial 2 weeks of an intervention, to assess whether providing ML-based explanations increases weight management experts? agreement with ML model predictions, and to inform factors that influence the understanding and trust of ML models to advance explainability in early prediction of weight loss among weight management experts. Methods: We trained an ML model using the random forest (RF) algorithm and data from a 6-month weight loss intervention (N=419). We leveraged findings from existing explainability metrics to develop Prime Implicant Maintenance of Outcome (PRIMO), an interactive tool to understand predictions made by the RF model. We asked 14 weight management experts to predict hypothetical participants? weight loss success before and after using PRIMO. We compared PRIMO with 2 other explainability methods, one based on feature ranking and the other based on conditional probability. We used generalized linear mixed-effects models to evaluate participants? agreement with ML predictions and conducted likelihood ratio tests to examine the relationship between explainability methods and outcomes for nested models. We conducted guided interviews and thematic analysis to study the impact of our tool on experts? understanding and trust in the model. Results: Our RF model had 81% accuracy in the early prediction of weight loss success. Weight management experts were significantly more likely to agree with the model when using PRIMO (?2=7.9; P=.02) compared with the other 2 methods with odds ratios of 2.52 (95% CI 0.91-7.69) and 3.95 (95% CI 1.50-11.76). From our study, we inferred that our software not only influenced experts? understanding and trust but also impacted decision-making. Several themes were identified through interviews: preference for multiple explanation types, need to visualize uncertainty in explanations provided by PRIMO, and need for model performance metrics on similar participant test instances. Conclusions: Our results show the potential for weight management experts to agree with the ML-based early prediction of success in weight loss treatment programs, enabling timely and dynamic modification of intervention components to enhance intervention effectiveness. Our findings provide methods for advancing the understandability and trust of ML models among weight management experts. UR - https://www.jmir.org/2023/1/e42047 UR - http://dx.doi.org/10.2196/42047 UR - http://www.ncbi.nlm.nih.gov/pubmed/37672333 ID - info:doi/10.2196/42047 ER - TY - JOUR AU - Henke, Elisa AU - Peng, Yuan AU - Reinecke, Ines AU - Zoch, Michéle AU - Sedlmayr, Martin AU - Bathelt, Franziska PY - 2023/8/21 TI - An Extract-Transform-Load Process Design for the Incremental Loading of German Real-World Data Based on FHIR and OMOP CDM: Algorithm Development and Validation JO - JMIR Med Inform SP - e47310 VL - 11 KW - ETL KW - incremental loading KW - OMOP CDM KW - FHIR KW - interoperability KW - Extract-Transform-Load KW - Observational Medical Outcomes Partnership Common Data Model KW - Fast Healthcare Interoperability Resources N2 - Background: In the Medical Informatics in Research and Care in University Medicine (MIRACUM) consortium, an IT-based clinical trial recruitment support system was developed based on the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). Currently, OMOP CDM is populated with German Fast Healthcare Interoperability Resources (FHIR) using an Extract-Transform-Load (ETL) process, which was designed as a bulk load. However, the computational effort that comes with an everyday full load is not efficient for daily recruitment. Objective: The aim of this study is to extend our existing ETL process with the option of incremental loading to efficiently support daily updated data. Methods: Based on our existing bulk ETL process, we performed an analysis to determine the requirements of incremental loading. Furthermore, a literature review was conducted to identify adaptable approaches. Based on this, we implemented three methods to integrate incremental loading into our ETL process. Lastly, a test suite was defined to evaluate the incremental loading for data correctness and performance compared to bulk loading. Results: The resulting ETL process supports bulk and incremental loading. Performance tests show that the incremental load took 87.5% less execution time than the bulk load (2.12 min compared to 17.07 min) related to changes of 1 day, while no data differences occurred in OMOP CDM. Conclusions: Since incremental loading is more efficient than a daily bulk load and both loading options result in the same amount of data, we recommend using bulk load for an initial load and switching to incremental load for daily updates. The resulting incremental ETL logic can be applied internationally since it is not restricted to German FHIR profiles. UR - https://medinform.jmir.org/2023/1/e47310 UR - http://dx.doi.org/10.2196/47310 ID - info:doi/10.2196/47310 ER - TY - JOUR AU - Sasseville, Maxime AU - Supper, Wilfried AU - Gartner, Jean-Baptiste AU - Layani, Géraldine AU - Amil, Samira AU - Sheffield, Peter AU - Gagnon, Marie-Pierre AU - Hudon, Catherine AU - Lambert, Sylvie AU - Attisso, Eugčne AU - Bureau Lagarde, Victoria AU - Breton, Mylaine AU - Poitras, Marie-Eve AU - Pluye, Pierre AU - Roux-Levy, Pierre-Henri AU - Plaisimond, James AU - Bergeron, Frédéric AU - Ashcroft, Rachelle AU - Wong, Sabrina AU - Groulx, Antoine AU - Beaudet, Nicolas AU - Paquette, Jean-Sébastien AU - D'Anjou, Natasha AU - Langlois, Sylviane AU - LeBlanc, Annie PY - 2023/8/18 TI - Clinical Integration of Digital Patient-Reported Outcome Measures in Primary Health Care for Chronic Disease Management: Protocol for a Systematic Review JO - JMIR Res Protoc SP - e48155 VL - 12 KW - systematic review KW - patient-reported outcome measure KW - primary healthcare KW - health care KW - implementation science N2 - Background: Health measurement guides policies and health care decisions are necessary to describe and attain the quintuple aim of improving patient experience, population health, care team well-being, health care costs, and equity. In the primary care setting, patient-reported outcome measurement allows outcome comparisons within and across settings and helps improve the clinical management of patients. However, these digital patient-reported outcome measures (PROMs) are still not adapted to the clinical context of primary health care, which is an indication of the complexity of integrating these tools in this context. We must then gather evidence of their impact on chronic disease management in primary health care and understand the characteristics of effective implementation. Objective: We will conduct a systematic review to identify and assess the impact of electronic PROMs (ePROMs) implementation in primary health care for chronic disease management. Our specific objectives are to (1) determine the impact of ePROMs in primary health care for chronic disease management and (2) compare and contrast characteristics of effective ePROMs? implementation strategies. Methods: We will conduct a systematic review of the literature in accordance with the guidelines of the Cochrane Methods Group and in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines for its reporting. A specific search strategy was developed for relevant databases to identify studies. Two reviewers will independently apply the inclusion criteria using full texts and will extract the data. We will use a 2-phase sequential mixed methods synthesis design by conducting a qualitative synthesis first, and use its results to perform a quantitative synthesis. Results: This study was initiated in June 2022 by assembling the research team and the knowledge transfer committee. The preliminary search strategy will be developed and completed in September 2022. The main search strategy, data collection, study selection, and application of inclusion criteria were completed between October and December 2022. Conclusions: Results from this review will help support implementation efforts to accelerate innovations and digital adoption for primary health care and will be relevant for improving clinical management of chronic diseases and health care services and policies. Trial Registration: PROSPERO International Prospective Register of Systematic Reviews CRD42022333513; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=333513 International Registered Report Identifier (IRRID): DERR1-10.2196/48155 UR - https://www.researchprotocols.org/2023/1/e48155 UR - http://dx.doi.org/10.2196/48155 UR - http://www.ncbi.nlm.nih.gov/pubmed/37594780 ID - info:doi/10.2196/48155 ER - TY - JOUR AU - Jaiswal, Aman AU - Katz, Alan AU - Nesca, Marcello AU - Milios, Evangelos PY - 2023/8/9 TI - Identifying Risk Factors Associated With Lower Back Pain in Electronic Medical Record Free Text: Deep Learning Approach Using Clinical Note Annotations JO - JMIR Med Inform SP - e45105 VL - 11 KW - machine learning KW - lower back pain KW - natural language processing KW - semantic textual similarity KW - electronic medical records KW - risk factors KW - deep learning N2 - Background: Lower back pain is a common weakening condition that affects a large population. It is a leading cause of disability and lost productivity, and the associated medical costs and lost wages place a substantial burden on individuals and society. Recent advances in artificial intelligence and natural language processing have opened new opportunities for the identification and management of risk factors for lower back pain. In this paper, we propose and train a deep learning model on a data set of clinical notes that have been annotated with relevant risk factors, and we evaluate the model?s performance in identifying risk factors in new clinical notes. Objective: The primary objective is to develop a novel deep learning approach to detect risk factors for underlying disease in patients presenting with lower back pain in clinical encounter notes. The secondary objective is to propose solutions to potential challenges of using deep learning and natural language processing techniques for identifying risk factors in electronic medical record free text and make practical recommendations for future research in this area. Methods: We manually annotated clinical notes for the presence of six risk factors for severe underlying disease in patients presenting with lower back pain. Data were highly imbalanced, with only 12% (n=296) of the annotated notes having at least one risk factor. To address imbalanced data, a combination of semantic textual similarity and regular expressions was used to further capture notes for annotation. Further analysis was conducted to study the impact of downsampling, binary formulation of multi-label classification, and unsupervised pretraining on classification performance. Results: Of 2749 labeled clinical notes, 347 exhibited at least one risk factor, while 2402 exhibited none. The initial analysis shows that downsampling the training set to equalize the ratio of clinical notes with and without risk factors improved the macro?area under the receiver operating characteristic curve (AUROC) by 2%. The Bidirectional Encoder Representations from Transformers (BERT) model improved the macro-AUROC by 15% over the traditional machine learning baseline. In experiment 2, the proposed BERT?convolutional neural network (CNN) model for longer texts improved (4% macro-AUROC) over the BERT baseline, and the multitask models are more stable for minority classes. In experiment 3, domain adaptation of BERTCNN using masked language modeling improved the macro-AUROC by 2%. Conclusions: Primary care clinical notes are likely to require manipulation to perform meaningful free-text analysis. The application of BERT models for multi-label classification on downsampled annotated clinical notes is useful in detecting risk factors suggesting an indication for imaging for patients with lower back pain. UR - https://medinform.jmir.org/2023/1/e45105 UR - http://dx.doi.org/10.2196/45105 ID - info:doi/10.2196/45105 ER - TY - JOUR AU - Ferrara, Maria AU - Gentili, Elisabetta AU - Belvederi Murri, Martino AU - Zese, Riccardo AU - Alberti, Marco AU - Franchini, Giorgia AU - Domenicano, Ilaria AU - Folesani, Federica AU - Sorio, Cristina AU - Benini, Lorenzo AU - Carozza, Paola AU - Little, Julian AU - Grassi, Luigi PY - 2023/8/9 TI - Establishment of a Public Mental Health Database for Research Purposes in the Ferrara Province: Development and Preliminary Evaluation Study JO - JMIR Med Inform SP - e45523 VL - 11 KW - mental health KW - psychosis KW - epidemiology KW - electronic health registry KW - health care KW - machine learning KW - medical health records KW - electronic health records KW - clinical database KW - support KW - mental disorder KW - social determinants KW - mental health care KW - resource utilization N2 - Background: The immediate use of data exported from electronic health records (EHRs) for research is often limited by the necessity to transform data elements into an actual data set. Objective: This paper describes the methodology for establishing a data set that originated from an EHR registry that included clinical, health service, and sociodemographic information. Methods: The Extract, Transform, Load process was applied to raw data collected at the Integrated Department of Mental Health and Pathological Addictions in Ferrara, Italy, from 1925 to February 18, 2021, to build the new, anonymized Ferrara-Psychiatry (FEPSY) database. Information collected before the first EHR was implemented (ie, in 1991) was excluded. An unsupervised cluster analysis was performed to identify patient subgroups to support the proof of concept. Results: The FEPSY database included 3,861,432 records on 46,222 patients. Since 1991, each year, a median of 1404 (IQR 1117.5-1757.7) patients had newly accessed care, and a median of 7300 (IQR 6109.5-9397.5) patients were actively receiving care. Among 38,022 patients with a mental disorder, 2 clusters were identified; the first predominantly included male patients who were aged 25 to 34 years at first presentation and were living with their parents, and the second predominantly included female patients who were aged 35 to 44 years and were living with their own families. Conclusions: The process for building the FEPSY database proved to be robust and replicable with similar health care data, even when they were not originally conceived for research purposes. The FEPSY database will enable future in-depth analyses regarding the epidemiology and social determinants of mental disorders, access to mental health care, and resource utilization. UR - https://medinform.jmir.org/2023/1/e45523 UR - http://dx.doi.org/10.2196/45523 ID - info:doi/10.2196/45523 ER - TY - JOUR AU - Makan, Hemant AU - Makan, Lindie AU - Lubbe, Jacqueline AU - Alami, Sarah AU - Lancman, Guila AU - Schaller, Manuella AU - Delval, Cécile AU - Kok, Adri PY - 2023/8/7 TI - Clinical and Economic Assessment of MyDiaCare, Digital Tools Combined With Diabetes Nurse Educator Support, for Managing Diabetes in South Africa: Observational Multicenter, Retrospective Study Associated With a Budget Impact Model JO - JMIR Form Res SP - e35790 VL - 7 KW - diabetes mellitus KW - diabetes nurse educator KW - digital tool KW - MyDiaCare program KW - type 2 N2 - Background: In South Africa, diabetes prevalence is expected to reach 5.4 million by 2030. In South Africa, diabetes-related complications severely impact not only patient health and quality of life but also the economy. Objective: The Diabetes Nurse Educator (DNE) study assessed the benefit of adding the MyDiaCare program to standard of care for managing patients with type 1 and type 2 diabetes in South Africa. An economic study was also performed to estimate the budget impact of adding MyDiaCare to standard of care for patients with type 2 diabetes older than 19 years treated in the South African private health care sector. Methods: The real-world DNE study was designed as an observational, retrospective, multicenter, single-group study. Eligible patients were older than 18 years and had at least 6 months of participation in the MyDiaCare program. The MyDiaCare program combines a patient mobile app and a health care professional platform with face-to-face visits with a DNE. The benefit of MyDiaCare was assessed by the changes in glycated hemoglobin (HbA1c) levels, the proportion of patients achieving clinical and biological targets, adherence to care plans, and satisfaction after 6 months of participating in the MyDiaCare program. A budget impact model was performed using data from the DNE study and another South African cohort of the DISCOVERY study to estimate the economic impact of MyDiaCare. Results: Between November 25, 2019, and June 30, 2020, a total of 117 patients (8 with type 1 diabetes and 109 with type 2 diabetes) were enrolled in 2 centers. After 6 months of MyDiaCare, a clinically relevant decrease in mean HbA1c levels of 0.6% from 7.8% to 7.2% was observed. Furthermore, 54% (43/79) of patients reached or maintained their HbA1c targets at 6 months. Most patients achieved their targets for blood pressure (53/79, 67% for systolic and 70/79, 89% for diastolic blood pressure) and lipid parameters (49/71, 69% for low-density-lipoprotein [LDL] cholesterol, 41/71, 58% for high-density-lipoprotein [HDL] cholesterol, and 59/71, 83% for total cholesterol), but fewer patients achieved their targets for triglycerides (32/70, 46%), waist circumference (12/68, 18%), and body weight (13/76, 17%). The mean overall adherence to the MyDiaCare care plan was 93%. Most patients (87/117, 74%) were satisfied with the MyDiaCare program. The net budget impact per patient with type 2 diabetes, older than 19 years, treated in the private sector using MyDiaCare was estimated to be approximately South African Rands (ZAR) 71,023 (US $4089) during the first year of introducing MyDiaCare. Conclusions: The results of using MyDiaCare program, which combines digital tools for patients and health care professionals with DNE support, suggest that it may be a clinically effective and cost-saving solution for diabetes management in the South African private health care sector. UR - https://formative.jmir.org/2023/1/e35790 UR - http://dx.doi.org/10.2196/35790 UR - http://www.ncbi.nlm.nih.gov/pubmed/37548994 ID - info:doi/10.2196/35790 ER - TY - JOUR AU - Wolfien, Markus AU - Ahmadi, Najia AU - Fitzer, Kai AU - Grummt, Sophia AU - Heine, Kilian-Ludwig AU - Jung, Ian-C AU - Krefting, Dagmar AU - Kühn, Andreas AU - Peng, Yuan AU - Reinecke, Ines AU - Scheel, Julia AU - Schmidt, Tobias AU - Schmücker, Paul AU - Schüttler, Christina AU - Waltemath, Dagmar AU - Zoch, Michele AU - Sedlmayr, Martin PY - 2023/7/24 TI - Ten Topics to Get Started in Medical Informatics Research JO - J Med Internet Res SP - e45948 VL - 25 KW - medical informatics KW - health informatics KW - interdisciplinary communication KW - research data KW - clinical data KW - digital health UR - https://www.jmir.org/2023/1/e45948 UR - http://dx.doi.org/10.2196/45948 UR - http://www.ncbi.nlm.nih.gov/pubmed/37486754 ID - info:doi/10.2196/45948 ER - TY - JOUR AU - Yune, Jung So AU - Kim, Youngjon AU - Lee, Woog Jea PY - 2023/7/19 TI - Data Analysis of Physician Competence Research Trend: Social Network Analysis and Topic Modeling Approach JO - JMIR Med Inform SP - e47934 VL - 11 KW - physician competency KW - research trend KW - competency-based education KW - professionalism KW - topic modeling KW - latent Dirichlet allocation KW - LDA algorithm KW - data science KW - social network analysis N2 - Background: Studies on competency in medical education often explore the acquisition, performance, and evaluation of particular skills, knowledge, or behaviors that constitute physician competency. As physician competency reflects social demands according to changes in the medical environment, analyzing the research trends of physician competency by period is necessary to derive major research topics for future studies. Therefore, a more macroscopic method is required to analyze the core competencies of physicians in this era. Objective: This study aimed to analyze research trends related to physicians? competency in reflecting social needs according to changes in the medical environment. Methods: We used topic modeling to identify potential research topics by analyzing data from studies related to physician competency published between 2011 and 2020. We preprocessed 1354 articles and extracted 272 keywords. Results: The terms that appeared most frequently in the research related to physician competency since 2010 were knowledge, hospital, family, job, guidelines, management, and communication. The terms that appeared in most studies were education, model, knowledge, and hospital. Topic modeling revealed that the main topics about physician competency included Evidence-based clinical practice, Community-based healthcare, Patient care, Career and self-management, Continuous professional development, and Communication and cooperation. We divided the studies into 4 periods (2011-2013, 2014-2016, 2017-2019, and 2020-2021) and performed a linear regression analysis. The results showed a change in topics by period. The hot topics that have shown increased interest among scholars over time include Community-based healthcare, Career and self-management, and Continuous professional development. Conclusions: On the basis of the analysis of research trends, it is predicted that physician professionalism and community-based medicine will continue to be studied in future studies on physician competency. UR - https://medinform.jmir.org/2023/1/e47934 UR - http://dx.doi.org/10.2196/47934 UR - http://www.ncbi.nlm.nih.gov/pubmed/37467028 ID - info:doi/10.2196/47934 ER - TY - JOUR AU - Lichtner, Gregor AU - Haese, Thomas AU - Brose, Sally AU - Röhrig, Larissa AU - Lysyakova, Liudmila AU - Rudolph, Stefanie AU - Uebe, Maria AU - Sass, Julian AU - Bartschke, Alexander AU - Hillus, David AU - Kurth, Florian AU - Sander, Erik Leif AU - Eckart, Falk AU - Toepfner, Nicole AU - Berner, Reinhard AU - Frey, Anna AU - Dörr, Marcus AU - Vehreschild, Janne Jörg AU - von Kalle, Christof AU - Thun, Sylvia PY - 2023/7/18 TI - Interoperable, Domain-Specific Extensions for the German Corona Consensus (GECCO) COVID-19 Research Data Set Using an Interdisciplinary, Consensus-Based Workflow: Data Set Development Study JO - JMIR Med Inform SP - e45496 VL - 11 KW - interoperability KW - research data set KW - Fast Healthcare Interoperability Resources KW - FHIR KW - FAIR principle KW - COVID-19 KW - interoperable KW - SARS-CoV-2 KW - pediatric KW - immunization KW - cardiology KW - standard N2 - Background: The COVID-19 pandemic has spurred large-scale, interinstitutional research efforts. To enable these efforts, researchers must agree on data set definitions that not only cover all elements relevant to the respective medical specialty but also are syntactically and semantically interoperable. Therefore, the German Corona Consensus (GECCO) data set was developed as a harmonized, interoperable collection of the most relevant data elements for COVID-19?related patient research. As the GECCO data set is a compact core data set comprising data across all medical fields, the focused research within particular medical domains demands the definition of extension modules that include data elements that are the most relevant to the research performed in those individual medical specialties. Objective: We aimed to (1) specify a workflow for the development of interoperable data set definitions that involves close collaboration between medical experts and information scientists and (2) apply the workflow to develop data set definitions that include data elements that are the most relevant to COVID-19?related patient research regarding immunization, pediatrics, and cardiology. Methods: We developed a workflow to create data set definitions that were (1) content-wise as relevant as possible to a specific field of study and (2) universally usable across computer systems, institutions, and countries (ie, interoperable). We then gathered medical experts from 3 specialties?infectious diseases (with a focus on immunization), pediatrics, and cardiology?to select data elements that were the most relevant to COVID-19?related patient research in the respective specialty. We mapped the data elements to international standardized vocabularies and created data exchange specifications, using Health Level Seven International (HL7) Fast Healthcare Interoperability Resources (FHIR). All steps were performed in close interdisciplinary collaboration with medical domain experts and medical information specialists. Profiles and vocabulary mappings were syntactically and semantically validated in a 2-stage process. Results: We created GECCO extension modules for the immunization, pediatrics, and cardiology domains according to pandemic-related requests. The data elements included in each module were selected, according to the developed consensus-based workflow, by medical experts from these specialties to ensure that the contents aligned with their research needs. We defined data set specifications for 48 immunization, 150 pediatrics, and 52 cardiology data elements that complement the GECCO core data set. We created and published implementation guides, example implementations, and data set annotations for each extension module. Conclusions: The GECCO extension modules, which contain data elements that are the most relevant to COVID-19?related patient research on infectious diseases (with a focus on immunization), pediatrics, and cardiology, were defined in an interdisciplinary, iterative, consensus-based workflow that may serve as a blueprint for developing further data set definitions. The GECCO extension modules provide standardized and harmonized definitions of specialty-related data sets that can help enable interinstitutional and cross-country COVID-19 research in these specialties. UR - https://medinform.jmir.org/2023/1/e45496 UR - http://dx.doi.org/10.2196/45496 ID - info:doi/10.2196/45496 ER - TY - JOUR AU - Yang, Dan AU - Su, Zihan AU - Mu, Runqing AU - Diao, Yingying AU - Zhang, Xin AU - Liu, Yusi AU - Wang, Shuo AU - Wang, Xu AU - Zhao, Lei AU - Wang, Hongyi AU - Zhao, Min PY - 2023/7/17 TI - Effects of Using Different Indirect Techniques on the Calculation of Reference Intervals: Observational Study JO - J Med Internet Res SP - e45651 VL - 25 KW - comparative study KW - data transformation KW - indirect method KW - outliers KW - reference interval KW - clinical decision-making KW - complete blood count KW - red blood cells KW - white blood cells KW - platelets KW - laboratory KW - clinical N2 - Background: Reference intervals (RIs) play an important role in clinical decision-making. However, due to the time, labor, and financial costs involved in establishing RIs using direct means, the use of indirect methods, based on big data previously obtained from clinical laboratories, is getting increasing attention. Different indirect techniques combined with different data transformation methods and outlier removal might cause differences in the calculation of RIs. However, there are few systematic evaluations of this. Objective: This study used data derived from direct methods as reference standards and evaluated the accuracy of combinations of different data transformation, outlier removal, and indirect techniques in establishing complete blood count (CBC) RIs for large-scale data. Methods: The CBC data of populations aged ?18 years undergoing physical examination from January 2010 to December 2011 were retrieved from the First Affiliated Hospital of China Medical University in northern China. After exclusion of repeated individuals, we performed parametric, nonparametric, Hoffmann, Bhattacharya, and truncation points and Kolmogorov?Smirnov distance (kosmic) indirect methods, combined with log or BoxCox transformation, and Reed?Dixon, Tukey, and iterative mean (3SD) outlier removal methods in order to derive the RIs of 8 CBC parameters and compared the results with those directly and previously established. Furthermore, bias ratios (BRs) were calculated to assess which combination of indirect technique, data transformation pattern, and outlier removal method is preferrable. Results: Raw data showed that the degrees of skewness of the white blood cell (WBC) count, platelet (PLT) count, mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), and mean corpuscular volume (MCV) were much more obvious than those of other CBC parameters. After log or BoxCox transformation combined with Tukey or iterative mean (3SD) processing, the distribution types of these data were close to Gaussian distribution. Tukey-based outlier removal yielded the maximum number of outliers. The lower-limit bias of WBC (male), PLT (male), hemoglobin (HGB; male), MCH (male/female), and MCV (female) was greater than that of the corresponding upper limit for more than half of 30 indirect methods. Computational indirect choices of CBC parameters for males and females were inconsistent. The RIs of MCHC established by the direct method for females were narrow. For this, the kosmic method was markedly superior, which contrasted with the RI calculation of CBC parameters with high |BR| qualification rates for males. Among the top 10 methodologies for the WBC count, PLT count, HGB, MCV, and MCHC with a high-BR qualification rate among males, the Bhattacharya, Hoffmann, and parametric methods were superior to the other 2 indirect methods. Conclusions: Compared to results derived by the direct method, outlier removal methods and indirect techniques markedly influence the final RIs, whereas data transformation has negligible effects, except for obviously skewed data. Specifically, the outlier removal efficiency of Tukey and iterative mean (3SD) methods is almost equivalent. Furthermore, the choice of indirect techniques depends more on the characteristics of the studied analyte itself. This study provides scientific evidence for clinical laboratories to use their previous data sets to establish RIs. UR - https://www.jmir.org/2023/1/e45651 UR - http://dx.doi.org/10.2196/45651 UR - http://www.ncbi.nlm.nih.gov/pubmed/37459170 ID - info:doi/10.2196/45651 ER - TY - JOUR AU - Wang, Yanyan AU - Zhang, Jin PY - 2023/7/17 TI - A Study on User-Oriented Subjects of Child Abuse on Wikipedia: Temporal Analysis of Wikipedia History Versions and Traffic Data JO - J Med Internet Res SP - e43901 VL - 25 KW - child abuse KW - user-oriented subject KW - subject schema KW - subject change KW - popularity trend KW - temporal analysis KW - Wikipedia N2 - Background: Many people turn to online open encyclopedias such as Wikipedia to seek knowledge about child abuse. However, the information available on this website is often disorganized and incomplete. Objective: The aim of this study is to analyze Wikipedia?s coverage of child abuse and provide a more accessible way for users to browse child abuse?related content. The study explored the main themes and subjects related to child abuse on Wikipedia and proposed a multilayer user-oriented subject schema from the general users? perspective. Methods: The knowledge of child abuse on Wikipedia is presented in the child abuse?related articles on it. The study analyzed child abuse?related articles on Wikipedia, examining their history versions and yearly page views data to reveal the evolution of content and popularity. The themes and subjects were identified from the articles? text using the open coding, self-organizing map, and n-gram approaches. The subjects in different periods were compared to reveal changes in content. Results: This study collected and investigated 241 associated Wikipedia articles and their history versions and traffic data. Four facets were identified: (1) maltreatment behavior (n=118, 48.9%); (2) people and environment (n=28, 11.6%); (3) problems and risks (n=33, 13.7%); and (4) protection and support (n=62, 25.7%). A total of 8 themes and 51 subjects were generated from the text, and a user-oriented subject schema linking the facets, themes, subjects, and articles was created. Maltreatment behavior (number of total views = 1.15 × 108) was the most popular facet viewed by users, while people and environment (number of total views = 2.42 × 107) was the least popular. The popularity of child abuse increased from 2010 to 2014 but decreased after that. Conclusions: The user-oriented subject schema provides an easier way for users to seek information and learn about child abuse. The knowledge of child abuse on Wikipedia covers the harms done to children, the problems caused by child abuse, the protection of children, and the people involved in child abuse. However, there was an inconsistency between the interests of general users and Wikipedia editors, and the child abuse knowledge on Wikipedia was found to be deficient, lacking content about typical child abuse types. To meet users? needs, health information creators need to generate more information to fill the knowledge gap. UR - https://www.jmir.org/2023/1/e43901 UR - http://dx.doi.org/10.2196/43901 UR - http://www.ncbi.nlm.nih.gov/pubmed/37459149 ID - info:doi/10.2196/43901 ER - TY - JOUR AU - Bottani, Eleonora AU - Bellini, Valentina AU - Mordonini, Monica AU - Pellegrino, Mattia AU - Lombardo, Gianfranco AU - Franchi, Beatrice AU - Craca, Michelangelo AU - Bignami, Elena PY - 2023/7/5 TI - Internet of Things and New Technologies for Tracking Perioperative Patients With an Innovative Model for Operating Room Scheduling: Protocol for a Development and Feasibility Study JO - JMIR Res Protoc SP - e45477 VL - 12 KW - internet of things KW - artificial intelligence KW - machine learning KW - perioperative organization KW - operating rooms N2 - Background: Management of operating rooms is a critical point in health care organizations because surgical departments represent a significant cost in hospital budgets. Therefore, it is increasingly important that there is effective planning of elective, emergency, and day surgery and optimization of both the human and physical resources available, always maintaining a high level of care and health treatment. This would lead to a reduction in patient waiting lists and better performance not only of surgical departments but also of the entire hospital. Objective: This study aims to automatically collect data from a real surgical scenario to develop an integrated technological-organizational model that optimizes operating block resources. Methods: Each patient is tracked and located in real time by wearing a bracelet sensor with a unique identifier. Exploiting the indoor location, the software architecture is able to collect the time spent for every step inside the surgical block. This method does not in any way affect the level of assistance that the patient receives and always protects their privacy; in fact, after expressing informed consent, each patient will be associated with an anonymous identification number. Results: The preliminary results are promising, making the study feasible and functional. Times automatically recorded are much more precise than those collected by humans and reported in the organization?s information system. In addition, machine learning can exploit the historical data collection to predict the surgery time required for each patient according to the patient?s specific profile. Simulation can also be applied to reproduce the system?s functioning, evaluate current performance, and identify strategies to improve the efficiency of the operating block. Conclusions: This functional approach improves short- and long-term surgical planning, facilitating interaction between the various professionals involved in the operating block, optimizing the management of available resources, and guaranteeing a high level of patient care in an increasingly efficient health care system. Trial Registration: ClinicalTrials.gov NCT05106621; https://clinicaltrials.gov/ct2/show/NCT05106621 International Registered Report Identifier (IRRID): DERR1-10.2196/45477 UR - https://www.researchprotocols.org/2023/1/e45477 UR - http://dx.doi.org/10.2196/45477 UR - http://www.ncbi.nlm.nih.gov/pubmed/37405821 ID - info:doi/10.2196/45477 ER - TY - JOUR AU - Wang, Liwei AU - He, Huan AU - Wen, Andrew AU - Moon, Sungrim AU - Fu, Sunyang AU - Peterson, J. Kevin AU - Ai, Xuguang AU - Liu, Sijia AU - Kavuluru, Ramakanth AU - Liu, Hongfang PY - 2023/6/27 TI - Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers?Assisted Sublanguage Analysis JO - JMIR Med Inform SP - e48072 VL - 11 KW - electronic health record KW - natural language processing KW - family history KW - sublanguage analysis KW - rule-based system KW - deep learning N2 - Background: A patient?s family history (FH) information significantly influences downstream clinical care. Despite this importance, there is no standardized method to capture FH information in electronic health records and a substantial portion of FH information is frequently embedded in clinical notes. This renders FH information difficult to use in downstream data analytics or clinical decision support applications. To address this issue, a natural language processing system capable of extracting and normalizing FH information can be used. Objective: In this study, we aimed to construct an FH lexical resource for information extraction and normalization. Methods: We exploited a transformer-based method to construct an FH lexical resource leveraging a corpus consisting of clinical notes generated as part of primary care. The usability of the lexicon was demonstrated through the development of a rule-based FH system that extracts FH entities and relations as specified in previous FH challenges. We also experimented with a deep learning?based FH system for FH information extraction. Previous FH challenge data sets were used for evaluation. Results: The resulting lexicon contains 33,603 lexicon entries normalized to 6408 concept unique identifiers of the Unified Medical Language System and 15,126 codes of the Systematized Nomenclature of Medicine Clinical Terms, with an average number of 5.4 variants per concept. The performance evaluation demonstrated that the rule-based FH system achieved reasonable performance. The combination of the rule-based FH system with a state-of-the-art deep learning?based FH system can improve the recall of FH information evaluated using the BioCreative/N2C2 FH challenge data set, with the F1 score varied but comparable. Conclusions: The resulting lexicon and rule-based FH system are freely available through the Open Health Natural Language Processing GitHub. UR - https://medinform.jmir.org/2023/1/e48072 UR - http://dx.doi.org/10.2196/48072 UR - http://www.ncbi.nlm.nih.gov/pubmed/37368483 ID - info:doi/10.2196/48072 ER - TY - JOUR AU - Perrin Franck, Caroline AU - Babington-Ashaye, Awa AU - Dietrich, Damien AU - Bediang, Georges AU - Veltsos, Philippe AU - Gupta, Prasad Pramendra AU - Juech, Claudia AU - Kadam, Rigveda AU - Collin, Maxime AU - Setian, Lucy AU - Serrano Pons, Jordi AU - Kwankam, Yunkap S. AU - Garrette, Béatrice AU - Barbe, Solenne AU - Bagayoko, Oumar Cheick AU - Mehl, Garrett AU - Lovis, Christian AU - Geissbuhler, Antoine PY - 2023/5/10 TI - iCHECK-DH: Guidelines and Checklist for the Reporting on Digital Health Implementations JO - J Med Internet Res SP - e46694 VL - 25 KW - implementation science KW - knowledge management KW - reporting standards KW - publishing standards KW - guideline KW - Digital Health Hub KW - reporting guideline KW - digital health implementation KW - health outcome N2 - Background: Implementation of digital health technologies has grown rapidly, but many remain limited to pilot studies due to challenges, such as a lack of evidence or barriers to implementation. Overcoming these challenges requires learning from previous implementations and systematically documenting implementation processes to better understand the real-world impact of a technology and identify effective strategies for future implementation. Objective: A group of global experts, facilitated by the Geneva Digital Health Hub, developed the Guidelines and Checklist for the Reporting on Digital Health Implementations (iCHECK-DH, pronounced ?I checked?) to improve the completeness of reporting on digital health implementations. Methods: A guideline development group was convened to define key considerations and criteria for reporting on digital health implementations. To ensure the practicality and effectiveness of the checklist, it was pilot-tested by applying it to several real-world digital health implementations, and adjustments were made based on the feedback received. The guiding principle for the development of iCHECK-DH was to identify the minimum set of information needed to comprehensively define a digital health implementation, to support the identification of key factors for success and failure, and to enable others to replicate it in different settings. Results: The result was a 20-item checklist with detailed explanations and examples in this paper. The authors anticipate that widespread adoption will standardize the quality of reporting and, indirectly, improve implementation standards and best practices. Conclusions: Guidelines for reporting on digital health implementations are important to ensure the accuracy, completeness, and consistency of reported information. This allows for meaningful comparison and evaluation of results, transparency, and accountability and informs stakeholder decision-making. i-CHECK-DH facilitates standardization of the way information is collected and reported, improving systematic documentation and knowledge transfer that can lead to the development of more effective digital health interventions and better health outcomes. UR - https://www.jmir.org/2023/1/e46694 UR - http://dx.doi.org/10.2196/46694 UR - http://www.ncbi.nlm.nih.gov/pubmed/37163336 ID - info:doi/10.2196/46694 ER - TY - JOUR AU - Chen, Jinying AU - Cutrona, L. Sarah AU - Dharod, Ajay AU - Bunch, C. Stephanie AU - Foley, L. Kristie AU - Ostasiewski, Brian AU - Hale, R. Erica AU - Bridges, Aaron AU - Moses, Adam AU - Donny, C. Eric AU - Sutfin, L. Erin AU - Houston, K. Thomas AU - PY - 2023/3/2 TI - Monitoring the Implementation of Tobacco Cessation Support Tools: Using Novel Electronic Health Record Activity Metrics JO - JMIR Med Inform SP - e43097 VL - 11 KW - medical informatics KW - electronic health records KW - EHR metrics KW - alerts KW - alert burden KW - tobacco cessation KW - monitoring KW - clinical decision support KW - implementation science KW - smoking cessation KW - decision tool N2 - Background: Clinical decision support (CDS) tools in electronic health records (EHRs) are often used as core strategies to support quality improvement programs in the clinical setting. Monitoring the impact (intended and unintended) of these tools is crucial for program evaluation and adaptation. Existing approaches for monitoring typically rely on health care providers? self-reports or direct observation of clinical workflows, which require substantial data collection efforts and are prone to reporting bias. Objective: This study aims to develop a novel monitoring method leveraging EHR activity data and demonstrate its use in monitoring the CDS tools implemented by a tobacco cessation program sponsored by the National Cancer Institute?s Cancer Center Cessation Initiative (C3I). Methods: We developed EHR-based metrics to monitor the implementation of two CDS tools: (1) a screening alert reminding clinic staff to complete the smoking assessment and (2) a support alert prompting health care providers to discuss support and treatment options, including referral to a cessation clinic. Using EHR activity data, we measured the completion (encounter-level alert completion rate) and burden (the number of times an alert was fired before completion and time spent handling the alert) of the CDS tools. We report metrics tracked for 12 months post implementation, comparing 7 cancer clinics (2 clinics implemented the screening alert and 5 implemented both alerts) within a C3I center, and identify areas to improve alert design and adoption. Results: The screening alert fired in 5121 encounters during the 12 months post implementation. The encounter-level alert completion rate (clinic staff acknowledged completion of screening in EHR: 0.55; clinic staff completed EHR documentation of screening results: 0.32) remained stable over time but varied considerably across clinics. The support alert fired in 1074 encounters during the 12 months. Providers acted upon (ie, not postponed) the support alert in 87.3% (n=938) of encounters, identified a patient ready to quit in 12% (n=129) of encounters, and ordered a referral to the cessation clinic in 2% (n=22) of encounters. With respect to alert burden, on average, both alerts fired over 2 times (screening alert: 2.7; support alert: 2.1) before completion; time spent postponing the screening alert was similar to completing (52 vs 53 seconds) the alert, and time spent postponing the support alert was more than completing (67 vs 50 seconds) the alert per encounter. These findings inform four areas where the alert design and use can be improved: (1) improving alert adoption and completion through local adaptation, (2) improving support alert efficacy by additional strategies including training in provider-patient communication, (3) improving the accuracy of tracking for alert completion, and (4) balancing alert efficacy with the burden. Conclusions: EHR activity metrics were able to monitor the success and burden of tobacco cessation alerts, allowing for a more nuanced understanding of potential trade-offs associated with alert implementation. These metrics can be used to guide implementation adaptation and are scalable across diverse settings. UR - https://medinform.jmir.org/2023/1/e43097 UR - http://dx.doi.org/10.2196/43097 UR - http://www.ncbi.nlm.nih.gov/pubmed/36862466 ID - info:doi/10.2196/43097 ER - TY - JOUR AU - Castańo Usuga, Andres Fabian AU - Gissel, Christian AU - Hernández, Mauricio Alher PY - 2022/11/25 TI - Motion Artifact Reduction in Electrocardiogram Signals Through a Redundant Denoising Independent Component Analysis Method for Wearable Health Care Monitoring Systems: Algorithm Development and Validation JO - JMIR Med Inform SP - e40826 VL - 10 IS - 11 KW - signal denoising KW - motion artifacts KW - biomedical signal processing KW - electrocardiogram KW - ECG KW - biomedical monitoring KW - home health care N2 - Background: The quest for improved diagnosis and treatment in home health care models has led to the development of wearable medical devices for remote vital signs monitoring. An accurate signal and a high diagnostic yield are critical for the cost-effectiveness of wearable health care monitoring systems and their widespread application in resource-constrained environments. Despite technological advances, the information acquired by these devices can be contaminated by motion artifacts (MA) leading to misdiagnosis or repeated procedures with increases in associated costs. This makes it necessary to develop methods to improve the quality of the signal acquired by these devices. Objective: We aimed to present a novel method for electrocardiogram (ECG) signal denoising to reduce MA. We aimed to analyze the method?s performance and to compare its performance to that of existing approaches. Methods: We present the novel Redundant denoising Independent Component Analysis method for ECG signal denoising based on the redundant and simultaneous acquisition of ECG signals and movement information, multichannel processing, and performance assessment considering the information contained in the signal waveform. The method is based on data including ECG signals from the patient?s chest and back, the acquisition of triaxial movement signals from inertial measurement units, a reference signal synthesized from an autoregressive model, and the separation of interest and noise sources through multichannel independent component analysis. Results: The proposed method significantly reduced MA, showing better performance and introducing a smaller distortion in the interest signal compared with other methods. Finally, the performance of the proposed method was compared to that of wavelet shrinkage and wavelet independent component analysis through the assessment of signal-to-noise ratio, dynamic time warping, and a proposed index based on the signal waveform evaluation with an ensemble average ECG. Conclusions: Our novel ECG denoising method is a contribution to converting wearable devices into medical monitoring tools that can be used to support the remote diagnosis and monitoring of cardiovascular diseases. A more accurate signal substantially improves the diagnostic yield of wearable devices. A better yield improves the devices? cost-effectiveness and contributes to their widespread application. UR - https://medinform.jmir.org/2022/11/e40826 UR - http://dx.doi.org/10.2196/40826 UR - http://www.ncbi.nlm.nih.gov/pubmed/36274196 ID - info:doi/10.2196/40826 ER - TY - JOUR AU - Huang, Yanqun AU - Zheng, Zhimin AU - Ma, Moxuan AU - Xin, Xin AU - Liu, Honglei AU - Fei, Xiaolu AU - Wei, Lan AU - Chen, Hui PY - 2022/8/3 TI - Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study JO - J Med Internet Res SP - e37486 VL - 24 IS - 8 KW - representation learning KW - skip-gram KW - feature association strengths KW - feature importance KW - mortality risk prediction KW - acute myocardial infarction N2 - Background: The widespread secondary use of electronic medical records (EMRs) promotes health care quality improvement. Representation learning that can automatically extract hidden information from EMR data has gained increasing attention. Objective: We aimed to propose a patient representation with more feature associations and task-specific feature importance to improve the outcome prediction performance for inpatients with acute myocardial infarction (AMI). Methods: Medical concepts, including patients? age, gender, disease diagnoses, laboratory tests, structured radiological features, procedures, and medications, were first embedded into real-value vectors using the improved skip-gram algorithm, where concepts in the context windows were selected by feature association strengths measured by association rule confidence. Then, each patient was represented as the sum of the feature embeddings weighted by the task-specific feature importance, which was applied to facilitate predictive model prediction from global and local perspectives. We finally applied the proposed patient representation into mortality risk prediction for 3010 and 1671 AMI inpatients from a public data set and a private data set, respectively, and compared it with several reference representation methods in terms of the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1-score. Results: Compared with the reference methods, the proposed embedding-based representation showed consistently superior predictive performance on the 2 data sets, achieving mean AUROCs of 0.878 and 0.973, AUPRCs of 0.220 and 0.505, and F1-scores of 0.376 and 0.674 for the public and private data sets, respectively, while the greatest AUROCs, AUPRCs, and F1-scores among the reference methods were 0.847 and 0.939, 0.196 and 0.283, and 0.344 and 0.361 for the public and private data sets, respectively. Feature importance integrated in patient representation reflected features that were also critical in prediction tasks and clinical practice. Conclusions: The introduction of feature associations and feature importance facilitated an effective patient representation and contributed to prediction performance improvement and model interpretation. UR - https://www.jmir.org/2022/8/e37486 UR - http://dx.doi.org/10.2196/37486 UR - http://www.ncbi.nlm.nih.gov/pubmed/35921141 ID - info:doi/10.2196/37486 ER - TY - JOUR AU - He, Xuefei AU - Peng, Cheng AU - Xu, Yingxin AU - Zhang, Ye AU - Wang, Zhongqing PY - 2022/4/21 TI - Global Scientific Research Landscape on Medical Informatics From 2011 to 2020: Bibliometric Analysis JO - JMIR Med Inform SP - e33842 VL - 10 IS - 4 KW - medical informatics KW - bibliometrics KW - VOSviewer KW - data visualization N2 - Background: With the emerging information and communication technology, the field of medical informatics has dramatically evolved in health care and medicine. Thus, it is crucial to explore the global scientific research landscape on medical informatics. Objective: This study aims to present a visual form to clarify the overall scientific research trends of medical informatics in the past decade. Methods: A bibliometric analysis of data retrieved and extracted from the Web of Science Core Collection (WoSCC) database was performed to analyze global scientific research trends on medical informatics, including publication year, journals, authors, institutions, countries/regions, references, and keywords, from January 1, 2011, to December 31, 2020. Results: The data set recorded 34,742 articles related to medical informatics from WoSCC between 2011 and 2020. The annual global publications increased by 193.86% from 1987 in 2011 to 5839 in 2020. Journal of Medical Internet Research (3600 publications and 63,932 citations) was the most productive and most highly cited journal in the field of medical informatics. David W Bates (99 publications), Harvard University (1161 publications), and the United States (12,927 publications) were the most productive author, institution, and country, respectively. The co-occurrence cluster analysis of high-frequency author keywords formed 4 clusters: (1) artificial intelligence in health care and medicine; (2) mobile health; (3) implementation and evaluation of electronic health records; (4) medical informatics technology application in public health. COVID-19, which ranked third in 2020, was the emerging theme of medical informatics. Conclusions: We summarize the recent advances in medical informatics in the past decade and shed light on their publication trends, influential journals, global collaboration patterns, basic knowledge, research hotspots, and theme evolution through bibliometric analysis and visualization maps. These findings will accurately and quickly grasp the research trends and provide valuable guidance for future medical informatics research. UR - https://medinform.jmir.org/2022/4/e33842 UR - http://dx.doi.org/10.2196/33842 UR - http://www.ncbi.nlm.nih.gov/pubmed/35451986 ID - info:doi/10.2196/33842 ER - TY - JOUR AU - El Emam, Khaled AU - Mosquera, Lucy AU - Fang, Xi AU - El-Hussuna, Alaa PY - 2022/4/7 TI - Utility Metrics for Evaluating Synthetic Health Data Generation Methods: Validation Study JO - JMIR Med Inform SP - e35734 VL - 10 IS - 4 KW - synthetic data KW - data utility KW - data privacy KW - generative models KW - utility metric KW - synthetic data generation KW - logistic regression KW - model validation KW - medical informatics KW - binary prediction model KW - prediction model N2 - Background: A regular task by developers and users of synthetic data generation (SDG) methods is to evaluate and compare the utility of these methods. Multiple utility metrics have been proposed and used to evaluate synthetic data. However, they have not been validated in general or for comparing SDG methods. Objective: This study evaluates the ability of common utility metrics to rank SDG methods according to performance on a specific analytic workload. The workload of interest is the use of synthetic data for logistic regression prediction models, which is a very frequent workload in health research. Methods: We evaluated 6 utility metrics on 30 different health data sets and 3 different SDG methods (a Bayesian network, a Generative Adversarial Network, and sequential tree synthesis). These metrics were computed by averaging across 20 synthetic data sets from the same generative model. The metrics were then tested on their ability to rank the SDG methods based on prediction performance. Prediction performance was defined as the difference between each of the area under the receiver operating characteristic curve and area under the precision-recall curve values on synthetic data logistic regression prediction models versus real data models. Results: The utility metric best able to rank SDG methods was the multivariate Hellinger distance based on a Gaussian copula representation of real and synthetic joint distributions. Conclusions: This study has validated a generative model utility metric, the multivariate Hellinger distance, which can be used to reliably rank competing SDG methods on the same data set. The Hellinger distance metric can be used to evaluate and compare alternate SDG methods. UR - https://medinform.jmir.org/2022/4/e35734 UR - http://dx.doi.org/10.2196/35734 UR - http://www.ncbi.nlm.nih.gov/pubmed/35389366 ID - info:doi/10.2196/35734 ER - TY - JOUR AU - Huang, Zonghai AU - Miao, Jiaqing AU - Chen, Ju AU - Zhong, Yanmei AU - Yang, Simin AU - Ma, Yiyi AU - Wen, Chuanbiao PY - 2022/4/6 TI - A Traditional Chinese Medicine Syndrome Classification Model Based on Cross-Feature Generation by Convolution Neural Network: Model Development and Validation JO - JMIR Med Inform SP - e29290 VL - 10 IS - 4 KW - intelligent syndrome differentiation KW - cross-FGCNN KW - TCM N2 - Background: Nowadays, intelligent medicine is gaining widespread attention, and great progress has been made in Western medicine with the help of artificial intelligence to assist in decision making. Compared with Western medicine, traditional Chinese medicine (TCM) involves selecting the specific treatment method, prescription, and medication based on the dialectical results of each patient?s symptoms. For this reason, the development of a TCM-assisted decision-making system has lagged. Treatment based on syndrome differentiation is the core of TCM treatment; TCM doctors can dialectically classify diseases according to patients? symptoms and optimize treatment in time. Therefore, the essence of a TCM-assisted decision-making system is a TCM intelligent, dialectical algorithm. Symptoms stored in electronic medical records are mostly associated with patients? diseases; however, symptoms of TCM are mostly subjectively identified. In general electronic medical records, there are many missing values. TCM medical records, in which symptoms tend to cause high-dimensional sparse data, reduce algorithm accuracy. Objective: This study aims to construct an algorithm model compatible for the multidimensional, highly sparse, and multiclassification task of TCM syndrome differentiation, so that it can be effectively applied to the intelligent dialectic of different diseases. Methods: The relevant terms in electronic medical records were standardized with respect to symptoms and evidence-based criteria of TCM. We structuralized case data based on the classification of different symptoms and physical signs according to the 4 diagnostic examinations in TCM diagnosis. A novel cross-feature generation by convolution neural network model performed evidence-based recommendations based on the input embedded, structured medical record data. Results: The data set included 5273 real dysmenorrhea cases from the Sichuan TCM big data management platform and the Chinese literature database, which were embedded into 60 fields after being structured and standardized. The training set and test set were randomly constructed in a ratio of 3:1. For the classification of different syndrome types, compared with 6 traditional, intelligent dialectical models and 3 click-through-rate models, the new model showed a good generalization ability and good classification effect. The comprehensive accuracy rate reached 96.21%. Conclusions: The main contribution of this study is the construction of a new intelligent dialectical model combining the characteristics of TCM by treating intelligent dialectics as a high-dimensional sparse vector classification task. Owing to the standardization of the input symptoms, all the common symptoms of TCM are covered, and the model can differentiate the symptoms with a variety of missing values. Therefore, with the continuous improvement of disease data sets, this model has the potential to be applied to the dialectical classification of different diseases in TCM. UR - https://medinform.jmir.org/2022/4/e29290 UR - http://dx.doi.org/10.2196/29290 UR - http://www.ncbi.nlm.nih.gov/pubmed/35384854 ID - info:doi/10.2196/29290 ER - TY - JOUR AU - Jung, Hyesil AU - Yoo, Sooyoung AU - Kim, Seok AU - Heo, Eunjeong AU - Kim, Borham AU - Lee, Ho-Young AU - Hwang, Hee PY - 2022/3/11 TI - Patient-Level Fall Risk Prediction Using the Observational Medical Outcomes Partnership?s Common Data Model: Pilot Feasibility Study JO - JMIR Med Inform SP - e35104 VL - 10 IS - 3 KW - common data model KW - accidental falls KW - Observational Medical Outcomes Partnership KW - nursing records KW - medical informatics KW - health data KW - electronic health record KW - data model KW - prediction model KW - risk prediction KW - fall risk N2 - Background: Falls in acute care settings threaten patients? safety. Researchers have been developing fall risk prediction models and exploring risk factors to provide evidence-based fall prevention practices; however, such efforts are hindered by insufficient samples, limited covariates, and a lack of standardized methodologies that aid study replication. Objective: The objectives of this study were to (1) convert fall-related electronic health record data into the standardized Observational Medical Outcome Partnership's (OMOP) common data model format and (2) develop models that predict fall risk during 2 time periods. Methods: As a pilot feasibility test, we converted fall-related electronic health record data (nursing notes, fall risk assessment sheet, patient acuity assessment sheet, and clinical observation sheet) into standardized OMOP common data model format using an extraction, transformation, and load process. We developed fall risk prediction models for 2 time periods (within 7 days of admission and during the entire hospital stay) using 2 algorithms (least absolute shrinkage and selection operator logistic regression and random forest). Results: In total, 6277 nursing statements, 747,049,486 clinical observation sheet records, 1,554,775 fall risk scores, and 5,685,011 patient acuity scores were converted into OMOP common data model format. All our models (area under the receiver operating characteristic curve 0.692-0.726) performed better than the Hendrich II Fall Risk Model. Patient acuity score, fall history, age ?60 years, movement disorder, and central nervous system agents were the most important predictors in the logistic regression models. Conclusions: To enhance model performance further, we are currently converting all nursing records into the OMOP common data model data format, which will then be included in the models. Thus, in the near future, the performance of fall risk prediction models could be improved through the application of abundant nursing records and external validation. UR - https://medinform.jmir.org/2022/3/e35104 UR - http://dx.doi.org/10.2196/35104 UR - http://www.ncbi.nlm.nih.gov/pubmed/35275076 ID - info:doi/10.2196/35104 ER - TY - JOUR AU - Hong, Na AU - Liu, Chun AU - Gao, Jianwei AU - Han, Lin AU - Chang, Fengxiang AU - Gong, Mengchun AU - Su, Longxiang PY - 2022/3/3 TI - State of the Art of Machine Learning?Enabled Clinical Decision Support in Intensive Care Units: Literature Review JO - JMIR Med Inform SP - e28781 VL - 10 IS - 3 KW - machine learning KW - intensive care units KW - clinical decision support KW - prediction model KW - artificial intelligence KW - electronic health records N2 - Background: Modern clinical care in intensive care units is full of rich data, and machine learning has great potential to support clinical decision-making. The development of intelligent machine learning?based clinical decision support systems is facing great opportunities and challenges. Clinical decision support systems may directly help clinicians accurately diagnose, predict outcomes, identify risk events, or decide treatments at the point of care. Objective: We aimed to review the research and application of machine learning?enabled clinical decision support studies in intensive care units to help clinicians, researchers, developers, and policy makers better understand the advantages and limitations of machine learning?supported diagnosis, outcome prediction, risk event identification, and intensive care unit point-of-care recommendations. Methods: We searched papers published in the PubMed database between January 1980 and October 2020. We defined selection criteria to identify papers that focused on machine learning?enabled clinical decision support studies in intensive care units and reviewed the following aspects: research topics, study cohorts, machine learning models, analysis variables, and evaluation metrics. Results: A total of 643 papers were collected, and using our selection criteria, 97 studies were found. Studies were categorized into 4 topics?monitoring, detection, and diagnosis (13/97, 13.4%), early identification of clinical events (32/97, 33.0%), outcome prediction and prognosis assessment (46/97, 47.6%), and treatment decision (6/97, 6.2%). Of the 97 papers, 82 (84.5%) studies used data from adult patients, 9 (9.3%) studies used data from pediatric patients, and 6 (6.2%) studies used data from neonates. We found that 65 (67.0%) studies used data from a single center, and 32 (33.0%) studies used a multicenter data set; 88 (90.7%) studies used supervised learning, 3 (3.1%) studies used unsupervised learning, and 6 (6.2%) studies used reinforcement learning. Clinical variable categories, starting with the most frequently used, were demographic (n=74), laboratory values (n=59), vital signs (n=55), scores (n=48), ventilation parameters (n=43), comorbidities (n=27), medications (n=18), outcome (n=14), fluid balance (n=13), nonmedicine therapy (n=10), symptoms (n=7), and medical history (n=4). The most frequently adopted evaluation metrics for clinical data modeling studies included area under the receiver operating characteristic curve (n=61), sensitivity (n=51), specificity (n=41), accuracy (n=29), and positive predictive value (n=23). Conclusions: Early identification of clinical and outcome prediction and prognosis assessment contributed to approximately 80% of studies included in this review. Using new algorithms to solve intensive care unit clinical problems by developing reinforcement learning, active learning, and time-series analysis methods for clinical decision support will be greater development prospects in the future. UR - https://medinform.jmir.org/2022/3/e28781 UR - http://dx.doi.org/10.2196/28781 UR - http://www.ncbi.nlm.nih.gov/pubmed/35238790 ID - info:doi/10.2196/28781 ER - TY - JOUR AU - Bucalon, Bernard AU - Shaw, Tim AU - Brown, Kerri AU - Kay, Judy PY - 2022/2/14 TI - State-of-the-art Dashboards on Clinical Indicator Data to Support Reflection on Practice: Scoping Review JO - JMIR Med Inform SP - e32695 VL - 10 IS - 2 KW - practice analytics dashboards KW - data visualization KW - reflective practice KW - professional learning KW - mobile phone N2 - Background: There is an increasing interest in using routinely collected eHealth data to support reflective practice and long-term professional learning. Studies have evaluated the impact of dashboards on clinician decision-making, task completion time, user satisfaction, and adherence to clinical guidelines. Objective: This scoping review aims to summarize the literature on dashboards based on patient administrative, medical, and surgical data for clinicians to support reflective practice. Methods: A scoping review was conducted using the Arksey and O?Malley framework. A search was conducted in 5 electronic databases (MEDLINE, Embase, Scopus, ACM Digital Library, and Web of Science) to identify studies that met the inclusion criteria. Study selection and characterization were performed by 2 independent reviewers (BB and CP). One reviewer extracted the data that were analyzed descriptively to map the available evidence. Results: A total of 18 dashboards from 8 countries were assessed. Purposes for the dashboards were designed for performance improvement (10/18, 56%), to support quality and safety initiatives (6/18, 33%), and management and operations (4/18, 22%). Data visualizations were primarily designed for team use (12/18, 67%) rather than individual clinicians (4/18, 22%). Evaluation methods varied among asking the clinicians directly (11/18, 61%), observing user behavior through clinical indicators and use log data (14/18, 78%), and usability testing (4/18, 22%). The studies reported high scores on standard usability questionnaires, favorable surveys, and interview feedback. Improvements to underlying clinical indicators were observed in 78% (7/9) of the studies, whereas 22% (2/9) of the studies reported no significant changes in performance. Conclusions: This scoping review maps the current literature landscape on dashboards based on routinely collected clinical indicator data. Although there were common data visualization techniques and clinical indicators used across studies, there was diversity in the design of the dashboards and their evaluation. There was a lack of detail regarding the design processes documented for reproducibility. We identified a lack of interface features to support clinicians in making sense of and reflecting on their personal performance data. UR - https://medinform.jmir.org/2022/2/e32695 UR - http://dx.doi.org/10.2196/32695 UR - http://www.ncbi.nlm.nih.gov/pubmed/35156928 ID - info:doi/10.2196/32695 ER - TY - JOUR AU - Oberschmidt, Kira AU - Grünloh, Christiane AU - Nijboer, Femke AU - van Velsen, Lex PY - 2022/1/28 TI - Best Practices and Lessons Learned for Action Research in eHealth Design and Implementation: Literature Review JO - J Med Internet Res SP - e31795 VL - 24 IS - 1 KW - action research KW - eHealth KW - best practices KW - lessons learned N2 - Background: Action research (AR) is an established research framework to introduce change in a community following a cyclical approach and involving stakeholders as coresearchers in the process. In recent years, it has also been used for eHealth development. However, little is known about the best practices and lessons learned from using AR for eHealth development. Objective: This literature review aims to provide more knowledge on the best practices and lessons learned from eHealth AR studies. Additionally, an overview of the context in which AR eHealth studies take place is given. Methods: A semisystematic review of 44 papers reporting on 40 different AR projects was conducted to identify the best practices and lessons learned in the research studies while accounting for the particular contextual setting and used AR approach. Results: Recommendations include paying attention to the training of stakeholders? academic skills, as well as the various roles and tasks of action researchers. The studies also highlight the need for constant reflection and accessible dissemination suiting the target group. Conclusions: This literature review identified room for improvements regarding communicating and specifying the particular AR definition and applied approach. UR - https://www.jmir.org/2022/1/e31795 UR - http://dx.doi.org/10.2196/31795 UR - http://www.ncbi.nlm.nih.gov/pubmed/35089158 ID - info:doi/10.2196/31795 ER - TY - JOUR AU - Han, Wenting AU - Han, Xi AU - Zhou, Sijia AU - Zhu, Qinghua PY - 2022/1/27 TI - The Development History and Research Tendency of Medical Informatics: Topic Evolution Analysis JO - JMIR Med Inform SP - e31918 VL - 10 IS - 1 KW - medical informatics KW - research hotspot KW - LDA model KW - topic evolution analysis KW - mobile phone N2 - Background: Medical informatics has attracted the attention of researchers worldwide. It is necessary to understand the development of its research hot spots as well as directions for future research. Objective: The aim of this study is to explore the evolution of medical informatics research topics by analyzing research articles published between 1964 and 2020. Methods: A total of 56,466 publications were collected from 27 representative medical informatics journals indexed by the Web of Science Core Collection. We identified the research stages based on the literature growth curve, extracted research topics using the latent Dirichlet allocation model, and analyzed topic evolution patterns by calculating the cosine similarity between topics from the adjacent stages. Results: The following three research stages were identified: early birth, early development, and rapid development. Medical informatics has entered the fast development stage, with literature growing exponentially. Research topics in medical informatics can be classified into the following two categories: data-centered studies and people-centered studies. Medical data analysis has been a research hot spot across all 3 stages, and the integration of emerging technologies into data analysis might be a future hot spot. Researchers have focused more on user needs in the last 2 stages. Another potential hot spot might be how to meet user needs and improve the usability of health tools. Conclusions: Our study provides a comprehensive understanding of research hot spots in medical informatics, as well as evolution patterns among them, which was helpful for researchers to grasp research trends and design their studies. UR - https://medinform.jmir.org/2022/1/e31918 UR - http://dx.doi.org/10.2196/31918 UR - http://www.ncbi.nlm.nih.gov/pubmed/35084351 ID - info:doi/10.2196/31918 ER - TY - JOUR AU - Chen, Qingyu AU - Rankine, Alex AU - Peng, Yifan AU - Aghaarabi, Elaheh AU - Lu, Zhiyong PY - 2021/12/30 TI - Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study JO - JMIR Med Inform SP - e27386 VL - 9 IS - 12 KW - semantic textual similarity KW - deep learning KW - biomedical and clinical text mining KW - word embeddings KW - sentence embeddings KW - transformers N2 - Background: Semantic textual similarity (STS) measures the degree of relatedness between sentence pairs. The Open Health Natural Language Processing (OHNLP) Consortium released an expertly annotated STS data set and called for the National Natural Language Processing Clinical Challenges. This work describes our entry, an ensemble model that leverages a range of deep learning (DL) models. Our team from the National Library of Medicine obtained a Pearson correlation of 0.8967 in an official test set during 2019 National Natural Language Processing Clinical Challenges/Open Health Natural Language Processing shared task and achieved a second rank. Objective: Although our models strongly correlate with manual annotations, annotator-level correlation was only moderate (weighted Cohen ?=0.60). We are cautious of the potential use of DL models in production systems and argue that it is more critical to evaluate the models in-depth, especially those with extremely high correlations. In this study, we benchmark the effectiveness and efficiency of top-ranked DL models. We quantify their robustness and inference times to validate their usefulness in real-time applications. Methods: We benchmarked five DL models, which are the top-ranked systems for STS tasks: Convolutional Neural Network, BioSentVec, BioBERT, BlueBERT, and ClinicalBERT. We evaluated a random forest model as an additional baseline. For each model, we repeated the experiment 10 times, using the official training and testing sets. We reported 95% CI of the Wilcoxon rank-sum test on the average Pearson correlation (official evaluation metric) and running time. We further evaluated Spearman correlation, R˛, and mean squared error as additional measures. Results: Using only the official training set, all models obtained highly effective results. BioSentVec and BioBERT achieved the highest average Pearson correlations (0.8497 and 0.8481, respectively). BioSentVec also had the highest results in 3 of 4 effectiveness measures, followed by BioBERT. However, their robustness to sentence pairs of different similarity levels varies significantly. A particular observation is that BERT models made the most errors (a mean squared error of over 2.5) on highly similar sentence pairs. They cannot capture highly similar sentence pairs effectively when they have different negation terms or word orders. In addition, time efficiency is dramatically different from the effectiveness results. On average, the BERT models were approximately 20 times and 50 times slower than the Convolutional Neural Network and BioSentVec models, respectively. This results in challenges for real-time applications. Conclusions: Despite the excitement of further improving Pearson correlations in this data set, our results highlight that evaluations of the effectiveness and efficiency of STS models are critical. In future, we suggest more evaluations on the generalization capability and user-level testing of the models. We call for community efforts to create more biomedical and clinical STS data sets from different perspectives to reflect the multifaceted notion of sentence-relatedness. UR - https://medinform.jmir.org/2021/12/e27386 UR - http://dx.doi.org/10.2196/27386 UR - http://www.ncbi.nlm.nih.gov/pubmed/34967748 ID - info:doi/10.2196/27386 ER - TY - JOUR AU - Hah, Hyeyoung AU - Goldin, Shevit Deana PY - 2021/12/16 TI - How Clinicians Perceive Artificial Intelligence?Assisted Technologies in Diagnostic Decision Making: Mixed Methods Approach JO - J Med Internet Res SP - e33540 VL - 23 IS - 12 KW - artificial intelligence algorithms KW - AI KW - diagnostic capability KW - virtual care KW - multilevel modeling KW - human-AI teaming KW - natural language understanding N2 - Background: With the rapid development of artificial intelligence (AI) and related technologies, AI algorithms are being embedded into various health information technologies that assist clinicians in clinical decision making. Objective: This study aimed to explore how clinicians perceive AI assistance in diagnostic decision making and suggest the paths forward for AI-human teaming for clinical decision making in health care. Methods: This study used a mixed methods approach, utilizing hierarchical linear modeling and sentiment analysis through natural language understanding techniques. Results: A total of 114 clinicians participated in online simulation surveys in 2020 and 2021. These clinicians studied family medicine and used AI algorithms to aid in patient diagnosis. Their overall sentiment toward AI-assisted diagnosis was positive and comparable with diagnoses made without the assistance of AI. However, AI-guided decision making was not congruent with the way clinicians typically made decisions in diagnosing illnesses. In a quantitative survey, clinicians reported perceiving current AI assistance as not likely to enhance diagnostic capability and negatively influenced their overall performance (?=?0.421, P=.02). Instead, clinicians? diagnostic capabilities tended to be associated with well-known parameters, such as education, age, and daily habit of technology use on social media platforms. Conclusions: This study elucidated clinicians? current perceptions and sentiments toward AI-enabled diagnosis. Although the sentiment was positive, the current form of AI assistance may not be linked with efficient decision making, as AI algorithms are not well aligned with subjective human reasoning in clinical diagnosis. Developers and policy makers in health could gather behavioral data from clinicians in various disciplines to help align AI algorithms with the unique subjective patterns of reasoning that humans employ in clinical diagnosis. UR - https://www.jmir.org/2021/12/e33540 UR - http://dx.doi.org/10.2196/33540 UR - http://www.ncbi.nlm.nih.gov/pubmed/34924356 ID - info:doi/10.2196/33540 ER - TY - JOUR AU - Ploug, Thomas AU - Sundby, Anna AU - Moeslund, B. Thomas AU - Holm, Sřren PY - 2021/12/13 TI - Population Preferences for Performance and Explainability of Artificial Intelligence in Health Care: Choice-Based Conjoint Survey JO - J Med Internet Res SP - e26611 VL - 23 IS - 12 KW - artificial Intelligence KW - performance KW - transparency KW - explainability KW - population preferences KW - public policy N2 - Background: Certain types of artificial intelligence (AI), that is, deep learning models, can outperform health care professionals in particular domains. Such models hold considerable promise for improved diagnostics, treatment, and prevention, as well as more cost-efficient health care. They are, however, opaque in the sense that their exact reasoning cannot be fully explicated. Different stakeholders have emphasized the importance of the transparency/explainability of AI decision making. Transparency/explainability may come at the cost of performance. There is need for a public policy regulating the use of AI in health care that balances the societal interests in high performance as well as in transparency/explainability. A public policy should consider the wider public?s interests in such features of AI. Objective: This study elicited the public?s preferences for the performance and explainability of AI decision making in health care and determined whether these preferences depend on respondent characteristics, including trust in health and technology and fears and hopes regarding AI. Methods: We conducted a choice-based conjoint survey of public preferences for attributes of AI decision making in health care in a representative sample of the adult Danish population. Initial focus group interviews yielded 6 attributes playing a role in the respondents? views on the use of AI decision support in health care: (1) type of AI decision, (2) level of explanation, (3) performance/accuracy, (4) responsibility for the final decision, (5) possibility of discrimination, and (6) severity of the disease to which the AI is applied. In total, 100 unique choice sets were developed using fractional factorial design. In a 12-task survey, respondents were asked about their preference for AI system use in hospitals in relation to 3 different scenarios. Results: Of the 1678 potential respondents, 1027 (61.2%) participated. The respondents consider the physician having the final responsibility for treatment decisions the most important attribute, with 46.8% of the total weight of attributes, followed by explainability of the decision (27.3%) and whether the system has been tested for discrimination (14.8%). Other factors, such as gender, age, level of education, whether respondents live rurally or in towns, respondents? trust in health and technology, and respondents? fears and hopes regarding AI, do not play a significant role in the majority of cases. Conclusions: The 3 factors that are most important to the public are, in descending order of importance, (1) that physicians are ultimately responsible for diagnostics and treatment planning, (2) that the AI decision support is explainable, and (3) that the AI system has been tested for discrimination. Public policy on AI system use in health care should give priority to such AI system use and ensure that patients are provided with information. UR - https://www.jmir.org/2021/12/e26611 UR - http://dx.doi.org/10.2196/26611 UR - http://www.ncbi.nlm.nih.gov/pubmed/34898454 ID - info:doi/10.2196/26611 ER - TY - JOUR AU - Stöhr, R. Mark AU - Günther, Andreas AU - Majeed, W. Raphael PY - 2021/11/29 TI - The Collaborative Metadata Repository (CoMetaR) Web App: Quantitative and Qualitative Usability Evaluation JO - JMIR Med Inform SP - e30308 VL - 9 IS - 11 KW - usability KW - metadata KW - data visualization KW - semantic web KW - data management KW - data warehousing KW - communication barriers KW - quality improvement KW - biological ontologies KW - data curation N2 - Background: In the field of medicine and medical informatics, the importance of comprehensive metadata has long been recognized, and the composition of metadata has become its own field of profession and research. To ensure sustainable and meaningful metadata are maintained, standards and guidelines such as the FAIR (Findability, Accessibility, Interoperability, Reusability) principles have been published. The compilation and maintenance of metadata is performed by field experts supported by metadata management apps. The usability of these apps, for example, in terms of ease of use, efficiency, and error tolerance, crucially determines their benefit to those interested in the data. Objective: This study aims to provide a metadata management app with high usability that assists scientists in compiling and using rich metadata. We aim to evaluate our recently developed interactive web app for our collaborative metadata repository (CoMetaR). This study reflects how real users perceive the app by assessing usability scores and explicit usability issues. Methods: We evaluated the CoMetaR web app by measuring the usability of 3 modules: core module, provenance module, and data integration module. We defined 10 tasks in which users must acquire information specific to their user role. The participants were asked to complete the tasks in a live web meeting. We used the System Usability Scale questionnaire to measure the usability of the app. For qualitative analysis, we applied a modified think aloud method with the following thematic analysis and categorization into the ISO 9241-110 usability categories. Results: A total of 12 individuals participated in the study. We found that over 97% (85/88) of all the tasks were completed successfully. We measured usability scores of 81, 81, and 72 for the 3 evaluated modules. The qualitative analysis resulted in 24 issues with the app. Conclusions: A usability score of 81 implies very good usability for the 2 modules, whereas a usability score of 72 still indicates acceptable usability for the third module. We identified 24 issues that serve as starting points for further development. Our method proved to be effective and efficient in terms of effort and outcome. It can be adapted to evaluate apps within the medical informatics field and potentially beyond. UR - https://medinform.jmir.org/2021/11/e30308 UR - http://dx.doi.org/10.2196/30308 UR - http://www.ncbi.nlm.nih.gov/pubmed/34847059 ID - info:doi/10.2196/30308 ER - TY - JOUR AU - Pankhurst, Tanya AU - Evison, Felicity AU - Atia, Jolene AU - Gallier, Suzy AU - Coleman, Jamie AU - Ball, Simon AU - McKee, Deborah AU - Ryan, Steven AU - Black, Ruth PY - 2021/11/23 TI - Introduction of Systematized Nomenclature of Medicine?Clinical Terms Coding Into an Electronic Health Record and Evaluation of its Impact: Qualitative and Quantitative Study JO - JMIR Med Inform SP - e29532 VL - 9 IS - 11 KW - coding standards KW - clinical decision support KW - Clinician led design KW - clinician reported experience KW - clinical usability KW - data sharing KW - diagnoses KW - electronic health records KW - electronic health record standards KW - health data exchange KW - health data research KW - International Classification of Diseases version 10 (ICD-10) KW - National Health Service Blueprint KW - patient diagnoses KW - population health KW - problem list KW - research KW - Systematized Nomenclature Of Medicine?Clinical Terms (SNOMED-CT) KW - use of electronic health data KW - user-led design N2 - Background: This study describes the conversion within an existing electronic health record (EHR) from the International Classification of Diseases, Tenth Revision coding system to the SNOMED-CT (Systematized Nomenclature of Medicine?Clinical Terms) for the collection of patient histories and diagnoses. The setting is a large acute hospital that is designing and building its own EHR. Well-designed EHRs create opportunities for continuous data collection, which can be used in clinical decision support rules to drive patient safety. Collected data can be exchanged across health care systems to support patients in all health care settings. Data can be used for research to prevent diseases and protect future populations. Objective: The aim of this study was to migrate a current EHR, with all relevant patient data, to the SNOMED-CT coding system to optimize clinical use and clinical decision support, facilitate data sharing across organizational boundaries for national programs, and enable remodeling of medical pathways. Methods: The study used qualitative and quantitative data to understand the successes and gaps in the project, clinician attitudes toward the new tool, and the future use of the tool. Results: The new coding system (tool) was well received and immediately widely used in all specialties. This resulted in increased, accurate, and clinically relevant data collection. Clinicians appreciated the increased depth and detail of the new coding, welcomed the potential for both data sharing and research, and provided extensive feedback for further development. Conclusions: Successful implementation of the new system aligned the University Hospitals Birmingham NHS Foundation Trust with national strategy and can be used as a blueprint for similar projects in other health care settings. UR - https://medinform.jmir.org/2021/11/e29532 UR - http://dx.doi.org/10.2196/29532 UR - http://www.ncbi.nlm.nih.gov/pubmed/34817387 ID - info:doi/10.2196/29532 ER - TY - JOUR AU - Rajamani, Geetanjali AU - Rodriguez Espinosa, Patricia AU - Rosas, G. Lisa PY - 2021/11/19 TI - Intersection of Health Informatics Tools and Community Engagement in Health-Related Research to Reduce Health Inequities: Scoping Review JO - J Particip Med SP - e30062 VL - 13 IS - 3 KW - community engagement KW - stakeholder involvement KW - underserved communities KW - health informatics KW - health information technology KW - health inequities KW - health-related research N2 - Background: The exponential growth of health information technology has the potential to facilitate community engagement in research. However, little is known about the use of health information technology in community-engaged research, such as which types of health information technology are used, which populations are engaged, and what are the research outcomes. Objective: The objectives of this scoping review were to examine studies that used health information technology for community engagement and to assess (1) the types of populations, (2) community engagement strategies, (3) types of health information technology tools, and (4) outcomes of interest. Methods: We searched PubMed and PCORI Literature Explorer using terms related to health information technology, health informatics, community engagement, and stakeholder involvement. This search process yielded 967 papers for screening. After inclusion and exclusion criteria were applied, a total of 37 papers were analyzed for key themes and for approaches relevant to health information technology and community engagement research. Results: This analysis revealed that the communities engaged were generally underrepresented populations in health-related research, including racial or ethnic minority communities such as Black/African American, American Indian/Alaska Native, Latino ethnicity, and communities from low socioeconomic backgrounds. The studies focused on various age groups, ranging from preschoolers to older adults. The studies were also geographically spread across the United States and the world. Community engagement strategies included collaborative development of health information technology tools and partnerships to promote use (encompassing collaborative development, use of community advisory boards, and focus groups for eliciting information needs) and use of health information technology to engage communities in research (eg, through citizen science). The types of technology varied across studies, with mobile or tablet-based apps being the most common platform. Outcomes measured included eliciting user needs and requirements, assessing health information technology tools and prototypes with participants, measuring knowledge, and advocating for community change. Conclusions: This study illustrates the current landscape at the intersection of health information technology tools and community-engaged research approaches. It highlights studies in which various community-engaged research approaches were used to design culturally centered health information technology tools, to promote health information technology uptake, or for engagement in health research and advocacy. Our findings can serve as a platform for generating future research upon which to expand the scope of health information technology tools and their use for meaningful stakeholder engagement. Studies that incorporate community context and needs have a greater chance of cocreating culturally centered health information technology tools and better knowledge to promote action and improve health outcomes. UR - https://jopm.jmir.org/2021/3/e30062 UR - http://dx.doi.org/10.2196/30062 UR - http://www.ncbi.nlm.nih.gov/pubmed/34797214 ID - info:doi/10.2196/30062 ER - TY - JOUR AU - Kim, Taewoo AU - Lee, Hyun Dong AU - Park, Eun-Kee AU - Choi, Sanghun PY - 2021/11/18 TI - Deep Learning Techniques for Fatty Liver Using Multi-View Ultrasound Images Scanned by Different Scanners: Development and Validation Study JO - JMIR Med Inform SP - e30066 VL - 9 IS - 11 KW - fatty liver KW - deep learning KW - transfer learning KW - classification KW - regression KW - magnetic resonance imaging?proton density fat fraction KW - multi-view ultrasound images KW - artificial intelligence KW - machine imaging KW - imaging KW - informatics KW - fatty liver disease KW - detection KW - diagnosis N2 - Background: Fat fraction values obtained from magnetic resonance imaging (MRI) can be used to obtain an accurate diagnosis of fatty liver diseases. However, MRI is expensive and cannot be performed for everyone. Objective: In this study, we aim to develop multi-view ultrasound image?based convolutional deep learning models to detect fatty liver disease and yield fat fraction values. Methods: We extracted 90 ultrasound images of the right intercostal view and 90 ultrasound images of the right intercostal view containing the right renal cortex from 39 cases of fatty liver (MRI?proton density fat fraction [MRI?PDFF] ? 5%) and 51 normal subjects (MRI?PDFF < 5%), with MRI?PDFF values obtained from Good Gang-An Hospital. We obtained combined liver and kidney-liver (CLKL) images to train the deep learning models and developed classification and regression models based on the VGG19 model to classify fatty liver disease and yield fat fraction values. We employed the data augmentation techniques such as flip and rotation to prevent the deep learning model from overfitting. We determined the deep learning model with performance metrics such as accuracy, sensitivity, specificity, and coefficient of determination (R2). Results: In demographic information, all metrics such as age and sex were similar between the two groups?fatty liver disease and normal subjects. In classification, the model trained on CLKL images achieved 80.1% accuracy, 86.2% precision, and 80.5% specificity to detect fatty liver disease. In regression, the predicted fat fraction values of the regression model trained on CLKL images correlated with MRI?PDFF values (R2=0.633), indicating that the predicted fat fraction values were moderately estimated. Conclusions: With deep learning techniques and multi-view ultrasound images, it is potentially possible to replace MRI?PDFF values with deep learning predictions for detecting fatty liver disease and estimating fat fraction values. UR - https://medinform.jmir.org/2021/11/e30066 UR - http://dx.doi.org/10.2196/30066 UR - http://www.ncbi.nlm.nih.gov/pubmed/34792476 ID - info:doi/10.2196/30066 ER - TY - JOUR AU - Knapp, Andreas AU - Harst, Lorenz AU - Hager, Stefan AU - Schmitt, Jochen AU - Scheibe, Madlen PY - 2021/11/17 TI - Use of Patient-Reported Outcome Measures and Patient-Reported Experience Measures Within Evaluation Studies of Telemedicine Applications: Systematic Review JO - J Med Internet Res SP - e30042 VL - 23 IS - 11 KW - telemedicine KW - telehealth KW - evaluation KW - outcome KW - patient-reported outcome measures KW - patient-reported outcome KW - patient-reported experience measures KW - patient-reported experience KW - measurement instrument KW - questionnaire N2 - Background: With the rise of digital health technologies and telemedicine, the need for evidence-based evaluation is growing. Patient-reported outcome measures (PROMs) and patient-reported experience measures (PREMs) are recommended as an essential part of the evaluation of telemedicine. For the first time, a systematic review has been conducted to investigate the use of PROMs and PREMs in the evaluation studies of telemedicine covering all application types and medical purposes. Objective: This study investigates the following research questions: in which scenarios are PROMs and PREMs collected for evaluation purposes, which PROM and PREM outcome domains have been covered and how often, which outcome measurement instruments have been used and how often, does the selection and quantity of PROMs and PREMs differ between study types and application types, and has the use of PROMs and PREMs changed over time. Methods: We conducted a systematic literature search of the MEDLINE and Embase databases and included studies published from inception until April 2, 2020. We included studies evaluating telemedicine with patients as the main users; these studies reported PROMs and PREMs within randomized controlled trials, controlled trials, noncontrolled trials, and feasibility trials in English and German. Results: Of the identified 2671 studies, 303 (11.34%) were included; of the 303 studies, 67 (22.1%) were feasibility studies, 70 (23.1%) were noncontrolled trials, 20 (6.6%) were controlled trials, and 146 (48.2%) were randomized controlled trials. Health-related quality of life (n=310; mean 1.02, SD 1.05), emotional function (n=244; mean 0.81, SD 1.18), and adherence (n=103; mean 0.34, SD 0.53) were the most frequently assessed outcome domains. Self-developed PROMs were used in 21.4% (65/303) of the studies, and self-developed PREMs were used in 22.3% (68/303). PROMs (n=884) were assessed more frequently than PREMs (n=234). As the evidence level of the studies increased, the number of PROMs also increased (?=?0.45), and the number of PREMs decreased (?=0.35). Since 2000, not only has the number of studies using PROMs and PREMs increased, but the level of evidence and the number of outcome measurement instruments used have also increased, with the number of PREMs permanently remaining at a lower level. Conclusions: There have been increasingly more studies, particularly high-evidence studies, which use PROMs and PREMs to evaluate telemedicine. PROMs have been used more frequently than PREMs. With the increasing maturity stage of telemedicine applications and higher evidence level, the use of PROMs increased in line with the recommendations of evaluation guidelines. Health-related quality of life and emotional function were measured in almost all the studies. Simultaneously, health literacy as a precondition for using the application adequately, alongside proper training and guidance, has rarely been reported. Further efforts should be pursued to standardize PROM and PREM collection in evaluation studies of telemedicine. UR - https://www.jmir.org/2021/11/e30042 UR - http://dx.doi.org/10.2196/30042 UR - http://www.ncbi.nlm.nih.gov/pubmed/34523604 ID - info:doi/10.2196/30042 ER - TY - JOUR AU - Hammam, Nevin AU - Izadi, Zara AU - Li, Jing AU - Evans, Michael AU - Kay, Julia AU - Shiboski, Stephen AU - Schmajuk, Gabriela AU - Yazdany, Jinoos PY - 2021/11/12 TI - The Relationship Between Electronic Health Record System and Performance on Quality Measures in the American College of Rheumatology?s Rheumatology Informatics System for Effectiveness (RISE) Registry: Observational Study JO - JMIR Med Inform SP - e31186 VL - 9 IS - 11 KW - rheumatoid arthritis KW - electronic health record KW - patient-reported outcomes KW - quality measures KW - disease activity KW - quality of care KW - performance reporting KW - medical informatics KW - clinical informatics N2 - Background: Routine collection of disease activity (DA) and patient-reported outcomes (PROs) in rheumatoid arthritis (RA) are nationally endorsed quality measures and critical components of a treat-to-target approach. However, little is known about the role electronic health record (EHR) systems play in facilitating performance on these measures. Objective: Using the American College Rheumatology?s (ACR?s) RISE registry, we analyzed the relationship between EHR system and performance on DA and functional status (FS) quality measures. Methods: We analyzed data collected in 2018 from practices enrolled in RISE. We assessed practice-level performance on quality measures that require DA and FS documentation. Multivariable linear regression and zero-inflated negative binomial models were used to examine the independent effect of EHR system on practice-level quality measure performance, adjusting for practice characteristics and patient case-mix. Results: In total, 220 included practices cared for 314,793 patients with RA. NextGen was the most commonly used EHR system (34.1%). We found wide variation in performance on DA and FS quality measures by EHR system (median 30.1, IQR 0-74.8, and median 9.0, IQR 0-74.2), respectively). Even after adjustment, NextGen practices performed significantly better than Allscripts on the DA measure (51.4% vs 5.0%; P<.05) and significantly better than eClinicalWorks and eMDs on the FS measure (49.3% vs 29.0% and 10.9%; P<.05). Conclusions: Performance on national RA quality measures was associated with the EHR system, even after adjusting for practice and patient characteristics. These findings suggest that future efforts to improve quality of care in RA should focus not only on provider performance reporting but also on developing and implementing rheumatology-specific standards across EHRs. UR - https://medinform.jmir.org/2021/11/e31186 UR - http://dx.doi.org/10.2196/31186 UR - http://www.ncbi.nlm.nih.gov/pubmed/34766910 ID - info:doi/10.2196/31186 ER - TY - JOUR AU - Wang, Jie-Teng AU - Lin, Wen-Yang PY - 2021/10/28 TI - Privacy-Preserving Anonymity for Periodical Releases of Spontaneous Adverse Drug Event Reporting Data: Algorithm Development and Validation JO - JMIR Med Inform SP - e28752 VL - 9 IS - 10 KW - adverse drug reaction KW - data anonymization KW - incremental data publishing KW - privacy preserving data publishing KW - spontaneous reporting system KW - drug KW - data set KW - anonymous KW - privacy KW - security KW - algorithm KW - development KW - validation KW - data N2 - Background: Spontaneous reporting systems (SRSs) have been increasingly established to collect adverse drug events for fostering adverse drug reaction (ADR) detection and analysis research. SRS data contain personal information, and so their publication requires data anonymization to prevent the disclosure of individuals? privacy. We have previously proposed a privacy model called MS(k, ?*)-bounding and the associated MS-Anonymization algorithm to fulfill the anonymization of SRS data. In the real world, the SRS data usually are released periodically (eg, FDA Adverse Event Reporting System [FAERS]) to accommodate newly collected adverse drug events. Different anonymized releases of SRS data available to the attacker may thwart our single-release-focus method, that is, MS(k, ?*)-bounding. Objective: We investigate the privacy threat caused by periodical releases of SRS data and propose anonymization methods to prevent the disclosure of personal privacy information while maintaining the utility of published data. Methods: We identify potential attacks on periodical releases of SRS data, namely, BFL-attacks, mainly caused by follow-up cases. We present a new privacy model called PPMS(k, ?*)-bounding, and propose the associated PPMS-Anonymization algorithm and 2 improvements: PPMS+-Anonymization and PPMS++-Anonymization. Empirical evaluations were performed using 32 selected FAERS quarter data sets from 2004Q1 to 2011Q4. The performance of the proposed versions of PPMS-Anonymization was inspected against MS-Anonymization from some aspects, including data distortion, measured by normalized information loss; privacy risk of anonymized data, measured by dangerous identity ratio and dangerous sensitivity ratio; and data utility, measured by the bias of signal counting and strength (proportional reporting ratio). Results: The best version of PPMS-Anonymization, PPMS++-Anonymization, achieves nearly the same quality as MS-Anonymization in both privacy protection and data utility. Overall, PPMS++-Anonymization ensures zero privacy risk on record and attribute linkage, and exhibits 51%-78% and 59%-82% improvements on information loss over PPMS+-Anonymization and PPMS-Anonymization, respectively, and significantly reduces the bias of ADR signal. Conclusions: The proposed PPMS(k, ?*)-bounding model and PPMS-Anonymization algorithm are effective in anonymizing SRS data sets in the periodical data publishing scenario, preventing the series of releases from disclosing personal sensitive information caused by BFL-attacks while maintaining the data utility for ADR signal detection. UR - https://medinform.jmir.org/2021/10/e28752 UR - http://dx.doi.org/10.2196/28752 UR - http://www.ncbi.nlm.nih.gov/pubmed/34709197 ID - info:doi/10.2196/28752 ER - TY - JOUR AU - Monahan, Corneille Ann AU - Feldman, S. Sue PY - 2021/9/16 TI - Models Predicting Hospital Admission of Adult Patients Utilizing Prehospital Data: Systematic Review Using PROBAST and CHARMS JO - JMIR Med Inform SP - e30022 VL - 9 IS - 9 KW - emergency services KW - hospital KW - decision support techniques KW - patient-specific modeling KW - crowding KW - boarding KW - exit block KW - systematic review KW - PROBAST KW - CHARMS KW - predictive model KW - medical informatics KW - health services research KW - prehospital assessment KW - process improvement KW - management information system KW - predict admission KW - emergency department N2 - Background: Emergency department boarding and hospital exit block are primary causes of emergency department crowding and have been conclusively associated with poor patient outcomes and major threats to patient safety. Boarding occurs when a patient is delayed or blocked from transitioning out of the emergency department because of dysfunctional transition or bed assignment processes. Predictive models for estimating the probability of an occurrence of this type could be useful in reducing or preventing emergency department boarding and hospital exit block, to reduce emergency department crowding. Objective: The aim of this study was to identify and appraise the predictive performance, predictor utility, model application, and model utility of hospital admission prediction models that utilized prehospital, adult patient data and aimed to address emergency department crowding. Methods: We searched multiple databases for studies, from inception to September 30, 2019, that evaluated models predicting adult patients? imminent hospital admission, with prehospital patient data and regression analysis. We used PROBAST (Prediction Model Risk of Bias Assessment Tool) and CHARMS (Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies) to critically assess studies. Results: Potential biases were found in most studies, which suggested that each model?s predictive performance required further investigation. We found that select prehospital patient data contribute to the identification of patients requiring hospital admission. Biomarker predictors may add superior value and advantages to models. It is, however, important to note that no models had been integrated with an information system or workflow, operated independently as electronic devices, or operated in real time within the care environment. Several models could be used at the site-of-care in real time without digital devices, which would make them suitable for low-technology or no-electricity environments. Conclusions: There is incredible potential for prehospital admission prediction models to improve patient care and hospital operations. Patient data can be utilized to act as predictors and as data-driven, actionable tools to identify patients likely to require imminent hospital admission and reduce patient boarding and crowding in emergency departments. Prediction models can be used to justify earlier patient admission and care, to lower morbidity and mortality, and models that utilize biomarker predictors offer additional advantages. UR - https://medinform.jmir.org/2021/9/e30022 UR - http://dx.doi.org/10.2196/30022 UR - http://www.ncbi.nlm.nih.gov/pubmed/34528893 ID - info:doi/10.2196/30022 ER - TY - JOUR AU - Lu, Ding-Heng AU - Hsu, Chia-An AU - Yuan, J. Eunice AU - Fen, Jun-Jeng AU - Lee, Chung-Yuan AU - Ming, Jin-Lain AU - Chen, Tzeng-Ji AU - Lee, Wui-Chiang AU - Chen, Shih-Ann PY - 2021/7/27 TI - Experiences With Internet Triaging of 9498 Outpatients Daily at the Largest Public Hospital in Taiwan During the COVID-19 Pandemic: Observational Study JO - JMIR Med Inform SP - e20994 VL - 9 IS - 7 KW - COVID-19 KW - hospital KW - information services KW - outpatients KW - patient KW - SARS-CoV-2 KW - triage KW - virus N2 - Background: During pandemics, acquiring outpatients? travel, occupation, contact, and cluster histories is one of the most important measures in assessing the disease risk among incoming patients. Previous means of acquiring this information in the examination room have been insufficient in preventing disease spread. Objective: This study aimed to demonstrate the deployment of an automatic system to triage outpatients over the internet. Methods: An automatic system was incorporated in the existing web-based appointment system of the hospital and deployed along with its on-site counterpart. Automatic queries to the virtual private network travel and contact history database with each patient?s national ID number were made for each attempt to acquire the patient?s travel and contact histories. Patients with relevant histories were denied registration or entry. Text messages were sent to patients without a relevant history for an expedited route of entry if applicable. Results: A total of 127,857 visits were recorded. Among all visits, 91,195 were registered on the internet. In total, 71,816 of them generated text messages for an expedited route of entry. Furthermore, 65 patients had relevant histories, as revealed by the virtual private network database, and were denied registration or entry. Conclusions: An automatic triage system to acquire outpatients? relevant travel and contact histories was deployed rapidly in one of the largest academic medical centers in Taiwan. The updated system successfully denied patients with relevant travel or contact histories entry to the hospital, thus preventing long lines outside the hospital. Further efforts could be made to integrate the system with the electronic medical record system. UR - https://medinform.jmir.org/2021/7/e20994 UR - http://dx.doi.org/10.2196/20994 UR - http://www.ncbi.nlm.nih.gov/pubmed/34043524 ID - info:doi/10.2196/20994 ER - TY - JOUR AU - Willis, Matthew AU - Brand Hein, Leah AU - Hu, Zhaoxian AU - Saran, Rajiv AU - Argentina, Marissa AU - Bragg-Gresham, Jennifer AU - Krein, L. Sarah AU - Gillespie, Brenda AU - Zheng, Kai AU - Veinot, C. Tiffany PY - 2021/6/14 TI - Usability Evaluation of a Tablet-Based Intervention to Prevent Intradialytic Hypotension in Dialysis Patients During In-Clinic Dialysis: Mixed Methods Study JO - JMIR Hum Factors SP - e26012 VL - 8 IS - 2 KW - user interaction KW - dialysis KW - usability KW - informatics intervention N2 - Background: Patients on hemodialysis receive dialysis thrice weekly for about 4 hours per session. Intradialytic hypotension (IDH)?low blood pressure during hemodialysis?is a serious but common complication of hemodialysis. Although patients on dialysis already participate in their care, activating patients toward IDH prevention may reduce their risk of IDH. Interactive, technology-based interventions hold promise as a platform for patient activation. However, little is known about the usability challenges that patients undergoing hemodialysis may face when using tablet-based informatics interventions, especially while dialyzing. Objective: This study aims to test the usability of a patient-facing, tablet-based intervention that includes theory-informed educational modules and motivational interviewing?based mentoring from patient peers via videoconferencing. Methods: We conducted a cross-sectional, mixed methods usability evaluation of the tablet-based intervention by using think-aloud methods, field notes, and structured observations. These qualitative data were evaluated by trained researchers using a structured data collection instrument to capture objective observational data. We calculated descriptive statistics for the quantitative data and conducted inductive content analysis using the qualitative data. Results: Findings from 14 patients cluster around general constraints such as the use of one arm, dexterity issues, impaired vision, and lack of experience with touch screen devices. Our task-by-task usability results showed that specific sections with the greatest difficulty for users were logging into the intervention (difficulty score: 2.08), interacting with the quizzes (difficulty score: 1.92), goal setting (difficulty score: 2.28), and entering and exiting videoconference rooms (difficulty score: 2.07) that are used to engage with peers during motivational interviewing sessions. Conclusions: In this paper, we present implications for designing informatics interventions for patients on dialysis and detail resulting changes to be implemented in the next version of this intervention. We frame these implications first through the context of the role the patients? physical body plays when interacting with the intervention and then through the digital considerations for software and interface interaction. UR - https://humanfactors.jmir.org/2021/2/e26012 UR - http://dx.doi.org/10.2196/26012 UR - http://www.ncbi.nlm.nih.gov/pubmed/34121664 ID - info:doi/10.2196/26012 ER - TY - JOUR AU - Manas, Gaur AU - Aribandi, Vamsi AU - Kursuncu, Ugur AU - Alambo, Amanuel AU - Shalin, L. Valerie AU - Thirunarayan, Krishnaprasad AU - Beich, Jonathan AU - Narasimhan, Meera AU - Sheth, Amit PY - 2021/5/10 TI - Knowledge-Infused Abstractive Summarization of Clinical Diagnostic Interviews: Framework Development Study JO - JMIR Ment Health SP - e20865 VL - 8 IS - 5 KW - knowledge-infusion KW - abstractive summarization KW - distress clinical diagnostic interviews KW - Patient Health Questionnaire-9 KW - healthcare informatics KW - interpretable evaluations N2 - Background: In clinical diagnostic interviews, mental health professionals (MHPs) implement a care practice that involves asking open questions (eg, ?What do you want from your life?? ?What have you tried before to bring change in your life??) while listening empathetically to patients. During these interviews, MHPs attempted to build a trusting human-centered relationship while collecting data necessary for professional medical and psychiatric care. Often, because of the social stigma of mental health disorders, patient discomfort in discussing their presenting problem may add additional complexities and nuances to the language they use, that is, hidden signals among noisy content. Therefore, a focused, well-formed, and elaborative summary of clinical interviews is critical to MHPs in making informed decisions by enabling a more profound exploration of a patient?s behavior, especially when it endangers life. Objective: The aim of this study is to propose an unsupervised, knowledge-infused abstractive summarization (KiAS) approach that generates summaries to enable MHPs to perform a well-informed follow-up with patients to improve the existing summarization methods built on frequency heuristics by creating more informative summaries. Methods: Our approach incorporated domain knowledge from the Patient Health Questionnaire-9 lexicon into an integer linear programming framework that optimizes linguistic quality and informativeness. We used 3 baseline approaches: extractive summarization using the SumBasic algorithm, abstractive summarization using integer linear programming without the infusion of knowledge, and abstraction over extractive summarization to evaluate the performance of KiAS. The capability of KiAS on the Distress Analysis Interview Corpus-Wizard of Oz data set was demonstrated through interpretable qualitative and quantitative evaluations. Results: KiAS generates summaries (7 sentences on average) that capture informative questions and responses exchanged during long (58 sentences on average), ambiguous, and sparse clinical diagnostic interviews. The summaries generated using KiAS improved upon the 3 baselines by 23.3%, 4.4%, 2.5%, and 2.2% for thematic overlap, Flesch Reading Ease, contextual similarity, and Jensen Shannon divergence, respectively. On the Recall-Oriented Understudy for Gisting Evaluation-2 and Recall-Oriented Understudy for Gisting Evaluation-L metrics, KiAS showed an improvement of 61% and 49%, respectively. We validated the quality of the generated summaries through visual inspection and substantial interrater agreement from MHPs. Conclusions: Our collaborator MHPs observed the potential utility and significant impact of KiAS in leveraging valuable but voluminous communications that take place outside of normally scheduled clinical appointments. This study shows promise in generating semantically relevant summaries that will help MHPs make informed decisions about patient status. UR - https://mental.jmir.org/2021/5/e20865 UR - http://dx.doi.org/10.2196/20865 UR - http://www.ncbi.nlm.nih.gov/pubmed/33970116 ID - info:doi/10.2196/20865 ER - TY - JOUR AU - Blease, Charlotte AU - Torous, John AU - Kharko, Anna AU - DesRoches, M. Catherine AU - Harcourt, Kendall AU - O'Neill, Stephen AU - Salmi, Liz AU - Wachenheim, Deborah AU - Hägglund, Maria PY - 2021/4/16 TI - Preparing Patients and Clinicians for Open Notes in Mental Health: Qualitative Inquiry of International Experts JO - JMIR Ment Health SP - e27397 VL - 8 IS - 4 KW - open notes KW - electronic health records KW - attitudes KW - survey KW - mental health KW - psychiatry KW - psychotherapy KW - qualitative research KW - mobile phone N2 - Background: In a growing number of countries worldwide, clinicians are sharing mental health notes, including psychiatry and psychotherapy notes, with patients. Objective: The aim of this study is to solicit the views of experts on provider policies and patient and clinician training or guidance in relation to open notes in mental health care. Methods: In August 2020, we conducted a web-based survey of international experts on the practice of sharing mental health notes. Experts were identified as informaticians, clinicians, chief medical information officers, patients, and patient advocates who have extensive research knowledge about or experience of providing access to or having access to mental health notes. This study undertook a qualitative descriptive analysis of experts? written responses and opinions (comments) to open-ended questions on training clinicians, patient guidance, and suggested policy regulations. Results: A total of 70 of 92 (76%) experts from 6 countries responded. We identified four major themes related to opening mental health notes to patients: the need for clarity about provider policies on exemptions, providing patients with basic information about open notes, clinician training in writing mental health notes, and managing patient-clinician disagreement about mental health notes. Conclusions: This study provides timely information on policy and training recommendations derived from a wide range of international experts on how to prepare clinicians and patients for open notes in mental health. The results of this study point to the need for further refinement of exemption policies in relation to sharing mental health notes, guidance for patients, and curricular changes for students and clinicians as well as improvements aimed at enhancing patient and clinician-friendly portal design. UR - https://mental.jmir.org/2021/4/e27397 UR - http://dx.doi.org/10.2196/27397 UR - http://www.ncbi.nlm.nih.gov/pubmed/33861202 ID - info:doi/10.2196/27397 ER - TY - JOUR AU - Park, Ae Ji AU - Sung, Dong Min AU - Kim, Heon Ho AU - Park, Rang Yu PY - 2021/4/5 TI - Weight-Based Framework for Predictive Modeling of Multiple Databases With Noniterative Communication Without Data Sharing: Privacy-Protecting Analytic Method for Multi-Institutional Studies JO - JMIR Med Inform SP - e21043 VL - 9 IS - 4 KW - multi-institutional study KW - distributed data KW - data sharing KW - privacy-protecting methods N2 - Background: Securing the representativeness of study populations is crucial in biomedical research to ensure high generalizability. In this regard, using multi-institutional data have advantages in medicine. However, combining data physically is difficult as the confidential nature of biomedical data causes privacy issues. Therefore, a methodological approach is necessary when using multi-institution medical data for research to develop a model without sharing data between institutions. Objective: This study aims to develop a weight-based integrated predictive model of multi-institutional data, which does not require iterative communication between institutions, to improve average predictive performance by increasing the generalizability of the model under privacy-preserving conditions without sharing patient-level data. Methods: The weight-based integrated model generates a weight for each institutional model and builds an integrated model for multi-institutional data based on these weights. We performed 3 simulations to show the weight characteristics and to determine the number of repetitions of the weight required to obtain stable values. We also conducted an experiment using real multi-institutional data to verify the developed weight-based integrated model. We selected 10 hospitals (2845 intensive care unit [ICU] stays in total) from the electronic intensive care unit Collaborative Research Database to predict ICU mortality with 11 features. To evaluate the validity of our model, compared with a centralized model, which was developed by combining all the data of 10 hospitals, we used proportional overlap (ie, 0.5 or less indicates a significant difference at a level of .05; and 2 indicates 2 CIs overlapping completely). Standard and firth logistic regression models were applied for the 2 simulations and the experiment. Results: The results of these simulations indicate that the weight of each institution is determined by 2 factors (ie, the data size of each institution and how well each institutional model fits into the overall institutional data) and that repeatedly generating 200 weights is necessary per institution. In the experiment, the estimated area under the receiver operating characteristic curve (AUC) and 95% CIs were 81.36% (79.37%-83.36%) and 81.95% (80.03%-83.87%) in the centralized model and weight-based integrated model, respectively. The proportional overlap of the CIs for AUC in both the weight-based integrated model and the centralized model was approximately 1.70, and that of overlap of the 11 estimated odds ratios was over 1, except for 1 case. Conclusions: In the experiment where real multi-institutional data were used, our model showed similar results to the centralized model without iterative communication between institutions. In addition, our weight-based integrated model provided a weighted average model by integrating 10 models overfitted or underfitted, compared with the centralized model. The proposed weight-based integrated model is expected to provide an efficient distributed research approach as it increases the generalizability of the model and does not require iterative communication. UR - https://medinform.jmir.org/2021/4/e21043 UR - http://dx.doi.org/10.2196/21043 UR - http://www.ncbi.nlm.nih.gov/pubmed/33818396 ID - info:doi/10.2196/21043 ER - TY - JOUR AU - Ma, Jinfei AU - Zou, Zihao AU - Pazo, Eric Emmanuel AU - Moutari, Salissou AU - Liu, Ye AU - Jin, Feng PY - 2021/3/2 TI - Comparative Analysis of Paper-Based and Web-Based Versions of the National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy-Breast Cancer Symptom Index (NFBSI-16) Questionnaire in Breast Cancer Patients: Randomized Crossover Study JO - JMIR Med Inform SP - e18269 VL - 9 IS - 3 KW - breast cancer KW - NFBSI-16 KW - patient-reported outcome KW - reproducibility KW - test-retest reliability KW - web-based questionnaire N2 - Background: Breast cancer remains the most common neoplasm diagnosed among women in China and globally. Health-related questionnaire assessments in research and clinical oncology settings have gained prominence. The National Comprehensive Cancer Network?Functional Assessment of Cancer Therapy?Breast Cancer Symptom Index (NFBSI-16) is a rapid and powerful tool to help evaluate disease- or treatment-related symptoms, both physical and emotional, in patients with breast cancer for clinical and research purposes. Prevalence of individual smartphones provides a potential web-based approach to administrating the questionnaire; however, the reliability of the NFBSI-16 in electronic format has not been assessed. Objective: This study aimed to assess the reliability of a web-based NFBSI-16 questionnaire in breast cancer patients undergoing systematic treatment with a prospective open-label randomized crossover study design. Methods: We recruited random patients with breast cancer under systematic treatment from the central hospital registry to complete both paper- and web-based versions of the questionnaires. Both versions of the questionnaires were self-assessed. Patients were randomly assigned to group A (paper-based first and web-based second) or group B (web-based first and paper-based second). A total of 354 patients were included in the analysis (group A: n=177, group B: n=177). Descriptive sociodemographic characteristics, reliability and agreement rates for single items, subscales, and total score were analyzed using the Wilcoxon test. The Lin concordance correlation coefficient (CCC) and Spearman and Kendall ? rank correlations were used to assess test-retest reliability. Results: Test-retest reliability measured with CCCs was 0.94 for the total NFBSI-16 score. Significant correlations (Spearman ?) were documented for all 4 subscales?Disease-Related Symptoms Subscale?Physical (?=0.93), Disease-Related Symptoms Subscale?Emotional (?=0.85), Treatment Side Effects Subscale (?=0.95), and Function and Well-Being Subscale (?=0.91)?and total NFBSI-16 score (?=0.94). Mean differences of the test and retest were all close to zero (?0.06). The parallel test-retest reliability of subscales with the Wilcoxon test comparing individual items found GP3 (item 5) to be significantly different (P=.02). A majority of the participants in this study (255/354, 72.0%) preferred the web-based over the paper-based version. Conclusions: The web-based version of the NFBSI-16 questionnaire is an excellent tool for monitoring individual breast cancer patients under treatment, with the majority of participants preferring it over the paper-based version. UR - https://medinform.jmir.org/2021/3/e18269 UR - http://dx.doi.org/10.2196/18269 UR - http://www.ncbi.nlm.nih.gov/pubmed/33650978 ID - info:doi/10.2196/18269 ER - TY - JOUR AU - Conca, Antoinette AU - Koch, Daniel AU - Regez, Katharina AU - Kutz, Alexander AU - Bächli, Ciril AU - Haubitz, Sebastian AU - Schuetz, Philipp AU - Mueller, Beat AU - Spirig, Rebecca AU - Petry, Heidi PY - 2021/1/14 TI - Self-Care Index and Post-Acute Care Discharge Score to Predict Discharge Destination of Adult Medical Inpatients: Protocol for a Multicenter Validation Study JO - JMIR Res Protoc SP - e21447 VL - 10 IS - 1 KW - discharge planning KW - forecasting KW - logistic models KW - patient transfer KW - post-acute care discharge score KW - protocol KW - self-care index KW - sensitivity KW - specificity KW - validation study N2 - Background: Delays in patient discharge can not only lead to deterioration, especially among geriatric patients, but also incorporate unnecessary resources at the hospital level. Many of these delays and their negative impact may be preventable by early focused screening to identify patients at risk for transfer to a post-acute care facility. Early interprofessional discharge planning is crucial in order to fit the appropriate individual discharge destination. While prediction of discharge to a post-acute care facility using post-acute care discharge score, the self-care index, and a combination of both has been shown in a single-center pilot study, an external validation is still missing. Objective: This paper outlines the study protocol and methodology currently being used to replicate the previous pilot findings and determine whether the post-acute care discharge score, the self-care index, or the combination of both can reliably identify patients requiring transfer to post-acute care facilities. Methods: This study will use prospective data involving all phases of the quasi-experimental study ?In-HospiTOOL? conducted at 7 Swiss hospitals in urban and rural areas. During an 18-month period, consecutive adult medical patients admitted to the hospitals through the emergency department will be included. We aim to include 6000 patients based on sample size calculation. These data will enable a prospective external validation of the prediction instruments. Results: We expect to gain more insight into the predictive capability of the above-mentioned prediction instruments. This approach will allow us to get important information about the generalizability of the three different models. The study was approved by the institutional review board on November 21, 2016, and funded in May 2020. Expected results are planned to be published in spring 2021. Conclusions: This study will provide evidence on prognostic properties, comparative performance, reliability of scoring, and suitability of the instruments for the screening purpose in order to be able to recommend application in clinical practice. International Registered Report Identifier (IRRID): DERR1-10.2196/21447 UR - http://www.researchprotocols.org/2021/1/e21447/ UR - http://dx.doi.org/10.2196/21447 UR - http://www.ncbi.nlm.nih.gov/pubmed/33263553 ID - info:doi/10.2196/21447 ER - TY - JOUR AU - Kim, Ki-Hun AU - Kim, Kwang-Jae PY - 2020/12/17 TI - Missing-Data Handling Methods for Lifelogs-Based Wellness Index Estimation: Comparative Analysis With Panel Data JO - JMIR Med Inform SP - e20597 VL - 8 IS - 12 KW - lifelogs-based wellness index KW - missing-data handling KW - health behavior lifelogs KW - panel data KW - smart wellness service N2 - Background: A lifelogs-based wellness index (LWI) is a function for calculating wellness scores based on health behavior lifelogs (eg, daily walking steps and sleep times collected via a smartwatch). A wellness score intuitively shows the users of smart wellness services the overall condition of their health behaviors. LWI development includes estimation (ie, estimating coefficients in LWI with data). A panel data set comprising health behavior lifelogs allows LWI estimation to control for unobserved variables, thereby resulting in less bias. However, these data sets typically have missing data due to events that occur in daily life (eg, smart devices stop collecting data when batteries are depleted), which can introduce biases into LWI coefficients. Thus, the appropriate choice of method to handle missing data is important for reducing biases in LWI estimations with panel data. However, there is a lack of research in this area. Objective: This study aims to identify a suitable missing-data handling method for LWI estimation with panel data. Methods: Listwise deletion, mean imputation, expectation maximization?based multiple imputation, predictive-mean matching?based multiple imputation, k-nearest neighbors?based imputation, and low-rank approximation?based imputation were comparatively evaluated by simulating an existing case of LWI development. A panel data set comprising health behavior lifelogs of 41 college students over 4 weeks was transformed into a reference data set without any missing data. Then, 200 simulated data sets were generated by randomly introducing missing data at proportions from 1% to 80%. The missing-data handling methods were each applied to transform the simulated data sets into complete data sets, and coefficients in a linear LWI were estimated for each complete data set. For each proportion for each method, a bias measure was calculated by comparing the estimated coefficient values with values estimated from the reference data set. Results: Methods performed differently depending on the proportion of missing data. For 1% to 30% proportions, low-rank approximation?based imputation, predictive-mean matching?based multiple imputation, and expectation maximization?based multiple imputation were superior. For 31% to 60% proportions, low-rank approximation?based imputation and predictive-mean matching?based multiple imputation performed best. For over 60% proportions, only low-rank approximation?based imputation performed acceptably. Conclusions: Low-rank approximation?based imputation was the best of the 6 data-handling methods regardless of the proportion of missing data. This superiority is generalizable to other panel data sets comprising health behavior lifelogs given their verified low-rank nature, for which low-rank approximation?based imputation is known to perform effectively. This result will guide missing-data handling in reducing coefficient biases in new development cases of linear LWIs with panel data. UR - http://medinform.jmir.org/2020/12/e20597/ UR - http://dx.doi.org/10.2196/20597 UR - http://www.ncbi.nlm.nih.gov/pubmed/33331831 ID - info:doi/10.2196/20597 ER - TY - JOUR AU - Garcia-Rudolph, Alejandro AU - Garcia-Molina, Alberto AU - Opisso, Eloy AU - Tormos Muńoz, Jose PY - 2020/10/6 TI - Personalized Web-Based Cognitive Rehabilitation Treatments for Patients with Traumatic Brain Injury: Cluster Analysis JO - JMIR Med Inform SP - e16077 VL - 8 IS - 10 KW - cluster analysis KW - traumatic brain injury KW - web-based rehabilitation N2 - Background: Traumatic brain injury (TBI) is a leading cause of disability worldwide. TBI is a highly heterogeneous disease, which makes it complex for effective therapeutic interventions. Cluster analysis has been extensively applied in previous research studies to identify homogeneous subgroups based on performance in neuropsychological baseline tests. Nevertheless, most analyzed samples are rarely larger than a size of 100, and different cluster analysis approaches and cluster validity indices have been scarcely compared or applied in web-based rehabilitation treatments. Objective: The aims of our study were as follows: (1) to apply state-of-the-art cluster validity indices to different cluster strategies: hierarchical, partitional, and model-based, (2) to apply combined strategies of dimensionality reduction by using principal component analysis and random forests and perform stability assessment of the final profiles, (3) to characterize the identified profiles by using demographic and clinically relevant variables, and (4) to study the external validity of the obtained clusters by considering 3 relevant aspects of TBI rehabilitation: Glasgow Coma Scale, functional independence measure, and execution of web-based cognitive tasks. Methods: This study was performed from August 2008 to July 2019. Different cluster strategies were executed with Mclust, factoextra, and cluster R packages. For combined strategies, we used the FactoMineR and random forest R packages. Stability analysis was performed with the fpc R package. Between-group comparisons for external validation were performed using 2-tailed t test, chi-square test, or Mann-Whitney U test, as appropriate. Results: We analyzed 574 adult patients with TBI (mostly severe) who were undergoing web-based rehabilitation. We identified and characterized 3 clusters with strong internal validation: (1) moderate attentional impairment and moderate dysexecutive syndrome with mild memory impairment and normal spatiotemporal perception, with almost 66% (111/170) of the patients being highly educated (P<.05); (2) severe dysexecutive syndrome with severe attentional and memory impairments and normal spatiotemporal perception, with 49.2% (153/311) of the patients being highly educated (P<.05); (3) very severe cognitive impairment, with 45.2% (42/93) of the patients being highly educated (P<.05). We externally validated them with severity of injury (P=.006) and functional independence assessments: cognitive (P<.001), motor (P<.001), and total (P<.001). We mapped 151,763 web-based cognitive rehabilitation tasks during the whole period to the 3 obtained clusters (P<.001) and confirmed the identified patterns. Stability analysis indicated that clusters 1 and 2 were respectively rated as 0.60 and 0.75; therefore, they were measuring a pattern and cluster 3 was rated as highly stable. Conclusions: Cluster analysis in web-based cognitive rehabilitation treatments enables the identification and characterization of strong response patterns to neuropsychological tests, external validation of the obtained clusters, tailoring of cognitive web-based tasks executed in the web platform to the identified profiles, thereby providing clinicians a tool for treatment personalization, and the extension of a similar approach to other medical conditions. UR - https://medinform.jmir.org/2020/10/e16077 UR - http://dx.doi.org/10.2196/16077 UR - http://www.ncbi.nlm.nih.gov/pubmed/33021482 ID - info:doi/10.2196/16077 ER - TY - JOUR AU - Tan, Marissa AU - Hatef, Elham AU - Taghipour, Delaram AU - Vyas, Kinjel AU - Kharrazi, Hadi AU - Gottlieb, Laura AU - Weiner, Jonathan PY - 2020/9/8 TI - Including Social and Behavioral Determinants in Predictive Models: Trends, Challenges, and Opportunities JO - JMIR Med Inform SP - e18084 VL - 8 IS - 9 KW - social determinants of health KW - information technology KW - health care disparities KW - population health UR - http://medinform.jmir.org/2020/9/e18084/ UR - http://dx.doi.org/10.2196/18084 UR - http://www.ncbi.nlm.nih.gov/pubmed/32897240 ID - info:doi/10.2196/18084 ER - TY - JOUR AU - Li, Xin AU - Rousseau, F. Justin AU - Ding, Ying AU - Song, Min AU - Lu, Wei PY - 2020/6/16 TI - Understanding Drug Repurposing From the Perspective of Biomedical Entities and Their Evolution: Bibliographic Research Using Aspirin JO - JMIR Med Inform SP - e16739 VL - 8 IS - 6 KW - drug repurposing KW - biomedical entities KW - entitymetrics KW - bibliometrics KW - aspirin KW - acetylsalicylic acid N2 - Background: Drug development is still a costly and time-consuming process with a low rate of success. Drug repurposing (DR) has attracted significant attention because of its significant advantages over traditional approaches in terms of development time, cost, and safety. Entitymetrics, defined as bibliometric indicators based on biomedical entities (eg, diseases, drugs, and genes) studied in the biomedical literature, make it possible for researchers to measure knowledge evolution and the transfer of drug research. Objective: The purpose of this study was to understand DR from the perspective of biomedical entities (diseases, drugs, and genes) and their evolution. Methods: In the work reported in this paper, we extended the bibliometric indicators of biomedical entities mentioned in PubMed to detect potential patterns of biomedical entities in various phases of drug research and investigate the factors driving DR. We used aspirin (acetylsalicylic acid) as the subject of the study since it can be repurposed for many applications. We propose 4 easy, transparent measures based on entitymetrics to investigate DR for aspirin: Popularity Index (P1), Promising Index (P2), Prestige Index (P3), and Collaboration Index (CI). Results: We found that the maxima of P1, P3, and CI are closely associated with the different repurposing phases of aspirin. These metrics enabled us to observe the way in which biomedical entities interacted with the drug during the various phases of DR and to analyze the potential driving factors for DR at the entity level. P1 and CI were indicative of the dynamic trends of a specific biomedical entity over a long time period, while P2 was more sensitive to immediate changes. P3 reflected the early signs of the practical value of biomedical entities and could be valuable for tracking the research frontiers of a drug. Conclusions: In-depth studies of side effects and mechanisms, fierce market competition, and advanced life science technologies are driving factors for DR. This study showcases the way in which researchers can examine the evolution of DR using entitymetrics, an approach that can be valuable for enhancing decision making in the field of drug discovery and development. UR - https://medinform.jmir.org/2020/6/e16739 UR - http://dx.doi.org/10.2196/16739 UR - http://www.ncbi.nlm.nih.gov/pubmed/32543442 ID - info:doi/10.2196/16739 ER - TY - JOUR AU - Miotto, Riccardo AU - Percha, L. Bethany AU - Glicksberg, S. Benjamin AU - Lee, Hao-Chih AU - Cruz, Lisanne AU - Dudley, T. Joel AU - Nabeel, Ismail PY - 2020/2/27 TI - Identifying Acute Low Back Pain Episodes in Primary Care Practice From Clinical Notes: Observational Study JO - JMIR Med Inform SP - e16878 VL - 8 IS - 2 KW - electronic health records KW - clinical notes KW - low back pain KW - natural language processing KW - machine learning N2 - Background: Acute and chronic low back pain (LBP) are different conditions with different treatments. However, they are coded in electronic health records with the same International Classification of Diseases, 10th revision (ICD-10) code (M54.5) and can be differentiated only by retrospective chart reviews. This prevents an efficient definition of data-driven guidelines for billing and therapy recommendations, such as return-to-work options. Objective: The objective of this study was to evaluate the feasibility of automatically distinguishing acute LBP episodes by analyzing free-text clinical notes. Methods: We used a dataset of 17,409 clinical notes from different primary care practices; of these, 891 documents were manually annotated as acute LBP and 2973 were generally associated with LBP via the recorded ICD-10 code. We compared different supervised and unsupervised strategies for automated identification: keyword search, topic modeling, logistic regression with bag of n-grams and manual features, and deep learning (a convolutional neural network-based architecture [ConvNet]). We trained the supervised models using either manual annotations or ICD-10 codes as positive labels. Results: ConvNet trained using manual annotations obtained the best results with an area under the receiver operating characteristic curve of 0.98 and an F score of 0.70. ConvNet?s results were also robust to reduction of the number of manually annotated documents. In the absence of manual annotations, topic models performed better than methods trained using ICD-10 codes, which were unsatisfactory for identifying LBP acuity. Conclusions: This study uses clinical notes to delineate a potential path toward systematic learning of therapeutic strategies, billing guidelines, and management options for acute LBP at the point of care. UR - http://medinform.jmir.org/2020/2/e16878/ UR - http://dx.doi.org/10.2196/16878 UR - http://www.ncbi.nlm.nih.gov/pubmed/32130159 ID - info:doi/10.2196/16878 ER - TY - JOUR AU - Gong, Mengchun AU - Wang, Shuang AU - Wang, Lezi AU - Liu, Chao AU - Wang, Jianyang AU - Guo, Qiang AU - Zheng, Hao AU - Xie, Kang AU - Wang, Chenghong AU - Hui, Zhouguang PY - 2020/2/5 TI - Evaluation of Privacy Risks of Patients? Data in China: Case Study JO - JMIR Med Inform SP - e13046 VL - 8 IS - 2 KW - patient privacy KW - privacy risk KW - Chinese patients? data KW - data sharing KW - re-identification N2 - Background: Patient privacy is a ubiquitous problem around the world. Many existing studies have demonstrated the potential privacy risks associated with sharing of biomedical data. Owing to the increasing need for data sharing and analysis, health care data privacy is drawing more attention. However, to better protect biomedical data privacy, it is essential to assess the privacy risk in the first place. Objective: In China, there is no clear regulation for health systems to deidentify data. It is also not known whether a mechanism such as the Health Insurance Portability and Accountability Act (HIPAA) safe harbor policy will achieve sufficient protection. This study aimed to conduct a pilot study using patient data from Chinese hospitals to understand and quantify the privacy risks of Chinese patients. Methods: We used g-distinct analysis to evaluate the reidentification risks with regard to the HIPAA safe harbor approach when applied to Chinese patients? data. More specifically, we estimated the risks based on the HIPAA safe harbor and limited dataset policies by assuming an attacker has background knowledge of the patient from the public domain. Results: The experiments were conducted on 0.83 million patients (with data field of date of birth, gender, and surrogate ZIP codes generated based on home address) across 33 provincial-level administrative divisions in China. Under the Limited Dataset policy, 19.58% (163,262/833,235) of the population could be uniquely identifiable under the g-distinct metric (ie, 1-distinct). In contrast, the Safe Harbor policy is able to significantly reduce privacy risk, where only 0.072% (601/833,235) of individuals are uniquely identifiable, and the majority of the population is 3000 indistinguishable (ie the population is expected to share common attributes with 3000 or less people). Conclusions: Through the experiments based on real-world patient data, this work illustrates that the results of g-distinct analysis about Chinese patient privacy risk are similar to those from a previous US study, in which data from different organizations/regions might be vulnerable to different reidentification risks under different policies. This work provides reference to Chinese health care entities for estimating patients? privacy risk during data sharing, which laid the foundation of privacy risk study about Chinese patients? data in the future. UR - https://medinform.jmir.org/2020/2/e13046 UR - http://dx.doi.org/10.2196/13046 UR - http://www.ncbi.nlm.nih.gov/pubmed/32022691 ID - info:doi/10.2196/13046 ER - TY - JOUR AU - Bai, Jinbing AU - Jhaney, Ileen AU - Wells, Jessica PY - 2019/11/11 TI - Developing a Reproducible Microbiome Data Analysis Pipeline Using the Amazon Web Services Cloud for a Cancer Research Group: Proof-of-Concept Study JO - JMIR Med Inform SP - e14667 VL - 7 IS - 4 KW - Amazon Web Services KW - cloud computation KW - microbiome KW - pipeline KW - sequence analysis N2 - Background: Cloud computing for microbiome data sets can significantly increase working efficiencies and expedite the translation of research findings into clinical practice. The Amazon Web Services (AWS) cloud provides an invaluable option for microbiome data storage, computation, and analysis. Objective: The goals of this study were to develop a microbiome data analysis pipeline by using AWS cloud and to conduct a proof-of-concept test for microbiome data storage, processing, and analysis. Methods: A multidisciplinary team was formed to develop and test a reproducible microbiome data analysis pipeline with multiple AWS cloud services that could be used for storage, computation, and data analysis. The microbiome data analysis pipeline developed in AWS was tested by using two data sets: 19 vaginal microbiome samples and 50 gut microbiome samples. Results: Using AWS features, we developed a microbiome data analysis pipeline that included Amazon Simple Storage Service for microbiome sequence storage, Linux Elastic Compute Cloud (EC2) instances (ie, servers) for data computation and analysis, and security keys to create and manage the use of encryption for the pipeline. Bioinformatics and statistical tools (ie, Quantitative Insights Into Microbial Ecology 2 and RStudio) were installed within the Linux EC2 instances to run microbiome statistical analysis. The microbiome data analysis pipeline was performed through command-line interfaces within the Linux operating system or in the Mac operating system. Using this new pipeline, we were able to successfully process and analyze 50 gut microbiome samples within 4 hours at a very low cost (a c4.4xlarge EC2 instance costs $0.80 per hour). Gut microbiome findings regarding diversity, taxonomy, and abundance analyses were easily shared within our research team. Conclusions: Building a microbiome data analysis pipeline with AWS cloud is feasible. This pipeline is highly reliable, computationally powerful, and cost effective. Our AWS-based microbiome analysis pipeline provides an efficient tool to conduct microbiome data analysis. UR - http://medinform.jmir.org/2019/4/e14667/ UR - http://dx.doi.org/10.2196/14667 UR - http://www.ncbi.nlm.nih.gov/pubmed/31710301 ID - info:doi/10.2196/14667 ER - TY - JOUR AU - Wakefield, J. Bonnie AU - Turvey, L. Carolyn AU - Nazi, M. Kim AU - Holman, E. John AU - Hogan, P. Timothy AU - Shimada, L. Stephanie AU - Kennedy, R. Diana PY - 2017/10/11 TI - Psychometric Properties of Patient-Facing eHealth Evaluation Measures: Systematic Review and Analysis JO - J Med Internet Res SP - e346 VL - 19 IS - 10 KW - telemedicine KW - computers KW - evaluation KW - use-effectiveness KW - technology KW - psychometrics N2 - Background: Significant resources are being invested into eHealth technology to improve health care. Few resources have focused on evaluating the impact of use on patient outcomes A standardized set of metrics used across health systems and research will enable aggregation of data to inform improved implementation, clinical practice, and ultimately health outcomes associated with use of patient-facing eHealth technologies. Objective: The objective of this project was to conduct a systematic review to (1) identify existing instruments for eHealth research and implementation evaluation from the patient?s point of view, (2) characterize measurement components, and (3) assess psychometrics. Methods: Concepts from existing models and published studies of technology use and adoption were identified and used to inform a search strategy. Search terms were broadly categorized as platforms (eg, email), measurement (eg, survey), function/information use (eg, self-management), health care occupations (eg, nurse), and eHealth/telemedicine (eg, mHealth). A computerized database search was conducted through June 2014. Included articles (1) described development of an instrument, or (2) used an instrument that could be traced back to its original publication, or (3) modified an instrument, and (4) with full text in English language, and (5) focused on the patient perspective on technology, including patient preferences and satisfaction, engagement with technology, usability, competency and fluency with technology, computer literacy, and trust in and acceptance of technology. The review was limited to instruments that reported at least one psychometric property. Excluded were investigator-developed measures, disease-specific assessments delivered via technology or telephone (eg, a cancer-coping measure delivered via computer survey), and measures focused primarily on clinician use (eg, the electronic health record). Results: The search strategy yielded 47,320 articles. Following elimination of duplicates and non-English language publications (n=14,550) and books (n=27), another 31,647 articles were excluded through review of titles. Following a review of the abstracts of the remaining 1096 articles, 68 were retained for full-text review. Of these, 16 described an instrument and six used an instrument; one instrument was drawn from the GEM database, resulting in 23 articles for inclusion. None included a complete psychometric evaluation. The most frequently assessed property was internal consistency (21/23, 91%). Testing for aspects of validity ranged from 48% (11/23) to 78% (18/23). Approximately half (13/23, 57%) reported how to score the instrument. Only six (26%) assessed the readability of the instrument for end users, although all the measures rely on self-report. Conclusions: Although most measures identified in this review were published after the year 2000, rapidly changing technology makes instrument development challenging. Platform-agnostic measures need to be developed that focus on concepts important for use of any type of eHealth innovation. At present, there are important gaps in the availability of psychometrically sound measures to evaluate eHealth technologies. UR - http://www.jmir.org/2017/10/e346/ UR - http://dx.doi.org/10.2196/jmir.7638 UR - http://www.ncbi.nlm.nih.gov/pubmed/29021128 ID - info:doi/10.2196/jmir.7638 ER - TY - JOUR AU - Eivazzadeh, Shahryar AU - Anderberg, Peter AU - Larsson, C. Tobias AU - Fricker, A. Samuel AU - Berglund, Johan PY - 2016/06/16 TI - Evaluating Health Information Systems Using Ontologies JO - JMIR Med Inform SP - e20 VL - 4 IS - 2 KW - health information systems KW - ontologies KW - evaluation KW - technology assessment KW - biomedical N2 - Background: There are several frameworks that attempt to address the challenges of evaluation of health information systems by offering models, methods, and guidelines about what to evaluate, how to evaluate, and how to report the evaluation results. Model-based evaluation frameworks usually suggest universally applicable evaluation aspects but do not consider case-specific aspects. On the other hand, evaluation frameworks that are case specific, by eliciting user requirements, limit their output to the evaluation aspects suggested by the users in the early phases of system development. In addition, these case-specific approaches extract different sets of evaluation aspects from each case, making it challenging to collectively compare, unify, or aggregate the evaluation of a set of heterogeneous health information systems. Objectives: The aim of this paper is to find a method capable of suggesting evaluation aspects for a set of one or more health information systems?whether similar or heterogeneous?by organizing, unifying, and aggregating the quality attributes extracted from those systems and from an external evaluation framework. Methods: On the basis of the available literature in semantic networks and ontologies, a method (called Unified eValuation using Ontology; UVON) was developed that can organize, unify, and aggregate the quality attributes of several health information systems into a tree-style ontology structure. The method was extended to integrate its generated ontology with the evaluation aspects suggested by model-based evaluation frameworks. An approach was developed to extract evaluation aspects from the ontology that also considers evaluation case practicalities such as the maximum number of evaluation aspects to be measured or their required degree of specificity. The method was applied and tested in Future Internet Social and Technological Alignment Research (FI-STAR), a project of 7 cloud-based eHealth applications that were developed and deployed across European Union countries. Results: The relevance of the evaluation aspects created by the UVON method for the FI-STAR project was validated by the corresponding stakeholders of each case. These evaluation aspects were extracted from a UVON-generated ontology structure that reflects both the internally declared required quality attributes in the 7 eHealth applications of the FI-STAR project and the evaluation aspects recommended by the Model for ASsessment of Telemedicine applications (MAST) evaluation framework. The extracted evaluation aspects were used to create questionnaires (for the corresponding patients and health professionals) to evaluate each individual case and the whole of the FI-STAR project. Conclusions: The UVON method can provide a relevant set of evaluation aspects for a heterogeneous set of health information systems by organizing, unifying, and aggregating the quality attributes through ontological structures. Those quality attributes can be either suggested by evaluation models or elicited from the stakeholders of those systems in the form of system requirements. The method continues to be systematic, context sensitive, and relevant across a heterogeneous set of health information systems. UR - http://medinform.jmir.org/2016/2/e20/ UR - http://dx.doi.org/10.2196/medinform.5185 UR - http://www.ncbi.nlm.nih.gov/pubmed/27311735 ID - info:doi/10.2196/medinform.5185 ER - TY - JOUR AU - Sadasivam, Shankar Rajani AU - Cutrona, L. Sarah AU - Kinney, L. Rebecca AU - Marlin, M. Benjamin AU - Mazor, M. Kathleen AU - Lemon, C. Stephenie AU - Houston, K. Thomas PY - 2016/03/07 TI - Collective-Intelligence Recommender Systems: Advancing Computer Tailoring for Health Behavior Change Into the 21st Century JO - J Med Internet Res SP - e42 VL - 18 IS - 3 KW - computer-tailored health communication KW - machine learning KW - recommender systems N2 - Background: What is the next frontier for computer-tailored health communication (CTHC) research? In current CTHC systems, study designers who have expertise in behavioral theory and mapping theory into CTHC systems select the variables and develop the rules that specify how the content should be tailored, based on their knowledge of the targeted population, the literature, and health behavior theories. In collective-intelligence recommender systems (hereafter recommender systems) used by Web 2.0 companies (eg, Netflix and Amazon), machine learning algorithms combine user profiles and continuous feedback ratings of content (from themselves and other users) to empirically tailor content. Augmenting current theory-based CTHC with empirical recommender systems could be evaluated as the next frontier for CTHC. Objective: The objective of our study was to uncover barriers and challenges to using recommender systems in health promotion. Methods: We conducted a focused literature review, interviewed subject experts (n=8), and synthesized the results. Results: We describe (1) limitations of current CTHC systems, (2) advantages of incorporating recommender systems to move CTHC forward, and (3) challenges to incorporating recommender systems into CTHC. Based on the evidence presented, we propose a future research agenda for CTHC systems. Conclusions: We promote discussion of ways to move CTHC into the 21st century by incorporation of recommender systems. UR - http://www.jmir.org/2016/3/e42/ UR - http://dx.doi.org/10.2196/jmir.4448 UR - http://www.ncbi.nlm.nih.gov/pubmed/26952574 ID - info:doi/10.2196/jmir.4448 ER - TY - JOUR AU - Gray, Kathleen AU - Sockolow, Paulina PY - 2016/02/24 TI - Conceptual Models in Health Informatics Research: A Literature Review and Suggestions for Development JO - JMIR Med Inform SP - e7 VL - 4 IS - 1 KW - medical informatics KW - theoretical models KW - conceptual framework KW - conceptual model KW - design-based research KW - implementation research KW - evaluation research KW - health informatics KW - research design KW - research training N2 - Background: Contributing to health informatics research means using conceptual models that are integrative and explain the research in terms of the two broad domains of health science and information science. However, it can be hard for novice health informatics researchers to find exemplars and guidelines in working with integrative conceptual models. Objectives: The aim of this paper is to support the use of integrative conceptual models in research on information and communication technologies in the health sector, and to encourage discussion of these conceptual models in scholarly forums. Methods: A two-part method was used to summarize and structure ideas about how to work effectively with conceptual models in health informatics research that included (1) a selective review and summary of the literature of conceptual models; and (2) the construction of a step-by-step approach to developing a conceptual model. Results: The seven-step methodology for developing conceptual models in health informatics research explained in this paper involves (1) acknowledging the limitations of health science and information science conceptual models; (2) giving a rationale for one?s choice of integrative conceptual model; (3) explicating a conceptual model verbally and graphically; (4) seeking feedback about the conceptual model from stakeholders in both the health science and information science domains; (5) aligning a conceptual model with an appropriate research plan; (6) adapting a conceptual model in response to new knowledge over time; and (7) disseminating conceptual models in scholarly and scientific forums. Conclusions: Making explicit the conceptual model that underpins a health informatics research project can contribute to increasing the number of well-formed and strongly grounded health informatics research projects. This explication has distinct benefits for researchers in training, research teams, and researchers and practitioners in information, health, and other disciplines. UR - http://medinform.jmir.org/2016/1/e7/ UR - http://dx.doi.org/10.2196/medinform.5021 UR - http://www.ncbi.nlm.nih.gov/pubmed/26912288 ID - info:doi/10.2196/medinform.5021 ER - TY - JOUR AU - Van de Velde, Stijn AU - Macken, Lieve AU - Vanneste, Koen AU - Goossens, Martine AU - Vanschoenbeek, Jan AU - Aertgeerts, Bert AU - Vanopstal, Klaar AU - Vander Stichele, Robert AU - Buysschaert, Joost PY - 2015/10/09 TI - Technology for Large-Scale Translation of Clinical Practice Guidelines: A Pilot Study of the Performance of a Hybrid Human and Computer-Assisted Approach JO - JMIR Med Inform SP - e33 VL - 3 IS - 4 KW - practice guidelines as topic KW - translations KW - technology KW - education, medical, continuing KW - evidence-based practice? N2 - Background: The construction of EBMPracticeNet, a national electronic point-of-care information platform in Belgium, began in 2011 to optimize quality of care by promoting evidence-based decision making. The project involved, among other tasks, the translation of 940 EBM Guidelines of Duodecim Medical Publications from English into Dutch and French. Considering the scale of the translation process, it was decided to make use of computer-aided translation performed by certificated translators with limited expertise in medical translation. Our consortium used a hybrid approach, involving a human translator supported by a translation memory (using SDL Trados Studio), terminology recognition (using SDL MultiTerm terminology databases) from medical terminology databases, and support from online machine translation. This resulted in a validated translation memory, which is now in use for the translation of new and updated guidelines. Objective: The objective of this experiment was to evaluate the performance of the hybrid human and computer-assisted approach in comparison with translation unsupported by translation memory and terminology recognition. A comparison was also made with the translation efficiency of an expert medical translator. Methods: We conducted a pilot study in which two sets of 30 new and 30 updated guidelines were randomized to one of three groups. Comparable guidelines were translated (1) by certificated junior translators without medical specialization using the hybrid method, (2) by an experienced medical translator without this support, and (3) by the same junior translators without the support of the validated translation memory. A medical proofreader who was blinded for the translation procedure, evaluated the translated guidelines for acceptability and adequacy. Translation speed was measured by recording translation and post-editing time. The human translation edit rate was calculated as a metric to evaluate the quality of the translation. A further evaluation was made of translation acceptability and adequacy. Results: The average number of words per guideline was 1195 and the mean total translation time was 100.2 minutes/1000 words. No meaningful differences were found in the translation speed for new guidelines. The translation of updated guidelines was 59 minutes/1000 words faster (95% CI 2-115; P=.044) in the computer-aided group. Revisions due to terminology accounted for one third of the overall revisions by the medical proofreader. Conclusions: Use of the hybrid human and computer-aided translation by a non-expert translator makes the translation of updates of clinical practice guidelines faster and cheaper because of the benefits of translation memory. For the translation of new guidelines, there was no apparent benefit in comparison with the efficiency of translation unsupported by translation memory (whether by an expert or non-expert translator). UR - http://medinform.jmir.org/2015/4/e33/ UR - http://dx.doi.org/10.2196/medinform.4450 UR - http://www.ncbi.nlm.nih.gov/pubmed/26453372 ID - info:doi/10.2196/medinform.4450 ER - TY - JOUR AU - Penteado, Pires Silvio AU - Bento, Ferreira Ricardo AU - Battistella, Rizzo Linamara AU - Silva, Manami Sara AU - Sooful, Prasha PY - 2014/09/02 TI - Use of the Satisfaction With Amplification in Daily Life Questionnaire to Assess Patient Satisfaction Following Remote Hearing Aid Adjustments (Telefitting) JO - JMIR Med Inform SP - e18 VL - 2 IS - 2 KW - audiology KW - hearing aids KW - hearing loss KW - telemedicine KW - correction of hearing impairment KW - public policy KW - prosthesis fitting KW - questionnairies KW - quality improvement N2 - Background: Hearing loss can affect approximately 15% of the pediatric population and up to 40% of the adult population. The gold standard of treatment for hearing loss is amplification of hearing thresholds by means of a hearing aid instrument. A hearing aid is an electronic device equipped with a topology of only three major components of aggregate cost. The gold standard of hearing aid fittings is face-to-face appointments in hearing aid centers, clinics, or hospitals. Telefitting encompasses the programming and adjustments of hearing aid settings remotely. Fitting hearing aids remotely is a relatively simple procedure, using minimal computer hardware and Internet access. Objective: This project aimed to examine the feasibility and outcomes of remote hearing aid adjustments (telefitting) by assessing patient satisfaction via the Portuguese version of the Satisfaction With Amplification in Daily Life (SADL) questionnaire. Methods: The Brazilian Portuguese version of the SADL was used in this experimental research design. Participants were randomly selected through the Rehabilitation Clinical (Espaco Reouvir) of the Otorhinolaryngology Department Medical School University of Sao Paulo. Of the 8 participants in the study, 5 were female and 3 were male, with a mean age of 71.5 years. The design consisted of two face-to-face sessions performed within 15 working days of each other. The remote assistance took place 15 days later. Results: The average scores from this study are above the mean scores from the original SADL normative data. These indicate a high level of satisfaction in participants who were fitted remotely. Conclusions: The use of an evaluation questionnaire is a simple yet effective method to objectively assess the success of a remote fitting. Questionnaire outcomes can help hearing stakeholders improve the National Policy on Hearing Health Care in Brazil. The results of this project indicated that patient satisfaction levels of those fitted remotely were comparable to those fitted in the conventional manner, that is, face-to-face. UR - http://medinform.jmir.org/2014/2/e18/ UR - http://dx.doi.org/10.2196/medinform.2769 UR - http://www.ncbi.nlm.nih.gov/pubmed/25599909 ID - info:doi/10.2196/medinform.2769 ER - TY - JOUR AU - Jones, Ray PY - 2013/09/02 TI - Development of a Questionnaire and Cross-Sectional Survey of Patient eHealth Readiness and eHealth Inequalities JO - Med 2.0 SP - e9 VL - 2 IS - 2 KW - eHealth readiness KW - eHealth inequalities KW - digital divide KW - questionnaire development N2 - Background: Many speak of the digital divide, but variation in the opportunity of patients to use the Internet for health (patient eHealth readiness) is not a binary difference, rather a distribution influenced by personal capability, provision of services, support, and cost. Digital divisions in health have been addressed by various initiatives, but there was no comprehensive validated measure to know if they are effective that could be used in randomized controlled trials (RCTs) covering both non-Internet-users and the range of Internet-users. Objective: The aim of this study was to develop and validate a self-completed questionnaire and scoring system to assess patient eHealth readiness by examining the spread of scores and eHealth inequalities. The intended use of this questionnaire and scores is in RCTs of interventions aiming to improve patient eHealth readiness and reduce eHealth inequalities. Methods: Based on four factors identified from the literature, a self-completed questionnaire, using a pragmatic combination of factual and attitude questions, was drafted and piloted in three stages. This was followed by a final population-based, cross-sectional household survey of 344 people used to refine the scoring system. Results: The Patient eHealth Readiness Questionnaire (PERQ) includes questions used to calculate four subscores: patients? perception of (1) provision, (2) their personal ability and confidence, (3) their interpersonal support, and (4) relative costs in using the Internet for health. These were combined into an overall PERQ score (0-9) which could be used in intervention studies. Reduction in standard deviation of the scores represents reduction in eHealth inequalities. Conclusions: PERQ appears acceptable for participants in British studies. The scores produced appear valid and will enable assessment of the effectiveness of interventions to improve patient eHealth readiness and reduce eHealth inequalities. Such methods need continued evolution and redevelopment for other environments. Full documentation and data have been published to allow others to develop the tool further. UR - http://www.medicine20.com/2013/2/e9/ UR - http://dx.doi.org/10.2196/med20.2559 UR - http://www.ncbi.nlm.nih.gov/pubmed/25075244 ID - info:doi/10.2196/med20.2559 ER - TY - JOUR AU - Blanson Henkemans, A. Olivier AU - Dusseldorp, ML Elise AU - Keijsers, FEM Jolanda AU - Kessens, M. Judith AU - Neerincx, A. Mark AU - Otten, Wilma PY - 2013/08/22 TI - Validity and Reliability of the eHealth Analysis and Steering Instrument JO - Med 2.0 SP - e8 VL - 2 IS - 2 KW - self-care KW - psychometrics KW - validity KW - reliability KW - scale analysis KW - effectiveness KW - self-management support N2 - Background: eHealth services can contribute to individuals? self-management, that is, performing lifestyle-related activities and decision making, to maintain a good health, or to mitigate the effect of an (chronic) illness on their health. But how effective are these services? Conducting a randomized controlled trial (RCT) is the golden standard to answer such a question, but takes extensive time and effort. The eHealth Analysis and Steering Instrument (eASI) offers a quick, but not dirty alternative. The eASI surveys how eHealth services score on 3 dimensions (ie, utility, usability, and content) and 12 underlying categories (ie, insight in health condition, self-management decision making, performance of self-management, involving the social environment, interaction, personalization, persuasion, description of health issue, factors of influence, goal of eHealth service, implementation, and evidence). However, there are no data on its validity and reliability. Objective: The objective of our study was to assess the construct and predictive validity and interrater reliability of the eASI. Methods: We found 16 eHealth services supporting self-management published in the literature, whose effectiveness was evaluated in an RCT and the service itself was available for rating. Participants (N=16) rated these services with the eASI. We analyzed the correlation of eASI items with the underlying three dimensions (construct validity), the correlation between the eASI score and the eHealth services? effect size observed in the RCT (predictive validity), and the interrater agreement. Results: Three items did not fit with the other items and dimensions and were removed from the eASI; 4 items were replaced from the utility to the content dimension. The interrater reliabilities of the dimensions and the total score were moderate (total, ?=.53, and content, ?=.55) and substantial (utility, ?=.69, and usability, ?=.63). The adjusted eASI explained variance in the eHealth services? effect sizes (R2=.31, P<.001), as did the dimensions utility (R2=.49, P<.001) and usability (R2=.18, P=.021). Usability explained variance in the effect size on health outcomes (R2=.13, P=.028). Conclusions: After removing 3 items and replacing 4 items to another dimension, the eASI (3 dimensions, 11 categories, and 32 items) has a good construct validity and predictive validity. The eASI scales are moderately to highly reliable. Accordingly, the eASI can predict how effective an eHealth service is in regard to supporting self-management. Due to a small pool of available eHealth services, it is advised to reevaluate the eASI in the future with more services. UR - http://www.medicine20.com/2013/2/e8/ UR - http://dx.doi.org/10.2196/med20.2571 UR - http://www.ncbi.nlm.nih.gov/pubmed/25075243 ID - info:doi/10.2196/med20.2571 ER -