JMIR Medical Informatics

Recent Articles

Expression of Concern: A Dynamic Adaptive Ensemble Learning Framework for Noninvasive Mild Cognitive Impairment Detection: Development and Validation Study

Natural Language Processing

Identifying Patient-Reported Outcome Measure Documentation in Veterans Health Administration Chiropractic Clinic Notes: Natural Language Processing Analysis

The use of patient reported outcome measures (PROMs) is an expected component of high-quality, measurement-based chiropractic care. The largest healthcare system offering integrated chiropractic care is the Veterans Health Administration (VHA). Challenges limit monitoring PROM use as a care quality metric at a national scale in the VHA. Structured data are unavailable with PROMs often embedded within clinic text notes as unstructured data requiring time-intensive, peer-conducted chart review for evaluation. Natural language processing (NLP) of clinic text notes is one promising solution to extracting care quality data from unstructured text.

Electronic Health Records

Large-Scale Evaluation and Liver Disease Risk Prediction in Finland’s National Electronic Health Record System: Feasibility Study Using Real-World Data

Globally, the incidence and mortality of chronic liver disease are escalating. Early detection of liver disease remains a challenge, often occurring at symptomatic stages when preventative measures are less effective. The Chronic Liver Disease score (CLivD) is a predictive risk model developed using Finnish health care data, aiming to forecast an individual’s risk of developing chronic liver disease in subsequent years. The Kanta Service is a national electronic health record system in Finland that stores comprehensive health care data including patient medical histories, prescriptions, and laboratory results, to facilitate health care delivery and research.

Decision Support for Health Professionals

Predicting Clinical Outcomes at the Toronto General Hospital Transitional Pain Service via the Manage My Pain App: Machine Learning Approach

Chronic pain is a complex condition that affects more than a quarter of people worldwide. The development and progression of chronic pain is unique to each individual due to the contribution of interacting biological, psychological and social factors. The subjective nature of the experience of chronic pain can make its clinical assessment and prognosis challenging. Personalized digital health apps, such as Manage My Pain (MMP), are popular pain self-tracking tools that can also be leveraged by clinicians to support patients. Recent advances in machine learning technologies open an opportunity to use data collected in pain apps to make predictions about a patient’s prognosis.

Natural Language Processing

Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework

Labeling unstructured radiology reports is crucial for creating structured datasets that facilitate downstream tasks, such as training large-scale medical imaging models. Current approaches typically rely on BERT-based methods or manual expert annotations, which have limitations in terms of scalability and performance.

Electronic Health Records

Biases in Race and Ethnicity Introduced by Filtering Electronic Health Records for “Complete Data”: Observational Clinical Data Analysis

Objective: Integrated clinical databases from national biobanks have advanced the capacity for disease research. Data quality and completeness filters are used when building clinical cohorts to address limitations of data missingness. However, these filters may unintentionally introduce systemic biases when they are correlated with race and ethnicity. In this study, we examined the race/ethnicity biases introduced by applying common filters to four clinical records databases. Materials and Methods: We applied 19 commonly used data filters to electronic health record datasets from four geographically varied locations comprising close to 12 million patients to understand how using these filters introduce sample bias along racial and ethnic groupings. These filters covered a range of information including demographics, medication records, visit details, and observation period. We observed the variation in sample drop-off between self-reported ethnic and racial groups for each site as we applied each filter individually. Results: Applying the observation period filter led to a substantial reduction in data availability across all races and ethnicities in all four datasets. However, among those examined, the availability of data in the white group remained consistently higher compared to other racial groups after applying each filter. Conversely, the Black/African American group was the most impacted by each filter on these three datasets, Cedars-Sinai dataset, UK-Biobank, and Columbia University Dataset. Among the four distinct datasets, only applying the filters to the All of Us dataset resulted in minimal deviation from the baseline, with most racial and ethnic groups following a similar pattern. Discussion and Conclusion: Our findings underscore the importance of using only necessary filters as they might disproportionally affect data availability of minoritized racial and ethnic populations. Researchers must consider these unintentional biases when performing data-driven research and explore techniques to minimize the impact of these filters, such as probabilistic methods or the use of machine learning and artificial intelligence. Researchers should consider disclosing sample sizes for racial and ethnic groups both before and after data filters are applied to aid the reader in understanding the generalizability of the results. Future work should focus on exploring the effects of filters on downstream analyses.

Natural Language Processing

Improving Systematic Review Updates With Natural Language Processing Through Abstract Component Classification and Selection: Algorithm Development and Validation

A challenge in updating systematic reviews is the workload in screening the articles. Many screening models using natural language processing technology have been implemented to scrutinize articles based on titles and abstracts. While these approaches show promise, traditional models typically treat abstracts as uniform text. We hypothesize that selective training on specific abstract components could enhance model performance for systematic review screening.

Machine Learning

An Interpretable Model With Probabilistic Integrated Scoring for Mental Health Treatment Prediction: Design Study

Machine learning (ML) systems in health care have the potential to enhance decision-making but often fail to address critical issues such as prediction explainability, confidence, and robustness in a context-based and easily interpretable manner.

Decision Support for Health Professionals

Convolutional Neural Network Models for Visual Classification of Pressure Ulcer Stages: Cross-Sectional Study

Pressure injuries (PIs) pose a negative health impact and a substantial economic burden on patients and society. Accurate staging is crucial for treating PIs. Owing to the diversity in the clinical manifestations of PIs and the lack of objective biochemical and pathological examinations, accurate staging of PIs is a major challenge. The deep learning (DL) algorithm, which uses convolutional neural networks (CNNs), has demonstrated exceptional classification performance in the intricate domain of skin diseases and wounds and has the potential to improve the staging accuracy of PIs.

Clinical Communication, Electronic Consultation and Telehealth

Public Attitudes Toward Violence Against Doctors: Sentiment Analysis of Chinese Users

Violence against doctors attracts the public’s attention both online and in the real world. Understanding how public sentiment evolves during such crises is essential for developing strategies to manage emotions and rebuild trust.

Methods and Instruments in Medical Informatics

Imputation and Missing Indicators for Handling Missing Longitudinal Data: Data Simulation Analysis Based on Electronic Health Record Data

Missing data in electronic health records (EHRs) is highly prevalent and results in analytical concerns such as heterogeneous sources of bias and loss of statistical power. One simple analytic method for addressing missing or unknown covariate values is to treat missing-ness for a particular variable as a category onto itself, which we refer to as the missing indicator method. For cross-sectional analyses, recent work suggested that there was minimal benefit to the missing indicator method; however, it is unclear how this approach performs in the setting of longitudinal data, in which correlation among clustered repeated measures may be leveraged for potentially improved model performance.

Big Data

Large Language Model–Based Critical Care Big Data Deployment and Extraction: Descriptive Analysis

Publicly accessible critical care-related databases contain enormous clinical data, but their utilization often requires advanced programming skills. However, the growing complexity of large databases and unstructured data presents challenges for clinicians who need programming or data analysis expertise to utilize these systems directly.