Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?


Journal Description

JMIR Medical Informatics (JMI, ISSN 2291-9694; Impact Factor: 3.188) (Editor-in-chief: Christian Lovis MD MPH FACMI) is a PubMed/SCIE-indexed, top-rated, tier A journal that focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation.

Published by JMIR Publications, JMIR Medical Informatics has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.

JMIR Medical Informatics adheres to rigorous quality standards, involving a rapid and thorough peer-review process, professional copyediting, professional production of PDF, XHTML, and XML proofs (ready for deposit in PubMed Central/PubMed).


Recent Articles:

  • Untitled. Source: Freepik; Copyright: freepik; URL:; License: Licensed by JMIR.

    Predicting Current Glycated Hemoglobin Levels in Adults From Electronic Health Records: Validation of Multiple Logistic Regression Algorithm


    Background: Electronic health record (EHR) systems generate large datasets that can significantly enrich the development of medical predictive models. Several attempts have been made to investigate the effect of glycated hemoglobin (HbA1c) elevation on the prediction of diabetes onset. However, there is still a need for validation of these models using EHR data collected from different populations. Objective: The aim of this study is to perform a replication study to validate, evaluate, and identify the strengths and weaknesses of replicating a predictive model that employed multiple logistic regression with EHR data to forecast the levels of HbA1c. The original study used data from a population in the United States and this differentiated replication used a population in Saudi Arabia. Methods: A total of 3 models were developed and compared with the model created in the original study. The models were trained and tested using a larger dataset from Saudi Arabia with 36,378 records. The 10-fold cross-validation approach was used for measuring the performance of the models. Results: Applying the method employed in the original study achieved an accuracy of 74% to 75% when using the dataset collected from Saudi Arabia, compared with 77% obtained from using the population from the United States. The results also show a different ranking of importance for the predictors between the original study and the replication. The order of importance for the predictors with our population, from the most to the least importance, is age, random blood sugar, estimated glomerular filtration rate, total cholesterol, non–high-density lipoprotein, and body mass index. Conclusions: This replication study shows that direct use of the models (calculators) created using multiple logistic regression to predict the level of HbA1c may not be appropriate for all populations. This study reveals that the weighting of the predictors needs to be calibrated to the population used. However, the study does confirm that replicating the original study using a different population can help with predicting the levels of HbA1c by using the predictors that are routinely collected and stored in hospital EHR systems.

  • Source: The Authors/Placeit; Copyright: The Authors/Placeit; URL:; License: Licensed by JMIR.

    An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation


    Background: Eligibility criteria are the main strategy for screening appropriate participants for clinical trials. Automatic analysis of clinical trial eligibility criteria by digital screening, leveraging natural language processing techniques, can improve recruitment efficiency and reduce the costs involved in promoting clinical research. Objective: We aimed to create a natural language processing model to automatically classify clinical trial eligibility criteria. Methods: We proposed a classifier for short text eligibility criteria based on ensemble learning, where a set of pretrained models was integrated. The pretrained models included state-of-the-art deep learning methods for training and classification, including Bidirectional Encoder Representations from Transformers (BERT), XLNet, and A Robustly Optimized BERT Pretraining Approach (RoBERTa). The classification results by the integrated models were combined as new features for training a Light Gradient Boosting Machine (LightGBM) model for eligibility criteria classification. Results: Our proposed method obtained an accuracy of 0.846, a precision of 0.803, and a recall of 0.817 on a standard data set from a shared task of an international conference. The macro F1 value was 0.807, outperforming the state-of-the-art baseline methods on the shared task. Conclusions: We designed a model for screening short text classification criteria for clinical trials based on multimodel ensemble learning. Through experiments, we concluded that performance was improved significantly with a model ensemble compared to a single model. The introduction of focal loss could reduce the impact of class imbalance to achieve better performance.

  • Source: freepik; Copyright: cookie_studio; URL:; License: Licensed by JMIR.

    Diabetes Self-Management in the Age of Social Media: Large-Scale Analysis of Peer Interactions Using Semiautomated Methods


    Background: Online communities have been gaining popularity as support venues for chronic disease management. User engagement, information exposure, and social influence mechanisms can play a significant role in the utility of these platforms. Objective: In this paper, we characterize peer interactions in an online community for chronic disease management. Our objective is to identify key communications and study their prevalence in online social interactions. Methods: The American Diabetes Association Online community is an online social network for diabetes self-management. We analyzed 80,481 randomly selected deidentified peer-to-peer messages from 1212 members, posted between June 1, 2012, and May 30, 2019. Our mixed methods approach comprised qualitative coding and automated text analysis to identify, visualize, and analyze content-specific communication patterns underlying diabetes self-management. Results: Qualitative analysis revealed that “social support” was the most prevalent theme (84.9%), followed by “readiness to change” (18.8%), “teachable moments” (14.7%), “pharmacotherapy” (13.7%), and “progress” (13.3%). The support vector machine classifier resulted in reasonable accuracy with a recall of 0.76 and precision 0.78 and allowed us to extend our thematic codes to the entire data set. Conclusions: Modeling health-related communication through high throughput methods can enable the identification of specific content related to sustainable chronic disease management, which facilitates targeted health promotion.

  • Source: freepik; Copyright: pressfoto; URL:; License: Licensed by JMIR.

    Factors Influencing Doctors’ Participation in the Provision of Medical Services Through Crowdsourced Health Care Information Websites:...


    Background: Web-based crowdsourcing promotes the goals achieved effectively by gaining solutions from public groups via the internet, and it has gained extensive attention in both business and academia. As a new mode of sourcing, crowdsourcing has been proven to improve efficiency, quality, and diversity of tasks. However, little attention has been given to crowdsourcing in the health sector. Objective: Crowdsourced health care information websites enable patients to post their questions in the question pool, which is accessible to all doctors, and the patients wait for doctors to respond to their questions. Since the sustainable development of crowdsourced health care information websites depends on the participation of the doctors, we aimed to investigate the factors influencing doctors’ participation in providing health care information in these websites from the perspective of the elaboration-likelihood model. Methods: We collected 1524 questions with complete patient-doctor interaction processes from an online health community in China to test all the hypotheses. We divided the doctors into 2 groups based on the sequence of the answers: (1) doctor who answered the patient’s question first and (2) the doctors who answered that question after the doctor who answered first. All analyses were conducted using the ordinary least squares method. Results: First, the ability of the doctor who first answered the health-related question was found to positively influence the participation of the following doctors who answered after the first doctor responded to the question (βoffline1=.177, P<.001; βoffline2=.063, P=.048; βonline=.418, P<.001). Second, the reward that the patient offered for the best answer showed a positive effect on doctors’ participation (β=.019, P<.001). Third, the question’s complexity was found to positively moderate the relationships between the ability of the first doctor who answered and the participation of the following doctors (β=.186, P=.05) and to mitigate the effect between the reward and the participation of the following doctors (β=–.003, P=.10). Conclusions: This study has both theoretical and practical contributions. Online health community managers can build effective incentive mechanisms to encourage highly competent doctors to participate in the provision of medical services in crowdsourced health care information websites and they can increase the reward incentives for each question to increase the participation of the doctors.

  • Source: Wikimedia Commons; Copyright: Web.DE; URL:,_shoulder_camera_1_(20699427282).jpg; License: Creative Commons Attribution (CC-BY).

    Using Electronic Data Collection Platforms to Assess Complementary and Integrative Health Patient-Reported Outcomes: Feasibility Project


    Background: The Veteran Administration (VA) Office of Patient-Centered Care and Cultural Transformation is invested in improving veteran health through a whole-person approach while taking advantage of the electronic resources suite available through the VA. Currently, there is no standardized process to collect and integrate electronic patient-reported outcomes (ePROs) of complementary and integrative health (CIH) into clinical care using a web-based survey platform. This quality improvement project enrolled veterans attending CIH appointments within a VA facility and used web-based technologies to collect ePROs. Objective: This study aimed to (1) determine a practical process for collecting ePROs using patient email services and a web-based survey platform and (2) conduct analyses of survey data using repeated measures to estimate the effects of CIH on patient outcomes. Methods: In total, 100 veterans from one VA facility, comprising 11 cohorts, agreed to participate. The VA patient email services (Secure Messaging) were used to manually send links to a 16-item web-based survey stored on a secure web-based survey storage platform (Qualtrics). Each survey included questions about patient outcomes from CIH programs. Each cohort was sent survey links via Secure Messaging (SM) at 6 time points: weeks 1 through 4, week 8, and week 12. Process evaluation interviews were conducted with five primary care providers to assess barriers and facilitators to using the patient-reported outcome survey in usual care. Results: This quality improvement project demonstrated the usability of SM and Qualtrics for ePRO collection. However, SM for ePROs was labor intensive for providers. Descriptive statistics on health competence (2-item Perceived Health Competence Scale), physical and mental health (Patient-Reported Outcomes Measurement Information System Global-10), and stress (4-item Perceived Stress Scale) indicated that scores did not significantly change over time. Survey response rates varied (18/100, 18.0%-42/100, 42.0%) across each of the 12 weekly survey periods. In total, 74 of 100 participants provided ≥1 survey, and 90% (66/74) were female. The majority, 62% (33/53) of participants, who reported the use of any CIH modality, reported the use of two or more unique modalities. Primary care providers highlighted specific challenges with SM and offered solutions regarding staff involvement in survey implementation. Conclusions: This quality improvement project informs our understanding of the processes currently available for using SM and web-based data platforms to collect ePROs. The study results indicate that although it is possible to use SM and web-based survey platforms for ePROs, automating scheduled administration will be necessary to reduce provider burden. The lack of significant change in ePROs may be due to standard measures taking a biomedical approach to wellness. Future work should focus on identifying ideal ePRO processes that would include standardized, whole-person measures of wellness.

  • Source: Unsplash; Copyright: Raymond Wong; URL:; License: Licensed by JMIR.

    Medical Emergency Resource Allocation Model in Large-Scale Emergencies Based on Artificial Intelligence: Algorithm Development

    Authors List:


    Background: Before major emergencies occur, the government needs to prepare various emergency supplies in advance. To do this, it should consider the coordinated storage of different types of materials while ensuring that emergency materials are not missed or superfluous. Objective: This paper aims to improve the dispatch and transportation efficiency of emergency materials under a model in which the government makes full use of Internet of Things technology and artificial intelligence technology. Methods: The paper established a model for emergency material preparation and dispatch based on queueing theory and further established a workflow system for emergency material preparation, dispatch, and transportation based on a Petri net, resulting in a highly efficient emergency material preparation and dispatch simulation system framework. Results: A decision support platform was designed to integrate all the algorithms and principles proposed. Conclusions: The resulting framework can effectively coordinate the workflow of emergency material preparation and dispatch, helping to shorten the total time of emergency material preparation, dispatch, and transportation.

  • Source: Pexels; Copyright: Public Copyright; URL:; License: Licensed by the authors.

    Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study


    Background: According to a World Health Organization report in 2017, there was almost one patient with depression among every 20 people in China. However, the diagnosis of depression is usually difficult in terms of clinical detection owing to slow observation, high cost, and patient resistance. Meanwhile, with the rapid emergence of social networking sites, people tend to share their daily life and disclose inner feelings online frequently, making it possible to effectively identify mental conditions using the rich text information. There are many achievements regarding an English web-based corpus, but for research in China so far, the extraction of language features from web-related depression signals is still in a relatively primary stage. Objective: The purpose of this study was to propose an effective approach for constructing a depression-domain lexicon. This lexicon will contain language features that could help identify social media users who potentially have depression. Our study also compared the performance of detection with and without our lexicon. Methods: We autoconstructed a depression-domain lexicon using Word2Vec, a semantic relationship graph, and the label propagation algorithm. These two methods combined performed well in a specific corpus during construction. The lexicon was obtained based on 111,052 Weibo microblogs from 1868 users who were depressed or nondepressed. During depression detection, we considered six features, and we used five classification methods to test the detection performance. Results: The experiment results showed that in terms of the F1 value, our autoconstruction method performed 1% to 6% better than baseline approaches and was more effective and steadier. When applied to detection models like logistic regression and support vector machine, our lexicon helped the models outperform by 2% to 9% and was able to improve the final accuracy of potential depression detection. Conclusions: Our depression-domain lexicon was proven to be a meaningful input for classification algorithms, providing linguistic insights on the depressive status of test subjects. We believe that this lexicon will enhance early depression detection in people on social media. Future work will need to be carried out on a larger corpus and with more complex methods.

  • Source: Freepik; Copyright: jcomp; URL:; License: Licensed by JMIR.

    Machine Learning–Based Signal Quality Evaluation of Single-Period Radial Artery Pulse Waves: Model Development and Validation


    Background: The radial artery pulse wave is a widely used physiological signal for disease diagnosis and personal health monitoring because it provides insight into the overall health of the heart and blood vessels. Periodic radial artery pulse signals are subsequently decomposed into single pulse wave periods (segments) for physiological parameter evaluations. However, abnormal periods frequently arise due to external interference, the inherent imperfections of current segmentation methods, and the quality of the pulse wave signals. Objective: The objective of this paper was to develop a machine learning model to detect abnormal pulse periods in real clinical data. Methods: Various machine learning models, such as k-nearest neighbor, logistic regression, and support vector machines, were applied to classify the normal and abnormal periods in 8561 segments extracted from the radial pulse waves of 390 outpatients. The recursive feature elimination method was used to simplify the classifier. Results: It was found that a logistic regression model with only four input features can achieve a satisfactory result. The area under the receiver operating characteristic curve from the test set was 0.9920. In addition, these classifiers can be easily interpreted. Conclusions: We expect that this model can be applied in smart sport watches and watchbands to accurately evaluate human health status.

  • Heparin vial and syringe on a counter top. Source: shutterstock; Copyright: JLMcAnally; URL:; License: Licensed by JMIR.

    Toward Optimal Heparin Dosing by Comparing Multiple Machine Learning Methods: Retrospective Study


    Background: Heparin is one of the most commonly used medications in intensive care units. In clinical practice, the use of a weight-based heparin dosing nomogram is standard practice for the treatment of thrombosis. Recently, machine learning techniques have dramatically improved the ability of computers to provide clinical decision support and have allowed for the possibility of computer generated, algorithm-based heparin dosing recommendations. Objective: The objective of this study was to predict the effects of heparin treatment using machine learning methods to optimize heparin dosing in intensive care units based on the predictions. Patient state predictions were based upon activated partial thromboplastin time in 3 different ranges: subtherapeutic, normal therapeutic, and supratherapeutic, respectively. Methods: Retrospective data from 2 intensive care unit research databases (Multiparameter Intelligent Monitoring in Intensive Care III, MIMIC-III; e–Intensive Care Unit Collaborative Research Database, eICU) were used for the analysis. Candidate machine learning models (random forest, support vector machine, adaptive boosting, extreme gradient boosting, and shallow neural network) were compared in 3 patient groups to evaluate the classification performance for predicting the subtherapeutic, normal therapeutic, and supratherapeutic patient states. The model results were evaluated using precision, recall, F1 score, and accuracy. Results: Data from the MIMIC-III database (n=2789 patients) and from the eICU database (n=575 patients) were used. In 3-class classification, the shallow neural network algorithm performed the best (F1 scores of 87.26%, 85.98%, and 87.55% for data set 1, 2, and 3, respectively). The shallow neural network algorithm achieved the highest F1 scores within the patient therapeutic state groups: subtherapeutic (data set 1: 79.35%; data set 2: 83.67%; data set 3: 83.33%), normal therapeutic (data set 1: 93.15%; data set 2: 87.76%; data set 3: 84.62%), and supratherapeutic (data set 1: 88.00%; data set 2: 86.54%; data set 3: 95.45%) therapeutic ranges, respectively. Conclusions: The most appropriate model for predicting the effects of heparin treatment was found by comparing multiple machine learning models and can be used to further guide optimal heparin dosing. Using multicenter intensive care unit data, our study demonstrates the feasibility of predicting the outcomes of heparin treatment using data-driven methods, and thus, how machine learning–based models can be used to optimize and personalize heparin dosing to improve patient safety. Manual analysis and validation suggested that the model outperformed standard practice heparin treatment dosing.

  • Source: freepik; Copyright: jcomp; URL:; License: Licensed by JMIR.

    Ensemble Learning Models Based on Noninvasive Features for Type 2 Diabetes Screening: Model Development and Validation


    Background: Early diabetes screening can effectively reduce the burden of disease. However, natural population–based screening projects require a large number of resources. With the emergence and development of machine learning, researchers have started to pursue more flexible and efficient methods to screen or predict type 2 diabetes. Objective: The aim of this study was to build prediction models based on the ensemble learning method for diabetes screening to further improve the health status of the population in a noninvasive and inexpensive manner. Methods: The dataset for building and evaluating the diabetes prediction model was extracted from the National Health and Nutrition Examination Survey from 2011-2016. After data cleaning and feature selection, the dataset was split into a training set (80%, 2011-2014), test set (20%, 2011-2014) and validation set (2015-2016). Three simple machine learning methods (linear discriminant analysis, support vector machine, and random forest) and easy ensemble methods were used to build diabetes prediction models. The performance of the models was evaluated through 5-fold cross-validation and external validation. The Delong test (2-sided) was used to test the performance differences between the models. Results: We selected 8057 observations and 12 attributes from the database. In the 5-fold cross-validation, the three simple methods yielded highly predictive performance models with areas under the curve (AUCs) over 0.800, wherein the ensemble methods significantly outperformed the simple methods. When we evaluated the models in the test set and validation set, the same trends were observed. The ensemble model of linear discriminant analysis yielded the best performance, with an AUC of 0.849, an accuracy of 0.730, a sensitivity of 0.819, and a specificity of 0.709 in the validation set. Conclusions: This study indicates that efficient screening using machine learning methods with noninvasive tests can be applied to a large population and achieve the objective of secondary prevention.

  • Source:; Copyright: Miguel Á Padriñán; URL:; License: Licensed by JMIR.

    Identification of High-Order Single-Nucleotide Polymorphism Barcodes in Breast Cancer Using a Hybrid Taguchi-Genetic Algorithm: Case-Control Study


    Background: Breast cancer has a major disease burden in the female population, and it is a highly genome-associated human disease. However, in genetic studies of complex diseases, modern geneticists face challenges in detecting interactions among loci. Objective: This study aimed to investigate whether variations of single-nucleotide polymorphisms (SNPs) are associated with histopathological tumor characteristics in breast cancer patients. Methods: A hybrid Taguchi-genetic algorithm (HTGA) was proposed to identify the high-order SNP barcodes in a breast cancer case-control study. A Taguchi method was used to enhance a genetic algorithm (GA) for identifying high-order SNP barcodes. The Taguchi method was integrated into the GA after the crossover operations in order to optimize the generated offspring systematically for enhancing the GA search ability. Results: The proposed HTGA effectively converged to a promising region within the problem space and provided excellent SNP barcode identification. Regression analysis was used to validate the association between breast cancer and the identified high-order SNP barcodes. The maximum OR was less than 1 (range 0.870-0.755) for two- to seven-order SNP barcodes. Conclusions: We systematically evaluated the interaction effects of 26 SNPs within growth factor–related genes for breast carcinogenesis pathways. The HTGA could successfully identify relevant high-order SNP barcodes by evaluating the differences between cases and controls. The validation results showed that the HTGA can provide better fitness values as compared with other methods for the identification of high-order SNP barcodes using breast cancer case-control data sets.

  • Source: Image created by the Authors/Placeit; Copyright: The Authors/Placeit; URL:; License: Licensed by JMIR.

    Summarizing Complex Graphical Models of Multiple Chronic Conditions Using the Second Eigenvalue of Graph Laplacian: Algorithm Development and Validation


    Background: It is important but challenging to understand the interactions of multiple chronic conditions (MCC) and how they develop over time in patients and populations. Clinical data on MCC can now be represented using graphical models to study their interaction and identify the path toward the development of MCC. However, the current graphical models representing MCC are often complex and difficult to analyze. Therefore, it is necessary to develop improved methods for generating these models. Objective: This study aimed to summarize the complex graphical models of MCC interactions to improve comprehension and aid analysis. Methods: We examined the emergence of 5 chronic medical conditions (ie, traumatic brain injury [TBI], posttraumatic stress disorder [PTSD], depression [Depr], substance abuse [SuAb], and back pain [BaPa]) over 5 years among 257,633 veteran patients. We developed 3 algorithms that utilize the second eigenvalue of the graph Laplacian to summarize the complex graphical models of MCC by removing less significant edges. The first algorithm learns a sparse probabilistic graphical model of MCC interactions directly from the data. The second algorithm summarizes an existing probabilistic graphical model of MCC interactions when a supporting data set is available. The third algorithm, which is a variation of the second algorithm, summarizes the existing graphical model of MCC interactions with no supporting data. Finally, we examined the coappearance of the 100 most common terms in the literature of MCC to validate the performance of the proposed model. Results: The proposed summarization algorithms demonstrate considerable performance in extracting major connections among MCC without reducing the predictive accuracy of the resulting graphical models. For the model learned directly from the data, the area under the curve (AUC) performance for predicting TBI, PTSD, BaPa, SuAb, and Depr, respectively, during the next 4 years is as follows—year 2: 79.91%, 84.04%, 78.83%, 82.50%, and 81.47%; year 3: 76.23%, 80.61%, 73.51%, 79.84%, and 77.13%; year 4: 72.38%, 78.22%, 72.96%, 77.92%, and 72.65%; and year 5: 69.51%, 76.15%, 73.04%, 76.72%, and 69.99%, respectively. This demonstrates an overall 12.07% increase in the cumulative sum of AUC in comparison with the classic multilevel temporal Bayesian network. Conclusions: Using graph summarization can improve the interpretability and the predictive power of the complex graphical models of MCC.

Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Latest Submissions Open for Peer-Review:

View All Open Peer Review Articles
  • Web-Based Dental Patient Education and Management Application

    Date Submitted: Jun 18, 2020

    Open Peer Review Period: Jun 18, 2020 - Aug 13, 2020

    Background: It is difficult for hospitals and clinics to manage their documents related to their patients and routine works without having management software. Objective: The purpose of this paper is...

    Background: It is difficult for hospitals and clinics to manage their documents related to their patients and routine works without having management software. Objective: The purpose of this paper is to develop and evaluate a web-based dental clinic application for managing and educating patients. Methods: The application will be developed with recent web technologies such as ASP.NET, JavaScript, Bootstrap, and Web Service; it is hosted in the Cloud and it is powered by Microsoft Azure Cloud computing Service. A clinic has been selected to use and evaluate the application. Results: The evaluation results of the application show that the application meets its objectives of managing and educating patients. Conclusions: It can be updated and extended to use in other clinics for management and universities for learning and researching purposes.

  • Patient triage by topic modelling of referral letters: Feasibility study

    Date Submitted: Jun 15, 2020

    Open Peer Review Period: Jun 9, 2020 - Aug 4, 2020

    Background: Musculoskeletal conditions are managed within primary care with referral to secondary care when a specialist opinion is required. The ever increasing demand of healthcare resources emphasi...

    Background: Musculoskeletal conditions are managed within primary care with referral to secondary care when a specialist opinion is required. The ever increasing demand of healthcare resources emphasizes the need for new triage methods to streamline care pathways with the ultimate aim of ensuring that patients receive timely and optimal care. Information contained in referral letters underpins the referral decision-making process but is yet to be explored systematically for the purposes of treatment prioritization for musculoskeletal conditions. Objective: This study aims to explore the feasibility of using natural language processing and machine learning to automate triage of patients with musculoskeletal conditions by analyzing information from referral letters. Methods: We used latent Dirichlet allocation to model each referral letter as a finite mixture over an underlying set of topics and model each topic as an infinite mixture over an underlying set of topic probabilities. The topic model was evaluated in the context of automating patient triage. Given a set of treatment outcomes, a binary classifier was trained for each outcome using previously extracted topics as the input features of the machine learning algorithm. In addition, qualitative evaluation was performed to assess human interpretability of topics. Results: The prediction accuracy of binary classifiers outperformed the stratified random classifier by a large margin giving an indication that topic modelling could be used to support patient triage. Qualitative evaluation confirmed high clinical interpretability of the topic model. Conclusions: The results established the feasibility of using natural language processing and machine learning to automate triage of patients with knee and/or hip pain by analyzing information from their referral letters.

  • The effect of artificial intelligence-based automated medical history taking system on patient waiting time in the general internal medicine outpatient setting: interrupted time series analysis

    Date Submitted: Jun 4, 2020

    Open Peer Review Period: Jun 4, 2020 - Jul 30, 2020

    Background: Use of automated medical history taking (AMHT) systems in the general internal medicine outpatient department (GIM-OD) is a promising strategy to reduce waiting time. Objective: We evaluat...

    Background: Use of automated medical history taking (AMHT) systems in the general internal medicine outpatient department (GIM-OD) is a promising strategy to reduce waiting time. Objective: We evaluated the effect of AI Monshin, an AMHT system, on waiting time in the GIM-OD. Methods: We retrospectively analyzed waiting time length in a Japanese community hospital GIM-OD (April 2017–April 2020). AI Monshin was implemented in April 2019. We compared the mean waiting time before and after the AI Monshin implementation. An interrupted time series analysis of the mean waiting time/month was conducted. Results: We analyzed 21,933 cases. The mean waiting time after the AI Monshin implementation (87.0 min, SD 55.1) was significantly shorter than before the AI Monshin implementation (89.5 min, SD 56.6), with an absolute difference of −2.5 min (P = .003; 95% CI, −4.0 to −0.9). In the interrupted time series analysis, whereas the underlying linear time trend was statistically significant (−0.6 min/month; P = .005; 95% CI, −1.0 to −0.2), level change (26.2 min; P = .43; 95% CI, −16.1 to 68.5) and slope change (−0.6 min/month; P = .23; 95% CI, −2.0 to 0.8) did not differ significantly. In a sensitivity analysis of data between April 2018 and April 2020, the mean waiting time after the AI Monshin implementation (87.0 min, SD 55.1) was not significantly different compared with that before the AI Monshin implementation (86.8 min, SD 53.9) with the absolute difference of +0.2 min (P = .84; 95% CI −1.6 to 2.0). Conclusions: AI Monshin reduced waiting time only to a very limited extent.

  • Informatics management of tumor specimens in the era of Big Data: Challenges and solutions

    Date Submitted: May 18, 2020

    Open Peer Review Period: May 18, 2020 - Jul 13, 2020

    Biomedical data bears the potential to facilitate personalize diagnosis and precision treatment in the era of Big Data. Based on this, high-quality annotation of human specimens has become the primary...

    Biomedical data bears the potential to facilitate personalize diagnosis and precision treatment in the era of Big Data. Based on this, high-quality annotation of human specimens has become the primary mission of bio-bankers, especially for tumor bio-banks with large amounts of “omics” and clinical data. However, the lack of agreed-upon standardizations and the gap among heterogeneous databases make information application and communication a major challenge. International efforts are undergoing to develop national projects on informatics management. The aim of this paper is to provide references in data annotation and process to standardize and take full advantage of biomedical information. First, information categories that are vital for specimen applications, including sample attributes, external clinical and experimental data, are systematically listed to provide references for subsequent data mining. Second, commonly-used approaches in data collection, recording, extraction, transformation, integration and storage were summarized in support of data processes. In particular, a practical workflow of information annotation in daily bio-banking was drawn to help handling each step of the informatics management procedure. This paper highlights the importance of informatics management of tumor specimens, presents the process of data standardization, and provides practical instructions for bio-bankers in specimen annotation and data management.