Effects of Computerized Decision Support Systems on Practitioner Performance and Patient Outcomes: Systematic Review

Background: Computerized decision support systems (CDSSs) are software programs that support the decision making of practitioners and other staff. Other reviews have analyzed the relationship between CDSSs, practitioner performance, and patient outcomes. These reviews reported positive practitioner performance in over half the articles analyzed, but very little information was found for patient outcomes. Objective: The purpose of this review was to analyze the relationship between CDSSs, practitioner performance, and patient medical outcomes. PubMed, CINAHL, Embase, Web of Science, and Cochrane databases were queried. Methods: Articles were chosen based on year published (last 10 years), high quality, peer-reviewed sources, and discussion of the relationship between the use of CDSS as an intervention and links to practitioner performance or patient outcomes. Reviewers used an Excel spreadsheet (Microsoft Corporation) to collect information on the relationship between CDSSs and practitioner performance or patient outcomes. Reviewers also collected observations of participants, intervention, comparison with control group, outcomes, and study design (PICOS) along with those showing implicit bias. Articles were analyzed by multiple reviewers following the Kruse protocol for systematic reviews. Data were organized into multiple tables for analysis and reporting. Results: Themes were identified for both practitioner performance (n=38) and medical outcomes (n=36). A total of 66% (25/38) of articles had occurrences of positive practitioner performance, 13% (5/38) found no difference in practitioner performance, and 21% (8/38) did not report or discuss practitioner performance. Zero articles reported negative practitioner performance. A total of 61% (22/36) of articles had occurrences of positive patient medical outcomes, 8% (3/36) found no statistically significant difference in medical outcomes between intervention and control groups, and 31% (11/36) did not report or discuss medical outcomes. Zero articles found negative patient medical outcomes attributed to using CDSSs. Conclusions: Results of this review are commensurate with previous reviews with similar objectives, but unlike these reviews we found a high level of reporting of positive effects on patient medical outcomes.


Rationale
Computerized decision support systems (CDSSs) are software programs that support the decision making of patients, practitioners, and staff with knowledge and person-specific information. CDSSs present several tools and alerts to enhance the decision-making process within the clinical workflow [1]. Knowledge-based CDSSs were the earliest classes of CDSSs using a data repository to draw conclusions. Knowledge-based systems use traditional computing methods giving programmed results. Non-knowledge-based CDSSs are the most common forms used today. These systems use artificial intelligence (AI) assistance to augment clinical decisions made at the point of care. AI-supported CDSSs use patient data to analyze relationships between symptoms, treatments, and patient outcomes to make clinical decisions. These patient data are usually derived from electronic health records (EHRs): digital forms of patient records that include patient information such as personal contact information, patient's medical history, allergies, test results, and treatment plan [2]. Artificial intelligence, software, or algorithms able to perform tasks that normally require human intelligence are integrated into CDSS processes. Data mining, a process usually assisted by AI, is often used by CDSSs to identify new data patterns from large data sets (like patient EHRs) [3]. The conclusions reached by AI used for data mining can be used by both non-knowledge-based CDSSs and knowledge-based CDSSs [3]. CDSSs are integrated into technologies such as computerized physician order entry (CPOE) [4] tools and electronic medical record (EMR)/EHR databases and use a wide variety of drug, patient, and treatment data and more to make clinical decisions that provide the best recommendations for treatment. CDSS utility varies widely, drawing conclusions about different ailments, disorders, and syndromes. Prospects for this technology may employ patient preferences or financial capabilities.
In prior studies, CDSSs have been shown to improve practitioner performance, but the effects on patient outcomes were inconsistent and required further study. A review conducted in 1998 evaluated studies for the previous 5 years and found a benefit to physician performance in 66% of studies analyzed (n=65), but only 14 of those analyzed discussed outcomes, so no conclusions were made [5]. The review was repeated in 2005 with a larger sample (n=100) and found a positive impact on physician performance in 64% of studies analyzed, but like the 1998 review, effects on patient outcomes were insufficient to make generalizations [6]. In 2010, a research protocol was registered to repeat the review, but no publication followed. In 2011, the review was repeated with a similar size of articles analyzed (n=91) and identified a positive effect of CDSSs on practitioner performance for 57% of articles analyzed; however, consistent with previous reviews, no conclusions could be made concerning patient outcomes [7].
Since the last publication on this topic in 2011, CDSSs have seen significant industry growth, becoming more accessible, cost-effective, and reliable and possessing greater computational power [8]. In addition to hardware improvements, the inclusion of software such as artificial intelligence (AI) programs is growing rapidly in CDSSs, but as of yet these improvements have not been systematically reviewed to determine any impacts they might have on patient outcomes and practitioner performance.

Objective
The purpose of this systematic review is to conduct a similar review to those from 1998 and 2005 to analyze the association between CDSSs, practitioner performance, and patient outcomes.
The methods used in the 2010 manuscript were never published, and those used in the 2011 review were significantly different than those in 1998 and 2005. The taxonomy of CDSSs has changed greatly since 1998, so search terms used 23 years ago will not be relevant today. CDSS employment is rapidly growing, especially with increased access to CDSS AI-supported software. Because the effects are understudied, our goal is to review the effectiveness of CDSS technologies, their employment, and their overall utility.

Protocol Registration and Eligibility Criteria
This review was not registered. The methods followed a technique of sharing workload from the Assessment of Multiple Systematic Reviews (AMSTAR) [9]. The format of the review uses the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [10]. Conceptualization of the overall review, including standardized data extraction tools, follows the Kruse protocol for writing systematic reviews in a health-related program [11]. Articles were eligible for inclusion if they were published in the English language within the last 10 years, had full text available, and reported on the elements of the objective statement: measures of effectiveness of CDSSs on practitioner performance or patient outcomes. A 10-year window was justified because we wanted the research to be current, and this exceeds the window of the 1998 and 2005 reviews, which used only 5 years. At first, we limited the search to studies in peer-reviewed journals, but because our sample was too small, we expanded the search to include grey literature. However, we limited our choices to use only those that had results.

Information Sources
Five common research databases were queried: PubMed (the web-based components of MEDLINE, life science journals, and online books), CINAHL, Embase, Web of Science, and Cochrane (reviews, controlled trials, methodologies, and health technology assessments). Searches were conducted from January 29 to January 31, 2020. Databases were chosen at the recommendation of the National Institutes of Health, which recommends at least three databases: PubMed, Embase, and Cochrane [12]. This practice also follows established practice in published systematic reviews [11].

Search and Study Selection
Searches in each database were identical: ("Clinical decision support systems" OR "computerized provider order entry" OR "diagnosis, computer assisted" OR "drug therapy, computer-assisted" OR "expert systems") AND ("patient reported outcomes" OR "practitioner performance"). Embase and Web of Science do not allow Boolean searches, so an advanced search was used. Articles were eligible for inclusion if they were published in the last 10 years and discussed both CDSSs and either practitioner performance or patient-reported outcomes. We excluded reviews. In CINAHL, we excluded MEDLINE to avoid duplication with the results from PubMed. Searching for CPOE also included order entry systems, medical; medication alert systems; alert system, medication; medication alert system; system, medication alert; alert systems, medication; computerized physician order entry system; CPOE; computerized provider order entry; and computerized physician order entry. Searching for diagnosis, computer assisted also included the following: computer-assisted diagnosis; computer assisted diagnosis; computer-assisted diagnoses; and diagnoses, computer assisted. Searching for drug therapy included the following: drug therapy, computer assisted; therapy, computer-assisted drug; computer-assisted drug therapies; drug therapies, computer-assisted; therapies, computer-assisted drug; therapy, computer assisted drug; computer-assisted drug therapy; computer assisted drug therapy; protocol drug therapy, computer-assisted; and protocol drug therapy, computer assisted. A search of expert systems also included expert system; system, expert; and systems, expert.
Abstracts were independently screened by each reviewer, and a consensus meeting was called to discuss disagreement. A kappa score was calculated to provide a measure of agreement between reviewers.

Data Collection and Data Items
A standardized Excel spreadsheet (Microsoft Corporation) was used as a data extraction tool, in accordance with the Kruse protocol [11]. This tool acted as a template for reviewers to collect study design, participants, sample size, intervention, observed bias, and effect size, where applicable. A literature matrix was created to list and organize all articles, extract data between multiple reviewers, and discuss observations in consensus meetings. Three consensus meetings were held for reviewers to discuss disagreement and share observations. This practice created a synergy effect and ensured everyone progressed with a like mind.

Risk of Bias in Individual Studies
Reviewers noted any observation of bias. We used the Johns Hopkins Nursing Evidence-Based Practice (JHNEBP) tool as a quality assessment of studies analyzed. Other forms of bias were noted as well, which are described in risk of bias across studies.

Synthesis of Results
The Excel spreadsheet was used to synthesize our observations and data collected. The spreadsheet enabled a narrative analysis which identified themes, as is the practice in multiple disciplines. We did not combine results of studies because this was not a meta-analysis.

Risk of Bias Across Studies
Additional forms of bias other than selection bias were noted on the spreadsheet such as localized studies or surveillance bias.

Additional Analysis
Reviewers read each article two times [11]. During the second reading, reviewers made independent notes of major themes related to the objective, using the Excel data extraction tool. After a third consensus meeting debriefing the observations and themes, detailed notes were formulated about health policy implications of telemedicine. Frequency of occurrence of each of the major common themes was captured in affinity matrices for further analysis. Data and calculations are available upon request.

Study Selection and Study Characteristics
The study selection process is illustrated in Figure 1. The 74 results from the search string in five databases were placed into an Excel spreadsheet and shared among reviewers for selection and analysis. Filters were applied in each database to capture only the last 10 years (January 30, 2011, to January 30, 2020). Reviewers independently removed duplicates and screened abstracts. A statistic of agreement, kappa, was calculated. The kappa score produced was .98, showing almost complete agreement on all reviewed articles [13,14]. The remaining 36 results were read in full for relevance. Observations for the 36 articles that remained were placed in an Excel spreadsheet for independent data analysis. Reviewers collected standard patient/participants, intervention, comparison, outcome, study design (PICOS) observations plus indications of either practitioner performance or patient medical outcomes (Multimedia Appendix 1). Bias was also noted. Following the Kruse protocol, observations were distilled into themes for further analysis. Three consensus meetings were used to discuss disagreement. A summary of all observations is listed in Table 1. Articles are listed in reverse chronological order. The details extracted were year of publication, authors, title, study design, participants, sample size, intervention, bias, and observations about barriers or facilitators to the adoption of telemedicine.

Risk of Bias Within Studies
Bias was not observed in all studies analyzed. A full review of the bias observed is provided in Multimedia Appendix 1. The JHNEBP tool found no quality measure below Level IV or C.

Results of Individual Studies
General observations and thematic analysis are listed in Table  1. Articles are listed in reverse chronological order. A table of PICOS is provided in Multimedia Appendix 1. Face-to-face visits with patients reduced by 89% time devoted by clinicians to patient evaluation was reduced by 27%; automatic detection of 100% of patients who needed insulin therapy Caballero-Ruiz et al [21] No difference reported Did not improve or worsen pain management No difference reported System did not improve pain intensity, therefore no significant differences in dose of opiates compared with control; had no effect on practitioner performance Raj et al [22] Improved Improved efficacy Treatment in the doxazosin arm was stopped early due to a 1.25-fold increase in the incidence of CVD b and a 2-fold increase in the incidence of heart failure compared with the diuretic arm Not reported or discussed Practitioner performance not discussed Cox and Pieper [32] Not reported or discussed Medical outcomes not reported or discussed Improved screening; improved buyin of CDSSs Once CDSS scored significantly more exams as appropriate; better interface of one CDSS versus the other influenced provider willingness to use the CDS system Schneider et al [33] Improved safety Improved patient safety Improved accuracy and performance Accuracy improved: reduced inaccuracy Zhu and Cimino [34] Improved disease management A quality improvement initiative supported by CDS and workflow tools integrated in the EHR c improved recognition of eligibility and may have increased palivizumab administration rates; palivizumab-focused group performed significantly better than a comprehensive intervention More accurate prescribing Proportions of doses administered declined during the baseline seasons (from 72% to 62%) with partial recovery to 68% during the intervention season; palivizumab-focused group improved by 19.2 percentage points in the intervention season compared with the prior baseline season (P<.001), while the comprehensive intervention group only improved 5.5 percentage points (P=.29); difference in change between study groups was significant (P=.05) Utidjian et al [35] No difference reported No statistically significant difference: mortality 14% versus 15%, ICU d -free days 17 versus 19, vasopressor-free days 22.2 versus 22.6 No difference reported No statistically significant difference in performance (also low use of tool) Semler et al [36] Improved disease management Improved cardiovascular disease risk management; no difference in prescription rates Improved screening Patients more likely to receive screening with CDSS (63% vs 53%); no improvements in prescription of recommended medications at the end of the study Peiris et al [37] No difference reported Patients aged <65 years had greater mortality benefit (OR e 0.45, 95% CI 0.20-1.00; P=.05) than patients >65 years (OR 1.28, 95% CI 0.91-1.82; P=.16); no effect was observed on incidence of Clostridium difficile (OR 1.02, 95% CI 0.34-3.01) and multidrug-resistant organism (OR 1.06, 95% CI 0.42-2.71) infections; no increase in infection-related readmission (OR 1. 16 Improved symptoms Increased CD4+ lymphocyte count and reduced suboptimal follow-up appointment

Improved buy-in of CDSSs
A total of 90% of providers involved with the RCT i supported adopting the intervention Robbins et al [48] Not reported or discussed Medical outcomes not discussed Improved screening New CDSS identified 70 records needing reassessment of triglyceride level Chen et al [49] Improved symptoms A total of 79% of respondents rated that their "pain and other symptoms have been controlled to a comfortable level" always or most of the time compared with 8% of respondents who rated this as rarely or never occurring Improved screening A total of 87% of respondents strongly agreed or somewhat agreed that the "ESAS j was important to complete because it helped the health care team to know what symptoms [they] were having and how severe they were" Seow et al [50] a CDSS: computerized decision support system.

Risk of Bias Across Studies
Multimedia Appendix 1 provides a table of PICOS and bias. Outcomes are reported in Table 1. Bias was similar across articles reviewed: most research took place in one facility, organization, or state, which is a form of selection bias and limits the broad application of results. A sample taken from a limited geographic area is inherently limited in its ability to generalize results to the general population unless steps have been taken to ensure the sample is representative of the population.

Summary of Evidence
Our review methodology enabled a meticulous evaluation of the efficiency and effectiveness of CDSSs for practitioner performance and medical outcomes. A summary of the findings from the review are listed in Table 1. Of the 36 articles analyzed that reported efficiency or effectiveness, 25 reported positive performance and 22 reported positive outcomes; 9 did not report practitioner performance and 11 did not report patient medical outcomes.
The decision of whether to adopt a CDSS is one of complexity and change management. Providers and administrators need to discuss the advantages and disadvantages. The organization's infrastructure must support the application, providers must be trained on how to implement it, and administrators must ensure that budget and organizational dynamics can afford acquisition and implementation. The literature is clear in the efficacy of CDSSs, and this should assist organizations in gaining user acceptance. Providers should carefully integrate CDSSs into their processes and clinical practice guidelines to ensure they are an asset more than a hindrance. They should be used to augment patient care rather than coming between patients and providers.
It is interesting that previous reviews did not find results of medical outcomes. This could have been a limitation in search strategy. It could also be due to the maturation of CDSSs in general. At the time the other reviews were conducted, it may have just been too soon for reviews to see the positive results in medical outcomes.
Because CDSSs present providers with knowledge-based information at the point of care, they augment decision making. Timely tools are available to providers through CDSSs that may not otherwise be available at the point of care. AI-supported recommendations provided by CDSSs analyze symptoms, possible treatments, clinical practice guidelines, and patient outcomes [1,2]. These capabilities are most likely the catalyst for improved practitioner performance and patient outcomes.
There does not appear to be one CDSS panacea for all practices, specialties, or templates. The literature is mixed on which products are best of breed systems. Clearly, additional research should continue to be conducted in this valuable area of medical practice. While other industries have fully embraced the digitized environment, health care in general has been slow to adopt, which is understandable when health is at stake. Based on the results of this review compared with similar ones in the past, CDSSs are diffusing across the health care industry as the systems improve. Further research into CDSSs should look to improve productivity and standardize their integration into clinical practice guidelines.
Another interesting note is that alert fatigue was not raised in any of the studies analyzed. Alert fatigue is a known phenomenon and worthy of note [51]. It is attributed to medical error in the areas of pharmacy and physician ordering systems, which are common attributes in CDSSs [52]. Even in clinical trials, alert fatigue is known to be persistent over time [53]. It is interesting that it was not noted, and if it was not noted, it was not controlled for in the studies analyzed.

Limitations
The small group of articles for analysis was a limitation. Only 36 articles met the selection criteria. A larger group for analysis would strengthen the external validity of the results because we could be better assured that our group is representative of the population. The effects of selection bias were reduced using multiple reviewers to screen and analyze articles [9]. Only two reviewers screened abstracts and analyzed articles for themes. One additional reviewer might have increased the number of observations. Publication bias was reduced through the inclusion of grey literature that included more than just peer-reviewed material; however, these articles were discarded if they did not include results. We considered only articles published in the English language. It is possible that additional observations could have been gained by expanding the search to other languages. This review is also limited by the techniques used in the trials analyzed, and statistics and effect sizes could not be combined due to the wide range used in the articles. We analyzed both qualitative and quantitative methods, and effect size is only viable for the latter. Sample sizes were widely different between studies analyzed, ranging from 6 to 900 million. Such a wide disparity makes consolidation of results difficult. We also did not analyze or compare the heuristics and algorithms used by CDSSs within the studies. To compensate for a limitation from a similar review in 2005, we expanded our analysis beyond randomized controlled trials to pre-post and other designs [6].