Intended for healthcare professionals

CCBYNC Open access
Research

Computerised clinical decision support systems and absolute improvements in care: meta-analysis of controlled clinical trials

BMJ 2020; 370 doi: https://doi.org/10.1136/bmj.m3216 (Published 17 September 2020) Cite this as: BMJ 2020;370:m3216

Linked Editorial

How effective are clinical decision support systems?

Linked Opinion

What I have learned about clinical decision support systems over the past decade

  1. Janice L Kwan, assistant professor1 2,
  2. Lisha Lo, research coordinator3,
  3. Jacob Ferguson, medical student4,
  4. Hanna Goldberg, resident physician4,
  5. Juan Pablo Diaz-Martinez, doctoral student5,
  6. George Tomlinson, professor5,
  7. Jeremy M Grimshaw, professor6,
  8. Kaveh G Shojania, professor2 3 7
  1. 1Sinai Health System, Department of Medicine, 600 University Avenue, Toronto, ON M5G 1X5, Canada
  2. 2Department of Medicine, University of Toronto, Toronto, ON, Canada
  3. 3Centre for Quality Improvement and Patient Safety, University of Toronto, Toronto, ON, Canada
  4. 4Faculty of Medicine, University of Toronto, Toronto, ON, Canada
  5. 5Biostatistics Research Unit, University Health Network and Sinai Health System, Toronto, ON, Canada
  6. 6Clinical Epidemiology Program, Ottawa Hospital Research Institute and Department of Medicine, University of Ottawa, Ottawa, ON, Canada
  7. 7Department of Medicine, Sunnybrook Health Sciences Centre, Toronto, ON, Canada
  1. Correspondence to: Janice L Kwan janice.kwan{at}utoronto.ca (or @KwanJanice on Twitter)
  • Accepted 7 August 2020

Abstract

Objective To report the improvements achieved with clinical decision support systems and examine the heterogeneity from pooling effects across diverse clinical settings and intervention targets.

Design Systematic review and meta-analysis.

Data sources Medline up to August 2019.

Eligibility criteria for selecting studies and methods Randomised or quasi-randomised controlled trials reporting absolute improvements in the percentage of patients receiving care recommended by clinical decision support systems. Multilevel meta-analysis accounted for within study clustering. Meta-regression was used to assess the degree to which the features of clinical decision support systems and study characteristics reduced heterogeneity in effect sizes. Where reported, clinical endpoints were also captured.

Results In 108 studies (94 randomised, 14 quasi-randomised), reporting 122 trials that provided analysable data from 1 203 053 patients and 10 790 providers, clinical decision support systems increased the proportion of patients receiving desired care by 5.8% (95% confidence interval 4.0% to 7.6%). This pooled effect exhibited substantial heterogeneity (I2=76%), with the top quartile of reported improvements ranging from 10% to 62%. In 30 trials reporting clinical endpoints, clinical decision support systems increased the proportion of patients achieving guideline based targets (eg, blood pressure or lipid control) by a median of 0.3% (interquartile range −0.7% to 1.9%). Two study characteristics (low baseline adherence and paediatric settings) were associated with significantly larger effects. Inclusion of these covariates in the multivariable meta-regression, however, did not reduce heterogeneity.

Conclusions Most interventions with clinical decision support systems appear to achieve small to moderate improvements in targeted processes of care, a finding confirmed by the small changes in clinical endpoints found in studies that reported them. A minority of studies achieved substantial increases in the delivery of recommended care, but predictors of these more meaningful improvements remain undefined.

Introduction

Although the first electronic health record (EHR) was introduced almost six decades ago,1 dissemination has occurred surprisingly slowly.2 In the United States, as recently as 2008, fewer than 10% of hospitals had EHRs, and only 17% had computerised order entry of drugs. By 2015, however, 75% of US hospitals had adopted at least basic EHR systems.3 This rapid uptake reflects the financial incentives in the Health Information Technology for Economic and Clinical Health (HITECH) Act passed in 2009.4 Although EHR adoption has been slower in the UK,5 the NHS long term plan, released in 2019, sets a goal for all trusts and providers to move to digital health records by 2024.6

Clinical decision support systems embedded within EHRs—from pop-up alerts about serious drug allergies to more sophisticated tools incorporating clinical prediction rules—prompt clinicians to deliver evidence based processes of care,7 discourage non-indicated care,89 optimise drug orders,1011 and improve documentation.1213 Despite optimism over the effects of these support systems,141516 a systematic review published by our group in 201017 found that clinical decision support systems typically improved the proportion of patients who received target processes of care by less than 5%. The subsequent decade has seen a dramatic rise in the application and evaluation of clinical decision support systems. Systematic reviews of an increasingly large number of publications, however, have typically looked only at the features of these support systems that predicted improvements in care,1418 and reported odds ratios or risk ratios,192021 without quantifying the actual sizes of the improvements achieved.

In our systematic review and meta-analysis, we sought firstly, to estimate the typical improvement in processes of care—and thus the potential for clinical effect—conferred by clinical decision support systems delivered at the point of care; and secondly, to identify any study characteristics or features of these support systems consistently associated with larger effects.

Methods

We followed established methods recommended by Cochrane22 and report our findings in accordance with PRISMA (preferred reporting items for systematic reviews and meta-analyses).23

Search strategy and selection criteria

We searched Medline from the earliest available date to August 2019 without language restrictions (supplementary appendix 1) and scanned reference lists from included studies and relevant systematic reviews. We did not search Embase, CINAHL, and Cochrane Central Register of Controlled Trials, as these databases did not increase the yield of eligible studies included in our previous review.1724 We did, however, conduct fresh study screening and data abstraction for all articles, even those covered by the previous review, to accommodate new and modified data elements reflecting changes in technology in the intervening decade.1724

Eligible studies evaluated the effects of clinical decision support systems on processes or outcomes of care using a randomised or quasi-randomised controlled design (allocation on the basis of an arbitrary but not truly random process, such as even or odd patient identification numbers). Patients in control arms received “usual care” contemporaneous with care delivered in the intervention arm (that is, we excluded head-to-head comparisons of different clinical decision support systems).

We defined a clinical decision support system as any on-screen tool designed to improve adherence of physicians to a recommended process of care. Eligible studies delivered the support system intervention within a clinical information system routinely used by the provider (not a computer application separate from the EHR) at the time of providing care to the targeted patient (eg, while entering an order or a clinical note). We excluded specialised diagnostic decision support systems (eg, in medical imaging) and systems not directly related to patient care, such as decision support for billing or health record coding. We also excluded studies using simulated patients and studies in which fewer than half of participants were physicians or physician trainees.

We focused on improvements in processes of care (eg, prescribing drugs, immunisations, test ordering, documentation), rather than clinical outcomes, because we sought to determine the degree to which clinical decision support systems achieve their immediate goal—namely, changing provider behaviour. The extent to which such changes ultimately improve patient outcomes will vary depending on the strength of the relation between targeted processes and clinical outcomes. Nonetheless, we did capture clinical outcomes where reported, including intermediate endpoints, such as haemoglobin A1c and blood pressure. We coded all results so that larger numbers always corresponded to improvements in care. For example, if a study reported the proportion of patients who received inappropriate drugs,825 we recorded the complementary proportion of patients who did not receive inappropriate drugs.

Two investigators independently evaluated the eligibility of all identified studies based on titles and abstracts. Studies not excluded in this first step were independently assessed for inclusion after full text evaluation by two investigators. For articles that met all inclusion criteria, two investigators independently extracted the following information: clinical setting, participants, methodological details, characteristics of the design and content of the clinical decision support system, presence of educational and non-educational co-interventions, and outcomes. We extracted studies with more than one eligible intervention arm as separate trials (that is, comparisons). Two investigators also independently assessed risk of bias for each study using criteria outlined in the Cochrane Handbook for Systematic Reviews of Interventions.2627 Specific biases considered for each study included selective enrolment, attrition bias, similarity of baseline characteristics, unit of analysis errors, performance bias (systematic differences between groups with respect to the care provided or exposures other than the interventions of interest), and detection bias (systematic differences between groups in outcome ascertainment). We resolved all discrepancies and disagreements by discussion and consensus among the study team.

Data analysis

We used a multilevel meta-analysis model to estimate the absolute improvements (risk differences) in processes of care between intervention and control groups, using the number of patients receiving the target process of care and the total sample size for each reported outcome. This approach allowed us to account for heterogeneity between studies, and for clustering of multiple outcomes reported for the same patients within a given study.

Most trials used clustered designs, assigning intervention status to the provider or provider group rather than to the individual patient, but did not always report cluster adjusted estimates. We accounted for clustering by multiplying the standard error of the risk differences by the square root of the design effect.22 Intraclass correlation coefficients were abstracted directly from the study when reported. To impute intraclass correlation coefficients in studies that did not report them, we used a published database of intraclass correlation coefficients28 stratified by type of setting (eg, hospital versus ambulatory care) and endpoint (eg, process versus outcome). We calculated the median intraclass correlation coefficient for hospital and ambulatory process measures across the 200 studies in this database and applied the relevant value to a given study.

For clinical endpoints, we quantified the median improvement and interquartile range across all studies that reported dichotomous clinical endpoints, such as the percentage of patients who achieved a target blood pressure or the percentage of patients who experienced a clinical event (eg, a critical laboratory result, adverse drug event, or venous thromboembolism). This method29 for summarising the effects of improvement interventions on disparate clinical endpoints was first developed in a large systematic review of implementation strategies for clinical practice guidelines,30 and subsequently applied in other systematic reviews.3132 We also calculated the median improvement and interquartile range for changes in blood pressure, the most commonly reported continuous clinical endpoint.

We performed univariate meta-regression analyses to explore the extent to which effects varied according to the study characteristics and features of the clinical decision support system. These analyses estimated the difference in absolute improvements reported between studies with and without each feature. Most features are easy to understand (eg, the setting of the intervention as hospital or ambulatory, the presence of co-interventions other than clinical decision support system). Table 1 defines those features with less obvious meanings.

Table 1

Study and clinical decision support system features

View this table:

Finally, we fitted a multivariable meta-regression model that included covariates with P<0.1 in the univariate analyses, and simplified the model by stepwise selection.33 We used the I2 statistic to summarise statistical heterogeneity (that is, the degree to which trials exhibited non-random variation in effect sizes). We expected substantial heterogeneity in effect sizes given the range of care settings, targeted clinician behaviours, and design features of the clinical decision support system. The multivariable meta-regression model was used to identify study and clinical decision support system features that predicted larger effects and to determine if heterogeneity could be reduced.

We performed all statistical analyses using R Software, version 3.5.1 (R Foundation for Statistical Computing, Vienna). Rma.mv function from metafor library was used to fit all models.

Patient and public involvement

Patients or the public were not involved in the design, conduct, reporting, or dissemination plans of this study. There were no funds or time allocated for patient and public involvement at the time of the study so we were unable to involve patients.

Results

Our search identified 5895 citations, of which 5428 were excluded at the initial stage of screening and an additional 352 on full text review, yielding a total of 115 studies that met all inclusion criteria (fig 1 and supplementary appendix 2). Nine studies contained two trials (that is, two comparisons of a clinical decision support system intervention with a control group) 73435363738394041 and one study contained six such trials,42 resulting in 129 included trials.

Fig 1
Fig 1

Flow of studies through the review process

Of the 129 included trials (table 2 and supplementary table 1), most used a true randomised design, with only 16 (12%) of the 129 trials involving a quasi-random allocation process. Most trials (113/129; 88%) had a clustered design, allocating intervention status to providers or provider groups rather than patients. Most trials (93/129; 72%) occurred in outpatient settings, and 85 of 129 trials (66%) came from US centres. Moreover, 93 (72%) of all 129 trials were published from 2009 onwards, when the US HITECH Act became law. The period since 2009 also included significantly more trials using commercial EHRs (39/93 (42%) v 2/36 (6%), P<0.05 with Bonferroni correction for multiple comparisons). Epic accounted for 26 (63%) of all 41 interventions on commercial EHR platforms. As shown in supplementary figures 1a-b, the risk of bias was generally low. The one exception involved unit of analysis errors, which occurred in 18 (16%) of 113 clustered studies and thus possessed a high risk of bias. (We included these studies because our use of intraclass correlation coefficients and a multilevel model avoided replicating unit of analysis errors from the primary studies.)

Table 2

Summary of characteristics of included trials, according to publication year. Data are number (%) of trials unless stated otherwise

View this table:

Across the 122 trials that provided analysable data from 1 203 053 patients and 10 790 providers (fig 2), clinical decision support systems produced an average absolute improvement of 5.8% (95% confidence interval 4.0% to 7.6%) in the percentage of patients receiving the desired process of care. For specific types of physician behaviours, prescribing improved by 4.4% (95% confidence interval 2.6% to 6.2%; 68 trials); test ordering by 6.8% (4.9% to 8.6%; 30 trials); documentation adherence by 7.1% (5.4% to 8.9%; 25 trials; vaccination by 5.9% (4.1% to 7.7%; 12 trials); and other process measures by 6.8% (5.0% to 8.6%; 35 trials). Other process measures included referral for specialty consultations,434445 overall guideline concordance,464748 and documenting key elements of the diagnostic process.495051 Supplementary figures 2a-e present the multilevel models for each category.

Fig 2
Fig 2

Absolute improvements in desired care by different categories of clinical care. Results from the multilevel random effects meta-analysis are shown. The diamond shows the summary overall absolute improvement and 95% confidence interval across all types of outcomes; the squares with lines represent estimates and their 95% confidence intervals for different categories of clinical care. *Other process outcomes included referrals for specialty consultations, overall guideline concordance, and diagnosis

In univariate meta-regression analyses (table 3), clinical decision support systems requiring acknowledgement and documentation of a reason for not adhering to the recommended action achieved improvements 4.8% larger than support systems without this feature (95% confidence interval 0.1% to 9.6%). The ability to execute the desired action through the clinical decision support system was also associated with larger effects than interventions without this feature, with an incremental difference of 4.4% (0.9% to 7.9%). Only 17 (14%) of 122 trials considered alert fatigue in designing or delivering the clinical decision support system, and the associated incremental increase for this feature was small (1.7%) and non-significant (95% confidence interval −3.5% to 6.8%; P=0.52).

Table 3

Absolute incremental improvements in desired care by CDSS feature

View this table:

Improvements seen for clinical decision support systems in paediatric settings exceeded those in other patient populations by 14.7% (95% confidence interval 8.4% to 21.0%; table 4). Additional study features associated with significant incremental effects included small studies (lower than the median patient sample size), with an incremental improvement of 3.7% (0.2% to 7.3%) greater than larger studies, and studies that took place in the US, with an incremental improvement of 3.7% (0% to 7.4%).

Table 4

Absolute incremental improvements in desired care by study feature

View this table:

As expected, the meta-analytic improvement in processes of care across all trials exhibited substantial heterogeneity (I2=76%), with a minority of studies reporting much larger improvements than the meta-analytic average. The top quartile of trials reported improvements in process adherence ranging from 10% to 62%. Using multivariable meta-regression (table 5), the final model identified paediatric studies as achieving incremental improvements of 13.6% (95% confidence interval 7.4% to 19.8%), and those trials with low baseline adherence (relative to the median across all studies) reported incrementally greater improvements by 3.2% (0% to 6.4%). Even when these characteristics were incorporated in the meta-regression model, heterogeneity remained high and essentially unchanged.

Table 5

Multivariable meta-regression model for the absolute incremental improvements in desired care by study and CDSS features

View this table:

Thirty trials reported dichotomous clinical endpoints (supplementary table 1). These endpoints included achieving guideline based targets for blood pressure525354 and lipid levels5354; adverse events, such as bleeding5556; in-hospital pulmonary embolism5758; hospital readmission5960; and mortality.5860 For these various endpoints, clinical decision support systems increased the proportion of patients achieving guideline based targets by a median of 0.3% (interquartile range −0.7% to 1.9%). Twenty trials reported continuous clinical endpoints (supplementary table 1), including laboratory test values (eg, haemoglobin A1c5461), and questionnaire scores, such as the 36-Item Short Form Survey.6263 Blood pressure was the most commonly reported continuous clinical endpoint.5253546465 Patients in intervention groups had a median reduction in diastolic blood pressure of 1.0 mm Hg (interquartile range 1.0 mm Hg decrease to 0.2 mm Hg increase). Median systolic blood pressure for patients receiving an intervention, however, increased by 1.0 mm Hg (interquartile range 0.3 mm Hg decrease to 1.0 mm Hg increase).

We conducted sensitivity analyses by reanalysing the percentage of patients receiving desired processes of care using the largest improvement from each trial. Instead of each trial contributing data for all eligible processes of care, we used only the largest improvement among the outcomes reported in a given trial for these sensitivity analyses. These “best case analyses” produced a pooled absolute improvement in patients receiving desired care of 8.5% (95% confidence interval 6.8% to 10.2%). We also repeated our analyses using odds ratios rather than risk differences, since absolute risk differences can produce less stable meta-analytic estimates than relative risks or odds ratios across varying levels of background risk.66 The pooled odds ratio for improvement in adherence was 1.43 (95% confidence interval 1.30 to 1.58). Overall, when odds ratios were used, our results remained qualitatively the same for important study and clinical decision support system features, types of outcome, and level of heterogeneity.

Discussion

Principal findings

Across 122 controlled, mostly randomised, trials involving 1 203 053 patients and 10 790 providers, clinical decision support systems improved the average percentage of patients receiving the desired element of care by 5.8% (95% confidence interval 4.0% to 7.6%). As expected, these trials exhibited substantial heterogeneity (I2=76%). Although it is generally not advisable to perform a meta-analysis with this degree of heterogeneity, we have reported these results to show that the current literature, despite its substantial size, provides little guidance for identifying the circumstances under which clinical decision support system interventions produce worthwhile improvements in care.

Implications

On the one hand, the extreme heterogeneity indicates non-random variation in effect sizes, such that a minority of interventions might have achieved significantly larger effects than the 95% confidence intervals around the meta-analytic average. Indeed, 25% of studies reported absolute improvements greater than 10%, with one as high as 62%.67 Yet, even with the identification of two significant predictors of larger effects—namely, paediatric studies and those with low baseline adherence—the meta-regression model still showed extreme heterogeneity. Thus even when these characteristics were taken into account, a wide, non-random variation remained in the improvements seen with clinical decision support systems. The reason for this remains largely unknown.

Other systematic reviews have similarly reported extreme heterogeneity, with I2 as high as 97% in one instance.18 Moja and colleagues reported more moderate heterogeneity (I2=41-64%),21 but their review focused specifically on measures of morbidity and mortality. Other reviews alluded to high heterogeneity without reporting formal analyses,19 whereas others did not mention it.1420

Previous reviews of clinical decision support systems have typically looked only at predictors of improvements in care,1418 or reported odds or risk ratios192021 for receiving recommended care. We sought to characterise the typical improvement achieved—namely, a 5.8% increase in the percentage of patients receiving the desired process of care. To put the magnitude of this improvement into perspective, for the control groups in the included trials, a median of 39.4% of patients received care recommended by the clinical decision support system. Thus, in the typical intervention group, about 45% would receive this recommended process of care. Even if the meta-analytic result of 8.5% from our sensitivity analysis incorporating the largest improvement reported in each trial were used, the typical clinical decision support system intervention would still leave over 50% of patients not receiving recommended care.

We would characterise these absolute increases of 5.8% to 8.5% in the percentage of patients receiving recommended care as a small to moderate effect, but this does not imply that all such improvements are unimportant. For some processes of care, such as vaccinations and evidence based cancer screening, even a small increase in the percentage of patients who receive this care will translate into worthwhile benefits at the population level. For many other processes of care, where the recommendation has a weaker connection with important outcomes, a small increase in adherence may not justify its implementation and subsequent contribution to clinicians’ frustrations with EHRs.68

Clinical decision support systems embody one of the eagerly awaited applications of artificial intelligence and machine learning for patient care.69 By leveraging the power of “big data,” these technologies promise earlier recognition of sepsis7071 and impending clinical deterioration.7273 Even with the most effective artificial intelligence algorithms,74 however, the decision support tools alerting clinicians to their complex outputs will still largely depend on—and therefore be limited by—the small to moderate effect sizes typically obtained in our analysis.

Notably, studies that targeted a paediatric population were associated with the largest absolute improvement in process adherence. Many of these studies occurred at large health centres, affiliated to universities, with mature clinical information systems. In our earlier review,17 we noted that studies conducted at institutions with longstanding experience in clinical informatics showed significantly larger improvements. In this updated review, we performed similar analyses and found no association between effect size and specific institutions or mature homegrown systems.

Although much of the early work on EHRs and clinical decision support systems took place on inpatients, 93 (72%) of the 129 trials included in this analysis took place in the outpatient setting. Across both settings, most trials (66%; 85/129) came from the US. Nonetheless, we believe our findings are relevant elsewhere. For instance, primary care EHRs in the UK increasingly record more and more coded demographic, diagnostic, and therapeutic data, providing a basis for future clinical decision support systems.

Several factors could explain the disappointingly small effect sizes typically achieved by clinical decision support systems. Firstly, researchers and leaders in clinical informatics and human factors have recognised for years the importance of informing this work with a rich sociotechnical model. This model includes the human-computer interface, hardware and software computing infrastructure, clinical content, people, workflow and communication, internal organisational culture, external regulations, and system measurement.75767778798081 Yet, those who develop and implement clinical decision support systems typically do not take into account the full range of these dimensions, or the complex interplay between them. In addition to using such considerations to inform the design and evaluation of clinical decision support systems, future studies should report specific potential effect modifiers, including physician EHR training, number of alerts for each visit, physician visit volume, and broader contextual factors, such as validated measures of burnout and local safety culture. A recent publication guideline may foster such improved reporting of trials of clinical decision support systems.82

Alert fatigue is a second possible explanation for the small to moderate effect sizes typically achieved in our analysis. Clinicians could encounter the same alert for many patients, or a given clinical information system could have several clinical decision support systems operating concurrently. In either case, clinicians could become less receptive to alerts from these support systems. Surprisingly, only 19 (15%) of 129 trials in this review mentioned alert fatigue in their design. Although only a minority of trials considered alert fatigue, studies focusing on this topic outside of controlled trials have highlighted the problem, and strategies to minimise the burden to users from clinical decision support systems are under investigation. Growing concerns that alert fatigue contributes to dissatisfaction with EHRs, and even clinician burnout, underscore the importance of considering this problem in future studies of support systems.68

Finally, as has happened with other common improvement interventions, from bundles and checklists to performance report cards and financial incentives, clinical decision support systems have become a “go to” off-the-shelf intervention. Many reflexively reach for these interventions without considering the way they work or the degree to which they would be likely to help with the cause of the target problem.8384 The decision to employ a clinical decision support system often reflects ease of deployment rather than its usefulness in dealing with the problem. A pop-up computer screen reminding clinicians of the approved indications for a broad spectrum antibiotic may ignore the psychological reality of the clinician’s primary concern at that moment—avoiding further deterioration of a very sick patient, rather than the public health consequences of excessive antibiotic use. An antimicrobial stewardship programme85 could hold far greater promise in achieving this goal than an interruptive alert, which many clinicians will probably ignore, or worse, could result in alert fatigue, thereby undermining the potential effectiveness of other support system interventions in a given healthcare setting. Future interventions may also seek to harness potential synergies between clinical decision support systems and other well known interventions, such as performance feedback.86

Risk of bias

The only included studies with high risk of bias were 18 (16%) of 113 clustered trials exhibiting unit of analysis errors, which typically underestimate the width of confidence intervals. We were able to include such studies without replicating this bias because we used their primary data, rather than the reported effect sizes, together with reported or imputed intraclass correlation coefficients for each clustered trial in a multilevel model. The lack of other trials with a high risk of bias among the included studies probably reflects the nature of the intervention (that is, clinical decision support systems delivered at the point-of-care) and the endpoints in our analysis (that is, the degree to which patients received recommended care as documented in the EHR), as many of the methodological shortcomings that can undermine evaluations of other interventions are easier to avoid. For instance, systematic differences in outcome ascertainment can hardly arise. There is also no clear way in which exposures other than the intervention of interest (the clinical decision support system) could systematically differ between intervention and control patients. Losses to follow-up (that is, attrition bias) are also virtually impossible given that the EHR captures whether or not the clinician’s orders or documentation complied with recommended care in response to the clinical decision support system triggered.

Attrition due to loss of entire clusters is the one possible exception and occurred in several trials in which one or more clinics assigned to the control arm dropped out of the trial. In a representative example of such a trial,87 participating clinics received an EHR system with or without additional guideline based decision support. Some clinics assigned to the control group dropped out owing to lack of motivation to implement a new system with no chance of benefitting their patients. The decision of clinics in the control arm not to follow through with an information technology intervention, which is well known to take personnel time and cause frustration to clinicians, did not seem to us to carry a clear risk of bias. We labelled the few such trials as “unclear risk.”

Limitations

Heterogeneity among the included studies is the main limitation to our analysis. Recommendations to refrain from meta-analysis when the I2 statistic exceeds 50%22 stem from the desire to avoid spurious precision in the estimated effect size. For instance, the extreme heterogeneity we report—again, indicating substantial variation in effects across trials beyond those expected from chance alone—suggests that some subsets of trials might have achieved substantially larger (or smaller) effects than denoted by the 95% confidence interval for the pooled effect across all trials. Indeed, this is one of the central findings in our study. Yet, even with a meta-regression model incorporating objective features of either the clinical decision support systems or the studies evaluating them, heterogeneity remained unchanged. Thus we conducted this meta-analysis despite extreme levels of heterogeneity because the results highlight that, while clinical decision support systems sometimes achieve large effects, current literature does not adequately identify the circumstances under which worthwhile improvements occur.

Conclusions

Despite publication of over 100 randomised and quasi-randomised trials involving over one million patients and 10 000 providers, the observed improvements display extreme, unexplained variation. Achieving worthwhile improvements therefore remains largely a case of trial and error. Future research must identify new ways of designing clinical decision support systems that reliably confer larger improvements in care while balancing the threat of alert fatigue. Head-to-head trials comparing design features of different support systems will also be important.88 In the meantime, a critical consideration in deciding to implement clinical decision support systems is the strength of the connection between the targeted process of care and patient outcomes. Achieving small improvements in care with only a presumptive connection to patient outcomes may not be worth the risk of contributing to alert fatigue or the growing concern of physician burnout attributable to EHRs.6889

What is already known on this topic

  • Clinical decision support systems embedded in electronic health records prompt clinicians to deliver recommended processes of care

  • Despite enthusiasm over the potential for clinical decision support systems to improve care, a previous systematic review in 2010 found that such systems typically improved the proportion of patients receiving target processes of care by less than 5%

  • The number of trials published over the ensuing decade has grown considerably, but subsequent systematic reviews have focused only on identifying features of clinical decision support systems associated with positive results, rather than quantifying the actual sizes of improvements achieved

What this study adds

  • Most clinical decision support system interventions achieve small to moderate improvements in the percentage of patients receiving recommended processes of care

  • The pooled effect size exhibited extreme heterogeneity (that is, variation across trials beyond that expected from chance alone), which did not diminish with a meta-regression model using significant predictors of larger effect sizes

  • Thus although a minority of studies have shown that clinical decision support systems deliver clinically worthwhile increases in recommended care, the circumstances under which such improvements occur remain undefined

Acknowledgments

We thank Julia Worswick, Sharlini Yogasingam, and Claire Chow for their support with data abstraction; Alex Kiss for statistical support; and Michelle Fiander for assistance with the bibliographic search.

Footnotes

  • Contributors: JLK, LL, JMG, and KGS led the study design. JLK, LL, JF, HG, and KGS extracted the data. JLK, JPDM, GT, and KGS analysed the data. JLK and KGS drafted the manuscript. All authors provided critical revision of the manuscript for important intellectual content and approval of the final submitted version. JLK and KGS are the guarantors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

  • Funding: JMG holds a Canada Research Chair in health knowledge transfer and uptake. The funders had no role in considering the study design or in the collection, analysis, interpretation of data, writing of the report, or decision to submit the article for publication.

  • Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

  • Ethical approval: Ethical approval for this evidence synthesis was not required.

  • Data sharing: No additional data available.

  • The lead author (the manuscript’s guarantor) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

  • Dissemination to participants and related patient and public communities: Dissemination of the study results to study participants is not applicable. We plan to use media outreach (eg, press release) and social media to disseminate our findings and communicate with the population at large.

  • Provenance and peer review: Not commissioned; externally peer reviewed.

http://creativecommons.org/licenses/by-nc/4.0/

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

References