In response to widely publicized reports highlighting the challenges of suboptimal quality of healthcare, improving patient safety has been a leading healthcare initiative for more than 10 years.1-4 Numerous strategies to improve patient safety have been proposed,5-9 but improvements have been limited, which raises questions about whether the right approaches are being employed.10,11
Checklists have served as a foundation for the standardization and safety of aviation and nuclear power12,13 and are advocated as simple and effective instruments for ensuring safe care.7,14,15 Systematic reviews of observational studies suggest that checklists can reduce medical errors and adverse events,15-19 but these reviews are at risk of bias due to the limitations of observational methods. Furthermore, discordant results of recent high-profile evaluations of the World Health Organization (WHO) Surgical Safety Checklist highlight the need for checklist evaluations using rigorous study designs.20-22 Therefore, we sought to conduct a systematic review of RCTs (randomized controlled trials) to determine whether checklists, as a type of decision-support tool, are effective at improving patient safety outcomes in hospitalized patients.
The study protocol was registered with the PROSPERO Register of Systematic Reviews (registration number: CRD42016037441) and developed according to the Preferred Reporting Items in Systematic Reviews and Meta-analyses (PRISMA) statement.23
On December 8, 2016, we systematically searched Ovid MEDLINE, Ovid EMBASE, PubMed, and the Cochrane Central Register of Controlled Trials. The search was performed using no language or publication date restrictions and included 2 groups of terms (key words with similar characteristics): ‘checklists’ and ‘patient outcomes assessment’. We restricted our search to patient outcomes because these are more patient-oriented than the proximal processes of care that may not translate into outcomes. The search was restricted to RCTs using the Cochrane Highly Sensitive Search Strategy for Identifying Randomized Trials from the Cochrane Collaborative.24 The MEDLINE search strategy is depicted in Appendix I (Supplementary File 1). Reference lists of included articles were manually searched for additional publications. The search strategy was designed with the help of an information scientist (DL). EndNote X7 (Thomas Reuters, Philadelphia, PA, USA) was the reference software used for the management of citations.
We selected all studies reporting patient safety outcomes of a checklist intervention, using the following inclusion criteria: 1) acute care hospital inpatient population, 2) checklist intervention, 3) contain a control group (ie, no checklist), 4) report one or more patient safety outcome, as defined by the authors (eg, medical errors, adverse events, mortality), and 5) RCT design. We restricted our focus to inpatient populations given the heterogeneity of illness and patient care between acute and community settings. We defined a checklist as a tool that details the essential steps of a task, requiring the target provider to indicate whether an item was completed or not.1,7 Tools that included only 1 item (eg, electronic prompts) or did not require acknowledgement of the items (eg, guidelines) were excluded. We defined patient safety outcomes as the authors’ definition of patient safety (eg, medical error, adverse event, provider compliance with safety regulations).
Two reviewers (JMB, GW) independently, and in duplicate, reviewed the titles and abstracts of the retrieved citations against the eligibility criteria. The same 2 reviewers subsequently reviewed the full text of relevant articles for inclusion. Eligibility disagreements were resolved by consensus. A Kappa statistic was calculated for reviewer agreement of full-text screening.25 Reviewers were not blinded to author or journal names.26
The structured data extraction form was calibrated using the first 2 articles. The 2 reviewers (JMB, GW) independently, and in duplicate, extracted data from included studies on the study characteristics, setting, study population, sample size, intervention used, outcomes examined, analytic method, and study quality. The data extraction form is depicted in Appendix II (Supplementary File 2). Coding discrepancies were resolved by consensus.
The 2 reviewers (JMB, GW) extracted data on study quality independently and in duplicate using 2 approaches. First, reviewers assessed study quality using a component method derived from the Cochrane Collaboration criteria.24 For each included study, the reviewers documented if the authors had adequately described inclusion/exclusion criteria, randomization, allocation concealment, blinding of participants/outcome assessors, attrition, cross over, baseline characteristics, and power calculation. Second, the reviewers calculated and reported the Jadad score for each included study, a validated assessment scale that assigns points (1 to 5) based on randomization, blinding, and attrition.27
Owing to the heterogeneity of the data and the small number of studies that satisfied the inclusion criteria, the data were analyzed using guidelines for the narrative synthesis of a systematic review.28 Descriptive statistical findings from each included study were reported. The DerSimonian and Laird method for random-effects models was used to calculate a pooled estimate of 30-day all-cause mortality from the raw data available from a subset of studies (number of events, study population).29 Stata SE version 13.1 (Stata Corp, LP, College Station, TX) was used to perform the statistical analyses.
The literature search identified 11,225 unique citations from which 83 abstracts were eligible for full-text review. We identified 9 full-text articles for inclusion in the review (Figure 1 [Supplementary File 3]). The main reasons for citation exclusion during the full-text review were that the study design was not an RCT (39%) or there was no checklist intervention (34%). Inter-rater agreement for full-text inclusion was fair (K=0.660, 95% confidence interval[CI],0.414-0.828).
Characteristics of the included studies are summarized in Tables 1 and 2. Six of the studies were conducted in at least 1teaching hospital.30-35 The studies varied in target populations for both the checklist user and patients. The outcomes reported varied; 3 studies examined 30-day mortality,21,30,36 4 studies examined hospital length of stay,21,30,33,36 and 2 studies reported user compliance with the checklist.21,31 Five of the studies reported patient outcomes,21,30,33,35,36 and 5 studies reported provider-level outcomes related to patient safety (eg, compliance with checklist items such as communication of medications, isolation precautions, etc.).31-34,37
Supplementary File 4 (Table 3) provides a detailed breakdown of the checklists’ purpose and components. Six of the checklists were designed to directly reduce patient safety events,21,30,33,35-37 whereas 3 of the checklists were designed to indirectly reduce patient safety events by increasing compliance with processes of care.31,32,34 Six checklists were constructed and pilot tested by the research team conducting the RCT30-35 and the 3 remaining studies used modified versions of previously validated checklists.21,36,37 The number of items included in the checklist ranged from 2 to 54.
Impact of the Checklist
Table 4 summarizes the adverse events, medical errors, resource utilization and/or compliance reported for each checklist. Chaudhary et al. reported significant decreases in Grade III (requiring intervention)38 and IV (life-threatening)38 postoperative complications (23% v. 33%, P = 0.04) and 30-day mortality (5.7% vs 10.0%, P = 0.04) for patients assigned to the Modified WHO Surgical Safety Checklist compared to controls.21 Conversely, Haugen et al. reported a nonsignificant reduction in 30-day mortality between the WHO Surgical Safety Checklist group and controls (1.0% vs 1.6%, P = 0.151).36 Bassor et al. reported no significant difference in 30-day hospital readmission for decompensated heart failure for the heart failure discharge checklist group when compared to controls (6% vs. 4%, P = NS); however, an exploratory analysis that excluded patients who died during the follow-up period found a significant difference in 30-day readmission rates (2% vs. 20%, P = 0.02).30 Gentili et al. reported a higher proportion of patients with pain control in the checklist group compared to the controls (67.6% vs. 54.8%), as well as fewer incidents of analgesic therapy–related uncontrolled adverse events (25.9% vs. 49.9%); however, the statistical significance of these differences were not reported.35 The Writing Group for CHECKLIST-ICU reported no significant difference for in-hospital mortality between the checklist and control groups (adjusted odds ratio [AOR] 1.02, 95% CI, 0.82-1.26, P = 0.88), nor for the secondary clinical outcomes examined (Table 4).33 However, there was a significant difference between the checklist group and control group for 3 of the 7 outcomes related to processes of patient care, including a reduction in the use of both urinary catheters (adjusted rate ratio [ARR] 0.86, 95% CI, 0.80-0.93, P < 0.001) and central venous catheters (ARR 0.90, 95% CI 0.83-0.98, P = 0.02). Masson et al. reported that when using the FASTHUG-MAIDENS checklist, more drug-related problems were identified by pharmacy residents (in relation to the number identified by the ICU pharmacist) both per patient encounter (P = 0.008) and overall (P < 0.001).37 Ong et al. reported higher rates of compliance with isolation precautions for infectious diseases in the checklist group (71% vs. 38%, P < 0.01); however, compliance with the checklist was low (40%) and qualitative analyses found participants were dissatisfied with the checklist.31 Salzwedel et al. reported the number of items handed over by anesthesia residents postoperatively to be higher in the checklist group than the control group (48.7% vs. 32.4%, P < 0.001).32 In a more recent study, Salzwedel et al. reported that proportion of items deemed by the attending anesthesiologist as “must be handed over” were more often actually handed over by the anesthesia residents assigned to the checklist group when compared to controls (87.1% vs. 75.0%, P = 0.005).34
A random-effects model pooling data from the 3 studies that reported data for 30-day all cause mortality suggested a significant reduction with use of a checklist (OR 0.60, 95% CI, 0.41-0.89; P = 0.01, I2 = 0.0%, P = 0.573).
Supplementary File 5 (Table 5) summarizes the quality assessment of the 9 studies. The clarity of description for each intervention varied. All studies reported inclusion/exclusion criteria and randomization procedures. Three studies indicated that outcome assessors were blinded to intervention allocation;32,34,36 while this was unclear in 2 studies.21,30 Three studies reported baseline characteristics.21,30,36 Two studies reported power calculations;33,37 however, one study had a sample size that was less than that required to achieve the target power.37 The Jadad scores ranged from 1to 5.
This systematic review identified 9 RCTs that examined the impact of a checklist on patient safety outcomes in hospitalized patients. The studies employed checklists with different purposes and elements and measured different patient safety outcomes. The methodological quality of the included studies was moderate. In aggregate, the results suggest that checklists may be effective at improving patient safety outcomes, but the small number of moderate quality studies and the heterogeneity of interventions and outcome measures suggests that there is an urgent need for further evaluation.
The most important observation from our systematic review is the paucity of high quality evidence evaluating checklists’ impact on patient safety outcomes in acute inpatient care. The implementation of checklists is increasingly common as they are relatively low cost to develop and implement, and intuitively make sense. This is particularly true in an era of increasing efforts to standardize care as a means for improving quality and minimizing cost (ie, previous systematic reviews cite 38 unique studies).39 However, implementation of an inadequately tested checklist risks unintended consequences (eg, inefficient resource utilization).18 The small number of RCTs identified might be owing to quality improvement efforts traditionally focusing on ‘real life’ applicability over rigorous research methodology.40 The translation of evidence into clinical practice is known to be slow;41 however, these more rigorous methodologies reduce the risk of biases and generate high-quality evidence, which help to fulfill the necessity to identify best practices while avoiding these unintended consequences.
The studies varied both in the approaches used to develop checklists and in the number of items included (ranging from 2 to 54). What is the optimal method for developing a checklist and how does this impact their effectiveness?42 The answers to these questions are not known. However, this review highlights some important issues to consider when developing a checklist. As the number of items or complexity of a task increases, our ability to efficiently perform the task without aid decreases.43-45 As such, a well-designed checklist should detail explicit instructions on the what, where, when, and how of a given task in a fashion that ensures a consistent accuracy for completing the work.5 It is recommended that construction of a checklist follow the principles of human factors engineering: engage stakeholders and human factors experts in the design; are developed based on user needs and realities; list items in order of importance; are concise and subgroup sections of checklists by task or chronological order; ensure usability and evaluate potential negative consequences (eg time to complete); are pilot tested and validated before implementation; are updated as needed based the on generation of new findings or changes in operational procedures.46 These general principles of human factors engineering46 provide a practical approach for the development and evaluation of a checklist. In addition, standardization of operational definitions (ie, process, outcome, compliance) is important for study replication and robust meta-analyses.
Checklists used in aviation are perhaps best known12 and the evidence of their effectiveness is derived from the attribution of aviation errors to incomplete checklists.12 Although more recently implemented in medicine, checklists have the potential to guide the successful completion of complex tasks in healthcare.7 Systematic reviews of observational studies have been conducted for specific checklists (eg, WHO Surgical Safety Checklist) and for select patient populations (eg, surgical patients), and the number of included studies ranges from 7-27 (n = 38 unique studies).15,16,18,19 For example, Gillespie et al. in a systematic review and meta-analysis reported the implementation of Surgical Safety Checklists to be associated with a reduction in postoperative complications (relative risk [RR] 0.63, 95% CI, 0.58-0.72, P = < 0.001), but not mortality (RR 1.03, 95% CI, 0.73-1.4, P = 0.857).19 Similarly, Treadwell et al. reported in a systematic review of Surgical Safety Checklists that while data are promising, more evaluation of their impact on clinical outcomes is needed.18 These recommendations are nicely illustrated by Urbach et al.’s20 and O’Leary et al.’s47 evaluations of the mandatory adoption of Surgical Safety Checklists across all hospitals in Ontario, Canada, which respectively demonstrated no significant reductions in 30-day perioperatively conplications for both adult (OR 0.97, 95% CI, 0.90-1.03, P = 0.29) and pediatric (AOR 1.01, 95% CI, 0.90-1.14, P = 0.9) patients. These data not only highlight the need for further evaluation of checklists but are also a reminder that checklists and their associated implementation strategies are complex interventions for which there may be important differences between the efficacy reported in clinical trials and the effectiveness reported in implementation studies.48 This all suggests that if checklists are to be effective in improving patient safety, process evaluations of implementation49 and realist reviews of published studies50 may be important to determine optimal approaches for implementation. We believe that, based on the limited currently available evidence, there is urgency for further robust evaluations of checklists before their widespread implementation. If effective, they should be widely implemented. If ineffective, they should be abandoned to minimize unintended consequences and inefficient use of resources.
There are 4 primary limitations to this review that should be considered when interpreting the findings. First, the RCT design is not the study design employed by most quality improvement initiatives.40 While some quality improvement experts may argue that an RCT design is insufficiently flexible for applied settings, it does minimize the risk of biased assessments of intervention effectiveness. Second, our search strategy included an RCT filter. The filter helped restrict the number of citations to be reviewed (n = 11,225) but could have resulted in improperly indexed studies being excluded. To guard against this risk, we used the validated Cochrane Highly Sensitive Search Strategy for Identifying Randomized Trials,24 reviewed reference lists of citations included in the review, and solicited suggestions for missing studies from quality improvement experts. Third, our review was restricted to hospitalized patients. Although the studies evaluated commonly reported safety outcomes across patients with diverse clinical conditions, care settings, and providers that broadly reflect hospital-based care, evaluations of checklists in additional patient and provider groups are needed (eg, hospitalists). Furthermore, the effectiveness of checklists for improving patient safety outcomes in outpatients is important; however, the organizational and patient characteristics of these 2 settings (hospitalized vs outpatient) are sufficiently different to warrant separate systematic reviews. Finally, owing to the heterogeneity of the checklists used and outcomes measured, we were unable to perform a robust meta-analysis. Heterogeneity, combined with the small number of studies identified in our search, prevented us from applying statistical methods to assess for publication bias. This limitation of our systematic review highlights an important gap in the literature and emphasizes the importance of additional primary research to evaluate checklists.
In summary, we identified few RCTs that examined checklists designed to improve patient safety outcomes. The small number of existing studies suggests that checklists may improve patient safety outcomes; however, these observations were not reported for all outcomes examined and the studies were heterogeneous and of limited methodological quality. There is an urgent need for high-quality evaluations of the effectiveness of patient safety checklists in inpatient healthcare settings to substantiate their perceived benefits.
We would like to thank Diane Lorenzetti for her help with the development of the search strategy.
Disclosure: The authors have no known conflicts of interest to declare.
Jamie Boyd was supported by a W21C – Alberta Innovates-Health Solutions (AIHS) Collaborative Research and Innovation Opportunities (CRIO) Health Services Research graduate studentship. Guosong Wu was supported by a Western Regional Training Centre (WRTC) for Health Services Research graduate studentship. Dr. Stelfox was supported by a Population Health Investigator Award from Alberta Innovates Health Solutions.
HTS was responsible for the study’s conception. All 3 authors contributed to the study’s design and interpretation. JB and GW were responsible for searching the literature, reviewing abstracts, selecting full-text articles and critically appraising them. All 3 authors performed the analyses. JB drafted the manuscript and all 3 authors assisted in the successive revisions of the final manuscript. All authors have read and approved the final manuscript.