Brief Reports

Ultrabrief Screens for Detecting Delirium in Postoperative Cognitively Intact Older Adults

Author and Disclosure Information


The authors’ sought to develop an ultrabrief screen for postoperative delirium in cognitively intact patients older than 70 years undergoing major elective surgery. All possible combinations of one-, two- and three-item screens and their sensitivities, specificities, and 95% confidence intervals were calculated and compared with the delirium reference standard Confusion Assessment Method (CAM). Among the 560 participants (mean age, 77 years; 58% women), delirium occurred in 134 (24%). We considered 1,100 delirium assessments from postoperative days 1 and 2. The screen with the best overall performance consisted of three items: (1) Patient reports feeling confused, (2) Months of the year backward, and (3) “Does the patient appear sleepy?” with sensitivity of 92% and specificity of 72%. This brief, three-item screen rules out delirium quickly, identifies a subset of patients who require further testing, and may be an important tool to improve recognition of postoperative delirium.

© Society of Hospital Medicine

Delirium is the most common postsurgical complication for older adults, with incidence of 15%-54%, depending on surgery type.1 Increasing numbers of older adults are undergoing surgery2; and those who develop delirium experience negative consequences including longer lengths of stay, higher likelihood of institutional discharge, and increased morbidity and mortality.3 The American Geriatrics Society Expert Panel on Postoperative Delirium in Older Adults and the European Society of Anaesthesiology4 recommend routine screening for delirium in those at risk.

Ultrabrief screens are designed to rule out delirium quickly and identify a subset of patients who require further testing.5 Our group, and others, have previously published ultrabrief screens for the general medicine, nonsurgical population and for patients with dementia.5,6 The UB-2 is an ultrabrief screen consisting of “Months of the year backward” (MOYB) and “What day of the week is it?”, which has a sensitivity of 93% and specificity of 64% in hospitalized older adults and takes less than 40 seconds to administer.5 However, no such screens for delirium have been developed for the group with relatively high cognitive and physical functioning undergoing scheduled major surgery in which delirium may present differently. Thus, the purpose of this study was to develop an ultrabrief screen for postoperative delirium using data from a large study of delirium in cognitively intact, older adults undergoing scheduled major noncardiac surgery.


We performed a secondary data analysis on 560 patients enrolled between June 18, 2010, and August 8, 2013, in the Successful Aging After Elective Surgery (SAGES) study,7 an ongoing prospective cohort study of older adults undergoing major elective surgeries (eg, total hip or knee replacement; lumbar, cervical, or sacral laminectomy; lower extremity arterial bypass surgery; open abdominal aortic aneurysm repair; and open or laparoscopic colectomy). Exclusion criteria included evidence of dementia, delirium, prior hospitalization within 3 months, legal blindness, severe deafness, terminal condition, history of schizophrenia or psychosis, and history of alcohol abuse or withdrawal. The Institutional Review Boards of Beth Israel Deaconess Medical Center, Brigham and Women’s Hospital, and Hebrew SeniorLife, all in Boston, Massachusetts, approved the study.

SAGES Delirium Assessment and Additional Variables

The presence or absence of delirium was based on daily in-hospital assessments by trained research staff using the Confusion Assessment Method (CAM)8 long form. The Delirium Symptom Interview (DSI)9 and information related to acute changes in mental status were also included as provided by nursing staff and/or family. Delirium severity was determined using the CAM-S.10 Participants in The SAGES Study had an initial baseline, presurgical assessment in their homes. Cognitive and physical functioning, depression, comorbidities, laboratory, and self-reported demographic data were collected.

Statistical Analyses

We included CAM delirium data from postoperative days (POD) 1 and 2 for each participant, if available; postoperative day 0 was not included because of potential residual anesthetic effects. We chose these days because most delirium began on POD1 or 2, and patients started being discharged on POD3. We considered all one-, two-, and three-item combinations of the 12 cognitive items of the 3D-CAM11 because of their demonstrated high information content for CAM diagnostic features per Item Response Theory.12 There were 12 possible one-item screens, 66 two-item screens, and 220 three-item screens. Sensitivity, specificity, and 95% confidence intervals for each were compared with CAM delirium determination. An ideal ultrabrief screen for delirium has high sensitivity with moderate specificity; general guidelines considered based on investigator consensus included screens with a sensitivity higher than 0.90 and specificity greater than 0.70. Because these screens are used to quickly rule out delirium, we also present the percent positive screen among the entire population (whether delirium is present or not). Screens with a positive screen rate of more than 50% are unlikely to be helpful in ruling out delirium quickly in a large enough fraction of the population. We also required that in multiple item screens, no two items should assess the same CAM feature. For instance, we would eliminate a two-item screen with MOYB and four-digit span since both items measure CAM Feature 2 (Inattention). Finally, we evaluated screen performance separately on POD1 and POD2. Switching screens by POD can be confusing, so we chose a single best screen that retained excellent performance over both days. Data analyses used SAS version 9.4 (SAS Institute, Cary, North Carolina).


The dataset included 560 adults who had an average age of 76.6 years (SD = 5.2), were 58% women, and were highly educated (15.0 years; SD = 2.9; Table). Postoperative delirium occurred during one or more days in 134 individuals (24%). A total of 1,100 delirium assessments were used, with 113 that were CAM positive (10.3%). For POD1, we used 551 assessments, 61 of which were positive (11.1%); for POD2, 549 assessments were used, with 51 positive (9.3%). Appendix Tables present the positive screen rates, sensitivities, specificities, and 95% confidence intervals of all 12 one-item screens and the 12 best performing two- and three-item screens in order of decreasing sensitivity.

Baseline Characteristics of the Study Cohort

The best ultrabrief screen from POD1 included the following three items: “Does the patient report feeling confused?”, MOYB, and “Does the patient appear sleepy?”, with a sensitivity of 0.95 (95% CI, 0.87-0.99) and specificity of 0.73 (95% CI, 0.69-0.77). The same combination of items has a sensitivity of 0.88 (95% CI, 0.77-0.96) and a specificity of 0.70 (95% CI, 0.66-0.74) on POD2. When POD1 and POD2 are combined, the sensitivity is 0.92 (95% CI, 0.85-0.96) and specificity is 0.72 (95% CI, 0.69-0.74). We consider this to be our best screen overall.


We identified a three-item screen for delirium after elective surgery consisting of “Does the patient report feeling confused?”, MOYB, and “Does the patient appear sleepy?” In our own prior work, we identified a two-item screen consisting of MOYB and “What is the day of the week?” as the best ultrabrief screen for delirium in general medicine populations (termed the “UB-2”)5 and a subsequent screen for patients with delirium superimposed on dementia (DSD) including “What type of place is this?”, Days of the Week Backward, and “Does the patient appear sleepy?”6 All three contain a test of attention (a cardinal feature of delirium) and a test of orientation, although the specific test for that varies. Both the surgical and DSD screens include “Does the patient appear sleepy?”, which addresses a reduced level of consciousness. This might be particularly important in the postoperative setting because of residual effects of anesthesia and/or postoperative analgesic medications contributing to delirium. Work done by others confirms our current findings, which is that MOYB is the best single item for most groups. Belleli et al13 and Han et al14 included MOYB as the single attentional item in the 4AT and B-CAM, respectively. The Nu-DESC has been used as a screen in surgical patients; however, it involves only nursing observations and no direct questioning of the patient.15

The Figure describes how our “best screen” could be integrated into clinical care. One or more “positive” or incorrect responses on these three items constitutes a positive screen that should be further evaluated with the CAM or 3D-CAM. If all three items are correct or negative, this effectively rules out delirium; however, continued periodic screening on a daily (or per shift) basis is indicated. On repeat testing, if any of the previously negative or correct items becomes positive or incorrect, this would be evidence for Acute Change, CAM Feature 1. Finally, it should be noted that, if all three items in our best screen are positive, full CAM criteria for delirium diagnosis are met within the screen itself, and no further testing is required. We envision this process being facilitated by use of an app-based program that generates optimal screening items based on patient and setting characteristics.

Flow diagram of delirium screening process using best performing three-item delirium screen

There are several limitations that must be noted. First, our three-item screen may not generalize to nonsurgical candidates or those undergoing emergent surgery and should be tested in these groups. Second, the SAGES sample is relatively homogenous with respect to racial and ethnic diversity and was highly educated with little functional impairment and no dementia. Therefore, results may not be generalizable to populations with lower educational attainment and/or preexisting mental and physical disabilities. A third limitation is that screen items were included in the reference standard delirium assessment, leading to a potential bias toward increased sensitivity. Finally, all screens were derived from secondary data analysis and further research will be needed to prospectively validate the results. Despite these limitations, this study has several strengths including the use of a well-characterized surgical population and a rigorous approach to delirium measurement. It is one of the first studies to identify a screening tool targeted to identifying delirium in postoperative older adults.

Future research should prospectively validate our screening tool and test its implementation in a real-world clinical environment. As part of this process, clinicians should document barriers and facilitators to widespread implementation. The goal of such screens is to facilitate early identification of postoperative delirium, which will allow timely intervention to address underlying causes and prevent adverse consequences, thereby improving the outcomes of vulnerable older surgical patients.


Online-Only Materials

   Comments ()