Original Research

Preliminary development of an ultrabrief two‐item bedside test for delirium

Abstract

BACKGROUND

Delirium is common, morbid, and costly, yet is greatly under‐recognized among hospitalized older adults.

OBJECTIVE

To identify the best single and pair of mental status test items that predict the presence of delirium.

DESIGN, SETTING

Diagnostic test evaluation study that enrolled medicine inpatients aged 75 years or older at an academic medical center.

METHODS

Patients underwent a clinical reference standard assessment involving a patient interview, medical record review, and interviews with family members and nurses to determine the presence or absence of Diagnostic and Statistical Manual of Mental Disorders, 4th Edition defined delirium. Participants also underwent the three‐dimensional Confusion Assessment Method (3D‐CAM), a brief, validated assessment for delirium. Individual items and pairs of items from the 3D‐CAM were evaluated to determine sensitivity and specificity relative to the reference standard delirium diagnosis.

RESULTS

Of the 201 participants (mean age 84 years, 62% female), 42 (21%) had delirium based on the clinical reference standard. The single item with the best test characteristics was “months of the year backwards” with a sensitivity of 83% (95% confidence interval [CI]: 69%‐93%) and specificity of 69% (95% CI: 61%‐76%). The best 2‐item screen was the combination of “months of the year backwards” and “what is the day of the week?” with a sensitivity of 93% (95% CI: 81%‐99%) and specificity of 64% (95% CI: 56%‐70%).

CONCLUSIONS

We identified a single item with >80% and pair of items with >90% sensitivity for delirium. If validated prospectively, these items will serve as an initial innovative screening step for delirium identification in hospitalized older adults. Journal of Hospital Medicine 2015;10:645–650. © 2015 Society of Hospital Medicine

© 2015 Society of Hospital Medicine

Delirium (acute confusion) is common in older adults and leads to poor outcomes, such as death, clinician and caregiver burden, and prolonged cognitive and functional decline.[1, 2, 3, 4] Delirium is extremely costly, with estimates ranging from $143 to $152 billion annually (2005 US$).[5, 6] Early detection and management may improve the poor outcomes and reduce costs attributable to delirium,[3, 7] yet delirium identification in clinical practice has been challenging, particularly when translating research tools to the bedside.[8, 9, 10]As a result, only 12% to 35% of delirium cases are detected in routine care, with hypoactive delirium and delirium superimposed on dementia most likely to be missed.[11, 12, 13, 14, 15]

To address these issues, we recently developed and published the three‐dimensional Confusion Assessment Method (3D‐CAM), the 3‐minute diagnostic assessment for CAM‐defined delirium.[16] The 3D‐CAM is a structured assessment tool that includes mental status testing, patient symptom probes, and guided interviewer observations for signs of delirium. 3D‐CAM items were selected through a rigorous process to determine the most informative items for the 4 CAM diagnostic features.[17] The 3D‐CAM can be completed in 3 minutes, and has 95% sensitivity and 94% specificity relative to a reference standard.[16]

Despite the capabilities of the 3D‐CAM, there are situations when even 3 minutes is too long to devote to delirium identification. Moreover, a 2‐step approach in which a sensitive ultrabrief screen is administered, followed by the 3D‐CAM in positives, may be the most efficient approach for large‐scale delirium case identification. The aim of the current study was to use the 3D‐CAM database to identify the most sensitive single item and pair of items in the diagnosis of delirium, using the reference standard in the diagnostic accuracy analysis. We hypothesized that we could identify a single item with greater than 80% sensitivity and a pair of items with greater than 90% sensitivity for detection of delirium.

METHODS

Study Sample and Design

We analyzed data from the 3D‐CAM validation study,[16] which prospectively enrolled participants from a large urban teaching hospital in Boston, Massachusetts, using a consecutive enrollment sampling strategy. Inclusion criteria were: (1) 75 years old, (2) admitted to general or geriatric medicine services, (3) able to communicate in English, (4) without terminal conditions, (5) expected hospital stay of 2 days, (6) not a previous study participant. Experienced clinicians screened patients for eligibility. If the patient lacked capacity to provide consent, the designated surrogate decision maker was contacted. The study was approved by the institutional review board.

Reference Standard Delirium Diagnosis

The reference standard delirium diagnosis was based on an extensive (45 minutes) face‐to‐face patient interview by experienced clinician assessors (neuropsychologists or advanced practice nurses), medical record review, and input from the nurse and family members. This comprehensive assessment included: (1) reason for hospital admission, hospital course, and presence of cognitive concerns, (2) family, social, and functional history, (3) Montreal Cognitive Assessment,[18] (4) Geriatric Depression Scale,[19] (5) medical record review including scoring of comorbidities using the Charlson index,[20] determination of functional status using the basic and Instrumental Activities of Daily Living,[21, 22] psychoactive medications administered, and (6) a family member interview to assess the patient's baseline cognitive status that included the Eight‐Item Interview to Differentiate Aging and Dementia,[23] to assess the presence of dementia. Using all of these data, an expert panel, including the clinical assessor, the study principal investigator (E.R.M.), a geriatrician, and an experienced neuropsychologist, adjudicated the final delirium diagnoses using Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM‐IV) criteria. The panel also adjudicated for the presence or absence of dementia and mild cognitive impairment based on National Institute on Aging‐Alzheimer's Association (NIA‐AA) criteria.[24] This approach has been used in other delirium studies.[25]

3D‐CAM Assessments

After the reference standard assessment, the 3D‐CAM was administered by trained research assistants (RAs) who were blinded to the results of the reference standard. To reduce the likelihood of fluctuations or temporal changes, all assessments were completed between 11:00 am and 2:00 pm and for each participant, within a 2‐hour time period (for example, 11:23 am to 1:23 pm).

Statistical Analyses to Determine the Best Single‐ and Two‐Item Screeners

To determine the best single 3D‐CAM item to identify delirium, the responses of the 20 individual items in the 3D‐CAM (see Supporting Table 1 in the online version of this article) were compared to the reference standard to determine their sensitivity and specificity. Similarly, an algorithm was used to generate all unique 2‐item combinations of the 20 items (190 unique pairs), which were compared to the reference. An error, no response, or an answer of I do not know by the patient was considered a positive screen for delirium. The 2‐item screeners were considered positive if 1 or both of the items were positive. Sensitivity and specificity were calculated along with 95% confidence intervals (CIs).

Subset analyses were performed to determine sensitivity and specificity of individual items and pairs of items stratified by the patient's baseline cognitive status. Two strata were createdpatients with dementia (N=56), and patients with normal baseline cognitive status or mild cognitive impairment (MCI) (N=145). We chose to group MCI with normal for 2 reasons: (1) dementia is a well‐established and strong risk factor for delirium, whereas the evidence for MCI being a risk factor for delirium is less established and (2) to achieve adequate allocation of delirious cases in both strata. Last, we report the sensitivity of altered level of consciousness (LOC), which included lethargy, stupor, coma, and hypervigilance as a single screening item for delirium in the overall sample and by cognitive status. Analyses were conducted using commercially available software (SAS version 9.3; SAS Institute, Inc., Cary, NC).

RESULTS

Characteristics of the patients are shown in Table 1. Subjects had a mean age of 84 years, 62% were female, and 28% had a baseline dementia. Forty‐two (21%) had delirium based on the clinical reference standard. Twenty (10%) had less than a high school education and 100 (49%) had at least a college education.

Sample Characteristics (N=201)
CharacteristicN (%)
  • NOTE: Abbreviations: ADL, activities of daily living; IADL, instrumental activities of daily living; MCI, mild cognitive impairment; MoCA, Montreal Cognitive Assessment; SD, standard deviation.

Age, y, mean (SD)84 (5.4)
Sex, n (%) female125 (62)
White, n (%)177 (88)
Education, n (%)
Less than high school20 (10)
High school graduate75 (38)
College plus100 (49)
Vision interfered with interview, n (%)5 (2)
Hearing interfered with interview, n (%)18 (9)
English second language n (%)10 (5)
Charlson, mean (SD)3 (2.3)
ADL, n (% impaired)110 (55)
IADL, n (% impaired)163 (81)
MCI, n (%)50 (25)
Dementia, n (%)56 (28)
Delirium, n (%)42 (21)
MoCA, mean (SD)19 (6.6)
MoCA, median (range)20 (030)

Single Item Screens

Table 2 reports the results of single‐item screens for delirium with sensitivity, the ability to correctly identify delirium when it is present by the reference standard, and specificity, the ability to correctly identify patients without delirium when it is not present by reference standard and 95% CIs. Items are listed in descending order of sensitivity; in the case of ties, the item with the higher specificity is listed first. The screening items with the highest sensitivity for delirium are Months of the year backwards, and Four digits backwards, both with a sensitivity of 83% (95% CI: 69%‐93%). Of these 2 items, Months of the year backwards had a much better specificity of 69% (95% CI: 61%‐76%), whereas Four digits backwards had a specificity of 52% (95% CI: 44%‐60%). The item What is the day of the week? had lower sensitivity at 71% (95% CI: 55%‐84%), but excellent specificity at 92% (95% CI: 87%‐96%).

Top Ten Single‐Item Screen for Delirium (N=201)
Screen ItemScreen Positive (%)cSensitivity (95% CI)Specificity (95% CI)LRLR
  • NOTE: Number of patients with delirium=42. Abbreviations: CI, confidence interval; LR, likelihood ratio.

  • There were 20 different items and 190 possible item pairs considered.

  • Top 10 items: our primary criterion for determining this was sensitivity, with a secondary criterion of specificity in the case of ties. Items are listed in descending order on this basis.

  • Screen positive: error, do not know, or no response.

Months of the year backwards420.83 (0.69‐0.93)0.69 (0.61‐0.76)2.70.24
Four digits backwards560.83 (0.69‐0.93)0.52 (0.44‐0.60)1.720.32
What is the day of the week?210.71 (0.55‐0.84)0.92 (0.87‐0.96)9.460.31
What is the year?160.55 (0.39‐0.70)0.94 (0.9‐0.97)9.670.48
Have you felt confused during the past day?140.50 (0.34‐0.66)0.95 (0.9‐0.98)9.940.53
Days of the week backwards150.50 (0.34‐0.66)0.94 (0.89‐0.97)7.950.53
During the past day, did you see things that were not really there?110.45 (0.3‐0.61)0.97 (0.94‐0.99)17.980.56
Three digits backwards150.45 (0.3‐0.61)0.92 (0.87‐0.96)5.990.59
What type of place is this?90.38 (0.24‐0.54)0.99 (0.96‐1)30.290.63
During the past day, did you think you were not in the hospital?100.38 (0.24‐0.54)0.97 (0.94‐0.99)15.140.64

We then examined performance of single‐item screeners in patients with and without dementia (Table 3). In persons with dementia, the best single item was also Months of the year backwards, with a sensitivity of 89% (95% CI: 72%‐98%) and a specificity of 61% (95% CI: 41%‐78%). In persons with normal baseline cognition or MCI, the best performing single item was Four digits backwards, with sensitivity of 79% (95% CI: 49%‐95%) and specificity of 51% (95% CI: 42%‐60%). Months of the year backwards also performed well, with sensitivity of 71% (95% CI: 42%‐92%) and specificity of 71% (95% CI: 62%‐79%).

Top Three Single‐Item Screen for Delirium Stratified by Baseline Cognition
Test ItemNormal/MCI Patients (n=145)Dementia Patients (n=56)
Screen Positive (%)bSensitivity (95% CI)Specificity (95% CI)LRLRScreen Positive (%)bSensitivity (95% CI)Specificity (95% CI)LRLR
  • NOTE: Participants with learning problems (1) grouped with dementia and MCI participants (44) grouped with normal. Number of patients with delirium=28. Abbreviations: CI, confidence interval; LR, likelihood ratio; MCI, mild cognitive impairment.

  • Top 3 items: our primary criterion for determining this was sensitivity, with a secondary criterion of specificity in the case of ties. Items are listed in descending order on this basis.

  • Screen positive: error, do not know, or no response.

Months backwards330.71 (0.42‐0.92)0.71 (0.62‐0.79)2.460.4640.89 (0.72‐0.98)0.61 (0.41‐0.78)2.270.18
Four digits backwards520.79 (0.49‐0.95)0.51 (0.42‐0.60)1.610.42660.86 (0.67‐0.96)0.54 (0.34‐0.72)1.850.27
What is the day of the week?100.64 (0.35‐0.87)0.96 (0.91‐0.99)16.840.37500.75 (0.55‐0.89)0.75 (0.55‐0.89)30.33

Two‐Item Screens

Table 4 reports the results of 2‐item screens for delirium with sensitivity, specificity, and 95% CIs. Item pairs are listed in descending order of sensitivity following the same convention as in Table 2. The 2‐item screen with the highest sensitivity for delirium is the combination of What is the day of the week? and Months of the year backwards, with a sensitivity of 93% (95% CI: 81%‐99%) and specificity of 64% (95% CI: 56%‐70%). This screen had a positive and negative likelihood ratio (LR) of 2.59 and 0.11, respectively. The combination of What is the day of the week? and Four digits backwards had the same sensitivity 93% (95% CI: 81%‐99%), but lower specificity of 48% (95% CI: 40%‐56%). The combination of What type of place is this? (hospital) and Four digits backwards had a sensitivity of 90% (95% CI: 77%‐97%) and specificity of 51% (95% CI: 43%‐50%).

Top Ten Two‐Item Screen for Delirium (N=201)
Screen Item 1Screen Item 2Screen Positive (%)cSensitivity (95% CI)Specificity (95% CI)LRLR
  • NOTE: Number of patients with delirium=42. Abbreviations: CI, confidence interval; LR, likelihood ratio.

  • There were 20 different items and 190 possible item pairs considered.

  • Top 10 items: our primary criterion for determining this was sensitivity, with a secondary criterion of specificity in the case of ties. Items are listed in descending order on this basis.

  • Screen positive: error, do not know, or no response.

What is the day of the week?Months backwards480.93 (0.81‐0.99)0.64 (0.56‐0.70)2.590.11
What is the day of the week?Four digits backwards600.93 (0.81‐0.99)0.48 (0.4‐0.56)1.80.15
Four digits backwardsMonths backwards650.93 (0.81‐0.99)0.42 (0.34‐0.50)1.60.17
What type of place is this?Four digits backwards580.90 (0.77‐0.97)0.51 (0.43‐0.50)1.840.19
What is the year?Four digits backwards590.9 (0.77‐0.97)0.5 (0.42‐0.5)1.800.19
What is the day of the week?Three digits backwards300.88 (0.74‐0.96)0.86 (0.79‐0.90)6.090.14
What is the year?Months backwards440.88 (0.74‐0.96)0.68 (0.6‐0.75)2.750.18
What type of place is this?Months backwards430.86 (0.71‐0.95)0.69 (0.61‐0.70)2.730.21
During the past day, did you think you were not in the hospital?Months backwards430.86 (0.71‐0.95)0.69 (0.61‐0.70)2.730.21
Days of the week backwardsMonths backwards430.86 (0.71‐0.95)0.68 (0.6‐0.75)2.670.21

When subjects were stratified by baseline cognition, the best 2‐item screens for normal and MCI patients was What is the day of the week? and Four digits backwards, with 93% sensitivity (95% CI: 66%‐100%) and 50% specificity (95% CI: 42%‐59%). The best pair of items for patients with dementia (Table 5) was the same as the overall sample, What is the day of the week? and Months of the year backwards, but its performance differed with a higher sensitivity of 96% (95% CI: 82%‐100%) and lower specificity of 43% (95% CI: 24%‐63%). This same pair of items had 86% sensitivity (95% CI: 57%‐98%) and 69% (95% CI: 60%‐77%) specificity for persons with either normal cognition or MCI.

Top Three Two‐Item Screen for Normal/MCI and Persons With Dementia
Test Item 1Test Item 2Normal/MCI Patients (n=145)Dementia Patients (n=56)
Item Positive (%)bSensitivity (95% CI)Specificity (95% CI)LRLRItem Positive (%)bSensitivity (95% CI)Specificity (95% CI)LRLR
  • NOTE: Participants with learning problems (1) grouped with dementia and MCI participants (44) grouped with normal. Number of patients with delirium=28. Abbreviations: CI, confidence interval; LR, likelihood ratio; MCI, mild cognitive impairment.

  • Top 3 items: our primary criterion for determining this was sensitivity, with a secondary criterion of specificity in the case of ties. Items are listed in descending order on this basis.

  • Screen positive: error, do not know, or no response.

What is the day of the week?Months backwards360.86 (0.57‐0.98)0.69 (0.60‐0.77)2.740.21770.96 (0.82‐1)0.43 (0.24‐0.63)1.690.08
What is the day of the week?Four digits backwards540.93 (0.66‐1)0.5 (0.42‐0.59)1.870.14770.93 (0.76‐0.99)0.39 (0.22‐0.59)1.530.18
Four digits backwardsMonths backwards610.93 (0.66‐1)0.43 (0.34‐0.52)1.620.17770.93 (0.76‐0.99)0.39 (0.22‐0.59)1.530.18

Altered Level of Consciousness as a Screener for Delirium

Altered level of consciousness (ALOC) was uncommon in our sample, with an overall prevalence of 10/201 (4.9%). When examined as a screening item for delirium, ALOC had very poor sensitivity of 19% (95% CI: 9%‐34%) but had excellent specificity 99% (95% CI: 96%‐100%). Altered LOC also demonstrated poor screening performance when stratified by cognitive status, with a sensitivity of 14% in the normal and MCI group (95% CI: 2%‐43%) and sensitivity of 21% (95% CI: 8%‐41%) in persons with dementia.

Positive and Negative Predictive Values

Although we focused on sensitivity and specificity in evaluating 1‐ and 2‐item screeners, we also examined positive and negative predictive values. These values will vary depending on the overall prevalence of delirium, which was 21% in this dataset. The best 1‐item screener, Months of the year backwards, had a positive predictive value of 31% and negative predictive value of 94%. The best 2‐item screener, Months of the year backwards with What is the day of the week?, had a positive predictive value of 41% and negative predictive value of 97% (see Supporting Tables 2 and 3 in the online version of this article) LRs for the items are in Tables 2 through 5.

DISCUSSION

Identifying simple, efficient, bedside case‐identification methods for delirium is an essential step toward improving recognition of this highly morbid syndrome in hospitalized older adults. In this study, we identified a single cognitive item, Months of the year backwards, that identified 83% of delirium cases when compared with a reference standard diagnosis. Furthermore, we identified 2 items, Months of the year backwards and What is the day of the week? which when used in combination identified 93% of delirium cases. The same 1 and 2 items also worked well in patients with dementia, in whom delirium is often missed. Although these items require further clinical validation, the development of an ultrabrief 2‐item test that identifies over 90% of delirium cases and can be completed in less than 1 minute (recently, we administered the best 2‐item screener to 20 consecutive general medicine patients over age 70 years, and it was completed in a median of 36.5 seconds), holds great potential for simplifying bedside delirium screening and improving the care of hospitalized older adults.

Our current findings both confirm and extend the emerging literature on best screening items for delirium. Sands and colleagues (2010)[26] tested a single test for delirium, Do you think (name of patient) has been more confused lately? in 21 subjects and achieved a sensitivity of 80%. Han and colleagues developed a screening tool in emergency‐department patients using the LOC question from the Richmond Agitation‐Sedation Scale and spelling the word lunch backwards, and achieved 98% sensitivity, but in a younger emergency department population with a low prevalence of dementia.[27] O'Regan et al. recently also found Months of the year backwards to be the best single‐screening item for delirium in a large sample, but only tested a 1‐item screen.[28] Our study extends these studies in several important ways by: (1) employing a rigorous clinical reference standard diagnosis of delirium, (2) having a large sample with a high prevalence of patients with dementia, (3) use of a general medical population, and (4) examining the best 2‐item screens in addition to the best single item.

Systematic intervention programs[29, 30, 31] that focus on improved delirium evaluation and management have the potential to improve patient outcomes and reduce costs. However, targeting these programs to patients with delirium has proven difficult, as only 12% to 35% of delirium cases are recognized in routine clinical practice.[11, 12, 13, 14, 15] The 1‐ and 2‐item screeners we identified could play an important role in future delirium identification. The 3D‐CAM combines high sensitivity (95%) with high specificity (94%)[16] and therefore would be an excellent choice as the second step after a positive screen. The feasibility, effectiveness, and cost of administering these screeners, followed by a brief diagnostic tool such as the 3D‐CAM, should be evaluated in future work.

Our study has noteworthy strengths, including the use of a large purposefully challenging clinical sample with advanced age that included a substantial proportion with dementia, a detailed assessment, and the testing of very brief and practical tools for bedside delirium screening.[25] This study also has several important limitations. Most importantly, we presented secondary analysis of individual items and pairs of items drawn from the 3D CAM assessment; therefore, the 2‐item bedside screen requires prospective clinical validation. The reference standard was based on the DSM‐IV, because this study was conducted prior to the release of DSM‐V. In addition, the ordering of the reference standard and 3D‐CAM assessments was not randomized due to feasibility constraints. In addition, this study was cross‐sectional, involved only a single hospital, and enrolled only older medical patients during the day shift. Our sample was older (aged 75 years and older), and a younger sample may have had a different prevalence of delirium, which could affect the positive predictive value of our ultrabrief screen. We plan to test this in a sample of patients aged 70 years and older in future studies. Finally, it should be noted that these best 1‐item and 2‐item screeners miss 17% and 7% of delirium cases, respectively. In cases where this is unacceptably high, alternative approaches might be necessary.

It is important to remember that these 1‐ and 2‐item screeners are not diagnostic tools and therefore should not be used in isolation. Optimally, they will be followed by a more specific evaluation, such as the 3D‐CAM, as part of a systematic delirium identification process. For instance, in our sample (with a delirium rate of 21%), the best 2‐item screener had a positive predictive value of 41%, meaning that positive screens are more likely to be false positives than true positives (see Supporting Tables 2 and 3 in the online version of this article).[32] Nevertheless, by reducing the total number of patients who require diagnostic instrument administration, use of these ultrabrief screeners can improve efficiency and result in a net benefit to delirium case‐identification efforts.[32]

Time has been demonstrated to be a barrier to delirium identification in previous studies, but there are likely others. These may include, for instance, staff nihilism about screening making a difference, ambiguous responsibility for delirium screening and management, unsupportive system leadership, and absent payment for these activities.[31] Moreover, it is possible that the 2‐step process we propose may create an incentive for staff to avoid positive screens as they see it creating more work for themselves. We plan to identify and address such barriers in our future work.

In conclusion, we identified a single screening item for delirium, Months of the year backwards, with 83% sensitivity, and a pair of items, Months of the year backwards and What is the day of the week?, with 93% sensitivity relative to a rigorous reference standard diagnosis. These ultrabrief screening items work well in patients with and without dementia, and should require very little training of staff. Future studies should further validate these tools, and determine their translatability and scalability into programs for systematic, widespread delirium detection. Developing efficient and accurate case identification strategies is a necessary prerequisite to appropriately target delirium management protocols, enabling healthcare systems to effectively address this costly and deadly condition.

Disclosures

Author contributionsD.M.F. conceived the study idea, participated in its design and coordination, and drafted the initial manuscript. S.K.I. contributed to the study design and conceptualization, supervision, funding, preliminary analysis, and interpretation of the data, and critical revision of the manuscript. J.G. conducted the analysis for the study and critically revised the manuscript. L.N. supervised the analysis for the study and critically revised the manuscript. R.J. contributed to the study design and critical revision of the manuscript. J.S.S. critically revised the manuscript. E.R.M. obtained funding for the study, supervised all data collection, assisted in drafting and critically revising the manuscript, and contributed to the conceptualization, design, and supervision of the study. All authors have seen and agree with the contents of the manuscript.

This work was supported by the National Institute of Aging grant number R01AG030618 and K24AG035075 to Dr. Marcantonio. Dr. Inouye's time was supported in part by grants P01AG031720, R01AG044518, and K07AG041835 from the National Institute on Aging. Dr. Inouye holds the Milton and Shirley F. Levy Family Chair (Hebrew Senior Life/Harvard Medical School). Dr. Fick is partially supported from National Institute of Nursing Research grant number R01 NR011042. Dr. Saczynski was supported in part by funding from the National Institute on Aging (K01AG33643) and from the National Heart Lung and Blood Institute (U01HL105268). The funding agencies had no role and the authors retained full autonomy in the preparation of this article. All authors and coauthors have no financial or nonfinancial conflicts of interest to disclose regarding this article.

This article was presented at the Presidential Poster Session at the American Geriatrics Society 2014 Annual Meeting in Orlando, Florida, May 14, 2014.

References

Online-Only Materials

   Comments ()