Original Research

Prediction of onset and course of high hospital utilization in sickle cell disease




Although sickle cell disease (SCD) patients typically manage their pain at home, a small subgroup is frequently hospitalized and accounts for the majority of costs.


1) To identify prospective diagnostic and demographic markers of new periods of high utilization; 2) To identify demographic and diagnostic markers of a persistent rather than moderating course of high utilization; 3) To replicate the finding that high utilization tends to moderate.


The State Inpatient Databases for California, 2004–2007, were used. Patients with new onset periods of high utilization were compared with non‐high utilizers, and new high utilizers who moderated were compared with those who had a persistent course.


All hospitals in the state of California.


Patients age 13 years or older in 2004 with a recorded diagnosis of sickle cell disease and at least one hospitalization for crisis during the study period.


Groups from hospitals throughout California were compared on demographics and discharge diagnoses of SCD complications and comorbidities. Patients age 13 years or older in 2004 with a recorded diagnosis of sickle cell disease and at least 1 hospitalization for crisis during the study period were included.


New periods of high utilization were associated with more prior hospitalizations and previous diagnoses of aseptic necrosis and renal disease. High utilization typically moderated. A persistent course was associated with slightly more hospitalizations during the initial year of high utilization, and, subsequently, by more mentions of septicemia and mood disorders.


Overall, high utilization was difficult to predict, as was its course. The diagnoses most associated with high utilization indicated more severe sickle cell disease. Septicemia deserves further investigation as a preventable cause for high utilization, as do mood disorders. Journal of Hospital Medicine 2011. © 2011 Society of Hospital Medicine

Copyright © 2010 Society of Hospital Medicine

Extremes of hospital utilization by patients with sickle cell disease (SCD) are problematic for patients, clinicians, and policymakers.110 Although patients manage their pain at home most of the time, even acute crises,11 a small minority of SCD patients accounts for a remarkable amount of hospital resource utilization.1, 3, 4, 6, 1114 Where it is quite unusual for a patient with SCD to be hospitalized more than twice per year,1, 11 in prior work with payer datasets our group identified some patients who were hospitalized more frequently than once per month. In rare cases, admission rates exceeding once per week were identified.1 High‐utilizing SCD patients, and particularly the very high‐utilizing subset, account for the majority of costs of care for the population.13, 14

In previous work by our group describing hospital utilization among members of a regional Medicaid MCO, results suggested that high utilization was a relatively transient phenomenon for most patients, likely resulting from short‐term increases in hospitalization rates among previously moderate utilizers.1 However, high‐utilizing members whose inpatient admission rate did not quickly moderate were progressively less likely to resume a more typical utilization pattern.

The present study used the State Inpatient Databases for years 2004 to 2007 from the Agency for Healthcare Research and Quality to replicate prior findings and to investigate questions not addressed in our prior work. Specifically, hospital discharge data from all hospitals in the state of California were examined to identify first‐year adolescent and adult high utilizers and to follow their hospital utilization over time. The objectives of the study were as follows:

  • To identify historical predictors of a period of high utilization by comparing diagnoses between 20042006 in patients who were new high utilizers in 2007 with those who were never high utilizers.

  • To identify predictors of a persistent rather than moderating course by following patients who were new high utilizers in 2005 over the succeeding 2 years.

  • To replicate prior findings on the course of high hospital utilization.

Patients and Methods

Initial Data Source

The State Inpatient Databases (SID) are provided by the Healthcare Cost and Utilization Project, sponsored by the Agency for Healthcare Research and Quality. They contain patient‐level discharge data from all hospitals in participating states. This study presents SID data from California for the years 2004 through 2007, including a total of 34,363 hospital admissions in which a diagnosis of sickle cell disease was recorded. Encrypted patient identifiers were used to identify individual patients, and there are few missing identifiers in the California dataset for these years. The data set includes up to 25 discharge diagnostic codes using ICD‐9 nomenclature. In addition, each patient's age and gender are recorded.

Categorization Based on Diagnosis and Inpatient Utilization

Management of missing or conflicting information

A minority of hospitalization records contained ambiguous demographic information (such as conflicting or missing gender or age) associated with the same patient identifier. Identical identifiers were assumed to represent the same patient for purposes of this study, even if other information was conflicting. This decision avoided overly conservative utilization estimates, as high utilizers would have correspondingly more missing information and data entry errors that could lead the same patient to be identified as multiple others with lower utilization. An examination of admissions with conflicting measures supported this method, in that most conflicts were due to missing entries in otherwise consistent data or were very likely typographical. If inconsistencies were due only to missing information for some hospitalizations, the non‐missing values were accepted. In cases where there was actual inconsistency, the following methods were employed.

For dichotomous information, such as gender, conflicts were recoded as missing. Ages recorded in each hospitalization were standardized to ages as of 2004 by subtracting the difference between the year of admission and 2004. If the spread of ages associated with a given patient identification number was greater than 3 years (missing values excluded), the age was coded as missing (note that age at hospitalization could differ by 1 year depending on the temporal relation of hospitalization to the date of birth). If the discrepancy was less, the minimum recorded age was accepted.

Construction of the Study Subset

The study data set was constructed as follows (Fig. 1):

  • Patient identifiers associated with a diagnosis of sickle cell disease were selected by identifying admissions with ICD‐9 diagnosis codes for sickle cell disease appearing in the first 10 diagnoses for calendar years 2004 to 2007 (these included ICD‐9 codes 282.60 to 282.64, 282.68, 282.69, 282.41, and 282.42). Of this group, patients who had a record of at least 1 admission for sickle cell crisis were identified. An admission for crisis was operationalized as a hospitalization with 1 discharge diagnosis coded as 282.42, 282.62, 282.64, or 282.69. This yielded a data set of 34,363 admissions among 3169 patients.

  • Admissions with missing patient identification numbers were excluded (n = 2365 of 34,363 admissions, 6.88%).

  • Hospitalizations were tabulated for each unique patient identifier.

  • Patients with a known age of 13 years or more in 2004 were selected. There were 481 patients excluded due to age below 13 years, and 814 excluded for having an uncertain age. The final sample consisted of 1874 unique patient identifiers representing 10,704 hospital admissions.

Figure 1

Description of sample construction.

As patients who were hospitalized more often were more likely to have inconsistent data, the exclusion for unknown and inconsistent age likely biased the findings by excluding more frequently hospitalized patients. Further post‐hoc analyses were conducted to gauge the extent of this bias, reported in Results, below.

Categorization by Utilization

For each patient, inpatient hospital admissions were tabulated for each year. A year of high utilization for a patient was defined as any calendar year in which that patient had 4 or more hospital admissions. In prior well‐designed studies, categorical definitions of high utilization have used cutoffs between 3 and 5 hospitalizations per year.13, 14 In our group's experience, the cutoff around 4 admissions per year identifies a subpopulation in the top 10% to 20% for annual hospital utilization, both in the outpatient clinic and in payer populations with which our center interacts. A patient was included in the high utilizer group if he or she was a high utilizer in at least 1 year of the study period; all other patients were placed in the comparison group. There were 479 patients in the high utilizer group (25.6% of the total sample) and 1395 in the comparison group. To predict onset of a period of high utilization, patients whose first year of high utilization was 2007 (n = 84) were compared with patients who were never high utilizers (n = 1395). In the prospective analysis to predict moderation, patients who were new high utilizers in 2005 (n = 206) were divided into the group who had fewer than 4 admissions in the following year (moderating course, n = 131) and those who had more than 4 admissions in the following year (continuous course, n = 75).

Operationalization of Diagnoses of Comorbid Conditions and Complications

Discharge diagnoses were parsed by a computer algorithm for diagnostic codes matching selected diagnoses. If the diagnosis was found at least once, the patient was coded as having the diagnosis. Diagnostic codes (ICD‐9‐CM) included the following: HIV: 042.__; septicemia: 038.__; pneumonia: 482.00 to 486.99; pulmonary embolus: 415.11,12 and 415.19; acute chest syndrome: 517.3_; chronic renal disease: 585.__; diabetes mellitus: 250.__; cocaine dependence: 304.2_; cocaine abuse: 305.6_; alcohol dependence: 303.00 to 303.92; alcohol abuse: 305.0_; mood disorders (including depressive and bipolar disorders): 296.00 to 296.89; and aseptic necrosis of bone: 733.4_. Substance dependence and abuse were aggregated to create alcohol use disorder and cocaine use disorder categories. Opiate use disorders were not included, as the clinical experience of the authors suggested that clinicians may sometimes diagnose opiate dependence on the basis of frequent hospitalization in itself, and it seemed prudent to avoid the confound.

Statistical Analyses

All statistical and graphical analyses were performed in the R statistical computing environment.15 Intergroup differences in categorical data were analyzed using the chi‐square test for independence. The sample distributions of many measures were highly skewed, and nonparametric methods were used where practical. In general, the median and interquartile range are reported as measures of central tendency and spread, respectively. Comparisons between groups on continuous measures were done using the Mann‐Whitney‐Wilcoxon test.

Institutional Review Board Approval

The study was exempt from institutional review board review, due to the nature of the data set and its noninterventional design.


Comparison of Utilization Groups by Demographics and Diagnosis

Table 1 presents direct comparisons of high utilizers with comparison patients. Patients in the high utilizer group were slightly more likely to be female and had a higher prevalence of all diagnoses examined, with the exception of HIV (where prevalence was quite low). At least 1 discharge diagnosis of acute chest syndrome was common in both groups, but was more than twice as prevalent in high utilizers. Diagnoses of aseptic necrosis of bone and septicemia were much greater in the high utilizer group than among comparison subjects.

Demographics, Discharge Diagnoses, and Utilization Characteristics of High Utilizers and Comparison Patients, 20042007
Comparison N = 1395High Utilizers N = 479P
  • Diagnoses are represented as proportions with at least 1 discharge diagnosis mentioned in a hospital record, expressed as percent.

  • Sex reduced to1364 for comparison, 477 for high utilizers due to missing or inconsistent data.

  • Hospitalizations and age are entered as median [interquartile range].

Age32 [21]b29 [19]b<0.001
Acute chest syndrome15.63%40.29%<0.001
Aseptic necrosis9.18%30.90%<0.001
Renal disease4.01%11.48%<0.001
Pulmonary embolus2.51%10.02%<0.001
Mood disorder1.72%11.69%<0.001
Cocaine disorder1.00%9.60%<0.001
Alcohol disorder2.87%8.56%<0.001
Hospitalizations2 [2]b11 [12]b<0.001

Prior History of New High Utilizers

Patients who were first high utilizers in 2007 (FY2007) were compared with patients who were never high utilizers on hospital diagnoses made before 2007 to identify predictors of a new‐onset period of high utilization (Table 2). The FY2007 high utilizers did not differ significantly in demographics from nonhigh utilizers. The FY2007 high utilizers had a greater prevalence of discharge diagnoses of aseptic necrosis of bone (OR 2.03, 95% CI, 1.07 to 3.85) and renal disease (OR 6.28, 95% CI, 2.72 to 14.5) prior to the onset of high utilization. FY2007 high utilizers also had a greater number of hospitalizations prior to their initial year of high utilization (median 3 vs 1); however, a similar proportion of FY2007 and never high utilizers had been hospitalized at least once before 2007. In 2007, the first‐year 2007 high utilizers had a markedly greater prevalence of hospital diagnoses of acute chest syndrome (OR 4.67, 95% CI, 2.53 to 8.63) and septicemia (OR 8.26, 95% CI, 3.91 to 17.4). Other diagnoses, expressed as OR and 95% confidence intervals, included aseptic necrosis of bone 4.80 (1.89 to 12.2), pneumonia 17.6 (4.99 to 62.0), pulmonary embolus 5.70 (1.52 to 21.5), mood disorder 11.0 (3.51 to 34.3), and cocaine disorder 10.3 (2.42 to 43.8), Table 2). However, only a minority of nonhigh utilizers were hospitalized at all that year.

Comparison of Patients Who Were Never High Utilizers in 20042007 With Those Whose First Year of High Utilization Was 2007
Prior to 2007In 2007
NeverNew High Utilizers in 2007PNeverNew High Utilizers in 2007P
N = 1395N = 84
  • Diagnoses are represented as proportions with at least 1 discharge diagnosis mentioned in a hospital record, expressed as percent.

  • Hospitalizations and age are entered as median [interquartile range].

Age32 [20]a30 [20]a0.116
New complications
Acute chest syndrome11.18%15.48%0.3064.44%17.86%<0.001
Aseptic necrosis7.60%14.29%0.0471.58%7.14%0.001
Renal disease1.65%9.52%<0.0012.37%3.57%0.740
New comorbidities
Pulmonary embolus1.86%2.38%0.9410.65%3.57%0.023
Diabetes mellitus4.52%4.76%0.8691.86%4.76%0.152
Mood disorder1.15%3.57%0.1560.57%5.95%<0.001
Cocaine disorder0.65%1.19%0.9260.36%3.57%0.002
Alcohol disorder4.30%2.38%0.5671.43%4.76%0.057
Hospitalizations1 [4]a3 [4]a<0.0010 [2]a5 [2]a<0.001

Course of New High Utilizers

Patients who were high utilizers in 2005 but not in 2004 were identified, and their hospital utilization from 2005 to 2007 was plotted (Fig. 2). The results are shown in Figure 1. Fifty‐five of the original 91 (60.44%) new high utilizers moderated in the following year, and 6.59% were known to have died in the hospital. Of the surviving 30 (32.97%) who did not moderate in the second year, 19 (65.3%) continued the high‐utilizing pattern into the third year, while 9 (16.36%) of those who moderated in year 2 returned to the high‐utilizing pattern in year 3. During this 3‐year period, 10 members (10.99%) of the initial group died in the hospital.

Figure 2

Prospective course of 2005 first‐year high utilizers.

Diagnostic Patterns in Continued and Moderated First‐Year High Utilizers

The diagnoses of patients who were high utilizers in 2005 and not 2004 were examined for differences between those who moderated in 2006 (moderating group) and those who continued the high‐utilizing pattern (persistent group, Table 3). There were no differences in any measures examined in 2004. In 2005, the initial year of high utilization, the groups differed only on the prevalence of new diagnoses of alcohol use disorders (95% CI for odds ratio incalculable due to zero prevalence in moderating group), and slightly in number of hospitalizations (median 5 vs 5.5). Over ensuing years, the persistent group was more likely to have new discharge diagnoses of septicemia (OR 5.88, 95% CI, 1.40 to 24.7) and mood disorders (OR not calculated due to zero prevalence in the moderating group).

Diagnostic and Utilization Characteristics Over Time for New High Utilizers From 2005, Categorized by Moderating or Persistent Course
Course of 2005 1st Year High Utilizers
Prior Year (2004)First Year (2005)Subsequent Years (20062007)
N = 61N = 30
  • Diagnoses are represented as proportions with at least 1 discharge diagnosis mentioned in a hospital record, expressed as percent.

  • Hospitalizations and age are entered as median [interquartile range].

Age30 [22]a25 [29.5]a.682
New complications
Acute chest syndrome8.2%6.67%.87221.31%33.33%.3254.92%13.33%.318
Aseptic necrosis6.56%13.33%.97811.48%10.00%.8858.20%13.33%.691
Renal disease0.00%0.00% 11.48%10.00%.8853.28%10.00%.405
New comorbidities
HIV0.00%0.00% 1.64%0.00%.7160.00%0.00%
Pulmonary embolus0.00%3.33%.54111.48%3.33%.3704.92%3.33%.844
Diabetes mellitus3.28%3.33%.8449.84%6.67%.9140.00%3.33%.716
Mood disorder3.28%6.67%.5411.64%10.00%.1990.00%13.33%.018
Cocaine disorder3.28%3.33%.6221.64%10.00%.1990.00%6.67%.201
Alcohol disorder0.00%0.00% 0.00%16.67%.0051.64%6.67%.523
Hospitalized77.05%73.33%.898100%100% 73.77%100%.005
Hospitalizations1 [2]a1.5 [2.5]a.9245 [2]a5.5 [4]a.0222 [4]a11 [11.75]a<.001

Assessment of Effects of Age Selection

In order to assess the effects of restricting the sample to patients with a known age <13, post‐hoc analyses were performed without this restriction. In general, results were in line with findings from the planned analysis.

Using these less stringent criteria, prior to the onset of their first year of high utilization, FY2007 high utilizers (n = 142) were more likely to be female (63.0% vs 52.5%, P = 0.019) than never high utilizers (n = 2173), and also had more chronic kidney disease (7.75% vs 1.29%, P < 0.001), mood disorders (4.93% vs 0.83%, P < 0.001), and prior hospitalizations (median 3 vs 2, P < 0.001).

New 2005 high utilizers who persisted after 2005 (n = 75) were more likely to be diagnosed with alcohol disorders in 2005 (8% vs 0%, P = 0.004) and had slightly more hospitalizations (median 5 for both groups, but with a greater spread for the continuous group, P = 0.003) in 2005 than those who moderated (n = 131). After 2005, the continuous group were more likely to have new diagnoses of acute chest syndrome (5.34% vs 14.67%, P = 0.043), aseptic necrosis (4.58% vs 14.67%, P = 0.023), septicemia (3.82% vs 21.33%, P < 0.001), and mood disorders (0.00% vs 9.33%, P = 0.002).


Replication of the Moderating Course of High Utilization

This study replicates, with substantial sample size, the finding that high inpatient utilization in patients with SCD tends to moderate relatively quickly. As the present report used a statewide data set of patients not selected for payer type, it mitigates prior concerns that selection by insurance status, disenrollment, and mortality biased previous findings using payer data sets. Thus, the moderating course of the typical high‐utilizing SCD patient now seems well‐established.

The fact that those new high utilizers who did not moderate stabilized at a new, higher level of utilization suggests that interventional studies of high utilizers in SCD may best target a more extreme population, either in terms of multi‐year persistence or an accelerating course of utilization. However, this subgroup will be rare.

Prediction of Onset and Course of New High Utilizers

This is the first study to the authors' knowledge to address the question of whether the onset and course of a period of high utilization can be reliably predicted. The results were mixed. High utilizers appeared to be more ill and complex than comparison patients over a wide range of measures, and new high utilizers were diagnosed with more complications prior to and during an index period of high utilization than comparison patients. Chronic complications appeared to lead a period of new high utilization, and more acute complications occurred in the same year. However, while complications were more prevalent in new high utilizers, the differences were not of sufficient magnitude to be reliably predictive. Even the most common SCD complication noted, acute chest syndrome, occurred as a new diagnosis in less than 20% of the new high utilizers in the initial year of high utilization. Thus, paradoxically, while high utilizers appeared more ill, no particular pattern of illness was strongly predictive of high utilization.

Persistent high utilization, rather than the more usual transient course, seemed more closely related to new substance use and mood disorder diagnoses than to complications of sickle cell disease. Persistent high utilizers had a greater prevalence of new diagnoses of mood disorders than moderating high utilizers in every time period examined, emerging as statistically significant after the first year of high utilization. The difference in new diagnoses of alcohol disorders was statistically significant in the initial period of high utilization, but was also present in the other time periods. Cocaine use disorders showed a similar pattern, though they were more rare and did not rise to the level of statistical significance.

The one SCD complication associated with persistent high utilization was septicemia. It is tempting to speculate that this could be as much cause as consequence of high utilization, given the exposures of frequently hospitalized patients to invasive procedures and nosocomial infection.

There was an intriguing regularity of associations of high utilization with mood disorders. This was most clear in differentiating persistent from moderating high utilizers, but was present as a theme in the results throughout. High utilizers were much more likely to be diagnosed with a mood disorder, and both first year high utilizers and persistent high utilizers were distinguished by a higher prevalence of new mood disorder diagnoses. Patients who were persistent high utilizers after an initial high utilization period in 2005 had a cumulative prevalence of hospital‐diagnosed mood disorder approximating 30% by 2007. These differences could be due to a number of factors, including increased surveillance in high utilizers, pain and chronic illness causing mood disturbance, or mood disorders influencing the underlying disease process.


High utilization in this and other studies is closely related to evidence of more severe sickle cell disease.3, 8, 9, 17 This fact, and the apparent difficulty of predicting the onset and course of high utilization, suggest that the primary intervention to moderate high utilization is to prevent such acute complications as acute chest syndrome in the more seriously affected. While the advent of hydroxyurea produced new hope that clinicians could reduce disruptive and dangerous hospitalizations for SCD patients,18 so far there is little evidence that this has occurred.14 Particularly concerning is evidence that only a minority of patients for whom hydroxyurea is indicated are being prescribed the medication.14, 19 Given the individual and public benefits of reduced morbidity and cost, interventions to reduce barriers to physician prescribing and patient adherence are urgently needed, and this is one of the most important issues in the clinical care of SCD today.

The study also points out the continuing question of the role of psychiatric problems in the high‐utilizing SCD patient. While depression, anxiety, and addiction are frequently used as clinical explanations for high utilization in SCD patients, the research literature has stalled at reporting associations between measures of psychological distress and worsened outcome, with inconsistent results depending on methods and populations chosen. Generally, depression has been defined categorically by threshold cutoffs in screening instruments.2023 Whether the term depression should refer to major depressive disorder as defined in the standard psychiatric diagnostic system, or as a broader entity including less severe symptoms or milder disorders, is only rarely addressed.20 This method probably produces a high false‐positive rate relative to the provisional gold standard of diagnosisexpert, diagnostic, semi‐structured interviews.21, 22

However, within the limitations of current methods, certain themes have emerged. Depression, as currently defined, appears highly prevalent among SCD patients.20, 21, 23, 24 It is clearly and consistently associated with worsened pain.20, 21, 25, 26 It also predicts greater opioid use, dramatically reduced quality of life, and reduced relief from opioids.20, 27 Findings on utilization are mixed, however. In some studies, depression has been associated with greater utilization.23, 28 However, in the longitudinal Pain in Sickle Cell Epidemiology Study (PiSCES) study, depression was not associated with utilization when other relevant characteristics were controlled.20 In general, depression appears to lose predictive power as more clinical variables are entered into the model; however, a number of the clinical variables associated with utilization also are related to depression. Whether depression may have a causal role through multiple pathways is not yet settled.

Another matter, frequently discussed but currently unsettled, is the role of addiction in utilization behavior in SCD. Whereas patients with SCD are heavily scrutinized for addiction, and the clinical problem of aberrant opioid use behavior is often discussed for this population, research literature gives little guidance as to the true prevalence and management of such comorbidities. It is well known that substance use disorders are interconnected, such that presence of one elevates risk for others; thus, one would expect more common substance use disorders to act as epidemiologic sentinels for the less. The study of alcohol use disorders, in particular, could be an excellent candidate for developing hypotheses about substance use disorders in this populationdivorced from differentiating problematic pain management behavior from purely drug‐reinforced behavior. In the present study, alcohol use disorders appeared associated with persistent high utilization; however, in other well‐designed longitudinal studies alcohol abuse was related to lower utilization. The meaning of these divergent findings is unclear for the moment.

Despite frequent clinical attention to psychiatric problems in SCD, well‐designed longitudinal studies to clarify causal relations are lacking. Such studies will require reasonable sample sizes, the use of expert structured interviews for diagnostic confirmation, and care taken to address the level of independence of the depressive syndrome from consequences of the underlying illness. The latter will be methodologically difficult. It may require detailed assessment of the course of the depressive syndrome relative to that of the hematologic disorder, assessment of background familial risk for mood disorder, and the effects of subclinical ischemic neurological insults. Despite the challenges, if this work is not done clinicians will be left with tantalizing associations but no solid evidence to guide treatment, and the confusion surrounding the role of psychiatric illness in SCD will remain. Meanwhile,, it seems prudent to have a low threshold to screen for psychiatric disorder and to obtain psychiatric consultations relatively early, where appropriate, for patients whose hospital utilization appears to be accelerating.

Strengths and Weaknesses

Higher hospital utilization provided greater opportunity for diagnosis, so some detection bias in these results should be assumed. Diagnoses are extracted from hospital discharge records using ICD codes. There are opportunities for a number of errors in reporting and coding these diagnoses, and this is likely to be particularly true of conditions that are not the primary clinical focus of the hospitalization. However, all patients had at least 1 hospital admission, and the diagnostic differences are not far outside of expectation. The study also required at least one hospitalization for crisis for inclusion to guard against bias from patients with more mild disease. As such, the conservatively‐defined comparison group may not be a perfect representation of low utilizers, and some relationships may be blunted relative to findings from a broader population.

The study also had a number of strengths, particularly related to the large number of patients who could be tracked using these methods. Given the rarity of sickle cell disease, and the rarity of high‐utilizing patients in the SCD population, such methods as these are the most practical means of developing hypotheses as to the causes and course of high utilization. It is reassuring that the age selection criterion did not significantly alter results; in fact, post hoc analyses that were more inclusive of the high‐utilizing population generally reinforced results of the more stringent analysis.


   Comments ()