Original Research

Discrepancy between admission and discharge diagnoses as a predictor of hospital length of stay




The addition of clinical data or present on admission (POA) codes to administrative databases improves the accuracy of predicting clinical outcomes, such as inpatient mortality. Other POA information may also explain variation in hospital outcomes, such as length of stay (LOS), but this potential has not been previously explored.


To assess whether a discrepancy between the diagnosis coded at the time of admission and the diagnoses coded at discharge independently explains variation in LOS for general internal medicine patients.


A retrospective data review of patients age 18 years and older admitted to general internal medicine units at a large, urban academic medical center between July 2005 and June 2006. A generalized linear regression model was constructed to adjust for patient factors known to be associated with LOS.


Average LOS among patients with a discrepancy between the admitting and discharge diagnosis codes versus those patients without a discrepancy.


The average LOS for patients without a discrepancy between the admitting and discharge diagnosis codes, adjusted for comorbid conditions, was 3.4 days compared to 4.2 days with a discrepancy (0.76‐day increase; P < 0.01).


Diagnosis discrepancy is associated with longer LOS. Diagnosis discrepancy on admission may be a marker of diagnosis uncertainty or poor patient assessment/documentation. Further research is needed to understand the underlying reasons for this discrepancy and its association with LOS, and, potentially, clinical outcomes. Journal of Hospital Medicine 2009;4:234–239. © 2009 Society of Hospital Medicine.

Copyright © 2009 Society of Hospital Medicine

Recent research has found that the addition of clinical data to administrative data strengthens the accuracy of predicting inpatient mortality.1, 2 Pine et al.1 showed that including present on admission (POA) codes and numerical laboratory data resulted in substantially better fitting risk adjustment models than those based on administrative data alone. Risk adjustment models, despite improvement with the use of POA codes, are still imperfect and severity adjustment alone does not explain differences in mortality as well as we would hope.2

The addition of POA codes improves prediction of mortality, since they distinguish between conditions that were present at the time of admission and conditions that were acquired during the hospitalization, but it is not known if the addition of these codes is related to other measures of hospital performancesuch as differences in length of stay (LOS). Which of the factors related to the patient's clinical condition at the time of hospital admission drive differences in outcomes?

A patient's admission diagnosis may be an important piece of information that accounts for differences in hospital care. A patient's diagnosis at the time of hospital admission leads to the initial course of treatment. If the admitting diagnosis is inaccurate, a physician may spend critical time following a course of unneeded treatment until the correct diagnosis is made (reflected by a discrepancy between the admitting and discharge diagnosis codes). This discrepancy may be a marker of the fact that, while some patients are admitted to the hospital for treatment of a previously diagnosed condition, other patients require a diagnostic workup to determine the clinical problem.

A discrepancy may also reflect poor systems of documenting critical information and result in delays in care, with potentially serious health consequences.3, 4 If diagnosis discrepancy is a marker of difficult‐to‐diagnose cases, leading to delays in care, we may be able to improve our understanding of perceived differences in the production of high‐quality medical care and proactively identify cases which need more attention at admission to ensure that necessary care is provided as quickly as possible.

Almost universally, comparisons of hospital performance are risk‐adjusted to account for differences in case mix and severity across institutions. These risk‐adjustment models rely on discharge diagnoses to adjust for clinical differences among patients, even though recent research has shown that models using discharge diagnoses alone are inadequate predictors of variation in mortality among hospitals. While the findings of Pine et al.1 suggest the need to add certain clinical information, such as laboratory values, to improve these models, this information may be costly for some institutions to collect and report. We aimed to explore whether other simple to measure factors that are independent of the quality of care provided and routinely collected by hospitals' electronic information systems can be used to improve risk‐adjustment models. To assess the potential of other routinely collected diagnostic information in explaining differences in health outcomes, this study examined whether a discrepancy between the admission and discharge diagnoses was associated with hospital LOS.

Patients and Methods

Patient Population

The sample included all patients age 18 years and older who were admitted to and discharged from the general medicine units at Rush University Medical Center between July 2005 and June 2006. We further limited the sample to patients who were admitted via the emergency department (ED) or directly by their physician, excluding patients with scheduled admissions for which LOS may vary little and patients transferred from other hospitals. We also excluded patients admitted directly to the intensive care units. However, some patients were transferred to the intensive care units during their stay and we retained these patients. Only a small percent of cases fit this designation (1.2%). We did not explore the effects of this clinical situation due to small numbers of patients. Our attempt was to constitute a sample that would include patients for whom admission is more likely for an episodic and diagnostically complex set of symptoms and signs.

Diagnosis Discrepancy

Admission and discharge diagnosis codes were classified using the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD‐9‐CM). An admission diagnosis is routinely documented and coded by hospitals but is not used by most private and public payers for reimbursement purposes, unlike the discharge diagnosis codes. The admission diagnosis code summarizes information known at the time the patient is admitted to the hospital and corresponds to the chief complaint in the history and physical report. Its specificity may depend on a variety of patient and physician‐related factors, and neither the quality of the information collected at admission nor the specificity of the coded information is externally regulated. Only one admission diagnosis code is captured and, like the discharge diagnosis codes, coded at the time of discharge. The admission diagnosis code reflects the amount of information known at the time of admission but is retrospectively coded.

A patient may have multiple discharge diagnosis codes. These codes summarize information collected throughout a hospitalization. The discharge diagnosis codes are used to bill third‐party payers and patients. In addition, governmental agencies, benchmarking institutions, and researchers use the discharge diagnosis codes to classify a patient's condition, identify comorbidities, and measure severity of illness.

We measured discrepancy between admission and discharge diagnoses in two ways. We first compared the admitting diagnosis code with the principal discharge diagnosis code. A match was defined as a patient record in which the two codes were exactly the same at the terminal digit. If the two codes did not match exactly at the terminal digit, we classified the patient as having a discrepancy or mismatch between diagnosis codes. For example, if the admitting diagnosis code was 786.05 (shortness of breath) and the principal discharge diagnosis code was 428 (congestive heart failure, unspecified), the diagnosis codes were classified as discrepant. To test the robustness of our definition of discrepancy between admitting and discharge diagnoses, we created a second variable that compared the admitting diagnosis code with the first five discharge diagnosis codes. If the admitting diagnosis code did not match any of the first five discharge diagnosis codes, the diagnosis codes were classified as discrepant.

We use the term diagnosis discrepancy to refer to records that have a mismatch between admitting and discharge diagnosis codes.

Models and Data Collection

The outcome of interest was inpatient LOS. The primary independent variable was whether the patient record had a discrepancy between the admitting and discharge diagnosis codes.

Our models controlled for the following variables: age; sex; admission source (ED or primary care provider); primary source of insurance (Medicare, Medicaid, or commercial coverage); and severity of illness, measured by the number of comorbid conditions.5, 6 We also controlled for the general type of clinical condition, which was classified by the principal discharge diagnosis code using the Healthcare Cost and Utilization Project's Clinical Classifications Software 2007.7 Data were collected from the institution's clinical data warehouse.

Statistical Analysis

A generalized linear regression model fit with a negative binomial distribution was used to test for an association between inpatient LOS and a discrepancy between admitting and discharge diagnosis codes, adjusting for the variables described above. We reestimated our models without the respective diagnosis discrepancy variable and calculated a likelihood ratio test statistic for the two models to determine whether the addition of diagnosis discrepancy significantly improved our models.

We used two sensitivity tests to assess the specification of our models. First, we included two interaction terms: one for diagnosis discrepancy and ED admissions, to assess whether the association between diagnosis discrepancy and LOS differed by admission source; and another for diagnosis discrepancy and the number of comorbidities, to assess whether the association between diagnosis discrepancy and LOS differed by level of patient complexity. Second, we incrementally broadened our definition of a match in admitting and discharge diagnoses by comparing the admitting diagnosis with the first two discharge diagnoses, then the first three discharge diagnoses, through the 10th discharge diagnosis, and reestimated the regression models using the successively broader definition of match (principal, first two, first three, first four, through the first 10 discharge diagnoses) to further assess the robustness of our measurement of diagnosis discrepancy as a predictor of LOS.


Of the 5,375 patients discharged between July 2005 and June 2006, 75.6% had a discrepancy between their admitting and principal discharge diagnosis. Patients with a discrepancy between their admitting and principal discharge diagnosis codes had significantly longer LOS, were older, had more comorbid conditions, and were more likely to be male, admitted through the ED, and have Medicare (Table 1). Results were similar for the more encompassing definition of a discrepancy between admitting and the top 5 discharge diagnoses (results not shown).

Sample Characteristics by Presence or Absence of a Discrepancy Between Admission Diagnosis and Principal Discharge Diagnosis
VariablesnNo Discrepancy (n = 1,313)Discrepancy (n = 4,062)P*
  • NOTE: Number of patients (n) = 5,375. Numbers in parentheses are standard errors (SEs).

  • Significance.

LOS (days), mean (SE) 3.4 (3.6)4.2 (4.1)<0.001
Age (years), mean (SE) 56.3 (18.8)59.7 (18.6)<0.001
Comorbid conditions (number), mean (SE) 1.2 (1.2)1.4 (1.3)<0.001
Gender (%) 0.019
Admission source (%)
Insurance coverage
Clinical domain (%) <0.001
Nervous system23035.764.4

Table 2 reports the 10 most common admitting diagnoses that did not match the principal discharge diagnosis code and the 10 most common principal discharge diagnoses that did not match the admitting diagnosis code. The top 10 discrepant admitting diagnosis codes represented nearly one‐half of all cases with a discrepancy between the admitting and discharge diagnoses. The top 10 principal discharge diagnosis codes represented 23% of all discrepant diagnoses. Table 3 lists the 10 most common pairs of mismatched admitting and principal discharge diagnosis codes. The most common mismatched pair was a principal admitting diagnosis code of 786.05 (shortness of breath) and discharge diagnosis code of 428.0 (congestive heart failure, unspecified).

Ten Most Common Discrepant Admission and Principal Discharge Diagnosis Codes
RankAdmission Diagnosis Code Not Matching Primary Discharge DiagnosisRankPrincipal Discharge Diagnosis Code Not Matching Admission Diagnosis Code
1786.05Shortness of breath11.11428.0Congestive heart failure, unspecified6.0
2789.00Abdominal pain, unspecified site8.52486Pneumonia, organism unspecified3.3
3780.6Fever6.73584.9Acute renal failure, unspecified2.2
4786.50Chest pain, unspecified5.64786.59Chest pain, other2.1
5787.01Nausea without vomiting3.95599.0Urinary tract infection, site not specified2.1
6780.99Other general symptoms3.46996.81Complications of kidney transplant1.8
7780.79Other malaise and fatigue3.07577.0Acute pancreatitis1.7
8780.2Syncope and collapse2.68996.62Infection and inflammatory reaction due to other vascular device, implant or graft1.4
9729.5Pain in limb2.19434.91Cerebral artery occlusion with cerebral infarction, unspecified1.3
10729.81Swelling of limb2.010008.8Intestinal infection, not elsewhere classified1.0
Ten Most Common Pairs of Discrepant Admission and Primary Discharge Diagnosis Codes
Admission DiagnosisPrincipal Discharge Diagnosis
786.05Shortness of breath428.0Congestive heart failure, unspecified
786.50Chest pain, unspecified786.59Chest pain, other
786.05Shortness of breath486Pneumonia, organism unspecified
780.6Fever486Pneumonia, organism unspecified
780.6Fever996.62Infection and inflammatory reaction due to other vascular device, implant or graft
789.00Abdominal pain, unspecified site577.0Acute pancreatitis
780.6Fever599.0Urinary tract infection, site not specified
786.05Shortness of breath491.21Obstructive chronic bronchitis with acute exacerbation
786.05Shortness of breath415.19Pulmonary embolism and infarction, other
786.05Shortness of breath493.22Chronic obstructive asthma, with acute exacerbation

Table 4 reports the results of the generalized linear model predicting LOS. Discrepancy between the admitting and principal discharge diagnoses was associated with a 22.5% longer LOS (P < 0.01), translating into a 0.76‐day increase at the mean for those with discrepant diagnoses. Our results are robust to our definition of discrepancy between admitting and discharge diagnoses. Using the discrepancy definition based on the top five discharge diagnosis codes, a discrepancy between admitting and discharge diagnoses was associated with a 15.4% longer LOS (P < 0.01), translating into a 0.52‐day increase. Results of the likelihood ratio test showed that the addition of diagnosis discrepancy significantly improved the fit of the regression models using both the principal and top five discharge diagnosis codes.

Results for Generalized Linear Regression Model Predicting LOS (n = 5,375)
Model 1Model 2Model 3
  • NOTE: Model 1 excludes diagnosis discrepancy variable; model 2 includes diagnosis discrepancy variable using the principal discharge diagnosis code; model 3 includes diagnosis discrepancy variable using the first 5 discharge diagnosis codes. Omitted category includes match in admitting and discharge diagnoses (in models 2 and 3), male, direct admission and commercial insurance coverage. Models control for clinical domain. Generalized linear models are estimated with a negative binomial distribution. Standard errors (SEs) are shown in parentheses.

  • Significance at the 1% level or better.

Intercept0.98* (0.06)0.84* (0.06)0.89* (0.06)
Diagnosis discrepancy with principal discharge diagnosis 0.20* (0.03)
Diagnosis discrepancy with top 5 discharge diagnoses 0.14* (0.02)
Age0.001 (0.001)0.001 (0.001)0.001 (0.001)
Female0.03 (0.02)0.03 (0.02)0.03 (0.02)
Emergency department admission0.02 (0.03)0.03 (0.03)0.03 (0.03)
Medicare0.15* (0.03)0.15* (0.03)0.15* (0.03)
Medicaid0.04 (0.03)0.04 (0.03)0.05 (0.03)
Number of comorbid conditions0.13* (0.01)0.13* (0.01)0.13* (0.01)
Log likelihood for model11737.2311797.5411771.76
Likelihood ratio test statistic 120.62*69.06*

Broadening our definition of a match between admitting and discharge diagnosis codes from matching only on the principal discharge diagnosis code to the first 10 discharge diagnosis codes showed that even when using the first 10 discharge diagnoses, a diagnosis discrepancy still significantly increased LOS. The magnitude weakened, however, as the definition of a match in diagnosis codes was broadened, ranging from 22.5% when including the principal discharge diagnosis code only to 12.1% when including the first 10 discharge diagnosis codes (Figure 1).

Figure 1

Association between discrepancy in admission and discharge diagnoses and LOS for the first 10 discharge diagnosis codes (n = 5,375).


Discrepancy between admitting and discharge diagnosis codes was associated with a large increase in LOS, even after controlling for age, sex, admission source, insurance, number of comorbid conditions, and clinical domain. This discrepancy translated into an increase of 0.76 days in LOS per general medicine patient, nearly two‐thirds larger than the increase in LOS of 0.47 days associated with having one comorbid condition, and equated to 4,102 additional patient days for the 5,375 general internal medicine patients admitted.

The relative and absolute increase in LOS associated with a diagnosis discrepancy is considerably larger than that associated with measures of comorbid illness found in other studies. In a study examining the predictive power of comorbidity measures based on diagnosis codes and outpatient pharmacy records, Parker et al.8 found that the inclusion of comorbid conditions based on only discharge diagnosis codes was associated with up to a 0.28‐day increase in LOS, and the further inclusion of comorbidity markers based on pharmacy data was associated with up to an additional 0.09‐day LOS. In a study comparing different measures of disease severity and comorbidities in predicting LOS for total knee replacement patients, Melfi et al.9 found that the addition of one diagnosis code was associated with a 3.3% increase in LOS. Similarly, Kieszak et al.10 found that the likelihood of having an LOS greater than 10 days increased two‐fold for patients with carotid endartectomy and at least one comorbidity.

While a discrepancy between the admitting and discharge diagnosis codes was consistently associated with an increased LOS, the underlying reasons are not yet understood. We can only speculate about the reasons for this association, and further work is needed to test these hypotheses. There are several possible explanations for discrepant cases: (1) poorer documentation at the time of admission, (2) more complexity in terms of the diagnostic task, and (3) less thorough diagnostic workup at the time of admission.

First, we do not think that poor documentation at the time of admission is the most likely explanation. Our ED uses documentation templates for all admitted patients, hence equalizing the amount of documentation for many patients. However, the main reason we do not think this is the reason for discrepancy is that diagnosis codes at the time of admission via the ED are assigned by physicians and not those who code based on documented information.

We do think that the most likely reason is that patients with discrepant diagnoses are truly harder to diagnose cases. For example, we assume that the time to provide care to patients once admitted is the same regardless of the ED or preadmission triage. For example, assume all patients are seen nearly as soon as admitted and the workup promptly ensues. Hence, under these conditions, variation in LOS may be due to more care needed for the most severely ill. If this assertion is true, our finding is a new one and adds a new candidate variable to explain variation in care due to patient severity (beyond comorbid illness, which we controlled for). We think we are showing that diagnostic uncertainty is a common, previously unexamined component of the complexity of clinical presentations (we propose that diagnosis discrepancy is a complexity variable rather than a comorbid, severity of illness variable). For example, discrepancy between admitting and discharge diagnosis codes could be due to other patient characteristics such as a patient's inability to communicate his or her symptoms to the physician due to language or cultural barriers.

However, regarding the third possible reason, if the ED or the preadmission setting fails to provide diagnostic services prior to admission for those patients with discrepant diagnoses regardless of diagnostic complexity, then our finding is a hospital or system performance variable. Those patients with discrepant diagnoses may have had a less thorough workup prior to admission leading to more workup being needed during the admission.

Regardless of the reason (perhaps all three reasons are involved at some level), our study points to a new component of patient care variations. We hope our finding spurs future research efforts. We are about to embark on a comparison of patients with identical discharge diagnoses but discrepant or not discrepant admission diagnoses to explore variations in the amount/type of diagnostic and treatment plans provided both before and during hospitalization.

In further support of diagnosis complexity as the reason for discrepancy is that the codes on admission for discrepantly coded patients are nonspecific, symptom or sign diagnoses (ie, shortness of breath, abdominal pain) while discharge diagnoses are more specific (ie, congestive heart failure, pancreatitis) (Tables 2 and 3). The nonspecific nature of the preliminary codes likely signifies more clinically complex situations and when noted, over and above previously described risk adjustment models, the discrepancy portends more healthcare needs. For patients admitted without a clear diagnosis of a clinical problem, diagnostic workups may be more complex and require longer hospitalization. For these patients, a longer LOS may not be a marker of poor quality of care, but instead the lack of critical information present at the time of admission.

Our comparison of the association between LOS and a discrepancy in diagnosis codes when the admitting diagnosis code was successively matched to a larger number of discharge diagnosis codes suggests that LOS increases not only when the admitting diagnosis is incorrect or not sufficiently specific, but also when the admitting diagnosis is correct, but not the principal discharge diagnosis. Taken together, these findings suggest that delays in care may result from lack of clear patient diagnostic information at the time of admission.

Our findings may advance the understanding of variations in hospital care from two standpoints. First, noting the discrepant diagnoses may significantly improve prediction in health services research studies examining variations in hospital performance, even beyond the addition of POA coding. Second, and perhaps more importantly, prospectively identifying patients at the time of admission with the nonspecific, preliminary codes identified in our study may allow physicians to target earlier in care patients with more demanding care needs. We realize, however, that before we could use this information to prospectively attempt to improve care, coding would have to be done at admission rather than discharge. At our site, this is true in the ED setting. Patients are assigned an admission diagnosis code as they leave the ED and this code is carried through to discharge without alteration. A nonspecific admission code could, for example, alert those taking care of the patient in the hospital that this is perhaps a more complex clinical situation requiring earlier consultation. Concurrent coding could also jumpstart studies to better understand whether what we have found in this preliminary study is due to poor assessment or difficult patient situations. However, this contingency may not be possible for those admitted directly from physician offices, as both the admission and discharge codes are determined at the time of discharge and based on documentation. Yet, on admission, a chief complaint is provided that may serve the same purpose as an admission diagnosis code if they are sufficiently in agreement.

Our study has limitations. It is from a single medical center and uses administrative data alone. We did not have access to clinical records for more detailed information about the content and completeness of medical records at the time of admission. Our observations should be tested in other hospital systems. Another limitation may be that we focus on discrepancy and not on those patients without a discrepancy. However, the aim of testing for discrepancy is to focus on improvement. Conducting a more in‐depth chart review of patients with similar final diagnoses, some with discrepant codes and others with nondiscrepant codes, may be a way to assess the reasons why LOS varied in the two groups. The next step, should our observations be confirmed, is to systematically assess whether other characteristics exist that differentiate cases in which a discrepancy between diagnosis codes is due to diagnostic uncertainty from those in which it is due to diagnostic oversight or error. A method to systematically identify conditions at admission that are likely to be misdiagnosed or have a delay in diagnosis may substantially improve the overall quality of care provided in the hospital.


   Comments ()