Original Research

Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children

Abstract

Abstract

BACKGROUND:

Although chest radiography is commonly used to establish the diagnosis of pneumonia in children, the reliability of radiographic findings among radiologists is not well described.

OBJECTIVE:

We sought to evaluate the inter‐rater and intra‐rater reliability of radiographic features commonly described by radiologists in childhood pneumonia.

METHODS:

Prospective case‐based study. One hundred and ten radiographs of children evaluated in a pediatric emergency department for suspicion of pneumonia were interpreted by six radiologists at two academic children's hospitals. Radiologists were blinded to the clinical history. Reliability of standardized radiographic features was evaluated using the kappa statistic.

RESULTS:

The radiographic finding of an alveolar infiltrate demonstrated substantial reliability among radiologists (κ = 0.69). The presence of ‘any infiltrate’ and pleural effusion demonstrated moderate reliability (κ = 0.47 and k=0.45, respectively). Other radiographic features were less reliable: air bronchograms (κ = 0.32), hilar adenopathy (κ = 0.21), and interstitial infiltrate (κ = 0.14). Similarly, the finding of alveolar infiltrate demonstrated substantial intra‐rater reliability upon review of ten duplicate radiographs, whereas interstitial infiltrate was less reliable.

CONCLUSION:

The radiographic finding of an alveolar infiltrate is very reliable among pediatric radiologists, whereas the finding of an interstitial infiltrate is less reliable. Journal of Hospital Medicine 2012; © 2011 Society of Hospital Medicine.

Copyright © 2011 Society of Hospital Medicine

The chest radiograph (CXR) is the most commonly used diagnostic imaging modality in children, and is considered to be the gold standard for the diagnosis of pneumonia. As such, physicians in developed countries rely on chest radiography to establish the diagnosis of pneumonia.13 However, there are limited data investigating the reliability of this test for the diagnosis of pneumonia in children.2, 46

Prior investigations have noted poor overall agreement by emergency medicine, infectious diseases, and pulmonary medicine physicians, and even radiologists, in their interpretation of chest radiographs for the diagnosis of pneumonia.2, 5, 710 The World Health Organization (WHO) developed criteria to standardize CXR interpretation for the diagnosis of pneumonia in children for use in epidemiologic studies.11 These standardized definitions of pneumonia have been formally evaluated by the WHO6 and utilized in epidemiologic studies of vaccine efficacy,12 but the overall reliability of these radiographic criteria have not been studied outside of these forums.

We conducted this prospective case‐based study to evaluate the reliability of the radiographic diagnosis of pneumonia among children presenting to a pediatric emergency department with clinical suspicion of pneumonia. We were primarily interested in assessing the overall reliability in CXR interpretation for the diagnosis of pneumonia, and identifying which radiographic features of pneumonia were consistently identified by radiologists.

MATERIALS AND METHODS

Study Subjects

We evaluated the reliability of CXR interpretation with respect to the diagnosis of pneumonia among radiologists. Six board‐certified radiologists at 2 academic children's hospitals (Children's Hospital of Philadelphia, Philadelphia, PA [n = 3] and Children's Hospital, Boston, Boston, MA [n = 3]) interpreted the same 110 chest radiographs in a blinded fashion. The radiologists varied with respect to the number of years practicing pediatric radiology (median 8 years, range 3‐36 years). Clinical information such as age, gender, clinical indication for obtaining the radiograph, history, and physical examination findings were not provided. Aside from the study form which stated the WHO classification scheme for radiographic pneumonia, no other information or training was provided to the radiologists as part of this study.

Radiographs were selected among a population of children presenting to the emergency department at Children's Hospital, Boston, who had a radiograph obtained for concern of pneumonia. From this cohort, we selected children who had radiographs which encompassed the spectrum of respiratory disease processes encountered in a pediatric population. The radiographs selected for review included 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. In the latter group, 25 radiographs had a final reading suggestive of an alveolar infiltrate, and 25 radiographs had a final reading suggestive of an interstitial infiltrate. Ten duplicate radiographs were included to permit assessment of intra‐rater reliability.

Radiograph Interpretation

Radiologists at both sites interpreted the identical 110 radiographs (both anteroposterior [AP] and lateral views for each subject). Digital Imaging and Communications in Medicine (DICOM) images were downloaded from a registry at Children's Hospital, Boston, and were copied to DVDs which were provided to each radiologist. Standardized radiographic imaging software (eFilm Lite [Mississauga, Canada]) was used by each radiologist to view and interpret the radiographs.

Each radiologist completed a study questionnaire for each radiograph interpreted (see Supporting Appendix A in the online version of this article). The questionnaire utilized radiographic descriptors of primary end‐point pneumonia described by the WHO which were procured to standardize the radiographic diagnosis of pneumonia.11, 12 The main outcome of interest was the presence or absence of an infiltrate. Among radiographs in which an infiltrate was identified, radiologists selected whether there was an alveolar infiltrate, interstitial infiltrate, or both. An alveolar infiltrate was defined as a dense or fluffy opacity that occupies a portion or whole of a lobe, or of the entire lung, that may or may not contain air bronchograms.11, 12 An interstitial infiltrate was defined by a lacy pattern involving both lungs, featuring peribronchial thickening and multiple areas of atelectasis.11, 12 It also included minor patchy infiltrates that were not of sufficient magnitude to constitute consolidation, and small areas of atelectasis that in children may be difficult to distinguish from consolidation. Among interstitial infiltrates, radiologists were asked to distinguish infiltrate from atelectasis. A radiograph classified as having either an alveolar infiltrate or interstitial infiltrate (not atelectasis) was considered to have any infiltrate. Additional findings including air bronchograms, hilar adenopathy, pleural effusion, and location of abnormalities were also recorded.

Statistical Analysis

Inter‐rater reliability was assessed using the kappa statistic to determine the overall agreement between the 6 radiologists for each binary outcome (ie, presence or absence of alveolar infiltrate). To calculate 95% confidence intervals (CI) for kappa statistics with more than 2 raters, we employed a bootstrapping method with 1000 replications of samples equal in size to the study sample, using the kapci program as implemented by STATA software (version 10.1, STATA Corp, College Station, TX). Also, intra‐rater reliability was evaluated by examining the agreement within each radiologist upon review of 10 duplicate radiographs that had been randomly inserted into the case‐mix. We used the benchmarks proposed by Landis and Koch to classify the strength of agreement measured by the kappa statistic, as follows: poor (<0.0); slight (0‐0.20); fair (0.21‐0.40); moderate (0.41‐0.60); substantial (0.61‐0.80); almost perfect (0.81‐1.0).13

The study was approved by the institutional review boards at Children's Hospital, Boston and Children's Hospital of Philadelphia.

RESULTS

Patient Sample

The sample of 110 radiographs was obtained from 100 children presenting to the emergency department at Children's Hospital, Boston, with concern of pneumonia. These patients ranged in age from 1 week to 19 years (median, 3.5 years; interquartile range [IQR], 1.6‐6.0 years). Fifty (50%) of these patients were male. As stated above, the sample comprised 50 radiographs with a final reading in the medical record without suspicion for pneumonia, and 50 radiographs in which the diagnosis of pneumonia could not be excluded. The 10 duplicate radiographs encompassed a similar spectrum of findings.

Inter‐Rater Reliability

The kappa coefficients of inter‐rater reliability between the radiologists across the 6 clinical measures of interest are displayed in Table 1. As shown, the most reliable measure was that of alveolar infiltrate (Figure 1), which attained a substantial degree of agreement between the radiologists. Two other measures, any infiltrate and pleural effusion, attained moderate reliability, while bronchograms and hilar adenopathy were each classified as having fair reliability. However, interstitial infiltrate (Figure 2) was found to have the lowest kappa estimate, with a slight degree of reliability. When examining inter‐rater reliability among the radiologists separately from each institution, the pattern of results was similar.

Inter‐Rater Reliability of Radiologists (n = 6) Evaluating Chest Radiographs in Children Presenting to the ED With Suspected Pneumonia (n = 100)
All Radiologists (n = 6)Kappa95% Confidence Interval
  • Abbreviation: ED, emergency department.

Any infiltrate0.470.39, 0.56
Alveolar infiltrate0.690.60, 0.78
Interstitial infiltrate0.140.05, 0.23
Air bronchograms0.320.24, 0.42
Hilar adenopathy0.210.08, 0.39
Pleural effusion0.450.29, 0.61
Figure 1

Chest radiograph (anteroposterior [AP] view) of a child with an opacity in the right middle lobe. For this image, all 6 radiologists classified the patient as having an alveolar infiltrate.

Figure 2

Chest radiograph (anteroposterior [AP] view) of a child demonstrating increased interstitial markings which are most prominent in the right middle and left upper lobes. For this image, 4 radiologists classified this radiograph as having an interstitial infiltrate, whereas 2 radiologists classified the patient as not having an interstitial infiltrate.

At least 4 of the 6 radiologists agreed on the presence or absence of an alveolar infiltrate for 95 of the 100 unique CXRs; all 6 radiologists agreed regarding the presence or absence of an alveolar infiltrate in 72 of the 100 unique CXRs. At least 4 of the 6 radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 96% and 90% of the time, respectively. All 6 of the radiologists agreed on the presence or absence of any infiltrate and interstitial infiltrate 35% and 27% of the time, respectively.

Intra‐Rater Reliability

Estimates of intra‐rater reliability on the primary clinical outcomes (alveolar infiltrate, interstitial infiltrate, and any infiltrate) are found in Table 2. Across the 6 raters, the kappa estimates for alveolar infiltrate were all classified as substantial or almost perfect. The kappa estimates for interstitial infiltrate varied widely, ranging from fair to almost perfect, while for any infiltrate, reliability ranged from moderate to almost perfect.

Intra‐Rater Reliability of Radiologists Evaluating Chest Radiographs (n = 10) for Pneumonia in Children
Kappa95% Confidence Interval
  • Too few response categories were represented to facilitate the calculation of the kappa statistic.

  • Both responses are negative for all 10 paired radiographs; kappa cannot be calculated.

Any infiltrate
Rater 11.001.00, 1.00
Rater 20.600.10, 1.00
Rater 30.800.44, 1.00
Rater 41.001.00, 1.00
Rater 5n/a*
Rater 61.001.00, 1.00
Alveolar infiltrate
Rater 11.001.00, 1.00
Rater 21.001.00, 1.00
Rater 31.001.00, 1.00
Rater 41.001.00, 1.00
Rater 50.780.39, 1.00
Rater 60.740.27, 1.00
Interstitial infiltrate
Rater 11.001.00, 1.00
Rater 20.210.43, 0.85
Rater 30.740.27, 1.00
Rater 4n/a
Rater 50.580.07, 1.00
Rater 60.620.5, 1.00

DISCUSSION

The chest radiograph serves as an integral component of the reference standard for the diagnosis of childhood pneumonia. Few prior studies have assessed the reliability of chest radiograph findings in children.3, 5, 12, 14, 15 We found a high degree of agreement among radiologists for radiologic findings consistent with bacterial pneumonia when standardized interpretation criteria were applied. In this study, we identified radiographic features of pneumonia, such as alveolar infiltrate and pleural effusion, that were consistently identified by different radiologists reviewing the same radiograph and by the same radiologist reviewing the same radiograph. These data support the notion that radiographic features most suggestive of bacterial pneumonia are consistently identified by radiologists.16, 17 There was less consistency in the identification of other radiographic findings, such as interstitial infiltrates, air bronchograms, and hilar lymphadenopathy.

Prior studies have found high levels of disagreement among radiologists in the interpretation of chest radiographs.2, 3, 15, 18 Many of these prior studies emphasized variation in detection of radiographic findings that would not typically alter clinical management. We observed high intra‐rater, and inter‐rater reliability among radiologists for the findings of alveolar infiltrate and pleural effusion. These are the radiographic findings most consistent with a bacterial etiologic agent for pneumonia.19 Other studies have also found that the presence of an alveolar infiltrate is a reliable radiographic finding in children18 and adults.7, 9, 10 These findings support the use of the WHO definition of primary endpoint pneumonia for use in epidemiologic studies.4, 6, 11

This study also confirms a previous report by Cherian et al. that findings of many children with asthma, reactive airways disease, bronchiolitis, and viral infections interstitial infiltrates are less reliable.6 This is not surprising considering the fact that these patients often have radiographic findings due to small airway disease and atelectasis.19, 20 The differentiation between atelectasis and interstitial infiltrate is difficult, particularly in young children. A prior study conducted among neonates observed wide variability in the interpretation of chest radiographs, and that the differentiation of pneumonia from atelectasis was difficult for this patient population.5 The decisions around antimicrobial treatment of children with radiographic findings of interstitial infiltrates should be made in the context of the clinical history and physical examination findings, and clinicians should realize that these radiographic features demonstrate poor reliability for the diagnosis of pneumonia.

Overall reliability for the presence of any infiltrate, and its converse, no infiltrate was considered moderate. This is driven by the low reliability and variability around the radiographic diagnosis of interstitial infiltrates. Our findings are similar to those observed in adults with lower respiratory tract infections.9 The low reliability in identification of interstitial infiltrates may explain why prior studies have demonstrated that the CXR results rarely change management in children who have radiographs performed for suspicion of pneumonia.1, 21 Our study highlights the importance of quantifying CXR findings to include specific comments regarding the presence or absence of alveolar infiltrates, rather than the presence or absence of any infiltrate.

The WHO has procured definitions the radiographic diagnosis of pneumonia, and this definition has been utilized to help standardize the interpretation of chest radiographs for the conduct of epidemiological studies.6, 11 Specifically, the definitions utilized not only define the presence or absence of pneumonia, but also attempt to differentiate a primarily bacterial infection (consolidation or pleural effusion), from a viral or atypical presentation (interstitial pattern). Even under the best of circumstances, the differentiation of viral versus bacterial pneumonia is not always possible, and again, is often made by the treating physician by incorporating the clinical setting within which the radiograph was obtained.

This study had several limitations. Firstly, the included radiographs did not reflect the frequency with which certain radiographic findings would be identified in children evaluated for pneumonia in a pediatric emergency department setting. Radiographs were purposefully selected to encompass a broad spectrum of radiologic findings, including less common findings such as hilar lymphadenopathy and pleural effusions. Thus, the prevalence of pneumonia and other abnormal findings in this study was artificially higher than typically observed among a cohort of children for whom pneumonia is considered, a factor that may limit the generalizability of our results. Secondly, the clinical history was not provided to the radiologists to avoid bias by indication. For this study, we notified the radiologists that all radiographs were performed for clinical suspicion of pneumonia without providing details about the subjects' signs and symptoms. The absence of clinical history, however, does not mirror the real world scenario in which the interpretation of the chest radiograph is frequently made in the context of the clinical history. The relevance of this latter issue is unclear, as Tudor et al. found a nonstatistically significant improvement in the overall accuracy in chest radiograph interpretation when radiologists were provided clinical details.10 The radiologists recruited for this study all practice in an academic children's hospital setting, and thus, the generalizability of our findings may be limited to this type of practice setting. Finally, reproducibility does not imply accuracy, and reliability in identifying specific findings does not necessarily lead to improved or different management. Thus, while the reliability of radiographic findings of alveolar infiltrate and pleural effusion is reassuringly high, the validity of these radiographic features for bacterial pneumonia is not known. Ascertainment of validity can only be assessed through the use of invasive testing such as lung biopsy, as the yield from bacterial testing such as blood cultures is low, and the results of other studies such as viral testing of nasopharyngeal washings do not prove an etiologic cause of pneumonia.

CONCLUSIONS

Radiographic findings of alveolar infiltrates and pleural effusions are highly reliable among radiologists. Radiographic interpretation of interstitial infiltrates appears to be less reliable.

References

Online-Only Materials

   Comments ()