Print version ISSN 0021-7557
J. Pediatr. (Rio J.) vol.79 no.6 Porto Alegre Nov./Dec. 2003
Edgar SarriaI; Gilberto B. FischerII; João A. B. LimaIII; Sergio S. Menna BarretoIV; José A. M. FlôresV; Ricardo SukiennikVI
Pediatrician, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre,
IIMSc. Pediatrician pulmologist, Universidade Federal do Rio Grande do Sul UFRGS. Physician, Pneumology Service, Hospital da Criança Santo Antônio HCSA, Porto Alegre, RS, Brazil
IIIPhD. Professor, Fundação Faculdade Federal de Ciências Médicas de Porto Alegre (FFFCMPA). Chief of the Pneumology Service, Hospital da Criança Santo Antônio (HCSA)
IVPhD. Professor, Universidade Federal do Rio Grande do Sul (UFRGS). Chief of the Pneumology Service, Hospital de Clínicas de Porto Alegre (HCPA), Porto Alegre, RS, Brazil
VPediatrician, specialist in radiology. Professor, Fundação Faculdade Federal de Ciências Médicas de Porto Alegre (FFFCMPA). Chief of the Radiology Service, Hospital da Criança Santo Antônio (HCSA), Porto Alegre, RS, Brazil
VIMSc. Pediatrician. Professor, Fundação Faculdade Federal de Ciências Médicas de Porto Alegre FFFCMPA. Physician, Emergency Room, Hospital da Criança Santo Antônio (HCSA), Porto Alegre, RS, Brazil
To evaluate the inter-observer agreement of radiological diagnosis of lower
respiratory tract infections in children.
METHODS: Chest X-rays from 60 children younger than 5 years of age were evaluated by three physicians: a pediatric radiologist (PR), a pediatric pulmonologist (PP) and an experienced emergency pediatrician (EP). All children had sought an emergency room due to acute respiratory infections with apparent lower respiratory tract involvement. Observers were blinded to the original diagnostic conclusions, but clinical and laboratory data from the initial medical evaluation were provided with each film. Variables were grouped into five categories: a) film quality; b) site of abnormality; c) radiological patterns; d) other radiographic images; e) diagnosis. Inter-observer agreement was assessed using Kappa statistics, accepting prevalence-bias-adjusted values (PABAK).
RESULTS: Kappa values for each of the three observer pairs (RP vs. PP, RP vs. EP, and PP vs. PE) were 0.41, 0.43, and 0.39, respectively. The overall inter-observer agreement was moderate (0.41). Agreement on other variables was as follows: regular for "film quality" (0.30); moderate for "site of abnormality" (0.48); fair for "radiological patterns" (0.29); moderate for "other radiographic images" (0,43); and moderate for "diagnosis" (0.33). The overall intra-observer agreement was "moderate" (0.54), which is below the agreement values reported by other studies on chest X-ray variability.
CONCLUSIONS: Inter-observer variability is an intrinsic characteristic of the interpretation of chest X-rays, and the diagnosis of lower respiratory tract infections in children remains a challenge. Most of our results were similar to those previously reported.
Lower respiratory infections, pneumonia, diagnosis, chest x-ray.
The chest radiograph is recognized as an instrument of fundamental importance in the diagnosis of pneumonia in children,1-4 although it is not included as such in the program for the control of acute respiratory infections (ARI), which is currently integrated into the WHO IMCI (Integrated Management of Childhood Illness) strategy.5,6 The justification for this exclusion is tripartite: a) the availability of X-Ray equipment in developing countries is limited, and represents great expense to them; b) when chest x-rays are compared with clinical signs included in the ARI norms, tachypnea associated with subcostal retraction presents greater sensitivity and specificity; c) there is considerable interobserver variation in the interpretation of radiographs.7
The study of interobserver variation (IOV) is much more common in the science of image studies, although it exists in all areas of medicine.8 It is also, therefore, reported in published literature that the ARI control norms, which use clinical indicators, are subject to interobserver variation, which in turn depends on levels of training and standardization of techniques.9
Notwithstanding the value and importance of x-rays to the diagnosis of pneumonia in children, few studies have dedicated themselves to the understanding and improvement of the elements involved in IOV. In a recent systematic review of the subject,110 Swingler searched for studies on interobserver agreement (IOA) in the diagnosis of acute lower respiratory infections (ALRI) in children published between 1966 and 1999 in the MEDLINE, HealtSTAR and HSRPROJ (Health Services Research Projects in Progress), databases. In these databases he managed to identify just ten articles in total, of which only six were methodologically suitable for analysis.
In developing countries, the diagnosis of pneumonia and other ALRIs, depending on local conditions, is based entirely on clinical symptoms (WHO directives), or on these in conjunction with radiographic and laboratory findings, if such resources are available. IN this last case, the chest radiograph is used as the gold standard, and those responsible for the diagnostic process are generally clinical practitioners with varying degrees of experience an training, because, at many health centers there are no radiologists available.
In the present study we attempted to evaluate radiological agreement in ALRI diagnoses between doctors with high levels of training and who have in common frequent contact with children suffering from respiratory diseases.
Three hundred and thirty-four chest radiographs (CXR) of children younger than five years obtained during medical consultations at the Emergency service of a medium sized hospital in Nicaragua during 1998, and which had been part of a study into pneumonia.11 The children had been taken to the hospital seeking attention for clinical status that was suggestive of respiratory infection with pulmonary involvement, for which, as part of the work-up, a chest radiograph was requested. All of the children were admitted with diagnoses of pneumonia. No CXRs were included that had been obtained at any other time than the first consultation and neither were CXRs from children with known pulmonary or cardiac malformations, nor CXRs which, by the first author's criteria, were markedly of low technical quality (too much movement or over/under exposed) and so making it impossible to interpret radiological findings or compromising the interpretation in a manner relevant to decision making.
Based on a calculation of sample size, the number of plates included in this study was 60. In Nicaragua x-rays in profile are uncommon so we have only included posteroanterior views. These examinations were selected sequentially based on the inclusion criteria, and were later evaluated individually, in 2002, by three doctors from the Santo Antônio Children's Hospital: a pediatric radiologist, a pediatric pneumologist and a pediatrician with emergency room experience. The selection of the professionals was made by convenience, taking into account the facts that all three work in a tertiary institution which is the regional center of excellence in the State of Rio Grande do Sul for children with respiratory diseases, all three have a minimum of ten years experience in the area for which they were included as observers, and all three teach as both academics and Residents.
The diagnoses and treatments established in Nicaragua were intentionally obscured in order to avoid any influence on interpretations. The observers were given a standard form with clinical data and blood test results from the initial consultation. Information included: age, time since onset of disease, presence of fever, coughing, shortness of breath, retraction, and cyanosis. Furthermore, when present the following pulmonary auscultation findings were included: ronchi, crackling and wheezing. The following were recorded from the blood test: hematocrit and total and differential leukocyte counts.
Variables were grouped into five categories: a) technical quality of the film; b) location of abnormalities, which included "pulmonary involvement" and "distribution of abnormality" (central and/or peripheral); c) roentgenographic patterns (alveolar, interstitial or mixed); d) other radiographic abnormalities, including collapses, perihiliar abnormalities, bronchial thickening, and pleural opacity; e) diagnosis, which included six possibilities: normal, non-specific abnormalities (bronchitis and/or collapse), acute viral bronchiolitis, viral pneumonia, bacterial pneumonia, and mixed pneumonia (viral and bacterial).
Information was collected on a form developed for the purpose. The response options aimed at confirming, in addition to the presence of findings, the topographical location. The majority of the terms used on the form were adapted from the WHO recommendations for the analysis of radiographic studies of the chest's of children.12 X-rays were interpreted by each doctor with no time restrictions, using a standard consulting room illuminator to view them. The form was accompanied by a sheet of instructions for its completion, and definitions of the terms and variables used. Both the form and the instructions were discussed with the observers individually.
Interobserver agreement (IOA) was tested with the initial interpretation and then, two or three weeks later, ten CXRs were selected at random and were re-interpreted by each observer to test intraobserver agreement. All three specialists re-evaluated the same x-rays.
The Kappa statistic was used in order to measure both IOA and intraobserver agreement, calculating the unweighted Kappa value with its 95% confidence intervals, and accepting prevalence-bias-adjusted values (PABAK). We accepted as working values only those K which resulted from the KVAPB correction, also known as kappanor or Bennett's S coefficient.13 Conventional interpretation of K values is as follows: 0.00-0.20 = poor agreement, 0.21-0.40 = regular, 0.41-0.60 = moderate, 0.61-0.80 = good, 0.81-1.00 = very good. Negative values are interpreted as equal to 0.00.14 The database was created with EPI-INFO v.6.04b15 which generated the n x n tables for the three pairs of observers the values from which served as the basis for the Kappa statistical calculations performed with PEPI v.3.0.13
Anticipating Kappa values > 0.20 based on previous agreement studies16,17 and using the statistical model proposed by Donner and Eliasziw for reliability studies,18 we calculated that 40 x-rays were necessary for two observers, accepting an alpha error of 0.05 and a beta of 0.20. We added 20 x-rays, taking into account the number of variables and the possibility that not all films would be included in the analysis of all of the variables, in the event that the observers judged technical quality to be too low. In total 60 x-rays were studied.
The original study into pneumonia had been evaluated and approved by the Directorate of the HSJ in Nicaragua. The current study was submitted to and approved by the Commission for Research and Ethics in Health at the Hospital de Clínicas de Porto Alegre.
Distribution across the sexes was: 32 males (53.3 %) and 28 females (46.7%). Lowest was 1 month and the oldest was 59 months, with a mean and standard deviation of 17.3 and 13.3 months, respectively.
Results for interobserver agreement show that average agreements between each observer pair (Table 1, last row) did not differ greatly (0.41, 0.43, 0.39), with global agreement considered "moderate" (0.41). Of the ten variables analyzed, average agreement between observers (same table, last column) was "poor" for two variables: bronchial thickening (0.11) and perihiliar abnormalities (0.12); "regular" for four: technical quality of the x-ray (0.30), general distribution (0.38), roentgenographic patterns (0.29) and diagnosis (0.33); "moderate" for three: pulmonary involvement (0.58), collapse (0.54), hyperinflation (0.45); and was "very good" for one: pleural opacity (0.93). The observer pairs averaged 1.7 variables with "poor" agreement, 4.7 with "regular", two with "moderate" agreement, 0.7 with "good" and one variable with "very good" agreement.
The pairing of the pediatric pneumologist with the emergency pediatrician achieved the greatest agreement (0.39), considered "regular". The variables with the least agreement between the two were: bronchial thickening (0.04) and perihiliar abnormalities (0.02). Variables with "poor" levels of agreement were: between the pediatric radiologist and the pediatric pneumologist - perihiliar abnormalities (-0.19); and between the radiologist and the emergency pediatrician - film quality (0.20) and bronchial thickening (0.02).
Intraobserver agreement returned better numerical values than interobserver agreement, both for the averages of the three observers (Table 2, last row table) and for averages for variables (same table, last column). However, the overall average agreement of this analysis was also "moderate", as was the case with the interobserver agreement analysis, (0.54 vs. 0.41). When analyzed by observer, the pediatric radiologist had the most consistent results with a "good" total average agreement (0.66). The exception was the variable film quality for which agreement was "poor" (0.00). Both the pediatric pneumologist and the emergency pediatrician had "moderate" total average agreement (0.46 and 0.50), having, however, "poor" agreement for the variables general distribution (-0.20) and roentgenographic patterns (0.07), in the case of the first, and for the variables perihiliar abnormalities (-0.20) and diagnosis (0.16) in the case of the second.
The Kappa (K) test is most often used in variation studies because it measures agreement between observers, ignoring the possibility that they agree by chance. It is not common for all the resources of Kappa statistics to be used in the majority of studies. This has led authors to question the appropriateness of using these results for comparison with other similar studies. The basis of this is that the test evaluates the degree of agreement based on values that agree and so the resultant K value, including its 95% confidence interval, is greatly influenced by the distribution of the prevalence of values that do not agree. In order to compensate for this, in the current study, we opted for PABAK, which corrects the K value to simulate similar prevalence levels.13,19,20
Our results, while acceptable, had lower K values than are described in many different studies of ALRI. We believe that there arte two reasons for these results, and they are not mutually exclusive. First: in hospitals in developing countries (including the HCSA) there is no routine evaluation of IOA, even though this could result in greater contact between the different professionals who participate in caring for the children, which would be productive for identifying differences and unifying diagnostic criteria. Second, in the majority of published studies, there are few variables and response options are simplified (present/ absent, absent/ probable/present),10 in contrast to our study in which 10 variables were analyzed, all with more complex response options than these studies.
In Brazil, the role of interobserver variation in clinical practice is not often considered, despite its presence in all diagnostic processes.8 This being the case, articles on agreement when diagnosing ALRI, are beyond rare - there has never been one in Latin America.10,21 We believe that this could illustrate the low level of importance given to the subject, which is added to the difficulties that the diagnosis of pneumonia in children itself already causes.4,22,23
The individual evaluations of each observer provide evidence of some of these difficulties. For example, with ALRI, the greatest problem is to distinguish between bacterial pneumonia and other diseases that don't require antibiotics. Classically, the identification of suggestive alveolar patterns is equivalent to a diagnosis of bacterial pneumonia, but from this point of view, a diagnosis of bacterial pneumonia, according to the three observers, would represent 14 - 67% of the cases in which they identified suggestive alveolar patterns (Table 3).
Other criteria that have been used to diagnose bacterial pneumonia, and which have been used in published studies, are the presence of the suggestive alveolar pattern (consolidation) associated or not with bronchial thickening and/or perihiliar abnormalities.4,16 Also in Table 3 it can be observed that if we take this association as a diagnostic criterion, the percentage of bacterial pneumonia varies from 8 to 67%.
Furthermore, it is recognized that collapses are more common among children due to the elevated occurrence of viral infections, associated with the smaller size of the airways. However, our rate of identification of collapses was relatively low (15.8 32.2%), compared with the percentage of viral infections diagnosed (Table 4). These findings, of a large number of viral pneumonia diagnoses with little identification of collapses, but with significant consolidation identification, are contradictory, but are believed to be, to a certain extent, predictable. A number of different authors, in particular Swischuk,24,25 have argued that collapses, while usual, are frequently confused with small-scale consolidation in children with ALRI. Differentiating between these findings in children is sometimes difficult, even for experienced observers, as may have happened in this study.
Bronchial thickening and perihiliar abnormalities are known manifestations, but over the years it has proved difficult to standardize the definition of either of them.4,12,26 This lack of standardization may be the reason for the low values identified between the observer pairs, even though the definitions we use were in the recommendations that accompanied the data collection form..
There was no direct question about the use of antibiotics, but we can artificially imagine the responses by dichotomizing the diagnoses. Thus, children with bacterial pneumonia and mixed pneumonia would require antibiotics and the other categories would not (Table 4). Based on this assumption, agreement between observer pairs in terms of diagnosis would be from "moderate" to "good" (K = 0.51 - 0.61).
From a practical point of view, this is the most important element when using CXR for the treatment of a child suspected of having pneumonia: are antibiotics required or not? The problem is that a conclusion is not reached from a simple analysis of the CXR on an x-ray viewer or the consulting room backlight, as is common. To reach a diagnostic conclusion a process is necessary which begins with a clinical assessment, followed soon after by the integration of this with an interpretation of a group of radiological findings. Dynamic prerequisites to the success of this integration are the acquisition of knowledge, training and critical discussion.
Overall, the primary limitation of this study was the technical quality of the x-rays, which was not uniform. It is common only to include x-rays of an optimum quality in published studies. We included x-rays that could be evaluated, but with varying levels of quality, simulating what happens in the daily practice of many hospitals. With varying technical quality, the degree of certainty of different observers of their perception of abnormalities may differ.
Finally, the Results for each observer, and the agreement between them, reaffirmed, to a certain extent, what has been written in literature on the limitations, difficulties and contradictions involved in the diagnosis of acute infectious pulmonary diseases in children, particularly pneumonia. The lack of uniformity in our results may express a level of variation beyond what would be desired, which reinforces the need for greater attention to these elements of radiological interpretation. In this sense, coinciding with the general results of other similar or related studies on interobserver agreement should not necessarily give satisfaction, and should rather provide an initial point for reflection on the utility of monitoring, observing and giving importance to elements of medical practice which require greater attention in our environment. This done, the quality of diagnosis would certainly improve and, in consequence, so too the rational use of antibiotics.27-29
1. Correa A, Starke J. Infections of the lower respiratory tract in children. In: Niederman MS, Sarosi GA, Glassroth J, editors. Respiratory Infections. 2a ed. Baltimore: Lippincott, Williams & Wilkins; 2001. p. 155-170. [ Links ]
2. Jadavji T, Law B, Lebel M, Kennedy W, Gold R, Wang E. A practical guide for the diagnosis and treatment of pediatric pneumonia. Can Med Assoc J. 1997;156 (Suppl):S703-11. [ Links ]
3. Paiva M, Reis F, Fischer G, Rozov T. Pneumonias na criança. J Pneumol. 1998;24(2):101-8. [ Links ]
4. WHO Pneumonia Vaccine Trial Investigator's Group. Standardization of Interpretation of chest radiographs for the diagnosis of pneumonia in children, WHO/V&B/01.35;2001. [ Links ]
5. WHO. Acute respiratory infections in children: case management in small hospitals in developing countries: a manual for doctors and other senior health workers. WHO/ARI/90.5;1990. [ Links ]
6. Benguigui Y, Perspectivas en el control de enfermedades en los niños: atención integrada a las enfermedades prevalentes de la infancia. Revista Brasileira de Saúde Materno Infantil. 2001:1(1):7-19. [ Links ]
7. Arias SJ, Bossio JC, Benguigui Y. Investigaciones operativas prioritarias para evaluar el impacto de las acciones de control de las infecciones respiratorias agudas. In: Benguigui Y, editor. Control de las infecciones respiratorias agudas: implementación, seguimiento y evaluación. OPS Serie HCT/AIEPI-6;1997. [ Links ]
8. Robinson PJA. Radiology's Achille's heel: error and variation in ten interpretation of the Röntgen image. BJR 1997;70:1085-98. [ Links ]
9. Lanata C. Incidencia y evolución de la neumonía en niños a nível comunitario. In: OPS/OMS: Infecciones Respiratorias Agudas en niños. Serie HCT/AIEPI-1;1999. p. 65-86. [ Links ]
10. Swingler G. Observer variation in chest radiography of acute lower respiratory infections in children: a systematic review. BMC Medical Imaging. 2001,1:1. Available from: URL:http://www.biomedcentral.com/1471-2342-1/1. Accessed: May 26, 2002. [ Links ]
11. Sarria-Icaza E, Duarte C, Estrada M. Neumonía en Niños: Características Clínicas y Epidemiológicas Básicas en el Hospital Santiago de Jinotepe, 1998 [dissertação]. Managua: Universidad Nacional Autónoma de Nicaragua; 2000. [ Links ]
12. WHO. Definitions of technical factors used in the assessment of film quality. In: System for standardizing interpretation of radiographs, WHO/ARI/90.13;1990. [ Links ]
13. Abranson JH, Gahlinger P. Computer Programs for Epidemiologic Analysis, PEPI v. 3.01. Brixton Books; 1999. [ Links ]
14. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-74. [ Links ]
15. Dean AG, Coulombier D, Brendel KA, Smith DC, Burton AH, Dicker RC, et al. Epi Info, Version 6: a word processing, database, and statistics program for public health on IBM-compatible microcomputers. Centers for Disease Control and Prevention, Atlanta, Georgia, U.S.A.; 1996. [ Links ]
16. Davies HD, Wang E, Manson D, Babyn P, Shuckett B. Reliability of the chest radiograph in the diagnosis of lower respiratory infections in young children. Pediatr Infect Dis J. 1996;15:600-4. [ Links ]
17. Limón Rojas A, Moreno Altamirano L, Valenzuela Flores A, Carreón García J, Medina González JC, Sicilia E. Concordancia en la interpretación de radiografías de tórax en niños. Rev Mex Ped. 1995;62(6):219-23. [ Links ]
18. Donner A, Eliasziw M. Sample size requirements for reliability studies. Stat Med. 1987;6:441-8. [ Links ]
19. Altman DG. Some common problems in medical research. In: Practical Statistics for Medical Research. DG Altman, editor. 1a ed. CRC Press;1991. [ Links ]
20. Brennan P, Silman A. Statistical methods for assessing observer variability in clinical measures. BMJ. 1992;304:1491-4. [ Links ]
21. Coblentz C, Babcook C, Alton D, Riley B, Norman G. Observer variation in detecting the radiologic features associated with bronchiolitis. Invest Radiol. 1991;26(2):115-18. [ Links ]
22. British Thoracic Society. BTS guidelines for the management of community acquired pneumonia in childhood. Thorax 2002;57(Suppl 1):1i-24i. [ Links ]
23. Amantéa SL. Diagnóstico etiológico das infecções do trato respiratório inferior sempre um desafío. J Pediatr (Rio J). 1999;75(5):310-12. [ Links ]
24. Swischuk L The chest. In: Emergency Imaging of the Acutely Ill or Injured Child. 4a ed. Baltimore: Lippincott, Williams & Wilkins; 2000. [ Links ]
25. Donnelly LF. Practical issues concerning imaging of pulmonary infection in children. J Thor Imag 2001;16(4):238-50. [ Links ]
26. Coakley FV, Lamont AC, Rickett AB An investigation into perihilar inflammatory changes on the chest radiograph of children admitted with acute respiratory symptoms. Clin Radiol. 1996;51:614-17. [ Links ]
27. Pechère JC. Community Acquired Pneumonia in Children. International Forum Series. U.K.: CMP. Publication; 1995. [ Links ]
28. Vuori-Holopainen E, Peltola H, Kallio M. Narrow versus broad spectrum parenteral antimicrobials against common infections of childhood: a prospective and randomised comparison between penicillin and cefuroxime. Eur J Pediatr. 2000;159:878-84. [ Links ]
29. Belongia EA, Schwartz B. Strategies for promoting judicious use of antibiotics by doctors and patients. BMJ. 1998;317:688-71. [ Links ]
Rua José Bonifácio 942/202
CEP 93010-180 - São Leopoldo, RS, Brazil
Phone: +55 (51) 592.3626
Jan 31 2003, accepted for publication Jul 23 2003.
Financially supported by: CAPES.