Reliability of cause of death coding : an international comparison

This study evaluates the agreement of nosologic coding of cardiovascular causes of death between a Chilean coder and one in the United States, in a stratified random sample of death certificates of persons aged ≥ 60, issued in 2008 in the Valparaíso and Metropolitan regions, Chile. All causes of death were converted to ICD10 codes in parallel by both coders. Concordance was analyzed with inter-coder agreement and Cohen’s kappa coefficient by level of specification ICD-10 code for the underlying cause and the total causes of death coding. Inter-coder agreement was 76.4% for all causes of death and 80.6% for the underlying cause (agreement at the four-digit level), with differences by the level of specification of the ICD-10 code, by line of the death certificate, and by number of causes of death per certificate. Cohen's kappa coefficient was 0.76 (95%CI: 0.68-0.84) for the underlying cause and 0.75 (95%CI: 0.74-0.77) for the total causes of death. In conclusion, causes of death coding and inter-coder agreement for cardiovascular diseases in two regions of Chile are comparable to an external benchmark and with reports from other countries.


Introduction
Mortality statistics are a cornerstone of health planning and research 1,2,3 .Their usefulness hinges on the quality of death certification and on the reliability of coding in line with the International Classification of Diseases -10 th revision (ICD-10) 4 .Despite extensive standardization efforts, the reliability of nosological coding of causes of death is often a cause for concern 3,5,6 .
Apart from possible errors associated with the medical certification of cause of death, a number of factors influence the quality of nosological coding of causes of death.Some derive from ambiguities and inconsistencies of the ICD-10, but arguably most relate to differences in training and skills, and the interpretation of the ICD guidelines by coders 3,7,8,9 .Studies of intercoder agreement show varying levels of agreement in relation to code classification level, the number of causes of death recorded on the death certificate, and the age of the deceased 3,7,10 .
Chile has made significant efforts in recent decades to improve the quality of its vital records 11 and the international comparability of its causes of death statistics, but there is currently no information available on the quality of recent coding process in the country.The aim of this study was therefore to quantify the level of international agreement between a coder from Chile and a coder from the United States using a representative sample of death certificates issued in 2008 in the Valparaíso and Metropolitan regions, Chile.

Method
As part of a study on mortality due to heart failure, a stratified random sample was drawn from death certificates of people aged 60 years and over issued in the Valparaíso and Metropolitan regions in 2008 available in the Department of Health Statistics and Information (DEIS, acronym in Spanish) of the Chilean Ministry of Health (MINSAL, acronym in Spanish).These regions were selected because they account for approximately 50% of all annual deaths in the country.
Given the low prevalence of heart failure as an underlying cause of death (approximately 2% per year), the sample universe was restricted to the group of ICD-10 codes which corresponds to pathologies frequently associated with heart failure (see codes below) and which also accounts for 47% of the ICD-10 codes for cardiovascular disease (codes I00 to I52).This randomly selected study sample was stratified by sex, age (60 to 69 years, 70 to 79 years, 80 years and over) and place of death (hospital, home).Based on the age distribution of deaths due to heart failure 12 , the death certificates of individuals aged under 60 years were excluded from the analysis.Deaths due to external causes and those that occurred in a place of death listed as "other" (prisons, nursing homes, lodgings) were also excluded given the greater probability of misclassifications in such cases that could lead to selection bias.Given the significant variation in prevalence of heart failure by age 12 , a stratified random sampling design using Neyman optimal allocation was chosen, with similar sampling fractions that accurately reflect the prevalence of heart failure as the underlying cause of death in each age-sex stratum.Based on the total number of deaths (31,112) in the Valparaíso and Metropolitan regions, a prevalence of heart failure as the underlying cause of death of 6.5%, and a variance of 0.0195, the minimum sample size required to attain the precision target was 437 death certificates.The actual study sample for this analysis was made up of 515 certificates (α = 0,05; Z = 1,96; β = 80%; number of strata = 12).
Chile's mortality database is maintained jointly by the National Statistics Institute, Civil and Identification Registry and the DEIS 11 .These institutions collect and verify the information recorded on death certificates, which is standardized according to World Health Organization (WHO) 13 recommendations into two parts: Part I (Lines A, B and C), which records the sequence of diseases that led directly to death; and Part II, which includes all other morbid conditions that may have contributed to the death, but are unrelated to the underlying cause 4,13 .
Based on the information recorded on the death certificate, the DEIS staff manually select and code the underlying cause of death based on the ICD-10 guidelines, translating the diagnostic terms recorded on the death certificate from words into alphanumeric code.The ICD code consists of a four-character string (with a letter in the first position and numbers in the second, third and fourth positions 10 ) and 21 chapters associated with the first character of the code (e.g., I -diseases of the cardiovascular system).The chapters are subdivided into homogeneous "blocks" of three-character categories (e.g., I50) that may represent a singular condition or group of diseases which have common features.Most blocks of three-character categories are subdivided by means of a fourth, numeric character after a decimal point (e.g., I50.9) 4 .According to WHO recommendations, this fourth character is not required internationally 4 .Therefore, a condition may be coded at the four-character, three-character or chapter (one-character) level 9 .
In accordance with the ICD-10 coding rules, each cause of death recorded on the death certificates was independently translated into ICD-10 codes by a coder from the DEIS and a certified nosological coding instructor from the United States.The analysis considered the frequency of non translation of causes of death into ICD coding by both coders.Demographic information about the deceased (age, sex, marital status, education level, region of residence, place of death) and the type of medical certifier (physician, pathologist, other doctors) was recorded.
Level of intercoder agreement (%) at the four-character, three-character or chapter (one-character) level was assessed.Cohen's kappa coefficient and corresponding 95% confidence intervals (95%CI) were calculated.The chi-square test of independence was used to test whether differences reached nominal levels of statistical significance at p = 0.05.The data was analysed using SAS 9.2 software (SAS Inst., Cary, USA).
The study was approved by the Human Research Ethics Committee of the Faculty of Medicine at the University of Chile.

Results
The underlying causes of death recorded in within the group of ICD-10 codes assessed by this study accounted for 21% of the total deaths recorded at regional level.The study sample comprised 515 death certificates (women = 59.2%; men = 40.8%;age = 81 ± 9 years) (Table 1), which registered a total of 1,725 causes of death.The number of causes of death per death certificate ranged from one (5.8%) to eight (0.4%), and was higher among women (60.5%), the 80 to 105 years age group (55%), and on Line A (99.8%).Of the 1,725 causes of death, 97.2% were translated into ICD-10 codes by both coders, 2.3% were coded by only one coder, and 0.6% was not coded.Intercoder agreement was affirmed in a total of 1,715 causes of death.
Level of intercoder agreement with respect to the underlying cause of death at the four-character, three-character and chapter (one-character) level was 80.6%, 86.6% and 94.1%, respectively, and was inversely related to the code classification level (Table 3).Differences were also observed in the number of causes of death per death certificate at the four (p < 0.01) and three-character (p < 0.01) level.The lowest level of agreement (73.7%) occurred at the four-character level in certificates with four or more causes of death, while the highest level of agreement (96.7%) was observed at the chapter-level in certificates with a single cause of death.In general, level of agreement decreased with increasing code classification level, from 80.6% at the four-character level, to 94.2% at the chapter level.No difference in level of agreement was found across age categories in any of the code classification levels.Cohen's kappa coefficients of coding of underlying cause of death at the four-character, three-character and chapter level were 0.77 (95%CI: 0.75-0.79),0.84 (95%CI: 0.82-0.86)and 0.76 (95%CI: 0.73-0.79),respectively.

Total causes of death
Level of intercoder agreement for the total causes of death coded by at least one of the coders at the four-character, three-character and chapter level was 76.4%, 81.7% and 86%, respectively (Table 4), with differences depending in the line of the death certificate (p < 0.01).The lowest level of agreement across all code classification levels was in Line A (53.3%, 63.1% and 67.25%, respectively) and the highest level of agreement was in Part II of the death certificate (89.3%, 91.4% and 95.9%, respectively).Cohen's kappa coefficients of the nosological coding of the total causes of death at the four-character, three-character and chapter level were 0.75 (95%CI: 0.75-0.77),0.81 (95%CI: 0.80-0.82)and 0.76 (95%CI: 0.74-0.79),respectively.

Discussion
The results show that the level of agreement in cause of death coding between the Chilean coder and the coding instructor from the United States was adequate.With respect to underlying cause of death, intercoder agreement was inversely related to the number of causes of death per certificate and to the code classification level (4-character, 3-character and chapter level), while in the case of total causes of death, agreement varied according to code classification level and line of the death certificate.Cohen's kappa coefficient values that demonstrate strength of agreement were considered "moderate" to "almost perfect" 14 .
Table 5 shows the level of agreement between expert coders and regular coders found by calibration studies of nosological coding based on automated systems 10,15,16 .Our findings are similar to those with an agreement level above 80% across all code classification levels.Based on an earlier version of the ICD, Curb et al. 5 reported a level of agreement of 97.1% for the group relating to cardiovascular causes of death (ICD-8: 390-458, 746), which is slightly higher than our findings.Harteloh et al. 9 reported lower levels of agreement for the codes ICD-10: I00-I99 relating to cardiovascular diseases at the four-character, three-character and chapter level (74.6%, 78.2% and 91.5%, respectively).These differences in level of agreement could be explained by the higher number of coders assessed by these studies which tends to lead to an increase in intercoder variability: Harteloh et al. 9 assessed four coders, while Curb et al. 5 analysed three coders.Some studies have reported to varying levels of intercoder agreement related to age, number of causes of death per death certificate, and code classification level 9,10 .Our results show variations only in relation to the last two variables, ranging from 73.7% in death certificates with four or more causes of death, to 86.7% in those with only one cause of death at the four-character level.A study of agreement between an expert coder and regular coder conducted in Taiwan 10 , which included deaths of all ages, reported agreement levels at the three-character and two-character level of 80.9% and 83.9%, respectively and showed that level of agreement was inversely re-lated to the age of the deceased and the number of reported causes of death.
Considering the complexity of nosological coding of cause of death the WHO has developed rules to facilitate and standardize the classification process and make it more informative 17,18 .However, some critics have pointed out a number of ambiguities and inconsistencies in these rules 8,16,19,20 and concerns about the quality of the coding process persist 5,6 given the observed influence of factors such as the coder's skill level and the level and continuity of coder training.The literature also suggests that differences in interpretation of coding rules result in discrepancies in intercoder agreement 5,7,10 .
Although this investigation assessed coders from different countries, the level of agreement observed by this study was high, indicating comparable interpretation of the ICD-10 rules, in contrast to the findings of the European Community sponsored EURODIAB study 16 , which reported considerable differences between the interpretations of ICD rules across the nine countries included in the study.Another study which assessed nosological coding of underlying cause of death by a member of a research team and local coders showed a level of agreement of 55% at the three-character level 15 , which increased to 67% when the analysis was restricted to large groups of diseases.The authors explained this high level of disagreement due to a tendency by local coders to code the immediate cause of death instead of the underlying cause 15 .As stated by Winkler et al. 3 , it could be argued that the ICD-10 has not achieved the desired improvements with respect to the reliability of the coding process.These authors compared the low levels of agreement with the findings of previous research 21 which report-ed levels of agreement at the three and four-character level of 67.7% and 61.5%, respectively.Winkler et al. 3 reported levels of agreement at the three-character and four-character level of 56% and 46%, respectively, and a Cohen's kappa coefficient of 0.69 (95%CI: 0.63-0.73)at the chapter level.One of the problems reported by this group was the limited information available on the death certificates, which adversely impacts the quality of nosological coding.Another problem concerns the quality of information on the death certificate.Ill-defined causes of death, modes of death or improper causal sequences, for example 2,8,18 , negatively affect the coding process, since  it involves the post hoc interpretation of a sequence of events that characterize the fatal outcome which often involves subjective judgments.The level intercoder agreement was lower for Line A of the death certificate (53.3%) than for the other two lines and Part II, where level of agreement was over 80%.This difference was attributable to three pairs of discordant codes (J96.9 -R06.8),(I50.0 -I50.9) and (R99 -I46.9).In this respect, although ICD guidelines emphasise that the mechanism of death does not constitute a cause of death 20 , physicians often record the former on Line A 17 .Further, nosological coding relies on the coder's interpretation of the ICD rules to resolve such errors.In the first pair of discordant codes mentioned above (I50.0-I50.9)disagreement occurs at the fourth-character level.According to Surján et al. 19 , a poor level of agreement at the fourth-character level is considered less severe than in the whole code.Since both codes have high-order semantic relations, this discrepancy should not have a major effect on causes of death statistics.Our findings corroborate the results of studies that show that ambiguities at the four-character level are particularly common in coding deaths due to diabetes mellitus 9,10,16 .The other two pairs of discordant codes (R99 -I46.9 and J96.9 -R06.8) are not semantically related and therefore affect agreement as well as the quality of mortality data.It should be noted that the examples described here reflect errors in certification of causes of death involving the use of modes of death instead of independent nosological entities.
One of the limitations of this study is that it only evaluated agreement between two coders, as opposed to agreement among several coders, or agreement between repeated coding of the same material, precluding an examination of intra and intercoder variability.A possible constraint to the generalizability of our results was, in contrast with other countries, the lack of an automated coding system, which meant that agreement was assessed based on manual coding.Given the impact of nosological coding on mortality statistics, and therefore on international comparisons, we chose coders from different countries in order to provide for a wider interpretation of the ICD-10 coding rules.To our knowledge, this is the first study of this kind in Chile.
Furthermore, the fact that this study was restricted to a particular group of ICD-10 codes could also affect the generalizability of our findings.However, it is important to note that the leading cause of death in Chile is cardiovascular disease and, as mentioned in Methods section, the codes included in our study account for almost half of this large group of diseases and around a quarter of the total deaths recorded in Valparaíso and Metropolitan regions in 2008.
Despite these limitations, our findings provide important information that was previously unavailable in Chile and scarce in other Latin American countries.It also raises a number of important new questions concerning the assessment of intra and intercoder agreement among coders in Chile and coders of other Latin American countries, and the assessment of agreement in ICD-10 coding process relating to deceased of all ages.Health policies rely in large part on vital statistics 22 .Reliable, valid and timely information is one of the keys to improving the health care and epidemiological research 23 .A greater understanding of the quality of the nosological coding process contributes to the accurate interpretation and comparison of mortality statistics 9 .
Cause-of-death coding procedures should be modernized, since system automation has a positive impact on the quality of vital statistics 24,25 because it reduces human intervention.However, the efficiency and accuracy of automated systems also depends on the quality of medical death certification, as reflected in the need to manually code between 15% and 20% of all death certificates, which in turn results in variability in the coding process 9,24,25 .Thus, improving training in medical certification of cause of death is critical to the integrity of vital statistics.
In conclusion, our results show that the level of agreement in coding causes of death due to cardiovascular disease in a representative sample of death certificates from two regions in Chile was adequate and that level of agreement was inversely related to the number of causes of death per death certificate and ICD-10 code classification level.
Mortalidad; Causas de Muerte; Codificación Clínica; Reprodutibilidad de Resultados Contributors C. Antini collaborated in the research conception and design; data analysis and interpretation; statistical analysis; drafting of the manuscript; supervision; critical revision and final approval of the manuscript.D. Rajs collaborated in the research conception and design; data collection; critical revision of the manuscript for important intellectual content.M. T. Muñoz-Quezada participated in the research design; data analysis and interpretation; drafting of the manuscript.B. A. L. Mondaca participated in the statistical analysis; drafting of the manuscript.G. Heiss collaborated research conception and design; data analysis and interpretation; critical revision of the manuscript for important intellectual content.
(0.74-0.79) 95%CI: 95% confidence interval; ICD-10: International Classification of Diseases, 10 th revision.Differences in the line of the death certificate: p-value < 0.01.* Number of death certificates randomly sampled from death certificates issued in the Valparaíso and Metropolitan regions in 2008.Deaths attributed to external causes and where place of death was categorized as "others" were excluded from the study sample; ** Number of causes of death with intercoder agreement based on the line of the death certificate; *** Number of causes of death coded in the study sample by at least one of the coders based on the line of the death certificate.Uncoded causes of death were removed from the denominator; # Percentage agreement.

Table 1
Number of deaths recorded in Valparaíso and Metropolitan regions *, Chile, during 2008 and demographic characteristics of the study sample.
Source: Department of Health Statistics and Information, Chilean Ministry of Health (MINSAL).* Deaths tabulated on the Chilean mortality database.Deaths attributed to external causes and where place of death was categorized as "others" were excluded from the study sample; ** Number of deaths in the Valparaíso and Metropolitan regions; *** Number of deaths in the study sample; # Column percentage; ## Row percentage.

Table 2
Underlying causes of death among the study sample and level of intercoder agreement (%) by ICD-10 code classification level.Valparaíso and Metropolitan regions, Chile, 2008.
* Number of death certificates randomly sampled from death certificates issued in the Valparaíso and Metropolitan regions in 2008.Deaths attributed to external causes and where place of death was categorized as "others" were excluded from the study sample.

Table 3
Underlying cause of death: level of intercoder agreement (%) by ICD-10 code classification level by age and sex of deceased and number of reported causes of death per death certificate.Valparaíso and Metropolitan regions, Chile, 2008 (n = 515).

Table 4
Level of intercoder agreement (%) by ICD-10 code classification level based on the line of the death certificate and total causes of death coded in the study sample *.Valparaíso and Metropolitan regions, Chile, 2008 (n ** = 1,715).