Improving the quality of external cause of death data in Brazil: evaluation and validation of a new form to investigate garbage codes

Garbage codes, such as external causes with no specific information, indicate poor quality cause of death data. Investigation of garbage codes via an effective instrument is necessary to convert them into useful data for public health. This study analyzed the performance and suitability of the new investigation of deaths from external causes (IDEC) form to improve the quality of external cause of death data in Brazil. The performance of the IDEC form on 133 external garbage codes deaths was compared with a stratified matched sample of 992 (16%) investigated deaths that used the standard garbage codes form. Consistency between these two groups was checked. The percentage of garbage codes from external causes reclassified into valid causes with a 95% confidence interval (95%CI) was analyzed. Reclassification for specific causes has been described. Qualitative data on the feasibility of the form were recorded by field investigators. Investigation using the new form reduced all external garbage codes by -92.5% (95%CI: -97.0; -88.0), whereas the existing form decreased garbage codes by -60.5% (95%CI: -63.5; -57.4). The IDEC form presented higher effectivity for external-cause garbage codes of determined intent. Deaths that remained garbage codes mainly lacked information about the circumstances of poisoning and/or vehicle accidents. Despite the fact that field investigators considered the IDEC form feasible, they suggested modifications for further improvement. The new form was more effective than the current standard form in improving the quality of defined external causes.


Introduction
Reliable information on causes of death can provide useful epidemiological evidence to support decision-making and inform public policies 1 .In Brazil, such data can help address the wide disparities in mortality among states and sociodemographic groups 2,3 .A major public health concern in Brazil are external causes (injuries), which comprise a higher percentage of deaths (12%) compared with other countries 4 .
The main source of cause of death data in Brazil is the Mortality Information System (SIM), established in 1975.Brazil recorded more than 1.31 million deaths in 2017, accounting for over 96% of all deaths; of these deaths, 36.5% were garbage codes 2,3,5 .Garbage codes are poor-quality and uninformative for public health policy causes of death, for example "undetermined intent".Garbage codes are a classic indicator of the quality of health information systems 6,7,8 , and have been used as an indicator of the level of comprehensiveness and specificity of information on death certificates 9 .Often, records do not specify a valid cause of death, requiring in-field collection of supplementary information 10 .Such supplementary information is difficult to obtain or is inexistent, but algorithms exist to redistribute garbage codes to other specific cause of death 11 .Brazil has important differences in coverage in the capture of deaths and quality of cause of death data among regions, observing less favorable situations in the poorest regions, such as North and Northeast 12,13 .However, improvement in coverage of death reporting and quality of cause of death over the years has reduced these differences 13,14,15 .
In 2017, injuries were the cause of 158,658 deaths in Brazil.Over one-quarter of the original cause for these deaths was a garbage code 16 .In recent decades, Brazil has been applying efforts to improve the quality of data in SIM, including investigation of garbage codes 17,18,19,20,21 .As a result of investigation by municipal health departments, garbage codes from injury deaths have been halved to 13% 16 .Notably, 81% of injury garbage codes were issued by forensic institutes, indicating inadequacies in death certificate issues that can be understood by coders and, therefore, be correctly coded; this was found to be particularly true in small-and medium-sized municipalities 4,10,22 .
In Brazil, there has been a long-standing process to investigate garbage codes, specifically on ill-defined causes and unspecified external causes.These investigation procedures were better structured nationally around the mid-2000s by a standardized procedure and form 12,15,17,18 .Deaths by injury investigation comprises collecting information about the circumstances of the event at forensic institutes and hospitals, in addition to collecting information from other sources, such as police stations, toxicology units, and the public emergency transport service.The investigation is conducted according to the exact type of accident or violence that produced (are the cause for) the injuries leading to death 10 .
In 2017, Brazil, with support from the Data for Health Initiative, implemented actions to improve the diagnosis of cause of death in cooperation with death surveillance teams in 60 municipalities 5,21 .The standard form -used by the mortality surveillance program to investigate causes of garbage codes deaths in hospitals 23 -was proposed and tested in 2017 to collect information from medical records in hospitals to assign the underlying-cause of death (Supplementary Material 1: http:// cadernos.ensp.fiocruz.br/static//arquivo/suppl1-0972-22_8224.pdf) 21.However, this form presents an insufficient number of questions about the circumstances of injury deaths to accurately assign the underlying cause of death.Moreover, it lacks questions that could allow to determine the aggressor's intent and the means used (e.g., firearm homicide or self-inflicted).
One of the actions developed was to investigate external garbage codes from different sources of information, including forensic institutes and hospitals, using a standard investigation form (investigation of deaths from external causes -IDEC) 16 .Previously, external-cause garbage codes have been investigated with the same form as every other garbage codes.The IDEC form, developed by the mortality surveillance team of the Brazilian Ministry of Health, with the support of the Graduate Program in Public Health of the University of Brasília and the Federal University of Minas Gerais, was devised to be prospectively used across the country for external-cause garbage codes investigation (Supplementary Material 2: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl2-0972-22_5497.pdf).This form differs from the previous one since it introduced questions and variables that allow for the collection of detailed data about the circumstance of death from external cause (Supplementary Cad.Saúde Pública 2023; 39(3):e00097222 Material 1: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl1-0972-22_8224.pdf;Supplementary Material 2: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl2-0972-22_5497.pdf).
The use of a specific form could improve the quality of external cause of death data in Brazil.This study aimed to analyze the performance and suitability of the new IDEC form to improve the quality of external cause of death data in Brazil.Therefore, we intend to verify whether this new form, designed to recapture data on the circumstance of the external cause of death, provides sufficient information for the reclassification of garbage codes into valid cause of death codes.Moreover, if a form with specific questions for this type of cause reclassifies more garbage codes into valid codes than the standard procedure.

Methods
An observational-analytical study of investigations on garbage-coded deaths due to injuries was performed using the newly developed IDEC form.
The six capitals reported 17,514 deaths from external causes, with 44% (n = 7,731) classified as garbage code.As part of the garbage codes investigation, 85% (n = 6,606) were investigated: 6,382 using the current standard procedure and 224 (212 external causes and 12 cases suspected to be violent) with the IDEC form.These 224 cases correspond to 20% of the 1,125 garbage codes that were not investigated by current standard procedure.A total of 60 of the 224 deaths were discarded for not reporting an underlying cause on the death certificate in SIM.However, 164 deaths were investigated using the new IDEC form.Of these deaths, 133 were exclusively due to external causes with garbage codes.That is, since the field investigator did not inform the type of garbage codes before the investigation, it was impossible to compare and reclassify these causes before and after the investigation.However, it is important to note that the investigation allowed classifying 83% (n = 50) of these investigated deaths in a valid code.These deaths were discarded for it was impossible to make the planned comparisons.The matched random sample of deaths that used the standard procedure made up a third group of 992 cases (Supplementary Material 4: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl4-0972-22_1248.pdf;Supplementary Material 5: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl5-0972-22_3246.pdf).The 133 deaths from external causes with garbage codes were considered as a reference for sample calculation since the standard procedure for investigating deaths does not provide defined causes (valid codes) for the investigation.The sample size with the current standard procedure was defined as the largest possible to obtain the matching with the proportional distribution by cause, region, sex, and age in the 133 deaths investigated with the IDEC form.Matched sample = 16%, 992 of 6,196 (6,382-186)  ing units, such as forensic institutes and hospitals, and were usually coordinated by physicians or nurses with experience in improving the diagnosis of cause of death in the Data for Health Initiative.After the investigation, the underlying cause of death was coded in each city and was reviewed by a senior coder at national level, with extensive experience in external causes.The investigation using the current standard procedure and the IDEC form was carried out by the same team at the same period of time, after training in the field protocol; fieldwork supervision was carried out during the same period.
The reclassification of an original underlying cause of death by garbage codes to a better-qualified underlying-cause after investigation was conducted according to França et al. 25 .The results before and after investigation and between the IDEC form and the current standard procedure were compared defining two comparative groups obtained from SIM 2017: "total cases" and a "matched random sample".
The primary comparison was conducted between the IDEC form and the matched sample.Results from the total cases are shown in Supplementary Material 6 (http://cadernos.ensp.fiocruz.br/static//arquivo/suppl6-0972-22_6369.pdf).These data help the comparative and critical analyses with the main results of the article.In addition, it allows greater access to different data produced in the study.To characterize the garbage codes, the following variables were analyzed: age groups (0-24, 15-24, 25-44, 45-54, 55-64, 65-74, 75 or more), sex (male, female), the original underlying cause of death before the investigation (undetermined intent: Y10-Y34, all other garbage codes, detailed in Supplementary Material 3: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl3-0972-22_6700.pdf), and region (Northeast, Southeast, South/Central-West).Also, the sufficiency of the collected data was verified by identifying necessary and unnecessary variables presenting a synthesis of the feedback offered by field researchers after the investigation using the IDEC form.Researchers' comments recorded in a field diary were organized according to strengths, difficulties or aspects to be improved, and suggestions.This feedback from field investigators supported the reformulation of the new form (IDEC), as one of the final products of this work (Supplementary Material 7: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl7-0972-22_2894.pdf).Thus, this article shows the performance of the IDEC form to improve the recording of the external cause of death and the possible improvements on this form after the field test in the capitals of Brazil.
As previously described, the six capitals were part of the 60 cities of the Data for Health Initiative that investigated garbage codes, applying the standard form and investigation protocol.The same teams from this Initiative tested the new form with the deaths that were not investigated during that project.For this reason, we did not consider the proportion of investigated deaths by capital using the new form to calculate the sample.
Statistical inference was estimated for the change from garbage codes to valid codes before and after the investigation, measured as a proportion and using 95% confidence intervals (95%CI) with binomial distribution.The hypothesis of no difference (H0) of the proportions was refuted when the confidence intervals did not overlap, assuming the alternative hypothesis (H1) that the proportions were significantly different in the comparisons.Furthermore, the ratio of the proportion of change using cases with the IDEC form as the reference (numerator) was calculated.Changes in the three groups were compared for total garbage codes, undetermined intent, and all other garbage codes.The interval estimation is given by the following equation: where: n is the sample size; p is the proportion; and z (standardized value) is equal to 1.96 for 95% confidence.
This study was approved by the Research Ethics Committee of the Federal University of Minas Gerais (CAEE 75555317.0.0000.5149).Non-nominal secondary data for the current standard procedure was used, according to Resolution n. 510/2016 26 , which provides for research standards.

Results
In this section, we describe our results: firstly, we show data on the performance of the new form/ IDEC (Tables 1, 2, 3 and 4) and summarize the field researchers' feedback on its suitability to recapture data (Box 1), which led to a final reformulated version of the tested IDEC form (Supplementary Material 7: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl7-0972-22_2894.pdf).
Table 1 shows the characteristics of deaths the new form investigated and the current standard procedure with matched samples.In both groups, the investigated deaths occurred more often in older men living in the Brazilian Southeast for undetermined causes.As designed, the IDEC form and matched sampling showed similar characteristics.The proportion of deaths in the group aged 65 years or older composed 40.6% of the cases the IDEC form investigated and 41% the standard procedure did.Among the evaluated injury garbage codes, the most common cause of death was "undetermined intent", at a frequency of 54.1% in the IDEC form and 54.6% in the current standard procedure.Investigated causes across regions showed slight differences in the Southeast (74.4% in the IDEC and 71.9% in the standard procedure) and South/Central-West (7.5% in the IDEC and 10.2% in the standard procedure).The IDEC form also assessed deaths from natural causes considered suspect or likely to be injuries (7.3%) and valid injury codes (11%) to confirm the cause of death (Table 3; Supplementary Material 6: http://cadernos.ensp.fiocruz.br/static//arquivo/suppl6-0972-22_6369.pdf).Table 2 shows the results of classifying garbage codes of external causes into valid codes.The IDEC form consisted of 133 cases due to this delimitation in its data.Garbage codes evaluation with the IDEC form reduced injury garbage codes by -92.5% (95%CI: -97.0; -88.0), whereas the current standard procedure only reduced garbage codes by -60.5% (95%CI: -63.5; -57.4) (Table 2), a 1.53 times greater reduction.This rate was lower for deaths of undetermined intent (i.e., 1.15) but higher (2.50) for all other injury garbage codes.

Table 1
Table 3 shows the reclassification of deaths after the IDEC form investigation.We consider all investigated causes: garbage codes for injuries and natural cause and valid injury codes (11%) to confirm causes of death in 164 cases.From undetermined injuries, 7% remained garbage codes, 56% became falls; 13%, self-harm; and 8%, interpersonal violence.In total, 7% of unspecified unintentional injuries remained as garbage codes; 33% became pedestrian road injuries; 30%, motorcyclist injuries; and 19%, falls.As expected, the IDEC reclassified almost all unspecified road injuries to pedestrian, cyclist, motorcyclist, or motor vehicle ones.It also gave specific causes to previously unspecified transport injuries, interpersonal violence, and unintentional injuries.The form recategorized 30% of the causes it investigated into falls; 33%, as specific road injuries; and 15%, as specified interpersonal violence.Table 4 shows how the current standard procedure reclassified deaths initially categorized as injury garbage codes after investigating its matched sample .Of the most common causes of undetermined injuries, 20.3% remained garbage codes.The percentage rises for most less common causes: 70.3% of unspecified transport injuries, 53.2% of unspecified unintentional injuries, 83.5% of unspecified interpersonal violence, and 100% of other garbage codes injuries.The standard procedure reclassified 23.8% of cases with specific causes into falls; 9.5%, as interpersonal violence; and 16.2% as road injuries (much lower than the 32% via IDEC form).IDEC attributed 94% of the evaluated garbage codes to valid causes, whereas the standard procedure, only 60.5%.
Box 1 summarizes the feedback field researchers offered after using the IDEC form.They provided feedback after testing it on information collected in hospitals and forensic institutes.Supplementary Material 2 (http://cadernos.ensp.fiocruz.br/static//arquivo/suppl2-0972-22_5497.pdf) and Supplementary Material 7 (http://cadernos.ensp.fiocruz.br/static//arquivo/suppl7-0972-22_2894.pdf) show the IDEC form before and after the modifications we implemented based on testing and feedback.The advantages researchers mentioned included the standardization of the instrument to investigate deaths due to external causes and its improved potential for retrieving true cause of death.However, they found that the longer IDEC form takes 30 minutes until completion, on average, a clear disadvantage.
Based on the feedback from field researchers, several modifications were made to the form to make it more intuitive, simple, objective, and clear (Box 1).One was standardizing the terminology for the variables to be consistent with other forms, such as the death certificate and violence notification form.We also removed three redundant items/questions.The sequence is important since we designed the form for use in different settings.Therefore, we grouped questions in a more relevant and logical manner for the several involved institutions: policy data, followed by that from hospitals and forensic institutes.Finally, we included additional items, including complementary information on violence and accidents to facilitate specifying victims, means, other parties in transport accidents, etc.
We found that the investigated causes failed to always define circumstances of deaths due to insufficient information after data collection from relevant sources, for example, the absence of data from police investigations or the non-registration of the circumstances of death in hospital records.All 11 garbage codes deaths which evaded classification into valid causes lacked details from hospital records and/or in police reports.

Table 4
Reclassification of deaths before and after an investigation in the matched sample using current standard procedure.Six Brazilian capitals, 2017 *. * This table includes all cases that were investigated during form testing in the municipalities: injuries by garbage code, no injuries (including ill-defined causes), and some valid external cause codes that needed to be confirmed.The form has few lines to write a detailed account of the event and it is not very hospital specific.

Inclusion of variable
To collect information from the media and/or social networks, which are potential sources of information; Occupation; Medical record; Author of violence should include agent/police; Modifying "civil police" to "police" is more comprehensive.

Discussion
As far as we know, this is the first Brazilian study comparing the effectiveness of a new form to investigate garbage codes for external causes.The recapture of information on the circumstances of deaths from external causes with the new IDEC form greatly reduced garbage codes than the current standard procedure; only 7.5% of IDEC form deaths remained as garbage codes, compared to 34% in the matched sample.IDEC reclassified deaths from undetermined intent into valid codes twice as effectively than in other external garbage codes, convincingly reclassifying garbage codes into specific categories such as falls, road injuries, and interpersonal violence.
Cad. Saúde Pública 2023; 39(3):e00097222 The characteristics of the matched sample under the current standard procedure usually resembled the 133 cases investigated via IDEC, only slightly differing in regional distribution.Its characteristics agreed with previous studies and showed a higher frequency of death in men, both younger and older, and a higher frequency of deaths from injuries occurring in hospitals than other deaths 16,20,27 .
IDEC reduced the number of garbage codes by more than 90%.Results may vary due to localities, information sources, and garbage codes types.Regional differences in police investigation and hospital service availability and forensic institute service quality may affect investigations.Certain studies reduce garbage codes for external causes from 39% to 83%, typically using data from forensic institutes as their central source 28,29,30 .Previous research has shown better results for unspecified accidents 28,29 and undetermined intent 30 .A multisource study has recently reduced garbage codes with undetermined intent by 84% and reclassified 11% of undetermined natural causes to external causes, pointing to a greater contribution to police, press, and forensic institute data 16,27 .Interestingly, research managed to categorize 67% of undetermined intent deaths by relying on newspaper reports alone 31 .
Our reclassification of external causes with garbage codes obtained findings similar to previous results based on national data, including the transition of deaths between groups of external and natural causes -as did we -and the reclassification of deaths from accidents to other valid codes, which, in this study, migrated to accidents on highways 10 .
We reclassified most deaths of undetermined intent into falls in both comparison groups, which may have suffered the partial influence of age and gender profile of the investigated garbage codes.The greater proportional weight of females and older adults in the IDEC form and sample group, compared to the total evaluated deaths across municipalities, contributed to the form effectively reclassifying garbage codes.Previous studies have shown increased reclassification of undetermined intent into accidents -especially falls 28,29 -, whereas more recent studies have shown a greater reclassification of undetermined intent into homicides 16,27,30,31 .Studies have indicated that using multiple sources of information improves garbage codes reclassification results 27,30 , although this may be insufficient to assign a specific cause of death 20 .Thus, we observed that certain sources contribute more to the identification of a particular cause of death category, such as the police investigating homicides or newspaper reports on recent traffic accidents, etc. 10,28,31 .
After more than 40 years of operation, the SIM shows difficulties in the current scenario of modernization and decentralization in a relatively large country such as Brazil 32 .Challenges may be greater in areas with poorer access to public services, especially small and medium-sized municipalities in the countryside or in rural areas 10,33 .These challenges worsen due to poor updating of death records for external causes, especially due to poor agreement on causes of death between forensic institutes and the Health Department 22,34 .Suboptimal death certificates may partly stem from forensic physicians' inclination to disregard records related to hospitalization.A recent study found that the quality of causes may show movements and discontinuities over time 35 .The literature in other countries found certain similarities related to some types of garbage codes.For example, more than two-thirds of the deceased classified as having suffered from exposure to an unspecified factor (code X59 from the 10th revision of the International Classification of Diseases -ICD-10) were over the age of 65 years, and more than half of them had femur fractures 9 .
The IDEC form usually collected enough data to improve the quality of external cause of death data in hospitals and forensic institutes.However, after the investigations, we had to rearrange the logical sequence of its questions to make the form more intuitive and useful for several services.Causes that remained as garbage code due to absent information on circumstances of death usually resulted from unavailable police investigations.Greater reclassification largely stemmed from poisoning (especially by cocaine) and unspecified vehicle accidents, in which reports attributed the site of the violent act or accident to private residences or public roads without any witness.A study observed that events occurring in homes are more likely to be classified with some type of garbage code 20 .The difficulty of correctly diagnosing external causes often depends on additional information supporting coroners' work, such as police investigations, since the lack of this information at the time of death registration may cause misclassification 36 .
Such questions reinforce the hypothesis that the difficulty in properly diagnosing external causes extends beyond the contextual and structural issues of service and medical training previous stud-Cad.Saúde Pública 2023; 39(3):e00097222 ies have reported, noting, as key issues, the non-use of instructional materials, certifiers attributing them low value, and the impossibility of describing chains of events 22,37,38,39 .Studies suggest that the conflicting needs and working styles of the legal structure/police and health epidemiology sectors contribute to the challenges faced in death investigations 16,27 .This dual role not only affects the Brazilian medicolegal death investigation system but also influences the United States, in which death investigations also carry significant societal importance for its criminal justice and public health.The autonomy of federated entities can also play a role in creating multiple realities and highly varied state and local death investigation systems 40 .
Despite their common goal of protecting and guaranteeing rights 41 , institutions operate with different priorities and working styles.For instance, public health prioritizes victims and associated risk factors, requiring a shorter time for prompt interventions 42,43,44,45 .On the other hand, the legal/ police sector, part of the public security and justice system, has a normative focus on the victimperpetrator relationship, thus operating at a slower pace and timeframe, especially in cases of violent deaths with criminal implications 46 .
Experiments have shown promising results in cause of death qualification, such as the use of an online death certification system to better record causes of death, especially couple with a training program 47 .To address errors in the cause of death on death certificates, research has suggested multiple cause of death, which can provide strong clues about the valid cause 19,48 .Other countries have delayed filing cause of death in death certificates for up to six days after death to incorporate test results and police investigations.This entails issuing a prior death certificate and delaying categorizing the circumstances of death until a full certificate can be issued following the conclusion of outstanding investigations 27 .
This study has some limitations, especially due to its low number of cases, its geographical locations, and the variety of codes it investigated, which may have affected its results.Although testing was performed only in capitals, the selection of deaths to review allowed us to investigate some cases from the countryside of the states, which have distinctive epidemiological patterns.

Conclusions
Our new form for investigating deaths from external causes (IDEC form) more efficiently reduced garbage codes than the current procedure in the studied Brazilian locations, performing much better for unspecified garbage codes.It appropriately improved the quality of data on causes of death, although it still requires further adjustments to make it more intuitive and useful for collecting data from different sources.We suggest that wider tests of this new form to assess how it reclassifies several types of garbage codes from external causes in different areas of Brazil.Research should also apply the IDEC form to untested sources of information, such as police stations, plans which lie in the near future.The persistence of garbage codes in reports of causes of death require the development of the best practices for searching and recapturing information via standardized procedures and validated instruments to improve the quality of external cause of death data, and produce useful statistics and evidence to formulate policies in public health in Brazil.

Causas de Muerte; Certificación; Causas Externas; Registros de Mortalidad
; it is suggested to revise the form to be simpler and more objective, facilitating the work of researchers; Some questions seem to repeat themselves or they are not clearly saying which source or data type they refer to.Specific Lacks note on the medical record about: possible aggression or neglect; gender, race, marital status, and degree of study (information little valued in injuries); Poisoning information is rarely found in forensic institutes and hospitals; Without access to information from the technical-scientific police expertise of the event scene in the municipality; The following variables are not included in the records of the forensic medical office or other health services: sexual orientation, gender identity, death motivation (e.g., racism, femicide, homophobia), and perpetrator of violence; Field testing of the new form was carried out in the second half of 2018, with death records from external causes notified with garbage codes in the SIM in 2017 and 2018.The cities had teams, comprised of health service professionals with experience in mortality surveillance, within the municipal health departments to investigate deaths.Using the IDEC form, each team retrieved data at notify- Frequency of deaths before investigation by age, sex, cause, and region.Six Brazilian state capitals, 2017.

Table 2
Change of classification of garbage codes, after investigation.Six Brazilian state capitals, 2017.

Table 3
Reclassification of deaths before and after investigation using IDEC form.Six Brazilian capitals, 2017 *.

before investiga- tion Causes after investigation Pedes- trian Cyclist Motor- cyclist Motor vehicle Unintentional injuries Self- harm Interper- sonal violence Other injuries No injuries Remaining garbage codes Total
23te: the garbage codes and valid codes for injuries were grouped according to GBD 2015 study23, International Classification of Diseases, 10th revision (ICD-10) presented in Supplementary Material 3 (http://cadernos.ensp.fiocruz.br/static//arquivo/suppl3-0972-22_6700.pdf).*Thistable includes all cases that were investigated during form testing in the municipalities: injuries by garbage code, no injuries (including ill-defined causes), and some valid external cause codes that needed to be confirmed.Cad.Saúde Pública 2023; 39(3):e00097222
Feedback from field researchers after investigation with IDEC form.Six Brazilian state capitals, 2017.
IDEC: investigation of deaths from external causes.