Analysis of quality of information about deaths from neoplasms in Brazil between 2009 and 2019

: Objective: To assess the quality of information about mortality from neoplasm within the Mortality Information System. Methods: Descriptive observational study evaluating the quality of the Mortality Information System, with an outcome referring to data on deaths from neoplasm between 2009 and 2019 in the Brazilian population ( ≥ 15 years). Information Quality (IQ) was measured through coverage, specificity and completeness of data, at national and state level. Results: The quality of the coverage dimension ranged from “good” to “excellent” in the national and state coverages. Specificity was classified as inadequate mainly in the states of the North and Northeast regions. The proportion of ill-defined causes was classified as “poor” quality in most units of analysis throughout the series. Data completeness varied according to indicator. Gender and age indicators were proven “excellent” for the entire period and units of analysis, while educational level varied in quality across units and periods, marital status had its quality improved over the period, as well as ethnicity/skin color. Conclusions: The quality of data on mortality from neoplasm in the Brazilian population ( ≥ 15 years) is mostly adequate, but there are important gaps to be filled, as the expansion of IQ seeks to give visibility to the health condition of the Brazilian population and to propose public actions for its improvement.


INTRODUCTION
The evolution of mortality from neoplasms around the world places conditions such as neoplasms as an important public health problem, encouraging different nations to take measures to control their incidence and mortality due to the set of diseases that make up the group of neoplasms 1 . In Brazil alone, there were 243,000 deaths from neoplasms in 2018, and the expectation for 2040 is that deaths will reach the mark of 432,000 2 . Nevertheless, the control of mortality from neoplasms is one of the goals of the United Nations (UN) 2030 Agenda, which is to reduce premature mortality from non-communicable diseases by one third 3 . The measures to be adopted for its control must vary geographically in view of socioeconomic conditions of different countries and regions related to the occurrence of the disease and its outcome, thus demanding specific actions depending on the extent of the problem 4 .
In order to understand the dimension of the problem and propose effective actions to control the disease and mortality, health authorities must receive accurate information on the number of incident cases and deaths, so the availability of quality information in different official government databases is essential [5][6][7] . In Brazil, the Ministry of Health (MS) tool that provides data on vital statistics is the Mortality Information System (SIM, acronym in Portuguese). The SIM is part of the Health Information System (SIS, acronym in Portuguese) and contributes to delineating the epidemiological profile of mortality in the country 8 Sistema de Informação sobre Mortalidade, com desfecho referente aos dados de óbitos por neoplasias ocorridos  entre os anos de 2009 e 2019 na população brasileira (≥15 anos). A qualidade da informação (QI) foi mensurada para o Brasil e para as Unidades Federativas por meio das dimensões: cobertura, especificidade e completude dos dados. Resultados: A qualidade da dimensão cobertura variou entre "boa" e "excelente" nas abrangências nacional e estadual. A dimensão especificidade foi classificada como inadequada predominantemente nos Estados das Regiões Norte e Nordeste. A proporção de causas mal definidas foi classificada como de "baixa" qualidade na maioria das unidades de análise ao longo da série. A completude dos dados variou de acordo com o indicador utilizado, os indicadores sexo e idade mostraram-se "excelentes" para todo o período e unidades de análise, a escolaridade apresentou variação de qualidade tanto nas unidades como nos períodos e o estado civil apresentou melhoria da qualidade de seu registro ao longo do período, assim como o indicador raça/ cor. Conclusões: A qualidade dos dados de mortalidade por neoplasias na população brasileira (≥15 anos) é, em sua maioria, adequada, mas há lacunas importantes que merecem ser preenchidas, pois a ampliação da QI busca dar visibilidade à condição de saúde da população brasileira, bem como propor ações públicas para sua melhoria. is a precise source for capturing and recording information on deaths at national, regional and municipal levels 9 .

Palavras
The importance of quality of information (QI) checking in health is described by authors from different regions of the country and the world, but there is an emphasis on the quality of data, especially in developing countries, where the information that feeds national and global health data banks is still considered poor, so expanded actions to improve this condition are needed 5,[10][11][12][13][14][15][16] . At the national level, the importance of verifying the quality of data starts from the premise that managers use SIS data to make decisions, so the information available must reflect reality so that quality sectoral actions are carried out, adding political value to health information 17,18 .
The Brazilian literature on QI of SIM data is usually limited to a specific group of an underlying cause of death and a geographic unit, which creates a gap for the comprehensive geographical perspective on QI of SIM data and makes comparison between different Brazilian regions impossible. Therefore, the objective of this study was to evaluate the QI about mortality from neoplasms within the scope of SIM, according to national data and by Federative Unit (FU).

METHODS
This is a descriptive observational study, carried out with data from Brazil and its FUs for the population aged 15 years or older whose deaths from neoplasms occurred between 2009 and 2019. Data were obtained from SIM through a time series of records of deaths from neoplasms in general (C00-C97) by place of residence for each study unit.
QI of SIM was measured in the following dimensions: coverage 19 , specificity 20 and completeness 21,22 . Coverage, defined as the ratio between deaths recorded in SIM and deaths projected by the Brazilian Institute of Geography and Statistics (IBGE) 23 and multiplied, at the end, by 100, was verified by comparing the occurrence data registered in SIM and IBGE via the civil registry between 2009 and 2019. Proportions below 80% were considered "regular", between 80 and ≤90% "good" and between 91 and ≥100%, "excellent" 19 . Specificity was measured by two indicators: proportion of unspecific causes (PUC) and ill-defined causes (PIDC) in SIM. PUC refers to causes of mortality not falling under Chapter XVIII of the International Classification of Diseases (ICD-10)-Abnormal symptoms and signs, and clinical and laboratory findings not classified elsewhere-codes R00 to R99, which are considered imprecise or insufficiently defined causes. The measure, which used the unspecific causes recorded by place of residence of death in Chapter II of the ICD-10 (Neoplasmscodes C76, C77, C79, C80 and C97), was divided by the result of subtracting the total number of deaths from neoplasms (ICD-10 C00-C97) of the number of deaths in chapter XVIII (ICD-10 R00 to R99) and multiplied, at the end, by 100. Chapter XVIII was withdrawn to remove ill-defined causes and determine the weight of unspecific causes in the total number of deaths from neoplasms whose classifications have been considered well defined.

REV BRAS EPIDEMIOL 2022; 25: E220022
A median value of the PUC distribution (10.8%) was established as an assumption for the analyzed period; a median of less than 10.8% was described as adequate quality, and greater than or equal to this value was classified as inadequate. PIDC, in turn, was described as the percentage of deaths from ill-defined causes that occurred in a given space and period, and its calculation was performed with death data by place of residence in Chapter XVIII of the ICD-10, divided by total deaths from neoplasms (ICD-10 C00-C97) and multiplied, at the end, by 100. PIDC values <5% indicated "high" QI, between ≥5 and ≤15 "regular" QI, and values >15% were considered "poor" QI 20 .
Finally, the completeness analysis sought to determine the degree of completion of demographic data, namely: age, sex, marital status, educational level and ethnicity/skin color. For this purpose, the categories referring to missing data, whose description in SIM is "ignored", were used as a basis for calculation. The score determines 5 degrees of completeness based on the proportion of ignored data, being "excellent" when the variable has less than 5% incomplete filling; "good" when it presents from 5% to less than 11%; "regular" from 11% to less than 21%; "poor" from 21% to less than 50%; and "very poor" from 50% upwards 21,22 .

RESULTS
From 2009 to 2019, there were 2,173,837 deaths from neoplasms (ICD-10 C00-C97) in the population aged 15 years and over residing in Brazil, with the Southeast Region responsible for 48% of deaths, followed by the Northeast (21.5%), South (19.5%), Midwest (6.3) and North (4.7) regions. The mortality growth rate in the same period for Brazil as a whole was 37%, with growth of 67% in the North Region, followed by the Midwest (51%), Northeast (47%), Southeast (31%) and South Regions. (23%).
Data coverage varied between 89 and 120% throughout the series, except for the State of Roraima, demonstrating that SIM data have a coverage score classified as "good" and "excellent". Roraima also presented coverage with "excellent" rating, but with proportions ranging between 790 and 1,000% throughout the series.
The specificity indicator showed that the FUs whose values are classified as inadequate are concentrated in the North and Northeast regions of the country, with the exception of Minas Gerais, whose median was 14.88%. In the State of Bahia, the PUC was negative because the number of ill-defined causes (ICD-10 R00-R99) was higher than the number of cases of deaths from neoplasms (ICD-10 C00-C97) for the 2009-2012 and 2014-2016 periods ( Figure 1) PIDC was classified as "poor" quality in most units throughout the series, with the exception of Tocantins in 2017, in which it was classified as "regular"; for the entire Espírito Santo series, with values ranging between "regular and good"; for Paraná, with "regular" values between 2017 and 2019; for Santa Catarina, with "regular" quality data between 2014 and 2019; for Mato Grosso do Sul, which presented "regular" quality until 2018; and for Goiás, also "regular" in 2014 and between 2016 and 2019. The Federal District had data oscillating between "regular" and "high" quality throughout (Table 1). Despite the fact that most units of analysis presented "poor" quality, there was an improvement in QI, with growth rates ranging between -3.5 and -72%. For Brazil, the growth rate was -29.7%. For the states of Piauí (21.9%), Sergipe (20.5%), Rio de Janeiro (19.5%), Mato Grosso do Sul (129.4%), and the Federal District (7.7%), a growth in IDC was observed along the series.
Data completeness varied according to the indicator used (Table 2): sex and age were "excellent" for the entire period and all units, presenting, respectively, incompleteness values that ranged from 0 to 0.2 and from 0 to 0.6. The variable educational level varied both  in the units and in the periods, with a negative highlight for the states of Alagoas, Bahia, Espírito Santo, Goiás, Minas Gerais, Paraíba, Rio Grande do Norte, Rio Grande do Sul, and São Paulo, which presented data quality varying in the lower ranks from "regular" to "very poor" throughout the series. Regarding marital status, over the period, the units of analysis had an improvement in the quality of their registration, with negative variations for the states of Alagoas, Espírito Santo and Paraíba. Ethnicity/skin color indicator showed a completeness growth rate of -49% for Brazil throughout the series, which means that, over time, the filling out of ethnicity/skin color improved. Despite the improvement in this indicator, the states of Espírito Santo and Alagoas remained at the "regular" level for most of the series.

DISCUSSION
Our study showed that the coverage of data on mortality from neoplasms in SIM, in all units of analysis, is classified as excellent per the methodology used. The highest coverage value was found in the State of Roraima, which can be explained not only by the investment in public policies and administrative improvements in death records, but also by the low mortality projection by IBGE for this FU and the low uptake of deaths by the civil registry system 6,24 . The adequacy of coverage found in this study corroborates other studies that sought to determine the coverage of adult mortality data. In the survey by Costa et al. 25 , SIM coverage was 98%, while a low coverage of the civil registry for the State of Roraima was evidenced. In the study by Queiroz et al. 24 , with coverage data from 1980 to 2010, the states of the South and Southeast regions had their coverage classified as "excellent", while other regions had it classified between "regular" and "good". The coverage of mortality data is essential to inform about the health condition of the Brazilian population, since a reduction in IDC is seen when associated with actions to its improvement 6 .
The PUC of mortality from neoplasms in the FUs of the North and Northeast regions and in the State of Minas Gerais was considered inadequate, corroborating studies that described the effect of geographic area on the quality of health data 14,26 . According to Balieiro et al. 27 , unspecific causes of death in the State of Amazonas from 2006 to 2012 were related to the place of residence and occurrence of death, sociodemographic conditions, and the professional responsible for attesting to the death. The comparability of PUC of this study with that of others is limited due to methodological differences and the lesser emphasis given to this indicator. This can be explained by the fact that the reduction in PIDC and underreporting has a greater impact on the improvement in mortality records due to a specific cause when compared to the reduction in PUC 28 . The denial of PUC in the State of Bahia, since the number of cases of deaths from neoplasms is lower than that of ill-defined causes, serves as an alert for the need to improve the completion of Death Certificates, as the high number of imprecise or not sufficiently defined causes compromises the assessment of the population's health condition and, consequently, the allocation of strategies and financial resources 17,18 .

REV BRAS EPIDEMIOL 2022; 25: E220022
Despite being classified in most Brazilian states as "poor" quality, PIDC showed an improvement in data over the years, corroborating studies that describe the reduction of PIDC in regions with low or medium socioeconomic development (although high quality data are still not available) as a result of investments in the public health system and advances in strengthening of statistics of death causes in the country. It is important to highlight that classification with codes R00-R99 not only represents poor quality of data, but also describes the lack of adequate health care, since these codes encompass conditions such as septicemia, hypertension, heart failure, among others that are preventable and treatable; it may also be due to a high number of deaths at home, which makes it difficult to properly report the cause 7,14,18. Even in high-income countries with advanced health information systems, classifications with R codes occur in smaller proportions than the ones showed in the present study. According to Mikkelsen et al. 29 , who investigated the use of ill-defined codes in six high-income countries, the average PIDC for such countries was 18% (2015 and 2016), and, in Brazil, the average PIDC in the same period was of 34.5%. This comparison reinforces the importance of maintaining investments to improve the quality of Brazilian health data 14,18 , because even in states with historical reports of quality data, a high PIDC is still seen 30 .
Gender and age indicators in deaths from neoplasms was classified as "excellent" completeness in the study period, corroborating other studies with different population and temporal approaches, which described the improvement in filling in these variables 22,31,32 . The variable educational level still has major limitations, requiring improvement in its completion, given its fundamental role in understanding social inequalities in mortality from different conditions, including neoplasms 33 . According to Melo and Valongueiro 31 , in a study that verified the incompleteness of records of deaths from external causes in the State of Pernambuco in two time series (2000 to 20002; 2008 to 2010), the completeness of educational level indicators improved when comparing both series, however it remained as "poor" quality in both periods. The completeness of the marital status variable was classified as "good" in most units of analysis; this data is consistent with what Messias et al. 32 presented in a study on the quality of mortality data from external causes in the city of Fortaleza, and with what Rios et al. 34 reported in a study on the completeness of mortality data from suicide among the elderly in the state of Bahia. The improvement in filling of ethnicity/skin color in Brazil and in most FUs was also observed by Romero et al. 35 in a study that sought to highlight the trend and inequality related to this variable in the notification of mortality of the elderly in SIM between 2000 and 2015 in Brazil. This variable must be used to measure the impact of racialized inequality on mortality, and its adequate completion provides accuracy for studies that aim to understand such differences.
This study showed that the quality of data on mortality from neoplasms in the Brazilian population aged 15 years or older is, for the most part, adequate, but there are important gaps to be filled, as the expansion of QI seeks to give visibility to the health condition of the Brazilian population as a whole and to propose public actions for its improvement. Different actions and measures can be taken to improve QI. For Woods et al. 36 , the alternative to improve the quality of data in Canada was based on the development of a curation REV BRAS EPIDEMIOL 2022; 25: E220022 program that developed familiarization with data and with filling mechanisms, actions that were centralized when the data was received, reducing errors and making them rare over time. For Lemma et al. 37 , the expansion of QI in low-and middle-income countries took place through the combination of different tools, such as the use of technology, training of personnel and assessment tools to improve self-assessment and feedback resources in the system of routine health information.
This study contributed to the discussion on the quality of health data, with a temporal and geographical approach to the adult population through nationally used indicators.
Research on the quality of health data usually applies a restricted geographical approach, with a strong emphasis on neonatal conditions, mortality in general and external causes, and a focus on mortality in children and the elderly. This investigation breaks new ground by bringing up a debate that has expanded geographically over the years and falls on a group of underlying causes of death. In addition, it is one of the first descriptive studies on the QI about mortality from neoplasms in Brazil, at the national level and by FU. This research has limitations that are consistent with those of a descriptive study, as it does not determine which factors are associated with QI. Based on results found, further studies with the same objective of analysis are necessary.