Temporal trends in prevalence and infant mortality of birth defects in Brazil, from 2001 to 2018

Congenital anomalies (CA) are a relevant problem for global public health, affecting about 3% to 6% of newborns worldwide. In Brazil, these are the second main cause of infant mortality. Thus, extensive studies are needed to demonstrate the impact of these anomalies on births and deaths. The present study describes the temporal trends of prevalence and infant mortality due to CA among live births in Brazil and regions, from 2001 to 2018, using the relat-ed data between the Live Birth Information System (SINASC, acronym in Portuguese) and the Mortality Information System (SIM, acronym in Portuguese). The prevalence and infant mortality due to CA has increased in Brazil and in most regions, especially in the Northeast and North. CAs in the musculoskeletal system were the most frequent at birth (29.8/10,000 live births), followed by those in the circulatory system (12.7/10,000 live births), which represented the primary cause of death in this group. The applied linkage technique made it possible to correct the national prevalence of CA by 17.9% during the analyzed period, after retrieving the anomalies reported in SIM, thereby proving to be a good tool to improve the quality of information on anomalies in Brazil.


introduction
It is estimated that, globally, 3% to 6% of all newborns present some form of Congenital Anomaly (CA) [1][2][3] , which has, therefore, become one of the main causes of the Global Burden of Diseases 1 .These anomalies are defined by the World Health Organization (WHO) as structural or functional defects that occur before birth, and can be identified in the intrauterine period, at birth, or throughout life.They represent a serious global public health problem, since, in addition to producing a high impact on child mortality, a significant portion are chronic conditions that impact not only the affected individual, but also the family, health systems, and society 1 .
In high-income countries, where well-established surveillance systems can be found, CAs are reported in approximately 3% of all births 2,3 .It is expected that, in low-and middle-income countries, this frequency is higher due to the risk factors that contribute mainly to the increase in the number of anomalies of environmental etiology 4 .
In Brazil, the information concerning available CAs comes mostly from passive surveillance, whose records are attained in the Certificate of Live Birth (CLB).One recent study conducted by the Brazilian Ministry of Health used information available in the CLB to determine the prevalence of CAs considered rare in Brazilian regions and states, emphasizing the importance of this information system for the study of these conditions 5 .However, even if it is the most important source of information about CAs, it is still subject to underreporting, given that the number of births with CA recorded in the country is less than 1% 7 .
Although the CAs are mandatorily reported in a CLB, the form is normally filled out some hours after the child's birth 8 , in such a way that the anomalies that are not identified within this time interval are generally underreported in the system.In Brazil, CAs that involve the musculoskeletal system constitute a cause group that are often diagnosed shortly after birth 9,10 , while those that involve other systems and are diagnosed later are underreported in the CLBs 11 , illustrating the need to implement other strategies to better understand the magnitude of this health problem in the country.
In 2015, the Zika epidemic in Brazil drastically increased the number of births with microcephaly and other congenital defects of the nervous system 12 .Although it has not changed the position occupied by this cause group, the epidemic even further clarified the urgent need to improve the surveillance system for all CAs in such a way as to enable a continuous and agile follow-up for this public health emergency.The importance of this report on CAs also stands out for its relevance in infant mortality.In Brazil, these rank second among the main groups of causes of deaths among children under one year of age and, in some Brazilian states, have already reached first place [13][14][15] .
One strategy that can contribute to improving knowledge concerning the occurrence of health events that are rare and difficult to observe is the technique of the relation and linkage of databanks, known as the applied linkage technique.Many countries have applied this method to improve information concerning CAs and to promote a more accurate estimation of the prevalence of these conditions [16][17][18] .This is, therefore, a methodology that has proven to be highly valuable in the recovery of records, especially among children with a more severe form of CA, which are coded as a cause of death.In Brazil, due to the large volume and complexity of the CLB data, which involves nearly 3 million annual births, this strategy was difficult to implement.However, the Center for Data and Knowledge Integration for Health (CIDACS) developed a new tool to link large databases 19 .
Therefore, using the related data between the Live Birth Information System (SINASC, in Portuguese) and the Mortality Information System (SIM, in Portuguese), the present study describes the temporal trend of prevalence and child mortality due to CA in live births (LBs) in Brazil and in geographic regions from 2001 to 2018.

Methods
This work presents a temporal series, descriptive and retrospective study, conducted using data linked to births and deaths in Brazil from 2001 to 2018.
SINASC is fed by an official document -the CLB -and contains information both about the gestational history of the mother and about the newborn.Since 2001, a field was added to the CLB to accommodate the CA code observed at birth (described in Chapter 17 of the 10th revision of the International Classification of Diseases -ICD-10) 20 .In 2011, some changes occurred in the CLB, including that which refers to the variable that identifies the CA, which was added to Block I, field 6, and to the variable with the ICD information, which was added to Block VI, field 41 8 .
The SIM receives the Death Certificate (DC), which contains information about the parents and the deceased individual.In 2011, the field reserved for the record of the cause of death was changed to position 40, and consisted of the following variables: line "a", line "b", line "c", line "d" and line II.It is important to note that the basic cause of death should be recorded in line "d", while the terminal cause should be placed in line "a" 21 .For this study, data were recovered from lines "a", "b", "c", and "d" and II whose field had been filled in with the CA code described in Chapter 17 (Q00 -Q99) of the ICD-10.
The link in the records from the SINASC and SIM databanks used in the present study were developed at CIDACS 22 , which aims to conduct studies and research based on the integration of large databases ("big data") 23 .For this purpose, the CIDACS-Record Linkage (CIDACS-RL) tool was developed, a tool created to link big data with a high level of sensitivity 19,22 .To link data together, the following attributes were considered: name of the mother, municipality of residence, and mother's date of birth 22 .A total of 563,821 (73.2%) linked records of child mortality were obtained.
The CA records found in the SINASC and SIM databanks were classified according to Chapter 17 of the ICD-10: congenital anomalies of the nervous system (Q00-Q07); congenital anomalies of the eye, ear, face, and neck (Q10-Q18); congenital anomalies of the circulatory system (Q20-Q28); congenital anomalies of the respiratory system (Q30-Q34); cleft lip and cleft palate (Q35-Q37); congenital anomalies of digestive system (Q38-Q45); congenital anomalies of the genital organs (Q50-Q56); congenital anomalies of the urinary system (Q60-Q64); congenital anomalies of the musculoskeletal system (Q65-Q79); other congenital malformations (Q80-Q89); and chromosomal abnormalities, not elsewhere classified (Q90-Q99).To calculate the prevalence, all of the LBs from 2001 to 2018 were considered, while for the infant mortality rate, the records from SIM were linked to the SINASC, referent to LBs from 2001 to 2017.To calculate the infant mortality rate, only children who had not completed one year of age at the time of death (≤ 364 days) were included.The LBs in 2018 were excluded from the infant mortality rate due to the incomplete follow-up time in the study.
The indicators below were calculated for Brazil and geographic regions.
Prevalence of LB with record of CA per 10,000/LB (ratio between the total LBs with CA. reported in SINASC or SIM from 2001 to 2018 and the total number of LBs over the same period, multiplied by 10,000).
Infant mortality rate due to CA per 10,000/LB from 2001 to 2017 (ratios between the total number of CAs recorded in SIM and the total number of LBs, multiplied by 10,000).
To verify the temporal trend of the prevalence of births and infant mortality rates due to CA in the respective periods, the Prais-Winsten generalized linear regression model was applied, considering, for each modeling, the stress indicator as a dependent variable and time as an independent variable.The Prais-Winsten method is indicated to correct the estimations of regression model parameters, according to the self-correlation seen between the successive observations of a temporal series 24 .The Durbin-Watson test 25 was initially applied to evaluate the presence of the self-correlation in the errors of adjacent observations obtained from the regression model.After the self-correlation had been confirmed, the Prais-Winsten and Cochrane corrections were applied.
The correction followed the steps described by Antunes and Cardoso (2015) 24 .First, a logarithmic transformation of the dependent variable was performed, followed by the application of the Prais-Winsten generalized linear regression model.To find the Annual Percent Change (APC) of the prevalence and the mortality rate in percentage, the coefficient of the β1 slope in each application of the linear model was applied to the following formula: APC = [-1 + 10 β1 ] * 100%.To find the lower limits (LL) and the upper limits (UL) of the APC confidence interval, the following expressions were used: LL = [-1 + 10 β1min ] * 100% and UL = [-1 + 10 β1max ] * 100%.
From the analysis of the APC, we can consider the trend to be growing when the rate is positive, declining if negative, and stationary if there is no significant difference between its value and zero.The APC shows the percentage of growth or decline per year in both the prevalence and infant mortality rate due to CA.For the applied tests, a significance level of 0.05 was considered.
Moreover, the trend of CAs in the LBs was also evaluated between 2001 and 2015, a period prior to the ZIKA epidemic was evaluated in order to observe if changes occurred in the trend of prevalence in later years with the increase in the Congenital ZIKA Syndrome (CZS).This study used the Stata 14.0 program 26 to perform the statistical analyses.
This article is part of a larger project, which received approval from the Research Ethics Committee of the Federal University of Bahia in Salvador, Brazil (CAAE: 70745617.2.0000.5030).

results
From 2001 to 2018, 377,475 LBs with a diagnosis of CA were registered in the SINASC databank.After linking this system with the SIM, it was observed that 81,982 LBs presented CA as the basic cause of death, but these had not been recorded in the SINASC, totaling 459,457 LBs with CA, which represented an increase of 17.9% in this period.In this sense, the prevalence of LB of children with CA in the country went from 70.8/10,000 LB to 86.2/10,000 LB.
In the analyzed period, the larger prevalence of CA was observed in the Southeast and South, 96.6/10,000 LB and 94.1/10,000 LB, respectively, followed by 78.7/10,000 LB in the Midwest; 78.6/10,000 LB in the Northeast; and 66.1/10,000 LB in the North (Figure1).
The annual trend of prevalence of the CAs in Brazil from 2001 to 2015 (period prior to the ZIKA outbreak) was stationary in the Midwest and South, with the average annual variance rates of change of 2.28% (p = 0.195) and 1.20% (p = 0.285), respectively, whereas in the other regions the trends were growing (Supplementary Table 1, available at: https://doi.org/10.48331/scielodata.2VEG1Z).
When the trend of prevalence from 2001 to 2018 was analyzed, that is, including the years during and after the Zika epidemic, the trend proved to be growing in Brazil and in the four regions (Southeast, North, Northeast, and Midwest).The Midwest showed a growth in the prevalence of CAs of 2.80% (p = 0.018) per year.By contrast, in the North, there was a rise in the APC, leading to a 5.44% increase (p < 0.001) in the prevalence of CAs per year.The Southeast also showed an annual increase of 5.68% (p < 0.001) in the prevalence of CAs.Finally, the South showed a stationary trend, with an APC of 0.09% (p = 0.931), according to that shown in Table 1.
The CAs of the musculoskeletal system were the most prevalent in Brazil, with 29.8/10,000 LB; the circulatory system, com 12.7/10,000 LB; and the nervous system, with 11.1/10,000 LB (Figure 2).Those related to the musculoskeletal system were the most common in all regions.These, to-gether with the circulatory system, cleft lip, and cleft palate showed a higher prevalence in the Southeast and South.By contrast, the CAs of the nervous system and other congenital malformations were more prevalent in the Northeast (Supplementary Table 2, available at: https://doi.org/10.48331/scielodata.2VEG1Z).
The anomalies of the cleft lip and cleft palate, as well as those of the musculoskeletal system, had their records mostly registered in the SINA-SC (94%).The records of the CAs of the circulatory system were observed primarily in the SIM, in the North, 84% of the notifications in this group were recovered in this system.Thus, with the link between the systems, the CAs of the circulatory system began to be ranked as the second most common in Brazil.Among the 59,042 CAs of the nervous system, more than 17% were recorded only among the causes of death (Supplementary Table 2, available at: https://doi.org/10.48331/scielodata.2VEG1Z).
From 2001 to 2017, the mortality proportional to the children under one year of age due to CA in the country was of 21.7% (140,930/650,681), while the specific infant mortality rate for this group was of 24.4/10,000 LB. Figure 3 shows that the South (26.6/10,000LB) and the Midwest (26.6/10,000LB) presented the highest infant mortality rates due to CA, followed by the Southeast (25.5/10,000NV), the Northeast (22.4/10,000NV), and the North (21.8/10,000NV).
According to that observed in Table 2, the annual infant mortality rate due to CA in Brazil showed a trend of growth (APC = 4.23%, p < 0.001).Regarding the distribution of rates in the regions, it was noted that the increase occurred in four of the five regions, whose highest indexes were verified in the North (APC = 10.15%,p < 0.001) and Northeast (APC = 9.90%, p < 0.001).The Southeast showed a stationary trend, with an APC of 1.16% (p = 0.227).
Concerning the most common causes of death due to CA in Brazil from 2001 to 2017, those of the circulatory system were the most predominant, showing an infant mortality rate of 10/10,000 LB, followed by the group of other congenital malformations (4.8/10,000 LB), the nervous system (4.4/10,000LB), and the musculoskeletal system (2.8/10,000LB).The chromosomal abnormalities, not elsewhere classified (ACrNCOP, in Portuguese) represented 2.1/10,000 LB (Supplementary Figure, available at: https://doi.org/10.48331/scielodata.2VEG1Z).

Discussion
The results of the present study point to a growing trend in the prevalence of CA in Brazil from 2001 to 2018.This increase was observed mainly in the Northeast, Southeast, and North, which, each year, showed a rise of 7.40%, 5.68%, and 5.44%, respectively, for each 10,000 LB.After the recovery of the records in SIM through the linkage of databases, the prevalence of CA in the country rose from 70.8/10,000 LB to 86.2/10,000 LB, varying from 96.6/10,000 LB in the Southeast to 66.1/10,000 LB in the North.Those of the musculoskeletal system continued to be the     most common upon birth, while the CA of the circulatory system moved from sixth to second place in the ranking, making it the main group of the causes of child mortality.Moreover, the trend of infant mortality due to CA in the country proved to be on the rise, with the North and Northeast presenting the highest APCs of 10.15% and 9.90%, respectively.The increase in the CA prevalence values in the Brazilian regions may well be related to an improvement in its notification, according to that observed in a previous study 27 .This trend may also have stemmed from some changes in the CLB in 2011, according to that described in the methodology, which enabled to notification of a larger number of anomalies for each child and, most likely, improved the inclusion of these cases 8 .There is also the contribution of the sudden increase in the prevalence of CA of the nervous system in 2015, with a peak in 2016, caused by CZS/microcephaly due to the emergence of the ZIKA virus (ZIKV) in Brazil, which produced epidemics in many states from the second half of 2014 to the end of 2016 28 .
Adding the years of 2016 to 2018 of the analysis of the trend in the prevalence of CA, in the Midwest, which until 2015 had shown a stationary trend, now presented a growing trend.A rise in the APC was also observed in the North and Southeast.These results reinforce the fact that measures adopted to improve the detection and recording of CZE cases resulted in an improvement in the notifications of CA, both in the course of the epidemic and in the subsequent years, especially of the malformations of the brain and eyes, reported as the main adverse effects among pregnant women infected by ZIKV 29 .However, Oliveira et al. (2017) 30 reported that they observed no increase in the number of cases of microcephaly in the South during the ZIKA epidemic in Brazil, which may have caused the trend in this region to remain stationary after the addition of the years after 2015 into the analysis.
Although an improvement in the number of notifications had been observed, the CA in the Brazil still presents a high underreporting, with a frequency at birth remaining below 1% (0.98%), given that the expected frequency should be at least 3% of the LBs 2,3 .Luquetti and Koifman (2010) 6 highlight that one of the reasons for this underreporting is the low capacity of professionals that act in the process of the diagnosis and registration of the CA in the CLB.Constant educational actions aimed at reducing such problems were carried out in the maternity wards of the city of São Paulo and showed an improvement in the notification using this specific form.Consequently, an increase in the CA prevalence was observed in the Southeast, as São Paulo is the most populous city in Brazil 31,32 .
Thus, the most visible CAs also tend to be the most prevalent, as they are easily identified shortly after birth and reported in the CLB [9][10][11] .The expressive proportion of undiagnosed CAs at the moment of birth can be due to the difficulty in the identification of the anomaly, the unavailability of diagnostic resources, or even, as already mentioned above, the low technical capacity of the professionals responsible for this process 6,33 .These factors can explain the expressive underreporting of the CAs of the circulatory system in SINASC, in which only 38.5% of the total number of records from the present investigation appear.An even worse scenario was reported by Pinto Junior et al. (2015) 11 who found an even lower proportion (5.3%) of congenital cardiopathies recorded in this system throughout the entire country.
In this sense, these types of problems reinforce the need to use data linkage techniques to enable the identification of individuals with severe forms of CA who had their diagnoses recorded in other databanks and, therefore, obtain information that is closer to reality.Nonetheless, studies on CA in Brazil using linked data is still scarce, especially those that give priority to specific regions of the country.
The present article is the first of its kind to analyze the temporal trend of prevalence of this group of causes in Brazil and in the five regions through national data resulting from the linkage of two large and important databases (live births and deaths), which enabled a 17.9% correction in the prevalence of CAs in the country.In a similar study carried out only for the city of São Paulo, conducted by Geremias et al. ( 2009) 34 , the correction was of 14.3%, while Guimarães et al. (2019) 35 reported a correction of 20% in the city of Recife.
In the Southeast and South, there is a major concentration of geneticists and reference centers in medical genetics 14,33 , which could contribute significantly to the diagnosis of CAs, and later to an improvement in the records of these in the DO.Consequently, this can influence in the mortality rates due to CA in these regions, since the highest infant mortality rates due to CA in the period studied was observed in these regions, along with the Midwest.It is important to highlight that the improvement in the detection of the CAs is also reflected in the type of anomaly registered at the time of death, given that the ACrN-COP, which requires a more specific diagnosis, showed the highest infant mortality rates due to CA, also found in the South and Southeast.
The lower infant mortality rates due to CA for the period studied were found in the North and Northeast.Nevertheless, the annual growth of infant mortality rates due to CA was greater in these regions.These locations are areas of less socioeconomic development and present higher risks of child mortality.Nonetheless, when compared to other regions, they presented the largest drops in infant mortality between 1990 and 2015, going from 45.9/1,000 LB and 75.8/1,000LB to 16.6/1,000 LB and 15.2/1,000 LB 36 , for the North and Northeast, respectively.This is most likely due to the fact that the evitable causes of death, such as infection and prematurity, have been more widely controlled in Brazil 37 .In this sense, the CAs began to have a greater impact on infant mortality over the years 15 , leading to a larger an-nual growth observed for infant mortality rates due to CA in these regions.Furthermore, child mortality due to CA, caused by the increase in the number of CZS cases, has contributed to this upswing, especially in 2016 and 2017 36 .
In recent years, Brazil has implemented a wide range of public maternal-child health policies focused on the detection and treatment of heart CAs 38,39 , policies that would most likely impact the improvements in diagnoses, thus leading to a rise in the number of notifications from this group.In a study conducted in the state of Rio Grande do Sul, Luz et al. ( 2019) 40 observed that, in the studied period, the anomalies of the circulatory system presented a growing trend with a higher annual growth rate among the analyzed groups of CA, even though the best detection of these anomalies can be observed in SIM, since, in most cases, the diagnosis is not done in time to be reported in the SINASC 11 .
The limitations of this study include the possibility of errors in the linkage of the databases, such as the incorrect linkage between the records and non-linkage, which can lead to a loss of data and impact the estimation of the prevalence and infant mortality rate.A second possibility of underreporting refers to the CA that were not diagnosed at birth (SINASC) and were not recorded in the SIM, or those that have not been diagnosed during the child's life, or even, despite the late diagnoses, were not severe enough to cause the child's death.In this sense, these CAs were able to be recovered in our study, which shows the need to include records about morbidities in the population in an attempt to improve the underreporting of the CAs in the public databases of the country.
The present study showed a trend of prevalence growth and of infant mortality due to CA in Brazil and in four of the five regions in the studied period.This increase may well have occurred due to the implementation, by the Ministry of Health, of measures to improve the reporting of this group of causes of disease, especially those related to the CZS epidemic of 2015.
The linkage between the databanks concerning births and deaths illustrated how this is an important tool to minimize underreporting and highlighted that the less visible CAs at birth are the most commonly underreported information in the SINASC.Many CAs are subject to prevention and/or intervention and, therefore, it is important to understand the number of events that most clearly reflects the true reality.For this, it is necessary to adopt strategies to collect more re-liable and detailed records about CAs and their impacts on morbidity-mortality so that they can adequately subsidize the formulation of public social and health policies focused on the child population.

Figure 1 .
Figure 1.Prevalence (per 10,000 live births) of the occurrence of congenital anomalies in Brazil and regions, 2001 to 2018.Source: Data obtained in the SINASC (Live Birth Information System), plus those unregistered at birth but that are registered in the Death Certificates/SIM (Mortality Information System).

collaborations
QHRF Fernandes worked on the design, analysis, data interpretation and final writing.ES Paixão, MCN Costa and AX Acosta contributed to the design, data interpretation and final writing.MG Teixeira, JDC Rios, KSG Di Santo and ML Barreto provided critical feedback and helped with final writing.Funding Fundação de Amparo à Pesquisa do Estado da Bahia -Funds QHRFF through a doctoral scholarship.Secretaria de Vigilância em Saúde do Ministério da Saúde do Brasil Wellcome Trust -Funds ESP through Grant 213589/Z/18/Z.Wellcome Trust & the UK Department for International Development -Funds ESP through grant 205377/Z/16/Z.

Table 1 .
Parameters obtained through Prais-Winsten generalized linear regression model for a temporal series of prevalence (per 10,000 live births) of congenital anomalies, according to the geographic region of residence.Brazil, 2001 to 2018.

location of residence Annual percentage change (%) 95%ci p * r 2 Trend
Figure 2. Prevalence (per 10,000 live births) of the occurrence of congenital anomalies per cause group in Brazil, from 2001 to 2018.Other CM = other congenital malformations.Source: Data obtained in the SINASC (Live Birth Information System), plus those unregistered at birth but that are registered in the Death Certificates/SIM (Mortality Information System).

Table 2 .
Parameters obtained through the Prais-Winsten generalized linear regression model for the temporal series of infant mortality (per 10,000 live births) for congenital anomaly, in Brazil and geographic regions, from 2001 to 2017.