Validity of informed birth weight. Study of Cardiovascular Risk in Adolescents (ERICA)-Rio de Janeiro

Objectives: to verify the agreement between birth weight information referred by the guardians of the adolescents who participated in the Study of Cardiovascular Risks in Adolescents (Portuguese acronym ERICA) and the birth weight data of the adolescents identified in the National Information System on Live Births (Portuguese acronym Sinasc). Methods: probabilistic record linkage of 1,668 records was conducted between the ERICA databases and the Sinasc databases from 1996 to 2002, both from the state of Rio de Janeiro. The agreement between the informed birth weight and the one registered in Sinasc was estimated by the Intra-class Correlation Coefficient (ICC), Bland-Altman plot, Cohen’s Kappa index, and Gwet’s agreement coefficient. Results: the ICC was = 0.89; CI95% =0.88-0.90 and the higher the mother’s educational level was, the higher it became. There was also an elevated agreement between the birth weight classification in the low (< 2,500 g), adequate (2,500 to 3,999 g) and elevated (≥ 4,000 g) birth weight categories, Gwet’s Agreement Coefficient = 0.91; CI95%= 0.89-0.92. Conclusions: the results showed satisfying agreement between the birth weight referred by the parent/guardian of the adolescents, and the ones registered in SINASC, this agreement being directly proportional to the mother’s educational level.


Introduction
The rise in the prevalence of non-communicable chronic diseases has stimulated the interest in studies with auto-referred information on risk factors, whether by adults, adolescents or parents/guardians of children and adolescents, due to cost reduction and logistic complexity.
The birth weight is an important predictor on morbidity and child mortality, due to its association with growth impairment, vulnerability to infectious diseases, developmental delay and mortality. 13][4] In epidemiological studies, the information on birth weight depends on the memory of the interviewee, while in clinical trials there may be problems due to the lack of communication between maternities and pediatric services. 5n recent years, the number of researches in the health department that use the method of record linkage has been rising.The record linkage is a process of identifying and joining records from different databases that belong to the same entity. 6A new database is created for that, with the variables from the linked databases, improving the capacity to answer the research's questions by aggregating several pieces of information through common data. 7,8onsidering the importance of assessing the precision of the birth weight information obtained years after birth, the present study had the objective of verifying the agreement between the birth weight information referred by the parent/guardian of the adolescents participating in the Study of Cardiovascular Risks in Teenagers (Portuguese acronym ERICA), in the state of Rio de Janeiro, and the birth weight information of the adolescents registered in the National Information System on Live Births (Portuguese acronym Sinasc).Furthermore, it was investigated if socio-economic and demographic characteristics had influenced the degree of agreement between birth weight referred by a parent/guardian and the one registered in Sinasc.

Methods
The linkage was conducted through the probabilistic record linkage method between the ERICA database, the Parent's/Guardian's Questionnaire, filled by the parent/guardian of the adolescent participants born in the state of Rio de Janeiro, and the annual databases from the National Information System on Live Births (Portuguese acronym Sinasc) from 1996 to 2002, also from the state of Rio de Janeiro, obtained from the State Department of Vital Health Data of Rio de Janeiro.
ERICA was a school-based, national, multicenter, sectional study that assessed adolescents from 12 to 17 years old enrolled in public and private schools in Brazilian cities with over 100,000 inhabitants.Further details on the protocol of the study can be found in Bloch et al. 9 The data collection from ERICA took place in the period of March 2013 to December 2014 and this study was approved by the ethics committee of research in the Federal University of Rio de Janeiro.
The probabilistic linkage was conducted using the software OpenRecLink, 6 a system of database linkage that uses a probabilistic technique following the Fellegi and Sunter model 10 to combine observed pairs in two databases where there are no unequivocal key points, these being of common identification.The ERICA and Sinasc databases went through standardization steps, blocking and record pairing, following the steps and parameters recommended by Camargo Jr. and Coeli. 11Initially, characters that did not compose the mother's name, such as accents, numbers and unusual characters that could interfere in the linkage process were removed from both databases.
Seventeen blocking steps were conducted, applying different keys through the combination of phonetic codes: Soundex from the first name, Soundex from the last name, gender, day, month and year of birth, and area code from the city they were born, ranging from more strict and less strict key combinations.For the record pairing, variations of the mother's name and birth date were used.
The scores ranged from -6,877 to 10,693 (inferior and superior limits), and, applying predefined rules, all the records with score values above zero were manually reviewed at the end of each blocking step, for the attribution of the pair's status (true or false).At the end of the process, out of 2,671 adolescents from ERICA who went through the linkage, 1,822 (over 68.2%) were identified in Sinasc.
The nutritional state of the adolescents was classified by anthropometric indices BMI (BMI=weight/height²), expressed in Z-score values, using a reference from the WHO. 12 The cut-off points used were: very low weight Z-score < -3; low weight Z-score ≥ -3 and < -2; eutrophication Z-score ≥ -2 and ≤ 1; overweight Z-score > 1 and ≤ 2; obesity Z-score > 2. This information was used to compare subgroups of adolescents with and without the Guardian's Questionnaire and linked and non-

linked.
Three different classification methods were used for the socioeconomic classification.An economic classification criterion (socioeconomic criteria according the Brazilian Association of Research Companies -Portuguese acronym ABEP) that combines possession of assets with the education level of the householder, 13 grouping the five original classes in three, A=A1 and A2, B=B1 and B2, and C= C, D and E; the mother's educational level, classified in 4 categories: Incomplete primary school (including illiterates also), complete primary school, complete secondary school and complete tertiary school and the administrative nature of the adolescent's school, public or private.
For birth weight classification, were considered the weights registered in Sinasc to be in the category of low birth weight (Portuguese acronym BPN) those lower than 2,500 g, adequate birth weight (Portuguese acronym APN) those of higher or equal vale to 2,500 g and lower than 4,000 g, and elevated birth weight (Portuguese acronym EPN) those of higher or equal value to 4,000 g. 14,15 The differences between birth weights informed by the guardians and birth weights registered in Sinasc were compared in the categories of socioeconomic indices and birth weight categories through the qui-square test.The means of the absolute differences in values referred by the guardians registered in Sinasc (birth weights informed by the guardiansthe birth weights registered in Sinasc) were described with means and standard deviations.
The agreement between the birth weights informed by the guardians of the adolescents in ERICA with the ones registered in Sinasc was estimated by intra-class correlation coefficient (ICC), 16 to obtain a summary measure of agreement between referred information and registered information, and through the Bland-Altman graphic analysis. 17The agreement in birth weight classifications in the low, adequate and elevated categories was also verified by the Cohen's Kappa index 16,18 of the Brennan-Prediger Coefficient or G-Index 19,20 and Gwet's Agreement Coefficient. 21,20Kappa is the most frequently used index to assess agreement, however, its method of calculating the expected agreement by chance, according to the algorithm proposed by Cohen, results in overestimation. 21Furthermore, Kappa is highly influenced by the table's marginal distribution, tending to be zero when there is a large heterogeneity in prevalence, such as the case of the prevalence of the different weight categories described in the literature. 22

Results
Of 5,042 adolescents assessed in ERICA in the state of Rio de Janeiro, 355 individuals born in the year 1995 were excluded because this year was not included in the analysis due to structural differences in Sinasc.Adolescents that did not bring back the Guardian's Questionnaire (n=1,702), or without the mother's name on it (n=39), necessary for the linkage process, were also excluded, as were 275 adolescents who were not born in the State of Rio de Janeiro, leaving 2,671 records of adolescents to be linked with the Sinasc databases.In the Sinasc databases from Rio de Janeiro, 1996 to 2002, there were 102,550 live births excluded who did not have the mother's name, resulting in 1,699,591 records.Out of the 2,671 Guardian's Questionnaire from ERICA adolescents submitted to linkage, 1,822 were identified in Sinasc, meaning that, the probabilistic relation presented a sensitivity of 68.2%.154 records out of the 1,822 pairs identified as real were excluded for not containing birth weight information (one from Sinasc and 153 from ERICA), leaving 1,668 adolescents in the final sample.
The 849 non-linked teenagers had a mean 15.2 (standard deviation=1.4)years of age, most of them were female, had adequate weight and studied in a public school (Table 1).Between the pairs identified as real, most were also female, had adequate weight and studied in a public school.The 1,668 individuals in the final sample had a mean of 14.3 (SD=1.4)years of age, most were female, 19.5% was overweight and 10.5% obese, almost half of them had a mother with a level of education equal to or inferior to primary school, and most studied in public schools.In that same sample, 5.2% of the adolescents were from class A and one fourth were from classes C and D from the ABEP economical classification (2011), and, also, according to Sinasc records, 7.4% were born underweight (BPN) and 4.8% overweight (APN).
Tavares BM et al.
In the category without the information about the mother's level of education in the Guardian's Questionnaire, 66.3%, of the respondents was the mother herself.In the other categories about the mother's level of education, this percentage varied from 76.6% (incomplete primary school) to 81.8% (complete secondary school).It was yet observed that the percentages of adolescents who studied in a public school in this category without information about the mother's level of education was 94.4% and in the other categories it varied from 31.0%(tertiary school) to 94.4% (incomplete primary school).
In total, approximately 38% of the weights informed by the guardians were a perfect match to the weight in Sinasc.The percentages of these answers without discrepancies were higher in Validity of referred informed birth weight.ERICA-RJ mothers with complete secondary or tertiary school, and the higher discrepancies (differences higher than 100g) in mothers who did not complete primary school or in the category of no information on mother's level of education (Table 2).The ICC, in the sample's total, demonstrates that high agreement was achieved, that is, low variability between the weights referred by the guardians and the ones registered in Sinasc of each adolescent when compared to the existing variability between them.The agreement is a little higher if analyzed only in the subset in which the mothers answered the questionnaire (ICC=0.90;CI95%= 0.88 to 0.91) and higher if we exclude 45 pairs with greater differences than 1.0 kg (most likely they are not pairs ) , ICC = 0.93 (0.92 to 0.94).The elevated degree of agreement is also found in all the categories of analyzed variables, with the exception of the elevated birth weight category.There is an increasing gradient on the agreement according to how high the mother's educational level is (Table 3).
The agreement measures in the birth weight classifications in the low, adequate and elevated categories (Table 4) represented by Gwet first-order agreement coefficient (AC1) allow the observation of an elevated degree of total agreement in all categories of the sociodemographic variables.Cohen's Kappa agreement measure presented lower agreement values and Brennan-Predigger coefficient presented values closer to Gwet first-order agreement coefficient.
The elevated agreement between referred birth weights and birth weights registered in Sinasc can also be seen in the Bland-Altman graphic (Figure 1), in which the largest part of the point cloud inside the agreement limit of 95% can be found, and which has the largest concentration along the reference line of absence of difference between the referred birth weights and the Sinasc birth weights.The highest frequency of values in overestimation occurred between 3,000 g and 4,200 g of mean birth weights referred and registered in Sinasc, on the other hand, in underestimation, the highest frequency occurred between 2,200 and 3,000g of mean birth weights referred and registered.

Table 2
Differences between birth weight referred by the guardians of the adolescents and the real weight registered in Sinasc, total, and according to sociodemographic and birth weight classification variables.Rio de Janeiro, 2013 and 2014.

Table 3
Means, standard deviations (SD), mean differences, mean absolute differences and intraclass correlation coefficients (ICC) of the birth weights (g) referred by the guardians of the adolescents and registered in Sinasc, total and according to sociodemographic and birth weight variables.Rio de Janeiro, 2013 and 2014.data on live births using the linkage method to verify the agreement between the Sinasc information and the data from a perinatal survey, and they verified a good reliability for birth weight data in the Sinasc, with high agreement (kappa=0.94)between low birth weight and perinatal research.

Characteristics
It was verified in this study the positive association of the mother's level of education with the precision of the referred birth weight information.The highest precision was given in the answers of mothers with highest degree of education, just like the highest imprecision was found in the information given by mothers with a lower degree of education.The agreement measured by the ICC of the guardians who did not inform, for any reason, is similar to the one found in the category of mothers with a lower educational level.Still, through the analysis of the confidence intervals of the mean absolute differences, it is not possible to confirm that there is relevance in the differences of these means in the categories of economic variables, mother's educational level, type of school or birth weight.
Reports of birth weight validation studies conducted in Brazil are scarce.In a cohort study conducted in Pelotas, Victora et al. 5 verified that in 61.2% of the cases, the informed weight was exactly the same as the real weight.However, in this study the children's age varied between 9 to 15

Discussion
There was no difference between the birth weight referred by the guardian and the weight in Sinasc in less than half of the sample of adolescent participants of ERICA in the State of Rio de Janeiro (38%).However, when including cases in which the difference was lower than 100g, the similarity between the referred weights and the registered weights elevated to 58%.This means that in only a little more than half of the sample, weight information referred by the parents could be used as a substitute to the register at the time of birth.It should be considered that this should be expected, since the temporal distance between the confronted pieces of information was at least 12 years, the age of the youngest adolescents who participated in ERICA.
As for the classification of birth weight adequacy, the similarity or agreement between the information referred during adolescence and the one registered at birth was elevated.Therefore, the utilization of weight referred by guardians can result in a relative imprecision in attributing birth weight in each individual, but it is satisfactory as for the individual classification in the traditional classification categories (low, adequate or elevated birth weight).Silva et al. 24  registries in the hospital records, and concluded that Sinasc is an excellent information source to identify low birth weight births.

evaluated the quality of
The highest frequencies of answers with differences higher than 100g being present in the category of guardians who did not answer the mother's level of education indicates that these mothers were probably in a low level of education, since the percentage of children who studied in public school, whose mothers did not answer, was practically identical to the mothers with a referred lower level of education.Furthermore, between those who did not answer the mother's level of education, it was observed a smaller proportion of mothers as informants, which may have contributed to the discrepancies between the weights registered in Sinasc and referred in ERICA.Victora et al. 5 also observed a trend towards a decrease in the accuracy of information among mothers with lower level of education.a way that the chronological distance between the pieces of information was much shorter than in the present study with the adolescents from ERICA.Still in the Victora et al. 5 study, 79% of the answers presented a difference smaller than 100g, and 89.5% inferior to 250g, without any relevant tendency in any way, of weight increase or reduction.The same study also observed that the mothers who had never been to the school presented incorrect answers with a higher frequency than the others did.Victora et al., 5 considered hospital records as the real weight, in this study we considered the weights registered in Sinasc as the true weight.Filha et al. 25 verified high concordance level (ICC over 0.90) in the birth weight registered in Sinasc, comparing them with data obtained from an interview with the puerperae and a survey in hospital records.Almeida et al. 26 also compared information from live birth certificates, which are used to fill Sinasc, with those obtained in interviews with the mothers, and the On the sample's total, the difference between mean referred and registered weights was inferior to 20 grams.However, some variation of these differences between the categories of the variables analyzed in this study occurred.Still, the highest difference reached approximately 100 grams only in case of the adolescents who were born with low weight, and the highest mean weight was referred by the guardians.This could be expected as a tendency of the guardians to attribute a better weight value to those who were born with this deficiency.In a smaller degree, the same happened to those who were born overweight, as well as with those who were born with adequate weight, in general there is an overall modest overestimation of birth weight by the guardians.This can be the result of a historical popular culture that babies who are big or have elevated weight are "healthy", becoming something for the parents to be proud of, despite the absence of evidences of that in literature.Some studies have reported that childhood obesity and overweight is not perceived correctly by those guardians, and that too many of these guardians do not care about children's overweight. 27,28A German study evaluated the maternal perception of the child's silhouette and showed that some of the mothers of younger children preferred the child to be with overweight silhouettes. 29n the cases of school types, it was verified that in public schools the mean overestimation was approximately five times bigger than in private ones, even though the highest difference has been only a little over 20 grams.This finding might be related to the more elevated frequencies of adolescents of lower socioeconomic classes in public schools and of mothers with a low level of education.Victora et al. 5 verified greater errors and overestimation of birth weight responses in families with lower family income.
The present study found a high overall agreement between the referred weights and the weights registered in Sinasc according to ICC, which means that the variability between the values referred in ERICA and registered in Sinasc in each individual were considerably lower than the variance between birth weights of all adolescents from the sample.This is important to qualify studies of association between variables from ERICA in which one of them is the birth weight, increased to what has already been observed relating to small differences between the means.Elevated degree of agreement was also observed by Gwet's agreement coefficient, when classified birth weight in the low, adequate and elevated weight categories.This means that there is high agreement between the categories of BPN, APN and EPN referred and registered in Sinasc.In fact, Gayle et al. 30 had already suggested that maternally reported birth weights are sufficiently accurate for research and programmatic purposes when birth certificate information is not readily available.
Despite the agreements measured by Cohen's Kappa being only reasonable, the agreements according to Gwet's coefficient were highly elevated, this indicator being more stable and reliable than the usual Kappa, because it is less dependent of prevalence distribution and homogeneous marginal, such as the case of the frequencies of the three weight categories in the ERICA sample. 21he results of this study show that the groups of adolescents who brought the Guardian's Questionnaire, linked and non-linked, and those who did not bring, were homogeneous in relation to several characteristics and therefore, the losses of questionnaires of the guardians and non-linked records did not bias the results.It is necessary to highlight that, the questionnaires with birth weight information were delivered by the teenagers, that took them to their respective guardians to answer at home, and they would give them back later.That means it is possible to assume that some of the guardians have consulted vaccination cards, characterizing a possible information bias.The impact of this possible information bias is the increase in the accuracy of birth weight data, increasing the concordance between the birth weight data referred to Sinasc.
This study possesses some limitations that must be mentioned.The final sample is not representative in relation to adolescents of the State of Rio de Janeiro, due to ERICA's complex sampling design, that does not contemplate representativeness in the Federation Units.It should be added to this the fact that only 36.1% of the 5,042 adolescent participants of ERICA in the State of Rio de Janeiro were assessed in this study, mainly due to losses when returning the Guardian's Questionnaire, and the merely reasonable efficiency of the linkage process.The Guardian's Questionnaire was self-filled, which might have led to losses and misinterpretation due to illegibility and equivocated transcription, besides the dependency on the memory of the interviewee (memory bias), after a minimum period of 12 years.
It is yet possible to assume that there are typing or transcription mistakes in the Sinasc databases.Another source of mistakes is the process of database linkage, since it is a probabilistic relationship, with imperfect sensibility and specificity, despite the manual revision of the doubtful pairs.7][8] In a study to assess the sensibility of the probabilistic relationship between the data of the Pro-Saúde Study and the Sinasc databases, Coutinho et al. 7 verified a sensibility for the identification of births of 60.9% in reduction strategy, and of 72.8% in amplified strategy.The present study presented intermediary sensibility (68.2%) between these extremes.Therefore, this study is susceptible to information biases that can influence and distort the association measures.
The birth weight values referred in ERICA, in the State of Rio de Janeiro, by the guardians of adolescents from 12 to 17 years old, presented reasonable degrees of agreement with the values registered soon after birth in Sinasc, in individual levels.However, the agreements between mean weights were very high, which means that these can be used safely in population studies.The results also suggest that the mother's level of education is associated with the agreement between referred birth weight and the birth weight registered in Sinasc; the higher the level of education, the higher the precision and agreement of this measurement will be.
For better safety on the application of the present findings and recommendations, it is important, however, the conduction of similar studies in other regions of the country.

Figure 1
Figure 1 Graphic analysis (Bland-Altman) of the mean difference and Confidence interval (95%) of the referred values minus the values registered in Sinasc, in relation to the mean values referred and registered in Sinasc in adolescents born and assessed in the Study of Cardiovascular Risk in Adolescents -(ERICA) in the State of Rio de Janeiro, 2013 and 2014.

Table 1
Distribution of sociodemographic variables, gender, birth weight and nutritional state, in the linked clusters, nonlinked, final sample, and in adolescents without Guardian's Questionnaire.Rio de Janeiro, 2013 and 2014.
Source: ERICA and Sinasc ABEP: Socioeconomic criteria according to the Brazilian Association of Research Companies.*Pearson's Chi-square (Between Final Sample and Adolescents without Guardian's Questionnaire).

Table 4
Kappa, G-index and Confidence intervals 95% between birth weight classifications in the Low, Adequate and Elevated categories, referred by the guardians of the adolescents registered in Sinasc, total and according to sociodemographic variables.Rio de Janeiro, 2013 and 2014.