Assessing the validity of a food frequency questionnaire among low-income women in São Paulo , southeastern Brazil

This study describes the validity of a food frequency questionnaire (FFQ) in 93 low-income women (20-65 years), participating in a case-control study in São Paulo, Brazil. Two FFQ (FFQ1 and FFQ2, 12 months apart) and three 24-hour dietary recalls (24hR) were conducted between 2003 and 2004 to estimate dietary intake during the past year. The Pearson correlation coefficients (crude, energy-adjusted and de-attenuated) were used for comparisons between FFQ and 24hR. The agreement between the methods was further examined by the Bland-Altman analysis. For the assessment of long-term reliability, the energyadjusted intra-class correlation coefficients were mostly around 0.40, but higher for vitamin A and folate (0.50-0.56). Energy-adjusted, attenuationcorrected Pearson validity correlations between FFQ and DR ranged from 0.30-0.54 for macronutrients to 0.20-0.48 for micronutrients, with higher value for calcium (0.75). There were small proportions of grossly misclassified nutrient intakes, while Bland-Altman plots indicated that the FFQ is accurate in assessing nutrient intake at a group level. Food Consumption; Validation Studies; Questionnaires; Women Introduction In epidemiologic studies, self-reported dietary intake is often used to establish diet-disease associations. The food frequency questionnaire (FFQ) has been the dietary assessment method most frequently used in large-scale studies, primarily because it is easy to administer, is less expensive than other dietary assessment methods, and provides a rapid estimate of usual intake 1. Over the years, investigators have recognized that the reported values from FFQ are subject to substantial error, both systematic and random 2. A limited number of validation studies of FFQ have been conducted in developing countries. People from low-income households have less nutritionally adequate diets, living for long periods of time on limited incomes, with lower literacy, numerical and language skills 3. Issues such as questionnaire format, participant motivation, perceived study burden, and repeated administration of the same instrument can each influence our ability to gather reliable and valid self-report dietary intake data 4. Results usually show a good agreement in test-retest for foods and nutrients, but with high variability in the dietary intake 5. Thus, rather than evaluating the validity of a questionnaire against a “gold standard”, it has been discussed whether and by how much the measurement characteristics of dietary assessment instruments differ, especially in research designed to address dietary ARTIGO ARTICLE Cardoso MA et al. 2060 Cad. Saúde Pública, Rio de Janeiro, 26(11):2059-2067, nov, 2010 quality, food inequality or diet as part of health inequalities 3. We developed a FFQ to assess dietary intake in an urban female population for use in a case control study designed to investigate the relationship between dietary factors, serum vitamin concentrations and cervical cancer in São Paulo city, Brazil. The present study illustrates an assessment of validity of an FFQ with 24hR, using different statistical approaches: Pearson correlation, Bland-Altman method and joint classification analysis. Subjects and methods Study population and design Participants were from the Brazilian Investigation into Nutrition and Cervical Cancer Prevention (BRINCA) Study, a case-control hospital-based study designed to investigate the relationship between dietary factors, serum vitamin concentrations and cervical cancer in São Paulo city. The exclusion criteria were pregnancy, breastfeeding, hysterectomy, positive test for HIV, bleeding, mental disturbance, and radiotherapy or chemotherapy treatments. Between March 2003 and May 2005, 1,676 women completed the study protocol by filling out a questionnaire in three major public hospitals: the Brazilian Institute for Cancer Research, the Leonor Mendes de Barros Hospital and Pérola Byngton Hospital. In the main study reported elsewhere 6, we aimed to recruit prospectively 453 controls and 497 newly diagnosed cases of cervical intraepithelial neoplasia. Eligible women were residents of São Paulo aged 21-65 years and had no prior hysterectomy, previous treatment for cervical neoplasia or cancer history. During the same period, control women were selected from among those attending screening in the same clinics where cases were diagnosed. Cases and controls were invited to participate in the study and interviewed before the colposcopy examination to minimize differential recall bias. The Institutional Review Board at the School of Public Health of the University of São Paulo and the Medical Ethical Committees of all participating hospitals approved the study protocol, and a written informed consent was obtained from each participant. Participants of the BRINCA Study were invited to take part in the validation study of the FFQ. The sample size for the validation study was calculated using a standard formula for correlation coefficients 7, and the number of subjects required was about 100. However, a random subsample of 145 participants from the main study was invited for this study to cover losses or potential incomplete responses over time. One of the authors (E.C.L.) contacted the participants within 2 weeks after the first FFQ interview at the hospital (FFQ1). Overall, three 24hR and two FFQ interviews (12 months apart) were conducted between 2003 and 2004 to estimate dietary intake during the past year. We collected three dietary recalls, conducted by telephone from each participant. Between September and November 2003, 145 participants completed the first dietary recall. Six months later, 119 (82%) completed the second dietary recall and 93 (64%) participants completed the third dietary recall within an interval of approximately 6 and 12 months of the first dietary recall, respectively. Of the initial study sample, 28 subjects (19% of 145) had moved or changed phone number, 22 (15%) were not contacted after 5 calls including weekend days and 2 (1%) refused to participate. After finishing the last dietary recall, the subjects completed the second FFQ (FFQ2). Participants underwent the standardized telephone interviews by trained nutritionists (E.C.L. and L.Y.T.) blinded to the group assignment in the main study. Women reported in household measures the dietary intake from breakfast of the day before till breakfast the next morning, following the FFQ2 questions on habitual food intake over the past year. Thus, for the present analysis, the validity of the FFQ was assessed among 93 participants (64 controls and 29 cases) from the main study. Self-administered FFQ All participants answered a short version of a FFQ previously validated for use with JapaneseBrazilians 8,9 and adapted to be used at the primary health care level 10. Women were asked about usual frequency of consumption of 76 foods items and their portion sizes, an openended food section, and vitamin and mineral supplements during the previous year. Questions concerning use of sauce, frequency of intake of visible fat and type of fat used in cooking procedures were also asked. Nutrients were analyzed using the Dietsys software version 4.01 (National Cancer Institute, Bethesda, USA) 11. The nutrient database was based primarily on the U.S. Department of Agriculture (USDA) publications supplied by Dietsys and supplemented by the Brazilian Standard Food Composition Table 12. 24-hour diet recalls The 24hR had been led by telephone, getting information on recipes, portion size and volume. ASSESSING VALIDITY OF FOOD FREQUENCY QUESTIONNAIRE 2061 Cad. Saúde Pública, Rio de Janeiro, 26(11):2059-2067, nov, 2010 Table 1 Mean, standard deviation (SD) and intra-class Pearson correlation coeffi cients between fi rst and second food frequency questionnaires (FFQ1 and FFQ2) (n = 93). Nutrients FFQ1 FFQ2 Intra-class Pearson

Assessing the validity of a food frequency questionnaire among low-income women in São Paulo, southeastern Brazil Avaliação de validade de um questionário de freqüência alimentar em mulheres de baixa renda residentes em São Paulo, Brasil

Introduction
In epidemiologic studies, self-reported dietary intake is often used to establish diet-disease associations.The food frequency questionnaire (FFQ) has been the dietary assessment method most frequently used in large-scale studies, primarily because it is easy to administer, is less expensive than other dietary assessment methods, and provides a rapid estimate of usual intake 1 .
Over the years, investigators have recognized that the reported values from FFQ are subject to substantial error, both systematic and random 2 .A limited number of validation studies of FFQ have been conducted in developing countries.People from low-income households have less nutritionally adequate diets, living for long periods of time on limited incomes, with lower literacy, numerical and language skills 3 .Issues such as questionnaire format, participant motivation, perceived study burden, and repeated administration of the same instrument can each influence our ability to gather reliable and valid self-report dietary intake data 4 .Results usually show a good agreement in test-retest for foods and nutrients, but with high variability in the dietary intake 5 .Thus, rather than evaluating the validity of a questionnaire against a "gold standard", it has been discussed whether and by how much the measurement characteristics of dietary assessment instruments differ, especially in research designed to address dietary Cad.Saúde Pública, Rio de Janeiro, 26 (11):2059-2067, nov, 2010 quality, food inequality or diet as part of health inequalities 3 .
We developed a FFQ to assess dietary intake in an urban female population for use in a case control study designed to investigate the relationship between dietary factors, serum vitamin concentrations and cervical cancer in São Paulo city, Brazil.The present study illustrates an assessment of validity of an FFQ with 24hR, using different statistical approaches: Pearson correlation, Bland-Altman method and joint classification analysis.

Study population and design
Participants were from the Brazilian Investigation into Nutrition and Cervical Cancer Prevention (BRINCA) Study, a case-control hospital-based study designed to investigate the relationship between dietary factors, serum vitamin concentrations and cervical cancer in São Paulo city.The exclusion criteria were pregnancy, breastfeeding, hysterectomy, positive test for HIV, bleeding, mental disturbance, and radiotherapy or chemotherapy treatments.Between March 2003 and May 2005, 1,676 women completed the study protocol by filling out a questionnaire in three major public hospitals: the Brazilian Institute for Cancer Research, the Leonor Mendes de Barros Hospital and Pérola Byngton Hospital.In the main study reported elsewhere 6 , we aimed to recruit prospectively 453 controls and 497 newly diagnosed cases of cervical intraepithelial neoplasia.Eligible women were residents of São Paulo aged 21-65 years and had no prior hysterectomy, previous treatment for cervical neoplasia or cancer history.During the same period, control women were selected from among those attending screening in the same clinics where cases were diagnosed.Cases and controls were invited to participate in the study and interviewed before the colposcopy examination to minimize differential recall bias.The Institutional Review Board at the School of Public Health of the University of São Paulo and the Medical Ethical Committees of all participating hospitals approved the study protocol, and a written informed consent was obtained from each participant.
Participants of the BRINCA Study were invited to take part in the validation study of the FFQ.The sample size for the validation study was calculated using a standard formula for correlation coefficients 7 , and the number of subjects required was about 100.However, a random sub-sample of 145 participants from the main study was invited for this study to cover losses or potential incomplete responses over time.One of the authors (E.C.L.) contacted the participants within 2 weeks after the first FFQ interview at the hospital (FFQ1).Overall, three 24hR and two FFQ interviews (12 months apart) were conducted between 2003 and 2004 to estimate dietary intake during the past year.We collected three dietary recalls, conducted by telephone from each participant.Between September and November 2003, 145 participants completed the first dietary recall.Six months later, 119 (82%) completed the second dietary recall and 93 (64%) participants completed the third dietary recall within an interval of approximately 6 and 12 months of the first dietary recall, respectively.Of the initial study sample, 28 subjects (19% of 145) had moved or changed phone number, 22 (15%) were not contacted after 5 calls including weekend days and 2 (1%) refused to participate.After finishing the last dietary recall, the subjects completed the second FFQ (FFQ2).Participants underwent the standardized telephone interviews by trained nutritionists (E.C.L. and L.Y.T.) blinded to the group assignment in the main study.Women reported in household measures the dietary intake from breakfast of the day before till breakfast the next morning, following the FFQ2 questions on habitual food intake over the past year.Thus, for the present analysis, the validity of the FFQ was assessed among 93 participants (64 controls and 29 cases) from the main study.

Self-administered FFQ
All participants answered a short version of a FFQ previously validated for use with Japanese-Brazilians 8,9 and adapted to be used at the primary health care level 10 .Women were asked about usual frequency of consumption of 76 foods items and their portion sizes, an openended food section, and vitamin and mineral supplements during the previous year.Questions concerning use of sauce, frequency of intake of visible fat and type of fat used in cooking procedures were also asked.Nutrients were analyzed using the Dietsys software version 4.01 (National Cancer Institute, Bethesda, USA) 11 .The nutrient database was based primarily on the U.S. Department of Agriculture (USDA) publications supplied by Dietsys and supplemented by the Brazilian Standard Food Composition Table 12 .

24-hour diet recalls
The 24hR had been led by telephone, getting information on recipes, portion size and volume.A standardized script was used for the dietary recall by telephone, as developed in a previous study 13 .The database of the World Food Dietary Assessment System (WFood version 2.0; Office of Technology Licensing, University of California Berkeley, Berkeley, USA) was used to estimate the nutrient composition of diets.The nutrient database was based on the USDA publications supplied by WFood.

Statistical analysis
Means and standard deviations (SD) were calculated for total nutrient intake from each administration of the FFQ and 24hR.Since most nutrient distributions were skewed toward higher values, all variables were natural-log (ln) transformed before statistical analysis.Subsequently, to examine the correlation due to total energy intake we calculate energy-adjusted nutrient intake, which indicate the nutrient composition of the diet by regressing nutrient intake on total energy 7 .As a measure of longterm reliability, Pearson intra-class correlation coefficients were used to compare the two FFQs for both unadjusted and energy-adjusted nutrients 3 .The validity of the FFQ was evaluated by comparing mean nutrient intake data obtained from FFQ2 with mean nutrient intake data obtained from three dietary recalls.The mean of FFQ2 was used for validation analysis since the reference period for FFQ1 did not correspond to the same time period covered by the dietary recall.The validation approach for the FFQ in this study used three methods: Pearson correlation, Bland-Altman and joint classification analysis.
Pearson correlation coefficients (crude and energy-adjusted) were used to assess the association between nutrient intake estimates from the second FFQ and mean of three dietary recalls, covering the same 12-month period.Attenuation of the Pearson correlation coefficients between 24hR and FFQ, caused by intra-individual dayto-day variation in the nutrient intakes among subjects, was corrected by taking into account the within-to-between person variance ratio calculated from 3 days of 24hR.We deattenuated the crude correlation coefficients by multiplying them with the factor [1 + ( s 2 intra /s 2 inter )/n] 1/2 , where n is the number of repeated 24-hour dietary recall, s 2 intra is the intra-individual variance and s 2 inter is the inter-individual variance between the 24hR 7 .
Bland-Altman limits of agreement (LOA) 14 were used to evaluate the level of agreement between the two dietary methods across the range of intakes.The LOA define the limits within which 95% of the differences are expected to fall (mean ± 2 SD of the differences).Ln transformation was performed since the dietary data were skewed.As described by Flood et al. 15 , after energy adjustment and ln transformation of the data, the difference in nutrient intake between the two methods (lnFFQ-lnDR) was plotted against mean of paired intake values [(lnFFQ + lnDR)/2].The antilogs of the LOA were then taken, providing a ratio FFQ/dietary recall of the data.The ratios were multiplied by 100 expressing percentages, with 100% representing perfect agreement.The dependency between the two methods was tested by fitting the regression line of differences: if the two methods are equally variable, the correlation between the differences would equal zero.
Misclassification error was assessed by dividing mean intake values from the FFQ and dietary recall into quintiles.Then the percentage of agreement between the two methods (classification in the same quintile) was estimated.Quadratic-weighted kappa statistics were used for comparison across quintiles of nutrient intake from FFQ and dietary recall.
All statistical analyses were performed using the SPSS software version 12.0 (SPSS Inc., Chicago, USA).
Table 1 shows the average and standard deviation of the nutrient intakes estimated by FFQ1 and FFQ2, excluding supplements.The intraclass correlations measuring the reproducibility of the unadjusted nutrients from the FFQ spaced one-year apart ranged from 0.33 for vitamin B 12 to 0.59 for cholesterol and 0.58 for energy.Adjustment of the nutrient intake for total energy before testing reproducibility slightly decrease almost all correlations, with the exception of fiber, vitamins A, C and B12, folate and calcium.The higher energy-adjusted intra-class correlation coefficients were observed for folate and vitamin A (0.50 and 0.56, respectively).
Table 2 presents the mean values of nutrient estimates from the three dietary recalls and the second FFQ with validity parameters obtained from three different statistical approaches.After correction for within-person variation, the deattenuated Pearson correlation coefficients were slightly higher than the energy-adjusted values, ranging from 0.24 to 0.54 for most nutrients.The worse coefficient values were observed for vitamins A (-0.003), E (0.16) and B 12 (0.05); the higher values were observed for calcium (0.75), fiber (0.54) and folate (0.48).The mean agreement between the FFQ and the dietary recalls ranged from 87% for fiber to 218% for vitamin E.Although the mean agreement was around 100% for most nutrients, the FFQ could under or overestimate dietary recall.The narrowest LOA was found for carbohydrate and the widest for vitamin E. On the other hand, weighted kappa values mostly showed good agreement for nutrient categories (around 80%); the highest proportion of subjects correctly classified was observed for calcium (88.2%).
Figure 1 and 2 illustrate the Bland-Altman analysis for calcium and vitamin E, respectively, nutrients with the best and worst validity parameters.A fitted regression line indicated a significant linear trend (p < 0.001) for both nutrients, even using ln transformation of the skewed data.That is, a dependency existed between the difference of the two methods and the average of the two methods; across extreme intake levels higher the magnitude of the error between the FFQ and dietary recall can be expected.On the other hand, Figure 3 illustrates the Bland-Altman analysis for total fat with not so good energy-adjusted and de-attenuated Pearson coefficient (0.43) when compared with calcium (0.75).However, a good percentage agreement was observed for total fat (mean = 103, LOA 90-118) without significant linear tendency between the differences of the two methods across intake levels, that is, the two methods were equally variable.

Discussion
In light of the difficulty of women living with limited resources and low literacy to express their habitual food intake, the correlation results of our FFQ were mostly fair to good between repeated long-term administrations, suggesting a reliable method of assessing nutrient intake in our study population.For validity purposes, our results suggest that the FFQ is a valid tool, based on Bland-Altman plots and correlation coefficients, for protein, total fat, carbohydrate, cholesterol, fiber, vitamin C, riboflavin, folate, calcium and magnesium (with energy-adjusted and de-attenuated correlations higher than 0.30 for nutrients comparing the FFQ with three 24hR).The high proportion of nutrient intakes classified within one quintile suggests that the FFQ is adequately ranking the nutrient intake of the study group.Nevertheless, caution should be taken for use of this FFQ among women with similar characteristics since the Bland-Altman limits of agreement illustrate that the FFQ can overestimate or underestimate consumption across intake levels for the following nutrients: vitamins A, E, B6, B 12 , thiamin and zinc.The long-term reliability (1-year interval) in this study was lower for all nutrients (ranging from 0.30 to 0.56) when compared with the short-term reliability (ranging from 0.52 to 0.75, within 1-month interval) observed for the original version of this FFQ in Japanese-Brazilian women with higher education levels 9 .However, it is worth noting that the results for test-retest of the FFQ in the present study are even better than those from previous research on low-income Brazilian workers carried out by Fornés et al. 5 .In that study, involving 104 women and men Bland-Altman method of assessing agreement between the food frequency questionnaire (FFQ) and 24-hour dietary recall (24hR) or dietary calcium intake, after natural-log transformation, applied to data from 93 low-income women.Figure 2 Bland-Altman method of assessing agreement between the food frequency questionnaire (FFQ) and 24-hour dietary recall (24hR) for dietary vitamin E intake, after natural-log transformation, applied to data from 93 low-income women.
Cad. Saúde Pública, Rio de Janeiro, 26 (11):2059-2067, nov, 2010 Figure 3 Bland-Altman method of assessing agreement between the food frequency questionnaire (FFQ) and 24-hour dietary recall (24hR) for dietary fat intake, after natural-log transformation, applied to data from 93 low-income women.aged 18 60 years, the energy-adjusted intraclass correlation coefficients ranged from 0.23 to 0.34 in a 5-month interval between the two FFQ, suggesting poor agreement for reproducibility.As expected, the long-term reliability of a FFQ is generally lower than the short-term reliability since real changes in diet could occur besides the memory bias 16 .
In the present study, we found lower correlations after energy adjustment for most nutrients in the validation analysis, except for vitamin C and folate, which habitual dietary sources seem to be less dependent on total energy intake.Similar results were found by Slater et al. 17 in a validation study of an adolescent food frequency questionnaire applied at a public school in São Paulo.These lower correlations may be explained by an increase in correlated measurement error as a consequence of controlling for total energy intake.Therefore, additional information on Bland-Altman limits of agreement is useful to evaluate the level of agreement between the two dietary methods across the range of intakes and the assessment of misclassification error as well.The differences between the FFQ and dietary recall for estimation of energy intake were previously reported and can be explained by underreporting in the dietary recall 18,19 .
Previous validation studies in Brazilian women have not reported relevant information on Bland-Altman analyses 20,21 .As suggested in a consensus statement on methods for assessing FFQ 1 , Bland-Altman limits of agreement would illustrate whether the FFQ is overestimating or underestimating nutrient consumption across intake levels.In this study, correlation coefficients were very low with Bland-Altman limits of agreement too wide for vitamin A, E and B 12 , indicating the difficulty in estimating such nutrients because they may be highly concentrated in some foods.In earlier validation studies carried out in Latin-American countries, similar findings have been reported 22,23 .The inclusion of more days of recall would possibly improve these results 23 .Though, for nutrients such as vitamin E, even 12 days of diet recording may be insufficient to measure accurately their habitual intakes 9 .
The main limitation in the present study is the use of dietary recalls as a reference method since, like FFQ, relies on memory, ensuing in errors such as under-or over-estimation of food consumption.As shown in published data of other validation studies conducted in adult populations in both developed and developing countries 24,25 , the dietary records were used as the gold standard in most of the validation studies.Dietary records are likely to have the least correlated errors, whereas dietary histories as gold standards are considered the least appropriate 25 .However, repeated 24hR are less demanding for participants with low literacy.Thus, multiple 24hR were shown to be the most appropriate method for studies of diet and nutrition including subjects with low-income and/or low-literacy 3 .
In conclusion, depending on the use of the nutrient intake estimates, different methods of analysis should be used in assessing the validity of a FFQ in similar populations, providing information about the performance of such method across different levels of nutrient intakes.Although our FFQ was reasonable for ranking individuals according to their nutrient intakes, absolute consumption should be used with caution.

Figure 1
Figure 1 ContributorsM. A. Cardoso participated in the planning of the research; data analysis and interpretation; and writing up of the article.L. Y. Tomita collaborated in data collection and procession; analysis and interpretation of results; and revision of the article.E. C. Laguna participated in the collection and processing of the date; analysis of results and final revision of the article.

Table 2
Mean energy and nutrient intakes of three 24-hour diet recalls (24hR) and the second food frequency questionnaire (FFQ2) and validity assessed by Pearson correlation coeffi cients, limits of agreement with p-value for slope, and joint classifi cation of subjects by quintiles of nutrient intakes over 12 months in 93 women.