Telephone survey : post-stratifi cation adjustments to compensate non-coverage bias in city of Rio Branco , Northern Brazil

OBJECTIVE: To evaluate the effects of using post-stratifi cation weight to correct the bias due to low coverage of households with telephones. METHODS: A Comparison was made of results collected by the Household Survey with those of the VIGITEL (Telephone Survey to Monitor Risk and Protective Factors for Chronic Diseases) in Rio Branco, Northern Brazil, in 2007 whose coverage was 40% of landline phones. The potential bias in the VIGITEL survey was expressed by the difference between the rates of prevalence of the VIGITEL and Household Survey, calculated as the square root mean square error (MSE) as a measure of the accuracy of the estimate. RESULTS: The weighting procedure of VIGITEL corrected potential bias in the prevalence of consumption of fruit and vegetables, meat with visible fat, smoking, bad self-assessment of health status and morbidity of cholesterol or triglycerides. In the prevalence of physical activity in leisure time and morbidity of asthma, bronchial asthma, chronic bronchitis or emphysema, the procedure adopted by VIGITEL did not reduce the potential bias. CONCLUSIONS: in order to construct post-stratifi cation weights which minimize the potential bias in estimates of the variables due to low coverage of households with landlines, it becomes necessary to use alternative methods of weighting and strategies of selecting external variables. DESCRIPTORS: Health Surveys, methods. Interviews as Topic. Bias (Epidemiology). Data Collection, methods. 2 Telephone surveys: valid inferences Bernal RTI et al The Ministry of Health, Department of Health Monitoring (SVS/MS) established the Telephone Survey System for Monitoring Risk and Protection Factors for Chronic Illness (VIGITEL), in the 26 Brazilian state capitals and the Federal District in 2006. This system collects information on risk factors, such as smoking, excessive consumption of foods high in saturated fat, being overweight, sedentary lifestyle and excessive consumption of alcohol. Factors of protection for which data were collected included information on exercise, fruit and vegetable consumption and cancer prevention.a Six years after its implementation, VIGITEL has been consolidated as a tool for collecting data. The telephone survey has various advantages, such as low operational cost and rapidity in the process of divulging results, compared to the Household Survey.16,17,22 However, existing scientifi c production brings to light three main questions: (a) the validity of estimates obtained in the Telephone Survey, due to the exclusion of household without a landline,2,4,11,18,b,c (b) the increased lack of response9,d and (c) the methodological procedures for obtaining valid estimates.11,12,14,c The experience of VIGITEL in Brazil differs from that of other countries, such as the Behavior Risk Factor Surveillance System (BRFSS)d in the United States, with regards to the levels of no response. In the more than twenty years of the use of this methodology in the United States, a decline in the response ratec,e in the last decade was observed, whereas VIGITEL progressively reduced the refusal rate (2,5% in the latest edition). This contributes to the reduction in bias and the strengthening of this strategy in the country.f Although the number of households in Brazil with a landline has increased since 2000, the results of the National Household Survey (PNAD) carried out by the Brazilian Institute of Geography and Statistics (IBGE),g in 2007, shows estimates of around 33% Palmas, Northern Brazil) and 78% (São Paulo, Southeastern INTRODUCTION Brazil) of households in urban areas in the capitals served by at least one residential telephone line (RTL). In the municipality of Rio Branco, Northern Brazil, 40% of households had access to a residential telephone. The lowest fi gures were found in the North and Northeast, (40% and 48%, respectively). In the Midwest, this rate was 57%, whereas the highest fi gures were in the South and Southeast (71% and 76%, respectively). Due to the exclusion of the population without a telephone, VIGITEL uses weighting to calibrate the statistical inferences, aiming to correct possible bias introduced by low rates of landline coverage. This study aimed to analyze the effects of the use of post-stratifi cation to correct bias due to low landline coverage.

The Ministry of Health, Department of Health Monitoring (SVS/MS) established the Telephone Survey System for Monitoring Risk and Protection Factors for Chronic Illness (VIGITEL), in the 26 Brazilian state capitals and the Federal District in 2006.This system collects information on risk factors, such as smoking, excessive consumption of foods high in saturated fat, being overweight, sedentary lifestyle and excessive consumption of alcohol.Factors of protection for which data were collected included information on exercise, fruit and vegetable consumption and cancer prevention.a Six years after its implementation, VIGITEL has been consolidated as a tool for collecting data.The telephone survey has various advantages, such as low operational cost and rapidity in the process of divulging results, compared to the Household Survey. 16,17,22c The experience of VIGITEL in Brazil differs from that of other countries, such as the Behavior Risk Factor Surveillance System (BRFSS) d in the United States, with regards to the levels of no response.In the more than twenty years of the use of this methodology in the United States, a decline in the response rate c,e in the last decade was observed, whereas VIGITEL progressively reduced the refusal rate (2,5% in the latest edition).This contributes to the reduction in bias and the strengthening of this strategy in the country.f INTRODUCTION Brazil) of households in urban areas in the capitals served by at least one residential telephone line (RTL).In the municipality of Rio Branco, Northern Brazil, 40% of households had access to a residential telephone.The lowest fi gures were found in the North and Northeast, (40% and 48%, respectively).In the Midwest, this rate was 57%, whereas the highest fi gures were in the South and Southeast (71% and 76%, respectively).
Due to the exclusion of the population without a telephone, VIGITEL uses weighting to calibrate the statistical inferences, aiming to correct possible bias introduced by low rates of landline coverage.This study aimed to analyze the effects of the use of post-stratifi cation to correct bias due to low landline coverage.

METHODS
A comparison was made between the rates of prevalence found in household and telephone surveys on risk and protection factors for non-communicable chronic illness.The rates of prevalence of the household survey were considered as population values b as they dealt with a sample of households both with and without a landline.The assumptions of the study are that the rates of prevalence obtained in the household survey show negligible bias and that there is independence between the VIGITEL and the household samples, whose rates of response were 71% and 87% respectively.
The VIGITEL and the household survey were carried out in Rio Branco, AC, in 2007.Around 40% of private residences had at least one telephone line, according to data from the IBGE's PNAD.g In the household survey, 1,515 adults aged over 18 were interviewed between March 2007 and September 2008 (64% of the interviews were carried out in 2007).VIGITEL interviewed 2,010 adults aged over 18 between July and December 2007.The two studies used the same questionnaire on the subjects of Exercise, Food Intake, Smoking, Alcohol, Perception and Reported Morbidity.The data from the household survey were used to identify rates of prevalence associated with owning a landline and to characterize the socio-demographic profi le of the population excluded from VIGITEL.The fi rst stage of the study consisted in selecting the variables for evaluating the potential bias in VIGITEL.Due to the data collection going on into 2008, it was necessary to carry out a hypothesis test (Test t) to test the equality of the parameters of the variables obtained in 2007 and 2008 (H 0 : P 2007 = P 2008 ), with a level of signifi cance of 5%.Of the 29 variables suggested, 18  were selected to study the potential VIGITEL bias.The variables selected were: food intake (beans fi ve or more days/week; fruit, legumes and vegetables (FLV) regularly; recommended FLV; fatty meat; whole milk; soft drinks fi ve or more days/week); physical activity during leisure time; sedentary lifestyle;excessive consumption of alcohol; smoker; ex-smoker; self-evaluation of health as very bad and reported comorbidities (high blood pressure; diabetes, heart attack, stroke or cerebrovascular accident; high cholesterol or triglycerides; asthma, asthmatic bronchitis or emphysema; osteoporosis).The variables (Y) of the study were qualitative and dichotomized (1 = yes; 0 = no).
The adults in the study were divided into two groups; those who had a residential telephone line (RTL) and those who did not.This allowed rates of prevalence associated with ownership of a telephone to be used to test the hypothesis for the difference in means between the populations with and without telephones (5% signifi cance).These data allowed the socio-demographic profile of the adult population excluded from the VIGITEL system to be characterized.The multiple logistic regression model h was used ( ), in which π(χ) expresses the probability of not having access to an RTL, given the following characteristics x p (age group, reported skin color, schooling and number of members in the household).The variables which describe the profi le of those resident in households without a landline were used to construct post-stratifi cation weightings, to even out potential bias introduced by the sample design.The independent variables are qualitative, and the last category is considered as the reference.The results of the multiple logistic regression are expressed as the odds ratio for a specifi c category x p and for the reference.An odds ratio = 1 indicates that the odds are equally probable in both groups.Values > 1 indicate how many times greater the chance is for the fi rst group.
The sample distributions for the household survey and for VIGITEL were adjusted for the same population in order to make comparisons between the two samples.
The post-stratifi cation weighting according to age, sex and schooling were used, using data from the 2007 PNAD g as an external source of data for constructing the weighting.The VIGITEL system used the weighting per cell method 11,d to obtain post-stratifi cation weighting according to age, sex and schooling, with the aim of smoothing out potential bias because of low landline coverage.These weightings were constructed based on data from the 2000 Census as an external data source, and made available in the database.To control the time effect, new post-stratifi cation weightings were constructed based on PNAD g data.These weightings were found using the ratio between relative frequency of the estimated population of the PNAD g and the VIGITEL sample in each cell.These frequencies were obtained using their respective sample weightings and those of the variables of the complex sample design.The effect of the post-stratifi cation weighting on the variables raised by VIGITEL were expressed by the difference between the prevalence weighted by the sample weighting and the weighted prevalence of the fi nal weighting.
The potential bias of the VIGITEL survey was expressed by the difference in rates of prevalence of the VIGITEL and household surveys ( ). 5,13 The mean square error (MSE) ( ) 5,13 was used to measure total error composed of sample error and systematic error.The square root of MSE (RMSE) gives the expected distance between the VIGITEL rate of prevalence and that of the population value.

RESULTS
Signifi cant differences were found in seven of the 18 variables when comparing the rates of prevalence of the groups who did and did not possess a RTL (Table 1).The group with the RTL differed in the prevalence of regular fruit, legumes and vegetable consumption (FLV), consumption of fatty meat, physical activity during leisure time, smoker, self-evaluation of health as very bad and in the prevalence of reported morbidities h Paula GA.Modelos de regressão com apoio computacional.São Paulo: Instituto de Matemática e Estatística da USP; 2004. of high cholesterol or triglycerides and asthma, asthmatic bronchitis, chronic bronchitis or emphysema.
The odds of an adult not having a RTL diminished as schooling increased.The same relationship was found between the number of adults in households for age group and among those who reported their skin color as not white (Table 2).
Due to the exclusion of households without a landline, the age group pyramid of the VIGITEL sample differed from that of the population of the PNAD g in 2007.This difference also occurred for years of schooling.Using the weighting procedure to correct statistical inference, the sample distribution of the VIGITEL survey was adjusted for the population estimated using the PNAD g as an external source.The use of post-stratifi cation weighting in the statistical analyses meant that the age pyramid estimated by the VIGITEL sample was the same as that of the population.The same occurred in the distribution of years of schooling (Figure ).
The weighting increased the rates of prevalence of smoking, alcohol consumption and food intake, except for the regular consumption of FLV in VIGITEL.The weightings decreased in four of the six estimates of rates of prevalence which made up reported morbidity.The estimated rates of prevalence for exercise, consumption of whole milk and self-evaluation of health as very bad changed by a question of decimals (Table 3).
The rates of prevalence of soft drink consumption on fi ve or more days/week, consumption of whole milk, ex-smoker, reported morbidity of high blood pressure, asthma, asthmatic bronchitis, chronic bronchitis or emphysema were underestimated by more than 3% in the VIGITEL survey.The rates of prevalence of fatty meat consumption, sedentarylifestyle, excessive alcohol consumption, self-evaluation of health as very bad and reported morbidities of heart attack, diabetes, high cholesterol and osteoporosis showed levels of bias between -3% and 3%.VIGITEL overestimated the prevalence of consumption of beans on fi ve more  4).
The median of errors in the estimated rates of prevalence by VIGITEL was 2.18%, expressed by the square root of the square mean error.It is notable that errors in the rates of prevalence for consumption of soft drinks and whole milk, physical activity during leisure time and reported morbidity of high blood pressure were in the third quartile (Table 4).

DISCUSSION
In 2007, 41% of private residences in the city Rio Branco had a landline, according to data from the household survey.The results of the survey indicate seven variables with potential bias in the estimates due to low coverage of landlines.These variables are: regular FLV consumption, fatty meat consumption, physical activity during leisure time, smoking, self-evaluation of health as very bad and reported morbidities of high cholesterol or triglycerides and asthma, asthmatic bronchitis, chronic bronchitis or emphysema.
The results of the household survey show that those without a landline are concentrated in the groups with lower levels of schooling, those whose reported skin color was not white, among individuals aged 18 to 34 and in households with one or two inhabitants.These results are consistent with those of other studies which identify the profi le of individuals without a landline. 9,i Of the seven rates of prevalence indicated by the household survey with potential bias due to the association between prevalence and owning a landline, the VIGITEL post-stratifi cation weighting partially eliminated bias in rates of prevalence of regular FLV consumption, fatty meat consumption, smoking and reported morbidity of high cholesterol or triglycerides.The weighting procedure adopted by the system did not correct the bias in the rates of prevalence of asthma, asthmatic bronchitis, chronic bronchitis or emphysema and physical activity during leisure time.
Recent Brazilian studies evaluate the magnitude of the bias found in results produced by the VIGITEL system by comparing them with telephone or household surveys.Viacava et al 21 evaluated the bias of mammogram coverage in women between 50 and 60 in the 26 state capitals and the Federal District, comparing the results of VIGITEL 2007 with data from the 2003 and 2006 PNAD.The bias was expressed by the absolute VIGITEL underestimated the rates of prevalence which composed the group of risk factors.This result is consistent with those presented by Battaglia et al, 2 which evaluate the effect of weighting to correct bias in the estimates due to the 50% lack of response rate.
Béland & St-Pierre 3 found similar results for the rates of prevalence of smokers and alcohol consumption.
The results here show that VIGITEL tends to underestimate rates of prevalence in areas of low landline coverage.The rates of prevalence of indicators (smoking, consuming soft drinks and fatty meat, self-evaluation of health as very bad) are present at higher proportions in populations with low levels of schooling. 1,10,15,20rotective' indicators (consuming FLV and exercise) are present in greater proportions among populations with higher levels of schooling.However, they tend to be overestimated.7 The difference between the surveys was expressed by the ratio of prevalence as a means of detecting bias.
In the evaluation of potential bias in the prevalence of high blood pressure, the results from Rio Branco and Campinas show the presence of bias in the results presented by the VIGITEL system in municipalities with low and high landline coverage, respectively.Ferreira et al 6 showed that post-stratifi cation weighting smooths out bias in the prevalence of high blood pressure.When evaluating mammogram coverage, Viacava et al 21 showed that bias in the VIGITEL survey diminished with the increase in landline coverage, varying from 3.4% in São Paulo, to 24.2% in João Pessoa.
Segri et al 19 detected signifi cant difference between These studies show the presence of bias in the prevalence of high blood pressure in municipalities with low and high landline coverage and in mammogram coverage in municipalities with high coverage.These differences can be attributed to the databases, to the study population, to differences in methodology used to estimate bias or to the statistical techniques used in the analyses.
Those without landlines are concentrated in those of lower socioeconomic status, and in Rio Branco greater rates of prevalence in variables associated with risk factors, such as smoking, consumption of fatty meat and self-evaluation of health as very bad were found.Those without a landline had lower rates of prevalence of physical activity during leisure time and regular FLV consumption.
The VIGITEL weighting procedure corrected potential bias in fi ve of the seven variables indicated by the household survey, due to low household coverage.
In the rates of prevalence for physical activity during leisure time and reported morbidity of asthma, asthmatic bronchitis, chronic asthma or emphysema, however, the procedure adopted by VIGITEL did not reduce the bias.These results need to be validated in other state capitals with low landline coverage.Studies comparing VIGITEL and household surveys in cities with high landline coverage are also necessary in order to validate the results in the municipality of Rio Branco, AC.
It is necessary to use alternative weighting methods and strategies of selecting external variables to construct the post-stratifi cation weightings which minimize the potential bias in the rates of prevalence for physical activity during leisure time and reported morbidity of asthma, asthmatic bronchitis, chronic bronchitis or emphysema, which will be explored in future analyses.
Recent studies evaluating the potential bias in VIGITEL highlight divergences in detecting potential bias in high blood pressure and in mammogram coverage.These differences can be attributed to the database, the population of the study, methodological differences used in estimating bias or to the statistical techniques used in the analyses.
This reinforces the importance of studies of the methodology for estimating bias due to lack of response and low coverage as an object of epidemiological research.
In total, 36 cells were accrued, composed of sex (F;M), age group of 18 to 24, 25 to 34, 35 to 44, 45 to 54, 55 to 64 and 65 and over, and years of schooling of 0 to 8, 9 to 11 and 12 or more.The cell composed of women in the 65 and over age group with 12 or more years of schooling was grouped together with the cell containing women aged between 55 and 64, totaling 35 cells.The fi nal weighting attributed to each individual interviewed by VIGITEL was composed of the weight of the sample multiplied by the post-stratifi cation weighting

Table 1 .
Difference in rates of prevalence of the variables in question according to landline ownership in the household survey.Rio Branco, Northern Brazil, 2007.

Table 2 .
Estimate of the odds ratio associated with not having a RTL obtained in multiple logistic regression model.Household survey.Rio Branco, Northern Brazil, 2007.
i Thornberry OT Jr, Massey JT.Correcting for undercoverage bias in Random Digit Dialed National Health Surveys.Washington (DC): National Center for Health Statistics; 1978 [cited 2010 Oct 25].Available from: http://www.amstat.org/sections/srms/proceedings/papers/1978_045.pdfFigure.Pyramid of age and distribution of the variable schooling according to the household survey.Rio Branco, AC, 2007.

Table 3 .
Difference between weighted rates of prevalence according to variables in question.Household survey.Rio Branco, Northern Brazil, 2007.

Table 4 .
Bias and mean squared error (MSE) of rates of prevalence according to VIGITEL.Rio Branco, Northern Brazil, 2007.