Validity of indicators of physical activity and sedentariness obtained by telephone survey

OBJECTIVE: To assess the reliability and validity of indicators of physical activity and sedentariness obtained by means of a telephone-based surveillance system. METHODS: Reliability and validity studies were carried out in two random subsamples (n=110 and n=111, respectively) obtained from the total sample (N=2,024) of adults (≥18 years) studied by the system in the municipality of São Paulo in 2005. Studied indicators included frequency of “suffi ciently active during leisure time,” “inactive in four domains of physical activity (leisure, work, transportation, and housework),” and “habit of watching television for long periods.” Reliability was assessed by comparing results of the original telephone interview with those of another identical interview repeated after seven to 15 days. Validity was assessed by comparing the results of the telephone interview with those of three 24-hour recalls (reference method) carried out in the week following the original interview. RESULTS: Frequencies obtained for of the three evaluated indicators were either identical or very similar for the fi rst and second telephone interviews. Kappa coeffi cients ranged from 0.53 to 0.80, indicating good reliability for all indicators. In relation to the reference method, all indicators showed 80% or higher specifi city, and sensitivity values were 69.7% for “watching television for long periods,” 59.1% for “inactive in four domains,” and 50% for “suffi ciently active during leisure.” CONCLUSIONS: The indicators of physical activity and sedentariness included in the system seem reliable and sufficiently accurate. If kept operational in coming years, this system may provide Brazil with a useful instrument for evaluating public policies aimed at promoting physical activity and controlling non-transmissible chronic diseases associated with sedentariness. DESCRIPTORS: Life Style. Diet Surveys. Indicators of Quality of Life. Reproducibility of Results. Validity of Tests. Nutritional Surveillance. Physical Activity.


INTRODUCTION
Global estimates indicate that non-transmissible chronic diseases (NTCDs) determine roughly 60% of all deaths worldwide, and almost half the global burden of disease. 16In Brazil, it is estimated that NTCDs account for almost two-thirds of all deaths with known cause.a The proportion of deaths due to a Ministério da Saúde.Secretaria de Vigilância em Saúde.Departamento de Análise de Situação em Saúde.Saúde Brasil 2006: uma análise da situação de saúde no Brasil.Brasília: Ministério da Saúde; 2006.620 p.
NCTDs in Brazilian state capitals increased more than three-fold between the 1930s and '90s. 4In all regions of the globe, a small group of risk factors determines the great majority of NCTD deaths as well as a substantial fraction of the disease burden related to these diseases.Noteworthy among these factors are unhealthy diets and insuffi cient physical activity. 16 Brazil, the frequency and distribution of risk factors for NTCDs is monitored by means of a telephone-based surveillance system.This system, known as VIGITEL (Vigilância de Fatores de Risco e Proteção para Doenças Crônicas não Transmissíveis por Inquérito Telefônico [Surveillance of Risk and Protective Factors for Non-Transmissible Chronic Diseases by Telephone Interview]), has been in operation since 2006 in all 26 Brazilian state capitals as well as in the Federal District.a VIGITEL was tested successfully in the city of São Paulo in 2003, 8 and was retested in this same city and in four other capitals in 2005.During the second test in São Paulo, a study of the reliability and validity of the indicators obtained was coupled to the system's normal operation.The present article describes the results pertaining to indicators of physical activity and sedentariness.The reliability and validity of diet-related indicators was described in Monteiro et al. 9

METHODS
Two systematic subsamples, each with 115 subjects, were extracted from the total sample (N=2,024) of subjects aged 18 years or older surveyed by the VIGITEL system in the city of Sao Paulo, respecting the proportion of men and women in the total sample.a Five subjects from the fi rst subsample (reliability) and four from the second (validity) either refused to participate in the study or did not complete the required interviews.The reliability study thus included 110 subjects (47 men and 63 women; mean age 45 years; 26.3% with up to 8 years schooling; and 32.7% with 12 or more years schooling), whereas the validity study included 111 subjects (50 men and 61 women; mean age 44 years; 34.2% with up to 8 years schooling; and 27.0% with 12 or more years schooling).
Indicators of physical activity in the VIGITEL system address suffi cient physical activity during leisure time, simultaneous inactivity in four domains of physical activity (leisure, work, transportation to work, and household), and the habit of watching television for extended periods of time.Based on the answers provided by the subjects to the questions on physical activity in the VIGITEL questionnaire, the system classifi es as "suffi ciently active during leisure" subjects who report physical exercise or sport of moderate intensity for at a Ministério da Saúde.VIGITEL Brasil 2006.Vigilância de fatores de risco e proteção para doenças crônicas por inquérito telefônico: estimativas sobre freqüência e distribuição sócio-demográfi ca de fatores de risco e proteção para doenças crônicas nas capitais dos 26 estados brasileiros e no Distrito Federal em 2006.Brasília: Ministério da Saúde; 2007.least 30 minutes per day at least fi ve days per week, or sport of vigorous intensity for at least 20 minutes per day at least three days per week.Subjects were classifi ed as "inactive in four domains of physical activity" when they reported 1) not practicing sports or physical exercise at least one day per week; 2) "not walking on a regular basis"; and "not carrying heavy loads on a regular basis" at work (or unemployed for the last three months); 3) not commuting from home to work on foot or by bicycle; and 4) not being responsible for "heavy cleaning" at home.The intensity of exercise or sport reported by the subject is classifi ed a posteriori by the system based on a compendium that estimates the energy expenditure associated with different forms of physical activity, attributing moderate intensity to exercise or sports associated with energy expenditures ranging from three to six times that of resting, and vigorous intensity to expenditures equivalent to six or more times that of resting. 1 Finally, the status of "watching television for extended periods" was attributed to subjects who watched television three or more hours per day at least fi ve days per week.
For the reliability study, subjects were contacted by phone seven to 15 days after the original interview by the system, when they were asked to respond again to the block of 12 questions on physical activity.The second interviewer was always different from that of the original interview.The results of these two sequential interviews were compared in terms of frequency of "suffi ciently active during leisure," "inactive in four domains of physical activity," and "watching television for extended periods of time," as well of agreement between the individual classifi cation of each subject with respect to these three indicators.In this last case, the degree of agreement between the two interviewers was evaluated using the kappa coeffi cient, classifi ed as follows: above 0.80 indicates virtually perfect agreement; 0.61 to 0.80, substantial agreement; 0.41 to 0.60, moderate agreement; 0.21 to 0.40, fair agreement; and below 0.21, slight agreement. 3r the validity study, subjects responded to three 24hour recalls addressing physical activity.These surveys consisted of asking subjects to report in detail the type and duration of all physical activity performed in the 24 hours preceding the interview. 10In the specifi c case of the present study, if there was no spontaneous report of physical activity in any of the four domains investigated, we directly asked the subject about occasional physical activity in these domains, including type and duration.The same procedure was used for subjects who did not mention "watching television."The 24hour recalls were administered via telephone in the week following the original interview by the system.Two of the recalls referred to weekdays and the third referred to a Saturday, Sunday, or holiday.
The validity study consisted of comparing the results of the regular VIGITEL telephone interview with those of the three 24-hour recalls (gold-standard).We compared frequencies of "suffi ciently active in leisure," "inactive in four domains of physical activity," and "watching television for extended periods," in addition to calculating, for each indicator, the degree of accuracy in classifying the (true) status of each subject as determined by the reference method.We considered as "suffi ciently active in leisure" subjects that, in at least two of the three 24-hour recalls, reported performing physical exercise of moderate intensity for 30 minutes or of vigorous intensity for 20 minutes, the classifi cation of exercise intensity being based on the same criteria described above.We considered as "inactive in four domains of physical activity" subjects that, in all three 24-hour recalls, failed to report physical exercise of any type, occupational activities implying walking (at least 30 minutes) or carrying heavy loads, transportation by bicycle or on foot to and from work, and activities related to "heavy cleaning" of the subject's own home.Finally, we classifi ed as positive for "watching television for extended periods" subjects that reported watching television for at least three hours in at least two of three 24-hour recalls.
The degree of accuracy of the telephone interview in determining the true status of each subject was assessed by calculating specifi city and sensitivity for each indicator, i.e., the proportion of accurate classifi cations made by the telephone interview among subjects with "case" or "non-case" status, respectively, according to the reference method. 13 addition, in order to assess the validity of indicators obtained by telephone, we compared, based on the three 24-hour recalls, the mean and median daily number of minutes that subjects classifi ed by the telephone interview as "cases" or "non-cases" for each indicator spent on 1) any sport or physical exercise; 2) total physical activity in the four domains studied (leisure, work, transportation to work on foot or by bicycle, and "heavy cleaning" of the home; and 3) watching television.Given the absence of normal distribution in the duration of the activities evaluated, the statistical signifi cance of differences between groups was determined using the non-parametric test for difference between two medians. 7e study was approved by the Research Ethics Committee of the Faculdade de Saúde Pública da Universidade de São Paulo.

RESULTS
Table 1 compares the results of the original VIGITEL interviews with those of the repeat interviews.Frequency of "suffi ciently active during leisure" was identical for the two series of interviews (24.5%).Frequencies were very similar for "inactive in four domains" (24.6% and 23.6%, respectively), and "watching television for extended periods" (33.6% and 34.6%, respectively).The kappa coeffi cient indicates substantial agreement for "suffi ciently active during leisure" (0.80) and for "inactive in four domains of physical activity" (0.78) and moderate agreement for "watching television for extended periods" (0.53).
Table 2 compares frequencies estimated by the VIGI-TEL interview with those of the gold-standard.For "inactive in four domains," the difference between values obtained by telephone interview (22.5%) and by the 24-hour recalls (19.8%) was minimal.For the other two indicators, frequencies obtained by telephone interview tended to be slightly overestimated: 26.1% vs. 21.6% for "suffi ciently active in leisure" and 35.1% vs. 29.7% for "watching television for extended periods." The telephone interview showed high specifi city (close to or above 80%) for all three indicators.Sensitivity was 69.7% for "watching television for extended periods," 59.1% for "inactive in four domains of physical activity," and 50% for "suffi ciently active during leisure." The mean time per day spent on physical exercise or sports of any nature, estimated based on the 24-hour recalls, was 31.8 minutes for subjects classifi ed by the telephone interview as "suffi ciently active during leisure," vs. 8.9 minutes for the remainder of subjects (median = 20 and zero minutes, respectively; p<0.001).
Mean time per day spent on physical activity in the four domains studied (leisure, work, transportation to and from work, and heavy cleaning at home) was 27.5 minutes for subjects classifi ed by the interview as "inactive in four domains" and 139.6 minutes for the remainder (median = zero and 60 min, respectively; p<0.001).Finally, mean time per day watching television was 209.1 minutes for subjects classifi ed by the interview as "watching television for extended periods" and 122.7 minutes for the remaining subjects (median = 203 min and 120 min, respectively; p<0.001).

DISCUSSION
The present study shows that indicators of physical activity and sedentariness obtained through the VIGITEL system show good reproducibility, at both collective (identical or very similar frequencies of the three indicators evaluated in repeated interviews) and individual (kappa coeffi cients compatible with moderate or substantial agreement in individual classifi cation of exposure) levels.Good reproducibility indicates that interviews are carried out in a standardized fashion, avoiding interpretation or answer induction.It also indicates that subjects are able to understand the questions and have no diffi culty answering them, providing answers that do not vary with time.What is expected from a surveillance system such as VIGITEL is that it provide estimates of indicators that, in addition to accurate, are also reproducible.Good reproducibility ensures that temporal variations in these indicators refl ect actual variations in behavior among the population, rather than indicator instability. 2,12dicator validity, i.e., in the present scenario, the ability of indicators obtained by VIGITEL to provide results that are similar to those obtained by three 24-hour recalls, was also evaluated from both collective and individual perspectives.On the collective level, the telephone interview revealed a frequency of "inactive in four domains" quite similar to that obtained by the 24-hour recalls (22.5 and 19.8%, respectively).In the case of "suffi ciently active during leisure," there was a slight overestimation by the telephone interview, which may indicate subjects' "desire" to be more physically active.The same is not true, however, for "watching television for extended periods," the frequency of which was also slightly higher in the telephone interview than in the recalls.
At the individual level, all three indicators evaluated showed good specifi city.Reasonable sensitivity was achieved for "watching television for extended periods" and "inactive in four domains."The low sensitivity found for "suffi ciently active during leisure" (50%) may be explained, at least partly, by the fact that the 24-hour recalls investigate only three days, compared to the seven-day reference period.
Still with regards to validity at the collective level, we found that the group of subjects classifi ed as "suffi ciently active in leisure" dedicate over threefold more time per day to physical exercise or sport than the remainder of subjects (32 minutes vs. 9 minutes).On the other hand, the group of subjects classifi ed as "watching television for extended periods" spent on average 75% more time watching television than other subjects (209 minutes vs. 120 minutes).
We do not see important limitations in the design used for the reproducibility study, given that the major sources of intra-subject variation and of variation between interviewers were accounted for by repeating the same interview with a different interviewer.Likewise, the kappa coeffi cient, employed in the analysis of reproducibility of the telephone interview, is the most widely recommended measure for evaluating the reproducibility of instruments used for classifying individuals as exposed or unexposed to a given condition. 13mmon limitations of validity studies include the use of insuffi ciently accurate reference methods and samples that are not representative of the population evaluated by the indicator. 13Regarding the fi rst of these issues, it would have been more appropriate to extend the recall period to seven days, a duration considered as more adequate for characterizing physical activity patterns, 11 or to employ instruments that directly record physical activity, such as accelerometers. 15These options, however, were discarded due to the risk of affecting the response rate or even of infl uencing the subject's usual pattern of physical activity.In any case, issues concerning the precision of the reference method are unlikely to lead to overestimation of the validity of the method under evaluation.Rather, such issues would likely lead to underestimation, which we believe may have been the case, as mentioned, for the sensitivity of the "suffi ciently active during leisure" indicator.
As to the representativeness of the current subsample, the probabilistic selection of subjects ensures that our results are applicable to the performance of VIGITEL in the city of São Paulo, but not necessarily in the other cities in which the system is being implemented.In this regard, we believe it will be essential to carry out a similar study in at least one state capital for each of the fi ve Brazilian Regions.In addition to probabilistic selection, other strengths of the present analysis are calculation of sensitivity and specifi city, a recommended procedure given the characteristics of the indicators being strudied, 13 and comparison of daily time dedicated to each of the various activities according to the inclusion or not of subjects in the exposed group for each indicator.
Even though restricted to developed countries, other studies of reproducibility and validity of physical activity indicators obtained by telephone interview have yielded similar results to those of the present study. 5,6,14 conclusion, the indicators of physical activity employed by the VIGITEL system appear to be reproducible and suffi ciently accurate.The maintenance of the system in years to come will provide the country with a useful instrument for evaluating public policies aimed at promoting physical activity and controlling non-transmissible chronic diseases associated with sedentariness.

Table 1 .
Reproducibility of indicators of physical activity and sedentariness among adults (≥ 18 years) obtained by telephone interview.City of São Paulo, Southeastern Brazil, 2005.
* Subjects who perform physical exercise or sport of moderate intensity for at least 30 minutes per day, fi ve or more days per week, or of vigorous intensity for at least 20 minutes per day, three or more days per week.** Subjects that: 1) do not perform physical activity or sport at least one day per week; 2) "do not walk frequently" and "do not carry heavy weight frequently" at work; 3) do not commute from home to work on foot or by bicycle; e 4) are not responsible for "heavy cleaning" at home.*** Watch television for at least three hours per day fi ve or more days per week.

Table 2 .
Validity of indicators of physical activity and sedentariness obtained by telephone interview in relation to three 24-hour recalls.City of São Paulo, Southeastern Brazil, 2005.Subjects who perform physical exercise or sport of moderate intensity for at least 30 minutes per day, fi ve or more days per week, or of vigorous intensity for at least 20 minutes per day, three or more days per week.** Subjects that: 1) do not perform physical activity or sport at least one day per week; 2) "do not walk frequently" and "do not carry heavy weight frequently" at work; 3) do not commute from home to work on foot or by bicycle; e 4) are not responsible for "heavy cleaning" at home.*** Watch television for at least three hours per day fi ve or more days per week. *