Socioeconomic differences between self- and interviewer-classifi cation of color/race

MÉTODOS: Estudo transversal de base populacional com indivíduos >20 anos (N=3.353), de ambos os sexos, conduzido na zona urbana de Pelotas, Rio Grande do Sul, em 2005. O processo de amostragem ocorreu em dois estágios e a coleta de dados foi domiciliar. Foram utilizados questionários padronizados e pré-codifi cados, aplicados sob a forma de entrevistas face-a-face. A consistência entre cor/raça autoclassifi cada e determinada pelo entrevistador foi verifi cada por meio da proporção de concordância e estatística kappa. Desigualdades étnico-raciais de renda e condição socioeconômica foram estimadas com os modelos de regressão linear e logística ordinal, ajustando-se para sexo, idade e escolaridade.


INTRODUCTION
The topic of inequalities in health is increasingly being investigated within academic circles and among public policy formulators.Inequalities in the levels of morbidity and mortality between groups with different socioeconomic conditions, genders, ages and geographic location have been reported in the literature. 12,15 studies on inequalities, increasing interest in ethnic-racial disparities has been seen, particularly in the fi eld of public health.There is already a considerable volume of evidence indicating a systematic condition of disadvantage and exclusion among black and brown (pardo) individuals, in relation to whites, covering various periods during the course of life. 3,9,14owever, investigations on ethnic-racial inequalities have come up against methodological problems.Discussions on the validity and reliability of measuring the variable "race" have taken place within different fi elds of knowledge, such as genetics 18 and public health. 24ence, it is considered that the attribution of color/race is complex and involves various factors.In Brazil, racial identifi cation is based on a combination of physical characteristics, such as skin color, nose and lip shape and hair type.The physical traits of the black and pardo categories are generally associated with negative connotations. 22Furthermore, it is seen that other variables have an infl uence on racial classifi cations, such as the schooling level, sex and age of interviewees. 21,22veral methods of racial classifi cation have been used in epidemiological studies and population censuses.Piza & Rosemberg 20 (2002) gave a detailed description of the racial classifi cation systems adopted in Brazilian censuses, and the criteria and practical diffi culties encountered over the course of various moments of the country's history.Other important published studies on the attribution of color/race in Brazil notably include those by Pinto 19 (1996), Telles & Lim 23 (1998) and Guimarães 8 (1999).As examples of epidemiological studies using (in this order) self-classifi cation, interviewer-classifi cation and a combination of these two strategies for determining the color/race of participating individuals, the studies by Leal et al 14 (2005), Dias-da-Costa et al 7 (2007) and Almeida-Filho et al 1 (2005) can be cited.
In essence, the attribution of color/race is based on external observation or self-classifi cation.In accordance with international practice, in Brazil the Instituto Brasileiro de Geografi a e Estatística (IBGE -Brazilian Institute for Geography and Statistics) trains its interviewers for the ten-year census to record race as declared by the interviewee. 22cial measurement needs to be in accordance with the objectives of each study in particular.For example, the way in which data are collected in a racial classifi cation based on color may relate to the type of exposure to health risks that it is wished to study.LaVeist 13 (1994) suggested that color/race should be attributed by the interviewer when estimating the social exposure to health risks and self-classifi ed when estimating the exposure to risky behavior attributable to cultural factors.
According to Telles & Lim 23 (1998), racial discrimination depends on classifi cations made by third parties.Therefore, they proposed that such measurements should be preferred when it is desired to estimate this type of discrimination.
The present study was devised with the aim of assessing: (1) the consistency between self-classifi ed and interviewer-classified color/race according to socioeconomic and demographic variables; and (2) the magnitude of the ethnic-racial inequalities of income and socioeconomic status when these forms of classifi cation are used.

METHODS
A cross-sectional population-based study was carried out in Pelotas, Southern Brazil, between October and December 2005, in which a variety of healthcare outcomes were investigated.The reference population for the study was composed of individuals of both sexes aged 20 years or over who were living in the urban zone of Pelotas, i.e. 93.2% of the population of the municipality.a Individuals living in old people's homes, hospitals or prisons, those who could not answer the questionnaire for themselves because of physical impediments, and those who were said by relatives or cohabitants of the home to present any type of mental incapacity, were excluded.
The sample size calculation and the sample selection process were conducted such that they would satisfy the demands of all the sub-studies.In the fi rst stage of the sampling process, the 404 census tracts were listed a in decreasing order of mean income of heads of households.Next, 120 census tracts were systematically drawn, while respecting the probability proportional to size.The number of census tracts to be drawn was defi ned arbitrarily, in such a way as to reduce the design effect resulting from the sampling process and not make the fi eldwork costly and lengthy.Around 13 homes were systematically selected from each census tract that was drawn, thus totaling 1,597 homes and 3,353 adults who were eligible to participate in the survey.The number of eligible participants was obtained by taking into account the IBGE's estimate that there were 2.1 adults per home in Pelotas, the mean of 13.4 homes drawn per census tract and the total of 120 tracts selected (2.1 x 13.4 x 120 ≅ 3,353).
Thus, considering that the data used had already been collected, the statistical power was calculated a posteriori.The database included records of just over 3,000 people, thus resulting in a statistical power of between 94% and 100% for estimating the relationships that self-classifi cation and interviewer classifi cation of color/race have with the variables investigated.
The data were collected by means of standardized precoded questionnaires that were applied in the form of face-to-face interviews.Answers given by other people on behalf of the interviewee were not accepted.The questionnaire presented two distinct items for classifying color/race in accordance with the categories of white, brown (pardo), black, yellow and indigenous that has been proposed by the IBGE.a In the initial section of the instrument, there was one item in which the interviewer classifi ed the participant into one of the abovementioned categories.At the end of this section, the conditions adopted in the IBGE's demographic censuses were simulated, such that the interviewee had to answer the question "What is your color or race?" with the same response choices.The order of these questions had the aim of avoiding any infl uence from the interviewees' self-classifi cation on the interviewers' opinion.
In addition, information on sex, age, monthly family income (sum of all household income, in reais, for the month preceding the interview), schooling level in complete years of study and socioeconomic condition, according to the criteria of the Associação Brasileira de Empresas de Pesquisa (ABEP -Brazilian Association of Survey Companies). a Age was analyzed according to the age groups of 20-29, 30-39, 40-49, 50-59 and ≥ 60 years.Income was converted into the categories of ≤ 1.0, 1.1-3.0,3.1-6.0and ≥ 6.1 minimum monthly salaries (one minimum monthly salary was R$ 300.00 at the time of data collection, which was equivalent to U$ 140.00 at that time).This form of categorization of family income has been used in other studies, 17,25 and its use in population-based studies in Pelotas has made it possible to establish longterm comparisons.Schooling levels were subdivided into groups of 0-4, 5-8, 9-11 and ≥ 12 years of study.Since the strata A and E of the ABEP criteria were found infrequently, it was decided to group the categories of this variable as A/B, C and D/E and rename them as groups 1, 2 and 3, respectively, for analysis purposes.
The interviews were conducted by 38 women who had received prior training, were blind to the study hypotheses and had at least completed high school education.The quality control for the information gathered was carried out by 11 fi eldwork supervisors, who applied questionnaires with a smaller number of questions to 10% of the surveyed individuals, selected at random.Kappa statistics were used to test the reproducibility of some of the questions.The data were entered into the EpiInfo 6 software, with double typing and automatic checks for consistency and amplitude.
Out of the total of 3,353 adults eligible to participate in the survey, 93.5% were interviewed.Diffi culty in locating the individuals and refusal to participate were the main reasons for the losses (N=217).For data quality control, 387 individuals were re-interviewed.The reproducibility of the variable "schooling" was tested by including it in the questionnaire with a smaller number of questions, and a kappa value of 0.7 was obtained.
The analyses were conducted in STATA, version 9.
The absolute and relative distributions of the sample were calculated for the self-classifi ed and interviewerclassifi ed color/race.The distribution of color/race by sex, age, schooling, family income and socioeconomic condition were also presented.The consistency between the self-classifi ed and interviewer-classifi ed color/race was evaluated by means of concordance proportion and kappa statistics for the whole sample and according to sex and the strata of age, schooling, family income and socioeconomic condition, as defi ned by the medians of their distributions.The individuals who were self-classifi ed as yellow (N=12) and indigenous (N=19) were excluded because they were numerically insignifi cant.
The magnitude of the inequalities in family income according to self-classifi ed and interviewer-classifi ed color/race was assessed by means of a multiple linear regression model, with adjustments for sex, age and schooling.The income distribution was asymmetrical and right-tailed, and it was therefore transformed into logarithms.The fulfi llment of the presuppositions underlying the model was verifi ed by means of visual analysis of the distribution of residuals, scatter diagrams between residuals and adjusted values and leverage graphs for detecting points of infl uence and aberrant points.
Ordinal logistic regression was used, with adjustment by the same variables as in the linear model, to analyze the association between color/race (self-classifi ed and interviewer-classifi ed) and socioeconomic condition according to ABEP (ordinal outcome at three levels).The ordinal regression produced odds ratios that estimated the chance that the dependent variable would increase by one unit for each increase of one unit in the independent variable.The supposition of proportionality of the model was verifi ed using the Brant test. 5ite color/race was adopted as the reference category in the abovementioned regression models, such that the estimated effect measurements would allow comparisons of pardos and blacks with whites.The interactions of sex, age and schooling with color/race were also tested.Because of the study design, in which observations within each census tract might be correlated, the "psu" option of STATA was used to adjust all the calculated precision estimates.The signifi cant level was taken to be 5% for two-tailed tests.
The study was approved by the Research Ethics Committee of the Faculdade de Medicina of the Universidade Federal de Pelotas.The participants were given guarantees regarding the confi dentiality of the information provided, and written consent was obtained from all participants prior to the interview.

RESULTS
The study population presented a median age of 43.0 years and a mean age of 44.0 years (SD = 16.4), and approximately 20% of these adults were in the age group ≥ 60 years.Among all the subjects, 56.1% were female.The median and the mean family income were R$ 1,000.00 and R$ 1,623.21(SD = 2,443.90),respectively.Approximately 12% of the interviewees had a monthly family income of less than or equal to one minimum salary.The mean schooling level was 7.9 years of study for the whole sample (median = 8.0), and it was similar for men and women.Just over one quarter of the individuals were in socioeconomic levels D and E.
Table 1 shows that, according to self-classifi ed color/ race, 81.6% were white, 6.6% were brown (pardo) and 10.8% were black.Individuals self-classifi ed as yellow and indigenous accounted for 1.0% together.According to interviewer-classifi ed color/race, 84.0% were white, 4.5% were pardo and 11.3% were black.The interviewers did not classify anyone as yellow, and classifi ed 0.2% as indigenous.Given the overlap between the 95% confi dence intervals, no statistically signifi cant difference was observed between the proportions of self-classifi ed and interviewer-classifi ed whites, and likewise for pardos and blacks.
The percentage distribution of color/race groups according to sex, age, schooling, family income and socioeconomic condition can be seen in Table 2.The percentages of men and women were the same, independent of the means of classifying color/race.There was a greater proportion of self-classifi ed pardos than the proportion thus classifi ed by the interviewers.The distribution of whites in the age groups was similar in both forms of color/race classifi cation.Whites were most frequent in the category ≥60 years of age and least frequent in the age group of 30-39 years, in relation to pardos and blacks.In turn, pardos and blacks were most frequent in the lowest strata of schooling, family income and socioeconomic condition.
The reproducibility between self-classifi ed and interviewer-classifi ed color/race was 0.8, according to the kappa value (Table 3).Whites were more likely to be classifi ed consistently than were blacks and pardos, in this order, taking the concordance proportions into consideration.There was also a general tendency towards whitening: when the interviewers gave a different classifi cation to self-classifi ed pardos, they chose the white category 1.4 times more than the black category.Whitening tended to be done by interviewers rather than by the interviewees.The same can be said in relation to blacks, in that the interviewers tended to classify them as white 1.5 times more than as pardos in cases of inconsistency.
The concordance proportion observed in the white and black color/race groups was greater for the women, and they also presented a higher kappa value.Among the women self-classifi ed as pardo, 28.1% were identifi ed by the interviewers as black, while among pardo men, 20.7% were identifi ed as black.The concordance proportion and the kappa value were greater for the stratum of greatest age, such that the tendency towards whitening was greatest among the youngest individuals.With regard to schooling, family income and socioeconomic condition, the tendency towards whitening was also observed.Pardos (to a greater degree) and blacks (to a lesser degree) of better socioeconomic situation were more likely to be included in lighter-skinned categories.On the other hand, self-classifi ed pardos of lower family income and socioeconomic condition tended to be classifi ed as blacks.In the same way, the concordance proportion for blacks was greater in the strata of worse schooling, family income and socioeconomic condition.The kappa values were higher in the more socially and economically disadvantaged groups, thus revealing that the reproducibility of the classifi cation of color/race was greater in these groups.4 presents the coeffi cients of the linear regression models with the logarithm of family income as the outcome, using self-classifi ed and interviewer-classifi ed color/race.The coeffi cients for color/race represent the difference in the logarithm of family income for pardos and blacks in relation to whites.Compared with the logarithm of income among whites, the income among pardos and blacks was slightly lower when color/race was classifi ed by the interviewers than when individuals were self-classifi ed.Taking exponentials from the values in Table 4 (results not presented), it was seen that the self-classifi ed pardos and blacks presented family incomes that were 6% and 19% lower, respectively, than the income among whites.When color/race was classifi ed by the interviewers, pardos presented family incomes that were 8% lower than the income among whites, and for blacks it was 21% lower.These differences were statistically signifi cant only for the comparison between whites and blacks.An analysis conducted using per capita family income showed results that were similar to those obtained using the aforementioned family income.
Table 5 shows that self-classifi ed pardos and blacks had statistically signifi cantly greater chances (OR=1.5 and OR=2.1, respectively) of presenting lower socioeconomic conditions than those of the whites.These effect measures were higher when color/race was determined by the interviewer, such that the OR was 1.6 for pardos and 2.3 for the blacks.No interactions were detected in the linear and ordinal regression models.

DISCUSSION
Certain factors have contributed towards ensuring the internal validity of this study, such as the high response rate, observed representativeness of the sample, use of standardized and pretested questionnaires, data collection using previously trained interviewers and fi eldwork supervision.The observed distribution of color/race The percentages on each line total 100.0% was similar to what was found in Pelotas in the IBGE's 2000 census, in which 83.2% were found to be white, 6.4% brown (pardo) and 9.7% black, a which also reinforces the internal validity of this study.Moreover, the adoption of a cross-sectional population-based design provided suffi cient sample size and power to detect the differences investigated.Another advantage of the present study lies in the fact that the interviewers were also inhabitants of Pelotas.This may have diminished any classifi cation errors resulting from regional variations in the attribution of color/race.Nonetheless, the external validity of the present study must be analyzed with caution.It is likely that some of the results presented here will only be applicable to localities in Brazil where the color/race distribution is similar to what was observed in Pelotas, 21 such as the majority of the municipalities in the southern region of Brazil.
All the interviewers had at least completed high school education and most of them were white.Even though color/race classifi cations performed by white interviewers of higher education level might be seen as a bias, this could represent an additional advantage: people with such characteristics have a greater chance of occupying social positions in which decisions on racial classifi cation affect the income of the individuals under consideration. 23e limitation to the present study was the absence of information on the sociodemographic characteristics of the interviewers in the analyses.This made it impossible to investigate how the interviewers' ages and income, for example, might infl uence the attribution of color/ race to and by the interviewees.Another limitation relates to the fact that the interviews were solely conducted by women.The racial classifi cation determined by the interviewers of the present study probably does not refl ect the attribution of color/race that would be obtained by an average citizen in Pelotas.
Concerning the distribution according to color/race, the predominance of whites and the greater percentage of blacks in relation to pardos are highlighted.According to the IBGE's census, the Brazilian population in 2000 was composed of 53.7% whites, 38.5% pardos and 6.2% blacks.a The predominance of whites in Pelotas is in line with the infl ux of European immigrants that took place at the end of the nineteenth century and start of the twentieth century, mainly in the southern and southeastern regions of the country.The small percentage of pardos in relation to the number of blacks suggests that there has been little miscegenation, probably because of smaller numbers of interracial marriages.Telles 22 (2003) found rates of stable partnerships between whites and pardos that were comparable with the rates between whites and blacks in some cities in southern Brazil, among which Rio Grande, Caxias do Sul and Pelotas.In these cities, Telles 22 (2003) indicated that whites seemed to regard pardos and blacks similarly in terms of forming stable partnerships.
The reproducibility between self-classifi ed and interviewer-classifi ed color/race for the whole sample was 0.8 (kappa value), and this can be considered good. 2 However, important patterns of variation could be seen when the concordance proportion was analyzed.2][23][24] Through this process, personal characteristics and qualities are attributed on the basis of perceptions of individuals' color/race such that groups with higher levels of education and social status have a better chance of being included in lighterskinned categories.The whitening process shown in the present study was consistent with other fi ndings in the literature. 16,22,23Males, younger individuals and those of better socioeconomic status tended to be classifi ed in lighter-skinned categories.With regard specifi cally to sex, the results from the present study contradict those of Telles 21 (2002), who suggested that the negative connotation associated with the term "black" and courtesy offered to women would make the interviewers less likely to classify black women in that category.
Greater concordance in the color/race classifi cation was observed among the socially and economically disadvantaged individuals, and this is not in agreement with the data of Telles & Lim 23 (1998).The kappa test is sensitive to the frequency of observations in each of the categories analyzed. 2Differing from the sample studied by those authors, 23 the proportion of blacks (among whom it is supposed that the consistency of classifi cation would be greater) found in the present study was higher than the proportion of pardos.This, together with the observation that blacks were more frequent in the socioeconomically more disadvantaged strata, may explain the discrepancy between the fi ndings.
The relative socioeconomic disadvantage of the pardos and blacks was shown in the distribution of the color/ race groups according to socioeconomic variables and from the results of the regression models.Inequalities between blacks and whites were found in relation to family income even after adjusting for schooling level, sex and age.Pardos and blacks had a greater chance of presenting lower socioeconomic conditions after adjustment for the same variables.However, the magnitude of these inequalities was smaller than what was observed by Telles & Lim 23 (1998) in a survey of national coverage.Those authors found that pardos and blacks declared personal incomes that were respectively 21% and 32% lower than the income of whites. 23The smaller magnitude of inequality observed between whites, pardos and blacks in Pelotas is possibly due to the lower rates of social inequality seen in this municipality, compared with the rest of the country.For example, the proportion of individuals with treated water supply reaches 79.8% of whites and 77.3% of pardos and blacks in the southern region.This is a difference of small magnitude when compared with the data for the whole of Brazil: 82.8% and 67.2%, respectively. 9 The observation that there are socioeconomic inequalities between racial groups is an important indication of the existence of institutionalized racism. 10,11,24This is defi ned as differences in access to goods, services and opportunities within society according to color/race. 10 From this point of view, the association between socioeconomic conditions and color/race that is commonly found would be a consequence of this type of discrimination.
The ethnic-racial inequalities that were found were slightly greater when color/race was measured by an external observer.As suggested by Telles, 21 (2002) whitening tends to be done by interviewers and not the contrary.As also seen in other studies, 21,23 the interviewers tended to classify individuals of better socioeconomic status in lighter-skinned color/race categories, and those of worse education level, family income and socioeconomic status in darker categories, thereby making the ethnic-racial groups more unequal from an economic and social point of view.
Variation in the magnitude of inequalities as a consequence of adopting different racial classifi cation methods has been described in studies conducted in other countries. 4,6There were similar fi ndings from a national survey in Brazil, 23 and these led the authors of that survey to postulate that interviewer-classifi ed color/race was a more appropriate method for studying ethnicracial inequalities.Given that external perceptions of individuals' color/race has a slightly greater impact in determining family income and socioeconomic condition, it is also suggested here that interviewer-classifi ed color/race should be preferred in studies on racial discrimination.On the other hand, self-classifi cation may be more indicated for understanding the success of attempts at black mobilization. 21e present study has shown that the racial classifi cation process depends on the social context and personal physical characteristics, among other matters.The choice of classifi cation method should take into consideration its advantages and disadvantages and the aims of each study. 21Researchers must be alert to the tendency of interviewers to whiten the population, especially in the privileged social strata, when conducting studies with ethnic-racial stratifi cation.Moreover, they must take into account the possibility that the classifi cation method (self-classifi cation or interviewer classifi cation) may infl uence the results found.The inherent complexity in gathering data on the color/race variable must not inhibit its use, since it is necessary for observing ethnic-racial inequalities and formulating public policies for reducing these inequalities.

Table 1 .
Absolute and relative distribution of color/race in the sample according to self-classifi cation and interviewer classifi cation.Pelotas, Southern Brazil, 2005.
* Among the 12 individuals self-classifi ed as yellow.10 were identifi ed as white.one as brown (pardo) and one as black by the interviewers ** Among the 19 individuals self-classifi ed as indigenous.10 were identifi ed as white and nine as indigenous by the interviewers Table

Table 2 .
Distribution of self-classifi ed and interviewer-classifi ed color/race according to sex, age, schooling level, income and socioeconomic condition.Pelotas, Southern Brazil, 2005.

Table 3 .
Concordance proportion and kappa values between self-classifi cation and interviewer classifi cation of color/race.Pelotas, Southern Brazil, 2005.

Table 4 .
Linear regression on the logarithm of family income in reais.according to self-classifi ed and interviewer-classifi ed color/race, adjusted for schooling level, sex and age.Pelotas, Southern Brazil, 2005.

Table 5 .
Ordinal logistic regression for socioeconomic condition (A/B; C; D/E).according to self-classifi ed and interviewerclassifi ed color/race, adjusted for schooling level, sex and age, Southern Brazil, 2005.