Prevalence of adult smokers in Brazilian capitals according to socioeconomic deprivation

ABSTRACT Objective: To estimate the prevalence of adult smokers in the 26 capitals and the Federal District according to the Brazilian Deprivation Index (Índice Brasileiro de Privação – IBP). Methods: Dataset on smoking were obtained from the Surveillance of Risk and Protective Factors for Noncommunicable Diseases by Survey (Vigitel) system for the 26 capitals and the Federal District, in the period from 2010 to 2013. The IBP classifies the census sectors according to indicators such as: income less than ½ minimum wage, illiterate population and without sanitary sewage. In the North and Northeast regions, the census sectors were grouped into four categories (low, medium, high and very high deprivation) and in the South, Southeast and Midwest regions into three (low, medium and high deprivation). Prevalence estimates of adult smokers were obtained using the indirect estimation method in small areas. To calculate the prevalence ratios, Poisson models are used. Results: The positive association between prevalence and deprivation of census sector categories was found in 16 (59.3%) of the 27 cities. In nine (33.3%) cities, the sectors with the greatest deprivation had a higher prevalence of smokers when compared to those with the least deprivation, and in two (7.4%) there were no differences. In Aracaju, Belém, Fortaleza, João Pessoa, Macapá and Salvador, the prevalence of adult smokers was three times higher in the group of sectors with greater deprivation compared to those with less deprivation. Conclusion: Sectors with greater social deprivation had a higher prevalence of smoking, compared with less deprivation, pointing to social inequalities.


INTRODUCTION
According to the World Health Organization (WHO), tobacco is the main risk factor for preventable causes of death and the second largest attributable factor of mortality in the world 1 .Tobacco use is associated with variables such as low income, low education 2 , and living in places with high vulnerabilities 3 .
The place of residence is presented, among the social determinants, as a component strongly shaped by the social position in which it is allocated, showing that the aspects of the physical surroundings of the neighborhood can be important factors for the perpetuation of inequities in health 4,5 .To this end, in addition to considering social aspects, epidemiological research makes use of spatial analysis to identify the influence of spaces related to exposure differentials and inequalities, expanding the understanding of the occurrence of health-related events in populations and in the processes of morbidity and mortality [6][7][8] .
Acting through research in these intra-urban relationships allows identifying where and how interventions should be carried out, and one of the tools used to understand the relationships between social determinants and health outcomes is geoprocessing, an important strategy in identifying areas of vulnerability 9 .
It is noteworthy that most states lack health information on their population in small areas for formulating local public policy programs, given the high cost of surveys of this nature.
In this sense, the area of statistics has contributed with methods for obtaining reliable estimates for smaller areas, such as regional health, districts or sub-regions, not initially contemplated in the research sampling plans 10 .The indirect estimation method for small areas based on models has been widely used in several areas 9 .This method uses survey data and auxiliary information extracted from the last census, at the lowest level, as predictor variables of the model for estimating the variable of interest in smaller areas 10 .
In 2019, the Center for Integration of Data and Knowledge for Health (Centro de Integração de Dados e Conhecimentos para Saúde -CIDACS) in partnership with the University of Glasgow built the deprivation index for Brazil, called the Brazilian Deprivation Index (Índice Brasileiro de Privação -IBP), using data from the 2010 demographic census.This index allows to highlight the inequalities of different social groups and the comparison between municipalities and Brazilian regions.The index was built to measure inequalities in the country using a single cutoff point for all of Brazil.This index is presented by quartile, quintile, and vigintile of deprivation 11 .
The use of composite indicators [12][13][14][15][16][17][18][19][20][21] , such as the IBP, may support the production of estimates related to risk factors for noncommunicable chronic diseases (NCDs) in smaller areas and, thus, support policies to promote eq-uity 1 .The present study aimed to produce estimates of prevalence of adult smokers, according to the IBP, in the 26 capitals and in the Federal District.

METHODS
This is an ecological study using data from the Surveillance of Risk and Protective Factors for Chronic Diseases by Telephone Survey (Vigilância de Fatores de Risco e Proteção para Doenças Crônicas por Inquérito -Vigitel) system, in the 26 capitals and the Federal District, from 2010 to 2013 [22][23][24][25] .
Vigitel uses probability sampling of the adult population (≥18 years old) residing in the 26 state capitals and the Federal District.The system uses the data frame of residential telephone available annually by the main telephone companies to draw the samples.The sampling process is carried out in two steps: a. draw of 5,000 telephone lines per city, divided into subsamples of 200 lines; b. selection of a resident over 18 years of age to be interviewed.
The Vigitel weighting process consists of multiplying two factors: the inverse of the number of landline telephones and the number of adults in each household.Post-stratification weights were used so that the system results are representative for the entire adult population of each city.This weighting aims to match the estimated socio-demographic composition of the population of adults with a telephone based on the Vigitel sample in each city to the socio-demographic composition estimated for the total adult population of the same city, in the same year the survey was carried out.
The study used the question "Do you smoke?", regardless of the number of cigarettes, frequency and duration of smoking, to estimate the prevalence of adult smokers according to the IBP in the period from 2010 to 2013.

Geoprocessing
Using the Vigitel samples with telephone and complete address information and the interview databases with telephone number information, it was possible to include the census tract by performing a linkage with the National Register of Addresses for Statistical Purposes (Cadastro Nacional de Endereços para Fins Estatísticos -CNEFE) of the 2010 census 26 .At the end of processing the database, IBP information by census sector was added.

Brazilian deprivation index
The IBP is an index of three components: the percentage of households with an income of less than half the minimum wage, the percentage of illiterate people under seven years of age, and the percentage of people with inadequate access to sanitary sewage, water and garbage disposal, without a bathroom 11 .In this way, the IPB makes https://doi.org/10.1590/1980-549720230044 it possible to highlight the inequalities of different social groups by census sector.
In the North and Northeast regions, the IBP was grouped into four categories: low, medium, high, and very high deprivation.While, in the other regions, the IBP was grouped into three categories (low, medium, and high deprivation), given the high concentration of sectors in the low deprivation category and few occurrences in the high and very high deprivation categories (supplementary material -Tables S1 and S2).

Indirect estimation for small areas
This study used data from Vigitel and the indirect estimation method to estimate the prevalence of adult smokers by IBP in the 26 state capitals and the Federal District.This method consists of using statistical models to obtain estimates of proportions of adult smokers observed in capitals for smaller areas, such as the IBP.The logistic regression model was used to impute the smoking response variable (Y), yes (1) or no (0), in the set of census sectors without any Vigitel interview.In the construction of the model, the set of sectors with a single interview in the period from 2006 to 2013 was used.This criterion was adopted due to the similarity in the distribution of sectors without an interview in Vigitel according to the IBP (supplementary material -Table S3).The response variable (y i ) is dichotomous, with 1 being a smoker (success) and 0 (failure) otherwise (Table S4).The covariables by census sector were taken from the 2010 census, such as the percentage of households by type of water supply, percentage of households by type of sanitary sewage, percentage of households with no male members, percentage of households with female heads of household, percentage of households with grandchildren, great-grandchildren, son-in-law or daughter-in-law, parents or stepfathers or stepmothers, percentage of households with siblings over 50 years of age, and percentage of households with one or more residents.
The general model of logistic regression 27 is given by: where: x = (1,x 2 ,...,x p ) represents the vector of covariates; π (x) is the probability that the respondent self-declares a smoker (success) given the characteristic of x; β = (β 1 ,β 2 ...,β p ) is the vector of model parameters.
The set of sectors with a Vigitel interview was divided into two samples in the proportion of 70% for training and 30% to validate the model to ensure that the model obtained in the first sample was robust.Logistic regression calculates the probability, between 0 and 1, that the adult in the census sector is a smoker, and, to classify the adults in the sectors as smokers or non-smokers, a cut-off point in probability is used.Thus, adults in sectors with a probabil-ity greater than or equal to the cutoff point were classified as smokers and, otherwise, as non-smokers.This cutoff point was determined by analyzing the receiver operating characteristic (ROC) curve 28 .
Multiple logistic regression models were run in Rstudio version 3.6.3using the Tidyverse package 29 .
To assess the adjusted the model, a two-by-two classification matrix was used with four possible results: true positive (TP) denotes a response of smoking being correctly classified by the model; true negative (TN) denotes a response of non-smoking being correctly classified as non-smoking.False negative (FN) responses were classified as non-smoking, and false positive (FP) responses were classified as no smoking.The sensitivity of the model is defined by , the specificity by , and the accuracy is measured as . In the joint analysis of the sectors with and without interviews, the post-stratification weight adjusted for the 2010 census population by IBP was calculated using the rake method 30 .These weights were calculated in the STATA program using the SURVWGT 31 package, requiring sample weight information to run the package.In this study, data from population N 1 and N 2 extracted from the 2010 census of each region were considered to calculate the weight of the group of sectors with Vigitel interviews ) , where N 1 is the total number of adults in sectors with Vigitel interviews, N 2 is the total number of adults in sectors without Vigitel interviews, n 1 is the number of Vigitel interviews and n 2 is the number of sectors without interviews.
The prevalence ratio of adult smokers due to IBP was calculated with the aim of comparing the groups.This ratio was estimated using the Poisson model, considering the first category as a reference.These estimates were calculated using post-stratification weights.

RESULTS
The 65,684 census sectors in the 26 Brazilian capitals and the Federal District correspond to a population of 45,980,581 people.This corresponds to 22% of the total census sectors and 24% of the Brazilian population.Of this total of census sectors, 38,867 (58.2%) sectors had at least one Vigitel interview in the period from 2010 to 2013.Analyzing by region, the North, Northeast, and South regions had 83.1%, 81.3%, and 82.0%sectors with interviews, with a median equal to five, three, and three interviews, respectively.This shows the good spread of the Vigitel samples.While the Center-West Region presented 69.9% (median=3) and 39.2% in the Southeast.In the Southeast Region, the capitals São Paulo and Rio de Janeiro have 18,182 and 10,158 sectors respectively, both with a median equal to one interview per sector, which explains the low percentage of sectors with Vigitel interviews (supplementary material -Table S4).In general, Vigitel's samples of residential telephones are scattered throughout the capitals, with the exception of São Paulo and Rio de Janeiro.

Imputation of missing data
In the construction of the logistic regression models, the census sectors were selected with an interview in the period from 2006 to 2013 (supplementary material -Table S5).This number of sectors varied between 7 (Boa Vista) and 4,231 (São Paulo).Due to the high variability in the number of sectors per region, the number of models was reduced from 27, one for each capital, to 5 models: North, Northeast, Southeast, South, and Center-West regions (supplementary material -Table S5).
The adjusted models for the North, Northeast, Southeast, South, and Center-West regions are available in the supplementary material -Tables S6 to 10.The measures of accuracy, sensitivity, and specificity of the models obtained in the two samples showed good adequacy of the models.However, the ability of the model to classify the individual as a non-smoker, given that he is a non-smoker, was greater when compared to its specificity (Supplementary material -Table S11).

Indirect estimation
The trend of increasing prevalence as deprivation increases was found in 16 (59.3%) of the 27 cities, indicating a positive gradient.In nine (33.3%) cities, the most deprived sectors had a higher prevalence of smokers when compared to those with less deprivation and, in the other two (7.4%), there were no differences (Tables 1 to 3).
In the Southeast, South, and Center-West regions, the prevalence ratios of adult smokers ranged from 1.33 (95%CI 1.10-1.60) in Campo Grande to 2.76 (95%CI 1.38-4.02) in Florianopolis.In Curitiba, Florianópolis, and Porto Alegre, the prevalence ratios of adult smokers were twice as high in the sectors with the greatest deprivation when compared to those with the least (Table 3).

DISCUSSION
This study used the IBP to measure intra-urban inequalities in the prevalence of adult smokers, in Brazilian capitals and the Federal District, using Vigitel data from 2010 to 2013 and the indirect method for estimation in small areas.
The study takes an ecological approach to measuring health inequalities, pointing out that the areas of greatest deprivation also had the highest prevalence of adult smokers.In Aracaju, Fortaleza, João Pessoa, and Salvador, the prevalence of smokers in very high deprivation sectors is three times higher than in low deprivation ones.The results found in the study are consistent with the literature, which points to an association between the highest preva- lence of tobacco and the population with low income and education in Brazil 2,3,32 and in other countries 33,34 .
Bernal et al. 35 showed the external validity of the estimate of the prevalence of adult smokers calculated using the indirect estimation method on Vigitel Belo Horizonte data.This study used the Health Vulnerability Indicators (HVI) grouped into four categories to estimate the prevalence of adult smokers in each group.Similarities were found between the estimates calculated in Vigitel and in the household survey, corroborating the results found here.
The work has some limitations.First, in 14% of the Vigitel interviews, the census sectors were not identified in the linkage process.The second is related to the lack of Vigitel interviews in some sectors, mainly in those with high or very high deprivation, requiring the use of statistical models to impute missing data in these sectors.In this sense, the covariates of the model may have underesti-mated or overestimated the probability of the adult being classified as a smoker or not in the sector.The capitals São Paulo and Rio de Janeiro have 28 and 43% of the sectors with interviews; in these capitals, the model may have underestimated the proportion of adult smokers.Third, the use of data from the 2010 census for the construction of post-stratification weights by IBP to minimize the selection bias of Vigitel in the period from 2010 to 2013 and of the covariates of the models.Due to the long-time span of the last census, these covariates may change over time.Fourth, the joining of the Vigitel databases from 2006 to 2013 given the annual variation in prevalence (supplementary material -Table S12).
Brazil produces a lot of research data in the health area with national coverage, large regions, federation unit, metropolitan region, and capitals.However, most of these states lack health information on their population in small areas, due to the high cost of surveys of this nature.In this sense, the IBP can be used to measure intra-urban inequalities in the country.
This study contributes in the methodological aspect to the production of indicators in smaller areas and, thus, subsidize the states with this information for the formulation, monitoring, and evaluation of programs and public policies for the adequate promotion of health to combat smoking.

Table 1 . Prevalence estimate and prevalence ratio of adult smokers by city and by Brazilian Deprivation Index. Northern Region, Vigitel, 2010-2013.
Brazilian Deprivation Index; CI: confidence interval; PR: prevalence ratio; *High and Very High categories were grouped together due to the small number of interviews in the period.