Spatio-temporal patterns of tuberculosis incidence in Ribeirão Preto , State of São Paulo , southeast Brazil , and their relationship with social vulnerability : a Bayesian analysis

Introduction: The purpose of this ecological study was to evaluate the urban spatial and temporal distribution of tuberculosis (TB) in Ribeirão Preto, State of São Paulo, southeast Brazil, between 2006 and 2009 and to evaluate its relationship with factors of social vulnerability such as income and education level. Methods: We evaluated data from TBWeb, an electronic notification system for TB cases. Measures of social vulnerability were obtained from the SEADE Foundation, and information about the number of inhabitants, education and income of the households were obtained from Brazilian Institute of Geography and Statistics. Statistical analyses were conducted by a Bayesian regression model assuming a Poisson distribution for the observed new cases of TB in each area. A conditional autoregressive structure was used for the spatial covariance structure. Results: The Bayesian model confirmed the spatial heterogeneity of TB distribution in Ribeirão Preto, identifying areas with elevated risk and the effects of social vulnerability on the disease. We demonstrated that the rate of TB was correlated with the measures of income, education and social vulnerability. However, we observed areas with low vulnerability and high education and income, but with high estimated TB rates. Conclusions: The study identified areas with different risks for TB, given that the public health system deals with the characteristics of each region individually and prioritizes those that present a higher propensity to risk of TB. Complex relationships may exist between TB incidence and a wide range of environmental and intrinsic factors, which need to be studied in future research.

Many authors [1][2][3] have stated that tuberculosis (TB) remains the world's leading cause of death due to a single infectious agent, Mycobacterium tuberculosis, with an estimated 3 million deaths and 10 million new cases each year 1 .It is estimated that 1 in every 3 people worldwide is skin test-positive for the infection and is thus believed to harbor the bacterium 4 .Tuberculosis is strongly associated with poverty 5 and related socioeconomic determinants such as malnutrition and micronutrient deficiencies 6 .The environmental and institutional factors associated with TB are climate, indoor air pollution, poorquality care, treatment delays, and increased drug resistance [7][8] .
Tuberculosis is an important public health problem in Brazil, with approximately 90,000 cases reported annually 9 .In 2004, the coefficient of TB incidence was 41 per 100,000 inhabitants 10 , and TB incidence was higher in urban areas such as the States of Rio de Janeiro (120/100,000 inhabitants) and São Paulo (80/100,000 inhabitants) 10 , with important differences among intra-urban clusters.For instance, in the City of Rio de Janeiro, the Rocinha favela (the largest Brazilian slum) is considered an area with the highest rate of TB, with an incidence approximately 3.5 times that of the general population of the city 11 .
In 1995, Ribeirão Preto (southeast Brazil) was included among the 25 priority municipalities targeted for TB control in the State of

Major Article
São Paulo 12 .In that year, the annual mean coefficient of TB incidence was 57.19 per 100,000 persons 13 .After the implementation of actions such as the DOT strategy 14 (Direct Observed Treatment, a supervised treatment regimen), the coefficients of TB incidence in Ribeirão Preto decreased to 39.68, 35.73, 30.32, and 28.73 cases per 100,000 inhabitants for the years 2001, 2002, 2003, and 2004, respectively (data from SINAN, the Brazilian Information System and Disease Notification).However, the incidence coefficients for the years 2007, 2008, and 2009 were 33.61, 34.94, and 39.78 cases per 100,000 inhabitants, respectively (estimated using data from SINAN and data on population projections described by the Brazilian Institute of Geography and Statistics, IBGE).These values suggest a recent increase in the number of TB cases, demanding new investigations into the epidemiology of TB in Ribeirão Preto.This observed increase in the number of TB cases suggests that new studies should be undertaken to elucidate the dynamic of the disease and to identify the areas that need improvement or attention.
The use of spatial analysis techniques 15 for mapping the geographic distribution of TB cases has been considered by many authors [16][17][18][19] .By modeling the spatial nature of TB incidence data, these works found that TB is not randomly distributed among different geographical regions, with disease cases tending to congregate at particular locations 18 .Thus, the identification of areas with different risks for TB takes into account that the public health system deals with the characteristics of each region specifically and prioritizes those that present higher incidences of the disease 20 .In this context, the objective of the present study was to evaluate the spatial and temporal distribution of TB in Ribeirão Preto between the years 2006 and 2009 and to evaluate its relationship with factors of social vulnerability such as income and education level.

Study design
This ecological study analyzed relationships between the number of annually registered TB cases in Ribeirão Preto, Brazil, and variables related to social vulnerability, income, and education.For this study, a map was constructed based on the areas of coverage of the municipality's 44 health units.The study period spanned the years 2006-2010.

Study locale
Ribeirão Preto is a municipality in the northeastern region of the State of São Paulo (21° 10' 42'' South and 47° 48' 24'' West) that has an agribusiness-based economy.In 2006, the population of Ribeirão Preto was estimated at 559,650 inhabitants in a 650-km 2 area.

Source of data
Data concerning to the reported cases of TB in Ribeirão Preto were obtained from TBWeb 21 , an electronic notification system for TB cases.The Municipal Health Secretary of Ribeirão Preto authorized the access to TBWeb.A case of TB is defined as a radiologically or bacteriologically confirmed patient (smear-or culture-positive) and/or with clinical confirmation of the disease.The system does not report suspected TB cases that do not meet these criteria.Individuals without a fixed address or who live in a prison or penitentiary were not considered in this study.Measures of social vulnerability were obtained from the SEADE Foundation (http://www.seade.gov.br/projetos/ipvs/), and information about the number of inhabitants, education, and income of the households was obtained from the IBGE.

Construction of the map
Although many Brazilian ecological studies have used IBGE-defined census tracts as the unit of analysis for neighborhoods 20,22 , the use of health administrative areas for this purpose may be more appropriate when the objective is to describe the regions that require priority TB control actions from the municipal health system.In Ribeirão Preto, the municipal health system is organized into regions called Health Districts.There are 5 Health Districts, located in the north, south, east, west, and central regions of the city.Each Health District has a Basic and District Health Unit (UBDS, Unidades Básicas e Distritais de Saúde) that is the reference point for some medical specialties for the respective region.In addition, the Health Districts are composed of various Basic Health Units (Unidades Básicas de Saúde, UBS) or Family Health Strategies (Unidades de Saúde da Família, USF) that provide primary care to the population in its area of coverage.
Thus, the construction of the map was based in the areas of coverage of each UBS, UBDS, or USF.The boundaries of the areas were drawn manually on a large printout of a map of the streets and regions of Ribeirão Preto in accordance with the information on the areas of coverage of each health unit obtained from the website of the Municipal Health Secretary.However, a description of the areas of coverage of the newest health units was not available on the home page.Therefore, it was necessary to schedule visits to the coordinators of these health units to obtain the required detailed information.The geographic limits of these units' areas of coverage were then covered by car to derive a more accurate transposition of the boundaries to the map.Finally, these geographic limits were transferred to digital format using a graphics editor that allowed manual drawing of such municipal street boundaries on a digital map.We identified 44 areas, shown in Figure 1.The denomination of the areas and the numeric labels used to identify them are also shown in Figure 1, and the UBDS, UBS, or USF responsible for each area is stated in parentheses.Due to their extension, the University of São Paulo campus area (USP, area 19) and the Moura Lacerda University Center campus area (area 30) were considered separately.

Variables
Vulnerability is a construct that refers to a dynamic context in which someone is at risk for the development of health problems resulting from inadequate economic, social, family, cognitive, psychological, or physical resources 23 .In this study, the São Paulo Social Vulnerability Index (IPVS) 24 was used to establish the degree of social vulnerability prevalent among individuals residing in each area.The IPVS is composed of socioeconomic variables related to the income and education level of household members, and family life cycle, such as the type of family arrangement and age of the household members.This index has been used to classify the level of social vulnerability of people in each census tract State of São Paulo, considering 6 classification levels: 1) no vulnerability, 2) very low vulnerability, 3) low vulnerability, 4) medium vulnerability, 5) high vulnerability, and 6) very high vulnerability.In the present study, levels 5 and 6 were combined to form 1 category for highest vulnerability.The map based in each health unit's area of coverage was superimposed on a map on the same scale containing the IPVS for each census tract (available from http://www.seade.gov.br/projetos/ipvs/mapas/Municipio/ribeirao_preto.pdf),and the observed predominant IPVS for each area of coverage was considered.
The distribution of predominant incomes in households by the number of minimum salaries was also defined by superimposing the map based on each health unit's area of coverage on a map on the same scale containing the income for each census tract, obtained from IBGE.The predominant household income in each area was classified as 0-3 minimum salaries, >3-10 minimum salaries, and 10 or more minimum salaries (in 2011, 1 Brazilian minimum salary was equivalent to approximately US$345 per month).Analogously, the predominant education level of households in each of the 44 areas was also defined by map superimposition.This variable was classified as elementary, middle, and higher education (including post-graduate levels).Areas with a predominant number of household members without education were not found.

Statistical analysis
Let Y it be the observed number of new TB cases in area i and in year t (let t = 1 if year 2006, t = 2 if year 2007, and so on).It was assumed that the random variable Y it followed a Poisson distribution with parameter µ it , or for example, Y it |µ it ~ Poisson(µ it ), i = 1, …, 44, t = 1, ..., 4, where µ it = N it θ it , N it being the number of inhabitants in area i in year t and θ it being the coefficient of TB incidence in area i in year t.The coefficient θ it is related to a vector of p covariates, X i (or p dummies variables related to a single qualitative covariate with p + 1 classes), in the following manner: log θ it = α + ω i + β it + γ'X i , where α is an intercept, ω i denotes random effects, β it is a fixed effect related to area i and year t, and γ = (γ 1 ,…, γ p )' denotes a vector of parameters associated to the covariates X 1i , …, X pi .In the Bayesian analysis, non-informative prior knowledge was considered with a flat distribution for the intercept α.Considering each area i, we assigned a multivariate normal prior distribution for the parameter β it , with vector of means zero and a covariance matrix for which an inverse Wishart prior distribution was specified.The prior distribution for the random effects ω i was assumed to have a conditional autoregressive structure that required an adjacency matrix and a weight matrix 25 .Thus, it was assumed that, 44, where A * (i) denotes the set of neighbors of area i, η i is the mean of the random effects corresponding to areas in the neighborhood of area i, n i is the number of regions forming this neighborhood, σ 2 is a precision parameter, and N (a, b) generically denotes a normal distribution with mean a and variance b.Thus, we adopted the nearest neighbor criteria, where the prior spatial distribution for the random effects ω i allowed contiguous areas to

RESULTS
Between 2006 and 2009, 705 cases of TB were reported in Ribeirão Preto, with an average age of 41 years.Among these cases, 68.8% were men and 60.7% had 1-7 years of schooling.In addition, 22.5% of these cases were HIV positive, and 14.9% were alcoholic.
As specified in the previous section, a Bayesian model was fitted to the data in the first step, but without considering covariates.To examine the sensitivity of the results to model estimation methods, a series of Bayesian alternative models was fitted to the data.This included models using other prior structures for the effect β it related to area i and year t as a mean zero Gaussian process with an exponential covariance function, where the covariance between β it and β it* for each area i is a function of the number of years separating t and t*.However, the model described in the previous section had the lowest DIC value (DIC = 733.8).
Figure 2 illustrates maps of urban areas of Ribeirão Preto, displaying the adjusted mean number of reported new TB cases in the years 2006 to 2009.In all the 4 years, the highest incidence rates of TB were observed in areas 33 (Jardim Itaú), 18 (Jardim Paiva), 36 (Maria das Graças), and 16 (Eugênio Mendes Lopes).Geographically, the new annual cases of TB in Ribeirão Preto tended to be more concentrated in the west and south regions.The east region (areas 38-44) tended to have the lowest TB incidence rates in the studied period.The Bayesian estimates and posterior 95% credible intervals (95% CI) for the annual number of new TB cases per 100,000 inhabitants for the 44 areas are plotted in Figure 3, where a heterogeneous pattern among the TB longitudinal trends across the areas is visible.Some areas experienced approximately constant TB occurrence over the 4-year period (for example, areas 7, 11, 14, and 39), while the annual incidence of TB in other areas fluctuated (for example, areas 1, 9, 21, and 40).The lines in Figure 3 suggest an increase of the annual incidence of TB in 2009 in areas 1 (Heitor Rigon), 5 (Adelino Simioni), 6 (Estação do Alto), 16 (Eugênio Mendes Lopes), 18 (Jardim Paiva), 21 (Sumarezinho), 23 (Vila Recreio), 30 (Campus Moura Lacerda), 32 (Jardim Recreio), 33 (Jardim Itaú), 34 (Adão do Carmo), 36 (Maria das Graças), and 40 (Jardim Zara).There were areas with high social vulnerability among these areas, such as areas 33 and 36; however, others were classified as being without vulnerability (such as areas 1 and 32) or with predominantly low vulnerability (such as areas 6 and 21).
In the second step, 1 covariate was added to this model at a time.A multiple model containing more than 1 covariate was not fitted to the data due to collinearity problems that are commonly found in ecological studies.The model containing only the predominant social vulnerability level as a covariate generated a DIC value of 730.6, while the model containing only predominant household income level as a covariate generated a DIC value of 732.5, and the model containing only the predominant household education level as covariate generated a DIC value of 733.7.Considering only the goodnessof-fit as estimated by the DIC, the model that included the social vulnerability level was better fitted, suggesting that of the 3 covariates, this covariate is most closely related to TB incidence.In addition, the models that included the spatial structure, even the model without an independent covariate, were better fitted.This proves that the spatial distribution of TB is heterogeneous and clustered.
Table 1 provides the estimated mean coefficient of incidence per 100,000 persons with its respective 95% CI, obtained by the Bayesian model and considering each year and class of the covariates predominant social vulnerability level, predominant household income level, and predominant household education level.In Table 1, an expressive increase in the incidence coefficients of TB in 2009 can be observed, even in areas with an absence of social vulnerability, highest income level, and highest education level.Generally, it can be noted in Table 1 that the incidence coefficients of TB grow as the social vulnerability levels of the areas increase and as the predominant household income level decreases.
Roza DL et al -Spatio-temporal patterns of TB incidence have similar weights.The prior distribution for the parameter σ 2 was chosen as an inverse gamma density, denoted by σ -2 ~ Gamma (0.5,0.0005), as recommended in the literature 26 .
For the Bayesian estimation of the parameters of the model, we considered the use of Markov chain Monte Carlo methods 27 .A major simplification in the sample simulation for the joint posterior distribution of interest was obtained using the GeoBUGS module 28 in WinBUGS, a software package for Bayesian analysis of complex statistical models using Monte Carlo methods 29 .Currently, all versions of WinBUGS are free and available from the BUGS Project website (http://www.mrc-bsu.cam.ac.uk/bugs).Using the GeoBUGS notation, we obtained ω i ~ car.normal (adj[], weights[], num[], σ 2 ), where num[] denotes the number of neighbors, n i , for each area, adj[] denotes the specification of all neighboring areas, and weights[] are specified weights (equal to 1 if areas i and j are neighbors, and 0 otherwise).The convergence of the chains was assessed using visual examination of trace, density, and autocorrelation plots, and by the Geweke 30 criteria.In the fit of all models, 150,000 samples for each parameter of interest were generated, with a burn-in of 5,000 iterations aimed at avoiding the influence of the initial values and a thinning interval of 50 aimed at avoiding correlation between successive samples.
The deviance information criterion (DIC) 31 was used to assess the goodness-of-fit of the different models.Models with a lower DIC value are usually selected as providing the best representation of the data.

Ethical considerations
The Committee for Ethics in Research at HCFMRP-USP, University of São Paulo, reviewed and approved this study under process 11304/2009.ω ω ω

DISCUSSION
In the present study, the rate of TB was correlated with the measures of income, education, and social vulnerability.These results are compatible with previous ecological analyses from the literature [32][33][34] , confirming that TB continues to be a disease strongly related to the social characteristics of a population.Marais et al. 35 cite 3 primary causes that deter-mine the role of poverty in the transmission of M. tuberculosis: I) its influence on living conditions, such as people living in overcrowded and poorly ventilated homes, II) prolonged diagnostic delay, and III) increased vulnerability due to malnutrition and/or HIV infection.Ecological studies provide a means of examining TB within its social context 34 as the relationship between the disease and the conditions related to poverty and low socioeconomic status can be unclear in studies conducted at an individual level.Several studies have already addressed the spatio-temporal patterns of TB in Ribeirão Preto 36,37 , but these studies are limited because they only describe the place of residence of the individuals who developed TB and the regions with the largest concentration of cases of the disease without the use of regression models that formally link vulnerability and socioeconomic factors to the disease from a spatial perspective.In this context, Bayesian spatial models are useful tools to understand spatial variation in disease risk and to provide important information for assessing and quantifying the amount of true spatial heterogeneity and the associated patterns 38 .In addition, these models are able to identify the areas of elevated risk and propensity of the disease, and they allow exploration of the relations of the disease and environmental exposures.Bayesian models similar to the one proposed here have been applied to investigate the spatial pattern of TB in different populations 39,40 .In the present study, the Bayesian model confirmed the spatial heterogeneity of TB distribution in Ribeirão Preto, identifying areas with elevated risk and the effects of social vulnerability on the disease.
Despite the well-known association between TB and poverty 5 , the present study demonstrated that the TB rates in Ribeirão Preto were also increased in areas with low social vulnerability and high income and education levels (Table 1).This suggests that more complex relationships may exist between TB incidence and a wide range of environmental and intrinsic factors such as the individuals' occupations, work conditions, poor nutrition, access to quality healthcare, living conditions, crowding, and social behavior.Occupational exposures should be further explored in individual-level studies, given that areas such as Jardim Recreio (area 32 on the map), in which many people who work in healthcare live due to its proximity to the Hospital of Clinics of the University of São Paulo, have a low vulnerability level and high education and income levels, but high estimated TB rates.
This study has limitations that are intrinsic to its ecological design, such as the possibility of migration across the areas, temporal ambiguity (the disease can influence the social vulnerability), and ecologic biases 41,42 .The number of inhabitants in each area was estimated based on information from the IBGE and the values cannot be very accurate, especially in areas with new neighborhoods and higher than expected growth rates.The estimated mean coefficients of incidence were Introdução: O objetivo deste estudo ecológico é avaliar a distribuição espacial e temporal da tuberculose (TB) na área urbana de Ribeirão Preto, São Paulo, entre os anos de 2006 e 2009, e estudar as suas relações com fatores de vulnerabilidade social como renda e educação.Métodos: Foram utilizados dados do TBWeb, um sistema de notificação de dados de TB.As medidas de vulnerabilidade social foram obtidas da Fundação SEADE (Sistema Estadual de Análise de Dados) e informações sobre o número de habitantes, educação e renda dos chefes dos domicílios foram obtidas do Instituto Brasileiro de Geografia e Estatística.A análise estatística utilizou um modelo bayesiano de regressão assumindo que os novos casos de TB observados em cada área assumem uma distribuição de Poisson.Resultados: O modelo bayesiano confirmou a heterogeneidade especial da distribuição da TB em Ribeirão Preto, identificando áreas com elevado risco de TB e os efeitos da vulnerabilidade social sobre a doença.Foi evidenciado que a taxa de TB associa-se com as medidas de renda, educação e vulnerabilidade social.Entretanto, são observadas áreas com baixa vulnerabilidade social e alto nível educacional, mas altas taxas de TB.Conclusões: O estudo identificou áreas com diferentes riscos de TB, permitindo que o sistema público de saúde lide com as diferentes características de cada região e priorize aquelas que apresentem maior propensão de risco de TB.São evidentes relações complexas entre a incidência de TB e um amplo número de fatores ambientais e intrínsecos, o que mostra a necessidade destes serem estudados em trabalhos futuros.

FIGURE 2 -FIGURE 3 -
FIGURE 2 -Estimated mean number of new cases of tuberculosis (TB) in urban areas of Ribeirão Preto, State of São Paulo, in (a) 2006, (b) 2007, (c) 2008, and (d) 2009The values in the legend denote the number of cases per 100,000 inhabitants.