Social segregation and lethal police violence in the city of São Paulo, Brazil (2014-2015) Segregação social e a violência policial letal na cidade de São Paulo, Brasil (2014-2015)

Resumo Investigamos em que medida a violência policial letal (VPL) na cidade de São Paulo, Brasil, guarda relação com desenvolvimento socioeconômico, considerando os locais de residência das vítimas e os endereços dos eventos fatais. A distribuição espacial da taxa de violência policial letal (TVPL) e sua associação com o índice de desenvolvimento humano (IDH) foi investigada por meio do Moran’s I (Global e o Local Bivariado). Entre 2014 e 2015 encontramos 403 vítimas da polícia no banco de dados da Saúde e 794 no da Segurança Pública. Constatamos uma distribuição não aleatória da TVPL considerando os locais de residência das vítimas (I=+0,12; p<0,001) e os locais de encontro fatal com a polícia (I=+0,07; p<0,001). Encontramos uma associação negativa (I=-0,10; p<0,001) entre as TVPL e o IDH nos locais de residência e uma associação positiva entre as TVPL e o IDH nos endereços dos eventos fatais (I=+0,02; p<0,001). Os resultados apontam para dinâmicas distintas da VPL na CSP. Clusters de alta mortalidade foram encontrados em áreas com menor IDH, quando consideramos o endereço das vítimas, e em áreas com maior IDH, quando consideramos o endereço dos eventos fatais. A VPL impacta jovens negros, com baixa escolaridade, vivendo nas periferias, nos informando dos padrões de segregação social. Palavras-chave Intervenção legal, Polícia, Análise espacial, Segregação social Abstract We aimed to investigate how lethal police violence (LPV) in the City of São Paulo (CSP), Brazil, is associated with socioeconomic development when we consider the victims’ place of residence and the locations of the fatal injuries. The spatial distribution of lethal police violence rate (LPVR) and its association with the human development index (HDI) was investigated using the Moran’s I (Global and Bivariate Local). Between 2014 and 2015 we found 403 police victims in the Health database and 794 victims in the Security Department. We found a non-random spatial distribution of the LPV considering the victim’s place of residence (I=+0.12; p<0.001) and the locations where the fatal injuries were inflicted (I=+0.07; p<0.001). We found a negative association between LPVR and the HDI of the place of residence (I=-0.10; p<0.001) and a positive association between LPVR and the HDI of the locations of the fatal injuries (I=+0.02; p<0.001). The results point to different dynamics of LPV in CSP. High mortality clusters are found in areas with lower HDI, when considering the victim’s address, and in areas with higher HDI, when considering the address of the violent events. LPV impacts young blacks, poorly educated residents of outskirts informing us about patterns of social segregation.


introduction
Since 1996, when the World Health Organization recognized Violence as a Global Public Health problem, there has been a growing understanding that there is a need for the phenomenon to be addressed from a public health perspective 1 . The impact of interpersonal violence on mortality is perceived through the large number of homicide deaths worldwide. In Brazil, much of it is related to the use of firearms and disproportionately impacts men, youth, black people and those living in peripheral areas of urban centers 2,3 In 2018, the American Public Health Association (APHA) approved the statement that violence perpetrated by law enforcement institutions needs to be tackled from a public health perspective, once these institutions can perpetuate physical, psychological, sexual and even negligent violence. In this context, tackle police violence (PV) is considered paramount to strengthen democracy and the health conditions of the population. This declaration stems from the recognition of the negative consequences of violent policing and its impacts on the most marginalized populations and the racial selectivity of public security apparatuses 4 . Since the police is one of the most visible faces of the State, the way it exercises authority does not simply reproduce inequalities, but may deepen them, reinforcing precarious models of citizenship 5 .
Several studies indicate that marginalized populations are disproportionally exposed to experiences of PV, including people living with mental disorders 6 , LGBTQI+ population 7 , homeless 8 , people with low income 9 , sex workers 10 , drug users 11 , and residents or visitors of police hyper-surveilled neighborhoods 12 .
Studies are consistent in demonstrating that homicide deaths are concentrated in areas with poorer socioeconomic conditions and of racial segregation 13,14 . Regarding violence perpetrated by the police, spatial distribution studies are rarer, which is partially explained by the difficulty to access information and poor quality of data [15][16][17] . In a study conducted in the USA 18 higher rates of police-related deaths occurred in neighborhoods with the highest concentration of black and low-income residents, concluding that the contextual characteristics of a certain neighborhood are important to understand, prevent, and respond to lethal police violence (LPV).
Brazil is a profoundly unequal country with high levels of violence, a strong presence of organized crime groups in urban territories, and rec-ognized violent action by police forces 19 . In 2017, more than 65,600 homicides occurred in Brazil, 6,220 deaths were perpetrated by the police. In the State of São Paulo, in the same period, 20% of all homicides were caused by the police 20 . Brazil is one of the countries with largest number of cases of LPV in the world 21 .
Few studies, however, have sought to investigate the characteristics and spatial distribution of these deaths and their association with the victims' living conditions. The few studies carried out, analyzed the spatial dynamics either according to the locations where the fatal injuries were inflicted 22,23 , or to the victim's place of residence 24 , but no study sought to analyze the spatial dynamics of both sources of data which provide different information about the same occurrences.
It is possible to assume that there are differences in the spatial distribution of death, as well as in its association with socioeconomic conditions of urban areas, when we consider the victims' place of residence or the locations where the fatal injuries were inflicted. Our main hypothesis is that LPV impacts poor people in wealthy areas -as they are perceived as suspicious -and also in poor areas, where they live. Our analysis allows us to advance in understanding LPV not only in the CSP but globally, once we could not find any study considering spatial distribution of LPV according to both place of residence and the location where fatal injury were inflicted.
In this article our objective is investigate the association between Lethal Police Violence Rate (LPVR) socioeconomic development in the CSP when we consider the victims' place of residence and the locations where the fatal injuries were inflicted.

methods
This study was conducted at the CSP, capital of the state of São Paulo and the largest city in Brazil with approximately 12 million inhabitants. According to the Atlas of Human Development of the United Nations Development Program (UNDP) Human Development Index (HDI) at the CSP in 2010 was 0.805, which makes São Paulo a city with a very high HDI. In 2017, according to official data from the Municipal Secretariat of Health, the CSP Homicide Mortality Rate was 9.1/100,000 inhabitants.
The CSP is internally divided into 1,593 Human Development Units (HDUs) and in this study, we will use the HDUs as unit of analysis.

Data Sources
All LPV in the CSP and recorded in two official sources in 2014 and 2015 were obtained. We chose to work with the 2014 and 2015 data, because when we analyze these two years, they showed to be sufficiently robust to measure the patterns of LPV. Furthermore, 2014 is the first year of an increasing trend in lethal police violence in the city of São Paulo, which is still growing.
The first official sources is: 1) Program for Improving Mortality Information (PROAIM) of the Health Department of the CSP, where data are recorded from the death certificate (DO), and the cause of death is classified following the 10th. edition of the International Classification of Diseases (ICD-10). A database of all cases classified under code Y35 (Legal Intervention) was obtained with the following information: age; home address; education; sex; race/skin color; date and time of death. The second official source is: 2) Public Security Secretary of the State of São Paulo, responsible for the consolidation of criminal records, which are registered in Police Reports. The Public Security classifies LPV as homicide resulting from police intervention. All incident reports registered were requested based on the law of access to information. The following information for each case were collected: sex; date of birth; race/skin color; and locations where the fatal injuries were inflicted.
The use of the two data sources allows spatial analysis to be carried out considering the victims' home address, available on death certificates, and the locations where the violent events occurred, available in the police records.
The resident population of the city of São Paulo was obtained from UNDP database in order to calculate the mortality rates per 100,000 inhabitants for each HDU. UNDP uses data from the Demographic Census conducted by IBGE in 2010. To characterize the degree of human development in the HDUs we used the HDI obtained from the UNPD Brazilian Atlas of Human Development. The HDI is an indicator composed of three dimensions of human development: health, education and income 25 .
This project was approved by the Ethics Committee of the University of São Paulo Medical School.

Analysis
Initially, we performed a descriptive analysis of the victims sociodemographic characteristics for each of the sources of information separately. For this analysis, we used Stata 12 software.
All the cases were geocoded via Google My-Maps website considering the addresses (victims place of residence and locations where the fatal injuries were inflicted). Through the MMQGIS plugin, in QGIS software, we counted the deaths in each HDU. We estimated the 2-year average LPVR per 100,000 inhabitants using the Spatial Empirical Bayesian Rate (SEBR) for each HDUs. For this, a neighbor matrix was created based on a second-order Queen contiguity. We chose to use SEBR smoothing given the rarity of the event and the great variability in terms of population in each HDU.
The existence of spatial clusters was investigated through the Global Moran's I and the Local Moran's I. To analyze the spatial correlation between the distribution of LPVR and the HDI, we conducted Local Bivariate Moran's I. Statistical significance for Moran's was tested with Monte Carlo Test with 999 permutation (p=0.001).
For spatial analysis, we used GeoDa software. Additionally, we used QGIS 2.18.23 software for map creation.

Results
Between 2014 and 2015 in the CSP we found 403 police victims in the Municipal Health database and 794 in the registry of the Public Security State Department. The difference between sources is almost twice, with an underreporting by Health Department. The sociodemographic profile of the victims of LPV in the CSP are in Table 1.
Victims of LPV are mostly young, black, and male. The lowest age found was 11 years old, and the average age in both sources is similar, around 23 years old. According to both sources, there is a predominance of men among deaths (>99,5%). About 60% to 70% of the victims were black (black + brown). It is important to emphasize that only 37% of the population in the CSP is black according to the census. Finally, most victims (72%) had little or no education (less than three years of formal education). It is noteworthy that 14% had no formal education at all and only 1.5% had eight years or more of schooling (with at least elementary school).
Out of the 403 deaths recorded in the Health database, 385 (95.5%) had registered addresses. We were able to geocode 99.5% of the cases. The vast majority resided in the CSP (n=361; 90%). The remaining (n=57; 10%) are deaths that occurred in CSP, but the victim resided in in another city, and therefore were excluded from our analysis.
Notably, when analyzing the distribution of deaths by the victim's place of residence, from the 1,593 HDUs of the city of São Paulo, 1,366 units (85.75%) had no record of LPV. In the other 227 (14.25%), there was at least one case, ranging from 1 to a maximum of 6. The mean number of deaths by HDU in the CSP is 0.21 deaths (sd=0.63).
From the 794 cases registered by Public Security, considering the locations where the fatal injuries were inflicted, 717 (90.3%) were geocoded using the address. Those police records who did not provide an address or had inaccurate records (9,7%) were excluded from the analysis. In 1,210 HDU (76%), there was no LPV. In the other 383 (24%), there was at least one case, ranging from 1 to a maximum of 10. The average number of deaths by HDU in the CSP was 0.45 deaths (sd=1).
The LPVR, according to the locations where the fatal injuries were inflicted, ranged from 0 to 137 per 100,000 inhabitants. The average was 3.5/100,000 (sd=4.2). Based on rates quartiles, we named categories as: Low (up to 2.0 deaths/100,000), Intermediate (2.06-3.1/100,000), High (3.16-4.5/100,000) and Very high (4.5-137/ 100,000). In Figure 1 we present the LPVR distribution according to the victim's place of residence (1a) and the locations where the fatal injuries were inflicted (1b). Besides the mortality rates, in Figure  1c, we present a map of the distribution of the HDI by the HDU.
There is a complementarity between the maps, which together help to understand the urban dynamics related to mortality due to LPV in the CSP. According to the Health data ( Figure  1a), which considers the regions of the victims' residence, the highest rates (fourth quartile) are mainly located at the edges of the city, in peripheral regions, such as the East and North HDUs. At the Central region and towards the beginning of the West Zone, areas with better socioeconom-  ic conditions, HDUs present mainly low (first quartile) and moderate (second quartile) rates.
According (Figure 1b), which is based on the locations where the fatal injuries were inflicted, the distribution of the mortality rates show a more diffuse and less pronounced pattern. We see high and very high rates in more affluent areas of the city: in and around the Center, as well as in the West zone. The eastern HDUs form a kind of "mortality belt", with rates ranging from moderate (second quartile) to very high (fourth quartile). High and very high rates are also found at the North, the south and west HDUs.
We can see an overlap between the HDU with high mortality rates, considering the places of living of the victims (Figure 1a), and the places with the worst socioeconomic indicators in the CSP (Figure 1c), evidencing the direction of State violence to more deprived areas at the periphery of the city, mainly in North and East region.
The Global Moran's I show a positive spatial correlation in the distribution of deaths considering both victim's place of residence (I=+0.12; p=0.001) and the locations where the fatal injuries were inflicted (I=+0.07; p=0.001). Despite exhibiting a low correlation magnitude, both values indicate the existence of spatial dependence of police lethality in the urban space of the CSP. We can see in Figure 2 the Global Moran's I scatter plot (1a e 1b) that shows us a positive linear fit through the point cloud that informs us about of patterns of spatial segregation. Figure 3 shows the clustering of HDUs, with five distinct autocorrelation patterns identified. Of particular interest are the High-High pattern clusters with high mortality rates within the HDU and in surrounding neighborhoods, and the Low-Low pattern clusters with low mortality rates in the HDU and in surrounding neighborhoods.
When we consider the victim's place of residence (Figure 3a), low mortality clusters predominate in the Expanded Center of the city, mainly towards the West and South Zone. High mortality clusters are prominent in the East and North of the city. When considering the locations where the fatal injuries were inflicted (Figure 3b), High-High clusters are more distributed throughout the city, being present in all regions (North, South, Eastern and Western), except the Center. Low-mortality clusters are spread across the city, without a pronounced spatial pattern.

Association between lPvR and the hDi
The spatial correlation between LPVR and the HDI is presented in Figure 4. We found spatial correlation, whether considering the distribution of deaths by the victims' place of residence, or by the locations where the fatal injuries were inflicted. The direction of the correlation, however, are inverted. Based on Health data (home address) we found a negative correlation between LPVR and HDI (I=-0.10; p<0.001) indicating higher mortality rates in areas with low HDI. On the other hand, based on the SSP data (locations where the fatal injuries were inflicted) we found a positive correlation (I=+0.02; p<0.001), which indicates higher mortality in areas with better socioeconomic indicators.
In addition, Figure 4 shows the geographical distribution of spatial clusters for the association between HDI and the LPVR. Considering the locations where the fatal injuries were inflicted (Figure 4b), the HDUs with high mortality rates and high HDI are mostly in the expanded center and the boundaries between the center and the periphery, especially in the northern and southern regions of the municipality. When we consider the distribution according to the victims' place of residence (Figure 4a The scatterplot of the Bivariate Moran can be seen in Figure 2 (Scatter plot 2a and 2b) that show us the correlations of PLVR and HDI in the HDU and nearby areas. In scatter plot 2a we see a negative slope, and in the scatter plot 2b, we see a positive slope, but with values that report very low-intensity phenomena.

Discussion
Our results reinforce the evidence of racial and social bias on LPV as well as the existence of spatial clusters with high mortality. Additionally, our data corroborates the evidence of an uneven distribution of LPV, affecting differently those with a vulnerable social background. Most victims of police brutality are black, young, poorly educated. Besides, spatial analysis demonstrates that victims of LPV live in deprived areas, while lethal confrontation focuses mainly on affluent areas of the city. The spatial analysis made it possible to identify distinct urban dynamics from LPV.
By working with two data sources, it was possible to observe that there are two complementary patterns of spatial distribution of police killings in the CSP. The first is through targeting social groups living in specific districts of the city, mainly the urban peripheries marked by the worst socioeconomic indicators and a negative correlation between mortality rates and the HDI. The second pattern, when we consider places of fatal encounters with the police, shows the pres-ence of violence both in peripheral districts, with the worse socioeconomic indicators, as in Center and near the Center districts, which have higher HDI. In this case, the correlation between mortality rates and the HDI is positive. Our results corroborate that of other authors, showing that PV mainly affects those in a weaker social position 3,[6][7][8][9][10][11][12]15,17,18,[26][27][28][29][30][31][32] .
To better understand how differences are displayed, it is necessary to consider aspects related to police distribution and performance in the CSP. Police distribution and the repression it exercises in the CSP are closely linked with social hierarchies and power structures 5 tion groups that have greater social prestige and economic power claim greater police presence in their neighborhoods and greater repression against socially disadvantaged inhabitants. According to Gonzalez 5 , the unequal distribution of safety occurs through two different manners:  The first would be the demands for repression, by citizens with higher social status, against those who are marginalized (by race/color, social class or spatial status). The second happens through the legitimation and prioritization of the population claims by the Public Security agents. Both  Low-High [34] High-Low [15] legitimation and prioritization are based on the position and status of those who claim. This means that the Security Forces respond more to security and controlling demands from citizen groups with higher social status. These demands target marginalized groups of citizens living in vulnerable areas but also circulating in more affluent regions of the city where they could be perceived as dangerous, becoming vulnerable to LPV 29 . Although we know that social hierarchies are closely related to LPV, we cannot discard other explanations for this pattern, that need to be further investigated.
The identification of different clusters of high mortality, when considering the address of residence and the occurrence of lethal conflicts, suggests the existence of social and racial segregation processes. This picture seems to contribute to racial discrimination by the State, when we consider the racial profile of the victims, supporting the evidence of the institutional racism of the police force. Different studies, from various countries, points to the racial bias in PV and LPV 12,18,20,33,34 . State-sanctioned violence, such as PV is one of the pathways that connect institutional racism and health 33 . Police killings in the city of São Paulo is part of an exclusion and segregation mechanism operated by the State that connect racism to social inequalities 31 .
The recognition of PV as a public health problem is recent. The many effects of PV on health are widely recognized and include physical, mental and moral consequences that may persist over time [12][13][14][15][16][26][27][28] . PV affects the health of the population and individuals, generates distrust and undermines the confidence in public institutions and policies 4,15 . Although the social and health impact of PV is recognized 4 , its systematic study remains a challenge in different countries, including Brazil, due to the lack of reliable data 20,[35][36][37] . To deal with this limitation, researchers are using alternative databases that aim at strengthening the validity of the given data, showing a better picture of the problem. In our paper two official sources were used: one reporting vital statistics (Health Department) and the other reporting criminal records (Public Security Department).
Even though some differences between the figures provided by distinct sources are expected, our data show a non-negligible underreporting of police killing by health authorities (n=403), when compared to police records (n=794). Although Public Security reports more cases, its data is often incomplete regarding the sociode-mographic characteristics of the victims. Despite the underreporting, the health information system presents excellent quality considering the completeness of victims' sociodemographic information.
There are few studies on LPV in Brazil that use data from the health sector. This scarcity is explained by the underreporting of these deaths. A study 35 in Salvador, Bahia, showed that deaths by Legal Intervention (Y35) are often classified as death by aggression (X85-Y09). The authors point out the existence of resistance, by forensic doctors, to use the category Y35 due to possible legal consequences. Also, the lack of articulation and an insufficient exchange of information between health departments and the police can make it difficult the correct classification of external causes death. In such cases, it is essential to clarify the circumstances of the death, through the regular access to police reports, and the building of formal and systematic flows for sharing information.
The acknowledgment of distinct problems affecting the quality of the official data suggests that efforts to make the two databases compatible would result in substantial gains for both. Linkage experiments with police, health, and media databases have been conducted 36,37 . These enabled a deeper understanding of the topic and improved the quality of information.
Analyzing and making public the data on LPV is necessary since it can be used as a tool to control police activity enabling better decision-making, with gains for the society as a whole, including improvements in the health of the population.
It should be stated that every effort to strengthen the quality and to guarantee access to information is a way to deepen the discussion on access to rights, contributing to social control and to improve the quality of the democracy. This is of special interest considering the growing of far-right political movements in different countries, including Brazil.

Strengthens and limitations
Our study has some strengths and limitations. As far as we know, it is the first study on the spatial distribution of LPV which analyzes data from two different official sources, making it possible to consider the spatial distribution of victims' homes and those of the violent confrontations. Most of the studies about the spatial distribution of LPV, from a public health per-spective, are from high-income countries with low violence and police killings rates. Research on LPV, in countries with a high number of cases and high social inequalities, such as Brazil, adds a valuable contribution to the knowledge about the topic. Our results reinforce the evidence of racial and social bias on LPV, as well as the existence of spatial clusters with high rates. Additionally, our data corroborate the evidence of an uneven distribution of LPV, affecting differently those with a vulnerable social background.
Even though our study has many strengths, some limitations should be taken into consideration when interpreting the results. As mentioned, the health information system underreports death derived from police intervention (ICD-10) when compared to criminal records. As a result, spatial analysis based on Health data that consider victims' residence could be biased. It is not possible to know if the results would be the same if all the cases were included in the analysis, once it is not possible to know the living address of those missing from the Health agency records. Data from Public Security lacks a non-negligible amount of information on some key variables such as age, and other sociodemographic indicators. We are not able to know how it affects the results of the descriptive analysis of victims' characteristics. To overcome both these issues, the linkage of health and criminal data would be necessary and is under construction. Additional analysis will be made, soon.
Another limitation is that our Analysis was based on crude mortality rates. We should also highlight that, when working with a rare event and with very small geographic units (HDU), LPVR can suffer great variability. We used Bayesian smoothing to work around this limitation.
As for HDU population, we used information from the 2010 Census, while LPV data were from the years of 2014 and 2015. Population estimates for HDU are not available for more recent years. The small time interval makes it unlikely that substantial changes have occurred to the extent of influencing the magnitude of the LPVR. The aggregations of census tracts, in the HDU, respect the limit of a minimum 400 households per unit. Although they were delimited considering contiguity, in some cases the criterion of spatial contiguity may not have been respected to guarantee the minimum of 400 households with homogeneous socioeconomic characteristics 38 . This can influence spatial patterns and subestimate the magnitude of spatial correlations. Another limitation, with a possible impact on spatial analysis, is that we did not perform a sample verification in geocoded addresses, but we had an excellent geocoding quality (>90%).

Conclusions
Police forces have the prerogative of the legitimate use of force, and killing could be exceptionally justified in some specific circumstances. In Brazil the excessive number of cases, the profile of victims and the spatial distribution of deaths are clear signs that the use of lethal force is lacking exceptionality and its use can no more be considered legitimate.
Public Health and Public security information systems have flaws in LPV registry, such as underreporting and incompleteness, respectively. Efforts should be made to improve the quality of information about LPV in health system and criminal records, to reduce underreporting and lack of information. Regular sharing of information between police and health authorities and the linkage of database would generate a better diagnosis to foster the implementation of evidence-based policies.
Our results point to different dynamics of LPV in urban space in the MSP. High mortality clusters are found in areas with lower HDI, when considering the victims' home address, and in areas with higher HDI, when considering the address of the occurrence of the violent events. LPV disproportionately impacts young blacks, poorly educated residents of outskirts informing us about patterns of social segregation and inequality.