Industrial coagglomeration : some state-level evidence for Brazil

Resumo O artigo quantifica a coaglomeração industrial entre pares de setores da indústria de transformação no Estado do Rio de Janeiro, em 2010. Para tanto, considera-se o índice de coaglomeração avançado por Ellison et al. (2010) e procura-se relacionar com indicadores que aproximariam o uso de trabalhadores semelhantes (labor pooling), proximidade com fornecedores e clientes, e vantagens naturais. Observaram-se algumas similaridades com evidência anterior para os EUA, mas também contrastes importantes. A evidência econométrica exploratória pareceu indicar um papel mais forte para as variáveis aproximando labor pooling e intensidade do uso de insumos. Abstract The paper quantifies industrial coagglomeration between pairs of sectors in the manufacturing industry in the state of Rio de Janeiro in 2010. In order todo so, it considers the co-agglomeration index of Ellison et al. (2010) and tries to relateit with indicators that approximate labor pooling, proximity to customers and suppliers, and natural advantages. Some similarities with previous evidence have been observed for the U.S., as well as important contrasts. The exploratory econometric evidence seems to indicate a stronger role for variables approximating labor pooling and input utilization intensity. Palavras chave coaglomeração, indústria de transformação, Rio de Janeiro


1_Introduction
The study of industrial agglomerations has received increasing interest in the literature and one of the main motivations is associated with the recognition of the possible positive effects it can exert on productivity [see Henderson (1986) and Rosenthal and Strange (2004)].An important advance in the area seems to be the development of indicators based on probabilistic plant location models that provide measurements with micro-foundations that benefit from increased availability of detailed employment data for industrial plants.Measures proposed by Ellison and Glaeser, EG  (1997) and Maurel and Sédillot-MS (1999) have led to a number of contributions, especially for developed countries [see, e.g., Allonso-Villar et al. (2004), Devereux et al. (2004), Alecke et al. (2006Alecke et al. ( , 2008) ) and Simpson (2007)].Applied studies in regional science often consider aggregated indicators (for example, locational quotients) and, therefore, contrast with the aforementioned microoriented agglomeration measures that have emerged in the literature since the 90s [see Haddad (1989) for an overview of more traditional empirical approaches].
In Brazil, the interest in industrial agglomerations has grown but, as a rule, more aggregated studies[see Suzigan et al. (2001), Andrade and Serra (2000), Saboia (2000) and Lage (2002)] or indicators of the literature on local clusters based on case studies [see Hasenclever and Zissimos (2006)]are observed.There are, however, exceptions based on measures by EG and/or MS, according to studies by Resende and Wyllie (2005) and Lautert and Araújo (2007).
The next logical step is to investigate the determinants of industrial agglomeration.Resende and Wyllie (2004) followed a formulation similar to that of Rosenthal and Strange (2001) and considered factors that explain the agglomeration index of MS.The present paper will consider an application the manufacturing industry in the state of Rio de Janeiro, Brazil, and sought to control for spillovers between sectors and natural advantages.In contrast with the RS specification, Resende and Wyllie (2004) consider variables reflecting local incentive policies (e.g., tax incentives), though no significant effects were detected.
An aspect that is less investigated in the literature refers to how different sectors tend to show a common pattern of agglomeration and, thus, "coagglomerate".EG advanced a measure of coagglomeration for groups of sectors.In the case of pairs of sectors, Ellison et al. (2009) have shown that the measure becomes greatly simplified; an empirical application was considered by Ellison et al. (2010) for U.S. manufacturing industries.Thus, the coagglomeration measure in the case of pairs of industries is simplified, as contrasted to the multisectoral indicator advanced by Ellison and Glaeser (1997).
The discussion of the determinants of agglomeration dates back to Marshall (1920) and has highlighted costs related to the flow of goods, people and ideas that could be mitigated by the emergence of industrial agglomerations.Ellison et al -EGK (2010) investigated common agglomeration patterns and the role of the previously mentioned explanatory facttors.
The calculation of the coagglomeration indicator for pairs of industries in a developing economy like Brazil contributes to the literature by addressing a large economy characterized by the co-existence of traditional and innovative sectors.Beyond such calculation, it is relevant to assess the importance of pertinent factors such as proximity to customers and suppliers, labor pooling, intellectual or technology spillovers and the advantages advanced by EGK.Therefore, while those explanatory factors closely relate to previous discussions in the literature on industrial agglomerations, the focus of the analysis is now on common agglomeration patterns (the so-called coagglomerations).EGK investigated the role of those factors in explaining coagglomeration in the U.S. manufacturing industry, using country-wide indicators.A similar analysis is considered in the present paper, but in the context of a specific region of Brazil.In particular, the focus is on the state of Rio de Janeiro, which is not the dominant industrial state in Brazil.
The main contribution of the present paper is to assess industrial coagglomerations in Brazil and discuss the role of some economic fundamentals in explaining those patterns.Thus, in contrast with the previous related literature in Brazil, the present paperdoes not focus on the measurement of industrial agglomeration but rather on common patterns of agglomeration.In fact, the purpose is to start to fill the gap in the literature in terms of coaggloneration studies.
This paper is organized as follows: the second section briefly discusses conceptual issues related to coagglomeration and industrial agglomeration, and indicates strategies for empirical quantification;the third section discusses the databases used and presents the empirical results;and, thefourth section presents some concluding remarks.

2.1_Conceptual Aspects
Measures of industrial agglomeration in a given industry sector quantify the degree of spatial concentration, conditioned on industrial concentration.Ellison and Glaeser-EG (1997) and Maurel and Sédillot-MS (1999) proposed measures of agglomeration based on probabilistic models for the location of plants which are somewhat similar.Consider a group of industries (i = 1, .., I) and a set of possible locations (m = 1, ..., M).One can define the employment share of industry i in each locality (s 1i .. ., s MI ) and yet measures of the area size stated in terms of average share (x m ) of employment of a given locality m across different industries (x 1 , ..., x M ).A first approximation of the geographical concentration of the i industry is given by: (1) EG highlight the sensitivity of the measure to the size distribution of plants in the industry and to the degree of spatial aggregation.They propose an adjusted indicator that aims at controlling for those aspects and is based on a sequential model for plant location.They suggest the following measure, to be calculated for each industrial sector: (2) Where the term in brackets (say GEG) stands for a normalized version for expression (1), H denotes the Herfindahl index, and X x m m = ∑ 2 with x m indicating (in the application of this paper) the share of the mth location in total employment of the manufacturing industry as a whole.MS advance a similar agglomeration measure ( )  γ MS based on a distinct locational model for plants and an essential difference pertains to the term and therefore includes an additional cross term. 1 The aforementioned indicators attempt to control for size differences between distinct industrial sectors relative to the aggregate industry.A downside is the strong data requirement in terms of detailed microdata.
Taking the previously mentioned concepts as a reference, EG propose a measure of coagglomeration for groups of industries: (3) where the Herfindahl index for a group of industries is defined in terms of a weighted average of indexes for the component industries, with wi denoting the employment share of industry i in the group.The calculation of such a measure still requires detailed microdata.Thelinks with models of plant location choice, so as to consider aspects relating to spillovers, natural advantages and random factors, would be considered only in EGK.The indicator given in expression (3) becomes simpler in the case of pairs of sectors, as given below: 2 (4) wheres mi denotes the employment share of industry i contained in locality m;and, x m (with a slight abuse of notation) stands for the aggregate size of area m, in terms of the average share across sectors in a given locality.EGK emphasize that, by construction, one should expect actual mean valuesto be close to 0 for that measure.In fact, the size proxy is given by x m and, thus, deviations of each sector relative to this reference will be approximately uncorrelated with the mean of thedeviations of all other industries.Finally, it is worth mentioning that negative values for γ ij C prevail when the industries composing a given pair agglomerate in distinct localities.

2.2_Determinants of coagglomeration
The literature on the determinants of industrial The idea is that different sectors, to some extent, share a pool of workers that could potentially be used by the different sectors.This could lead to common patterns of agglomeration.For this variable, one closely follows the procedure of those authors by segmenting the employment of each sector according to types of occupation.
The related shares are the basis for the calculation of correlation coefficients for pairs of sectors that would approximatethe degree of labor pooling (LPOOL); Marcelo Resende b) Proximity to suppliers and customers In this case, the common sense is that sectors with strong vertical linkages might tend to coagglomerate.To this end, one proposes indicators based on the inputoutput matrix.Specifically, from the resources and uses matrix and the relevant data one can the product of the technical coefficient matrix by the production vector (in the usual notation AX).In each column one will calculate shares, with respect to the total, that would indicate the importance of inputs from different sectors for a given destination sector (the input variable for the column under consideration).The indicator for each pair of sectors (henceforth INPUT) would be defined in terms of average values for the two sectors involved in each pair, considering both directions.On the other hand, if one focuseson terms of weight as the destination, it is possible to compute shares relative to the total of each row in the matrix AX and to construct output indicators for each sector.The indicator for each pair of sectors would be defined as the average of the two sectors composing the pair, considering both directions (henceforth OUTPUT).

c) Shared natural advantages
The existence of natural advantages may favor common industrial agglomerations that are independent of other factors.In fact, a well known example is given by industrial sectors whose coastal location tends to be favorable, as in the case of shipbuilding and oil refining.Another example that is usually mentioned refers to the aluminum industry that, facing the high use of electricity, could benefit from a special provision of that input.Information about the use of specific services in different sectors of industry could be provided through the share of these services in the resources and uses matrix, defined in each column (indicator denoted by NATA).

d) Technological spillovers
Innovation efforts can generate significant spillovers onto other firms, such as in the case of investment in R&D.This type of effect is one of the major motivations for the emergence of technology parks: Silicon Valley is an iconic example.In the American case, EGK make use of information from the matrix of sectoral technology flows, provided by Scherer (1984), to construct an indicator of technological spillovers for pairs of industry sectors.Unfortunately, there is no similar information in the Brazilian case; so, this aspect cannot be considered explicitly in this study.Nevertheless, the case of the state of Rio de Janeiro considered in the present application becomes more appealing since,in many sectors, more leading edge operations tend to prevail in other states.

3.1_Data sources
The main data source is the RelaçãoAnual de InformaçõesSociais-RAIS (MTE) that provides, for example, employment data forformal establishments in Brazil, classified up to the 5-digit level (CNAE5).In the present paper, the focus will be on the manufacturing industry.The level of spatial aggregation will be specified in terms of the 92 municipalities in the state of Rio de Janeiro for 2010.Given a confidentiality restriction, data from the armaments industry were not available and the related parts were thus excluded; the analysis concentrates on 102 sectors at the 3-digit level.The coagglomeration indicator is defined for pairs of industries; the number thereof reflects the 2 by 2 possible combinations of industries.Following an analogous procedure, as Ellison et al. (2010), factors that would be likely to explain coagglomerations are considered.

Specifically:
Labor pooling: the data were extracted from the The indicators for coagglomeration and labor pooling are feasible at the 5-digit level; however, the need to match different sectoral classifications from those two sources led to an analysis focusing on the 3-digit level of aggregation.

Proximity to suppliers and customers
The aforementioned input-output matrix provides the basis for the construction of the INPUT and OUTPUT indicators.The equivalence of sectors from that matrix, with the CNAE 2.0, only was feasible at the 2-digit level.
The final analysis will be undertaken at the 3-digit level, despite the repetition of certain values given the aggregation.The indicator considers the average of the values composing each pair of sectors.

Shared natural advantages
The special access to inputs in the state of Rio de Janeiro appears to be a more recent phenomenon.An example appears in the case of siderurgy (CSN), in the case of electric energy.In the present study, the focus is on advantages related to transportation.In the matrix of resources and uses, one makes use of the shares of the row referring to "transport" in the different sectors defined in the total from the column of matrix AX.In order to construct indicators for each pair of sectors, the average between the values of the two sectors involved is computed once more.All the indicators were calculated for the state of Rio de Janeiro except LPOOL,which was generated for the Brazilian industry as a whole.This is discussed in section 3.2.2.
Summary statistics for the different indicators, indicating substantial heterogeneity, are presented in Table 1.In fact, one observes a large sample variability for the different variables is clearly observed, even if one had considered a unitless measure like the coefficient of variation.Thus, even with the analysis confined to the state of Rio de Janeiro, a large degree of heterogeneity appears to prevail.
Table 2 shows pairwise correlations for the variables.

3.2.1_Coagglomeration: descriptive results
The calculation of coagglomeration indicators was highly data intensive and was programmed using Matlab.The analysis of the coagglomeration and labor pooling indicators, at the 3-digit level, needed to be matched with the indicators obtained from the input-output matrix this implied a reduction in the number of pairs of sectors to 3657.The table in the Appendix presents the pairs of sectors with the 15 largest and 15 smallest values for the coagglomeration index.By construction, one should observe a mean value close to 0, as reported in Table 1.Moreover, inspection of the list of pairs of sectors indicates a relatively higher order of magnitude for some pairs, as compared to cases in the U.S. and Germany, and somewhat distinct patterns of coagglomeration.For example, one notes important coagglomerations in sectors related to textile fibers; but, in contrast, there are also high values of the index in sectors involving transportation vehicles.In fact, it is possible to observe important coagglomeration of the latter sectors with sectors referring to related parts and elastomers.
Those are probably the most salient among the higher coagglomerated pairs of sectors listed in the Appendix.It is important to note that, in the U.S., the pairs that exhibit larger coagglomeration are, as a rule, related to the mills, fibers and textile industries.This contrasts with the evidence for the state of Rio de Janeiro.However, one cannot discard cases that, in some cases, might reflect historical specificities.For example, when one considers a larger sample of pairs that are not matched with the input-output sector classification, the larger coagglomeration index (0.990) pair is associated with rail transportation and musical instruments.Thus, in that case it is not possible to rule out the possibility that peculiar historical aspects might have played a decisive role in such pattern, and that those aspects do not reflect an economic logic for common agglomeration.Indeed, it appears that the location of economic activity in Brazil many times reflected particular personal concessions, rather than specific economic fundamentals, reflecting common natural advantages, proximity to suppliers and customers. 3 It is worth noting that common patterns of agglomeration are not observed inhigh technology sectors; in any case, the prevalence of more sophisticated agglomerations would be more likely in the state of São Paulo.Moreover, even in Germany, high technology agglomerations are less frequent than expected [see Alecke et al. (2006)].
At the other extreme, one can find pairs of industries that exhibit the smallest coagglomeration indexes.Examples that are not surprising include industries involving fibers, elastomers and footwear, with others referring to more technological goods such as vehicles and appliances.In those cases, one would not expect the common occurrence of the aforementioned economic fundamentals to prevail for the pairs of industries in the different groups.

3.2.2_Econometric analysis
Table 3 presents the results for the regression of coagglomeration on its expected determinants, in accordance with the previously discussed categories.
In order to facilitate the interpretation, in terms of relative importance, all variables were standardized with the subtraction ofthe mean and division by the standard deviation.
The inspection of individual coefficients indicates significant roles for labor pooling (LPOOL) and proximity to suppliers and clients (INPUT and OUTPUT); but, it does not indicate significant roles for natural advantages, as approximated by the importance of transportation (TRANSP).The general fit, as indicated by the coefficient of determination, is modest and smaller than the small value obtained by EGK.A potential concern raised by those authors pertains to possible endogeneities in the explanatory factors that could reflect coagglomeration.In other words, some apparently exogenous indicators can reflect previous coagglomeration that, in some cases, is implicated by more random factors.For example, co-location might drive input-output linkages or hiring patterns.Econometric treatment would suggest an instrumental variable procedure as a cautionary extension of the ordinary least squares analysis.The authors adopt a creative strategy by using analogous indicators, from the U.K., as instruments for the U.S. indicators.
In the present application, a regional perspective at the state level is adopted.All indicators were constructed for the state of Rio de Janeiro except LPOOL,which was considered for Brazil in general and highlights the potential spatial mobility of the labor force.In fact, in that case, the country-wide indicator appears to be more appropriate, so as to mitigate possible related endogeneity concerns.In other words, the concern for the endogeneity of LPOOL would be stronger if the focus had been on labor in adjacent municipalities, in the state of Rio de Janeiro.In that sense, the referenced variable is constructed based on the Brazilian figure that can render the assumption of exogeneity more tenable.In the U.S. case, EGK relied on instruments using indicators referring to a different country (U.K.).
One could, in principle, conceive indicators for different states as instruments by analogy; although, that extension extrapolates the present exploratory study.
In any case, the exploratory evidence produced so far appears to indicate the possibility that the economic fundamentals do not provide an exhaustive picture of the coagglomeration phenomenon.In fact, that would not be totally surprising: industrial development in Brazil often had a complex pattern where historical conditions, that lead to the location of industries, reflected factors that extrapolated the expected economic fundamentals.Those factors are likely to play a stronger role in Brazil than in many developed countries.More recently, for The variables used in the application,constructed for Rio de Janeiro, were very similar; except for the latter, on technological flows, that is not available in the Brazilian case.In the present application, as previously mentioned, stronger effects are associated to INPUT and LPOOL, respectively; whereas, no significant effect accrues from VNAT and OUTPUT is significant, but with an unexpected negative coefficient that contrasts with the results for the U.S. Thus, the relative importance of the different economic fundamentals differs between the two countries.
In both cases, however, labor pooling and proximity to suppliers appear as relevant.
4_Final Comments This paper aimed at undertaking an initial characterization of industrial coagglomeration in the state of Rio de Janeiro in 2010, taking pairs of industries as reference, in a manner as similar to Ellison et al. (2010).In addition to the theoretical basis of the index, in terms of a model for plant location, the article discusses explanatory factors associated with labor pooling, proximity to suppliers and clients and natural advantages.There are some similarities in terms of coagglomeration patterns, indicated by previous evidence for the U.S., but there are also important contrasts.When the exploratory econometric evidence is considered in the future, with a larger study, a stronger role for the variables LPOOL and INPUT and also the issue of possible reverse causality, should be investigated.However, given the focus on a particular state in the present study, one should exercise caution in drawing sharp policy recommendations.Nevertheless, even the measurement of industrial coagglomerations can be informative.In fact, a recurring diagnosis indicates the presence of supply bottlenecks associated with the lack of skilled technical labor.In that sense, the suggestion for the necessity of expanding technical schools is often discussed.Coagglomeration studies could,to some extent, provide some guidance regarding the significance of labor pooling and, also, more targeted specification of technical schools designed towards some specific technical skills.
Possible avenues for future research include: a) Consideration of measures of coagglomeration for other regions of Brazil and, possibly, at a finer level of aggregation (ideally at the level of microregions for the coagglomeration indicator) and for several years.Such research would involve large implementation efforts because even the current application was very data-intensive.Moreover, the availability of input-output matrices is limited in some cases and rarely involve data updates.It seems that an application for the southeast region that includes the state of São Paulo would be especially timely; b) In the Brazilian case, unlike the U.S., there is no readily available information about intersectoral technology flows, to capture aspects of technological spillovers.However, it could be interesting to segment the sample used in the regression in terms of pairs, with minimum levels of R&D intensity, if one can obtain special tabulations, by state, from IBGE; c) The potential problem of reverse causality in the regression analysis can, in principle, be handled using a similar empirical strategy as advanced by EGK.In their case, the analysis considered country-wide indicators for the U.S., and instruments were conceived in terms of similar, analogous indicators, such as LPOOL, INPUT and OUTPUT, but for the U.K. In the present application, given the focus on regional analysis at the state level, it would be possible to devise a similar strategy using, for example, indicators related to another state such as Minas Gerais.All these applications involve, however, go beyond the scope of this initial exploratory application.
RAIS, but segmented by occupations [taking as reference the Classificação Brasileira de Ocupações-CBO at the sub-groups level].187 occupations were considered, given the confidentiality associated with 5 occupations in the sub-group of 192 occupations.Next, it is possible to calculate correlations between employment shares for each type of occupation for each pair of sectors.That indicator would approximatethe utilization of employees with similar profiles.The remaining indicators are constructed upon the input-output matrix for the state of Rio de Janeiro for 2002, provided by Fundação CIDE.