Multivariate Analysis of Erosivity Indices and Rainfall Physical Characteristics Associated with Rainfall Patterns in Rio de Janeiro

The identification of areas with greater erosive potential is important for planning soil and water conservation. The objective of this study was to evaluate the physical characteristics of rainfall events in the state of Rio de Janeiro, Brazil, and their interactions with rainfall patterns through multivariate statistical analysis. Rainfall depth, kinetic energy, 30-min intensity (I30), duration of rainfall events, and the erosivity indices KE >10, KE >25, and EI30 in 36 locations (stations) were subjected to principal component analysis (PCA) and canonical discriminant analysis (CDA). Based on evaluation of the respective historical series of hyetographs, it was found that the advanced pattern occurs with highest frequency (51.8 %), followed by the delayed pattern (26.1 %), and by the intermediate pattern (22.1 %). All the evaluated rainfall characteristics have high response capacity in describing localities and rainfall patterns through PCA and CDA. In CDA, the Tukey test (p<0.05) applied to the scores of the first canonical discriminant function (CDF1) allowed differentiation of the stations with respect to the rainfall and erosivity characteristics for the advanced and delayed patterns. In the delayed pattern, the localities of Angra dos Reis, Campos, Eletrobrás, Manuel Duarte, Santa Isabel do Rio Preto, Tanguá, Teresópolis, Vila Mambucaba, and Xerém had the highest CDF1 scores, indicating that they have rainfalls with higher depth, I30, and duration because the standardized canonical coefficient (SCC) and the correlation coefficient (“r”) of these characteristics were positive. The rainfall events in the state of Rio de Janeiro differ from one locality to another in relation to the advanced and delayed rainfall patterns, mainly due to the physical characteristics of rainfall depth, I30, and duration, indicating a higher risk of soil loss and runoff in the localities where rainfall events with the delayed pattern prevail.


INTRODUCTION
Water erosion poses a serious threat to natural resources worldwide, with significant economic and environmental consequences (Prosdocimi et al., 2016) and evaluation of this erosion has become an important tool in soil and water conservation plans (Anache et al., 2017).Soil erosion is directly related to the physical characteristics of rainfall events (Ran et al., 2012), primarily their intensity and duration, and the diameter and final speed of raindrops, which determine their erosivity (Goebes et al., 2014).
The indices EI 30 and KE >25 have been widely used to calculate erosivity (Oliveira et al., 2013) since they have provided good correlation with soil losses in different regions (Bertol et al., 2007;Bertol et al., 2008;Silva et al., 2009).These indices are calculated through the product of kinetic energy and intensity of the rainfall (Marques et al., 1997).
Rainfall events with the same average intensity do not necessarily have the same kinetic energy (Mohamadi and Kavian, 2015), which indicates that a given precipitated volume may lead to different soil losses, even under the same conditions of land use and cover.This reality may be related to the effect of the hydrological rainfall pattern, defined in accordance with the occurrence of peak intensity in relation to its duration (Flanagan et al., 1988).Various studies have shown the influence of rainfall patterns in the erosion process, especially in runoff, soil loss, and particle distribution (Flanagan et al., 1988;Eltz et al., 2001;Oliveira et al., 2010;Wang et al., 2016).
The state of Rio de Janeiro has wide climatic variability, mainly caused by the diverse topography and consequent cloud formation.Thus, the state has orographic rainfalls and convective and frontal rainfalls (André et al., 2008;Dereczynski et al., 2009).These characteristics influence the volume, intensity, and patterns of rainfalls and, consequently, soil erosion.
Evaluations of rainfall erosivity and patterns have been made for various Brazilian localities individually (Mehl et al., 2001;Evangelista et al., 2005;Cassol et al., 2008;Machado et al., 2008;Peñalva-Bazzano et al., 2010;Oliveira et al., 2012;Eltz et al., 2013).However, there are few studies investigating the use of these rainfall physical attributes in relation to the erosive processes that aim to group various localities of a same region or state with similar characteristics.This type of approach is especially useful in identifying areas with a higher risk of soil, water, and nutrient losses.
Due to the large number of characteristics (variables) related to the rainfall events, multivariate analysis statistical techniques can be better suited to evaluate the data; they have been applied to soil science and various areas of knowledge (Maluche-Baretta et al., 2006;Dai et al., 2014).Multivariate techniques include canonical discriminant analysis (CDA) (Cruz-Castillo et al., 1994) and principal component analysis (PCA) (Pearson, 1901;Raziei et al., 2009;Santos et al., 2010).These are exploratory techniques of the data to summarize the set of original variables and facilitate interpretation of the interdependence among these variables.Applied to pluviographic data, these techniques allow localities to be distinguished and grouped in accordance with the rainfall physical characteristics that affect the erosive process.Through the PCA, it is possible to visualize the variables (characteristics) and the levels of the factors evaluated (in this case, localities and rainfall patterns) on a two-dimensional plane and, therefore, evaluate relationships between variables through components (PCA) and the potential to discriminate (CDA) locations according to the rainfall characteristics.
Canonical discriminant analysis allows identification of differences among the groups evaluated (treatments or factors) based on measurements of characteristics of the observations belonging to these groups, facilitating understanding of the relationships among the variables evaluated within these groups.The CDA estimates linear functions (called canonical functions or variables) based on the quantified original variables, separating the groups of individuals to maximize the variance between groups and minimize the variance inside them (Cruz-Castillo et al., 1994).
Utilization of the PCA and CDA multivariate techniques has not been widespread in the literature on rainfall studies (Machado et al., 2008;Jin et al., 2015), and even less so with the aim of evaluating and grouping localities according to the rainfall characteristics related to erosivity.
Considering that the different regions of Rio de Janeiro do not exhibit homogeneity of characteristics related to the potential of rainfall events to cause erosion, the aim of this study was to evaluate the physical characteristics of rainfall events, erosivity indices, and rainfall patterns in the state using multivariate statistical techniques to identify homogeneous regions.

MATERIALS AND METHODS
The present study was conducted using the rainfall records of 36 stations in the state of Rio de Janeiro that belonged to the Superintendência Estadual de Rios e Lagoas -Serla (currently part of the Instituto Estadual do Ambiente -INEA), the Instituto Nacional de Meteorologia (INMET), the Agência Nacional das Águas (ANA), and the Light Serviços de Eletricidade S.A. company (Figure 1).The period of analysis encompassed the years 1974 to 1996, with an average of 11 years of data, considering that it was not possible to establish a common period for all series studied.Regarding rainfall data, in Brazil it is still rare to have series of more than ten years (Montebeller et al., 2007), especially in working with areas of considerable geographic extension.
The daily hyetographs, with amplitude of 10 mm of rainfall depth and lowest reading range of 0.1 mm, were converted to the digital format using a digitizing table and the system for digitalization of hyetographs HidroGraph 1.02.From the digitized data, the erosivity indices (EI 30 , KE >10, and KE >25) for each station were calculated using the computer program CHUVEROS (Program created by the Professor Elemar Antonino Cassol; not published).For each rainfall event, the program provides the rainfall depth and duration, kinetic energy, and I 30 , among other characteristics.It also characterizes the rainfall events in hydrological patterns.Rainfall events were characterized in the following hydrological patterns: advanced (V), intermediate (I), and delayed (T), when the peak of highest intensity occurs in the first third, second third, and third third of the rainfall period, respectively (Horner and Jens, 1941).The rainfall characteristics and erosivity indices evaluated were rainfall depth (mm), kinetic energy -KE (J m -2 ), maximum 30-min intensity -I 30 (mm h -1 ), rainfall duration (h), KE >10 (MJ ha -1 yr -1 ), KE >25 (MJ ha -1 yr -1 ), and EI 30 (MJ mm ha -1 h -1 yr -1 ).The erosivity indices KE >25 and KE >10 were determined by the product of the kinetic energy and intensity of rainfall segments longer than 25 and 10 min h -1 , respectively.The kinetic energy determination used in CHUVEROS is the same as EI 30 , i.e., from Wischmeier and Smith (1958) modified by Foster et al. (1981), and described as KE = 0.119 + 0.0873 × log I, where I is rainfall intensity in mm h -1 (Montebeller et al., 2007).
Joint evaluation of physical characteristics and erosivity indices to obtain the groups of stations/localities with similar characteristics regarding the different rainfall patterns was made through multivariate principal component analysis (Santos et al., 2010) and canonical discriminant analysis (Cruz-Castillo et al., 1994), using the programs Canoco (ter Braak and Smilauer, 1998) and SAS (SAS, 1997), respectively.For this study, the 12 months of the year were considered as replicates for each combination between station and rainfall pattern, and they were obtained from the mean of the erosive events in each month, for all years of the historical series.
The interpretation of PCA results considered the scores of the first two principal components (Y 1 and Y 2 ) and the values of the linear correlation coefficients between the original variables and the first two components.The cluster analyses of the treatments and of the above-mentioned variables through CDA used the mean of 12 months for each rainfall pattern.
Complementarily and to facilitate graphical interpretation of the CDA, multivariate analysis of variance was made through the Wilks' Lambda test (Green and Carroll, 1978), and the Tukey test (p<0.05)was applied to the scores of the treatments.

RESULTS AND DISCUSSION
The advanced rainfall pattern (V) was the most frequent (51.8 %), followed by the delayed pattern (T) (26.1 %), and the intermediate pattern (I) (22.1 %) in the state of Rio de Janeiro under the conditions studied (Table 1).This predominance of the advanced pattern also occurs in other regions of Brazil, such as São Borja -RS (Cassol et al., 2008), Quaraí -RS (Peñalva- Bazzano et al., 2007), Lavras -MG (Evangelista et al., 2005), Seropédica -RJ and Nova Friburgo -RJ (Carvalho et al., 2005), and São Gabriel -RS (Eltz et al., 2013).In contrast, the localities Manuel Duarte, Santa Isabel do Rio Preto, and Vila Mambucaba stood out from the others with their trend of higher frequency of rainfalls with delayed pattern, which is a rare characteristic in Brazil.
From identification of the advanced pattern as predominant in the state of Rio de Janeiro, it can be inferred that soil losses are smaller than they would be if there were predominance of rainfall events with intermediate and delayed patterns (Bertol et al., 2007).The higher water content in the soil preceding the rainfall events, common in these patterns, favors disaggregation of the particles due to the impacts of the raindrops, increases surface sealing, reduces infiltration capacity, and causes greater runoff (Peñalva- Bazzano et al., 2010).Using erosion plots under natural rainfall in Seropédica -RJ, Carvalho et al. (2009) identified that 64.6, 21.3, and 14.1 % of the rainfall events were classified as advanced (V), intermediate (I), and delayed (T), respectively, generating 35.1, 6.6, and 58.3 % of the soil losses, under different management practices in an Argissolo Vermelho Amarelo (Red Yellow Ultisol).Eltz et al. (2001) also confirmed the predominance of soil losses in the delayed pattern, working with simulated rainfall in this same soil class in Santa Maria, RS.Through the PCA, 96.1 % of the information contained in the set of the seven original variables resulted from the first two principal components (Table 2), whose contributions were 83.0 and 13.1 % for the first (Y 1 ) and second (Y 2 ) axes, respectively.Higher Y 1 values were obtained for the stations Manuel Duarte (man), Santa Isabel do Rio Preto (sir), and Vila Mambucaba (vil).In complement, figure 2 shows the behavior in the first axis, in which the stations of the South Fluminense mesoregion are distant from the others, situated farther from and to the right of the origin.In the second axis, the highest values of principal components occurred for the stations Angra dos Reis (ang), Campos (cam), Capela Mayrink (cap), and Xerém (xer).
The first principal component (Y 1 ) has high positive correlation with six of the rainfall characteristics and erosivity indices studied (r >0.91) (Table 3).Hence, all characteristics, except rainfall duration, contributed to Y 1 and were considered good predictors of the effects of the treatments and, therefore, adequate for utilization in other statistical analyses.The high positive correlation with Y 2 (0.75) indicates that duration is the rainfall characteristic of highest correlation with the stations Angra, Campos, Capela Mayrink, and Xerém (Figure 2).These stations have low altitudes in common (Table 1) and are more influenced by the sea breeze, according to Dereczynski et al. (2009).
It was not possible to find the determinant effect of the rainfall and erosivity characteristics in the rainfall patterns (V, I, and T) for the stations studied, because all of them approach the origin of the axes Y 1 and Y 2 (Figure 2).However, rainfall depth was the characteristic closest to the advanced pattern.Regarding the localities and the respective patterns, the most correlated stations were Andorinhas, Fazenda Santo Amaro, Sambaetiba Most of the stations correlated with the advanced pattern are located in the Metropolitan region.However, they are closer to the border with the Central Fluminense region, which is the Serrana region on other maps.According to Montebeller et al. (2007), this region has the highest erosivity indices of the rainfall events in the state.In contrast, the stations that have delayed pattern were more associated with the South Fluminense mesoregion.
According to Brito et al. (2016), these parts of the state have total monthly rainfall events influenced by valley-mountain atmospheric circulations and also by orographic rainfall.
Through CDA, the first two functions explain 64.3 % of the total data variation, 42.2 % by the first function (CDF 1 ) and 22.1 % by the second function (CDF 2 ).The likelihood ratio test showed that CDF 1 and CDF 2 were significant at the 0.01 probability level, and the  explanation of the variation between the samples given by the other CDFs is small.The canonical correlations were 0.70 and 0.58 for CDF 1 and CDF 2 , respectively, considered as high.The results indicate high association between localities, as well as rainfall patterns with the EI 30 and KE >25 indices and the physical characteristics of rainfalls (Table 4).
The first discriminant functions (CDF 1 and CDF 2 ) differed among localities and rainfall patterns through the multivariate Wilks' Lambda test (p<0.0001)(Green and Carroll, 1978).For the characteristics I 30 , duration, and EI 30 , there were higher values (in modulus) of the standardized canonical coefficient for CDF 1 , while rainfall depth (RD), followed by duration, along with their correlation coefficients, were higher for CDF 2 (Table 4).
Based on I 30 , rainfall intensity is the physical characteristic most adequate for different localities in the state of Rio de Janeiro with respect to erosive potential, under different rainfall patterns.This index is used to calculate the EI 30 utilized in the Universal Soil Loss Equation (USLE) and is considered one of the best estimators of erosivity in countries with temperate climates (Wischmeier and Smith, 1978;Oliveira et al., 2013).
Despite subjectivity in visualization of the groups due to the large number of points, three groups of stations could be identified through the CDA in analysis of the mean scores of CDF 1 for the combination of stations versus rainfall patterns (108 observations) (Figure 3).The capV group (Capela Mayrink and advanced pattern) has lower scores; the sirT (Sta I. do Rio Preto) and ManT (Manuel Duarte) group of stations has higher values, both for the delayed pattern; and the third group, with intermediate scores, contained the other stations associated with each pattern (Figure 3).The results found in the delayed pattern were similar to those obtained by the PCA.
In the group of the stations for each rainfall pattern that was isolated, greater dispersion was observed in the advanced pattern, and it was possible to identify three groups of stations.The Capela Mayrink station had a lower score, while Santa Isabel do Rio Preto, Manuel Duarte, Vila Mambucaba, and Cordeiro had higher values (Figure 4a).In the intermediate pattern, the Itaperuna station exhibited the lowest score, while Manuel Duarte, Teresópolis, and Santa Isabel do Rio Preto showed the highest scores (Figure 4b).The other stations had intermediate values.In the delayed pattern, three groups were identified -the stations Vassouras and São Bento had the lower scores, whereas Manuel Duarte, Sta.I. do Rio Preto, Vila Mambucaba, Teresópolis, and Angra dos Reis had higher scores (Figure 4c).
To facilitate interpretation of the group presented in figures 3 and 4, the Tukey test (p<0.05)was applied to the scores (index resulting from the linear combination of all variables) of the stations/localities within each rainfall pattern.Since CDF 1 explained most of the data variation (42.2 %), only its result relative to the Tukey test (p<0.05)will be presented (Table 5).
In the advanced pattern, the Capela Mayrink station showed the lowest score, while Santa Izabel do Rio Preto (sir), Manuel Duarte, Vila Mambucaba, and Cordeiro showed the highest values (Table 5).The other stations did not differ, and some of them were similar to sir (man, vil, cor, vas, cam, laj, sce, toc, alc, pir, ipo, ang, sbe, ita, igu, and eco stations) and others similar to cap (mac, pos, fco, xer, and fsa stations) with respect to the rainfall and erosivity characteristics.For the intermediate pattern, the localities did not differ regarding the physical characteristics of rainfall events and erosivity evaluated as a single group, considering CDF 1 (Tukey <0.05).However, a lower score was observed for the Itaperuna station and higher scores for the Manuel Duarte, Teresópolis, and Santa Izabel do Rio Preto stations (Table 5).
For the delayed pattern, three groups of stations were identified: the group with the lowest values was formed by the stations Vassouras and São Bento; the group with the highest scores was composed of the stations Angra dos Reis, Campos, Eletrobrás, Manuel Duarte, Santa Isabel do Rio Preto, Tanguá, Teresópolis, Vila Mambucaba, and Xerém; the other stations showed intermediate scores (Table 5).
Considering that most characteristics are directly related to erosive potential, the more negative the mean of the scores, the greater the possibility of lower erosive potential of the rainfall events under the influence of the stations of the respective groups and rainfall patterns (Table 5).However, the stations with the highest scores, regardless of the rainfall patterns (except for Santa Isabel do Rio Preto), are related to higher values of rainfall erosivity, corroborating the study of Montebeller et al. (2007).According to these authors, the sites with the highest erosivity indices in the state occur in the Sul Fluminense region (Baía de Ilha Grande) and also in the direction from the Baixadas Litorâneas to Central Fluminense (Serrana region).This demonstrates the influence of the relief, notably the Serra do Mar mountain range, in this behavior of rainfall events.The stations with the highest scores (Table 5) coincide with most of those correlated with the delayed pattern obtained in the PCA (Figure 2), showing a certain consistency of this analysis in discriminating stations and rainfall patterns.A possible explanation for this is the procedure common to these techniques of making linear combinations of the original variables.The stations belonging to the group with the highest significant score for CDF 1 , in relation to the others in the delayed pattern, indicate that these localities (Figure 5) are similar regarding the expected high risk of soil, water, and nutrient losses due to the physical characteristics of rainfall events in them.This may occur especially in unprotected soils due to higher antecedent water content, which leads to lower infiltration capacity, lower cohesion of aggregates, and greater surface sealing and runoff.In addition, higher erosive potential is expected in this group because of the rainfall events with physical characteristics of high rainfall depth, high I 30 , and long duration, because the standardized canonical coefficient (SCC) and the correlation coefficient (r) of these characteristics were positive.Except for the Campos station, the highest scores in the delayed pattern were obtained for stations located in the regions South Fluminense, Central Fluminense, and Metropolitan, which had the highest monthly and annual mean values of rainfall (André et al., 2008;Brito et al., 2016) and the highest values of rainfall erosivity in the state (Montebeller et al., 2007;Machado et al., 2013).These results are highly influenced by the topography and higher air humidity coming from the ocean in the southeast direction (André et al., 2008).
The results obtained suggest conducting studies in erosion plots using simulated rainfall of a delayed pattern, because this would allow evaluation of the effect of different soil management systems and conservation practices on water erosion.The aim would be to control this type of soil degradation through systematization of techniques that can be extrapolated to the localities of this group.

CONCLUSIONS
Principal component analysis proved to be adequate for the data.In addition, characteristics such as rainfall depth, erosivity index EI 30 , kinetic energy (KE), maximum 30-min intensity (I 30 ), the erosivity indices KE >10 and KE >25, and duration of the rainfall events showed significant correlation with the localities and rainfall patterns studied.
Canonical discriminant analysis differentiates the stations/localities in the rainfall patterns.Through this technique, three groups of homogeneous stations are formed, regardless of the rainfall pattern, as well as three groups of similar stations for the advanced and delayed patterns for physical characteristics of rainfall events and erosivity in the state.

Figure 1 .
Figure 1.Location of the rainfall stations studied in the state of Rio de Janeiro, Brazil.

Figure 3 .
Figure 3. Standardized scores of the significant canonical discriminant functions (CDF) of rainfall and erosivity attributes for the 36 localities of the state of Rio de Janeiro, Brazil.

Figure 4 .
Figure 4. Standardized scores of the significant canonical discriminant functions (CDF) of rainfall and erosivity attributes in the advanced (a), intermediate (b), and delayed (c) patterns for the 36 localities of the state of Rio de Janeiro, Brazil.

Figure 5 .
Figure 5. Location of the stations of delayed pattern with respect to the variation in standardized canonical scores of the first canonical discriminant function (CDF 1 ).

Table 1 .
Rainfall patterns for different stations in the state of Rio de Janeiro, Brazil

Table 2 .
Values of the principal components (Y 1 and Y 2 ) for the rainfall stations, rainfall patterns, and percent of information retained by them (% variance and % accumulated variance)

Table 5 .
Means of standardized canonical scores of the canonical discriminant function (CDF 1 ) of the physical characteristic and erosivity variables for the rainfall patterns