Assessing rainfall erosivity indices through synthetic precipitation series and artificial neural networks

The rainfall parameter that expresses the capacity to promote soil erosion is called rainfall erosivity (R), and is commonly represented by the indexes EI 30 and KE>25. The calculations of these indexes requires pluviographical records, that are difficult to obtain in Brazil. This paper describes the use of synthetic rainfall series to compute EI 30 and KE>25 in Espírito Santo State (Brazil). Artificial neural networks (ANNs) were also developed to spatially interpolate R values in Espírito Santo. EI 30 and KE>25 indexes values were close to those calculated on a homogeneous area according to the similarity of rainfall distribution; indicating the applicability of the use of synthetic rainfall series to estimate the R factor. ANNs had a better performance than Inverse Distance Weighted and Kriging to spatially interpolate rainfall erosivity values in the State of Espírito Santo.


INTRODUCTION
Soil erosion is a widespread land degradation problem in many parts of the world. On-site and off-site costs of soil erosion reach about 44 billion dollars in the United States (Pimentel et al. 1995), 4.2 billion dollars in Brazil (Hernani et al. 2002) and 45.5 billion dollars in the European Union (Telles et al. 2011). Assessing the risk of erosion, predicting erosion rates and designing and evaluating different soil protection strategies is an essential tool for selecting soil and watershed best management practices. Mathematical models are used to quantify and predict soil losses. The universal soil loss equation (USLE) (Wischmeier and Smith 1978) and the revised universal soil loss equation (RUSLE) (Renard et al. 1991) have been the most widely used models in predicting soil erosion losses (Baskan et al. 2010).
Rainfall is the main climatic characteristic that influences soil erosion, given the extraordinary importance of soil detachment processes due to drop impact and runoff shear. Among USLE/ RUSLE factors, the erosive capacity of rainfall is expressed as rainfall erosivity (R), commonly represented by the indices EI 30 (Wischmeier and Smith 1958) or KE>25 (Hudson 1973 The main method to calculate rainfall erosivity values requires pluviographic records. This kind of information is difficult to obtain in Brazil due to the reduced number and inadequate spatial distribution of meteorological stations that are equipped to provide pluviographic data. This makes it very difficult to know an R factor for all of Brazil. On the other hand, some empirical equations can also estimate values of rainfall erosivity by using geographical coordinates or pluviometric records, such as annual and monthly rainfall averages (Silva 2004, Hoyos 2005, Aquino et al. 2008, Capolongo et al. 2008, Zhang et al. 2008a.
yu (2002) and Zhang et al. (2008bZhang et al. ( , 2010 assessed the ability of stochastic weather generators to generate daily rainfall synthetic series used to calculate the R factor. These generators have the potential to be used in Brazil to extend the rainfall erosivity index database. Many authors (Qi et al. 2000, Silva 2004, Moreira et al. 2006, yin et al. 2007, Men et al. 2008, Shamsad et al. 2008, Angulo-Martínez et al. 2009, Silva et al. 2010a, b, Alves Sobrinho et al. 2011) used spatial interpolation techniques like "inverse distance weighted", "kriging" and "artificial neural networks" (ANNs) to create maps representing spatial distribution of R values. Since no single interpolation method among those available for spatial interpolation of R factor is optimal for all regions and all indices, it is very important to compare the results obtained through different methods, applied to each set of data (Goovaerts 1999, Beguería and Vicente-Serrano 2006, Angulo-Martínez et al. 2009).
The ability of ANNs to use different input parameters makes them capable of solving complex problems from many areas (Sárközy 1999, Souza et al. 2006. The ANNs are cited as an alternative resource for estimating climatic variables that may replace the traditional interpolation methods (Białobrzewski 2008, Sivapragasam et al. 2010, including rainfall erosivity indexes as assessed by Moreira et al. (2006) for São Paulo State, Brazil.
In this paper we aimed at: a) calculating rainfall erosivity indexes EI 30 and KE>25 using synthetic rainfall series for several locations in Espírito Santo State, Brazil; b) developing ANNs to spatially interpolate rainfall erosivity in Espírito Santo State, Brazil; and c) comparing the developed ANNs performance to other spatial interpolators.

ASSESSING RAINFALL EROSIVITy INDEXES
The stochastic weather generator ClimaBR 2.0, developed by Baena et al. (2005) and validated by Zanetti et al. (2006), was used to generate daily synthetic rainfall series for 73 pluviometric stations in Espírito Santo State ( Figure 1). Oliveira et al. (2005a, b) describe all the method to generate synthetic rainfall precipitation series. The input data used to generate each synthetic rainfall series were the measured daily rainfall depth data series in the standardized format of the Brazilian National Water Agency (http://hidroweb.ana.gov.br) with 15 or more years in extension. Each synthetic rainfall series had 100-year daily data of: rainfall depth, storm duration, peak storm intensity, time to peak and storm profile patterns.
A computational algorithm was developed to identify all the erosive precipitations on each rainfall series. The erosive precipitations were taken as all rainfall with depth equal or higher than 10 mm or lower than 10 mm in depth, but with a 15 minute depth equal or higher than 6 mm (Wischmeier and Smith 1958, Wischmeier 1959, Cabeda 1976. Two different rainfall erosivity indices were computed (EI 30 and KE>25) using two different equations to calculate erosive precipitation kinetic energy (KE).
Before the calculation of EI 30 (Wischmeier and Smith 1958) and KE>25 (Hudson 1973) indices, it was necessary to estimate the erosive precipitation kinetic energy (KE). KE values were computed individually by the equations proposed by Wischmeier and Smith (1958)  Equations 1 and 2 were used to compute KE of all the erosive rainfall with intensity equal to or lower than 76 mm.h -1 . Erosive rainfall with greater intensities were assumed to have a KE equal to 0.283 MJ.ha -1 .mm -1 , as long as the raindrop diameter does not rise up to rainfall intensities greater than this limit (Foster et al. 1981).
The EI 30 parameter for each specific event was calculated as the product of total kinetic energy (KE) computed individually by equations 1 and 2 and the maximum 30 min intensity, according to Wischmeier and Smith (1958). The total KE of each event was computed using the one minute time step. Monthly values were determined as the sum of the individual events determined through the EI 30 parameter (MJ.mm.ha -1 .h -1 ), and annual The KE>25 parameter for each specific event was calculated as the product of total kinetic energy (KE) computed individually by equations 1 and 2 and rainfall depth, according to Hudson (1973). The total KE of each event was computed by using one minute time step. Only rainfall intensities greater than 25 mm.h -1 were considered. Monthly values were determined as the sum of the individual events determined by the KE>25 parameter (MJ.ha -1 ) and annual values were determined in the same manner. Later mean monthly and annual values were computed using the 100-year values.
Since two different equations were used to compute KE (Wischmeier andSmith 1958, Wagner andMassambani 1988) and two different erosivity indices were calculated (EI 30 and KE>25), final results consisted on four monthly and four annual values of R factors for each pluviometric station.

RAINFALL EROSIVITy INDICES
Neural modeling was carried out with MathWorks MatLab ® software (MATLAB 2000). Pluviometric stations R values were randomly divided in two sub data-sets to develop 48 ANNs (four monthly R indices for 12 months): training sub data-set (60 stations) and test sub data-set (13 stations).
In the present study, a four-layer ANN model was used. The ANN architecture was 3-n 1 -n 2 -1 type, corresponding to one input layer with three variables (input parameters), two intermediate layers with n 1 and n 2 neurons and one neuron at the output layer representing output variable. The input variables were composed of the latitude and longitude values of each station (decimal degrees) and the altitude value (meters). A linear activation function was used in the output layer to obtain the rainfall erosivity value (R factor), in MJ.mm.h -1 . ha -1 .year -1 (EI 30 ) or MJ.ha -1 .year -1 (KE>25).
Before ANN's training all input and output data sets were standardized to values between -1 and 1. This procedure is essential to guarantee better training efficiency (Maier and Dandy 2000). The training algorithm feed forward back propagation was used. After each algorithm interaction the ANN's free parameters were refined by the Levenberg-Marquardt training rule. Different numbers of neurons at intermediate layers were tested (n 1 and n 2 varying from 1 to 12 neurons) as well as different total training epochs (50, 100, 200 and 500 epochs). The ANN's total number of neurons were limited by the number of samples (stations) used on the training sub dataset as suggested by Hagan et al. (1996).
Considering that at the beginning of the ANN's training the free parameters are randomly generated, the ANNs resulted from each combination of n 1 and n 2 and training seasons were trained 20 times. For each month and each one of the four R factors, the ANNs that presented the highest correlation coefficient (r) obtained in the test sub data-set were selected to be the spatial interpolator ANNs.
The ANNs to spatially interpolate the annual R factor were not developed because their values were computed by the sum of the ANN's monthly interpolated R factors.
Interpolators' evaluation was done using the cross-validation method (Robinson and Metternicht 2006). For each station, observed (Oi) and interpolated (Si) R values were used to compute the agreement index (d) (Willmont 1981) observed R factor value; S = observed R factor value; and Ō = mean observed R factor value.    Gonçalves et al. (2006) had found the same result in Rio de Janeiro, a state limiting with Espírito Santo and located on a homogeneous area according to similarity of rainfall distribution (Keller Filho et al. 2005). This was also observed in other homogeneous rainfall areas in Brazil, like the Brazilian Savanna (Marques et al. 1997.

RAINFALL EROSIVITy INDEX VALUES
The EI 30 index ranged from 2,123 (Santo Agostinho station) to 9,885 MJ.mm.ha -1 .h -1 . year -1 (Burarama (DNOS)). The KE>25 index ranged from 5.0 to 105.0 MJ.ha-1 .year-1 on the same stations. The lowest R index values were observed at higher latitudes, where annual pluviometric depths are also the lowest. On the other hand, the higher values were observed on lower latitudes and higher longitudes (near the Brazilian coast), in regions characterized by Average EI 30 e KE>25 were 5,592 MJ.mm.ha-1.h -1 .year -1 and 58.0 MJ.ha -1 .year -1 , respectively. These values are very close to those found by Carvalho et al. (2005) and Gonçalves et al. (2006) in Rio de Janeiro, located on a homogeneous area according to the similarity of rainfall distribution (Keller Filho et al. 2005).
The EI 30 values on the west side of Espírito Santo (lower longitudes) were close to those found by Mello et al. (2007) (1996), the maximum number of neurons on the intermediate layers of the ANNs would be 12. The low number of neurons on the developed ANNs (Table II) indicates a lower complexity, better ability to generalization and estimation (Bernardos and Vosniakos 2007) and had a lower probability of "memorizing" answers (Sinha and Wang 2007   interpolators only depends on the distance between stations, which are weighting factors in mathematical models used for conventional interpolation. This did not occur on the ANNs because trained architectures (Table II) are different for each month and each situation (WS or WM equations), resulting in different performances.
In general, the conventional interpolation methods evaluated had similar performances. These interpolators have advantages and disadvantages that depend on various factors such as the amount of data available and regularity of the spatial distribution. If the distribution of observed data is not favorable, the results may be unsatisfactory.  Table III data shows that "d" index was higher on 44 of the 48 developed ANNs, which indicates better performance of ANNs for spatial interpolation of the R factor compared to conventional interpolators, as also seen by Moreira et al. (2006Moreira et al. ( , 2009 in the States of São Paulo and Minas Gerais, respectively. According to Akkala et al. (2010), ANN interpolators work well with sparse data irregularly distributed, just as for the data presented (Figure 1). The ANNs, in order to have better performance, need consistent training and the data-set used must represent the nuances of the terrain to be modeled (Teegavarapu 2007, Miranda et al. 2009, Sivapragasam et al. 2010, as was the case in this study. Another important factor that led to the superiority of ANNs consisted in considering the altitude to interpolate the R factor (Goovaerts 1999, Moreira et al. 2006, Silva et al. 2010b). This is a very important variable to explain the behavior of precipitation, especially in regions of great orographic influence on the climate, as for Espírito Santo State (Keller Filho et al. 2005;Melo Júnior et al. 2006).
The ANNs developed to spatially interpolate EI 30 and KE>25 indices with KE computed by the use of WM equation showed always better performance than traditional interpolation, so they are recommended for use in spatial of rainfall erosivity in Espírito Santo State. Figure 2 presents the spatial distribution of the annual EI 30 and KE>25 indices calculated with KE computed by the use of WM equation and interpolated using the developed ANNs.

CONCLUSIONS
Based on the presented results we can conclude that: 1. The use of synthetic rainfall series is a promising alternative to estimate the rainfall erosivity at locations without pluviographic data availability; 2. There were no significant differences in EI 30 and KE> 25 rainfall erosivity indices estimated using two rainfall kinetic energy equations that were evaluated; 3. Artificial neural networks presented better performance than IDW and Kriging to spatial interpolate rainfall erosivity values in Espirito Santo State.