RAINFALL EROSIVITY FOR THE STATE OF RIO DE JANEIRO ESTIMATED BY ARTIFICIAL NEURAL NETWORK

The Artificial Neural Networks (ANNs) are mathematical models method capable of estimating non-linear response plans. The advantage of these models is to present different responses of the statistical models. Thus, the objective of this study was to develop and to test ANNs for estimating rainfall erosivity index (EI30) as a function of the geographical location for the state of Rio de Janeiro, Brazil and generating a thematic visualization map. The characteristics of latitude, longitude e altitude using ANNs were acceptable to estimating EI30 and allowing visualization of the space variability of EI30. Thus, ANN is a potential option for the estimate of climatic variables in substitution to the traditional methods of interpolation.


INTRODUCTION
Despite its importance for agriculture, rain is the main element in soil erosion (NYSSEN et al., 2005).Its distribution is dependent on factors that are static (latitude, distance from the ocean, orographic effects) and dynamic (movement of air masses), which characterize the rainfall in a region when associated with each other (MONTEBELLER et al., 2007).
Soil degradation can be considered one of the most important environmental problems, being water erosion the degradation form that has most affected the yield potential of soils, assisted and accelerated by the man with his inappropriate agricultural management practices (SILVA et al. 2005).According to MACHADO et al. (2008), accelerated erosion is considered one of the most damaging forms of land degradation.In addition to reducing fertility and consequently crop productivity through loss of land, water and nutrients, this phenomenon can cause serious environmental problems, such as silting and contamination of water bodies, consequently raising the costs of water treatment (PRUSKI, 2009).
The Universal Soil Loss Equation (USLE) is an empirical model that estimates soil loss, and identify the factors that have the greatest effect on this phenomenon (WISCHMEIER & SMITH, 1958).Among the USLE factors, the one expressing the erosive capacity of rainfall is known as rain erosivity index (EI 30 ), which is the product of the total kinetic energy of maximum intensity of rain in 30 minutes.According to CARVALHO et al. (2010), erosivity is a numeric index that expresses the rain capability of causing erosion in an area.This factor is considered one of the most important components in the estimation of water erosion when using indirect methods, because it quantifies the effect of the impact of raindrops on the soil, therefore allowing the estimation of their erosivity.MONTEBELLER et al. (2007) observed that the accurate estimation of erosivity indices in a region requires a reliable time series and uniform distribution of pluviometers.However, due to the limited availability of pluviometer records, the estimation of EI 30 index is difficult for some regions in Brazil.In order to overcome this limitation, some authors propose the use of empirical equations that relate the EI 30 with monthly and annual precipitation (CARVALHO et al., 2005).However, even with these equations, obtaining the EI 30 is restricted to locations where there is a satisfactory rainfall database.
Even in localities where rainfall records are available, SILVA (2004) and GONÇALVES et al. (2006) used interpolation techniques to estimate EI 30 values, based on the inverse power of distance (IPD) method.However, this method does not take into account the altitude of the area but instead the value of the rainfall erosivity from other locations, using an inversely proportional relation of the distance between this site and each of the neighboring stations.MONTEBELLER et al. (2007) evaluated the spatial variability of erosivity in the State of Rio de Janeiro, using geostatistical analysis.
An improvement to obtain EI 30 values can be attained when interpolation methods take into account the altitude (GONÇALVES, 2002), which is possible by using Artificial Neural Networks (ANN).According to SARIGUL et al. (2003), among the models that utilize artificial intelligence techniques, artificial neural networks (ANNs) are the most widely used.In comparison with statistical models, which determine linear or quadratic plans, the main advantage of an ANN is its ability to model nonlinear response plans.ZANETTI et al. (2008) state that ANNs are distributed parallel systems, composed of simple processing units, which calculate certain mathematical functions.They have been successfully used to model complex relationships involving time series in various areas of knowledge.MOREIRA et al. (2006) andMOREIRA et al. (2009) used ANNs to estimate the erosivity in the states of Sao Paulo and Minas Gerais, respectively.Using estimated values of EI 30 from 138 and 268 rainfall stations, respectively, the authors concluded that the developed ANNs showed satisfactory results and can be used for use planning, management and soil conservation in both states.
Based on the above, this study aimed to develop and test ANNs to estimate the values of rainfall erosivity index (EI 30 ) from geographical features (longitude, latitude and altitude) in any area of the State of Rio de Janeiro and prepare a map of erosivity variability for the state of Rio de Janeiro.

MATERIAL AND METHODS
The EI 30 values used for the development and testing of ANN were obtained from studies by GONÇALVES (2002) and MONTEBELLER (2005) for 36 locations in the State of Rio de Janeiro.The geographical site (latitude and longitude), the altitude and the EI 30 values were standardized by mean and standard deviation in order to reduce the difference in scale of the variables in the training phase of the ANN. Figure 1 and Table 1 show, respectively, the geographic location and precipitation station data used in this study.The pluviometer data series had different number of records, depending on the institution responsible for managing the data (7; 17; 7 and 15 years, on average, respectively, for INMET, SERLA, ANA and Light).The training and testing of ANN were performed using the platform of the computer program Matlab ® 6.5 with the Neural Network package tool, which uses an iterative method to minimize the error between the ANN output and the observed value, known as the back-propagation error algorithm.This procedure minimizes the mean square error (MSE) between the ANN output and the expected output of the network free parameters (w's and b's).The minimization was performed by Quasi-Newton iterative method using the Levenberg-Marquardt algorithm (ZANETTI et al., 2008).

State boundary
Daniel F. de Carvalho,Joseph K. Khoury Júnior,Carlos A. A. Varella et al. Eng. Agríc.,Jaboticabal,v.32,n.1,The ANN architecture was type 3-n 1 -n 2 -1, being values corresponding to an input vector with three variables, two intermediate layers with n 1 and n 2 neurons, and one neuron in the output layer.The input vector was composed by values of latitude, longitude and altitude of each rainfall station.For the neuron in the output layer, it was used a linear activation function to generate the value of the rainfall erosivity (MJ mm h -1 ha -1 yr -1 ), of the location represented by the input vector.
ANN training and testing was performed by the "leaving one out" method, which consists of defining a fixed number of iterations to minimize the error.The ANN test was assessed with the observation that falls outside the training, in order not to bias the error.This method is a variant of cross-validation that is used when the sample has small number of observations (SENA JÚNIOR et al., 2008).For each training, just an observation was left out for testing, obtaining thus a greater number of observations in the training sample.
Several ANNs were trained by varying the number of times, number of neurons in the intermediate layers, and activation functions.The numbers of iterations or periods used in the training were 25; 50; 75; 100 and 150.Because in the beginning of training, free parameters were randomly generated and these initial values can influence the outcome of the training, each architecture was trained ten times, and selected the one with lowest classification MSE in the observation test.
In assessing the results obtained with the developed ANN, validity indices were used as proposed by CAMARGO & SENTELHAS (1997), and the absolute value of the mean relative error (MRE) (Equation 1).The indices of validity are: precision (r), indicating the degree of data dispersion (Equation 2), accuracy (d), relating to the removal of the points listed on the regression graph in relation to the line of equal values 1:1 (Equation 3), and confidence (c), represented by the product of the precision index by the accuracy index.
(1) where, n -36 weather stations; A i -EI 30 estimated by ANN for the i-th station, MJ mm h -1 ha -1 yr -1 , and T i -EI 30 of the i-th experimental station, MJ mm h -1 ha -1 yr -1 .
(2) where, A -average EI 30 estimated by ANN from the 36 stations, and T -experimental average EI 30 from the 36 stations.
(3) where, T A i  -absolute deviation between the estimated ANN value for the i-th station and the average experimental values, and T T i  -absolute deviation between the experimental value of the i-th station and the average experimental values.
A graph of correspondence with the MRE and validity indices was used to compare between the EI 30 values, obtained by ANN simulation and calculated from pluviometers by GONÇALVES (2002) and MONTEBELLER (2005), for 36 stations in the test phase of ANN.Another correspondence chart was generated with values of EI 30 factor, being 62 of these estimated by MONTEBELLER (2005) with equations that correlate EI 30 monthly with rainfall and 36 from the factor experimentally obtained by the same authors, totaling 98 values.These values were also obtained by ANN according to their geographic locations (latitude, longitude and altitude) and compared with these 98 values.
To view the results obtained by ANN, a thematic map of erosivity indices of the state of Rio de Janeiro estimated by ANN that presented the best performance was built.The resulting map was reclassified in the interval between EI 30 classes of 2,000 MJ mm ha -1 h -1 yr -1 , in a total of seven classes, as used by MONTEBELLER (2005).
The preparation of the EI 30 variability map was performed using the ArcGis program, and the kriging interpolation method was used, as recommended by CEDDIA et al. (2009).A set of 452 points of geographic coordinates (latitude, longitude and altitude) within the limits of the State of Rio de Janeiro was obtained from picture provided by Google Earth Release 4.2.These coordinates were used to estimate the EI 30 by the best ANN developed previously and to obtain a variability map of the EI 30 classes of 2000 MJ mm ha -1 h -1 yr -1 , according to MONTEBELLER (2005).

RESULTS AND DISCUSSION
Table 2 shows the number of times, number of neurons of the intermediate layers (Layer 1 and Layer 2), the mean relative error (MRE) of the sample from the best ANN and the validity indices used in the work.The ANN that showed the best result was that trained with 75 iterations (periods) and presented in its architecture intermediate layers with n 1 = 4 and n 2 = 3 neurons.Under these conditions, the confidence index (c) was equal to 0.6, considered "poor" according to CAMARGO & SENTELHAS (1997), and the MRE was 35.4%.The activation functions used in the neurons were sigmoid hyperbolic tangent type (tansig), in the input and intermediate layers, and of linear type (purelin) in the output layer.
The ANN developed to determine the EI 30 values has the following parameters: (4) where, f 1 = f 2 = tansig (n) = 2/(1+exp(-2n))-1; f 3 = purelin (n) = n; n = value of the entry in the function; W = adjusted w's matrix parameters of the ANN;  The values of mean and standard deviations were used for standardization, respectively: With this defined ANN architecture and parameters, it is possible to estimate the EI 30 in the State of Rio de Janeiro, from its implementation in any computing platform that operates with matrices, as well as to generate more data from the EI 30 to use in visualization software in attribute maps of values classes.
Figure 2 shows the correspondence graph of the EI 30 estimated values of the ANN (eq.( 4)) with ones observed from pluviometer data.Combining the results found with the one stipulated by MONTEBELLER (2005), which showed a map with EI 30 classes ranging in 2000 MJ mm ha -1 h -1 yr -1 , it is noticeable the viability of using ANN to estimate rain erosivity in the State of Rio de Janeiro, because the error introduced would not be detrimental for the purposes of calculating the erosivity.Taking as an example a value of 8000 MJ mm ha -1 h -1 yr -1 : when using ANN to estimate the EI 30 value of that location and considering an MRE of 35.40%, the range of possible values is between 5,168 and 10,832 MJ mm ha -1 h -1 yr -1 (8,000 ± 0.354 x 8,000).Thus, ANN was estimated by including the value in the correct range or, at most, one level above (9,000 to 11,000) or one level below (5000-7000) to the one considered correct (7000-9000).However, the addition of variables that influence the erosivity, and geographic coordinates, should be tested in order to improve the accuracy of the estimated EI 30 index.FIGURE 2. Correspondence graph between the EI 30 index values (MJ mm h -1 ha -1 year -1 ) predicted by ANN (eq.( 4)) and obtained from pluviograph data.The adjusted model of the kriging semivariogram was exponential.Figure 4 shows the thematic map with the EI 30 , erosivity index for the State of Rio de Janeiro estimated by ANNs.The map showed the same variation tendency of that found by MONTEBELLER et al. (2007).However, the use of ANN allowed generating more points, making it possible to achieve better visualization of the EI 30 spatial variability, thus generating a greater number of erosivity classes in the same region.This result attends the recommendation of MONTEBELLER et al. (2007), so that a greater number of erosivity values were obtained, mainly in the north (micro-regions of Campos dos Goytacazes and Macaé) and the coastal lowlands (micro-regions of Bacia de São João and Lagos).

FIGURE 1 .
FIGURE 1. Map of the state of Rio de Janeiro with the distribution of 36 pluviograph stations used.
The vector of input and output variables were standardized by the equation: variable; x -real variable;  -x average, and  -x standard deviation.

Figure 3
Figure 3 shows the correspondence graph generated between 98 EI 30 values estimated and observed by MONTEBELLER (2005) and GONÇALVES (2002), and the corresponding values simulated by the ANN developed for this other dataset.It is noticeable that the confidence index (c) increased to 0.7, which is considered "good" by the classification proposed by CAMARGO & SENTELHAS (1997).It is also observed that the small number of observations of the EI 30 value (obtained in 36 precipitation stations) used in the past training and testing of ANNs may have been the reason for the "poor" results shown in the analysis.Despite the best result in the training of ANN with 98 EI 30 values, one should take into account that part of these EI 30 values were estimated, reducing the validation reliability compared to ANN evaluated with 36 observations.Thus, the ANN estimation of EI 30 index by using a larger sample number for ANN training resulted in acceptable values to compare with calculated and observed values by MONTEBELLER (2005) and GONÇALVES (2002).

TABLE 2 .
Number of neurons of intermediate layers (Layer 1 and Layer 2), root mean square (RMS), correlation coefficient(r), accuracy index (d) and reliability index (c) of the artificial neural network (ANN) available.