SPATIALIZATION OF THE ANNUAL MAXIMUM DAILY RAINFALL IN SOUTHEASTERN BRAZIL

Extreme rainfall can lead to heavy damage and losses, such as landslides, floods and agricultural productivity as well as the loss of human and animal lives. To mitigate these losses, water resources management policies are needed, among other goals, to study and predict the frequency of such events in a given region to minimize their harmful effects. The present study investigated the Generalized Extreme Value (GEV) probability distribution applied to the annual maximum daily precipitation data from rainfall stations in the southeastern Brazil. A total of 1,921 rainfall stations were considered, among which the stations with at least 15 years of uninterrupted observations were selected. Subsequently, the stationarity and adherence were tested. GEV probability distribution parameters were then estimated. The results enabled satisfactory spatial interpolation by ordinary kriging and the generation of maps of the distribution parameters. The semivariogram model with the best fit to the three GEV distribution parameters was the exponential model.

To mitigate these losses, it is necessary to improve public policies on urbanization and the environment considering the particularities of each region.For this, it is necessary to analyze the frequencies with which such events occur in a given region in order to characterize it analytically.The occurrence of maximum rainfall events has a strongly random behavior in time and space; therefore, a stochastic approach is indispensable to analyze it.
One strategy to study extreme rainfall events is to use a given probability distribution to model the frequency of observed rainfall data.The use of these distributions is crucial for the prevention of disasters involving extreme rainfall (Yuan et al., 2018).Several theoretical distributions are used in the literature to study maximum extreme events, including the Generalized Extreme Value probability distribution (GEV), which has aroused the interest of many researchers.Santos et al. (2016) used the GEV distribution in daily precipitation series of homogeneous regions of the Brazilian Amazon.The authors verified that the distribution satisfactorily modeled the precipitation data using Kolmogorov-Smirnov test.Pedron et al. (2017) used the GEV distribution to characterize extreme events of daily precipitation in the city of Curitiba, Brazil, and observed an increase in the frequency of extreme events, considering the variation of the distribution parameters.Fischer et al. (2018) investigated the seasonal cycle of total monthly rainfall maxima in Germany by modeling the GEV distribution and found that the model was appropriate for most of the rainfall stations used.The cumulative distribution function of the GEV distribution described by the authors was: The function is valid when 1 + κ > 0 and x is the studied variable; β is the position parameter; -∞ < β < ∞; α is the scale parameter; α > 0 and κ is the shape parameter; and/or -∞ < κ < ∞.According to Pedron et al. (2017) and Katz (2010), the position parameter represents the location of the distribution peak relative to the center; the scale parameter represents the size of the deviation relative to the position parameter; and the shape parameter models the tail decay rate of the distribution.
Engenharia Agrícola, Jaboticabal, v.39, n.1, p.97-109, jan./feb.2019 GEV distribution encompasses three types of distribution depending on the value of the κ parameter (Fischer et al., 2018;Shamir et al., 2013).When this parameter is positive (κ > 0), the distribution is called a Frechet, or Type II, distribution, and the tail decay is slower and gradual.For a negative shape parameter (κ < 0), the distribution is called Weibull, or Type III, distribution.For the parameter with limit κ → 0, the distribution is called Gumbel distribution for maxima, or Type I, distribution, and the tail decay is more pronounced.Fischer et al. (2018) present the inverse of the cumulative GEV distribution for the estimation of quantiles at the given return period as: Where, xp is the quantile associated with the return time T, and After selecting a specific distribution and estimating its respective parameters, it is possible to determine the behavior of the parameters by interpolating the values in space if the number of stations is enough for the variable to present a spatial structure.Ordinary kriging is a type of spatial estimator that is unbiased and with minimum variance (Landim & Yamamoto, 2013), which has been widely used in the study of rainfall, yielding good results (Seo et al., 2015;Adhikary et al., 2016;Borges et al., 2016).Batista et al. (2018) used ordinary kriging with semivariogram estimators to determine the performance of the mean annual total rainfall mapping for the state of Minas Gerais and observed good performance for all estimators, with the semivariance estimator NEW-1 having the best performance.Adhikary et al. (2016) compared the methods of ordinary kriging, inverse square distance and kriging with genetic algorithms to spatially interpolate the monthly and annual rainfall data for the Yarra River basin in Australia.The results indicated that the methods based on kriging clearly outperformed the inverse square distance method.Among all methods based on kriging, ordinary kriging resulted in better estimates.
The objective of the present study was to investigate the fit of the GEV to the annual maximum daily rainfall data for rainfall stations in southeastern Brazil, considering the stationarity and adherence of the data.The (α), (β) and (κ) parameters of the GEV distribution were estimated for each station using the L-moments method.Next, variograms of the parameters were fitted using spherical, Gaussian and exponential models and maps were developed for the parameters using ordinary kriging for spatial interpolation.A quantile map with a return period of 100 years was generated to identify the areas with the highest expected daily annual precipitation for the southeastern Brazil.

Spatial distribution of rainfall stations and estimation of GEV parameters
Data were obtained from the National Water Resources Information System (SNIRH) made available by the National Water Agency.After obtaining these historical series, the highest daily rainfall value of each year was selected for each station, and a new historical series of annual maximum daily rainfall was constructed for the available stations in the southeastern Brazil.The spatial distribution of the rainfall stations provided by SINRH (Figure 1) indicates the locations of the data collecting stations, and the number of rainfall stations per State and their respective areas are shown in Table 1  (a) National Water Agency (ANA, 2013), (b) Brazilian Institute of Geography and Statistics (IBGE, 2013).
The state of São Paulo has the highest number of rainfall stations (1,193 stations).Espírito Santo has the lowest number of stations, totaling 83, while Minas Gerais and Rio de Janeiro have 526 and 119 stations, respectively (Table 1).
The first criterion for selecting the stations was the observation period, which was equal to or greater than 15 uninterrupted years.The stationarity was then verified using the Spearman correlation test for each station.The Spearman correlation test is frequently used for trend checking in hydrological series (Fathian et al., 2015).Naghettini & Pinto (2007) and Shamir et al. (2013) clarify that there are different ways of estimating the GEV distribution parameters, including the moments method, the L-moments method and the maximum likelihood method.In this study, these parameters were estimated using the Lmoments method (LMM) following the recommendations of de Filho et al. (2017) and Junqueira Júnior et al. (2015), considering that the other methods can generate results unreliable for a small sample size.Since the majority of historical hydrological series in Brazil with reliability may be considered small (less than 40 years), the LMM can estimate better results.The adherence of the GEV distribution to each station was tested using the Filliben test (Filliben, 1975).After obtaining the parameters of the distribution, exploratory analysis of the data was conducted, followed by semivariogram modeling.

Exploratory analysis, semivariogram construction and spatial interpolation
To model a semivariogram, exploratory analysis was performed by evaluating frequency histograms, trend, boxplots and experimental semivariograms.
The exploratory analysis was initiated by examining histograms of the values of each GEV distribution parameter together with the asymmetry values of the data.Kerry & Oliver (2007a) recommend verifying the asymmetry value of the dataset as a standard practice for exploratory data analysis in geostatistics.If the asymmetry value is between -1 and 1, the raw data can be used, but if the value is outside this limit, the histogram must be investigated.
The existence of outliers was verified using boxplots to represent the variation in the observed data to locate the outliers of the dataset.The possible occurrence of spatial trends was verified, and the exploratory data analysis was performed on the spatial structure of the GEV distribution parameters using the exponential, spherical and Gaussian semivariogram models.The fit of the semivariograms models, the parameters (sill, range and nugget effect) were obtained using the weighted least squares method as described in Mello et al. (2005).After establishing the theoretical semivariogram model, the degree of spatial dependence (DSD) was evaluated.This index was measured according to the study of Cambardella et al. (1994): Where, C1 is the contribution, and C0 is the nugget effect of the semivariogram.
According to the author, if DSD < 25%, the degree of spatial dependence is considered weak; for DSD ≥ 25% and DSD < 75%, the spatial dependence degree is moderate; and if DSD ≥ 75%, there is a strong spatial dependence degree.After adjusting the theoretical model, the kriging method was applied to estimate non-sampled values.Goovaerts (1997) defines kriging as a family of generalized least squares regression algorithms.The ordinary kriging method (OK) uses the spatial continuity between neighboring samples to estimate values at any position within the space, with no trend and with minimal variance.The DSD and the results of the validation analyses were the criteria considered in choosing the best model.According to Carvalho et al. (2012), validation consists of the separation of a set of data from the sample that will not be part of the interpolation process.For the present study, 100 stations were considered.After the theoretical semivariograms fitting, the values of the separated stations were then estimated using each one the models.Therefore, the values of the validation data were compared with the values estimated by the model.The reduced mean error (RME) and the error standard deviation (ESD) were used for this comparison.
For calculation of the RME and ESD, Junqueira Júnior et al. (2008) used: The best semivariogram model was then selected and the thematic maps of the GEV distribution parameters were generated.The quantiles of intense rainfall were estimated and associated with the return period for the southeastern Brazil.To identify areas with higher vulnerability to extreme rainfall, a map with the annual maximum daily precipitation quantiles associated with a return period 100 years was created using the cumulative inverse function of the GEV distribution (Equation 2), based on the values of the parameters obtained for each station.After obtaining the quantiles for each station, a new spatial interpolation was performed using ordinary kriging to individualize the extreme values and obtain a thematic map of the study region.

RESULTS AND DISCUSSION
The minimum rainfall station density based on physiographic units recommended by the World Meteorological Organization (WMO) (1994) is shown in Table 2.The state of São Paulo has diverse physiographic units throughout its territory and the density of rainfall stations meets the most restrictive criteria; therefore, it can be concluded that this state meets the WMO recommendations (Table 2).The states of Rio de Janeiro and Espírito Santo meet the criteria of hilly, plains and coastal areas.The state of Minas Gerais, despite having 526 stations, does not meet the recommendations since the density of rainfall stations is determined by one station per each 1,115 km 2 .

Observation time records
The histogram in Figure 2 and Table 3 present the extension of the data of the rainfall series for the southeastern Brazil.The standard deviation is high for the region over the 18-year period, indicating dispersion in the data relative to the mean, which is evident in the histogram based on the nonuniformity of the data.On average, the regions presented 37 years of observation, with the highest mean value for the state of São Paulo and the lowest value for the state of Minas Gerais.
Considering the criterion of 15 years of uninterrupted observations, only 243 stations were discarded, with 1,678 stations remaining for the next step.Estimation of the GEV parameters for the stations Using the LMM method, the scale, position and shape parameters were estimated for the 951 stations distributed in the southeastern Brazil.The results of the estimation indicated scale values ranging from 9.76 to 27.89; for the position parameter, the range was 81.13 to 58.54, and for the shape parameter, the range was -0.114 to 0.263.

Analysis of stationarity and adherence
The Spearman correlation test was performed using a 5% significance level, which enabled the identification of the stations that did not present stationarity.According to the results presented in Table 4, a total of 362 stations were nonstationary.Nonstationary stations were discarded in the subsequent analyses.The analysis of adherence was performed next.According to the Filliben test (Filliben, 1975), the data from 365 stations could not be represented by the GEV; therefore, they were also discarded, leaving the data for 951 rainfall stations for the exploratory analysis step.

Exploratory analysis
The histograms of the GEV parameters (Figure 3) show estimated asymmetry values of 0.52, -0.13 and 0.38 for the alpha, beta and kappa parameters, respectively, which imply that there was no need for data transformation according to Kerry & Oliver (2007b).By analyzing the histogram of the shape parameter (Figure 3c), a range of values far from the data mass is observed, indicating possible existence of outliers, which was confirmed by observing the values shown in boxplots (Figure 4).Engenharia Agrícola, Jaboticabal, v.39, n.1, p.97-109, jan./feb.2019 Modeling the spatial continuity of the data The spatial continuity of the data was modeled after exploratory analysis and verification of the nonoccurrence of trends in the data, followed by ordinary kriging interpolation.The experimental and theoretical semivariograms for the exponential, Gaussian and spherical models for the three GEV parameters are shown in Figure 8.  5.The GEV parameters showed significant spatial dependence, i.e., at least one degree of spatial dependence (moderate).
The exponential model presented the highest DSD for the three studied parameters.In addition to presenting a higher spatial dependence degree, the model showed a larger range and a lower nugget effect for the three parameters.These results indicate that the exponential model is a valid choice.
The results show that the exponential model provided reduced mean error values close to zero and an error standard deviation close to 1.The spherical and Gaussian models also presented small values; however, the exponential model was chosen because it presented the best spatial structure.

GEV distribution parameters mapping in southeastern Brazil
The maps of the scale, position and shape parameters of the GEV distribution in the southeastern Brazil are shown in Figures 9, 10 and 11, respectively.Most shape parameter values (κ) were positive, resulting in a Type II GEV distribution (Fréchet).Following the interpretation of the GEV parameters proposed by Katz (2010), the shape parameter was related to the asymmetry of the observed data, and analysis of the map indicates that greater spatial variation in this parameter relative to the others was observed.
In addition, this result corroborates those found by Papalexiou & Koutsoyiannis (2013), who tested which of the three GEV distribution types was best fitted to the annual maximum daily rainfall data of 15,137 rainfall stations around the world.The authors found that the Type II distribution (Fréchet) was better fitted to the observed data.
According to values in Table 6, only 2.78% of the area of the map shows a negative shape parameter, i.e., it has a Type III (Weibull) distribution.The class corresponding to 25.89% of the map area has values close to zero, which indicates that these regions approximate the Type I (Gumbel) distribution.A 100-year return period map was created using the values of the parameters in the maps and applying the map algebra technique.This procedure was performed to identify the areas with the highest occurrences of maximum daily rainfall.The map is shown in Figure 12.FIGURE 12. Quantile associated with 100-year return time.
Three regions were identified according to the delimitation shown in Figure 12.Region I, which covers the northern Minas Gerais, has a semi-arid rainfall regime, with little water vapor in the atmosphere, convective events and frontal systems on a smaller scale.Region II has the highest maximum rainfall quantiles.These high values may be the result of several factors, both at the micro-and macroscales.A local scale factor is the orographic influence, consisting of the ascent of moist and warm air masses over natural barriers that cool and condense as they gain altitude, in addition to the milder temperature, which favors cloud formation (Mello & Viola, 2013).The main orographic elements that contribute to these effects are the Canastra and Espinhaço Mountains, located to the west and east of Minas Gerais, respectively, and the Mar and Mantiqueira Mountains, which are close to the sea, causing very intense rains in the coastal region.Another influential factor is the mesoscale convective complexes (MCC), as described by Reboita et al. (2010), which consist of lines of instability with water vapor generated in the Amazon region and spread throughout most of the South American continent in the summer.
A large-scale factor is the South Atlantic Convergence Zone (SACZ), a typical summer phenomenon in South America.Its main characteristic is the persistence of a northwest-southeast oriented cloud cover, and it plays a predominant role in the rainfall regime in the region where it occurs, causing high rainfall rates (Carvalho et al., 2012).This range of high values presented in the maps is close to the results of Carvalho et al. (2012), who studied extreme rainfall events related to the SACZ.It is important to highlight the factors associated with the SACZ and its relationship with other global climatic phenomena, such as El Niño and La Niña.According to Kodama (1992), the SACZ occurs mainly between 10° and 20° latitude, thus affecting much of the southeast region of Brazil.
Engenharia Agrícola, Jaboticabal, v.39, n.1, p.97-109, jan./feb.2019 For Region III, which encompasses Rio de Janeiro and the area west of São Paulo, the expected rainfall values are lower than those of Region II.According to Teixeira (2010), the frontal systems are very active in this region, with the formation of long rainfall events with low intensities and the possibility of convective rainfall formation.

CONCLUSIONS
The annual maximum daily rainfall data from 951 of the 1,921 analyzed stations are stationary according to the Spearman test criterion and are adequate in relation to the GEV distribution according to the Filliben test.The spatial correlation of the GEV parameters using semivariograms was determined, and the shape parameter presented the lowest spatial continuity.The associated quantile map with a 100-year return period indicates areas of high maximum daily rainfall values.These areas require greater attention in development projects that use water resources directly and indirectly.
number of validation data; z(xi) is the value observed at point i; z*(xi) is the estimated value for point i, and  is the standard deviation of kriging.RME should be 0 and ESD should be 1 to obtain ideal results.Engenharia Agrícola, Jaboticabal, v.39, n.1, p.97-109, jan./feb.2019

FIGURE 2 .
FIGURE 2. Extension of the rainfall series for the southeast region.

FIGURE 5 .FIGURE 6 .
FIGURE 4. Boxplots of the scale (a), position (b) and shape (c) parameters of the GEV distribution, with their outlier because the analyses of the boxplots enabled the detection of possible outliers, it was determined that the presence of outliers was detrimental to semivariogram modeling, as observed in Feld et al. (2016), Duggimpudi et al. (2017), St. Luce et al. (2014), Mingoti & Rosa (2008), Tobin et al. (2011) and Teixeira & Scalon (2014).These authors indicate that discrepant outliers can destabilize the computation of the parameters of the theoretical semivariogram, making the data spatialization unfeasible.It was therefore decided to remove the outliers from the dataset, leaving the values from 901 stations for the next step.
FIGURE 8. Experimental and theoretical semivariograms for the α (a), β (b) and κ (c) parameters of the GEV using the exponential, Gaussian and spherical models, respectively.The models presented a good visual fitting, indicating spatial structure.The values of the nugget effect, contribution and range for each fitted model as well as the DSD are shown in Table5.The GEV parameters showed significant spatial dependence, i.e., at least one degree of spatial dependence (moderate).The exponential model presented the highest DSD for the three studied parameters.In addition to presenting a higher spatial dependence degree, the model showed a
. FIGURE 1. Geographical distribution of rainfall stations in southeastern Brazil.

TABLE 2 .
Densities recommended by the WMO.

TABLE 3 .
Descriptive statistics of the data extension of the rainfall series for the southeastern Brazil.

TABLE 4 .
Result of the evaluation of the stationarity of rainfall stations by the Spearman test.