PROBABILITY DISTRIBUTION FUNCTIONS APPLIED IN THE WATER REQUIREMENT ESTIMATES IN IRRIGATION PROJECTS

ABSTRACT Spain contains a third the entire irrigated area of Europe, accounting for 15% of the cultivated area of the country and almost 60% of the national agricultural production. Knowledge of the spatial and temporal variability of reference evapotranspiration (ETo) and the probabilistic theory of extreme events is crucial for the elaboration of sustainable irrigation projects. The objective of this work was to define the frequency distribution that best describes ETo for the design of irrigation systems in the region of Andalusia. We used ETo data for the period 2001 to 2015 from 56 meteorological stations. The values were accumulated over three consecutive days. For all accumulated periods, nine probability distributions were adjusted. The probability distribution that best described ETo for the design of irrigation systems in the region was the Gumbel II distribution. The maximum daily ETo to be considered in irrigation projects in this region is, on average, 10 mm. The accumulated ETo for periods of 5, 10, and 30 days that should be considered are, on average, 42 mm, 78, mm and 224 mm, respectively.


INTRODUCTION
The irrigated area of Spain represents approximately one third of the total irrigated area of Europe. According to the data from Esyrce (2015), in the last 15 years, the total irrigated area in the country increased by 8.7%, reaching approximately 3.6 million hectares, corresponding to about 15% of the cultivated area and to almost 60% of the final national agricultural production.
The Andalusia region represents the greatest irrigated area in Spain, with approximately 1 million hectares under irrigation, followed by the regions of Castilla-La Mancha and Castilla y Leon, which have approximately 512,000 and 481,000 hectares under irrigation, respectively (THENKABAIL et al., 2009;ESYRCE, 2015).
Irrigated agriculture facilitates the generation of jobs and economic profit, enhances food production, and mitigates the problem of rural exodus in Spain. However, the rational use of irrigation water is an increasingly important issue and characterizes the hydrological cycle and agricultural and environmental experiments in the region (DUARTE; PINILLA; SERRANO, 2014; BLANCO-GUTIÉRREZ.; VARELA-ORTEGA; PURKEY, 2013).
Reference evapotranspiration (ETo) is one of the most important variables to determine the events balance as it refers to the water flux, as vapor, from the soil-plant system to the atmosphere, therefore serving as a base for dimensioning irrigation systems. Evapotranspiration is controlled by the energy availability and by the air evaporating power, with energy availability being depended upon the location and season of the year (GONG et al., 2006;CAI et al., 2009;TABARI;GRISMER;TRAJKOVIC, 2013;COSTA et al., 2015).
Therefore, ETo varies in space and time; since extreme events have a great relevance in terms of agrometeorology, estimations of ETo are essential for planning and developing activities under its adverse effects, especially in agriculture (SANSÍGOLO, 2008;LIANG;LI;LIU, 2010;WANG;DICKINSON, 2012).
Although the fundamental probabilistic theory of extreme values has been developed a long time ago, the statistical modelling of extremes still remains an important subject, given its crucial role in water resource management, especially in the context of a changing climate (KATZ;PARLANGE;NAVEAU, 2002;KATZ, 2010).
Knowing the ETo frequency distribution model in a location facilitates adequate hydrological prediction in dimensioning irrigation and drainage systems (SILVA et al., 2015), mainly by enabling the characterization and quantification, with greater precision, of the water head to be applied and of the potential of the irrigation system in uncommon situations, considering different periods of time.
In this context, the aim of the present work was to evaluate the frequency distribution that better describes the reference evapotranspiration (ETo) for the Andalusia region (Spain), focusing on the dimensioning of irrigation systems in this region.

MATERIAL AND METHODS
Andalusia is the second largest autonomous community of Spain, covering an area of approximately 87,600 km 2 between the longitudes of 07°31' W and 01°38' W and the latitudes of 38°44' N and 36°00' N. Approximately 31.42% of the territory are located at an elevation of 0 and 200 m, 39.38% are between 200 and 600 m, 25.94% are between 600 and 1,400 m, and 3.27% are situated above 1,400 m (IECA, 2013).
According to the Köppen classification, the predominant climate is Csa, characterized as temperate climate with hot and dry summers. Exceptions are the area of the province of Almeria, which presents a BWh climate (hot desert climate), and the Sierra Nevada, which has a Dsc climate (cold climate with dry and cool summers) (AEMET-IM, 2011).
Andalusia is divided into six main hydrographic basins, of which the hydrographic basin of the Guadalquivir River occupies the largest area (51.900 km 2 ), approximately 59% of the Andalusian territory. This basin involves the province of Seville and parts of the provinces of Cordoba, Granada, and Jaen (IECA, 2013).
The dataset used in this study contained the reference evapotranspiration (ETo) values from 56 meteorological stations ( Figure 1 and Table 1)  The ETo values were accumulated in consecutive periods of 5 (ETo 5 ), 10 (ETo 10 ), and 30 (ETo 30 ) days. The maximum accumulated reference evapotranspiration (ETo ac ) for each year and for all the accumulated periods was composed of series for the respective stations, which were adjusted to 9 probability distributions, as follows: (i) Log-normal, (ii) Weibull, (iii) Gamma, (iv) Cauchy, (v) Normal, (vi) Logística, (vii) Birnbaum-Saunders, (viii) Gumbel-II, and (ix) Gumbel. To adjust the distribution parameters, the Maximum Likelihood (ML) method was used, which consists of obtaining, from a sample, the estimator with greater likelihood from the parameters of a certain probabilistic model.
Since we used the ML method to adjust the parameters of the distributions, the Akaike Information Criteria (AIC) was used to chose the distribution which better adjusted the data through the least AIC value. Thus, the distribution which presented the best adjustment to the majority of the ETo ac data was chosen to represent all studied regions and to obtain the parameter values. Finally, the Anderson-Darling test was used to verify if the distribution used for all the series that adhered to the data; this way, the parameter values were obtained and the ETo ac was estimated for the payback periods (T). To obtain the daily and accumulated evapotranspiration series in different payback periods (hydroTSM package -Hydrologic Time Series Management), as the adjustment of the distribution models (fitdistrplus package -Parametric Distribution to Non-Censored or Censored Data) and the adherence test (package ADG of Test), data importing and mining were performed using the open-source statistical software package R 3.1.2 ® .
Based on the data of the parameters of the best probability distribution and the daily and accumulated reference evapotranspiration (ETo d and ETo ac , respectively) in different payback periods, spatial distribution maps were created using the squared IDW method for interpolation. Data interpolation and spatial distribution map generation were performed in the ArcGIS 10.1 ® software. Table 2 shows the best results for the data adjustment to the probabilistic models for each meteorological station. Nine probability distributions were accepted according to the Anderson-Darling adherence test (p < 0.05), with the log-normal distribution being the only one that did not present the best adjustment, and the generalized Rayleigh distribution did not adjust to the data. The distributions which obtained the best adjustments to the majority of the ETo ac data were the Gumbel II (ETo 5 ) and Weibull (ETo 10 and ETo 30 ) distributions. Overall, the Gumbel II distribution presented the best adequacy to the reference evaporation data, with approximately 20% of the tested series, followed by the Weibull probability distribution.

RESULTS AND DISCUSSION
Jerszurki, Souza and Evangelista (2015), applying the Kolmogorov-Smirnov adherence test at 5% of probability and verifying that approximately 65% of the 10-day ETo values were adjusted to the Normal distribution, verified that adjustment of probability distribution functions in series of ETo data in the city of Telêmaco Borba, state of Paraná (Brazil), were most suitable.
On the other hand, Silva et al. (1998) have determined, for the city of Cruz das Almas, state of Bahia (Brazil), that the main ETo distributions were the Normal, Log-Normal, and Beta distributions. This way, it is possible to note the variability regarding the adequacy of the distributions to the historical society and to the different locations.
In the present work, the distribution which presented the best adjustment to the majority of the data of daily maximum and accumulated ETo values was the Gumbel II distribution; consequently, it was selected to represent the entire Andalusian region. Thus, the scale (α) and shape (β) parameters for the Gumbel II distribution were obtained for the remaining stations (Table 3). The Gumbel II distribution was evaluated by the Anderson-Darling test (p < 0.05), and the values obtained were significantly lower than the critical values, considering this distribution adequate for the 56 meteorological stations. It is noteworthy that the α value did not vary greatly among the different simulations, with values in the interval between 7 and 9, with some exceptions, for the daily maximum ETo as well for the accumulated ETo, in different payback periods. This may be explained by the fact that α is a shape parameter of the random variable; thus, it does not depend on the magnitude of the evapotranspiration values. On the other hand, the β parameter presented great dispersion in the different simulations performed, with the greater values in the ETo simulations accumulated over 30 days; in other words, in simulations with greater evapotranspiration values.
For the daily ETo simulation, the maximum and minimum β values were 24.35 and 1.17, respectively, for the stations 30 and 29, respectively. For the ETo accumulated over 5 days, the maximum and minimum β values (69.06 and 0.66, respectively) were observed for the stations 37 and 1, respectively. Yet, for the ETo accumulated over 20 days, the maximum and minimum β values (74.21 and 0.66, respectively) were found for the stations 37 and 56, respectively. For the simulation with ETo accumulated over 30 days, the maximum β value was 51.69, while the minimum value was 0.66, corresponding to the stations 37 and 1, respectively. This wide range of ETo values is based on the fact that this parameter represents the variable scale.
The distribution maps regarding the scale (α) and shape (β) parameters of the Gumbel II distribution in the Andalusian region are shown in Figure 2. In the spatial distribution maps of the α parameter, there was a predominance of values in the interval between 7 and 9 in the central region of Andalusia, with extreme values of α being located in the outskirts of this region. In the ETo d map, values between 8.2 and 8.9 prevailed, while in the ETo 5 map, values between 5.9 and 8.2 were predominant. In the maps of ETo 10 and ETo 30 , values mainly ranged between 5.8 and 7.9.
In the spatial distribution maps of the β parameter, we observed no predominance of any range of values. This greater variability in the β maps reinforced the hypothesis that this parameter represents a variable scale. Table 4 shows the values of daily and accumulated reference evapotranspiration in different periods (5, 10, and 30 days), using the optimum probability distribution function (optimum PDF) and the Gumbel II probability distribution function (Gumbel II PDF) for a payback period of 4 years. The optimum PDF corresponds to the distribution which best adjusted to the ETo ac .  In general, as the accumulated period increases, the daily maximum reference evapotranspiration decreases. According to Santos et al. (2017), this behavior generally occurs when analyzing evapotranspiration frequency, depending on the accumulated period and on the probability level adopted, resulting in sensitive differences when projecting on irrigation systems.
The daily maximum ETo for the optimum probability distribution function was 9.7 mm on average, varying from 6.9 to 13.9 mm in the different stations considered. For the Gumbel II probability distribution function, ETo d had an average value of 9.5 mm, with values varying from 7 to 14.5 mm.
The reference evapotranspiration accumulated over a period of 5 days, using the optimum probability distribution function, was 41.6 mm on average, varying from 30.9 to 61.3 mm in the different meteorological stations considered. For the Gumbel II PDF, the average ETo 5 value was 41.6 mm.
The 10-day accumulated reference evapotranspiration for the optimum PDF was 77.4 mm on average, ranging from 59.1 to 94.8 mm between all considered stations. For the Gumbel II PDF, the average ETo 10 values was 78.2 mm.
For the 30-day accumulated evapotranspiration, the optimum PDF had an average value of 225.3 mm, ranging from 167.8 to 271.7 mm in the different stations. For the Gumbel II PDF, the average ETo 30 value was 221.8 mm.
Based on these ETo values, the daily maximum ETo to be considered in irrigation projects in this region is approximately 9.7 mm d -1 . Therefore, the accumulated ETo in the periods of 5, 10, and 30 days that should be taken into account while dimensioning irrigation systems in the Andalusian region are, on average, 41.6, 77.4, and 225.3 mm, respectively.
The spatial distribution maps regarding the daily maximum and the accumulated reference evapotranspiration values in different periods (5, 10, and 30 days) for the optimum and Gumbel II probability distribution functions in the Andalusian region are presented in Figure 3.
In the ETo d distribution maps, we noted the same had high spatial variability when using the optimum PDF. When using the Gumbel II PDF, the ETo d values were predominantly in the range between 8.9 and 10.2 mm.
In the ETo 5 distribution maps, spatial variability was high in both probability distribution functions (optimum and Gumbel II), with values mainly around 42 mm. In the distribution maps of ETo 10 , the spatial variability was also high in both probability distributions, with values mainly around 78 mm. The ETo 30 distribution maps presented a high variability in both probability distributions, with predominant values around 224 mm. The maps obtained demonstrate a high heterogeneity of reference evapotranspiration in the Andalusian region. This Spanish autonomous community is characterized by a highly developed agriculture, with irrigation application in crops such as wheat, rice, oats, rye, potato, and olives, among others. In this context, the results generated could be used in the dimensioning of irrigation systems considering the evapotranspiration demand, thereby improving the productive capacity of the region and supporting farmers and land owners.

CONCLUSIONS
The probabilistic model that presented the best adjustment to the reference evapotranspiration (ETo), aiming at the dimensioning of irrigation systems in the Andalusian region (Spain), was the Gumbel II model.
For the study region, the daily maximum ETo value to be considered in irrigation projects is, on average, 9.7 mm. The accumulated ETo values for periods of 5, 10, and 30 days, which should be considered in the dimensioning of irrigation systems, are, on average, 42, 77, and 225 mm, respectively.