INTENSITY-DURATION-FREQUENCY RELATIONSHIPS : STOCHASTIC MODELING AND DISAGGREGATION OF DAILY RAINFALL IN THE LAGOA MIRIM WATERSHED , RIO GRANDE DO SUL , BRAZIL

This study aimed to investigate information gain by using rainfall intensity-durationfrequency (IDF) relationships, with data gathered within N+M years from seven rain gauge stations located in the Lagoa Mirim Watershed (South Atlantic basin). After N years of daily rainfall, the transition probabilities of a time homogeneous two-state Markov chain were defined to simulate rainfall occurrence, as well as gamma distribution to measure it; for that, daily rainfall series were composed of N+M years, with M being the generated series. The series were adjusted to Gumbel distribution, being used in annual maximum daily rainfall disaggregation for durations of 10, 20, 30, 40, 50, 60, 120, 360, 720 and 1440 min. Daily rainfall disaggregation was validated through IDF relationships taken from pluviograph records of N years and from N+M years, using the “t” test of relative mean squared error. We can infer that there was information gain using IDF relationships of rainfall occurrence when using N years of observed data and M years of generated data by stochastic modeling compared to those obtained from a composed series of N years.


INTRODUCTION
The Lagoa Mirim Watershed has a prominent role in terms of water resources management, since it is a cross-border basin between Brazil and Uruguay.Despite its importance, water-flow information are scarce, which complicates the design of hydraulic structures as spillways, drainage canals, as well as those related to soil and water conservation, among others (TEIXEIRA et al., 2011).To get around this situation, hydrological models have been applied for rainfall-runoff transformation (FIORIO et al., 2012;LALOZAEE et al., 2013).Among them, the Soil Conservation Service model (SCS, 1972) makes use of intensity-duration-frequency (IDF) relationships to estimate hyetograms and unit histogram for routing surface runoff.
Long series of rainfall data are available for the Lagoa Mirim watershed but with missing records, hindering estimation of the IDF relationships.Garcia et al. (2011), aiming to obtain intense rainfall equations for the municipalities of Cáceres, Cuiabá and Rondonópolis, in Brazil, used pluviograph analysis, 24-h rainfall disaggregation, and the Bell's method (1969).These authors also used a discontinuous series and concluded that the disaggregation method was most sensitive to such series size if compared to the Bell's one.Moreover, for different durations and return period, the disaggregation method has also shown better performance.DAMÉ et al. (2014) estimated the IDF relationships using daily rainfall disaggregation method for stations located in the southern half of Rio Grande do Sul State, in Brazil, and compared them with those obtained by homogeneous pluviograph records.The authors concluded that the maximum rainfall intensity by disaggregation are similar to those obtained by IDF equations and, therefore, this method constitutes a feasible alternative in obtaining the IDF relationships.
As it is an extreme phenomenon, intense rainfall ought to be characterized by continuous and representative data series.This way, the longer the series used in the IDF equation, the greater the chance of contemplating extreme phenomena.Although the city of Pelotas already owned a set IDF relationship that covers the period between 1961 and 1991 (30 years), TEIXEIRA et al. ( 2011) evaluated a longer data series, from 1921 to 2009 (79 years) and, thus, it could be contemplated the extreme events of 1959 (208.4 mm) and of 2004 (292.6 mm. In order to fill the missing values of daily rainfall and therefore increasing series size, Markov and probabilistic models can be used to generate synthetic series.The Markov process considers a minimum of two transition states, rainfall presence or absence and, which is quantified by a parametric distribution, such as Gamma or Exponential (STERN & COE, 1984).The knowledge on rain behavior over time can be enabled by mixing observed data and those estimated by the abovementioned stochastic process.
Aiming to simulate a daily rainfall over large areas and focusing on risk analysis studies, SERINALDI & KILSBY (2014) reported a few time series models such as the Markov chain, aside from generalized and parametric linear models.Regarding the Markov modeling, the authors emphasized the importance of estimating parameters for an improved representation of transition probabilities, regardless of the order used in alternation of precipitation states.ARAÚJO et al. (2012) applied the same model for a 47-year rainfall daily data series of 75 stations in the states of Bahia and Sergipe, Brazil.They found that the rainy days of each season in this region were well represented by the Markovian likelihood function, which is a seasonal average of unconditional probability of rainy days.
This study aimed to check information gains when estimating rainfall intensity-durationfrequency relationships through the Markov chain and Gamma distribution, by comparing them with a daily rainfall disaggregation method.For that purpose, we used data gathered during N + M years, in which a two-state Markov chain is fit to N continuous years of daily data and used to fill M years of missing data for a few cities within the Lagoa Mirim Watershed.

MATERIAL AND METHODS
Daily rainfall data were taken from seven rain gauge stations located inside the Lagoa Mirim Watershed (88), which is inserted into the South Atlantic basin (8).The study area is located between parallels 31° 30' and 34° 30' Southern latitude and 52° 00' and 56° 00' Western longitude, with an area of around 62,250 km 2 , of which 29,250 km 2 (47%) in Brazil and 33,000 km 2 (53%) in Uruguay.Table 1 shows information on the rain gauge stations provided by the Agência Nacional de Águas -ANA (National Water Agency), which comprises code, name and location, as well as geographic coordinates, altitude and data gathering period.Watershed rainfall modelling by a two-state Markov chain, which defines the probability of one day being dry or rainy, depends singly on the previous day condition, wherein days are considered dry (0) or rainy (1) (STERN & COE, 1984).Daily rainfall below 1 mm was regarded as dry day (MINUZZI & LOPEZ, 2014).Transition probabilities between dry and rainy conditions, i.e.P(0,0), P(0,1), P(1,0) and P(1,1) (BAÚ et al., 2013), were determined for an annual series, without considering monthly stationarity, since this study aimed to fulfill daily gaps and subsequently set an annual maximum daily rainfall series.After estimating transition probabilities, 100 sequences of dry/rainy days were generated for the entire period where gaps occurred.From these sequences, we could estimate precipitation amounts on the rainy days using a two-parameter Gamma distribution (DETZEL & MINE, 2011), whose parameters were set by moment method.
Once missing data were fulfilled, continuous series of annual maximum daily rainfall data were established for each of the seven stations.For this purpose, the Gumbel's theoretical probability distribution was used in association with return periods of 5, 10, 20, 50 and 100 years (QUADROS et al., 2011).Then, this annual maximum daily rainfall was disaggregated into periods of 10, 20, 30, 40, 50, 60, 120, 360, 720 and1440 min by relation methods (TEIXEIRA et al., 2011).
Intensity-duration-frequency relationships: stochastic modeling and disaggregation of daily rainfall Eng.Agríc., Jaboticabal, v.36, n.3, p.492-502, maio./jun. 2016 495 From the disaggregated data transformed into average maximum rainfall intensity, IDF relationship parameters were adjusted (MICHELE et al., 2011), minimizing the objective function (f obs − f mod ) 2 using the Excel function Solver with a non-linear optimization code, known as Generalized Reduced Gradient.
The results were validated by following these items: a) N-year IDF relationships of disaggregated daily rainfall in hourly and sub-hourly durations were compared with those of PRUSKI et al. (2006), considering their closeness (LUDWIG et al., 2013) (Table 3).The null hypothesis stated that maximum obtained values do not have significant differences at a 5% level.Therefore, we used the Student's t test with n-k degrees of freedom, in which n is the sample size and k is the number of explanatory variables for linear (β 0 ) and angular (β 1 ) coefficients.The hypothesis is accepted when the Student's t test is less than its critical value.Moreover, standard error was estimated for maximum values within established return and duration periods; b) N-year IDF relationships of disaggregated daily rainfall in hourly and sub-hourly durations were compared with those obtained within N+M years.Likewise, t test and estimate standard error were used; c) The method was considered valid for low standard errors when comparing maximum rainfall obtained from a series of N+M years with those of N years.
It is noteworthy highlighting that in an item, we singly validated the daily rainfall disaggregation method for IDF relationship estimation; whilst in b, besides that, we also validated the use of Markov chain and Gamma distribution.

Station Transition probabilities
Gamma distribution parameters P(0,0) P(0,1) P(1,0) P( 1 (Brazil) and noted a relief contribution on their distribution in different areas of the state, being most abundant in areas near mountain slopes, since moist and warm air raise favors large rainfall volumes.Gamma distribution parameters of shape (α) and scale (β) used to estimate daily precipitated water depth is shown in Table 4.
Descriptive statistics data (average -mm, standard deviation -mm, coefficient of variation -%, and maximum of the series values of N+M yearsmm) are displayed in Table 5.In comparison to the statistics presented in Table 2, referring to N years, we observed that, in general, there was no change in the descriptive statistics of the composed series of N+M years.One of the stochastic modeling assumptions is the preservation of statistical characteristics from historical series to generated series (BACK et al., 2011); thus, our results met such assumption.However, for the stations 3152003 (Canguçu) and 3252003 (Estação do Curtume), the highest values of maximum daily rainfall ranged from 123.30 mm (N years) to 177.30 mm (N+M years) and from 96 mm (N years) to 200.20 mm (N+M years), respectively.Therefore, we may highlight that the values of series of N+M years are somehow similar to the other stations.Furthermore, these values seem to represent more accurately local intense rainfall behaviors whether compared to the series of N years, which is required in estimating the IDF relationships (TEIXEIRA et al., 2011).
The maximum daily rainfall series, constituted by N years of observed data and N years of observed data + M synthetic years, were adjusted to Gumbel distribution, whose parameters of scale (λ) and location (γ), as well as the results of the Kolmogorov-Smirnov adherence test (KS) (ARAGON et al., 2013) at 5% significance level are shown in Table 6.The KS statistic values were lower than critical statistics ones were, what means that Gumbel's adjustment was appropriate.
Similar statement was achieved by TEIXEIRA et al. (2011), who adjusted maximum rainfall intensity values in Pelotas (RS) Brazil; they concluded Gumbel's distribution was adequate since there was adherence between observed and adjusted values that was evidenced by the KS test; thus, there is a consensus on its use for extreme data adjustment.
Considering the Gumbel's distribution parameters, λ values had low variation among stations in the two series.In contrast, γ values show discrepancy for the stations 3152004, 3152010 and 3252003 compared to the others.This result might be derived from the small number of years used to adjust the N-year series, what may have had a reflection on the N+M year series parameter values.QUADROS et al. (2011) adjusted maximum rainfall series for Cascavel (PR), in Brazil, using GEV and Gumbel probability distributions, considering a period of 22 years.These authors established that long-term data series are required to enable considerations of frequencies as probabilities, since the probabilistic laws are synthetic and are intended to describe general characteristics of facts.Table 7 lists the values of Gumbel rainfall distribution for return periods of 5, 10, 20, 50 and 100 years in both series.As recommendation for hydro-agricultural planning is 10 years (MOSQUE et al., 2009), we can emphasize that the lowest values ranged in 76.03 mm (N + M years) for 3252003 station, and in 113.15 mm (N years) for Morro Redondo station (3152010).In the 3252008 station, the highest values were 137.92 mm (N + M years) and 143.55 mm (N years).Overall, for a 5-year return period, the estimated values considering a number of N + M years for the stations 3152004 (110.77/80.56 mm), 3152010 (105.81/76.38 mm) and 3252003 (111.00/55.53 mm) overestimated water depths in 27, 27 and 50%, respectively.On the other hand, for larger return periods, these differences tend to decrease.TABLE 7. Relations between maximum daily rainfall (mm) from the Gumbel distribution and return periods (Tr) of 5, 10, 20, 50 and 100 years for the seven stations in the Lagoa Mirim Watershed.
Table 8 displays the intensity-duration-frequency equations for rainfalls throughout N and N+M years.The t test values for angular coefficient obtained comparing maximum intensities in IDF relationships (N and N+M years), for all stations and return periods, showed no significant difference at 5% α level (Table 9), since calculated statistics values were lower than the critical ones, i.e.H 0 is accepted.
Once test results were not significant, relative mean squared error was employed to evaluate information gains by IDF relationship, when pluviograph records and daily rainfall disaggregation are used (Pluvio_N), as well as daily rainfall disaggregation according to Markov chain extension (N_N+M).For all studied stations and return periods, Table 10 demonstrates that composed series of N+M years had the lowest error values compared to N-year ones.In relation to the average relative errors, ZANETTI et al. (2006), aiming to generate daily total rainfall synthetic series through ClimaBR model in 12 Brazilian locations, concluded that distribution simulations by number of rainy days, from the Markov model, were representative, even in situations of large regional weather variability.The number of years was relevant both for Markov chain transition probabilities and for Gamma distribution parameters in order to obtain the lowest relative mean squared errors.In this sense, PAIVA & CLARKE (1997), adjusting rainfall stochastic models in the Brazilian Amazon, using daily records of 402 stations, concluded that parameter adjustment was adequate for stations with longer time records, covering periods of over 1,000 days.

CONCLUSIONS
Stochastic modeling using a homogeneous first order Markov chain showed to be adequate to estimate sequences of dry and rainy days.The statistics values of observed daily rainfall series were preserved when it was used the Gamma probability distribution to simulate the amount of rainfa ll.This proposed methodology is useful to fill missing data and extent daily rainfall series.The daily rainfall disaggregation technique presented a good performance, composing a feasible alternative for estimations of rainfall intensity-duration-frequency relationships.Mostly important is that there was information gain on intensity-duration-frequency relationships by using N+M years of disaggregated rainfall daily data compared to N years.

TABLE 1 .
Rain gauge stations used in the study.

TABLE 2 .
Descriptive statistics of a daily rainfall series and period of obtaining the transition probabilities (N years).

TABLE 3 .
Intensity-duration-frequency relationships of rainfall occurrence used for validation.
Larger rainfall volumes occurred at higher altitudes.BACK et al. (2012) determined relationships among rainfalls of different time lengths in Santa Catarina state

TABLE 5 .
Descriptive statistics of the daily rainfall series whose gaps were filled using the Markov chain and Gamma distribution, as well as the period used to obtain annual maximum daily rainfall (N+M years).

TABLE 6 .
Parameters of scale (λ) and location (γ) of Gumbel distribution and Kolmogorov-Smirnov (KS) test values at 5% significance level, obtained from N and N+M years of data.

TABLE 8 .
Rainfall intensity-duration-frequency (IDF) relationships from daily rainfall disaggregation technique, within N and N+M years of data.

TABLE 10 .
PRUSKI et al. (2006)d error among maximum rainfall intensity (mm h −1 ) byPRUSKI et al. (2006)(PLUVIO) and among the evaluated stations, over the periods of N and N+M years for various return periods.