Performance evaluation of flow duration curves regionalization methods

The main objective of the study was the evaluation of Characteristic Values and Exponential Curve methods as alternatives for flow permanence curves regionalization. Additionally, the use of regional flow indicators for the appropriation of reference flows associated with flow permanence curves was evaluated. The study area covered the Itapemirim and Itabapoana rivers watersheds, the most important located in the southern portion of the Espírito Santo state (Brazil). The Characteristic Values method presented best results in the flow permanence curves regionalization work. It was also concluded that the regional indicators represent a simple and consistent alternative for study area reference flows estimation.


INTRODUCTION
Due to the limitations of the Brazilian hydrological monitoring network, several hydrographic basins present unsatisfactory flow historical series.In this context, hydrological regionalization represents a relevant tool to support the management of water resources.Variables or hydrological functions regionalization can be conducted by different methods, as illustrated by ELETROBRÁS (1985), Sarhadi andModarres (2011), Tucci (2002), Chaves et al. (2002), Baena et al. (2004), Laaha and Blöschl (2006) and Isik and Singh (2008) works.
Flow permanence curve is a hydrological function that relates flow rate and the percentage of time that this flow is equal or exceeded during the entire historical period considered for its construction.It represents the complement of the flow probabilities cumulative distribution function or flows exceedance probabilities (QUIMPO; ALEJANDRINO; MCNALLY, 1983;VOGEL;FENNESSEY, 1994).In addition to the direct results that provide for the study of the available watercourses utilization, the flow duration curves are instruments for comparison between watersheds, representing the effects of relief, vegetation, land use and precipitation over flows distribution (EUCLYDES et al., 2001).
Among the available methods for flow permanence curves hydrological regionalization, the Characteristic Values and the Exponential Curve methods have been repeatedly used, function of use facility and results quality.
According to Tucci (2002), the Characteristic Values method initially determines the characteristic flows associated with the hydrological functions of interest -maximum flows probability curves, minimum flows probability curves or flows permanence curves.Subsequently, the cited flow values are related with independent variables through regression analysis.The regression analysis can be simple (using only one independent variable) or multivariate (which is characterized by considering more than one independent variable, concomitantly).In this way, the Characteristic Values method can be used to regionalize hydrological variables as well as hydrological functions.The works of Baena (2002), Silva et al. (2003), Pessoa, Blanco and Martins (2011), Moreira and Silva (2014) and Bazzo et al. (2017) illustrate the use of the Characteristic Values method for flow permanence curves and different reference flow rates regionalization.
The Exponential Curve method considers the flow permanence curve exponential behavior in the interval between two characteristic flows.Although the use of the Exponential Curve method can be established from any pairs of flows that conform the permanence curve, it is usually used for the regionalization of the branch of the permanence curve between 50% (Q50) and 95% (Q95).Agra et al. (2003) and Cortês (2004) works illustrate the use of the Exponential Curve method for permanence curves regionalization.
Regional indicators represent alternatives for regionalization of hydrological variables.A regional indicator is a hydrological variable average value (or a proportion between hydrological variables) characteristic of a region presenting homogeneous behavior (REIS et al., 2008).Works such as Tucci ( 2002) and Reis et al. (2008) illustrate the use of regional indicators to estimate mean, minimum and maximum flows for different river basins.
For proper water resources management, the consistent assessment of water availability is central.In this context, the main objective of the research was to evaluate, for the region conformed by the Itapemirim and Itabapoana river basins, the best method applicable to the regionalization of the flow permanence curve and reference flows associated with this hydrological function.Different authors (SILVA JUNIOR, 2014;REIS et al., 2006;COSER, 2013) affirmed that the Itapemirim and Itabapoana rivers basins, corresponding to the main watercourses located in the southern portion of the Espírito Santo state, are homogeneous with respect to different characteristic flows and hydrological functions.These affirmations made it possible to waive the preliminary assessment of watersheds hydrological homogeneity.The responses fom Characteristic Values and Exponential Curve methods and Regional Flow Indicators were compared in this work.These methods and indicators, although largely applied to the regionalization of permanence curves and estimatives of characteristic flows resulting from these curves, usually do not have their performances compared.

STUDY AREA
The present study was conducted for the Itabapoana and Itapemirim river basins, both mainly located in the southern portion of Espírito Santo state.These basins also cover small portions of Minas Gerais and Rio de Janeiro states (Figure 1).According to Silva Junior (2014), these basins, when integrated, constitute a hydrologically homogeneous region.
The Itabapoana river basin presents a 4,875 square kilometers drainage area and covers, totally or partially, eighteen municipalities in the states of Espírito Santo, Minas Gerais and Rio de Janeiro.Itabapoana river starts in Caparaó mountain range, in Alto Caparaó (MG) municipality, initially named Preto river.After confluence with the Verde river, it is named Itabapoana.
The Itapemirim river basin presents a 6,014 square kilometers drainage area and is almost totally located in the Espírito Santo state, presenting a small part in Minas Gerais state.The Espírito Santo state municipalities located, totally or partially, in the Itapemirim and Itabapoana basins are, Ibatiba, Irupi, Iúna, Ibitirama, Muniz Freire, Alegre, Conceição do Castelo, Venda Nova do Imigrante, Castelo, Vargem Alta, Cachoeiro de Itapemirim, Jerônimo Monteiro, Muqui, Atílio Vivácqua, President Kennedy, Itapemirim and Marataízes.Lajinha is the only Minas Gerais state municipality partially located in the Itapemirim basin.
Annual rainfall heights along the Itapemirim river course varies between 1,020 and 1,240 mm, being smaller in the coastal range, where there is negative water balance.In Caparaó mountain range region, with a maximum altitude 2,892 meters (Pico da Bandeira), annual rainfall increases to 1,570 mm per year.Because it is located in a transition zone between the southeastern and northeastern Brazilian regions, close to the ocean, and because it presents large relief variations, the Itapemirim basin presents large climatic diversity.Despite this, temperature and rainfall present similar behaviors throughout the year all the way through the basin, with rainy summers and dry winters (KLIGERMAN, 2001).

Hydrological data
The fluviometric records used in the present study were extracted from the National Water Agency (ANA) database through the Hydrological Information System (HIDROWEB).The fluviometric stations were selected according to the length and quality of the available historical flow series, not using historical series less than 35 years long.Tables 1 and 2 show the fluviometric stations selected for the hydrological regionalization in the Itapemirim and Itabapoana river basins.
The hydrological data were processed and manipulated through the Computational System for Hydrological Analysis (SisCAH), software developed and made available by Viçosa Federal University Water Resources Research Group.

Independent variables
Watershed drainage area (A), watershed perimeter (P), main river length (L), mean basin slope (I) and mean rainfall (R) were the independent variables considered for conducting the Performance evaluation of flow duration curves regionalization methods 4/13 regional flow analysis developed in the present study.These variables were obtained from the work conducted by Silva Junior (2014).

Selection of regional functions
a) Variance Inflation Factor (VIF) Independent variables multicollinearity was identified in the present study by the use of the inflationary factor of variance (VIF) indicator.Equation 1 presents the VIF indicator.r 2 represents the adjusted determination coefficient of a given independent variable with all other independent variables.
If a set of independent variables is not correlated, then VIF will be equal to 1.If the set is strongly correlated, VIF may exceed 10.Marquadt (1980) indicated that if VIF is greater than 10 there is very high correlation between an independent variable with the others independent variables.

5/13
In this study, the independent variables were considered correlated for VIF values above 5, according to a criterion suggested by Snee (1973).
b) "Mallows's Cp" statistic The best subset approach evaluates all possible regression models for a given set of independent variables or the best subsets of models for a given number of independent variables.
One of the criteria available for the evaluation of competing models is based on the statistic developed by Mallows (1973Mallows ( , 1995)).The statistic proposed by Mallows (Cp) estimates the differences between adjusted regression models and real models.When a regression model with k independent variables contains only random differences relative to an actual model, the mean value of "Cp" is k + 1, the number of parameters.Thus, when evaluating many alternative regression models, the objective is to find models whose "Cp" are less than k + 1.The "Cp" statistic is defined as where k is the number of independent variables in a regression model; T is the total number of parameters to be estimated in the full regression model; r k 2 is the adjusted coefficient of determination for a regression model that has k independent variables and r T 2 is the adjusted coefficient of determination for a complete regression model containing all estimated T parameters.
Figure 2 shows the flowchart associated to the process used in this study for the selection of the independent variables and their combinations, variables to be used to estimate the regional functions applicable to obtaining permanence curves.

Flows permanence curves regionalization
For the establishment of regional functions applicable to the estimation of the flow permanence curves, Characteristic Values and Exponential Curve methods were applied.
The application of the Characteristic Values Method involved the following steps: a) Establishment of the permanence curves for the different study area fluviometric stations conducted with the aid of the SisCAH program; b) From the established permanence curves, estimation of flows with permanence varying between 50% and 95%; c) Using regression analysis (simple and multiple), assuming power functions, associations were made between different permanencie flows with study area physiographic and climatological characteristics.In this step the best subsets approach was used, by selecting the regional functions considering the "Mallows Cp's" statistic.
Based on the regression equations selected for each permanence flow defined in item (a), the flow permanence curves for the study area fluviometric stations were reconstructed.
For application of the Exponential Curve method, the following steps were performed: a) Estimation, from the established flow permanence curves, of flows with 50% (Q 50 ) and 95% (Q 95 ) permanencies.It is relevant to note that, although the use of the method can be established from any pairs of flows that comply the permanence curve, it is usually used for the regionalization of the permanence curve stretch between the Q 50 and Q 95 flows; b) The reference flows Q 50 and Q 95 were related to study area physiographic and climatic characteristics, establishing regional functions by means of regression analysis (single and multiple) assuming power type functions.This stage corresponds to an approach for choosing the best subassemblies by selecting the regional functions from the "Mallows Cp's" statistic; c) Determination, for the section of interest, of the geographical and climatological features employed in the regionalization and estimative of the Q 50 and Q 95 flows through the regional functions; d) Determination of the coefficients a and b through Equations 3 and 4.
( ) e) Reconstruction of the permanence curve flows for the study area fluviometric stations, by using Equation 5.  Performance evaluation of flow duration curves regionalization methods 6/13

Regional functions performance analysis
In this study, to evaluate the regional functions responses, there were used the estimation standard error (Equation 6), the coefficient of determination or the adjusted coefficient of determination (Equation 7) and the percentage deviation, presented in works such as Naghettini and Pinto (2007) and Levine, Stephan and Szabat (2005).
In the Equations 6 and 7, n represents the extension of the samples, N the number of samples, qi the value of the actual dependent variable and i q the value of the dependent variable estimated for the same sampling point.In this study, for a regional function established from the regression analysis to be considered adequate, the coefficient of determination should present a minimum value 70% and the maximum percentage deviation between the values of the flows obtained from permanence curves and the regionalized flows should be 30%, according to recommendations originally established by ELETROBRÁS (1985).
To indicate the most appropriate regionalization method there were used the relative errors (Equation 8), the mean relative errors and the Nash-Sutcliffe coefficients (Equation 9).These coefficients were used by Costa, Fernandes and Naghettini (2012), when conducting the cross-validation process associated with the establishment of regional models for permanence curves of perennial, intermittent and ephemeral rivers located in Minas Gerais and Ceará states.s,j s,j s,j s,j q -q = q  (8) In Equations 8 and 9, for a given station s, ϵ s,j represents the error between the flows corresponding to the j-th permanence, denoted by s,j q e s,j q , such as obtained respectively by the regional and empirical permanence curves.Costa, Fernandes and Naghettini (2012) evaluated the percentages (P) of cases, in relation to the set of equations, for which E s > 0.75 (P 1 , regular to good adjustment), 0.50 < E s < 0.75 (P 2 , bad to regular adjustment) and E s < 0.50 (P 3 , bad adjustment).

Regional indicators
There were estimated the indicators rcp 50 , rcp 90 and rcp 95 with the aid of Equations 10, 11 and 12.
Defined the values of the permanence indicators for each fluviometric station, mean values of each indicator were assumed as regional indicators for the study area.
The values of the flow rates Q 50 , Q 90 e Q 95 obtained, for the different stations, from the regional indicators, were compared with the values of the actual flows and the flows obtained through the regional functions applicable to the estimation of the permanence curves.In this study, the responses quality obtained through the regional indicators was evaluated with the aid of the percentage deviation.

Evaluation of collinearity
Table 2 shows the combinations of independent variables that presented collinearity.Combinations involving the independent variables area, perimeter and the main river length produced high VIF values, recurrently higher than 5.In this context, combinations involving two or more of the aforementioned variables were not considered when defining regional functions.

Independent variables selection
The Appendix A presents the values of the "Mallows Cp's" statistic for the different combinations of the independent variables considered in the present study.The highlighted cells indicate the permanence curves reference flows appropriated from regression models considered adequate according to the "Mallows Cp's" statistic.
Following, considering the results of the "Mallows Cp's" statistic for the different flows that conform to the permanence curve, the regression models whose independent variables did not present collinearity were searched, according to the procedure determined by the use of VIF.
In this context, the combinations of independent variables chosen for the conduction of the permanence curve regionalization were a) Perimeter, b) Perimeter and Slope, and c) Perimeter, Slope and Precipitation.Additionally, Area was considered as an independent variable because it presented values lower or very close to the limit imposed by the "Mallows Cp's" statistic for flows with permanences between 50% and 90% and represents the most used variable in regional flow analysis studies.

Characteristic values method
The regional functions established from the use of the Characteristic Values Method, assuming as independent variables a) Area, b) Perimeter, c) Perimeter and Slope, and d) Perimeter, Piol et al.

7/13
Slope and Precipitation are grouped in Table 3.This table also presents the coefficient of determination (for Area or Perimeter as single independent variables) or the adjusted coefficient of determination values associated with each regional function.The determination or adjusted determination coefficients were satisfactory for all combinations of independent variables evaluated, invariably assuming values higher than 95%.The established regional functions, therefore, complied with the criterion proposed Performance evaluation of flow duration curves regionalization methods 8/13 by ELETROBRÁS (1985), according to which the regional functions obtained by regression analysis should present a determination coefficient at least 70%.
For all independent variables combinations considered in this study, the estimation standard errors remained substantially lower than the standard deviation values associated with the actual flow series, indicating, according to analysis criterion discussed by Naghettini and Pinto (2007), that the regression analysis produced satisfactory results in the the regional functions obtainment.
The combination of independent variables Perimeter and Slope was the one that presented the best results, considering the limit 30% for the flows percentage deviations between actual permanence curves and permanence curves established by means of regional functions.However, for most of the analyzed fluviometric stations, regional functions established from the combination of different independent variables produced similar results.Para diferentes vazões do trecho regionalizado da curva de permanência (todo o trecho ou parte dele) e diferentes combinações de variáveis independentes, desvios percentuais superiores a 30% entre vazões reais e vazões estimadas foram observados nas estações de São José do Calçado e Guaçui.
For the permanence curve regionalized stretchs (all or part of them) different flows and different combinations of independent variables, percentage deviations higher than 30% between actual and estimated flow rates were observed for São José do Calçado and Guaçui stations.Similar behavior was also observed in Silva Junior (2014), Reis et al. (2006) andCoser (2013) works.According to Coser (2013), regional functions capable of producing smaller percentage errors between actual and estimated flow rates could be produced with the eventual suppression of the cited fluviometric stations.Deviations greater than 30% were also observed for Iúna and Ibitirama stations from certain produced regional functions and in part of the analyzed reference flows.

Exponential curve method
For the permanence curve hydrological regionalization by using the Exponential Curve method, regional functions were established to estimate Q 50 and Q 95 flow rates (Table 4).
The expressions showed in Table 4 allowed the estimation of coefficients a and b, (estimated with the aid of Expressions 3 and 4, respectively) that form the exponential function used for each regionalized permanence curve (Equation 5).
As it was done when applying the Characteristic Values method, standard estimation errors were determined for the different independent variables considered in the present study.Estimation standard error values for the regional functions produced through  the application of the Exponential Curve method remained relatively low and considerably lower than the standard deviation values obtained for the permanence curve actual flows.
In addition, percentage deviations between the actual flows and the those obtained through the regional functions obtained by the Exponential Curve method estimated flow rates were appropriated, considering different combinations of independent variables considered in the present study.
The combination of independent variables Perimeter, Slope and Precipitation was the one that produced the best results when applying the Exponential Curve method, considering the 30% limit for the percentage deviations between the actual flows and flows estimated by means of regional functions.
As observed during the application of Characteristic Values method, deviations greater than 30% between real and estimated flows were recurrently observed for São José do Calçado, Guacuí, Ibitirama and Iúna stations when the Exponential Curve method was applied.
Itaci and Terra Corrida stations, for flows from the lower branch of the regionalized permanence curves (permanences between 75% and 90%), also presented deviations greater than 30% for different regional functions established with the aid of the Exponential Curve method.
Figures 3 and 4 present permanence curves stretches, between the permanencies 50% and 95%, for Coutinho and Usina São Miguel stations, respectively.In these figures there are presented parts of the permanence curves obtained from the flow registers and from the regional functions defined by the different employed regionalization methods.Similar permanence curves were produced for the other fluviometric stations considered in the regional analysis.

Comparison between permanence curve results
Tables 5 presents mean relative errors and Nash-Sutcliffe coefficients values established for the study area different stations, when considered the application of the regional function established by the Characteristic Values method that involved Perimeter and Slope as independent variables.Similarly, Table 6 shows the values obtained for the cited parameters obtained when using the regional function established by the Exponential Curve method considering Perimeter, Slope and Precipitation as independent variables.The cited tables also present the study area mean relative error values and the percentage of cases in which the Nash-Sutcliffe coefficient values indicate adjustments presents different response qualities.
From the values presented in Tables 5 and 6, it is possible to observe that, for the study area, the relative errors mean absolute value was lower when applied regional function obtained with the aid of the Characteristic Values method.Additionally, although the percentage of cases in which the Nash-Sutcliffe coefficient suggested adjustment regular to good (E s > 0.75) was equal for the different regionalization methods, the Characteristic Values method was the one that presented lower percentage (19%) of cases with adjustment considered poor.Performance evaluation of flow duration curves regionalization methods 10/13

Regional indicators
Q m , Q 50 , Q 90 and Q 95 regional flow indicators are presented in Table 7.
From the comparison between reference flows appropriated from flow historical series records and those estimated through regional indicators, it was possible to observe that the application of the regional indicator for regionalization of the reference flow Q90 presented the best results.Only for two fluvimetric stations (Guaçuí and São José do Calçado) the flow values obtained from the regional indicator showed deviations greater than 30% in relation to the actual flow rates.
Satisfactory results were also obtained by means of regional indicators for reference flows Q m , Q 50 and Q 95 .The use of the indicators q, rcp 50 e rcp 95 produced percentage deviations between actual flow rates and estimated flow rates higher than 30% for only three fluviometric stations (Ibititama, Guaçuí and São José do Calçado).
Comparison of the results obtained through the regional indicators with those obtained by the other permanence curve regionalization methods.
Figure 5 presents graphical comparison between the Q 90 flows obtained from historical series and through application of Regional Indicators, Exponential Curve method and Characteristic Values method.Similar graphs were produced for reference flows Q 50 and Q 95 .Error bars indicate 30% deviations in relation to reference flows estimated from flow historical series.
From the comparisons between the reference flows obtained from the historical flow series and the reference flows estimated through the different regional analysis alternatives, a tendency of overestimation was observed for flows obtained through regional indicators, when compared with actual flows and those obtained through the regional functions produced by using Exponential Curve and Characteristic Values methods.
For stations such as Iúna, Terra Corrida e Rive, the results obtained for reference flows Q 50 , Q 90 e Q 95 from the application of

CONCLUSIONS
For the region composed by the Itapemirim and Itabapoana rivers hydrographic basins, considering the statistical criteria used in this study, the Characteristic Values method presented better performance for the establishment of regional functions applicable to the conformation of permanence curves and reference flows associated with these hydrological functions.The regional function that assumed Perimeter and Slope as independent variables was the one that presented the best results for the study area.
The regional flow indicators, although presenting simple and expedite application, tended to overestimate reference flows associated with flow permanence curves in the Itapemirim and Itabapoana rivers basins.
Qn is the flow with permanence Pn.

Figure 2 .
Figure 2. Flowchart presenting steps for the definition of regional functions.

Figure 3 .
Figure 3.Comparison between the real permanence curve and those obtained by Characteristic Values and Exponential Curve methods for the Coutinho station.

Figure 4 .
Figure 4. Comparison between the real permanence curve and those obtained by Characteristic Value and Exponential Curve methods for the Usina São Miguel station.

Table 1 .
Fluviometric stations installed and operated in the area considered in the study.

Table 2 .
Combinations of independent variables that presented collinearity.

Table 3 .
Regional functions obtained by the Characteristic Values method.
Flow values in cubic meter per second, areas in square kilometers, perimeter in kilometers, slope in kilometer/kilometer and precipitation in millimeter.

Table 4 .
Regional functions for Q 50 and Q 95 flow rates estimation.Flow values in cubic meter per second, areas in square kilometers, perimeter in kilometers, slope in kilometer/kilometer and precipitation in millimeter.

Table 5 .
Average relative errors and Nash-Sutcliffe coefficients established by the use of the regional function selected for the application of the Characteristic Values method.

Table 6 .
Average relative errors and Nash-Sutcliffe coefficient established by the use of the regional function selected for the application of the Exponential Curve method.

Table 7 .
Regional flow indicators used to estimate reference flows for the study region.Comparison between the real Q 90 flow values, obtained through Regional Indicators, Exponential Curve and Characteristic Values methods.