Dimensioning of reservoirs for semiarid regions using synthetic series

ABSTRACT The lack of fluviometric data for hydrographic basins affects the estimates of capacity of regularization reservoirs, important to meet water seasonal demands. The objective of this study was to evaluate the performance of methodologies based on synthetic series (SS) of streamflow for the dimensioning of regularization reservoirs in the Jequitinhonha River Basin, Brazil. The reservoir capacity (RC) was estimated with and without the association to return period, using different long-term mean streamflow rates, SS from the observed data, and estimated data by the streamflow regionalization process developed by the Instituto Mineiro de Gestão das Águas. The results obtained were compared to the RC obtained in regionalization methods of the regularization curve and regionalization of reservoir capacity. The methods that include synthetic series associated to return period presented better performance, with 25% and 15% lower overestimate and underestimate means, respectively, and estimated values closer to those that considered the observed data series. Therefore, the use of methodologies to estimate RC, using SS was adequate for the locations without fluviometric monitoring in the Jequitinhonha River Basin.


Introduction
Water scarcity occurs mainly due to asymmetry in its distribution, lack of management, and inadequate management of natural resources. In this context, arid and semiarid regions are severely affected, since they present intense rainfalls concentrated in a short period followed by a long drought period (Martins et al., 2018).
The search for temporal and spatial balance between available and demanded water is essential to meet its increasing demand. The use of regularization reservoirs is an alternative for decreasing losses caused by scarcity and potentialize the use efficiency of water resources, making possible water accumulation during the rainy periods and distribution throughout long drought periods (Wang et al., 2013;Xu et al., 2017).
Thus, regularization reservoirs are important to promote a sustainable management of water resources (Li et al., 2010). However, the adequate dimensioning of these structures requires long-term records of fluviometric data (Kuria & Vogel, 2015), which are incomplete for most hydrographic basins in the world (Loukas & Vasiliades, 2014).
This limitation of unavailability of streamflow data can be overcome by considering continuum synthetic series of streamflows, which represents an alternative to estimate reservoir capacities.
Therefore, the objective of this study was to evaluate the performance of methodologies based on synthetic series (SS) of streamflow for the dimensioning of regularization reservoirs in the Jequitinhonha River Basin, Brazil.

Material and Methods
The study was developed for the Jequitinhonha River Basin, Brazil, which cover an area of 70,315 km², 66,319 km² in the state of Minas Gerais, and 3,996 km² in the state of Bahia (Gonçalves, 1997). A large part of this basin is in the Semiarid region, which is characterized by a poor perennial river network, with water streamflows in the rainy season or soon after the rainfall events.
The generation of synthetic series for the section of interest (not monitored) was based on the methodology proposed by Rodrigues (2017), using observed streamflow data or regionalized from fluviometric stations in the same hydrologically homogeneous region of the section of interest to generate representative synthetic series for the section.
The study developed by IGAM (2012) classified the whole Jequitinhonha River Basin as a hydrologically homogeneous region. Figure 1 presents the location of the Jequitinhonha River Basin, and fluviometric stations used in the study.
The proposed methodology is presented in Figure 2 and is based on the generation of synthetic series, estimation of the reservoir capacity by methodologies that use these series, and methods that do not use these series for comparisons.
The synthetic series were estimated for the section of interest, according to Eq. 1: where: Q SI,d -streamflow of the section of interest in the day d, m³ s -1 ; q EF I,d -specific streamflow of the fluviometric station i in the day d, m³ s -1 km -2 ; q lt SI -specific long-term average streamflow in the section of interest obtained by regionalization studies, m 3 s -1 km -2 ; q lt EF i -specific long-term average streamflow in the fluviometric station i (observed or regionalized), m³ s -1 km -2 ; n -number of streamflow in the gauge station; and, A SI -drainage area of the section of interest, km².
The synthetic series with q lt EF i data of fluviometric stations were termed observed synthetic series (SSo); and the synthetic series with q lt EF i data referring to regionalized data in the stations were termed regionalized synthetic series (SSr). The regionalized data were estimated by the streamflow regionalization process described by IGAM (2012).
The performance of the synthetic series was evaluated using leave-one-out cross validation. This methodology consists in excluding a sample of the dataset and estimate its value using the remaining samples. The estimated value is then compared to the value of the removed sample (Muller & Thompson, 2016). Thus, each fluviometric station was considered notmonitored and left out to obtain the synthetic series (SSr or SSo) for the section.
The quality of the synthetic series (SSr and SSo) was evaluated by comparing the q lt of these series to the qlt of the historical series of observed streamflow (SO) using the relative error (RE), as described in Eq. 2 where: RE -relative error; q lt SS -specific long-term average streamflow of the synthetic series (SSo or SSr), m³ s -1 km -2 ; and, q lt So -specific long-term average streamflow of the observed series, m³ s -1 km -2 .
The reservoir capacity (RC) was estimated based on different methods: regionalization of regularization curve (RRCur), RC estimate using the observed synthetic series (MSSo), RC estimate using the regionalized synthetic series (MSSr), and regionalization of reservoir capacity (RRCap), the latter three can be associated with the return period (MSSo_T, MSSr_T, RRCap_T). RC was also estimated based on the historical data series associated with the return period (MSO_T), which was the reference method due to its less prone conditions to errors related to databases. (1) (2) RRCur is commonly used to dimensioning reservoirs in locations without fluviometric data, and RRCap is used for comparisons in the evaluation of methods based on generation of synthetic series.
Four proportions (0.25, 0.50, 0.75, and 1.00) of long-term mean streamflow were used to dimension the regularization reservoirs, which indicate the water potential availability of the hydrographic basin, representing the maximum streamflow that can be regularized.
The dimensioning of regularization reservoir capacity requires the value of the highest volumes of accumulated deficit, calculated by the method of accumulated differences or maximum accumulated deficit. In this method, the accumulated deficit volume is obtained by the highest sum of deficits between the affluent streamflow and the streamflow to be regularized. The most critical period was the one with the highest accumulated deficit and, consequently, higher RC.
Probability density functions (PDF) were used to associate the RC to the different return periods (T) (Kite, 1988), using Eq. 3: where: f -factor of conversion for cubic hectometers (0.0864); a and b -parameters obtained in the regression, dimensionless; β -streamflow fraction to be regularized, dimensionless; and, Q lt reg -long-term mean streamflow regionalized for the section of interest, m³ s -1 .
The variables to be regionalized in relation to RRCur in the reservoir capacity regionalization proposed by Rodrigues (2017) were altered, with the advantage of associating the RC to the T.
The RC was calculated for each fluviometric station located in the same hydrologically homogeneous region of the section of interest based on the demand. The RC were obtained and regionalized, generating regression equations that associate them to the streamflow equivalent to the rainfall volume in the drainage area of the station less an abstract factor, which is 600 mm for the Jequitinhonha River Basin (Peq600) (IGAM, 2012), using Eq. 4 T RC x k s = + where: RC T -reservoir capacity associated with a return period, hm 3 ; x -mean accumulated deficit volumes, hm³; k -frequency factor, dimensionless; and, s -standard deviation of the accumulated deficit volumes, hm³.
There is an adequate form to determine the frequency factor for each probabilistic distribution (Kite, 1998).
The highest accumulated deficit volume is obtained for each year for the annual operation rule, showing the number of deficits corresponding to the number of years of synthetic series. PDF are applied for the series related to the numbers of deficits, showing the RC associated to T.
The combined years of the series is analyzed for the pluriannual operation rule, showing only one accumulated deficit volume, which does not allow for the use of PDF. The modified accumulated difference method (MADM), developed by Nunes & Pruski (2015) was used to obtain several RC associated to different return periods.
The MADM is based on the development of temporal synthetic series in which the number of synthetic series will be equal to the number of years of the period based on the historical series of available streamflows less one, since an accumulated deficit volume will be associated to each series, including the original. The following functions of probability density distribution were analyzed for all deficits: Gumbel, Log-Normal type II, Log-Normal type III, Pearson type III, and Log-Pearson type III.
The regularization curve developed through calculations of several volumes for regularized streamflows relates the regularization degree to the volume required and allows the estimate of RC in locations without monitoring, based on Eq. 4: Peq600 -streamflow equivalent to the rainfall volume less 600 mm (m³ s -1 ); P -mean total annual rainfall in the area, mm; A -drainage area, km²; and, k -conversion factor (31536).
The abstract factor is used to encompass part of the rainfall that is not converted into runoff over the hydrographic region due to other process, mainly evapotranspiration (Pruski et al., 2013).
The regression model used was the potential, represented by Eq. 6 RC T-D -reservoir capacity associated to T and the demand, hm³; and, c and d -parameters obtained in the potential regression.
The methodologies proposed to estimate RC were evaluated by the modified Nash-Sutcliffe index (E 1 ) and Willmott agreement index (d). The RC from the observed data series were adopted as reference to evaluate the methodologies.
The Willmott agreement index (d) (Willmott et.al, 1985) varies from zero (no agreement) to 1 (perfect agreement), based on Eq. 8 The higher error for the original series, which reached approximately -20% (stations 54110002) and 50% (stations 54590000) in the two synthetic series, are explained by the flaws in the regionalization process of the IGAM (2012), in which the regionalized q lt estimate also presented higher residue percentage for these stations.
The RC was calculated by the proposed methodologies for all stations, and was exemplified by the fluviometric stations 54580000 (Figure 4), representative of the other stations.
The RC obtained for the different β and T showed that the methodologies RRCur, MSSo, MSSr, and RRCap do not follow the growth of RC estimated by MSO_T as the T increases because they are not associated to T. Thus, the analysis of these methodologies shows the occurrence of underestimates and overestimates, mainly for RRCur due to more inconsistent values in relation to the original series in the β of 0.5. The methodology RRCap is superior to RRCur, but exhibited considerable underestimates in this station, lower than those of the MSSo and MSSr methodologies, which stood out with lower inconsistencies than the reference method, mainly in the β of 0.75.
Moreover, the methodologies MSSo and MSSr have better applicability than the RRCap, since after a regionalization study and generation of synthetic series the application of the maximum accumulated deficit method is required to obtain the RC (Rodrigues, 2017). The use of RRCap requires the regionalization of the obtained RC after the application of maximum accumulated deficit method, which generates the equation for estimate the RC for the section. Thus, the use of synthetic series for RC estimation has good applicability, with more representative estimates.
Despite the satisfactory results of MSSo and MSSr, methodologies that consider the T are more representative, since the data vary as the T is modified. Methodologies that use T are more complex, requiring the application of MADM and PDF, but the RC calculation associated with the frequency factor is more indicated because it enables the forecast of largemagnitude hydrological events, important for the development of hydrological projects.
RC SO i -reservoir capacity estimated for the fluviometric station i, based on the original series of streamflows, m³ s -1 km -2 ; RC SS i -reservoir capacity estimated for the fluviometric station i, based on the synthetic series, m³ s -1 km -2 ; RC SO i -mean observed values of reservoir capacity in the fluviometric stations, m³ s -1 km -2 ; and, n -number of fluviometric stations considered.
The historical series of the Brazilian Water Agency (Agência Nacional de Águas -ANA) fluviometric stations in the basin were used. The regionalized Peq 600 and q lt were obtained from IGAM (2012). The period from 1970 to 2005 was used based on the analysis of data availability. Years with flaws higher than 5% were discarded. Table 1 shows the codes of the stations used in the study, their geographical coordinates, and respective areas of contribution.

Results and Discussion
The results of q lt of the synthetic series and their respective relative error percentages in relation to the original data and the q lt of the original series are shown in Figure 3.
The q lt of the SSo and SSr were, in general, close to the observed values, without expressive difference between them, with mean relative error close to 10% for SSo and 12% for SSr. The q lt of the synthetic series presented no trend, with oscillating errors between overestimates and underestimates ( Figure 3).
The MSSo_T and MSSr_T are among the methodologies that use T that had the best results, except for β = 0.5, and β = 1.0, for T = 70 years and T = 100 years, in which the RRCap_T method had values closer to the HDS-T. However, the RRCap_T method presented the highest underestimate values of RC in the highest β, providing a poor safety for the reservoir to be dimensioned.
Different from the RRCap_T, the MSSo_T and MSSr_T methodologies have a tendency of overestimating the RC of the station, except for β = 0.75. These overestimates are not desirable, but ensure a higher safety of water supply to the reservoir. Figure 5 shows the mean maximum errors (RE) for each method and fluviometric station.
The results were consistent to those associated to the RC of station 54580000 (Figure 4), where relative errors of methodologies that did not consider the T were higher for overestimates and underestimates.
The mean relative errors of the methodologies that did not consider the T for each station showed that the use of RRCur had higher relative errors in most stations, and was responsible for the highest underestimates in nine stations, and highest overestimates in six, with maximum errors above 100% in six stations, which characterizes it as the less adequate among the studied methods for reservoir dimensioning.
The use of RRCap resulted in the highest errors in overestimates and underestimates in three and two stations, respectively, mainly for overestimates, which were generally high and presented high values of maximum errors, which reached more than 180% in three stations, as found by Rodrigues (2017).
Therefore, MSSo and MSSr had better performance for the estimates of RC, among the methods without use of T. These methods presented small differences, with similar relative errors, and deviations that favor the system safety, with higher discrepancies in the overestimates ( Figure 5A).
Among the models that consider the T, RRCap_T had, in general, the highest amplitude of mean relative errors of the stations ( Figure 5B). Moreover, this method presented 170% higher maximum overestimates in two stations. This confirms the predictive advantage of RC with the use of methodologies that use synthetic series.
The methods MSSr_T and MSSo_T presented similar statistical results for most stations, with mean overestimates and underestimates lower than 25% in seven stations and lower than 15% in eight stations, respectively, denoting higher overestimating tendency. The overdimensioning have environmental, economic, and social disadvantages, but ensures a higher water safety.
RRCur -Regionalization of Regularization Curve; MSSo -RC estimate using the observed synthetic series; MSSr -RC estimate using the regionalized synthetic series; RRCap -Regionalization of reservoir capacity; MSSo_T, MSSr_T, RRCap_T -MSSo, MSSr, and RRCap methods associated to a return period  An optimal reservoir dimensioning is between the critical overdimensioning (high project cost) and underdimensioning (need for water rationing during dry periods) (Tucci, 2012) situations. Therefore, a more adequate dimensioning requires analyses of risks and valuation of the safety water, thus using preferably high-reliability methods in situations of high risk of insufficient water supply, even when MSSociated with a lower cost and proportion of flooded areas. Semiarid regions present high spatial and temporal rainfall variabilities (Cabral et al., 2016); thus, implementing reservoirs is needed due to the strong impacts of water scarcity, since the water demand in these regions are not met (Hauschild, 2000) Considering the analyses, the methods of estimate of RC that do not consider the T present lower performance and should be avoided; among the methods that are associated to a frequency factor, RRCap_T present lower efficiency; and the methods that include synthetic series are the most indicated. These results are consistent with those presented by Rodrigues (2017), who evaluated these methodologies for dimensioning reservoirs for a sub-basin of the Paracatu River. Figures 6 and 7 show the values of E 1 and d, respectively, for each β, considering a mean of all T analyzed. Only the results referring to methodologies that use T were included, due to Figure 6. Nash Sutcliffe mean coefficient of efficiency (E 1 ) for each β value in four methods for estimating reservoir capacity RRCur -Regionalization of regularization curve; MSSo_T -RC estimate using the observed synthetic series combined with a return period ; MSSr -RC estimate using the regionalized synthetic series combined with a return period; RRCap_T -Regionalization of reservoir capacity combined with a return period their superiority in relation to others, and to the RRCur, being the mainly used in studies involving dimensioning of reservoirs.
According to Silva et al. (2008), when E 1 is higher than 0.75, the performance of the model is good, 0.36 to 0.75 is acceptable, and lower than 0.36 is unacceptable.
The method RRCur presented lower E 1 , as expected, being unacceptable for β of 0.75 to 1.00. The method RRCap_T had acceptable results for β of 0.5, 0.75, and 1.00, and good only for β of 0.25. In addition, this method presented lower coefficient than the MSSo_T and MSSr_T for all β values. The methods MSSo_T and MSSr_T presented E 1 higher than 0.75 for β of 0.25 to 0.5, and acceptable results for higher β values.
The results of the Willmott agreement index (d), which indicates the distancing of observed values in relation to the estimated by the standard model, showed better performance for the methods MSSo_T and MSSr_T, with higher indexes to all β values. Moreover, the method RRCur showed lower values than the other methods.
Thus, the methodologies MSSo_T and MSSr_T had better performances, and can be used for any β value. This denotes that regionalized data are efficient in generating synthetic series and, consequently, to be used to dimension streamflow regularization reservoirs without presenting differences in the use of values of the original series.
The use of methods that aggregate improvements to estimate RC is important, since they enable regional social, economic, and environmental development, mainly in semiarid regions, in which water scarcity is a major concern of the population.
However, despite the many benefits, the construction of reservoirs generates several impacts that should be measured, analyzed, and discussed, considering mainly their purposes.

Conclusions
1. Methods that include synthetic series are adequate to dimension regularization reservoirs in locations with no or insufficient fluviometric data, in the Jequitinhonha River Basin, Brazil.
2. The methods that associate synthetic series to a return period (MSSo_T and MSSr_T) showed better performances than those that do not consider this frequency factor in the estimates of reservoir capacity.