Hydrological regionalization of streamflows for the Tocantins River Basin in Brazilian Cerrado biome

The Brazilian Cerrado biome is the largest and richest tropical savanna in the world and is among the 25 biodiversity hotspots identified worldwide. However, the lack of adequate hydrological monitoring in this region has led to problems in the management of water resources. In order to provide tools for the adequate management of water resources in the Brazilian Cerrado biome region, this paper develops the regionalization of maximum, mean and minimum streamflows in the Tocantins River Basin (287,405.5 km), fully located in the Brazilian Cerrado biome. The streamflow records of 32 gauging stations in the Tocantins River Basin are examined using the Mann-Kendall test and the hydrological homogeneity nonparametric index-flood method. One homogeneous region was identified for the estimate of the streamflows Qltm (long-term mean streamflow), Q90% (streamflow with 90% of exceeding time), Q95% (streamflow with 95% of exceeding time) and Q7,10 (minimum annual streamflow over 7 days and return period of 10 years). Two homogeneous regions were identified for maximum annual streamflow estimation and the Generalized Extreme Value distribution is found to describe the distribution of maximus events appropriately within the both regions. Regional models were developed for each streamflow of each region and evaluated by cross-validation. These models can be used for the estimation of maximum, mean and minimum streamflows in ungauged basins within the Tocantins River Basin within the area boundaries identified. Therefore, the results provided in this paper are valuable tools for practicing water-resource managers in the Brazilian Cerrado biome.


INTRODUCTION
The volume of fresh water in Brazil accounts for about 12% of the planet's total, and is one of the largest reserves in the world. However, the natural distribution of fresh water shows great spatial disparity across the territory. Along with this factor, the different types of water use in the basins lead to conflicts for the right to use and uncertainties regarding the risks of flooding. Therefore, for the maintenance of life and environmental preservation, more effective actions are needed in the management of water resources (ANA, 2019a;Charles, 2020).
The Brazilian Cerrado biome is the largest and richest tropical savanna in the world and is among the 25 biodiversity hotspots identified worldwide (Myers et al., 2000;Silva and Bates, 2002). In addition, this region encompasses the recharge area of several aquifers and important rivers in Brazil, being recognized as the "cradle of Brazil's water" (Lima, 2011). However, the lack of adequate hydrological monitoring in this region has led to problems in the management of water resources, which may further compromise the sustainability of this important biome. Thus, improving the knowledge base on streamflow in the Cerrado biome is essential for water management in Brazil and for ensuring water security and economic development (Rodrigues et al., 2021).
The Tocantins River Basin (287,405.5 km 2 ) is a large area of the Cerrado biome, which has been confronting water resource management problems around the expansion of waterusing sectors, such as irrigated agriculture and hydro-energy. Irrigated agriculture is the primary consumer of water in this basin (ANA, 2019b), it presents a demand for the use of water of 44.3%. In addition, this basin is the third-ranked Brazilian basin in hydroelectric potential (ELETROBRAS, 2016). The TRB covers three Brazilian states: Goiás, Tocantins and Maranhão, which use streamflows that are exceeded 90 and 95% of the time (Q90% and Q95%, respectively) as references for granting water use rights.
Hydrological monitoring is the ideal way to determine streamflows in watercourses of interest. However, in countries such as Brazil, whose dimensions are continental, the density of streamflow gauging stations is unsatisfactory, which compromises the estimation of streamflows for water resource management (Melati and Marcuzzo, 2016). According to Pugliesi et al. (2016) the need to understand streamflow behavior is one of the biggest problems that occur in ungauged basins. Another obstacle that occurs mainly in medium-and smallgauged basins is the unavailability of longer time series, which limits the estimation of reliable quantiles (Beskow et al., 2016).
Rev. Ambient. Água vol. 16 n. 6, e2716 -Taubaté 2021 Rainfall-runoff models are widely used to estimate the streamflow, but calibrating these models in ungauged catchments is a challenge (Pool et al., 2017). Thus, to mitigate the effect of this lack of data, the streamflow regionalization method is an alternative to obtain hydrological information in locations with little or even without datasets. Streamflow regionalization can be applied within a region with similar hydrological behaviour using statistical procedures (Naghettini and Pinto, 2007;Wolff et al., 2014). In this approach, the streamflow statistics at ungauged sites are conditioned by the streamflow statistics at a gauged site, using catchment descriptors as similarity measure (Cupak, 2020). Furthermore, the regionalization models fitted for a given basin should not be applied using inputs out of the boundaries (Silveira and Tucci, 1998). In the TRB, there are no established regionalized models. Therefore, there is a need to develop models which can be applied in the conditions of the Tocantins River Basin.
Maximum, mean and minimum streamflows estimates are essential for water-resource management. Maximum streamflows are essential for hydraulic structure designs, such as dams, bridges, culverts, and urban drainage systems, and for flood risk assessments. Mean streamflow is fundamental for hydropower planning. Minimum streamflows are important as reference for water-use rights concessions, water supply and habitat protection (Beskow et al., 2016).
In order to provide tools for the adequate management of water resources in the Brazilian Cerrado biome region, the objectives of the study were: i) to evaluate the suitability of 10 probability distributions functions using L-moments method and goodness-of-fit test; and, ii) to develop the regionalization of the maximum, mean and minimum streamflows (Qmax, Qltm, Q90%, Q95% and Q7,10) for the Tocantins River Basin (287,405.5 km 2 ), fully located in the Brazilian Cerrado biome. The novelty of this study lies in estimation of reliable and robust flows in data scarce regions by using a combination of statistical techniques.

Study area and streamflow data
The Tocantins River Basin (TRB) ( Figure 1) comprises a drainage area of 287,405.5 km 2 , is located in the states of Goiás, Tocantins and Maranhão, in the northern region of Brazil, and includes 213 municipalities. This basin is fully inserted in the Cerrado biome and is part of the Tocantins-Araguaia Basin (TARB). The TRB was delimited in the Itaguatins streamflow station (National Water Agency code 23710000).
According to Köppen's climate classification system, the climate of the studied basin is Aw (tropical savanna), with a rainy season in the summer (from November to April) and a dry season in the winter (from May to October) (Kottek et al., 2006;EMBRAPA, 2018).
Twelve hydroelectric plants ( II (5 MW, 1999);and Sobrado (5 MW, 1994). Among these hydroelectric plants, the Serra da Mesa HPP, located at the upper reaches of the Tocantins River, stands out for having the greatest hydropower potential, reservoir and flow regulation capacity (ANEEL, 2018).
Daily streamflow data were obtained from the National Water Agency (ANA) for 32 streamflow gauging stations ( Figure 1) with at least 10 years of continuous record (Cassalho et al., 2017). Precautions were taken to ensure the natural representation of the streamflows in the study area. Thus, to avoid the regulation effect of dams on the streamflow behavior, only data prior to the start of reservoir operations were used for their downstream stations. In this way, 13 streamflow gauging stations had their series shortened, while the other 19 streamflow gauging stations had all available data kept. Table 1 shows the characteristics of the streamflow gauging stations and their record lengths used in this study, which ranged from 1955-2017. Figure 1 shows the location of the Tocantins River Basin (TRB) with the thirty-two streamflow gauging stations, and the twelve hydropower plants.  Based on the daily streamflow historical series of the 32 streamflow gauging stations, three new series were generated for each station, referring to: the maximum annual streamflow (Qmax), mean annual streamflow (Qmean) and mean minimum streamflow over seven consecutive days (Q7).
The first analysis in this study was to determine which series were stationary and therefore could be used as a basis for regionalization. To determine whether a given historical series is stationary, it is necessary to apply a trend test (Naghettini and Pinto, 2007). The Mann-Kendall (MK) test (Mann, 1945;Kendall, 1975) is the most recommended to evaluate trends in hydrological time series (Hamed, 2008;Zhang et al., 2015;Wang et al., 2015). Thus, this test was applied to the 32 historical series of Qmax, Qmean and Q7 to select those data to be used in the streamflow regionalization. This is one of the most important steps in the flow regionalization process, as it prevents the spread of local trends to the rest of the basin, which justified the exclusion of periods subsequent to the installation of the reservoirs. This test was performed using the Kendall package from RStudio.
Daily series from stations whose Qmean series are stationary were used to obtain Q90%, Q95% and Qltm (Beskow et al., 2016). For the stations whose Q7 series are stationary, PDFs (probability distribution functions) were fitted to obtain the quantile associated with the return period (RP) of 10 years (Q7,10). For the stations with Qmax stationary series, PDFs were fitted to regionalize the maximum annual streamflows as a function of RP.

Probability distribution functions (PDF)
To regionalize streamflows associated with RPs, it is necessary to model the frequency of occurrence by a PDF. Qmax and Q7 were modeled by different PDFs, selecting the most adequate for each situation.
The following PDFs were fitted to the maximum annual streamflow series: two-parameter log-normal (LN2); Gumbel (or Extreme Values -EV1); Generalized Pareto (GPA); Gamma (GAM); three-parameter log-normal (LN3); Generalized extreme values (GEV); Pearson type III (PE3); Generalized logistic (GLO); Kappa (KAP); and Wakeby (WAK). These distributions were also fitted by Cassalho et al. (2018;2019) to historical streamflow series from watersheds in Rio Grande do Sul. The same distributions were fitted to Q7 series, except GPA distribution, however, Weibull distribution was also fitted to these series. Detailed descriptions of these PDFs are further presented in Naghettini and Pinto (2007) and Cassalho et al. (2018).
The L-moments method was used to estimate the parameters of the PDFs (Hosking and Wallis, 1997;Cassalho et al., 2017). The best PDF for each Q7 historical series was selected based on the Anderson-Darling (AD) goodness of fit test (Anderson and Darling, 1954), with a significance level of 5%, being the one with the lowest Anderson-Darling value among the PDFs. The adequate PDF to regionalize Qmax streamflow was also selected based on the Anderson-Darling goodness of fit test, with a significance level of 5%. However, it was the one that fitted all Qmax series and presented lower AD value among the PDFs that fitted all Qmax series. The Anderson-Darling test places more weight on observations in the tails of the PDFs, which is a desirable feature when modeling extreme events (Naghettini and Pinto, 2007).

Streamflow regionalization
Streamflow regionalization allows transferring information from a basin where data are available to another where little or no data are available through a mathematical model that is valid for a hydrologically homogeneous region (Naghettini and Pinto, 2007). Thus, to evaluate the homogeneity of the TRB, the non-parametric index-flood method was applied to the Qmax, Qmean and Q7 historical series that showed stationarity. This method was originally introduced by Dalrymple (1960) and exhibits good results.
Typically, index-flood method is based on the analyses of a graphically frequency distribution of the dimensionless streamflows from each station. This analysis is the preliminary step of the dimensionless curve regionalization method (Euclydes et al., 2001). To apply this method, the streamflows were initially dimensionless as follows (Equation 1): Where Qid is the dimensionless streamflow, Qi is the observed streamflow in ascending order at position i, and Qmean is the average observed streamflows of the series.
The frequency of occurrence of the events was calculated by applying the Weibull procedure (Equation 2): Where P(Qid) is the non-exceedance frequency of the streamflow of order i and N is the number of events.
After obtaining the dimensionless streamflows and their respective frequencies of occurrence, these data were plotted. To evaluate the homogeneity of the Qmax, Qmean and Q7 series, three graphics were plotted (one for each set). This method considers that the dimensionless streamflow versus frequency of occurrence curves for stations within the same hydrologically homogeneous region are similar, thus configuring the criterion for the identification of homogeneous regions.
Rev. Ambient. Água vol. 16 n. 6, e2716 -Taubaté 2021 The homogeneous regions identified by the index-flood method for Qmax were used in the regionalization of the maximum annual streamflows as function of RP. The regions identified for Qmean were used in the regionalization of Qltm, Q90% and Q95%, and those identified for Q7 were used in the regionalization of Q7,10.
To develop the regionalization models, it is essential to know the independent variables that better explain streamflow behaviors. The independent variables to describe this relationship could be the drainage area, drainage density, length and steepness of the main river, average annual rainfall, land cover, among others (de Souza et al., 2021). Cassalho et al. (2017;2019) highlight that to reduce uncertainties related to the regionalization process, the parsimony principle should be considered, which means that a phenomenon should be explained with the lowest number of explanatory variables. In this way, this study used the drainage area as an explanatory variable. Drainage area is the variable most used in different regionalization studies (Naghettini and Pinto, 2007;Beskow et al., 2016;Melati and Marcuzzo, 2016;Bazzo et al., 2017), mainly because it is easily obtained, enabling the use of the generated models in ungauged basins. Thus, the drainage area of each sub-basin was obtained using the ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) digital elevation model (DEM) combined with calculations made using the Raster Calculator Tool in ArcGIS 10.1 (ESRI, 2013).
According to Naghettini and Pinto (2007), the main methods applied to streamflow regionalization in a hydrologically homogeneous region are (i) the method that regionalizes the quantiles associated with a previously specified risk and (ii) the method that regionalizes a dimensionless probability curve, which is called the index-flood method. Method (i) was applied to regionalize Qltm, Q90%, Q95% and Q7,10. Method (ii) was used to regionalize the maximum annual streamflows as a function of RP (PDF regionalization). These regionalizations allow the estimation of varied quantiles to the most diverse demands in the homogeneous region, thus, are tools of extreme importance for water resources management.
The streamflows associated with specific risks were regionalized by fitting a regression model, which considers streamflow to be regionalized as the dependent variable and, in the case of this study, the drainage area as the independent variable.
The regionalization of the maximum annual streamflows as a function of RP (Qmax(RP)) using the index-flood method was performed based on the following Equation 3: Where Qmax(RP) is the estimated maximum annual streamflow for a given return period and X RP is the dimensionless regional quantile function obtained parametrically by fitting a PDF to the dimensionless regional data. This PDF must fit all Qmax series within the homogeneous region. Qmean_max is the scaling factor known as "index flood," which consists of the regional function of the mean maximum annual streamflow (mean of the Qmax series) as a function of the drainage area, which is obtained using method (i).
Regionalization models were fitted according to the power mathematical model, as suggested by Lisboa et al. (2008), Beskow et al. (2016) and Cassalho et al. (2017). To evaluate the capability of the regional models, the cross-validation method was applied using the root mean square error (RMSE) and the coefficient of determination (R 2 ) as objective functions (Vezza et al., 2010). In addition, to quantify the performance of the regional models and compare the observed quantiles to the estimated ones, the confidence index (c) proposed by Camargo and Sentelhas (1997)

Preliminary analysis and delimitation of the homogeneous regions
Based on the Mann-Kendall test, out of 32 streamflow-gauging stations with at least 10 years of data, 23 stations for the Qmax series, 18 for the Qmean, and 14 stations for Q7 series could be considered stationary and adequate for regionalization. Figures 2a, 2b and 2e depict the empirical dimensionless streamflow versus frequency curves for the Qmean, Qmax and Q7 series, respectively. Figure 2a shows that the 18 streamflow gauging stations had similar behavior, thus forming a single homogeneous region regarding Qmean. However, Figure 2b shows that the 23 Qmax series did not have the same behavior and therefore should not comprise a single homogeneous region. Therefore, for Qmax, the existence of two homogeneous regions was observed, namely, one region consisting of 7 streamflow stations, Region 1 (Figure 2c), and the other region consisting of 16 streamflow stations, Region 2 ( Figure 2d). Figure 2e shows that the 14 historical Q7 series did not indicate homogeneous behavior. Graphical analysis indicates that a homogeneous region with 10 streamflow stations can be defined (Figure 2f). However, the other series did not conform to each other, and to define a second homogeneous region for this variable was not possible. A similar process was applied by Noto and La Loggia (2009), who removed approximately 8% of the data because they did not show behavior consistent with the defined hydrological region. Table 2 presents the systematization of the homogeneous regions for the Qmax, Qmean, and Q7 series. In addition, it can be seen in Table 2 that the stations did not present common periods, since according to Hosking and Wallis (1997) the series can be considered homogenous and representative of the variable under analysis, so that the use of the same period is unnecessary.  The geographic location of the homogeneous regions determined for the Qmean, Qmax and Q7 is shown in Figures 3 a, b and c. Possible characteristics that led to the differentiation between homogeneous regions and the removal of some series were identified based on the evaluation of the physiographic characteristics of the TRB. Such characteristics were obtained from land use (IBGE, 2000), soils (EMBRAPA, 2011), hydrogeological (Diniz et al., 2014) and slope (ASTER DEM) maps. For the Qmax homogeneous regions, Figures 2c and 2d show that Region 1 has a wider range of variation than Region 2. Thus, Region 1 was found to have characteristics that favor surface runoff, such as a high percentage of anthropogenic land cover and Petric Plinthosol, which has a low infiltrability. Regarding the removal of some Q7 series, Figures 2e and 2f show that the excluded series have more complex characteristics than the rest of the basin, such as amplitude of variation. Thus, based on the physiographic evaluation of these areas, it was observed that the basins having the smallest amplitude of variation in Q7 are small drainage areas with a low slope and high hydrogeological favorability, as demonstrated by the Urucuia aquifer.  Table 3 shows the minimum with 90% and 95% permanence (Q90% and Q95%) and longterm mean (Qltm) streamflows obtained by means of the permanence curves of the observed data at each of the 18 streamflow-gauging stations. Table 4 shows the regionalization models of Q90%, Q95% and Qltm streamflows fitted for the Tocantins River Basin. It can be seen in Table  4 that the three models fitted are classified as "excellent" based on a confidence index value (c) > 0.85 (Camargo and Sentelhas, 1997). Besides that, the R 2 values showed that the models explain at least 98% of the variation in Q90%, Q95% and Qltm. When regionalizing Q90% for river basins fully inserted in the state of Rio Grande do Sul, Brazil, Beskow et al. (2016) obtained models with confidence indices (c) ranging from 0.71 to 0.99, considering the drainage area as the only explanatory variable. When regionalizing Q95% for the Taquari-Antas River Basin, located in the state of Rio Grande do Sul, Brazil, Bazzo et al. (2017) obtained models with R 2 coefficients ranging from 0.77 to 0.99, also considering the drainage area as the only explanatory variable. Thus, the models provided in this study can be used for the estimation of Q90%, Q95% and Qltm in ungauged basins in the Tocantins River Basin, within the homogeneous region for mean annual streamflow and within the drainage area boundary from 795 to 287,405.5 km 2 . Therefore, they are important tools in the context of the quantitative management of water resources and for Cerrado biome conservation, since the Q90% and Q95% are the reference streamflows for granting water use rights in the states of Goiás, Tocantins and Maranhão, located in the studied basin.

Regionalization of Q90%, Q95% and Qltm
Rev. Ambient. Água vol. 16 n. 6, e2716 -Taubaté 2021  Table 5 shows the best PDF used to adjust the Q7 series at each of the 10 streamflowgauging stations and their respective estimated values of Q7 streamflow for a return period of 10 years (Q7,10). The best PDF was the one with the lowest Anderson-Darling (AD) test value among the adequate PDFs, considering a significance level of 5%. The Wakeby PDF showed the best performance among the PDFs for 5 historical series, the Weibull PDF showed the best performance for 2 series, and the Pearson, GLO and GEV PDFs showed the best performance for 1 historical series each. Leme and Chaudhry (2005), comparing the Weibull and Gumbel PDFs in the determination of Q7,10 streamflow in the Jaguari Mirim River Basin, Brazil, pointed to the Weibull distribution as the one that provides the best fit. Chen et al. (2006), comparing five PDFs in the determination of Q7,10 streamflow in the Dongjiang basin, South China, pointed to the three-parameter lognormal (LN3) distribution as the one that provides the best fit, outperforming generalized logistic (GLO), generalized extreme value (GEV), Pearson type III (PIII) and generalized Pareto (GPD) distributions. Amorim et al. (2020), characterizing the Q7,10 streamflow in the Mortes River Basin, southeastern Brazil, tested ten PDFs, the same ones studied in this work. The distributions that stood out most were those of Wakeby, Kappa, GEV, GLO, GPA, Weibull and PE3, which showed the best performance for 8, 3, 2, 1, 1, 1 and 1 series, respectively. Thus, it can be seen that there is not a better PDF for all regions, thereby, to reduce errors in the Q7,10 estimate, one can highlight the importance of this PDFs analysis. Table 5. The best PDF used to adjust the Q7 series at each of the 10 streamflow gauging stations, along with its Anderson-Darling (AD) test value and estimated value of Q7,10.

Station
Drainage Area (km 2 ) Best PDF AD of the Best PDF Q7,10 (m³ s −1 )  Table 6 shows the regionalization model of Q7,10 streamflow fitted for the Q7 streamflow hydrologically homogeneous region of the Tocantins River Basin, along with the accuracy statistics associated with fitting and cross-validation. It can be seen in Table 6 that the fitted model can explain 97.5% of the variation in Q7,10 based on only the drainage area. Furthermore, the confidence index value (c) > 0.85 showed that the model is classified as "excellent" (Camargo and Sentelhas, 1997), which demonstrates the quality of the regionalization model. When regionalizing Q7,10 for the São Paulo State, Brazil, Wolff et al. (2014) obtained a model with confidence index (c) of 0.94 and R 2 coefficient of 0.92, considering the drainage area as the only explanatory variable. When regionalizing Q7,10 for the Mortes River Basin, southeastern Brazil, Amorim et al. (2020) obtained model with R 2 coefficient of 0.99, also considering the drainage area as the only explanatory variable. Thus, the Q7,10 model provided in this study can be used in ungauged basins in the Tocantins River Basin, within the Q7 streamflow hydrologically homogeneous region and within the drainage area boundary from 1566.9 to 287,405.5 km 2 . The previous finding is important, since Q7,10 is used as reference streamflow for the water use right concession in Minas Gerais, São Paulo and Espírito Santo states (ANA, 2007), close to the studied basin, thus, the Q7,10 model developed in the present study is an alternative for the management of water resources.  Table 7 shows the results of the Anderson-Darling (AD) test for the best PDF, the one with the lowest AD value, and for GEV PDF, which was the only one accepted for all Qmax series, considering a significance level of 5%. The Wakeby PDF showed the best performance among the PDFs for 10 historical series of Qmax, the Kappa and GEV PDFs showed the best performance for 4 historical series each, the Pearson PDF was the best for 3 series, and the LN2 and GLO PDFs showed the best performance for 1 historical series each. However, considering that the goal is to define a regional function that is capable of estimating the Qmax streamflow for different RPs, and that the GEV PDF was the only one that had a good fit for all Qmax series within the homogeneous regions 1 and 2, the GEV PDF was adopted here. Morais et al. (2020), in a regionalization study for the Araguaia River Basin, Brazil, also identified the GEV PDF as the only one that had a good fit for all Qmax series. Kumar et al. (2003), when evaluating 12 PDFs in a regionalization study of Qmax for Middle Ganga Plains Subzone 1 (f) of India, identified the GEV PDF as the most robust. Noto and La Loggia (2009), comparing 4 PDFs in the determination of Qmax streamflow in a case study on the island of Sicily, Italy, pointed to the GEV PDF as the one that provides the best fit. The robustness of the GEV for modeling Qmax was also identified by Seckin et al. (2011), when evaluating 6 PDFs in Turkey; by Cassalho et al. (2017), when evaluating 6 PDFs in the Mirim-São Gonçalo Basin, Brazil; and by Cassalho et al. (2018), when evaluating 4 PDFs in Rio Grande do Sul state, Brazil. Therefore, the results of these studies corroborate the findings of the present study. Table 7 also shows the fitted parameters of the GEV distribution and the length of each series. From these data, it was possible to estimate the regional parameters of the GEV distribution by means of the mean weighted by the length of the series, as recommended by Naghettini and Pinto (2007). The regional parameters of the GEV PDF obtained for Region 1 were ξ = 0.766, α = 0.328 and κ = -0.125, and for Region 2 were ξ = 0.849, α = 0.315 and κ = 0.112. Therefore, the term X RP of Equation 3 was defined for both regions. Table 7 also shows the mean maximum annual streamflow (Qmean_max) observed at each of the 23 streamflow-gauging stations. Table 8 shows the regionalization models of Qmean_max streamflow fitted as a function of the drainage area for the homogeneous regions 1 and 2 of the TRB. It can be seen in Table 8 that the fitted models are classified as "excellent" based on the confidence index value (c) > 0.85 (Camargo and Sentelhas, 1997). In addition, the R 2 values showed that the models explain at least 86% of the variation in Qmean_max streamflow, and the cross-validation results demonstrated the predictive ability of the models. When regionalizing Qmean_max in a case study on the island of Sicily, Italy, Noto and La Loggia (2009) obtained a model with R 2 coefficient of 0.77, considering the drainage area as the only explanatory variable. When regionalizing Qmean_max for the Araguaia River Basin, Brazil, Morais et al. (2020) obtained power mathematical models with R 2 coefficients ranging from 0.87 to 0.9 and confidence index values (c) > 0.85, considering the drainage area as the only explanatory variable. When regionalizing Qmean_max for the state of Rio Grande do Sul, Brazil, Cassalho et al. (2018) obtained models with R 2 coefficients ranging from 0.57 to 0.96, also considering the drainage area as the only explanatory variable. Thus, the Qmean_max models provided in this study can be used in ungauged basins in the Tocantins River Basin, within their respective Qmax streamflow homogeneous regions, and within the drainage area boundary from 795 to 287,405.5 km 2 (Region 1) and from 1585.2 to 20,212.7 km 2 (Region 2). Therefore, the term "index flood" of Equation 3 was defined for both regions.   Based on the coupling of the Qmean_max models and regional parameters of the GEV distribution to Equation 3, the regional functions Equation 4 and Equation 5 were obtained for Qmax homogeneous regions 1 and 2, respectively, to estimate the maximum annual streamflow in m 3 s -1 as a function of the RP (years) and of the drainage area DA (km 2 ). These models allow the direct estimation of the Qmax associated with different return periods for ungauged basins located within the respective homogeneous regions, within the drainage area boundary from 795 to 287,405.5 km 2 (Region 1) and from 1585.2 to 20,212.7 km 2 (Region 2). Therefore, these regional functions are extremely important tools for the management of water resources in the Tocantins River Basin, especially for planning hydraulic structures. In addition, it can be highlighted that despite the good results obtained in all regionalization models, the use of field monitored data is the best option, since the regionalization process presents uncertainties, as well as any other data simulation process.

Regionalization of Qmax
At the end, it is highlighted that the estimates of the regionalization models represent the flow in natural-flow conditions in the basins. This means that the approach will most probably not work or will be misleading if flow regimes analysed are continually changing under maninduced impacts.

CONCLUSIONS
In the present study, Q90%, Q95%, Qltm, Q7,10, and Qmax as a function of return period were regionalized. Considering the results, the following conclusions were drawn: i) The nonparametric index-flood method for identification of hydrologically homogeneous regions was adequate for the Tocantins River Basin, presenting results that are consistent with the physiographic reality of the basin; ii) The Wakeby distribution was the one that adjusted to the highest number of Q7 and Qmax series among the analyzed PDFs, and the GEV distribution was the most robust in relation to Qmax, being the only one that adjusted all the series; iii) All fitted models were adequate according to the statistics used; iv) The drainage area as the only explanatory variable was robust for all fittings, which confirms the benefits of its use, especially in terms of ease of use of the generated models; v) The fitted regional models are an alternative for generating data for the poor hydrological monitoring of this important basin of the Brazilian Cerrado; and, vi) The streamflows regionalized in this study are important for water resource management in the Tocantins River Basin because they contribute to several initiatives ranging from the planning of hydraulic structures to the quantitative management of reference streamflows for water-use concessions.