Multi-variable SWAT model calibration with remotely sensed evapotranspiration and observed flow

Although intrinsic, uncertainty for hydrological model estimation is not always reported. The aim of this study is to evaluate the use of satellite-based evapotranspiration on SWAT model calibration, regarding uncertainty and model performance in streamflow simulation. The SWAT model was calibrated in a monthly step and validated in monthly (streamflow and evapotranspiration) and daily steps (streamflow only). The validation and calibration period covers the years from 2006 to 2009 and the study area is the upper Negro river basin, situated in Santa Catarina and Paraná. SWAT-CUP was used to calibrate and validate the model, using SUFI-2 with KGE (Kling-Gupta Efficiency) as objective function. Different calibration strategies were evaluated, considering single-variable and multi-variable calibration, using streamflow and evapotranspiration. Compared to conventional single-variable calibration (streamflow only), multi-variable calibration (streamflow and evapotranspiration, simultaneously) produce better streamflow performance, especially for low flow periods and daily step validation. Despite that, no evidence of reduction of streamflow prediction uncertainty was observed. SWAT model calibration using solely evapotranspiration still requires further studies.


INTRODUCTION
Hydrological models are widely used to support water resource management, planning and decision making (DAGGUPATI et al., 2015a).Hydrological models are approximations of the real system, and therefore, unable to consider all processes and variables, causing model uncertainty to be present in its predictions (BEVEN, 2012;MORADKHANI;SOROOSHIAN, 2009).
Besides model uncertainty, the parameters, data input and scale simplifications are also uncertainty sources (ABBOTT; REFSGAARD, 1996;BEVEN, 2012).Wagener and Gupta (2005) emphasize the need to propagate and clearly relate the model uncertainties, associating an appropriate degree of confidence for the model estimates.A model calibration procedure attempts to reduce uncertainty from model parameters estimation and interpolation, obtaining the best parameterization for the local conditions (ABBOTT; REFSGAARD, 1996;ARNOLD et al., 2012).
Inverse modelling -IM -denotes the calibration procedure that infers parameter values from model output variable observations (ABBASPOUR, 2005).But one model output variable can be relatively well simulated by several different parameter values, leading to parameter non-uniqueness and equifinality problems (ABBASPOUR, 2005;BEVEN;BINLEY, 1992;BEVEN, 1993).
The use of multi-variable and multi-site calibration in hydrologic distributed models can help reduce equifinality problems, since less parameter sets can satisfy calibration criteria at all sites simultaneously (BEVEN, 2006(BEVEN, , 2012;;DAGGUPATI et al., 2015b).However, model calibration is commonly performed with watershed outlet streamflow data only.Consequently, part of the processes, specially from the unsaturated zone, can remain uncalibrated (WANDERS et al., 2014).
Remotely sensed evapotranspiration (ET) has important spatial and temporal resolution and can be used to estimate soil water balance related parameters, such as soil moisture (ALLEN et al., 2007;GITHUI;SELLE;THAYALAKUMARAN, 2012;IMMERZEEL;DROOGERS, 2008).Recently, some studies have integrated remote sensing data into the calibration of distributed hydrological models (GITHUI; SELLE; THAYALAKUMARAN, 2012;MUTHUWATTA;BOOIJ;RIENTJES, 2009) obtaining streamflow prediction performance improvement (KUNNATH-POOVAKKA et al., 2016;WANDERS et al., 2014;ZHANG et al., 2009).Rajib, Merwade and Yu (2016) calibrated the SWAT model using streamflow and soil moisture simultaneously, and compared it with the conventional calibration (streamflow only).Remotely sensed soil moisture and field soil moisture measurements were used.The authors normalized the final calibrated parameter intervals and reported parameter uncertainty reduction when the model was calibrated with both soil moisture and streamflow.
Immerzeel and Droogers ( 2008) calibrated the SWAT model for a 45,678 km 2 Indian basin, using remotely sensed evapotranspiration only.Monthly estimates of evapotranspiration were derived applying SEBAL to MODIS imagery, and the model was compared to available observed streamflow data.
Githui, Selle and Thayalakumaran (2012) also used remotely sensed evapotranspiration, together with streamflow measurements, to calibrate the SWAT model and estimate aquifer recharge for an Australian irrigated basin (irrigation 325 mm ano -1 ).
The authors related good performance for the SWAT to simulate evapotranspiration, having obtained R 2 of 0,87 and low PBIAS values.The study was carried out in an area with low rainfall and evapotranspiration rates (284 mm year -1 and 290 mm year -1 ).
Among commonly used hydrological models nowadays, SWAT (Soil and Water Assessment Tool) has been extensively applied worldwide and in Brazil (ARNOLD et al., 2012).In Brazil, SWAT model was applied to estimate sediment yield (SANTOS; SCUDELARI; RIGHETTO, 2013), best management practices (STRAUCH et al., 2013) and hierarchical calibration (BRIGHENTI; BONUMÁ; CHAFFE, 2016).Bressiani et al. (2015) identified 100 studies using the SWAT model between the years of 1999 and 2015 in Brazil.
SWAT model calibration can be carried out manually or with calibration softwares, such as SWAT-CUP (SWAT Calibration Uncertainty Procedure).The SWAT-CUP was developed to support users on SWAT calibration, and has five different calibrations methods: GLUE (Generalized Likelihood Uncertainty Estimation); ParaSol (Parameter Solution); MCMC (Markov chain Monte Carlo); PSO (Particle Swarm Optimization) e SUFI-2 (Sequential Uncertainty FItting) (ABBASPOUR, 2015).
Considering the growing availability and use of remotely sensed products, this study aims to investigate the impact of the multi-variable calibration on the uncertainty and performance of the SWAT model streamflow predictions, in a Brazilian rainforest basin.The multi-variable calibration aims to reduce the parameter non-uniqueness problem, by using remotely sensed evapotranspiration and measured streamflow simultaneously.SWAT model calibration solely with evapotranspiration was also evaluated, an approach which is a novelty in Brazilian basins.Parameter temporal scale transfer was explored through daily streamflow validation with monthly calibrated parameters.

STUDY AREA
The study area was selected according to evapotranspiration data availability.The evapotranspiration estimates are a product of Uda (2016) study, which applied METRIC model to MODIS imagery, covering the entire Iguaçu river basin.Due to processing limitations and data availability, the modeled area was reduced to the upper Negro river basin only.The upper Negro river basin is located in the states of Santa Catarina and Paraná, between 49°55'15"W and 48°56'55"W longitudes and 25°55'06"S and 26°42'16"S latitudes, with a 3,453 km 2 drainage area (Figure 1).Altitude varies from 780 meters, at the basin outlet, until 1,591 meters for the highest area.The average altitude is 885 m and 80% of the basin area ranges between 780 and 1,215 m.
Economy in this region is strongly based on timber and furniture industry, besides agriculture (THOMÉ et al., 1999).
According to Köppen classification, region climate is Cfb: humid temperate climate with average temperature below 18 °C for the coldest month and below 22 °C for the hottest month, constantly humid and with no significant precipitation difference between seasons (THOMÉ et al., 1999).The river basin is located in the Atlantic forest, originally covered with Mixed Ombrophylus Forest (Araucaria Forest) and average annual rainfall of 1,522 mm (FRANCO et al., 2015).

INPUT DATA AND SWAT MODEL
The SWAT is an hydraulic-hydrologic model, with physically based equations for basin water cycle.The SWAT model was developed to predict the impact of land use and management, and agricultural chemicals, on water and sediment yield (NEITSCH et al., 2011).The spatial discretization is variable and can be user adjusted.
The default mode is semi-distributed, using HRUs (Hydrologic Response Units) as simulation units.
Daily precipitation and streamflow data were obtained from the National Water Agency (ANA) online database (ANA, 2016) (MIRANDA, 2005).The minimum contribution area for stream generation was established in 30 km 2 and 66 subbasins were generated on SWAT, according to the basin topography.

Land use and land cover
The land use and land cover database were produced from Landsat 8 imagery, with 30 m spatial resolution.Two scenes (220-078 e 221-078) from the year 2014 (26/08/2014 and 07/12/2014) covered the entire basin area.The maps were produced on Spring 5.2.6, using region growing segmentation followed by supervised classification.Image segmentation methods group adjacent pixels to "seed" pixels which aggregate pixels and regions according to some heterogeneity criteria (MENESES; ALMEIDA, 2012).Land use classes were: water, reforestation,  native forest, agriculture, exposed soil, urban area and pasture.
Exposed soil was later reclassified as also agriculture.Land use and land cover were resampled to 90 m, in order to be compatible with the DEM resolution.
Native forest still covers most of the basin area (49.7%), especially on the basin headwater and riparian forests.Agriculture is the dominant activity, occupying 26.9% of the basin area, followed by reforestation (17.7%).Reforestation is mostly of Pinus elliotti, which supplies local cellulose industry.Urban area and pasture cover approximately 5% of the basin area, each.Table 2 sumarizes areas and percentuals for each land use and land cover for the upper Negro river basin.

Evapotranspiration
Potential evapotranspiration (PET) can be directly informed to SWAT, or calculated from meteorological data.The SWAT model provides three methods for PET estimates: Hargreaves, Priestley-Taylor and Penman-Monteith (NEITSCH et al., 2011).For the present study, PET was estimated with Penman-Monteith method.
The actual evapotranspiration is then estimated from the available water capacity and water canopy storage.First, water stored on the vegetation canopy is evaporated.If potential evapotranspiration is higher than the canopy stored volume, the remaining evaporative demand is distributed between plants and soil.The plants transpiration demand is supplied according to the soil available water (NEITSCH et al., 2011).
In the present study, monthly actual evapotranspiration estimates from METRIC application to MODIS imagery, from Uda (2016), were used for model calibration and validation.The author produced estimates of monthly actual evapotranspiration for the years 2006, 2007 and 2009, with 250 m spatial resolution, for the entire Iguaçu river basin.The average pixel value for each subbasin was used for model calibration and validation.
Evapotranspiration estimates from SEBAL and METRIC present good confidence for monthly scale.Typical accuracy for single-day events and scales of the order of 100 ha is +/-15% (BASTIAANSSEN et al., 2005).Also according to Bastiaanssen et al. (2005), typical SEBAL accuracy for one day is 85%, reaching 95% on seasonal scale and an average value of 96% for annual scale on large basins.The SUFI-2 calibration algorithm allows the user to inform the percentual measurement error for the observed data used for calibration.For the present study, the value of 10%, suggested by Abbaspour (2015), for typical streamflow measurement conditions, was used for streamflow and evapotranspiration.

CALIBRATION AND VALIDATION
Model validation is the procedure in which the calibrated model is executed and evaluated for a different time interval or subbasin.The objective is to compare the model estimates with observed data that were not used on the calibration process, and demonstrate that the model is able to make sufficiently accurate estimates (ARNOLD et al., 2012).
Presented by Klemeš (1986), the split-sample test is a model calibration and validation approach that consists on equally splitting the available data, when the record is sufficiently long to represent different climate conditions.Further discussions on record data length for calibration and validation can be found on Her and Chaubey (2015).When the available record is not sufficient for the 50/50 split, it must be split in two different ways, for example, 70/30 and 30/70, such that the calibration interval is sufficiently long.
The SWAT model was calibrated on monthly time step and validated on monthly time step (for evapotranspiration and streamflow) and daily time step (streamflow only), using SUFI-2.Among the calibration techniques available in SWAT-CUP, SUFI-2 is the one that needs the smallest number of runs to achieve good prediction uncertainty ranges with reasonable coverage of data points (YANG et al., 2008).
The SUFI-2 operates with successive iterations, with the same number of simulations each.For each iteration, the calibrated parameters values interval (Range_Par) are reduced, always centered on the parameter set that produced the best objective function value (Best_Par) (ABBASPOUR, 2015).The iteration number and the number of simulations for each iteration are user defined.The objective function used to define the best parameter values set (Best_Par) is also defined by the user.
In SUFI-2, the uncertainty is expressed as an uniform probability distribution.The uncertainty is indicated by the interval of 95% probability (95PPU), calculated for the 2.5 and 97.5 percentiles for the accumulated probability distribution of the output variable of interest.This uncertainty is presented as an "envelop" of solutions generated by the parameter value interval used.
To quantify the adjust of the parameter values interval, two statistical indicators are used: the p-factor and the r-factor.The p-factor indicates the percentage of observed data bracket by the 95PPU envelope.The r-factor is the uncertainty indicator and it is calculated as the ratio of the average distance between the 2.5 and 97.5 percentiles and the standard deviation of the measured data.The r-factor represents the thickness of the envelope of solutions (95PPU) (ABBASPOUR; JOHNSON; VAN GENUCHTEN, 2004).
A sensitivity analysis were carried out to reduce the number of calibrated parameters.From the sensitivity analysis, 11 parameters were selected to further calibration.The calibrated parameters and initial intervals are summarized on Table 3.
Several statistical performance indicators for streamflow simulation exist.The most commonly reported for SWAT streamflow calibration and validation are the coefficient of determination (R 2 , Equation 1), the Nash-Sutcliffe coefficient (NS, Equation 2) and the percent bias (PBIAS, Equation 3).The R 2 varies from 0 to 1, NS varies between -∞ and 1, and 1 is the optimal value for both.For PBIAS, the value of zero indicates (1) ( ) where, n is the number of observations for the simulated period, Oi and Si are the observed and simulated values for each time step i, O and S are the observed and simulated average values (GREEN;VANGRIENSVEN, 2008).According to Abbaspour, Johnson and Van Genuchten (2004) a model is to be considered satisfactory calibrated by SUFI-2 when R 2 is higher than 0.80.Abbaspour (2015) suggests values of p-factor above 0.70 and r-factor below 1.50, for acceptable streamflow calibration.Moriasi et al. ( 2007) defines NSE>0.5 and PBIAS<+/-25% as satisfactory values, while NSE>0.75 and PBIAS < +/-10% are considered very good for streamflow calibration of hydrological models.
The objective function used for model calibration influences the calibration result and final calibrated model performance.Gupta et al. (2009) presented the KGE function (Kling-Gupta Efficiency), based on the decomposition of NS and mean squared error (MSE).The KGE can be decomposed in three terms, which represent the correlation, bias and relative variability between observed and simulated values.The optimal point is computed for a three-dimensional pareto surface, in terms of minimum Euclidian Distance from the ideal point, according to: where, ED is the Euclidian distance from the ideal point, r is the coefficient of linear correlation between observed and simulated values and o s and s s the standard deviation of observed and simulated values.Similar to NS, KGE values range from -∞ to 1, and the optimal value is 1.In case of multiple observed variables, the objective function is defined as: where, j w is the weight of the variable j.
The KGE was used as objective function to allow the simultaneously use of evapotranspiration and streamflow on calibration, and to enable comparison between different strategies.For iterations in which the streamflow and evapotranspiration were used simultaneously, the weight ( j w ) of 0.5 was attributed to the outlet streamflow and 0.5/66 to the evapotranspiration of each one of the 66 subbasins.
SWAT model was calibrated and validated using data for the time interval from 2006 to 2009.According to the Split-Sample Test, proposed by Klemeš (1986), the time interval was split unequally, such that the calibration period was sufficiently long, and the remaining data was used for model validation.Therefore, the 4 year interval was split in 3 years for calibration and 1 year for validation, having two years of monthly evapotranspiration estimates for model calibration and one year for the model validation.
Because the SUFI-2 operates with successive independent iterations, the objective function can be distinct for each iteration for the same calibration.Different calibration strategies were evaluated.The strategies consist of two iterations, with 500 simulations each, except for the S3' strategy, which refers to the result of only one iteration.Strategy S1 is the conventional calibration, using outlet streamflow data only, on both iterations.Calibration using solely evapotranspiration data was carried out in two successive iterations.Results for the first iteration are indicated on strategy S3', while the S3 strategy refers to the second iteration.The multi-variable calibration, strategy S2, was carried out using only evapotranspiration data on the first iteration and streamflow and evapotranspiration, simultaneously, for the second iteration.
Figure 2 illustrates the calibration sequence of each strategy, according to the variable considered in the objective function of each iteration.All calibration strategies were compared to each other and with the initial performance of the uncalibrated model (S0).The uncalibrated model refers to the use of default parameters, indicated by the model developers on SWAT manual (NEITSCH et al., 2011).
Evapotranspiration was analyzed only on monthly scale due to the data availability, and streamflow was validated on monthly and daily time steps.Streamflow validation for the daily time step was performed for the entire data interval (from 2006 to 2009), using the monthly calibrated parameters.For the monthly time step, the model was validated according the Split Sample Test, with data that was not used on calibration.Although used sometimes, the best parameter set, found during calibration with SUFI-2, should not be used nor propagated for model validation.Abbaspour, Johnson and van Genuchten (2004) point out that calibration with SUFI-2 algorithm aims to find an optimum interval of values for each calibrated parameter (Range_Par), and not to establish a best value for each calibrated parameter (Best_Par).The calibrated parameter intervals (Range_Par) must be able to simulate the output variable of interest with acceptable uncertainty.Model validation, therefore, must be carried out propagating the final calibrated parameter intervals (Range_Par), obtained from the last iteration of the calibration process.With the purpose of exemplifying possible problems from an inadequate approach, the correct model validation, that is, validation using the parameter values interval (Range_Par), will be compared to the use of the best parameter values (Best_Par).

Initial model performance
Uncalibrated model performance (S0) for streamflow simulation, on monthly time step, can be considered acceptable only for the 2009 year (KGE=0.67,NSE=0.74 and PBIAS=-3.7%).Monthly evapotranspiration simulation also exhibits its best performance for the 2009 year, with KGE=0.60,PBIAS=33.6%and R 2 =0.81.Daily uncalibrated model performance is unsatisfactory for all periods, emphasizing the need for model calibration.Streamflow and evapotranspiration statistical performance indicators for the uncalibrated model (S0) are summarized on Table 4.

Calibration and validation strategies
Daily time step model validation results are presented on Table 5, and monthly time step calibration and validation results are on Table 6.The S3' calibration strategy exhibits model performance superior to the uncalibrated model (S0) for the evapotranspiration and streamflow simulation, in calibration and validation periods.The S3' strategy reached satisfactory performance for daily streamflow, with p-factor between 0.69 and 0.71, and r-factor from 1.00 to 1.08.For S3', monthly streamflow also reaches satisfactory performance on calibration(validation) of 2007-2009(2009) with a p-factor of 0.81(0.75)and r-factor of 1.15(0.95).Despite S3's unacceptable p and r factors, monthly and daily evapotranspiration and streamflow performance are superior to the uncalibrated model (S0).Still, monthly hydrographs show the S3 tendency to "flatten" the flow peaks (Figure 3 and Figure 4).Compared to S3', the S3 streamflow simulation performance is worst.Calibrated parameters comprehend several hydraulic      Brighenti, Bonumá and Chaffe (2016) applied SWAT model to simulate the water cycle for the Negrinho river basin, which is a subbasin for the present study area.The authors calibrated and validated SWAT model on monthly and daily time steps and also reported difficulties to validate the year of 2006.For the mentioned study, SWAT was also calibrated and validated with SUFI-2, still, no p and r factors were reported.
The r-factor for the analyzed strategies indicate no uncertainty reduction on multi-variable calibration.In order to look for the best possible evapotranspiration simulation, values outside the initial range (+/-0.25)for the relative parameters (CN2, SOL_AWC, SOL_K) were accepted on the second iteration.This extrapolation outside the initial ranges can be the cause for the increase on r-factor values for multi-variable calibration strategies.A more deepened analysis of parameters and respective calibrated intervals, similar to the normalization carried out by Rajib, Merwade and Yu (2016), may lead to more information about the uncertainties related to each calibration strategy.
The evapotranspiration simulation performance is unsatisfactory for all strategies, both on calibration and validation periods.High PBIAS values for evapotranspiration on all periods and strategies may suggest that the model is unable to satisfactory simulate the actual evapotranspiration for the study area.Rajib, Merwade and Yu (2016) also reported low performance for the soil moisture simulation by SWAT model, even when measured soil moisture data were used to calibrate the model.

Best_Par versus Range_Par
The SUFI-2 calibration process indicates the best parameter set (Best_Par) and the calibrated interval values (Range_Par) for each parameter, in each iteration.The best parameter set values, when used to validate the model, have led to unsatisfactory model performance for several time intervals.Figure 8 and Figure 9 compare the results obtained using the best parameter set values (Best_Par) versus the parameter interval values (Range_Par), for monthly and daily streamflow.
It must be emphasized that according to Abbaspour, Johnson and Van Genuchten ( 2004) SUFI-2 should not be used to define a single set of parameter values, but to define an adequate interval value for each calibrated parameter.

CONCLUSIONS
The present study analyzed different calibration strategies, including the conventional calibration using streamflow data only, and multi-variable calibration with evapotranspiration data.The results led to the following conclusions: • Multi-variable calibration (evapotranspiration + streamflow) did not presented uncertainty reduction (r-factor) on streamflow model prediction.A more deepened analysis of parameters and respective calibrated intervals may lead to more information about the uncertainties of each calibrated strategy; • Multi-variable calibration (evapotranspiration + streamflow) streamflow performance was superior to the other strategies for monthly and daily time steps; • After the first iteration of the calibration using solely evapotranspiration (S3' strategy), model performance was satisfactory.But after the second iteration (S3 strategy) model performance was unsatisfactory, with worse streamflow simulation performance compared to S3'.Model calibration using only evapotranspiration still requires more studies.
Further studies regarding the use of different remote sensing products, such as soil moisture, are encouraged.The use of multiple streamflow stations for model calibration and validation are also suggested.Uncertainty analysis regarding model parameters and other output variables are also promising alternatives for the deepened understanding of uncertainty.

Figure 1 .
Figure 1.Location of the upper Negro river basin, rainfall and streamflow gauges and meteorologic stations.

Figure 2 .
Figure 2. Calibration strategies, according the variable considered for the objective function of each iteration.

Figure 5 .
Figure 5. Daily streamflow from 01 July to 31 December of 2009, calibrated with monthly data from 2006 to 2008.

Figure 6 .
Figure 6.Daily streamflow from 01 July to 31 December of 2006, calibrated with monthly data from 2007 to 2009.

Table 2 .
Land use and land cover for the upper Negro river basin.

Table 3 .
Neitsch et al. (2011) and initial range.The existing parameter value is multiplied by (1+ a given value).Parameter definitions and further details can be found inNeitsch et al. (2011). *

Table 5 .
Daily streamflow validation for the time interval of 2006 to 2009.R2=coefficient of determination; NS= Nash-Sutcliffe coefficient; PBIAS=percent bias; KGE=Kling-Gupta Efficience; S e O = average simulated and measured streamflow.s s e s o = simulated and measured standard deviation.Satisfactory p and r factor values are indicated in bold.

Table 6 .
Uncertainty and performance indicators for all strategies.Monthly simulation.