Automatic calibration of a large-scale sediment model using suspended sediment concentration , water quality , and remote sensing data

Calibration and validation are two important steps in the application of sediment models requiring observed data. This study aims to investigate the potential use of suspended sediment concentration (SSC), water quality and remote sensing data to calibrate and validate a large-scale sediment model. Observed data from across 108 stations located in the Doce River basin was used for the period between 1997-2010. Ten calibration and validation experiments using the MOCOM-UA optimization algorithm coupled with the MGB-SED model were carried out, which, over the same period of time, resulted in 37 calibration and 111 validation tests. The experiments were performed by modifying metrics, spatial discretization, observed data and parameters of the MOCOM-UA algorithm. Results generally demonstrated that the values of correlation presented slight variations and were superior in the calibration step. Additionally, increasing spatial discretization or establishing a background concentration for the model allowed for improved results. In a station with high quantity of SSC data, calibration improved the ENS coefficient from -0.44 to 0.44. The experiments showed that the spectral surface reflectance, total suspended solids and turbidity data have the potential to enhance the performance of sediment models.


INTRODUCTION
Monitoring suspended sediments (SS) in water bodies is useful in the development of various studies that relate sediments to environmental, social and economic issues (MORRIS; FAN, 1998).Through the use of observed data and sediment models it is possible to understand erosion processes and sediment transport; and to simulate scenarios which involve climactic alterations or land cover and land use, for example (MILLINGTON, 1986;MERRITT;LETCHER;JAKEMAN, 2003;SANTOS, 2009;PANDEY et al., 2016;WORKU;KHARE;TRIPATHI, 2017).
Two important procedures to applicate sediment models are calibration and validation, which aim to ensure their optimal performance (BRESSIANI et al., 2015;PANDEY et al., 2016).Bressiani et al. (2015) demonstrated that calibration is only performed in 66% of the models in Brazil in order to achieve improved results.Model calibration is commonly carried out manually, adjusting the parameters through trial and error, turning it into a very time consuming and monotonous task (SUGAWARA, 1979;BOYLE;GUPTA;SOROOSHIAN, 2000;MULETA;NICKLOW, 2005).However, in order to achieve a satisfactory fit between observed and simulated data, a learning period is necessary.This arises from the lack of knowledge regarding the exact outcome of changes in parameters values in each region of the model application.One's insights acquired through practice is difficult to be swiftly transferred to another person, let alone to another model (BOYLE; GUPTA; SOROOSHIAN, 2000).To aid the user in applying a model -which does not exempt them from having basic knowledge about the basin's physical characteristics, the model and its parameters -optimization methods and algorithms (e.g.YAPO; GUPTA; SOROOSHIAN, 1998;VRUGT et al., 2003;MULETA;NICKLOW, 2005) were implemented in hydrological (e.g.GUPTA; SOROOSHIAN; YAPO, 1998;VINEY;SIVAPALAN, 1999;BOYLE;GUPTA;SOROOSHIAN, 2000;TUCCI;COLLISCHONN, 2003;BLASONE;MADSEN;ROSBJERG, 2007;TUCCI;BRAVO;COLLISCHONN, 2009) and sediment (see Table 1) models, allowing for the parameters calibration automatically.Few sediment model studies were carried out using automatic calibration (Table 1).The main goals of the studies displayed on Table 1 were: to calibrate, to analyze uncertainties and to validate models; to compare the performances of different models; and to compare automatic calibration methods.
Among them, the studies of Van Rompaey et al. ( 2005) and Rostamian et al. (2008) Respectively grant that automatic calibration can be used to perform a various number of experiments (simulations), and to perform applications in large basins.A problem that emerges in merging these two approaches is that calibration and validation procedures demand observed data, of which suspended sediment concentration (SSC) or discharge (QSS) are normally used.On the other hand, we verify a low density of measuring stations across monitoring networks, few of them presenting long and continuous series of data (LODHI et al., 1998;PANDEY et al., 2016).According to Pandey et al. (2016), this limits the understanding of erosion ratios in the multiple spatial and temporal scales, as well as that of the efficiency of erosion control measures.
In face of these issues, several alternatives to standard monitoring have been developed to estimate suspended sediments (SS), based on water quality data or spectral surface reflectance (SSR), obtained from Remote Sensing images (RS).Glysson, Gray and Conge (2000) and Williamson and Crawford (2011) used total suspended solids (TSS) data to estimate SSC through linear regression.Pavanelli and Bigi (2005) and Minella et al. (2008) estimated SSC from turbidity data through empirical equations developed over each Rspective study.Sari, Castro and Pedrollo (2017) imputed turbidity data in a artificial neural networks model and secured valuable SSC estimations.SSR data has been used to undertake SS estimations in water bodies since the deployment of the Landsat 1 satellite, as exposed by Munday Junior and Alföldi (1979).Lodhi et al. (1998) carried out a laboratory study concerning the relationship between SSC and SSR data across different concentrations.Martinez et al. (2009) and Espinoza Villar et al. (2012) also developed empirical equations to estimate SSC from SSR in the Amazon, and Ucayali and Marañon (Amazon's tributaries) rivers, Respectively.Zhang et al. (2014) carried out procedures similar to those of Martinez et al. (2009) andEspinoza Villar et al. (2012), to estimate SSC in the Huang He (Yellow River) estuary.
Despite the aforementioned studies employing RS to estimate SSC, none have broached the matter of using these information to calibrate models.Miller et al. (2005) and Yang et al. (2014) used SSC derived data from satellite images to calibrate and validate sediment transport models for coastal waters.These two were the few studies in the literature that were found which used RS to calibrate or validate a sediment transport model.
The advantage of using surrogate data sources, be it water quality or RS data, comes from increasing the availability of information stemming from such indirect measurements, which broadens the monitoring scope.Within the context of large-scale sediment modeling, however, a few limitations and disadvantages are made present through the sole use of SSC derived from empirical equation (such as those used by Miller et al. (2005) and Yang et al (2014)): (i) in order to establish these relationships in an accurate manner, a sufficiently long SSC series is necessary that be representative of the basin conditions both in dry and in wet seasons; (ii) generally, these relationships are only reasonable over the location where the station was deployed; (iii) in many regions, the spatial density of water quality data stations and virtual stations (created from remote sensing images) is much higher than that of sediment stations.Thus, being limited to the use of sediment data derived from empirical equations would be to waste alternative information which have great application potential in studies related to sediment transport.
No research within the literature, however, has been found to directly use water quality and RS data in automatically calibration a sediment model at a basin-scale.Therefore, there are no clear recommendations regarding how to handle such data.In face of this context, the present study investigated the potential use of SSC, turbidity, total suspended solids and spectral surface reflectance data to calibrate and validate a large-scale sediment model.To that end, the MOCOM-UA (YAPO; GUPTA; SOROOSHIAN, 1998)  Automatic calibration of a large-scale sediment model using suspended sediment concentration, water quality, and remote sensing data 4/18

Study area
The study area is the Doce River basin (Figure 1a of the supplementary material), considered by Lima et al. (2005), among the great Brazilian basins, that one that has the largest SSC average (386.25 mg/L).It is a very emblematic basin, since it is where, on November 5th, 2015, the infamous Mariana Dam Disaster occurred, afflicting a great part of the Doce River, and leading to rampant environmental impacts (ANA, 2016).Aside from these reasons, the basin was selected due to the presence of a relatively large number of stations having observed data (Figure 1a of the supplementary material).
It has an area of approximately 86,715 km 2 (PIRH, 2010) and is located between the states of Minas Gerais and Espírito Santo.The basin has a strongly seasonal rainfall pattern (PINTO; LIMA; ZANETTI, 2015) presenting a dry season that varies from April to September, and a rainy season varying from October to March (Figure 1b of the supplementary material).The strong and concentrated rainfall contribute to the intensity of erosive processes in the basin, causing siltation issues in the reservoirs (FAN et al., 2015b).The predominant types of soil within the region are Red-Yellow Latosols and Red Argisols.There are also other types of Latosols and Argisols, Litholic Neosols, Gleysols and Cambisols (PIRH, 2010).Fagundes et al. (2017) showed that the basin sediment yield varies from around 10 t/ano.km 2 to close to 14,680 t/ano.km 2 .The authors also demonstrated that sediment yield is directly linked to the increase in slope; also that regions south of the Doce River tend to hold the highest sediment yield values.Furthermore, Fagundes et al. (2017) observed that sediment discharge increases along with the drainage area, reaching values higher than 7,000,000 t/day in the Doce River.Tributaries, transporting the largest sediment load, are the Piracicaba, Santo Antônio, Suaçuí Grande and the Manhuaçu Rivers.

Suspended sediment concentration
We obtained information stemmed from 24 SSC monitoring stations (Figure 1) from the National Water Agency (ANA), made available through the Hydrological Information System (HidroWeb), possessing around four annual measurements between 1997 and 2010 (surrogate data was also acquired for this period).SSC data was also obtained from the Fazenda Ouro Fino station (Figure 1), provided by the Minas Gerais Energy Company (CEMIG), which holds around one daily measurement over the rainy season, and from four to ten measurements for dry-season months.

Turbidity and total suspended solids
Turbidity and total suspended solids (TSS) data was obtained from 63 water quality monitoring stations (Figure 1) of the Minas Gerais Water Management Institute (IGAM), taking around four annual measurements.This data was used as a proxy of SSC, despite them considering other substances suspended in water (ASTM, 2003) besides the inorganic soil fractions (silt, clay and sand), which is the SSC case.This was performed since various studies (e.g.GLYSSON;GRAY;CONGE, 2000;WILLIAMSON;CRAWFORD, 2011;PAVANELLI;BIGI, 2005;MINELLA et al., 2008;SARI;CASTRO;PEDROLLO, 2017) demonstrated that there is a substantial correlation between SSC and these water quality data.

Spectral surface reflectance
Studies indicate that it is possible to use the visible (~0.40 µm to 0.70 µm) and infrared (~0.70 µm to 1.30 µm) electromagnetic spectrum to evaluate water components through remote sensing (LODHI et al., 1998;MUNDAY JUNIOR;ALFÖLDI, 1979).The red band is one of the most used in SSC estimation, for it is where the peak of reflectance occurs in water-sediment mixtures (LODHI et al., 1998).On the other hand, reflectance saturation may occur for high SSC, causing reflectance to not increase proportionally to SSC (LODHI et al., 1998).Among the advantages of using SSR to monitor SS are the low financial cost in acquiring images and the possibility to obtain information with wide spatial coverage and with high temporal frequency (WANG et al., 2009;ESPINOZA VILLAR et al., 2012).Among the disadvantages are the limited use of SSR in rivers that are too narrow, once the water-sediment mixtures reflectance value may be influenced by the presence of river banks and/or sand bars, both in rainy and dry seasons (e.g.MARTINS et al., 2017).
21 virtual stations across the Doce River basin were created using images from the Landsat 5/ TM satellite (Figure 2), where Automatic calibration of a large-scale sediment model using suspended sediment concentration, water quality, and remote sensing data 6/18 SSR information was extracted in the red band.The Landsat 5/ TM satellite was chosen because it presents a long temporal series of the past, 30m spatial resolution and also due to institutions that already provide that product with atmospheric corrections.In the present study, images were acquired at no charge from the United States Geological Survey -USGS (2018a) with atmospheric correction (USGS, 2018b).To encompass the entire basin, images from four scenes with the following orbit/point were used: 216/073, 216/074, 217/073 and 217/074.An average of 13 images per year were used for each scene, due to images with high cloud cover (>80%) not being used.SSR data extraction was performed through the same approach proposed by Fagundes, Paiva and Fan (2017), which seeks to obtain information on pixels free from cloud and shadow interference.
A procedure that could have been adopted to work with the remote sensing data would be their transformation in SSC through empirical relationships, as performed by Martinez et al. (2009), Espinoza Villar et al. (2012), Zhang et al. (2014) and others.In the current study, however, a simpler possibility was tested, one that allows for a broader use of information, which would be the direct use of reflectance in comparing to the sediment model results, following the procedure presented in the "Experiments" section.

The MGB-SED Model
The MGB-SED (BUARQUE, 2015) model is coupled to the MGB large-scale hydrological model (COLLISCHONN et al., 2007), which is a distributed and conceptual model, with daily time step, unit catchments discretization, and that uses the Hydrologic Response Units (HRU) approach.MGB-SED was developed to represent the erosion and sediment transport processes in hill slopes, as well as to depict channel sediment transport with possible interactions with floodplains.The sediment yield at each unit catchments is estimated through MUSLE (Equation 1) (WILLIAMS, 1975), considering a LS two-dimensional topographic factor (BUARQUE, 2015) extracted from the Digital Elevation Model (DEM), using surface runoff volumes calculated by MGB as an input.Fine sediments (silt and clay) are routed along the river as suspension loads by the diffusion-advection equation and don't settle into the channel.

(
) where Sed [t/day] is the sediment load resulting from soil erosion, sup Q [mm/ha] is the surface runoff volume, pico q [m 3 /s] is the peak flow rate, A is the superficial area, α and β are the adjustment coefficients (which are calibrated afterwards), whose values originally estimated by Williams (1975) were 11.8 and 0.56, Rspectively, K [0.013.t.m 2 .h./m 3 .t.cm] is the soil erodibility factor, C [-] is the cover and management factor, P [-] is the conservation practice factor and LS [-] is the topographic factor.
A model with the configurations defined by Fagundes et al. ( 2017) was employed in this study, which they used in the Doce River basin and obtained good results in the calibration process.A brief description of these settings follows.
The Doce River basin was discretized into 1173 unit catchments (Figure 1c of the supplementary material) and the HRU were acquired from the South America HRU map (FAN et al., 2015a).We use 217 rainfall and 59 ANA flow stations, along with 14 meteorological stations from MGB internal database (FAN; COLLISCHONN, 2014).The employed channel flow routing method was the Muskingum-Cunge, which has shown satisfying results in basins with no significant effects of backwater and floodplain storage (e.g.ALLASIA et al., 2015;COLLISCHONN et al., 2007;TUCCI;BRAVO;COLLISCHONN, 2009;GETIRANA et al., 2010;NÓBREGA et al., 2011;FAN et al., 2016).
The peak q was calculated from the daily uniform surface runoff volume (BUARQUE, 2015).The LS factor is the combination of the slope-length L and slope-steepness S factors.This factor was calculated using Buarque (2015) methodology, who developed a computational routine that computes the LS factor two-dimensionally, making use of Desmet and Govers (1996) approach to determine the L factor, and the Wischmeier and Smith (1978) equation to determine S factor.
Value of P factor was adopted equal to 1 for two reasons: i) conservation practices have greater impact in small watersheds.With the increase of the watershed, its impacts can be despised or may not cause significant difference in the estimates performed by the model; and ii), due to the hardship of obtaining the P values for large basins.Factor K was estimated following the equation proposed by Williams (1995), which uses soil texture data, obtained from the Food and Agriculture Organization of the United Nations (FAO, 1971), displayed on Table 2. Factor C values for each HRU are also indicated within that table, obtained from the literature (see FAGUNDES et al., 2017).

Procedures for automatic calibration
According to Moriasi et al. (2007), calibration is the process of estimating the model parameters through comparisons between a model estimation and an observed data set, both in similar conditions.Validation is used here as defined by Refsgaard (1997): as the process that demonstrates that a specific model is capable of performing "sufficiently accurate" simulations for a location.The term "sufficiently accurate" being subjective and related to goals to be achieved.
The MGB-SED model was calibrated using the MOCOM-UA multi-objective automatic calibration algorithm (YAPO; GUPTA; SOROOSHIAN, 1998), which has already been implemented in the MGB model (FAN; COLLISCHONN, 2014).Some modification was necessary to calibrate parameters related to sediment load estimation.The adjustment coefficients α and β present in MUSLE (Equation 1) were adopted as calibration parameters.These parameters have been modified in several studies, as shown by Sadeghi et al. (2014) in their review paper about MUSLE applications around the world, in which about 30% of the papers have carried out alterations of these parameters.A surface runoff delay parameter (TKS) from the MGB-SED model was also adopted to calibration procedures.Sediment volumes generated at each HRU are virtually stored in a linear reservoir, which is a structure that transports sediments from the unit catchments to the river channel.TKS parameter is associated to the linear reservoir (COLLISCHONN et al., 2007), and determines the period in which sediments reach the channel.TKS was computed for each unit catchment and is directly related to their time of concentration.After setting TKS value, changes were considered at the sub-basin level (or to the whole basin, if that were the case); that is, every TKS value was amplified or reduced at the same rate, but each unit catchment could have a single TKS value.A sensitivity analysis was also performed for each calibration parameter.
The MOCOM-UA algorithm uses genetic algorithm techniques and has Nelder and Mead simplex algorithm (SOROOSHIAN; GUPTA, 1995) in its structure.To make use of the MOCOM-UA it is necessary to define the number of parameters (N) to undergo calibration; the search space limits that each parameter may take; the number of objective functions (NF) to evaluate the model; and the parameter set number (NS) or points (defined randomly) within the region determined by the space limits.Each point is provided by the N parameter values and, for each point, the NF objective functions are assessed, providing a result matrix F(NS, NF) (COLLISCHONN; TUCCI, 2003).
To perform multi-objective automatic calibration it is necessary to define, beyond calibrated parameters, which objective functions will be used to evaluate the desired quality adjustment.Moreover, the MOCOM-UA algorithm seeks to optimize these functions simultaneously.The main characteristic of a multi-objective optimization problem is that the solution, generally, will not be a single one (COLLISCHONN;TUCCI, 2003).The value that presented the best objective function average for a given data set was always the selected one for the calibration in this study.The calibrated parameter values adopted for each experiment may be found in the supplementary material.Further information regarding the automatic calibration method used in this study, as well as other information about the subject can be found in Collischonn and Tucci (2003).

Experiments
In order to investigate how remote sensing and water quality data may be used in calibrating and validating sediment models, as well as aiding towards potential enhancements, several experiments were conducted and compared to a reference simulation in which the model was not calibrated.The reference simulation was performed considering the values α =11.8 and β =0.56 (WILLIAMS, 1975), TKS without variation and 17 sub-basins.10 experiments were performed (Table 3), all having the same calibration and validation period: 1997-2010.In each experiment, In the table, SSC is the suspended sediment concentration; SSR is spectral surface reflectance in red band; TSS are total suspended solids; Rtp is the temporal correlation coefficient; Rsp is the spatial correlation coefficient; Rgl is the global correlation coefficient; KGE is the Kling-Gupta coefficient; ENS is the Nash-Sutcliffe efficiency coefficient; Imaxgen is the maximum value for algorithm iteration; QSS is the suspended solid discharge; SSCbg is the SSC background concentration that always remained in the river.All experiments made use of all datasets to validation; *For these experiments, SSC stations 56800000, 56846000 and 5697600, Turbidity stations RD091 and RD098 and TSS RD098 and RD099 stations were not considered, and a second reference simulation was performed (without calibration).

8/18
generally, one type of data (e.g. SSC measured in situ) was used for the model automatic calibration, while the others (e.g.SSR, turbidity and TSS) were used for validation, which resulted in 4 calibrations in the case of experiment E1, for example.Other elements that influenced the type of experiment to be performed were the number of sub-basins (1, 5, and 17), and the objective functions (always three for each experiment, which may be distinct).Spatial discretization of the basin are illustrated in Figure 1 of the supplementary material.
The objective functions used in the automatic calibration were the Nash-Sutcliffe efficiency coefficient (ENS) (NASH; SUTCLIFFE, 1970), Kling-Gupta efficiency coefficient (KGE) (GUPTA et al., 2009), Pearson correlation coefficient (Rtp) (ELSEL; HIRSCH, 1992), and its variations, named spatial correlation coefficient (Rsp) and global correlation coefficient (Rgl), explained in the text below.The MOCOM-UA algorithm aims to optimize each of the objective functions, to that end, a single value derived from these functions is necessary.For ENS and KGE, the objective function value (OFV) was calculated according to Equation 2.

(
) As Rtp results in a single value for each station, an average Rtp value from all of them subtracted from the unit was used (Equation 3).In order to calculate Rsp two new data series were built: one composed by the long-term average of observed values for each station, and the other by the long-term average of the simulated values to the corresponding station.It was also built two data series to estimate Rgl: one with all observed values from all stations, and the other with their respective simulated values.After we have defined the series, the Pearson was computed for both Rsp and Rgl.OFV for Rsp and Rgl is calculated in the same way as Equation 3, just changing the variable Rtp to Rsp and Rgl.
OFV 1 Rtp = − (3) Table 3 summarizes the experiments and illustrates their specific characteristics.For most experiments, a search space for parameter α was set between 2.0 and 25.0, for β between 0.2 and 1.7, and TKS between 0.1 and 3.0, with a maximum number of algorithm iteration (Imaxgen) at 60, and 50 parameter sets (NS).Since parameters α and β are not physically based, and TKS can have broad variations due to the several real conditions that may retain sediments over the basins, these ranges could be different.However, they were defined through the sensitivity analysis of these calibrated parameters.
Experiments E1, E2 and E3 were performed by varying the number of sub-basins (calibration elements) at 1, 5 and 17, Respectively, to investigate whether the model better represented sediment processes when the calibration parameter set was more heterogeneous.Other experiments were carried out to investigate whether highest correlation values would be achieved when correlations between observed SSC derived data and simulated data from MGB-SED were calculated.These were the case of the experiments E4, in which SSC data was converted into the logarithm of SSC (logSSC); E5, in which SSC was converted into QSS; E6, in which only SSC values higher than 50 mg/L (a value hardly exceeded in measurements taken during the dry season in the Doce River basin) were used; and E7, which a background concentration (SSCbg) was employed to attempt to enhance the representation of SSC values.SSCbg was computed as the average of SSC values measured in the dry season, for each sediment station.Aiming to verify the influence of certain automatic calibration algorithm parameters, experiments E8 and E9 were conducted.In the experiment E10, the model was calibrated for the Fazenda Ouro Fino -CEMIG, which is the station with the highest number of SSC observed data.For this latter, the found calibrated parameter values were applied all over the basin.

Sensitivity analysis
MUSLE parameters α and β affect the amount of sediments generated at each HRU, while the affect the time in which these sediments arrive at the drainage network.Figure 3 shows that sediment graphs are amplified or reduced proportionally to the

9/18
parameter value variation α, meaning a 20% increase in the α value will cause a 20% increase in the SSC value.Parameter β, however, amplifies the sediment graphs and intensifies their peaks and valleys at the rate its value decreases, which is evidenced in the blue line (-50%) on Figure 4. Changes to β are not proportional, a 20% increase in the parameters value caused, on average, a reduction of 66% in the SSC value.This happens due to the β parameter being the exponent of a value that is always less than 1.
Figure 5 exhibits the MGB-SED results in the face of changes to the parameter.It is observed that the smaller the value, the more intense are the peaks and valleys, as shown by the sediment graph in blue (-50%).A 50% decrease in value could cause an increase of up to 7,300% in the SSC value.Furthermore, a change to the value causes a temporal variation in the sediment graph, anticipating the SSC peak.Comparing the sediment graph in yellow (+50%) to the sediment graph in blue inside the marked rectangle in Figure 5, a near 2 days discrepancy is noted in the SSC peak and its reduction from 953 mg/L to 660 mg/L.
Deviations in calibrated parameters values may result in large differences in the values estimated by the MGB-SED model, especially in the amplification of extreme values.For an appropriate representation it is important to establish a search space during the automatic calibration process that results in simulated values that are consistent with observed values.

Experiment analysis
In this section the results of the experiments are briefly outlined, they support the subsequent discussion.To aid in the comprehension of the results presented in the form of tables, the average difference (AD) and the average absolute difference (AAD) were calculated.AD was calculated as the average between the values presented in the tables, considering whether they are positive or negative, to verify how much the results improved or worsened.AAD was calculated similarly, but without considering the sign of the values, in order to indicate the difference magnitude of the results.

Sub-basin numbers deviation
Table 4 exhibits the comparison between experiments that had the number of sub-basin changed.Both in that table as in the others that follow that template: values represent the absolute increase (green) or decrease (red) in relation to the values in the reference simulation (without calibration); the main diagonal represents the metric values related to calibration; the values on the other cells represent the metric values regarding validation.
Table 4 results show that in the calibration period, metrics are generally better, while in the validation period they are worse, which is common within the context modelling (e.g.BUSSI et al., 2014;YESUF et al., 2015;AYELE et al., 2017;WORKU;KHARE;TRIPATHI, 2017).Nevertheless, the metrics worsening was not significant, proving that all data assisted in the model validation process.It should be noted that the E1 experiment (1 sub-basin) is the one that have the smallest differences, both in the calibration (AAD = 0.02) as in validation (AAD = 0.02) period.With the increase of sub-basin numbers from 1 to 5, AD increased from 0.01 to 0.06 in the calibration step.In increasing from 5 to 17 sub-basins, the results exhibit little difference, with some values remaining the same.Experiment E3 (17 sub-basins) shows that reflectance was the dataset in which the correlation average had the most increase (0.12), even when using turbidity data (0.06) and TSS (0.06) to calibrate MGB-SED.

Comparison with SSC derived data
The results of experiments E5 (QSS) and E6 (SSC values higher than 50 mg/L) showed low or even negative correlation values.For these experiments, results tables are not be presented.Although the MGB-SED model may have underestimated observed values in the dry season during the current application, in trying to calibrate it only taking into consideration SSC values > 50 mg/L, the results do not showed enhancements.In establishing a SSC threshold, the amount of observed data available for comparison decreased, and the sediment temporal variability was worse.
Table 5 presents the results of experiments E4 (SSClog) and E7 (SSCbg).Experiment E4 highlights the improvement of mean values of correlations between simulated and observed SSClog, both in calibration (+0.14) as in validation (+0.25).On the other hand, for turbidity and TSS data, both calibration (AD=-0.19)as validation (AD=-0.19)presented worse correlation values.When a background SSC was included, in calibration AD increased by 0.10, and in validation it decreased by -0.07It is emphasized that the best model performance, during the calibration step using SSC data, was in experiment E7.In a certain way, this fact was expected, once the observed SSC values themselves were used to calculate the SSCbg.

Changes in automatic calibration parameters
Table 6 presents the results for experiments E8 (smaller search space and greater Imaxgen) and E9 (small α and β values).The more restricted the search space, smaller are the possibilities of combinations being able to generate an optimal result.This procedure could make the search algorithm found a local maximum instead of a global maximum.It is observed that in experiment E8, the calibration performed with SSR data resulted in higher metric values for SSC results in the validation.In experiment E9, it is noted that the metrics were improved only for SSC and SSR data during the calibration.For turbidity and TSS data, automatic calibration did not find a better parameter set than those from the reference simulation, thus the results did not show changes.The values within the cells represent the increase (green) or decrease (red) in relation to the average of the three correlations (temporal, spatial and global) when compared to the reference simulation values (without calibrating): SSC -0.50; SSR -0.63; Turb.-0.63; TSS -0.65.The results of the main diagonal (in bold) refer to the calibration process while the others refer to the validation process, both performed for 1997-2010 period.The values within the cells represent the increase (green) or decrease (red) in relation to the average of the three correlations (temporal, spatial and global) when compared to the reference simulation values (without calibrating): SSC -0.50; SSR -0.63; Turb.-0.63; TSS -0.65.The results of the main diagonal (in bold) refer to the calibration process while the others refer to the validation process, both performed for 1997-2010 period.
Fagundes et al.

11/18
These results show that the parameter sets that present values closest to those found by Williams (1975), α, especially, result in better model performances.
Results from experiment 8 are detailed on Table 7.The main improvements occur during calibration, for Rsp values (AD=0.24).Considering only SSC, Resp values increase from 0.36 (in reference simulation) to 0.79 after the model has been calibrated.On the other hand, Rtp values, on average, had slight variations (AAD=0.01).Improvements in Rsp values indicates that observed and simulated values became closer for each station after automatic calibration procedure.

Fazenda Ouro Fino station
During experiment E10, MGB-SED was calibrated using ENS, KGE and Rtp statistics for the Fazenda Ouro Fino station (area ~ 6.438 km 2 ).Results showed that Rtp after calibration remained equal to 0.64; KGE increased from -0.19 to 0.52; and the experiment most significant result is the enhancement in the ENS coefficient, which increased from -0.44 to 0.44. Figure 6 shows SSC values observed and simulated at the Fazenda Ouro Fino station, and it is possible to note that simulated values after calibration were smaller.That is due to the characteristics of the employed metrics (e.g.ENS) combined with the optimization  The values within the cells represent the increase (green) or decrease (red) in relation to the average of the three correlations (temporal, spatial and global) when compared to the reference simulation values (without calibrating): SSC -0.51; SSR -0.63; Turb.-0.65; TSS -0.65.The results of the main diagonal (in bold) refer to the calibration process while the others refer to the validation process, both performed for 1997-2010 period.The values within the cells represent the increase (green) or decrease (red) in relation to the average of the three correlations (temporal-Rtp, spatial-Rsp and global-Rgl) when compared to the reference simulation values (without calibrating): SSC -0.51; SSR-0.63;Turb.-0.65; TSS -0.65.The results of the main diagonal (in bold) refer to the calibration process while the others refer to the validation process, both performed for 1997-2010 period.
Automatic calibration of a large-scale sediment model using suspended sediment concentration, water quality, and remote sensing data 12/18 algorithm, which jointly aim to minimize absolute errors between observed and simulated data.
Results from experiment E10 are important in showing that the metrics used, and calibration for a single station significantly improved the MGB-SED performance.Furthermore, results indicate that the improvements that can be reached also depend on the amount of stations and, especially, of available data at each one of them.For instance, Fazenda Ouro Fino station is the one with the largest number of available information, while a large part of the stations used in the other experiments just have 4 yearly observations.

Summary of analyzes
When statistic metric values were larger during calibration for a specific dataset, in general, metric values were smaller for the other datasets during validation.This point out that, although all types of data present information related to SS, there are divergences in the methodology of estimating this data and in representing the spatio-temporal dynamic of sediments within the basin.However, analyzes carried out from the experiments allowed for a few important considerations, outlined as follows: • In increasing the number of sub-basins, results tend to improve due to the better representation of the basins heterogeneity; • The calibration performed for a station that show a long and dense data time series, using metrics that represent the correlation, bias and amplitude of variation, tends to provide better estimates of SSC simulated; • The replacement of SSC values by their logarithms increased metric values both in calibration and in validation in the case of SSC data; while, generally, decreased the metric values for SSR, turbidity and TSS during validation, as well as in turbidity and TSS calibration; • Including a background concentration increased the metric values during calibration for all datasets, especially for SSC data; • The reduction of search space around the standard calibrated parameter values (α=11.8β =0.56 and TKS without changes), with a large number of iterations resulted in higher average correlation values than when that search space was bigger.

DISCUSSIONS Observational data for calibration
Several limitations and uncertainties were present in the calibration and validation processes using the MGB-SED model, and that influenced the results obtained.Firstly, taking observed data into consideration, we realize that there are uncertainties associated with them and their acquisition way (OP DE HIPT et al., 2017), and despite all of them being related to suspended sediments, the used approaches and methods are different.Methods employed for SSC acquisition usually considered only inorganic sediments in water suspension (BOITEN, 2008).Turbidity may be influenced by suspended and/or dissolved organic matter in the water (ASTM, 2003).Reflectance, however, considers every water suspended matter that interacts with solar radiation (JENSEN, 2009).
According to Morris and Fan (1998), sampling and analysis programs are usually inadequate in determining long-term sediment loads.That is because SS measurements, used to validate the models, are generally scarce for periods of high stream flows and catastrophic events.One of the techniques used in those situations is the extrapolation of curve fitting to a period beyond that of observed data.This extrapolation might be a possible source of errors and uncertainties (MORRIS; FAN, 1998).Figure 6 illustrates that the application of the MGB-SED model in the Doce River basin as it was done, could not represent the great peaks of SSC.This could be related to the non-representation of landslides by the model, usually frequent in the basin, mainly in Suaçuí Grande, Santa Maria do Doce and Caratinga basins (PIRH, 2010).Another specific source that contributes to the increase in sediment load are those originating in mining (LOBO et al., 2016).That activity is historically present in the basin (HORA et al., 2012) and is carried out in several areas, primarily in the headwaters of the Carmo and Piracicaba rivers.
The reason to performed the experiment E7 (SSCbg) was the existence of some natural processes related to sediments that were not being represented by the model, which consequently was not adequately estimating SSC values during the dry season.This indicates that during the periods without precipitations, the Doce river have other sediment sources that aren't those originating in the hillslopes towards the channel.These sources could be the erosion of sediment bars (FRYIRS, 2013), erosion of the riverbed and the river banks (HOOKE, 2003) or even anthropogenic activities (LOBO et al., 2016).In the context of sediment transport connectivity, sediment bars (Figure 7) may have an important role in supplying the channel with sediments (FRYIRS, 2013).The largest part of these sediments corresponds to the fractions of silt and clay that, due to the large amount in the basin (LIMA et al., 2005), may be partly deposited during the rainy season and remain stored among the pores of sandbars.During the dry season, these fine sediments may be mobilized by flows with low SSC.Hooke (2003) mentions that a stable river downstream segment is more active, although coarse materials are not enough to supply the suspended sediment deficit, which causes fine sediments to be eroded from river banks.Coarse loads are transported especially for high water discharges, while transport of fine loads decreases proportionally (LIN et al., 2017).

The MGB-SED model and its structure
According to Morgan (2005), hardship in obtaining an exact adjustment between observed and calculated data reflects the uncertainty of the predictions performed by models.The author remarks that uncertainties originate from (i) errors in measured values; (ii) high spatial variability of some input parameters that could not be properly represented by a single value; (iii) the need to estimate some parameter values that cannot be easily measured; and (iv) errors in the model structure or the operating equations, RBRH, Porto Alegre, v. 24, e26, 2019 Fagundes et al.

13/18
particularly where empirical equations are not used to describe physical processes.At a deeper analysis level, the author mentions that there is still considerable uncertainty even regarding the nature of the mechanisms involved in soil particle detachment through surface runoff.
Oliveira and Quaresma (2017), while analyzing the 56994500 (Colatina) fluviometric station (~88% of the Doce river basin drainage area), concluded that 63% of the variation in suspended sediment load within the Doce River basin can be explained by runoff, and the remaining 37%, by other factors, such as rain intensity, vegetation cover and soil use.Rainfall, considered one of the main forcing in sediment models, also has great uncertainty in its measurements (XUE;CHEN;WU, 2014).In MGB-SED, rainfall values are interpolated to unit catchments centroids, which may lead rainfall values to be sometimes underestimated and, sometimes overestimated.Besides that, in the MUSLE equation, rain intensity factors are replaced by surface runoff and a peak flow to represent the maximum runoff energy acting over the soil.However, peak flow values are difficult to estimate (KINNELL; RISSE, 1998), and therefore a simplified assessment were performed in the MGB-SED.
It is known that MUSLE is an empirical equation that mathematically represents a power function, presenting a simple way to represent all the complexity of sedimentological processes.Sediment retention/deposition processes in landscape are not represented by a specific mathematical formulation, but are taken account in the equation (WILLIAMS, 1975).Shen, Chen and Chen (2012) in performing a SWAT model uncertainty analysis, which also employed MUSLE, demonstrated that the analysis of sediment simulations presented higher uncertainty than that of water discharge, and that uncertainty becomes even higher during rainy seasons.According to the authors, this could be related to the dependence that the sediment model has on the hydrological model.
It is practically unavoidable that there are uncertainties in the model parameters (SHEN;CHEN;CHEN, 2012).Hence, many model users tend to overcome uncertainty issues at the input parameters, splitting an observed data series into two periods, one for calibration and another for validation.However, calibration may not solve all uncertainty issues in modelling context, for it cannot be generalized for all environmental conditions (MORGAN, 2005) and the model parameters have acceptable ranges of values so that they may maintain their function.

CONCLUSIONS
Several automatic calibration and validation experiments for the MGB-SED model were performed using suspended sediment concentration (SSC), spectral surface reflectance (SSR) in the red band, turbidity and total suspended solids (TSS).We sought to investigate the applicability of surrogate data sources in the calibration and validation for a large-scale sediment model, given the scarcity of observed SSC data.Model calibration and validation procedures were carried out, using in each one, independent data for a given period.According to the literature research, this is one of the first studies to adopt this kind of approach.
After performed 37 automatic calibration and 111 validation tests, the main conclusions were: • SSR, turbidity and TSS data have the potential to enhance the performance of the MGB-SED model.Generally, after MGB-SED calibration, the model performance was improved, with positive changes in correlation values.
During validation, the model performance was generally poorer than that of calibration; • From surrogate data used in this study, the best results in the calibration process were for experiments E2 and E3 (that used of 5 and 17 sub-basins, respectively) in which the correlation average was increased to 0.12 in SSR.On the RBRH, Porto Alegre, v. 24, e26, 2019 Automatic calibration of a large-scale sediment model using suspended sediment concentration, water quality, and remote sensing data 14/18 other hand, traditional SSC data showed the best result in experiment E10, where the ENS coefficient increased from -0.44 to 0.44 at the Fazenda Ouro Fino station.This station was the one that have the highest number of observed SSC data and highlight the importance of using in situ measured data together with a high sampling frequency, even when SSR data was available.
In summary, the results indicate that in basins devoid of in situ measurements, remote sensing data may be a powerful alternative in calibration and validation processes, enhancing the large-scale sediment model performance.
Furthermore, the performed experiments and the comparisons between observed and simulated data allowed to identify opportunities for improvements in the sediment model structure.To try fully represent what happens in environment, new processes should be included in the MGB-SED model.These enhancements will be the goal of future studies.

Figure 1 .Fagundes
Figure 1.(a) Doce river basin, main rivers, and locations of ANA and CEMIG suspended sediment concentration monitoring station, IGAM water quality stations, and virtual surface reflectance stations; (b) Long-term monthly average (1970-2010) hyetograph and hydrograph at stations 1940006 and 56994599, respectively, located in Colatina -ES.

Figure 2 .
Figure 2. Potential locations for red band surface reflectance extraction in the Doce River basin (0.64 µm-0.67 µm) and virtual stations created from Landsat 5/ TM images.Satellite images in the natural composition show details of reflectance extraction places for stations Piranga, Suaçuí and Linhares.

Figure 3 .
Figure 3. Sensitivity analysis of suspended sediment concentration simulated by MGB-SED for α parameter changes in Piranga-MG.

Figure 4 .
Figure 4. Sensitivity analysis of suspended sediment concentration simulated by MGB-SED for β parameter changes in Piranga-MG.

Figure 5 .
Figure 5. Sensitivity analysis of suspended sediment concentration simulated by MGB-SED for TKS parameter changes in Piranga-MG.

Figure 6 .
Figure 6.SSC values calculated and observed at Fazenda Ouro Fino (CEMIG) station.Calibrated SSC values were calculated from experiment E10.

Table 1 .
automatic calibration algorithm was employed jointly with the MGB-SED(BUARQUE, 2015)model to carry out 10 experiments, which resulted in 37 automatic calibrations and 111 validation tests.Summary of the application of sediment models that uses automatic calibration.This table does not exhaust all published studies on this subject.The last study in this list refers to the current study.

Table 2 .
Parameters used for sediment yield estimation through MUSLE.

Table 3 .
Experiments for calibrating and validating the MGB-SED hydrosedimentological model with different data sources.

Table 6 .
Result comparison for experiments E8 (smaller search space and larger Imaxgen) and E9 (small α and β values).

Table 7 .
Detailed values of metrics found through experiment E8 (search space smaller and Imaxgen larger).