ESTIMATION ON THE CONCENTRATION OF SUSPENDED SOLIDS FROM TURBIDITY IN THE WATER OF TWO SUB-BASINS IN THE DOCE RIVER BASIN

Knowing the relationship between the total suspended solids concentration (TSS), turbidity in the waters, and that turbidity analysis can be done faster and in less expensive way, this study aimed to obtain mathematical model to estimate the total suspended solids (TSS) concentration from the turbidity values for waters of Doce river basin. For this purpose, it was used the water quality database of the Minas Gerais Institute of Water Management (IGAM). The data were pre-treated using the adjusted boxplot technique followed by adjustment of curves for the different management units and rainfall regime period. It was verified the possibility of a single curve through the dummy variable technique, subsequently. With the results it was observed that the adjusted boxplot technique proved to be useful for environmental data. Linear relationships with R2 values, as a rule, were higher than 0.6, however, it was not possible to develop a single model. It is concluded that the generated models presented good adjustments being able to be used for predicting the concentration of TSS as a function of turbidity. However, each management unit in each period of rainfall regime presents particularities that were reflected in the prediction models.


INTRODUCTION
Water does not appear in the environment in the purely molecular form (H2O), due to its solvent properties and its ability to transport particles.In it, compounds of natural and anthropogenic origin are present (von Sperling, 2014), such as salts, metals, microorganisms, organic matter, among others.These constituents are responsible for their characterization (von Sperling, 2014) and, therefore, for their classification according to the nobility of their use.
In this sense, an important parameter for the analysis of water quality is the concentration of total suspended solids (TSS) because these types of solids are good indicators of physical and esthetic degradation of surface water quality, being also good indicators of the presence of other pollutants (Hannouche et al., 2011;Kusari & Ahmedi, 2013;Rügner, et al., 2013).
The reduction in photosynthetic activity due to the impediment of the passage of sunlight, the transport of pollutants such as phosphorus, mercury and hydrophobic organic compounds are associated with TSS high concentrations in the water (Rügner et al., 2013).
TSS can also cause depletion on dissolved oxygen (DO) due to the increase in surface water temperature caused by higher absorption of solar energy by such solids (Naveedullah et al., 2016).Large amounts of suspended solids (SS) can affect procreation of fish and invertebrates due to obstruction on breeding habitat (Naveedullah et al., 2016).In addition, TSS can serve as a shelter for pathogenic microorganisms and may be associated to bacterial contamination (Bakan et al., 2010;Henning et al., 2014).
As can be seen, the presence of suspended solids in the water causes a number of damages to the aquatic environment.Periodic monitoring of water quality is therefore, of paramount importance (Srivastava & Kumar, 2013), although it is not an easy practice (Goher et al., 2014).The quantification on TSS concentration in water may not be such a simple task due to the demand for time and/or equipment or because of test failures.It is therefore necessary to develop simple, practical and cost-effective methods for the proper management of water resources in order to guarantee their multiple uses.
Engenharia Agrícola, Jaboticabal, v.38, n.5, p.751-759, sep./oct. 2018 Following this line of reasoning, another parameter that can give indicative of TSS concentration is the turbidity which is easy to measure, that because the main responsible factor for turbidity is the SS (Hannouche et al., 2011).
The turbidity is quantified by the nephelometric method which is based on the comparison of the light intensity spread over by the sample under defined conditions, with the light intensity spread by suspension considered standard.So, as great the intensity of the spread light is greater is the turbidity in the sample under analysis (APHA et al., 2017).The used equipment for the reading is the turbidimeter which consists of a nephelometer, being the turbidity expressed in nephelometric turbidity units (NTU).In in situ monitoring probes, the sensor consists of an infrared nephelometric turbidimeter.
There have been previous attempts to use the turbidity measure to estimate the TSS concentration (e.g.Suk et al., (1998), Hannouche et al. (2011) and Kusari & Ahmedi (2013)), but there is no universal relation between turbidity and TSS.It is necessary the development of specific relations in each place (Kusari & Ahmedi, 2013).
Knowing the existence of direct relationship between the concentration of suspended solids present in the water and its turbidity, and that the analysis of the turbidity is processed faster and at lower cost, this study had the objective of obtaining models to estimate the concentration of TSS by means of the turbidity data in waters of two important sub-basins of Doce river, being those from Piranga and Piracicaba river.

Study area
The watershed of the Doce River is located in the Southeast region, between the parallels 17° 45' and 21°15' S and the meridians 39°30' and 43°45' W, dividing between the states of Minas Gerais (86% of the drainage area) and Espírito Santo (Ecoplan Lume Consortium, 2010).With a population of more than 3.5 million inhabitants, covering 230 municipalities and a drainage area of approximately 86,715 km², this basin is part of the Southeast Atlantic hydrographic region (Ecoplan Lume Consortium, 2010).
Its springs are located in the state of Minas Gerais, in the mountains of Mantiqueira and Espinhaço.Their waters are drained to the town of Regencia, in the state of Espírito Santo, where they flow into the Atlantic Ocean (Figure 1).In this basin there are two rivers of federal dominance, being they the Doce river and the Jose Pedro river, affluent of the Manhuaçu river (Ecoplan Lume Consortium, 2010).The Doce river basin was chosen because of its great socioeconomic and political importance, and because it is a basin with intense economic activity and population occupation.
Two basins were chosen in order to compare WRPMUs of the Doce River basin with different predominant activities, the WRPMU of Piracicaba River and the WRPMU of Piranga River.The first one with predominant industrial activity, and the second one with predominant agrarian activity.

Characteristics of the data
In order to obtain models for estimating TSS concentration from the turbidity data, a survey was made on the water surface quality database available at "Portal InfoHidro" by the Instituto Mineiro de Gestão das Águas (IGAM) which monitors the quality of surface and groundwater in Minas Gerais since 1997 -known as the "Águas de Minas Project" -generating data that are indispensable for the correct management of water resources.
Currently, IGAM has in its qualitative monitoring network sixty-four (64) sampling stations located in the Doce river basin; 15 of these stations are located at WRPMU in Piranga river, and 13 in the WRPMU in Piracicaba river, that is, 44% of the stations basin are in these important WRPMUs.
In order to evaluate the water quality, 56 water quality parameters are analyzed among them are the turbidity and the concentration of TSS.The collections of samples and the respective analyzes are carried out by the National Service of Industrial Learning -Technological Center Unit of Minas Gerais (SENAI -CETEC).
The frequency of the review varied between semiannual, quarterly, every three months and monthly.In all, there were 3062 collections in the entire Doce river basin between 1997 and 2014.Of these collections, 793 were made at the WRPMU in Piranga river, and 688 at the WRPMU in Piracicaba river (48% of the total).

Pre-processing of data
Usually environmental data present censored, lost values, as well as outliers (Sabino et al., 2014).To avoid any problems that these types of values may cause in statistical analyzes, the database must be handled.
Thus, in this study, the methodology presented by Sabino et al. (2014) was used to treat the censored values.According to this methodology, values below the minimum detection limit are replaced by half the minimum detection limit; yet the values that are above the maximum measured value are maintained by the responsible organ.
In the case of missing data, the sample that did not present one of the two studied parameters or did not present any of the two parameters, was excluded, since, in this study, a relation between TSS and turbidity was sought if one of the parameters (or both parameters) is not measured, there is not a pair to account in the regression models to be adjusted.
For the investigation and later elimination of the outliers, we used the adjusted boxplot method, proposed by Vandervieren & Hubert (2004).

Statistical analyzes
After the data pre-treatment, linear regression models were adjusted for the two WRPMUs: Piranga and Piracicaba.In the adjusted models, the pluviometric regime of the units was considered, that is, simple linear regression models were adjusted for the rainy seasonwhich, according to the Ecoplan Lume Consortium (2010), extends from October to March, when rain volumes vary from 800 to 1300 mm -and for the dry period -which extends from April to September, when rain volumes vary from 150 to 250 mm (Ecoplan Lume Consortium, 2010).
With the models properly adjusted, we sought to verify the equality of the full / dry regression models and Piranga / Piracicaba.That is, evaluate between the equations if the estimated parameters are statistically the same, in order to obtain a single equation.According to Regazzi & Silva (2010), this is a very frequent practice in regression analysis.For this, it was used the dummy variables method.
The dummy variable method along with the identity model method are the most relevant for comparisons between linear regression equations (Magalhães & Andrade, 2009).The methods present very similar results however, with the dummy variables there is lower probability of occurrence on Type I Error (rejecting the null hypothesis, being it true) and Type II Error (not rejecting the null hypothesis, being it false) (Magalhães & Andrade, 2009) that is why in this study we chose to use it.This method consists in the inclusion of additive and multiplicative binary variables, the dummy variables, which assume values 0 and 1.Then, new models, including these variables, are adjusted and their coefficients are tested (Student's t-test) (Magalhães & Andrade, 2009).

Pre-processing of data
Initially, the boxplot method was used (here it will be called the original boxplot); however, it was observed that a large number of values were classified as being discrepant (approximately 15% of the values); which, if removed from analyzes could compromise the veracity of the results.
The original boxplot method, that it is the "Tukey fences" (Lyra, 2014), it assumes that the data follow a normal distribution.Performing tests of adherence to normal and lognormal distributions -Anderson-Darling, Kolmogorov-Smirnov, Shapiro-Wilk and Ryan-Joinerit was observed that there was no adherence to the distribution curves analyzed (p-value ≤ 0.05) in the data of none of WRPMUs.Vandervieren & Hubert (2004) explain that, when conventional methods are used in the analysis of distorted data (abnormals), many points are usually classified as outliers because the cut-off values are derived from the normal distribution.Vandervieren & Hubert (2004) proposed an adjusted boxplot which has lower and upper limits that are more robust to variations related to normality and that adhere to all distributions (Lyra, 2014).To construct the tolerance interval on the adjusted boxplot method, a measure of robust asymmetry to variations in data normality is taken into account (Vandervieren & Hubert, 2004).We then chose to use such method to analyze discrepant values.In Figure 2 we can observe the difference between the tolerance intervals between the original boxplot and the adjusted boxplot for the WRPMU in the Piranga river.
Engenharia Agrícola,Jaboticabal,v.38,n.5,Using the adjusted boxplot, the amount of values held in the distribution is greater (approximately, 8% of the data were censored).
After the elimination of the missing values and subsequent elimination of the outliers from 793 performed collections at the WRPMU in the Piranga river, 730 remained, being 396 data from the rainy period and 334 from the dry period; and out of 688 collections made at the WRPMU in the Piracicaba river, 613 remained, being 302 data from the rainy period and 311 from the dry period.This means that, for the statistical analysis, we still have approximately 88% of the total initial collections.

Adjustment of regression models
Proceeding to the analysis of variance (ANOVA), considering the simple linear regression model, we can observe that the turbidity has a significant linear relationship with the TSS concentration, in all adjusted models (p-value ≤ 0.01).Other authors, also studying the relationship between these two parameters in stream water, have found linear relationship between them (e.g.Suk et al., 1998), Daphne et al. (2011); Rügner et al. ( 2013)).Pavanelli & Bigi (2005), using samples prepared in laboratory with specific solids concentrations, also found linear relationship between these variables.Already Bhargava & Mariam (1990) observed both a linear and curvilinear relationship; however, these authors studied the relationship between turbidity and TSS content in suspensions in 4 different types of soils.The difference between the ratios for each type of soil suspension is mainly due to variations in the spectral characteristics of each studied material (Bhargava & Mariam, 1990).
Tables 1 and 2 show the regression coefficient estimates for the rainy and dry periods of the two WRPMUs, Piranga and Piracicaba.The adjustment coefficients were significant (pvalue ≤ 0.01) for all adjusted models (Tables 1 and 2).It is also noted that, in the analysis interval, increase in turbidity reflects an increase in the concentration of TSS, fact also evidenced by the authors above mentioned.
According to Rügner et al. (2013), angular coefficients between 1 and 2 have been reported in the literature, but these authors present other studies that obtained angular coefficients from 0.75 to 3.3.Suk et al. (1998), monitoring the concentration of TSS and turbidity in a river in 25 different time periods, observed angular coefficients between 0.2 and 2.8.Then, the coefficients obtained in the present study are within the ranges found in the literature.
It was found in some samples that, although the turbidity value was high, the value of the TSS concentration was low, as can be observed in the graphs of Figure 3.This fact can be consequence of a high concentration in colloidal solids (muddy water) in the collected samples.These solids can be counted in the turbidity at some point of water turbulence; however, they are not considered in TSS concentration, once they must present size range from 10 0 μm to10 3 μm (von Sperling, 2014), and the colloidal solids are in the range of 10 -3 μm to 10 0 μm (von Sperling, 2014).
It is also observed the opposite, that is, in some samples, although the value of turbidity is low, the value of the TSS concentration is high.According to Metcalf & Eddy et al. (2014), turbidity measurement, especially low values, presents high degree of variability depending on the light source and the measurement method which could explain what happened with these samples.Daphne et al. (2011) also verified the possibility of low turbidity values related to concentration of TSS in the sample of a river, and that this may be due to a fraction of fine sand present in the samples that was quickly installed below the monitored zone by the turbidimeter.This fact was found by observing such fraction of sand trapped in a set with other suspended solids during filtration using a 0.6 μm glass fiber filter when measuring TSS concentration (Daphne et al., 2011).Engenharia Agrícola, Jaboticabal, v.38, n.5, p.751-759, sep./oct. 2018 In both cases, the reason for this disproportion between turbidity and TSS may also be the fact that turbidity is measured by an optical property, subject to the interference of shape, size, refractive index and particle density, as well as color of the water, but little influence on the concentration of the suspended material (Bhargava & Mariam, 1990;Gippel, 1995;Daphne et al., 2011;Hannouche et al., 2011;Rügner et al., 2013).In other words, the relationship between TSS and turbidity is dependent on the variations in the spectral characteristics of the suspended material (Daphne et al., 2011).
Reading and/or analyzing errors may also have provided the above problems.Gippel (1995) shows that turbidity data measured in laboratories can be censored because of particle agglomeration problems in the sample conditioning vessel.Yet, Suk et al. (1998) report that, for in situ monitoring, the compromise of the collected data may occur due to the high biological growth rates in the instruments.Pavanelli & Bigi (2005) demonstrated that by storing the samples for a long period (1 month) -at room temperature, exposed to daylight -there is an increase in water turbidity due to the development of microorganisms and algae, aggregation of clays and flocculation, in addition to other biological reactions and gas production.
Following the evaluation of the adjusted models we verified possible deviations of the assumptions model, consequently, the adjustment of the same, investigating if the residues present normal distribution.For both WRPMUs, in the rainy and dry periods, there was no adherence to the normal distribution curve (p-value ≤ 0.05) in none of the tests -Anderson-Darling, Shapiro-Wilk, Kolmogorov-Smirnov, Ryan-Joiner.According to Minitab (2014), p-value accuracy is sensitive to non-normal residual errors when dealing with small samples (less than 15).Therefore, because the samples are sufficiently large, the adjustment of the models is not compromised by the non-normality of the residues.
Finally, the coefficient of determination value (R²) for the rainy and dry periods in the Piranga river was respectively, 0.71 and 0.87.For the WRPMU in the Piracicaba river in the rainy season R² was 0.68, and for the dry period, 0.54.It should be noted that R² values are high -except for the dry period of the Piracicaba river unit -then there are indications that the simple linear model adjusted well to the data set analyzed for the rainy and dry periods of the two WRPMUs, Piranga and Piracicaba.However, for Barros Neto et al. (2007), models with R² <0.60 should be used only as trend indicators, never for predictive purposes.Therefore, only the adjusted model for the dry period at the Piracicaba WRPMU cannot be used to predict the concentration of TSS.Rügner et al. (2013) present R² values in the literature varying from 0.73 to 0.99; however, these authors, studying the relationship between TSS and turbidity in basins in Southeastern Germany, found R² ranging from 0.59 to 0.98.It can be said, then, that the values of determination coefficients obtained in this study are comparable to those in the literature.Recalling that several factors influence the relation between TSS and turbidity -as earlier commented -then we cannot establish a criterion in this regard.
The low value of R² in the dry period at the Piracicaba WRPMU is due to the fact previously described: many points with high turbidity value and low value on TSS concentration and also many points with low value of turbidity and high concentration of TSS.When these values were withdrawn, a better adjustment of the model was observed, which can be noticed by increasing the R² value to 0.78.Suk et al. (1998) observed that the low concentrations of TSS disrupted the good adjustment of the models.Daphne et al. (2011) found that, as the TSS increases, the turbidity uncertainty is also increased with higher consistency of a maximum TSS concentration on approximately 50 mg L -1 -obtaining R² at 0.81, by restricting the range of TSS concentration.As shown above, when we extracted the values that compromised the good adjustment of the model, we obtained R² value close to that obtained by Daphne et al. (2011), but the maximum concentration of TSS that allowed a greater consistency in the results was approximately 90 mg L -1.
Even as trend indicators, the models would be of great value.As already mentioned, monitoring the water quality is very important, but it is not an easy practice (Goher et al., 2014).Having models that indicate trend of the magnitude on certain parameter, if there are any discrepant value models that indicate trend of the magnitude of a given parameter are observed, if some discrepant value is observed, more accurate analyzes can be carried out and then measures can be taken (Kusari & Ahmedi, 2013).This would contribute to a more practical and economically viable process for monitoring water quality, especially in developing countries (Kusari & Ahmedi, 2013, Rügner et al., 2013).

Verification of model equality for prediction of TSS concentration
Tables 3 and 4 show the obtained results using the dummy variables method for rainy and dry periodsassigning D = 0 for the dry period and D = 1 for the rainy period.For both WRPMUs the coefficient D is significant.Therefore, we can see sample evidence for the equality hypothesis of intercepts on the linear models for the two periods (rainy and dry) is not true.However, the interaction coefficient D: Turbidity is not significant in both WRPMUs.This means that there are indications for linear models on the two periods (rainy and dry) are parallel.That is, for both WRPMUs rainy and dry models present angular coefficients statistically equal; however, do not have equality on intercept.
The possibility of equality between WRPMU models for each period was also tested.For the Piranga WRPMU was assigned D = 1, and for the Piracicaba WRPMU was assigned D = 0.In the same way, new models of the whole dataset of each period were adjusted (Piranga and Piracicaba) and including as dummy variables.The obtained results using the dummy variables method are presented in Tables 5 and 6.There are indications that Piranga WRPMU model and Piracicaba WRPMU model, for both periods of the year, are not parallel.However, there is evidence that the hypothesis of intercepts equality of on the linear models for two WRPMUs (Piranga and Piracicaba) is true.Thus, it can be said that the Piranga WRPMU and Piracicaba WRPMU models, in both periods of the year, have equal intercept, but are not parallel.The adjustment of a single model for the two WRPMUs under study, in order to contemplate the rainy and dry periods was not possible.These results corroborate with those by Hannouche et al. (2011) who verified variation between the models for drought and flood conditions and also between sites.These authors explain that in urban wastewater or rainwater the characteristics of suspended solids are heterogeneous and variable, so it is possible occur variations in time and space (Hannouche et al., 2011).
According to Daphne et al. (2011) the relation between TSS and turbidity depend on the variations in the spectral characteristics of the suspended material, so the relation is unique in each situation, as can be observed in the present study and also in the study by Hannouche et al. (2011).It is necessary, then, to use one model for each situation.
In each situation, the equations describing the adjusted models to predict TSS concentration as a function of turbidity in the Piranga WRPMU during rainy and dry periods are presented in eqs (1) and (2), respectively: TSS rainy = 0.86 Turbidity + 9.99 (1) TSS dry = 0.79 Turbidity + 4.36 (2) Where: TSSrainyconcentration on total suspended solids for the rainy period (mg L -1 ); TSSdryconcentration on total suspended solids for the dry period (mg L -1 ), Turbiditymeasure of Turbidity (NTU).
FIGURE 1. Map of the river basin of the Doce river.
sep./oct.2018 TSS -Total suspended solids Note: The representation of data in the form of natural logarithm was made for better limits visualization.

FIGURE 2 .
FIGURE 2. Comparison between adjusted boxplot and original boxplot method to investigate the existence of outliers on TSS values and turbidity for the Piranga River WRPMU.
FIGURE 3. Turbidity dispersion graphs in water samples collected at the WRPMU in the Piranga river during the rainy (a) and dry (b) periods; and at the WRPMU in the Piracicaba river, during the periods of flood (c) and dry (d).

TABLE 1 .
Regression coefficients of the simple linear model for TSS concentration as a function of turbidity at the Piranga WRPMU for the rainy and dry periods.

TABLE 2 .
Regression coefficients on simple linear model for TSS concentration as a function of the turbidity at the Piracicaba WRPMU for the rainy and dry periods.

TABLE 3 .
Estimation of the regression coefficients on new adjusted model for TSS concentration in collected water samples at Piranga WRPMU.

TABLE 4 .
Regression coefficients on new adjusted model for TSS concentration in water samples collected at the Piracicaba WRPMU.

TABLE 5 .
The regression coefficients of the new adjusted model for the concentration of TSS in water samples collected from Doce river basin during the rainy period.

TABLE 6 .
The regression coefficients on new adjusted model for TSS concentration in water samples collected from the Doce river basin during the dry period.Note: * significant at 0.05 probability level; ** significant at 0.01 probability level; *** significant at 0.001 level of probability; ns not significant by Student's t-test.D -dummy variable (D = 0, Piracicaba WRPMU; D = 1, Piranga WRPMU).