Spatial-temporal analysis of the surface water quality of the Pará River Basin through statistical techniques

Water quality issues are a growing concern due to the the recent intensification of urbanization and industrialization. This paper evaluates and compares the surface water quality of the ten sub-basins of the Pará River, located in the São Francisco River Basin, Minas Gerais, and evaluates the impact of seasonality and the compliance with the current limits of state legislation. The surface water quality monitoring database of the Institute of Water Management of Minas Gerais (Igam) was used, and 18 parameters were analyzed from a historical series from 2008 to 2016, totaling 16,651 observations. First, the descriptive statistics of the parameters were calculated, considering each sub-basin separately. Then, for the temporal and spatial analysis, the Kruskal-Wallis nonparametric statistical tests were applied, followed by the multiple comparison test, with an alpha level of 5%, due to the asymmetric behavior of the data. Thus, it was possible to compare water quality of the sub-basins in rainy and dry seasons and to identify which parameters were responsible for the greater degradation. In the compliance analysis to the current limits of state legislation, it was identified that all of the sub-basins were out of the specified range for at least one of the evaluated parameters. Finally, the seasonality analysis exposed significant differences in the parameters of dissolved oxygen, turbidity, total suspended solids, total solids and water temperature, where it was shown that there was a worsening of water quality in the rainy season for most sub-basins.


INTRODUCTION
Water is a component integrated into the global system, which has been strongly altered and degraded with anthropic actions. The velocity and extension of globalization, allied with socioeconomic development, has increased the demand for water resources, reflected in the scarcity and deterioration of springs. Water became a growing worry not only regarding the available quantity but, mainly, concerning quality and restrictions in its uses (Barakat et al., 2016;Giri and Qiu et al., 2016;Zeinalzadeha and Rezaei, 2017;Delkash et al., 2018). Ferrier et al. (2001) affirm that the properties of a water system tend to reflect the combination of geomorphological attributes, with a variation of direct and indirect influences of climatological aspects and the anthropogenic action in the basins. Hence, chemical components found in water bodies are variable and have correlations with specific characteristics of each environment, being subject to constant changes in the environmental system, caused mainly by anthropic actions (Kazi et al., 2009).
Many studies regarding water quality have been presented with the objective of evaluating the extension of impacts in quali-quantitative aspects of hydrographic basins as consequences of anthropization (Trindade et al., 2016;Pinto et al., 2017;Gebler et al., 2018). However, although there may be much data available, this information is often disorganized and scattered in different institutions and public agencies. These programs also generate extensive data matrices, which are usually difficult to interpret, and the identification of the possible causes that may influence the occurrence and the concentration of a given parameter may become a complex task (Simeonov et al., 2003).
In Minas Gerais, the state Water Management Institute (Igam), through the Project Águas de Minas, has monitored surface-and ground-water quality in the state since 1997. The program provides a historical data series related to water quality of Minas Gerais, which is indispensable to the proper management of the available water resources (IGAM, 2010a).
The referred database available is extensive and difficult to interpret, making it impossible to obtain immediate conclusions with only a superficial analysis. In this sense, statistical methods have become an excellent exploratory interpretation tool. Studies made by Trindade et al. (2016); Christofaro et al. (2017); Costa et al. (2017); Oliveira et al. (2017) and ; showed the use of these statistical tools to explore many objects of study involving quality of water bodies, pollution, anthropic relations, amongst others. The use of non-Rev. Ambient. Água vol. 14 n. 1, e2322 -Taubaté 2019 parametric tests in these studies presents positive results when aiming to find significant differences between sampling units, evaluating the behavior patterns of each studied event.
Thus, this work aims: (i) to characterize and compare the surface waters quality of the ten sub-basins of the Pará River, located in the São Francisco River Basin, Minas Gerais, regarding monitoring data of the analyzed parameters, (ii) to evaluate compliance with the standards established by the current state law for each parameter, concerning legal classification and, finally, (iii) to evaluate and compare the effect of seasonality in surface water quality of the ten sub-basins.

Characterization of the study area
The Pará River Hydrographic Basin corresponds to the SF2 Water Resources Planning and Management Unit (UPRGH), located in southwestern Minas Gerais. Its main river is born in the municipality of Resende Costa and extends for 365 km until flowing into the São Francisco River. The basin covers approximately 12,300 km², where 35 municipalities are located. The estimated population of the basin is 700,000 inhabitants, with 12% living in rural areas (IGAM, 2018).
The basin presents high degradation which directly affects the quality of life of the population. According to Igam reports, the main pressure sources associated with the water quality changes in the Pará River basin are: extractions of sand, clay, gravel, limestone and quartz; artisanal mining; extraction of precious stones; extraction and processing of metallic minerals; mining and processing of gold; agriculture; livestock; silviculture; diffuse charge; erosion and aggradation; release of domestic/sanitary sewer and the release of industrial effluents (IGAM, 2010a; 2010b).
The spatial arrangement of water quality monitoring stations along the Pará River Basin is shown in Figure 1. The description of sub-basins is shown in Tables 1 and 2, respectively.
Real data (raw values) from the observations in each station were organized considering each sub-basin and split into two seasons: dry and rainy, in order to analyze the impact of seasonality. It is emphasized that the delimitation of the sub-basins was established by the Land Use Plan (IGAM, 2016b;2016c) and was made through the ottocoded (1:50.000) Igam base. Such methodology, proposed by Otto Pfafstetter is based on the grouping of direct contribution areas of each stretch of the hydrographic net that have the same code. Hence, the hydrographic basins correspond to the aggregation of the areas of hydrographic contribution, known as ottobasins (Pfafstetter, 1989).
Descriptive statistics for the raw data of each parameter were calculated for all sub-basins, showing their minimum and maximum values, average, standard deviation, geometric average and percentiles (25th, median-50th, and 75th).
In order to define the statistical analyzes to be conducted, a Shapiro-Wilk normality test was applied, with a 5% significance. Moreover, since most parameters do not follow the Normal distribution, nonparametric tests were utilized to compare the median values, as well as for the analysis of seasonality.
To detect significant differences between concentrations of water quality parameters in the sub-basins, the nonparametric tests of Kruskal-Wallis were applied to compare multiple independent samples, followed by the multiple comparison test (when applicable) with an alpha level of 5%. Thus, it was possible to identify which sub-basins were more impacted and which parameters were responsible for a worsening in water quality degradation. Box-plots were also generated to better visualize of the results. All statistical analyzes were performed with the Statistica® 13.0 software.   (1) -Ordination class, according to DN COPAM (1998).
Multivariate statistical methods were not utilized due to data heterogeneity, which had missing observations, what would result in a considerable information loss in the use of multivariate analysis, since they follow the premise that the amount of data must be the same for all stations, excluding observations that do not match this condition.
To evaluate law compliance, the results of the analyzes of all sampling points in the subbasins were compared to the limits established by the Joint Normative Deliberation COPAM/CERH n. 01/2008 (COPAM andCERH 2008). Since state law establishes more strict values than the national CONAMA Resolution n. 357/2005, in this study it was decided to compare the water quality data series with local values in order to rate violations in water quality parameters. With this, it was possible to calculate the violation percentage of each parameter, in the sub-basins, to identify the main causes of environmental degradation.
To verify the effect of seasonality in surface water quality, data were split for each basin in dry and rainy seasons. According to the definition contained in the Igam report (2016d), for the state of Minas Gerais the dry season takes place between April and September, and the rainy between October and March. In order to verify the matching of seasons, daily rainfall data, from 2008 to 2016 were obtained with the automatic surface weather stations of the National Meteorology Institute (INMET), located in the Pará River Basin coverage area and its surroundings. Hence, data from the stations Divinópolis-A564, Formiga-A524, Florestal-A535, Pompéu-A560 and Oliveira-A570 were analyzed and compared to the water quality data from the same days. From this evaluation, both seasons defined by Igam were confirmed.
The nonparametric Kruskal-Wallis test was utilized to the comparison of multiple independent samples for all sub-basins considering both seasons. Then, when a significant difference between the concentrations of the parameters was detected, the multiple comparison test with a 5% significance level was applied, and finally, box-plot graphs were generated.

RESULTS AND DISCUSSIONS
All analyses proposed in the methodology were conducted, regarding the 18 chosen parameters. However, the parameters TC/E. Coli, BOD, PT, DO and turbidity will be considered here, which presented a significant difference regarding seasonality.

Descriptive analysis of database
According to what was verified by the descriptive analysis of data, TC/E. Coli showed high maximum values, very close or same as the detection limits of the method. Such results are a characteristic from anthropized hydrographic basins. Other studies, conducted by Calazans (2015), Trindade et al. (2016); Costa et al. (2017) and Oliveira et al. (2017), also shown high concentrations of TC/E. Coli in all water bodies of Minas Gerais.
PT presented higher median values in the Da Paciência brook sub-basin, reaching 0.69 mgL -1 . All maximum values were higher than the 0.15 mg/L limit in all sub-basins. These results are mainly referred to agricultural activities and the lack of treatment of sanitary sewage, or insufficient treatment in the municipalities.
DO and BOD had similar behavior to what was observed in PT, showing the highest concentrations in Middle Pará River, Da Paciência Brook and São João River sub-basins, reaching 0.5 mg/L of DO and 383 mg/L of BOD in Middle Pará River. This minimum DO value is higher, also than the established limits for Class 4 water bodies. In this case, the impact of sanitary sewage release from the urban areas of Carmo do Cajuru, Divinópolis, Nova Serrana, São Gonçalo do Pará (MRP), Itaúna (RSJ) and Pará de Minas (RPC), located upstream from the stations that make up these three sub-basins.
Regarding turbidity, three high maximum values were observed, ranging from two to ten times above the maximum established concentration for Class 4 and values close to 4,000 NTU, as verified in Da Paciência Brook.

Characterization and comparison of surface waters in the Pará River sub-basins
In the surface water quality evaluation, the Kruskal-Wallis nonparametric test identified significant differences between the concentrations of all parameters analyzed in the sub-basins. Figure 2 shows the box-plots and results of the multiple comparison test, after a difference indicated by the Kruskal-Wallis test (p<0,05), from all five water quality parameters mentioned above.
Through the observation of the box-plots and tables of the multiple comparison tests, it is possible to identify that the Da Paciência Brook sub-basin showed the highest median concentrations in four of the five parameters, differing significantly, with α = 5%, from all other sub-basins, at least twice.
Regarding CT/ E. coli and BOD concentrations in the Middle Pará River sub-basin, it can be seen that, by observing the box-plots in Figure 2, the differences exposed by the concentrations become evident. As an example of the multiple comparison test table interpretation, consider only the parameter BOD and the MRP sub-basin. It can be observed that the median BOD values of this sub-basin were significantly higher than the ones presented in seven of the ten analyzed sub-basins. The median BOD of São João River sub-basin did not differentiate itself significantly from the value shown by MRP, meanwhile, Da Paciência Brook sub-basin showed a significantly lower median value.  For turbidity, the Boa Vista Brook sub-basin presented the highest median value, significantly higher than the values from the Itapecerica, São João and Lower Pará River subbasins, not differing itself from the others. Analyzing the boxplots and the multiple comparison tables, it was possible to identify the most-and least-impacted sub-basins. Table 3 shows a summary of the most-impacted sub-basins and the responsible parameters according to the results of the multiple comparison tests. It can be observed that for DO, PT, BOD, TC/E. coli, Cl-T, Cl-a, CE, COD, N-NH4 + , pH, TS, MBAS and ZnT, Da Paciência Brook sub-basin presents significantly higher concentrations when compared to other sub-basins (significantly lower for DO). These results are shown by the box-plots and evidenced by the multiple comparison test tables in Figure 2. This makes the Da Paciência Brook sub-basin the most-impacted in the Pará River basin.
This result refutes the hypothesis that the most-impacted municipalities contain the largest populations since, between the two municipalities that form the Da Paciência Brook sub-basin (Onça de Pitangui and Pará de Minas), only Pará de Minas is among the most populous ones. However, this sub-basin shows the highest population density, approximately 201 inhabitants/km². Along with this, there are many nonconformity generating foci in the contributing area, such as the release of natural sewage from the urban area of Pará de Minas, downstream the station; pig farming activities; granite, gold and rock-forming minerals mining; and effluents from leather tanning activities, which may generate high environmental impacts. This fact makes Da Paciência Brook sub-basin a priority for measures that control and monitor human occupation effluents (IGAM, 2016a).
Rev. Ambient. Água vol. 14 n. 1, e2322 -Taubaté 2019 Then, there is the Boa Vista Brook sub-basin, which shows significantly higher concentrations for turbidity, TSS and temperature. Finally, the Picão River sub-basin, which presented N-NO3 median concentrations significantly higher than other basins. Such results directly affect the life quality of the population, since they reduce the potential use of surface water and increase its costs for treatment and consumption in the cities of Pará de Minas (RPC), Carmo da Mata (RVB) and Martinho Campos (RPI).
Using the results of the multiple comparison tests, Table 4 was generated, with the summary of the least-impacted sub-basins and the parameters responsible for the better water quality condition. It can be verified that Do Peixe River sub-basin, located in the municipalities of Maravilhas; Onça de Pitangui; Papagaios; Pitangui and Pompéu, shows the most parameters with significantly lower concentrations when compared to other sub-basins, possibly being considered the least-impacted one. This result corroborates data contained in the Land Use Plan of the studied basin, where Do Peixe River sub-basin is considered to be the one detaining the lowest domestic effluent release flow (IGAM, 2016b). In the second place, comes Lower Pará River sub-basin, followed by Upper Pará River and Boa Vista Brook.
However, it is important to point out that the discrepancy in the station density in the subbasins might have also been a relevant factor for the results found. Meanwhile, there are subbasins with more than one monitoring station, there are others with only one (RBV, RPX and RPC), and far from the outfall. This may directly affect the concentration of parameters. Hence, the well-structured delineation and the adequacy of sampling networks, in strategic places, based on hydrological conditions, land use and occupancy, water quantity and quality and the location of the pollution sources in the hydrographic basin (Calazans, 2015). Table 5 shows the percentage of compliance to the limits established by DN COPAM CERH 01/08, according to the pre-established class for each existing sampling point in the ten sub-basins. Among the results that showed the data that met the legal limits, are the ones obtained by Boa Vista Brook, for BOD and DO; Picão River, for BOD and Lower Pará River for DO. Regarding higher violation indexes, TC/E. coli stands out for showing the lowest compliance to the classification limits in all sub-basins, except in Da Paciência Brook subbasin, which compliance percentage for PT was even lower than the value found for TC/E. coli.  Considering the main economic activities and characteristics of the sub-basin, these violations are, above all, a strong indication of effluent release, soil erosion, intense mining activities, low flow indexes, diffuse pollution caused by agriculture and unprotected river springs.

Effect of seasonality in surface water quality in the Pará River basin
After applying statistical analyses, it was possible to identify the behavior of each parameter in the sub-basins in dry and rainy seasons. Figure 3 contains the box-plots that show the concentrations of parameters in dry and rainy seasons in sequence, aiming to ease the visualization of the results.
The results showed that from the 18 parameters analyzed, only DO, turbidity, TSS, TS and temperature showed significant differences at a 5% significance level. DO showed a similar behavior for most of the analyzed sub-basins. However, significant differences were found only in the medians of four sub-basins: Upper Pará River, Lambari River, Do Peixe River and Picão River. The median concentrations for this variable tended to be higher in the dry season when compared to the rainy. In the rainy season, the values are lower due to the increase in organic matter and nutrients in the river, transported by surface runoff, which leads to a higher DO consumption for the degradation of organic matter.
It is emphasized that, according to Table 3, Da Paciência Brook sub-basin was the most impacted concerning water quality. For turbidity, it is noticed that six sub-basins showed a significant difference between the medians obtained in the dry and rainy seasons.
For the ten sub-basins analyzed, turbidity values were more elevated in the rainy season, with the highest amplitude being found in Do Peixe River sub-basin. This behavior was already expected since rainfall leads to the carrying of suspended solids, dissolved chemical compounds and suspended particles such as silt, clay and organic matter into the water body.
TSS also showed a similar behavior, decreasing in the dry period. In the sub-basins of Itapecerica River, Do Peixe River and Lower Pará River, TSS concentration medians in the rainy season were significantly higher when compared to the dry season, and the one that showed the highest difference amplitude was the Itapecerica River sub-basin.
Total solids showed a similar behavior to TSS, having lower concentrations in the dry season in all sub-basins. Significant differences were found for the sub-basins of Upper Pará River, Itapecerica River, Lambari River and Lower Pará River. Lastly, concerning water temperature, the values were lower in the dry period for all sub-basins, showing a significant difference in eight out of the ten evaluated sub-basins.
These results reflect the influence of seasonality in surface waters quality in the Pará River Basin, exposing that, despite the higher flow and resulting dilution, there is a worsening in the concentration of parameters in the rainy season for almost all sub-basins, owing to the soil carried into the water body.

FINAL CONSIDERATIONS
The water quality comparison between the ten sub-basins of the Pará River identified significant differences for the concentrations of all evaluated water quality parameters.
Regarding compliance to the law, all sub-basins presented themselves out of the classification for at least one of the evaluated parameters, and in these cases, the new classification suggests an inferior water quality class compared to the previously established one. TC/E. coli was the parameter with the lowest compliance to the classification limits in all sub-basins, reaching a violation of 95% to the current standards for the classification classes of the water bodies of Middle Pará River, and 89% in Do Peixe River sub-basin. Considering PT in Da Paciência Brook, its compliance percentage reached only 11%.
The seasonality evaluation pointed out significant differences between 5 of the 18 evaluated parameters: DO, turbidity, TSS, TS and water temperature. All parameters analyzed demonstrated a decrease in water quality in the rainy period for most sub-basins.
The most-impacted sub-basins regarding surface water quality, considering significantly higher concentrations than others were: Da Paciência Brook (13 of 18 parameters) and Da Boa Vista Brook (3 of 18 parameters). On the other hand, the least-degraded sub-basins, considering concentrations that were significantly lower others were: Do Peixe River, Lower Pará River and Upper Pará River.
The disparity in the number of monitoring stations between the regions is also noticeable. Da Boa Vista Brook, Da Paciência Brook and Do Peixe River sub-basins only have one station each and may not be representing the real situation of the entire sub-basin properly. However, the other sub-basins have most of the monitoring stations.
This work begins the surface water quality monitoring studies in the Pará River Basin and exposes the environmental damage level that the basin is exposed to. Thus, it may be used as a reference in order to establish more strategic goals and management instruments aiming at the sustainable development of the region, leading to an improvement in the living conditions of local people.