Land use and water quality in watersheds in the State of São Paulo, based on GIS and SWAT data

Land use influences the quality and availability of water resources, but Brazil has made little progress in integrated watershed management. This study therefore applied geoprocessing for land-use classification and evaluated the impact on the hydrological balance in order to contribute to the integrated management of water resources. Using GIS tools, two drainage areas from the water catchment points of two municipalities, Santa Cruz das Palmeiras and Piedade, were delimited; land-use mapping was carried out using the supervised classification method of satellite images, and the SWAT model was applied for hydrological simulation. The methods used were appropriate. The surface runoff was related to the absence of vegetation and the predominance of exposed soil. The relationship between land use/land cover and the hydrological balance was evidenced, especially the impact of agricultural activities and the lack of natural vegetation in the surface runoff.


Uso do solo e qualidade da água em bacias hidrográficas do Estado de São Paulo, utilizando GIS e SWAT RESUMO
O uso e ocupação do solo influenciam a qualidade e a disponibilidade de água, devendo ser considerados numa gestão integrada dos recursos hídricos e no planejamento das bacias hidrográficas. Ferramentas de geoprocessamento têm sido utilizadas para classificação do uso do solo e combinadas com modelos hidrológicos para avaliação do impacto de diferentes cenários de uso no balanço hidrológico visando contribuir para uma gestão integrada dos recursos hídricos. Por meio do SIG, foram delimitadas duas áreas de drenagem dos pontos de captação de água para abastecimento dos municípios de Santa Cruz das Palmeiras e Piedade; foi realizado o mapeamento do uso do solo pelo método de classificação supervisionada de imagens de satélite e aplicado o modelo SWAT para simulação hidrológica. As ferramentas utilizadas se mostraram adequadas. O escoamento superficial esteve relacionado à ausência de vegetação e predomínio de solo exposto. Foi evidenciada relação entre o uso e ocupação do Land use and water quality in watersheds … Rev. Ambient. Água vol. 14 n. 5, e2325 -Taubaté 2019 2017 level of preservation. Watersheds that have riparian forests along their riverbanks and preserved areas of natural vegetation have better water quality than do those in which there is intensive agricultural activity and degraded areas (Connolly et al., 2015). Therefore, this study evaluated the use of image processing for LULC classification in two watersheds for public water supply in the state of São Paulo, in which there is intensive agricultural activity of various patterns, as well as to evaluate the impact of land use on the water balance in these watersheds, through the use of the SWAT model, with the ultimate goal of contributing to the integrated management of the quality of the water resources destined for public supply.

Description of the watersheds
The two watersheds selected, both of which are used for public water supply, are located in the municipalities of Santa Cruz das Palmeiras and Piedade, respectively, in the state of São Paulo (Figure 1). In the Ribeirão da Prata River watershed (RPRWS), in the municipality of Santa Cruz das Palmeiras, sugarcane monoculture predominates. In the Pirapora River watershed (PRWS), in the municipality of Piedade, there are a variety of fruit and vegetable crops on small-and medium-sized properties. Those watersheds were chosen as study areas for the continuation of a research project funded by the São Paulo Research Foundation and the Brazilian Unified Health Care System Research Program, 1 in which pesticides in the water supplies of several municipalities in the state of São Paulo were monitored during 2014. The results showed elevated levels of pesticide residues in the catchment water sources within the RPRWS and PRWS.
Santa Cruz das Palmeiras has a population of 29,932 inhabitants, is within Water Resources Management Zone 9 (WRMZ -Mogi-Guaçú), and presents a landscape of flat hills with elevations of 20-50 m, red Oxisol being the predominant soil type. According to the Köppen climate classification system, the climate of the region is type Aw (tropical savanna), with dry winters and an average annual rainfall of 1500 mm. The main economic activity in Santa Cruz das Palmeiras is the production of sugarcane. Piedade has a population of 52,123 inhabitants, is mostly within Water Resources Management Zone 10 (WRMZ -Sorocaba/Middle Tiete), and presents a type of landscape known in Brazil as mares de morros ("seas of hills") with elevations between 80 m and 200 m, red-yellow Ultisol being the predominant soil type. According to the Köppen climate classification system, the climate of the region is type Cwa (humid subtropical), with dry winters and an average annual rainfall of 1300 mm. The main economic driver in Piedade is the service industry, followed by agriculture (production of vegetables and legumes).  For the delimitation of the drainage area of the water catchment area of the municipalities, the hydrographic network was extracted from the Hydrology tool of the ArcGIS software, Version 10.1. Digital Elevation Models (DEMs) were obtained from the TOPODATA project of the Instituto Nacional de Pesquisas Espaciais (INPE, Brazilian National Institute for Space Research), which were derived from Shuttle Radar Topography Mission data (SRTM) provided by the United States Geological Service, with a resolution of 30 m and 16 bits. We thus generated the files for flow direction, a process that defines the flow of the watercourse, pixel by pixel, in eight directions, sending each flow in one of those directions. The algorithm for calculating the discrete aspect (flow direction) was derived by Jenson et al. (1988), the flow direction being made in a 3 × 3 moving window that traverses the DEM and assigns to each cell the direction of one of its eight neighbors. The attribution is made by determining the direction with the steepest slope: the slope is calculated as dZ/dS, where dZ is the difference between the elevations in the cell of the considered direction and the central cell, and dS has a value of 1 in the perpendicular directions and a root of 2 in the diagonal directions (Mendes and Cirilo, 2001). The pixels are given values from 1 to 255, and the values from each direction to the center respect the following distribution: The next step was the identification of flow accumulation, the pixels with accumulated flow represent areas with greater flow concentration; from those data, together with the previous file, the drainage network was extracted and later ordered according to the Strahler method (Christofoletti, 1980). The area of water contribution to the catchment area for the public water supply was then calculated for both municipalities with the ArcGIS Watershed tool, which uses data related to flow direction, drainage networks, and points of interest. That delimitation corresponds to the basin of water contribution of the catchment area, which was considered in the following stages of this study.

Classification of land use
For the classification of land use/land cover, RapidEye satellite images, dated October 2014 and made available by the Brazilian Ministry of the Environment, were submitted to the Supervised Maximum Likelihood Classification method. The RapidEye is a system composed of five identical remote sensing satellites in the same sun-synchronous orbit, at an altitude of 630 km. The image collection path is 77 km wide and 1500 km long. The satellite image is composed of five spectral bands (blue, green, red, red edge, and near-infrared), with a 5-m resolution in the Horizontal Datum WGS84 (Antunes et al., 2014).
We classified the images with ArcGIS software, Version 10.1, using the maximum likelihood method, because it presents better accuracy and global accuracy values than do other classification methods (Cattani et al., 2013). The method involves pixel-by-pixel multispectral analysis, considers the weighting of the distances between means of the gray levels of the classes, and uses the training samples to calculate the probability of a pixel belonging to a certain class (IBGE, 2001). Google Earth Pro images were used in order to facilitate the visual interpretation process, and field work was used in order to quantify agricultural land use within the catchment area.

Water balance analysis
To analyze the impact of land use on the hydrological cycle of the areas, the SWAT model was developed to predict the long-term effects of water management and agricultural practices on the hydrological cycle of watersheds (Arnold et al., 1998). The SWAT employs the modified universal soil loss equation, which uses the amount of runoff to simulate erosion and sediment production (Arnold et al., 2012). The model performs the simulation after dividing the area into subwatersheds, grouping them by the characteristics they have in common, their specificities, and their contributions to the hydrological cycle. The subwatersheds are further divided into hydrological response units (HRUs), which correspond to smaller divisions within the subwatersheds, with unique land use, soil, and management features (Neitsch et al., 2011).
The process of water balance analysis discriminates surface runoff, infiltration, evapotranspiration, lateral flow, drainage, tributary channels, and water redistribution, according to the soil profile. The SWAT simulates the soil hydrological cycle based on the following Equation 1 (Neitsch et al., 2011): Where SW1 is the final amount of water in the soil (mmH2O), SW0 is the initial soil moisture (in mmH2O) on day i, t is the time in days, Rday is the amount of precipitation (in mmH2O) on day i, Qsurf is the surface runoff (in mmH2O) on day i, Ea is the evapotranspiration (in mmH2O) on day i, Wseep is the amount of water (in mmH2O) entering the aeration zone of the soil profile on day i, and Qgw is the amount of flow return (in mmH2O) on day i. This water balance analysis considers the characteristics of the soil, land use/land cover, slope, and climate (Arnold et al., 2012). The daily data for precipitation, temperature, relative humidity, solar radiation, and wind speed in the 2008-2015 period were obtained from the Brazilian National Meteorological Institute for the stations closest to the study areas (INMET, 2016). The monthly precipitation data for the Piedade and Santa Cruz das Palmeiras were obtained from the Center for the Promotion of Agriculture of the São Paulo State University at Campinas (CEPAGRI, 2016). As previously mentioned, the DEMs with a resolution of 30 m and 16 bits were obtained from the INPE. Soil data, at a scale of 1:500,000, were obtained from the Brazilian Agency for Agricultural Research (EMBRAPA, 2006).

RESULTS AND DISCUSSION
For the RPRWS, the area of water contribution, calculated from the catchment area, was 11.6 km 2 divided into 23 subwatersheds or HRUs, with slopes of 3-8% and for PRWS the area of contribution of the catchment area was 93.59 km 2 divided the PRWS into 25 subwatersheds or HRUs, with slopes of 20-45%. The results of primary simulation show that surface runoff and flow have behaviors that are similar and are directly related to the pattern of precipitation distribution (Figure 2). The period of high surface runoff corresponded to the rainy months (October through March), for these months the SWAT simulation produced the greater differences between observed values; similar results were obtained by Sousa et al. (2018) even when the authors compared two methods of rainfall simulation. The hydrological results tend to be better when there is more than one rainfall station in the basin, and the results also depend on the size of the drainage area. Even in simulations calibrated using Nash-Sutcliffe efficiency (NSE), the model tends to underestimate periods of higher rainfall as demonstrated by Pereira et al. (2016). Despite the limitations of the data, this simulation was satisfactory, especially when analyzing other hydrological parameters (Table  1).
Land use and water quality in watersheds … Rev. Ambient. Água vol. 14 n. 5, e2325 -Taubaté 2019 2017 For the RPRWS, the observed evapotranspiration was estimated at 60% of the evapotranspiration potential and 51% of the total precipitation. Soil percolation and surface runoff corresponded to the greatest amount of water in the terrestrial process, being 47% of the volume precipitated. That can be associated with the permeability of Oxisol (the predominant soil type in the area), as well as with the type of vegetation cover. The curve number was 77.9, a value considered high and that can describe a situation of reduced vegetation cover or intensive agriculture (Tucci, 1993). For the PRWS, the simulation estimated the observed evapotranspiration at 49.7% of the evapotranspiration potential and 49.2% of the total precipitation, and the subsurface (lateral) flow accounted for 37.2% of the precipitation, those being the main destinations of the water entering the system. Percolation and surface runoff were considered low and collectively corresponded to only 13% of the total rainfall. The estimated curve number was significantly lower for the PRWS than for the RPRWS, reflecting greater preservation in the former, and consequently the surface runoff, since they are directly related processes.
The results of the hydrological simulation were considered satisfactory, because there is coherence between the values of precipitation, surface runoff, number curve, soil type and LULC. The curve number is related to the LULC and soil type is considered one of the most sensitive parameters in hydrological modeling (Ficklin et al., 2013), as demonstrated by Fukunaga et al. (2015), the result of the SWAT simulation in a Brazilian Southeastern basin presented values of simulation higher than those observed for precipitation and surface runoff, which increased the value of the Curve Number. Nevertheless, in the calibration process the authors concluded that this difference is within the expected range, and in accordance with other simulations in tropical watersheds.
According to Wallace et al. (2018), the curve number is often used because it is simple, stable and does not require a long historical series of data. The authors evaluated the influence of basin size on the variation of several parameters in the SWAT hydrologic simulation process. The authors concluded that although there was a difference between the observed and simulated values, the modeling was satisfactory and indicated as a watershed planning and management tool.
The classification of LULC (Figure 3) indicated that the conditions of water source preservation differed between the two watersheds. In the RPRWS, more than 80% of the area was occupied by agricultural activities, whereas forest occupied only 6.28%, well below the 20% established by legislation (Brasil, 2006). In contrast, the PRWS presented 53.29% forest and 35.82% agricultural areas. The proportion of land used for agricultural purposes within the two watersheds corresponded to that observed for the respective municipalities.
The poor preservation in the RPRWS was attributable to the agricultural production method adopted by the municipality of Santa Cruz das Palmeiras, which is dependent on sugarcane monoculture, as well as coffee and oranges, which are grown as permanent crops (IBGE, 2015). Likewise, the degree of preservation in the PRWS reflects the allocation of land in the municipality of Piedade, where there are small plots and variety of crops, including maize, onions, sweet potatoes, and beans, persimmons, avocados, and grapes being grown as permanent crops (IBGE, 2015) ( Table 2).  Applying the Supervised Maximum Likelihood Classification method to the RapidEye satellite images allowed us to observe stretches, throughout the catchment areas, where there was no riparian forest and there was exposure to agricultural activities, aspects that potentiate the transport of sediment and pesticides. Similar results were obtained by Twesigye et al. (2011), based on the supervised classification of a historical series of Landsat 5 images of Lake Victoria, in Africa. The authors correlated the reduction in natural vegetation with the expansion of agriculture and the presence of pesticides in the watershed.
For the RPRWS, the surface runoff was highest in the subwatersheds with a preponderance of exposed soil, followed by those with a preponderance of sugarcane (Figure 3, A and B). We also observed a scarcity of natural vegetation, especially in the Permanent Preservation Areas, which correspond to the preservation of the riparian forests. Within the PRWS, the surface runoff values were highest in the subwatersheds with a predominance of agriculture (Figure 3, C and D) and lowest in those with the most well preserved vegetation cover.
The results of our water balance evaluation were satisfactory for analysis of the behavior of the main variables of the hydrological cycle and the relationship with LULC. The rates of percolation and surface runoff were significantly higher in the RPRWS than in the PRWS, correlating with soil permeability, as well as with the degree and type of vegetation cover. Oxisol, the predominant soil type in the RPRWS, is a soil that is deep, well drained, and permeable. The absence of vegetation cover and the cultivation of sugarcane, which has short roots, tend to contribute to the increase of surface runoff (Armas et al., 2007).
The PRWS presented results consistent with greater preservation of natural vegetation and therefore greater water storage capacity of the soil. Although Ultisol, a shallow soil type, predominates in the PRWS, the preserved vegetation cover favors the greater subsurface flow. Because the PRWS is in a region with a "sea of hills" landscape, the loss of natural vegetation in the area would increase the risk of soil instability and silting of the water sources. In both areas, the surface runoff rates were higher in the subwatersheds in which there was a predominance of exposed soil and agricultural activity. A similar result was obtained by Oliveira et al. (2018a) which identified that deforestation and agricultural use increase peak flows at the same time as percolation decreases; these alterations can result in degradation of source water quality. Armas et al. (2007) found that the presence of pesticides in surface waters in the Piracicaba River watershed was related to aspects of land use, mainly to the predominance of sugarcane cultivation. In the Corumbataí River and its tributaries, the authors found high levels of atrazine, ametryn and glyphosate. In addition to contaminant loading, different land uses can also influence the amount of soil lost and the water balance within the watershed, as demonstrated by Silva et al. (2011). Evaluating the degree of contamination of water sources in the Jacaré River watershed, within Water Resources Management Zone 13, Souza et al. (2013) concluded that the level of ammonia was high in the subwatersheds that did not contain Permanent Preservation Areas or that presented only grass cover. The authors found that preservation of riparian forests along the river also reduced the levels of dissolved oxygen, nitrates, and fine sediment in the water sources.
A similar result was reported by Mello et al. (2017), who used the SWAT to analyze the influence of land use on water quality in the Sarapuí River, near the city of Sorocaba, also in the state of São Paulo. The authors concluded that there were high levels of sediment, decreasing water quality, in the subwatershed areas in which there was agricultural activity. In addition to the spatial variation, those authors identified a temporal variation in the concentrations of sediments and substances, which were found to be higher during the rainy months.
The use of SWAT to predict impacts on the water management has been recommended as decision support in different countries, including studies on climate and land-use change, crossboundary water transfers, nitrogen loads and others (Abbaspour et al., 2015). In addition, it is possible to prioritize areas of the basin to monitor and control pollutants, and thus to prevent or improve water quality (Conolly et al., 2015). The lack of continuous monitoring data on flow and sediment in watersheds has been a limiting factor for the validation of the results obtained and for their use in the formal processes of water resource management in the watersheds. Nevertheless, the SWAT has been considered an important tool for the prediction of maximum annual concentrations of pollutants even in unmonitored basins and without modeling calibration (Winchell et al., 2018).
At the beginning of the GIS application, the mapping LULC contributed to differentiate and quantify the main activities in a watershed, in the last few years this analysis has become more sophisticated, it is possible to relate the LULC with the hydrological dynamics, emission of pollutants, to identify patterns of alteration in the spatial ordering. From this information, it is also possible to predict costs to the management of natural resources, using it as a decisionsupport system (Shao et al., 2017).
It is understood that integrated management of watersheds should consider, in an interconnected way, the physical processes of water and the hydrological cycle, as well as their relationships with other natural strata, such as soil, relief, flora, and fauna, together with the interests of the multiple uses of water sources, and participatory management relationships at different administrative levels (Machado, 2003). However, more than 20 years after the introduction of the National Water Resources Policy and the passage of São Paulo State Water Protection Law No. 9866/97, the use of water for economic pursuits continues to be prioritized, resulting in the degradation of various water sources, especially by agro-industrial sector. The state of São Paulo has several water sources that are at high risk, requiring conservation and restoration measures (IDS and LABGEO, 2017;Oliveira et al., 2018b) to ensure the supply and quality of water for human consumption.
The determination of areas with greater surface runoff can support the actions to prevent contamination of water in the basin. Increased surface runoff caused by agricultural occupation of the watershed may lead to water contamination, requiring additional and costly treatment to ensure safe water. With integrated management, managers can prioritize these areas for control and reduction of the use of pesticides, collection and treatment of domestic effluents, incentives for the preservation of forested areas on agricultural property, among others. The knowledge of hydrology may be used to improve environmental studies and the management of natural resources.
This work corroborates the importance of applying geotechnologies as decision support for water-resource management, not only for analyses of water availability, but mainly to emphasize the relevance of its application to the management of water quality and the risks to which the source of supply is submitted.

CONCLUSION
The present study identified alterations in the hydrological cycle according to land use/land cover. We also showed that agricultural activities and a lack of natural vegetation increase surface runoff, which is a concern for water quality. The remote-sensing tools employed in our study represent an efficient, fast and low-cost means of classifying land use/occupation in watersheds. Using those tools in combination with ecosystem modeling tools, such as the SWAT, allow land use patterns to be correlated with the quality of the water supply and areas of greater concern in terms of the impact of water sources to be identified. This favors the collection of information and the development of plans to facilitate the integrated management of watersheds by the different actors involved, the ultimate goal being, above all, to maintain water quality, to protect the sources of the public water supply.