A Chemometric Study on the Accumulation of Heavy Metals Along the Mogi Guaçu River Basin

Um estudo sobre os impactos ambientais causados por atividades agrícolas e industriais, ao longo da bacia do rio Mogi Guaçu, foi realizado neste trabalho. A concentração de alguns metais (Fe, Cd, Cu, Cr, Zn, Ni, Pb e Mn) foi determinada em quinze estações de coleta de sedimentos e em dois períodos diferentes: chuvoso (Março/2001) e seco (Junho/2001). Os métodos de reconhecimento de padrões PCA e HCA foram utilizados para agrupar as amostras coletadas em São Paulo e em Minas Gerais e obter informações relevantes sobre as possíveis fontes de contaminação do rio Mogi Guaçu de acordo com o período de coleta das amostras. As análises quimiométricas realizadas, utilizando as amostras coletas em Março/2001 e em Junho/2001, indicaram que somente a concentração de Cd, Cu, Cr e Pb discrimina as amostras coletadas em São Paulo das amostras coletadas em Minas Gerais. A influência do período de coleta também foi analisada e os resultados obtidos indicaram que a concentração de quatro metais (Cd, Cr, Zn e Mn) foi importante para discriminar as amostras coletadas em Março/2001 das amostras coletadas em Junho/2001. De forma geral, a concentração dos metais estudados se apresenta em altos níveis em todas as estações de coleta e este fato caracteriza um ambiente com alto grau de contaminação causada, principalmente, por atividades antropogênicas.


Introduction
Aquatic ecosystems cover about 3/4 of the earth surface.Most of this water (about 97%) is salty and it is found in the oceans.The evaporated water forms clouds in the atmosphere and it precipitates like rain in the terrestrial surface constituting fresh water.About 77% of the fresh water in the world is found in the polar ice caps and icebergs and 23% is underground.Only a small percentage of this water is found in rivers and lakes. 1 However, rivers and lakes represent only a small proportion of fresh water on earth (0.004% and 0.33% respectively).Fresh water has a fundamental role in civilization development (drinking water source, way of transportation, electric power source etc.).With the increase in human population, the natural landscape was being modified, for example: forests were deforested, agricultural monocultures arose, land was removed for flood prevention, coastal areas were landed and big drainage systems were built.Besides, great water volumes are being used by industries, public government and agricultural activities.The results of these activities have an impact on the physical, chemical and biological structure of the rivers, lakes and dams and, consequently, there is a decrease in the useful potential of these systems. 2 In this work, a study on the environmental effects caused by man in the Mogi Guaçu river basin (Figure 1) is carried out with the goal of providing subsidies for the treatment of contaminated areas by heavy metals such as lead, cadmium and mercury.Our main purpose here is to find out some relationships between the concentration of some metals and the kind of activity performed by man along the Mogi Guaçu river basin in the São Paulo and Minas Gerais regions.
The Mogi Guaçu river (Big Snake, in tupi-guarani dialect) has its nascents located in Bom Repouso (Minas Gerais -MG) with average altitude of 1650m.In São Paulo, the Mogi Guaçu river goes through strongly industrialized and urbanized regions as well as large extensions of farmable lands. 3The environmental study of the Mogi Guaçu river basin in São Paulo can be facilitated by analysing the main polluting sources of each one of the locations that are related, direct or indirectly, to the collection stations adopted in this work.Some agricultural and industrial activities along the Mogi Guaçu river are summarized in Table 1.The more expressive agricultural activity along the São Paulo route of the Mogi Guaçu is sugar cane.In Bom Repouso (Minas Gerais), 368 nascents of the Mogi Guaçu river are found and the predominant crops in this area are potato and strawberries.This kind of cultures, allied to economic aspects, has been advancing over the region without any preoccupation with the environmental impacts.
The Mogi Guaçu river basin has some industries with polluting potential, as presented in Table 1.Metals, eliminated by these industries, can accumulate in lakes and rivers (in sediments) changing the environmental conditions.The chemical action of heavy metals has attracted great environmental interest due to the fact these metals do not present biodegradable character.This fact determines the presence of heavy metals in the global biogeochemical cycles and these metals can remain accumulated in aquatic life at considerably high levels.
The heavy metal term is widely recognized and used for a group of metals that are associated with pollution and toxicity such as lead, cadmium, mercury, arsenic and uranium.In this group (heavy metals) are also included other metals but they are considered biologically essential (in low concentration), for example: copper, manganese, selenium and zinc. 4The term "heavy metal" does not necessarily imply "toxic metal", as many metals are considered indispensable nutrients to plants and human beings at low concentration.Several metallic ions are essential for live organisms and other ions (such as sodium, calcium, potassium, manganese, iron, cobalt, molybdenum, copper and zinc) have fundamental importance for man.However, the ingestion of metals is considered toxic in spite of some metals be essential for live organism.The presence of heavy metals in the aquatic environment, in high concentration, causes mortality of fish and other animals.In soils, heavy metals can be retained by different mechanisms, mostly when the soil is rich in organic matter and has a pH greater than seven (what reduces the leaching of the soil and the entrance of metals into the water).In this kind of environment, the heavy metals drastically reduce the fertility and the development of plants. 5,6Also, the presence of heavy metals in the human organism, from the food chain, can provoke countless diseases due to the cumulative effect and can cause death. 7ctually, our main purpose in this work is to find out some relationships between the concentration of heavy metals and the kind of activity performed by man along the Mogi Guaçu river in the São Paulo and Minas Gerais regions.The discrimination performed by chemometric methods becomes important since different activities are developed in these two regions (industrial activities in São Paulo and agricultural activities in Minas Gerais) and based on the concentration of the metals studied our results can be used to assess the origin (São Paulo or Minas Gerais) of new collected samples.

Experimental data set
The choice of the collection locations took into account the use and occupation of the soil (agricultural, industrial and extracting activities, urban agglomerates and permanent preservation areas).Five regions of the Mogi Guaçu river were selected and 15 collection points were established (which can be identified in Figure 1).
For each region studied, a number of collection stations (enough to allow a characterization of the alterations in the biogeochemical processes along the Mogi Guaçu river) was established based on the degree of human interference.Regarding the sampling frequency, two collections were accomplished during March/2001 (rainy season) and June/ 2001 (dry season).
The sediment collection for analysis was made by using collectors of "core and Eckamn's dredge". 8After collection, the samples were submitted to a pretreatment, i.e., a procedure of wet digestion was carried out.Afterwards, the material was analysed using the atomic absorption technique (Hitachi model 28100) obtaining the concentration of the metals studied.The metals selected for this work were: iron (Fe), cadmium (Cd), copper (Cu), chromium (Cr), zinc (Zn), nickel (Ni), lead (Pb) and manganese (Mn).The concentration of each metal studied was obtained in triplicate, and the mean value of them is presented in Table 2.

Chemometric analysis
In this work we employed exploratory data techniques for obtaining relevant information about the data set.The main goal of these techniques is to reduce the data set and obtain possible relationships between the variables (concentration of the metals) and the kind of activity performed along the Mogi Guaçu river.The two exploratory data techniques used in this work were: Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA).
Principal Component Analysis (PCA) is a multivariate statistical technique and its central idea is grouping correlated variables, generating a new set of variables called Principal Components (PCs), onto which the data set is projected.These PCs are built as linear combination of the original variables and have the important property of being completely uncorrelated.In PCA analysis, the position of the samples in the new coordinate system (PCs) is represented by the score matrix, while the loadings matrix gives the importance (weight) of the original variables on the PCs.The first new axis, PC1, is chosen in such a direction that maximizes the variance along that axis.The second axis must be chosen orthogonal to the first one and in the direction to describe as much variance left as possible and so on. 9,10ierarchical Cluster Analysis (HCA) examines the distances between the samples in a data set and this information is represented as a two-dimensional plot called dendrogram.The HCA method is an excellent tool for preliminary data analysis and it is useful to analyse data sets for expected or unexpected clusters, including the presence of outliers.It is advisable to analyse the dendrogram in conjunction with PCA, since they give complementary information in different forms.In HCA each point forms, initially, an only cluster and then the similarity matrix is analyzed.The most similar points are grouped forming one cluster and the process is repeated until all the points belong to an only cluster. 11In this work we used the incremental technique for grouping the samples studied. 12or the chemometric analysis in this work, the variables (metal concentration) were autoscaled so that they could be compared to each other in the same scale.In the autoscaling, each element of the data matrix is meancentered and scaled to a variance of one.The exploratory data analyses (PCA and HCA) were performed by using the computational package Pirouette 2.0. 13In our PCA and HCA analyses, we defined the samples collected in March/2001 as MPi and the samples collected in June/ 2001 as JPi.The index i corresponds to areas where the samples were collected (from 1 to 15 in Figure 1).

Results and Discussion
Initially, the results obtained for each collection season (March and June/2001) were analysed individually and, subsequently, the two seasons were compared with the goal to find the relationships between the collection seasons and the metal distribution.

March 2001
From the data set presented in Table 2, we carried out the PCA analyses and the results obtained show that only four metals (Cd, Cu, Cr and Pb) are important to discriminate samples from São Paulo and Minas Gerais regions.The first three principal components describe 96.25% of the overall variance.The first PC explains 51.29% of the variance and the second PC describes 29.86%.These two PCs correspond to 81.15% of the overall variance, i.e., PC1 and PC2 give most of the information on the systems studied.
The main result obtained with PCA is showed by the score plot (the spatial distribution of the samples in the new coordinate system -PC1 x PC2) displayed in From Figure 2 we can see that PC1 discriminates samples from São Paulo (labeled MP1 to MP8 in Figure 2) and Minas Gerais (labeled MP9 to MP15 in Figure 2).Analysing Figure 2  for PC1, i.e., on average these samples have higher concentrations of Cu than the ones from Minas Gerais (see Table 3).According to the characteristics presented by these samples, we can conclude that the high levels of Cr can be attributed to the effluents originated from sugar, leather and paper factories, which are the main activities present at São Paulo route.The samples collected in Minas Gerais have positive values for PC1 and this means high concentrations of Cd and Pb.This indicates indiscriminate use of pesticides and fertilizers on the potato and strawberry fields in Minas Gerais region.Also, it is interesting to notice that the sample MP3 present a different behavior when compared with the other São Paulo samples, since it presents a very high concentration of Cr due to the presence of a sugar factory near this collection station (MP3).The two exceptions in the PCA classification are the samples MP8 and MP14.The sample MP8 was collected in São Paulo region, but it was considered as a sample from Minas Gerais region.This occurred due to two factors: (i) the proximity of the sample MP8 to Minas Gerais region, and (ii) the collection season (rainy), for which the transport of solid particles, nutrients and agricultural chemicals (pesticides and fertilizers) occurs, from this collection station, characterizing this sample as a sample from Minas Gerais region.The sample MP14 was collected in the Minas Gerais region, but it was considered as a sample from São Paulo region.This can be explained due to the agricultural activities developed near this collection station (MP14), leading to the increase of Cu (originated from pesticides used on the potato and strawberry fields in Minas Gerais region).

and equation (1) one can see that samples collected in São Paulo present negative values
After the PCA analysis, we carried out the HCA analysis with the goal of grouping the similar samples.Figure 3 illustrates the dendrogram obtained with the HCA analysis.The vertical lines in Figure 3 represent the samples studied and the horizontal lines the similarity values between pairs of samples, a sample and a group of samples and sample groups.From Figure 3 we can see that two clusters are formed: (i) the first cluster is mainly formed by the samples collected   high concentrations of Cd, Cu and Pb and, therefore, the sample JP2 (from São Paulo) was classified as a sample from Minas Gerais.The sample JP8 was also classified as a sample from Minas Gerais due to its localization, i.e., this collection station (JP8) is localized in the border region between São Paulo and Minas Gerais states.
The HCA results obtained for the samples collected in June/2001 are summarized in Figure 5. From Figure 5 we can see two different groups: (i) samples collected in São Paulo (JP1 and JP3 to JP7) and two samples collected in Minas Gerais (JP13 and JP14).These two samples (from Minas Gerais) were incorrectly grouped to the cluster containing the São Paulo samples due to their low concentration of Pb which is similar to the samples from São Paulo.This occurs because the JP13 and JP14 samples were collected in nascent regions and, consequently, they are not contaminated with pesticides that contain Pb; (ii) samples collected in Minas Gerais (JP9 to JP12 and JP15) and two samples collected in São Paulo (JP2 and JP8).The region where the sample JP2 was collected presents sugar cane crops and, consequently, pesticides are used in this region and this characterizes the sample JP2 as from Minas Gerais.The sample JP8 was also classified as a sample from Minas Gerais due to its localization, i.e., this collection point (JP8) is located in the border region between São Paulo and Minas Gerais states.

Two seasons (March and June/2001)
The aim of analysing the March and June/2001 seasons together is to find out some relationship between the concentration of the metals (arising from agricultural or industrial activities) and the sample collection season (rainy and dry).
From the data set presented in Table 2 we carried out PCA and HCA analyses.The PCA results showed that four metals (Cd, Cr, Zn and Mn) were important to discriminate the samples from March/2001 and June/2001.The first three principal components describe 86.96% of the overall variance.The first PC explains 35.71% of the variance while the second PC describes 29.04%.These two PCs correspond to 64.75% of the overall variance.The score plot obtained is presented in Figure 6 and the loading values are presented in equations ( 5) and ( 6).

Figure 1 .
Figure 1.Location of the Mogi Guaçu river basin in São Paulo and Minas Gerais states and the indication of the collection stations.

Figure 2 .
Figure 2. Score plot for samples collected in March/2001.

PC1 = 0
.680 Cd + 0.373 Cr -0.619 Zn -0.123 Mn (5) PC2 = 0.233 Cd + 0.485 Cr + 0.401 Zn + 0.742 Mn(6)   From Figure6we can see the good discrimination between the samples collected in March/2001 and June/ 2001.Also from Figure 6 it is possible to see that the samples collected in March/2001 (rainy season) have positive values for PC1 while the samples collected in June/2001 (dry season) have negative values for PC1.According to Figure 6 and equation (5), the samples collected in the rainy season (March/2001) present high concentrations of Cd and Cr due to their positive coefficient in the loading equation (see equation (5)).The high concentrations of Cd and Cr in these collection stations (MP1 to MP15 in Figure 6), generally originated from industrial activities, are a result of the rainy season which is responsible for the homogeneous concentration of the metals in all collection stations, since the effects caused by industrial activities (carried out in São Paulo) are extended to the other collection stations from Minas Gerais.The samples collected in June/2001 (dry season) present negative values of PC1 and this means high concentrations

Figure 6 .
Figure 6.Score plot for samples collected in March and June/2001.

Table 1 .
Main polluting agricultural and industrial activities of the Mogi Guaçu river in São Paulo (SP) and Minas Gerais (MG) states

Table 2 .
Metal concentration (mg kg -1 ) in the sediment samples collected in March and June/2001

Table 3 .
Average values of metal concentrations (mg kg -1 ) and standard deviations (SD) for the samples collected in March and June/2001