Application of multivariate statistical analysis to superficial soils around a coal burning power plant

The Thermoelectric Complex Jorge Lacerda (TCJL), located in the Santa Catarina State, Brazil, is the largest coal burning thermoelectric complex of Latin America and consists of seven power plants with a total capacity of 832 MWe. In order to estimate the contribution of the atmospheric releases from the TCJL to the elemental composition of surface soils around it, forty-five samples were collected at up to a distance of 8 km. Forty-two elements were determined by ICP-MS and ICP-AES after total acid dissolution. The technique of principal component analysis was employed to identify the major sources that contribute to surface soil composition. Additionally, a source apportioning using multiple regression on absolute principal component scores was performed in order to obtain quantitative information about the contribution of the different identified sources on the soil composition. Based on the results obtained, four sources were identified as the main contributors to the surface soil elemental composition. One of them was related to TCJL because it retains volatile elements enriched on fly ash and released from powerhouse stacks.


Introduction
Electric power generation in Brazil has been predominantly hydroelectric.Approximately 14% is of thermoelectric origin, of which a small fraction (18%) is from coal burning power plants.There is a clear tendency to increase thermal electric generation by the use of natural gas power plants, but the coal power plants capacity has remained constant due to the quality and geographical distribution of coal resources.However, large investments are planned for this decade involving both natural gas and coal fired power plants. 1 The Thermoelectric Complex Jorge Lacerda (TCJL) is located in the township of Capivari de Baixo, in the Southeast area of the state of Santa Catarina, 130 km from Florianópolis.The TCJL is the largest coal burning thermoelectric complex of Latin America, formed by seven power plants and with a total capacity of 832 MWe (Table 1). 2 Flat lands of recent sedimentary formation with average altitude of 9 meters above sea level dominate the area around the complex, occupied by rice plantations.The complex is located between two cities, Capivari de Baixo and Tubarão.The principal rivers of the area are the Tubarão and Capivari.An aerial view from the TCJL and its environment is shown in Figure 1.The average wind speed is 2 m s -1 and its direction distribution is quite homogeneous, futhermore, with a calm frequency of 11.5% (Figure 2).The meteorological conditions at Jorge Lacerda with weak wind and high frequency of unstable conditions tend to cause air pollition. 2l the units of TCJL have electrostatic precipitators with an efficiency of approximately 98%, which remove particulates in suspension in the gaseous effluent.To help the aerial pollutant dispersions, the 4 older generating units are equipped with a 150 m chimney.Each 125 MW unit has a 100 m chimney, and the newest unit of 350 MW has a 200 m chimney.The power plant operator has an environmental monitoring program related to major pollutants involving SO 2 and NO x automatic monitoring stations and total suspended particulate (TSP) sampling.No environmental data concerning trace elements is available. 2Finkelmann and Gross 3 have proposed twenty-five elements as health hazard; from them nineteen were investigated during the present study.
The objective of this work was to verify the deposition of elements on surface soils due to the TCJL operation and to obtain the local soil signature for further studies concerning particulate material in suspension in the air.To attain these objectives, the use of the absolute principal component analysis (APCA) was investigated.][6][7][8][9][10][11][12][13] However, its application on surface soils is not usual.It should allow the estimation of the contribution of each identified source (by principal component analysis) to the soil concentration (mass) of each element.The main advantage of this modeling method is its receptor orientation and the opportunity to evaluate the source emissions without direct measurements.

Experimental
A radial sampling net as recommended by the Environmental Monitoring Laboratory Manual, EML-USDOE (1992), 14 was initially planned but was not achieved due to obstacles such as rice plantations, coal and ash deposits and swamps.Forty-five samples of surface soil (0-5 cm) were collected around the installation, covering a distance up to 8 km from the Thermoelectric Complex (Figure 3).The chosen sites were flat, not swampy, noncultivated areas and, when possible, far away from any roads.Each sample was a composite of 7 cores, collected in a straight line 50 cm from each other, taken with a PVC corer of 6.5 cm diameter and 5 cm height. 14The samples were air-dried, sieved to the 2 mm fraction and then homogenized by fine grinding with a mortar.
For the determination of elemental contents, each soil sample was analyzed in triplicate.Aliquots of 300 mg were completely dissolved, in closed Teflon ® vessels, with a mixture 1:1:1 (v/v/v) of HNO 3 , HF and HClO 4 . 15,16Each acid dissolution batch was composed of 17 Teflon ® vessels (15 soil aliquots, a reagent blank and an aliquot of the standard reference material IAEA-356 lake sediment).Major elements (Na, Mg, Al, K, Ca Ti and Fe) were determined by ICP-AES 16 and trace elements by ICP-MS. 15uring this work, four interlaboratorial exercises were performed under the USDOE-Mixed Analyte Performance Evaluation Program (MAPEP), two for soil samples and two for water samples (Tables 2 and 3, respectively).For water samples all of them were considered to be acceptable   ± signs = 95% confidence ranges.
(±20%).For soil samples, both results for selenium were classified as acceptable with warning and one for nickel as not acceptable.All the data related to this nickel determination were verified and no reason was found for this result, in particular, because all the other nickel determinations were considered to be acceptable.The results of the analysis of the standard reference material IAEA-356 were used to build a control chart (Figure 4), which gave information about the achieved accuracy and precision.The control bars (±20%, acceptable, and ±30%, acceptable with warning) reflect the MAPEP acceptance criteria.Based on this criteria, Sb and As could be regarded as acceptable with warning and all the others acceptable.Poor arsenic results seem to be related to an analytical problem associated with this sample, since, good results were obtained under the MAPEP program (Tables 2 and 3).For Sb the observed biases were, in general, negative, probably related to loses during the sample dissolution procedure.
Soil pH, granulometry, cation exchange capacity, sulfur, phosphate and organic matter content, were analyzed by the National Laboratory for Soil Research, belonging to the Brazilian Company for Agricultural Research (EMBRAPA).
In order to identify and evaluate the contribution of the pollutant sources in the soil surrounding the TJCL, in particular, volatile elements such as As, Pb, Cd and Sb, receptors models 17 were used.A multivariate statistical approach using Principal Component Analysis (PCA), 18,19 Absolute Principal Component Analysis (APCA) [4][5][6][7][8][9][10][11][12][13] and Hierarchical Cluster Analysis 18,19 was applied.All the statistical analyses were performed using the Statistical Program for Social Science (SPSS) ® version 9.0.

Results and Discussion
The main soil sample characteristics are showed in Table 4 and their chemical composition is presented in Table 5.The soils are, in general, loam and have an acid pH.Some high phosphate values were observed reflecting  the agriculture activity of this region.The soil sample with the highest phosphate content (sampling point 4, circa 1 km from the TCJL on the ENE sector) was eliminated after the statistical tests performed as described below.No statistically valid correlation (95% significance level) was observed between any soil parameter described in Table 4 and the elements found in Table 5.Therefore, these main soil parameters were excluded from further statistical tests.Histograms, normal and lognormal distributions were generated to validate the data and remove outliers.For each element, a stepwise linear regression was performed on the validated data set.Those elements that could not be predicted by any other element were excluded from the dataset.In general, these had many results bellow or close to the quantification limit such as Se, Ge, Ag and Bi.
One sampling point (point 4) presented several outlier values such as the elemental concentrations of Ca and Zn.Its visual examination has shown the presence of shale fragments, and, therefore, it was excluded from the dataset.The validated data set had 44 sampling points and 44 variables.
According to Henry et al., 20 in order to obtain reliable results of a multivariate model in ecological applications, the degrees of freedom per variable should be at least 30, as a consequence, for 44 samples no more than 24 variables should be used for the Principal Component Analysis.In order to reduce the number of variables involved, additional criteria were applied: Preliminary PCA tests have shown that the lanthanides built a separate group; therefore it was decided to exclude them from the PCA evaluation.Due to the fact that Fe and Mn and Mg and Ca show a strong correlation, only one of each pair was included (Fe and Mg).Other elements such as Li, Rb, Cs, Ba, Nb and W were also excluded in order to preserve others such as Zn, Sb or As, which are enriched in fly ashes 15 and, therefore, more relevant for the present study.After these exclusions, there was a reduction for 21 variables and 44 samples.In Table 6, PCA results are presented with the elements retained in each component and their communalities.The communality represents how good the presence of one particular element is explained by the components selected.Based on the criteria defined by Hopke, 17 when an element has shown a factor loading greater than 0.4 it was considered to belong to this component.A soil component means a soil phase or a mineral present with at least one of the selected elements associated with it.The four components chosen are able to explain 85.3% of data set variability.
Components 1 and 2 seem to represent the soil matrix.The first component is related to the oxides of iron (manganese) and titanium present in soils.To this phase are also associated metals such as nickel, chromium and copper.The second component with aluminum and also with the presence of potassium, iron and scandium associates it to clay minerals.The radioactive elements uranium and thorium are also associated with this soil phase.The third component was attributed to TCLJ and includes the elements strongly concentrated in the fly ashes as As, Zn, Cd, Pb and Sb.However, As, Cd, Pb and Zn have expressive soil contributions, in contrast to Sb where the main source seems to be the TCJL.The fourth component includes sodium and potassium indicating some marine aerosol and biomass contributions.
In order to validate the PCA results, a hierarchical cluster analysis was performed including the components factor scores retained in the PCA as new variables.The obtained result was in agreement with that described above.Four clusters were observed, each one including the elements belonging to each component obtained in the PCA together with the correspondent component factor scores, as it is shown in Figure 5.The same results were obtained using both PCA and cluster analysis indicating that the database is valid.
][6][7][8][9][10][11][12][13] The results obtained (Table 7) show that, together with the TCJL component, the soil matrix (PCA components 1 and 2) contributes a significant percentage of mass content of As, Pb and Zn in the surface soils.On the other hand, the TCJL component (component 3) represents 2/3 of the soil content of Sb and 1/2 of the soil content of Cd.The S/M values close to one show that the calculated elemental concen-  trations in soil are in agreement with the observed mean values.The pie charts shown in Figure 6 illustrate better these relative contributions.In spite of this TCJL contribution, the mean value observed for the elements of the component 3 are similar to those reported by other authors [21][22][23] as shown in Table 8.
The distribution with distance from TCJL for some elements belonging to components 1-4 is shown in Figure 7 a-d.As there is not a well-defined main wind direction a symmetric distribution in relation to the complex is expected for those elements with significatant contributions from atmospheric releases from TCJL.A maximum at some distance from the complex is also expected coinciding to the plume touchdown.Due to the presence of coal and ash deposits around the complex, elevated concentrations close to the complex are also expected.The curve (c) with As and Zn is the one that more closely corresponds to the above description.Thus the Figure 7 (c) with elements belonging to component 3 represents atmospheric releases from the TCJL.

Conclusions
Based on the application of multivariate data treatment methods, it was possible to identify the origin of metals in surface soils around the Thermoelectric Complex Jorge Lacerda (TCJL).In particular, it was possible to verify that the thermoelectric complex contributes with a significant

Figure 1 .
Figure 1.Aerial photograph of the Thermoelectric Complex Jorge Lacerda (TCJL) and its surroundings.

Figure 3 .
Figure 3. Soil sampling points with the TCJL at the center.

Figure 4 .
Figure 4. Control chart based on the standard reference material IAEA-356.

Figure 5 .
Figure 5. Hierarchical cluster analysis dendrogram showing the distance between the elements, with the PCA factor scores included as new variables.

Table 3 .
Results obtained during participation in the USDOE Mixed Analyte Performance Exercise Program for water samples

Table 4 .
Descriptive statistics of the main soil characteristics

Table 5 .
Descriptive statistics of the studied elements in superficial soils around the TCJL (values in mg kg-1 )

Table 6 .
Varimax rotated factor loadings matrix and communalities obtained with principal components analysis for the studied elements in the superficial soils around the TCJL (EV-eigenvalue, VARexplained variance and CVAR-cumulative variance explained) a ª Only factor loadings large than 0.1 are shown and in bold those higher than 0.4.

Table 7 .
Results of source apportionment obtained by the application of APCA, and the ratio between the calculated concentration in soil to the observed mean value (values in mg kg -1 ) a ª Only statistical significant regression coefficients within 95% confidence interval are shown.