Geostatistical techniques applied to mapping limnological variables and quantify the uncertainty associated with estimates

: Aim: This study aimed to map the concentrations of limnological variables in a reservoir employing semivariogram geostatistical techniques and Kriging estimates for unsampled locations, as well as the uncertainty calculation associated with the estimates. Methods: We established twenty-seven points distributed in a regular mesh for sampling. Then it was determined the concentrations of chlorophyll-a , total nitrogen and total phosphorus. Subsequently, a spatial variability analysis was performed and the semivariogram function was modeled for all variables and the variographic mathematical models were established. The main geostatistical estimation technique was the ordinary Kriging. The work was developed with the estimate of a heavy grid points for each variables that formed the basis of the interpolated maps. Results: Through the semivariogram analysis was possible to identify the random component as not significant for the estimation process of chlorophyll-a, and as significant for total nitrogen and total phosphorus. Geostatistical maps were produced from the Kriging for each variable and the respective standard deviations of the estimates calculated. These measurements allowed us to map the concentrations of limnological variables throughout the reservoir. The calculation of standard deviations provided the quality of the estimates and, consequently, the reliability of the final product. Conclusions: The use of the Kriging statistical technique to estimate heavy mesh points associated with the error dispersion (standard deviation of the estimate), made it possible to make quality and reliable maps of the estimated variables. Concentrations of limnological variables in general were higher in the lacustrine zone and decreased towards the riverine zone. The chlorophyll-a and total nitrogen correlated comparing the grid generated by Kriging. Although the use of Kriging is more laborious compared to other interpolation methods, this technique is distinguished for its ability to minimize the variance of the estimate and provide the estimated value of the degree of uncertainty.

Abstract: Aim: This study aimed to map the concentrations of limnological variables in a reservoir employing semivariogram geostatistical techniques and Kriging estimates for unsampled locations, as well as the uncertainty calculation associated with the estimates.Methods: We established twenty-seven points distributed in a regular mesh for sampling.Then it was determined the concentrations of chlorophyll-a, total nitrogen and total phosphorus.Subsequently, a spatial variability analysis was performed and the semivariogram function was modeled for all variables and the variographic mathematical models were established.The main geostatistical estimation technique was the ordinary Kriging.The work was developed with the estimate of a heavy grid points for each variables that formed the basis of the interpolated maps.Results: Through the semivariogram analysis was possible to identify the random component as not significant for the estimation process of chlorophyll-a, and as significant for total nitrogen and total phosphorus.Geostatistical maps were produced from the Kriging for each variable and the respective standard deviations of the estimates calculated.These measurements allowed us to map the concentrations of limnological variables throughout the reservoir.The calculation of standard deviations provided the quality of the estimates and, consequently, the reliability of the final product.Conclusions: The use of the Kriging statistical technique to estimate heavy mesh points associated with the error dispersion (standard deviation of the estimate), made it possible to make quality and reliable maps of the estimated variables.Concentrations of limnological variables in general were higher in the lacustrine zone and decreased towards the riverine zone.The chlorophyll-a and total nitrogen correlated comparing the grid generated by Kriging.Although the use of Kriging is more laborious compared to other interpolation methods, this technique is distinguished for its ability to minimize the variance of the estimate and provide the estimated value of the degree of uncertainty.
The use of models and the generation of sample meshes, followed by interpolation, have been consolidated over time in limnology studies.
Geostatistical techniques for estimates in unsampled locations have been applied, for example, in evaluating the spatial distribution of fish in lakes (Bezerra-Neto et al., 2013), macrophyte geographical distribution modeling in river basins (Cancian, 2012), spatial distribution of phytoplankton (Campos, 2010), spatial structuring of submerged aquatic vegetation in an estuarine habitat (Burgos-León et al., 2013), bathymetric and morphometric studies in lotic (Zani et al., 2008) and lentic environments (Lopes et al., 2013;Cigagna et al., 2014;Furlan et al., 2015), cyanobacteria mapping (Utsumi et al., 2015), among others.These studies have at least one thing in common, that is, they attempt to make great predictions of the variable under study in unsampled locations.
To predict a variable of a point or block the uncertainty associated with the estimate should also be considered.In this case, the traditional methods of evaluation such as radial functions, spline, polynomial and inverse distance, among others, are limited so as to calculate the uncertainty associated with the estimate.Statistical methods can evaluate the accuracy of the estimate, however, these methods

Introduction
The theory of regionalized variables, also known as Geostatistics, was developed based on the description, modeling, and the use of spatial continuity of regionalized variables.According to this theory, the variables representing a regionalized phenomenon have a casual variation in space, however, as will the same approach, there is the existence of a correlation that reflects the structure of regionalized phenomenon (Huijbregts, 1975).The application of this theory had its development based on several studies of empirical conducted by researchers in the gold mines in South Africa in the 50's.
One notable difference between Classical Statistics and the Geostatistics is that the first requires independent sample values spatially, while the second requires values of the samples correlated (dependent) for a certain distance in space.According to Matheron (1963), the starting point for the development of Geostatistics was due to the inability of Classical Statistics to consider the spatial aspect of a natural phenomenon, which is an important feature for assessing the physical environment.For this feature, the Geostatistics has many applications in Geosciences, mainly to describe and model spatial variability patterns (semivariogram), predict values in unsampled locations (Kriging), get the uncertainty of estimates for unsampled locations (standard deviation Kriging) and optimizing sampling meshes.
over-estimate this error, since it does not take into account the spatial correlations (Guerra, 1988).
Among the interpolation methods currently used, Kriging's geostatistical technique constitutes the best unbiased linear estimator (Isaaks & Srivastava, 1989), considering the use of the semivariogram function, which allows us to model the variability of the variable studied in space.Such a function seeks to capture the spatial correlation that will be used to assign a weight to the samples surrounding the estimates.In general the semivariogram is an increasing function of distance (h), since the values taken from two different points are averaged, the farther away they are the more different they are from each other.Thus the semivariogram gives precise meaning to the traditional notion of the influence zone of the samples (Guerra, 1988).
The semivariogram function is a basic technique for geostatistical estimates because it works directly with the spatial variation, i.e., it is a measurement that expresses the spatial correlation of the variable.This measurement constitutes the semivariances differences in experimental values situated at regular intervals.In stationary conditions, the expected average value is constant, which reduces the semivariogram to the quadratic mean of the differences between the experimental values (Isaaks & Srivastava, 1989).
According to Guerra (1988), although it exists theoretically, a γ(h) trend indefinitely to (h) occurs frequently in practice that from a certain distance the samples have some influence over each other, as can be observed in Figure 1.The parameter (a) is called range and corresponds to the idea of influence of a sample area, i.e., the structured component.It marks the distance from which a point of the variable under study has no more influence on the neighboring point.It represents the beginning of a pure randomness zone.
When a semivariogram in different directions exhibits the same behavior, it's said to be an isotropic variable; otherwise, it is said there is anisotropy.An anisotropy corresponds to the existence of privileged directions that conditioned the genesis of the natural phenomenon under study.In the case of isotropic γ(h) does not depend on the direction in which one studies the phenomenon, because their behavior is the same in any of them.In the anisotropic case, γ(h) will also influence the direction (h) and not only in the module itself (Andriotti, 2003).
For obtaining a semivariogram, it is assumed that the regionalized variable has a stationary behavior, that is, the expected mean values, as well as its spatial covariance, is statistically the same for a given area.It is assumed, therefore, that the values within the area of interest have no tendency that might affect the results (Landim et al., 2002).
The geostatistical technique of Kriging comprises a set of technical estimates and prediction of surfaces based on semivariogram modeling of spatial correlation structure.What distinguishes Kriging from other interpolation methods is the estimation of a spatial covariance matrix that determines the weights assigned to different samples, treatment of data redundancy, the neighborhood being considered in the inferential process and the uncertainty associated with the estimated value (Camargo et al., 2004).Hence the difference is the way the weights are assigned to different samples.In the case of interpolation by a simple average, for example, the weights are all equal to 1/N (N = number of samples); at the interpolation based on the inverse square of the distance, the weights are defined as the inverse square of the distance separating the interpolated value of the values observed.
In Kriging, the procedure is similar to the interpolation by the weighted moving average, except that here the weights are determined from a spatial analysis based on semivariogram, which brings measurement differences in variance as a function of the distances between observed points or "lag".The Kriging uses these parameters as input terms in a system of linear equations to make predictions.
The Kriging geostatistical technique stands out for its ability to minimize the variance of the error between the sample and the estimated point and allows the estimation measurement accuracy, that is, the uncertainty associated (Goovaerts, 1997).Minimizing the variance of the estimation, in  (Guerra, 1988).
relation to the condition of non-bias is a question that requires, for its solution, using the technique of Lagrange multipliers.Theoretical considerations show that the optimum weights are found by solving a set of linear equations, whose coefficients are a function of the semivariogram and location of the samples in relation to the point or block to be estimated (Andriotti, 2003).
In this study, the Kriging geostatistical technique was applied in the estimation of limnological variables in a reservoir using twenty-seven sampling points distributed in a regular mesh.The objective was to map the dispersion of limnological variables starting with a detailed semivariogram analysis that supported the estimates and, consequently, enabled us to calculate the respective standard deviations of the estimated values.

Study area
The central reservoir of FEENA ("Edmundo Navarro de Andrade" State Forest) is situated in a Sustainable Use Conservation Unit of the State of São Paulo, located in the southeastern region of Brazil, at the eastern end of the city of Rio Claro, SP (Figure 2).The quadrant with UTM (Universal Transverse Mercator) coordinates covering the study area are: 240258-240442E and 7519137-7519380S, 23K zone, datum SIRGAS/2000.
The climate of the area according to the Köppen classification is tropical with two distinct seasons -Cwa, i.e., w: dry in winter, a: hottest month with temperatures above 22 °C.
As for rainfall, there is a dry period between April and September, with average rainfall from 34.2 to 72.3 mm, and a rainy period, from October to March, with the average ranging from 119.3 mm to 338.6 mm.The annual average rainfall is 1366 mm.
The study area is within the Ribeirão Claro catchment which, in turn, is part of the watershed of the Corumbataí River.The main drainage system in the region is represented by the Corumbataí River and its tributaries: Ribeirão Claro, Cabeça and Passa Cinco.Such rivers' source is on the slopes of the Cuesta and they flow south, emptying into the Piracicaba River which runs westward, and takes its waters to the Tietê River (Cottas, 1983).The main tributaries of the Ribeirão Claro that cross FEENA are the Lavapés, Santo Antônio and Ibitinga streams.The reservoir is a result of the damming of the Ibitinga stream waters in the final portion of its course, within the boundaries of FEENA.This affluent feeds the reservoir at the north end, the spillway is south, draining the water into the Santo Antônio stream, left tributary of the Ribeirão Claro.The latter is responsible for part of the water supply for the city of Rio Claro.Some principal morphometric parameters of the reservoir are presented in Table 1.

Data acquisition
The authorization for the development of this research project was granted by the Forestry Institute, an agency of the Environmental Department of the State of São Paulo (SMA Case N°. 260108-006744/2012).
Twenty-seven sampling points were established and distributed in a regular mesh (Figure 3).The sampling was conducted in September 2012 (end of the dry season).Each point was georeferenced and identified using a Garmin GPS12.The UTM projection system and the datum SIRGAS/2000 were adopted.
A bottle of the Van Dorn horizontal type, Wildco make, model 1120, with 2.2 liter storage capacity was used for water sampling.Water samples were placed in frosted plastic bottles and stored in a thermal insulated box with ice.They were then transported to the limnology laboratory at the Ecology Department of the Institute of Biosciences of the Universidade Estadual Paulista/UNESP, Rio Claro Campus, where chlorophyll-a, total phosphorus and total nitrogen calculations were made according to the methods described by (Lorenzen, 1967), (Golterman et al., 1978) and (Mackereth et al., 1978), respectively.

Semivariogram function
In the next stage, we carried out a detailed analysis of the spatial variability of each limnological variable through semivariograms.Table 1.Morphometric parameters of the FEENA reservoir (Cigagna et al., 2014).For spatial variability analysis of limnological data, we used the VARIOWIN ® 2.21 software authored by (Pannatier, 1996).The VARIOWIN ® consists of three modules (Prevar2D, Vario2D, and Model).With Prevar2D a file was created with extension PCF (Pair Comparison File) containing all possible distances between the observation data points.Then, we calculated a semivariogram with an angular tolerance of 45 o to each variable using the Vario2D.Finally, the Model was used in the modeling of semivariograms which were adjusted with the spherical model.The spherical variographic models refer to transitional phenomena, characterized by semivariograms whose upward curve tendency rise plateaus at C 0 + C 1 .

Morphometric parameters
The anisotropy was also modeled, since the limnological data showed principal directions of 0° and 90° which, respectively represent the directions of greater and lesser correlation between the data.The final product modeling constitutes the variographic parameters nugget effect, range, sill and anisotropy that describe the spatial variability of natural phenomena.These parameters supported the Kriging performed in the next step.

Ordinary Kriging
Kriging geostatistical technique was employed to estimate a heavy and regular grid of values using the Surfer10 ® software.Variographic parameters of the random component (nugget effect), structured component (sill), the semivariogram range (a) and anisotropy, obtained in the modeling of semivariograms were used.The predicted surface was computed to satisfy the conditions of minimal tendency and minimum variance and in this way to obtaining by means of a Surfer10 ® subroutine, a measure of estimate accuracy, i.e., the standard deviation.

Variographic parameters
To check the significance of the variability of the nugget effect we calculated the ratio between C 0 and C 1 , which expresses the degree of randomness of the variable (Table 2).
The ratio of both parameters C 0 and C 1 allows us to classify the random component as not significant for the estimation process of chlorophyll-a and as significant for total nitrogen and total phosphorus.According to Guerra (1988), the combination of these two parameters in a single parameter, which is the relative nugget effect, it is a convenient way to express the relative randomness of regionalization.
The distance (h) wherein the variables curve stabilized on semivariograms (Figure 4), i.e., the range (a), was 82 meters on average.These parameters reflect the distance to where there is a spatial dependence of the variable, that is, the distance away from the study point which has no more influence on the neighboring point.This distance (a) delimiting the zone of influence from which the variability is maximum.

Geostatistical maps and standard deviations of the estimates
Through the geostatistical technique of semivariogram analysis and Kriging, it was possible to map the variability of limnological variables in the reservoir, resulting in the geostatistical maps as a finished product and their respective standard errors of the estimates.Obtaining the minimum variance of the estimate for the situation we guaranteed the best possible advantage of limnological data, i.e., it provides greater security to have the most accurate assessment possible with the twenty-seven sampling points available.
The chlorophyll-a (Figure 5a) has a higher concentration in the central zone of the reservoir with a maximum level of 6 µg.L -1 .The lowest values were found in the riverine zone (0.60 µg.L -1 ), close to the Ibitinga stream.
The total nitrogen had little variation, with levels between 0.23 and 0.24 mg.L -1 (Figure 6a).It was observed that, like chlorophyll-a, the highest values were located in the central zone and showed a decrease in the riverine zone.
The highest concentrations of total phosphorus occurred in some places the reservoir shore, especially in the western shore, and smaller in the central zone (Figure 7a).The highest level was 170 µg.L -1 and the minimum 10 µg.L -1 .
Usually all the maps show a longitudinal gradient to the concentrations of the limnological variables.The riverine zone that's where lower concentrations were shown.The highest levels were observed in the central zone and in the lacustrine zone, where the reservoir spillway is located.Among the variables compared here, geostatistical maps of chlorophyll-a and total nitrogen showed the higher spatial correlation.Pearson's linear correlation coefficient between the grids generated by Kriging indicated 0.38 between chlorophyll-a and total nitrogen.This correlation using the grid generated by Kriging can be useful in characterizing the limiting nutrient in aquatic systems.Chlorophyll-a mapping allows a quantitative biomass assessment present in the environment, which can be used as a trophic condition indicator (Dodds et al., 1998).
One of the advantages of the Kriging interpolation technique is that each estimate is accompanied by a corresponding Kriging standard deviation (Figures 5b, 6b and 7b).Thus, for any values of a contour map, a reliability map can be produced.Therefore, if a standard deviation is greater than the standard deviation of the initial sample, the estimate has a significant degree of randomness (Clark, 1979).
The average values of standard deviations calculated for each variable were 1.57 for chlorophyll-a, 0.10 for total nitrogen, and 55.80 for  the isovalue curves tend to be concentric towards the determined value for the variable, therefore the error dispersion is greater.Moreover, the chart indicates that the uncertainty of the estimate increases as it moves away from the observation points towards the edges of the reservoir, where the sampling mesh is smaller.In this sense, the standard deviation map can still provide subsidies for the most appropriate establishment of the sampling mesh.

Conclusion
All maps show a longitudinal gradient to the concentrations of the limnological variables.The levels were higher in the lacustrine zone and decreased towards the riverine zone.The chlorophyll-a total phosphorus, respectively.Due to the small number of sampling points available and the variability of the natural phenomena expressed by semivariograms, a relatively high variance is to be expected.However, Kriging allowed the minimum variance of the uncertainty to the available data.
Considering that the mesh used in the Kriging was the same for all limnological variables and the fact that the uncertainty estimated is linked directly to the configuration of the sampling points, the standard deviation maps showed little visual difference between them in general appearance.As a rule, the map of the standard deviations shows a configuration known as "bull eyes", i.e., at the sampling site where the points are more numerous,  and total nitrogen correlate comparing the grids generated by Kriging.The use of Kriging starting with a detailed spatial analysis, based on semivariograms, enabled us to perform more accurate estimates of the dispersion of the variables in the reservoir, in this sense this technique considers the random component, structured component, the influence radius of sample points and the anisotropic aspect of the system.The estimate of heavy meshes, associated with the error dispersion, allowed us to produce qualified and reliable maps of the estimated limnological variables concentrations.Although the use of Kriging is more complex and laborious, compared to other interpolation methods, this technique is distinguished for its ability to minimize the variance of the estimate and provide the estimated value of the degree of uncertainty.

Figure 3 .
Figure 3. Sampling points distributed in a regular mesh in the FEENA reservoir.

Figure 5 .
Figure 5. Estimates (a) and standard deviations (b) of the chlorophyll-a concentrations.

Figure 6 .
Figure 6.Estimates (a) and standard deviations (b) of total nitrogen concentrations.

Figure 7 .
Figure 7. Estimates (a) and standard deviations (b) of total phosphorus concentrations.