OPTIMIZATION OF THE SAMPLING SCHEME FOR MAPS OF PHYSICAL AND CHEMICAL PROPERTIES ESTIMATED BY KRIGING

The sampling scheme is essential in the investigation of the spatial variability of soil properties in Soil Science studies. The high costs of sampling schemes optimized with additional sampling points for each physical and chemical soil property, prevent their use in precision agriculture. The purpose of this study was to obtain an optimal sampling scheme for physical and chemical property sets and investigate its effect on the quality of soil sampling. Soil was sampled on a 42-ha area, with 206 georeferenced points arranged in a regular grid spaced 50 m from each other, in a depth range of 0.00-0.20 m. In order to obtain an optimal sampling scheme for every physical and chemical property, a sample grid, a medium-scale variogram and the extended Spatial Simulated Annealing (SSA) method were used to minimize kriging variance. The optimization procedure was validated by constructing maps of relative improvement comparing the sample configuration before and after the process. A greater concentration of recommended points in specific areas (NW-SE direction) was observed, which also reflects a greater estimate variance at these locations. The addition of optimal samples, for specific regions, increased the accuracy up to 2 % for chemical and 1 % for physical properties. The use of a sample grid and mediumscale variogram, as previous information for the conception of additional sampling schemes, was very promising to determine the locations of these additional points for all physical and chemical soil properties, enhancing the accuracy of kriging estimates of the physical-chemical properties.


INTRODUCTION
An insufficient sampling intensity can be an important limiting factor for precision agriculture and has been the subject of several studies (Atkinson & Lloyd, 2007;Montanari et al., 2012).Approximately 80 to 85 % of the total errors in liming and fertilizer recommendations can be attributed to the peculiarities of soil sampling (Siqueira et al., 2010).The sampling scheme influences research efficiency and cost of the spatial variability of soil properties.The collection, preparation and analysis of large numbers of samples is cost-intensive; however, the higher the number of samples, the clearer becomes the view of the spatial variability of regionalized variables.Although sparse sampling schemes are cheaper, they might fail to capture some spatial dependence aspects (Groenigen et al., 1999;Groenigen, 2000).Thus, geostatistics plays an important role in optimal sampling studies of regionalized variables (Kerry & Oliver, 2007;Pang et al., 2009;Baume et al., 2011;Montanari et al., 2012).
As the initial part of many geostatistical studies, the sampling scheme plays a fundamental role in the quality of estimations (Bilgili et al., 2011).A regular grid is typically recommended for result optimization (Vašát et al., 2010), although it does not take regionalized variables into account and the area where sampling optimization firstly occurs (Christakos & Olea, 1992).In addition, an optimal sampling scheme for each soil physical and chemical property would make the adoption of sampling optimization unfeasible.In this sense, scale variograms of soil properties, used for a same region, represent a simple and powerful method to provide integrated information (Vieira et al., 1991).
Ordinary kriging (OK) is the most commonly used interpolation method in geostatistics.The relative transparency and simplicity of the OK algorithm, combined with its great success in past applications in Soil Science, explain its continued popularity (Groenigen, 2000).An important property of OK is the intrinsic hypothesis that accurate estimates can be expressed by the kriging variance (Isaaks & Srivastava, 1989).It is important to observe that both scale and non-scale variograms have the same kriging values (Vieira et al., 1997).Therefore, the kriging variance can be used as a tool to evaluate the quality of sampling schemes (Groenigen, 2000;Lark & Lapworth, 2012).
Several studies using a preliminary sampling grid as previous information have been performed (McBratney & Webster, 1983;Groenigen et al., 1999;Montanari et al., 2012).Groenigen et al. (1999), for instance, showed that the Spatial Simulated Annealing (SSA) algorithm could be employed to construct sampling schemes with minimal kriging variance by the use of a continuous solution in space, with a higher number of real observations.Burgess et al. (1981) and Groenigen et al. (1999) indicated a considerable influence of anisotropy on the sampling scheme optimization, increasing sample density towards a greater variability.Groenigen (2000) also mentioned that the variogram parameters also influence the optimal sampling schemes.
Optimal sampling schemes show the locations of additional sample points considered for the simultaneous analysis of several properties (Vašát et al., 2010;Montanari et al., 2012).This information may play an important role in terms of precision and for the economical feasibility of the application of these techniques in precision agriculture, e.g., for the specific management of crop fields.The aim of this study was to develop an optimal sampling scheme for a set of physical and chemical soil properties, using the SSA algorithm in an experimental area with a preestablished sampling grid.

Description of the study area
The study area is located in Guariba, in Northeastern São Paulo State, Brazil.The geographic coordinates are (21 o 19' S, 48 o 13' W, average altitude 640 m).According to the Köppen classification, the climate is mesothermic with dry winters (Cwa), with an annual average rainfall of 1,400 mm, with rains concentrated from November to February.The natural vegetation comprises tropical semideciduous forest and riparian vegetation.The relief is predominantly undulated with an average slope of 3 to 8 %.
The experimental area had been growing sugarcane for over 30 years and, at the moment of this study, the cane was about to be cut for the fifth time (ratoon) after replanting.The soil was classified as a clayey Oxisol (LVef) developed from basalt alterations (Embrapa, 2006).A regular 206 point grid was considered for the 42-ha area (Figure 1); observation points were spaced 50 m apart, geo-referenced and sampled in the depth range of 0.00-0.20 m, for physical and chemical property analysis.

Descriptive statistics
Descriptive statistical values of mean, median, standard deviation, minimum, first quartile, third quartile, maximum, coefficient of variation, skewness and kurtosis were previously used in the description of the variable distribution shapes for the application and interpretation of the geostatistical analysis.

Variograms and kriging
Considering the intrinsic hypothesis, the isotropic variogram is defined as the variance between the difference of the regionalized random variable pairs, Z (x) and Z (x + h), separated by the distance vector h, i.e.: [ ] In practice, the experimental variogram is estimated by: where is the variogram for the h class distance; and N(h) represents the number of the regionalized variable pairs separated by an h distance.In this experimental variogram, allowable models are adjusted using nonlinear regression (Isaaks & Srivastava, 1989).This estimated model, ) ( ˆh g , is used in the matrix system of OK linear equations.The OK estimator can be written as the weighted average of N observations by: where x z 0 ) ( ˆ is the kriging estimate at the x 0 point; Z(x i ) represent the measured values at x i , i = 1,2,…N; and λ i is the kriging weights attributed to the Z(x i ) nearest neighbor values to estimate x z 0 ) ( ˆ.The kriging variance can be calculated by: where µ(x 0 ) is the Lagrange multiplier (Isaaks & Srivastava, 1989).In equation 4, the variogram, sampling numbers and local point to be estimated are the only factors that influence kriging variance.This means that if the variogram is known or can be assumed, it is possible to calculate the kriging variance before sampling is performed.This fact is used to optimize the sampling scheme and minimize kriging variance (Groenigen, 2000).Vieira et al. (1991) proposed the following scaling technique for the variogram: where ) ( ˆh esc g is the estimated scale variogram; ) ( ˆh g , the estimated non-scale variogram; and Var(Z), the data sampling variance.Hypothetically, equation 5 requires a finite variance, which is guaranteed if the second stationary order exists.However, in practice, this condition does not always exist and is not easily verified (Vieira et al., 1997).
A medium-scale variogram was adjusted for all experimental data of both physical and chemical properties, which generated a representative model of spatial dependence for each property set (Montanari et al., 2012).The scale variograms were then used in the extended SSA algorithm for the development of two optimal sampling schemes: one for the physical and the other for the chemical properties.

Spatial Simulated Annealing (SSA)
In this study, an extended SSA method, developed as an algorithm, was used to optimize spatial sampling schemes, which is a modification of the simulation algorithm of Simulated Annealing (SAS) (Groenigen et al., 1999;Groenigen, 2000).Groenigen & Stein (1998) extended this method to optimize sampling schemes, using the minimization of kriging variance as criterion.This extension also includes the use of a prior sampling scheme and variogram modeling of soil properties.The term "annealing" is related to the metallurgical process of metal alloy heating and slow cooling in order to toughen and reduce brittleness (Goovaerts, 1997).The SSA algorithm can be described by the following steps: 1. Consider a set of possible spatial sampling schemes consisting of n observations (E n ), with an objective function (.) which should be minimized.The optimization process begins with a random sampling scheme ) is obtained from the random perturbation sequence E 0 with the motion of random observation E 0 , with a randomly chosen vector h x ; 3. The transition probability , that E i+1 is accepted, is determined by the Metropolis criterion (Metropolis et al., 1953): Where t is a positive control parameter, which decreases with the optimization progress.If E i+1 is accepted, it serves as a starting point for the next scheme E i+2 ; thus, the process continues until the disturbances in the samples no longer significantly reduce the objective function φ (.).During the process development, the size of vector h x will also decrease.Simultaneously, it decreases the chances of unfavorable sampling schemes to be accepted with the t parameter reduction, shown in equation 6 (Aarts & Korst, 1990).By this process, the sampling scheme is safely stabilized in an optimal configuration for global sampling.In addition, SSA allows the use of several quantitative criteria of optimization, including the minimization of the kriging variance.In this study, 20 sample points were added to the initial sampling grid by the SSA method, which represents approximately 10 % of the total points.

Minimization of kriging variance
The main objective of this criterion is to minimize the kriging variance for the entire study area, which is given by the integral where E is the sampling scheme; A the area size; x 0 the vector location to be estimated; and 2 KO s is the kriging variance.
As in most cases, the integral in equation 6 cannot be analytically solved; considering that the kriging estimates are calculated for the grid points, the SSA method discretizes the area to calculate the defined function presented in equation 7. It can then be approximated by the average of kriging variance at points of a denser grid, i.e., where x e,j represents the grid point j; and n e the number of points in the grid used for the calculation (Groenigen et al., 1999;Groenigen, 2000).This optimization criterion was proposed by Christakos & Olea (1992) and a complete discussion about the SSA optimization procedure is given by Groenigen et al. (1999).These same authors developed a computer program called Sanos for Windows 0.1, where the extended SSA algorithm was implemented, using the minimization of the kriging variance as criterion by supplying an optimal spatial sampling scheme in the presence of previous information (Groenigen & Stein, 1998;Groenigen et al., 1999).

Validation of the proposed optimal sampling scheme
After optimization of the sampling scheme by the SSA method, the variance values of each location interpolated by OK were estimated, using the scale variogram parameters.At each interpolated site, the estimated variance values were compared, both before and after sampling optimization, through the relative improvement index (RI) (Equation 9).obtained by interpolation using, respectively, an original sample scheme (N=206) and optimal sampling scheme (N=226), as proposed by the SSA method.

RESULTS AND DISCUSSION
The results of the descriptive analysis of all studied properties (Table 1) showed that the mean and median values were close, indicating a symmetrical distribution, with asymmetry and kurtosis values close to 0. Only the properties P and Mg contents had a higher degree of asymmetry.According to Warrick & Nielsen (1986), the coefficient of variation (CV) obtained for the properties pH, CEC, clay content, silt content, and ST can be classified as low, for the properties OM, H + Al, V, MS, FS and VFS as average; and the properties P, K, Ca, Mg, SB, and AG, as high.These same authors observed that when dealing with natural data, the theoretical distribution adjustment is only approximate.Data normality is not a geostatistical requirement; however, a strong asymmetry of the distribution would not be appropriate (Cressie, 1990).Salviano et al. (1998) and Souza et al. (2004) also found high CV values for P and Mg, possibly indicating the influence of fertilization.According to Isaaks & Srivastava (1989), the presence or absence of proportional effects is more important than data normality.If the average data variation is proportional to the magnitude, the minimum stationary for geostatistical applications may be affected.
A unique scale variogram model adjusted to the spatial dependence of physical and chemical properties (Figure 2) leads to either the underestimation or overestimation of parameters of unitary models; however, the practical gain using this technique is of great interest for some areas, such as the precision agriculture.Thus, the scaled model adjusted for chemical and physical properties is the spherical, represented by, respectively, Based on these models, it was possible to use SANOS 0.1 for Windows (Groenigen et al., 1999) to determine the specific location of each additional point, which minimizes the kriging variance function for the studied area.Montanari et al. (2005), when studying chemical properties of an Oxisol in Jaboticabal, São Paulo State, in a 94-ha area, with a grid of 421 sampling points, obtained an optimal sampling scheme for each property, with a grid and individual variogram for each soil property.This sampling scheme was however infeasible for practical use, due to its high costs.
Sampling schemes optimized by additional observations, suggested by the SSA algorithm, are presented for all chemical and physical properties in figure 3a,c, respectively, where differences can be observed.The chemical properties have 14 additional points in the more elongated area, NW-SE direction (Figure 3a).The spatial distribution map of calcium (Figure 3b) showed that this is the part of the area with the highest spatial discontinuity.The fact that the extra samples are located in the transition areas among the mapped variable classes reinforces the importance of optimization procedures for a clear identification of the class boundaries.Therefore, due to the lower accuracy of these regions, as indicated by SANOS, the boundaries indicated among classes may differ from the real ones, either over-or underrepresenting certain regions.The six remaining points (Figure 3a,b) were placed on the borders of the central area, where spatial continuity is higher and isolines are fewer and kriging variance values highest (Groenigen, 2000).
The sampling optimization scheme for the physical properties (Figure 3c) shows a more distributed point arrangement than that of the chemical properties (Figure 3a).The number of additional points placed in the most elongated area (Figure 3c), NW-SE direction, is smaller than that suggested for the chemical properties (Figure 3a); consequently, the central part received more points.The scheme of the spatial distribution of physical properties suggests a more homogeneous spatial continuity in the area, as observed for the property coarse sand (Figure 3d), resulting in a more dispersed spatial arrangement of points, generated by the sampling optimization model.
The regions where samples were added resulted in an increased accuracy of the estimates after the optimization procedure (Figure 4).The chemical properties had relative improvement indexes (RI) of up to 2 %, concentrating the greatest gains in the NW-SE direction.In turn, the RI for the physical properties showed values ranging from 0 to 1 %.The greatest gains in precision by the chemical properties were due to the closer proximity of optimized sampling sites, promoting a contribution of more points in a smaller region.Groenigen et al. (1999) reported reductions of up to 53 % in the variance of sand content estimates using 100 % of additional optimized samples.The low values of RI in this study were due to the use of only 10 % of additional samples.We also emphasize that this technique has great potential for poorly sampled areas, which are very common in  (9) Coefficient of Kurtosis.OM (organic matter, g dm -3 ); P (phosphorus, g dm -3 ); K (potassium, mmol c dm -3 ); Ca (calcium, mmol c dm -3 ); Mg (magnesium, mmol c dm -3 ); Al (aluminum, mmol c dm -3 ); SB (sum of bases, mmol c dm -3 ); V (base saturation, mmol c dm -3 ); CEC (cation exchange capacity, mmol c dm -3 ); Clay (clay content, g kg -1 ); Silt (silt content, g kg -1 ); ST (sand total content, g kg -1 ); CS (coarse sand content, g kg -1 ); MS (medium sand content -g kg -1 ); FS (fine sand content -g kg -1 ) e VFS (very fine sand content -g kg -1 )."precision agriculture"; for these areas, a large increase in the estimate accuracy is expected through the inclusion of but a few points.
From the maps of optimum sampling schemes, some information was obtained to enable a better understanding of the spatial distribution pattern and define different management zones.The identification of specific management zones, in agricultural areas, enables input application at specific rates, resulting in greater production uniformity and reducing economic and environmental costs (Zhang et al., 2010).These maps can be very useful in the experimental design and, also, as a tool for precision agriculture programs, since the addition of sampling points implies in high costs.According to Groenigen et al. (1999), the relevant aspect of the SSA algorithm is the indication of the places, within the studied area,  of real the sampling requirement to increase the accuracy of kriging estimates, resulting in more reliable maps.
It should be emphasized that, in this study, a larger recommended point concentration in some specific areas was observed, resulting in greater variance estimate errors.As the spatial variability differs across the study area, it is important to establish a sampling plan with differentiated intensity.

CONCLUSION
Based on an initial sampling scheme and the use of a single-scale variogram as previous information, for the representation of the spatial dependence of several physical and chemical soil properties, the SSA algorithm was effective to suggest an optimal spatial sampling scheme, increasing the estimate accuracy of the spatial distribution maps obtained by ordinary kriging.

Figure 1 .
Figure 1.Studied area location and soil sampling grid representation.

Figure 4 .
Figure 4. Relative improvement (%) of the accuracy due to the use of optimized additional samples: chemical (a) and physical (b) properties.

Figure 3 .
Figure 3.The two optimal sampling schemes, showing the locations of additional points (red dots) for the chemical properties (a), physical properties (c), optimal sampling scheme applied to the kriging map of the calcium content (b) and scheme applied to the kriging map of the coarse sand content (d).