SOIL PENETRATION RESISTANCE ANALYSIS BY MULTIVARIATE AND GEOSTATISTICAL METHODS MÉTODOS GEOESTADÍSTICOS Y MULTIVARIADOS

: The penetration resistance (PR) is a soil attribute that allows identifies areas with restrictions due to compaction, which results in mechanical impedance for root growth and reduced crop yield. The aim of this study was to characterize the PR of an agricultural soil by geostatistical and multivariate analysis. Sampling was done randomly in 90 points up to 0.60 m depth. It was determined spatial distribution models of PR, and defined areas with mechanical impedance for roots growth. The PR showed a random distribution to 0.55 and 0.60 m depth. PR in other depths analyzed showed spatial dependence, with adjustments to exponential and spherical models. The cluster analysis that considered sampling points allowed establishing areas with compaction problem identified in the maps by kriging interpolation. The analysis with main components identified three soil layers, where the middle layer showed the highest values of PR. RESUMEN : La resistencia a la penetración (RP) es un atributo del suelo que permite identificar zonas con restricciones debido a la compactación, que se traduce en impedancia mecánica para el desarrollo de las raíces y en una menor productividad de los cultivos. El objetivo del presente trabajo fue caracterizar la RP de un suelo agrícola, mediante análisis geoestadístico y multivariado. El muestreo se realizó de manera aleatoria en 90 puntos, hasta una profundidad de 0,60 m. Se determinaron los modelos de distribución espacial de la RP y se delimitaron áreas con problemas de impedancia mecánica de las raíces. La RP presentó distribución aleatoria a 0,55 y 0,60 m de profundidad. La RP en las otras profundidades analizadas mostraron dependencia espacial, con ajustes a modelos exponenciales y esféricos. El análisis jerárquico que consideró puntos de muestreo, permitió establecer zonas con problemas de compactación, identificadas en los mapas obtenidos mediante interpolación por kriging. El análisis de componentes principales permitió identificar tres capas de suelo, donde la capa intermedia fue la que presentó los mayores valores de RP.


INTRODUCTION
Some soils show signs of structural degradation due to continues application of high pressure from inappropriate use of agricultural machinery for crops or animals production. Characterize soils physical properties are required in these cases, for taking management decisions and to determine the effects of agricultural practices on soil properties. The level of characterization may range from visual observations of selected properties of the soil on field or laboratory detailed evaluations (BOWEN, 1981).
The penetration resistance (PR) is an appropriate index to evaluate soil compaction problems, limiting plant root development. PR is a physical property of soil, which represents the behavior and effects of different soil properties such as bulk density, moisture content, porosity and permeability, which, result from the particles size distribution, the structure, and the mineral and organic content of soil (SOANEN et al., 1981).
Soil compaction is a detrimental factor that has been evaluated because it causes poor emergency capacity of the plants, affecting soil characteristics such as: availability of nutrients (CONTE et al., 2007), and water storage, causing inefficient use of water resources (SERAFIM et al., 2008) and reducing directly crop yield (STELLUTI et al., 1998).
The PR is an index that can be evaluated through a simple, fast and punctual determination, made directly on the field, by penetrometer or recording penetrometer that are effective and fast (BOWEN, 1981). The PR determined in this way is an indirect measure of the force exerted by the roots to penetrate the pores of the channels, or to deform the soil structure.
Usually, the analysis of this information is performed by univariate techniques that only allow characterizing on time the PR behavior and other soil properties. Therefore, it requires a considerable number of measurements at different locations and depths, to obtain an accurately estimate (LIPIEC & HATANO, 2003). For the analysis of data thus generated, it recommended the use of geostatistical tools, that allow evaluate and understand the spatial variability of soil variables (RAMIREZ-LOPEZ et al., 2008), looking to solve problems such as variability of morphological properties (SALVADOR-BLANES et al., 2006), variability in physical properties (CRUZ et al., 2010, MARTINS et al., 2010, determination of management units and analysis of crop yield (CUCUNUBA-MELO et al., 2011).
Although the geostatistics permit to identify the spatial dependence of a soil property, the multivariate analysis permit to classify an attribute, from the same data set, into groups with common characteristics or similar behaviors, allowing a "natural division" (GRUNWALD et al, 2001), useful to identify management zones in crop areas.
Given the importance of this physical property in the crop growth and development, the aim of this research is to characterize the PR spatial variability in the 'Silvania' county (Cundinamarca, Colombia), in an area planted with tomato tree ( Cyphomandra betacea) with the use of geostatistical tools and multivariate analysis.
The sampling was performed in an area of 0.5 ha, in March 2010, ending the in dry period, in which 90 points were established following a randomly alternate course with a spacing of 3.0 m between plants in alternate rows ( Figure 1). To obtain the PR values it was used an Eijkelkamp recording penetrometer, measuring at each point, reaching up to 0.6 m. depth. For the measuring was used a B penetrometer tip (ASAE S313.3, 2009), with an accuracy of 0.05 MPa. At each net site the measurements were made with constant speed, obtaining three curves, which shows the PR behavior trough the depth. Subsequently, it was generated the data set or database, taking the values at different soil depths: 1, 3,6,10,15,20,25,30,35,40,45,50,55 and 60 cm.
Initially it was conducted a statistical descriptive analysis using SPSS v.17 software (SPSS Inc., Chicago, IL), with were determined mean or average, maximum, minimum, coefficient of variation (CV), skewness and kurtosis, for each depth, including the Kolmogorov-Smirnov normality test, that is sensitive to values close to the median and the edges of the distribution. In this way, it was verified both empirically and numerically the normality fit to each variable, although not essential, provides better predictions when combined with geostatistical techniques (DIGGLE & RIBEIRO, 2000). For analysis of the CV it was taken the criteria of WARRICK & NIELSEN (1980), with low variability for CV under 12%, mean variability between 12 to 60%, and high variability above 60%. Then spatial dependence of the attributes was established using the theory of regionalized variables (WEBSTER & OLIVER, 2007), in which the data set were fitted by semivarigram theoretical models  (h), defined by: where z (x) is the variable value at a site x, z (xi + h) another sample value separated at the h distance, and N the number of separated pairs by an h distance. The semivariograms is the arithmetic mean of all squared differences between experimental pairs of values separated by an h distance, or the variance increases of the regionalized variable at locations separated by an h distance.
There are several theoretical models that can fit the experimental semivariogram. In general, these models can be divided into unbounded (linear, logarithmic, potential) and bounded (spherical, exponential, Gaussian). Bounded models are widely applied in the study of the spatial variability of soil attributes. Equations 2, 3 and 4 present the expressions that define the spherical, exponential and Gaussian semivariogram; respectively. These models have three common parameters, which are the nugget effect (C0), the plateau (C0 + C) and the range or scope (a). The nugget effect indicates the discontinuity between samples; it is the spatial variability not detected during the sampling process. The plateau is the value of the semivariance where the model is stable and has a constant value. The range represents the distance where the correlation is constant; indicating that from this distance there is not correlation between samples. For the points above the range is possible to say that the correlation between pairs of points is zero and is called statistically independent values (Moncayo et al., 2006).
The spherical model has a high growth rate, although for large distances, marginal increments decrease until it becomes zero. Exponential and Gaussian models are applied when the spatial dependence shows a steady increase in proportion to the distance between observations, where the spatial dependence vanishes for a distance that tends to infinity.
For the geostatistical analysis was used GS + v. 9 software (Gamma Design Software, LLC, Plainwell, MI), determining the theoretical model of semivariogram, whose selection is based on the lower value of the sum of residuals squared, the R 2 of the fitting equation and the similar values obtained between the actual value and the estimated value, that are obtained in the cross validation (FARAC et al., 2008, JOHANN et al., 2010. It also established the degree of spatial dependence (DSD) as the ratio between the nugget effect and the plateau (100 * C/C 0 + C), which is classified as strong if it exceeds 75%, moderate with values between 25% and 75%, and weak when it less than 25% (CAMBARDELLA et al., 1994). When the DSD is close to zero, the fitted model to the experimental semivariogram is called nugget pure effect (NPE) (WEBSTER & OLIVER, 2007) and is defined by  (h) = C 0 , for h> 0, denoting a random spatial distribution of the variable.
Interpolation was subsequently performed by the classic method of Kriging, which is considered the best linear neutral estimator and with minimum variance, presenting results by digital elevation maps. This procedure was performed with the Sufer v. 9 software (Golden Software Inc., Golden, CO).
To improve the understanding of spatial behavior determined through the geostatistical methods it was applied multivariate statistics using SPSS v.17 software (SPSS Inc., Chicago, IL), for the purpose to identify the behavior by the depth and by the PR point , by hierarchical cluster analysis (HCA) and principal component analysis (PCA). In this study it was considered that PR values were correlated both horizontally and vertically (multicollinearity), for which the univariate analysis provides redundant information. This can be eliminated by the use of PCA, because this method reduces the dimensionality and shows the different basic components, which are called principal components. Each principal component (PC) is described in terms of new components, defined as a linear combination from the original variables. The first component represents the maximum value of the total variance and is associated with the largest auto data. The second component is the second linear combination, uncorrelated with the first, which represents the maximum residual variance, and so on, until totally account the variance. It is desirable that a small number of components explain the higher percentage of the total variance, it means, the data set can be described in a lower dimensional space (RAMOS et al., 2007).
On the other hand, the HCA allows to group variables that have similarities. The difference between groups is an estimated distance, which separates a set of variables or attributes in groups. The results are obtained using different algorithms, which calculate the distance through the sum of the squares between two groups, rising over all variables. In each estimate, the sum of squares between groups is minimized over the partitions, through the combination of two groups, estimated in a previous step (RAMOS et al., 2007).
In the HCA was complete group formation by depth and point, to establish the presence of layers and limiting compaction areas, using the Euclidean distance to separate the groups to be identified in the respective Dendrograms. The results were obtained using Ward's algorithm, which calculates the distance through the sum of the squares between two groups. PCA was then carried out for the different depths considered, using Varimax's rotation.

RESULTS AND DISCUSSION
The results obtained from descriptive statistics analysis to demonstrate that it has a symmetric distribution, with similar average and median values for each depth analyzed with a rising trend, associated with this kurtosis values (Table 1). The data set tend to a normal distribution, which is confirmed by the results of the Kolmogorov -Simonov's test.
In the PR averages is observed that for the soil surface layers is given a lower rank up to six centimeter deep, caused by the recent agricultural management performed in these layers, plus the effect of slope that does not allow the soil stabilization by the effect of dragging of sediments in the area under study. Furthermore, these trends are similar in other studies (VERONESE JUNIOR et al., 2006;RAMIREZ-LOPEZ et al, 2008). According to WARRICK & NIELSEN (1980) criteria, CV values show variability average for all depths, getting greater dispersion the first superficial layer, which may be due to soil and climatic conditions to the area and the agricultural management of land under study.
The PR average values higher than 2 MPa for depths evaluated, except the first one, can argue that is a soil that has a potential restriction on the crop roots growth by compaction levels (CARR et al., 2007;RAMIREZ-LOPEZ et al., 2008); however, there are areas with lower values close to 0 MPa and higher values close to 4 MPa.
The spatial dependence evaluated with semivariograms indicates that most of the PR of different depths fit to exponential models, followed by the spherical model ( Table 2). The depths at 0.55 and 0.60 m did not show spatial dependence; it indicates that the spatial distribution was random, showing a pure nugget effect (PNE). C 0 values for the variables that showed spatial dependence close to zero, indicating the existence of spatial correlation between neighboring points. On the other hand, the nugget was lower than 50% of the plateau, showing that the correlation model adequately describes the PR distribution. The determination coefficient (R 2 ) was always greater than 0.53 for properties that showed spatial dependence, being the depth of 0.15 m the best fit. These values of R 2 , with values close to one (1) of the cross-validation coefficient (CVC) for all properties, indicate appropriate confidence of the results. Studies carried out by different authors obtained theoretical adjustments semivariograms models with higher tendency to exponential models (RAMIREZ-LOPEZ et al., 2008;VERONESE JUNIOR et al., 2006), suggesting that PR is commonly adjusted to this model. The lower ranges were obtained for the first depths, which in turn present the lowest values of C 0 and higher values of DSD, which shows that for future sampling this distance can be lower, indicating that the sampling distance is considered appropriate for the majority of measurements and the predictions for not sampled points by the Kriging method are reliable (VALVUENA CALDERON et al., 2008).
The use of hierarchical cluster analysis (HCA) allows identified management zones, based on different values of PR, both in space and depth. While it is possible to define several areas, in addition to the Euclidean distance is also considered the PR value which occur limiting crop root development for different depths. This value is very controversial and may vary between 1.5 and 3.0 MPa, depending on soil texture and crop. Whereas the crop of the study area is a medium-sized tree, for the present study a value of 2.0 MPa is assumed, a value that is commonly accepted by researchers (HAMZA & ANDERSON, 2005). Based on these criteria, it was defined two management zones, considering a breakpoint Euclidean distance of 30 (Figure 2A).
The first group corresponds to sites with PR low values, while group II show high values, which may be a limitation to root development and consequently for crop productivity, presumably those that do not represent mechanical constraints for plants. The maps obtained at different depths ( Figures 2B, 2C, 2D and 2E), have in common that most limiting areas are located in the center and in the upper right and lower left of the study area, associated with the groups obtained in the HCA The HCA for depth also presented two groups ( Figure 3A), differing from the principal component analysis, which showed three groups ( Figure 3C) of which provides a clear performance of the PR in three soil layers: a surface between 0 and 0.10 m deep, which recorded the lowest values of PR, an average between 0.10 and 0.35 m deep, which presents the highest values of PR, and finally a third layer to 0.60 m depth, where the PR has lower values than the second layer, but above the topsoil. These three groups represent the typical curve of PR in agricultural soils ( Figure  3B) and confirm different behavior for each depth considered (WERICH NETO et al., 2006), which also allows subsequent decisions related to soil preparation and partly depends on the crop root system to be establish. To facilitate the analysis of the PR using multivariate techniques, it was considered as independent of PR variables measuring at each depth. In addition, it was analyzed the principal components with auto data greater than one (1), which for this study was considered an appropriate interval, since the first three principal components explain more than 80% of the total variance (Table 3), and as indicated by KAISER & RICE (1974), it is desirable that the main components analyzed to explain at least 75% of the total variance.
Communality values close to one (1), for the PR at different depths, showing the representative of the components analyzed for this study. Each principal component is directly related to a soil layer, where the first component (PC1), accounting for 53.39% of the total variance, corresponds to the depth of the soil where they presented the highest values of PR located between 0,40 and 0.60 m, an area that is less influenced by the natural processes of soil weathering and where the tillage and machinery traffic probably had little effect, since in this area are not used heavy equipment, obeying this behavior mainly to natural features and drainage conditions, as also reported by WERICH NETO et al. (2006).  In sequence, the second component is related to the surface layer, which presents the major changes, both agronomic management and soil formation processes. This layer shows the lowest values of PR and higher reconsolidation processes occurring in agricultural soils after being subjected to external forces, such as tillage. The third component (PC3) accounts for 11.98% of the total variance and it is represented mainly by the soil layer between 0.10 and 0.30 m, which is the transition zone between A and B horizons of soils of the study area, also showing the side effect of soil preparation labor.
Moreover, most of the resulting coefficients for each component, allow to verify a positive correlation between the different PR between different depths, showing relative continuity between adjacent soil layers, behavior also reported by STELLUTI et al. (1998), because when pressure is exerted on the soil surface, it is transmitted to subsurface layers, dissipating its effect at 0.50 m depth, primarily depending on the texture and soil moisture content (SOANA et al., 1981).

CONCLUSIONS
The study of soil penetration resistance by and multivariate analysis characterized this property both vertically and horizontally, which it identified correlation of penetration resistance independently to each soil depth, and thus allows identifying areas with limitations for crop development.
The multivariate analysis helps to establish management zones, by grouping areas that have similarities both vertically and horizontally, which makes this tool an appropriate complement to classify and divide an attribute, according to the characteristics of the study area.
Another benefit of multivariate analysis in the penetration resistance study is the principal component evaluation by depth, because this allows identifying and characterizing different soil layers, indicating accurately the depth for different agriculture practices, such as tillage.