Slash Spatial Linear Modeling: Soybean Yield Variability as a Function of Soil Chemical Properties

Fagundes, Regiane Slongo; Uribe-Opazo, Miguel Angel; Guedes, Luciana Pagliosa Carvalho; Galea, Manuel

doi:10.1590/18069657rbcs20170030

ABSTRACT:

In geostatistical modeling of soil chemical properties, one or more influential observations in a dataset may impair the construction of interpolation maps and their accuracy. An alternative to avoid the problem would be to use most robust models, based on distributions that have heavier tails. Therefore, this study proposes a spatial linear model based on the slash distribution (SSLM) in order to characterize the spatial variability of soybean yields as a function of soil chemical properties. The likelihood ratio statistic (LR) was applied to verify the significance of parameters associated with the model. We evaluated the sensitivity of the maximum likelihood estimators by means of local influence analysis for both the soybean response and the linear predictor. In the proposed model, we analyzed data gathered from a commercial grain production area (127.18 ha) located in the western part of the state of Paraná (Brazil). The results showed that the slash distribution allowed us to adjust the high kurtosis of the data set distribution and the LR test confirmed that the soil chemical properties of phosphorus, potassium, pH, and organic matter were significant for the SSLM. Diagnostic analysis indicated that the atypical value of the sample set was not influential in the parameter estimation process. Construction of the interpolation map based on the proposed model is not affected when considering the atypical and/or influential observations. Thus, SSLM becomes a robust alternative in the study of soybean yield variability as a function of soil chemical properties, making it possible to investigate the productive potential of the areas.

Keywords:
spatial variability; slash distribution; maximum likelihood; yield estimators

INTRODUCTION

Knowledge of spatial variability of soybean yield and its relationship to soil chemical properties are essential for proper crop management (Sobjak et al., 2016Sobjak R, Souza EG, Bazzi CL, Uribe-Opazo MA, Betzek NM. Redundant variables and the quality of management zones. Eng Agric. 2016;36:78-93. https://doi.org/10.1590/1809-4430-Eng.Agric.v36n1p78-93/2016
https://doi.org/10.1590/1809-4430-Eng.Ag... ). However, many precision agriculture users get disappointed trying to find the ideal variable-rate application of the nutrient based on the prescription map, because does not always correspond to the soybean yield map generated after the intervention. One of the main explanations is related to complexity of the soil, which is considered a dynamic system whose functionality arises from interactions between physical, chemical, and biological components that are specific for each crop (Pereira et al., 2016Pereira FCBL, Mello LMM, Pariz CM, Mendonça VZ, Yano EH, Miranda EEV, Crusciol CAC. Autumn Maize Intercropped with Tropical Forages: Crop Residues, Nutrient Cycling, Subsequent Soybean and Soil Quality. Rev Bras Cienc Solo. 2016;40:e0150003. https://doi.org/10.1590/18069657rbcs20150003
https://doi.org/10.1590/18069657rbcs2015... ). Thus, a localized management system for soybean requires accurate information on spatial variation and the interaction between soil chemical properties and their relationship to yield (Pagani and Mallarino, 2015Pagani A, Mallarino AP On-farm evaluation of corn and soybean grain yield and soil pH responses to liming. Agron J. 2015;107:71-82. https://doi.org/10.2134/agronj14.0314
https://doi.org/10.2134/agronj14.0314... ; Al-Kaisi et al., 2016Al-Kaisi MM, Archontoulis S, Kwaw-Mensah D. Soybean spatiotemporal yield and economic variability as affected by tillage and crop rotation. J Am Soc Agron. 2016;108:1267-80. https://doi.org/10.2134/agronj2015.0363
https://doi.org/10.2134/agronj2015.0363... ; Dalla Nora et al., 2017Dalla Nora D, Amado TJC, Nicoloso RS, Gruhn EM. Modern high-yielding maize, wheat and soybean cultivars in response to gypsum and lime application on no-till Oxisol. Rev Bras Cienc Solo. 2017;41:e0160504. https://doi.org/10.1590/18069657rbcs20160504
https://doi.org/10.1590/18069657rbcs2016... ).

One of the methods used in such characterization is geostatistics, which takes soil spatial variation patterns into account and provides techniques that enable construction of maps associated with one or more soil chemical properties (Cressie, 2015Cressie NAC. Statistics for spatial data. 2nd ed. NewYork: Jonh Wiley & Sons;2015.). One of the different methods used in geostatistical studies is spatial linear models, which have been widely evaluated by assuming a Gaussian stochastic process (Uribe-Opazo et al., 2012Uribe-Opazo MA, Borssoi JA, Galea M. Influence diagnostics in Gaussian spatial linear models. J Appl Stat. 2012;39:615-30. https://doi.org/10.1080/02664763.2011.607802
https://doi.org/10.1080/02664763.2011.60... ; Grzegozewski et al., 2013Grzegozewski DM, Uribe-Opazo MA, De Bastiani F, Galea M. Local influence when fitting Gaussian spatial linear models: an agriculture application. Cien Inv Agr. 2013;40:523-35. https://doi.org/10.4067/S0718-16202013000300006
https://doi.org/10.4067/S0718-1620201300... ; Nesi et al., 2013Nesi CN, Ribeiro A, Bonat WH, Ribeiro Junior PJ. Verossimilhança na seleção de modelos para predição espacial. Rev Bras Cienc Solo. 2013;37:352-8. https://doi.org/10.1590/S0100-06832013000200006
https://doi.org/10.1590/S0100-0683201300... ; De Bastiani et al., 2015De Bastiani F, Cysneiros AHMA, Uribe-Opazo MA, Galea M. Influence diagnostics in elliptical spatial linear models. TEST. 2015;24:322-40. https://doi.org/10.1007/s11749-014-0409-z
https://doi.org/10.1007/s11749-014-0409-... ). This modeling enables estimation of spatial dependence parameters through the maximum likelihood method (ML), making inferential studies possible.

However, Assumpção et al. (2014)Assumpção RAB, Uribe-Opazo MA, Galea M. Analysis of local influence in geostatistics using Student's t-distribution. J Appl Stat. 2014;41:2323-41. http://doi.org/10.1080/02664763.2014.909793
http://doi.org/10.1080/02664763.2014.909... and Schemmer et al. (2017)Schemmer R, Uribe-Opazo MA, Galea M, Assumpção RAB. Spatial variability of soybean yield through a reparameterized t-Student model. Eng Agric. 2017;37:360-70. http://dx.doi.org/10.1590/1809-4430-eng.agric.v37n4p760-770/2017
http://dx.doi.org/10.1590/1809-4430-eng.... reported on the sensitivity of a Gaussian distribution for atypical values and errors with heavier tails than normal and stated that such modeling may generate unrealistic maps. Thus, one of the alternatives to avoid this problem would be to use more robust models based on heavier tails. Following this line of reasoning, multivariate slash distribution is quite attractive because it has an additional parameter, here designated as η (0 < η < 1) which allows kurtosis adjustment, making the modeling more flexible in the presence of atypical values (Osorio et al., 2009Osorio F, Paula GA, Galea M. On estimation and influence diagnostics for the Grubbs’ model under heavy-tailed distributions. Comput Stat Data An. 2009;53:1249-63. https://doi.org/10.1016/j.csda.2008.10.034
https://doi.org/10.1016/j.csda.2008.10.0... ; Alcantara and Cysneiros, 2013Alcantara IC, Cysneiros FJA. Linear regression models with slash-elliptical errors. Comput Stat Data An. 2013;64:153-64. https://doi.org/10.1016/j.csda.2013.02.029
https://doi.org/10.1016/j.csda.2013.02.0... ). This distribution was first introduced by Lange and Sinsheimer (1993)Lange K, Sinsheimer JS. Normal/independent distributions and their applications in robust regression. J Comput Graph Stat. 1993;2:175-98. https://doi.org/10.2307/1390698
https://doi.org/10.2307/1390698... , belonging to the class of scale mixtures of normal distribution. The basic idea behind this distribution class is to insert randomness in the covariance matrix, as well as in the mean vector of the multivariate normal distribution, through a strictly positive mixture variable. This allows generalization of a multivariate normal distribution, preserving the main properties.

Although it creates robust models, spatial modeling based on the slash distribution may be affected by influential observations. Therefore, it is important to evaluate its sensitivity through influence diagnostics, which may be carried out by different evaluation methods, one of them being local influence assessment. The aim of this assessment is to evaluate the goodness of fit of the model, and the robustness of its estimates when small perturbations are introduced in the model and/or in the dataset (Cook, 1986Cook RD. Assessment of local influence. J R Statist Soc B. 1986;48:133-69.; Zhu and Lee, 2001Zhu H-T, Lee S-Y Local influence for incomplete-data models. J R Statist Soc B. 2001;63:111-26. https://doi.org/10.1111/1467-9868.00279
https://doi.org/10.1111/1467-9868.00279... ; Jonathan et al., 2016Jonathan R, Uribe-Opazo MA, De Bastiani F, Johann JA. Técnicas para detecção de pontos influentes em variáveis contínuas regionalizadas. Eng Agric. 2016;36:152-65. http://dx.doi.org/10.1590/1809-4430-Eng.Agric.v36n1p152-165/2016
http://dx.doi.org/10.1590/1809-4430-Eng.... ).

Regarding georeferenced data, Uribe-Opazo et al. (2012)Uribe-Opazo MA, Borssoi JA, Galea M. Influence diagnostics in Gaussian spatial linear models. J Appl Stat. 2012;39:615-30. https://doi.org/10.1080/02664763.2011.607802
https://doi.org/10.1080/02664763.2011.60... evaluated the sensitivity of covariance function estimators and linear predictors under small dataset perturbations and/or a spatial linear model with a normal distribution. Assumpção et al. (2011)Assumpção RAB, Opazo MAU, Galea M. Local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Rev Bras Cienc Solo. 2011;35:1917-26. https://doi.org/10.1590/S0100-06832011000600008
https://doi.org/10.1590/S0100-0683201100... presented techniques for local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Grzegozewski et al. (2013)Grzegozewski DM, Uribe-Opazo MA, De Bastiani F, Galea M. Local influence when fitting Gaussian spatial linear models: an agriculture application. Cien Inv Agr. 2013;40:523-35. https://doi.org/10.4067/S0718-16202013000300006
https://doi.org/10.4067/S0718-1620201300... considered a Gaussian spatial linear model (GSLM) and information about soil macro- and micronutrients in evaluating the effects of influential observations on spatial model selection, parameter estimation by maximum likelihood, and characterization of spatial continuity of soybean yield. In these studies, the authors emphasized that influential values can modify the interpolated maps and may generate inaccurate predictions.

The aim of this study was to propose a spatial linear model based on the slash distribution (SSLM) to characterize the spatial variability of soybean yields as a function of soil chemical properties.

MATERIALS AND METHODS

Experimental site

We gathered data from a field experiment set up in a commercial grain production area in which the no-tillage system has been practiced since 1994. The area is located in the municipality of Cascavel, in the west of the state of Paraná, Brazil. It lies at the geographical coordinates of approximately 24.95° S, 53.57° W, with an average altitude of 650 m. Soil at the location is classified as a Latossolo Vermelho Distróférrico (Santos et al., 2013Santos HG, Jacomine PKT, Anjos LHC, Oliveira VA, Oliveira JB, Coelho MR, Lumbreras JF, Cunha TJF. Sistema brasileiro de classificação de solos. 3. ed. Rio de Janeiro: Embrapa Solos; 2013.), which corresponds to Rhodic Hapludox (Soil Survey Staff, 2014Soil Survey Staff. Keys to soil taxonomy. 12th ed. Washington, DC: United States Department of Agriculture, Natural Resources Conservation Service; 2014.), with an average slope of around 4 %. The temperate climate is mesothermal and highly humid, classified as Cfa (Köppen system); and mean annual temperature is 21 °C. Samples were taken during the 2014/2015 crop season in a 127.18 ha area. For this study, we used a lattice plus close pairs design (Diggle and Ribeiro Junior, 2007Diggle PJ, Ribeiro Junior PJ. Model-based geostatistics. New York: Springer; 2007.), with a distance of 141 m between points belonging to the regular grid and, in some random locations, shorter distances of 75 and 50 m between points, thereby obtaining 78 locations. All the collected data was georeferenced by a GPS (Global Positioning System) receiver, Geoexplorer^® 3 (Trimble^®), under a UTM coordinate system, zone 22 south and datum WGS 84 (Figure 1).

Figure 1
Location map of the study site.

The response variable considered in the model was soybean yield, which was estimated by the quantity of grains harvested in an area of 0.90 m² centered on each georeferenced point. After harvest, the grain from each plot was weighed and moisture level was corrected to 13 %; this weight was then converted into t ha⁻¹. As explanatory variables (covariates), we selected P (mg dm^-3), K (mg dm^-3), pH, and organic matter (OM) (g dm^-3). Five samples were collected at a depth of 0.00-0.20 m to determine the chemical contents.

These samples were taken near each georeferenced point and were then mixed in order to have a representative composed sample for each plot. Analysis was carried out in the laboratory of the Cooperativa Central de Pesquisa Agrícola (Coodetec, Brazil), whose methodology is available in Costa and Oliveira (2001)Costa JM, Oliveira EF. Fertilidade do solo e nutrição de plantas. 2. ed rev. Campo Mourão: Coamo - Cascavel: Coodetec; 2001..

These covariates were chosen based on the following considerations: (i) high kurtosis associated with indexes outside the range of −2 to +2 (Casella and Berger, 2010Casella G, Berger RL. Inferência estatística - tradução da 2a edição norte-americana. São Paulo: Cengage Learning; 2010.); (ii) structure of spatial dependence of soil chemical properties, identified from the construction of omnidirectional experimental semivariograms using the Matheron estimator (Soares, 2014Soares, A. Geoestatística para as ciências da terra e do ambiente. 3. ed. Portugal: IST Press; 2014.; Cressie, 2015Cressie NAC. Statistics for spatial data. 2nd ed. NewYork: Jonh Wiley & Sons;2015.); (iii) P increases crop yield at early reproductive stages and is applied in large amounts since Brazilian soils are poor in this element; (iv) K plays an important role in plant photosynthesis and respiration, assists in formation of starch and sugars, and produces vigorous and resistant plants; (v) Brazilian soils are acidic (low pH), which limits crop yield; and (vi) OM provides nutrients for sustainable crop production and increases water infiltration into the soil, reducing erosion (Santos et al., 2013Santos HG, Jacomine PKT, Anjos LHC, Oliveira VA, Oliveira JB, Coelho MR, Lumbreras JF, Cunha TJF. Sistema brasileiro de classificação de solos. 3. ed. Rio de Janeiro: Embrapa Solos; 2013.).

Spatial modeling and parameter estimation

In order to build the data model from the spatially correlated variables, we considered a stochastic process {Y(s_i), s_i) ∈ S} defined in a region S ⊂ ℛ², where each element Y (s_i) related to soybean yield has known locations s_i, i = 1, …, n. We also consider the process as stationary, wherein Y = (Y (s_i), …, Y(s_n))^T follows a multivariate slash distribution (Lange and Sinsheimer, 1993Lange K, Sinsheimer JS. Normal/independent distributions and their applications in robust regression. J Comput Graph Stat. 1993;2:175-98. https://doi.org/10.2307/1390698
https://doi.org/10.2307/1390698... ) in which the second moment is finite, being that Y~SL_n (Χβ, Σ, η), where E(Y) = Χβ and Coν(Y) = Σ. In addition, X is a matrix formed by the vector 1's and the covariates P, K, pH, and OM; β = (β_o, …, β_q)^τ, the vector of unknown parameters associated with each covariate; and η, the parameter of kurtosis adjustment. Under these conditions, the SSLM was expressed by equation 1:

Eq. 1

Y (s_{i}) = β_{0} + β_{1} P (s_{i}) + β_{2} K (s_{i}) + β_{3} p H (s_{i}) + β_{4} O M (s_{i}) + ε (s_{i})

in which ε(s_i) is the random [ε~SL_n (O, Σ, η)].

Spatial dependence was determined by the covariance matrix Σ, whose parametric form is given by Σ = ϕ₁ l_n + ϕ₂ R(ϕ₃), where l_n is the identify matrix; ϕ₁ ≥0 is the nugget effect; ϕ₂ ≥0 is the partial sill; ϕ₃ is the parameter related to the spatial dependence radius (range); and R(ϕ₃) = [(r_ij)] is the correlation matrix for the soybean yield dataset observed between the points located at s_i and at s_j (Uribe-Opazo et al., 2012Uribe-Opazo MA, Borssoi JA, Galea M. Influence diagnostics in Gaussian spatial linear models. J Appl Stat. 2012;39:615-30. https://doi.org/10.1080/02664763.2011.607802
https://doi.org/10.1080/02664763.2011.60... ).

We considered the Matérn family covariance function (Matérn, 1986Matérn B. Spatial variation - Lecture notes in statistics. 2nd ed. New York: Springer-Verlag; 1986.) to explain spatial dependence with the smoothness parameter κ ∈ {0.5, 1.5, 2.5}, whose relationship to the practical range is given by 3 ϕ₃, 4.75 ϕ₃, and 5.92 ϕ₃, respectively (Diggle and Ribeiro Junior, 2007Diggle PJ, Ribeiro Junior PJ. Model-based geostatistics. New York: Springer; 2007.).

We estimate the unknown vector of parameters of the model θ = (β^T, ϕ^T)^T, with ϕ = (ϕ₁, ϕ₂, ϕ₃)^T, by maximizing the log-likelihood function (ML), whose form is expressed by the equation 2:

Eq. 2

l (θ) = - \log (η) + \frac{n}{2} \log (\frac{c (η)}{2 π}) - \frac{1}{2} \log | Σ | + \log (\int_{0}^{1} v^{a - 1} \exp {- v b} d v)

in which a = (n/2 + η⁻¹) and b = c(η)δ/2, with δ = (Y - Χβ)^T Σ⁻¹ (Y – Χβ) and c(η) = (1 – η)⁻¹, for 0 < η < 1, where η stands for the parameter of kurtosis adjustment. Here, the ML method defines the estimator $\hat{θ}$ of θ as being the vector that maximizes Ι(θ) in the parametric space Θ.

The ML estimator was calculated using an expectation-maximization (EM) algorithm, which consists of a computational method that generates approaches iteratively by maximizing expectation of the logarithm of the likelihood function for the completed data set, called the Q-function, which is expressed by the equation 3:

Eq. 3

Q (θ | \hat{θ}) = - \log (η) + \frac{n}{2} \log (\frac{c (η)}{2 π}) - \frac{1}{2} \log | Σ | - \frac{1}{2} w δ + (\frac{n}{2} + \frac{1}{η} - 1) d,

in which $w = c (η) E {V | Y, \hat{θ}}$ and $d = E {\log (V | Y, \hat{θ}}$ . Here, V ~ Beta(η⁻¹, 1) and (Y|V = ν) ~ Ν_η(Χβ, c(η)ν⁻¹ Σ). For further details, see Lange and Sinsheimer (1993)Lange K, Sinsheimer JS. Normal/independent distributions and their applications in robust regression. J Comput Graph Stat. 1993;2:175-98. https://doi.org/10.2307/1390698
https://doi.org/10.2307/1390698... and Osorio et al. (2009)Osorio F, Paula GA, Galea M. On estimation and influence diagnostics for the Grubbs’ model under heavy-tailed distributions. Comput Stat Data An. 2009;53:1249-63. https://doi.org/10.1016/j.csda.2008.10.034
https://doi.org/10.1016/j.csda.2008.10.0... .

To avoid identifiability problems in ML estimations, the parameters of kurtosis adjustment (η) and smoothing (κ) of the Matérn family model were determined through cross-validation (CrV) (De Bastiani et al., 2015De Bastiani F, Cysneiros AHMA, Uribe-Opazo MA, Galea M. Influence diagnostics in elliptical spatial linear models. TEST. 2015;24:322-40. https://doi.org/10.1007/s11749-014-0409-z
https://doi.org/10.1007/s11749-014-0409-... ) and Trace criterion (Tr) (Kano et al., 1993Kano Y, Berkane M, Bentler PM. Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. J Am Stat Assoc. 1993;88:135-43. https://doi.org/10.2307/2290706
https://doi.org/10.2307/2290706... ).

Significance (hypothesis) testing on the estimated parameters β_s was carried out by the likelihood ratio statistic (LR), given by the equation 4:

Eq. 4

L R = 2 [l (\hat{θ}) - l (\tilde{θ})],

where $l (\hat{θ})$ is the log-likelihood value of the model with all covariates and $l (\tilde{θ})$ is the log-likelihood value of the model without a certain covariable. At a significance level α, if $| L R | \geq χ^{2}_{(k_{1}, α/ 2)}$ , we should reject the null hypothesis, meaning that the covariate tested was significant in the composition of the model (Dagenais and Dufor, 1991Dagenais MG, Dufour JM. Invariance, nonlinear models, and asymptotic tests. Econometrica. 1991;59:1601-15. https://doi.org/10.2307/2938281
https://doi.org/10.2307/2938281... ).

Local influence analysis

After defining the parameters, an interpolation map was built by regression-kriging in order to visualize soybean yield variability as a function of the covariates (P, K, pH, and OM) (Soares, 2014Soares, A. Geoestatística para as ciências da terra e do ambiente. 3. ed. Portugal: IST Press; 2014.). To investigate the presence of observations that may have interfered in the process of interpolation, a local influence analysis was performed.

As proposed by Cook (1986)Cook RD. Assessment of local influence. J R Statist Soc B. 1986;48:133-69., we examined possible influential observations on the soybean yield response variable by considering the likelihood displacement (Equation 2) after we insert a perturbation vector ω (noise). We used the expression $L D (ω) = 2 [l (\hat{θ}) - l ({\tilde{θ}}_{ω})]$ , where $l ({\tilde{θ}}_{ω})$ is the log-likelihood value of the model perturbed by ω. High values of LD(ω) means that $l (\hat{θ})$ and $l ({\tilde{θ}}_{ω})$ differ considerably, with the the i-th influential observation being considered if $| h_{[L] m a x i} | > \bar{D} + 2 s d (D)$ , where $\bar{D}$ denotes the mean of the elements of the vector |h_[L]max|, and sd(D) is the standard deviation. Here, |h_[L]max| corresponds to a normalized unit eigenvector associated with the largest eigenvalue of the matrix $B_{L} = Δ_{L ω}^{T} [L {(θ)}^{- 1}] Δ_{L ω},$ , in which L(θ) is a Hessian matrix evaluated at $θ = \hat{θ}$ , and $Δ_{L ω} = \partial /^{2} (θ, ω) / \partial θ \partial ω$ is a perturbation matrix (q + 4) × n evaluated at $θ = {\hat{θ}}_{ω}$ .

In a similar manner, influence analysis was performed on the Q-function (Equation 3), which was considered the reference measure |h_[Q]max|. For more details, see Zhu and Lee (2001)Zhu H-T, Lee S-Y Local influence for incomplete-data models. J R Statist Soc B. 2001;63:111-26. https://doi.org/10.1111/1467-9868.00279
https://doi.org/10.1111/1467-9868.00279... and Assumpção et al. (2011Assumpção RAB, Opazo MAU, Galea M. Local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Rev Bras Cienc Solo. 2011;35:1917-26. https://doi.org/10.1590/S0100-06832011000600008
https://doi.org/10.1590/S0100-0683201100... , 2014Assumpção RAB, Uribe-Opazo MA, Galea M. Analysis of local influence in geostatistics using Student's t-distribution. J Appl Stat. 2014;41:2323-41. http://doi.org/10.1080/02664763.2014.909793
http://doi.org/10.1080/02664763.2014.909... ).

To evaluate the influence on the linear predictor, we considered the methodology presented by Assumpção et al. (2011)Assumpção RAB, Opazo MAU, Galea M. Local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Rev Bras Cienc Solo. 2011;35:1917-26. https://doi.org/10.1590/S0100-06832011000600008
https://doi.org/10.1590/S0100-0683201100... applied to the SSLM (Equation 1). In this case, maximum influence directions were denoted by L_p and Q_p. Analysis was carried out using R software 3.3.2 for Windows (R Development Core Team, 2016R Development Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016 [acessed on 2 Nov. 2016]. Available at: http://www.R-project.org.
http://www.R-project.org... ).

RESULTS AND DISCUSSIONS

Table 1 shows the exploratory data analysis for the response variable and covariates, i.e., the soil chemical properties (P, K, pH, and OM). Average soybean yield was 2.37 t ha⁻¹, which is lower than the national average (3.03 t ha⁻¹) for the 2014/2015 season (IBGE, 2016Instituito Brasileiro de Geografia e Estatística -IBGE. Banco de dados agregados- Sistema IBGE de recuperação automática-SIDRA; 2016 [acesso em 10 nov 2016]. Disponível em: http://www.sidra.ibge.gov.br
http://www.sidra.ibge.gov.br... ). Standard deviation and coefficient of variation were 0.87 t ha⁻¹ and 11.48 %, respectively, exhibiting low data dispersion around the mean and homogeneity (Carvalho et al., 2003Carvalho CGP, Arias CAA, Toledo JFF, Almeida LA, Kiihl RAS, Oliveira MF, Hiromoto DM, Takeda C. Proposta de classificação dos coeficientes de variação em relação à produtividade e altura da planta de soja. Pesq Agropec Bras. 2003;38:187-93. https://doi.org/10.1590/S0100-204X2003000200004
https://doi.org/10.1590/S0100-204X200300... ).

Thumbnail

Table 1
Descriptive statistics of soybean yield (Y) and of the soil chemical properties

Quartile analysis indicated an atypical value of 3.18 t ha⁻¹, corresponding to sample 33. Standard analysis of the directional semivariograms (omitted here) towards 0°, 45°, 90°, and 135° showed similar behavior for all directions, indicating that the spatial dependence structure is isotropic.

Descriptive statistics of soil chemical properties (Table 1) indicate very high levels of P and K for soybean (Costa and Oliveira, 2001Costa JM, Oliveira EF. Fertilidade do solo e nutrição de plantas. 2. ed rev. Campo Mourão: Coamo - Cascavel: Coodetec; 2001.). Moreover, the standard deviation and coefficient of variation for both P and K suggested average dispersion around the mean and heterogeneity (Warrick and Nielsen, 1980Warrick AW, Nielsen DR. Spatial variability of soil physical properties in the field. In: Hillel D, editor. Applications of soil physics. New York: Academic Press; 1980. p. 319-44.). High variations in P and K contents may be due to continuing fertilization at a fixed rate along the plant row, affecting micro and macro spatial variability (Amado et al., 2009Amado TJC, Pes LZ, Lemainski CL, Schenato RB. Atributos químicos e físicos de Latossolos e sua relação com os rendimentos de milho e feijão irrigados. Rev Bras Cienc Solo. 2009;33:831-43. https://doi.org/10.1590/S0100-06832009000400008
https://doi.org/10.1590/S0100-0683200900... ). Another factor may be the mobility of these chemical elements in the soil. For example, as a monovalent cation, K is easily leached absorbed, fixed, adsorbed, or stabilized in the soil solution, generating large variability. Therefore, these variations may reflect the effects of the complex interaction between soil chemical properties and crop management practices (Bottega et al., 2013Bottega EL, Queiroz DM, Pinto FAC, Souza CMA. Variabilidade espacial de atributos do solo em sistema de semeadura direta com rotação de culturas no cerrado brasileiro. Rev Cienc Agron. 2013;44:1-9. https://doi.org/10.1590/S1806-66902013000100001
https://doi.org/10.1590/S1806-6690201300... ; Pereira et al., 2016Pereira FCBL, Mello LMM, Pariz CM, Mendonça VZ, Yano EH, Miranda EEV, Crusciol CAC. Autumn Maize Intercropped with Tropical Forages: Crop Residues, Nutrient Cycling, Subsequent Soybean and Soil Quality. Rev Bras Cienc Solo. 2016;40:e0150003. https://doi.org/10.1590/18069657rbcs20150003
https://doi.org/10.1590/18069657rbcs2015... ).

Soil pH was classified as highly acidic (from 4.31 to 5.00), with low dispersion around the mean. Organic matter contents, in turn, were rated as high (from 35.01 to 60.00 g dm^-3), with a coefficient of variation of 12.50 % (Warrick and Nielsen, 1980Warrick AW, Nielsen DR. Spatial variability of soil physical properties in the field. In: Hillel D, editor. Applications of soil physics. New York: Academic Press; 1980. p. 319-44.; Costa and Oliveira, 2001Costa JM, Oliveira EF. Fertilidade do solo e nutrição de plantas. 2. ed rev. Campo Mourão: Coamo - Cascavel: Coodetec; 2001.). Large amounts of OM and high acidity might be associated with the no-tillage system (NTS) implemented in the area since 1994. The amount of straw increases on the soil surface in areas under NTS for long periods, which shifts the quantity and quality of OM and gradually alters pH due to basic cations and soluble organic carbon (Dalchiavon et al., 2013Dalchiavon FC, Carvalho MP, Montanari R, Andreotti M. Sugarcane productivity correlated with physical-chemical attributes to create soil management zone. Rev Ceres. 2013;60:706-14. https://doi.org/10.1590/S0034-737X2013000500015
https://doi.org/10.1590/S0034-737X201300... ). In addition, NTS may have limited the action of liming, restricting it to the areas of application, i.e., in the surface soil layers (Nunes et al., 2011Nunes RS, Sousa DMG, Goedert WJ, Vivaldi LJ. Distribuição de fósforo no solo em razão do sistema de cultivo e manejo da adubação fosfatada. Rev Bras Cienc Solo. 2011;35:877-88. https://doi.org/10.1590/S0100-06832011000300022
https://doi.org/10.1590/S0100-0683201100... ; Paganiand Mallarino, 2015Pagani A, Mallarino AP On-farm evaluation of corn and soybean grain yield and soil pH responses to liming. Agron J. 2015;107:71-82. https://doi.org/10.2134/agronj14.0314
https://doi.org/10.2134/agronj14.0314... ; Dalla Nora et al., 2017Dalla Nora D, Amado TJC, Nicoloso RS, Gruhn EM. Modern high-yielding maize, wheat and soybean cultivars in response to gypsum and lime application on no-till Oxisol. Rev Bras Cienc Solo. 2017;41:e0160504. https://doi.org/10.1590/18069657rbcs20160504
https://doi.org/10.1590/18069657rbcs2016... ).

All variables showed high kurtosis (Table 1) (Casella and Berger, 2010Casella G, Berger RL. Inferência estatística - tradução da 2a edição norte-americana. São Paulo: Cengage Learning; 2010.). A process modeled under this condition may produce a model of interpolation with biased parameters and, consequently, generate overestimated/underestimated regions in the map. Our proposed model based on the slash distribution is able to reduce this impact by determining the shape parameter η, which will allow kurtosis adjustment of the data.

All the samples and the SSLM were considered in studying the dataset, assuming Y~SL_n (Χβ, Σ, η). For comparison, a Gaussian spatial linear model (GSLM) was also used, in which Y~N_n (Χβ, Σ). Thus, we could assess the model robustness and the effect of outliers and/or influential values on parameter estimates and mapping. Both criteria, CrV and Tr, indicated that for an SSLM, the adequate value for kurtosis adjustment was η = 0.25, and for the Matérn model, smoothing was κ = 0.25. For a GSLM, an adequate model for explaining spatial dependence was also the Matérn model, with a smoothing parameter of κ = 0.25.

Table 2 shows the parameter estimated by the ML estimator. The asymptotic standard errors (in brackets) were calculated from Fisher's information matrix. It may be noted that the estimated values of the parameter vector β were similar in both models, with the lowest standard deviations observed for the GSLM, except for ${\hat{β}}_{1}$ , which was equal.

Thumbnail

Table 2
Values of parameters for kurtosis adjustment (η) and of smoothing (κ) selected from CrV (Cross-validation) and Tr (Trace) criteria; estimated values of parameters of the Gaussian spatial linear model (GSLM) and slash spatial linear model (SSLM) by maximum likelihood using the EM algorithm with respective asymptotic standard deviations (in brackets)

It is noteworthy that the large values of coefficient of the parameters were associated with K ( ${\hat{β}}_{2}$ ) content, but with a negative signal, and pH ( ${\hat{β}}_{3}$ ) with a positive signal. This relationship may have contributed to the estimation of average soybean yield (t ha⁻¹) showing lower values than the national average, since an increase in soil acidity implies an increase of Al³⁺, which inhibits root development and, consequently, decreases uptake of nutrients, such as K.

Another issue that may be related to pH is the low value of the ${\hat{β}}_{1}$ coefficient, since soil acidity decreases release of available P in organic matter of the soil (Santos et al., 2013Santos HG, Jacomine PKT, Anjos LHC, Oliveira VA, Oliveira JB, Coelho MR, Lumbreras JF, Cunha TJF. Sistema brasileiro de classificação de solos. 3. ed. Rio de Janeiro: Embrapa Solos; 2013.; Passos et al., 2015Passos AMA, Rezende PM, Carvalho ER, Ávila FW. Pó de carvão, esterco de curral e cama de frango no cultivo da soja e atributos químicos de um Cambissolo distrófico. Rev Bras Cienc Agrar. 2015;10:382-8. https://doi.org/10.5039/agraria.v10i3a4546
https://doi.org/10.5039/agraria.v10i3a45... ).

Comparing values estimated for parameters related to the covariance of the spatial process, relevant differences were found, with the lowest standard deviations being registered in the SSLM. By calculating the relationship ϕ₁/(ϕ₁ + ϕ₂), we found that SSLM had an index of 0.66, while GSLM had an index of 0.75. The smoothing parameter was κ = 0.25 for both models, and this confirms that the practical range of SSLM was 485.37 m, whereas that of GSLM was merely 311.78 m. For Cambardella et al. (1994)Cambardella CA, Moorman TB, Parkin TB, Karlen DL, Novak JM, Turco RF, Konopka AE. Field-scale variability of soil properties in Central Iowa Soils. Soil Sci Soc Am J. 1994;58:1501-11. https://doi.org/10.2136/sssaj1994.03615995005800050033x
https://doi.org/10.2136/sssaj1994.036159... , the best estimates are reached when models are based on covariance functions with the lowest “nugget effect/sill” ratio and the highest range.

Using data from table 2, we were able to establish the GSLM of the soybean yield at S_i for the area under study, which was expressed by equation 5:

Eq. 5

Y (s_{i}) = 2.030 + 0.003 P (s_{i}) - 0.121 K (s_{i}) + 0.033 p H (s_{i}) + 0.003 O M (s_{i}) + ε (s_{i})

with i = 1, …, 78 and the covariance matrix given by $\sum^{^} = 0.053 l_{78} + 0.018 \hat{R} (52.666)$ . The SSLM was expressed by the equation 6:

Eq. 6

Y (s_{i}) = 1.993 + 0.003 P (s_{i}) - 0.109 K (s_{i}) + 0.034 p H (s_{i}) + 0.004 O M (s_{i}) + ε (s_{i})

with i = 1, …, 78 and the covariance matrix given by $\sum^{^} = 0.051 l_{78} + 0.026 \hat{R} (81.988)$ .

In both cases, the correlation matrix elements R were determined by the correlation function of the Matérn family with κ = 0.25.

The hypothesis test to assess the significance of parameter vector β was applied jointly and individually for both models. Table 3 displays the results of the LR and respective p-values. The null hypothesis β₁ = β₂ = β₃ = β₄ = 0 was rejected at 5 % probability for both models. Individual testing also showed the parameter values of (β's) as significant. Therefore, the covariates P, K, pH, and OM remained in the modeling.

Thumbnail

Table 3
Likelihood ratio statistic (LR) of the parameter vector β of the Gaussian spatial linear model (GSLM) and slash spatial linear model (SSLM) at 5 % probability

Local influence was diagnosed to evaluate the sensitivity of estimates from the model and from the predictive process to atypical and/or influential values. For GSLM, we used a perturbation scheme fitted to a normal distribution, as presented by De Bastiani et al. (2015)De Bastiani F, Cysneiros AHMA, Uribe-Opazo MA, Galea M. Influence diagnostics in elliptical spatial linear models. TEST. 2015;24:322-40. https://doi.org/10.1007/s11749-014-0409-z
https://doi.org/10.1007/s11749-014-0409-... . For the GSLM, the samples exerting influence on the response variables were 15 and 72 (Figure 2a). Conversely, for the linear predictor, the number of influencing samples was higher, namely, samples 15, 58, 65, and 72 (Figure 3a).

Figure 2
Graphs of local influence perturbing the response variable considering: (a) the Gaussian spatial linear model (GSLM) and measure |h_[L]max|; (b) the slash spatial linear model (SSLM) and measure |h_[L]max|; and (c) the slash spatial linear model (SSLM) and measure |h_[Q]max|. Observations above the dotted line are classified as influential.

Figure 3
Graphs of local influence perturbing the linear predictor considering: (a) the Gaussian spatial linear model (GSLM) and measure L_p; (b) the slash spatial linear model (SSLM) and measure L_p; and (c) the slash spatial linear model (SSLM) and measure Q_p. Observations above the dotted line are classified as influential.

The figure regarding local influence on the response variable perturbing the SSLM shows that no observations were identified when considered measure |h_[L]max| (Figure 2b). However, sample 71 was influential for measure |h_[Q]max| (Figure 2c). Considering perturbation of the linear predictor, samples 1 and 71 were influential (Figure 3b) when measure L_p was used. And, for perturbation of the Q-function, measure Q_p, samples 1, 3, 62, and 71 were identified as influential (Figure 3c).

It is noteworthy that atypical sample 33 was not identified as influential on soybean yield response in any of the cases. These results corroborate those of Uribe-Opazo et al. (2012)Uribe-Opazo MA, Borssoi JA, Galea M. Influence diagnostics in Gaussian spatial linear models. J Appl Stat. 2012;39:615-30. https://doi.org/10.1080/02664763.2011.607802
https://doi.org/10.1080/02664763.2011.60... , who used a GSLM, tested different influence diagnostics techniques, and observed that not every atypical value could have an influence on model determination in a covariate study. Assumpção et al. (2014)Assumpção RAB, Uribe-Opazo MA, Galea M. Analysis of local influence in geostatistics using Student's t-distribution. J Appl Stat. 2014;41:2323-41. http://doi.org/10.1080/02664763.2014.909793
http://doi.org/10.1080/02664763.2014.909... , based on Student's t-distribution, also found differences between the atypical and the influential values; they highlighted the relevance of the local influence method as opposed to a simple box-plot analysis. Furthermore, the smaller number of influential cases found here when SSLM is fitted is consistent with the results of De Bastiani et al. (2015)De Bastiani F, Cysneiros AHMA, Uribe-Opazo MA, Galea M. Influence diagnostics in elliptical spatial linear models. TEST. 2015;24:322-40. https://doi.org/10.1007/s11749-014-0409-z
https://doi.org/10.1007/s11749-014-0409-... ; these authors used distributions with heavier tails and obtained greater modeling robustness.

Figure 4 displays the post-plot graph of the experimental area, highlighting the position of each influential sample. For GSLM (Figure 4a), the yield value of sample 72 was from 1.87 to 2.18 t ha⁻¹ and the yield values of samples 58 and 65 were from 2.18 to 2.33 t ha⁻¹. Note that nearest observations of these samples showed soybean yield highs values. Sample 15 was found within a region with high values. When considering SSLM (Figure 4b), we found that the values of samples 1, 62, and 71 were from 2.55 to 3.18 t ha⁻¹ and that of sample 3 was from 2.33 to 2.55 t ha⁻¹. Spatial analysis of the samples showed that neighboring observations to the left of sample 71 had values lower than the values of samples 1 and 3, which are close to the contour of the plotting domain.

Figure 4
Post-plot graph of soybean yield (t ha⁻¹) in the experimental area identifying influential samples: (a) the Gaussian spatial linear model (GSLM) and (b) the slash spatial linear model (SSLM).

In diagnostic analysis, when one or more influential values are detected, they are removed from the dataset to understand how their removal affects model selection, parameter estimates, and construction of the interpolation maps (Assumpção et al., 2011Assumpção RAB, Opazo MAU, Galea M. Local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Rev Bras Cienc Solo. 2011;35:1917-26. https://doi.org/10.1590/S0100-06832011000600008
https://doi.org/10.1590/S0100-0683201100... ). For GSLM, samples 15 and 72 were removed from the dataset since they were identified as influential when applying measures |h_[L]max| and L_p. Using the same criterion for SSLM and considering measures |h_[L]max|, |h_[Q]max|, L_p, and Q_p, samples 1 and 71 were removed. Table 4 shows the values of η and κ, which were selected by CrV and Tr criteria; this table also shows the estimated values of parameters for both models after sample removal. Furthermore, it is notable that there was no change in the choice of the model to describe spatial dependence. Nevertheless, after removing the influential samples, the estimated values of parameters changed, and especially asymptotic standard deviation showed higher values compared to the model without sample exclusion (Table 2). Moreover, we note a reduction in ϕ₁ values and in the estimated spatial dependence radius, which is 170.73 m for GSLM and 348.57 m for SSLM.

Thumbnail

Table 4
Values of the parameter of kurtosis adjustment (η) and of smoothing (κ) selected by CrV (Cross-validation) and Tr (Trace) criteria; estimated values of parameters of the Gaussian spatial linear model (GSLM) and slash spatial linear model (SSLM) by maximum likelihood using the EM algorithm with respective asymptotic standard deviations (in brackets) after removal of influential samples

Figure 5 shows the interpolated maps of yield as a function of the covariates studied using regression-kriging, and the estimated values of parameters are shown in tables 2 and 4. The maps were generated based on the following scenarios: S1- using the entire dataset and the GSLM (Figure 5a); S2 - using the entire dataset and the SSLM (Figure 5b); S3 - removing the influential observations for GSLM (Figure 5c); and S4 - removing the influential observations for SSLM (Figure 5d). Four classes of the same size were considered to build the interpolated maps; these classes were obtained by dividing the estimated range of yield variation.

Figure 5
Interpolation maps showing the soybean yield (t ha⁻¹) as a function of soil chemical properties [P, K, pH, and OM] for: (a) S₁: Gaussian spatial linear model (GSLM) with the complete dataset; (b) S₂: slash spatial linear model (SSLM) with the complete dataset; (c) S₃: Gaussian spatial linear model (GSLM) removing samples 15 and 72; (d) S₃: slash spatial linear model (SSLM) removing samples 1 and 71.

Comparing figure 5b to 5a, similar regions could be observed. However, the SSLM generated map (with ${\hat{ϕ}}_{1} = 0.051$ and 485.37 m range) exhibited a continuous structure, whereas the GSLM map (with ${\hat{ϕ}}_{1} = 0.053$ and 311.78 m range) had image formats of small dimensions. One of the problems related to maps whose modeling generates small regions is the difficulty of homogenizing the area to define management zones for localized application of fertilizers (Al-Kaisi et al., 2016Al-Kaisi MM, Archontoulis S, Kwaw-Mensah D. Soybean spatiotemporal yield and economic variability as affected by tillage and crop rotation. J Am Soc Agron. 2016;108:1267-80. https://doi.org/10.2134/agronj2015.0363
https://doi.org/10.2134/agronj2015.0363... ). Soares (2014)Soares, A. Geoestatística para as ciências da terra e do ambiente. 3. ed. Portugal: IST Press; 2014. affirmed a direct relationship of this effect with the parameters describing spatial dependence structure. According to this author, when the nugget effect is low or null, the influence of all samples is greater, and the map surface becomes smoother for larger ranges. When looking at figure 5c, an interpolated map that was built considering a GSLM without influential samples, a variation could be observed for short distances. This behavior is related to a structure with a sill and no nugget effect, showing a “screen effect” that is characterized by major weights around the point, as well as by almost null or even negative weights due to samples that are remote or beyond the practical limits (Soares, 2014Soares, A. Geoestatística para as ciências da terra e do ambiente. 3. ed. Portugal: IST Press; 2014.). With respect to figure 5d, empirical analysis indicated that the removal of influential samples caused small alterations in the interpolated map. Thus, when the data set contains influential values, the maps generated using SSLM can prevent problems that cause misinterpretation in defining management zones.

To quantify the similarity between the interpolated maps, the global accuracy (GA) and kappa indexes were used. According to the classification of Krippendorff (2004)Krippendorff K. Content analysis: an introduction to its methodology. 2nd ed. Thousand Oaks: Sage Publications; 2004., maps show low similarity if kappa < 0.67, medium similarity if 0.67 ≤ kappa ≤ 0.80, and high similarity if kappa > 0.80. With respect to GA, as stated by Anderson et al. (1976)Anderson JR, Hardy EE, Roach JT, Witmer, RE. A land use and land cover classification system for use with remote sensor data - Geological survey professional paper 964. Washington, DC: United States Government Printing Office; 1976 [acessed on 7 Sep 2016]. Available at: http://www.pbcgis.com/data_basics/anderson.pdf
http://www.pbcgis.com/data_basics/anders... , maps are similar if the GA index is greater than 0.85. Table 5 shows the values associated with each index. Comparing scenario S₁ to S₂, the kappa index indicated medium similarity, whereas the GA index assigned similarity to the maps. However, comparing S₁ to S₃, the index values mark dissimilarity between the maps, i.e., upon removal of samples 15, 58, 65, and 72, the map outlines changed. In the case of comparing S₂ to S₄, the maps were similar.

Thumbnail

Table 5
Index of similarity between the maps after local influence analysis

Therefore, the accuracy indexes of the maps showed the lower sensitivity of those generated by the slash distribution when removing an influential value, in contrast to a normal distribution. Thus, the SSLM enabled more robust modeling in the presence of influential observations, avoiding unnecessary exclusion of samples. From the interpolated maps, the yield potential of the area and its relationship to sub-regions can be investigated. This allows definition of better soil management and production system strategies, such as acidity correction.

CONCLUSIONS

The map of soybean yield variability as a function of soil chemical properties generated from the SSLM was less sensitive to the high kurtosis of the data set and the presence of influential and/or atypical values, which shows the robustness of the proposed model.

Diagnostic of local influence on the response variable and on the linear predictor based on the SSLM confirmed that an atypical value might not be influential since spatial modeling takes the position of the variable within the space into account. This finding may be a determining factor in the prediction process, avoiding exclusion of samples when using this method of modeling.

ACKNOWLEDGMENTS

We wish to acknowledge partial financial support from the Fundação Araucária of the state of Paraná, Capes, and CNPq of Brazil, and Project Fondecyt 1150325 of Chile.

REFERENCES

Al-Kaisi MM, Archontoulis S, Kwaw-Mensah D. Soybean spatiotemporal yield and economic variability as affected by tillage and crop rotation. J Am Soc Agron. 2016;108:1267-80. https://doi.org/10.2134/agronj2015.0363
» https://doi.org/10.2134/agronj2015.0363
Alcantara IC, Cysneiros FJA. Linear regression models with slash-elliptical errors. Comput Stat Data An. 2013;64:153-64. https://doi.org/10.1016/j.csda.2013.02.029
» https://doi.org/10.1016/j.csda.2013.02.029
Amado TJC, Pes LZ, Lemainski CL, Schenato RB. Atributos químicos e físicos de Latossolos e sua relação com os rendimentos de milho e feijão irrigados. Rev Bras Cienc Solo. 2009;33:831-43. https://doi.org/10.1590/S0100-06832009000400008
» https://doi.org/10.1590/S0100-06832009000400008
Anderson JR, Hardy EE, Roach JT, Witmer, RE. A land use and land cover classification system for use with remote sensor data - Geological survey professional paper 964. Washington, DC: United States Government Printing Office; 1976 [acessed on 7 Sep 2016]. Available at: http://www.pbcgis.com/data_basics/anderson.pdf
» http://www.pbcgis.com/data_basics/anderson.pdf
Assumpção RAB, Opazo MAU, Galea M. Local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Rev Bras Cienc Solo. 2011;35:1917-26. https://doi.org/10.1590/S0100-06832011000600008
» https://doi.org/10.1590/S0100-06832011000600008
Assumpção RAB, Uribe-Opazo MA, Galea M. Analysis of local influence in geostatistics using Student's t-distribution. J Appl Stat. 2014;41:2323-41. http://doi.org/10.1080/02664763.2014.909793
» http://doi.org/10.1080/02664763.2014.909793
Bottega EL, Queiroz DM, Pinto FAC, Souza CMA. Variabilidade espacial de atributos do solo em sistema de semeadura direta com rotação de culturas no cerrado brasileiro. Rev Cienc Agron. 2013;44:1-9. https://doi.org/10.1590/S1806-66902013000100001
» https://doi.org/10.1590/S1806-66902013000100001
Cambardella CA, Moorman TB, Parkin TB, Karlen DL, Novak JM, Turco RF, Konopka AE. Field-scale variability of soil properties in Central Iowa Soils. Soil Sci Soc Am J. 1994;58:1501-11. https://doi.org/10.2136/sssaj1994.03615995005800050033x
» https://doi.org/10.2136/sssaj1994.03615995005800050033x
Carvalho CGP, Arias CAA, Toledo JFF, Almeida LA, Kiihl RAS, Oliveira MF, Hiromoto DM, Takeda C. Proposta de classificação dos coeficientes de variação em relação à produtividade e altura da planta de soja. Pesq Agropec Bras. 2003;38:187-93. https://doi.org/10.1590/S0100-204X2003000200004
» https://doi.org/10.1590/S0100-204X2003000200004
Casella G, Berger RL. Inferência estatística - tradução da 2a edição norte-americana. São Paulo: Cengage Learning; 2010.
Cook RD. Assessment of local influence. J R Statist Soc B. 1986;48:133-69.
Costa JM, Oliveira EF. Fertilidade do solo e nutrição de plantas. 2. ed rev. Campo Mourão: Coamo - Cascavel: Coodetec; 2001.
Cressie NAC. Statistics for spatial data. 2nd ed. NewYork: Jonh Wiley & Sons;2015.
Dagenais MG, Dufour JM. Invariance, nonlinear models, and asymptotic tests. Econometrica. 1991;59:1601-15. https://doi.org/10.2307/2938281
» https://doi.org/10.2307/2938281
Dalchiavon FC, Carvalho MP, Montanari R, Andreotti M. Sugarcane productivity correlated with physical-chemical attributes to create soil management zone. Rev Ceres. 2013;60:706-14. https://doi.org/10.1590/S0034-737X2013000500015
» https://doi.org/10.1590/S0034-737X2013000500015
De Bastiani F, Cysneiros AHMA, Uribe-Opazo MA, Galea M. Influence diagnostics in elliptical spatial linear models. TEST. 2015;24:322-40. https://doi.org/10.1007/s11749-014-0409-z
» https://doi.org/10.1007/s11749-014-0409-z
Dalla Nora D, Amado TJC, Nicoloso RS, Gruhn EM. Modern high-yielding maize, wheat and soybean cultivars in response to gypsum and lime application on no-till Oxisol. Rev Bras Cienc Solo. 2017;41:e0160504. https://doi.org/10.1590/18069657rbcs20160504
» https://doi.org/10.1590/18069657rbcs20160504
Diggle PJ, Ribeiro Junior PJ. Model-based geostatistics. New York: Springer; 2007.
Grzegozewski DM, Uribe-Opazo MA, De Bastiani F, Galea M. Local influence when fitting Gaussian spatial linear models: an agriculture application. Cien Inv Agr. 2013;40:523-35. https://doi.org/10.4067/S0718-16202013000300006
» https://doi.org/10.4067/S0718-16202013000300006
Instituito Brasileiro de Geografia e Estatística -IBGE. Banco de dados agregados- Sistema IBGE de recuperação automática-SIDRA; 2016 [acesso em 10 nov 2016]. Disponível em: http://www.sidra.ibge.gov.br
» http://www.sidra.ibge.gov.br
Jonathan R, Uribe-Opazo MA, De Bastiani F, Johann JA. Técnicas para detecção de pontos influentes em variáveis contínuas regionalizadas. Eng Agric. 2016;36:152-65. http://dx.doi.org/10.1590/1809-4430-Eng.Agric.v36n1p152-165/2016
» http://dx.doi.org/10.1590/1809-4430-Eng.Agric.v36n1p152-165/2016
Kano Y, Berkane M, Bentler PM. Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. J Am Stat Assoc. 1993;88:135-43. https://doi.org/10.2307/2290706
» https://doi.org/10.2307/2290706
Krippendorff K. Content analysis: an introduction to its methodology. 2nd ed. Thousand Oaks: Sage Publications; 2004.
Lange K, Sinsheimer JS. Normal/independent distributions and their applications in robust regression. J Comput Graph Stat. 1993;2:175-98. https://doi.org/10.2307/1390698
» https://doi.org/10.2307/1390698
Matérn B. Spatial variation - Lecture notes in statistics. 2nd ed. New York: Springer-Verlag; 1986.
Nesi CN, Ribeiro A, Bonat WH, Ribeiro Junior PJ. Verossimilhança na seleção de modelos para predição espacial. Rev Bras Cienc Solo. 2013;37:352-8. https://doi.org/10.1590/S0100-06832013000200006
» https://doi.org/10.1590/S0100-06832013000200006
Nunes RS, Sousa DMG, Goedert WJ, Vivaldi LJ. Distribuição de fósforo no solo em razão do sistema de cultivo e manejo da adubação fosfatada. Rev Bras Cienc Solo. 2011;35:877-88. https://doi.org/10.1590/S0100-06832011000300022
» https://doi.org/10.1590/S0100-06832011000300022
Osorio F, Paula GA, Galea M. On estimation and influence diagnostics for the Grubbs’ model under heavy-tailed distributions. Comput Stat Data An. 2009;53:1249-63. https://doi.org/10.1016/j.csda.2008.10.034
» https://doi.org/10.1016/j.csda.2008.10.034
Pagani A, Mallarino AP On-farm evaluation of corn and soybean grain yield and soil pH responses to liming. Agron J. 2015;107:71-82. https://doi.org/10.2134/agronj14.0314
» https://doi.org/10.2134/agronj14.0314
Passos AMA, Rezende PM, Carvalho ER, Ávila FW. Pó de carvão, esterco de curral e cama de frango no cultivo da soja e atributos químicos de um Cambissolo distrófico. Rev Bras Cienc Agrar. 2015;10:382-8. https://doi.org/10.5039/agraria.v10i3a4546
» https://doi.org/10.5039/agraria.v10i3a4546
Pereira FCBL, Mello LMM, Pariz CM, Mendonça VZ, Yano EH, Miranda EEV, Crusciol CAC. Autumn Maize Intercropped with Tropical Forages: Crop Residues, Nutrient Cycling, Subsequent Soybean and Soil Quality. Rev Bras Cienc Solo. 2016;40:e0150003. https://doi.org/10.1590/18069657rbcs20150003
» https://doi.org/10.1590/18069657rbcs20150003
R Development Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016 [acessed on 2 Nov. 2016]. Available at: http://www.R-project.org
» http://www.R-project.org
Santos HG, Jacomine PKT, Anjos LHC, Oliveira VA, Oliveira JB, Coelho MR, Lumbreras JF, Cunha TJF. Sistema brasileiro de classificação de solos. 3. ed. Rio de Janeiro: Embrapa Solos; 2013.
Schemmer R, Uribe-Opazo MA, Galea M, Assumpção RAB. Spatial variability of soybean yield through a reparameterized t-Student model. Eng Agric. 2017;37:360-70. http://dx.doi.org/10.1590/1809-4430-eng.agric.v37n4p760-770/2017
» http://dx.doi.org/10.1590/1809-4430-eng.agric.v37n4p760-770/2017
Soares, A. Geoestatística para as ciências da terra e do ambiente. 3. ed. Portugal: IST Press; 2014.
Sobjak R, Souza EG, Bazzi CL, Uribe-Opazo MA, Betzek NM. Redundant variables and the quality of management zones. Eng Agric. 2016;36:78-93. https://doi.org/10.1590/1809-4430-Eng.Agric.v36n1p78-93/2016
» https://doi.org/10.1590/1809-4430-Eng.Agric.v36n1p78-93/2016
Soil Survey Staff. Keys to soil taxonomy. 12th ed. Washington, DC: United States Department of Agriculture, Natural Resources Conservation Service; 2014.
Uribe-Opazo MA, Borssoi JA, Galea M. Influence diagnostics in Gaussian spatial linear models. J Appl Stat. 2012;39:615-30. https://doi.org/10.1080/02664763.2011.607802
» https://doi.org/10.1080/02664763.2011.607802
Walkley A, Black I A. An examination of the Degtjareff method for determining soil organic matter and a proposed modification of the chromic acid titration method. Soil Sci. 1934;27:29-37. https://doi.org/10.1097/00010694-193401000-00003
» https://doi.org/10.1097/00010694-193401000-00003
Warrick AW, Nielsen DR. Spatial variability of soil physical properties in the field. In: Hillel D, editor. Applications of soil physics. New York: Academic Press; 1980. p. 319-44.
Zhu H-T, Lee S-Y Local influence for incomplete-data models. J R Statist Soc B. 2001;63:111-26. https://doi.org/10.1111/1467-9868.00279
» https://doi.org/10.1111/1467-9868.00279

Publication Dates

Publication in this collection
2018

History

Received
31 Jan 2017
Accepted
31 Aug 2017

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

[1] Al-Kaisi MM, Archontoulis S, Kwaw-Mensah D. Soybean spatiotemporal yield and economic variability as affected by tillage and crop rotation. J Am Soc Agron. 2016;108:1267-80. https://doi.org/10.2134/agronj2015.0363
» https://doi.org/10.2134/agronj2015.0363

[2] Alcantara IC, Cysneiros FJA. Linear regression models with slash-elliptical errors. Comput Stat Data An. 2013;64:153-64. https://doi.org/10.1016/j.csda.2013.02.029
» https://doi.org/10.1016/j.csda.2013.02.029

[3] Amado TJC, Pes LZ, Lemainski CL, Schenato RB. Atributos químicos e físicos de Latossolos e sua relação com os rendimentos de milho e feijão irrigados. Rev Bras Cienc Solo. 2009;33:831-43. https://doi.org/10.1590/S0100-06832009000400008
» https://doi.org/10.1590/S0100-06832009000400008

[4] Anderson JR, Hardy EE, Roach JT, Witmer, RE. A land use and land cover classification system for use with remote sensor data - Geological survey professional paper 964. Washington, DC: United States Government Printing Office; 1976 [acessed on 7 Sep 2016]. Available at: http://www.pbcgis.com/data_basics/anderson.pdf
» http://www.pbcgis.com/data_basics/anderson.pdf

[5] Assumpção RAB, Opazo MAU, Galea M. Local influence for spatial analysis of soil physical properties and soybean yield using Student's t-distribution. Rev Bras Cienc Solo. 2011;35:1917-26. https://doi.org/10.1590/S0100-06832011000600008
» https://doi.org/10.1590/S0100-06832011000600008

[6] Assumpção RAB, Uribe-Opazo MA, Galea M. Analysis of local influence in geostatistics using Student's t-distribution. J Appl Stat. 2014;41:2323-41. http://doi.org/10.1080/02664763.2014.909793
» http://doi.org/10.1080/02664763.2014.909793

[7] Bottega EL, Queiroz DM, Pinto FAC, Souza CMA. Variabilidade espacial de atributos do solo em sistema de semeadura direta com rotação de culturas no cerrado brasileiro. Rev Cienc Agron. 2013;44:1-9. https://doi.org/10.1590/S1806-66902013000100001
» https://doi.org/10.1590/S1806-66902013000100001

[8] Cambardella CA, Moorman TB, Parkin TB, Karlen DL, Novak JM, Turco RF, Konopka AE. Field-scale variability of soil properties in Central Iowa Soils. Soil Sci Soc Am J. 1994;58:1501-11. https://doi.org/10.2136/sssaj1994.03615995005800050033x
» https://doi.org/10.2136/sssaj1994.03615995005800050033x

[9] Carvalho CGP, Arias CAA, Toledo JFF, Almeida LA, Kiihl RAS, Oliveira MF, Hiromoto DM, Takeda C. Proposta de classificação dos coeficientes de variação em relação à produtividade e altura da planta de soja. Pesq Agropec Bras. 2003;38:187-93. https://doi.org/10.1590/S0100-204X2003000200004
» https://doi.org/10.1590/S0100-204X2003000200004

[10] Casella G, Berger RL. Inferência estatística - tradução da 2a edição norte-americana. São Paulo: Cengage Learning; 2010.

[11] Cook RD. Assessment of local influence. J R Statist Soc B. 1986;48:133-69.

[12] Costa JM, Oliveira EF. Fertilidade do solo e nutrição de plantas. 2. ed rev. Campo Mourão: Coamo - Cascavel: Coodetec; 2001.

[13] Cressie NAC. Statistics for spatial data. 2nd ed. NewYork: Jonh Wiley & Sons;2015.

[14] Dagenais MG, Dufour JM. Invariance, nonlinear models, and asymptotic tests. Econometrica. 1991;59:1601-15. https://doi.org/10.2307/2938281
» https://doi.org/10.2307/2938281

[15] Dalchiavon FC, Carvalho MP, Montanari R, Andreotti M. Sugarcane productivity correlated with physical-chemical attributes to create soil management zone. Rev Ceres. 2013;60:706-14. https://doi.org/10.1590/S0034-737X2013000500015
» https://doi.org/10.1590/S0034-737X2013000500015

[16] De Bastiani F, Cysneiros AHMA, Uribe-Opazo MA, Galea M. Influence diagnostics in elliptical spatial linear models. TEST. 2015;24:322-40. https://doi.org/10.1007/s11749-014-0409-z
» https://doi.org/10.1007/s11749-014-0409-z

[17] Dalla Nora D, Amado TJC, Nicoloso RS, Gruhn EM. Modern high-yielding maize, wheat and soybean cultivars in response to gypsum and lime application on no-till Oxisol. Rev Bras Cienc Solo. 2017;41:e0160504. https://doi.org/10.1590/18069657rbcs20160504
» https://doi.org/10.1590/18069657rbcs20160504

[18] Diggle PJ, Ribeiro Junior PJ. Model-based geostatistics. New York: Springer; 2007.

[19] Grzegozewski DM, Uribe-Opazo MA, De Bastiani F, Galea M. Local influence when fitting Gaussian spatial linear models: an agriculture application. Cien Inv Agr. 2013;40:523-35. https://doi.org/10.4067/S0718-16202013000300006
» https://doi.org/10.4067/S0718-16202013000300006

[20] Instituito Brasileiro de Geografia e Estatística -IBGE. Banco de dados agregados- Sistema IBGE de recuperação automática-SIDRA; 2016 [acesso em 10 nov 2016]. Disponível em: http://www.sidra.ibge.gov.br
» http://www.sidra.ibge.gov.br

[21] Jonathan R, Uribe-Opazo MA, De Bastiani F, Johann JA. Técnicas para detecção de pontos influentes em variáveis contínuas regionalizadas. Eng Agric. 2016;36:152-65. http://dx.doi.org/10.1590/1809-4430-Eng.Agric.v36n1p152-165/2016
» http://dx.doi.org/10.1590/1809-4430-Eng.Agric.v36n1p152-165/2016

[22] Kano Y, Berkane M, Bentler PM. Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. J Am Stat Assoc. 1993;88:135-43. https://doi.org/10.2307/2290706
» https://doi.org/10.2307/2290706

[23] Krippendorff K. Content analysis: an introduction to its methodology. 2nd ed. Thousand Oaks: Sage Publications; 2004.

[24] Lange K, Sinsheimer JS. Normal/independent distributions and their applications in robust regression. J Comput Graph Stat. 1993;2:175-98. https://doi.org/10.2307/1390698
» https://doi.org/10.2307/1390698

[25] Matérn B. Spatial variation - Lecture notes in statistics. 2nd ed. New York: Springer-Verlag; 1986.

[26] Nesi CN, Ribeiro A, Bonat WH, Ribeiro Junior PJ. Verossimilhança na seleção de modelos para predição espacial. Rev Bras Cienc Solo. 2013;37:352-8. https://doi.org/10.1590/S0100-06832013000200006
» https://doi.org/10.1590/S0100-06832013000200006

[27] Nunes RS, Sousa DMG, Goedert WJ, Vivaldi LJ. Distribuição de fósforo no solo em razão do sistema de cultivo e manejo da adubação fosfatada. Rev Bras Cienc Solo. 2011;35:877-88. https://doi.org/10.1590/S0100-06832011000300022
» https://doi.org/10.1590/S0100-06832011000300022

[28] Osorio F, Paula GA, Galea M. On estimation and influence diagnostics for the Grubbs’ model under heavy-tailed distributions. Comput Stat Data An. 2009;53:1249-63. https://doi.org/10.1016/j.csda.2008.10.034
» https://doi.org/10.1016/j.csda.2008.10.034

[29] Pagani A, Mallarino AP On-farm evaluation of corn and soybean grain yield and soil pH responses to liming. Agron J. 2015;107:71-82. https://doi.org/10.2134/agronj14.0314
» https://doi.org/10.2134/agronj14.0314

[30] Passos AMA, Rezende PM, Carvalho ER, Ávila FW. Pó de carvão, esterco de curral e cama de frango no cultivo da soja e atributos químicos de um Cambissolo distrófico. Rev Bras Cienc Agrar. 2015;10:382-8. https://doi.org/10.5039/agraria.v10i3a4546
» https://doi.org/10.5039/agraria.v10i3a4546

[31] Pereira FCBL, Mello LMM, Pariz CM, Mendonça VZ, Yano EH, Miranda EEV, Crusciol CAC. Autumn Maize Intercropped with Tropical Forages: Crop Residues, Nutrient Cycling, Subsequent Soybean and Soil Quality. Rev Bras Cienc Solo. 2016;40:e0150003. https://doi.org/10.1590/18069657rbcs20150003
» https://doi.org/10.1590/18069657rbcs20150003

[32] R Development Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016 [acessed on 2 Nov. 2016]. Available at: http://www.R-project.org
» http://www.R-project.org

[33] Santos HG, Jacomine PKT, Anjos LHC, Oliveira VA, Oliveira JB, Coelho MR, Lumbreras JF, Cunha TJF. Sistema brasileiro de classificação de solos. 3. ed. Rio de Janeiro: Embrapa Solos; 2013.

[34] Schemmer R, Uribe-Opazo MA, Galea M, Assumpção RAB. Spatial variability of soybean yield through a reparameterized t-Student model. Eng Agric. 2017;37:360-70. http://dx.doi.org/10.1590/1809-4430-eng.agric.v37n4p760-770/2017
» http://dx.doi.org/10.1590/1809-4430-eng.agric.v37n4p760-770/2017

[35] Soares, A. Geoestatística para as ciências da terra e do ambiente. 3. ed. Portugal: IST Press; 2014.

[36] Sobjak R, Souza EG, Bazzi CL, Uribe-Opazo MA, Betzek NM. Redundant variables and the quality of management zones. Eng Agric. 2016;36:78-93. https://doi.org/10.1590/1809-4430-Eng.Agric.v36n1p78-93/2016
» https://doi.org/10.1590/1809-4430-Eng.Agric.v36n1p78-93/2016

[37] Soil Survey Staff. Keys to soil taxonomy. 12th ed. Washington, DC: United States Department of Agriculture, Natural Resources Conservation Service; 2014.

[38] Uribe-Opazo MA, Borssoi JA, Galea M. Influence diagnostics in Gaussian spatial linear models. J Appl Stat. 2012;39:615-30. https://doi.org/10.1080/02664763.2011.607802
» https://doi.org/10.1080/02664763.2011.607802

[39] Walkley A, Black I A. An examination of the Degtjareff method for determining soil organic matter and a proposed modification of the chromic acid titration method. Soil Sci. 1934;27:29-37. https://doi.org/10.1097/00010694-193401000-00003
» https://doi.org/10.1097/00010694-193401000-00003

[40] Warrick AW, Nielsen DR. Spatial variability of soil physical properties in the field. In: Hillel D, editor. Applications of soil physics. New York: Academic Press; 1980. p. 319-44.

[41] Zhu H-T, Lee S-Y Local influence for incomplete-data models. J R Statist Soc B. 2001;63:111-26. https://doi.org/10.1111/1467-9868.00279
» https://doi.org/10.1111/1467-9868.00279

Variable	Y	P	K	pH(H₂O)	OM
	t ha^{-1 (1)}	mg dm^-3	mg dm^-3		g dm^-3
Mean	2.37	19.19	0.31	4.82	50.63
Standard deviation	0.07	123.56	0.02	0.17	40.05
Coefficient of variation (%)	11.48	57.91	45.35	8.43	12.50
Skewness	0.51	1.34	0.60	1.08	0.02
Kurtosis	3.11	4.73	2.44	4.31	2.44

Model	Structure	Estimated parameter⁽¹⁾ (1) βi: parameters associated with the variable i = [soybean yield (Y), P, K, pH, and OM]; ϕ^1 nugget effect, ϕ^2 partial sill, ϕ^3 parameter that defines the spatial dependence radius.
Model	Structure	${\hat{β}}_{0}$	${\hat{β}}_{1}$	${\hat{β}}_{2}$	${\hat{β}}_{3}$	${\hat{β}}_{4}$	${\hat{ϕ}}_{1}$	${\hat{ϕ}}_{2}$	${\hat{ϕ}}_{3}$
GSLM	Matérn κ = 2.5	2.030	0.003	-0.121	0.033	0.003	0.053	0.018	52.666
GSLM	Matérn κ = 2.5	(0.412)⁽²⁾ (2) The lowest values of the asymptotic standard deviations are underlined.	(0.003)	(0.228)	(0.076)	(0.005)	(0.059)	(0.059)	(0.007)
SSLM η = 0.25	Matérn κ = 2.5	1.993	0.003	-0.109	0.034	0.004	0.051	0.026	81.988
SSLM η = 0.25	Matérn κ = 2.5	(0.516)	(0.003)	(0.288)	(0.095)	(0.006)	(0.036)	(0.013)	(0.006)

Hypothesis⁽¹⁾ (1) Parameters associated with the variable: β1 - P, β2 - K, β3 - pH, and β4 - OM.	GSLM		SSLM
	LR	p-value	LR	p-value
β₁ = 0	6.687	0.036^* * significant at 5 % probability.	11.659	0.004^* * significant at 5 % probability.
β₂ = 0	7.001	0.032^* * significant at 5 % probability.	15.215	0.001^* * significant at 5 % probability.
β₃ = 0	5.953	0.050^* * significant at 5 % probability.	8.567	0.016^* * significant at 5 % probability.
β₄ = 0	6.257	0.044^* * significant at 5 % probability.	9.113	0.013^* * significant at 5 % probability.
β₁ = β₂ = β₃ = β₄ = 0	7.865	0.039^* * significant at 5 % probability.	13.897	0.003^* * significant at 5 % probability.

Model	Structure	Estimated parameter⁽¹⁾ (1) βi: parameters associated with the variable i = [soybean yield (Y), P, K, pH, and OM]; ϕ^1: nugget effect, ϕ^2: partial sill, ϕ^3: parameter that defines the spatial dependence radius.
Model	Structure	${\hat{β}}_{0}$	${\hat{β}}_{1}$	${\hat{β}}_{2}$	${\hat{β}}_{3}$	${\hat{β}}_{4}$	${\hat{ϕ}}_{1}$	${\hat{ϕ}}_{2}$	${\hat{ϕ}}_{3}$
GSLM	Matérn κ = 2.5	2.066	0.003	-0.102	0.039	0.002	0.000	0.074	28.840
GSLM	Matérn κ = 2.5	(0.423)⁽²⁾ (2) The lowest values of the asymptotic standard deviations are underlined.	(0.003)	(0.238)	(0.078)	(0.006)	(0.986)	(0.987)	(0.003)
SSLM η = 0.20	Matérn κ = 2.5	2.172	0.004	-0.164	0.053	-0.002	0.050	0.010	58.880
SSLM η = 0.20	Matérn κ = 2.5	(0.493)	(0.003)	(0.277)	(0.009)	(0.006)	(0.030)	(0.007)	(0.003)

Maps analyzed	Index
Maps analyzed	kappa	Global accuracy (GA)
S₁ against S₂	0.78	0.88
S₁ against S₃	0.59	0.67
S₂ against S₄	0.87	0.86