Water retention and penetration resistance equations for the least limiting water range

The least limiting water range is a soil physical quality indicator, which is useful to predict the optimum water range for plant growth in a given soil and to study the effects of soil use and management over this optimum water range by integrating the effects of available water, penetration resistance and air filled porosity. This study tested six equations to fit water retention and penetration resistance surface responses used to determine the least limiting water range and present a simple algorithm written in the open source software R for fitting, calculation and visualization of the least limiting water range. Five soils from Brazil and Canada, under different use and management conditions were used to test the functions. The results show that the three water retention surface responses had good statistical properties for fitting water retention and that two of the penetration resistance surface responses were adequate to fit the data, while one failed to achieve convergence in two instances. The open source code performed as well as the commercial statistical package SAS for fitting the penetration resistance and water retention


Introduction
Over the last decades, there has been a search for indicators of soil physical quality able to integrate soil physical parameters in a single index that can be easily measured and correlated with crop production factors (Dexter, 2004;Letey, 1985;Silva et al., 1994).The Least Limiting Water Range (LLWR) is an indicator of soil physical quality and has been used and validated under different crop production systems (Lima et al., 2012;Silva et al., 1994;Wu et al., 2003;Zou et al., 2000).The conceptual basis of the LLWR was proposed by Letey (1985), while Silva et al. (1994) developed a mathematical and statistical framework for the application of the concept to soil data.LLWR is sensitive to changes in soil air filled porosity (AFP) and soil resistance to penetration (SR) and integrates the traditional concept of available water defined as the water content between field capacity (FC) and permanent wilting point (PWP) in a particular soil.In the implementation of the method by Silva et al. (1994), AFP, SR, PWP and FC are defined as a function of soil bulk density (D b ), thereby creating an index that, once parameterized, can be used for monitoring soil physical quality by measuring soil bulk density alone.
The objectives of this research were to: i) evaluate the fitting properties and statistical quality of three soil water retention and three soil resistance to penetration equations to calculate LLWR, and ii) present a code developed in the open source software GNU R (R Development Core Team, version 3.0.1 "Good Sport") for fitting the relevant equations and creating a LLWR plot.

Materials and Methods
Mathematically, LLWR quantification is based on the fitting of a water retention function and a soil pen-etration resistance function.The water retention function, in this particular case, must incorporate soil structural variability, which can be achieved by incorporating soil bulk density into the function.Silva et al. (1994) used a power function described by Ross et al. (1991) for fitting the soil water retention curve: (1) where: θ = volumetric water content (cm 3 cm -3 ), ψ = soil water potential (MPa) and a and b = empirical parameters.
The "stepwise" multiple regression procedures used by Silva et al. (1994) resulted in a three parameter equations with good statistical properties to characterize the structural influence on the soil water retention process (Tormena et al., 1999;Betz et al., 1998).
where: D b = soil bulk density (g cm -3 ) and a, b, and c = empirical parameters.Soil bulk density can also be incorporated into the classic equation of van Genuchten (1980) for the soil water retention curve: where: θ r = soil residual water content (cm 3 cm -3 ), θ s = soil saturated water content (cm 3 cm -3 ), α = inverse of the air entry water potential value (MPa -1 ) and n = empirical parameter.To incorporate soil bulk density, the soil saturated water content is replaced by soil total porosity, calculated as 1 -D b /D p , where: D p = soil particle density, usually assumed as 2.65 g cm -3 .Another form of incorporating D b into Eq. 1 is through the a parameter: where: a 0 , a 1 and b = empirical parameters.The soil penetration resistance relationship is usually described using the non-linear equation developed by Busscher and Sojka (1987) (Silva et al., 1994): where: SR = soil resistance to penetration (MPa) and d, e, and f = empirical parameters.Busscher (1990) evaluated eight equations to fit the soil penetration resistance function and Eq. 5 was the equation with the highest coefficient of determination (R 2 average = 0.91) and with a good ratio of the coefficient of determination to the number of empirical parameters.Equation 5 is also mathematically simple and easily implemented in statistical software.
Despite the frequent use of Eq. 5 in studies to quantify LLWR and in soil resistance studies, several equations can be used to fit the soil resistance function, including equations that incorporate other independent variables, such as grain size parameters, water content at a given water potential, organic matter, among others.Perumpral (1987), Busscher (1990) and Caranache (1987) list almost twenty equations that can be used to fit the soil resistance to penetration function.Ayres and Perumpral (1982) searched for an empirical nonlinear equation that was simple, but with good fitting properties to fit SR data as a function of soil water content and dry bulk density.Their procedure resulted in a four-parameter equation: where: a, b, c and d = empirical parameters.In their study, Eq. 6 resulted in values of R 2 > 0.9 for soils with clay content ranging from 0 to 1000 g kg -1 (Ayres and Perumpral, 1982).Vaz et al. (2001) used a nonlinear three-parameter equation to fit the SR function by simultaneously measuring water content and soil penetration resistance using a cone penetrometer adapted with a time domain reflectometry (TDR) sensor: where: a, b, and c = Empirical parameters.The majority of the equations presented earlier can be linearized by logarithmic transformation.Since it is mathematically and computationally easier to work with linearized equations, many studies have used linearized forms of these equations to fit water retention and penetration resistance data (Silva et al., 1994;Betz et al., 1998;Zou et al., 2000).However, in many instances, the linear transformation also involves a transformation of the stochastic term (error), affecting statistical assumptions of the model (Bates and Watts, 1988).Besides, techniques and algorithms for nonlinear regression have improved dramatically over the past decades, becoming more efficient and accessible to a greater number of researchers (Seber and Wild, 1989).The computing power of most personal devices of today are greater than the machines used in the past and fitting a nonlinear equation has become trivial in terms of processing power and time consumption.
Equations 2 to 7 were evaluated using a dataset compiled and described by Leão et al. (2005).Five soils from different regions, taxonomic classes, grain size distributions and managements were used to evaluate the equations.The basic characterization of soils, including grain size distribution, use and reference is presented in Table 1.The soils used in this research are: a silt loam Inceptisol (Can1), a loamy sand Alfisol (Can2); a sandy clay Oxisol under native vegetation (NV), continuous grazing (CG), and short duration intensive grazing (SG); a clay Oxisol under grain rotation (Arg); and a loamy sand Ultisol under orange orchard (Are) (Table 1).
As several publications deal with the theoretical concept of LLWR (Letey, 1985;Silva et al., 1994) and the procedures for sampling and laboratory analysis to quantify the variables D b , θ, RP and ψ (Klute, 1986;Silva et al., 1994;Betz et al., 1998;Tormena et al., 1999), this research was mainly focused in the data analysis procedures.For model comparison, the equations were fitted using SAS software (Statistical Analysis System, version 9.2) as described by Leão et al. (2005).Afterward, two equations were chosen, one for the water retention curve and the other for the soil penetration resistance  and 0.0008 (cm 3 cm -3 ) 2 for Eq. 4. The average R 2 value was approximately 0.85 for Eqs. 2 and 3 and 0.86 for Eq. 4, while the F value was 3808.6, 3817.3 and 3088.8 for Eqs. 2, 3 and 4, respectively (Table 2).This shows that the statistical quality of the fittings was similar.In six cases, one of the parameters was not statistically significant, as evaluated by the confidence interval of the parameter estimate.The presence of nonsignificant parameters in a nonlinear model may be an indicator that these parameters are superfluous to fit the model to the specific dataset (Ratkowsky, 1990).However, these parameters did not affect the fitting quality and predictive capability of the models, as indicated by data in Table 2.
Although it is nearly equivalent to Eqs. 2 and 3, Eq. 4 had fitting quality indicators slightly better than the other two, with slightly lower mean squared error and slightly higher R 2 .Nonetheless, this model should be used with caution, especially when simple optimization methods are used, such as in Leão and Silva (2004).Equation 4is a more complex model from the nonlinear fitting point of view as the likelihood of dispersion of the parameter surface in the nonlinear regression search procedure is greater.In the case of complex, multiparameter and/or intrinsically nonlinear models, more robust search procedures are recommended.One of the best procedures for this purpose is the Levenberg-Marquardt procedure (Levenberg, 1944;Marquardt, 1963).The Levenberg-Marquardt is a robust search procedure that usually converges when the initial parameter value guess is within the expected range and when the model adequately describes the shape of the relationship be-function.A simple code in GNU R (R Development Core Team, version 3.0.1 "Good Sport") was developed to fit the equations and plot the LLWR.The fitting properties of the R code were compared to the SAS results.

Models comparison
Results from fitting the soil water retention curve functions described by Eqs. 2, 3 and 4 are presented in Table 2.The nonlinear regression procedure achieved convergence for all equations and all soils (p < 0.0001).The main points to be checked in a nonlinear regression fitting, in order of importance, are: 1. the convergence of the model, 2. the significance of the parameters, 3. the probability associated with the F value of the regression analysis of variance, 4. the mean square error, and 5. the F value.The coefficient of determination, although often regarded as the most important indicator of goodness of fit in regression procedures, is not a reliable indicator in nonlinear regression since the total sum of squares does not add to zero as it is the case in linear regression procedures (Seber and Wild, 1989).Therefore, the R 2 values presented in Tables 1 and 2 should be viewed with caution.These values are not derived from the nonlinear fitting procedure.They were calculated by linear regression of the observed versus predicted values for each model.
Overall, the value of the mean square error was very similar among the models under evaluation, with an average value of 0.00093 (cm 3 cm -3 ) 2 for Eqs. 2 and 3 tween dependent and independent variables (Leão et al., 2005).As with any nonlinear search procedure, when the Levenberg-Marquardt method is used, the user should check for the possibility of local minima in the procedure convergence (Seber and Wild, 1989).The fitting procedure for Eq. 4 should follow the recommendations to fit the van Genucthen (1980) water retention equation from which it derives.Equations 2 and 3 are intrinsically linear and therefore the fitting is somewhat trivial.
Analysis of soil resistance to penetration equations (Eqs.5 to 7) showed that Eqs. 5 and 7 had a very similar performance with mean values of the mean squared error of 0.31 and 0.34 MPa 2 , and F values of 457 and 410, respectively, and R 2 of 0.81 in both cases (Table 3).Equation 5was slightly better than Eq.7 when the mean squared error and F value are considered.Eq. 6 had an overall lower mean squared error and higher R 2 and F; however, it failed to converge in two conditions, namely CG and Are, indicating that this model can be problematic in some cases and should be used with caution.The increase in the number of empirical parameters is known to cause difficulties in the nonlinear search procedure and/or dispersion of the parameter surface in the search procedure (Seber and Wild, 1989).This can be minimized by changing the initial values of the parameters thereby avoiding dispersion of the search procedure in the first iterations of the process.In this particular case, the initial values of the parameters were the same for all soils and conditions to evaluate the models in similar conditions.If convergence is not achieved, even if the initial parameter values are altered, this can be an indicator that the equation does not conform to the data or that there is an excess of empirical parameters, and this overparameterization can be very problematic when nonlinear models are used (Ratkowsky, 1990).Therefore, the use of Eqs. 5 or 7 is recommended to fit soil penetration resistance data.

Code development in GNU R
Based on the model evaluation above, a code in GNU R environment was developed to fit and calculate LLWR.Previously, Leão and Silva (2004) and Leão et al. (2005) developed algorithms to calculate LLWR in commercial software programs, which are usually expensive and often unavailable to public research and education institutions in developing countries.The R system is a free software program developed under the GNU General Public License system, based on the S programming language and can be downloaded for free at http://www.R-project.org.A dataset in *.dat format was used in the analysis.The dataset, called llwr.dat for illustrative purposes, is formatted as shown in Table 4, where Site is the treatment or soil, Db is the soil bulk density (g cm -3 ), q is the volumetric water content (cm 3 cm -3 ), SR is soil resistance to penetration (MPa) and Y soil water potential (MPa).To import the file into the program, the following line of code can be used: llwr<-read.table("C:/llwr.txt",header=T, sep="\t") The options header and sep indicate that the first line of the file corresponds to the declaration of vari- ables names and that the data are separated by tabulation.The configurations for file format and separator can be easily changed according to the user's requirements.The analysis can be performed in each treatment at a time, in this case the "Are" treatment, referring to the loamy sand Ultisol was used (Table 1): attach(llwr) areia<-llwr[which(Site=="Are"),] detach(llwr) The commands attach and detach are used to make objects within dataframes accessible to R in fewer keystrokes.In simple terms, they assign or remove data from the program memory for immediate use.For adjusting the equations, the file created for the "Are" dataset, called areia, is assigned to the program memory.The soil water retention curve is then adjusted using the nls command.Equation 2 was used in the fitting procedure for simplicity.The initial values for parameters a, b and c are specified in the command start.The command summary provides the results of the nonlinear fitting procedure: attach(areia) areia.cra<-nls(q~exp(a+b*Db)*Y^c,start=list(a=-0.1,b=-0.1,c=-0.1))summary(areia.cra) The same procedure is repeated for the soil penetration resistance procedure.Equation 5 was used for simplicity: areia.crp<-nls(SR~d*Db^e*q^f,start=list(d=0.1,e=-0.1,f=0.1))summary(areia.crp) The results from the nonlinear procedure for the water retention curve equation are presented below: As discussed before, the values of initial parameters for the nonlinear regression can and should be altered according to the units of each variable to which it corresponds and in case the procedure does not converge.Lack of convergence indicates that the search mechanism did not find a reasonable result for the parameter set.Nonlinear equations where convergence is not met should not be used to predict or model data.The results show that convergence was achieved in 7 iterations and that the values are almost identical to those found in SAS software (Table 2).The values for the soil penetration resistance surface fitting were also very similar to those found in SAS: To generate values of the volumetric water contents at the critical limits of LLWR, the datasets containing the parameter values estimated by the two equations must be combined to the original data.This can be done by transforming the parameter vector into columns by using the command rbind on the fitting parameters data extracted using the coef command.The command merge combines the two datasets.The new dataset is then attributed to the program memory using the command attach: areia.crp.coef<-rbind(coef(areia.crp)) areia.cra.coef<-rbind(coef(areia.cra)) areia.coefs<-merge(areia.crp.coef,areia.cra.coef)areia.all<-merge(areia,areia.coefs)detach(areia) attach(areia.all) The critical value of the volumetric water content at the FC (qfc) is generated by replacing the water potential at the FC in Eq. 2, |0.01|MPa in this case.For the critical value of volumetric water content at the PWP (qpwp), the value of water potential of |1.5| MPa was used: areia.all$qfc<-exp(areia.all$a+areia.all$b*areia.all$Db)*0.01^areia.all$cThe critical volumetric water content where AFP is 10 % is calculated as: where: D p is the particle density (g cm -3 ) assumed as 2.65 g cm -3 : areia.all$qafp<-(1-(areia.all$Db/2.65)-0.1) The new variables created by the equations above are assigned to the dataset specifying the name of the dataset open, followed by the symbol "$" and the name of the new variable.Therefore, to generate the variable qafp, the command areia.all$qashould be specified.The same procedure can be used on the variables open for manipulation in the dataset, one example is the code fragment areia.all$Db.The critical volumetric water content where the SR is equal to 2 MPa (qsr) is calculated by replacing this value into Eq. 5 and solving for volumetric water content.The critical value can be easily changed to a value different from 2 MPa, according to the user's requirements.areia.all$qsr<-((2/(areia.all$d*areia.all$Db^areia.all$e))^(1/areia.all$f)) LLWR plot can be produced with the following code fragment: attach(areia.all)par(mar=c(4,4.5,2,1))par(oma=c(0,0,0,0)) plot(Db,qfc, col = "blue", type = "b", xlim = c(1.45,1.85),ylim=c(0.0,0.3),lwd=2,xlab=expression(paste("Soil Bulk Density, g cm"^"-3")), ylab=expression(paste("Volumetric Water Content, cm"^{3}, "cm"^{-3}))) lines(Db,qafp, col = "orange", type="b", lwd=2, pch=2) lines(Db,qsr, col = "green", type="b", lwd=2, pch=2) lines(Db,qpwp, col = "red", type="b", lwd=2, pch=2) legend("topleft", legend=c("FC", "PWP", "SR", "AFP"), cex=1, col=c("blue", "red", "green", "orange"), lty=1:3, lwd=2, bty="n") The command par is only necessary when there are size configuration problems in the window where the plot is shown.The commands in R should be executed line by line, the execution of the code as a whole is not recommended until the user becomes more proficient with the procedure and the software.By executing each step at a time, the user can check for problems and inconsistencies in the fitting and plotting procedures.Figure 1 shows the procedure results.The axis, fonts, lines, colors and other configurations can be easily modified according with the preferences of the user.
The D b value, where LLWR = 0, called the critical bulk density (D bc ), can also be calculated in R. A simple optimization procedure is used.The equation to be optimized is the equation of the superior limit minus the inferior LLWR limit.The equations that intercept at the critical limit can be easily identified by inspecting the LLWR plot (Figure 1).The equation should be specified with the values of the coefficients obtained from the nonlinear regression procedure.In the case of the areia dataset, D bc corresponds to the intersection of the lines of the volumetric water content at field capacity and the water content in which the soil penetration resistance is equal to 2 MPa.In R, the critical bulk density was specified as the Dbc function, where x is the value of the critical bulk density to be optimized.The command uniroot performs the optimization in the D b limits between 0.0 and 2.0 g cm -3 : Dbc<-function(x) exp(-0.98-1.39*x)*(0.01^-0.34)-((2/(0.000007*x^15))^(1/-1.84)) uniroot(Dbc, c(0,2)) As a result, the critical bulk density where LLWR = 0 is estimated: The last step is to add a line corresponding to Dbc into the graph.This can be specified by the command: abline(v=1.820318,col ="black")

Conclusions
All water retention equations performed well in fitting to the observed data.For the penetration resistance data, Eqs. 5 and 7 performed similarly well, while Eq. 6 failed to converge in two occasions.These equations can be combined to determine the least limiting water range.A simple GNU R application was also developed to quantify the least limiting water range.The results from the application were very similar to those found using a commercial software program widely used in statistical analyses.As R is a very flexible programming language, the code fragments presented in this study can be promptly modified to suit the needs of different datasets, models as well as for data manipulation.

Figure 1 −
Figure 1 − Least limiting water range plot generated using R software.

Table 1 −
Description of the soils used for testing Eqs. 2 to 7.

Table 2 −
Fitting results for water retention equations.Convergence status, yes or no and number of iterations to achieve convergence.F = F value from the nonlinear regression ANOVA; ns = not significant (p < 0.05).

Table 3 −
Fitting results for the penetration resistance equations.Convergence = Convergence status, yes or no and number of iterations to achieve convergence.F = F value from the nonlinear regression ANOVA; ns = not significant (p < 0.05).

Table 4 -
Model for the input datafile used in R software. LeãoEqs.