Nonlinear modeling of liming reaction and extractable base curves

HIGHLIGHTS: Statistically nonlinear models were proposed for pH as a function of lime doses. Statistically nonlinear models were proposed for exchangeable Ca+2 + Mg+2 as a function of lime doses. The use of a statistical index combined with constraints on extrapolation values enabled the se-lection of the best-fitting models. ABSTRACT Modeling the response of soils to liming is important for understanding neutralization reactions and predicting lime residual effects. Models based on simple or quadratic polynomial equations are the most used due to their simplicity and ease of fitting; however, they fail to reproduce a realistic soil response to liming, indicating a decrease in pH as the lime dose is increased after reaching a maximum point. Thus, several nonlinear functions were tested and compared to polynomial models, using a dataset from a liming test conducted on a sandy clay loam soil in a farm. The best-fitting models for pH data were the Mitscherlich, three-parameter logistic, and Morgan-Mercer-Flodin models. The best-fitting models for exchangeable Ca+2 + Mg+2 data were Skaggs et al., Gompertz, and Morgan- Mercer-Flodin. The use of the proposed T index, which ranks models based on their residual standard error and Akaike information criterion values, combined with constraints on extrapolation values, was useful for selecting models that are statistically robust and empirically coherent.


HIGHLIGHTS:
Statistically nonlinear models were proposed for pH as a function of lime doses.Statistically nonlinear models were proposed for exchangeable Ca +2 + Mg +2 as a function of lime doses.
The use of a statistical index combined with constraints on extrapolation values enabled the selection of the best-fitting models.

Introduction
The modeling of responses of soils to liming is important for understanding neutralization reactions and predicting lime residual effects.The use of liming is a wellestablished agricultural practice that has been extensively studied, however, many details still require clarification.The soil reaction time to different lime sources and rates and the short-term effects on crop yield still require detailed information.
Soil preparation for new agricultural fields in the Cerrado region of western Bahia, Brazil, usually involves the use of high lime rates, which exceed the current recommended rates for any textural soil group (Albuquerque Filho et al., 2011).The soil preparation process includes applying lime rates of 6 Mg ha -1 in the first year, followed by 2 Mg ha -1 in the second year plus 1 Mg ha -1 of gypsum, and finally 4 Mg ha -1 of lime and 1 Mg ha -1 of gypsum.
Fitting neutralization curves with mathematical models is used to study the kinetics of lime dissolution and estimate the unreacted lime.Models based on simple or quadratic polynomial equations are commonly used for this purpose due to their simplicity and ease of fitting.
However, these models fail to reproduce a realistic soil response to liming, as they indicate a decrease in pH as the lime rate is increased after reaching a maximum point.Thus, they cannot be used for extrapolation, not even for values slightly above the highest ones used for curve fitting.The pH levels should increase continuously as the lime rate is increased until they reach the theoretical limit defined by the plateau, which is determined by the limestone abrasion pH (just above eight); thus, equations from the growth function family, such as the monomolecular or Mitscherlich types, should be included to describe the lime reaction over time.
The aim of this work was to test several nonlinear functions and compare them to polynomial models currently used.

Material and Methods
The soils were sampled in agricultural plots of the Xanxere farm (13°47'01''S; 46°00' 13''W; altitude of 930 m) (Figure 1).The mean chemical properties of 20 soil samples collected from the experimental area are shown in Table 1.The farm is on the Sao Francisco plateau, characterized by a flat to gently undulating terrain with deep, well-drained, acidic, Figure 1.Map of the location of the experimental area and soil sampling points and low-fertility soils.The soils were classified as Oxisols or Psamments (Latossolos Vermelho-Amarelos, Latossolos Amarelos, or Neossolos Quartzarenicos; Santos et al., 2018) of sandy clay loam texture.The mean annual rainfall depth in the region is 1,300 mm, concentrated from November to March.The water deficit period ranges from April/May to September/October.The mean air relative humidity is 64%, decreasing to 34% from June to September.
A liming test was conducted in the Xanxere farm, where 50 × 200 m plots were subjected to liming using dolomitic limestone at rates ranging from zero to 20 Mg ha -1 (0, 5, 10, and 20 Mg ha -1 ).Four points were sampled in the middle of the plots, equally spaced in a straight line, in two periods (one and two and a half years after liming) for each of the evaluated rates.A composite soil sample was obtained by collecting three samples of the 0-20 cm and 20-40 cm layers for each sampling point, using an auger.Soybean crops were grown in the area for two consecutive years and soil samples were collected in the first and second crop seasons.
The soil samples were air-dried and sent for standard fertility analyses, according to the procedures of Teixeira et al. (2017).
The soil experimental data (pH in water and sum of Ca +2 and Mg +2 as a function of the applied lime rates) were fitted to a set of nonlinear models (Table 2) and statistically compared using the R software (R Core Team, 2022).The nonlinear models listed in Table 1 were selected from the reference literature and have been applied for various purposes (Lima & Silva, 2003).
The performance of the models was evaluated with a ranking system that uses the values of residual standard error (RSE) and Akaike information criterion (AIC), as well as an additional criterion based on the result of the extrapolation of the models outside the range of the data used for modeling.This proposed criterion compares the estimated values generated by the model for values above or below the range of the data used for its fit with the expected values based on soil chemical theory.This criterion was selected to avoid models that produce unrealistic results outside the range of the data used for model fitting, such as pH values above the maximum of the lime (8.3) or below the untreated soil pH, or declining pH values with increasing lime rates.This undesired trend is found in secondorder polynomial models fitted to neutralization curves.
The ranking index show the average position of each model for each of soil layer and sampling year in a classification list, according to Eq. 1, 2, and 3.The highest values indicate the best-fitting models.Index T m -final score index of the model m.
where: RSEm ij -score index of the model m by the RSE from year 1 to year i, and from soil layer 1 to layer j; N -number of models to be tested (15 models; and sum of x a , 120); I -number of sampling years to be tested (2 years); J -number of soil layers to be tested (2 layers); RSEm -position in the classification list of the model m (1 ton models in the list); and, 1 st = worst and 15 th = the best model, with the lowest RSE.
where: AICm ij -score index of the model m by the RSE from year 1 to year i, and from soil layer 1 to layer j; N -number of models to be tested (15 models; and sum of x a , 120); I -number of sampling years to be tested (2 years); j -number of soil layers to be tested (2 layers); AICm -position in the classification list of the model m (1 ton models in the list); and, 1 st = worst and 15 th = the best model, with the lowest AIC.

Results and Discussion
Soil pH as a function of lime rates presented the expected increases in both years evaluated, but with a decreasing effect (1) (2) (3) Rev. Bras.Eng.Agríc.Ambiental, v.27, n.10, p.820-827, 2023.2A and B), despite the increase in lime rates at double rates (from 5 to 10 to 20 Mg ha -1 ).The mean pH was 7 for the highest lime rate in the first year, and 6.6 in the year 2. The decrease in pH of the soil surface layer from the first to the second year is connected to the effects of the crop grown in the area, such as root acidification and nutrient extraction, among others.The flattening trend observed in the reaction curves over time, indicated by the measured pH levels, has been previously reported in other studies (Nogaroli & Fonseca, 2020;Cavalli et al. 2021).
The increase in pH of the soil subsurface layer from the first to the second year is due to vertical percolation and/or biological activity, decreasing the liming effects.Long-term experiments using high lime rates of up to 20 Mg ha -1 have reported a pH plateau in the range of 7 to 7.5 (Goulding, 2016).Although the decrease in soil surface pH over time was small, it may have been affected by soil spatial variability, sampling procedures, and irregular distribution of lime throughout the area.However, even small changes in pH can impact model predictions, which will be discussed later.Decreases in lime solubility as pH increases may reduce the effects of higher lime rates, as already found in previous studies.
Tables 3 and 4 present the statistical evaluation of model fitting to pH data, while Table 5 presents the evaluation for exchangeable Ca +2 + Mg +2 data.Some models performed poorly, with higher values of residual standard error (RSE) and Akaike and Bayesian information criteria (AIC and BIC).The Haverkamp & Parlange, Weilbull (three parameters), and Three-parameter log-logistic models did not adequately fit the data, mainly for values close to zero (Figures 3 and 4), resulting in lower ranks in the classification by the T index based on RSE and AIC (Tables 6 and 7).Therefore, these models were discarded.
The others models had mixed performance, with a narrow range of RSE and AIC values.Considering the 15 models, 10 of them [Gompertz; Lima & Silva; Mitscherlich; Four-parameter log-logistic; Morgan-Mercer-Flodin, Richards, Quadratic polynomial, Second-order polynomial exponential, Skaggs et al.; and Weilbull (4 parameters)] were among the three best models for each of the four sets of pH data, denoting the relative flexibility of these models to fit the data.Additionally, this shows the practical difficulty of choosing an appropriate function model based solely on empirical fit quality.
The three best-fitting models, based on T index classification, were the Four-parameter log-logistic, Quadratic polynomial, and Second-order polynomial exponential models.However, the latter two models failed to predict acceptable pH levels beyond the range of the model data, as they predicted a decrease in pH levels as lime rates increased, which is unrealistic (Tables 8 and 9).The four-parameter loglogistic model also failed to predict an acceptable pH level (9.3) for the 0-20 cm soil layer in the first year for a lime rate of 100 Mg ha -1 , which exceeds the theoretical limit (8.3) achievable by liming.Other models, such as Weilbull (3 parameters), Three-parameter log-logistic, Tangent hyperbolic, and Weilbull (4 parameters), overestimated pH levels for higher lime rates (Tables 8 and 9).Thus, according to the combined use of T index and constraints on extrapolation values, the best models to fit the pH data were the Mitscherlich, Threeparameter logistic, and Morgan-Mercer-Flodin models.
The statistical evaluation of model fitting to exchangeable Ca +2 + Mg +2 data in the second year (Table 5) resulted in the ranking presented in Table 7, in which the best models were the Second-order polynomial exponential, Gompertz, and Morgan-Mercer-Flodin models.Most of the models produced acceptable results within the range of fitting.However, as expected, the polynomial models predicted a decrease in pH levels beyond that range, which is not realistic or acceptable in real-world situations (Table 7 and Figures 5A and B).Although the theoretical upper limit for exchangeable Ca +2 + Mg +2 in soils may be high, such as in soils developed from limestones, stoichiometric calculations constrain the increase  in exchangeable Ca +2 + Mg +2 to be proportional to the applied rate; moreover, empirical evidence indicates that only a portion of it will be available for measurements due to different soil reactions.Therefore, models that predict Ca +2 + Mg +2 contents far beyond those usually found in soils are unrealistic and should be discarded.Furthermore, this is coherent with the expected pH response.The three-parameter logistic model and Tangent hyperbolic model predicted pH levels that were one to The results of the present study highlight two important but often not obvious points when fitting models.Firstly, the risk of extrapolating the model beyond the range of the data Table 9. Extrapolated estimated pH values by the models, in the second year, for lime doses beyond the evaluated doses  10.Extrapolated estimated values of exchangeable Ca +2 + Mg +2 by the models, in the second year, for lime doses beyond the evaluated doses without proper evaluation of the results (Table 10).Secondly, the practical usefulness of adding more criteria, both theoretical and empirical, to aid in selecting the best model.Purely statistical criteria may be misleading if used blindly, as model fitting is sensitive to the absolute values of the data, mainly when considering nonlinear data.Furthermore, the uncertainty of the data limits the reliability of the fitted models, no matter how accurate the fitting procedures are.

Conclusions
1.The best-fitting models for soil pH as a function of lime doses were the Mitscherlich, three-parameter logistic, and Morgan-Mercer-Flodin models.
2. The best-fitting models for exchangeable Ca +2 + Mg +2 as a function of lime doses were the Skaggs, Gompertz, and Morgan-Mercer-Flodin models.
3. The use of the proposed T index, which ranks models based on their residual standard error and Akaike information criterion values, combined with analysis of estimated values through extrapolation beyond the limits of the data used for fitting, was useful for selecting models that are statistically robust and empirically coherent.

Figure 2 .
Figure 2. Soil pH as a function of the applied lime doses, a) 0-20 cm soil layer, b) 20-40 cm soil layer, in two years of lime rates (Figures2A and B), despite the increase in lime rates at double rates (from 5 to 10 to 20 Mg ha -1 ).The mean pH was 7 for the highest lime rate in the first year, and 6.6 in the year 2. The decrease in pH of the soil surface layer from the first to the second year is connected to the effects of the crop grown in the area, such as root acidification and nutrient extraction, among others.The flattening trend observed in the reaction curves over time, indicated by the measured pH levels, has been previously reported in other studies(Nogaroli & Fonseca, 2020;Cavalli et al. 2021).The increase in pH of the soil subsurface layer from the first to the second year is due to vertical percolation and/or biological activity, decreasing the liming effects.Long-term experiments using high lime rates of up to 20 Mg ha -1 have reported a pH plateau in the range of 7 to 7.5(Goulding, 2016).Although the decrease in soil surface pH over time was small, it may have been affected by soil spatial variability, sampling procedures, and irregular distribution of lime throughout the area.However, even small changes in pH can impact model predictions, which will be discussed later.Decreases in lime solubility as pH increases may reduce the effects of higher lime rates, as already found in previous studies.Tables3 and 4present the statistical evaluation of model fitting to pH data, while Table5presents the evaluation for exchangeable Ca +2 + Mg +2 data.Some models performed poorly, with higher values of residual standard error (RSE) and Akaike and Bayesian information criteria (AIC and BIC).The Haverkamp & Parlange, Weilbull (three parameters), and Three-parameter log-logistic models did not adequately fit the data, mainly for values close to zero (Figures3 and 4), resulting in lower ranks in the classification by the T index based on RSE and AIC (Tables6 and 7).Therefore, these models were discarded.The others models had mixed performance, with a narrow range of RSE and AIC values.Considering the 15 models, 10 of them [Gompertz; Lima & Silva; Mitscherlich; Four-parameter log-logistic; Morgan-Mercer-Flodin, Richards, Quadratic polynomial, Second-order polynomial exponential, Skaggs et al.; and Weilbull (4 parameters)] were among the three best models for each of the four sets of pH data, denoting the relative flexibility of these models to fit the data.Additionally, this shows the practical difficulty of choosing an appropriate function model based solely on empirical fit quality.The three best-fitting models, based on T index classification, were the Four-parameter log-logistic, Quadratic

Table 4 .
Statistics for comparison of model fit to pH data in the second year (*)RSE = residual standard error; Convergence = whether the model converged after 50 iterations; AIC = Akaike information criterion; BIC = Bayesian information criterion; RDF = residual degrees of freedom Table 5. Statistics for comparison of model fit to exchangeable Ca +2 + Mg +2 data in the second year (*)RSE = residual standard error; Convergence = whether the model converged after 50 iterations; AIC = Akaike information criterion; BIC = Bayesian information criterion; RDF = residual degrees of freedom

Figure 3 .
Figure 3. Soil pH as a function of applied lime doses, with extrapolation beyond the evaluated doses (A) 0-20 cm soil layer, (B) 20-40 cm soil layer, in the first year

Figure 4 .
Figure 4. Model prediction extrapolated beyond the fitted data (pH data as a function of applied lime doses), (A) 0-20 cm soil layer, (B) 20-40 cm soil layer, in the second year Table6.Ranking of the models according to results of Tables3 and 4(model fit to pH data in both sampling years).The higher values of T index indicate the best-fitting models

Figure 5 .
Figure 5. Exchangeable Ca +2 + Mg +2 as a function of applied lime doses, with extrapolation beyond the evaluated lime doses (A) 0-20 cm soil layer, (B) 20-40 cm soil layer, in the second year

Table 1 .
Soil chemical attributes of 20 samples from the experimental area OM -Organic matter; ( * ) Means and standard deviation (in parentheses) ( * ) a, b, c, d, fitted parameters

Table 2 .
Nonlinear models applied in this work

Table 3 .
Statistics for comparison of model fit to pH data in the first year