Chanter model: nonlinear modeling of the fruit growth of cocoa

Modelo Chanter: modelagem não linear do crescimento de frutos do cacaueiro

Pollyane Vieira da Silva Taciana Villela Savian About the authors

ABSTRACT:

The growth of plants and animals can be described through a growth curve. This curve is given by the equation of a nonlinear model, such as the Logistic model and the Gompertz model. The objective of this study was to adjust the Chanter model, as well as Logistic and Gompertz, using a set of cocoa (clone Sial-105) fruit whose length and diameter measurements were evaluated from 30 to 180 days after pollination, every 15 days. The Chanter model is a hybrid between the Logistic model and Gompertz model whose parameters can be interpreted similarly. A comparison of the quality of fit between the models was made using the following statistical measures: the Akaike’s information criterion (AIC), the Akaike’s weights criterion, Bayesian information criterion (BIC), residual standard deviation (RSD),the adjusted coefficient of determination (R²aj) and the measures of non-linearity Box’s bias and curvature of Bates and Watts. It was verified that the Chanter model is the most suitable one among the studied models for modeling the cocoa data.

Key words:
nonlinear regression; Chanter model; nonlinearity and cocoa fruits measures

RESUMO:

O crescimento de plantas e animais pode ser descrito por meio de uma curva. Essa curva é dada pela equação de um modelo não linear, como o modelo Logístico e o modelo Gompertz. O objetivo deste trabalho foi ajustar o modelo Chanter, assim como o Logístico e Gompertz, utilizando um conjunto de dados do fruto do cacaueiro do clone SIAL - 105, cujas medidas de comprimento e diâmetro foram avaliadas de 30 até 180 dias após a polinização, a cada 15 dias. O modelo Chanter é um híbrido entre o modelo Logístico e o modelo Gompertz cujos parâmetros podem ser interpretados similarmente. A avaliação da qualidade do ajuste entre os modelos foi feita utilizando as seguintes medidas estatísticas: o critério de informação de Akaike (AIC), o critério Peso de Akaike, o critério de informação de Bayes (BIC), o desvio padrão residual (DPR), o coeficiente de determinação ajustado (R²aj) e as medidas de não linearidade, vício de Box e curvatura de Bates e Watts. Verificou-se que o modelo Chanter dentre os modelos estudados neste trabalho é o mais adequado para o ajuste aos dados do fruto do cacaueiro.

Palavras-chave:
regressão não linear; modelo Chanter; não linearidade e medidas do cacau

INTRODUCTION:

The classical growth models, such as Logistic model (NELDER, 1961NELDER, J. The fitting of a generalization of the Logistic curve. Biometrics, 1961.) and Gompertz model (LAIRD, 1965LAIRD, A. K. Dynamics of relative growth. Growth, 1965. Available from: <Available from: https://www.cabdirect.org/cabdirect/abstract/19661402865 >. Accessed: feb 11, 2018.
https://www.cabdirect.org/cabdirect/abst...
), have long been used to describe various biological processes of animals and plants. They can be adjusted data with relative ease, using standard software statistical, and its parameters exhibit considerable biological interpretability. A common feature among the classical growth models is that the individual growth is ever-increasing and tends to an asymptotic value. It is usually reasonable to assume that as time (or another independent variable) increases, the value of the dependent variable approaches a constant that is not zero.

RIBEIRO et al. (2018RIBEIRO, T. D. et al. Description of the growth of pequi fruits by nonlinear models. RevistaBrasileira de Fruticultura, 2018. Available from: <Available from: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0100-29452018000400705 >. Accessed: Jan 5, 2017. doi: 10.1590/0100-29452018949.
http://www.scielo.br/scielo.php?script=s...
), adjusted the Logistic model and the Gompertz model among other non-linear models and evaluated the Gompertz model as the most suitable model to describe the data of diameter and mass of the pequi fruit. FERNANDES et al. (2017FERNANDES, T. J. et al. Double sigmoidal models describing the growth of coffee berries. Ciência Rural, 2017. Available from: <Available from: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0103-84782017000800401 >. Accessed: Nov 22, 2017. doi: 10.1590/0103-8478cr20160646.
http://www.scielo.br/scielo.php?script=s...
), MUIANGA et al. (2016MUIANGA, C. A. et al. Description of the growth curve of cashew fruits in nonlinear models. Revista Brasileira de Fruticultura, 2016. Available from: <Available from: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0100-29452016000100022 >. Accessed: Nov. 12, 2017. doi: 10.1590/0100-2945-295/14.
http://www.scielo.br/scielo.php?script=s...
) and PRADO et al. (2013PRADO, T. K. L. et al. The fit Gompertz and Logístic models to the growth data of green dwarf coconut fruits. Ciência Rural, 2013. Available from: <Available from: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0103-84782013000500008 >. Accessed: Dec 12, 2017. doi: 10.1590/S0103-84782013005000044.
http://www.scielo.br/scielo.php?script=s...
) also adjusted the Logistic and Gompertz models to the growth data of coffee fruit, cashew fruit and green dwarf coconut fruit, respectively. In these last two works the Logistic model was better fitted to the data.

The equations of growth models are solutions of differential equations. In the case of a differential equation Chanter model presents a part of the differential equation of Logistic model and another part of the differential equation of the Gompertz model. This feature indicated that the Chanter model is a hybrid between the Gompertz model and Logistic model, whose parameters may be interpreted similarly (FRANCE, THORNLEY, 1984FRANCE, J.; THORNLEY, J. H. M. Mathematical models in agriculture. London: Butterworks, 1984. p.335. ).

The objective of this research was to study the Chanter model and its parameters and its differential equation. In order to adjust the Chanter model, as well as Logistic and Gompertz, a set of cocoa fruit data and to make a residue analysis and diagnostics.

MATERIALS AND METHODS:

The data used in this research are related to the length (cm) and diameter (cm) of cocoa fruit. The fruits proceeded shaded cocoa clone SIAL - 105, which were arranged in rows in Clonal Garden Court E, the Cocoa Research Center in Ilhéus, Bahia (BRITO, SILVA, 1983BRITO, I. C.; SILVA, C. P. Medidas biométricas do fruto do cacaueiro durante seu desenvolvimento. Sitientibus. 1983. Available from: <Available from: http://www2.uefs.br/sitientibus/pdf/3/medidas_biometricas_do_fruto_do_cacaueiro.pdf >. Acessed: jan 15, 2018.
http://www2.uefs.br/sitientibus/pdf/3/me...
). Twenty three trees were artificially pollinated, taking pollen EEG-9 clone were in the neighboring row, characterized by the terrain type being of alfisolo soils with higher clay content, but without humus series of germplasm. The first fruit collection began one month after pollination and repeated each 15 days until 180 days after pollination. As the first fruits were small, in the first collection were taken up 50 fruits. Later, with the increase in the fruit size and shortage of material, the amount of the fruit per collection was decreasing, 40 fruits were taken up in the second collection, 30 fruits in the third and fourth collection, 20 fruits in the fifth, sixth, seventh and eighth collection, 15 fruits in the ninth and tenth collection and 16 fruits in the eleventh collection. After the collection, the fruits were placed in plastic bags and immediately were taken to the laboratory where the variables, length and diameter, were measured using a caliper.

To these data were fit nonlinear models Logistic, Gompertz and Chanter, with the following parameterization:

Logistic model:

yi=β11+exp-β2-β3xi+εi (1)

Gompertz model:

yi=β1exp-exp-β2xi-β3+εi (2)

Chanter model:

yi=β1β2β1+β2-β1exp-β3β41-exp-β4xi+εi (3)

Where y i = 1, 2,…, is the response variable or dependent variable;β 1, β 2, β 3 and 𝛽4 are parameters of the models and ϵi random error is assumed independent identically distributed and follow a normal distribution with mean 0 and variance σ2.

In models Logistic and Gompertz the parameter β 1 represents a capacity of support, is a horizontal asymptote upper right, that is, when the fruit growth tends to stabilize (MISCHAN; PINHO, 2014MISCHAN, M. M.; PINHO, S. Z. Modelos não lineares. 1. ed. - São Paulo: Cultura Acadêmica, 2014. ). The point of inflection of the growth of the fruit passing from an increasing speed to a decreasing rate occurs at the point -β2β3,β12 to the Logistic model and from the point β3,β1e to the Gompertz model where e is a mathematical constant that is the base of the natural logarithm.

The Chanter model was proposed by Dennis Osborne Chanter in 1976 and unlike the above mentioned models it has 4 parameters. These parameters have the following conditions: β 1, β 2, β 3R *, 𝛽4R * and β 2> β 1

When the parameter 𝛽4 is positive then the Chanter model has the following horizontal asymptotes: right y=β1β2β1+β2-β1exp-β3β4 and left y=0. When the parameter 𝛽4 is negative then the Chanter model has the following horizontal asymptotes: right y=β 2 and left y=β1β2β1+β2-β1exp- β3β4. The parameter β 1 it is the intercept of the graph with the y-axis, that is, the value of x such that y=0.

The coordinate x of the point of inflection is not unique then the solution of equation

d 2 y x d x 2 = 2 β 1 β 2 β 3 2 ( β 2 - β 1 ) 2 exp ( - 2 β 3 β 4 + 2 β 3 β 4 exp ( - β 4 x ) - 2 β 4 x ) β 1 + ( β 2 - β 1 ) exp ( - β 3 β 4 + β 3 β 4 exp ( - β 4 x ) ) 3 + β 1 β 2 β 3 ( β 2 - β 1 ) exp ( - β 3 β 4 + β 3 β 4 exp ( - β 4 x ) - β 4 x ) ( - β 3 exp ( - β 4 x ) - β 4 ) β 1 + ( β 2 - β 1 ) exp ( - β 3 β 4 + β 3 β 4 exp ( - β 4 x ) ) 2

depends on a global variable Z and is expressed by:

- ln ( ( - Z β 4 e Z β 2 + Z β 4 e Z β 1 + Z β 4 β 1 - β 3 e Z β 2 + β 3 e Z β 1 + β 1 β 3 + β 4 β 1 + β 4 e Z β 2 - β 4 e Z β 1 ) β 4 + β 3 β 3 ) β 4 .

The graphs in figure 1 represent the Chanter function for different parameter values, all of which are for the case of the parameter 𝛽4 negative. Note that in graph (a) parameter β 1 indicates the intercept, in graph (b) parameter β 2 indicates the upper horizontal asymptote on the right. Note that in graph (c) for high values of β 3 the curve is relatively more “closed”, as x increases, the curve approaches the asymptote more quickly. In graph (d) the lower the β 4 values the curve approaches the asymptote faster as x increases.

Figure 1
Graphs of the Chanter functions where (a) different values for β1, β2 = 3, β3 = 1 and β4 = - 0.5, (b) β1 = 2, β2 = 3, different values for β3 and β4 = - 0,5 and (d) β1 = 2, β2 = 3, β3 = 2 and different values for β4.

In the adjustment of Logistic, Gompertz and Chanter models to real data, the estimates of the parameters were obtained by the method of least squares. The most common nonlinear regression algorithm for parameter estimation, and that was used in research was the Gauss-Newton method, which is based on linear approximations to the function expected value (X, β) each step. Initial values for the fit of Logistic and Gompertz models were chosen through the linearization of the models themselves. The values were β 1 = 11.66, β 2 = - 2.006, β 3 = 0.028 and β 1 = 11.66, β 2 = 0.022, β 3 = 49.227, respectively.

The Chanter model is not transformable linearly, so the initial values for the algorithm were β 1 = 2 because the average length of the fruits one month after the pollination, that is, in the first fruit collection was 2 cm; β 2 = 11.66 because this parameter represent the asymptote when 𝛽4<0, as parameter β 1 of the Logistic and Gompertz models; β 3 = 0.02 because it represents a ratio as the parameter β 3 of the Logistic model is the ratio between the missing amount for the curve to reach the asymptote β 1 and the initial value y 0 and the parameter β 2 of the Gompertz model is the ratio between asymptote β 1 and a amount that is missing for the curve to reach the asymptote. And for 𝛽4 investigated a grid of negative values and used as initial value with lower Residual Sum of Squares (RSS), 𝛽4 = - 0.11.

It was used the Shapiro-Wilk test to verify the normality of the residuals, applied in the residuals of nonlinear models studied in this research. As the test p-value was higher than 5% level of significance, it was not rejected the residual normality hypothesis.

Evaluation of the quality of fit of the three models will be done through the following statistical measures: corrected Akaike’s information criterion (AICc), Bayesian information criterion (BIC), residual standard deviation (RSD), nonlinearity measure Box’s bias and Akaike’s weights criterion.

The Akaike’s information criterion evaluates the fit quality between two or more models. AKAIKE (1974AKAIKE, H. A. A new look at the statistical model identification. IEEE Transaction on Automatic Control. New York, v.19, p.716 - 723, 1974. Available from: <Available from: http://bayes.acs.unt.edu:8083/BayesContent/class/Jon/MiscDocs/Akaike_1974.pdf >. Accessed: Nov 12, 2017.
http://bayes.acs.unt.edu:8083/BayesConte...
) defined his criterion as:

AIC = -2ln(L) + 2p (4)where L is the maximum of the likelihood function and p the number of adjusted parameters.

BOZDOGAN (1987BOZDOGAN, H. Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 1987. ) proposed the following correction for AIC:

AICc = -2ln(L) + 2p + 2p(p+1)n-p-1(5)

where L is the maximum of the likelihood function, p the number of adjusted parameters and n is the number of observations, or equivalently, the sample size. The model with the lowest AIC value is considered the best fit model, the same is true for the AICc.

One way to quantify the chance that the model is correct is through the criterion called Akaike’s Weights (MOTULSKY; CHRISTOPOULOS, 2003MOTULSKY, H.; CHRISTOPOULOS, A. Fitting models to biological data using linear and nonlinear regression: a practical guide to curve fitting. 4. ed. San Diego, CA: GraphPad Software, 2003. ). The calculation requires the AICc value.

The chance of a model to be correct is calculated by the following equation, where is the difference between the AICc values.

Akaike's Weights=e-0.5Δ1 + e-0.5Δ (6)

The Bayesian information criterion (BIC) proposed by SCHWARZ (1978SCHWARZ, G. Estimating the Dimension of a Model. The Annals of Statistics, 1978. Available from: <Available from: https://projecteuclid.org/download/pdf_1/euclid.aos/1176344136 >. Accessed: Nov 28, 2017. doi:10.1214/1176344136.
https://projecteuclid.org/download/pdf_1...
) is used to maximize the probability of choosing the true model. The criterion is given by:

BIC= -2lnL+ pln(n) (7)where L is the maximum of the likelihood function, p the number of adjusted parameters and n is the number of observations. The model with the lowest BIC value is considered the best fit model.

The residual standard deviation (RSD) quantifies the average residue size and is calculated from the sum square residuals (SSR), the number of adjusted parameters p and the number of observations n, through the following equation:

RSD =SSRn-p (8)

The adjusted coefficient of determination (R²aj), is obtained by:

aj =1-1-R2(n-1)n-p (9)

where R² being the unadjusted coefficient of determination, n the number oftimes when measurements were taken and p the number of model parameters.

The measures of nonlinearity in the nonlinear regression are related to the method of error minimization. The more linear is fX,β the better the inferential results associated with the nonlinear model. The measure of nonlinearity Box’s bias must assume the smallest possible values for the function to be considered linearizable.

According to Box (1971BOX, M. L. Bias in nonlinear estimation. Journal of the Royal statistical Society. Series B(Methodological). London, v.33, n.2, p.171 - 201, 1971. Available from: <Available from: https://doi.org/10.1111/j.2517-6161.1971.tb00871.x >. Accessed: Dec 15, 2017. doi: 10.1111/j.2517-6161.1971.tb00871.
https://doi.org/10.1111/j.2517-6161.1971...
), when the values of bias are higher than 1%, it is considered that the model has nonlinear behavior. Therefore, the smaller the difference between linear and quadratic approximations, the lower the percentage of bias. The Bates and Watts curvature measure serves to indicate how far away a nonlinear model from a linear model is. The intrinsic curvature is inherent to the solution site and does not depend on the particular parameterization. The curvature caused by the effect of model parameters corresponds to a particular direction of the parametric space (RATKOWSKY, 1983RATKOWSKY, D. A. Nonlinear Regression Modeling. Dekker, New York, 1983.).

RESULTS AND DISCUSSION:

The term hybrid is related to the differential equations of each of the models mentioned above. Through the differential equations of the model and Logistic model Gompertz we obtained a relationship with the Chanter model differential equation.

Indeed, the Logistic model comes from the following differential equation,

dyxdx=β2yx1- y(x)β1(10)

and the Gompertz model comes from this other differential equation

dyxdx=β2yxexp-β2x-β3 (11)

But the Chanter model is the following system solution (I), in case β<0:

dy x dx = β 3 y x 1 - y x β 2 exp - β 4 x lim x + y ( x ) = β 1 β 2 β 1 + β 2 - β 1 exp - β 3 β 4 y 0 = β 1

The differential equation means that the relative growth rate dyxdx it is proportional to y(x), the amount missing to achieve the function parameter β 2 (common Logistic model) and a negative exponential function that depends on x (common Gompertz model).

Expression of Chanter model will be reported by the differential equation through separating the variables:

dyxdx=β3β2yxβ2-yxexp-β4xdyxyxβ2-yx=β3β2exp-β4xdx (12)

Note that there are A, B ∈ such that

dyxyxβ2-yx=Ayx+Bβ2-yx=Aβ2+B-Ayxyxβ2-yx (13)

If 2 = 1 then (B - A) = 0. So, A=B=1β2.Then it will have to,

1β2yxdy + 1β2(β2-yx)dy=β3β2exp-β4xdx1yxdy+ 1(β2-yx)dy= β3exp-β4xdx.(14)

Integrating, it will obtain,

ln y x β 2 - y x = - β 3 β 4 exp - β 4 x + K

y x β 2 - y x = exp - β 3 β 4 exp - β 4 x + K = exp - β 3 β 4 exp - β 4 x C

yx=Cβ2expβ3β4+C (15)

Then limx+yx= limx+Cβ2expβ3β4exp-β4x+C= β1β2β1+β2-β1exp- β3β4

But β4>0,, so limx+Cβ2expβ3β4exp-β4x+C= Cβ21+C.

Thus C, =β1exp-β3β4β2-β1.

Therefore, replacing in (15) it will have:

yx= β1β2β1+β2-β1exp- β3β4(1 - exp-β4x). (16)

For x = 0 it will have y(0)=β1 and hence (16) is solution of the system (I).

For the case 𝛽4<0 is that (16) is also solution of the system (II):

dy x dx = β 3 y x 1 - y x β 2 exp - β 4 x lim x - y ( x ) = β 1 β 2 β 1 + β 2 - β 1 exp - β 3 β 4 y 0 = β 1

The figure 2 shows the average of the scatter plot of data over the 180 days of the experiment and the fit of curves of the Logistic model, Gompertz model and Chanter model for data length and diameter of cocoa fruit. In the table 1 shows the estimated values β 1, β 2, β 3 and 𝛽4 and p-value of the Shapiro-Wilk test for the Logistic model, Gompertz model and Chanter model for data length and diameter of the cocoa fruit. In order to compare the fit of the three models and diagnose which of them is the most suitable for the description of the phenomenon, since all fit the data well, the values of the criteria were obtained which some of them are shown in table 1. Based on the criteria in table 1, AICc, BIC and RSD, it can indicate that the most appropriate model for the fit of the data is the Chanter model because the values of all five criteria for this model were lower when compared with Logistic and Gompertz models. Coefficient of determination was higher for the Chanter model, both for the length data and for the cacao fruit diameter data, as shown in table 1.

Figure 2
Chanter, Logistic and Gompertz models adjusted to the mean data length and diameter cocoa fruit over time.

Table 1
Estimates (standard error) of the parameters, p-values of the Shapiro-Wilk test and values of the evaluators quality fit of Chanter model, Logistic model and Gompertz model.

It is necessary to obtain the difference between the AICc values of the Chanter model, Logistic model and Gompertz model to calculate the weight Akaike criterion. The case of the data of the length, the Logistic model has a 0.0292 probability of being correct compared to Chanter model and Gompertz model has a 0.0011 probability of being correct compared to Chanter model. For the data of the diameter, the Logistic model has a 0.0077 probability of being correct compared to Chanter model and Gompertz model has a 0.0001 probability of being correct compared to Chanter model.

The measure of nonlinearity Bates and Watts Curvature presented very small values for the intrinsic curvature in the three models under study, values below 0.3, indicating that the nonlinearity is mainly due to the parameter effect. The lower value of intrinsic curvature was for the Logistic model, that is, the Logistic model presented the behavior closer to the linear when compared to the Gompertz and Chanter models. The Chanter model stands out in the values of curvature due to the effect of the parameters, 7.21 and 7.12 for both sets of data, length and diameter, respectively. For the other models the values were not higher than 1.4.

The measure of nonlinearity Box’s bias showed the importance of evaluate the vices indicating which model parameters are the most responsible for distant behavior of linear. The values of bias for Logistic and Gompertz models, indicated that the parameter β 2 of the Gompertz model and parameters β 3 and 𝛽4 of the Chanter model are responsible for distant linear behavior in both data, the length and the diameter, it was shown higher than 1% bias. The logistic model did not present parameters with bias greater than 1%. These two models, Gompertz and Chanter, have in their negative exponential equations containing the parameters β 3, 𝛽4 and β 2, respectively. As a function of time and the result obtained by measuring nonlinearity Box’s bias reinforces the idea that these parameters are responsible for the flexible behavior of these models.

Based on the values of quality evaluators showed above, there are evidences that the Chanter model is the best model to fit the actual data has studied in this research, and thus the prediction equation Chanter model for cocoa fruit length is:

ŷi= 16.461.56 + 8.99exp[0.5(1- exp0.02xi)] (17)

And the prediction equation Chanter model for cocoa fruit diameter is:

ŷi= 4.830.57+7.9exp(1- exp0.01xi)(18)

For the Chanter model an estimate was obtained of approximately 10.55 cm for length and 8.47 cm for diameter for the parameter β 2, that is, for the length measurement the fruit reaches the maximum growth in 155 days and for the measurement diameter the fruit reaches the maximum growth in 243 days. After 155 days, the fruit is expected to have the maximum length, but the diameter will not yet be maximum. This information is interesting for the fruit harvest time, which is a major factor for the quality of fruit.

The change in curve concavity, that is, the point at which fruit growth ceases to have a positive velocity and becomes a negative velocity, is approximately (83; 7) for the length and (100; 5) for the diameter.

CONCLUSION:

The parameters β 1 and β 2 have practical interpretation, where β 1 the intercept indicates the fruit size on day zero, and β 2 it is the top right asymptote indicates asymptotic maximum size of the fruit. Therefore, carry out studies using the Chanter model to explore the practical interpretation of the other two parameters would be relevant in future research.

The Logistic model, the Gompertz model and the Chanter model proved to be adequate to describe the length and diameter for the behavior of the cocoa fruit over time; however the best was the Chanter model. Chanter model has greater flexibility in adjusting the data, this is due to several factors, one being the presence of a parameter more when compared to Logistic and Gompertz.

An estimate was obtained of approximately 10.55 cm for length and 8.47 cm for diameter for the parameter β 2 where represent the asymptotic measure of fruit, that is, 155 days and 243 days for the fruit to reach maximum growth, respectively. The parameter β 1 represents the size of the fruit at the beginning of the experiment, this is, 30 days before the pollination, the fruit length was 1.56 cm and the fruit diameter was 0. 57 cm. Fruit growth has a positive velocity up to the 83rd day for the length and up to the 100th day for the diameter. As for the other two parameters have no immediate practical interpretation.

ACKNOWLEDGEMENTS

We thank Coordenação de Aperfeiçoamento de Pessoa de Nível Superior (CAPES) e Conselho Nacional de Desenvolvimento Científicoe Tecnológico (CNPq) for the financial support for this research.

REFERENCES

  • 0
    CR-2019-0409.R1

Publication Dates

  • Publication in this collection
    21 Oct 2019
  • Date of issue
    2019

History

  • Received
    29 May 2019
  • Accepted
    12 Aug 2019
  • Reviewed
    27 Sept 2019
Universidade Federal de Santa Maria Universidade Federal de Santa Maria, Centro de Ciências Rurais , 97105-900 Santa Maria RS Brazil , Tel.: +55 55 3220-8698 , Fax: +55 55 3220-8695 - Santa Maria - RS - Brazil
E-mail: cienciarural@mail.ufsm.br