Evaluation of the critical points of the most adequate nonlinear model in adjusting growth data of ‘green dwarf’ coconut fruits Avaliação dos pontos críticos do modelo não linear mais adequado no ajuste aos dados de crescimento de frutos de coqueiro-anão-verde

- ‘Green Dwarf’ coconut is a fruit of great economic interest, since all its components are used, in addition to water, its main component. It is a culture of humid tropics, widely produced in northeastern Brazil, being an important income source for the region. The phenology study of this type of fruit is extremely important, but there are few studies in literature. Regression models, especially nonlinear growth models, can be of great value to understand how fruit growth behaves. The scarcity of works of this nature may be linked to some difficulties in estimating parameters of nonlinear models, such as assigning initial values to the itterative process. Overcoming this difficulty, for regression analysis, linear or not, several steps need to be respected to ensure the validity of information. Much information can be extracted from nonlinear growth models, such as the asynotic value, growth rate and critical points (maximum acceleration point, inflection point, maximum deceleration point and asynotic deceleration point). The aim of this work was to describe the stages of nonlinear regression analysis and to estimate the critical points of ‘Green Dwarf ’ coconut growth curves. After initial adjustments, the only unmet assumption was independence, adding a first order autoregressive term. Again, models were adjusted and all parameters were significant, with both models, Gompertz and Logistic, adjusting well to data, with slight advantage for the Logistic model with better adjustment quality criteria values, with maximum expected LED and LEDKP values of 21.4037 cm and 21.5478 cm, respectively. The x and y axis of critical points were estimated, with values that can help producers to make more objective decisions about the appropriate time to harvest coconut fruits, considering the most diverse uses of this type of fruit.


Introduction
Coconut tree (Cocos nucifera L.) is a plant that belongs to the Arecaceae family, reaching up to ten meters in height. It is a tree of great economic interest, since all its components are used: fruit, leaves, pulp, water, bark and fibers. The main economic activity involving this species is the production of green coconut, mainly for water production (SILVA et al., 2017). It is a culture of humid tropics; however, production is largely concentrated in semi-arid regions of northeastern Brazil, with high temperatures and little rain for most of the year (CÂMARA et al., 2019). Phenology in coconut farming is important in fruit development and production, allowing testing the viability of fruits over the years, in different seasons, types of soil and climate. However, despite the relevance of results, studies on the phenology of coconut crops are scarce and generic in northeastern Brazil (ARAÚJO et al., 2011;CÂMARA et al., 2019). Thus, growth models can be of great value to understand the phenology of this culture, since studying growth has become of great importance to help producers and show the best way to deal with growth phases under study. However, in practice, several authors, from the most varied areas, who work with these types of data and adjustment of regression models, do not take into account the stages of statistical modeling, which are considered essential to ensure validity of inferences. Regression analysis originated in the 19 th century, with Francis Galton, which is a technique that allows us inferring the functional relationship between a dependent variable, generally designated as "Y" and one or more independent random variables, or "X" variables. When animal or plant growth data are analyzed, the functional relationship is usually described by a curve in the form of a sigmoid curve and is more adjusted by non-linear models, more specifically, nonlinear growth models , RIBEIRO et al., 2018b, which have advantages over linear models, as they are more parsimonious, that is, they satisfactorily describe the relationship between variables with smaller number of parameters and with biological interpretation (MISCHAN; PINHO, 2014).
For linear or nonlinear regression models to be correctly used, one must consider the modeling steps starting with the choice of models to fit data, residual analysis, significance of parameters and adjustment adequacy. The first step of the procedure is the choice of candidate models, that is, through descriptive analysis, the researcher will know which models may be more suitable to describe the relationship under study (SARI et al., 2019a). After the initial choice, it is necessary to adjust the models by estimating parameters based on experimental data; however, in nonlinear models, there is an additional difficulty, as it is not possible to obtain such estimates in an algebraic way, requiring approximation through the use of numerical methods, which need the attribution of good initial values to initiate the iterative process (MISCHAN; PINHO, 2014;ARCHONTOULIS;MIGUEZ, 2015;FRÜHAUF, A.C. et al., 2020).
Since models are adjusted, it is necessary to verify the residual vector to guarantee the results of hypothesis tests applied in studies. There are several hypothesis tests in literature to verify the validation of residual vector assumptions which can also be verified by the graphic analysis (RIBEIRO et al., 2018b;SILVA et al., 2020). For models to be considered fit to describe the functional relationship in study, in addition to the analysis of residues, it is essential to verify the significance of the parameters of models. If any parameter is considered nonsignificant, another model must be adjusted (RENCHER;SCHAALJE, 2008).
With residual vector assumptions and significant parameters, one more inspection is important, the verification of the adequacy of models, which can be performed through the adjusted determination coefficient (R 2 aj ) and the parametric non-linearity measure (BATES; WATS, 1980). Models will be considered adequate when R 2 aj values are close to 1 and parametric non-linearity measures are below 1 (SARI et al., 2019b).
With all these steps completed, the various tested models can be reduced to a single model, said to be the "most adequate" (ARCHONTOULIS; MIGUEZ, 2015). For the selection of the "most adequate" model, some criteria are used to assess the quality of fit, which are used in several growth studies (RIBEIRO et al., 2018aSILVA et al., 2020), such as the Akaike information criterion (AIC) (AKAIKE, 1974), the Bayesian information criterion (BIC) (SCHWARZ, 1978), adjusted determination coefficient (R 2 a j ), residual standard deviation (RSD) and the intrinsic (C l ) and parametric (C q ) nonlinearity measures.
Curves that represent growth models do not have extreme, maximum or minimum points, but some points are considered important from the physiological point of view, each having a specific meaning. The study of these points is performed through the derivatives of equations in relation to time (MISCHAN; PINHO, 2014).
Critical points have been satisfactorily studied by the following authors in the growth of different types of fruits (SARI et al., 2019a;SARI et al., 2019b;DIEL et al., 2020;SILVA et al., 2020). In this context, the aim of this work was to describe steps in adjusting Gompertz and Logistics models to growth data of the longitudinal diameter of 'Green Dwarf' coconut fruits, estimating the critical points of models.

Material and methods
Data were obtained from Benassi (2007) and the fieldwork was carried out in an 'Green Dwarf' coconut orchard (Coco nucifera L.) installed in the city of Bebedouro, state of São Paulo. The work was carried out in a seven-year-old coconut orchard with spacing of seven meters between rows and six meters between plants in a triangle shape. Eight different plants were evaluated, measuring three fruits, totaling 24 fruits, and the average was obtained to adjust models. From these fruits, measurements in relation to the longitudinal external diameter of harvested fruits (LED) and fruits kept on plants (LEDKP) were performed with the aid of caliper until 120 days of age, with measurement unit in centimeter (cm). The first measurement was performed 1 day after inflorescence opening (DAIO) and the last at 375 DAIO every 15 days, totaling 26 measurements over time.
Logistic (1) and Gompertz (2) nonlinear regression models were adjusted and the parameterization chosen was based on the work of Fernandes et al. (2015).
homoscedasticity assumption, a viable solution would be to model the inverse of the variance. Finally, if residues are not independent, it is necessary to model dependence through the inclusion of an order-p autoregressive term (AR (p)).
The significance of parameters of models can be verified through the Student's t test, in which values lower than pre-established nominal level imply significant parameter; otherwise, this parameter is not significant. The confidence interval (CI) of parameters can also be used, which can be calculated with the following expression: Where Y i is the response variable or the diameter value taien at time t i , with i = 1,2...,26, a being the asymptotic value or maximum expected fruit diameter value, b is the abscissa of the inflection point, with k value being related to the fruit growth rate, and the higher the k value, the faster the fruit is expected to reach its adult stage (MISCHAN; PINHO, 2014) and e i is the random error, in which, as initial assumption, has normal distribution, with zero mean, independent and with constant variance, that is e i ~ N(0,s 2 ).
In nonlinear models, it is necessary to use an iterative method to find approximation of estimates. For this work, the iterative method used was that of Gauss-Newton because it was implemented in the "nls" function and the initial values were graphically obtained by the "manipulate" function (ALLAIRE, 2014), both packages of the R statistical software.
After initial adjustments, residual analysis was performed to verify whether assumptions were not violated. For this, the Shapiro-Wilk (SHAPIRO;WILK, 1965) tests for normality, Breusch-Pagan (BREUSCH;PAGAN, 1979) for homoscedasticity and Durbin-Watson (DURBIN;WATSON, 1951) for the independence of residues were used, and complementary to aforementioned tests, graphical analysis was used.
If data normality is not met, some transformation may be performed to data or a change in the choice of the model to be adjusted. If there is violation in the Where q i is the i-th estimated parameter, t (v,a/2) the quantile of Student's t distribution, a the significance level adopted and v the degrees of freedom, v = n -p, n the size of the response vector, p the number of parameters of the model and is the standard error ep(q i ) of the estimate, being ep(q i ) = √V(q) and V(q) an estimate of the variance of parameter q i obtained from the diagonal of the variances and covariance matrix. If this interval does not contain zero, the parameter can be considered significant.
When it comes to regression analysis, the ideal is to have several candidate models to test which is the most suitable for each situation, in the case of this work, models (1) and (2) were those indicated to be tested due to the data dispersion graph and study objectives. After verifying which models are candidates, the one that best fits these data is selected to then draw conclusions about its results. To select the model that best fits data, the following criteria were used: • Corrected Akaike information criterion, with n being the number of observation and the number of parameters; • Bayesian information criterion • Residual standard deviation , QME is the residual mean square; • Non-linearity measures (intrinsic and parametric); • Adjusted determination coefficient and R 2 the determination coefficient. Lower AIC c , BIC, RSD values, non-linearity, intrinsic and parametric measures and higher R 2 aj values characterize the most appropriate model to describe data under study. Together with the interpretations of parameters, the first four partial derivatives in relation to variable time of Logistic and Gompertz models were used, the first derivative being the growth rate. By equalizing the second derivative to zero, the inflection point (ip) of the curve is obtained. The third derivative equal to zero allows finding the maximum acceleration point (map) and the maximum deceleration point (mdp) and the fourth derivative brings the information of the asymptotic deceleration point (adp) (MISCHAN; PINHO, 2014). To find the coordinates of critical points using the parameterization of this work, expressions shown in Table  1 can be used: Table 1. Expressions to find abscissa and ordinates of maximum acceleration points (map), inflection point (ip), maximum deceleration point (mdp) and asymptotic deceleration point (adp) of Logistic and Gompertz models using parameterization of this work.

Results and discussion
As initial part of any regression analysis, models are adjusted considering all assumptions met, after these first adjustments, the residual analysis is performed. With results presented in Table 2, it was observed that the only assumption violated was that of independence, which was expected for the LED variable, since data were taken from the same fruit and as mentioned in the works by Ribeiro et al. (2018a) and Cassiano and Safadi (2015), observations performed on the same individual are generally autocorrelated. In a complementary way, graphic analysis corroborated the same results of hypothesis tests in Table 2, as can be seen in Figures 1, 2, 3 and 4. With this observation, models were again adjusted incorporating the first-order autoregressive term (AR (1)).    Before concluding that models are adequate and selecting the one that best fits data, the significance of parameters was tested via Student's t test and 95% confidence intervals. For both adjusted models, all parameters were significant and their respective confidence intervals did not contain zero, that is, after verifying adjustments, significance and CI of parameters, all models were adequate to describe the relationship between LED and LEDKP in relation to time (Table 4).
The results of criteria used to assess the adjustment quality of models, with R² aj values greater than 0.99 and parametric non-linearity less than 1, indicate good adjustments of models in relation to data (Table 3) for study with tomato and study with meat producing mammals, respectively. Sari et al. (2019a) and Fernandes et al. (2019) obtained similar values. Table 3. Criteria for assessing adjustment quality: Akaike information criterion (AIC C ), Bayesian information criterion (BIC), residual standard deviation (RSD), intrinsic non-linearity (C l ), parametric non-linearity (C q ) and adjusted determination coefficient (R² aj ), for comparing the Logistical and Gompertz models to 'Green Dwarf' coconut growth data in relation to the longitudinal external diameter of harvested fruits (LED) and longitudinal external diameter of fruits kept on plants (LEDKP).

Feature
Model  Parameter k 0,0128 0,01464** 0,0166 0,0125 0,0143** 0,0160 phi ------0,6154 ------------0,6787 ------ Although both adjusted models were shown to be adequate to describe the relationship among variables, to make inferences about their estimates, the Logistic model was chosen as the one that best adhered to data, as it can be observed that this model presented higher R² aj values and lower AIC C , BIC and RSD values (Table 3). Bates and Wats (1988) used intrinsic and parametric nonlinearity measures to evaluate the nonlinearity of the nonlinear model, and the lower the C l and C q values, the closer to linear is the model, and as C l and C q values were lower for the Logistic model in both LED and LEDKP, it seems to be the most appropriate model, corroborating AIC C , BIC, RSD and R² aj values.
According to Diel et al. (2019), several studies use only adjustment quality evaluators as the only way to choose the best model, but another aspect must be taken into account, the estimated asymptote value (a), which cannot present sub or overestimated values, as this parameter has the most important interpretation. In this work, the estimated asymptote values of the Logistic model, for both LED and LEDKP, had small deviation from observed values, with LED values of 21.4037 cm and LEDKP values of 21.5478 cm (Figures 1 and  2, respectively). It should also be pointed out that it is necessary to use nonlinear regression models with some caution, because if the necessary steps of a good study are not respected, mistaken results can be obtained and, checking only criteria to assess the adjustment quality can be dangerous in the inferences of models, and the ideal situation is to check the bias of the estimates of parameters, mainly the horizontal asymptote, that is, parameter a (Table 4).
With another extremely important aspect and corroborating adjustment quality evaluators, the Logistic model can be considered the most suitable to describe the relationship proposed in this work, and this model was also chosen as the most adequate in the works of Prado, Savian and Muniz (2013), Prado et al. (2020), Muianga et al. (2016), Jane et al. (2020a) and Jane et al. (2020b). Table 4 shows the estimates and confidence intervals for parameters of the external longitudinal diameter of harvested fruits (LED) and external longitudinal diameter of fruits kept on plants (LEDKP) and the critical points of the four derivatives of the Logistic model. Asymptotic values were 21.4037 cm and 21.5478 cm, respectively, for LED and LEDKP, which values were found at 375 DAIO, higher than those found by Silva et al. (2009), who found for 'Green Dwarf' coconut grown in the conventional system of 17.23 cm and in the organic system of 16.75 cm, which difference can be explained by the fact that in the work of Silva et al. (2009), the authors harvested fruits at 240 DAIO, a phase occurring between the inflection point and the maximum deceleration point found in this work.
According to Silva et al. (2009), Ferreira Neto et al. (2007 and Ribeiro, Costa and Aragão (2017), coconut fruits are harvested at 240 DAIO, as the fresh coconut water market makes this requirement due to the fact that the water taste is better in this period. In the work by Silva et al. (2009), the diameter value was 16.57 cm in the organic system and 17.23 cm in the conventional system and with estimates of this work at 240 DAIO, phase close to the maximum deceleration point (mdp), LED value was 16.11 cm and LEDKP value was 15.86 cm.
For the agroindustry, Ribeiro, Costa and Aragão (2017) point out that the fruit must be harvested between 8 and 9 months or between 240 DAIO and 270 DAIO, phase comprised between the maximum deceleration point and the asymptotic deceleration point found in this study. For exactly 270 DAIO, estimated LED and LEDKP values were 17.66 cm and 17.47 cm, respectively. The final phase of fruit harvesting takes place when they are dry, and according to Ribeiro, Costa and Aragão (2017), this phase occurs at 11 months or 330 DAIO. Mischan and Pinho (2014) point out that the asymptotic deceleration value (ADP) represented by 90% of the horizontal asymptote value (a) , which corresponds to LED of approximately 329 DAIO and LEDKP of approximately 327 DAIO, being close to 330 DAIO needed for coconut fruits to be dry or 19.43 cm for LED and 19.57 cm for LEDKP.
The observed data, the adjusted curves and the curve that represents the growth rate over time ( Figures 5  and 6), the highest point of the growth rate curve being the inflection point and as mentioned by Mischan and Pinho (2014), this point is exactly half of the upper horizontal asymptote of the Logistic model, which for LED occurred at approximately 163 DAIO or 10.70 cm ( Figure 5) and for LEDKP, 167 DAIO or 10.77 cm ( Figure 6). In addition to information about the curve that represents the growth rate, the adjusted curve adhered well to observed data, corroborating the initial hypothesis that data have sigmoidal shape. Other important points for the production of 'Green Dwarf' coconut in the growth rate curve are map (in yellow color), mdp (in blue color) and adp (in pink color) ( Figure 5).

Conclusion
The Logistic and Gompertz models were adequate to describe the relationship between LED and LEDKP; however, due to criteria used to assess the adjustment quality and the analysis of the horizontal asymptote, the Logistic model was more adequate, with asymptotic values for both variables of approximately 21 cm.
For studies with 'Green Dwarf' coconut, the use of critical points of models is of great help for the researcher to find the harvest point of fruits with the best water flavor, for the use of agro-industry and the use of dry coconut in the food industry. The asymptotic values were 21.4037 cm for LED and 21.5478 cm for LEDKP.