Nonlinear regression analysis of length growth in cultured rainbow trout

Length growth as a function of time has a non-linear relationship, so nonlinear equations are recommended to represent this kind of curve. We used six nonlinear models to calculate the length gain of rainbow trout ( Oncorhynchus mykiss ) during the final grow-out phase of 98 days under three different feed types in triplicate groups. We fitted the von Bertalanffy, Gompertz, Logistic, Brody, Power Function, and Exponential equations to individual length-at-age data of 900 fish. Equations were fitted to the data based on the least square method using the Marquardt iterative algorithm. Accuracy of the fitted models was evaluated using a model performance metrics combining mean squared residuals (MSR), mean absolute error (MAE) and Akaike's Information Criterion corrected for small sample sizes (AICc). All models converged in all cases tested. Evaluation criteria for the Logistic model indicated the best overall fit (0.67 of combined metric MSR, MAE and AICc) under all different feeding types, followed by the Exponential model (0.185), and the von Bertalanffy and Brody model (0.074, respectively). Additionally, ∆AICc results identify the Logistic and Gompertz models as being substantially supported by the data in 100% of cases. The logistic model can be suggested for length growth prediction in aquaculture of rainbow trout.


INTRODUCTION
Mathematical modeling is defined as the use of equations to describe or simulate processes in a system, such as animal growth (Santos, 2008). Mostly, growth is described as an increase in body dimension (mass, volume or length) as a function of time, and when this relationship is plotted, it results in a growth curve.
Length increase in fish has been studied for a long time in population and fisheries research. In contrast, aquaculture studies mostly refer to body weight. Because both measurement (weight and length) are understood to be largely caused by the same genes (Gunnes and Gjedrem, 1981) weight and length are closely linked by mathematical relationships. Accordingly, length, just as weight, can be affected by the environmental factors present in aquaculture facilities, i.e. tank design (Ross et al., 1995;Üstündağ and Rad, 2014) water quality and stocking density (Person et al., 2008).
Therefore, length growth and other size related treatment studies in cultured fish have received increasing attention since they can be used in the management of aquaculture production (Furuya et al., 2014;Silva et al., 2015;Lugert et al., 2017). In fact, during the last decade the interest to record these growth measurements in situ has grown, in order to improve the automatization of rearing practices in commercial fish aquaculture with the goal to advance in terms of productivity and profitability (Miranda et al., 2017;Saberioon and Císař, 2018).
Predicting growth in aquaculture facilities with high accuracy is possible using statistically based models, i.e. nonlinear equations, since statistical processing software are capable of handling complex mathematical algorithms in order to achieve analytical solutions (Lugert et al., 2016;Powel et al., 2019). Thus, the aim of this work was to fit six nonlinear models to length growth data of cultured rainbow trout by nonlinear regression and evaluate which model or models have the highest accuracy to display the growth curve.

MATERIAL AND METHODS
The data were collected at a commercial rainbow trout farm. The farm is located in the municipality of Nova Friburgo, a mountain region of the state of Rio de Janeiro, Brazil (22 ° 23'36 "S, 42 ° 29'12" W, 1.032 m altitude). This research was approved by the Ethics Committee on Animal Use (CEUA) of the Rio de Janeiro State Fisheries Foundation-FIPERJ with document number 007/2017007/2017. The fish, without sex distinction, were acquired from the farms' own breeding program. Nine hundred fish with an age of 273 days post-hatch (dph), and length (fork length) mean of 22.42 ± 0.71cm, were selected. Fish were distributed randomly into nine masonry tanks with a volume of 40 m 3 each. Fish were fed with three different types of extruded pellets (two commercials diets, A and B, and one experimental diet, C) in triplicates [(A/1, A/2, A/3) (B/1, B/2, B/3) (C/1, C/2, C/3)]. Rations were offered twice a day until apparent saturation. The experimental period was 98 days. Length measures at the beginning and the end of the trial for each feed type are shown in Table 1.
The six nonlinear equations chosen were von Bertalanffy, Brody, Gompertz, Logistic, Exponential, and Power Function; the mathematical expression of each function is presented in Table 2. Models were fitted using the Levenverg-Marquardt algorithm through the nlsLM computational process in the statistical software R (Elzhov et al., 2015). This process uses the nonlinear least squares (nls) method. The default convergence conditions were used with the exception of the maximum number of iterations being increased to 1000.  Table 2. Mathematical expression of the seven equations fitted to length growth data of cultured rainbow trout Huxley, 1932 Y = dependent variable; t = independent variable; A = asymptote; B = exponential rate of approximation to the asymptote; T = the location of the point of inflection (POI); C = an integration constant without biological interpretation; T0 = is the intercept on the xaxis; Y1 and Y2 = first and the last recorded length data, respectively; T1 and T2 = age of fish at the beginning and the final of period experiment, respectively; L0 = is the intercept on the y-axis; and k = exponential rate to infinity.
The accuracy of the fitted models was evaluated using a model performance metrics. The performance criteria to evaluate the goodness of fit are: The mean squared residuals ( = * [ − ] -1 ); where RSS is the residual sum of squares, n is the number of observations, p is the number of parameters of the model (Rawlings et al., 1998). The Akaike Information Criterion (AIC) corrected for small sample sizes (AICc).
= 2 − 2 ln(̂); where k is the number of estimated parameters in the model and L̂ is the maximum value of the likelihood function for the model, and ln is the natural logarithm (Akaike, 1973).
where n is the sample size and k is the number of parameters.
We calculated the difference in AICc (∆AICc) values to test the support of inferior models by the data. ∆AICc is calculated as: AICc (AICc i -AICc min) (Katsanevakis and Maravelias, 2008). Models with ∆AICc >10 have no support from the data, while models with ∆AICc < 2 have substantial support (Burnham and Anderson, 2002). Models with ∆AICc between 4-7 are somewhat supported by the data and might be taken into consideration.
The Mean Absolute Error (MAE) is the average absolute difference between observed and predicted outcomes and is calculated as: = (| − |). The MSR, AICc, and MAE were calculated using SAS (Statistical…, 2013). Finally, the results from MSR, AICc, and MAE were analyzed using a scoring system in which each best fit accounted for one score. The model that had the best fit in most tested cases achieved the highest score. In addition, we interpreted the estimated regression parameters of each model in regard to the biological attributes of the species whenever possible.

RESULTS
All models met convergence in all (9 of 9 evaluations) tested cases through Levenverg-Marquardt´s iterative method. All models needed a comparably low number of iterations, and convergence was generally met within 100 iterations. The estimated parameters for each model are shown in Table 3.      Figure 1 for threeparameter functions (Logistic, von Bertalanffy, Gompertz and Brody models) and, two-parameter functions (Power Function and Exponential models).
The model performance metrics for each model are presented in Table 4. Lowest MSR-values are produced by the Logistic model in 0.67 of tested cases, followed by the Von Bertalanffy (0.11), Brody (0.11), Exponential (0.11) and Power Function (0.11) models. The Gompertz model did not perform the lowest MSR in any case. MAE is lowest in the Logistic model in 5 out of 9 tested groups, 0.55 respectively. The Gompertz, Exponential and Power function models produced lowest MAE once (0.11) ( Table 4).
The AICc values of each model and all tested cases are listed in Table 4. Lowest AICc values are most often obtained by the Logistic model (6 of 9 cases). The Exponential model produced lowest AICc in 2 out of 9 cases, and the von Bertalanffy and Brody model both achieved lowest AICc values in 1 of 9 cases. The Gompertz and Power Function models never achieve lowest AICc.
The overall score obtained by the models are presented at the bottom of Table 4. Undisputedly, the Logistic model achieved the best overallscoring with 18 of 27 best fits (0.67). The Exponential model achieved best overall fit in 5 of 27 cases. The von Bertalanffy, and the Brody models scored only 2 out of 27 (0.07), and the Gompertz, and Power Function models achieved best fit just in one tested case and criteria (0.04). ∆AICc values range between 0.026 as the lowest and 44.61 as the highest. The Logistic and Gompertz models had substantial support by the data in all cases (Table 4). The von Bertalanffy and Brody models in 6 cases and, the Exponential and Power Function models in 2 cases each.

DISCUSSION
Convergence is met when the iterative process successfully estimates parameters for the function within the given maximum number of iterations set in the fitting algorithm. In this study, all models met convergence in all tested cases using the Marquardt algorithm. This algorithm is described to be more robust than others offered on statistical software (Elzhov, et al., 2015;Lugert et al., 2017). This is especially important, as nonconvergence situations of models for aquaculture data are described by several authors (Costa et al., 2009;Mansano et al., 2012;Allaman et al., 2013;Sousa et al., 2014).
Parameter A for three-parameter models (von Bertalanffy, Brody, Logistic and Gompertz) describe the infinite size of an organism and can be interpreted as the possibility of the model to reflect the biological properties of the species. O. mykiss is known to exceed 120cm in length (Eaton et al., 1995). Accordingly, Logistic and Gompertz models, estimated A within the biological range of the species in 7 of 9 cases, and for von Bertalanffy and Brody models in 5 of 9 cases.
Parameter B for three-parameter models denotes the precocity index. This means the larger the numeric value, the quicker the fish will reach the asymptotic or infinite size (Malhado et al., 2009). Estimated B values for Logistic and Gompertz models (0.0001 and 0.0144) in this study have the tendency to be greater than those being obtained from wild rainbow trout (0.002 to 0.049) in 7 and 6 of 9 cases, respectively (Blair et al., 2013;Sloat and Reeves, 2014;Cilbiz and Yalim, 2017 The point of inflection (POI) (parameter T) of the growth curve is only parameterized in the Gompertz and the Logistic model. At the POI, the rate of grow this largest, before diminishing asymptotically against zero. In this study, T values obtained by the Gompertz and Logistic models are generally lower than those estimated by Sloat and Reeves (2014) in weight data of wild rainbow trout using the Gompertz model. Furthermore, in aquaculture operations, the parameter T can be useful in the empiric adjustment of management strategies, as it is proven to correlate with other husbandry information. For instance, T parameter has significant meaning on cultured Carassius auratus gibelio because it positively correlates with dietary protein levels (Yun et al., 2015). Likewise, Oreochromis niloticus shows significant influence of water temperature on weight gain and at the age at the inflexion point (Santos et al., 2013).
Parameter T0 for the von Bertalanffy model defines the hatching day of rainbow trout. In this study, this parameter does not have congruence with biological features since it is not possible to have negative or positive hatching age (up to 54 days). Similarly, parameter L0 for Exponential and Power Function models which define the hatching length differ between both models. In addition, L0 of the Power function model shows values (0.0127 to 0.1116cm) smaller than the biologic features of rainbow trout (1.2 to 2cm) as described by Lavens and Sorgeloos (1996).
Parameter k represents the constant growth rate of rainbow trout trough all growth-curve for Exponential and Power Function model. k values of Exponential model are lower than those obtained by the Power Function. These values must be taken with care since both models display exponential shape and are not intended for longer growth periods or extrapolation of data. However, because of their simplicity they are frequently used in aquaculture studies (Santos et al., 2008;Costa et al., 2009).
In model selection, the goodness of fit should generally not be based on a single criterion. Correspondingly, it has become common practice to evaluate the most suitable model based on an evaluation metrics of mostly three statistical parameters of different properties (e.g. Yun et al., 2015;Lugert et al., 2017;Powell et al., 2019). One parameter should be based on the residuals from fitting the model. The second parameter is often based on information theory either AIC, AICc or BIC. A third parameter is mostly somehow based on the deviation between estimated and sampled data. For these three categories of evaluation parameters, several different statistical parameters are available. In each scenario, the author needs to decide individually, which parameter is most suitable for the current study.
In our study, we used Mean Squared Residual (MSR), Akaike Information Criterion for small sample sizes (AICc) and Mean Absolute Error (MAE). The non-linear least squares method aims to achieve non-linear equation parameter by minimizing the Residual Sum of Squares (RSS). The smaller RSS, the smaller the MSR and the better the fit (Rawlings et al., 1998). In this study, the Logistic model most often achieved the smallest MSR values. Similar results were obtained by Costa et al. (2009) in growth studies of Orechormis niloticus, but are in contrast to Mansano et al. (2012) on Lithobates catesbeianus, with both species being reared under aquaculture conditions.
We used ∆AICc to identify whether our datasets were supported by more than one model. This was necessary, as the outcome from the analysis revealed very close numeric results between different models within tested groups. ∆AICc < 2 indicates substantial support of a model by the data (Burnham and Anderson, 2002). Indeed, in all 9 analysis, 2 out of 6 tested models were supported by the data, namely Logistic and Gompertz. This might be due to the specific pattern of our recorded data (grow-out phase), which are distributed around the POI of the growth curve. Accordingly, several models of sigmoidal behavior can equally well reflect this segment of the curve.
Primarily, we observed that the different nonlinear models adjusted their fit individually to the various growth trajectories expressed by rainbow trout caused by different diet treatments. Araneda et al. (2013) observed similar results when fitting models on various growth data of Penaeus vannamei. This specific application has huge potential in predicting the effects of new feed formulations, harvest size and production period in all aquaculture species. However, it is necessary to verify and validate this potential through studies with rigorous control of diet quality and quantity as recorded in carp (Yun et al., 2015).

CONCLUSION
All six models (von Bertalanffy, Brody, Logistic, Gompertz, Exponential and Power Function) have shown the capacity to fit the length-at-age data of cultured rainbow trout during the grow-out phase. However, in the current study, the Logistic model achieved the highest accuracy in fit. Despite the growth-length curve of cultured rainbow trout not clearly follows a sigmoidal shape, the diminishing-return shaped von Bertalanffy and Brody models, as well as the exponential shaped Power Function and Exponential models do not meet the mathematical attributes needed to reflect length-at-age data. This is also verified by ∆AICc values, which indicate the Logistic and Gompertz model, as the only models having substantial support by the data in all cases. Furthermore, we showed that it is possible to model the impact of varies feeding strategies to predict long-term influences on growth and harvest size and production period.

ACKNOWLEDGMENTS
This study was financed by the Co-ordination of Higher-Level Population Survey -Brazil (CAPES) -Finance Code 001, The National Council for Scientific and Technological Development (CNPq) Process number 141889/2019-5, Araribá Trout Farm, and by C.N. Rações. The authors want to thank Victor Octavio Borda Pua for his help with the suggestion of codes on the R software.

COMMITTEE ON ETHICS AND BIOSAFETY
The procedures adopted were approved by the Ethics Committee on Animal Use (CEUA) of the Rio de Janeiro State Fisheries Foundation-FIPERJ, document number 007/2017, been in accordance with the ethical principles in animal experimentation elaborated by the Brazilian College of Animal Experimentation (COBEA).