Nonlinear regression and plot size to estimate green beans production

The objectives of this work were to adjust nonlinear regression models for the green beans production and to identify the plot size which provides the best explanation and adjustment to the models. The authors used two field and two protected environment (plastic tunnel) trials in the autumn-winter and spring-summer seasons. The logistic and von Bertalanffy models were adjusted for average weight of green beans accumulated after multiple harvests and with different plot sizes. The models presented similar estimates and the same parameters estimates in all the plot sizes. The logistic model provided estimates closest to the reality, showing the best description performance for the average weight of pods during the productive cycle. In the autumn-winter season, plot size of 14 basic units (28 plants in the cultivation line direction) in the field and of two basic units (four plants in the cultivation line direction) under the plastic tunnel provide a good quality in the models adjustment. In the springsummer season, the plot sizes are made of six basic units (12 plants in the cultivation line) in the field and seven basic units (14 plants in the cultivation line) under the plastic tunnel.


Pesquisa / Research
Hortic.bras., v34, n. 4, out.-dez.2016 S nap-bean (Phaseolus vulgaris)   is the most important Fabaceae representative among the horticultural crops.Snap-bean pods, differently from the common beans, are harvested when immature and completely consumed after cooking, being used both in industrialized and in natura food (Filgueira, 2008).
In multiple harvest essays, as carried out for snap-bean, all harvests should be considered in the statistical analyses, as they are one of the variability sources of the variables observed.In this situation, residual variance tends to be inflated and leads to an inadequate choice of experimental design, plot size and sample size for the experiments and, thus, it influences negatively the discrimination of treatment effects in evaluation.This increase in residual variance is due to the lack of information during harvest, the absence of suitable fruits or pods to be harvested in the plants and in the experimental plots in certain harvests during the crop productive cycle, favoring overdispersion in the database, with the tab of a high amount of zero values.
Several researches were already carried out aiming to develop strategies and to identify the most appropriate procedures in order to minimize data variability in essays with snapbeans.Among these researches, we can mention the ones carried out by Haesbaert et al. (2011) to estimate sample size and Santos et al. (2012aSantos et al. ( , 2012bSantos et al. ( , 2014)), to estimate plot size and viability of implementation of Papadakis method for the experimental error adjustment.The researches aiming

RESUMO
Regressão não-linear e tamanho de parcela para a estimativa da produção de feijão-de-vagem Os objetivos do trabalho foram ajustar modelos de regressão não--linear para a produção de feijão-de-vagem e identificar o tamanho da parcela com melhor poder de explicação e ajuste dos modelos.Foram utilizados dois ensaios a campo e dois em túnel plástico, realizados nas estações outono-inverno e primavera-verão.Os modelos ajustados foram o logístico e o de von Bertalanffy, para peso médio de vagens acumulado nas múltiplas colheitas e com diferentes tamanhos de parcela.Os modelos apresentaram estimativas semelhantes entre sí e as mesmas estimativas dos parâmetros em todas os casos de tamanho de parcela.O modelo logístico proporcionou estimativas mais próximas da realidade, apresentando melhor desempenho na descrição do comportamento do peso médio de vagens no decorrer do ciclo produtivo da cultura.Na estação sazonal outono-inverno parcelas de 14 UBs (28 plantas no sentido da linha de cultivo) a campo e de duas UBs (quatro plantas no sentido da linha de cultivo) em túnel plástico proporcionam uma boa qualidade no ajuste dos modelos.Na estação sazonal primavera-verão os tamanhos de parcela são de seis UBs (12 plantas no sentido da linha de cultivo) a campo e de sete UBs (14 plantas no sentido da linha de cultivo) em túnel plástico.to improve the quality of essays use the following strategies: determination of plot size and sample size adjusted to the variability of experimental areas; determination of the variability behavior between lines and between harvests; study on data transformation and use of Papadakis method to minimize the effects of excessive zeros which causes overdispersion in the observed variables due to lack of suitable fruits to be harvested from certain plants in certain crops.The relationship between the observed variables in trials with vegetables and their behavior during the productive cycle of the species is important, as it produces information on how the multiple harvests should be planned and carried out, in order to reduce the number of plants with zero values for variables number and weight of fruits or pods harvested.An interesting strategy in this regard is to estimate the optimum plot size that provides the smallest variance between the different plots evaluated (Lorentz et al., 2005;Lúcio et al., 2008Lúcio et al., , 2012)).Another strategy is the accumulation of values of these variables in each plant.With this accumulation, an increasing behavior of the values is observed in each plant evaluated allowing the application of regression analysis techniques to estimate the values of the aforementioned variables.By adopting the first or second strategy, a reduction of the percentage of zero values is observed at database, since they aggregate, within each plot, a greater number of plants or observations of successive harvests.

Palavras
The regression models classified as nonlinear are useful to describe the growth of individuals over time, since they facilitate the researcher to make a decision as they show parameters having a biological interpretation.According to Seber &Wild (1989) andSmyth (2002) the nonlinear models are generally adopted when the researcher observes and assumes as hypothesis that the relationship between the dependent variable and the predictors follows a certain function.Draper & Smith (1981) state that the kind of nonlinear model adopted depends on the research area, on the specific problem and on the kind of growth which is supposed to be modeled.
Applications of nonlinear growth models can be found in studies on different areas of knowledge.The modeling works on vegetables as well as on beans aim to evaluate all the cycle of a specific species or to model the growth of different cultivars or according to the application of different cultural managements (Urchei et al., 2000;Martins Filho et al., 2008;Vieira et al., 2008;Vieira Neto et al., 2013).
However, the authors did not identify any study describing nonlinear relationships of pod or fruit production with the advance of the production cycle of vegetable crops with multiple harvests and the behavior of this production related to plot size and accumulation of production over multiple harvests.Thus, the goals of this work were to adjust nonlinear regression models for describing production of snap-bean pods and identifying the plot size which provides the best explanation and model adjustment.

MATERIAL AND METHODS
Four uniformity trials were carried out, with no treatment application, with snap-beans cultivar "Macarrão" in the field and in protected environment (plastic tunnel).Two of these trials were carried out in autumn-winter and the other two trials in springsummer, in the experimental field of Setor de Experimentação Vegetal (Plant Experimentation Division) of Departamento de Fitotecnia (Plant Sciences Department) of the Universidade Federal de Santa Maria (Santa Maria Federal University), Brazil (29°42'23''S, 53°43'15''W, 95 m elevation).
The plastic tunnel is 3 m ceiling height, 20 m length and north-south direction.The greenhouse was covered with low density polyethylene (LDPE) 150 µm thickness and anti-UV.The area and structure used for unprotected cultivation and cultivation under protected conditions in autumn-winter were the same used in spring-summer season.In all trials, the plants were grown on three ridges (growing rows) with plants per row and approximately 0.20 m height and 0.40 m width, covered with black-opaque low-density polyethylene mulch.Spacing used was 0.2 m between plants and 1.0 m between rows in all trials and each basic experimental unit (BU) consisted of two plants totalizing 42 basic units within each growing row.From the initial growth, the plants were tutored, vertically, with raffia thread stuck in wires along the growing row.Drip irrigation system was used and the other cultural practices were carried out according to recommendations for the snap-bean crop.
In autumn-winter, four harvests were carried out, at 61, 74, 88 and days after sowing (DAS), whereas in spring-summer, three harvests were carried out, at 70, 91 and 99 DAS.In each harvest, the authors observed the variable weight (in grams) of the pods harvested per BU using a digital scale with 0.01 gram accuracy.
In order to reduce the percentage of plots with zero values, decrease the variability of data and residual variance in analysis of variance of the adjusted models, allowing an adequate and reliable estimate of the nonlinear regression models parameters, different plot sizes were simulated, in groupings of BU in cultivation line direction, by multiples of total number of BUs per row, in order to use all the experimental area.Thus, the authors simulated plots with 1, 2, 3, 6, 7, 14 and 21 BUs (Table 1).
From these situations, the authors estimated the average weight of pods harvested per plot, per grouping of BU within each one of the multiple harvests and total accumulated of average weight of pods harvested during the productive crop cycle, dividing the accumulated value by BU number of each simulated plot, in order to work on the same value scale for the statistical analyses.
Levene test (Levene, 1960) was carried out to identify homogeneity of variance among the multiple harvests carried out and Durbin-Watson test (Durbin & Watson, 1950, 1951) to identify autocorrelation among residues.In the adjustments of nonlinear regression models, to describe the behavior of the accumulated average weight of pods depending on the productive cycle evolution, biological-based models were adopted: a) Logistical (Nelder, 1961;Seber & Wild, 1989) where:Y i = accumulated average weight of pods (in grams) harvested per BU; X i = time elapsed, in days, from sowing until the variable observation (61, 74, 88 and 112 days after sowing in autumn-winter and 70, 91 and 99 days after sowing in spring-summer); β 1 = parameter which represents the asymptotic weight; β 2 = location-scale parameter; β 3 = growth rate parameter;ε 1 = random error.Since the models studied show three parameters to be estimated and the observations were carried out in four and three seasons, respectively in autumn-winter and spring-summer, the greatest number of observations was used (in basic experimental units) possible in each simulated plot size (Table 1) and, thus, the experimental error adopted in variance analyses in all statistical tests was pure error, according to recommendations of Ezekiel & Fox (1959).
To estimate model parameters, the authors used the ordinary leastsquare method, with iterative process o f L e v e n b e rg -M a r q u a r d t w i t h Gaussian-type error (Madsen et al., 2004), building, for each situation, confidence intervals for each estimate of the model parameters.To identify the quality of the model adjustment and the explanation capacity for each regression model, the coefficient of determination adjusted (R² aj ), the standard error of the adjustment (SEA = (MSRe sidue) ) and the graphs of residues for each model were estimated.Using this strategy, the authors could define the nonlinear models proposed properly and estimate more precisely the smallest plot size to be used according to the level of experimental accuracy and explain the productive behavior of snap-beans.
In all statistical analyses carried out, the authors used 5% probability of error using computer program Table Curve 3D (Jandel Scientific, 1993).

RESULTS AND DISCUSSION
The variance homogeneity of average weight of pods in multiple harvests was observed in all growing environments.Nevertheless, this behavior was identified in plots with 14 BU in the field trial (p-value = 0.06) and 6 BU in the protected environment trial (p-vaue = 0.06) in autumn-winter season and 21 BU in the field trial (p-value = 0.15) and 2 in the protected environment trial (p-value = 0.068) in spring-summer season.The authors also observed an increase in p-value with the increase in BU number to make up the plot where the values of the variables were observed (Table 1), showing greater chances of obtaining homogeneous variances in bigger plots made of a greater number of plants.
Regardless of environment and growing season, as well as composition of the plots in number of BUs, the authors obtained statistic values of the Durbin-Watson test from 1.57 to 1.87 showing, according to the classification presented by Souza (1998), independence of residues.Thus, the authors decided not to carry out any process to transform or weight the data to adjust the regression models proposed in the study.
The adjustments of nonlinear regression growth models showed always the same estimates of the three parameters in all cases evaluated of plot size since the average of the values observed within each plot size was used, in order to uniform just one value scale in the statistical analyses.The authors observed that the von Bertalanffy model always presented higher values β1 and lower values β3 when compared to the logistic model (Tables 2 and 3).Since these estimates represent, respectively, the total average weight of pods harvested at the productive cycle final phase and the growth rate, the logistic  2 and 3), reinforcing the choice for the logistic model as the best performance.The logistic model was also adopted in several other works like studies on eucalyptus (Calegario et al., 2005;Mafia et al., 2005), arrack (Hernández et al., 2007), corn (Lyra et al., 2008), banana (Maia et al., 2009) and cucumber (Vieira Neto et al., 2013), for being more generalist in its application and in its behavioral description of the different variables evaluated in these studies.
In each growing season, adopting the logistic model for its best performance in describing the behavior of average weight of pods, the authors verified that the field experiment presented a lower value of estimated total average weight of pods in autumn-winter season and a higher value in spring-summer season, when compared to the protected environment trials (Tables 2 and 3).This behavior can be explained considering that protected environment, the plastic tunnel in this case, favors the plant production in relation to cultivation in the field when cultivation is done out of the crop season, which, in the case of snap-beans, is the springsummer season, mainly in the Southern Brazil.The use of environments with plastic covering during the growing seasons with lower temperatures, is an alternative to minimize the effects of these low temperatures and allow "outof-season" cultivation and experiments, as it increases the thermal comfort for the plants.Observing the behavior of estimates of the logistic model between the two growing seasons, the authors verified that the estimate β1 was always higher in spring-summer season, as a result of the absence of problems with low temperatures, the plants performance in field tending to be superior when compared to those grown in environments with plastic covering.
Observing the behavior of upper and lower limits of each of the estimates of model parameters, the authors verified that as the plot size increased, in number of BUs, the range of the confidence interval also increased (Tables 2 and  3).This is caused by the change in the tabulated "t" statistic value, by the reduction in the number of plots within each growing row with an increase of the plot size and, consequently, the reduction in degrees of freedom used Table 2. Estimates ( ) of the logistic and von Bertalanffy models parameters with the respective lower (LI) and upper (LS) limits, for the mean weight of green beans in different plot size in basic units (1,2,3,6,7,14 and 21 BU) and different environment, cultivated during autumn-winter seasons {estimativas ( ) dos parâmetros dos modelos logístico e de von Bertalanffy com os respectivos limites inferior (LI) e superior (LS), para o peso médio de vagens de feijão-de-vagem em diferentes tamanhos de parcela em unidades básicas (1, 2, 3, 6, 7, 14 e 21 BU) e diferentes ambientes, na estação sazonal de cultivo outono-inverno}.Santa Maria, UFSM, 2015.

Field experiment
Protected environment experiment in the estimation of confidence interval limits.
Regarding the quality of the adjustments carried out, the logistic and the von Bertalanffy models showed very similar estimates of quality indicators (Table 4).The study verified that with the increase in number of BUs in the plot, an increase in estimates of R²aj was observed, as well as an adjustment standard error reduction (Table 4) and a reduction of residues of models, independent on the season and on the growing environment.Thus, the strategy of working on larger plot sizes was effective in order to reduce the variability in data.This can be explained by the reduction of plots with zero values, from the increase in the number of BUs which make up the plot, reducing, therefore, the variance between the plots within each harvest.
With respect to the number of BUs per plot and adopting the lower limit in estimate of R²aj ≥0.70, adopting plots of 14 BUs for the field trial is a good option (28 plants in cultivation line direction) in autumn-winter season and six BUs (12 plants in cultivation line direction) in spring-summer season.For the protected environment experiments, to adopt two-BU-plots (four plants in cultivation line direction) in autumn-winter season and seven BUs (14 plants in cultivation line direction) in spring-summer season (Table 4) is recommended.From these results, the researcher can establish greater-size plots, as he/she wants to.With greater-size plots, the R²aj are also greater and the models will provide greater reliability, as it will be a reduction in the mean square of error in model adjustments.The researcher should decide based on the level of accuracy and reliability to be adopted in the research, in the field area, labor and available resources.
The need for a larger plot size for field trials, when these trials are carried out in autumn-winter season, is due to the direct interference of low temperatures and low solar radiation which is common in the Southern Brazil in this season.In this condition, plants are subject to all unfavorable conditions for their development and production, altering the experiments variability.In the appropriate season for the crop cultivation, spring-summer, the plot sizes of the field and the greenhouse experiments are about the same.

Table 1 .
Number of observations, in basic experimental units (BU), used in the analyzes and p-value of Levene's test at 5% probability error to evaluate the variances homogeneity of the green beans mean weight among different plot sizes in two cultivation seasons {número de observações, em unidades experimentais básicas (BU), utilizados nas análises e p-valor segundo o teste de Levene a 5% de probabilidade de erro para avaliar a homogeneidade das variâncias do peso médio de vagens entre as diferentes parcelas simuladas em duas estações de cultivo}.Santa Maria, UFSM, 2015.