Acessibilidade / Reportar erro

Study of repeatability and phenotypical stabilization in kale using frequentist, Bayesian and bootstrap resampling approaches

ABSTRACT.

The aim of this study was to obtain information for the genetic improvement of kale through repeatability and phenotypic stabilization studies and to compare methodologies that represent the reliability of the estimated parameters. Thirty-three half-sib progenies were evaluated in a randomized block design with three replicates and six plants per plot. Eight harvests were evaluated in terms of the yield of fresh leaves, number of shoots, number of leaves and average mass of leaves. Then, a phenotypic repeatability and stabilization study was performed, estimating the genetic parameters σ2a, σ²g, σ²e, and the coefficient of environmental variation and repeatability using the frequentist and Bayesian methodologies. To evaluate the reliability of these estimates, intervals were obtained using the frequentist, Bayesian and bootstrap methods. It was verified that the reliable selection of progenies of half-sib of kale can be achieved in four harvests that were realized between 95 and 170 days after planting. It was observed that the frequentist and Bayesian methodologies are better suited to obtain reliable estimates of the genetic parameters evaluated, as the last one provided smaller amplitudes for the obtained intervals. The bootstrap methodologies are not recommended for phenotypic repeatability and stabilization studies in kale.

Keywords:
improvement; genetics; Brassica oleracea var. Acephala; crops.

Introduction

Kale is a vegetable with nutritional characteristics of interest, such as high levels of carbohydrates, proteins, vitamins and fibre (Novo, Prela-Pantano, Trani, & Blat, 2010Novo, M. C. S. S., Prela-Pantano, A., Trani, P. E., & Blat, S. F. (2010). Desenvolvimento e produção de genótipos de couve-manteiga. Horticultura Brasileira , 28(3), 321-325. DOI: 10.1590/S0102-05362010000300014.
https://doi.org/10.1590/S0102-0536201000...
). However, there is a need for cultivars with higher productivity and resistance to the crop’s major pests (Azevedo et al., 2016aAzevedo, A. M, Andrade Júnior, V. C., Pedrosa, C. E., Valadares, N. R., Andrade, R. F., & Souza, J. R. S. (2016a). Estudo da repetibilidade genética em clones de couve. Horticultura Brasileira, 34(1), 54-58. DOI: 10.1590/S0102-053620160000100008.
https://doi.org/10.1590/S0102-0536201600...
; Azevedo et al., 2017Azevedo, A. M., Andrade Júnior, V. C., Santos, A. A., Sousa Júnior, Oliveira, A. J. M., & Ferreira, M. A. M.(2017). Population parameters and selection of kale genotypes using Bayesian inference in a multi-trait linear model. Acta Scientiarum. Agronomy, 39(1), 25-31. DOI: 10.4025/actasciagron.v39i1.30856.
https://doi.org/10.4025/actasciagron.v39...
).

The selection of kale genotypes is based on several harvests (Azevedo et al., 2016aAzevedo, A. M, Andrade Júnior, V. C., Pedrosa, C. E., Valadares, N. R., Andrade, R. F., & Souza, J. R. S. (2016a). Estudo da repetibilidade genética em clones de couve. Horticultura Brasileira, 34(1), 54-58. DOI: 10.1590/S0102-053620160000100008.
https://doi.org/10.1590/S0102-0536201600...
); thus, information on the minimum number of harvests is needed to avoid more evaluations than are necessary, which would entail the waste of labour and financial resources (Cruz, Regazzi, & Carneiro, 2012Cruz, C. D., Regazzi, A. J., & Carneiro, P. C. S. (2012). Modelos biométricos aplicados ao melhoramento genético. Viçosa, MG: UFV .). To establish how many and which harvests should be contemplated during progeny selection, repeatability and phenotypic stabilization studies can be performed (Martuscello et al., 2015Martuscello, J. A., Braz, T. G. S., Jank, L., Cunha, D. N. F. V., Lima, B. P. S., & Oliveira, L. P. (2015). Repeatability and phenotypic stabilization of Panicum maximum accessions. Acta Scientiarum. Animal Sciences, 37(1), 15-21. DOI: 10.4025/actascianimsci.v37i1.23206.
https://doi.org/10.4025/actascianimsci.v...
).

For the decision making, measures of the reliability of the estimates are important and represented as the confidence interval (CI) (Khosravi, Nahavandi, Creighton, & Atiya, 2011Khosravi, A., Nahavandi, S., Creighton, D., & Atiya, A. F. (2011). Lower upper bound estimation method for construction of neural network-based prediction intervals. IEE Transactions on Neural Networks, 22(3), 337-346. DOI: 10.1109/TNN.2010.2096824.
https://doi.org/10.1109/TNN.2010.2096824...
; Severiano, Carriço, Robinson, Ramirez, & Pinto, 2011Severiano, A., Carriço, J. A., Robinson, D. A., Ramirez, M., & Pinto, F. R. (2011). Evaluation of jackknife and bootstrap for defining confidence intervals for pairwise agreement measures. PLoS ONE, 6(5), 1-11. DOI: 10.1371/journal.pone.0019539.
https://doi.org/10.1371/journal.pone.001...
; Santos et al., 2011Santos, A. I., Ribeiro, R. P., Vargas, L., Mora, F., Alexandre Filho, L., Fornari, D. C., & Oliveira, S. N. (2011). Bayesian genetic parameters for body weight and survival of Nile tilapia farmed in Brazil. Pesquisa Agropecuárias Brasileira, 46(1), 33-43. DOI: 10.1590/S0100-204X2011000100005.
https://doi.org/10.1590/S0100-204X201100...
). There are several methodologies for estimating such intervals; the frequentist methodologies (Toral, Alencar, & Freitas, 2007Toral, F. L. B., Alencar, M. M., & Freitas, A. R. (2007). Abordagens freqüentista e bayesiana para avaliação genética de bovinos da raça Canchim para características de crescimento. Revista Brasileira de Zootecnia , 36(1), 43-53. DOI: 10.1590/S1516-35982007000100006.
https://doi.org/10.1590/S1516-3598200700...
) require knowledge of the probability distribution that the parameters of interest follow, as well as the number of degrees of freedom that are associated (Cecon, Silva, Nascimento, & Ferreira, 2012Cecon, P. R., Silva, A. R., Nascimento, M., & Ferreira, A. (2012). Métodos estatísticos. Viçosa, MG: UFV.).

Intensive computing methodologies such as bootstrap resampling can also be used for this purpose, dispensing assumptions about the probability distribution of parameters (Efron & Tibshirani, 1993Efron, B., & Tibshirani, R. J. (1993). An introduction to the Bootstrap. New York, US: Chapman & Hall.). Another possible approach for estimating reliability is the Bayesian approach to obtaining credibility intervals from an a posteriori density distribution (Zhang, Rojas, & Cuervo, 2010Zhang, H., Rojas, H. A. G., & Cuervo, E. C. (2010). Confidence and credibility intervals for the difference of two proportions. Revista Colombiana de Estadística, 33(1), 63-88.; Santos et al., 2011Santos, A. I., Ribeiro, R. P., Vargas, L., Mora, F., Alexandre Filho, L., Fornari, D. C., & Oliveira, S. N. (2011). Bayesian genetic parameters for body weight and survival of Nile tilapia farmed in Brazil. Pesquisa Agropecuárias Brasileira, 46(1), 33-43. DOI: 10.1590/S0100-204X2011000100005.
https://doi.org/10.1590/S0100-204X201100...
; Azevedo et al., 2017Azevedo, A. M., Andrade Júnior, V. C., Santos, A. A., Sousa Júnior, Oliveira, A. J. M., & Ferreira, M. A. M.(2017). Population parameters and selection of kale genotypes using Bayesian inference in a multi-trait linear model. Acta Scientiarum. Agronomy, 39(1), 25-31. DOI: 10.4025/actasciagron.v39i1.30856.
https://doi.org/10.4025/actasciagron.v39...
).

Given the importance of representing the uncertainty associated with parameter estimation in repeatability studies, a comparison of methodologies for this purpose is required. This study was aimed to obtain information for the genetic improvement of kale through repeatability and phenotypic stabilization studies and to compare methodologies that represent the reliability of the estimated parameters.

Material and methods

Procurement and maintenance of progenies

The progeny test was conducted in the municipality of Diamantina, Minas Gerais State, Brazil, installed in the Olegicultura Sector of the Federal University of the Jequitinhonha and Mucuri Valleys - UFVJM. The JK Campus is located at an altitude of 1,400 m, with coordinates of 18º 9' S latitude and 43º 21' WGR. The predominant soil is the Typical Ortho Quartzeneic Neosol type (Santos et al., 2018Santos, H. G., Jacomine, P. K. T., Anjos, L. H. C., Oliveira, V. A., Lumbreras, J. F., Coelho, M. R., ... Cunha, T. J. F. (2018). Sistema brasileiro de classificação de solos. Brasília, DF: Embrapa Solos.).

Seeds from 33 families (progenies) of half-sib obtained from surveys conducted in Viçosa, Minas Gerais State, Brazil (Azevedo et al., 2016bAzevedo, A. M., Seus, R., Gomes, C. L., Freitas, E. M., Candido, D. M., Silva, D. J. H., & Carneiro, P. C. S. (2016b). Correlações genotípicas e análise de trilha em famílias de meios-irmãos de couve de folhas. Pesquisa Agropecuária Brasileira , 51(1), 35-44. DOI: DOI: 10.1590/S0100-204X2016000100005.
https://doi.org/10.1590/S0100-204X201600...
) were used. Seeding was performed in trays of 72 cells, filled with commercial Plantimax® substrate and kept in a greenhouse for approximately 50 days. The preparation of the soil consisted of ploughing and two gradations, which formed beds 1.2 m wide and 30 cm high. The seedlings were planted 0.50 m apart. Planting and cover fertilization as well as irrigation management, pest and disease management were performed according to the needs of the crop.

An experimental design used for the randomized blocks with 33 treatments (half-sib progenies) with four replicates and six plants per plot was used according to the installation instructions. After 30 days of planting, eight harvests were taken at biweekly intervals, and the productivity of fresh marketable leaves, the number of shoots, the number of leaves and the fresh mass per marketable leaf were evaluated. The leaves without signs of senescence, damage caused by pests and diseases, and lengths greater than 15 cm were considered marketable (Azevedo et al., 2012Azevedo, A. M., Andrade Júnior, V. C., Pedrosa, C. E., Fernandes, J. S. C., Valadares, N. R., Ferreira, M. A. M., & Martins, R. A. V. (2012). Desempenho agronômico e variabilidade genética em genótipos de couve. Pesquisa Agropecuária Brasileira, 47(12), 1751-1758. DOI: 10.1590/S0100-204X2012001200011.
https://doi.org/10.1590/S0100-204X201200...
).

Obtaining estimates for the repeatability study using the frequentist method (ANOVA) and Bayesian method

For the repeatability study, the statistical model proposed by Cruz, Regazzi, and Carneiro (2012Cruz, C. D., Regazzi, A. J., & Carneiro, P. C. S. (2012). Modelos biométricos aplicados ao melhoramento genético. Viçosa, MG: UFV .): Yij=m+gi+aj+eij. Y ij: observation in the plot that received the i-th family (i = 1, 2, ..., 33 families) in the j-th harvest (j = 1, 2, ..., 8 harvests); m: overall average; g i: random effect of the i-th family under the influence of the permanent environment; a j: fixed effect of the temporary environment in the j-th harvest and e ij: experimental error established by the temporary effects of the environment in the i-th family and j-th harvest.

The repeatability was estimated using the ANOVA method (frequentist), including the genotypic variance (σg2=(QMG-QMR)/a); temporary environmental variance (harvest) (σa2=(QMA-QMR)/g) and residual variance (σe2=QMR), where a is the number of harvests and g is the number of families. From these components of variance, the repeatability coefficients (r=(σg2)(σg2+σe2)), residual variance coefficients (CVe%=100 . QMRm), determination coefficients (R2=nr[1+rn-1]), and the optimal number of harvests (η0=R2(1-r)1-R2r) were determined, where n is the number of harvests.

The Bayes theorem was also used to estimate the above-mentioned parameters. Assuming thatσe2~N0,σe2, the sample distribution of the observed data (maximum likelihood function) isyij| m, gi, aj, σe2~N(m+gi+ aj, σe2). For the location parameters, the following a priori distributions were considered: m| um, σm2~N(um, σm2), gi| ug, σg2~N(ug, σg2)andaj| ua, σa2~N(ua, σa2). For the components of variance, the scaled inverse chi-squared distribution was considered as an a priori distribution: σg2| vg,sg ~vgsgχvg-2, σa2| va,sa ~vasaχva-2andσe2| ve,se ~veseχve-2.

Therefore, the posterior a posteriori distribution can be represented by:

P θ y i j i = 1 33 j = 1 8 e x p [ y i j - ( m + g i + a j ) ] 2 2 σ e 2 σ e 2 - 0.5 e x p [ m + u m ] 2 2 σ m 2 σ m 2 - 0.5 e x p [ g i + u g ] 2 2 σ g 2 σ g 2 - 0.5

σ g 2 - v g 2 + 1 e x p - v g s g 2 σ g 2 e x p [ a i + u a ] 2 2 σ a 2 σ a 2 - 0.5 σ a 2 - v a 2 + 1 e x p - v a s a 2 σ a 2 σ e 2 - v e 2 + 1 e x p - v e s e 2 σ e 2

For statistical analysis, the rJags package of the R software (R Core Team, 2016) was used. Due to the absence of previous work with progenies of half-sibs of kale, the use of a priori values was vague (uninformative). Thus, for location effects, u i = 0(i = m, g, a, e). For the variance component of the general average, it was stipulated that ( m 2 = 0.001, and for the other components of variance, it was considered that v i /2 = s i /2 = 0.001 (i = m, g, a, e). MCMC (Markov Chain Monte Carlo) chains were obtained, and 1,000,000 iterations were established per feature. A burn-in of 100,000 iterations and thin of 500 interactions were used, resulting in a total sample of 1,800 iterations for each characteristic.

Determination of confidence intervals of estimates for the repeatability study using the frequentist method

The confidence intervals were determined using the frequentist and bootstrap resampling methods. To obtain the CIs for the components of variance, the following estimator was used: ICσ2=fσ̂2μ2;fσ̂2μ1, where ( 2 is the component of variance; f is the number of degrees of freedom associated with it; and u1=χf; α22 and u2=χf;1- α22 are the quantile obtained from the probability density function (f.d.p) of the (2-square distribution. To obtain the degrees of freedom of ( g 2, we used the Satterthwaite method, whereglσg2=(QMG-QMR)2(QMG)2g-1+(QMR)2(g-1)(a-1). For the residual variation coefficient, the estimator ICCVe=CVU2+2V+1-1CV2+U2V;CVU1+2V+1-1CV2+U1V was used, where V is the number of degrees of freedom associated with ( e 2, u1=χv; 1-α22 and u2=χv; α22 are the quantiles obtained from the probability density function (f.d.p) of the χ2-square distribution. For the repeatability coefficient, the estimator used wasICr=1-QMT+a-1QMRaQMR x Fα2;n1;n2-1;1-QMT+a-1QMRaQMR x F1-α2;n1;n2-1, where F refers to the quantile of the F distribution.

Determination of estimates of confidence intervals to study repeatability using the bootstrap method

In the bootstrap method, 1,000 samples were generated by re-sampling with replacement of the original data, always maintaining information for the respective family and harvest for each value. For each of the 1,000 samples, all previously mentioned parameters were estimated, considering the same statistical expressions quoted for the frequentist analysis. Thus, for each parameter, a distribution was obtained, which was referred to as the bootstrap empirical distribution.

For the percentile method, bootstrap empirical distribution data were ordered. In this way, the samples positioned at position 25(1000 x (/2) and 975(1000x(1- ()/2 were considered the lower and upper limit of the confidence interval (( = 0.05).

To obtain the "Biased-Corrected Percentile Bootstrap" (BCPB) confidence interval, the extremes of the confidence interval are the adjusted bootstrap distribution percentiles, which can be used to correct the bias resulting from the distribution asymmetry. In this way, the proportion of replicates is found (p0) and smaller than the estimate obtained by the frequentist analysis. The proportion p0 is defined as the probability that an estimate is less than the original sampleθ̂. From p0, the bias correction parameter was defined by z0 (z0=Φ-1(p0)), where Φ −1 is the inverse of the density function of the standardized normal distribution. Subsequently, the limits of the confidence interval are obtained by selecting the estimate at the position 1,000 x LI (beingLI=Φ2z0-zα2) and 1,000 x LS (where Ls=Φ2z0+zα2).

In addition to correcting the asymmetry of the bootstrap empirical distribution, an acceleration constant is also used for the “Bias-Corrected and Acceleration” (BCa) method. In this method, we estimate the constant 𝑧 0 , as mentioned previously. The acceleration constant is obtained byâ=i=1n(θ̂.-θ̂i)36i=1n(θ̂.-θ̂i)232, θ̂. is the estimate obtained for the parameter considering the Jack-knife methodology, and θ̂. is the average ofθ̂i. From the acceleration constant, the upper and lower limits of the confidence interval are obtained by selecting the estimate at position 1,000 x LI (beingLI=ϕẑ0-ẑ0+zα21-âẑ0+zα2) and 1,000 x LS (whereLs=ϕẑ0+ẑ0+zα21-âẑ0+z1-α2).

Determination of estimated credibility intervals for the Bayesian repeatability study

The chains obtained by the MCMC method were tested for convergence using the Geweke test without autocorrelation. After verifying these criteria, from the a posteriori distribution for each parameter, the average, median, mode and credibility intervals for parameters were estimated using the ‘good’ package in the R software package (R Core Team, 2016).

Results

Estimates of σ2 a were higher than those found for σ2 g in all characteristics (Table 1). This indicates the predominance of the temporary environmental effect (harvests) over genetic effects, which hinders selection in breeding programmes. Often, the estimates of the variance components (σ²a, σ²g, and σ²e) and the coefficient of residual variation obtained by the frequentist analysis were higher than those found by the mode of posteriori density of the Bayesian approach. However, the repeatability estimates obtained using these two methods were close (Table 1). Repeatability estimates for frequentist and Bayesian methodologies ranged from 0.52 to 0.55 (for leaf yield, number of shoots and fresh leaf mass), while for leaf number, the estimate was 0.63 by the frequentist method and 0.64 by the Bayesian method.

For the bootstrap methodology, the percentile and BCPB techniques (although they presented, in general, IC with smaller amplitude) were not compatible. The parameter estimates were not between the lower and upper limits of the confidence intervals for the σ2 e and CVe of the characteristics "number of shoots" and "number of leaves". The same was observed for the estimated repeatability of the number of leaves in the BCPB method. However, the BCa IC technique, in addition to presenting a low amplitude of CIs, always understood the estimates of the variance component estimates (Table 1).

Table 1
Parameters estimated by frequentist, bootstrap and Bayesian analysis, followed by confidence interval (CI) and credibility interval.

Bootstrap ICs comprised estimates of repeatability. The same values were verified in the frequentist and Bayesian method, with low variation in amplitudes. Likewise, for the variance components, the frequentist and Bayesian methodologies had intervals that always comprised the estimates of the components of variance, with the interval amplitude of the frequentist methodology being generally greater than those of the Bayesian methodology.

From the repeatability study, it was verified that with 3 harvests, it is already possible to obtain a coefficient of determination above 80% (83.45%) for the number of leaves (Figure 1). For the other characteristics, 4 harvests are needed to reach more than 80% (82.01, 81.22, 87.21, and 82.90% for leaf yield, number of shoots, number of leaves and fresh mass per leaf, respectively). In all characteristics, it was verified that the fashion of the coefficient of determination was always between the lower and upper limits of the credibility interval (Figure 1).

Figure 1
Estimation of coefficient of determination related to number of harvests estimated by Bayesian inference and their respective credibility intervals for leaf yield (A), number of leaves (B), number of shoots (C), and fresh mass per leaf (D) in families of half-sibs of kale.

By the frequentist methodology (Figure 2), the ideal number of harvests enables the same conclusions cited previously for Figure 1. However, for the confidence intervals estimated by the bootstrap methodologies, it was verified that the estimates of the coefficient of determination were very close to the percentile method and very close to the upper limit of the BCPB and BCa methods.

The stabilization by the frequentist analysis showed that for the leaf yield variable, considering the optimum number of four harvests for selection of this variable (Figure 1), harvests 5, 6, 7, and 8 provided the highest repeatability and coefficient of determination (Table 2).

For the number of leaves and fresh mass per leaves, crops 6 and 7 were already sufficient for reliable selection (R² greater than 80%), whereas for the number of shoots, crops 3, 4, and 5 would already be sufficient (Table 2). The highest coefficient of determination was obtained when all harvests were evaluated. Considering the optimum number of three harvests as ideal for selection based on the number of leaves, crops 6, 7, and 8 would be most suitable for this purpose. For fresh mass per leaf and number of shoots, whose optimal number of harvests was equal to four (Figure 1), harvests 5, 6, 7, and 8 were the most suitable for study (Table 2).

Figure 2
Estimation of coefficient of determination as a function of ideal harvest number, estimated by Bootstrap inference by percentile techniques, BCa and BCPB, for leaf yield (A), leaf number (B), number of shoots C), and fresh mass per leaf (D) in families of half-sibs of kale.

Table 2
Study of phenotypic stability by frequentist analysis for leaf productivity, number of leaves, number of shoots and fresh mass per leaf in families of half-sib of kale.

Stabilization by Bayesian methodology (Table 3) enabled similar information to that obtained by the frequentist method (Table 2). The same was also observed for the credibility interval.

Table 3
Study of phenotypic stability by Bayesian analysis for leaf productivity, number of leaves, number of shoots and fresh mass per leaf in families of half-sib of kale.

When analysing stabilization by bootstrap techniques, it was observed that, as occurred for the frequentist and Bayesian methodology, the increase in the number of harvests reduced the amplitude of the CI, but in several situations, these intervals did not include the repeatability estimates (Tables 4 and 5).

Table 4
Study of phenotypic stability by bootstrap analysis for leaf yield and number of leaves in families of half-sibs of kale and their respective confidence intervals.
Table 5
Study of phenotypic stability by Bootstrap analysis for number of shoots and fresh mass per leaf in families of half-sibs of kale and their respective confidence intervals.

Discussion

In multi-harvest crops such as kale, it is important to perform repeatability studies (Chia, Lopes, Cunha, Rocha, & Lopes, 2009Chia, G. S., Lopes, R., Cunha, R. N. V., Rocha, R. N. C., & Lopes, M. T. G. (2009). Repetibilidade da produção de cachos de híbridos interespecíficos entre o caiaué e o dendezeiro. Acta Amazonica, 39(2), 249-254. DOI: 10.1590/S0044-59672009000200001.
https://doi.org/10.1590/S0044-5967200900...
; Azevedo et al., 2016aAzevedo, A. M, Andrade Júnior, V. C., Pedrosa, C. E., Valadares, N. R., Andrade, R. F., & Souza, J. R. S. (2016a). Estudo da repetibilidade genética em clones de couve. Horticultura Brasileira, 34(1), 54-58. DOI: 10.1590/S0102-053620160000100008.
https://doi.org/10.1590/S0102-0536201600...
). This enables the establishment of a number of harvests that must be performed for efficient selection of the best progenies (Sobrinho, Borges, Lédo, & Kopp, 2010Sobrinho, F. S., Borges, V., Lédo, F. J. S., & Kopp, M. M. (2010). Repetibilidade de características agronômicas e número de cortes necessários para seleção de Urochloa ruziziensis. Pesquisa Agropecuária Brasileira , 45(6), 579-584. DOI: 10.1590/S0100-204X2010000600007.
https://doi.org/10.1590/S0100-204X201000...
; Bruna, Moreto, & Dalbó, 2012Bruna, E. D., Moreto, A. L., & Dalbó, M. A. (2012). Uso do coeficiente de repetibilidade na seleção de clones de pessegueiro para o litoral sul de Santa Catariana. Revista Brasileira de Fruticultura, 34(1), 206‑215. DOI: 10.1590/S0100-29452012000100028.
https://doi.org/10.1590/S0100-2945201200...
). The effect of the environment on quantitative characteristics of kale directly influenced the performance of the plant, especially on fresh leaf mass yield, since there was a predominance of temporary environmental variance in relation to the genotype. This results in greater difficulty in the selection of genotypes, since their effects are confounded in the total variation (Cruz et al., 2012Cruz, C. D., Regazzi, A. J., & Carneiro, P. C. S. (2012). Modelos biométricos aplicados ao melhoramento genético. Viçosa, MG: UFV .).

The lower repeatability estimates for leaf yield, number of shoots and fresh mass per leaf may be associated with greater sensitivity to environmental factors, such as climatic conditions, fertilization, cultural treatments, and irrigation. However, the higher values of repeatability for the number of leaves may indicate a lower influence environmental effects. It is important to remember that the higher emission of leaves is not necessarily related to higher productivity.

The repeatability estimation can vary from 0 to 1, and high coefficients allow prediction of the real value for a given characteristic with only a few measurements (Oliveira & Moura, 2010Oliveira, M. S. P., & Moura, E. F. (2010). Repetibilidade e número mínimo de medições para caracteres de cacho de bacabi (Oenocarpus mapora). Revista Brasileira de Fruticultura , 32(4), 1173‑1179. DOI: 10.1590/S0100-29452010005000120.
https://doi.org/10.1590/S0100-2945201000...
). Repeatability estimates above 0.5 were found for all evaluated characteristics, which according to Padilha, Oliveira, and Mota (2003Padilha, N. C. C., Oliveira, M. S. P., & Mota, M. G. C. (2003). Estimativa da repetibilidade em caracteres morfológicos e de produção de palmito em pupunheira (Bactris gasipaes Kunth). Revista Árvore, 27(4), 435-442. DOI: 10.1590/S0100-67622003000400003.
https://doi.org/10.1590/S0100-6762200300...
) indicates good reliability for selection. This coefficient is also important for expressing the maximum value for which heritability in the broad sense can be achieved (Cruz et al., 2012Cruz, C. D., Regazzi, A. J., & Carneiro, P. C. S. (2012). Modelos biométricos aplicados ao melhoramento genético. Viçosa, MG: UFV .).

To obtain genetic parameters such as repeatability, obtaining the components of variance is of fundamental importance (Resende, 2016Resende, M. D. V. (2016). Software Selegen-REML/BLUP: a useful tool for plant breeding. Crop Breeding and Applied Biotechnology, 16(4), 330-339. DOI: 10.1590/1984-70332016v16n4a49.
https://doi.org/10.1590/1984-70332016v16...
), thus guiding decision-making within the breeding programme. Several statistical methodologies can be used to estimate these parameters as frequentist and Bayesian methods, and their choice is related to the type of data to be manipulated, the presence or absence of an imbalance and the familiarity of the researcher with the type of technique adopted.

The estimates of the parameters obtained in this research by frequentist analysis and Bayesian inference were largely similar. This is a consequence of the use of prior informative (vague prioris) in the Bayesian methodology. In this case, the maximum likelihood predominates over a priori, and point estimates (average, mode and median) of the parameters are very close to those obtained by the ANOVA method. Although different methods are used to study these genetic parameters, the frequentist analysis is the most used technique, especially due to the lower complexity of its calculations. However, the frequentist technique is limited to obtaining point estimates of variance (Xie & Singh, 2013Xie, M., & Singh, K. (2013). Confidence distribution, the frequentist distribution estimator of a parameter: a review. International Statistical Review, 81(1), 3-39. DOI: 10.1111/insr.12000.
https://doi.org/10.1111/insr.12000....
), while Bayesian has a distribution for all parameters (a posteriori distribution). One of the most frequently used frequentist methods is based on ANOVA, but it can present a negative estimate of the components of variance (Karaman, Firat, & Narinc, 2014Karaman, E., Firat, M. Z., & Narinc, D. (2014). Single-trait bayesian analysis of some growth traits in japanese quail. Brazilian Journal of Poultry Science, 16(2), 51-56. DOI: 10.1590/1516-635x160251-56.
https://doi.org/10.1590/1516-635x160251-...
), which is not of interest to the breeder and does not occur in Bayesian inference.

The adoption of confidence intervals for estimates is important because it allows conclusions to consider the error associated with their estimation (Kenz, Banks, & Smith, 2013Kenz, Z. R., Banks, H. T, & Smith, R. C. (2013). Comparison of frequentist and bayesian confidence analysis methods on a viscoelastic stenosis model. Society for Industrial and Applied Mathematics and American Statistical Association, 1(1), 348-369. DOI: 10.1137/130917867.
https://doi.org/10.1137/130917867....
). These intervals can also be obtained using different methodologies, such as frequentist, bootstrap and Bayesian methods. The frequentist and Bayesian methodologies were efficient in obtaining these intervals, and always included the estimates. Considering the bootstrap, the BCa technique was most successful, since it presented smaller confidence intervals, and incorporated parameter estimates between its lower and upper limits more frequently. This better accuracy of the BCa technique is associated with the use of the "acceleration constant" (Haukoos & Lewis, 2005Haukoos, J. S., & Lewis, R. J. (2005). Advanced statistics: Bootstrapping confi dence intervals for statistics with “difficult” distributions. Academic Emergency Medicine, 12(4), 360-365. DOI: 10.1197/j.aem.2004.11.018.
https://doi.org/10.1197/j.aem.2004.11.01...
).

As expected, it was verified that with increasing number of harvests, there is an increase in the coefficient of determination, with 4 harvests giving estimates above 80% for all the characteristics. This estimate is considered high and allows satisfactory precision in selection (Martuscello, Jank, Fonseca, Cruz, & Cunha, 2007Martuscello, J. A., Jank, L., Fonseca, D. M., Cruz, C. D., & Cunha, D. N. F. V. (2007). Repetibilidade de caracteres agronômicos em Panicum maximum Jacq. Revista Brasileira de Zootecnia, 36(6), 1975-1981 (supl.). DOI: 10.1590/S1516-35982007000900005.
https://doi.org/10.1590/S1516-3598200700...
). The use of 4 harvests is slightly higher than that verified by Azevedo et al. (2016aAzevedo, A. M, Andrade Júnior, V. C., Pedrosa, C. E., Valadares, N. R., Andrade, R. F., & Souza, J. R. S. (2016a). Estudo da repetibilidade genética em clones de couve. Horticultura Brasileira, 34(1), 54-58. DOI: 10.1590/S0102-053620160000100008.
https://doi.org/10.1590/S0102-0536201600...
) that for evaluating kale clones, which recommended three crops as sufficient for a reliable selection (R² greater than 80%).

In addition to the study of the optimal number of evaluations, phenotypic stabilization study is of great importance in plant selection, since it allows verification of which harvests provide greater efficiency (Martuscello et al., 2015Martuscello, J. A., Braz, T. G. S., Jank, L., Cunha, D. N. F. V., Lima, B. P. S., & Oliveira, L. P. (2015). Repeatability and phenotypic stabilization of Panicum maximum accessions. Acta Scientiarum. Animal Sciences, 37(1), 15-21. DOI: 10.4025/actascianimsci.v37i1.23206.
https://doi.org/10.4025/actascianimsci.v...
). The evaluation of non-stabilized genotypes may suffer from low repeatability, and in these situations the increase in the number of replications is not always the solution to alleviate the problem (Pereira et al., 2002Pereira, A. V., Cruz, C. D., Ferreira, R. P., Botrel, M. A., & Oliveira, J. S. (2002). Influência da estabilização de genótipos de capim elefante (Pennisetum purpureum Schum.) sobre a estimativa da repetibilidade de características forrageiras. Ciência e Agrotecnologia, 26(4), 762-767.), as a consequence of the performance of distinct genotypes as a function of the stage of development of the culture (Martuscello, Jank, Fonseca, Cruz, & Cunha, 2007Martuscello, J. A., Jank, L., Fonseca, D. M., Cruz, C. D., & Cunha, D. N. F. V. (2007). Repetibilidade de caracteres agronômicos em Panicum maximum Jacq. Revista Brasileira de Zootecnia, 36(6), 1975-1981 (supl.). DOI: 10.1590/S1516-35982007000900005.
https://doi.org/10.1590/S1516-3598200700...
). Considering four crops as the ideal, it was verified by the repeatability study that harvests 5, 6, 7, and 8 (between 95 and 170 days after planting) provide higher determination coefficients for most characteristics, especially for leaf yield. Therefore, in breeding programmes, these harvests should be prioritized.

The Bayesian and frequentist techniques were shown to be adequate to obtain the intervals in the phenotypic stabilization study. The three bootstrap techniques presented limitations when the repeatability estimates were outside the established ICs. The low performance of the bootstrap methodologies was due to the resampling of the original data, because in this situation, obtaining a greater number of repeated values causes the repeatability estimates to be overestimated. As a result, the repeatability values obtained in the bootstrap samples are generally larger than the estimate obtained in the original sample, and the repeatability value of the original sample is in some situations lower than the lower limit of the confidence interval obtained by the percentile method. This problem is mitigated by the method of asymmetry correction obtained in the BCPB and BCa methods, but they are not always solved. With a smaller sample size, as in the stability study contemplating few evaluations, these problems are aggravated. Therefore, the use of bootstrap methodologies should be avoided in repeatability studies to obtain confidence intervals.

Conclusion

It is possible to obtain highly reliable progeny selection of half-sibs of kale with only four harvests.

Harvests between 95 and 170 days after planting (harvests 5, 6, 7, and 8) should be prioritized because they provide greater phenotypic stability.

The frequentist and Bayesian analyses are efficient for estimating confidence intervals and credibility, respectively. Bayesian inference allowed smaller intervals than the frequentist methodology.

The bootstrap percentile, BCPB and BCa techniques are not recommended to obtain confidence intervals in phenotypic repeatability and stabilization.

Acknowledgements

The authors thank the National Council for Scientific and Technological Development (CNPq), the Foundation for Research Support of the State of Minas Gerais (FAPEMIG) and the Coordination for the Improvement of Higher Education Personnel (CAPES) for granting scholarships and resources for the development of the project

References

  • Azevedo, A. M., Andrade Júnior, V. C., Pedrosa, C. E., Fernandes, J. S. C., Valadares, N. R., Ferreira, M. A. M., & Martins, R. A. V. (2012). Desempenho agronômico e variabilidade genética em genótipos de couve. Pesquisa Agropecuária Brasileira, 47(12), 1751-1758. DOI: 10.1590/S0100-204X2012001200011.
    » https://doi.org/10.1590/S0100-204X2012001200011.
  • Azevedo, A. M, Andrade Júnior, V. C., Pedrosa, C. E., Valadares, N. R., Andrade, R. F., & Souza, J. R. S. (2016a). Estudo da repetibilidade genética em clones de couve. Horticultura Brasileira, 34(1), 54-58. DOI: 10.1590/S0102-053620160000100008.
    » https://doi.org/10.1590/S0102-053620160000100008.
  • Azevedo, A. M., Seus, R., Gomes, C. L., Freitas, E. M., Candido, D. M., Silva, D. J. H., & Carneiro, P. C. S. (2016b). Correlações genotípicas e análise de trilha em famílias de meios-irmãos de couve de folhas. Pesquisa Agropecuária Brasileira , 51(1), 35-44. DOI: DOI: 10.1590/S0100-204X2016000100005.
    » https://doi.org/10.1590/S0100-204X2016000100005
  • Azevedo, A. M., Andrade Júnior, V. C., Santos, A. A., Sousa Júnior, Oliveira, A. J. M., & Ferreira, M. A. M.(2017). Population parameters and selection of kale genotypes using Bayesian inference in a multi-trait linear model. Acta Scientiarum. Agronomy, 39(1), 25-31. DOI: 10.4025/actasciagron.v39i1.30856.
    » https://doi.org/10.4025/actasciagron.v39i1.30856.
  • Bruna, E. D., Moreto, A. L., & Dalbó, M. A. (2012). Uso do coeficiente de repetibilidade na seleção de clones de pessegueiro para o litoral sul de Santa Catariana. Revista Brasileira de Fruticultura, 34(1), 206‑215. DOI: 10.1590/S0100-29452012000100028.
    » https://doi.org/10.1590/S0100-29452012000100028.
  • Cecon, P. R., Silva, A. R., Nascimento, M., & Ferreira, A. (2012). Métodos estatísticos. Viçosa, MG: UFV.
  • Chia, G. S., Lopes, R., Cunha, R. N. V., Rocha, R. N. C., & Lopes, M. T. G. (2009). Repetibilidade da produção de cachos de híbridos interespecíficos entre o caiaué e o dendezeiro. Acta Amazonica, 39(2), 249-254. DOI: 10.1590/S0044-59672009000200001.
    » https://doi.org/10.1590/S0044-59672009000200001.
  • Cruz, C. D., Regazzi, A. J., & Carneiro, P. C. S. (2012). Modelos biométricos aplicados ao melhoramento genético. Viçosa, MG: UFV .
  • Efron, B., & Tibshirani, R. J. (1993). An introduction to the Bootstrap. New York, US: Chapman & Hall.
  • Haukoos, J. S., & Lewis, R. J. (2005). Advanced statistics: Bootstrapping confi dence intervals for statistics with “difficult” distributions. Academic Emergency Medicine, 12(4), 360-365. DOI: 10.1197/j.aem.2004.11.018.
    » https://doi.org/10.1197/j.aem.2004.11.018.
  • Karaman, E., Firat, M. Z., & Narinc, D. (2014). Single-trait bayesian analysis of some growth traits in japanese quail. Brazilian Journal of Poultry Science, 16(2), 51-56. DOI: 10.1590/1516-635x160251-56.
    » https://doi.org/10.1590/1516-635x160251-56.
  • Kenz, Z. R., Banks, H. T, & Smith, R. C. (2013). Comparison of frequentist and bayesian confidence analysis methods on a viscoelastic stenosis model. Society for Industrial and Applied Mathematics and American Statistical Association, 1(1), 348-369. DOI: 10.1137/130917867.
    » https://doi.org/10.1137/130917867.
  • Khosravi, A., Nahavandi, S., Creighton, D., & Atiya, A. F. (2011). Lower upper bound estimation method for construction of neural network-based prediction intervals. IEE Transactions on Neural Networks, 22(3), 337-346. DOI: 10.1109/TNN.2010.2096824.
    » https://doi.org/10.1109/TNN.2010.2096824.
  • Martuscello, J. A., Braz, T. G. S., Jank, L., Cunha, D. N. F. V., Lima, B. P. S., & Oliveira, L. P. (2015). Repeatability and phenotypic stabilization of Panicum maximum accessions. Acta Scientiarum. Animal Sciences, 37(1), 15-21. DOI: 10.4025/actascianimsci.v37i1.23206.
    » https://doi.org/10.4025/actascianimsci.v37i1.23206.
  • Martuscello, J. A., Jank, L., Fonseca, D. M., Cruz, C. D., & Cunha, D. N. F. V. (2007). Repetibilidade de caracteres agronômicos em Panicum maximum Jacq. Revista Brasileira de Zootecnia, 36(6), 1975-1981 (supl.). DOI: 10.1590/S1516-35982007000900005.
    » https://doi.org/10.1590/S1516-35982007000900005
  • Novo, M. C. S. S., Prela-Pantano, A., Trani, P. E., & Blat, S. F. (2010). Desenvolvimento e produção de genótipos de couve-manteiga. Horticultura Brasileira , 28(3), 321-325. DOI: 10.1590/S0102-05362010000300014.
    » https://doi.org/10.1590/S0102-05362010000300014.
  • Oliveira, M. S. P., & Moura, E. F. (2010). Repetibilidade e número mínimo de medições para caracteres de cacho de bacabi (Oenocarpus mapora). Revista Brasileira de Fruticultura , 32(4), 1173‑1179. DOI: 10.1590/S0100-29452010005000120.
    » https://doi.org/10.1590/S0100-29452010005000120.
  • Padilha, N. C. C., Oliveira, M. S. P., & Mota, M. G. C. (2003). Estimativa da repetibilidade em caracteres morfológicos e de produção de palmito em pupunheira (Bactris gasipaes Kunth). Revista Árvore, 27(4), 435-442. DOI: 10.1590/S0100-67622003000400003.
    » https://doi.org/10.1590/S0100-67622003000400003.
  • Pereira, A. V., Cruz, C. D., Ferreira, R. P., Botrel, M. A., & Oliveira, J. S. (2002). Influência da estabilização de genótipos de capim elefante (Pennisetum purpureum Schum.) sobre a estimativa da repetibilidade de características forrageiras. Ciência e Agrotecnologia, 26(4), 762-767.
  • R Core Team. (2016). R: A language and environment for statistical computing. Viena, AU: R Foundation for Statistical Computing.
  • Resende, M. D. V. (2016). Software Selegen-REML/BLUP: a useful tool for plant breeding. Crop Breeding and Applied Biotechnology, 16(4), 330-339. DOI: 10.1590/1984-70332016v16n4a49.
    » https://doi.org/10.1590/1984-70332016v16n4a49.
  • Santos, H. G., Jacomine, P. K. T., Anjos, L. H. C., Oliveira, V. A., Lumbreras, J. F., Coelho, M. R., ... Cunha, T. J. F. (2018). Sistema brasileiro de classificação de solos. Brasília, DF: Embrapa Solos.
  • Santos, A. I., Ribeiro, R. P., Vargas, L., Mora, F., Alexandre Filho, L., Fornari, D. C., & Oliveira, S. N. (2011). Bayesian genetic parameters for body weight and survival of Nile tilapia farmed in Brazil. Pesquisa Agropecuárias Brasileira, 46(1), 33-43. DOI: 10.1590/S0100-204X2011000100005.
    » https://doi.org/10.1590/S0100-204X2011000100005.
  • Severiano, A., Carriço, J. A., Robinson, D. A., Ramirez, M., & Pinto, F. R. (2011). Evaluation of jackknife and bootstrap for defining confidence intervals for pairwise agreement measures. PLoS ONE, 6(5), 1-11. DOI: 10.1371/journal.pone.0019539.
    » https://doi.org/10.1371/journal.pone.0019539.
  • Sobrinho, F. S., Borges, V., Lédo, F. J. S., & Kopp, M. M. (2010). Repetibilidade de características agronômicas e número de cortes necessários para seleção de Urochloa ruziziensis. Pesquisa Agropecuária Brasileira , 45(6), 579-584. DOI: 10.1590/S0100-204X2010000600007.
    » https://doi.org/10.1590/S0100-204X2010000600007.
  • Toral, F. L. B., Alencar, M. M., & Freitas, A. R. (2007). Abordagens freqüentista e bayesiana para avaliação genética de bovinos da raça Canchim para características de crescimento. Revista Brasileira de Zootecnia , 36(1), 43-53. DOI: 10.1590/S1516-35982007000100006.
    » https://doi.org/10.1590/S1516-35982007000100006.
  • Xie, M., & Singh, K. (2013). Confidence distribution, the frequentist distribution estimator of a parameter: a review. International Statistical Review, 81(1), 3-39. DOI: 10.1111/insr.12000.
    » https://doi.org/10.1111/insr.12000.
  • Zhang, H., Rojas, H. A. G., & Cuervo, E. C. (2010). Confidence and credibility intervals for the difference of two proportions. Revista Colombiana de Estadística, 33(1), 63-88.

Publication Dates

  • Publication in this collection
    13 June 2019
  • Date of issue
    2019

History

  • Received
    17 Oct 2017
  • Accepted
    14 Mar 2018
Editora da Universidade Estadual de Maringá - EDUEM Av. Colombo, 5790, bloco 40, 87020-900 - Maringá PR/ Brasil, Tel.: (55 44) 3011-4253, Fax: (55 44) 3011-1392 - Maringá - PR - Brazil
E-mail: actaagron@uem.br