Determination of the optimal number of evaluations in half-sib progenies of kale by Bayesian approach

Kale has a long vegetative cycle, requiring a lot of labor, due to the need for tutoring, thinning and multiple harvests, leading to difficulties in the maintenance and evaluation of experiments. Thus, the objective was to estimate the minimum number of evaluations for the assertive selection of half-sib progenies of kale by means of a repeatability study by Bayesian approach. Twenty four half-sib progenies were evaluated in a randomized block design with four replicates and five plants per plot. The number of shoots, number of marketable leaves, fresh mass of marketable leaves and fresh mass per leaf were measured throughout 15 harvests. All traits showed high estimates of the repeatability, indicating high regularity in the expression of the traits during the harvesting period. With eight harvests it is possible to evaluate all the traits with a coefficient of determination superior to 85% in half-sib progenies of kale.

The crop is considered as an annual or biennial plant and leaf harvesting is carried out periodically throughout its vegetative cycle. It also demands a lot of labor, as it needs staking, sprout removal and multiple harvests, which brings difficulties in the maintenance and evaluation of experiments. According to Patcharin et al. (2013), the lack of information about the minimum number of harvests to adequately evaluate an experiment can lead the researcher to carry out more harvests than necessary to differentiate treatments (Della Bruna et al., 2012). This can cause waste in the utilization of labor and financial resources (Martuscello et al., 2007).
The reliability of the good performance of a genotype throughout successive evaluations can be proven by the repeatability coefficient (Neves et al., 2010), from which the ideal number of harvests can be estimated. In order to estimate genetic parameters such as repeatability, it is necessary to obtain variance components (Della Bruna et al., 2012;Tenkouano et al., 2012). These are unknown, and generally are estimated by the method of moments, maximum likelihood (ML) or restricted maximum likelihood (REML) (Azevedo et al., 2017
can be used advantageously, since it allows obtaining the a posteriori distribution and credibility intervals of the estimated parameters (Gonçalves-Vidigal et al., 2008;Mathew et al., 2012;Rodovalho et al., 2014). This makes the technique very informative (Mathew et al., 2012) and facilitates hypothesis testing. In addition, Bayesian inference enables the evaluation of experiments with unbalanced data and the study of complex statistical models (Waldmann & Ericsson, 2006;Bink et al., 2007). Consequently, its use is increasing among breeders, not only for the analysis of molecular data, but also for phenotypic data (Waldmann & Ericsson, 2006;Omer et al., 2016). In this sense, the objective was to estimate the minimum number of evaluations for assertive selection of half-sib progenies of kale by means of a repeatability study, using the Bayesian approach.

MATERIAL AND METHODS
The experiment was carried out in field conditions, in the research vegetable garden of the Federal University of Viçosa (UFV), in Viçosa-MG. Twentyfour half-sib progenies of kale were evaluated in a randomized block design with four replicates and five plants per plot. The plants were sown in expanded polystyrene trays. After 60 days, the seedlings were transplanted to seedbeds with approximately 2.5 m width and 0.3 m height, using a 1 m x 0.5 m spacing. From 30 days after transplantation, fifteen harvests were carried out each fourteen days. In five plants per plot the number of shoots (which were removed in the occasion), number of marketable leaves, fresh mass of marketable leaves and fresh mass per leaf were evaluated. The completely expanded leaves with leaf blade length greater than 15 cm and without signs of senescence were considered as marketable ones.
For the repeatability study, the statistical model was used, proposed by Cruz et al. (2012): Y ij = m+g i +a j -e ij , in which: Y ij : observation referring to the i-th progenies (i = 1, 2, ..., 24 progenies) in the j-th harvest (j = 1, 2, ..., 15 harvests); m: general means; g i : aleatory effect of the i-th progeny on the influence of permanent environment; a j : effect of the j-th harvest, and, e ij : experimental error associated to the observation Y ij . The Bayesian theorem was used to estimate the variance components. This theorem admits that the a posteriori joint distribution of all unknown parameters is proportional to the product of the function with maximum likelihood with the a posteriori distribution (Azevedo et al., 2017). Assuming that , the sampling distribution of the observed data (function of maximum likelihood) is: .
For the location parameters, the following a priori distributions were considered: , and For the variance components, the inverse chi-squared distribution was considered as the a priori distribution: , and . Therefore, the joint a posteriori distribution can be represented by: For the statistical analysis, the rJags package of the R software (R Core Team, 2020) was employed. Since there is no previous work with half-sib progenies of kale to obtain informative priori, vague (less informative) priori were used. Thus, for the location effects, it was considered u i = 0(i=m,g,a,e). For the general means variance component, it was stipulated . For the other variance components, it was considered In order to obtain the MCMC chains, 1,000,000 iterations per characteristic were established. Burnin of 100,000 iterations and thin of 500 iterations was used, resulting in a total sample of 1,800 iterations for each characteristic. the HPD 95% (high probability density) interval and mode were estimated with the aid of the Bayesian Output Analysis (BOA) package of the software R.

RESULTS AND DISCUSSION
Only the characteristic number of shoots showed overlapping of the HPD intervals for the variance components due to the effects of family and evaluation (Table 1). For the other traits, there were higher estimates of the variance components due to the effects of the evaluations in relation to the family effects, without overlapping the HPD intervals. The obtained asymmetric HPD for the variance components and genetic parameters are a peculiarity of Bayesian inference and facilitate hypothesis testing (Azevedo et al., 2017).
The higher magnitudes of the variance components of the effects of the evaluations compared to the genetic effects (progenies), without overlapping the credibility interval, indicates the predominance of the evaluation effects when compared to the genetic effects. This happened for number of leaves, fresh leaf mass and fresh mass per leaf. The higher magnitudes of the variance components of the effects of the evaluations compared to the genetic effects were also verified by Brito et al. (2019) when evaluating half-sib kale. The higher magnitudes of the residual coefficient of variation for number of shoots and fresh leaf mass shows that these traits are more influenced by random effects of the environment (experimental error).
The highest mode for the coefficient of residual variation was found for the number of shoots (19.10%), and its HPD interval only did not overlap those values found for the number of leaves and fresh mass per leaf, which presented the lower modes (8.17 and 10.48%, respectively). There were overlapping HPD intervals between the coefficients of determination of all the traits. The  Determination of the optimal number of evaluations in half-sib progenies of kale by Bayesian approach traits with higher mode values for this parameter were number of leaves and number of shoots (96.67 and 95.31%, respectively). The lowest mode for repeatability was found for fresh matter per leaf, followed by fresh matter of leaves, with mode values of 0.40 and 0.47 ( Figure  1). However, the number of leaves and number of shoots showed the highest estimates of repeatability, with mode values of 0.65 and 0.55, respectively. However, there was an overlap of the HPD interval of repeatability in all traits. The repeatability estimation can vary from 0 to 1, and high coefficients allow to predict the real value for a given characteristic with few measurements (Oliveira & Moura, 2010). The highest estimates for the repeatability coefficient of the number of marketable leaves ( Figure 1) were also found by Brito et al. (2019). This indicates for this trait, a smaller increase in the experimental accuracy due to the increase in the number of evaluations (Della Bruna et al., 2012). On the other hand, the trait fresh mass per leaf, with smaller estimates, requires a greater number of harvests for a selection with greater efficiency and reliability.

Characteristics
The number of wished harvests according to the used coefficient of determination indicates that the traits mass of fresh matter per leaf and mass of fresh leaves require a greater number of evaluations for the efficient selection of progenies ( Figure 2). This can be justified by the percentage of water in the leaves that can vary at the time of harvest, due to variations in soil moisture, relative humidity or temperature. For this, 13 harvests are required to guarantee the coefficient of determination of 90%, and eight harvests to reach a coefficient of determination of 85%. This information is important, and indicates that in future experiments with half-sib progenies of kale, it is possible to have considerable precision with only eight harvests. This number of harvests is much higher than that found by Azevedo et al. (2012), which suggest only three harvests to obtain a coefficient of variation higher than 95%. Among the justifications for the need of a smaller number of harvests found by these authors stands out the fact that they evaluated kale clones. In the present work, the genetic variability within each treatment (halfsib progenies) may have contributed to the lower repeatability coefficients. This justification agrees with the work done by Cruz et al. (2012), in which the authors state that the repeatability coefficient may vary according to the genetic structure of the studied population (clones, half-sib progenies, complete-sib progenies). The higher number of measurements required for the evaluation of fresh mass per leaf may be due to a higher interaction between genotypes and temporary environment in these traits. A possible cause for this interaction may be the regulation of the character by different gene sets, which may be more or less active, depending on the developmental state of the individual (Cruz et al., 2012). More than 8 harvests were also necessary to obtain the determination coefficient greater than 85% by Brito et al. (2019) to half-sib progeny.
On the other hand, the number of leaves is the characteristic that requires a smaller number of evaluations, followed by the number of shoots. From the mode of the a posteriori distribution, it is estimated that three, two, four and six evaluations of the traits number of shoots, number of marketable leaves, fresh matter of leaves and fresh matter per leaf are required, respectively, if a coefficient of determination of 80% is desired (Figure 2). To obtain a coefficient of determination of 85%, four, three, six and eight evaluations are required for the number of shoots, number of marketable leaves, fresh matter of leaves and fresh matter per leaf, respectively. To achieve the 90% coefficient of determination, seven, five, ten and 12 evaluations are required, respectively. To obtain the coefficient of 95%, 14, 10, 20 and 26 evaluations are necessary, respectively. I n b r e e d i n g p r o g r a m s , t h i s information is important, as it permits knowing the minimum number of evaluations to compare genotypes (Patcharin et al., 2013). This allows avoiding the loss of time with evaluations beyond necessary, also avoiding evaluation for a very short period, which can lead to errors in the identification of the superior genotypes (Neves et al., 2010).
Therefore, it can be concluded that the number of leaves is the characteristic with higher repeatability, as opposed to the fresh mass per leaf, which requires a higher number of harvests for the selection of better half-sib progenies of kale. With eight harvests it is possible to evaluate all the traits with a coefficient of determination superior to 85% in half-sib progenies of kale.