Estimation of repeatability and genotypic superiority of elephant grass half-sib families for energy purposes using mixed models

ABSTRACT The mixed-model methodology is an alternative to select genotypes for traits highly influenced by the environment. In addition, this method allows FOR estimating the repeatability coefficient and predicting the number of assessments needed for a selection process to increase reliability. This study aimed to determine the minimum number of evaluations necessary for a reliable selection process and to estimate the variance components used for predicting genetic gains between and within half-sib families of elephant grass ( Cenchrus purpureus (Schumach.) Morrone ) using the mixed-model methodology. Half-sib families were generated using genotypes from the Active Germplasm Bank of Elephant Grass. The experiment was performed in a randomized block design with nine half-sib families, three replicates, and eight plants per plot. We evaluated 216 genotypes (individual plants) of elephant grass. The deviance analysis was carried out, genetic parameters were estimated, gains between and within families were predicted, and repeatability coefficients were obtained using Selegen software. There was genetic variability for selection within the families evaluated. The reliability values found above 60 % for plant height and number of tillers and above 80 % for dry matter yield suggest that only two evaluations are required to select superior genotypes with outstanding reliability. Sixteen genotypes were identified and selected for their productive potential, which can be used as parents in elephant grass breeding programs for bioenergy production.


Introduction
Elephant grass (Cenchrus purpureus (Schumach.)Morrone) shows potential for biomass production with yearly yields that can reach 59 t of biomass per hectare (Silva et al., 2020).Moreover, elephant grass has chemical characteristics, such as cellulose, hemicelluloses, lignin, moisture content, density, C/N ratio, and calorific value that reinforce its high potential for energy production (Vidal et al., 2017;Freitas et al., 2018).
Studies should be conducted to develop interest genotypesthat encompass favorable energy production traits.However, this selection is not an easy task since traits, such as crop yield, are affected by a complex genetic action and are greatly influenced by the environment (Viana and Resende, 2014), as reported by studies on production traits of elephant grass (Rodrigues et al., 2017;Stida et al., 2018;Silva et al., 2018).
Plant breeding programs require the use of precise selection methods; therefore, the REML/BLUP mixed model methodology presents a great tool for selecting genotypes for traits highly influenced by the environment.This approach has been increasingly used in breeding programs of several crops, namely sugar cane (Gonçalves et al., 2014), papaya (Cortes et al., 2019), passion fruit (Silva et al., 2017), guava (Gomes et al., 2017), corn (Vittorazzi et al., 2017), and cowpea (Cruz et al., 2021).
The REML/BLUP methodology also allows estimating the repeatability coefficient, which is highly relevant, considering that selection in perennial species requires a long time due to the long productive and reproductive cycles and the need to conduct extensive and expensive experiments.The REML/BLUP allows selecting superior genotypes with greater efficiency and lower operating costs (Marçal et al., 2016).
This study was developed to determine the minimum number of evaluations necessary for a reliable selection process o estimate the variance components and use the estimates to predict genetic gains between and within elephant grass half-sib families using the mixedmodel methodology.
Meteorological data were obtained from the automatic agrometeorological station near the experimental area.Figure 1 shows the monthly precipitation and temperature values recorded during the experimental period (Nov 2019 to Aug 2021).

Progeny formation
The families were generated using genotypes from the

Genetics and Plant Breeding
Research article families for energy purposes using mixed models Active Germplasm Bank of Elephant Grass (BAGCE).To this end, the nine most productive genotypes of dry matter yield (DMY) were selected, following Rocha et al. (2015) (Table 1).
The crosses were carried out from June to Aug 2019, during the crop flowering stage.The genotypes were planted in 9-m rows without repetition.The crosses were allowed to occur naturally and, panicles of the respective genotypes were subsequently harvested.Panicles were harvested at the beginning, middle, and end of the flowering season.This procedure enabled the collection of seeds pollinated by early and late parents, ensuring more significant variability for the families generated.
After collection, the seeds of each genotype were removed from the panicles, homogenized, packed in aluminum foil, and stored in a refrigerator.On 18 Sept 2019, sowing was carried out in 128-cell Styrofoam trays filled with forest substrate and kept for 60 days in a greenhouse equipped with an irrigation system to provide ideal germination conditions and seedling maintenance.The plot-uniformity cut was made on 14 Apr 2020, aiming to standardize the plants to start the evaluation period.

Implementation and conduct of the experiment
Soil preparation consisted of two-disc harrowing operations.Seedlings were transplanted to the field on 18 Nov 2019.Supplementary irrigation was applied by a conventional sprinkler system only in the phase of implementation and establishment of the plants (Nov and Dec) to ensure their establishment and development in this early stage.
Fertilizer application was carried out throughout the experiment according to the soil analysis results and recommendations provided in the liming and fertilization manual.The fertilizer treatment was split into three applications: at planting and once at each evaluation harvest.To this end, 60 g single superphosphate were distributed in each row at planting.Fifty days later, the area was topdressed with 70 g urea and 40 g KCl per row, corresponding to 28.6 kg N and 24 kg K 2 O per hectare, respectively.
The families were evaluated in a randomized block design with three replicates.The plot consisted of a 15-m row with a spacing of 1.50 × 1.50 m, totaling ten plants per plot.The usable area was represented by eight central plants and the plants at the edges of the row were considered borders.
The grass was harvested for evaluations on two occasions after eight months of plant growth, first on 1 Dec 2020 and then on 30 July 2021.

Traits evaluated
Evaluations were performed on eight individual plants from each plot to measure the following traits:  • Dry matter yield (DMY, t ha -1 ) -a sample was taken randomly from each plant, chopped, placed in a labeled paper bag, weighed, and oven-dried at 65 °C for 72 h.Subsequently, the samples were weighed again to obtain the air-dried sample weight.The dried material was then ground in a Wiley mill with a 5-mm sieve and placed in plastic bags to determine the oven-dried sample weight, which was obtained by oven-drying 2 g of each ground material at 105 °C for 18 h and later weighing again; • Number of tillers (NT) -determined by counting the number of tillers of each plant evaluated; • Plant height (PH, m) -measured from the ground to the inflection of the last fully expanded leaf of each of the eight plants; • Stem diameter (SD, mm) -defined as the average of three tillers of each plant evaluated, measured with a digital caliper at 1 m above the ground.

Statistical analysis
The traits evaluated were submitted to the deviance analysis to estimate the genetic parameters and use the estimations to predict gains between and within families by the mixed models (REML/BLUP).The deviance analysis was obtained according to the model described in Viana and Resende (2014): where: ln (L) is the maximum point of the restricted maximum likelihood (REML) logarithm function; y is the vector of the variable analyzed; m is the vector of the effects of observations, assumed fixed; X is the fixed effects of the incidence matrix; and V is the variance-covariance matrix of y.
The LRT (likelihood ratio test) was used to test the significance of the effects as follows: where: L se is the maximum point of the maximum likelihood function for the reduced model (without the effects) and L fm is the maximum point of the maximum likelihood function for the full model.The variables were analyzed using Selegen-REML/BLUP software to obtain the variance components by the REML method and the individual genotypic values by the best linear unbiased predictor (BLUP) method.
For the REML/BLUP approach, the model eight of the SELEGEM -REML/BLUP computer program was used to evaluate Genotypes in Half-Sib Progenies.Several Observations per Plot, one location, and at several harvests, in a Complete-Block design with Results by Genotype (Viana and Resende, 2014).Breeding values were predicted using the mixed-model approach, adopting a model based on the equation described below: y = Xm + Za + Wp + Ts + e where: y is the data vector; m is the vector of the effects of the measurement-replicate combinations (assumed fixed) added to the overall mean; a is the vector of the individual additive genetic effects (assumed random) ~ NID (0, σ a 2 ); p is the vector of plot effects (random) ~ NID (0, I σ plot 2 ); s is the vector of permanent effects (random) ~ NID (0, I σ perm 2 ); and e is the vector of errors or residuals (random) ~ NID (0, I σ e 2 ).Uppercase letters represent the incidence matrices for these effects.Vector m encompasses all measurements in all replicates and adjusts simultaneously for the effects of replicates, measurements, and replicate × measurement interaction.
The components of phenotypic variance provided by the model were: The variance components for calculating the repeatability coefficient were estimated using the REML procedure.Repeatability at plot level (r) was estimated as shown below: where: V g is the genetic variance between plants; V perm is the variance of permanent effects; and V p is the phenotypic variance (Viana and Resende, 2014).

Results and Discussion
According to LRT, only plant height (PH) showed differences in the family as a source of variation.
Considering the plot as the source of variation, there was a significant effect for PH at 1 % (p < 0.01) and for the other traits at 5 % (p < 0.05) (Table 2).
The results obtained for the genotypes source of variation show that for the traits DMY, SD, and NT, the selection between-family selection does not provide significant gains due to the existing low variability between the families.Significance for plot effects indicates significant genetic variability within the plot.Thus, it is interesting to undertake selection within families rather than between families (Borém et al., 2017).
The phenotypic variance was decomposed into additive genetic variance, the environmental variance between plots, the variance of permanent effects, and temporary residual variance.The contribution of additive genetic variance was small for all traits evaluated, with the environmental effects predominating.The variance of permanent effects (V perm ) had the most significant contribution for DMY and temporary residual variance (V e ) for the other traits (Table 3).
This result was expected since the traits evaluated in this study are controlled by many genes and are highly affected by the growth environment (Souza et al., 2017).However, the form of crop propagation (vegetative) shows an advantage of all genetic variance, whether of an additive, dominant, or epistatic nature (Cruz et al., 2014).
Estimates of individual narrow-sense heritability ( h a 2 ) were considered low for three of the four traits evaluated.Only PH had a medium heritability value (0.435), according to the classification proposed by Viana and Resende (2014).Knowing the heritability magnitude is very important in plant breeding, as it determines the degree of difficulty to improve a trait.The low estimates of h a 2 found in this study indicate that selection for these traits is expected to be difficult.However, lowmagnitude of individual heritability is common for quantitative traits (Viana and Resende, 2014).Moreover, the use of analysis by the mixed models is warranted, as favorable genetic gains are predicted and the genotypes have the potential for selection, even in the case of traits with low heritability (Cruz et al., 2014).
Individual repeatability (R) showed a high magnitude for DMY (0.668).DMY is one of the essential traits in elephant grass for energy production.Thus, the high repeatability values obtained for this trait show that it is possible to predict the real value of the genotypes with a relatively small number of measurements, indicating little gain in accuracy with an increase in the number of measurements (Sanchéz et al., 2017).
Individual repeatability was considered medium for PH and NT and low for SD.When repeatability is low, many repetitions are required to reach a satisfactory determination value.The knowledge of repeatability estimates allows the evaluation phase to be carried out efficiently and with minimal expenditure of time and labor, thereby maximizing selection efficacy (Viana and Resende, 2014).
Selection efficacy is maximized when more than one measurement is made in each genotype.This is because the genotypic value is better measured when more than one assessment is made, as the best genotypes in one evaluation are not necessarily the best in another.The repeatability coefficient allows measuring the capacity that plants have to repeat the expression of the trait (Resende, 2016).Therefore, when associated  with vegetative propagation, using the repeatability coefficient for yield-related traits is an efficient breeding strategy.
The coefficients of determination of repeatability with two measurements ranged from 0.32 to 0.80.The repeatability coefficient can be classified as high (r ≥ 0.60), medium (0.30 < r < 0.60), or low (r ≤ 0.30) (Resende, 2016).Thus, the repeatability coefficients obtained in this study were considered high for all traits, except for SD (Table 4).Only two measurements were considered sufficient to estimate the real value of the genotypes, with PH and NT exhibiting reliability values above 60 % and DMY above 80 % (Table 4).The selection strategy-based on coefficients of determination greater than 80 % can be considered adequate (Viana and Resende, 2014).
Our results show great relevance when compared with those described in other studies with elephant grass.The number of measurements needed for reliable selection was much higher than those in our study.Here, we estimated the repeatability coefficient in 73 elephant grass genotypes using the methods of analysis of variance, principal components, and structural analysis and concluded that at least nine harvests are necessary to predict the real value of the genotypes for DMY with 80 % reliability (Souza et al., 2017).The investigation of 19 clones and two controls in six environments with the mixed-model methodology (REML/BLUP) also concluded that at least seven harvests are required to reach an accuracy of 80 % for DMY (Ferreira et al., 2021).
Accuracy of permanent phenotypic values based on m evaluation harvests (A cm ) measures the proximity between predicted and true genetic values.Additionally, it is an indicator of the quality of experimental information, taking into account heritability, the repeatability coefficient, and experimental precision.Accuracy is classified as high magnitude when R > 0.70 and low when R < 0.50 (Resende and Alves, 2020).
The values found in this study with two evaluation harvests were considered low for DMY, SD, and NT (0.45, 0.43, and 0.30, respectively) and high for PH (0.76).The low heritability estimates in this study may have contributed to the low selection accuracy values for the traits mentioned above.However, the analysis effectively indicated the minimum number of evaluations on the population.
The use of two evaluation harvests allowed for increases in selection efficacy (Ef) of 9 % for DMY, 15 % for PH, 30 % for SD, and 15 % for NT.Based on the previously described repeatability coefficients and the efficiencies found, the degree of genetic determination of the trait after two harvests was high, providing a favorable scenario for the genetic selection of the genotypes evaluated.
Gain estimates for DMY were the most significant among the traits evaluated, ranging from 18.37 % with the selection of family 9 to 0 % with the selection of family 3.The main objective of a crop used for bioenergy production is to achieve high energy yields and total dry biomass per area unit (Gravina et al., 2020).Thus, the family 9 has the potential to produce 18 % more than the overall mean of the experiment (Table 5).
For the other traits, the highest gains ranged from 11.27 to 6.78 %.Families 9 and 2 were the best ranked for most traits evaluated, demonstrating their high productive potential.However, as shown in the deviance analysis (Table 2), the most significant variability found in this study was within the families.Therefore, genetic variability is essential for selecting superior genetic materials, as the selection of family combined with individual selection allows for more significant gains (Borém et al., 2017).
Individual within-family selection is based on a series of morpho-agronomic traits.It is aimed to reach a considerable number of genotypes to increase the probability of at least one of these plants encompassing several traits of agronomic interest, since the genotype  is fixed (clone) by selection (Rodrigues et al., 2017).Therefore, for selection, among the 216 genotypes evaluated, the 25 best were selected within each of the four traits evaluated.In total, 83 genotypes were selected (Table 6).All genotypes selected showed a new mean higher than the overall mean of the experiment (DMY: 14.917; PH: 3.106; SD: 5.275; and NT: 34.817).Genetic gains obtained among the genotypes selected ranged from 30.50 to 16.56 % for DMY.Using these genotypes as clones provide gains of up to 30 % in yield without additional expenses on inputs and labor.Similar results were found by Stida et al. (2018), who selected 80 accessions of elephant grass via REML/BLUP and obtained a 32 % gain in DMY.In contrast, Silva et al. (2020) obtained only a 17 % gain with the selection of elephant grass full-sib families.
residual variance; V e y y b x y aZ yN R X 2 c c c c ˆˆ( ) where: tr: matrix trace operator; C 22 , C 33 , and C 44 derive from: coefficient matrix of mixed model equations accuracy of genetic value prediction; Individual phenotypic variance; σ p 2 = σ a 2 + σ e 2 Individual narrow-sense heritability, that is, heritability of additive effects; repeatability; c plot 2 : coefficient of determination of plot effects; c perm 2 : coefficient of determination of permanent effects; and Overall mean of the experiment.
h 2 m = heritability at the genotype level associated with the mean of the harvests; R 2 = coefficient of determination of repeatability; A cm = accuracy of permanent phenotypic values based on m evaluation harvests; and Ef = efficacy of m evaluations compared with the situation in which only one evaluation is performed.Elephant grass breeding for bioenergy Sci.Agric.v.80, e20220103, 2023

Table 1 -
Identification of nine parents from the Active Germplasm Bank of Elephant Grass (BAGCE) at LEAG/CCTA/UENF used as female parents to generate the half-sib families (Campos dos Goytacazes, Rio de Janeiro State, Brazil, 2019-2021).

Table 3 -
Components of variance obtained by individual REML for dry matter yield (DMY), plant height (PH), stem diameter (SD) and number of tillers (NT) in elephant grass half-sib families evaluated at two harvests (Campos dos Goytacazes, Rio de JaneiroState,  Brazil, 2019-2021).

Table 4 -
Efficacy of repeated measures predicted by BLUP for dry matter yield (DMY), plant height (PH), stem diameter (SD), and number of tillers (NT) in elephant grass half-sib families evaluated at two harvests (Campos dos Goytacazes, Rio de Janeiro State, Brazil, 2019-2021).