Genetic parameters and selection gain in tropical wheat populations via Bayesian inference

ABSTRACT: The development process of a new wheat cultivar requires time between obtaining the base population and selecting the most promising line. Estimating genetic parameters more accurately in early generations with a view to anticipating selection means important advances for wheat breeding programs. Thus, the present study estimated the genetic parameters of F2 populations of tropical wheat and the genetic gain from selection via the Bayesian approach. To this end, the authors assessed the grain yield per plot of 34 F2 populations of tropical wheat. The Bayesian approach provided an adequate fit to the model, estimating genetic parameters within the parametric space. Heritability (h2) was 0.51. Among those selected, 11 F2 populations performed better than the control cultivars, with genetic gain of 7.80%. The following populations were the most promising: TbioSossego/CD 1303, CD 1303/TbioPonteiro, BRS 254/CD 1303, Tbio Duque/Tbio Aton, and Tbio Aton/CD 1303. Bayesian inference can be used to significantly improve tropical wheat breeding programs.


INTRODUCTION
Brazil consumes about 12 million tons of wheat annually; however, it produces only about 50% of this total, requiring imports to meet domestic demand (CONAB, 2021).The country is dependent on the production of exporting countries such as Argentina, the European Union, and the United States.Thus, the country is constantly affected by protectionist policies in these exporting countries and by occasional weather events that reduce yield in these agricultural areas.In this context, wheat breeding programs should focus on selecting genotypes with high grain yield, especially for areas in which wheat cultivation is secondary in importance despite its exploration potential, such as the Brazilian Cerrado (PASINATO et al., 2018).
The development of new cultivars starts with the identification of segregating populations with the potential to derive lines with superior genetic value, which depends on parents with a high concentration of trait-favorable alleles (FASAHAT et al., 2016).The definition of the best strategies both for identifying superior populations and for conducting them through breeding programs requires high accuracy prediction of breeding values, as well as the estimation of variance components and genetic parameters.Studies usually apply frequentist approaches such as the Restricted Maximum Likelihood/ Ciência Rural, v.53, n.7, 2023.Mezzomo et al.Best Linear Unbiased Prediction (REML/BLUP) to this end.This type of approach was used by PIMENTEL et al. (2014) in F 3 wheat populations, by THORWARTH et al. (2019) in wheat hybrids, and by MAHJOURIMAJD et al. (2016) in double haploid wheat.Although, the frequentist approach has several useful properties such as estimators with minimal and unbiased variance, it has limitations such as providing only approximate standard errors for heritability (RESENDE, 2002).
As an alternative to the frequentist approach, the Bayesian approach combines subjective information contained in a priori probability distributions with sample information, through a posteriori distribution of parameters.As a central feature in the Bayesian approach, probability distribution correlates with uncertainty regarding unknown parameters.In the frequentist approach, the parameters consist of fixed and constant values, not associated with any probability distribution (BOX & TIAO, 1992).
The Bayesian approach provides more complete results, allowing the selection of the best segregating populations to continue breeding programs.In this process, the selection of progenies occurs from a performance evaluation according to the breeder's criteria (SILVA et al., 2019).The literature reports different successful cases involving Bayesian inference, such as the selection of guava (Psidium guajava L.) (SILVA et al., 2020), kale (Brassica oleracea L. var acephala DC) (AZEVEDO et al., 2017), and eucalyptus populations (Eucalyptus globulus) (MORA et al., 2019).
The definition of selection strategies in a breeding program requires information on the populations under study, the estimation of variance components and breeding values, as well as the achievement of heritability (SEARLE et al., 1992;GONÇALVES-VIDIGAL et al., 2008).Bayesian inference can be used with advantages in these cases, since it allows to obtain posterior densities of the marginal distributions and credibility intervals for the variance components, breeding values, and genetic parameters such as heritability (WALDMANN & ERICSSON, 2006).
Bayesian approaches have many practical applications in breeding programs, including the study of adaptability and stability in genotypes of Gossypium L. (NASCIMENTO et al., 2020) and Zea mays (OLIVEIRA et al., 2018), repeatability analysis in Jatropha curcas (PEIXOTO et al., 2021), and parameter estimates and population selection in Brassica oleracea L. (AZEVEDO et al., 2017).Moreover, the Bayesian approach produces information regarding distributions and credibility intervals; however, wheat breeding programs do not often report its use.There is a lack of information on wheat improvement regarding the obtaining of population parameters, the selection of F 2 populations, and the estimates of genetic gain from selection.In this sense, the present study analyzes 34 F 2 populations for grain yield using a Bayesian approach, assessing heritability, breeding values, and genetic gain from selection.

Genetic material and experimental design
This study included 34 F 2 segregating populations belonging to the Wheat Breeding Program of the Federal University of Viçosa (UFV), Brazil, and eight commercial cultivars used as parents (Table 1).The F 2 populations come from crosses conducted in 2019 (winter) in a greenhouse, involving eight parents selected for presenting genetic variability for cycle, health, and agronomic performance.The F 1 seeds harvested from the crosses were sown under greenhouse conditions in the summer season of 2020 for generation advancement.Then, physiologically mature ears were harvested and threshed manually, and F 2 seeds and parents were separated and arranged according to the experimental design.
The experiment was conducted in the winter season of 2020, in a randomized block design with two replications.The plots consisted of two 1.5 m rows, spaced 0.2 m apart.Sowing density was 350 seeds m -2 .Cultural treatments followed technical indications for wheat cultivation in central Brazil (EMBRAPA, 2020).The plot was harvested manually, with manual cutting of plants, followed by mechanical threshing and cleaning and drying of grains up to 13% to determine yield per plot, in grams (g).

Statistical analysis
The Bayesian approach was used to analyze plot production data.Parameter estimates via Bayesian inference were obtained using Monte Carlo Markov Chain (MCMC) algorithms.The analysis was performed using the MCMCglmm package (HADFIELD, 2010) in the R software (R CORE TEAM, 2020).A total of 1,000,000 iterations (nitt) were determined, discarding the first 50,000 (burn-in).After each set of five iterations (thin) were performed, a sample was retained, totaling a chain with 190,000 iterations, from which posterior estimates were obtained.Convergence analysis was performed according to Geweke's criteria (GEWEKE, 1991), and graphical analysis was performed using the BOA package (SMITH, 2007) of the R software (R CORE TEAM, 2020).
The posterior means and medians, credibility intervals, and standard deviation of estimates were obtained according to the linear model presented below: (1) Where y is the vector of phenotypic values (with dimension nm × 1, where n = 42, which is the number of populations, and m = 2, which is the number of blocks), g is the vector of breeding values of populations, and e is the vector of random errors, with Equação A, where is the residual variance, and I is an identity matrix.X and Z are incidence matrices, respectively, of effects b and g.
Joint data distribution is normal, with mean and variance given by: (2) where is the genetic variance.
The a priori distributions of the parameters are: (3) Where b assumes a non-informative distribution (the normal distribution with a large variance value), G1 represents the inverse gamma distribution with hyperparameters given by , , and V e = V g = 1 Ciência Rural, v.53, n.7, 2023.generated from the f.c.p.d. are samples of the posterior marginal distributions.Two models were defined, the first with the presence of genetic effects (complete model) and the second without them (reduced models).The goodness of fit of these models was compared using the deviance information criteria (DIC) proposed by SPIEGELHALTER et al. (2002).The DIC is given by: (8) where is the deviance estimate applied to the posterior mean of the parameters of the evaluated model, and is the effective number of parameters in the model.
The density of the components of genetic and environmental variance was calculated to obtain the density and the heritability estimate (h 2 ), as follows: where is the genetic variance; is the residual variance.
The selection differential (SD) was obtained as follows: where x̄s is the posterior mean of the selected populations (with the standardized value of 30% of selection intensity of the populations, corresponding to 12 populations), and x̄0 is the posterior mean of all populations and parents in the experiment.With the information on heritability and selection differential, the expected genetic gain from selection was estimated, according to Falconer and Mackay (1996); ST MARTIN & FUTI (2000), by the expression: GS = (x̄s -x̄0) × h 2 and the expected genetic gain in percentage was calculated according to the expression:

RESULTS
The chains reached convergence by the Geweke criterion after 1,000,000 iterations at 5% significance level (Table 2).The DIC value was 895.48 for the complete model, and 933.73 for the reduced model (without the presence of genetic effects).Consequently, the best-fit model contained the population genetic effects, and the a posteriori inference will be based on it.Thus, a posteriori estimates and densities for the sources of variation, populations (pop) and error (units) were obtained.The Bayesian density distribution for heritability estimation is given in figure 1.This allows a clear graphical representation of the degree of uncertainty around the average heritability estimates; and are therefore, intuitive ways to present the results.The evaluation of the best model fit was performed using DIC, in which the model with the lowest DIC value has the best fit.In the present study, the complete model showed the best fit, with DIC equal to 895.48, to the detriment of the reduced model.This value indicated the significance of the breeding effects of the populations under study.The deviance information criterion (DIC) is widely applied to assess the goodness of fit of models in Bayesian inference (RESENDE et al., 2014).Bayesian inference has advantages over the commonly used frequentist inference, including the incorporation of a priori knowledge and more accurate credibility intervals (0.025 and 0.975 quantile), increasing the reliability of components and estimated effects (PEIXOTO et al., 2021).
When using noninformative a priori information, the estimates of genetic parameters obtained by Bayesian inference present values similar to those obtained by frequentist inference by restricted maximum likelihood (REML) (BEAUMONT & RANNALA, 2004).Nonetheless, SILVA et al.
(2020) obtained different results when testing three approaches in segregating populations of Psidum guajava.Two of these approaches were Bayesian approaches (one with informative and the other with noninformative a priori distribution) and the other was a mixed model.The authors reported greater accuracy through Bayesian analysis with informative a priori information, followed by Bayesian analysis with noninformative a priori information and, finally, REML/BLUP analysis.
The a posteriori mean of broad-sense heritability (h²) for grain production in the plot was 0.51, the credibility intervals were HPD (0.025) 0.01 and HPD (0.975) 0.73, and the a posteriori standard deviation was 0.15 (Table 2).Figure 2 shows the results for the estimates of heritability density.According to RESENDE (2002), the heritability of the present study (0.51) is high (h 2 > 0.50).This estimate is within the expected range for grain yield, considering that this characteristic is controlled by a large number of genes and is highly influenced by the environment.Previous studies on segregating wheat populations estimate heritability using frequentist approaches.
For instance, AKEL et al. ( 2018) analyzed separately F 1 hybrids of Triticum durum and reported an h 2 of 0.67 for grain yield (t ha -1 ) and 0.40 analyzing the parents.In turn, PIMENTEL et al. ( 2014) observed heritability at an average level for grain yield equal to 39.15% in F3 populations of Triticum aestivum.The heritability value of the present study is in an intermediate position in relation to previous literature.However, it is noteworthy that estimates of genetic parameters from Bayesian analyses tend to be more accurate (SILVA et al., 2020).
Bayesian models are more robust, generating more accurate estimates (JUNQUEIRA et al., 2016).This fact collaborates with the perspective of selection of superior populations with improved accuracy.In this context, the best F 2 populations were selected for generation advancement and for deriving promising lines for the breeding program, capable of meeting the demand of agricultural producers in the Brazilian Cerrado.Since the objective is to increase average grain yield, 13 populations were selected among those with estimates higher than the overall average of the experiment.
Figure 3A shows that among parents and F 2 populations, 23 crosses and genotypes had positive breeding value, ranging from 0.11 to 92.23. Figure 3B shows the populations selected based on the positive breeding value intercept.When considering the group with the 13 selected populations, its mean was 282.28 g plot -1 , with a selection differential of 37.67 g plot - 1 in relation to the intercept of all populations and parents, of 244.61 g plot -1 .The expected genetic gain from selection was 19.2 g plot -1 , equivalent to 7.8%, for a selection intensity of 30%.The smallest positive breeding value intercept within the selected group belongs to population F2_B4 (BRS 254/BRS 394), with 253.55 g plot -1 .The population with the highest average was F2_H5 (Tbio Sossego/CD 1303), with average plot production of 336.84 g plot -1 .The populations F2_H5 (Tbio Sossego/CD 1303), F2_E7 (CD 1303/Tbio Ponteiro), F2_B5 (BRS 254/CD 1303), F2_F1 (Tbio Duque/Tbio Aton), and F2_A5 (Tbio Aton/CD 1303) showed average grain yield per plot superior to all parents used in the crosses and in the experiment as controls.This shows the potential for selection of superior transgressive individuals within the wheat populations developed by the UFV Wheat Breeding Program, with a view to meeting the demand of the agricultural market in the Brazilian Cerrado.Another six F 2 populations of tropical wheat were selected together with two commercial control cultivars, Tbio Duque and Tbio Sossego.
The five populations that showed significant genetic effects, mentioned in the previous paragraph, presented an average estimate of 320.41 g plot -1 .
The intercept of all hybrid and parent combinations was 244.61 g plot -1 .For the selection differential (SD) between populations with significant genetic effects and all combinations, SD was 75.80 g plot -1 .Considering only the parents used, the differences are even greater, since the average of the parents was of 234.29 g plot -1 , then the SD was 86.12 g plot -1 .
Of the five F 2 populations with significant effects, the three populations with the highest breeding values were Tbio Sossego/ CD 1303 (F2_H5), CD 1303/Tbio Ponteiro (F2_E7), and BRS 254/CD 1303 (F2_B5).For the three most promising populations among all 34 combinations, crosses involving wheat parents from different breeders (Biotrigo Genética, Coodetec, and Embrapa Trigo) resulted in F 2 populations with high average grain yield plot -1 , and with greater probability of extracting wheat progenies with satisfactory agronomic performance.
Strategies aimed launching cultivars with high yield potential are desirable, as they allow the intensification of wheat agricultural areas and enable satisfactory gains in production per unit of area.However, the intensification of annual gain in wheat yield is limited, with indications of stagnation.BECHE et al. (2014) reported gains of 0.92% year -1 when evaluating wheat cultivars released in the last 60 years in Brazil.In turn, WOYANN et al. (2019) observed gains of up to 1.28% year -1 in cultivars released between 1985 and 2014.Therefore, the significant selection gains obtained are noteworthy, especially when considering the gain of selected F 2 populations in relation to the parents, which are commercial cultivars.
The results of the gains are expressive and point to the possibility of extracting lines with high yield.It is noteworthy that the F 2 populations come from crosses involving parents from different In the author's where we read: Henrique Caletti Mezzmo

Read:
Henrique Caletti Mezzomo posterior marginal distributions.In summary, random samples of the posterior marginal distributions are indirectly generated from the full conditional posterior distributions (f.c.p.d) (likelihood function × prior distribution of each parameter) by means of the MCMC algorithms.Thus, after a sufficiently large number of iterations, the values

Figure 1 -
Figure 1 -Distribution chain of mean estimates of 190,000 estimates for the sources of variation population (pop) and error (units) of the model using noninformative prior, on the left.On the right, the distribution density function corresponding to the chain.

Figure 2 -
Figure 2 -Distribution chain of heritability estimates of 190,000 estimates of the model using noninformative priori, on the left.On the right, the distribution density function corresponding to the chain.

Figure 3 -
Figure 3 -Estimates of genetic value (A) and genetic value plus intercept (B) of 34 F2 populations and eight tropical wheat cultivars obtained by Bayesian approach for the yield per plot trait (g).Viçosa, MG, 2020.

Table 1 -
Description of the cultivars used in the crossings as maternal (♀) and paternal (♂) parents regarding the breeding institution, commercial class, cycle and weight of 1000 seeds (W1000S, g) and F2 populations obtained by artificial crossings.