ABSTRACT
Among the multi-trait models selected to study several traits and environments jointly, the Bayesian framework has been a preferred tool when constructing a more complex and biologically realistic model. In most cases, non-informative prior distributions are adopted in studies using the Bayesian approach. However, the Bayesian approach presents more accurate estimates when informative prior distributions are used. The present study was developed to evaluate the efficiency and applicability of multi-trait multi-environment (MTME) models within a Bayesian framework utilizing a strategy for eliciting informative prior distribution using previous data on rice. The study involved data pertaining to rice (Oryza sativa L.) genotypes in three environments and five crop seasons (2010/2011 until 2014/2015) for the following traits: grain yield (GY), flowering in days (FLOR) and plant height (PH). Variance components, genetic and non-genetic parameters were estimated using the Bayesian method. In general, the informative prior distribution in Bayesian MTME models provided higher estimates of individual narrow-sense heritability and variance components, as well as minor lengths for the highest probability density interval (HPD), compared to their respective non-informative prior distribution analyses. More informative prior distributions make it possible to detect genetic correlations between traits, which cannot be achieved with non-informative prior distributions. Therefore, this mechanism presented to update knowledge for an elicitation of an informative prior distribution can be efficiently applied in rice breeding programs.
MCMC; genetic correlation; genetic improvement; heritability; prior distribution
Introduction
Rice (Oryza sativa L.) is one of the most important sources of the global population’s daily caloric and nutritional requirement (FAO, 2020). Across the world, the population is increasing while, at the same time the available area of suitable wetlands is decreasing (Ray et al., 2013Ray, D.K.; Mueller, N.D.; West, P.C.; Foley, J.A. 2013. Yield trends are insufficient to double global crop production by 2050. PLoS One 8: e66428. https://doi.org/10.1371/journal.pone.0066428
https://doi.org/10.1371/journal.pone.006...
). It is estimated that by 2050 the agricultural production of rice should increase by between 60 and 110 % (Hunter et al., 2017Hunter, M.C.; Smith, R.G.; Schipanski, M.E.; Atwood, L.W.; Mortensen, D.A. 2017. Agriculture in 2050: recalibrating targets for sustainable intensification. BioScience 67: 386-391. https://doi.org/10.1093/biosci/bix010
https://doi.org/10.1093/biosci/bix010...
; Juliana et al., 2019Juliana, P.; Poland, J.; Huerta-Espino, J.; Shrestha, S.; Crossa, J.; Crespo-Herrera, L.; Toledo, F.H.; Govindan, V.; Mondal, S.; Kumar, U.; Bhavani, S.; Singh, P.K.; Randhawa, M.S.; He, X.; Guzman, C.; Dreisigacker, S.; Rouse, M.N.; Jin, Y.; Pérez-Rodríguez, P.; Montesinos-López, O.A.; Singh, D.; Rahman, M.M.; Marza, F.; Singh, R.P. 2019. Improving grain yield, stress resilience and quality of bread wheat using large-scale genomics. Nature Genetics 51: 1530-1539. https://doi.org/10.1038/s41588-019-0496-6
https://doi.org/10.1038/s41588-019-0496-...
). Thus, the evaluation of multiple traits in rice cultivation aims to maximize grain yield potential (Liang et al., 2021Liang, Y.; Nan, W.; Qin, X.; Zhang, H. 2021. Field performance on grain yield and quality and genetic diversity of overwintering cultivated rice (Oryza sativa L.) in southwest China. Scientific Reports 11: 1846. https://doi.org/10.1038/s41598-021-81291-8
https://doi.org/10.1038/s41598-021-81291...
). In general, in a plant breeding program aimed at identifying the most genetically superior genotypes, the selection is based on one trait only (Suela et al., 2019Suela, M.M.; Lima, L.P.; Azevedo, C.F.; Resende, M.D.V.; Nascimento, M.; Silva, F.F. 2019. Combined index of genomic prediction methods applied to productivity. Ciência Rural 49: 6. https://doi.org/10.1590/0103-8478cr20181008
https://doi.org/10.1590/0103-8478cr20181...
; Sabri et al., 2020Sabri, R.S.; Rafii, M.Y.; Ismail, M.R.; Yusuff, O.; Chukwu, S.C.; Hasan, N.A. 2020. Assessment of agro-morphologic performance, genetic parameters and clustering pattern of newly developed blast resistant rice lines tested in four environments. Agronomy 10: 1098. https://doi.org/10.3390/agronomy10081098
https://doi.org/10.3390/agronomy10081098...
). However, this approach can cause problems if its performance in another desirable trait is not evaluated (Cruz et al., 2014Cruz, C.D.; Regazzi, A.J.; Carneiro, P.C.S. 2014. Biometric Models Applied to Genetic Breeding Improvements = Modelos Biométricos Aplicados ao Melhoramento Genético. Editora UFV, Viçosa, MG, Brazil (in Portuguese).). Genetic evaluation of multiple traits is relevant since superior varieties combine optimal attributes for several traits simultaneously in plant breeding (Torres et al., 2018Torres, L.G.; Rodrigues, M.C.; Lima, N.L.; Trindade, T.F.H.; Silva, F.F.; Azevedo, C.F.; Lima, R.O. 2018. Multi-trait multi-environment Bayesian model reveals G x E interaction for nitrogen use efficiency components in tropical maize. PLoS One 13: e0199492. https://doi.org/10.1371/journal.pone.0199492
https://doi.org/10.1371/journal.pone.019...
). In these cases, selection can be made indirectly, based on easy to measure secondary traits of low environmental influence genetically correlated with the target trait, which is an exciting alternative for maximizing accuracy (Santos et al., 2018Santos, I.G.D.; Cruz, C.D.; Nascimento, M.; Rosado, R.D.S.; Ferreira, R.D.P. 2018. Direct, indirect and simultaneous selection as strategies for alfalfa breeding on forage yield and nutritive value. Pesquisa Agropecuária Tropical 48: 178-189. https://doi.org/10.1590/1983-40632018v4851950
https://doi.org/10.1590/1983-40632018v48...
).
Among the multi-trait models used for modeling several traits and environment jointly, the Bayesian framework has been the preferred tool when using a more complex and biologically realistic model (Dunson, 2001Dunson, D.B. 2001. Commentary: practical advantages of Bayesian analysis of epidemiologic data. American Journal of Epidemiology 153: 1222-1226. https://doi.org/10.1093/aje/153.12.1222
https://doi.org/10.1093/aje/153.12.1222...
). Several studies have demonstrated the potential of the Bayesian approach to genetic evaluation in plant breeding by considering multi-trait evaluation (Torres et al., 2018Torres, L.G.; Rodrigues, M.C.; Lima, N.L.; Trindade, T.F.H.; Silva, F.F.; Azevedo, C.F.; Lima, R.O. 2018. Multi-trait multi-environment Bayesian model reveals G x E interaction for nitrogen use efficiency components in tropical maize. PLoS One 13: e0199492. https://doi.org/10.1371/journal.pone.0199492
https://doi.org/10.1371/journal.pone.019...
; Peixoto et al., 2021Peixoto, M.A.; Evangelista, J.S.P.C.; Coelho, I.F.; Alves, R.S.; Laviola, B.G.; Silva, F.F., Resende, M.D.V.; Bhering, L.L. 2021. Multiple-trait model through Bayesian inference applied to Jatropha curcas breeding for bioenergy. PLoS One 16: e0247775. https://doi.org/10.1371/journal.pone.0247775
https://doi.org/10.1371/journal.pone.024...
). However, in the majority of these studies, non-informative prior distributions are used. The Bayesian approach tends to be less biased and presents more accurate estimates than classical analysis when it uses informative prior distributions (van de Schoot et al., 2021van de Schoot, R.; Depaoli, S.; King, R.; Kramer, B.; Märtens, K.; Tadesse, M.G.; Vannucci, M.; Gelman, A.; Veen, D.; Willemsen, J.; Yau, C. 2021. Bayesian statistics and modelling. Nature Reviews Methods Primers 1: 1-26. https://doi.org/10.1038/s43586-020-00001-2
https://doi.org/10.1038/s43586-020-00001...
) that should be preferable for breeding purposes aimed at improving selection accuracy.
Systems for updating knowledge of the hyperparameters and building informative prior distributions were proposed by Silva et al. (2013)Silva, F.F.; Viana, J.M.S.; Faria, V.R.; Resende, M.D.V. 2013. Bayesian inference of mixed models in quantitative genetics of crop species. Theoretical and Applied Genetics 126: 1749-1761. https://doi.org/10.1007/s00122-013-2089-6
https://doi.org/10.1007/s00122-013-2089-...
and Azevedo et al. (2022)Azevedo, C.F.; Nascimento, M.; Carvalho, I.R.; Nascimento A.C.C.; Almeida, H.C.F.; Cruz, C.D.; Silva, A.G. 2022. Updated knowledge in the estimation of genetic parameters: a Bayesian approach in white oat (Avena sativa L.). Euphytica 218: 43. https://doi.org/10.1007/s10681-022-02995-0
https://doi.org/10.1007/s10681-022-02995...
. The first study used the scaled inverse chi-square prior distributions for the parameters in univariate analysis in maize breeding, drawing on the previous phenotypic data in two selection cycles. The second study used inverse gamma prior distributions considering ten crop seasons in white oat (Avena sativa L.). However, these procedures for eliciting informative prior distributions have not yet been presented with multi-trait analysis. Thus, the present study aimed to evaluate a strategy for eliciting informative prior distribution using previous data from rice. For such, phenotypic data of three traits associated with eighteen genotypes of rice evaluated in five crop seasons were used.
Materials and Methods
Experimental data
The field experiment was carried out in the experimental area in the municipalities of Janaúba (15°48’77” S, 43°17’59.09” W, altitude 516 m), Lambari (21°58’11.24” S, 45°20’59.6” W, altitude 887 m) and Leopoldina (21°31’55” S, 42°38’35” W, altitude 225 m), in the state of Minas Gerais in Brazil. This experiment evaluated eighteen rice genotypes lines for a flood irrigated rice breeding program. Among these genotypes, five cultivars were used as experimental controls (Rubelita, Seleta, Ourominas, Predileta, and Rio Grande). Grain yield (kg ha1 - GY), flowering in days (FLOR) and plant height (cm - PH) were also evaluated for the crop seasons 2010/2011 to 2014/2015. All experiments were arranged in a randomized block design with three replications.
The useful area consisted of four meters of three internal rows (4 m × 0.9 m, 3.60 m2). The experiments were conducted on floodplain soils with continuous flood irrigation. The cultural treatments were carried out following the recommendations for irrigated rice cultivation in the evaluated regions (Soares et al., 2005Soares, P.C.; Melo, P.G.S.; Melo, L.C.; Soares, A.A. 2005. Genetic gain in an improvement program of irrigated rice in Minas Gerais. Crop Breeding and Applied Biotechnology 5: 142-148.).
Model and Bayesian inference
The traits (GY, FLOR and PH) were analyzed using multi-trait models featured in the Markov Chain Monte Carlo (MCMC) Bayesian approach. The first idea was to compare the full model (considering the interaction between the genotypes and environments) with the null model (not considering the interaction). The full fitted multi-trait statistical model was given by:
which can be rewritten as:
where yij is the vector of phenotypic values of the i-th trait (i = 1,2,3) in the j-th environment (j = 1,2,3); bij the vector of effect of the j-th environment in the i-th trait; rij i the vector of the block effects of the i-th trait in the j-th environment; uij the genetic values vector of the i-th trait in the j-th environment, and eij the residual vector of the i-th trait in the j-th environment. X is the incidence matrix of systematic effects, Z1 the incidence matrix of block effects and Z2 the incidence matrix of genetic effects.
The prior distributions for the parameters of the model were given by:
where I is the identity matrix, ∑b, ∑r, ∑u and ∑e are the (co)variance matrix with prior distributions given by:
where IW is the inverted Wishart distribution; Vb, Vr, Vu and Ve the matrices with known values, and hb, hr, hu and he the known constants known as hyperparameters. The (co)variance matrix estimates are given by:
where , , are, respectively, the block, genetic and residual variance of the i-th trait in the j-th environment; srii’(j), suii’(j), seii’(j) the block, genetic and residual covariance between the i-th trait and the i’-trait in the j-th environment, and srii’(j,j’), suii’(j,j’), seii’(j,j’) the block, genetic and residual covariance between the i-th trait of j-th environment and i’-trait of j’-th environment respectively. ∑b is a diagonal matrix with values equal to 108.
The Bayesian estimation of the parameters (b,r,u,∑r,∑u,∑e) is based on their posterior marginal distributions, which are indirectly generated through the MCMC algorithms and create a chain of values for each parameter. The w-th value of the chain of individual narrow-sense heritability associated with the i-th trait and the j-th environment hi2, is given by:
where , and are, respectively, block, genetic and residual variances of w-th iteration and the i-th trait in the j-th environment. The relative variation coefficient is the ratio of the coefficient of genotypic variation to the coefficient of residual variation, i.e. . The w-th value of the chain of genetic correlation between traits i and i’ in the j-th enviroment, , is given by:
where is the genetic covariance between traits i and i’; is the genetic variance of the i-th trait, and i the genetic variance of the i’-th trait in the j-th environment and the w-th iteration of the MCMC algorithm. The w-th value of the chain of genetic correlation of the i-th trait between environments j and j’, is given by:
where is the genetic covariance between environments; j and j’ the genetic variance of the j-th environment, andi the genetic variance of the j’-th environment of the i-th trait in the w-th iteration of the MCMC algorithm. The efficiency of indirect selection of the i-th trait in the j’-th environment relative to direct selection in the targeted the j-th environment EISj(j’) proposed by Windhausen et al. (2012)Windhausen, V.S.; Wagener, S.; Magorokosho, C.; Makumbi, D.; Vivek, B.; Piepho, H.P.; Melchinger, A.E.; Atlin, G.N. 2012. Strategies to subdivide a target population of environments: results from the CIMMYT-led maize hybrid testing programs in Africa. Crop Breeding and Genetics 52: 2143-2152. https://doi.org/10.2135/cropsci2012.02.0125
https://doi.org/10.2135/cropsci2012.02.0...
is given by:
where ri(j,j’) is the posterior mean of genetic correlation of the i-th trait between environments j and j’, hi2(j’) is the posterior mean of individual narrow-sense heritability associated with the i-th trait and the j’-th environment, and hi2(j) is the posterior mean of individual narrow-sense heritability associated with the i-th trait and the j-th environment.
The informative worth of prior distribution is associated with the values of the hyperparameters and, consequently, in this study, with the (co)variance matrices of the normal distribution (van de Schoot et al., 2021van de Schoot, R.; Depaoli, S.; King, R.; Kramer, B.; Märtens, K.; Tadesse, M.G.; Vannucci, M.; Gelman, A.; Veen, D.; Willemsen, J.; Yau, C. 2021. Bayesian statistics and modelling. Nature Reviews Methods Primers 1: 1-26. https://doi.org/10.1038/s43586-020-00001-2
https://doi.org/10.1038/s43586-020-00001...
). Based on phenotypic databases containing several years of collection, it is possible to update the hyperparameters and increase our knowledge of the (co)variance matrices and thus create informative prior distributions. In univariate analyses with ten seasons of data, the updating knowledge for the p-th season should be carried out using information from the (p – 1)-th season only and not the previous seasons (Azevedo et al., 2022Azevedo, C.F.; Nascimento, M.; Carvalho, I.R.; Nascimento A.C.C.; Almeida, H.C.F.; Cruz, C.D.; Silva, A.G. 2022. Updated knowledge in the estimation of genetic parameters: a Bayesian approach in white oat (Avena sativa L.). Euphytica 218: 43. https://doi.org/10.1007/s10681-022-02995-0
https://doi.org/10.1007/s10681-022-02995...
). Therefore, in this study, we used two prior distributions, one non-informative and one informative prior distribution created according to Azevedo et al. (2022)Azevedo, C.F.; Nascimento, M.; Carvalho, I.R.; Nascimento A.C.C.; Almeida, H.C.F.; Cruz, C.D.; Silva, A.G. 2022. Updated knowledge in the estimation of genetic parameters: a Bayesian approach in white oat (Avena sativa L.). Euphytica 218: 43. https://doi.org/10.1007/s10681-022-02995-0
https://doi.org/10.1007/s10681-022-02995...
though in our case the multivariate approach was adopted. Thus, in this study, two analyses were performed: i) five multi-trait and multi-environment models which considered each crop season separately and used non-informative prior distributions in all seasons; ii) five multi-trait and multi-environment models which considered informative prior distributions, where the (p – 1)-th crop season contributed to the p-th crop season. In the first season, non-informative prior distribution was used.
In non-informative prior distribution, we consider the hyperparameter to be equal to and (Hadfield, 2010Hadfield, J.D. 2010. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. Journal of Statistical Software 33: 1-22. https://doi.org/10.18637/jss.v033.i02
https://doi.org/10.18637/jss.v033.i02...
). For the construction of the informative prior distributions, we know that if the (co)variance matrix is (in this study, the dimension of ∑ is 9 × 9), then the expected value of ∑ is given by and the mode of ∑ is given by . Thus, the posterior mean of (co)variance components (Σ) obtained in the analysis of (p – 1)-th season and its respective posterior mode (Mo) were equalized to the expected value and mode of the ∑~IW(V, h) distribution. Through these expressions, it was possible to find the following equality and,
and these values were used as hyperparameters of the prior distribution of the p-th crop season.
The following parameters were calculated in order to assess the impact of prior knowledge insertion : i) the posterior coefficient of variation (CV) of the estimates of the components of variance, individual narrow-sense heritability, genetic correlation and additive genetic values; ii) length of the Highest Posterior Density intervals (HPD) of the parameter estimates; iii) the deviance information criterion (DIC), when possible, since the quality of the fit can only be compared using the DIC when the model uses the same data; iv) agreement between genetic estimates by both non-informative and informative prior distribution, considering 30 % of the selection differential (a total of six genotypes).
All computational implementations of the analysis were performed using MCMCglmm (Hadfield, 2010Hadfield, J.D. 2010. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. Journal of Statistical Software 33: 1-22. https://doi.org/10.18637/jss.v033.i02
https://doi.org/10.18637/jss.v033.i02...
) in R (R software, version 4.1.0). A total of 3,000,000 samples were generated, assuming a burn-in period and sampling interval of 100,000 and ten iterations, respectively, which resulted in 290,000 samples. The convergence of MCMC chains was assessed by Geweke’s diagnostic (Geweke, 1992), which was performed using the CODA R package (Plummer et al., 2006Plummer, M.; Best, N.; Cowles, K.; Vines, K. 2006. CODA: Convergence Diagnosis and Output Analysis for MCMC. R News 6: 7-11.). The computational routine is available at https://github.com/licaeufv/Multi-trait-Multi-Environment-Rice.
Results and Discussion
Model selection and convergence of parameters
Overall, except for the 2012-2013 crop season, the entire model (model with the interaction effect) presented lower DIC values compared with those obtained from the null model (model without the interaction effect) (Table 1). The lower values of DIC indicate better goodness-of-fit of the full model (Spiegelhalter et al., 2014Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; van der Linde, A. 2014. The deviance information criterion: 12 years on. Journal of the Royal Statistical Society. Series B - Statistical Methodology 76: 485-493. https://doi.org/10.1111/rssb.12062
https://doi.org/10.1111/rssb.12062...
), except for the 2012-2013 crop season analysis using non-informative prior distributions. For all parameters, the p of Geweke’s Z statistics were more significant than 1 % (Tables 2 and 3), indicating that convergence was achieved, and the inferences could then be made.
Comparison between informative and non-informative prior distributions
The lower posterior coefficient of variation (CV) values of the genetic variance and individual narrow-sense heritability (Table 2) and genetic correlation (Table 3) were observed by considering the informative prior to the estimation process. Using this approach, the hyperparameters from the prior distributions were obtained by analyzing the previous year. Therefore, the length of the HPD interval is also shorter due to the higher precision provided by this informative prior (Tables 2 and 3). The same results were found by Silva et al. (2013)Silva, F.F.; Viana, J.M.S.; Faria, V.R.; Resende, M.D.V. 2013. Bayesian inference of mixed models in quantitative genetics of crop species. Theoretical and Applied Genetics 126: 1749-1761. https://doi.org/10.1007/s00122-013-2089-6
https://doi.org/10.1007/s00122-013-2089-...
when considering univariate analyses in maize. However, the same was not observed in the genetic values. In most analyses, the CV of genetic value presented increased amplitude in the informative priors. Despite these amplitude values, considering a selection differential of 30 %, the agreement between the selected genotypes in both prior distributions is more than 50 % (Table 4).
Mean, 95 % highest probability density interval (HPD) and coefficient of variation (CV) of the posterior densities of the genetic parameters for traits relative to 2014-2015, considering non-informative and informative prior distributions, the statistics of convergence and DIC (Deviance information criterion).
Individual narrow-sense heritability of traits
Considering the results obtained by the informative prior distributions, the estimates of individual narrow-sense heritability for GY, FLOR and PH were low to high, respectively, with 0.27 [0.25; 0.29], 0.47 [0.21; 0.74] and 0.79 [0.77; 0.80] for the locality of Janaúba, 0.21 [0.20; 0.23], 0.62 [0.60; 0.64] and 0.43 [0.40; 0.45] for the locality of Lambari and 0.14 [0.13; 0.15], 0.77 [0.76; 0.79] and 0.58 [0.56; 0.60] for the locality of Leopoldina (Table 2). It is worth emphasizing that the low heritability values observed did not depend on the number of evaluated genotypes, since the Bayesian approach is recommended essentially for small sample sizes (Torres et al., 2018Torres, L.G.; Rodrigues, M.C.; Lima, N.L.; Trindade, T.F.H.; Silva, F.F.; Azevedo, C.F.; Lima, R.O. 2018. Multi-trait multi-environment Bayesian model reveals G x E interaction for nitrogen use efficiency components in tropical maize. PLoS One 13: e0199492. https://doi.org/10.1371/journal.pone.0199492
https://doi.org/10.1371/journal.pone.019...
). In addition, GY is quantitative and highly affected by the environment (Rao et al., 2017Rao, M.; Grithlahre, S.; Bisen, P.; Loitongbam, B.; Dar, M.H.; Zaidi, N.W.; Singh, U.S.; Singh, P.K. 2017. Generation mean analysis for grain yield and its component traits in submergence rice. SABRAO Journal of Breeding and Genetics 49: 327-335.; Li et al., 2018Li, X.; Wu, L.; Geng, X.; Xia, X.; Wang, X.; Xu, Z.; Xu, Q. 2018. Deciphering the environmental impacts on rice quality for different rice cultivated areas. Rice 11: 7. https://doi.org/10.1186/s12284-018-0198-1
https://doi.org/10.1186/s12284-018-0198-...
; Kumar et al., 2019Kumar, M.; Singh, R.P.; Singh, O.N.; Singh, P.; Arsode, P.; Jena, D.; Samantaray, S.; Verma, R. 2019. Generation mean analysis for bacterial blight resistance and yield traits in rice. Journal of Pharmacognosy and Phytochemistry 8: 2120-2124.; Zhang et al., 2020Zhang, H.; Zhu, Y.; Zhu, A.; Fan, Y.; Huang, T.; Zhang, J.; Xie, H.; Zhuang, J. 2020. Identification and verification of quantitative trait loci affecting milling yield of rice. Agronomy 10: 75. https://doi.org/10.3390/agronomy10010075
https://doi.org/10.3390/agronomy10010075...
).
Studies using the Bayesian approach for estimating genetic parameters are rare in rice. Using multi-trait and multi-environment Bayesian analysis but with non-informative prior distributions and only the one crop season, Silva Junior et al. (2022) found heritabilities of 0.28 and 3.32e-6 for GY also in the localities of Lambari and Janaúba, respectively. As for FLOR, Silva Junior et al. (2022), in this same study, found heritabilities of 0.31 and 0.27 in Lambari and Janaúba, respectively. Using rice genotypes in assays performed at the Dale Bumpers National Rice Research Center (DBNRRC), Sharma et al. (2021)Sharma, S.; Pinson, S.R.M.; Gealy, D.R.; Edwards, J.D. 2021. Genomic prediction and QTL mapping of root system architecture and above-ground agronomic traits in rice (Oryza sativa L.) with a multitrait index and Bayesian networks. G3 11: jkab178. https://doi.org/10.1093/g3journal/jkab178
https://doi.org/10.1093/g3journal/jkab17...
found heritabilities of 0.60 and 0.93 for GY and PH, respectively. The heritabilities observed by Bhandari et al. (2019)Bhandari, A.; Bartholomé, J.; Cao-Hamadoun, T.V.; Kumarim N.; Frouin, J.; Kumar, A.; Ahmadi, N. 2019. Selection of trait-specific markers and multi-environment models improve genomic predictive ability in rice. PLoS One 14: e0208871. https://doi.org/10.1371/journal.pone.0208871
https://doi.org/10.1371/journal.pone.020...
in three managed environments in the Philippines ranged from 0.75 to 0.84, 0.54 to 0.91, and 0.71 to 0.96 for the FLOR, GY, and PH traits, respectively.
We observed increased additive genetic variance and heritability in the use of informative prior on the results of the non-informative prior distribution for all traits, except for PH, in the locality of Lambari, and FLOR, in the locality of Janaúba (Table 2). Using informative prior distribution based on the data can be conducted on biased variance component estimates (over and underestimated) (Silva et al., 2013Silva, F.F.; Viana, J.M.S.; Faria, V.R.; Resende, M.D.V. 2013. Bayesian inference of mixed models in quantitative genetics of crop species. Theoretical and Applied Genetics 126: 1749-1761. https://doi.org/10.1007/s00122-013-2089-6
https://doi.org/10.1007/s00122-013-2089-...
). Furthermore, the posterior mean of heritability can increase using an informative prior. However, the Bayesian approach is more often recommended using interval estimates, such as HPD intervals than point estimates. Among the 18 rice genotypes evaluated, the GY trait in the Janaúba locality showed the highest additive genetic variance, while the lowest value was found for PH in the Lambari locality. We also observed the highest heritability value of 0.79 in the Janaúba locality for PH and the lowest heritability for GY, with a value of 0.14 in the Leopoldina locality.
The GY, FLOR and PH traits presented coefficients of variation (CVg) from 0.95 % to 2.58 %, 3.25 % to 3.84 % and 1.06 % to 2.48 %, respectively, for informative prior distributions for each place studied. These can be considered adequate when compared to the method for the classification of coefficients of variation for rice cultivation, proposed by Costa et al. (2002)Costa, N.H.A.D.; Seraphin, J.C.; Zimmermann, F.J.P. 2002. A new method of variation coefficient classification for upland rice crop. Pesquisa Agropecuária Brasileira 37: 243-249 (in Portuguese, with abstract in English). https://doi.org/10.1590/S0100-204X2002000300003
https://doi.org/10.1590/S0100-204X200200...
, which determined that the coefficients of variation should be below 51.36 %, 7.62 %, and 17.27 % for grain yield, flowering in days and plant height, respectively. The relative variation coefficients (CVg/CVe) that are greater than the unit suggest that genetic variation is more influential than residual variation (Torres et al., 2018Torres, L.G.; Rodrigues, M.C.; Lima, N.L.; Trindade, T.F.H.; Silva, F.F.; Azevedo, C.F.; Lima, R.O. 2018. Multi-trait multi-environment Bayesian model reveals G x E interaction for nitrogen use efficiency components in tropical maize. PLoS One 13: e0199492. https://doi.org/10.1371/journal.pone.0199492
https://doi.org/10.1371/journal.pone.019...
). This was observed in this study for FLOR, in Lambari and Leopoldina, and for PH, in Janaúba and Leopoldina.
Genetic correlation between environments
The genetic correlations between environments for all traits ranged from 0.14 to 0.47 for GY; 0.14 to 0.36 for FLOR; and 0.34 to 0.47 for PH, which indicates the existence of interaction between environments (Table 5). The genetic correlations between environments were positive for all traits. Leopoldina was the environment that presented the highest correlation with other locations. Considering the genetic correlations below 0.30 as low and above 0.60 as high, Oliveira et al. (2020)Oliveira, I.C.M.; Guilhen, J.H.S.; Ribeiro, P.C.O.; Gezan, S.A.; Schaffert, R.E.; Simeone, M.L.F.; Damasceno, C.M.B.; Carneiro, J.E.S.; Carneiro, P.C.S.; Parrella, R.A.C.; Pastina, M.M. 2020. Genotype-by-environment interaction and yield stability analysis of biomass sorghum hybrids using factor analytic models and environmental covariates. Field Crops Research 257: 107929. https://doi.org/10.1016/j.fcr.2020.107929
https://doi.org/10.1016/j.fcr.2020.10792...
suggest the occurrence of high (0.14-0.22) and moderate (0.34-0.47) G × E, i.e., the performance of genotypes varied between environments.
Mean and 95 % highest probability density (HPD) interval of the genetic correlation (rg) between environment (upper diagonal), relative variation coefficient (diagonal) and agreement between genetic breeding values estimated for each pair of environments relative to 2014-2015 (under diagonal).
The efficiency of indirect selection compared to direct selection is presented in Table 6. For all evaluated traits, direct selection proved to be more efficient. However, as expected, indirect selection was more efficient in more correlated environments where the heritability of the indirect selection environment was greater than in the direct selection environment, which varied according to the traits under study. For GY, the highest efficiency of indirect selection was observed in direct selection in Leopoldina and indirect in Lambari (0.58). In contrast, for FLOR, the highest efficiency was observed in direct selection in Janaúba and indirect selection in Leopoldina (0.46). For PH, the highest efficiency was observed in direct selection in Lambari and indirect in Leopoldina (0.55).
The percentage of agreement considering a selection differential of 30 % was calculated to compare the ranking of genotypes between the three environments for each trait, as described above (Table 5). For the GY trait, 83.33 %, 0.00 % and 16.67 % of coincidence between the environments were observed. For the FLOR trait, 83.33 %, 50 % and 66.67 % of coincidence between the environments were observed. For the PH trait, all coincidences observed were 16.67 %. This result suggests that the PH and GY traits are more influenced by the environment than FLOR.
The difference between the genotype rankings in this study indicates that if breeders selected genotypes using only individual trial results, their selection would change between trials. In addition, there is still the possibility that high-yield genotypes are discarded and low-yield genotypes are chosen for other environments. This low correlation between the rankings for GY and PH also indicates that the rice genotypes carry many alleles that are differentially adapted to the evaluated environments, highlighting the importance of multi-environmental trials for this data set to address and deal with the genotype by environment interaction.
Genetic correlation between the traits
We verified that the HPD lengths of genetic correlation, using the informative prior distribution, decreased over the years (Figures 1, 2 and 3). In addition, four pairs of traits and environment (GY × FLOR in the Lambari and Leopoldina locality and GY × PH) were not significant in the first years. With the accumulation of information over the years, these correlations were significant. In contrast, all the correlations obtained using the non-informative model were not significant.
(A) Mean posterior (in bars) and highest posterior density (in arrows) of genetic correlation (GC) between GY and FLOR traits, (B) absolute value of coefficient of variation (CV) of genetic correlation (GY × FLOR), using the non-informative and informative prior distribution the five years. Grain yield (GY), in kg ha–1 and Flowering (FLOR) in days.
(A) Mean posterior (in bars) and highest posterior density (in arrows) of genetic correlation (GC) between GY and PH traits, (B) absolute value of coefficient of variation (CV) of genetic correlation (GY × PH) using the non-informative and informative prior distribution over the five years. Grain yield (GY), in kg ha–1 and Plant Height (PH), in cm.
(A) Mean posterior (in bars) and highest posterior density (in arrows) of genetic correlation (GC) between FLOR and PH, (B) absolute value of coefficient of variation (CV) of genetic correlation (FLOR × PH) using the non-informative and informative prior distribution over the five years. Flowering (FLOR) in days and Plant Height (PH), in cm.
The correlations obtained, using the informative model, for the GY and PH traits were significant for all locations. For the localities of Lambari and Leopoldina, the correlations were 0.14 [0.09, 0.20] and 0.15 [0.10, 0.21], respectively, while for Janaúba, the correlation was –0.50 [–0.54, –0.46]. Similar results were observed by Lakshmi et al. (2014)Lakshmi, M.V.; Suneetha, Y.; Yugandhar, G.; Lakshmi, N.V. 2014. Correlation studies in rice (Oryza sativa L.). Journal of Genetic Engineering and Biotechnology 5: 121-126. and Oladosu et al. (2018)Oladosu, Y.; Rafii, M.Y.; Magaji, U.; Abdullah, N.; Miah, G.; Chukwu, S.C.; Hussin, G.; Ramli, A.; Kareem, I. 2018. Genotypic and phenotypic relationship among yield components in rice under tropical conditions. BioMed Research International 2018: 8936767. https://doi.org/10.1155/2018/8936767
https://doi.org/10.1155/2018/8936767...
in their study on rice genotypes under tropical conditions, which found correlations of 0.18 and –0.34, respectively. This divergence can be explained by the effect of the environment on the expression of these traits, as observed in the results of Table 5. The estimated correlation values for GY and FLOR were 0.11 [0.05, 0.16] for Lambari and 0.13 [0.07, 0.18] for Leopoldina, which corroborates the correlation of 0.11 estimated by Lakshmi et al. (2014)Lakshmi, M.V.; Suneetha, Y.; Yugandhar, G.; Lakshmi, N.V. 2014. Correlation studies in rice (Oryza sativa L.). Journal of Genetic Engineering and Biotechnology 5: 121-126..
The Bayesian estimation of parameters such as genetic correlation is advantageous compared to the classical estimation using the maximum likelihood method (Nustad et al., 2018Nustad, H.E.; Page, C.M.; Reiner, A.H.; Zucknick, M.; LeBlanc, M. 2018. A Bayesian mixed modeling approach for estimating heritability. BMC Proceedings 12: 31. https://doi.org/10.1186/s12919-018-0131-z
https://doi.org/10.1186/s12919-018-0131-...
). In classical statistics, confidence intervals are only possible through Bootstrap and delta method procedures (Manichaikul et al., 2006Manichaikul, A.; Dupuis, J.; Sen, S.; Broman, K.W. 2006. Poor Performance of Bootstrap Confidence Intervals for the Location of a Quantitative Trait Locus. Genetics 174: 481-489. https://doi.org/10.1534/genetics.106.061549
https://doi.org/10.1534/genetics.106.061...
). These intervals generally have great amplitudes (Beyene and Moineddin, 2005Beyene, J.; Moineddin, R. 2005. Methods for confidence interval estimation of a ratio parameter with application to location quotients. BMC Medical Research Methodology 5: 32. https://doi.org/10.1186/1471-2288-5-32
https://doi.org/10.1186/1471-2288-5-32...
). The Bayesian approach makes it possible to estimate credibility intervals (in general, they are shorter than the confidence intervals). Thus, shorter intervals make it easier to detect correlations between traits and even between environments.
Conclusions
We demonstrated the feasibility of the proposed multi-trait multi-environment Bayesian model for plant breeding involving a low number of genotypes that are evaluated for multiple traits across a range of environments. In addition, we presented a knowledge-updating mechanism for eliciting an informative prior distribution. More informative prior distributions make it possible to detect genetic correlations between traits. This was not feasible with the use of non-informative prior distributions.
Acknowledgments
The authors are grateful for the financial support of the Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).
References
- Azevedo, C.F.; Nascimento, M.; Carvalho, I.R.; Nascimento A.C.C.; Almeida, H.C.F.; Cruz, C.D.; Silva, A.G. 2022. Updated knowledge in the estimation of genetic parameters: a Bayesian approach in white oat (Avena sativa L.). Euphytica 218: 43. https://doi.org/10.1007/s10681-022-02995-0
» https://doi.org/10.1007/s10681-022-02995-0 - Bhandari, A.; Bartholomé, J.; Cao-Hamadoun, T.V.; Kumarim N.; Frouin, J.; Kumar, A.; Ahmadi, N. 2019. Selection of trait-specific markers and multi-environment models improve genomic predictive ability in rice. PLoS One 14: e0208871. https://doi.org/10.1371/journal.pone.0208871
» https://doi.org/10.1371/journal.pone.0208871 - Beyene, J.; Moineddin, R. 2005. Methods for confidence interval estimation of a ratio parameter with application to location quotients. BMC Medical Research Methodology 5: 32. https://doi.org/10.1186/1471-2288-5-32
» https://doi.org/10.1186/1471-2288-5-32 - Costa, N.H.A.D.; Seraphin, J.C.; Zimmermann, F.J.P. 2002. A new method of variation coefficient classification for upland rice crop. Pesquisa Agropecuária Brasileira 37: 243-249 (in Portuguese, with abstract in English). https://doi.org/10.1590/S0100-204X2002000300003
» https://doi.org/10.1590/S0100-204X2002000300003 - Cruz, C.D.; Regazzi, A.J.; Carneiro, P.C.S. 2014. Biometric Models Applied to Genetic Breeding Improvements = Modelos Biométricos Aplicados ao Melhoramento Genético. Editora UFV, Viçosa, MG, Brazil (in Portuguese).
- Dunson, D.B. 2001. Commentary: practical advantages of Bayesian analysis of epidemiologic data. American Journal of Epidemiology 153: 1222-1226. https://doi.org/10.1093/aje/153.12.1222
» https://doi.org/10.1093/aje/153.12.1222 - Food and Agriculture Organization [FAO]. 2020. Future of Food and Agriculture: Alternative Pathways to 2050. FAO, Rome, Italy.
- Geweke, J. 1992. Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. p. 625. In: Bernardo, J.M.; Berger, J.O.; David, A.P.; Smith, A.F.M., eds. Bayesian statistics. Oxford University Press, Oxford, England.
- Hadfield, J.D. 2010. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. Journal of Statistical Software 33: 1-22. https://doi.org/10.18637/jss.v033.i02
» https://doi.org/10.18637/jss.v033.i02 - Hunter, M.C.; Smith, R.G.; Schipanski, M.E.; Atwood, L.W.; Mortensen, D.A. 2017. Agriculture in 2050: recalibrating targets for sustainable intensification. BioScience 67: 386-391. https://doi.org/10.1093/biosci/bix010
» https://doi.org/10.1093/biosci/bix010 - Juliana, P.; Poland, J.; Huerta-Espino, J.; Shrestha, S.; Crossa, J.; Crespo-Herrera, L.; Toledo, F.H.; Govindan, V.; Mondal, S.; Kumar, U.; Bhavani, S.; Singh, P.K.; Randhawa, M.S.; He, X.; Guzman, C.; Dreisigacker, S.; Rouse, M.N.; Jin, Y.; Pérez-Rodríguez, P.; Montesinos-López, O.A.; Singh, D.; Rahman, M.M.; Marza, F.; Singh, R.P. 2019. Improving grain yield, stress resilience and quality of bread wheat using large-scale genomics. Nature Genetics 51: 1530-1539. https://doi.org/10.1038/s41588-019-0496-6
» https://doi.org/10.1038/s41588-019-0496-6 - Kumar, M.; Singh, R.P.; Singh, O.N.; Singh, P.; Arsode, P.; Jena, D.; Samantaray, S.; Verma, R. 2019. Generation mean analysis for bacterial blight resistance and yield traits in rice. Journal of Pharmacognosy and Phytochemistry 8: 2120-2124.
- Lakshmi, M.V.; Suneetha, Y.; Yugandhar, G.; Lakshmi, N.V. 2014. Correlation studies in rice (Oryza sativa L.). Journal of Genetic Engineering and Biotechnology 5: 121-126.
- Li, X.; Wu, L.; Geng, X.; Xia, X.; Wang, X.; Xu, Z.; Xu, Q. 2018. Deciphering the environmental impacts on rice quality for different rice cultivated areas. Rice 11: 7. https://doi.org/10.1186/s12284-018-0198-1
» https://doi.org/10.1186/s12284-018-0198-1 - Liang, Y.; Nan, W.; Qin, X.; Zhang, H. 2021. Field performance on grain yield and quality and genetic diversity of overwintering cultivated rice (Oryza sativa L.) in southwest China. Scientific Reports 11: 1846. https://doi.org/10.1038/s41598-021-81291-8
» https://doi.org/10.1038/s41598-021-81291-8 - Manichaikul, A.; Dupuis, J.; Sen, S.; Broman, K.W. 2006. Poor Performance of Bootstrap Confidence Intervals for the Location of a Quantitative Trait Locus. Genetics 174: 481-489. https://doi.org/10.1534/genetics.106.061549
» https://doi.org/10.1534/genetics.106.061549 - Nustad, H.E.; Page, C.M.; Reiner, A.H.; Zucknick, M.; LeBlanc, M. 2018. A Bayesian mixed modeling approach for estimating heritability. BMC Proceedings 12: 31. https://doi.org/10.1186/s12919-018-0131-z
» https://doi.org/10.1186/s12919-018-0131-z - Oladosu, Y.; Rafii, M.Y.; Magaji, U.; Abdullah, N.; Miah, G.; Chukwu, S.C.; Hussin, G.; Ramli, A.; Kareem, I. 2018. Genotypic and phenotypic relationship among yield components in rice under tropical conditions. BioMed Research International 2018: 8936767. https://doi.org/10.1155/2018/8936767
» https://doi.org/10.1155/2018/8936767 - Oliveira, I.C.M.; Guilhen, J.H.S.; Ribeiro, P.C.O.; Gezan, S.A.; Schaffert, R.E.; Simeone, M.L.F.; Damasceno, C.M.B.; Carneiro, J.E.S.; Carneiro, P.C.S.; Parrella, R.A.C.; Pastina, M.M. 2020. Genotype-by-environment interaction and yield stability analysis of biomass sorghum hybrids using factor analytic models and environmental covariates. Field Crops Research 257: 107929. https://doi.org/10.1016/j.fcr.2020.107929
» https://doi.org/10.1016/j.fcr.2020.107929 - Peixoto, M.A.; Evangelista, J.S.P.C.; Coelho, I.F.; Alves, R.S.; Laviola, B.G.; Silva, F.F., Resende, M.D.V.; Bhering, L.L. 2021. Multiple-trait model through Bayesian inference applied to Jatropha curcas breeding for bioenergy. PLoS One 16: e0247775. https://doi.org/10.1371/journal.pone.0247775
» https://doi.org/10.1371/journal.pone.0247775 - Plummer, M.; Best, N.; Cowles, K.; Vines, K. 2006. CODA: Convergence Diagnosis and Output Analysis for MCMC. R News 6: 7-11.
- Rao, M.; Grithlahre, S.; Bisen, P.; Loitongbam, B.; Dar, M.H.; Zaidi, N.W.; Singh, U.S.; Singh, P.K. 2017. Generation mean analysis for grain yield and its component traits in submergence rice. SABRAO Journal of Breeding and Genetics 49: 327-335.
- Ray, D.K.; Mueller, N.D.; West, P.C.; Foley, J.A. 2013. Yield trends are insufficient to double global crop production by 2050. PLoS One 8: e66428. https://doi.org/10.1371/journal.pone.0066428
» https://doi.org/10.1371/journal.pone.0066428 - Sabri, R.S.; Rafii, M.Y.; Ismail, M.R.; Yusuff, O.; Chukwu, S.C.; Hasan, N.A. 2020. Assessment of agro-morphologic performance, genetic parameters and clustering pattern of newly developed blast resistant rice lines tested in four environments. Agronomy 10: 1098. https://doi.org/10.3390/agronomy10081098
» https://doi.org/10.3390/agronomy10081098 - Santos, I.G.D.; Cruz, C.D.; Nascimento, M.; Rosado, R.D.S.; Ferreira, R.D.P. 2018. Direct, indirect and simultaneous selection as strategies for alfalfa breeding on forage yield and nutritive value. Pesquisa Agropecuária Tropical 48: 178-189. https://doi.org/10.1590/1983-40632018v4851950
» https://doi.org/10.1590/1983-40632018v4851950 - Sharma, S.; Pinson, S.R.M.; Gealy, D.R.; Edwards, J.D. 2021. Genomic prediction and QTL mapping of root system architecture and above-ground agronomic traits in rice (Oryza sativa L.) with a multitrait index and Bayesian networks. G3 11: jkab178. https://doi.org/10.1093/g3journal/jkab178
» https://doi.org/10.1093/g3journal/jkab178 - Silva, F.F.; Viana, J.M.S.; Faria, V.R.; Resende, M.D.V. 2013. Bayesian inference of mixed models in quantitative genetics of crop species. Theoretical and Applied Genetics 126: 1749-1761. https://doi.org/10.1007/s00122-013-2089-6
» https://doi.org/10.1007/s00122-013-2089-6 - Silva Júnior, A.C.; Sant’Anna, I.C.; Siqueira, M.J.S.; Cruz, C.D.; Azevedo, C.F.; Nascimento, M.; Soares, P.C. 2022. Multi-trait and multi-environment Bayesian analysis to predict the G x E interaction in flood-irrigated rice. PLoS One 17: e0259607. https://doi.org/10.1371/journal.pone.0259607
» https://doi.org/10.1371/journal.pone.0259607 - Soares, P.C.; Melo, P.G.S.; Melo, L.C.; Soares, A.A. 2005. Genetic gain in an improvement program of irrigated rice in Minas Gerais. Crop Breeding and Applied Biotechnology 5: 142-148.
- Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; van der Linde, A. 2014. The deviance information criterion: 12 years on. Journal of the Royal Statistical Society. Series B - Statistical Methodology 76: 485-493. https://doi.org/10.1111/rssb.12062
» https://doi.org/10.1111/rssb.12062 - Suela, M.M.; Lima, L.P.; Azevedo, C.F.; Resende, M.D.V.; Nascimento, M.; Silva, F.F. 2019. Combined index of genomic prediction methods applied to productivity. Ciência Rural 49: 6. https://doi.org/10.1590/0103-8478cr20181008
» https://doi.org/10.1590/0103-8478cr20181008 - Torres, L.G.; Rodrigues, M.C.; Lima, N.L.; Trindade, T.F.H.; Silva, F.F.; Azevedo, C.F.; Lima, R.O. 2018. Multi-trait multi-environment Bayesian model reveals G x E interaction for nitrogen use efficiency components in tropical maize. PLoS One 13: e0199492. https://doi.org/10.1371/journal.pone.0199492
» https://doi.org/10.1371/journal.pone.0199492 - van de Schoot, R.; Depaoli, S.; King, R.; Kramer, B.; Märtens, K.; Tadesse, M.G.; Vannucci, M.; Gelman, A.; Veen, D.; Willemsen, J.; Yau, C. 2021. Bayesian statistics and modelling. Nature Reviews Methods Primers 1: 1-26. https://doi.org/10.1038/s43586-020-00001-2
» https://doi.org/10.1038/s43586-020-00001-2 - Windhausen, V.S.; Wagener, S.; Magorokosho, C.; Makumbi, D.; Vivek, B.; Piepho, H.P.; Melchinger, A.E.; Atlin, G.N. 2012. Strategies to subdivide a target population of environments: results from the CIMMYT-led maize hybrid testing programs in Africa. Crop Breeding and Genetics 52: 2143-2152. https://doi.org/10.2135/cropsci2012.02.0125
» https://doi.org/10.2135/cropsci2012.02.0125 - Zhang, H.; Zhu, Y.; Zhu, A.; Fan, Y.; Huang, T.; Zhang, J.; Xie, H.; Zhuang, J. 2020. Identification and verification of quantitative trait loci affecting milling yield of rice. Agronomy 10: 75. https://doi.org/10.3390/agronomy10010075
» https://doi.org/10.3390/agronomy10010075
Edited by
Publication Dates
-
Publication in this collection
10 Oct 2022 -
Date of issue
2023
History
-
Received
08 Mar 2022 -
Accepted
27 June 2022