Bayesian inference applied to soybean grown under different shading levels

: present study aimed to determine the effects of different light restriction levels (shading levels) on soybean genetic parameters using a Bayesian multi-trait model (MTM) and select high-yielding soybean cultivars. Eighteen commercial soybean cultivars bred in a soybean breeding program were evaluated over two agricultural seasons. Three shading levels were used over two agricultural crop seasons, giving six treatments (light restriction × crop season). The experiments were arranged in a randomized complete block design with six treatments replicated thrice. The genetic values and parameters were estimated using a Monte Carlo Markov Chain algorithm. Broad-sense heritability range from 0.2093 to 0.7153. The lowest genotypic variance estimate was observed at the 45 % photosynthetically active radiation level in the 2019/2020 crop season year compared with that of other shading levels. Furthermore, a 40 % selection intensity had the highest soybean yield under different shading levels. The Bayesian MTM combined with the factor analysis and genotype-ideotype distance method can be used to evaluate and select soybean genotypes considering different shading levels. The soybean cultivars 8579RSF, NS8338, NS7901, NS7667, RK8115, and 8473RSF had higher genetic potential than other cultivars under different shading levels.


Introduction
Glycine max (L.) is an important crop worldwide, with high grain protein and oil levels, and is extensively used in the food processing, animal feed, bioenergy, and chemical industries (Gonçalves et al., 2020).Soybean can be integrated into crop-livestock-forestry (CLF) production systems (Feng et al., 2019;Cristo et al., 2020).However, choosing the appropriate crop varieties for the CLF production system considering the shading of crops by forestry plants is important since shading can induce physiological and morpho-agronomic changes, which can affect productivity and quality performance owing to reduced photosynthetically active radiation (PAR) (Werner et al., 2017).The development of high-yielding soybean varieties resistant to abiotic and biotic stresses and adaptable to the environment through breeding is necessary.Besides, there is a need to reveal the interaction and correlation between agronomically essential traits, which can improve selection accuracy in complex trait systems (Yu et al., 2019).
The prediction of secondary traits using multi-trait analyses can improve the prognosis of primary traits, particularly when they have low heritability.Although the genetic correlation between traits is essential, modeling recursive interactions between phenotypes provides information to develop breeding strategies that cannot be done using conventional multivariate approaches (Momen et al., 2019).Bayesian inference helps deal with complex models such as multi-trait models (MTMs) (Torres et al., 2018;Silva Junior et al., 2022a).The Bayesian approach estimates genetic parameters more accurately than frequentist approaches (Torres et al., 2018;Volpato et al., 2019;Peixoto et al., 2021;van de Schoot et al., 2021).Bayesian MTMs are suitable for plant genetic evaluation (Volpato et al., 2019;Silva Junior et al., 2022a).Additionally, Bayesian MTMs enable the estimation of variance components and genetic values for individual traits (Peixoto et al., 2021) and joint analysis of multiple traits.The potential of the Bayesian approach for genetic evaluation in plant breeding has been demonstrated in several studies, considering multi-environment and -traits (Volpato et al., 2019;Silva Junior et al., 2022a).However, still needs to be more information on using MTMs in the Bayesian approach for soybean cultivated under different shading levels.
Therefore, the present study aimed to determine the effects of different shading levels on soybean genetic parameters using a Bayesian MTM and select soybean cultivars with good genetic potential.

Field experiments
The experiments were conducted at the Instituto de  303, 304, 401, 403, 404, and 405 of different groups of relative maturity.There is still no information about these cultivars in shaded areas, as studies on shading in soybean crops are scarce.In addition, as these cultivars are cultivated in a wide region, the study of their shading can provide inferences that still need to be evaluated for other crop systems.
Screens with black shade nets allowing 18 and 35 % light passage were installed in the field for the different PAR environments 20 days after crop emergence.The shades were installed at 1.6 m above ground to not disturb soybean growth and development.Photosynthetically active radiation was measured every 15 days using a PAR meter (Apogee quantum meter, model MQ-200, Apogee Instruments).The average PAR reduction values were determined and compared with the controls (measurements taken under full light conditions).Thus, three different shading environments were established.Soybean grown under 100, 75, and 55 % PAR had 100 (full sun environment), 18, and 35 % light passage, respectively.In all shading environments, the experimental plots had four planting lines and a length of lines 6 m.The plant population was determined according to the recommendation of each cultivar and all agronomic practices were carried out according to the recommendation for commercial soybean production.
Soybean pods were hand-harvested at the R8 stage (when more than 95 % of pods turned yellow) from the two middle rows (5 m 2 ) of each plot.The pods from each plot were threshed using a stationary thresher.The grain was weighed and the obtained mass was expressed in kg ha −1 and normalized to a moisture content of 13.0 %.GY1 (100 % PAR), GY2 (75 % PAR), and GY3 (55 % PAR) were assigned as grain yield for the 2019/2020 crop season, while GY4 (100 % PAR), GY5 (75 % PAR), and GY6 (55 % PAR) were assigned as grain yield for the 2021/2022 crop season.

Statistics analysis
Data were analyzed using the MTM through the Monte Carlo Markov Chain (MCMC) Bayesian approach.The MTM is calculated using: where: y is the vector of phenotypic data, the conditional distribution is given by y|b, g, i, G, R ~ N (Xb + Zg, R⊗I).
Where: G is the genotypic covariance matrix, and R is the residual covariance matrix; I is an identity matrix, b is a vector of systematic effects (genotype mean and replication effects) assumed to be b ~ N (b, Σb⊗I); g is the vector of genotype effects assumed to be g|G, ~ N (0, G⊗I); e is the vector of residuals assumed to be e |R, ~ N (0, R⊗I); X and Z are the incidence matrices for effects b and g, respectively.The R package MCMCglmm (Hadfield, 2010) was used to fit the model.Furthermore, 1,900,000 samples were obtained.A burn-in of 10,000 and a thin of 10 iterations were assumed, resulting in 189,000 samples.The MCMC convergence was verified according to the criterion by Geweke (1992), using the R packages boa (Smith, 2007) and Convergence Diagnosis and Output Analysis (Plummer et al., 2006).
The model was compared using the deviation information criterion (DIC) proposed by Spiegelhalter et al. (2002): (2) where: D( ) θ is a point estimate of the deviance obtained by replacing the parameters with their posterior mean estimates in the likelihood function and p D is the effective number of the model parameters.Models with a lower DIC should be preferred more than those with a higher DIC.
The higher posterior density (HPD) intervals for all traits were estimated using the R package boa (Smith, 2007).Variance components, broad-sense heritability (H 2 ), correlation between H 2 estimates for the different soybean shading levels, considering the MTM, and breeding values were calculated from the posterior distribution.Posteriori estimates of H 2 for each trait and each iteration were calculated from the later samples of variance components using the following expression: where: σ g i 2( ) and σ r i 2( ) are the genetic and residual variance components of each iteration, respectively.

Selection based on the selection index
The multi-trait index based on the factor analysis and genotype-ideotype distance (FAI-BLUP) method was used to identify superior soybean genotypes under different shading levels (Rocha et al., 2018).The formula used is as follows: where: P ij : the probability that the i th genotype (i = 1, 2, ..., 16) is similar to the j th ideotype (j = 1, 2, ..., m); d ij genotype-ideotype distance from the i th genotype to the j th ideotype, based on the standardized mean distance.Selection gains were estimated from the FAI-BLUP by considering five different selection intensities: 20, 30, 40, 60, and 80 %, as follows: where: X g is the mean of the selected genotypes, and X 0 is the overall population mean.

Results
The Geweke criterion revealed convergence for all dispersion parameters, generating 1,900,000 MCMC iterations, 10,000 samples for burn-in and a sampling interval of ten, totaling 189,000 effective samples used to estimate the variance components (Figure 1).Using this criterion, all chains (components of the [co]variance) converged.The variance components had similar posterior means, indicating a normal density.The DIC revealed that the full model for multi-trait best fits the data, revealing the significance of genotypic effects (DIC = 4474.94and 4542.29 for the full and restricted models, respectively).
The subsequent mean estimates of variance components revealed chi-square density and normal distributions (Figure 1).Heritability estimates were obtained in the broad sense of soybean grain yield for different shading levels and their HPD for the selection of genotypes (Figure 2).
The correlation between the H 2 estimates between GY6 and GY2, and GY4 was considered intermediate (Table 2).Conversely, GY6 negatively correlated with GY1 and GY3 with low intensity.However, there was no high-intensity correlation for GY1, and GY3 had the highest correlation (0.4656).
There were different posteriori estimates of genotypic and residual variances for the MTM between the different soybean shading levels (Table 3).The GY3 characteristic, which corresponds to 55 % PAR in the 2019/2020 crop season, had the lowest estimate of genotypic variance compared to those of the other shading levels.In the 2021/2022 crop season, at 55 % PAR, the genotypic variance estimate was intermediate compared with those of other shading levels, indicating a more significant influence of the genetic components than that of the environmental components on the expression of this trait.
The genetic variance (σ g 2 ) posterior density for soybean grain yield at different shading levels in the 2019/2020 and 2021/2022 crop seasons based on the MTM is shown in Figure 3.All the mean variance estimates exhibited chi-square density and normal distributions (Figure 3).The FAI-BLUP index exhibited discrepant selection gains between different selection intensities for the same characteristic based on the average estimates from the MTM (Table 4).Selection gains were reduced with increasing selection intensity.The highest selection gain was estimated for GY5 and GY1 in all the selection index scenarios.On the other hand, GY3 had the lowest selection gain for all the evaluated traits.
The classification of the 16 cultivars, considering the characteristics evaluated based on the FAI-BLUP

Discussion
The success of the evaluation of the improvement program was correlated with the accurate prediction of genotypic values due to the use of appropriate models.In the present study, we used an MTM to estimate the variance components and select soybean cultivars with higher genetic potential at different shading levels.Soybean shading can induce physiological and morphoagronomic changes, influencing yield and quality performance due to PAR reduction (Werner et al., 2017).
The posterior distribution of the parameters to be estimated was used in Bayesian inferences, enabling the establishment of precise credibility intervals for estimates of random variables and variance components (Resende et al., 2001).The posterior distribution of the parameters was used to estimate the genetic parameters for nitrogen (N) uptake and use efficacy under varying soil N levels using models, such as the MTM (Torres et al., 2018).The Bayesian models are based on estimating the genetic parameters to select segregated soybean progenies using MTM (Volpato et al., 2019).Additionally, flood-irrigated rice used a model to estimate genetic parameters (Silva Junior et al., 2022a, b).
Values of approximately 95 % distribution credibility for the H 2 parameter were found in the present study (Table 2).In flood-irrigated rice, the H 2 estimate was > 80 % (Silva Junior et al., 2022b).In corn lines, the heritability for N use efficacy was 50 %, considered highly heritable (Torres et al., 2018), indicating that the MTM estimates H 2 more accurately than the individual models.Eucalyptus globulus clones were evaluated and found to have moderate to high H 2 values (ranging from 12 to 41 % [mode value of the posterior distribution of heritability]) for the tree height trait (Mora et al., 2019).In addition to the statistical model, the H 2 of a trait also improves predictions (Lorenz et al., 2011;Gill et al., 2021).Low H 2 estimates reduce accuracy in predicting individual traits (Heffner et al., 2009).The application of MTM can improve the prediction of poorly heritable characters using information from correlated characters with high H 2 (Jia and Jannink, 2012;Jiang et al., 2015;Lado et al., 2018;Bhatta et al., 2020;Gill et al., 2021).Furthermore, when there is a moderate genetic correlation between traits, MTM is more effective (Jia and Jannink, 2012).The difference between the H 2 estimates of the mean, mode, and median (Table 2) had asymmetry in the posterior distribution estimates.Previous studies have reported asymmetry among the mean, mode, and median heritability estimates in the posterior distribution estimates (Torres et al., 2018;Silva Junior et al., 2022a, b).The low heritability in the broad sense observed in the traits was independent of the number of samples evaluated, as the Bayesian structure used is recommended to analyze small samples.Quantitative traits are of agronomic interest, determined by several genes with low expression and influenced by the environment (Falconer and MacKay, 1996), consistent with the results of the GY trait evaluated in the present study.However, traits with a low heritability performed better than those with high heritability when using the MTM, as it considers the interaction between traits and genotypes while providing a better estimate of the correlation between traits (Guo et al., 2020).
The multi-trait analysis is effective and provides more accurate estimates than the single-trait analysis because it considers the underlying correlation structure found in a multi-trait dataset.However, the Bayesian and non-Bayesian inferences from the MTM analysis are complex and computationally demanding.The shading level influenced grain yield between cultivars and between environments for the same cultivar.This effect was evidenced in the results presented throughout the article, reflecting mainly in the selection gain indices.
Accurate estimates of genetic parameters provide new perspectives on the use of the Bayesian methods to model soybean genetic improvement under different shading levels.The results of the present study revealed that the MTM effectively estimates the soybean genetic parameters under different shading levels.Soybean breeding programs require accurate results rapidly; therefore, the model choice and the selection index (selection pressure) should be used as breeding strategies.Using FAI-BLUP index to select genotypes was based on the possibility of using the classification of genotypes based on the multi-trait free of multicollinearity by this index (Rocha et al., 2018).Therefore, the soybean cultivar 8579RSF was selected.NS8338, NS7901, NS7667, RK8115, and 8473RSF cultivars had higher genetic potential than other soybean cultivars under different shading levels.Thus, based on the results of the FAI-BLUP index, these cultivars can be grown under different shading levels.
The Bayesian MTM combined with the FAI-BLUP method could be used to evaluate and select soybean genotypes considering different shading levels.

Figure 1 -
Figure 1 -Convergence for the genotypic variance of the six traits analyzed in the multi-trait model.The numbers on the right refer to the posterior density of the genetic variance estimates.The numbers on the left refer to Markov Chains for the genetic variance estimates.Grain yield = GY1 (100 %), GY2 (75 % photosynthetically active radiation = PAR), and GY3 (55 % PAR) in the 2019/2020 crop season.GY4 (100 % PAR), GY5 (75 % PAR), and GY6 (55 % PAR), in the 2021/2022 crop season.

Figure 4 -
Figure 4 -Selection considering 40 % of the selection intensity (selection of six cultivars).The line indicates the soybean genotypes for the different shading levels.The cultivars selected by the FAI-BLUP index correspond to the red dots outside the red line.

Biometry, Modeling, and Statistics Research article using the multiple-trait model
Ciências Agrárias da Universidade Federal dos Vales do Jequitinhonha e Mucuri (UFVJM), Unaí, Minas Gerais State, Brazil (16°26'10.48"S, 46°54'2.28"W, altitude 634 m) during the rainy season between Nov 2019 and Mar 2020, and Oct 2021 and Feb 2022, on two crop seasons with tree shading levels (PAR environments), totaling six treatments (shading level × crop season).Both experiments were conducted during the soybean crop harvest period in two different years to minimize the climate interference on the research data.

Table 1 -
Posterior inferences for mode, mean, median, and posterior density range (HPD) of broad sense heritability, considering the multi-trait model.

Table 2 -
Correlation between estimates of broad-sense heritability for the different soybean shading levels, considering the multi-trait model.

Table 4 -
Percentage of selection gains, factor number, and commonalities obtained using the factor analysis and the genotype-ideotype distance index considering five different selection intensities: 20, 30, 40, 60, and 80 %.