1 Crop Breeding and Applied Biotechnology - 20(2): e282920217, 2020 Adaptability and yield stability of soybean genotypes by REML/BLUP and GGE Biplot

: The aim of this study was to ascertain the association between the REML/BLUP and GGE Biplot methodologies for selection of superior genotypes in regard to adaptability and yield stability for various regions of the Middle North region of Brazil. Sixteen soybean genotypes were evaluated in eight environments during the 2015/2016 and 2016/2017 crop seasons, analyzing the following traits: number of days to maturity, plant height, one hundred seed weight, and grain yield. In this study, the REML/BLUP and the GGE Biplot methods are highly correlated in terms of genotype ranking for selection and recommendation purposes. The genotypes BRASBT13-0528, M8372 IPRO, and BRASBT13-0621 most approximate a hypothetical ideal genotype.


INTRODUCTION
Soybean [Glycine max (L.) Merrill] is of great socioeconomic importance for Brazil as it is the basis for various food products and for animal feed. In the 2017/2018 crop season, Brazilian production was 118.8 million tons, with a mean yield of approximately 3334 kg ha -1 (CONAB 2019).
The Brazilian Cerrado (tropical savanna) regions constitute 204 million hectares, 10.7% of which are in the states of Maranhão and Piauí, which compose the Middle-North or Western Northeast region of Brazil. This region is prominent in food production due to its climate conditions and soils conducive to grain production (Cardoso et al. 2012). Over the past two decades, there has been an expressive increase in soybean production in the region, from 459.3 thousand tons in the 1998/1999 crop season to about 5.5 million tons in the 2017/2018 crop season (CONAB 2019).
The increase in soybean production has mainly come about through advances in crop breeding. Higher-yielding genotypes with desirable characteristics are selected through experiments in different environments (years and locations), which show that the same cultivar may perform differently according to the growing environment. This fluctuation arises from the genetic and environmental components and the interaction between them, known as the genotype × environment (G × E) interaction (Bornhofen et al. 2017). The effects of this interaction are one of the main challenges faced by breeders at the time of GMC Gonçalves et al. identification, selection, and recommendation of cultivars. This interaction may lead to inconsistency in classification of genotypes in the various environments tested and minimize the magnitude of the association between phenotypic and genotypic values (Polizel et al. 2013).
An alternative to attenuate the effects of the G × E interaction is the identification and selection of genotypes with greater adaptability and yield stability. Numerous methods have been reported in the literature for studying adaptability and stability in multi-environment trials. The methods proposed may be based on the components of analysis of variance, on the regression method, on non-parametric methods, on multivariate methods, on mixed models, and on new methods, such as factor analytic models (Van Eeuwijk et al. 2016, Carvalho et al. 2016, Li et al. 2017, Sousa et al. 2020. Methodologies based on mixed models (REML/BLUP) are currently in wide use in plant breeding programs (Smith et al. 2005, Torres Filho et al. 2017, Sousa et al. 2019. The REML (Restricted Maximum Likelihood) procedure estimates the variance components and genetic parameters, and BLUP (Best Linear Unbiased Prediction) is the ideal selection procedure for additive genetic and dominance effects and genotypic effects (Resende 2016). In the context of mixed models, the following parameters can be obtained: harmonic mean of the genotypic values (HMGV), to infer stability and yield; relative performance of genotypic values (RPGV), to analyze genotypic adaptability and yield; and harmonic mean of the relative performance of the genotypic values (HMRPGV), to simultaneously evaluate stability, adaptability, and yield (Resende 2007).
Considering the multivariate methods, a model frequently used, proposed by Yan et al. (2000), is the main effects of genotypes + multiplicative effect of the genotype × environment interaction, called GGE Biplot. This analysis clusters the additive effects of the genotypes with the multiplicative effects of the interaction and places them under principal component analysis. Interpretation of the results of the GGE Biplot is facilitated by the graphic display of the biplots, allowing important aspects to be observed, as for example, the formation of mega-environments, the genotypes and ideal environments, and the most representative and discriminant environments, as well as easier comparison of the genotypes evaluated (Yan et al. 2000).
In spite of the expansion of soybean in the Middle-North region of Brazil in recent years, there are few studies in the literature aiming to select genotypes for adaptability and yield stability for the region. Thus, the aim of this study was to ascertain the association between the REML/BLUP and GGE Biplot methodologies for the selection of soybean genotypes with superior yield, stability, and adaptability for the Middle-North region of Brazil.

MATERIAL AND METHODS
Data from the Value for Cultivation and Use (VCU) trials were used from the following sixteen soybean genotypes, with their respective maturity groups: twelve lines with the Intacta® technology originating from the Soybean Breeding Program The experiments were conducted in a randomized block design with three replications. The experimental plot was composed of four 5-meter rows spaced at 0.5 m between rows. The two center rows were used for data collection, disregarding the 0.5 m from the ends as a border area. All the trials were sown mechanically, uniformly distributing 15 seeds per linear meter. All crop management practices were undertaken according to crop requirements, following Sediyama et al. (2015).
At crop maturity, the following agronomic traits were evaluated in the area of the plot used for data collection: number of days to maturity (NDM), defined as the period from sowing to the R8 phenological stage, in which 95% of the pods in the area are mature; plant height (PH), measured in cm, comprising the distance between the soil surface and the tip of the main stem, in ten random plants; one hundred seed weight (HSW), weight in g of 100 seeds taken at random; and grain yield (GY), corresponding to the grain yield of plants after harvest and processing, with grain weights adjusted to 13% moisture and the value extrapolated to kg ha -1 .
To estimate the variance components and to predict of the effects of genotypes and of the genotype × environment interaction by REML/BLUP, the follow mixed model was used: y = Xr + Zg + Wi + e, where y is the data vector; r, g, i, and e correspond to the effects of the blocks added to the overall mean (considering all the replications of all the locations -assumed as fixed), genotypic effects (assumed as random), effects of the G × E interaction (random), and the random errors, respectively; and X, Z, and W represent the incidence matrices for the aforementioned effects.
The following parameters were estimated: phenotypic variance (σ 2 f ); genotypic variance (σ 2 g ); residual variance (σ 2 e ); variance of the genotype × environment interaction (σ 2 c ); genotypic correlation (r gloc = σ 2 c / σ 2 g + σ 2 c ); heritability of the average of genotypes (h 2 mg = σ 2 g /(σ 2 g + σ 2 e /e )), where e is the number of environments; accuracy in the selection of genotypes (r̂g g = ĥ 2 mg ); and the relative coefficient of variation, calculated by CV g /CV e , where CV g and CV e are the coefficient of genetic and environmental variation, respectively. To test the significance of the random effects of the model, the likelihood ratio test (LRT) was conducted, generating a Deviance Analysis table (Resende 2016). Based on this model, the genotypic values free of all interaction with environments were obtained by μ + g i , in which μ is the mean of all the environments and g i is the genotypic effect free of the genotype × environment interaction. For each environment j, the genotypic values were predicted by μ + g i + (ge) ij , where μ is the mean of environment j, g i is the genotypic effect of genotype i in environment j, and (ge) ij is the effect of the G × E interaction in relation to genotype i and environment j (Resende 2007).
For adaptability and stability analyses, only the GY trait was considered. By the REML/BLUP method, the following parameters were obtained: harmonic mean of genotypic values (HMGV) obtained by the equation , where e is the number of environments in which genotype i was evaluated, and GC ij is the genotypic value of genotype i in environment j; relative performance of genotypic values (RPGV) obtained by the expression where μ j is the mean of environment j; harmonic mean of RPGV (HMRPVG) calculated by the equation HMRPGV i = e/Σ e j=1 (1/RPGV j ), whose terms have already been explained (Resende 2007). The mean genotypic value, capitalizing on adaptability, is obtained by RPGV multiplied by the general mean of all environments (RPVG*µ), and the average genotypic value penalized by instability and capitalized by stability is calculated by the HMRPGV multiplied by the general mean of all environments (HMRPGV*µ). The mixed model analysis, analysis of deviance, and adaptability and stability analyses were performed using the SELEGEN-REML/BLUP software (Resende 2007(Resende , 2016. Adaptability and yield stability were also evaluated by the GGE Biplot method, proposed by Yan and Rajcan (2002). Analysis was based on information of genotypic means, considering the model Y̅ ij -μ = G i + A j + GA ij , where Y̅ ij represents the genotypic value of genotype i in environment j; μ is the overall mean of the observations; G i is the principal effect of genotype i; A j is the principal effect of environment j; and GA ij is the effect of the genotype i and environment j interaction. The GGE Biplot model does not separate the genotype (G) effect from the effect of the (G × E) interaction, maintaining them together in two multiplicative terms, represented by the expression: Y ij -μ -β j = g i1 e i1 + g i2 e i2 + ε ij , where Y ij is the yield expected from genotype i in environment j; μ is the overall mean of the observations; β j is the principal effect of environment j; g i1 and e i1 are the principal scores of genotype i and environment j, respectively; g i2 and e i2 are the secondary scores for genotype i and environment j, respectively; and, ε ij is the unexplained residue for both effects.
The biplot of the GGE Biplot model was constructed by means of simple dispersion of g i1 and g i2 for genotypes and e i1 and e i2 for environments by decomposition of the singular value, according to the equation Y ij -Y j = λ 1 ε i1 ρ j1 + λ 2 ε i2 ρ j2 + ε ij , where λ 1 and λ 2 are the highest eigenvalues of the first principal component (PCA1) and second principal component (PCA2), respectively; ε i1 and ε i2 are the eigenvalues of genotype i for PCA1 and PCA2, respectively; and ρ j1 and ρ j2 are the eigenvalues of environment j for PCA1 and PCA2, respectively Rajcan 2002, Yan andTinker 2006). Analysis was GMC Gonçalves et al.
performed with the assistance of the R statistical environment (R Development Core Team 2014) using the GGEBiplotGUI package (Frutos et al. 2014), selecting the no-scaling model; tester-centered G + GE, which corresponds to the GGE Biplot model; the singular value partitioning (SVP) method of the column metric preserving type; and the biplot type considering the first and second principal components (PC).

RESULTS AND DISCUSSION
By the likelihood ratio test, a significant effect (P < 0.01) was observed for genotypes and for the G × E interaction for all the traits evaluated (Table 1). Thus, it can be inferred that there is genetic variability for the traits, allowing selection of genotypes with superior performance, and that the genotypes exhibited differential performance according to the crop environment. The significance of the G × E interaction can affect selection of the best genotypes, hindering recommendation of new cultivars by breeders. Freiria et al. (2018), Hamawaki et al. (2018), and Volpato et al. (2018) also found significance for interaction upon studying soybean genotypes.
The variance components revealed a greater contribution from genotypic variance in phenotypic expression (σ 2 f ) of the NDM and PH traits (Table 1). In contrast, for HSW and GY, a greater contribution was observed from environmental variance and from variance of the G × E interaction for phenotypic variation. The relative coefficient of variation, higher than 1.0 for the NDM and PH traits and lower than 1.0 for HSW and GY, also shows that most of the phenotypic variation is attributed to genetic causes in the first case and to environmental variation in the second case. Therefore, although the traits evaluated showed complex inheritance, HSW and GY were more affected by the environment. This result was expected, since GY is controlled by many genes and is highly affected by the crop environment (Costa et al. 2015).
The coefficients of mean heritability of genotype were 0.70 for GY and 0.97 for NDM (Table 1). The parameter is estimated using means of blocks as the criterion of evaluation and/or selection (Resende 2007). In accordance with the values obtained for all the traits (> 0.70), selection of soybean genotypes based on predicted genotypic values can be performed relatively easily. GY and HSW showed lower heritability compared to NDM and PH, since GY and HSW were more affected by the environment and the G × E interaction in their phenotypic expression.
Selective accuracy was used as a parameter to evaluate experimental accuracy. The selective accuracy is one of the most relevant parameters for evaluation of the quality of experiments, defined as the correlation between the genotypic values predicted from experimental data and the true genotypic values (Resende and Duarte 2007).
The traits NDM, PH, HSW, and GY had selective accuracy of 0.98 (very high experimental precision), 0.98 (very high experimental precision), 0.88 (high experimental precision), and 0.84 (high experimental precision), respectively For GY and HSW, the estimates of correlation between the environments (0.35 and 0.53, respectively) were of smaller magnitude than the estimates for NDM and PH (0.63 and 0.72, respectively) ( Table 1). Such estimates show the predominance of the complex type interaction, inhibiting selection and recommendation of superior genotypes (Costa et al. 2015).
When the effects of treatments are accepted as random, it is not necessary to perform multiple mean comparison tests (Resende 2004). Thus, from the estimates of the genotypic means free of interaction for the traits evaluated (Table  2), the cycle of the genotypes, with a mean of 104.89 days, ranged from 96.85 days (G13) to 114.98 days (G16). The lines, in general, were earlier than the controls. Evaluation of the soybean cycle is of fundamental importance to correctly plan the time of sowing and harvest, which may affect growth and development of the crop (Hu and Wiatrak 2012).
For PH, a variation from 50.80 cm (G7) to 83.43 cm (G6) was observed, with a mean of 67.01 cm. The controls were slightly taller than the lines. The soybean plant should achieve a height from 60.0 cm to 110.0 cm for better combine harvesting efficiency (Shigihara and Hamawaki 2005).
The HSW ranged from 13.61 g (G1) to 15.13 g (G13), with a mean of 14.39 g. The lines and the controls had similar mean HSW, of 14.33 g and 14.55 g, respectively. This trait is related to seed vigor and, consequently, plant vigor, and is highly correlated with soybean grain yield (Arshad et al. 2006).
The genotype G7 was the lowest yielding (2597.19 kg ha -1 ) and G16 was the highest (3238.09 kg ha -1 ) ( Table 2). In general, the lines yielded less than the controls, with mean yield of 2970.00 kg ha -1 and 3019.07 kg ha -1 , respectively. Among the lines, G5, G8, and G3 stand out, with grain yield of 3181.64 kg ha -1 , 3139.96 kg ha -1 , and 3128.61 kg ha -1 , respectively, higher than the controls G13 and G15.

GMC Gonçalves et al.
From analysis of the predicted genotypic values for grain yield in all the environments evaluated (Table 3), a change could be seen in the order of the best genotypes due to the crop environment, showing the effect of the genotype × environment interaction and the low genotypic correlation observed for the trait (Table 1).
The results of stability (HMGV), adaptability (RPGV), and simultaneous stability and adaptability (HMRPGV) ( Table  4) show that the five best genotypes based on the HMGV, RPGV, and HMRPGV criteria correspond to the five highest yielding genotypes (Table 4). In relation to HMGV, there was an inversion between the genotypes G5 and G14. For RPGV and HMRPGV, the order of the five best genotypes was the same as obtained for the mean genotypic value.
The five best genotypes (G16, G14, G5, G8, and G3) by the HMRPGV*µ criterion had grain yield of 3293.69 kg ha -1 , 3257.66 kg ha -1 , 3235.49 kg ha -1 , 3184.15 kg ha -1 , and 3140.50 kg ha -1 , respectively, representing superiority over the overall mean of 10%, 9%, 8%, 7%, and 5%, respectively. Among the advantages of using HMRPGV*µ are selection through genotypic adaptability and stability, since the effects of genotypes are considered as random; the ability to work with unbalanced data and heterogeneity of the variances; and provision of results regarding the actual magnitude of the trait evaluated, allowing it to be used for any number of environments (Resende 2007).
By the "which-won-where" pattern of the GGE Biplot ( Figure 1A), the polygon was delimited by the genotypes G16, G14, G5, G3, G1, G7, and G13, which correspond to the genotypes present in the vertices more distant from the origin of the biplot and with the best performance in one or more environments (Yan and Rajcan 2002). The vectors (red lines) that go out from the center of the biplot (0.0) delimited the diagram in six sectors, forming two mega-environments: I) A1, A4, A6, A3 and II) A2, A5, A8. A mega-environment can be defined as a cluster of positively correlated environments or sub-regions in which a genotype or a group of genotypes is specifically adapted and achieves better performance (Yan et al. 2000). The genotypes located in sectors that do not have any clustered environment showed low yield and were therefore unfavorable in all the environments tested.

GMC Gonçalves et al.
According to the "mean x stability" diagram ( Figure 1B), the continuous green line with a single arrow, called the "average-environment axis" (AEA), indicates the genotypes that exhibited greater mean yield performance. The second continuous green line, perpendicular to the AEA, indicates greater variability (lower stability) in any direction, such that the greater the length of the dotted green line, the less stable the genotype is (Yan and Tinker 2006). In addition, it allows separation of the genotypes that are above or below the mean.
The genotypes that had higher grain yield are, in decreasing order, G16, G14, G5, G8, G3, G6, G2, G11, and G12. In contrast, G15, G10, G9, G4, G13, and G7 had the lowest yields, with performance lower than the mean. In relation to stability, the genotype G4 was the most stable, while the genotype G16 had the lowest stability. The genotypes G14 and G5 stand out because of simultaneous high yields and high stabilities.
The genotypes that are nearest the center of the concentric circles are the most desirable ( Figure 1C). The genotype should have both high yield performance and high stability (Yan et al. 2000). Therefore, the REML/BLUP and GGE Biplot methodologies coincide in identification of superior soybean genotypes for the Middle North region of Brazil. Among the 16 genotypes evaluated, G5, G14, and G8 most approximate a hypothetical ideal genotype.