Evaluation of maize hybrids and environmental stratification by the methods AMMI and GGE biplot

The purpose of this study was to evaluate yield stability, adaptability and environmental stratification by the methods AMMI (Additive Main Effects and Multiplicative Interaction Analysis) and GGE (Genotype and Genotypes by Environment Interaction) biplot and to compare the efficiency of these methods. Data from the evaluation of 20 experimental single-cross and three commercial hybrids and 11 locations, in two growing seasons, 2005/2006 and 2006/2007 were used. Analyses of variance, adaptability, stability and environmental stratification were performed. A better combination of adaptability and stability was observed in the hybrids 10 and 16, according to the graphics of AMMI and GGE biplot methods, respectively. The number of locations could be reduced by 28% based on stratification. The predictive correlation of the AMMI and GGE methods was 0.88 and 0.86, respectively. The results showed that it is possible to reduce the number of evaluation sites; AMMI tended to be more accurate than GGE analysis.


INTRODUCTION
The genotype by environment interaction (GE) may be reduced using specific cultivars for each environment or using cultivars with wide adaptability and good stability or by stratifying the region under study in megaenvironments with similar environmental characteristics, within which the interaction becomes insignificant (Terasawa Júnior et al. 2008).
There are several methodologies to assess the GE, of which the most commonly used are based on simple and multiple regression.Despite the widespread use, regression-based methods have limitations that are frequently reported in the literature.Crossa (1990) argues that linear regression analysis is not informative if linearity fails, is highly dependent on the group of genotypes and environments included, and tends to simplify response models explaining the variation caused by interaction in one dimension, when in fact it can be quite complex.Crossa (1990) suggested that the application of multivariate methods can be useful to better exploit the information contained in the data.He proposed techniques such as principal component analysis (PCA), cluster analysis and the AMMI procedure (Additive Main Effects and Multiplicative Interaction Analysis), which have gained wide application in recent years.A detailed description of the AMMI methodology is given by Ebdon and Gauch (2002) and Duarte and Vencovsky (1999).The AMMI model has also been used for environmental stratification, and stratification based on the winning RL Oliveira et al.
genotypes has been more efficient than of other stratification methods (Pacheco et al. 2003).
Recently, a modification of conventional AMMI analysis, proposed by Yan et al. (2000), denoted GGE (Genotype and Genotype-by-Environment Interaction) biplot, has been used to study the genotype by environment interaction.GGE analysis groups the genotype effect, which is an additive effect in the AMMI analysis, with the GE interaction, which is a multiplicative effect, and subjects these effects to a multiplicative model for local regression (SREG -Site Regression).
The purpose of this study was to evaluate yield stability, adaptability and environmental stratification by the methods AMMI and GGE biplot, using test data to evaluate maize hybrids and compare the efficiency of the methods.

MATERIAL AND METHODS
Grain yield data of the final trials of maize hybrids were used, provided by the company Monsanto.The tests were performed in a randomized block design (three replications in the 2005/2006 and two replications in the 2006/2007 growing season) at 11 locations in two growing seasons, in a total of 22 environments, in the states Minas Gerais, São Paulo, Paraná, Goiás, Mato Grosso do Sul, Bahia and Distrito Federal (Table 1).The plots consisted of four 5-m rows and the evaluated area per plot was 7.5 m 2 .Of the 23 evaluated hybrids, 20 were experimental and three commercial hybrids (described in Table 1).
To assess the genetic variability among the treatments (hybrids) and the experimental accuracy, analysis of variance was performed for each environment (individual analysis).Then a combined analysis was carried out, as proposed by Ramalho et al. (2000), considering the genotype and environment effects as fixed.For the analyses of variance proc glm was used, of the software package SAS v 8.0 (SAS Institute 2000).
Once the presence of GE interaction (F test significant) was confirmed, the stability analysis was carried out, which allows a measurement of adaptability and yield stability of each test hybrid.The AMMI and GGE biplot were used in the assessment in order to compare the efficiency of these methods.The AMMI analysis was performed using the software Estabilidade, developed by Ferreira and Zambalde (1997), while the GGE biplot analysis was performed using SAS v 8.0, with IML (Interactive Matrix Language) and SAS GRAPH (SAS Institute 2000).
The AMMI analysis according to Zobel et al. (1988) combines in a single model additive components for the main effects of genotype (g i ) and environments (e j ), and multiplicative components for the effect of GE interaction (ge ij ).The model that describes the mean yield of a genotype i in environment j is given by: Y ij = μ + g i + a j + Σ λ k γ ik α jk + r ij + ε ij where: Y ij is the average yield of genotype i in environment j, Y ij is the overall mean yield; g i is the effect of genotype I; a j is the effect of environment j; λ k is the k -th singular value of the original matrix interactions (GE); γ ik is the element corresponding to the i -th genotype in the k -th singular vector of the GE matrix column; α jk is the element corresponding to the j -th environment in the k -th singular vector of the GE matrix row; r ij is the noise associated with the expression (ge ij ) not explained by the retained principal components; n is the number of axes or principal components retained to describe the GE interaction pattern; ε ij is the average experimental error associated with observation, assumed to be independent ε ~N(0, σ 2 ).
For the GE interaction, the biplot is interpreted by observing the magnitude and sign of the scores of genotypes and environments, for the axis (axes) of interaction.Thus, low scores (close to zero) represent genotypes and environments are little involved in the interaction and are characterized as stable.In an AMMI2 biplot, the points of stable genotypes and environments Evaluation of maize hybrids and environmental stratifiction by the methods AMMI and GGE biplot (with little contribution to the sum of squares of the GE interaction (SS GE )) lie near the origin.
To stratify the environments, the methodology AMMI was used with the approach of winning genotypes, as proposed by Gauch and Zobel (1997).In this study, the groups were obtained by estimates of the interaction of AMMI1.Thus, the strata were defined by the winning genotype.In this context, each genotype with one or more winning genotypes, i.e., the genotype with the highest mean yield in one or more environments, determines an environmental stratum.
To verify the efficiency of SREG, denoted GGE biplot (Genotype and Genotype-by-Environment Interaction), by Yan et al. (2000), in explaining the effect of genotypes (G) + (GE) interaction in relation to the AMMI analysis, the GGE analysis was performed, using SAS v. 8.0 (SAS Institute 2000) considering the simplified model for two principal components: where Y ij is the mean yield of cultivar i in environment j; μ j is the mean of environment j; λ 1 γ i1 α j1 is the first principal component (PCA1); λ 2 γ i2 α j2 is the second major component (PCA2); λ 1 and λ 2 are the eigenvalues associated to PCA1 and PCA2; γ i1 and γ i2 are the scores of PCA1 and of PCA2, respectively, for the genotype effect; α j1 and α j2 are the scores of PCA1 and of PCA2, respectively, for the environmental effect; ρ ij is the residue of genotypeenvironment interaction, also known as noise, corresponding to the principal components not retained in the model; and ε ij is the residual effect of the model with normal distribution, with zero mean and variance σ 2 /r (where σ 2 is the variance of the error between plots for each environment and r is the number of replications).
The efficiency in retaining the greatest portion of the sum of the squares of the GE interaction effects, as well as (G) + genotype-environment (GE) interaction of the graphics AMMI1 and GGE biplot were compared.For this reason, the sum of squares of G and GE contained in PCA1 and PCA2 of the GGE biplot were partitioned, according to Gauch et al. (2008), by the following expressions: where: SSG and SSGE are the sums of squares of genotype and genotype-environment (GE) contained in the first two principal components of GGE2 (GGE model with two principal components), and the first two scores for genotype (G) and environment (E) respectively; K is the matrix of the phenotypic means distributed along the k -th column, and P = I-K, where I is the identity matrix contained in the singular value decomposition (SVD) of G + GE.Gauch and Zobel (1988) argue that the methods of model evaluation, using F tests, are not efficient for the selection of parsimonious models and are likely to include noise.In contrast, predictive evaluation criteria capitalize on the ability of a model to obtain predictions with data not included in the analysis by simulating future responses yet unmeasured, so it would be preferable to choose the model based on these criteria.
Unless the choice of a model or performance evaluation of a predictor is based on assumptions of distribution, the method providing the most general results should be adopted.Methods essentially based on data free from theoretical distributions have the greatest generality.These methods involve resampling in a given data set by techniques such as jackknife, bootstrap and cross validation (Dias and Krzanowski 2003).
The accuracy of the graphical identification methods of mega-environments and winning genotypes was tested by the cross validation procedure proposed by Gabriel (2002).With this purpose, the statistics PRESSm and PRESScorr were used to measure the discrepancy between observed and predicted and predictive correlation values (Dias and Krzanowsky 2003).Gabriel (2002) proposed an algorithm for the crossvalidation, partitioning the data matrix by the singular value decomposition (SVD).The algorithm is developed based on the submatrix X /11 given by : X GxE = where: X /11 = Σ u (k) d k v (k) = UDVT, where: r is the number of multiplicative components under analysis; u k , d k and v k correspond, respectively, to the elements, and λ k , γ ik and α ik (k = 1, 2, ... , r) derived from the SVD, described above for the models evaluated; . The predicted value for x 11 is given by: x 11 = x 1 VD -1 U T x 1 .The residue of the crossvalidation is obtained by: e 11 = x 11 = x 11 .
Similarly, to calculate all x ij values set by cross validation and errors e ij = x ij -x ij for all other elements x ij (i = 1, 2 ,..., g) where g is the number of genotypes; (j = 1, 2, ..., e) where e is the number of environments.The error and fitted values are summarized by: PRESS m = Σ Σ e ij and PRECORR(m) = Corr (x ij , x ij i,j ), respectively.For Gabriel (2002), the model should be chosen that provides the lowest PRESS m value.
The empirical means for the genotypes (Y ij ) are explained by the genotype scores in the following sense: in the biplot graph, the higher the score of the first principal component (PCA1), the higher are the genotype means, and if the second principal component (PCA2) is close to zero, the genotypes are considered more stable (Yan et al. 2000).

RESULTS AND DISCUSSION
Based on the combined analysis, it was confirmed that the sources of variation in genotypes, environments and GE interaction were significant at 1% probability in all 22 environments.This high significance of the GE interaction indicates different responses of the genotypes in each evaluation environment.The coefficient of variation of the combined analysis was 9.85%, indicating high experimental accuracy in the test set.
The results of AMMI analysis (Table 2) show the lowest PRESSm value in the AMMI1 model, i.e., the discrepancy between the values observed and predicted by the model was lowest.Consequently, the predictive correlation (PRESScorr) of the AMMI1 model was highest.Thus, the AMMI1 biplot was used for the graphical analysis (main effects vs. scores of the first interaction principal component analysis (IPCA1).
with noise, so the pattern of GE interaction would be inexpressive (Duarte and Vencovsky 1999).
The AMMI1graph shows that the hybrids 4, 6, 10, 14, and 17 stood out with the lowest IPCA1 scores (Figuree1).This indicates that these hybrids were least involved with the interaction, and are therefore the most stable.However, only the yield of hybrids 6 and 10 was above-average.The mean hybrid yield performance, in decreasing order, ranked hybrid 15 first (most productive), 11 and 18 (second), 16, 19 and 10 (third most productive).There was no significant difference between hybrids 11 and 18 as well as between hybrids 16, 19 and 10, by the Scott and Knott (1974) test at 5% probability (Table 1).Thus, considering adaptability and stability in all environments, the best hybrid is 10, since it ranked third in yield and was more stable than the highest-yielding hybrids (Figure 1).
Table 2. Recovery of the sums of squares of genotypes (SS G ) and of the genotype-environment interaction (SS GE ), of the methods AMMI1, AMMI2 and GGE2, based on the mean grain yield (kg ha -1 ) of 23 maize hybrids evaluated, in 22 environments.In the first principal component of the AMMI1 (IPCA1) diagram, the additive effects de G and E were used; it was assumed that this principal component accounts for 100% of the additive effects The first principal component of the AMMI model explained only 21.88% of SS GE , however, the AMMI1 model was the most accurate, as mentioned above (Table 2).This was the case because much of SS GE may be associated  1.
Group 1 was composed of only five locations in the same growing season.However, of these five sites, the environmental scores of three (Brasilia, Chapadão do Sul, Iraí de Minas) were very close to IPCA1.This indicates that these sites similarly affect the general genotype performance, while only one of these sites can be used in the group.Thus, the group was composed of the following locations: Chapadão do Sul, Rio Verde and Uberaba.The 17 environments that make up group 2 are distributed in the 11 sites, six of which are present in two seasons (Barreiras, José Bonifácio, Passos, Presidente Olegário, Rolândia, and Três corações).However, the environmental scores of IPCA1 of Barreiras and Presidente Olegário were very close, in both growing seasons.Thus, only five sites formed group 2, which are: José Bonifácio, Passos, Presidente Olegário, Rolândia, and Três corações.
Regarding the number of sites evaluated, the model AMMI1 allowed a reduction of 28% in the number of sites used in the evaluation test of maize hybrids.This reduction in the number of sites to be used in future evaluation tests represents a significant reduction in the costs of obtaining new hybrids in breeding programs, since the step of genotype assessment is the most costly of an improvement program (Terasawa Júnior et al. 2008).
In the GGE biplot method, the first two principal components (PCA1 and PCA2), derived by singular value decomposition of the effects of genotype (G) + interaction (GE) were presented.The first principal component (PCA1) indicates genotype adaptability, i.e., it is highly correlated with yield (Yan et al. 2000).Accordingly, it can be seen that hybrid 15 was best adapted to the evaluation environments, followed by hybrids 11, 18, 16, and 19 (Figure 3).The second principal component (PCA2) indicates phenotypic stability, i.e., genotypes with PCA2 closer to zero would be the most stable (Yan et al. 2000).Thus, the stability of the hybrids was in decreasing order hybrid 6 > 22 > 16 > 5 > 12 and > 3 (Figure 3).Analyzing the two components of the graph, the conclusion was drawn that the best genotype, considering adaptability and stability, was hybrid 16, for being among the most stable and also the third most productive hybrid (Figure 3).   1.
The GGE biplot also presents an environmental stratification based on the winning genotypes.Figure 3 shows the formation of two groups of environments (also called mega-environments), i.e., environments determined by winning genotypes.These genotypes are located at the vertices of the polygon and the mega-environments are separated by lines, perpendicular to the polygon sides.The hybrids 15 and 18 determined the mega-regions I and II, respectively, i.e., they are the most productive genotypes in the environments included in each mega-environment.

Figure 1 .
Figure 1.AMMI1 Biplot with the main effects vs the first principal component of interaction (IPCA1), corresponding to the representation of 23 hybrids and 22 environments.The identification of the hybrids and environments, with the respective numbers of reference used in this diagram, are shown in Table1.

Figure 3 .
Figure 3. GGE biplot with the first two principal components of G + GxA (PCA1 and PCA2), corresponding to the representation of 23 genotypes and 22 environments (numbers preceded by letter E).The algorisms I and II represent the mega-environments I and II, respectively.The identification of the hybrids and environments, with the respective reference numbers used in this diagram, are shown in Table1.