Mega-environment analysis of maize breeding data from Brazil

ABSTRACT: The development and recommendation of single cross maize hybrids (SH) to be used in extensive land areas (mega-environments), and in different crop seasons requires many experiments under numerous environmental conditions. The question we asked is if the data from these multi-environment experiments are sufficient to identify the best hybrid combinations. The aim of this study was to critically analyze the phenotype data of experiments of yield, established by a large seed producing company, under a high level of imbalance. Data from evaluation of 2770 SH were used from experiments conducted over four years, involving the first and second crop seasons, in 50 locations of different years and regions of Brazil. Different types of analysis were carried out and genetic and non-genetic components were estimated, with emphasis on the different interactions of the SH with the environments. Results showed that the coincidence of common hybrids in these experiments is normally small. The estimates of the correlations between of the hybrids coinciding in the environments two by two is of low magnitude. The hybrid × crop season interaction was always expressive; however, the interactions of hybrids and other environmental variables were also important. Under these conditions, alternatives were discussed for making with the information obtained from the experiments, can be more efficient on the process to obtain new hybrids by companies.


Introduction
Two maize crop seasons are common in Brazil per year. The first crop occurs from Sept to Dec, while the second season is from Jan to Apr. The environmental conditions between these two crop seasons are quite distinct in relation to temperature and rain distribution. In addition, farmers' use of technology in maize growing is quite diversified. This makes selection of hybrids for recommendation under these different conditions a much greater challenge than that in temperate regions, for example.
In order for a breeder to be successful in the identification of hybrids adapted to the mega-environment of maize growing, the hybrids obtained annually must be broadly evaluated. Clearly, these evaluations will only be successful if the experiments are conducted in the greatest number of environments possible. Experience in respect shows that secure recommendation was only possible through middle-term results coming from hundreds of replications (Troyer, 1996;Gaffney et al., 2015). However, companies obtain thousands of hybrids annually, which makes testing in multiple replications difficult. In this way, the same hybrid will rarely be evaluated in all the environments, resulting in a is excessive imbalance data and consequently hindering the decision making at the time of recommendation.
In many situations, these experiments are used to evaluate the possibility of employing genomic selection in prediction of potentially superior hybrid combinations. In this sense, the more accurate the model is, the greater the association of future performance of the hybrid will be through the response of the genotyped line. Previous experiences show that the effect of the hybrid × environment interaction greatly complicates the prediction process. Because of this interaction, the responses of the hybrids do not coincide in the diverse environments evaluated. An alternative is to include this effect in the predictive models to obtain more accurate information (Lado et al., 2016;Ferrão et al., 2018;Dias et al., 2018a;Montesinos-López et al., 2019;Krause et al., 2020). The question is whether the hybrid × environment interaction component obtained from highly unbalanced experiments can contribute to the predictive models.
Thus, the purpose of the present study was to analyze the phenotypic data from yield experiments of different crop years and seasons, estimate genetic and phenotypic parameters under a high level of imbalance and comment the impact of these conditions on breeder decisions to selection maize hybrids.

Genetic material, experimental design, and environments
Grain yield data (t ha -1 ), kindly provided by a Brazilian company of hybrid maize cultivars, were used in this study. These data were obtained over four years with two crop seasons per year including numerous locations in the central and southern regions of Brazil ( Figure 1). During this period, 2770 SH of maize were evaluated. These hybrids originated from crosses of 447 lines coming from different tropical, subtropical,

Genetics and Plant Breeding
Research Article Multi-environment data analysis Sci. Agric. v.79, n.2, e20200314, 2022 and temperate regions around the world. Due to the breeding program in question being a line introgression program, the number of SH common to the two crop seasons, as well as the number of experiments and of treatments evaluated per experiment in each location, was quite variable (Table 1 and Figure 2).
Randomized block (RBD) and incomplete block (IBD) experimental designs were used for evaluation of the hybrids, with two or three replications. The plots consisted of four 5 m rows with a 0.7 m between-row spacing. Different experiments were set up within each crop season in the same location. The experiments within each location were connected through check varieties in common since the hybrids evaluated in each experiment were not necessarily the same. Additional information regarding the number of hybrids, the locations, experiments, replications, and experimental design adopted in each crop season is provided in Table   1. Each crop season was identified by an abbreviation that corresponds to the year of sowing (2011, 2012, 2013, or 2014), followed by the crop season (first crop, s1, or second crop, s2) and region (West Center, C, or South, S) ( Figure 1 and Table 1).

Statistical analyses
For better characterization of the dataset, the overall mean per crop season and the variation among mean values of the SH in the different experiments and, subsequently, in the locations were estimated. Considering only the data from the 2011s1C crop season for the purpose of making inferences regarding what occurs among locations in the same crop season, genetic   variance (s g 2 ) among the hybrids evaluated, variance of the error (s e 2 ), and heritability (h 2 ) were estimated in each location using the following statistical model: where y is the vector of phenotypic observations; τ is the vector of fixed effect of the experiment; u g is the vector of random genotypic effects of hybrids, with u g ~ N(0, s g g I 2 ); u ge is the vector of random effects of the hybrid by experiment interaction, with u ge ~ N(0, s ge ge I 2 ); u b is the vector of random effect of replication ; e is the vector of random errors, with e ~ N(0, s e n I 2 ); X, Z g , Z ge , and Z b are the incidence matrices associated with the vectors τ, u g , u ge , and u b ; s g 2 , s ge 2 , s b 2 , and s e 2 are the variance components associated with the vectors u g , u ge , u b , and e; and , , and I n are the identity matrices associated with the vectors u g , u ge , u b , and e. I g I n each crop season, the grain yield data were analyzed through a mixed models approach considering the model according to the experimental design adopted: where y is the vector of phenotypic observations; τ is the vector of fixed effects (experimental RBD: location and experiment within location; experimental IBD: location, experiment within location, and replication within experiment and location); u g is the vector of random genotypic effects of hybrids, with u g ~ N(0, s g g I 2 ); u gl is the vector of random effects of the hybrid by location interaction, with u gl ~ N(0, s gl gl I 2 ); u b is the vector of random effects (experimental RBD: block within experiment and location; experimental IBD: block within replication, experiment, and location, with u b ~ N(0, s b b I 2 ); e is the vector of random errors, with e ~ N(0, R); X, Z g , Z gl , and Z b are the incidence matrices associated with the vectors τ , u g , u gl , and u b ; s g 2 , s gl 2 , s b 2 , and s e 2 are the variance components associated with the vectors u g , u gl , u b , and e; and I g , I gl , I b and I n are the identity matrices associated with the vectors u g , u gl , u b , and e. The residual (co)variance matrix, with the aim of modeling the effect of location within each crop season, adopted a diagonal block variance structure, using the identity matrix (R I l t e n i i ). Previously, alternative methodologies of unstructured variance covariance matrix were tested to try to model the genetic correlation between all environment pairs. These matrices allow a better understanding of the genetic structure and evaluate the stability of genotypes in mega-environments. To this end, the genetic and residual effects were considered as u g ~ MVN (0, σ g l I 2 ⊗ Σ ) and e ~ MVN (0, σ e l I R 2 ⊗ ). Σ l is the VCOV matrix for the additive genetic effects in the l environments and R l represents the VCOV matrix for the residual effects in the l environments. In this case, the main environment effects were implicitly modeled and an unstructured form for the genetic Σ l and residual R l VCOV matrix was assumed. Because of the large number of sites evaluated in each season (> 5) the convergence of these unstructured matrices was difficult. In this way, the diagonal block variance structure was adopted, as described above, to model the genetic and residual effects in this study.
The variance components associated with the random effects were obtained using the residual maximum likelihood method (REML) (Patterson and Thompson, 1971) and their significance levels were verified by the likelihood ratio test. To make inferences regarding the occurrence of interaction, the correlation (r qs ) among the mean values of the SH coinciding in the q and s crop seasons was estimated. The estimator used was similar to that presented by Steel et al. (1997): where HSiq is the mean of the single cross hybrid i in crop season q; HSis is the mean of the single cross hybrid i in crop season s; s HS iq 2 is the variance of the single cross hybrid i in crop season q; and s HS is 2 is the variance of the single cross hybrid i in crop season s.
In addition, the effect of the hybrid × crop season interaction was also verified through the coincidence of the genotypes selected based on the mean of two environments (considering different combinations of year, region, and sowing time) in relation to selection based on the mean of each environment individually. For that purpose, the maize yield data from each location within the crop seasons were fitted regarding the effect of blocking and replication, according to the design adopted in each situation, to obtain the EBLUE (Empirical Best Linear Unbiased Estimation) estimates. These estimates were used to carry out the individual analyses of each crop season and also the combined analyses of the environments two by two. Using the EBLUP (Empirical Best Linear Unbiased Prediction) predictions, the ten best hybrids in the mean of the environments and also in each one of the environments were selected.
Furthermore, in each crop season, the correlation between the mean value and the EBLUP of the SH was estimated, and the estimates of heritability were obtained using the following estimators: Standard method using the expression presented by Falconer and Mackay (1996) that presupposes balanced data and independent genetic effects: where v BLUE means the mean variance of the difference of the fitted mean values of two treatments (EBLUE). Cullis et al. (2006), also estimated in cases of imbalance and random effect of genotype: where v EBLUP means the mean variance of the difference of two EBLUPs. Finally, combined analysis was carried out involving the single cross hybrids common to two crop seasons using different alternatives. The crop seasons chosen were 2011s1C/2012s2C and 2013s1C/2014s2C. These crop seasons were evaluated in different years and also differed in experimental accuracy and in the number of hybrids that coincided among them. In all the alternatives adopted, selection was made of the 20 best SH common to the two crop seasons chosen (2011s1C/2012s2C and 2013s1C/2014s2C).
Combined analysis based on the EBLUE estimates that were obtained for the hybrids in each environment was performed a) involving only the SH common to the two environments and common residual variance; b) considering all the SH present in the two environments, i.e., also those that were eliminated in the year, crop season, or location and common residual variance; c) involving only the SH common to the two environments and considering a residual variance different for each environment using the diagonal matrix, and d) considering all the SH present in the two environments and residual variance different for each environment.

Results
The dataset evaluated is typical of breeding programs with the objective of line introgression to obtain new SH. As many lines do not adapt well to Brazilian climate conditions, an imbalance was observed in the number of SH, locations, experiments, and replications evaluated over the crop seasons (Table 1).
The number of SH that were repeated among the crop seasons varied widely. Comparing the 2011s1S crop season and the 2012s1S crop season, of the 783 SH evaluated in the first year, only 159 proceeded, i.e., high selection intensity was applied and only 20 % of the SH evaluated in 2011s1S were allocated to the experiments in the following crop season. An even more complex scenario was observed between the 2012s2C and 2013s2C crop seasons, where only 13 % of the hybrids evaluated in 2012s2C advanced to the 2013s2C crop season. The same observation is valid for other years when comparing crop seasons and/or regions, and it becomes clear that there is great difficulty in evaluating data from different crop seasons in a combined manner (Figure 2).
The overall mean of the crop seasons ranged from 3.9 to 9.7 (t ha -1 ), and the first crop seasons (8.3 t ha -1 ) were 1.56 times higher yielding than the second crop seasons (5.3 t ha -1 ), regardless of the region evaluated. The variation among the crop seasons was high; for example, for the West Center region, the yield in the first crop season in 2011/2012 was 58 % greater than in the second crop season. However, in 2012/2013, this superiority was much lower, only 18 %, but returned to a higher level in 2013/2014 at 51 %, once more showing the discrepancy among the crop seasons evaluated (Table 2).
Within each crop season, a greater variation in mean yield of the SH tested was observed among the experiments than among the locations. The amplitude of variation of the experiments in relation to the mean was up to 110 %, as is the case of the 2011s1S crop season ([(11.2 -2.7) / 7.6] =1.10). The variation of the lower and upper limits among the locations in relation to the overall mean was under 55 %, except in the 2011s1S and 2014s2C crop seasons. It is important to highlight that the greater variation among experiments is a result of the effect of locations and also of the different SH evaluated among the experiments ( Table 2).
The grain yield data were fitted through the mixed models/REML approach. The estimates of correlations between the mean of the SH and their EBLUPs was greater than the 0.9 involving the crop seasons of more recent years. It follows that under these conditions, an optimal association between the mean values and the EBLUPs of the SH was obtained (Table 2). In all the crop seasons, hybrids were observed with discrepant performance in relation to the set evaluated. The 2012s1S crop season exhibited the widest amplitude of variation, which was associated with the highest mean yield value (9.7 t ha -1 ). In that crop season, the EBLUPs of the hybrids ranged from -4.0 to 3.1. In the 2012s2C crop season, lower amplitude of variation and a mean value of 6.0 t ha -1 was found, with EBLUPs ranging from -0.7 to 1.0 (Table 2 and Figure 3A).
To more easily make inferences regarding what happens among the locations within the same crop season, each location in the 2011s1C crop season was analyzed in detail. In that year, a total of 28 different experiments were set up; however, the number of experiments evaluated in each location was different. The mean yield variation of the experiments in each location was at most 44 % (6.1 to 8.8 t ha -1 ), and the overall mean of the five locations evaluated was from 6.5 to 9.2 t ha -1 . Variation was observed in the magnitude of the estimates of genetic variance and of standard heritability among the five locations. The variance of the hybrid × experiment interaction was greater than the genetic variance in all the locations evaluated in this season, except for one of locations where both variances were similar.
The estimates of the genetic variance components (s g 2 ) involving all the experiments and locations obtained in each crop season were significant by the likelihood ratio test, indicating the presence of genetic variability and the possibility of selection among the hybrids. The magnitude of the variance of the hybrid by environment interaction in relation to genetic variation was expressive, reflecting the hybrid performance that did not coincide across the environments. The s s gxl e 2 2 ratio ranged from 0.12 (2013s1C) to 4.47 (2012s2C) among the crop seasons, accentuating what was commented.
The expressive existence of the hybrid × environment interaction within each crop season was also found through the estimates of the correlations (r qs ) of the mean performance of the SH coinciding in the crop seasons two by two. The estimates of r qs across the combinations of crop seasons ranged from -0.01 to 0.51. In the pairs of environments 2012s2C/2012s1C, 2012s2S/2012s1C, 2012s2C/2014s2C, and 2012s2S/2014s2C, in which the lowest estimates of r were observed, high intensity of selection applied was also found in the SH evaluated from one crop season to another (Figure 2).
The presence of the interaction was also highlighted by the coincidence among the ten best SH selected in the mean of the EBLUPs of the two crop seasons in relation to their relative performance in each crop season. When the same region and year of evaluation were considered, i.e., the response of the first and second crop season, the coincidence varied between the sowing times. As expected, the greatest coincidence in most cases was in the first crop season. However, even under these conditions, in the 2011/2012 and 2013/2014 crop years, the coincidence was less than 50 %. These results, once more showing, the difficulty of moving toward recommendation of new SH involving different regions, sowing times, and crop years.
As the experiments were unbalanced, the heritability (h 2 ) in the mean of the hybrids within each crop season was estimated considering three strategies: i) standard method according to Falconer and Mackay (1996), ii) method according to Cullis et al. (2006), and iii) method according to Holland et al. (2003), cited by Piepho and Möhring (2007). Strategy i, prescribing the use of balanced data, ranged from 0.46 to 0.69 among the crop seasons, and as expected, was always superior to the estimates of h 2 obtained by the other strategies, except in the 2013s1C crop season ( Figure 3B). The estimates of heritabilities proposed by Cullis et al. (2006) and by Holland et al. (2003); Piepho and Möhring (2007) were always of similar magnitude, except in the 2012s2C crop season, and ranged from 0.09 to 0.69 among the crop seasons evaluated. However, the coincidence in the estimates of h 2 of the three strategies was greater in the experiments conducted in recent years ( Figure 3A).
In carrying out combined analysis involving the SH common to two or more environments, there are some alternatives. One is involving only the SH in common and the other would be considering all the SH, i.e., also those that were eliminated in some environments. It is also possible to carry out the analyses considering residual variance in common or heterogeneous residual variance. In this study, combined analysis was performed considering the combinations of the 2011s1C/2012s2C and 2013s1C/2014s2C crop seasons. These pairs were chosen as they consisted of data from the first and second crop seasons of different years that also differed in experimental accuracy and in the number of coinciding hybrids.
For the combination of 2011s1C/2012s2C that involves the first and second crop seasons in the 2011/2012 crop year, the coincidence of the 20 best SH, which is what most interests breeders, changed significantly. Of the 20 best SH ranked in the analysis involving all the hybrids common to the two crop seasons, only five remained when the analysis was performed considering only the SH common to the two crop seasons. This information is valid for both cases, when the homogeneous or heterogeneous residual variance is considered.
Different results were observed for the combination of the first and second crop seasons in the 2013/2014 crop year (2013s1C/2014s2C combination) in relation to previous crop seasons. Coincidence in identification of the 20 best SH, involving all the SH or only the SH common to two crop seasons, was total upon using the same residual variance. However, when the residual variance used was heterogeneous, though the coincidence was high, it was not total (15 SH in 20 SH).

Discussion
The challenge common to all companies is evaluating a large number of SH annually for the purpose of recommending those that have the best performance for farmers. The information coming from these evaluations is often unbalanced in relation to the number of SH, of replications, of experiments, and of locations, which may compromise the choice of the best hybrid. In addition, the breeder needs to deal with the hybrid × environment interaction in seeking greater reliability in future recommendations since this interaction is a complicating factor in the performance of the SH evaluated.
The results obtained in this analysis showed considerable substitution of hybrids among the different environmental conditions (Figure 2). This low coincidence among the hybrids evaluated is explainable because if a determined SH evaluated did not have good results under certain conditions, why reevaluate it under other conditions? Other plausible explanations would be the difficulty of continuing to evaluate a SH that has some agronomic trait other than grain yield that would make its future recommendation unviable, as well as the lack of adaptation of the lines to tropical conditions, which impedes the production of hybrid seeds in large quantity for evaluation of the SH in different environments. Thus, it is expected above all between crop seasons that the low coincidence among the SH evaluated is a reality that likely will not change.
The low coincidence among the hybrids evaluated in the different environments makes it difficult to estimate genetic and phenotypic parameters, especially genetic variance and, above all, the SH × environment interaction. This difficulty has been reported in the literature by diverse authors in recent years (Smith et al., 2001;Möhring and Piepho, 2009;Smith et al., 2015;Nuvunga et al., 2015;Silva et al., 2019).
To deal with unbalanced data, some proposals have been implemented more recently for analysis of experiments with plants using, for example, analysis in two steps, in which weighting is considered in the second step in accordance with the number of replications, with the experimental design, and with residual variance (Smith et al., 2001;Möhring and Piepho, 2009;Welham et al., 2010;Piepho et al., 2012). Other alternatives are multiplicative models (Smith et al., 2015;Nuvunga et al., 2015), sequential analysis, which considers all the hybrids evaluated in the previous generations (Piepho and Möhring, 2006) and models that consider the use of heterogeneous residual variance (Edwards and Jannink, 2006;So and Edwards, 2011;Orellana et al., 2014;Hu et al., 2014;Andrade et al., 2015;Silva et al., 2019).
Due to the wide variation in analytical possibilities for unbalanced experiments in multi-environments, in this study, individual analyses were initially performed in each location within each crop season. Due to the great volume of information, we chose to present only the results in reference to the 2011s1C crop season. In this analysis, the importance of the SH × experiment interaction was clear, even in a single location. This was possible because some SH were present in more experiments.
After that, the yield data from each crop season were fitted through mixed models regarding the block and location effect, seeking to obtain the best estimates of the genetic value of each hybrid. In conditions as observed in this study, wherein a huge number of hybrids were evaluated in many environments across the years, the use of approaches like unstructured VCOV matrices and factorial analytic models have been adopted, once these structures allow to model different Multi-environment data analysis Sci. Agric. v.79, n.2, e20200314, 2022 genetic variance to each site and different covariances between pairs of environments evaluated (Smith et al., 2002;Burgueño et al., 2012;Krause et al., 2020;Oliveira et al., 2020).
In this study, unstructured VCOV structures were tested to better understand the correlation between environments by including the genotype by environment interaction in the model. However, models that include unstructured VCOV matrices shows computational difficulties to converge due a huge number of parameters to be estimated under high unbalance level. The factor analytic structure is an alternative approach to deal with these limitations (Smith et al., 2002;Kelly et al., 2007;Dias et al., 2018b). Due to the difficulty of convergence of models and computational limitations, in this study we adopted VCOV simpler structures to model the effect of location within each crop season. Despite that, always is possible, it is important to assume models with more complex structures, as previously mentioned.
In situations in which the dataset exhibits considerable imbalance, a way of checking the fit of the model is to correlate the mean values and the EBLUPs of the SH. In general, the estimates of correlation in most of the crop seasons evaluated were high, especially in more recent years, showing that in many situations, when the experiments are well conducted, the mean can be considered a good indication of the performance of the SH, even under unbalanced conditions (Figure 2). A significant effect of the SH × environment (locations and experiments within locations) interaction was found in all the crop seasons evaluated. In these cases, the SH × environment interaction component was greater than the genetic variance component in most of the crop seasons. This is very frequent in most of the situations in which various hybrids are evaluated in the same crop season (Tonk et al., 2011;Nzuve et al., 2013;Ndhlela et al., 2014;Mengesha et al., 2019). In the conditions evaluated, the significant effect of the interaction is expected due to the expressive environmental variation of numerous factors, such as climate, soil fertility, and management practices that occurs in the different locations in which maize experiments are conducted (Noia Junior et al., 2019;Embrapa, 2020).
It should be emphasized that environmental variations under tropical and subtropical conditions are more expressive than those normally observed under temperate conditions. This environmental variation is even more challenging since it is largely unpredictable (Eeuwijk et al., 2016). Given this situation, the great challenge of breeders is identifying hybrids that are more adapted and stable under these growing conditions. For that purpose, numerous methods have been proposed in the literature over the past fifty years (Eberhart and Russel, 1966;Wricke and Weber, 1986;Gauch and Zobel, 1988;Piepho, 1997;Yan et al., 2000;Smith et al., 2015;Nuvunga et al., 2015); most recently, the use of mixed models has been proposed above all, according to a survey performed by Eeuwijk et al. (2016).
In addition, in Brazil, variation in environmental factors in the second crop season is much more expressive than in the first, especially due to drought stress or heat stress (Andrea et al., 2019;Andrea et al., 2018), and so a difference in mean yield between the crop seasons is expected. In spite of that, this yield difference has diminished through the choice of more adapted hybrids and the use of greater technology in crop fields. The great challenge for seed production companies currently has been identifying hybrids adapted to both growing conditions. The results obtained using this dataset show that finding a hybrid with wide adaptation to different climatic regions is a challenging factor for breeders because of the enormous contribution of the SH × crop season interaction (Figure 2).
In the present study, the effect of the interaction on SH performance can be observed through estimation of the correlation between the mean of the hybrids in common across the crop seasons ( Figure 2). Another option for the study of the interaction with greater importance for breeders was the coincidence of the hybrids selected considering two or more environments. The low magnitudes of the estimates of correlation and the low coincidences observed show that in most of the cases evaluated, the response to the interaction was of a complex nature and in some cases it was probably not even linear, making identification of the best hybrid difficult. Results similar to these are discussed by Eeuwijk et al. (2016) through graph illustrations involving the yield of the genotype and environmental quality.
The heritability (h 2 ) estimate is a key parameter in plant breeding because it is associated with predictive measurement of success in selection. It has been estimated by the ratio between the part of genetic variance exploited by the genotypes evaluated and the phenotypic variance of the selection unit applied (Falconer and Mackay, 1996;Bernado, 2010). However, with the increased use of mixed models to attenuate the effects of unbalanced data, new options of h 2 estimates have been proposed (Cullis et al., 2006;Piepho and Mohring, 2007;Schmidt et al., 2019).
In this study, h 2 was estimated by three procedures. Especially in the first crop seasons, the estimates did not greatly coincide. However, in more recent crop seasons, there was greater coincidence. It should be emphasized that, as was expected in all cases, the absolute value of h 2 in the standard method was superior to the other two (Cullis et al., 2006); Schmidt et al., 2019) ( Figure 3A). This occurs because in the standard method, phenotypic variance is estimated considering that there is no variation in the number of replications and locations, for example. This discrepancy in the estimates of h 2 has also frequently been observed in other conditions (Piepho and Mohring, 2007;Schmidt et al., 2019).
Regardless of the method used, the h 2 estimates, in most cases, were considered of medium magnitude, which is a favorable condition for selection of SH, based Multi-environment data analysis Sci. Agric. v.79, n.2, e20200314, 2022 on the overall mean of each crop season. It should be emphasized that when selection is made among SH, all the genetic variance is used, i.e., additive, dominant, and epistatic variance (Hallauer and Miranda Filho, 1998;Souza Junior, 2007).
The proposal of using all the hybrids in analyses and not only those in common in both crop seasons, as proposed by Piepho and Möhring (2006), did not prove to be effective in relation to the use of only the SH in common. The fact of considering homogeneous residual variance or not leads to a difference above all when the h 2 are of lower magnitude and, therefore, it is difficult to decide in the latter case from the results obtained in this study that the use of heterogeneous residual variance is more appropriate, since there is no way to prove which ranking is more trustworthy. In the literature, however, there are numerous reports that the use of heterogeneous variance is more advisable (Edwards and Jannink, 2006;So and Edwards, 2011;Orellana et al., 2014;Hu et al., 2014;Andrade et al., 2015;Silva et al., 2019).
From the above, it is clear that the possibility of selecting general hybrids for different growing seasons is very difficult. The possibilities of identifying hybrids that stand out under both conditions can be increased when using experiments with a smaller number of hybrids, with check varieties that are common to the experiments, with more replications and evaluations in the greatest number of locations possible, as has frequently been reported in the literature (Troyer, 1996;Cooper et al., 2014;Gaffney et al., 2015).
In the current period of "plant breeding 4.0", the need for evaluating hybrids considering various replications is not disregarded. In addition, the proposal considers the use of other information, such as climate, soil, geographic coordinates, phenological data, molecular markers, and the possibilities that exist in current analytical terms to identify the SH with best performance (Wallace et al., 2016;Ersoz et al., 2019;Ramstein et al., 2019). Obtaining accurate experiments is especially important in the molecular marker validation phase. Without accurate experiments, it is impossible to find trustworthy associations between the phenotype and the molecular marker.
More recently, the use of genomic selection models, including the effect of the genotype × environment interaction, have frequently been reported as a tool to accelerate the selection process and improve the accuracy between the predicted value and observed value in breeding programs (Cuervas et al., 2016;Lado et al., 2016;Ferrão et al., 2018;Dias et al., 2018a;Montesinos-López et al., 2019;Monteverde et al., 2019;Ames and Bernado, 2020). Nevertheless, it is clear that the interaction information will only effectively contribute to improve the predictive capability of the models if the interaction component used includes not only the genetic variation but also the future possibilities of environmental variation.
The analysis presented here, were carried out to better understand what happens with this data set. In addition, provide subsidies for the genomic prediction study, within will be carried out in a subsequent step, based on the genotyping of the parental lines of the hybrids evaluated in this study. Studies published recently in the literature, involving the prediction of hybrids under different environmental conditions, suggest that the inclusion of the component genotype × environment interaction in genomic prediction models, may improve hybrids predictions if the environmental component is reliable (Krause et al., 2020;Oliveira et al., 2020). The question remains, given that the experiments are very unbalanced, if the component of the interaction to be used in the model will be able to improve its predictive capability, since, as found in this study, the component of the hybrid × environment interaction is very expressive.
Therefore, it should be highlighted that in breeding programs of any species, the most important step is the final evaluation of the lines/hybrids. Recommendation of a cultivar with low accuracy of evaluation is a huge risk, not only economically, but also for the image of the company. The risk will only be reduced if, as already emphasized, the experiments are not only conducted in various environments, but are also as accurate as possible.