Similarity networks for the classification of rice genotypes as to adaptability and stability

The objective of this work was to evaluate the similarity network graphic methodology for the classification of flood-irrigated rice (Orzya sativa) genotypes regarding their adaptability and stability. Two statistical measures were used to represent the proximity of the behavior (based on Pearson’s correlation) or values (based on Gower’s distance) between pairs of genotypes or between genotype and environment. Productivity data of 18 genotypes were evaluated in three locations in the state of Minas Gerais, Brazil, in the harvests of 2012/2013, 2013/2014, 2014/2015, and 2015/2016, in a randomized complete block design. The genotypes were previously assessed for adaptability and stability by the Eberhart & Russell and centroid methods. The graphical representations provided by the similarity networks allowed to better identify the pattern of the genotype x environment interaction, overcoming the interpretation difficulties due to the disagreements between the results obtained by the Eberhart & Russell and centroid methods. The similarity networks improve genotype x environment interaction studies.


Introduction
Rice (Oryza sativa L.) represents the most important commodity worldwide, standing out as the second most cultivated and one of the most consumed grains, providing over 70 and 65% of the Asian and world population meals, respectively (Santos et al., 2006). In Mercosul, Brazil occupies the first position in harvested area and rice production (Acompanhamento…, 2017).
In general, the greatest challenge for breeding programs for grains and agricultural species has been to select genotypes that are stable and have high productivity in several environments (Reginato Neto et al., 2013). In this context, the evaluation of the genotype x environment (GxE) interaction is of special importance. For this, several methodologies have been proposed, following the analysis of variance principles, environmental stratification (Lin & Binns, 1988), or adaptability and stability analyses based on simple, multiple, and nonparametric linear regressions (Eberhart & Russell, 1966;Rocha et al., 2005).
Although widely used, those methodologies have some limitations when individually applied. For example, they usually do not report specific GxE interactions, besides being difficult to interpret (Malosetti et al., 2013). Another problem faced by GxE interaction studies is the classification mismatch between different methodologies, which makes it even more difficult for the breeder to interpret data and make decisions. Aiming to overcome difficulties of interpretation, several authors have proposed the simultaneous use of traditional and graphic methodologies, such as additive main effects and multiplicative interaction (AMMI) (Zobel et al., 1988), GGE biplot (Yan et al., 2000), and restricted maximum likelihood/best linear unbiased prediction (REML/ BLUP) (Resende, 2002), useful for zoning purposes and specific indications in studies with a wide range of environments. Faria et al. (2017) used the Eberhart & Russell, centroid, AMMI, and mixed model methods to evaluate the adaptability and stability of commercial corn (Zea mays L.) hybrids. The authors observed that the studied methods diverged in the indication of hybrids with specific adaptability to favorable and unfavorable environments, concluding that the use of more than one evaluation method allows a more reliable recommendation.
In practice, the breeder is interested in knowing if a genotype is able to thrive in more than one environment.
The consistency of this response pattern can be determined by measuring correlations or distances regarding the performance of pairs of genotypes in environments; the performance of a genotype in pairs of environments; and the relationship between genotypes and environments (Cruz et al., 2014). With these data, it is possible to obtain a matrix of correlations or distances. Although similarity measures have already been successfully used in clustering methods (Cruz et al., 2011), few studies currently adopt correlation or distance measurements to aid in the classification of genotypes as to adaptability and stability (Silva et al., 2019).
In the present study, a new methodology is proposed, based on the similarity analysis of the behavior of genotypes and environments, which represents a very useful and easily interpreted alternative for GxE interaction studies. The technique was built in analogy to the correlation network plots used by several authors (Kumar & Deo, 2012;Saba et al., 2014;Monforte et al., 2015;Silva et al., 2016) to represent and explore -by nodes and lines -the similarity pattern among genotypes and/or environments (Epskamp et al., 2012). This graphical analysis allows the organization, by zoning, of the evaluated environments, adding to them information about the adaptation of a genotype to specific regions.
The objective of this work was to evaluate the similarity network graphic methodology for the classification of flood-irrigated rice genotypes regarding their adaptability and stability.

Materials and Methods
Eighteen flood-irrigated rice genotypes were evaluated for grain yield (kg ha -1 ) in multi-location value for cultivation and use (VCU) trials. Of these genotypes, 13 are elite lines and 5 are commercial cultivars -Rio Grande, BRS Ourominas, BRSMG Seleta, BRSMG Predileta, and BRSMG Rubelita ( Table 1).
The experimental trials in all locations were performed within each harvest year, totaling 12 environments (Table 1). First, individual analyses of variance were performed and, then, the joint analysis of all sources of variations was carried out based on a simple factorial arrangement, considering the genotypes as fixed and the environments as random (Ramalho et al., 2000). The statistical model used was: Y ijk = µ + B/E jk + G i + E i + GE ij + ε ijk , where Y ijk is the observation in the k-th block, evaluated in the i-th genotype and j-th environment; μ is the general mean of the experiments; B/E jk is the effect of block k within environment j; G i is the effect of the i-th genotype considered as fixed; E i is the effect of the j-th environment considered as random; GE ij is the random effect of the interaction between genotype i and environment j; and ε ijk is the random error associated with observation Y ijk .
To better assess the interactions between genotypes and environments, the genotypes were previously classified as to adaptability and stability by the Eberhart & Russell (1966) and centroid (Rocha et al., 2005) methods. These classifications were applied in the graphical analyses using the similarity network and distance projection methods proposed in the present study.
According to Eberhart & Russell (1966), genotype adaptability can be classified as: class I, general adaptability; class II, adaptability to favorable environments; class III, adaptability to unfavorable environments; class IV, restrictions to recommendation (not adapted); and class V, not recommended (not adapted). In the centroid method, four classes are used: class I, wide adaptability; class II, adaptability to favorable environments; class III, adaptability to unfavorable environments; and class IV, minimally adapted (Rocha et al., 2005). Classes IV and V of the Eberhart & Russell method and class IV of the centroid method refer to disposable or poorly adapted material; therefore, these classes are considered equivalent. All analyses were performed using the Genes software (Cruz, 2016).
The graphical representation of the similarity networks was built in analogy to the correlation network plots proposed by Epskamp et al. (2012). Each line has a weight indicating correlation force or existing similarity, depending on the similarity matrix used, either based on Pearson's correlation or on Gower's similarity. As the degree of relationship between two variables gets stronger, the lines that connect them get thicker in the network frame. The intensity of the correlations and/or similarities also depends on the length of the lines. According to Epskamp et al. (2012), shorter lines indicate stronger relationships, allowing to group different variables. The two-dimensional network representation of a p-dimensional similarity matrix, for example, allows the researcher to detect important structures and complex statistical patterns difficult to extract from a table (Silva et al., 2019).
Two different similarity matrices were generated, with elements either representing the proximity of the behavior (using the correlation principle) or of the values (using the distance principle) between pairs of genotypes or between genotype and environment. Therefore, the similarity matrix was constructed based on two principles: Pearson's correlation (SN/r p ) and the complement of Gower's distance (SN/G). Similarity was structured in a matrix of (g + e) × (g + e) dimension, with two diagonal blocks (R gxg and R exe ) arranged on the main diagonal and the R gxe = R T exg information, on the secondary diagonal.
For the similarity matrix based on Pearson's correlation, the main diagonal was composed by Pearson's correlations (r pearson ). Therefore, the information about the performance of the g genotype in the e environment made up the data matrix (M), from which the R gxg submatrix was generated, whose elements represented the correlation between the M matrix columns. Similarly, the M matrix transpose (M T ) allowed generating the R exe submatrix, whose elements represented the correlation between the M T columns. The R gxe submatrix was obtained from a transformation in the M matrix, which consisted in converting the maximum and minimum values of each column into 1 and 0, respectively; the other values were interpolated within these limits. The addition of the R gxg R gxe : R exg R exe submatrices generated the R matrix for the similarity network analysis.
The similarity matrix based on Gower's distance was constructed equivalently to the one based on Pearson's correlation. From the M matrix, it was possible to obtain the R gxg and R exe submatrices, which represented the diagonal blocks of the R matrix. The R gxe submatrix was established identically to the one for the similarity matrix based on Pearson's correlation. The diagonal blocks were composed by the similarity matrices generated based on Gower's algorithm, described by where p is the number of variables (p = e to get R gxg and p = g to get R exe ); w ijk is the weight given to the ijk comparison, assigning 1 for valid comparisons and 0 for missing values; and s ijk is the similarity between i and j, which represent pairs of genotypes or pairs of environments for variable k (0 ≤ s ijk ≤ 1). The classification of genotype and environment into groups according to the Eberhart & Russell and centroid methods was associated to the similarity matrices. Therefore, a total of four scenarios were evaluated: similarity network using r pearson (SN/r p ) and similarity network using Gower's distance (SN/G) for the GxE classification groups given by the Eberhart & Russell method; and similarity network using r pearson (SN/r p ) and similarity network using Gower's distance (SN/G) for the GxE classification groups obtained by the centroid method.

Results and Discussion
The joint analysis of the experiments showed that environment and GxE interaction effects were significant (Table 2). Genotype mean, however, did not differ significantly, which is explained by the significance of the GxE interaction and by the advanced breeding stage in which the lines were evaluated, which makes it difficult to detect differences among the general means of the lines.
The significant interaction between genotypes and environments ( Table 2) is indicative of the varying behavior of the rice genotypes throughout the evaluated environments. This justifies the need to perform adaptability and stability studies to better assess genotype performance in different environmental conditions, also allowing to identify stable genotypes (Cruz et al., 2014).
When the Eberhart & Russell and centroid methods were used to classify the genotypes for adaptability and stability, differences were observed in some line rankings (Table 3). Assuming that classes IV and V of the Eberhart & Russell method are equivalent to class IV of the centroid method, a 50% agreement was found between the classifications of the 18 assessed genotypes. Both methods, for example, classified the BRA 02691 (G3), MGI 0717-18 (G12), MGI 0607-1 (G5), and the control genotype 'BRS Ourominas' (G8) as of general adaptability, but differed regarding the classification of other lines of interest. The BRA 031001 (G1) genotype was classified as of general adaptability (class I) by the Eberhart & Russell method, since it presented Mi > M, β 1i = 1, and R i 2 > 70%. However, this same genotype was classified as poorly adapted (class IV) by the centroid method, because its average productivity was not high compared with that of MGI 0607-1 (G5). The classification of the BRA 02708 (G13) and BRA 041230 (G10) genotypes also differed: both were classified as of general adaptability (class I) by the Eberhart & Russell method, whereas BRA 02708 (G13) and BRA 041230 (G10) were classified as of specific adaptability to unfavorable and favorable environments, respectively, by the centroid method.
These discrepancies make the decision-making process difficult for breeders. Several other authors also reported differences in the classifications and recommendations given by different methodologies of adaptability and stability analyses (Pelúzio et al., 2008;Barroso et al., 2017;Silva et al., 2019). Therefore, each method presents singularities when ranking lines, and the choice of the best biometric technique should be made carefully (Faria et al., 2017).
The graphical representation of the similarity networks based on Pearson's correlation (SN/ r pearson ), using the groups given by the Eberhart & Russell and centroid methods, showed that each methodology supported the other in important aspects (Figure 1). The exception were the divergent rankings for BRA 02708 (G13) and BRA 031018 (G17), when adopting class I of Eberhart & Russell and class III of the centroid method. These two lines had a strong correlation with environment E3, classified as favorable, in both graphs; however, BRA 02708 (G13) was classified as of specific adaptability to unfavorable environments by the centroid method, with similar probabilities of belonging to classes I and III (Table 3). Therefore, the proposed similarity network gives the researcher greater security in classifying genotypes of general adaptability and with a high correlation with favorable environments. It also highlights the greater correlation Table 3. Estimates of the adaptability and stability parameters, spatial probabilities [P(I) to P (IV)], and phenotypic adaptability classification of flood-irrigated rice (Orzya sativa) genotypes according the Eberhart  Pesq. agropec. bras., Brasília, v.55, e01017, 2020 DOI: 10.1590/S1678-3921.pab2020.v55.01017 between genotypes that have an average higher than the general one. A similar response pattern was found by Silva et al. (2019) when adopting strategies based on the projection of dissimilarity measures.
Another discrepant result that had to be evaluated with caution was the classification of the 'BRSMG Rubelita' (G4) commercial line. It was classified as not recommended (class V) by Eberhart & Russell since its average was lower than that of the general experiment, but showed specific adaptability to favorable environments (class II) by the centroid method ( Table 3). The obtained graphs emphasized a strong correlation between 'BRSMG Rubelita' (G4) and environment E3 (Figure 1). It was noted that the probability of this line not being recommended (class IV) or of showing specific adaptability to favorable environments (class II) was similar by the centroid method, which caused confusion in its classification. In this case, the graphical representation was useful because it aided in the visualization of the relationships between genotypes and environments. Silva et al. (2019) also evaluated the 'BRSMG Rubelita' (G4) commercial line, and found a strong correlation between this genotype and environments classified as favorable. According to the authors, the graphical representation by projections of distances helped to visualize the relationship between genotypes and environments.
Relationships between genotypes, environments, and GxE were expressed by Gower's similarity measure (Figure 2). By adopting this similarity matrix over the previous one based on Pearson's correlation, it is possible to make inferences about genotypes and environments based on the obtained values and not on their general behavior. This is important because, even when the correlation between two environments, considering distinct genotypes, is high, the comparative values between them may be as discrepant as if one environment were favorable and the other unfavorable (Cruz et al., 2014). Therefore, a distance measurement, rather than a correlation measurement, would be able to capture this and provide a new angle for the breeder's interpretation (Cruz et al., 2014;Epskamp et al., 2012).
Similarity networks based on the complement of Gower's distance (SN/G) also used the groups given by the Eberhart & Russell and centroid methods ( Figure 2). The obtained graph evidenced the similarity between environments belonging to a same favorable  Table 1. Pesq. agropec. bras., Brasília, v.55, e01017, 2020 DOI: 10.1590/S1678-3921.pab2020.v55.01017 or unfavorable class. A similar pattern of behavior was verified for the BRA 02708 (G13) and BRA 031018 (G17) lines regarding favorable environments, particularly E1 and E3. The BRA 031001 (G1) line also resulted in a conflict of interest for the breeder, since it presented a higher average than the general one, but was classified as not adapted (class IV) by the centroid method and as of general adaptability (class I) by the Eberhart & Russell method (Table 3). Applying the proposed similarity networks (Figures 1 and 2), this line showed a high correlation with genotypes classified as of general adaptability and as adapted to favorable environments, even for the networks constructed based on the centroid classification (Figure 1 B). The same similarity pattern was observed when the decision criterion was based on Gower's similarity, i.e., BRA 031001 (G1) also presented a great similarity with genotypes classified as of general adaptability and as adapted to favorable environments, as well as with the favorable environment E1 (Figure 2 B).
Considering that the correlation network analysis has been useful in plant breeding studies (Silva et al., 2016), similarity networks showed an increase in the effectiveness of genotype selection, assisting in the decision-making process, especially for genotypes that are difficult to classify (Silva et al., 2019).

Conclusions
1. Similarity matrices between genotypes, between environments, and between genotypes and environments are effective for genotype x environment interaction studies in flood-irrigated rice (Orzya sativa).
2. The graphical evaluations provided by the proposed similarity network methodology are useful in the breeder's decision-making process, when evaluating lines classified by the Eberhart & Russell and centroid methods.  Table 1.