Association between genetic distances in wheat (Triticum aestivum L.) as estimated by AFLP and morphological markers

Molecular and morphological data analyses matrices are very informative tools for the estimation of genetic distances. We used AFLP markers, morphological traits and combined analysis to estimate the genetic distances between wheat genotypes and ascertain any associations between the two techneques. Nineteen wheat (Triticum aestivum L.) genotypes were analyzed using amplified fragment length polymorphism (AFLP) markers and field evaluated for two years. The matrices obtained by morphological and molecular marker data analyses revealed a significant but moderate correlation (r = 0.47), indicating that such techniques sample distinct genome regions. The combined analysis was found to be highly correlated with AFLP (r = 0.97) and moderately correlated with morphological (r = 0.59) markers. A possible explanation for such results is a bias caused by the much higher number of AFLP (229) than morphological (17) markers. Thus, it is evident that the combined analysis is not efficient when a very dissimilar number of markers are used in each isolated technique. Therefore, to obtain a better knowledge of the degree of divergence among genotypes it is necessary to consider each analysis separately.


Introduction
The analysis of genetic distance is an auxiliary tool of great use in breeding programs and an important link between the conservation and use of available genetic resources (Mahammadi and Prasanna, 2003). A knowledge of genetic distance not only produces a better understanding of germplasm organization and higher efficiency during genotype sampling but also results in the biologically oriented choice of crosses and gene introgression from exotic germplasm, and can also be used to recommend cultivars when the goal is to increase the genetic basis of commercial cultivars for a given region.
In a breeding program, the genetic gain achieved through artificial selection is directly related to the amount of variability and gene quality present in the segregating population. Thus, the correct choice of parents employed in the development of the basic population can influence the final result of the artificial selection and promote a better al-location of financial resources during the whole process of adjusting genotypes to a given environment (Nienhuis et al., 1993;Bohn et al., 1999). However, to confirm such expectations, it is necessary that the parents combine high means with an increase in variability for the characters under selection.
Molecular and morphological analyses are among the most used tools for the estimation of genetic distances within a group of genotypes. Molecular markers provide an excellent tool for obtaining genetic information and their use in the assessment of genetic divergence in wheat (Triticum aestivum L.) has increased in the last few years (Manifesto et al., 2001;Corbellini et al., 2002;Almanza-Pizón et al., 2003;Máric et al., 2004;Roy et al., 2004). Also, because molecular markers are not subject to environmental influence they are considered superior to morphological markers (Máric et al., 2004). Amplified fragment length polymorphism (AFLP) markers are the preferred type of molecular markers because of their multiplex power, their efficiency in detecting genetic variability and the robustness of AFLP assays (Vos et al., 1995).
Morphological characters, in association with multivariate techniques, have been employed in quantifying genetic distance in wheat (Zeven and Schachl, 1989;Van Beuningen and Bush, 1997;Máric et al., 2004;Roy et al., 2004). However, phenotype expression is influenced by genotype vs. year and genotype vs. location interactions, leading to a low accuracy of quantitative genetic parameter estimates.
Nevertheless, the effectiveness of the simultaneous use of AFLP and morphological markers to measure genotype relationships, as well as the magnitude of association between these methods, is not yet fully understood. Thus, the objective of the work presented in this paper was to use AFLP, morphological and combined markers to estimate the genetic distance between wheat (Triticum aestivum L.) genotypes and to establish the degree of association between these techniques.

Materials and Methods
The 19 wheat genotypes used in our study are listed in Table 1, two genotypes, the aluminum-tolerant BH1146 and the aluminum-sensitive Sonora64 genotypes (Bertan et al., 2006) being included due to their well known responses to excess aluminum. To measure the morphological characters, experiments were conducted during the 2003 and 2004 cropping seasons in an experimental field belonging to the Genomic and Plant-improvement Center of the Federal University of Pelotas (Centro de Genômica e Fitomelho-ramento/Universidade Federal de Pelotas -CGF/UFPEL), in the municipality of Capão do Leão (31°52'00" S, 52°21'2" W, altitude = 13.24 m) in the southern Brazilian state of Rio Grande do Sul. The experimental design was completely randomized blocks with three replications, and the plots were composed of five 5 m rows spaced 0.2 m apart. It were disregarded the 2 outer rows and 0,5 m on the end of each of the 3 left inner rows, thus the useful area of each plot consisted of 4 m length of each three inner row. Each plot was base-fertilized with the equivalent of 300 kg ha -1 of 5-20-20 NPK fertilizer and top-dressed with the equivalent of 60 kg ha -1 N at tilling. Weeding was performed manually and ants were controlled using granulated baits, other pest and disease control measures being carried out according to the recommendations of the Southern Brazilian Wheat Research Committee (Comissão Sul Brasileira de Pesquisa de Trigo, 2002).
A total of 17 morphological characters were determined, according to procedures presented by Scheeren (1984): i) days from emergence to flowering (DEF); ii) days from emergence to maturation (DEM); iii) days from flowering to maturation (DFM); iv) plant stature in cm (PS); v) number of fertile tillers per linear meter (FTLM); vi) weight of a thousand grains (WTG); vii) hectoliter weight in kg hL -1 (HW); viii) grain yield in kg ha -1 (GY). The following morphological characters were measured on a sample of 25 plants per plot: ix) flag leaf blade width in cm (LBW); x) flag leaf blade length in cm (FLL); xi) leaf sheath length in cm (LSL); xii) peduncle length in cm (PL). In addition, the following characters were determined on a sample of 25 spikes per plot: xiii) spike length in cm (SL); xiv) number of spikelets per spike (NS); xv) spike weight in g (SW); xvi) number of grains per spike (NG) and xvii) number of grains per spikelet (NG/NS).
Morphological character data were subjected to analysis of variance (ANOVAR), considering the effects of genotypes and years as fixed. The character means, based on two years, were compared using the least significant differences (LSD) method (Steel and Torrie, 1980) at the 5% probability level (p = 0.05) and the generalized Mahalanobis distance (D 2 ) was obtained for all genotype pairs, based on two years of evaluation, using the Genes software (Cruz, 2001). From the genetic distance matrix, a dendrogram was constructed using the unweighted pair group method with arithmetic means (UPGMA) method. The adjustment between the distance matrix and the dendrogram was estimated by the cophenetic correlation coefficient (r; Sokal and Rolf, 1962) using the NTSYS pc 2.1 software (Rolf, 2000).
The DNA used in the AFLP analyses was extracted according to the protocol described in Saghai-Maroof et al. (1984). The analyses of AFLP markers were performed according to the protocol described by Vos et al. (1995). The six primer combinations used were M-CTA/E-ACT (C 1 ), ACA (C 4 ), M-CAA/E-ACA (C 5 ) and M-CAG/E-ACT (C 6 ), where C i represents the i th primer combination and E and M represent the EcoRI and MseI restriction enzymes. The amplified fragments were electrophoresed in (w/v) 6% denaturing polyacrylamide gels and stained using silver nitrate (Creste et al., 2001). Bands were scored as binary data (1 = presence and 0 = absence) and the average polymorphic information content (PIC) was calculated for each primer combination by applying the formula PIC = 1 -Σp i 2 , where p i is the frequency of the i th allele (Powell et al., 1996). The marker index (MI) was calculated for each primer combination as MI = PIC x np i , were np i is the number of polymorphic bands (Powell et al., 1996).
A genetic similarity calculation was performed using the software NTSYS pc 2.1. The genetic similarity (S ij ) was measured using the Dice coefficient (Dice, 1945) according to the equation S ij = 2 N ij /(N i + N j ), where N ij is the number of bands present in both genotypes i and j, N i is the number of bands present in genotype i, and N j is the number of bands present in genotype j. The genetic similarity was converted to genetic dissimilarity according to the equation D ij = 1 -S ij , in which D ij is the genetic dissimilarity in each pair of i and j genotypes, and S ij is the genetic similarity between each pair of i and j genotypes. The dissimilarity matrix produce was used to generate an UPGMA dendrogram, the adjustment between the dissimilarity matrix and the dendrogram being estimated from the cophenetic correlation coefficient (r) using the NTSYS pc 2.1 software. Cluster stability was measured by bootstrap analysis with 1,000 replications using the Winboot software (Yap and Nelson, 1996). The Genes software (Cruz, 2001) was used to estimate the minimum number of markers needed for the estimation of genetic distance with a correlation coefficient (r) of at least 0.95.
Genetic similarity was also estimated between all genotype pairs using the similarity index proposed by Gower (1971) which uses both binary and quantitative morphological data to estimate a unique similarity index ranging from 0 to 1, calculated as where t αβh is adjusted for the type of variable (h) and where if h is binary (as in AFLP analysis) then t αβh equals 0 and δ abh = 1 if h for each genotypes is different while both t αβh and δ abh equal 1 if h is present in both genotypes or 0 if h is absent from both genotypes but if h is a quantitative variable, such as is the case for some morphological parameters, then and δ abh equals 1, where δ abh is the variable value (h) for genotype α and x βh is the value for the same variable in genotype β. The genetic similarity was estimated using the Multiv v. 2.3 software (Pillar, 1997), was converted into genetic dissimilarity according to the equation D αβ = 1 -S ij in which D αβ is genetic dissimilarity between each pair of α and β genotypes while S αβ is the genetic similarity between each pair of α and β genotypes. The dissimilarity matrix generated was used to construct an UPGMA dendrogram, the adjustment between the dissimilarity matrix and the dendrogram being estimated from the cophenetic correlation coefficient (r) using the NTSYS pc 2.1 software. Mantel test (Mantel, 1967) with 1,000 permutations was used to estimate the correlation (association) significance between the distance matrices resulting from morphological, AFLP and combined analyses, the test being calculated using the NTSYS pc 2.1 software.

Results and Discussion
The analysis of variance indicated that for all the characters evaluated there were statistically significant differences (p = 0.05) between the genotypes studied and, for most of the characters evaluated, for years and the genotype times year interaction, justifying the need for evaluating genotypes for more than one year in order to obtain a reliable estimate of individual means for the majority of characters evaluated ( Table 2). The coefficients of variation for the data shown in Table 2 were low (2 to 13%), indicating high experimental accuracy for the study.
Examination of maximum and minimum values for each of the characters measured showed that some genotypes presented means located at the top and bottom limits for a large number of characters. For example, the Sonora 64 genotype showed mean values at the bottom limit of the range for seven (DEF, DEM, LBW, LSL, GY, HW and NG/NS) out of the 17 measured characters (Table 2) while for genotype TB 951 a total of six characters had mean values located at the top (characters LBW, FLL, NE and SL) or bottom (characters PS and PL) limits (Table 2), indicating that these genotypes have very different morphological characters compared to the other genotypes. According to the LSD test, only the DEF characteristic showed no genotypes with means significantly below the overall mean while only the NG/NE characteristic showed no genotypes with means significantly above the overall mean ( Table 2).
The overall Mahalanobis distance (D 2 ) estimated using morphological characters revealed that the most distant genotypes were TB 951 and BRS 194 while the closest were BRS 119 and BRS 208. The dendrogram showed that the most distinct genotypes were Sonora 64, TB 951 and BR 18 (Figure 1). The large distance found between the 394 Genetic distances in wheat Sonora 64 and TB 951 genotypes and the other genotypes in the dendrogram generated by morphological characters was expected, since their means were located at the top or bottom limits for a large number of phenotypic characters. The cophenetic correlation coefficient (r = 0.80) showed a fair degree of agreement between the graphical representation of distances and the original matrix, supporting the visual inferences drawn from Figure 1. The six AFLP primer combinations used generated a total of 262 bands, of which 239 (91.2%) were polymorphic among the 19 genotypes studied. Analysis of the minimum number of informative markers revealed that among all the markers obtained at least 200 were required for the combined analysis to have a correlation coefficient of 0.95. Since the minimum number of informative markers was very close to the number of polymorphic markers evaluated (239) subsequent analyses were performed with all markers.
The PIC values were very close for the primer combinations used, so that the higher MI values were detected for combinations presenting the higher number of polymorphic bands ( Table 3). The combinations producing the highest number of polymorphic bands (48) were M-CTA/E-ACT and M-CAC/E-ACA while the combination producing the lowest number of polymorphic bands (31) was M-CAA/E-ACA. The low number of monomorphic bands (23) obtained by using all primer combinations demonstrated that AFLP analysis has a high potential for detecting the genetic variability present in these wheat genotypes. A similar scenario was reported by Corbellini et al. (2002) who analyzed 40 wheat genotypes from Central and Southern Europe using five AFLP primer combinations and obtained an average of 40 polymorphic bands per primer combination and a total of 200 polymorphic bands. Slightly lower levels of polymorphism have also been detected in wheat, such as the 59% reported by Almanzá-Pizon et al. (2003) and the 47% reported by Roy et al. (2004). Taken together, these results indicate that AFLP markers are efficient in detecting genetic variability in wheat.
The genetic dissimilarity estimated using AFLP markers showed that the most similar genotypes were BR 35 and BRS 120 and the most dissimilar were TB 951 and ICA 1. Clustering percentage values above 30% for 1,000 bootstrap cycles occurred in only four groups (BR 18 and TB 951 (57%), BRS 119 and BRS 192 (40%), BRS 35 and BRS 120 (37%) and BR 23 and BRS 177 (30%)), revealing that such clusters are the most consistent ( Figure 2). The cophenetic correlation coefficient for the dendrogram (r = 0.85) indicated good agreement between the graphical display of distances and the original matrix, supporting the visual inferences suggested in Figure 2.
Of the four most consistent AFLP analysis clusters, three (BR 18 and TB 951;BR 23 and BRS 177;BRS 119 and BRS 192) were consistent with the distance estimated using morphological characters (Figures 1 and 2). A certain agreement between the distances estimated through these two techniques was found, as evidenced by a moderate but significant correlation (r = 0.47) between the morphological genetic distance matrix and the AFLP marker matrix (Table 4). Previous studies have reported that small genetic distances as estimated by molecular markers were consistently associated with small phenotypic distances (Dillmann et al., 1997;Lefebvre et al., 2001) while large molecular distances can either be associated with large or small phenotypic distances (Dillmann et al., 1997;Lefebvre et al., 2001). 396 Genetic distances in wheat  The moderate association between genetic distances estimated using molecular and phenotypic markers can be explained by a range of factors. Molecular analysis provides a wider genome sampling than the morphological analysis, since a study comparing both techniques rarely evaluates the same, or even a similar, number of morphological and molecular markers. The association between estimates is also influenced by the fact that a large portion of the variation detected by molecular markers is non adaptive and, therefore, not subject to either natural or artificial selection. On the other hand, the phenotypic characters are subject to both natural and artificial selection, aside from their high environmental dependence. Moreover, it is not always the case that two identical phenotypes are determined by the same genes, i.e., distinct genes may lead to similar phenotypes. Thus, it is clear that such estimates are closer when there is an association between the loci controlling the targeted morphological traits (quantitative trait loci, or QTLs) and the evaluated bands and when a large number of morphological traits are evaluated (Dillmann et al., 1997;Schut et al., 1997;Lefebvre et al., 2001;Máric et al., 2004;Roy et al., 2004). Máric et al. (2004) investigated wheat and reported an r = 0.12 correlation between distances estimated using random amplified polymorphic DNA (RAPD) markers and 12 morphological characters. This was lower than the r = 0.47 correlation detected by us, possibly because we used a larger number of morphological characters (17) and hence a larger sample of the wheat genome. In another study of wheat, Roy et al. (2004) reported that the correlation between genetic distances estimated in wheat using AFLP markers and 14 morphological characters was r = 0.072, indicating an association close to null. However, although the morphological measurements conducted by Roy et al. (2004) were also performed in two years the ge-notypes were evaluated under spaced-plant conditions, unlike in our study in which the genotypes were evaluated under full-row conditions. Taken together, these studies may indicate that under full-row conditions genetic distances tend to be closer than under spaced-plant conditions in which genotypes can express their phenotype without facing competition for space and nutrients. Dreisigacker et al. (2004) analyzed 68 advanced wheat lines using 99 simple sequence repeat (microsatellite) (SSR) markers but did not differentiate the lines regarding the five distinct mega-environments in which they were selected. According to Dreisigacker et al. (2004) there may be three possible reasons for unsuccessful separation: i) selection based on mega-environment adaptation has not been practiced long enough to differentiate the germplasm; ii) genes conferring fitness to one megaenvironment are not unique to that mega-environment and may confer fitness to several other mega-environments; and iii) adaptation to a mega-environments is not based on accretion of random genes but rather on a limited set of specific genes.
In our study, genetic dissimilarity as estimated through the combined analysis of AFLP and morphological characters showed that the most similar genotypes were BR 35 and BRS 120 and that the most dissimilar were TB 951 and ICA 1. The most distinct genotypes in the collection used in this study were ICA 2 and Sonora 64 (Figure 3). The cophenetic correlation coefficient of the dendrogram (r = 0.81) showed good agreement between the graphical representation of the distances and the original matrices, which enabled more accurate visual inferences to be drawn using Figure 3.
We also found a higher correlation (r = 0.97) between the combined and molecular matrices than between the combined and morphological matrices (r = 0.59), with both correlations being significant at p = 0.05 by the Mantel test Vieira et al. 397 Figure 3 -Combined morphological character and amplified fragment length polymorphism (AFLP) marker unweighted pair group method with arithmetic means (UPGMA) dendrogram resulting from the analysis of 19 wheat genotypes using the complement of the similarity index of Gower (1971) obtained from the combined analysis of phenotypic and AFLP markers as a measure of genetic dissimilarity. The value of the cophenetic correlation coefficient (r) was 0.81. with 1,000 permutations (Table 4). Such difference in association between matrices could be based on the different number of data points for AFLP markers (229) and phenotypic characters (17). The results obtained in our study suggests that to obtain a more complete understanding of the degree of genotype divergence it is necessary to consider the molecular and morphological data separately. Franco et al. (2001) have suggested that better genotype discrimination is obtained by the combined molecular and morphological data when one determines, a priori, the minimum number of markers that will lead to the same results as the combination of all markers. However, Warburton et al. (2002) has pointed out that this strategy may lead to errors because the presence of translocated segments on chromosomes within the studied genotypes can lead to overestimation of genetic distances, especially when there is no information available about the presence or absence of the translocated segments concerned. Warburton et al. (2002) also showed that only four AFLP markers, located on translocated wheat segments, out of 40 polymorphic markers were sufficient to mistakenly cluster 101 sister lines only as function of the presence of translocated segments and that separation of the 101 sister lines as a function of their pedigrees was achieved only when these four markers were eliminated from the analyses. In our study, the morphological and molecular marker analyses showed that the most divergent genotype is Sonora 64 (Figures 1 and 2). However, this genotype should not be prioritized in crosses, because it presents the lowest grain yield value (Table 2). Among the characters most targeted in plant breeding are those related to productivity (GY and HW) which determine the ultimate genotype performance. For such characters, the choice of cross aiming to increase the productivity plateau should include genotypes with high GY, however divergent for the pool of evaluated characters. Genotypes that are high yielding and also divergent probably have different QTLs controlling GY and these QTLs could be combined into a new genotype superior to both parents (transgressive segregant). In this sense, BR 23 and/or RUBI are promising genotypes, since they fulfill the requirements of being high yielding (Table 2) and divergent as indicated by both morphological and molecular data (Figures 1 and 2). Among the possible crossing combinations, special attention should be drawn to the promising cross between BRS 177 and Rubi, which was high yielding, divergent and demonstrated high HW (Table 2, Figures 1 and 2). 398 Genetic distances in wheat