Genetic diversity of sweet sorghum germplasm in Mexico using AFLP and SSR markers

The objective of this work was to evaluate the diversity and genetic relationships between lines and varieties of the sweet sorghum (Sorghum bicolor) germplasm bank of the National Institute for Forestry, Agriculture and Livestock Research, Mexico, using AFLP and SSR markers. The molecular markers revealed robust amplification profiles and were able to differentiate the 41 genotypes of sweet sorghum evaluated. Analysis of the frequency and distribution of polymorphic fragments allowed for the detection of unique (AFLP) and rare (SSR) alleles in several genotypes (RBSS-8, RBSS-9, RBSS-25, RBSS-32, and RBSS-37), indicating that these markers may be associated with a feature that has not yet been determined or may be useful for the identification of these genotypes. The genetic relationships indicated the presence of at least two types of sweet sorghum: a group of modern genotypes used for sugar and biofuel production, and another group consisting of historic and modern genotypes used for the production of syrups. Sweet sorghum genotypes may be used to develop new varieties with higher sugar and juice contents.


Introduction
Sweet sorghum [Sorghum bicolor (L.) Moench] belongs to the same species as grain sorghum and forage sorghum; however, it has been agriculturally selected to accumulate high levels of sucrose in the stem parenchyma.The juice extracted from sweet sorghum cane contains high levels of sucrose and invert sugar that are easily fermented to produce ethanol (Prasad et al., 2007).This juice has been used to produce syrup in the USA and alcohol in Brazil and India.It is estimated that, under favorable conditions, sweet sorghum can produce around 43 Mg ha -1 per year of juice, which contains 11.8% of fermentable sugars (Kim & Day, 2011).Some varieties have been reported to produce sugar yields similar to those of sugarcane (Ratnavathi et al., 2010).Additionally, sweet sorghum is highly efficient in water use, even in areas where there are frequent periods of drought and high temperatures.The cultivation costs are also three times lower than those of sugarcane (Reddy et al., 2005), making it an ideal raw material for bioenergy.
Pesq. agropec.bras., Brasília, v.47, n.8, p.1095Brasília, v.47, n.8, p. -1102Brasília, v.47, n.8, p. , ago. 2012 Several sweet sorghum varieties from the USA have been introduced into Mexico, and new genotypes have been developed from these varieties.However, according to Murray et al. (2009), only six African genotypes were used in the development of many of the new varieties, indicating that sweet sorghum cultivars in the USA may have a narrow genetic base.If the genetic base is too narrow, the development of new genotypes for bioenergy production will be difficult.Therefore, it is necessary to determine the genetic diversity of this crop and to identify groups of similar genotypes for the purposes of conservation and proper use of genetic resources, in addition to the protection of property rights.
Genetic diversity has been estimated using DNA molecular markers, and the quality of this estimation depends on the number of markers generated for each species and on their coverage of the genome.In sweet sorghum, genetic diversity has been determined using amplified fragment length polymorphism (AFLP) and simple sequence repeat (SSR) markers.Due to their high values for polymorphic information content (PIC) and the Shannon diversity index (Geleta et al., 2006), these marker types are adequate for use in this species.
Although some analyses of the sweet sorghum germplasm in the USA have indicated that sweet sorghum genotypes cannot be separated from the grain sorghum lines because of their similar racial origin within S. bicolor ssp.bicolor (Ritter et al., 2007;Ali et al., 2008).A more recent study using SSR and single nucleotide polymorphism (SNP) markers showed that sweet sorghum accessions can be classified into three main groups: historical and modern genotypes used for syrup production, modern genotypes used for sugar or energy production, and amber genotypes (Murray et al., 2009).In Mexico, the National Institute for Forestry, Agriculture and Livestock Research (Inifap) has introduced lines and varieties from the USA and the International Crops Research Institute for the Semi-Arid Tropics (Icrisat); however, the level of genetic diversity required to develop new genotypes is unknown.
Regarding SSR markers, different studies have recommended the use of these markers in analyses of genetic diversity due to their high degree of polymorphism (Geleta et al., 2006;Ali et al., 2008;Shehzad et al., 2009).
The objective of this work was to evaluate the diversity and genetic relationships between lines and varieties of the sweet sorghum germplasm bank of Inifap using AFLP and SSR markers.

Materials and Methods
A representative set of 41 lines and cultivars of sweet sorghum (Table 1) from Inifap was used for diversity analyses.Total genomic DNA was extracted from the young leaves of five plants of each line following the standard cetyltrimethylammonium bromide method (CTAB) with minor modifications (Doyle & Doyle, 1990).Leaf tissue (2 g) was ground to a fine powder in liquid nitrogen, and 100 mg of this powder were homogenized in 800 µL of extraction buffer [100 mmol L -1 Tris, pH 8.0; 150 mmol L -1 EDTA, pH 8.0; 2.1 mol L -1 NaCl; 3% (w v -1 ) CTAB; 1% (v v -1 ) β-mercaptoethanol].
AFLP analysis was performed as described by Vos et al. (1995).Genomic DNA (10 ng µL -1 ) was digested with 5 U each of EcoRI and MseI at 37°C for 3 hours.Digested samples were incubated at 70°C for 15 min to deactivate the restriction enzymes.EcoRI and MseI adapters (5 and 50 pmol, respectively) were ligated to the digested DNA fragments using 1 U of T4 DNA ligase in ligation buffer (1X T4 DNA ligase buffer), and incubated at 37°C overnight.Taq DNA polymerase was used in all polymerase chain reactions (PCR).Selective amplification was performed using six different AFLP primer combinations (Table 2).Pre-amplification of ligated DNA (diluted 10-fold) was carried out with primers complementary to the EcoRI and MseI adapters with one selective nucleotide (adenine and cytosine, respectively) in a Px2 thermal cycler (Thermo Electron Corporation, Milford, MA, USA), using the following cycling parameters: 20 cycles of 94°C for 30 s, 56°C for 30 s, and 72°C for 60 s.The second amplification was carried out with six selective primer combinations of EcoRI (700 and 800 nm) and MseI with three selective nucleotides (Table 2).
Each PCR product was electrophoresed independently on 6% denaturing polyacrylamide gels.The PCR products were fractionated on a sequencing system (Li-Cor IR2, Li-Cor, Lincoln, NE, USA) equipped with two infrared lasers with the ability to read at 700 and 800 nm wavelengths.Only bright, clearly distinguishable bands between 50 and 700 bp were recorded for analysis.Negative controls containing no template were performed for each PCR.
The SSR primers used in the present study were developed for S. bicolor (Taramino et al., 1997;Bhattramakki et al., 2000;Kong et al., 2000) and were selected based on their clear polymorphic patterns and on their position in the sorghum genome, covering ten linkage groups or chromosomes.In total, 29 SSR primers were used (Table 3).The microsatellite PCR contained 50 ng µL -1 template DNA, 1X PCR buffer, 0.2 mmol L -1 each dNTP, 4 mmol L -1 MgCl 2 , 1 µmol L -1 each primer (F and R), and 1 unit of Taq DNA polymerase in a total reaction volume of 25 µL.Amplification was performed in a Px2 thermal cycler (Thermo Electron Corporation, Milford, MA, USA) using a program with an initial denaturation at 94°C for 60 s followed by 35 cycles at 94°C for 60 s, 57°C for 60 s, and 72°C for 60 s with a final hold at 72°C for 10 min.Formamide dye was added to the appropriate PCR products, which were then subjected to electrophoretic separation on 8.5% denaturing polyacrylamide gels.
To determine which of the AFLP primer combinations were the most informative, the following parameters were calculated: polymorphism information content (PIC), marker index (MI), resolving power (RP), and diversity index (DI).The PIC value for each AFLP primer combination was calculated according to the formula PICi = 2fi (1 -f i ), in which f i is the frequency of the marker fragments that were present, and 1 -f i is the frequency of the marker fragments that were absent.PIC was averaged over all fragments for each primer combination.PIC for each SSR locus was calculated using a freely available online calculator (Kemp, 2012).
The marker index (MI) was calculated according to the formula MI = PIC × EMR, in which EMR is the effective multiplex ratio, defined as the product of the total number of loci per fragments per primer (n) and the fraction of polymorphic loci fragments (β) The resolving power (RP) of each primer was calculated as RP = ΣI b , in which I b represents fragment informativeness.The I b can be represented on a 0-1 scale using the formula I b = 1 -[2 × (0.5 -p)], in which p is the proportion of the 88 accessions that contain the fragment.The diversity index, which indicates the genetic diversity of the germplasm, was calculated using the formula DI = 1 -Σ Pi 2 , in which Pi is the allele frequency i n (each individual allele is considered a unique fragment amplification).
Each AFLP primer combination received a score (1, for presence, and 0, for absence of bands in each accession), and a binary matrix was generated.A dendrogram was constructed using the Dice similarity coefficient (DSC) (Nei & Li, 1979), the unweighted pair group method with arithmetic average (UPGMA), and the jackknife method for corroboration.A bootstrap resampling method was performed to determine the robustness of the dendrogram, and 1,000 bootstrap replicates were obtained from the original data of 88 accessions.From these 1,000 matrices, confidence limits for each pair-wise comparison were determined.All calculations were performed using FreeTree 0.9.1.50(Pavlícek et al., 1999), and the dendrogram was drawn using TreeView 1.6.6.(Page, 2001).Moreover, a principal coordinate analysis (PCooA), using the DSC (Nei & Li, 1979), was conducted with NTSYS PC version 2.2 (Rohlf, 2000).This analysis was performed to generate a three-dimensional representation of the accessions to confirm the similarity among them.

Results and Discussion
Robust amplification profiles were obtained from AFLP and SSR markers, which were able to differentiate between genotypes.The polymorphism rate of the AFLP markers was 54%, which was similar to that observed by Ritter et al. (2007); however, other studies have reported polymorphism rates between 85 and 93% using AFLP and RAPD markers (Geleta et al., 2006;Kachapur et al., 2009).In the present study, AFLP markers were polymorphic enough to distinguish between sweet sorghum genotypes and to detect high levels (69%) of genetic diversity (Table 2), which is consistent with the observations of Geleta et al. ( 2006 The analysis of the frequency and distribution of polymorphic AFLP fragments showed 39 specific fragments -each present in a single individual -and 65 rare fragments present in up to 10% of the genotypes (Table 2).It is probable that the presence of such fragments in a gene is related to a loss of genetic diversity, possibly due to selection (Foster et al., 2010).However, specific and rare alleles are of great interest because they may be linked to a particular genotype and may serve to distinguish a genotype or a specific region of the genome (Agrama & Tuinstra, 2003).The genotypes with the highest number of unique and rare AFLP fragments were RBSS-8, RBSS-9, RBSS-25, RBSS-32 (Dale), RBSS-37 (Theis); three of them are genotypes introduced by Icrisat, and two are improved USA varieties, confirming the data on rare and unique markers (Agrama & Tuinstra, 2003;Foster et al., 2010).
However, five AFLP combinations were highly informative (E-AGA + M-CAC, E-AGG + M-CAT, E-AAG + M-CAG, E-AGA + M-CAA, and E-AGG + M-CAG) with an average PIC of 0.3 (Table 2).et al., 2009).The total number of alleles and size of SSR markers varied from the ones originally reported, indicating that the source of the material tested differed from that of the previously characterized collections (Ali et al., 2008;Shehzad et al., 2009).An average of five alleles per locus and a polymorphism rate of 98% were observed (Table 3); the latter being similar to that found by Agrama & Tuinstra (2003) and Pei et al. (2010), but higher than the ones observed by Ali et al. (2008) andShehzad et al. (2009).Moreover, 11 tested loci (mSbCIR286, Xgap84, Xtxp12, Xtxp15, Xtxp265, Xtxp6, Xtxp321, Xtxp289, Xtxp67, Xcup67, and Sb03g039090) had PIC values greater than 0.7, which is consistent with the results found for some of these loci by Shehzad et al. (2009) and Pei et al. (2010).The high average genetic diversity observed in the present study (0.79) was similar to that found by Smith et al. (2000) and Agrama & Tuinstra (2003), who obtained values of 0.62 and 0.58, respectively, but was higher than the values reported by Schloss et al. (2002), Ali et al. (2008), and Shehzad et al. (2009), who obtained values of 0.46, 0.40, and 0.217, respectively.Rare alleles were also observed in 65% of the evaluated loci (Table 4).Rare alleles have a frequency less than 0.05 (Casa et al., 2005;Somers et al., 2007), and their importance lies in the fact that they are unique and associated with a particular genotype (Agrama & Tuinstra, 2003).In the present study, the genotypes with the highest number of rare SSR alleles also contained unique AFLP alleles.
The Xtxp265 marker was highly polymorphic and had a high PIC value of 0.7 (Table 3).Murray et al.  2009) reported that this marker had a significant association with plant height and flowering time.
Xtxp265 is located on chromosome 6, within the gene more closely linked with photoperiod sensitivity (Murray et al., 2008).Similarly, Ji et al. (2011) found a positive association between the Xtxp18 marker and an allele of 270 bp located on chromosome SB1-08.Polymorphisms were also observed using the marker Sb03g027710, which corresponds to an SNP marker derived from sugarcane (Calvino et al., 2009).These authors recommend the use of this marker along with five additional SNP markers for the genetic mapping of genes linked to sugar content or for the selection of parents in sweet sorghum.Therefore, this marker may be useful for the genetic improvement of this species in the Mexican germplasm bank.
The diversity indices for AFLP and SSR markers were 69 and 79%, respectively.Although a more accurate estimate of genetic relationships would theoretically be obtained using a greater number of markers, the distribution of these markers in the genome is equally important.Menz et al. (2002) indicated that markers with a low level of genomic coverage may provide a different classification for a particular gene.For example, many of the AFLP markers are generated using the enzymes EcoRI and MseI, and the recognition sites for these enzymes are more densely located in the centromeric regions of chromosomes; therefore, centromeric regions have a high weight in the classification of germplasm.With the creation of genetic maps for many crop species, markers can be selected to adequately cover the entire genome without any particular region being over-or underrepresented.Consequently, a more accurate estimate of the dissimilarity between genotypes can be achieved by selecting markers that are uniformly distributed throughout the genome.In the present study, 29 SSR markers distributed throughout the ten chromosomes of sorghum were selected, which provided a better estimate of genetic relationships among the studied genotypes compared to the AFLP markers.These genetic relationships (Figure 1) indicated the presence of at least two types of sorghum.The first group, cluster 1, was divided into two subgroups (1A and 1B).Cluster 1A contained most of the genotypes, including the grain sorghums Sureño and Mazatlán, and both subgroups of this cluster contain historical and modern sorghum lines used for syrup production (Murray et al., 2009).These lines have tall stems that are large in diameter and very juicy, with high brix content, which is lower than that of sorghum lines developed for sugar and energy production (bicolor type).The second group, cluster 2, includes four modern genotypes of sweet sorghum used for bioenergy production, i.e., Dale, Theis, Maravilla, and RB-Cañero (Murray et al., 2009); the latter is derived from the variety M81E (kafir/bicolor type).
Deu et al. (2006) found that the bicolor race has high genetic diversity and many rare alleles, which is not surprising considering that this race is considered to be the oldest and most widely distributed geographically due to its several uses (fodder, brooms, and sweet stems).Additionally, genotypes RBSS-7, RBSS-8, RBSS-9, and RBSS-21 were very divergent.Casa et al. (2008) observed that the genotypes of amber varieties of sweet sorghum are very similar to those of the bicolor type, although amber varieties are more divergent.Therefore, it is likely that the four genotypes previously mentioned correspond to the amber variety.Furthermore, bootstrap confidence intervals indicating the similarity of the genotypes showed that, at multiple nodes, the association was not as robust because the values were less than 30%.
To validate these results, PCooA was performed, which determined that the first three principal coordinates accounted for 26% of total variation, which agreed with the bootstrap values.A three-dimensional representation (Figure 2) showed that the dispersion of the genotypes of the detected clusters corresponded to the one observed in the dendrogram, and also illustrated that the genotypes of cluster 1B were more heterogeneous.Moreover, outlier genotypes (RBSS-7, RBSS-8, RBSS-9, and RBSS-21) could clearly be distinguished.
2. In general, AFLP and SSR markers show a high level of genetic diversity.
3. The Inifap germplasm bank contains a great diversity of genotypes, which may be used in breeding programs for the development of new varieties of sweet sorghum with high sugar and juice content.

Figure 1 .
Figure 1.Dendrogram of the 41 evaluated genotypes of sweet sorghum (Sorghum bicolor) obtained from amplified fragment length polymorphism (AFLP) and simple sequence repeat (SSR) molecular markers, using the similarity index of Nei & Li and the unweighted pair group method with arithmetic mean (UPGMA).

Figure 2 .
Figure 2. Dispersion of sweet sorghum (Sorghum bicolor) germplasm from the National Institute for Forestry, Agriculture and Livestock Research, based on the first three principal coordinates.

Table 1 .
Genotypes of sweet sorghum (Sorghum bicolor) used in the study.

Table 2 .
Polymorphism information, genetic diversity index (DI), marker index (MI), effective multiple ratio (EMR), polymorphic information content (PIC), and resolving power (RP) given for each of the amplified fragment length polymorphism (AFLP) combinations used in the study.

Table 3 .
Number of alleles (NA), expected number of alleles (NAE), Shannon index (SI), marker index (MI), polymorphic information content (PIC), resolving power (RP), and genetic diversity index (DI) calculated for each simple sequence repeat (SSR) locus.Therefore, these AFLP combinations are recommended for use in germplasm diversity analysis of S. bicolor and other species.In addition, the correlation of PIC with MI and RP (0.83 and 0.99, respectively) confirms the usefulness of these AFLP combinations (Tatikonda

Table 4 .
Number and frequency of simple sequence repeat (SSR) alleles observed in sweet sorghum.