A novel statistical method for assessing SSR variation in autotetraploid alfalfa ( Medicago sativa L . )

The level of variation of simple sequence repeat (SSR) markers in cultivated alfalfa from American, Australian and Chinese sources was evaluated using a novel autotetraploid statistical method to calculate the effective number of alleles, the allele frequencies and heterozygosity. We used 19 SSR primers to screen seven polymorphic SSR loci in 320 plants from eight populations. The genetic distance and phylogenetic analysis (DISPAN) program was used to calculate the interand intra-population genetic relationships using the conventional binary absence/presence (0/1) method and our novel autotetraploid method. The autotetraploid method resulted in significantly higher heterozygosity (p < 0.01), average effective number (p < 0.01) and lower standard genetic distance (p < 0.01) than the binary method. Our results suggest that our new autotetraploid method is a very useful tool for assessing genetic variation and genetic relationships in all autotetraploid plant species.


Introduction
Alfalfa, Medicago sativa L., is the most cultivated forage legume due to its ability to fix atmospheric dinitrogen and its high protein content.The analysis of the genetic variability within and among populations of cultivated alfalfa can assess future risk of genetic erosion and help in the development of sustainable conservation and genetic improvement strategies.However, since alfalfa is autotetraploid (2n = 4x = 32), allogamous and seedpropagated successful assessment of the genetic diversity of alfalfa has been hampered by the statistical methods available (Stanford 1951, Flajoulot et al. 2005).
Compared with other DNA-based markers such as restriction fragment length polymorphisms (RFLP), random amplified polymorphic DNA (RAPD), single nucleotide polymorphisms (SNPs) and amplified fragment length polymorphisms (AFLP), simple sequence repeat (SSR) markers occur frequently in plants and are multiallelic, co-dominant, highly reproducible and can function with low-quality DNA (Morgnate and Olivieri 1993;Wang et al. 1994).Many SSR markers have been developed and are widely used in plants for genetic mapping, genetic diversity assessment, population genetics and marker-assisted selection (Gupta and Varshney 2000).
With the rapid development of expressed sequence tags (ESTs), a large number of SSR markers have been developed from the ESTs library of Medicago truncatula (Baquerizo et al. 2001;Eujayl et al. 2004), a model diploid organism with 2n = 2x = 16.Eujayl et al. (2004) searched 147,000 M. truncatula ESTs and identified 455 SSR primer pairs which produced characteristic SSR bands of the expected length in Medicago species.Ellwood et al. (2006) used six simple sequence repeat (SSR) loci to analyze the genetic diversity and relationships between randomly selected specimens from 192 accessions in the core M. truncatula collection.
It thus seems that SSR markers would be a powerful molecular approach for assessing genetic diversity and germplasm characterization in tetraploid alfalfa.Bernadette et al. (2003) used 87 SSR primer pairs, most from M. truncatula ESTs, for genotyping and mapping tetraploid alfalfa populations and SSR markers have been applied to alfalfa in several other studies (Diwan et al. 1997(Diwan et al. , 2000;;Mengoni, 2000a, b;Baquerizo et al. 2001;Eujayl et al. 2004;Flajoulot et al. 2005;Sledge et al. 2005;Ellwood et al. 2006).Even so, relatively little is known about the utility of SSR markers in elucidating the genetic relationships within and among populations of cultivated alfalfa because of the autotetraploid inheritance of alfalfa and codominant characteristic of SSR marker.A method for analyzing the segregation patterns of SSR markers in autotetraploid plants has been developed which can predict the genotypes of a pair of autotetraploid specimens at a genetic locus from their gelpattern phenotypes and the phenotypes of their offspring (Luo et al. 2000(Luo et al. , 2001;;Hackett and Luo 2002).This methodology has greatly facilitated the construction of genetic linkage maps in tetraploid plants and has resulted in the application of SSR methodology to the analysis of genetic diversity within and among tetraploid plant populations.
There are large differences between the genotypes and the phenotypes of autotetraploid plants due to the possibilities of different dosages of alleles and the presence of null alleles (Table 1), which means the failure of the PCR primers to anneal to the DNA template and the consequent lack of the corresponding electrophoretic band (Luo et al. 2000).The null allele often occurs in cultivated alfalfa because there is more SSR variation in autotetraploid than diploid plants (Callen et al. 1993) and it has also been pointed out that when M. truncatula SSR primers are used in alfalfa the number of null alleles should be higher (Bernadette et al. 2003).Jenczewski et al. (1999) has pointed out that the frequencies of the null alleles, the amount of self-fertilization within population (s) and the proportion of double reduction at each loci (α) should be evaluated when describing genetic relationships in autotetraploid alfalfa accessions.The allele frequencies, including the null alleles, shown in Table 1 form the basis of our hypothesis that in natural populations of cultivated alfalfa all genotype frequencies within the same set of banding patterns (gel) are equivalent, assuming that that the populations were mating at random (s = 0) and that in autotetraploid alfalfa loci did not experience double reduction (α = 0).These hypotheses are supported by the previous work of Jenczewski et al. (1999).
Based on allele frequencies, the formulas of heterozygosity (H) and the average effective number of alleles (Ne) were put forward in our study.
In the present paper we describe the use of SSR markers and a novel autotetraploid statistical method to investigate the genetic variability within and among cultivated alfalfa accessions.This methodology can be used to gain a better understanding of biodiversity conservation and the utilization of genetic resources for use in designing breeding programs for cultivated alfalfa.

Plant materials
All alfalfa (Medicago sativa L.) accessions investigated were supplied by the Institute of Animal Science, Chinese Academy of Agricultural Sciences (IAS-CAAS), Beijing.The eight accessions studied were: two native Chinese accessions, Weinan, a landrace from Shaanxi Province, and Zhongmu N. 1, an improved variety at IAS-CAAS; and six accessions Alfanafa, Atlantic, AZ-Germ-Salt, AZ-88NDC, AZ-90NDC-ST and Salado introduced from Australia and America.Except for Weinan, all the accessions were improved varieties.Seeds were germinated and grown in a greenhouse for 30 days before collecting 0.5 g of fresh young leaves from each of 40 randomly selected seedlings per accession.

DNA preparation and SSR amplification
For each accession, total genomic DNA was extracted from individual seedlings using a mini-extraction kit according to the manufacturer's protocol (BioDev-Tech, Beijing, China) and stored at -20 °C until needed.To determine the optimum polymerase chain reaction (PCR) conditions we carried out trials using the PCR protocol described below but varying the concentration of MgCl 2 , dNTPs, Taq DNA polymerase and genomic DNA and other parameters, e.g.melting point (Tm, °C), annealing time etc.The number of 19 SSR primer pairs was screened using ten DNA samples from different accessions and on the basis of this preliminary data seven SSR primers (AFca16, AFca1, AFca11, AFct32, 386 Liu et al.AFct11, MTLEC2A, and B14B03) (Diwan et al. 1997;Bernadette et al. 2003) were chosen.The PCR amplification was carried out in a 12 μL final volume containing 15 ng of genomic DNA as template, 0.2 mM of each dNTP (Sangon Co., Shanghai, China), 0.25 μM of each primer, 1.0 unit of Taq DNA polymerase (Sangon Co., Shanghai, China) in 1 x PCR reaction buffer (Sangon Co., Shanghai, China).The optimal concentration of the Mg 2+ ions was different in each primer pairs (Table 2).The PCR conditions consisted of an initial denaturation at 95 °C for 5 min, followed by 33 cycles of denaturation at 94 °C for 45 s, annealing at 56 °C (depended on primers, Table 2) for 45 s and extension at 72 °C for 1 min, with a final 72 °C extension for 10 min.
The PCR products (3 μL per lane) were resolved on 8% or 12% (w/v) polyacrylamide gel, using an acrylamide/bisacrylamide ratio of 29:1, and run at 175 V constant voltage for 10 h at 4 °C.The PBR322/Msp I marker (Huamei, Beijing, China) was also run on each gel as a molecular weight standard.Bands were stained using a silver-nitrate method (Tixier et al. 1997) and scanned and analyzed using the Gel Logic 200 system and the associated 1D Image Analysis Software (Kodak, USA).

SSR analysis
The SSR banding phenotypes on the visualized gel were scored by two different protocols, the conventional band absence (0) or presence (1) binary method (Botstein et al. 1980;Anderson et al. 1993) and our novel tetraploid method.
In the conventional binary 0/1 method, the formulas for the binary allele frequencies (P), heterozygosity (h) (Nei 1987), effective number of alleles (Ne) and the polymorphism information content (PIC) (Botstein et al. 1980;Anderson et al. 1993) were calculated as follows: where P i is the frequency of an allele for locus i, f i (0/1) is the presence or absence of a specific band in each plant, n is the number of plants in the population and N is the total number of locus i bands in a population.So, this leads to the equation where P ij is the frequency of the j allele of locus i, m is the total number of locus i, Ne is the effective number of alleles for locus i and n is the total number of loci.
In our new autotetraploid method the autotetraploid heterozygosity (h') was calculated using the same formula as for the binary heterozygosity (h), and the allele frequencies (p i ) and the effective number of alleles (Ne') were calculated as follows: where Pi' is the expected number of the j allele (expect for the null allele) of a locus, P E is the sum of N j in locus i, P O is the expected number of null alleles for a locus.The banding A novel statistical method for assessing SSR variation in autotetraploid alfalfa 387 phenotypes are given by α = one band, β = two bands, χ = three bands and ε = four bands.For both methods, genetic diversity was estimated using Nei's standard genetic distance (D, Nei 1972) computed using the genetic distance and phylogenetic analysis DISPAN program (Ota 1993) and unweighted pair groupmethod with arithmetic mean (UPGMA) dendrograms (Sneath and Sokal 1973) were drawn from genetic distance (D) matrices using the DISPAN clustering routine.

Genetic diversity within and among accessions
By optimizing the PCR conditions for each primer pair, especially the Mg 2+ concentration and the melting temperature (Tm, °C) (Table 2), we were able to visualize the SSR banding patterns of the accessions, an example being the SSR variation at the AFca16 locus for the Alfanafa accession shown in Figure 1.We used the allele frequencies to calculate the heterozygosity (h), the number of effective alleles (Ne) and the polymorphism information content (PIC) by both the binary method and our novel autotetraploid method and found that both statistical methods indicated that there was substantial variability between the eight accessions studied.In the eight accessions we found 59 bands (alleles) at seven SSR loci, with 27 (45.8%)occurring in all the accessions.The number of bands detected in each accession was 51 in AZ-GermSalt, 49 in Atlantic, 48 in Salado and Zhongmu N. 1, 47 in AZ-90NDC-ST, 46 in Alfanafa, 44 in AZ-88NDC and 41 in Weinan.

Comparison of the binary and autotetraploid statistical methods
With the binary 0/1 method, within population SSR variability was high, with the heterozygosity by the binary method (h) ranging from 0.6877 for the Alfanafa accession to 0.7640 for the AZ-GermSalt accession (Figure 2A) and the mean effective number of alleles by the binary method (Ne) ranging from the 3.4 for Alfanafa and AZ-88NDC to 4.4 for Zhongmu N. 1' (Figure 2B).The most powerful index for evaluating loci polymorphism is the PIC value, which ranged from 0.6276 for the B14B03 locus to 0.7989 for the AFca11 locus.The PIC values for the AFca16, AFca11, AFct11, MTLEC2A and B14B03 loci were all higher than 0.6500 and the mean PIC for all seven loci was 0.6977 (Figure 3).The regression coefficient between h and PIC was 0.9917 (|t| = 7.697 > t 0.01 , p < 0.01).
Figure 2 -The effective number of alleles (Ne) and heterozygosity (h) and their respective standard errors of eight alfalfa accessions as calculated using two statistical methods.Figure A shows the heterozygosity of eight alfalfa accessions by the binary method and our novel autotetraploid statistical method.Figure B shows the number of effective alleles by these two methods.All the values calculated using the autotetraploid method are significantly higher than those using the binary method because the autotetraploid method takes the null allele into account and uses the expected allele frequencies (See Table 1).Figure C shows the polymorphic information content (PIC) +SE of eight alfalfa populations as calculated using the binary method.Accessions: Alfanafa (Alf), Atlantic (Atl), AZ-GermSalt (AZS), AZ-88NDC (AZ8), AZ-90DNC-ST (AZ9), Salado (Sal), Weinan (Wei), Zhongmu N. 1 (Zho).Our novel autotetraploid method showed greater within population SSR variability, with the maximum autotetraploid heterozygosity (h') being 0.8234 for the AZ-GermSalt accession.The mean h' value for the eight accessions was 0.7707, significantly higher than the mean heterozygosity (h = 0.7338, p < 0.05) as calculated by the binary method (Figure 2A).The mean effective number of alleles by our autotetraploid method (Ne') was also significantly higher (p < 0.01) than the effective number of alleles as calculated by the binary method (Ne) (Figure 2B).

Standard genetic distance and dendrogram
The allele frequencies (Table 3) and DISPAN program were used to calculate the genetic distance (D) for the results produced by the binary method (D) and the novel autotetraploid method (D'), and two UPGMA dendrograms constructed (Figure 4).With the binary method the minimum D value was 0.0173 between the Salado and Zhongmu N. 1 accessions and the maximum D value was 0.2217 between the Alfanafa and Weinan accessions.These findings were supported by similar findings for the autotetraploid statistical method, which found a minimum D' value of 0.0119 between the Salado and Zhongmu N. 1 accessions and a maximum D' value of 0.1473 between the Alfanafa and Weinan accessions.However, there were significant differences (p < 0.01) between the D and D' values.
When plotted as a dendrogram the data from the binary analysis clustered the eight alfalfa populations were into two groups (Figures 4A), Weinan and Atlantic being combined into one group and the other six accessions constituting the other group.Data from our novel autotetraploid method produced similar results, except that there were differences in the AZ-GermSalt and AZ-90DNC-ST branches (Figure 4B).

Discussion
This study is the first attempt to evaluate genetic variation patterns in autotetraploid plant using a new statistical method.The data from the two different statistical methods provide evidence of inter-and intra-populational genetic polymorphisms which could be interpreted as being due to allogamy and the tetraploid nature of alfalfa.The PIC index describes diversity within accessions (intra-populational diversity) and characterizes the degree of polymorphism in each locus, a PIC value of less than 0.25 indicating low polymorphism, a value between 0.25 and 0.5 average polymorphism and a value higher than 0.5 a highly polymorphic locus (Botstein et al., 1980).In our present study we found that seven loci all had PIC values exceeding 0.5 and could be considered highly polymorphic (Figure 3).The highest PIC value was 0.7302 for AZ-GermSalt accession, indicating that this accession had the greatest genetic diversity of all the accessions investigated (Figure 2C).However, all the eight alfalfa accessions studied presented high heterozygosity as measured by both h and h' , supporting previous reports of high heterozygosity in alfalfa as detected using RFLP (Brummer et al., 1991) andSSR andRAPD (Mengoni et al., 2000a).
In our study we found that both statistical methods clustered the eight accessions into two groups, which may be related to their salt-tolerance traits because Alfanafa, AZ-GermSalt, AZ-90NDC-ST, RFLPAZ-88NDC, Salado A novel statistical method for assessing SSR variation in autotetraploid alfalfa 389 Table 3 -Standard genetic distances (D) between populations calculated using two different statistical methods, the 0/1 method (upper diagonal) and the tetraploid method (lower diagonal).The standard genetic distances computed by the 0/1 method and the tetraploid method are significantly different (|t| = 3.018 > t 0.01 , p < 0.01).The differences between the conventional binary and our novel autotetraploid dendrograms (Figures 4a and 4b) are probably due to the different statistical theory on which the two methods are based.The usefulness of nuclear SSR in the analysis of alfalfa germplasm has been reported previously (Mengoni et al., 2000a) but our novel autotetraploid method allows a clearer insight into the genetic variability of cultivated tetraploid alfalfa using SSR markers because when calculating the allele frequencies our method considered not only the differences between marker phenotypes and genotypes but also the frequency of null alleles.Compared with the binary analysis, our new statistical method not only produced a significantly higher number of effective alleles and heterozygosity values and lower standard genetic distances but also a different UPGMA dendrogram.Because the autotetraploid method emphasizes the complex mechanisms of autotetraploid population genetics and codominant genetic SSR markers the results of this type of analysis should be closer to true genetic characteristics of autotetraploid alfalfa.
In our tetraploid method, we did not present a new formula for calculating the PIC because this is very complex for autotetraploid plants.In the traditional binary method, we found that the correlation coefficient between the heterozygosity (h) and PIC was 0.9917 (p < 0.01), the similar correlation also being reported by Qu et al. (2004), and that the populations showing the maximum and the minimum h (Figure 2B) and PIC (Figure 2C) values were AZ-GermSalt (h = 0.7640, PIC = 0.7302) and Alfanafa (h = 0.6877, PIC = 0.6474), respectively.This suggests that the autotetraploid PIC value (PIC') could be estimated from the autotetraploid h' value and used to evaluate the genetic diversity within an autotetraploid population.
Polyploidy is very prevalent in plants, with at least 50% and perhaps up to 95% of angiosperms having experienced one or more episodes of chromosome doubling during their evolutionary history (Leitch and Bennett 1997).A large proportion of cultivated crops are allopolyploids, prominent among them being wheat, cotton, tobacco and many of the forage grasses.Autotetraploids are less common, but include some important plants such as alfalfa, birdsfoot trefoil, potato, tea and rose.During the work described in this paper we employed cultivated alfalfa as a model plant to establish a novel autotetraploid statistical method for studying the inter-and intra-population genetic relationship between autotetraploid species.The results of our study indicate that the tetraploid statistical method will be useful in a range of autotetraploid species for gene localization, construction of linkage maps and the planning of breeding programs.

Figure 1 -
Figure 1 -Simple sequence repeat variation at the AFca16 locus of some individual Alfanafa plants.The letter 'M' denotes the molecular marker, 147 bp, 110 bp and 90 bp from top to bottom of the lane.

Figure 3 -
Figure3-The polymorphism information content (PIC) and corresponding standard error of seven simple sequence repeat (SSR) loci in eight alfalfa accessions calculated using the conventional binary method.See Figure2for the abbreviations.

Table 1 -
Relationship between marker banding phenotypes and genotypes at a single locus.Different letters represent different alleles, with 'O' denoting the null allele.The expected allele frequencies were calculated on the assumption that the frequencies of each genotype of the same banding phenotype (i.e.number of bands present on the electrophoresis gel after staining) are equivalent in natural populations of cultivated alfalfa.For example, for the 'One band' banding phenotype all corresponding genotypes (AAAA, AAAO, AAOO and AOOO) share the same genotype frequency.

Table 2 -
The seven pairs of alfalfa polymorphic simple sequence repeat (SSR) primers used in this study.Note that the number of alleles does not include the null allele.