Genetic diversity analysis of Cuban traditional rice ( Oryza sativa L . ) varieties based on microsatellite markers

Microsatellite polymorphism was studied in a sample of 39 traditional rice (Oryza sativa L.) varieties and 11 improved varieties widely planted in Cuba. The study was aimed at assessing the extent of genetic variation in traditional and improved varieties and to establish their genetic relationship for breeding purposes. Heterozygosity was analyzed at each microsatellite loci and for each genotype using 10 microsatellite primer pairs. Between varieties genetic relationship was estimated. The number of alleles per microsatellite loci was 4 to 8, averaging 6.6 alleles per locus. Higher heterozygosity (H) was found in traditional varieties (HTV = 0.72) than in improved varieties (HIV = 0.42), and 68% of the total microsatellite alleles were found exclusively in the traditional varieties. Genetic diversity, represented by cluster analysis, indicated three different genetic groups based on their origin. Genetic relationship estimates based on the proportion of microsatellite loci with shared alleles indicated that the majority of traditional varieties were poorly related to the improved varieties. We also discuss the more efficient use of the available genetic diversity in future programs involving genetic crosses.


Introduction
In Cuba, the genetic breeding of rice (Oryza sativa) is mainly conducted at the Cuban Rice Research Institute (Instituto de Investigaciones del Arroz (IIArroz) Cuba) and is characterized by the use of consanguineous parent plants mainly introduced from the International Rice Research Institute (IRRI, Philippines), the International Center for Tropical Agriculture (Centro Internacional de Agricultura Tropical (CIAT), Colombia) and improved varieties developed in our country.These improved rice varieties are characterized by a limited genetic base which can be traced back to just six maternal cytoplasm sources.All the improved varieties are grown under irrigated lowland conditions and are semi-dwarf O. sativa subspecies indica genotypes which present a high field genetic uniformity because rice production in Cuba during the last 30 years was mainly undertaken using an extensive system based on a limited number of these improved varieties (Fuentes et al., 2004).
Since 1996 the profile of the Cuban rice crop has changed, and today nearly 60% of the total rice production comes from local farmers growing traditional varieties of unclear origin or improved in situ varieties under mainly irrigated lowland conditions.Many of the traditional varieties are probably locally adapted farmer's selections from varieties from the United States (US) that showed good grain quality and local adaptability and were introduced before the beginning of the Cuban rice breeding program.However, the genetic relationship among traditional varieties and between these materials and the improved varieties released by IIArroz remained unknown.
Molecular marker technologies can assist conventional breeding efforts and are valuable tools for the analysis of genetic relatedness and the identification and selection of desirable genotypes for crosses as well as for germplasm conservation in gene banks.Single sequence repeats markers (microsatellites) are co-dominant, hypervariable, abundant and well distributed throughout the rice genome (Temnykh et al., 2001).About 2240 microsatellite markers are now available through the published high-density linkage map (McCouch et al., 2002) or public database.Moreover, multiplex microsatellite marker panels have been designed for high throughput analyses and semi-automated genotyping (Coburn et al., 2002).The application of microsatellites markers in rice include characterization of the genetic structure of the cultivated rice O. sativa at both the inter and intra-varietal level (Olufowote et al., 1997;Garris et al., 2005), genetic diversity and/or evolutionary analyses of landraces, weedy and wild rice germplasm (Yang et al., 1994;Vaughan et al., 2001;Ni et al., 2002;Gao, 2004Gao, , 2005;;Gao et al., 2005), determination of the purity of breeding material or seed stocks (Olufowote et al., 1997), prediction of hybrid performance and heterosis (Xiao et al., 1996a) and the analyses and tagging of valuable quantitative trait loci (QTL) and genes (Xiao et al., 1996b;Zou et al., 2000).
Our hypothesis is that traditional rice varieties locally cultivated by farmers that were either not used, or poorly used, as parents in the Cuban rice breeding program represent alternative genetic pools to the improved varieties.To corroborate this hypothesis, microsatellite markers were used for the genetic diversity analysis of a sample of 39 traditional varieties and 11 varieties representing a significant portion of the rice cultivated in Cuba by the extensive system of state companies.This study was aimed assessing the extent of genetic variation in the traditional and improved rice varieties and establishing their genetic relationships for breeding purposes.

Rice materials
We investigated 39 traditional rice varieties (Oryza sativa), collected from 1975 to 1985 at different farms in Cuba and held at the IIArroz Rice Germplasm Bank, and 11 varieties commercially exploited in Cuba (Fuentes et al., 2003) (Table 1).

Microsatellite assay
For each variety, 20 seeds were planted in a greenhouse and leaves from 20-day old seedlings were collected for DNA extraction (Dellaporta et al. 1983).The polymerase chain reaction (PCR) was conducted in a final volume of 20 μL containing 20 ng of template DNA, 0.1 μL of a 20 μM solution of each of the primers presented in Table 2, 250 μM of each dNTP, 1.8 mM of MgCl 2 and 1 unit of Taq DNA polymerase (Promega, USA).The reaction was processed at 94 °C for 3 min, followed by 34 cycles of 94 °C for 30 s, 54 °C for 30 s and 72 °C for 1 min and a final 5 min extension at 72 °C.After the reaction, 5 μL of stop solution (95% (w/v) formamide, 20 mM EDTA, 0.05% (w/v) bromophenol blue and 0.05% (w/v) xylene-cyanol) was added to the amplification product and 3 μL samples loaded onto 6% (w/v) polyacrylamide denaturing gels containing 6 M urea.A silver-staining procedure (Cho et al., 1996) was used to reveal bands after electrophoresis.

Data analysis
Because of the codominance of the markers, microsatellites were scored as homozygotic and heterozygotic genotypes.The gene diversity or heterozygosity (Nei, 1973) of the polymorphic loci was calculated as: = − / 1 , where n the number of genotypes in the sample.For each genotype, heterozygosity was expressed as the percentage of loci at which the genotype is heterozygous out of the total possible number of 10 loci.The partitioning of genetic variation in the total variety sample was studied considering the traditional and improved rice varieties as subgroups.The gene frequencies of these two variety subgroups were compared by means of the chi-squared (χ 2 ) test (p < 0.05).
Genetic relationship between genotypes (S xy ) were calculated as the proportion of microsatellites loci with shared allele (Lynch, 1990), which is equivalent to the F statistic (Nei and Li, 1979).Thus, S xy = 0 if none of the 10 microsatellite loci had alleles common to both genotypes, whereas S xy = 0.5 if the genotypes had identical alleles at 5 of the 10 microsatellite loci.In a germplasm collection consisting of n varieties, each variety can be compared with the other varieties (n -1) times so that (n -1) S xy can be obtained.An average S xy can be obtained for each variety from the (n -1) S xy (Xu et al., 2004).Therefore, this index was used for identifying the more genetically diverse varieties for breeding purposes.For the diversity representation, a genetic relationship matrix was used to produce a dendrogram in a sequential agglomerative hierarchical nested cluster analysis (SAHN), based on the unweighted pair group method with arithmetic mean (UPGMA) using the NTSYS-pc package (Rohlf, 1997;Exeter Software, Setauket, USA).

Results
Allele number, allele frequency and heterozygosity A total of 66 alleles were detected at the ten microsatellite loci evaluated in the varieties.In this study, the banding patterns resolved by each primer pair were in accordance with single locus variation.Therefore, we refer to the sequence amplified by each primer pair as a locus and each variant as an allele.The number of alleles per microsatellite loci ranged from 4 in the RM-18 to 8 in the RM-4, RM-167 and RM-202 (Table 3).In accordance with previous studies (Wu and Tanksley, 1993;Ni et al., 2002;Lu et al., 2005), the average number of alleles per locus was 6.6, a value higher than that reported by Xiao et al. (1996b) surveying elite inbred lines but slightly lower than those found by Yang et al. (1994), Xu et al. (2004) and Gao et al. (2005) studying a larger sample of landraces and/or commercial varieties.
Very frequent alleles were considered to be those occurring in more than 10% of the varieties in the collection, while rare alleles were classified as occurring in between 2% and 10% of the varieties in the collection.In our study, alleles at microsatellite loci were detected with very differ-ent frequencies in the different genotypes, with 37 very frequent alleles being identified at the ten microsatellite loci while 15 rare alleles were identified at 7 microsatellite loci.A total of 14 unique alleles were detected at eight microsatellite loci (Table 1).Ten of the rice varieties, representing 20% of the collection, had unique alleles for at least one microsatellite locus, 7 of which had a unique allele at only one marker locus.The varieties with the highest number of unique alleles were Negrón (2), Selección 143 (3) and M-4 (2).
The expected total heterozygosity (H T ) of the microsatellite polymorphic loci was between 0.60 for RM-167 and 0.85 for RM-18 markers, with an average value of H T = 0.75.When heterozygosity for each variety was 1112 Alvarez et al.  calculated (Table 1), this ranged between zero and 0.80.As expected for cultivated rice, a high degree of homozygosis was observed.Thus, the majority of rice varieties showed a single allele in their respective microsatellite profiles.Out of a total of 500 amplification profiles (50 genotypes x 10 primer pairs) scored in this study, 109 (22%) showed two alleles and 387 (77%) showed a single allele.Interestingly, four varieties (Jabao, Rexoro, Pati Prieto II and Selection 30) showed three alleles each for the RM-4 primer pair.
Since RM-4 locus is duplicated in chromosomes 11 and 12 of the rice genome (Panaud et al., 1996) it is possible to detect up to four alleles for this locus.

Comparison of variability in traditional and improved varieties
In accordance with previous studies (Yang et al., 1994;Xu et al., 2004), heterozygosity was higher in the traditional varieties (H TV ) than in the improved varieties (H IV ) with H TV = 0.72 > H IV = 0.42 (χ 2 test, p < 0.05).In terms of the genetic diversity documented in the entire set of varieties, the traditional varieties included 98% (65/66) of the microsatellite alleles detected in this study while the improved varieties included 32% (21/66).At microsatellite loci, a total of 45 alleles (68%) were found exclusively in the traditional varieties while only one allele (2%) was found in the improved varieties.Furthermore, 20 (30%) of microsatellite alleles were common to traditional and improved varieties, contrasting with the results of Yang et al., (1994) who reported that about 72% of the microsatellite alleles observed in landraces were incorporated into elite varieties.The result obtained in our present study was not surprising because the traditional varieties were either poorly used, or not used, as parents in the Cuban rice breeding program.

Genetic relationship between varieties
A cluster analysis based on genetic similarity estimates is shown in Figure 1.Considering a truncate level of 0.7, three variety groups (I-III) were obtained.Groups I and III included only traditional varieties, while Group II clustered all the improved varieties and a few of the traditional varieties which had probably been derived from the improved varieties.The subgrouping of the improved varieties within Group II was basically in accordance with their genealogies, with, for example, the related genotypes Jucarito-104 being placed in the IIa subgroup and Amistad-82 into the IIb subgroup as reported in a previous study (Fuentes et al., 2003).
Group I may correspond to the tropical japonicas rice subspecies group since it included some US varieties (i.e.Rexoro, Blue Bonnet and Nira) previously assigned to this subspecies group by Lu et al. (2005).To gain more insight into this relationship further research needs to be carried out including some international japonica control strains such as the Nipponbare cultivar and an additional set of microsatellite markers with proved diagnose value for differentiation between the indica and japonica subspecies groups (Ni et al., 2002;Coburn et al., 2002;Gao et al., 2005).We have previously shown that the improved varieties in the Cuban rice breeding program, here included in group II, are indica genotypes (Fuentes et al., 1999(Fuentes et al., , 2004(Fuentes et al., , 2005)).Thus, the traditional varieties of this group may also be indica genotypes, since they are close to the improved varieties.When an international control for indica subspecies (variety IR36) was included for microsatellite diversity analysis (data not shown) it clustered in group II confirming the indica nature of this genetic group.The traditional varieties within group III could not be assigned to any specific subspecies groups.
A quantitative estimate of the genetic relationships between the traditional and improved varieties based on the S xy pairwise values was undertaken to assist genetic crosses using these varieties (Table 4).The most similar varieties shared alleles at all microsatellite loci while the least similar varieties shared alleles at the zero marker loci.Thus, S xy values among all traditional/improved varietal pairs were calculated, and subsequently the average values of S xy for each variety were obtained.On average, the 11 improved varieties shared between 13% and 28% of the microsatellite alleles with traditional varieties, while 39 traditional varieties shared between 5% and 56%.Out of a total 429 traditional/improved varietal pairs (39 traditional x 11 improved varieties), 95 (22%) shared alleles at the zero microsatellite loci, 228 (53%) shared between 5% and 25% of the total number of microsatellite loci, while 106 (25%) shared more than 25%.No genetic polymorphism could be detected at any microsatellite locus between the varieties IACuba-14, IACuba-16 and IACuba-24 which all trace back to a common parent (CP 1 -C 8 ), confirming their close pedigree relationship (Fuentes et al., 2003).A similar result was found for the IACuba-21 and IACuba-26 varieties, mutant lines derived from Jucarito-104, as well as for the traditional varieties Blanquito, Espiritista and Selección Tres Provincias, all of unknown origin.
The average genetic similarity (S xy ) values for each variety ranged from 11% for Selección 135 to 47% for Jabao and Rexoro, with an overall average S xy value of 34% (Table 1).When the traditional and improved variety groups were evaluated separately (data not shown) the average S xy values for the traditional group ranged from 10% for Jorge Valladares to 67% for Blue Bonnet and the overall average S xy value was 40%, while for the improved varieties the S xy values were between 59% for Jucarito-104 and 70% for Perla de Cuba with an overall average value of 65%.This indicated that within-groups varieties were very related genetically.

Discussion
In this paper we present a genetic diversity analysis of traditional and improved Cuban rice varieties based on microsatellite markers data.The study demonstrated that the variety sample possesses a high level of microsatellite variation (H T = 0.75).This result agrees with previous studies that showed high microsatellite variation in O. sativa (Yang et al., 1994;Gao et al., 2005) and related wild rice species (Gao 2004(Gao , 2005)).We used microsatellite sequences that have previously revealed a high level of polymorphisms in Latin American rice germplasm (data not shown) and which are mostly located on chromosomes harboring high level of genetic diversity (Gao et al., 2005).Thus, these microsatellite sequences may be useful tools in future genetic studies of rice germplasm.
Based on phenotypic characteristics (data not shown) and information supplied by farmers on the origin of the varieties we hypothesize that traditional rice varieties grown by local farmers in Cuba represent alternative genetic pools to those present in the improved varieties.Support is given to this hypothesis by the fact that the mean of the microsatellite diversity analysis was H TV = 0.72 > H IV = 0.42 (χ 2 test, p < 0.05), where 45 microsatellite alleles (68% of the total) were found exclusively in the traditional varieties.The hypothesis was also corroborated by cluster analysis based on genetic relationship estimates (Figure 1).As expected, all the improved varieties grouped in the same genetic group (group II), confirming the close genetic relationship reported for these genotypes in previous studies (Fuentes et al., 1999(Fuentes et al., , 2005)).The clustering of some traditional varieties in group II suggested their close genetic relationship with improved varieties.Considering the collection date of these traditional varieties (Table 1) only two, M2 and M4, could be selections from the improved varieties surveyed in this work, i.e., Jucarito-104 released for production in 1981.The remaining traditional varieties included in Group II (Selección 30,132,142 and 143,Arroz Bolito,Matancero,Jorge Valladares and Caña Verde) could be selections from earlier varieties such as IR-880C9, Cica 4 or Naylamp, released by the Cuban breeding program between 1972 and 1976.These three varieties trace to the same maternal cytoplasm (Cina) of the IACuba-14 and IACuba-16 varieties (Fuentes et al., 2004).The Cica 4 variety is also the grandmother parent of the IACuba-14 variety (Fuentes et al., 2003).Future studies using a higher number of microsatellite markers will be necessary to confirm this hypothesis.
A different origin could be expected for traditional varieties from groups I and III.Group I clustered tropical japonica varieties such as Rexoro, Blue Bonnet and Nira (Lu et al., 2005), which were introduced into Cuba from the US before the beginning of the rice breeding program.Several of these traditional varieties show phenotypic traits such as good cooking quality, crystalline endosperm, and white grain color, common to Fortuna and Rexoro varieties, which are known to be contributors to grain quality for the Latin American and US rice varieties (Cuevas-Pérez et al., 1992;Lu et al., 2005).The independence of the breed-ing programs applied in Cuba and the US suggests that the US breeding pool contained in group I may serve as reservoirs of genetic diversity for Cuban varieties.In spite of the attractive traits of the US varieties, only one variety (Century Patna) has been used as a parent of modern varieties in Cuba (Fuentes et al., 2003).
In this sense, Table 4 provides a guideline for selection of traditional/improved parent combinations for genetic crosses on the basis of their microsatellite diversity.It may be suggested that those varieties sharing alleles at less than 25% of the microsatellite loci are adequate for genetic crosses, assuming 0.25 as the inbred limit value  (Kempthorne, 1969) and that the proportion of microsatellite bands (S xy ) refers to bands identical by descent and not by state (Lynch, 1988).Thus, nearly 75% of the traditional varieties could be crossed with almost any improved variety.However, the amplification of the genetic diversity of improved varieties is not an aim of a breeding program, but a consequence of the adequate use of parent genotypes selected for different aims such as introducing resistance genes in varieties, increasing the yield and quality of the crop, etc.For this reason, the information contained in Table 4 should be used to complement other breeding criteria and agronomic traits such as grain quality, local adaptability and diseases resistance, all of which are currently being evaluated in these varieties.In summary, the microsatellite diversity analysis suggests two different origins for the traditional Cuban rice varieties studied, with most of the varieties originating from US introductions while a relatively smaller number being derived from commercial varieties improved in Cuba.Further more, compared to the improved varieties clustered in group II, the traditional varieties clustering in groups I and III were genetically more diverse and represent alternative genetic pools for improving Cuban rice varieties.This study demonstrates the usefulness of microsatellite markers for recommending parent genotypes for genetic crosses in rice.

Genetic diversity of Cuban rice 1113 Figure 1 -
Figure 1 -Dendrogram of rice varieties obtained by UPGMA cluster analysis based on microsatellite data.

Table 1 -
Characteristics of the 50 Cuban rice varieties held at the Cuban Rice Research Institute (CRRI) and used in the present study.Heterozygosity (H) expressed as the percentage of loci at which the genotype was heterozygous out of the total of 10 possible loci.Mean proportion of microsatellite loci with shared alleles (S xy ).Ordered by date and province, unless otherwise indicated, all varieties came from Cuba.

Table 2 -
The microsatellite primers used in this study came from a study of polymorphic microsatellite primer pairs from Latin American rice varieties (G.Gallego, personal communication).The size (bp) of the polymerase chain reaction (PCR) product and number of perfect repeats refer to variety IR-36.Ordered by linkage group.

Table 3 -
Allele number, allele frequency and heterozygosity of the ten polymorphic microsatellite loci studied.Ordered first by total number of alleles and then by number of frequent alleles.
*Alleles occurring in more than 10% of the total varieties in the collection, parentheses indicate the number of unrepresented alleles in the improved variety collection.† Alleles occurring in 2% to 10% of the total varieties in the collection.‡ Alleles occurring in only one variety.

Table 4 -
Proportion of shared alleles (S xy ) between traditional and improved varieties pairs expressed in percentage.Average values for each traditional and improved variety is given in the last column and row, respectively.