Determination of genetic variability of traditional varieties of Brazilian rice using microsatellite markers

The rice (Oryzasativa) breeding program of the Rice and Bean research center of the Brazilian agricultural company Empresa Brasileira de Pesquisa Agropecuária (Embrapa) is well established and provides new cultivars every year to attend the demand for improved high yielding varieties with tolerance to biotic and abiotic stresses. However, the elite genitors used to compose new populations for selection are closely related, contributing to the yield plateau reached in the last 20 years. To overcome this limit, it is necessary to broaden the genetic basis of the cultivars using diverse germplasm such as wild relatives or traditional varieties, with the latter being more practical because they are more easily crossed with elite germplasm to accelerate the recovery of modern plant types in the breeding lines. The objective of our study was to characterize the allelic diversity of 192 traditional varieties of Brazilian rice using 12 simple sequence repeat (SSR or microsatellite) markers. The germplasm was divided into 39 groups by common name similarity. A total of 176 alleles were detected, 30 of which (from 23 accessions) were exclusive. The number of alleles per marker ranged from 6 to 22, with an average of 14.6 alleles per locus. We identified 16 accessions as a mix-ture of pure lines or heterozygous plants. Dendrogram analysis identified six clusters of identical accessions with different common names and just one cluster with identical accessions with the same common name, indicating that SSR markers are fundamental to determining the genetic relationship between landraces. A subset of 24 landraces, representatives of the 13 similarity groups plus the 11 accessions not grouped, was the most variable set of genotypes analyzed. These accessions can be used as genitors to increase the genetic variability available to rice breeding programs.


Introduction
Rice (Oryza sativa) genetic resources are widely available worldwide, this crop being cultivated between latitudes 55°N and 36°S in a variety of ecosystems, including irrigated, rainfed lowland, rainfed upland and floodprone areas.Human selection and crop adaptation to diverse environmental conditions have resulted in a large number of genotypes, and it has been estimated that about 120,000 rice varieties exist in worldwide (Khush, 1997).Rice production doubled between 1966 and 1990 due to the proliferation of highly productive cultivars, but the use of elite germplasm in breeding programs reduced the genetic variability available for selection and is believed to be the main factor for the leveling off of yield (Rangel et al., 1996).In addition, limited genetic diversity has lead to increased disease susceptibility and an increase in insect pests.The use of rice genetic resources available at genebanks is an important strategy for incorporating genetic variability into rice breeding programs, which can potentially generate new cultivars with broadened genetic basis and allows new and useful allelic combinations (McCouch, 2005).Crosses to broaden the genetic basis of rice also can promote the preservation of rare alleles that can be incorporated in elite germplasm.The use of adapted rice landraces, as the primary source of variation into which desired characters present in modern cultivars are introgressed may be an effective strategy for producing cultivars adapted to difficult production environments (Hawtin et al., 1997).
Rice is a very important source of carbohydrate and protein for Brazilians which was probably introduced to Brazil during Portuguese colonization around 1550 (Pereira, 2002).Subsequent cultivation has generated many local Brazilian varieties (landraces) adapted to different environments, resulting in a regional fixation of distinctly favorable alleles.To preserve this genetic variability, more than 3,000 traditional Brazilian accessions have been collected during the last 30 years.These accessions are representatives of a variety of soil and climate conditions, including low temperatures in the South Region of Brazil, saline and dry soils in the Brazilian Northeast and low fertility acid soils in the Cerrado (Brazilian savanna) Region (Burle et al., 2001).
For the effective conservation of rice genetic resources it is important to characterize genetic variability in order to assist germplasm bank curators to preserve genetic diversity and for breeders to use it effectively (Olufowote et al., 1997).Speed, reproducibility and the ability to detect genetic variation within and between accessions determine the utility of molecular techniques for germplasm bank management (Gilbert et al., 1999).The increasing availability of highly polymorphic genetic markers and the decreasing cost of genotyping provide powerful tools for finding the true biological relationship between individuals (Presciuttini et al., 2002).Simple sequence repeat (SSR or microsatellite) markers provide a higher level of information compared with other classes of molecular markers (Rafalski et al., 1996).In rice, more than 2,000 SSR markers are available from SSR-enriched libraries and from the sequenced rice genome (McCouch et al., 2002).This high number of markers permits the selection of the most informative and well-distributed SSR loci in the rice genome to be used in molecular analysis.
In the work described in this paper we used SSR markers to infer genetic variability in rice landraces with the objective of characterizing the allelic diversity of a single set of landraces from the entire collection of Brazilian rice landraces.

Plant material
We investigated 192 Brazilian rice landrace accessions stored at the Rice and Bean Germplasm Bank at the Rice and Bean Research Center of the Brazilian agricultural company Empresa Brasileira de Pesquisa Agropecuária (Embrapa Arroz e Feijão, Santo Antônio de Goiás, GO, Brazil).The acessions were divided into 39 groups based on their common names, but 44 accessions were not included in any of these groups (Table 1).

DNA extraction and SSR analysis
For each accession, DNA was extracted from the leaves of five plants, quantified and adjusted to a final concentration of 3 ng mL -1 using the protocol described by Brondani et al. (2002).Two five-plant pools were made for each of the 192 accessions.The pooled DNA was used as a template for the SSR PCR reactions.
To select suitable SSR markers we used 40 SSR markers with high PIC values to genotype four rice genotypes (cultivars Cica-8, BR Irga 409, Primavera and BRS Colosso) in order to select the SSR markers showing the most desirable characteristics (highest polymorphism, good band resolution and good distribution as regards all twelve rice chromosomes), the genotyping being carried out as described in the next paragraph.The codes of the primers selected were the following: RM9, RM11, RM22 (Panaud et al., 1996); RM204, RM207, RM223, RM229, RM247, RM252 (Chen et al., 1997); RM310 (Temnykh et al., 2000); and OG61 and OG106 (Brondani et al., 2001).
The PCR reactions were conducted in a final volume of 13 mL containing 0.3 mM of each primer, 1 unit of Taq DNA polymerase, 0.2 mM of each dNTP, 1 mM TRIS-HCl (pH 8.3), 50 mM KCl, 1.5 mM MgCl 2 , 1.3 mL of DMSO (50% w/v) and 7.5 ng of template DNA.The PCR was performed in a PT-100 thermocycler (MJ Research) programmed for one pre-cycle at 96 °C for 5 min, followed by 30 cycles of 94 °C for 1 min, 56 °C for 1 min and 72 °C for 1 min, with a final extension at 72 °C for 7 min.Amplification was checked by horizontal electrophoresis in 3.5% (w/v) agarose gel containing TBE (0.09 M TRIS-Borate and 2 mM EDTA, pH 8.3) and 0.2 mg mL -1 ethidium bromide.Allelic polymorphism was detected in 4% (w/v) denaturing polyacrylamide gels containing 7M urea and 1x TBE buffer, bands being visualized using silver staining (Bassam et al., 1991).

Statistical analysis
The number of alleles per locus and polymorphism information content (PIC) were estimated using the Genetic Data Analysis program (Lewis and Zaykin, 2000).The estimates of probability of identity (PI) were obtained using the Identity program (Sefc et al., 1997).The dendrogram was constructed from the genetic distance matrix obtained by Rogers distance as modified by Wright (1978), hereafter named Rogers-W distance coefficient.The accessions were grouped by cluster analysis using the unweighted pair group method with averages (UPGMA) (Weir, 1990) as implemented in the Numerical Taxonomy and Multivariate Analysis System (NTSYS) program (Rohlf, 1989).Tocher's optimization method was performed using the Genes program (Cruz, 1997).

Results and Discussion
Traditional rice varieties, or landraces, have a high level of genetic heterogeneity compared to modern cultivars.This genetic variability is very important for the sustainability of small farmers, because despite the low yield capacity, these varieties present high yield stability (Oka, 1991).Landraces are adapted to local, small-scale, low-input environments where the plant ideotype may differ considerably from that developed for modern agricultural systems (Veteläinen et al., 1997).The evaluation of the genetic variability of accessions of landraces can provide the basic information necessary to help genbanks mul-   tiply and properly conserve these genetic resources.This will also help breeding programs to plan crosses to incorporate this variability into the genetic background of elite rice lines, which in turn will generate new rice cultivars.
In our analysis of SSR variation we detected 176 alleles in the 192 landrace accessions investigated, the number of alleles per marker ranging from 6 (RM22) to 22 (OG61 and OG106), with an average of 14.7 alleles per locus.The PIC varied from 0.33 (RM252) to 0.94 (OG106), with an average of 0.73 (Table 2).The number of alleles and PIC are higher than that reported in previous works (Xu et al., 2004;Yu et al., 2003;Ni et al., 2002;Coburn et al., 2002;Panaud et al., 1996).The higher number of alleles and PIC detected in the present work indicate that the Brazilian rice landraces are a good source of genetic variability to be explored in crosses with elite rice germplasm.We also detected 30 private or exclusives alleles in 23 of the accessions (Table 2).The molecular data were generally correlated with variation at the agro-morphological level in the crop plants and therefore provide good guidance on the distribution of useful variation as well as on the existence of co-adapted gene complexes (Hawtin et al., 1997).Despite the fact that the SSR markers used in our study are considered neutral (i.e., not functionally related to any trait) the genomic fingerprinting profile of each plant does not vary from one environment to another, reflecting the capacity of an individual plant or population to adapt (Virk et al., 1996).This direct association between the fingerprint of an accession and the phenotypic response to a target environment is caused by linkage disequilibrium, which in rice is mainly due to autogamous reproduction (Ford-Lloyd et al., 1997).Another important aspect to consider is the differentiation of characters that are highly subject to natural selection, such as tolerance to biotic and abiotic stresses occurring during the cultivation of rice landraces.In addition, neutral markers can be used to establish the evolutionary past of varietal groups and to account for pre-selection of the germplasm to be used in breeding programs (Glaszmann et al., 1996).
The use of DNA pools from five plants was very effective at evaluating the allelic diversity of the rice accessions and the analysis of two pools per accession detected 16 traditional varieties which were heterogeneous for at least two SSR markers (Table 3).Then a second round of SSR analysis was performed to individually characterize the five individual plants present in one or two pools of each 16 accessions.The pool heterogeneity was more frequently related to the occurrence of homozygous plants for different alleles than to heterozygous plants (Table 3).The accessions Agulha, Japonês, Prata Roxa and Venez Branco showed the highest number of heterogeneous pools (Table 3).Pooling plants is a very important strategy for speeding up the genotyping of landraces, which due to their higher genetic variability requires the analysis of more individual plants for each accession to produce a more realistic evaluation of the variability within the germplasm of a particular acession.
Six clusters were composed of identical accessions: 1) Cacho Grande (155), Dez Anos (162), Carolina (66) and Chatão (77); 2) Bico Roxo (45), Bico Torto (46) and Branco (47); 3) Guapa ( 167) and IAC 1246 (101); 4) Mineiro (172) and Montanha (124); 5) Paulista (179), Pindorama (181) and Pratão 4 Meses (136); and 6) Guaíra (97) and Guaíra Amarelo (98) (data not shown).The entire set of SSR markers used in this work produced a combined probability of identity (PI) of 6.2x10 -12 (Table 2), which is the probability of having two individuals with the same genotype in a group of accessions.The fact that the PI value is very low indicates that the accessions of the seven clusters cited above are identical despite their different common names, it being already known that the two Guaíra accessions (97 and 98) are identical.The most common case observed in our study was accessions with the same name that were not genetically identical.This probably occurred for two reasons: a) When a farmer gives the name of the landrace to the germplasm collector, the farmer is influenced by the name used regionally for the accession, so although names can vary from county to county the germplasm can be the same or b) due to events operating independently or simultaneously, e.g.seed mixture, cross pollination, selection and genetic drift caused by reduced sampling of seeds to be used in the next rice crop season.In this last case, occurring in specific environments, an original landrace or old variety released to different farmers, after years of successive cultivation, would generate populations with a genetic constitution different from the original genotype.This adaptation, i.e. the movement of a population towards a phenotype that best fits the population within a specific environment, may select the best gene or gene combinations which can be explored by breeding programs.
We used Tochers optimization method to identify the most variable accessions present in the entire set of 192 genotypes and to appreciate the extent of genetic variability and select a starting point (i.e.accession, or accessions) within the collection for a breeding program.This test was conducted using the SSR marker genetic distance values of all pairwise combinations of the 192 genotypes.The final set of genotypes obtained by Tocher's method yielded 14 accessions, with an average Rogers-W distance coefficient of 0.76, similar to the average Rogers-W distance coefficient of 0.77, obtained when the 192 genotypes were analyzed (Figure 1).However, we observed that of the 14 accessions, eight (3 Meses, De Morro, Douradão, Anão, Chatão, Branco, 90 Dias and Amarelo) had previously been Brondani et al. 681  Brejeiro) in Group E, one (Ligeiro) in group F, one (Mucuim) in group B, one (Cateto Amarelo) in group H and one (Japonês) had not been included in any similarity group (Figure 1).When we analyzed a set of genotypes (one representative of each of 13 similarity groups plus the 11 ungrouped accessions) we obtained a Rogers-W distance coefficient average of 0.90, higher than the entire set of 192 and 14 accessions from Tocher's method (Figure 2).This result clearly shows that an efficient way to select genetically distinct genotypes in a large collection of germplasm is to first analyze all accessions and then choose representative accessions from each cluster.In order to decide which ac-cession should be chosen from each group, agronomic performance traits should be effective in selecting genotypes that will produce potentially even better results in breeding programs.All sets of accessions were well distributed spatially according to principal component analysis (Figure 3).In terms of genetic variability based on SSR marker genotyping we recommended the 24 accessions (set of accessions representative of each similarity group plus the unclassified accessions) for use as genitors in crosses with the elite rice genotypes to increase the genetic variability available for selection, and also to increase the possibility of detecting transgressive segregation.However, for such variability results in effective genetic gain, further experiments, such as diallel crosses, should be carried out to determine the combining ability of the traditional varieties and to prove that the genetic variability found with SSR markers is effectively related to good field performance of lines derived from elite x traditional variety crosses.

682
Genetic variability of rice landraces   One of the implications of SSR analysis for landrace conservation is that the genotyping of landraces can detect differences that cannot be detected by traditional methods of morphological characterization used routinely in genebanks.At Embrapa Rice and Bean, 12 morphological descriptors, which are traits with high heritability, such as pilosity of the leaves, presence of awns, etc., are used for rice (Fonseca et al., 1981).Another relevant aspect of genotyping using molecular markers, particularly SSRs, is the ability of this methodology not only to analyze many accessions simultaneously but also to investigate individuals plants of a specific accession.The knowledge of withinaccession variability is important for conservation purposes, because it is possible to determine the most genetically variable accessions which would demand an additional effort of sampling a higher quantity of seeds in order to preserve this genetic variability and prevent genetic drift during routine periodic germplasm multiplication.The analysis of individual plants of an accession is also relevant for breeding purposes, since homozygous plants can be selected and used as genitors in crosses with elite rice genotypes.The ex situ collection of Embrapa Genebank is preserving for the future the genetic variability of many landraces that are no longer cultivated by farmers.At the beginning of the 1970s the estimated number of rice varieties cultivated by Brazilian farmers was around 3,000 (Fonseca et al., 1982).There are no estimates of the cultivated varieties today but there is a consensus that the number is much lower than that, following the falling trend in the number of small farmers who traditionally cultivate landraces.Due to the size of Brazil, there are regions that still need to be included in collection expeditions, in order to increase the Brazilian sampling coverage of such valuable germplasm.

678
Genetic variability of rice landracesTable1-Brazilian rice landraces used in the simple sequence repeat (SSR) analysis.The landraces were grouped according to the similarity based of their common names.The state where the landraces were collected is shown in the third column and the cluster code in the dendrogram in the fourth column.

Figure 1 -
Figure 1 -Dendrogram based on Rogers-W distance coefficient of a subset of 14 rice landraces.

Figure 2 -
Figure 2 -Dendrogram based on Rogers-W distance coefficient of a subset of 24 rice landraces.

Table 2 -
Chromosomal location (CL), number of alleles, polymorphism information content (PIC), probability of identity (PI) and private alleles detected with 12 simple sequence repeat (SSR) markers used for genotyping 192 rice landrace accessions.
*Parenthesis contain, in order, Allele frequency; band size in base pairs (bp).

Table 3 -
Rice landrace accessions that showed heterogeneous pooled DNA detected by simple sequence repeat (SSR) markers.
classified in similarity Group A, two (Buriti Vermelho and