Acessibilidade / Reportar erro

Characterization of Brazilian soybean cultivars using microsatellite markers

Abstract

Microsatellite markers or SSR (Simple Sequence Repeats) have proved to be an excellent tool for cultivar identification, pedigree analysis and the evaluation of genetic distance among organisms. Soybean cultivars have been characterized mainly by morphological and biochemical traits. However, these traits have not been sufficient to characterize the large number of cultivars eligible to receive protection under the Brazilian Cultivar Protection Act. In order to define new soybean cultivar markers, the alleles of twelve SSR loci of 186 Brazilian soybean cultivars were studied by estimating the variation in their size range and their respective frequencies. On average, 5.3 alleles per locus were detected, with a mean genetic diversity of 0.64 ± 0.12. These loci were used to distinguish morphologically similar groups, presenting a mean similarity coefficient of 0.46; their use allowed to determine 184 profiles for the 186 cultivars. A dendrogram based on the SSR loci profiles showed good agreement with the cultivar pedigree information.

Glycine max (L.) Merrill; simple sequence repeat; microsatellites; molecular markers; soybean elite cultivars


Characterization of Brazilian soybean cultivars using microsatellite markers

Regina Helena Geribello Priolli1, Celso Teixeira Mendes-Junior1, Neylson Eustáquio Arantes2 and Eucleia Primo Betioli Contel1

1Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, SP, Brazil.

2Empresa Brasileira de Pesquisa Agropecuária, Centro Nacional de Pesquisa de Soja, Uberaba, MG, Brazil.

Send correspondence to E.P.B. Contel. Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Av. Bandeirantes, 3900, 14049-900, Ribeirão Preto, SP, Brazil. E-mail: epbconte@rge.fmrp.usp.br.

ABSTRACT

Microsatellite markers or SSR (Simple Sequence Repeats) have proved to be an excellent tool for cultivar identification, pedigree analysis and the evaluation of genetic distance among organisms. Soybean cultivars have been characterized mainly by morphological and biochemical traits. However, these traits have not been sufficient to characterize the large number of cultivars eligible to receive protection under the Brazilian Cultivar Protection Act.

In order to define new soybean cultivar markers, the alleles of twelve SSR loci of 186 Brazilian soybean cultivars were studied by estimating the variation in their size range and their respective frequencies. On average, 5.3 alleles per locus were detected, with a mean genetic diversity of 0.64 ± 0.12. These loci were used to distinguish morphologically similar groups, presenting a mean similarity coefficient of 0.46; their use allowed to determine 184 profiles for the 186 cultivars. A dendrogram based on the SSR loci profiles showed good agreement with the cultivar pedigree information.

Key words:Glycine max (L.) Merrill, simple sequence repeat, microsatellites, molecular markers, soybean elite cultivars.

Received: April 3, 2002; accepted: June 17, 2002.

INTRODUCTION

Soybean, Glycine max (L.) Merrill, a legume native from China, is currently one of the most important crops worldwide. Brazil is the second largest producer, with a cultivated area of 13.68 million hectares, and 37.8 million tons harvested in 2000/2001 (http://www.conab.gov.br). The importance of soybean in Brazilian agriculture is due partly to suitable climate and soil management, but particularly to the great number of improved cultivars. For the 2000/2001 harvest alone, the Brazilian Agricultural Research Corporation (Embrapa) listed about 259 soybean cultivars, adapted to the most diverse producing regions in Central Brazil (Embrapa, 2000). This figure has increased year after year, with more productive new cultivars, resistant to pathogens, in both consolidated and expanding cropping areas.

Along with the development of new cultivars, there has been a growing interest in the genetic characterization, for commercial protection provided by the Brazilian Cultivar Protection Act (1997). When referring to the necessary requirements for the protection of a cultivar, it states that the cultivar has to be reliably distinct, homogeneous and stable.

Plant breeders have traditionally used morphological and biochemical traits to register and protect their varieties. Although these traits remain predominant and important, they present limitations, particularly in closely related cultivars. In plants with a narrow genetic base in their gene pool, such as soybean, they may not be sufficient, taking into account the large number of cultivars eligible to be protected. In such cases, molecular descriptors can provide additional information about the characterization, degree of diversity and genetic constitution of the existing germplasm.

Microsatellites or SSR are sequences of a few repeated and adjacent basepairs, well distributed over the eukaryote genome (Powell et al., 1996). Variations in the number of repeats can be detected by polymerase chain reaction (PCR), with the development of primers (20 to 30 base pairs) specifically built for amplification and complementary to single sequences flanking the microsatellite. These markers have been used for genotypic identification of many plant species, such as soybean (Cregan et al., 1994; Diwan and Cregan, 1997; Rongwen et al., 1995; Maughan et al., 1995; Song et al., 1999), grape (Vitis vinifera L.) (Thomas and Scott, 1993), rapeseed (Brassica napus L.) (Kresovich et al., 1995), apple (Malus x domestica Borch) (Hokanson et al., 1998), and many others.

A high level of polymorphism in the SSR loci has been reported for soybean. Akkaya et al. (1992) detected an average of seven alleles at each of three microsatellite loci studied in a group of 43 soybean genotypes. Morgante and Olivieri (1994) detected similar levels of allelic diversity in seven SSR loci in a group of 61 genotypes. Rongwen et al. (1995) reported 11 to 26 alleles at seven loci in a group of 96 soybean cultivars and plant introductions (PIs). Maughan et al. (1995) detected 79 alleles across five SSR loci in a sample of 94 soybean accessions of G. max and G. soja genotypes. Using 12 microsatellite primers, Doldi et al. (1997) found two to six alleles per locus in a group of 18 soybean cultivars. Narvel et al. (2000), using 72 microsatellite loci, detected a total of 397 alleles in 79 elite soybean cultivars and PIs.

Using 20 SSR markers, Diwan and Cregan (1997) were able to distinguish the 35 soybean genotypes that accounted for about 95% of the alleles present in North-American soybean. They detected an average of 10.1 alleles per locus, and concluded that the stuttering related to the dinucleotide loci increased the difficulty in defining the main peak of the allele used to establish their size, suggesting the use of trinucleotide loci for cultivar identification. Song et al. (1999) selected a group of 13 trinucleotide SSR loci to characterize morphologically similar cultivars, and standardized the identification of North-American soybean cultivars by this group of loci.

The objective of the present study was to determine the number of alleles and the gene diversity of trinucleotide loci in a group of soybean cultivars fit to be grown in Brazil, and to select or indicate a set of loci endowed with different profiles for each cultivar.

MATERIAL AND METHODS

Soybean plant material and DNA isolation

A group of 186 soybean elite cultivars, developed and released by Brazilian public and private institutions, was selected to represent the complete range of cultivars grown in Brazil. Seeds of each of the 186 cultivars were obtained from the Embrapa-Soybean Germplasm Collection. The cultivars are listed in Table I.

Thirty to fifty plants of each soybean cultivar were grown in a greenhouse for DNA isolation. The equivalent of 30 leaf tissue samples were collected from each cultivar, frozen in liquid nitrogen and lyophilized for 1-2 days. DNA was isolated from the bulked lyophilized leaf tissue of the plants of each cultivar by a mini-prep procedure based on Doyle and Doyle (1990). DNA quality and concentration were evaluated by electrophoresis in 0.8% agarose gel stained with ethidium bromide (EtBr).

Morphological and genealogical traits of the cultivars

The pedigrees and some morphological traits of the soybean cultivars were recorded, following research of the literature and information received from Embrapa-Soybean and private breeders. The 186 cultivars were divided into 15 groups, according to similarities in hypocotyl color (green or purple), flower color (white or purple), pubescence color (gray or brown) and hilum color (buff, brown, yellow, black and imperfect black), denominated with Roman numerals, as shown in Table I.

SSR loci

Twelve pairs of soybean primers flanking the microsatellite regions, previously developed and published by Cregan et al. (1999), were selected. They were synthesized by Bio Synthesis Inc., Texas, USA, and coded as Satt 002, Satt 005, Satt 009, Satt 102, Satt 173, Satt 263, Satt 307, Satt 308, Satt 309, Satt 335, Satt 406, and Sct__189. The sequences of the Forward and Reverse primers are available at the soybean Website USDA-ARS Soybean Genome Database (http://129.186.26.94/SSR.html). The primers comprise 12 of the 20 soybean linkage groups, chosen because they had presented polymorphism in previous studies and/or because of their trinucleotide nature.

PCR amplification of SSR loci

PCR amplification was performed on each of the 186 soybean genotypes, using primers for each SSR locus. Reaction mixtures contained 30ng of soybean genomic DNA, 0.2 mM 3’ and 5’ end primers, 200 mM of each nucleotide, 1 X PCR Buffer containing 50 mM KCl, 10 mM Tris-HCl pH 8.9, 2.0 mM MgCl2, and 1 unit of Taq DNA polymerase, in a total volume of 25mL. For primers Satt 002, Satt 005 and Satt 009, the MgCl2 concentrationwas changed to 2.5 mM for better amplification. A thermal cycler (PCR Machine Robocycler, Stratagene) was programmed for 2 min at 94 °C, followed by 32 cycles of 1 min at 94 °C, 1 min at 47 °C and 1 min at 72 °C, and a final cycle of 10 min at 72 °C.

Amplification products were separated in denaturing gels containing 10% polyacrylamide, 8 M urea and 1 X TBE, during approximately 4 h at 15 mA. The size of each band was estimated by a 25-bp DNA Ladder (Life Technologies-Gibco BRL). Amplified SSR fragments of different sizes were considered as different alleles. The fragments were detected by silver staining, following the Sanguineti et al. (1994) protocol.

Statistical analysis

The gene diversity (Weir, 1990) was calculated as: 1 - SPij2, where Pij is the frequency of the jth allele at the ith locus, summed across all alleles in the locus.

A genetic dissimilarity coefficient was calculated for each pair of cultivars, according to Diwan and Cregan (1997), to determine the effectiveness of the group of twelve SSR loci in distinguishing each of the 186 cultivars. These authors state that in elite soybean cultivars (that are often derived from identical plants), 12.5% to 6.25% of heterozygous loci remain in the F4 and F5 generations, respectively, whereas such a heterozygosity might be expressed as a mixture of two different homozygotes in later generations. Therefore, they suggest that the segregating bulks should be taken into account in the identification of soybean cultivars. They indicate a computer program to compare each pair of loci and attribute them either similarity or dissimilarity values. In order to obtain a dendrogram with significance values, the bootstrap procedure was applied over the original databank, allowing the construction of 100 different ones, by sorting with replacement of 12 loci, as suggested by Felsenstein (1985). For each databank, Microsoft Excel software, Version 5.0, was used to draw a spreadsheet where each locus of two cultivars would score 1.0 if they shared the same alleles, that is, if both alleles had the same size; 0.5 if only one of the alleles was the same, and 0 if they did not have the same alleles. These values were used to calculate a simple genetic dissimilarity coefficient (1 - Score/12) between each pair of cultivars. The 100 matrices of genetic dissimilarity coefficients were used to construct a consensus UPGMA (Unweighted Pair-Group Method using Arithmetic Average) dendrogram, using the NEIGHBOR and CONSENSE programs contained in the PHYLIP package (Phylogeny Inference Package), Version 3.57c (Felsenstein, 1989). The capacity of the markers to distinguish between morphologically similar groups was also determined by calculating the genetic similarity coefficients (1 - genetic dissimilarity coefficient) of each pair of cultivars in 14 out of the 15 groups shown in Table I, since one of the identified groups consisted of only one cultivar.

RESULTS

SSR polymorphism in 186 soybean cultivars

All the 12 SSR loci were polymorphic, as shown in Table II. The number of alleles per locus varied from four to eight, with an average of 5.3 alleles per locus, distributed among the 186 cultivars. The frequency of seventy-five percent of the 64 detected alleles was lower than 0.25, and that of the remaining 25% was equal to or higher than 0.25. Only one allele in Satt102 showed a frequency higher than 0.75, and two alleles had frequencies lower than 0.01, one in locus Satt005 and the other in locus Satt002. These values confirm the good distribution and the representative aspect of the alleles in the studied sample. The genetic diversity (GD), which is indicative of the effectiveness of SSR loci information, was also relatively high, ranging from 0.41 to 0.82, with a mean value of 0.64 ± 0.12.

The 12 SSR loci provided 184 profiles of the 186 studied cultivars. The four non-distinguished cultivars were Embrapa 1 (IAS 5 RC) with regard to RS 9 (Itaúba), and FT 103 with regard to FT 104. Embrapa 1 (IAS 5RC) and RS 9 (Itaúba) derive from IAS 5. The first one resulted from a backcross of IAS 5 during five generations, and the second one, from a cross between FT 2 and IAS 5. In spite of their unknown origin, the two other cultivars, FT 103 and FT 104, were developed by the same institution by crossing many progenitors (bulk), which does not exclude the possibility that they may have similar origins. Another point to be noted concerning these similar cultivars is that the alleles of the 12 loci which constituted their profile were precisely the most frequent, although the probability of finding identical individuals at random in this sample was practically null.

The genetic dissimilarity coefficients found in the cultivar comparison matrix were relatively high. The distribution analysis of the 17,205 pairwise comparisons (Figure 1) revealed extreme values. Zero indicated similar cultivars, and 1 indicated different cultivars. However, most of the values lied between 0.4 and 0.9, rather indicating a dissimilarity level among the cultivars than the opposite.


The 12 SSR loci were also successful in distinguishing cultivars with identical morphological traits (Table III). The mean similarity value among cultivars belonging to the same group was 0.46. There were totally different cultivars in the same group (coefficient 0.0), as in groups seven and nine, but the average for the minimum similarity values of all groups was 0.25. Although completely similar cultivars (coefficient 1.0) were present in groups 1 and 3, as mentioned above, the mean of the maximum similarity values was 0.81. Therefore, out of the 24 possible comparisons between two morphologically similar cultivars (12 loci x 2 possible alleles), 11 were observed to be identical, on average.

Germplasm

The consensus tree relating the 186 cultivars based on the twelve SSR loci (Figure 2) expresses the distinction of groups with maximum and minimum similarities. The results were also highly consistent with regard to the ancestral descent of the groups, and identified groups with some degree of parentage. For instance, cultivars FT Eureka, Ocepar 8, BRSMG Virtuosa, and almost all the cultivars in the group named Paraná are in the same group as Paraná. Furthermore, all of them descended either from Paraná or from a selection of it. For the same reason, IAC 8, IAC 8-2, BR IAC 21, IAC 17, IAC 18, CAC 1, and CS 303 are in the same group as their ancestor Bragg. Similarly, other groups contain small sets of cultivars, all of them related to the same common ancestral, as shown in Figure 2.


Many of the ancestral genotypes mentioned present some degree of parentage. Dourados, for instance, is a selection of Andrews, which, in turn, is a selection of Santa Rosa. Bragg and FT Cristalina share a common parent, D492491; Paraná and IAS 5 also share a common parent, Hill. All of them are ancestors of other groups.

There was only about 10% discrepancy between the dendrogram and the constituted pedigree, such as the inclusion of MG/BR 46 Conquista in the group containing Santa Rosa, BRSMS Piracanjuba, FT Estrela and RB 603 in the IAS 5 group, as well as all the X-marked cultivars in Figure 2. This incongruity, along with the lack of common parental in some clusters, may mean either that there is no parentage with the indicated ancestral genotype or that precise data on its pedigree are lacking.

Except for the genetic relationships of a variety being selected from another or pedigree relationships, the analysis did not show any correlation with growing habits, similar morphology or geographical origin among the groups.

The dendrogram also revealed which North-American varieties more effectively contributed to the formation of this group of Brazilian cultivars. Santa Rosa, D492491 (sister line of Lee), Hill, Davis and Hood were most frequently used as parents, since they were directly or indirectly identified in most of the clusters.

DISCUSSION

The polymorphism of SSR loci detected in this study was consistent with previous studies by Akkaya (1992), Morgante and Olivieri (1994), Maughan et al. (1995), Doldi et al. (1997) and Narvel et al. (2000), but lower than that obtained by Rongwen (1995) and Diwan and Cregan (1997). One possible reason for this difference is that the materials used in the present study were all from breeding programs, thus having a relatively narrow genetic base. In a study on genetic diversity in soybean, 11 to 26 alleles per microsatellite primer pair were amplified from 96 soybean genotypes, but this number was reduced by five to 10 alelles per primer pair in 26 cultivars from North-American breeding programs (Rongwen et al., 1995).

The obtained gene diversity (GD) was in agreement with the data of Rongwen et al. (1995), who found a mean value of 0.74 in a group of 96 soybean genotypes. It is in line with the results of Diwan and Creagan (1997), who found mean GD values close to 0.69 in a group of 36 commercial soybean lines, and in agreement with the data of Narvel et al. (2000), who detected a mean value of 0.50 ± 0.02 in a group of 39 elite cultivars.

The presence of low-frequency alleles in some SSR, as observed in Satt002 and Satt005, may reflect the soybean microsatellite mutation rate, estimated at 10-5 to 10-4 per generation (Diwan and Cregan, 1997). These authors argued that such a rate is similar to the human rate, and that it should not be a hindrance to the use of SSR for cultivar identification. They also stated that soybean cultivars should be described for identification based on a bulk of 30 to 50 plants, since possible mutation alleles would not be detected and, therefore, mutations in isolated plants would not alter the allelic constitution of the cultivar. However, Song et al. (1999), using this procedure, detected 10 new alleles in 66 soybean cultivars, that were not present in the 35 ancestral lines; and Narvel et al. (2000) recorded 32 alleles specific for elite cultivars, within a total of 397 alleles that had been detected in 40 lines and in 39 soybean cultivars.

The genetic dissimilarity coefficient derived from the 12 studied loci presented a mean variation of 0.63, which means that, on average, two genotypes presented 15 alleles that differed from one another. Table II shows that, even in groups which are similar for certain morphological traits, the mean average value obtained was 0.46, or 11 common alleles. These results were favorable to the loci, as far as distinguishing the assayed cultivars is concerned.

The existence of non-distinguished cultivars in the sample may reflect the narrow genetic base of the gene pool of Brazilian soybean germplasm. Hiromoto and Vello (1986) already reported that, in that year, all recommended cultivars had derived from only 26 ancestral genotypes, nine of which were responsible for more than 80% of that gene set, and only four of them being responsible for 50% of it. This picture was not so different in the following years, since Abdelnoor et al. (1995) did not find much variation (14.2 to 20.5%) in the genetic distances among 38 Brazilian soybean cultivars, as estimated by RAPD molecular markers.

The obtained data suggest that this group of 12 microsatellite loci can be used to distinguish Brazilian soybean cultivars from each other, inasmuch as 98.9% of the assayed cultivars could be identified. Furthermore, in referring to some morphological traits, identical cultivars could be distinguished by the same SSR loci in 12 out of the 14 established groups. Despite the existence of four non-distinguished cultivars, which, as mentioned above, were closely related in their formation, the use of these 12 SSR loci may be a feasible alternative in identifying and evaluating the soybean to be protected.

ACKNOWLEDGEMENTS

This work was supported by CAPES and FAEPA-HC-FMRP. Pedro Roberto R. Prado provided valuable technical assistance during the course of this study. Leones de Almeida, Milton Kaster, Roberto Zito, Nelson R. Braga and João Alberine provided samples and information about cultivars indicated in this study.

  • Abdelnoor RV, Barros EG and Moreira MA (1995) Determination of diversity within Brazilian soybean germplasm using random amplified polymorphic DNA techniques and comparative analysis with pedigree data. Braz J Genet 18:265-273.
  • Akkaya MG, Bhawat A and Cregan PB (1992) Length polymorphisms of simple sequence repeat DNA in soybean. Genetics 132:1131-1139.
  • Cregan PB, Bhagwat AA, Akkaya MS and Rongwen J (1994) Microsatellite fingerprinting and mapping of soybean. Methods Mol Cell Biol 5:49-61.
  • Cregan PB, Jarvik T, Bush AL, Shoemaker RC, Lark KG, Kahler AL et al. (1999) An integrated genetic linkage map of the soybean genome. Crop Sci 39:1464-1490.
  • Diwan N and Cregan PB (1997) Automated sizing of fluorescent-labeled simple sequence repeat (SSR) markers to assay genetic variation in soybean. Theor Appl Genet 95:723-733.
  • Doldi ML, Vollmann J and Lelley T (1997) Genetic diversity in soybean as determined by RAPD and microsatellite analysis. Plant Breeding 116:331-335.
  • Doyle JJ and Doyle JL (1990) Isolation of plant DNA from fresh tissue. BRL Focus 12:13-15.
  • Embrapa (2000) Recomendações Técnicas para a Cultura da Soja na Região Central do Brasil 2000/01. Embrapa Soja/Fundação MT, Londrina, 245pp.
  • Felsenstein J (1989) Phylip-phylogeny inference package (version 32). Cladistics 5:164-166.
  • Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-791.
  • Hiromoto DM and Vello NA (1986) The genetic base of Brazilian soybean cultivars. Braz J Genet 9:295-306.
  • Hokanson SC, Szewc-McFadden AK, Lamboy WF and McFerson JR (1998) Microsatellite (SSR) markers reveal genetic identities, genetic diversity, and relationships in a Malus x domestica Borkh. core subset collection. Theor Appl Genet 97:671-683.
  • Kresovich S, Szewc-McFadden AK and Bliek SM (1995) Abundance and characterization of simple-sequence repeats (SSRs) isolated from a size fractionated genomic library of Brassica napus L. (Rapessed). Theor Appl Genet 91:206-211.
  • Maughan PJ, Saghai-Maroof MA and Buss GR (1995) Microsatellite and amplified sequence length polymorphisms in cultivated and wild soybean. Genome 38:715-723.
  • Morgante M and Olivieri AM (1994) Genetic mapping and variability of seven soybean simple sequence repeat loci. Genome 37:763-769.
  • Narvel JM, Fehr WR, Chu WS, Grant D and Shoemaker RC (2000) Simple sequence repeat diversity among soybean plant introductions and elite genotypes. Crop Sci 40:1452-1458.
  • Powell W, Machray GC and Provan J (1996) Polymorphism revealed by simple sequence repeats. Trends in Plant Science 1:215-222.
  • Rongwen J, Akkaya MS, Lavi U and Cregan PB (1995) The use of microsatellite DNA markers for soybean genotype identification. Theor Appl Genet 19:43-48.
  • Sanguineti C, Dias-Neto E and Simpson AJG (1994) RAPD silver staining and recovery of PCR products separated on polyacrylamide gels. Biotechniques 17:914-921.
  • Song QJ, Quigley CV, Nelson RL, Carter TE, Boerma HR, Strachan JR et al. (1999) A selected set of trinucleotide simple sequence repeat markers for soybean cultivar identification. Plant Var Seeds 12:207-220.
  • Thomas MR and Scott NS (1993) Microsatellite repeats in grapevine reveal DNA polymorphisms when analysed as sequence-tagged sites (STSs). Theor Appl Genet 86:985-990.
  • Weir BS (1990) Genetic data analysis methods for discrete genetic data. Sinauer Association, Sunderland, Massachusetts, 445 pp.

Publication Dates

  • Publication in this collection
    11 Sept 2002
  • Date of issue
    2002

History

  • Received
    03 Apr 2002
  • Accepted
    17 June 2002
Sociedade Brasileira de Genética Rua Cap. Adelmio Norberto da Silva, 736, 14025-670 Ribeirão Preto SP Brazil, Tel.: (55 16) 3911-4130 / Fax.: (55 16) 3621-3552 - Ribeirão Preto - SP - Brazil
E-mail: editor@gmb.org.br