Development of microsatellite markers for the genetic analysis of Magnaporthe grisea

An AG microsatellite-enriched genomic DNA library was constructed for Magnaporthe grisea (anamorph Pyricularia grisea), the causal agent of rice blast. Seventy-two DNA clones containing microsatellite repeats were isolated and sequenced in order to develop a series of new PCR-based molecular markers to be used in genetic studies of the fungus. Twenty-four of these clones were selected to design primer pairs for the PCR amplification of microsatellite alleles. Single spore cultures of M. grisea isolated from rice and wheat in Brazil, Colombia and China were genotyped at three microsatellite loci. Isolates from southern Brazil were predominantly monomorphic at the tested SSR loci, indicating a low level of genetic variability in these samples. However, seven alleles were observed at the MGM-1 locus in isolates from Central Brazil and at least nine alleles were detected at the same locus in a sample of Colombian isolates. Polymorphism analysis at SSR loci is a simple and direct approach for estimating the genetic diversity of M. grisea isolates and a powerful tool for studying M. grisea genetics. 1EMBRAPA Arroz e Feijão, Caixa Postal 179, 75375-000 Sto. Antônio de Goiás, GO, Brasil. Send correspondence to C.B. E-mail: brondani@cnpaf.embrapa.br 2EMBRAPA Recursos Genéticos e Biotecnologia, Laboratório de Genética, Caixa Postal 02372, 70849-970 Brasília, DF, Brasil.


INTRODUCTION
Rice blast, caused by the fungus Magnaporthe grisea (T.T. Hebert) Yaegashi and Udegawaq (anamorph Pyricularia grisea (Cooke) Sacc.), is one of the most important diseases of rice in the tropics, and is responsible for yield losses in both upland and irrigated rice production systems. Understanding the rice-Magnaporthe pathosystem requires knowledge of the pathogen's genetic diversity and the mechanisms that lead to the development of new virulent genotypes. The development of technologies that allow monitoring of the dynamics of fungal populations is crucial in designing strategies for disease control.
The genetic diversity of M. grisea and its correlation with pathotypes have been studied by various approaches, including repetitive DNA sequences. Since repetitive DNA sequences seem to be distributed throughout the pathogen genome (Weising et al., 1995), the differences and similarities among fungus isolates can be assessed in satellite regions and the data can be used to monitor disease epidemics, population dynamics and the racial composition of pathogen populations. Long repetitive sequences that have been described and studied in the M. grisea genome include the Magnaporthe grisea repeats (MGR), containing a core repetitive sequence of 1,860 bp, with an estimated average of 46 copies per genome (Hamer, 1991;Farman et al., 1996a). Some retrotransposon repetitive sequences such as grh (Dobinson et al., 1993) and MAGGY (Farman et al., 1996b) have also been detected in M. grisea DNA, with approximately 50 copies per genome. Assess-ment of polymorphism at these repetitive element regions among fungal isolates has relied on RFLP, which is often costly as well as time and labor intensive. Kachroo et al. (1994) described a method that uncovers the polymorphism of Pot2, another repetitive element found in the M. grisea genome, based on PCR with primers flanking the core repetitive sequence. Alternatively, since simple sequence repeats (SSR) or microsatellite sequences are relatively abundant in microorganisms (Field and Wills, 1996), a protocol that could efficiently uncover SSR loci in the genome of the rice blast pathogen, together with a simple PCR assay to assess the polymorphism at these hypervariable loci, would be very useful.
Microsatellites are highly informative repetitive sequences of 2-6 bp, dispersed throughout the eukaryotic genome (Morgante and Olivieri, 1993;Taramino and Tingey, 1996). The development of SSR markers requires the identification and sequencing of SSR loci and the construction of primers that can be used to amplify the alleles. Polymorphism at microsatellite loci can be efficiently assessed by PCR (Weber and May, 1989). The alleles are usually separated and identified by high resolution polyacrylamide gel electrophoresis. Characteristics such as convenient analysis through PCR, a high number of alleles per locus, precise allele identification through the use of allelic ladders and the accurate comparison of data among researchers and laboratories make SSR markers one of the most informative techniques for genome mapping, DNA fingerprinting and population studies (Taramino and Tingey, 1996). Allelic diversity at SSR loci caused by variation in the number of repeats of the core sequence is probably caused by polymerase "slippage" and a lack of repair during DNA replication (Field and Wills, 1996).
The objectives of this work were i) to construct a microsatellite-rich DNA library of M. grisea; ii) to identify, select and sequence microsatellite DNA clones; iii) to characterize microsatellite markers of the M. grisea genome, and iv) to genotype M. grisea isolates from different areas and hosts using some of the newly developed SSR markers.

M. grisea DNA extraction
Monosporic M. grisea cultures were obtained from M. grisea-infected rice leaves in the field. The leaves were kept in a humid chamber for 12 h at 22°C in the dark to induce sporulation. The spores were isolated from the leaves using a stereoscopic microscope, transferred to water-agar medium (1.7% agar) and kept at 24°C for 24 h under fluorescent light. After germination, the spores were isolated and transferred to solid oat medium (Tuite, 1969) containing 250 mg/l chloramphenicol and maintained at 26°C under fluorescent light for 15 days. Agar discs containing mycelia and spores were used to inoculate liquid Fries medium (Tuite, 1969) and the culture was incubated at room temperature in the dark under stirring at 150 rpm. The mycelia were isolated from the medium, frozen in liquid nitrogen and ground to powder with a mortar and pestle. DNA was extracted from this powder using the extraction protocol described by Brondani et al. (1998).

Construction of SSR-enriched genomic library
An M. grisea SSR-enriched library was obtained as described by Rafalski et al. (1996). DNA (50 µg) from the M. grisea isolate PS-3 was digested with the restriction enzyme Tsp 509 I (New England Biolabs, Beverly, MA), and size separated by 2% agarose gel electrophoresis. Fragments ranging from 300-600 bp were recovered after electrophoresis using an NA-45 DEAE (diethylaminoethyl) membrane (Schleicher and Schuell, Keene, NH) inserted into the gel. The fragments were ligated to adapters at the Tsp 509 I restriction site. Positive clones containing SSR were selected by hybridization with biotinylated oligonucleotides complementary to the repetitive sequence AG/CT, and recovered with magnetic beads linked to streptavidin (Dynal, Oslo, Norway). Microsatellite-rich fragments were amplified by PCR and cloned into phage Lambda Zap II (Stratagene, La Jolla, CA).
Selection and sequencing of SSR-positive clones and design of flanking primers Plaques of a Lambda Zap II DNA library were lifted onto Hybond-N membranes (Amersham, Buckingham-shire, England) and screened by hybridization with a digoxigenin-labeled poly (dA-dG) probe (Boehringer, Mannheim, Germany) to select fragments containing AG repeats. The positive plaques were selected and phagemids were isolated by in vivo excision, according to the manufacturer's instructions. The presence, orientation and length of the inserts were determined by an anchor-PCR strategy (Rafalski et al., 1996). This method ensured that only clones containing repeats were amplified using a combination of primers that annealed to either T3 or T7 primer sites and to the SSR sequence of the insert. Subsequent electrophoresis on 3.5% agarose gels revealed clones containing SSR inserts and the direction within the vector from which they were to be sequenced. After in vivo excision, the phagemid DNA was isolated using Wizard minipreps (Promega, Madison, WI). Double-stranded DNA samples were sequenced with an Applied Biosystems 377 DNA sequencer (Perkin-Elmer, Foster City, CA) using dye terminator fluorescent chemistry (Roche Molecular Systems, Branchburg, NJ), which consists of labeling each of the four dideoxyterminators (ddNTPs) with four different fluorescent dyes. When these terminators replace standard dideoxynucleotides in enzymatic sequencing, a dye label is incorporated into the DNA along with the terminating base. Each dye emits light at a different wavelength when excited by the laser light of the sequencer so that the four ddNTPs can be detected. These data are transformed into a deoxynucleotide position in the DNA molecule that is being sequenced. The sequences of regions flanking microsatellite loci were used to design 18-24-bp primer pairs, using the software Primer 0.5 (S. Lincoln, M. Daly, and E. Lander, Cambridge, MA). The criteria adopted to reduce the amplification of nonspecific bands in PCR reactions included a minimum primer annealing temperature (Ta) of 50°C, a maximum difference of 1°C in Ta between the two primers of an SSR locus, and a G + C content ranging from 40 to 50%. The primers were synthesized by Operon Technologies (Alameda, CA).
PCR amplification of alleles at SSR loci PCR was done in a final volume of 20 µl, containing 15 ng of DNA, 0.3 µM of each primer, 0.25 mM of each dNTP, 1.5 mM MgCl 2 , 5% DMSO and 1 unit of Taq polymerase. The reactions were done on a thermal PT-100 cycler (MJ Research, Watertown, MA), programmed for the following steps: 94°C for 4 min, then 30 cycles of 94°C for 1 min, 50°C for 1 min, 72°C for 1 min, and a final extension step of 72°C for 7 min. The samples were electrophoresed on a) 3.5% agarose gels containing 0.1 µg of ethidium bromide/µl in 1X TBE buffer (89 mM Tris-borate, 2 mM EDTA, pH 8.3) and run at 90 V for 2.5 h, or b) 4% denaturing polyacrylamide gels in 1X TBE buffer run at 45 W for 30 min. The bands on the polyacrylamide gels were silver stained as described by Bassam et al. (1991).

Confirmation of alleles at SSR loci
To verify whether the amplified DNA fragments were representative of the expected microsatellite, and not the result of amplification of spurious sequences, the M. grisea isolate PS-3 was analyzed at three SSR loci (MGM-1, MGM-21 and MGM-24). After the PCR reaction, the amplified DNA products for each SSR primer pair were electrophoresed on 3.5% agarose gels. The DNA band of each primer pair was recovered from the gel by cutting it right after the band, inserting an NA-45 nitrocellulose membrane into the cut, and running the gel until the band was bound to the membrane. The bound DNA was extracted from the membrane with 250 µl NET buffer (1.0 mM NaCl, 0.1 mM EDTA and 20 mM Tris-HCl, pH 8.0) in a 1.5-ml tube at 65°C for 1 h, and then precipitated with 625 µl ethanol overnight. The DNA was resuspended in 10 µl water, quantified, diluted, and 20 ng was used as template DNA for automated sequencing, as described above. A similar protocol was used to sequence the PCR products of M. grisea isolates MT-20, 6043 and CPAC-01 at the MGM-1 locus. For each sample, a single-strand sequence was generated using only the forward primer.

Genotyping of M. grisea isolates
A total of 158 M. grisea single spore isolates from different areas and hosts were collected and used for genotypic analysis using some of the newly developed SSR markers. Twenty-eight isolates from two rice blast-infected areas in southern Brazil were kindly provided by Dr. Alceu S. Ribeiro (EMBRAPA-CPACT, Brazil). These isolates were obtained from lesions in rice plants infected in the field in two counties of the State of Rio Grande do Sul, approximately 200 km apart from each other. The areas of collection were Rosario do Sul County (elevation 125 m, 30° S latitude, 54° W longitude), with isolates identified as "RS" followed by a number representing the order of isolation, and Palmares do Sul (elevation 9 m, 30° S latitude, 50° W longitude), with isolates identified as "PS" followed by a number as described above. DNA from 34 M. grisea isolates collected throughout Colombia were kindly provided by Dr. Fernando Correa-Victoria (CIAT, Colombia), and coded as MT or OLY. Ninety-six isolates were also obtained from an irrigated commercial rice field crop in Formoso, State of Tocantins, Central Brazil (elevation 240 m, 11° S latitude, 49° W longitude). These isolates represent a sample from an M. grisea epidemic since they were collected from infected plants 10 m apart covering an area of 1 ha. The genetic analysis of this M. grisea population was used to indicate the level of pathogen variability in small areas during an epidemic. One M. grisea isolate from rice collected in China (number 6043, kindly provided by Dr. Barbara Valent, Du Pont, USA) and one isolate from wheat (CPAC-01, kindly provided by Dr. José Ribamar dos Anjos, EMBRAPA-CPAC) collected in Cen-tral Brazil were also evaluated. The polymorphism information content (PIC) of each tested marker was calculated for isolates from Central Brazil, as described by Taramino and Tingey (1996).

Genomic library and SSR development
The AG-enriched library contained inserts with an average length of 500 bp, which facilitated the subsequent steps of SSR marker development. Around 40% of the plaques contained clones that hybridized with the poly (dA-dG) probe. A total of 193 clones were tested for satellite sequences by the anchor-PCR approach, and 168 (87%) of these tested positive for SSRs. Among the clones with microsatellites, 152 (90%) had an SSR of adequate size and position within the cloned insert to be sequenced. Seventy-two of these clones were sequenced and found to contain AG repeats. Of these sequences, 24 had SSR sequences and a flanking region of adequate size for the design of forward and reverse primers. These clones are described in Table I, with the primer sequences in bold. Twenty-one of the clones had simple, straight AG repeats, while three were complex (clones MGM-3, MGM-14, MGM-26) ( Table I).

Screening of SSR markers
The 24 SSR primer pairs designed were tested for DNA amplification on a set of seven M. grisea isolates from different regions and hosts: RS-10, RS-13, PS-3 and PS-41 (from two areas of southern Brazil), MT-20 (from Colombia), 6043 (from China) and CPAC-01 (from wheat). Under the PCR conditions used (see Methods), three primer pairs produced clear DNA amplification products: MGM-1, MGM-21 and MGM-24. The other 21 SSR primer pairs did not produce clear-cut PCR products under the PCR conditions used. Other conditions were also tested for these primers, including higher annealing temperatures for primer pairs with several bands (Rafalski et al., 1996) or a "touchdown" approach for primer pairs with low band resolution (Brown et al., 1996). Since specific conditions should be adjusted for each primer pair independently, the M. grisea SSR analysis was continued only with the three pairs that showed good resolution under the originally developed PCR conditions. The results for the remaining primer pairs will be reported elsewhere. Alleles at the three SSR loci (MGM-1, MGM-21, MGM-24) of the isolates cited above were detected on agarose and polyacrylamide gels (Figure 1).
M. grisea isolated from rice and wheat yielded PCR products from all three loci, indicating the conservation of these loci in M. grisea pathogenic to rice and wheat. The southern Brazilian isolates (RS-10, RS-13, PS-3 and PS-41) were easily distinguished from the MT-20 (Colombia),  were confirmed to be microsatellite alleles by sequence analysis of recovered fragments amplified at these three SSR loci using the PS-3 isolate (data not shown). All three sequences from recovered PCR products had an AG-SSR region and, as expected, were identical in sequence to the original clones derived from the SSR library. The allele sizes observed for the CPAC-01 (wheat) and PS-3 (southern Brazil) isolates were 84 and 126 bp, respectively (Figure 1). Sequence analysis of both alleles revealed that this difference in allele size was caused mainly by variation in the number of AG repeats. The DNA-flanking sequence of this micro-  satellite locus was almost completely conserved in isolates from wheat and rice (Table II). Sequencing of the MGM-1 allele of isolate MT-20 (Colombia) revealed more than 60 AG repeats.

Genotyping of M. grisea isolates
Genetic variation among 158 isolates of M. grisea from different regions was detected by analyzing their genotypes at the MGM-1, MGM-21 and MGM-24 loci. Twentyeight southern Brazilian isolates, 96 central Brazilian isolates and a sample of 34 Colombian M. grisea isolates were genotyped at these loci. None of the 28 southern Brazilian isolates showed polymorphism at the three loci after electrophoresis on 3.5% agarose gels (Figure 2). The situation for the Colombian and central Brazilian isolates was quite different. Analysis of the MGM-1 locus for a sample of Colombian isolates indicated high variability. At least nine alleles were identified on 3.5% agarose gels (Figure 2), with estimated sizes ranging from 118 bp to approximately 400 bp. The Colombian isolate MT-2 showed more than one amplified band, which is not expected to occur in haploid mycelial DNA. Several of the 96 M. grisea isolates sampled from a commercial rice field infected by the blast fungus in Central Brazil were separated based on their geno-types at the MGM-1 and MGM-21 loci (Figure 3). At least seven alleles were identified at these loci on 3.5% agarose gels, the most frequent of them being observed in almost half of the isolates. PIC values estimated from allelic variation observed in the 96 isolates collected from a 1-ha area in Central Brazil were 0.54 for MGM-1 loci and 0.44 for MGM-21. The PIC values would be expected to be higher on polyacrylamide gels. Some isolates (Figure 3) showed two alleles, especially at the MGM-24 locus.

DISCUSSION
We have described the construction of a dinucleotide microsatellite-rich DNA library of M. grisea, the identification, selection and sequencing of microsatellite DNA clones, the characterization of three of the newly developed M. grisea microsatellite markers and the use of these marker loci to genotype blast isolates from different areas and hosts. Dinucleotide SSR markers have been detected abundantly in mammals and plants. While in mammals and plants the most frequent repetitive motif is AC/TG (Beckmann and Weber, 1992) and AT/TA (Morgante and Olivieri, 1993), respectively, there is limited information about the type of motif and the frequency of SSR loci in microorganisms. Field and Wills (1996) described the results of a search for microsatellite sequences in microorganisms by examining the cloned sequences deposited at GenBank. Forty-six out of 375 sequences found were from fungi and all of them were trinucleotides. Since no information about SSR in M. grisea was available at the beginning of this work, a dinucleotide-enriched library was constructed by assuming that dinucleotides are expected to occur more frequently in the genome than tri-or tetranucleotides, thus increasing the chance of cloning these regions. Indeed, the data collected to date indicate that dinucleotide repetitive sequences can efficiently be isolated from the M. grisea genome.
M. grisea isolates from Brazil, China and Colombia and one isolate from wheat were genotyped at three SSR marker loci. Several amplified alleles of different lengths were detected in agarose and polyacrylamide gels, as were null alleles, observed for the MGM-21 locus in the Chinese isolate, and the MGM-24 locus in the PS-41 isolate from Brazil. These findings indicate that the SSR flanking regions are conserved among isolates from different geographic regions and countries, and that they are also conserved in an M. grisea isolate from wheat (CPAC-01). The possibility of differentiating isolates based on differences in the number of dinucleotide repeats and the presence or absence of a site for primer annealing is useful for discriminating among isolates and provides a powerful tool for the genetic analysis of M. grisea populations. Using a small set of selected SSR markers, it may be possible to differentiate large numbers of pathogen isolates.
The number of AG repeats in the M. grisea genome varied significantly between some isolates. The allele sizes observed in isolates CPAC-01 (collected in wheat) and PS-3 (collected in southern Brazil) were 84 and 126 bp, respectively ( Figure 1). Sequence analysis of the two fragments indicated that the large difference in size resulted from a 5-fold difference in the number of AG repeats between the two isolates (Table II). Likewise, sequencing of the MGM-1 allele of isolate MT-20 (collected in Colombia) revealed more than 60 AG repeats at this locus, which represents one of the highest levels of simple sequence repeats known among eukaryotic organisms.
The difference in the number of alleles detected in southern Brazil relative to the central Brazilian and Colombian M. grisea isolates was significant and may be related to the sampling strategy used to recover M. grisea isolates in these areas. The southern Brazilian isolates were sampled in a typical agricultural environment where the rice varieties have a very narrow genetic base (Rangel et al., 1996). The use of the same or very similar rice varieties in these areas over the years favors the predominance of a specific pathotype of the fungus. Also, the temperate climate of southern Brazil is not as favorable to the development of the disease as other areas, such as the Formoso River valley of Central Brazil, where high temperatures and humidity most of the year create a suitable environment for the pathogen. Blast symptoms can be observed on rice and weeds in the field throughout the year in the Formoso River valley, and this creates a continuous source of M. grisea inoculum in the region. The M. grisea isolates from Central Brazil were collected A B C from a blast epidemic on rice cultivar Metica-1. A large number of isolates were collected from infected plants only 10 m apart, over a total area of 1 ha. The level of variability observed was very striking, indicating that the genetic structure of M. grisea populations in this area is far more complex than anticipated. The isolates from Colombia, representing a sample of M. grisea found in several parts of that country, were collected from various rice cultivars (Correa-Victoria and Zeigler, 1993). In this case, the genotypic variability was expected to be high because of the different selection pressures imposed by host genotype variability and the diverse environments where the pathogen was sampled. The level of variability of M. grisea isolates from Colombia was similar to that found in the population sampled in a small area of Central Brazil. The Colombian isolate MT-2 showed more than one amplified band, which would not be expected to occur in haploid mycelial DNA. This could reflect nonspecific primer annealing during PCR, the contamination of isolates or the presence of duplicated loci in the M. grisea genome. Current information on loci duplication in the M. grisea genome (Valent and Chumley, 1994) favors the last hypothesis. Double-banding patterns were also observed in some isolates from Central Brazil genotyped at the MGM-24 locus (Figure 3).
The estimates of PIC for the Central Brazilian M. grisea population were 0.54 for MGM-1 and 0.44 for MGM-21. PIC values will probably rise when polyacrylamide gels are used and as more isolates are studied. The combined use of the MGM-1 and MGM-21 markers in this population allowed the identification of nine different genotypes. The MGM-1 and MGM-21 markers discriminated five and three genotypes, respectively, when used individually. The development of additional SSR markers and their analysis by PCR using fluorescent dye technology or silver-stained gels could rapidly provide genetic data for a number of studies, including the estimation of genetic parameters for pathogen populations. One clear advantage is the possibility of constructing databases of allele frequencies at selected SSR loci. The development of allelic ladders for informative SSR loci will facilitate accurate allele identification within and between populations of the pathogen, thereby facilitating the comparison of data among laboratories.
The application of SSR markers to genetic studies of M. grisea is very promising. The practical advantages of a robust PCR-based approach in contrast to the laborintensive and costly multilocus hybridization probe techniques are easily appreciated, especially for studies involving large samples such as those required for population diversity analysis and for monitoring genotype and pathotype distribution in epidemic areas. Data gathered on the genotype of individual isolates of a fungal population during an epidemic and on the dynamics of genotype change over time may be useful for developing and breeding new resistant rice cultivars.