Print version ISSN 1415-4757
Genet. Mol. Biol. vol.31 no.1 São Paulo 2008
Paula A. Roratto; Darine Buchmann; Sandro Santos; Marlise L. Bartholomei-Santos
Laboratório de Diversidade Genética, Centro de Ciências Naturais e Exatas, Universidade Federal de Santa Maria, Santa Maria, RS, Brazil
Protocols for microsatellite-enrichment libraries have been widely applied to several species in order to supply the most informative molecular markers for population and inbreeding studies. One drawback of these protocols is the ratio of designed primer pairs that fail to amplify the expected fragment, even after exhaustive optimization attempts. A possible cause of unsuccessful microsatellite primers may be that such loci are artifacts resulting from chimeric PCR products, instead of real genomic sequences. The microsatellite-enriched library constructed for Aegla longirostri (Crustacea, Decapoda, Anomura) showed that 29% of sequenced clones were chimeric products because these sequences shared one of the flanking regions around the same repeat motif but not the other. PCR-mediated recombination is a well-known event described for several procedures in which related sequences are used as a template. We have associated this phenomenon with microsatellite marker development. This study explained the high ratio of recombinant sequences generated in the A. longirostri microsatellite-enriched library. We discuss the mechanism and implications of PCR chimeric-product formation during microsatellite isolation.
Key words: chimeric PCR product, microsatellite isolation, recombination, simple sequence repeats (SSR).
Microsatellites or simple sequence repeats (SSR) are genetic markers widely used in individual identification and population-level analysis because of their high power of genetic resolution (Chambers and Macavoy, 2000). Several techniques for microsatellite isolation have been developed, and in a recent review of the most frequently used strategies for microsatellite isolation, Zane et al. (2002) noted that selective hybridization protocols are extremely popular, being used in over 25% of all reviewed articles and in 70% of those employing enrichment steps. The basic protocol was proposed by Karagyozov et al. (1993), Armour et al. (1994) and Kijas et al. (1994). Selective hybridization is performed by using an oligonucleotide probe containing the repeat motif to be isolated. The probe can be cross-linked to a nylon membrane or can be biotinylated at one end so that DNA hybridized to the probe can be selectively captured using streptavidin-coated magnetic beads. The use of a biotinylated probe is generally preferable because of its greater efficiency in hybridizing to the target DNA. After selective hybridization, recovered fragments are amplified using polymerase chain reaction (PCR) and cloned using standard methods for further sequencing (Zane et al., 2002).
Despite the modern technology results are not always satisfactory, with the percentage of designed primer pairs that are successfully optimized being variable and generally low (Table 1). However, because publications about microsatellite isolation are generally limited to notes, authors usually focus on amplifiable loci and do not discuss the possible factors behind loci that did not work. We are of the opinion that the failure of loci to amplify might be attributable to artifacts of the PCR-based isolation process.
Chimeric PCR products can arise when sequences with relatively high similarity are present in a reaction as templates. An incompletely extended primer in the elongation phase of the PCR cycle generates a shorter nascent strand, which, in a subsequent cycle, can prime off a heterologous target sequence and be completely extended. The chimeric sequence produced has its 5' end corresponding to the first template and its 3' end to the heterologous template (Bhavsar et al., 1994).
Formation of hybrid sequences by PCR has been reported for attempts to amplify genes belonging to multiple families (Bhavsar et al., 1994), characterization of alleles from a heterozygous subject (Bradley and Hillis, 1997), forensic applications in which the template is generally ancient DNA (Pääbo et al., 1990), reverse transcription (Brakenhoff et al., 1991), in PCR-derived clones from polyploid genomes (Cronn et al., 2002), and for environmental DNA samples (von Wintzingerode et al., 1997). Even false sequences resulting from chimeric PCR have been deposited in public databases (Hugenholtz and Huber, 2003; Ashelford et al., 2005). On the other hand, a chimeric PCR product is a powerful tool widely used to create recombinant molecules in biotechnology assays (Coljee et al., 2000).
In our attempt to isolate microsatellite sequences from the South American freshwater crab Aegla longirostri, out of the 61 clones obtained 13 lacked sufficient flanking sequences for primer design and two were repeated (identical clones). Moreover, 17 clones were doubtful because they were a type of "shuffled" sequence that varied combinations of identical flanking regions around the (CA)n repeat, a pattern which attracted our attention to these clones.
Suspicion of chimeric products led us to an exhaustive search for such phenomena related to studies on the development of microsatellites. Despite the evident propensity of microsatellites to produce PCR-based recombination because of the presence of the same motif in all templates, few studies have reported the possibility of such an event occurring (Refseth et al., 1997; Koblízková et al., 1998; Poteaux et al., 1999; Hughes et al., 2002). Except for the study of Koblízková et al. (1998), none of the studies mentioning the possibility of chimeras in microsatellite development proposed a mechanism to explain this phenomenon or associated its occurrence with unsuccessfully amplifiable microsatellite loci.
Koblízková et al. (1998) proposed that chimeric PCR products result from elongation of free oligonucleotide probes, which would first generate amplification fragments lacking one flanking region and then produce chimeric products in subsequent cycles. They also suggested that the problem could be eliminated using 3' modified oligonucleotides.
Our present paper reports and discusses a surprising result from the microsatellite-development procedure and we propose a mechanism for chimeric microsatellite loci which differs from that proposed by Koblízková et al. (1998). Several methodologies for microsatellite isolation are based on PCR amplification and are hence liable to form chimeric products, because of which we also discuss ways to detect and to avoid chimeric clones that may have been responsible for many of the literature reports of microsatellite primers which failed to amplify the expected products despite the attempts to optimize PCR conditions.
We developed microsatellites following the method described by Refseth et al. (1997), with some modifications. Genomic DNA from Aegla longirostri Bond-Buckup & Buckup, 1994 was digested with TaqI (CenBiot) and fragments (500 ng) were ligated to an adapter (25 µM) using T4 DNA ligase (1 U) (Invitrogen) at 4 °C for 16 h. The adapter oligo sequences used were: TaqI20Mer (5'-ATGA AGCCTTGGTACTGGAT-3') and TaqI22Mer (5'- pCGA TCCAGTACCAAGGCTTCAT-3'). About 100 ng of the DNA ligated to the adapter was hybridized to a 5' biotinylated probe (CA)8 (0.4 µM) (MWG) in TE/NaCl buffer (10 µM Tris-HCl, 1 mM EDTA, 1 M NaCl) containing the oligonucleotide TaqI20Mer (2 µM). The DNA was denatured by incubating at 95 °C for 10 min, followed by incubation at 60 °C for 1 min in order to allow the biotinylated probe to hybridize to the target DNA. To capture the fragments hybridized to the probe we used the affinity of the biotin in the probe for the streptavidin-coated magnetic beads (Dynabeads M-280 Streptavidin, Dynal, Norway) by incubating 100 mg of beads for 30 min at room temperature with the hybridized DNA in TE/NaCl buffer. The beads were then washed 3 times in 2x standard saline citrate containing 0.1% (w/v) sodium dodecyl sulfate at 50 °C for 10 min and once in TE/NaCl at room temperature to remove unbound DNA and excess oligomers. The immobilized single-stranded DNA was eluted from the beads in 50 mL of distilled water at 90 °C for 5 min. Recovered DNA was PCR-amplified in a 50 mL-reaction, containing 10 mL of the captured fragments (without beads), 10 pmol of oligonucleotide TaqI20Mer (MGW), 2.5 U of Taq DNA polymerase (Invitrogen), 100 µM of each dNTP (Invitrogen) and Taq DNA polymerase buffer (10 µM Tris-HCl pH 8.5; 50 µM KCl; 4 µM MgCl2). Reactions were denatured for 5 min at 95 °C before amplification using 30 cycles of 1 min at 95 °C, 30 s at 61 °C and 2 min at 72 °C, followed by a final extension of 8 min at 72 °C. The amplification products were purified by polyethylene glycol precipitation and cloned using a TA Cloning kit (Invitrogen).
Positive clones, checked by PCR for the presence of an insert, were sequenced using a MegaBACE 500 sequencer (Amersham Biosciences). Before the primer design, repetition was taken off and all 3' and 5' flanking regions around (CA)n repeat were aligned as separated queries using ClustalW (Chenna et al., 2003), which calculates pairwise scores as the number of identities in the best alignment divided by the number of residues compared (percentage identity scores).
The first microsatellite-enriched library that we developed yielded 62 positive colonies. Three of 32 sequenced clones did not contain repeats. All the others showed only one flanking region to the microsatellite. These sequences are originated by the internal priming of the biotinylated oligonucleotide probe leaking from the magnetic beads. Absence of the adapter sequence at one microsatellite end of the insert confirms the repeat as the primer site.
In order to avoid carrying over oligonucleotide probes with the recovered DNA, we adopted the following strategies: reduction of probe concentration from 0.4 µM to 0.3 µM; reduction of the elution temperature from 90 °C to 80 °C (high temperatures could break the strong ligation between the biotin in the probe and the streptavidin bound to the magnetic beads); and an additional elution step, in which 20 µg of magnetic beads was added to the eluted DNA to interact with any biotinylated probe remaining in solution.
Our second attempt, using the modifications above, yielded 61 positive clones, of which only two resulted from free oligonucleotide extension. Different clones that showed high identity for both upstream and downstream regions around the (CA)n repeat were considered redundant (two cases). However, the outcome revealed that some inserts shared one of the flanking regions, but not the other. These 17 doubtful sequences corresponded to 28.8% of our clones and were grouped in six subsets (A to F), each containing clones that shared a same flanking region, as shown in Table 2, with high alignment scores. Only the 5' end of clone AlCA112 showed relatively low identity with the related sequences 5' AlCA121, 5' AlCA124 and 5' AlCA166.
To assess the possibility that these sequences were part of repetitive genes BLAST analyses (Altschul et al., 1990) were performed, but no matches were found. Sequences were also submitted to NEBcutter V 2.0 (Vincze et al., 2003) to check the presence of sites for TaqI endonuclease that was used for the DNA digestion before isolation, which could represent a point of ligation between two different loci that had been cloned together. No sites for TaqI were found in these shuffled sequences.
We suggest that the mechanism implicated in chimeric microsatellite loci is the in vitro recombination events occurring during the PCR preceding cloning. A captured microsatellite locus not completely extended (i.e., extended only to the (CA)n repeat) in one cycle annealed its 3' end with a (TG)n repeat of another microsatellite locus in the subsequent cycle, functioning as a priming site for subsequent extension. The generated nascent strand is a chimera formed by the 5' flanking region of one locus, a (CA)n repeat that was the crossover point and a 3' flanking region of another locus (Figure 1). This chimera does not represent a contiguous sequence present in A. longirostri genome and primers designed for these PCR artifacts will certainly not amplify.
Despite the high alignment scores (> 92% of identity) for most of the flanking regions analyzed by us, the identity was always less than 100% (Table 2). This may have been due to misincorporation of deoxynucleotides by Taq DNA polymerase during PCR or sequencing errors. However, some of our chimeric clones, which showed similarity scores less than 100%, could be generated if the incomplete extension during amplification passed beyond or stopped before the microsatellite and the 3' portion of the sequence acted as a primer in a subsequent cycle (Figure 2a). Clone AlCA111 shares its 5' flanking region with sequences 3'AlCA95, 5'AlCA116 and 5'AlCA161 until position 70 (TG), where the repeat begins for that clone, but not for the others (Figure 2b). Following the alignment shown in Figure 2b, the same was observed for sequence AlCA161 at position 85. Probably, an incompletely extended strand until these points (70 and 85), finishing in TG, primed off another microsatellite locus and generated a chimeric sequence.
The above clones constituted the main reason to believe that PCR recombination is the cause of chimerism in microsatellite development, because they showed a few differences at their flanking end shared with other clones, although they were clearly related. The mechanism described above for clones AlCA111, AlCA95, AlCA116 and AlCA161 could not be attributed to products generated by a contaminant repetitive probe, as suggested by Koblízková et al. (1998), because the latter can only generate identical flanking regions.
Chimeric sequences can be detected by aligning of microsatellite flanking ends, after removing the repetitive region, and trying to find shuffled sequences originating from a chimeric PCR. The low probability of obtaining related recombinant sequences when only a subset of cloned PCR products are sequenced, or if a small number of clones is obtained from a isolation procedure, should be taken into account. The more clones are sequenced, the more probable is the detection of chimeras. If a flanking region is not present in shuffled clones, it can be assumed that it is real and it can be considered for primer design. Because much effort, time and money are often employed in primer design and optimization of PCR conditions for microsatellite loci, we strongly suggest the alignment of flanking regions of repeats in order to detect these artifacts.
To minimize PCR recombination, it should be remembered that the possibility that premature extension products can compete successfully with the normal PCR primers for target sequences increases with each subsequent round of amplification as the concentration of normal PCR primers available to target DNA progressively decreases, so that recombination events occur late in the PCR reaction (Judo et al., 1998). Our procedure for A. longirostri microsatellite development was performed with 30 PCR cycles with 2 min elongation steps. In an attempt to minimize the recombination ratio, we will, in future, adopt and recommend a reaction with fewer cycles, longer elongation time and replacement of Taq DNA polymerase by a polymerase with higher processivity. These recommendations are valid for any protocol that performs a PCR-step before cloning, including the popular ones relying on selective hybridization.
For protocols using capture of microsatellites with streptavidin-coated beads, the possibility of probes functioning as primers can be drastically reduced by using a lower concentration of biotinylated oligonucleotide in the hybridization step, use of 80 °C in the elution step for recovering fragments containing microsatellites and the addition of an extra 20 mg of beads to the recovered DNA (to interact with the probe remaining in solution) followed by another elution step. Also, beads should not be present in PCR. Alternatively, a 3' biotinylated oligonucleotide can be used (Koblízková et al.,1998).
Here, we propose a mechanism that can explain the unfruitful loci for which the designed primer pairs have failed to amplify microsatellite markers in several studies, and we also recommend means to avoid some pitfalls in microsatellite development.
Future studies of PCR-mediated recombination with different microsatellite loci as templates would be of great value to estimate recombination ratios and evaluate factors that affect the formation of abortive extension products for repetitive sequences.
We gratefully acknowledge Dr. Élgion Loreto for the sequencing service, and the Brazilian Agency Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for financial support.
Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403-410. [ Links ]
An HS and Han SJ (2006) Isolation and characterization of microsatellite DNA markers in the Pacific abalone, Haliotis discus hannai. Mol Ecol Notes 6:11-13. [ Links ]
Armour JA, Neumann R, Gobert S and Jeffreys AJ (1994) Isolation of human simple repeat loci by hybridization selection. Hum Mol Genet 3:599-605. [ Links ]
Ashelford KE, Chuzhanova NA, Fry JC, Jones AJ and Weightman AJ (2005) At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Appl Environ Microbiol 71:7724-7736. [ Links ]
Bhavsar D, Zheng HD and Drysdale J (1994) Chimerism in PCR products from a multigene family. Biochem Biophys Res Commun 205:944-947. [ Links ]
Blanquer A, Uriz MJ and Pascual M (2005) Polymorphic microsatellite loci isolated from the marine sponge Scopalina lophyropoda (Demospongiae, Halichondrida). Mol Ecol Notes 5:466-468. [ Links ]
Bradley RD and Hillis DA (1997) Recombinant DNA sequences generated by PCR amplification. Mol Biol Evol 14:592-593. [ Links ]
Brakenhoff RH, Schoenmakers JG and Lubsen NH (1991) Chimeric cDNA clones: A novel PCR artifact. Nucleic Acids Res 19:1949. [ Links ]
Cesari M, Mularoni L, Scanabissi F and Mantovani B (2004) Characterization of dinucleotide microsatellite loci in the living fossil tadpole shrimp Triops cancriformis (Crustacea Branchiopoda Notostrace). Mol Ecol Notes 4:733-735. [ Links ]
Chambers GK and Macavoy ES (2000) Microsatellites: Consensus and controversy. Comp Biochem Physiol, B 126:455-476. [ Links ]
Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG and Thompson JD (2003) Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res 31:3497-500. [ Links ]
Coljee VM, Murray HL, Donahue WF and Jarrell KA (2000) Seamless gene engineering using RNA- and DNA- overhang cloning. Nat Biotechnol 18:788-791. [ Links ]
Cronn R, Cedroni M, Haselkorn T, Grover C and Wendel JF (2002) PCR-mediated recombination in amplification products derived from polyploid cotton. Theor Appl Genet 104:482-489. [ Links ]
Gaublomme E, Dhuyvetter H, Verdyck P, Mondor-Genson G, Rasplus JY and Desender K (2003) Isolation and characterization of microsatellite loci in the ground beetle Carabus problematicus (Coleoptera, Carabidae). Mol Ecol Notes 3:341-343. [ Links ]
Hadonou AM, Walden R and Darby P (2004) Isolation and characterization of polymorphic microsatellites for assessment of genetic variation of hops (Humulus lupulus L.). Mol Ecol Notes 4:280-282. [ Links ]
Herrera EA, Chemello ME, Lacey EA, Salas V and Sousa BF (2004) Characterization of microsatellite markers from capybaras, Hydrochoerus hydrochaeris (Rodentia, Hydrochoeridae). Mol Ecol Notes 4:541-543. [ Links ]
Hugenholtz P and Huber T (2003) Chimeric 16S rDNA sequences of diverse origin are accumulating in the public databases. Int J Syst Evol Microbiol 53:289-293. [ Links ]
Hughes M, Russel J and Hollingsworth PM (2002) Polymorphic microsatellite markers for the Socotran endemic herb Begonia socotrana. Mol Ecol Notes 2:159-160. [ Links ]
Judo MSB, Wedel AB and Wilson C (1998) Stimulation and supression of PCR-mediated recombination. Nucleic Acids Res 26:1819-1825. [ Links ]
Karagyozov L, Kalcheva ID and Chapman VM (1993) Construction of random small-insert genomic libraries highly enriched for simple sequence repeats. Nucleic Acids Res 21:3911-3912. [ Links ]
Kijas JM, Fowler JC, Garbett CA and Thomas MR (1994) Enrichment of microsatellites from the citrus genome using biotinylated oligonucleotide sequences bound to streptavidin-coated magnetic particles. Biotechniques 16:656-662. [ Links ]
Koblízková A, Dolezel J and Macas J (1998) Subtraction with 3' modified oligonucleotides eliminates amplification artifacts in DNA libraries enriched for microsatellites. Biotechniques 25:32-38. [ Links ]
Mottura MC, Finkeldey R, Verga AR and Gailing O (2005) Development and characterization of microsatellite markers for Prosopis chilensis and Prosopis flexuosa and cross-species amplification. Mol Ecol Notes 5:487-489. [ Links ]
Pääbo S, Irwin DM and Wilson A (1990) DNA damage promotes jumping between templates during enzimatic amplification. J Biol Chem 265:4718-4721. [ Links ]
Poteaux C, Bonhomme F and Berrebi P (1999) Microsatellite polymorphism and genetic impact of restocking in Mediterranean brown trout (Salmo trutta L.). Heredity 82:645-653. [ Links ]
Refseth UH, Fangan BM and Jakobsen KS (1997) Hybridization capture of microsatellites directly from genomic DNA. Electrophoresis 18:1519-1523. [ Links ]
Schwartz TS, Jenkins F and Beheregaray LB (2005) Microsatellite DNA markers developed for the Australian bass (Macquaria novemaculeata) and their cross-amplification in estuary perch (Macquaria colonorum). Mol Ecol Notes 5:519-520. [ Links ]
Vincze T, Posfai J and Roberts RJ (2003) NEBcutter: A program to cleave DNA with restriction enzymes. Nucleic Acids Res 31:3688-3691. [ Links ]
von Wintzingerode F, Göbel UB and Stackebrandt E (1997) Determination of microbial diversity in environmental samples: Pitfalls of PCR-based rRNA analysis. FEMS Microbiol Rev 21:213-229. [ Links ]
Zane L, Baegelloni L and Patarnello T (2002) Strategies for microsatellite isolation: A review. Mol Ecol 11:1-16. [ Links ]
Zhou Y, Gu H and Dorn S (2005) Polymorphic microsatellite loci in the parasitic wasp Cotesia glomerata (Hymenoptera, Braconidae). Mol Ecol Notes 5:475-477. [ Links ]
Send correspondence to:
Marlise Ladvocat Bartholomei-Santos
Programa de Pós-graduação em Biodiversidade Animal
Centro de Ciências Naturais e Exatas
Universidade Federal de Santa Maria
97.105-900 Santa Maria, RS, Brazil
Received: April 20, 2007; Accepted: August 13, 2007.
Nucleotide sequence data reported are available in the GenBank database under the accession numbers EF025136-EF025166, EF528186 and EF528187.
Associate Editor: Fausto Foresti