PCR-mediated recombination in development of microsatellite markers : Mechanism and implications

Protocols for microsatellite-enrichment libraries have been widely applied to several species in order to supply the most informative molecular markers for population and inbreeding studies. One drawback of these protocols is the ratio of designed primer pairs that fail to amplify the expected fragment, even after exhaustive optimization attempts. A possible cause of unsuccessful microsatellite primers may be that such loci are artifacts resulting from chimeric PCR products, instead of real genomic sequences. The microsatellite-enriched library constructed for Aegla longirostri (Crustacea, Decapoda, Anomura) showed that 29% of sequenced clones were chimeric products because these sequences shared one of the flanking regions around the same repeat motif but not the other. PCR-mediated recombination is a well-known event described for several procedures in which related sequences are used as a template. We have associated this phenomenon with microsatellite marker development. This study explained the high ratio of recombinant sequences generated in the A. longirostri microsatellite-enriched library. We discuss the mechanism and implications of PCR chimeric-product formation during microsatellite isolation.

Microsatellites or simple sequence repeats (SSR) are genetic markers widely used in individual identification and population-level analysis because of their high power of genetic resolution (Chambers and Macavoy, 2000).Several techniques for microsatellite isolation have been developed, and in a recent review of the most frequently used strategies for microsatellite isolation, Zane et al. (2002) noted that selective hybridization protocols are extremely popular, being used in over 25% of all reviewed articles and in 70% of those employing enrichment steps.The basic protocol was proposed by Karagyozov et al. (1993), Armour et al. (1994) and Kijas et al. (1994).Selective hybridization is performed by using an oligonucleotide probe containing the repeat motif to be isolated.The probe can be cross-linked to a nylon membrane or can be biotinylated at one end so that DNA hybridized to the probe can be selectively captured using streptavidin-coated magnetic beads.The use of a biotinylated probe is generally preferable because of its greater efficiency in hybridizing to the target DNA.After selective hybridization, recovered fragments are amplified using polymerase chain reaction (PCR) and cloned using standard methods for further sequencing (Zane et al., 2002).
Despite the modern technology results are not always satisfactory, with the percentage of designed primer pairs that are successfully optimized being variable and generally low (Table 1).However, because publications about microsatellite isolation are generally limited to notes, authors usually focus on amplifiable loci and do not discuss the possible factors behind loci that did not work.We are of the opinion that the failure of loci to amplify might be attributable to artifacts of the PCR-based isolation process.
Chimeric PCR products can arise when sequences with relatively high similarity are present in a reaction as templates.An incompletely extended primer in the elongation phase of the PCR cycle generates a shorter nascent strand, which, in a subsequent cycle, can prime off a heterologous target sequence and be completely extended.The chimeric sequence produced has its 5' end corresponding to the first template and its 3' end to the heterologous template (Bhavsar et al., 1994).
Formation of hybrid sequences by PCR has been reported for attempts to amplify genes belonging to multiple families (Bhavsar et al., 1994), characterization of alleles from a heterozygous subject (Bradley and Hillis, 1997), forensic applications in which the template is generally ancient DNA (Pääbo et al., 1990), reverse transcription (Brakenhoff et al., 1991), in PCR-derived clones from polyploid genomes (Cronn et al., 2002), and for environmental DNA samples (von Wintzingerode et al., 1997).Even false sequences resulting from chimeric PCR have been deposited in public databases (Hugenholtz and Huber, 2003;Ashelford et al., 2005).On the other hand, a chimeric PCR product is a powerful tool widely used to create recombinant molecules in biotechnology assays (Coljee et al., 2000).
In our attempt to isolate microsatellite sequences from the South American freshwater crab Aegla longirostri, out of the 61 clones obtained 13 lacked sufficient flanking sequences for primer design and two were repeated (identical clones).Moreover, 17 clones were doubtful because they were a type of "shuffled" sequence that varied combinations of identical flanking regions around the (CA) n repeat, a pattern which attracted our attention to these clones.
Suspicion of chimeric products led us to an exhaustive search for such phenomena related to studies on the development of microsatellites.Despite the evident propensity of microsatellites to produce PCR-based recombination because of the presence of the same motif in all templates, few studies have reported the possibility of such an event occurring (Refseth et al., 1997;Koblízková et al., 1998;Poteaux et al., 1999;Hughes et al., 2002).Except for the study of Koblízková et al. (1998), none of the studies mentioning the possibility of chimeras in microsatellite development proposed a mechanism to explain this phenomenon or associated its occurrence with unsuccessfully amplifiable microsatellite loci.Koblízková et al. (1998) proposed that chimeric PCR products result from elongation of free oligonucleotide probes, which would first generate amplification fragments lacking one flanking region and then produce chimeric products in subsequent cycles.They also suggested that the problem could be eliminated using 3' modified oligonucleotides.
Our present paper reports and discusses a surprising result from the microsatellite-development procedure and we propose a mechanism for chimeric microsatellite loci which differs from that proposed by Koblízková et al. (1998).Several methodologies for microsatellite isolation are based on PCR amplification and are hence liable to form chimeric products, because of which we also discuss ways to detect and to avoid chimeric clones that may have been responsible for many of the literature reports of microsatellite primers which failed to amplify the expected products despite the attempts to optimize PCR conditions.
We developed microsatellites following the method described by Refseth et al. (1997), with some modifications.Genomic DNA from Aegla longirostri Bond-Buckup & Buckup, 1994 was digested with TaqI (CenBiot) and fragments (500 ng) were ligated to an adapter (25 μM) using T4 DNA ligase (1 U) (Invitrogen) at 4 °C for 16 h.The adapter oligo sequences used were: TaqI20Mer (5'-ATGA AGCCTTGGTACTGGAT-3') and TaqI22Mer (5'-pCGA TCCAGTACCAAGGCTTCAT-3').About 100 ng of the DNA ligated to the adapter was hybridized to a 5' biotinylated probe (CA) 8 (0.4 μM) (MWG) in TE/NaCl buffer (10 mM Tris-HCl, 1 mM EDTA, 1 M NaCl) containing the oligonucleotide TaqI20Mer (2 μM).The DNA was denatured by incubating at 95 °C for 10 min, followed by incubation at 60 °C for 1 min in order to allow the biotinylated probe to hybridize to the target DNA.To capture the fragments hybridized to the probe we used the affinity of the biotin in the probe for the streptavidin-coated magnetic beads (Dynabeads M-280 Streptavidin, Dynal, Norway) by incubating 100 μg of beads for 30 min at room temperature with the hybridized DNA in TE/NaCl buffer.The beads were then washed 3 times in 2x standard saline citrate containing 0.1% (w/v) sodium dodecyl sulfate at 50 °C for 10 min and once in TE/NaCl at room temperature to remove unbound DNA and excess oligomers.The immobilized single-stranded DNA was eluted from the beads in 50 μL of distilled water at 90 °C for 5 min.Recovered DNA was PCR-amplified in a 50 μL-reaction, containing 10 μL of the captured fragments (without beads), 10 pmol of oligonucleotide TaqI20Mer (MGW), 2.5 U of Taq DNA polymerase (Invitrogen), 100 μM of each dNTP (Invitrogen) and Taq DNA polymerase buffer (10 mM Tris-HCl pH 8.5; 50 mM KCl; 4 mM MgCl 2 ).Reactions were denatured for 5 min at 95 °C before amplification using 30 cycles of 1 min at 95 °C, 30 s at 61 °C and 2 min at 72 °C, followed by a final extension of 8 min at 72 °C.The amplification products were purified by polyethylene glycol precipitation and cloned using a TA Cloning kit (Invitrogen).Positive clones, checked by PCR for the presence of an insert, were sequenced using a MegaBACE 500 sequencer (Amersham Biosciences).Before the primer design, repetition was taken off and all 3' and 5' flanking regions around (CA) n repeat were aligned as separated queries using ClustalW (Chenna et al., 2003), which calculates pairwise scores as the number of identities in the best alignment divided by the number of residues compared (percentage identity scores).
The first microsatellite-enriched library that we developed yielded 62 positive colonies.Three of 32 sequenced clones did not contain repeats.All the others showed only one flanking region to the microsatellite.These sequences are originated by the internal priming of the biotinylated oligonucleotide probe leaking from the magnetic beads.Absence of the adapter sequence at one microsatellite end of the insert confirms the repeat as the primer site.
In order to avoid carrying over oligonucleotide probes with the recovered DNA, we adopted the following strategies: reduction of probe concentration from 0.4 μM to 0.3 μM; reduction of the elution temperature from 90 °C to 80 °C (high temperatures could break the strong ligation between the biotin in the probe and the streptavidin bound to the magnetic beads); and an additional elution step, in which 20 μg of magnetic beads was added to the eluted DNA to interact with any biotinylated probe remaining in solution.
Our second attempt, using the modifications above, yielded 61 positive clones, of which only two resulted from free oligonucleotide extension.Different clones that showed high identity for both upstream and downstream regions around the (CA) n repeat were considered redundant (two cases).However, the outcome revealed that some inserts shared one of the flanking regions, but not the other.These 17 doubtful sequences corresponded to 28.8% of our clones and were grouped in six subsets (A to F), each containing clones that shared a same flanking region, as shown in Table 2, with high alignment scores.Only the 5' end of clone AlCA112 showed relatively low identity with the related sequences 5' AlCA121, 5' AlCA124 and 5' AlCA166.
To assess the possibility that these sequences were part of repetitive genes BLAST analyses (Altschul et al., 1990) were performed, but no matches were found.Sequences were also submitted to NEBcutter V 2.0 (Vincze et al., 2003) to check the presence of sites for TaqI endo- 3'-106 (216) to 5'-129 (9) 3'-106 (216) to 3 '-165 (232) 3'-106 (216) to 5 '-174 (171) 5'-129 (9) to 3'-165 (232) 5'-129 (9) to 5'-174(171) 3 '-165 (232) '-111 (128) to 3 '-116 (62) 3 '-111 (128) to 3'-161 - 3'-116 (62) to 3'-161 - nuclease that was used for the DNA digestion before isolation, which could represent a point of ligation between two different loci that had been cloned together.No sites for TaqI were found in these shuffled sequences.We suggest that the mechanism implicated in chimeric microsatellite loci is the in vitro recombination events occurring during the PCR preceding cloning.A captured microsatellite locus not completely extended (i.e., extended only to the (CA) n repeat) in one cycle annealed its 3' end with a (TG) n repeat of another microsatellite locus in the subsequent cycle, functioning as a priming site for subsequent extension.The generated nascent strand is a chimera formed by the 5' flanking region of one locus, a (CA) n repeat that was the crossover point and a 3' flanking region of another locus (Figure 1).This chimera does not represent a contiguous sequence present in A. longirostri genome and primers designed for these PCR artifacts will certainly not amplify.
Despite the high alignment scores (> 92% of identity) for most of the flanking regions analyzed by us, the identity was always less than 100% (Table 2).This may have been due to misincorporation of deoxynucleotides by Taq DNA polymerase during PCR or sequencing errors.However, some of our chimeric clones, which showed similarity scores less than 100%, could be generated if the incomplete extension during amplification passed beyond or stopped before the microsatellite and the 3' portion of the sequence acted as a primer in a subsequent cycle (Figure 2a).Clone AlCA111 shares its 5' flanking region with sequences 3'AlCA95, 5'AlCA116 and 5'AlCA161 until position 70 (TG), where the repeat begins for that clone, but not for the others (Figure 2b).Following the alignment shown in Figure 2b, the same was observed for sequence AlCA161 at position 85.Probably, an incompletely extended strand until these points (70 and 85), finishing in TG, primed off another microsatellite locus and generated a chimeric sequence.
The above clones constituted the main reason to believe that PCR recombination is the cause of chimerism in microsatellite development, because they showed a few differences at their flanking end shared with other clones, although they were clearly related.The mechanism described above for clones AlCA111, AlCA95, AlCA116 and AlCA161 could not be attributed to products generated by a contaminant repetitive probe, as suggested by Koblízková et al. (1998), because the latter can only generate identical flanking regions.
Chimeric sequences can be detected by aligning of microsatellite flanking ends, after removing the repetitive region, and trying to find shuffled sequences originating from a chimeric PCR.The low probability of obtaining related recombinant sequences when only a subset of cloned PCR products are sequenced, or if a small number of clones is obtained from a isolation procedure, should be taken into account.The more clones are sequenced, the more probable is the detection of chimeras.If a flanking region is not present in shuffled clones, it can be assumed that it is real and it can be considered for primer design.Because much effort, time and money are often employed in primer design and optimization of PCR conditions for microsatellite loci, we strongly suggest the alignment of flanking regions of repeats in order to detect these artifacts.
To minimize PCR recombination, it should be remembered that the possibility that premature extension products can compete successfully with the normal PCR primers for target sequences increases with each subsequent round of amplification as the concentration of normal PCR primers available to target DNA progressively decreases, so that recombination events occur late in the PCR reaction (Judo et al., 1998).Our procedure for A. longirostri microsatellite development was performed with 30 PCR cycles with 2 min elongation steps.In an attempt to minimize the recombination ratio, we will, in future, adopt and recommend a reaction with fewer cycles, longer elongation time and replacement of Taq DNA polymerase by a polymerase with higher processivity.These recommendations are valid for any protocol that performs a PCR-step before cloning, including the popular ones relying on selective hybridization.
For protocols using capture of microsatellites with streptavidin-coated beads, the possibility of probes functioning as primers can be drastically reduced by using a Roratto et al.
61 lower concentration of biotinylated oligonucleotide in the hybridization step, use of 80 °C in the elution step for recovering fragments containing microsatellites and the addition of an extra 20 μg of beads to the recovered DNA (to interact with the probe remaining in solution) followed by another elution step.Also, beads should not be present in PCR.Alternatively, a 3' biotinylated oligonucleotide can be used (Koblízková et al.,1998).
Here, we propose a mechanism that can explain the unfruitful loci for which the designed primer pairs have failed to amplify microsatellite markers in several studies, and we also recommend means to avoid some pitfalls in microsatellite development.
Future studies of PCR-mediated recombination with different microsatellite loci as templates would be of great value to estimate recombination ratios and evaluate factors that affect the formation of abortive extension products for repetitive sequences.
Identity scores (% sequence similarity) for microsatellite flanking regions shared among different clones obtained from an Aegla longirostri microsatellite-enriched library.The 5' and 3' flanking sequences of each microsatellite were analyzed separately, without the (CA) repeat.Six groups of clones (A to F) were arranged according to the same shared flanking region.Scores for the non-shared flanking sequences of these shuffled clones are also shown, denoting low identity among them.The upstream flanking region (5') and downstream flanking region (3') are shown with the sequence length in base pairs given in parentheses.

Figure 1 -
Figure 1 -Schematic representation of three successive PCR cycles related to formation of chimeric products.The illustrations represent two isolated microsatellite loci with the same repeat motif (light hatched region) but different flanking regions.White boxes show flanking regions of locus A, while black boxes show the flanking regions of locus B. Adapters functioning as primers are shown as checkered boxes.In the n cycle, an incompletely extended strand (IES) was produced from locus A, which ends in the repeat and serves as a primer for locus B in the n + 1 cycle.A chimeric strand is generated, which becomes double-stranded in the next cycle (n + 2) and which can be amplified in the following cycles (not represented).

Table 1 -
Variable efficiency of microsatellite-enrichment protocols, based on the percentage of designed primer pairs for different loci that successfully amplified the target region, according to different methodologies, which include a PCR-step before cloning.Publications were randomly chosen as examples and the table is ordered by decreasing efficiency.