Sequence diversity and copy number variation of Mutator-like transposases in wheat

Partial transposase-coding sequences of Mutator-like elements (MULEs) were isolated from a wild einkorn wheat, Triticumurartu, by degenerate PCR. The isolated sequences were classified into a MuDRor Class I clade and divided into two distinct subclasses (subclass I and subclass II). The average pair-wise identity between members of both subclasses was 58.8% at the nucleotide sequence level. Sequence diversity of subclass I was larger than that of subclass II. DNA gel blot analysis showed that subclass I was present as low copy number elements in the genomes of all Triticumand Aegilopsaccessions surveyed, while subclass II was present as high copy number elements. These two subclasses seemed uncapable of recognizing each other for transposition. The number of copies of subclass II elements was much higher in Aegilopswith the S, S l and D genomes and polyploid Triticumspecies than in diploid Triticumwith the A genome, indicating that active transposition occurred in S, S l and D genomes before polyploidization. DNA gel blot analysis of six species selected from three subfamilies of Poaceae demonstrated that only the tribe Triticeae possessed both subclasses. These results suggest that the differentiation of these two subclasses occurred before or immediately after the establishment of the tribe Triticeae.


Introduction
The Mutator trait with a high frequency of forward mutations was first identified in a single maize line (Robertson 1978). Like the Ac/Ds and Spm/dSpm systems, Mutator activity is regulated by the two-component system composed of two DNA-type transposable elements, MuDR and Mu (Bennetzen 1996). MuDR is an autonomous element, while Mu is a non-autonomous deletion derivative of MuDR. Mutator-like elements (MULEs) have been identified in diverse plant species including both monocots and dicots (Mao et al. 2000;Yu et al. 2000;Lisch et. al. 2001;Rossi et al. 2001). Transposases sharing a homologous domain with MURA (mudrA product) were also found in prokaryotes (Eisen et al. 1994) and a MULE named Hop has been recently isolated from the fungus, Fusarium oxysporum (Chalvet et al. 2003). It is thus apparent that Mutator composes a superfamily and is widespread.
Molecular features and transposition mechanisms of Mutator have been extensively studied (for reviews see Lisch 2002;Walbot and Rudenko 2002). MuDR is 4.9-kbp long and possesses around 200-bp of terminal inverted repeats (TIRs). During insertion, a 9-bp duplication of the recipient DNA is generated. MuDR carries two genes, mudrA and mudrB. The former encodes the MURA transposase that catalyzes the excision of Mutator (Eisen et al. 1994;Benito and Walbot 1997) and the latter encodes the MURB protein that is proposed to be involved in the reinsertion of Mutator (Lisch et al. 1999;Raizada and Walbot 2000). However, the sequence corresponding to mudrB has only been identified in the genus Zea so far (Lisch et al. 1995;Walbot and Rudenko 2002). Transposition activity of Mutator was differentially regulated in somatic and germinal tissues (Lisch et al. 1995;Raizada et al. 2001). Mutator had a cut-and-paste mechanism in somatic tissues, while it appeared to transpose either by a gap-repair mechanism or by a semi-conservative and duplicative transposition mechanism in germinal tissues. Consequently, numerous copies of Mutator can be accumulated in a given genome.
Wheat consists of a series of species with different ploidy levels and their genome differentiation and evolutionary history through allopolyploidization are well known (Kihara and Tanaka 1970;Kimber and Sears 1983;Feldman et al. 1995). Wheat is therefore an informative material to study MULE dynamics related to genome differentiation and evolution through allopolyploidization. Partial sequences of MULE transposases were isolated by PCR from several grass species including common wheat (Lisch et al. 2001). Genome sequencing analysis led to the identification of a few MULEs in einkorn wheat (Yan et al. 2002), durum wheat (Wicker et al. 2003) and common wheat (Chantret et al. 2005), as well as in Aegilops tauschii (synonymous to Ae. squarrosa), a wild D genome donor species to common wheat (Li et al. 2004;Chantret et al. 2005). However, little is known about MULE dynamics in Triticum and Aegilops.
We previously identified and characterized MULEs in rice (Yoshida et al. 1998;Asakura et al. 2002). One of these rice MULEs, OsMu4-2, carried a transcriptionally active gene encoding a putative transposase with a significantly high identity to maize MURA. We now performed degenerate PCR to isolate partial sequences of MULE transposase from Triticum urartu, a donor of the A genome to two tetraploid wheat groups (emmer and timopheevi) (Tsunewaki and Nakamura 1995). A pair of primers was designed based on the sequences of MURA and a putative transposase of OsMu4-2. The isolated sequences possessed a conserved transposase-coding domain and were divided into two subclasses with different sequence diversity. DNA gel blot analysis in Triticum and Aegilops species revealed marked copy number variation between both subclasses. Furthermore, the copy number of MULEs of the high copy number subclass greatly differed between diploid Triticum and Aegilops species. In addition, we studied the distribution of the two MULE subclasses in other grass species.

Plant materials
Wild einkorn wheat, Triticum urartu (accession no. KU199-2, Plant Germplasm Institute, Kyoto University, Japan, Table 1), was used as a donor of template DNA for degenerate PCR amplification of the conserved transposase-coding region of MULEs. Twenty-eight accessions of Triticum and Aegilops species, including KU199-2, were used for DNA gel blot analysis. Oryza sativa (rice) from the subfamily Ehrhartoideae, Secale cereale (rye), Hordeum vulgare (barley) and Avena sativa (oat) from the subfamily Pooideae, Zea mays (maize) and Sorghum bicolor (sorghum) from the subfamily Panicoideae, were also used for DNA gel blot analysis.
Degenerate PCR and sequencing of the amplified DNA fragments Total DNA was extracted from young leaves of KU199-2 with DNeasy Plant Maxi Kit (QIAGEN, Hilden, Germany). Degenerate primers were designed after comparing amino acid sequences of the conserved transposase region of the maize MURA and a putative transposase of rice OsMu4-2 (Asakura et al. 2002). The nucleotide sequences of the forward and reverse primers were 5'-GAY GGICAYAAYTGGATG-3' and 5'-GTGATRWARTCRC AYTTDAT-3', respectively. Coding sequences corresponding to the region between MURA amino acid residues 350 through 484 were amplified. The PCR amplification was carried out in 50 μL of a reaction mixture containing 2.5 U of Taq polymerase, 1.0 μM of primers, 50 ng of total DNA, 0.2 mM of each dNTP, 1.5 mM of MgCl 2 and reaction buffer. PCR was performed in a GeneAmp PCR 9700 (Applied Biosystems, Foster city, CA) as follows: a pre-denaturation step of 1 min at 94°C; 30 cycles of denaturation for 1 min at 94°C, annealing for 2 min at 55°C and extension for 2 min at 72°C; followed by a post-extension incubation for 3 min at 72°C. Amplified DNA fragments were cloned into pGem-T Easy vector (Promega, Madison, Wis.) and used in the transformation of Escherichia coli strain DH5α. In order to exclude clones with unexpected DNA fragments, the length of the inserted DNA fragments was examined by direct colony PCR followed by agarose-gel electrophoresis. The sequencing of the inserts was performed by the dideoxynucleotide chain-termination method. Individual sequences of putative MULE transposases were designated MoTU, according to the nomenclature system of Lisch et al. (2001). Mo is the prefix of maize mudrA-like sequences. T and U are the initials of Triticum urartu.

Sequence analyses
Putative amino acid sequences deduced from the nucleotide sequences of MoTUs were analyzed using the NCBI conserved domain search (Marchler-Bauer and Bryant 2004) in order to confirm that they coded for MULE transposases. Homologous sequences from Triticum and Aegilops were searched in the TIGR Wheat Genome Database using BLASTn and tBLASTx (Altschul et al. 1997). Two other Mutator-related elements, Trap (Comelli et al. 1999) and Jittery (Xu et al. 2004), have been identified in maize. Mutator-like transposases were divided into four classes in sugarcane, rice and Arabidopsis (Rossi et al. 2004). Nucleotide sequences of the conserved transposase region from MuDR, Trap, Jittery and representative clones, TE165, TE109, TE266 and TE148, from each of the four MULE classes of sugarcane were used as queries in the homology search. To characterize MoTUs, individual sequences were compared to the query sequences, sequences of MULEs found in the homology search and OsMu4-2. Multiple alignments of coding sequences (amino acid residues 385 to 477 of MURA) were carried out using CLUSTAL X ver. 1.83 (Tompson et al. 1997). Phylogenetic analysis was performed by the neighbor-joining (NJ) method (Saitou and Nei 1980) using MEGA ver. 3.1 (Kumar et al. 2004). 540 Asakura et al.

DNA gel blot analysis
DNA gel blot analysis was carried out to study the distribution of MULE transposases related to MoTUs in Triticum and Aegilops species and the distribution of homologous sequences among six other grass species. Total DNA was extracted from young leaves of all accessions using the DNeasy Plant Maxi Kit (QIAGEN). Genomic DNA of wheat, rye, barley and oat (10 μg), and of rice, maize and sorghum (5 μg) was digested with Hind III. The digests were fractionated by electrophoresis through 0.8% agarose gels and transferred onto nylon membranes using the alkaline blotting method. Two clones, MoTU-12 and MoTU-32, were used as probes. Labeling, hybridization and signal detection were performed using the Gene Images Random-Prime Labelling and Detection System (Amersham Biosciences) according to the manufacturer's instructions with slight modifications. Prehybridization and hybridization were conducted for five and 18 h, respectively, at 65°C in buffer containing 5xSSC, 0.1% SDS, 5% dextran sulfate and 5% blocking reagent. Membranes were washed twice for 25 min at 65°C in a solution containing 1xSSC and 0.1% SDS, followed by two 25 min washes at 65°C in 0.1xSSC and 0.1% SDS (high stringency). Signals were detected after membranes exposure to X-ray films for about one hour.

Isolation of MULE transposases coding sequences from Triticum urartu
Degenerate PCR was carried out to isolate partial coding sequences of MULE transposases from T. urartu (KU199-2, Table 1). After cloning the amplified DNA fragments into plasmids, the length of each inserted sequence was measured by direct colony PCR followed by an Characterization of MULEs in wheat 541

Classification of MoTUs
MULEs of Triticum and Aegilops species were searched in the TIGR Wheat Genome Database. Twentynine MULE sequences significantly homologous to at least one of three maize elements (MuDR, Trap and Jittery) and four sugarcane MULEs (TE165, TE109, TE266 and TE148), were found (Table 2). A phylogenetic analysis resulted in the classification of the MoTUs into five clades (Figure 1). MULEs of Triticum and Aegilops species were found in all but the Trap or Class II clade. Out of the 29 wheat MULEs identified by homology search, 19 MULEs were classified into the Jittery clade. All MoTUs were classified into a major MuDR or Class I clade as expected, because degenerate primers were designed based on the MuDR and OsMu4-2 sequences. Furthermore, MoTUs were clearly divided into two subclasses: 11 of the16 MoTUs belonged to subclass I and the remaining five belonged to subclass II. The average pair-wise identity between these two subclasses was 58.8% at the nucleotide sequence level. Subclass I exhibited the highest similarity to maize MURA transposase among MULE transposases identified in Triticum and Aegilops species. As shown in 542 Asakura et al.  Figure 1, the sequence diversity of subclass I was larger than that of subclass II.

Distribution of MULEs in wheat
Distribution of MULEs belonging to subclass I and II was surveyed in the 28 accessions of Triticum and Aegilops listed in Table 1 by DNA gel blot analysis. We used MoTU-32 and MoTU-12 as probes because they represent two MoTU subclasses. The hybridization patterns with each probe clearly differed in these accessions. Several bands were detected with the MoTU-32 probe (subclass I) in the diploids with the A, S, S l and D genomes and in the two types of timopheevi wheat, T. araraticum and T. timopheevi, with the AG genome (Figure 2A). A few more bands were detected in the emmer wheat (T. dicoccoides, T. dicoccum and T. durum) with the AB genome and in the common wheat (T. spelta and T. aestivum) with the ABD Characterization of MULEs in wheat 543  Coding sequences corresponding to amino acids 385 through 477 of the maize MURA were used for the analysis. A phylogenetic tree was constructed by the NJ method. Bootstrap values higher than 50% for 1000 replications are shown at the nodes. Classification into classes followed Rossi et al. (2004). MuDR, Jittery and Trap are from maize and OsMu4-2 (Asakura et al. 2002) is from rice. Sequences denoted with TE are from sugarcane (Rossi et al. 2004) and sequences designated MoTU were isolated from T. urartu. Details of other sequences identified in the genera Triticum and Aegilops by blastn and tblastx searches are described in Table 2. genome. On the other hand, MoTU-12 (subclass II) detected more than 15 major bands in the einkorn wheat (T. urartu, T. boeoticum and T. monococcum) with the A genome and many more bands in the Aegilops species with the S, S l or D genomes ( Figure 2B). Numerous bands were also detected in all of the tetraploid and hexaploid wheat accessions. Subclass II MULEs therefore represented a highcopy element in the genomes of Triticum and Aegilops species, particularly in the S, S l and D genomes and probably also in the B and G genomes. In contrast to subclass II MULEs, subclass I MULEs existed as a low-copy element.

Distribution of MULEs among some grass species
The distribution of MULEs homologous to the two subclasses was studied through DNA gel blot analysis in 16 accessions from six grass species. Intense and distinct hybridization bands of subclass I were detected by MoTU-32 in all accessions of barley and rye, which belong to the tribe Triticeae ( Figure 3A). More bands were detected in barley than in rye and diploid wheat (Figure 2A). Weak signals were detected in rice and no signals were detected in oat, maize and sorghum. Intense hybridization bands of subclass II were detected by MoTU-12 in all accessions of barley, rye and oat, which belong to the subfamily Pooideae ( Figure 3B). In barley and rye, like in wheat, numerous bands were detected. In oat, however, two intense signals, with 3.6-kbp and 1.3-kbp, were detected. The 1.3-kbp signal was extremely intense, indicating that the subclass II transposase existed as a high copy number element also in oat. No subclass II signals were detected in rice, maize, and sorghum.

Discussion
Sequence differentiation of MULE transposases in wheat MULE transposase homologs found in Triticum and Aegilops were classified into five clades, i.e., MuDR or Class I, Jittery, Trap or Class II, Class III and Class IV (Figure 1). Most of MULE transposases in plants were reported to exhibit higher similarity to the transposase of Jittery than to MURA (Walbot and Rudenko 2002;Xu et al. 2004) and the Jittery clade also included the largest numbers of MULEs in T. aestivum. Many MULE transposases belonging to the Jittery clade are probably also present in T. urartu. MoTUs, however, showed a higher similarity to the transposase of MuDR than to that of Jittery. This result was most probably caused by a PCR bias based on sequence differences between MuDR and Jittery at the target regions of primers. Consequently, we selectively amplified MULE transposase sequences belonging to the MuDR clade.
Clear sequence differentiation of MULEs was found in the MuDR clade. MoTUs consisted of two distinct subclasses that exhibited an average pair-wise identity of 58.8% at the nucleotide sequence level. Among wheat 544 Asakura et al.  MULEs, subclass I was the closest to the maize MURA. Sequence diversity was higher in subclass I than in subclass II MULEs. The differentiation within subclass I may have occurred earlier than that within subclass II.

Copy number variation of MULEs in wheat
DNA gel blot analysis showed that subclass II MULEs existed as a high-copy number element in the genomes of wheat, rye and barley ( Figures 2B and 3B). Furthermore, it is intriguing that the copy number of subclass II transposases obviously differed among the ancestral diploid genomes: the genomes S, S l and D of Aegilops species possessed higher copy numbers than the A genome of diploid Triticum species. Tetraploid and hexaploid wheat genomes also contained numerous subclass II transposases. The B genome of emmer wheat and common wheat and the G genome of timopheevi wheat were most probably derived from the S genome of Aegilops speltoides (Dvorak and Zhang 1990). Aegilops tauschii donated the D genome to common wheat (Kihara 1944;McFadden and Sears 1946). Tetraploid and hexaploid wheat thus probably inherited the numerous copies of subclass II MULEs through their evolution by allopolyploidization. This copy number variation among diploid species probably reflects historical differences in transposition frequencies of subclass II MULEs after the differentiation of the genera Triticum and Aegilops. The factors determining such copy number variation require further studies.
The copy number of subclass II MULEs was much higher than that of subclass I in wheat, rye and barley (Figures 2A and 3A). Sequence diversity of subclass II was lower than that of subclass I. These results suggest that rapid amplification of subclass II MULEs has recently occurred. Furthermore, the results also suggest that the transposition of each MULE subclass is under a different regulation. The MURA transposase binding site (MBS), a 32-bp motif in the TIRs, is well conserved among the mobile Mutator elements (Benito and Walbot 1997;Rudenko and Walbot 2001). It was suggested that transposase active for transposition of subclass II MoTUs might not be able to recognize MBSs of subclass I MoTUs. A similar behavior was observed between distinct groups of mariner-like elements coexisting in a Drosophila genome (Lohe et al. 1995).

MULE dynamics in grass species
DNA gel blot analysis revealed that MULEs of the two subclasses were present at least within the tribe Triticeae (Figure 3). This result suggests that the evolution of these subclasses occurred before or immediately after the establishment of the tribe Triticeae. However, a clear differentiation exists between the two subclasses. Subclass I MULEs were found in rice of the subfamily Ehrhartoideae but not in oat of the tribe Aveneae that belongs to the subfamily Pooideae. Oat, on the other hand, possessed sub-class II MULEs ( Figure 3B). This could be explained by the stochastic loss of subclass I MULEs in oat, as originally proposed to account for the patchy distribution of P elements among Drosophila melanogaster strains (Engels 1981). More extensive studies are required in order to clarify the distribution and sequence diversity of these two subclasses of MULEs and to understand MULE dynamics in grass species.