Insertion sequences as variability generators in the Mycoplasma hyopneumoniae and M . synoviae genomes

We have analyzed the sequenced genomes of three strains of Mycoplasma hyopneumoniae and one strain of M. synoviae, and have found three and two different transposable element families, respectively in each species. In M. hyopneumoniae, the Insertion Sequences of the IS4 family is represented by ISMHp1, a putatively active element. The IS3 family is represented by several degenerated sequences. A third element called tMH was found, which shows some characteristics reminiscent of retrotransposons. In M. synoviae, three different possibly active IS4 elements are present (ISMHp1-like; ISMs1 and IS1634-like elements). The IS30 family is represented by the degenerated IS1630-like element. The IS1634-like element is shown to be involved in chromosomal rearrangements and horizontal gene transfer (HGT). The ISMHp1-like element is shown to relate to the HGT of a 25-kb region from M. gallisepticum to M. synoviae. The fractions of these genomes that correspond to mobile elements varied from 1.35 to 3.13% in M. hyopneumonia strains and was 2.08% in M. synoviae. Although these species possess reduced genomes, they maintain mobile elements, perhaps as a mechanism for genetic variability production.

Mycoplasmas, apart from being potential pathogens, are considered the best examples of a minimal genome.They are Gram-positive bacteria, but rather than being primitive, they diverged recently -around 65 million years ago -with a drastic reduction in genome size that resulted in the loss of many biosynthetic abilities.Normally, their genome size is smaller than 1 Mb, and whole genome comparisons suggest that the severe genome reduction in mollicutes probably reflects their parasitic lifestyle (Trachtenberg, 2005).
Small genomes, as those of Mycoplasmas, are characterized by progressive gene loss, and are often presumed to be impermeable to mobile DNA, given that only the essential genes would be maintained in these cases (Rocha and Blanchard, 2002).Indeed, mobile elements have been described in many mycoplasma species (Ferrel et al., 1989;Hu et al., 1990;Bhugra and Dybvig, 1993;Zheng and McIntosh, 1995;Calcutt et al., 1999;Chandler and Mahillon, 2002;Ditty et al., 2003;Thomas et al., 2005).Complete genome sequencing has also revealed mobile el-ements in the majority of the Mycoplasma studied so far (Chambaud et al., 2001;Sasaki et al., 2002;Papazisi et al., 2003;Westberg et al., 2004;Vasconcelos et al., 2005).In M. mycoides, for example, insertion sequences (ISs) represent 13% of the genome (Westberg et al., 2004).Nevertheless, there are some exceptions, as no insertion sequence, transposon, or endogenous plasmid was found in M. mobile (Jaffe et al., 2004) or other bacteria with reduced genomes, such as Wigglesworthia glossinidia, Buchnera aphidicola, and Blochmannia floridanus (Bordenstein and Reznikoff, 2005).
One interesting question to be answered considers the hypothesis that the mobile elements are maintained in these reduced genomes because they represent important genomic elements, probably as sources of genetic variability.Alternatively, in the light of their selfish nature, these mobile elements may be maintained because the hosts are unable to get rid of these parasites.
The most abundant class of mobile elements in the Mycoplasma genomes is the IS.These are mobile genetic parasites of about 800-2500 bp, often present in multiple copies on bacterial genomes and able to carry out transposition.This phenomenon corresponds to the move-ment of specialized DNA elements, namely transposons and insertion elements, within or between loci mediated either by a transposase or an integrase, the transposition enzymes (Zhou and Reznikoff, 1997;Haren et al., 1999).The substrate of transposition is DNA flanked by inverted repeats, which are recognized as the target for transposase.The mobile stretch of DNA bordered by these inverted repeats is called the transposable element, the insertion sequence or the transposon (Zhou and Reznikoff, 1997;Haren et al., 1999).Transposition of IS elements can cause deletions, insertions and inversions of genomic loci and consequently contribute to the genetic variability of bacteria.The genome thus acquires greater plasticity from these processes, and quickly adapts to diverse environmental selective pressures (Bordenstein and Reznikoff, 2005).
Mahillon and Chandler (1998) provided a review of Insertion Sequences and, at that time, a total of 17 IS families were recognized based on the following features: IS ORF organization, conserved signature motifs among transposases, similarity of terminal inverted repeat sequences (TIRs), and length of target site duplications (direct repeats, DR).So far, more than 800 IS elements belonging to 19 families have been discovered (http:// www-is biotoul.fr/is.html).
Further importance of insertion sequences lies in their utility as genetic markers for diagnosis and epidemiological analyses.This is because IS elements are typically present in multiple copies, rendering assays more sensitive and demanding less DNA for analyses.Also, the IS mobility can contribute to producing variants and subtypes of bacterial species (Stanley et al., 1993;Small et al., 1994;Frey, 1998).
The present study reports the presence, copy number and functional status of the IS elements in three strains of Mycoplasma hypneumoniae (the pathogenic 7448 and 232 and the non-pathogenic J strains) and one strain of M. synoviae, aiming to contribute to the understanding of the evolutionary role played by transposable elements in reduced genomes such as those of mollicutes.
The sequences that were annotated as transposase or transposable elements in the M. hyopneumonia and M. synoviae genomes (data available at http://www.brgene.lncc.br/finalMS for M. synoviae and http://www.genesul.lncc.br/finalMH for M. hyopneumoniae strain J and http:// www.genesul.lncc.br/finalMPfor M. hyopneumoniae strain 7448) served as the first step for analyses.The annotated sequences as transposable elements were used as seeds for searching for related sequences using BLAST (http://www.ncbi.nlm.nih.gov/blast/).Global alignments were performed using Clustal X (Thompson et al., 1997).The Artemis software was used for ORF integrity analyses and visualization of IS distribution in the genomes (Rutherford et al., 2000).The identification of IS in the described families was conducted using the following criteria: i) similarities in ORF organization; ii) identities or similarities in their Tpases (common domains or motifs); iii) similar features of their ends (TIRs); iv) direct target duplication characteristics (ISFinder http://www-is biotoul.fr/is.html).
We found three transposable elements (TEs) from three diverse families in Mycoplasma hyopneumoniae, while M. synoviae had four different TEs from two IS families.Table 1 summarizes the main information about these ISs: location in the genomes, their length, the length of TIRs and of DRs.Also, the elements are classified accord-284 Loreto et al. ing to their structure as: i) complete-or IS without nonsense mutations or indels.These are putatively active elements, and when the IS is considered complete, the length of putative protein is also described; ii) defective-or IS with almost the complete length of putative active elements but showing nonsense mutations or indels, suggesting that these sequences are inactive; iii) partial-or short sequences showing high similarity with part of the IS (more than 95%).Elements of the IS4 family are the most representative in these analyzed genomes.This IS family is a heterogeneous group and is present in various bacterial taxa.Usually, these ISs present approximately 50-bp Terminal Inverted Repeats (TIRs) along with direct repeats (DR) that correspond to target sequence duplication from 9 to 12 bp, and contain a single ORF coding for a transposase with a DD_E motif.In M. hyopneumoniae the sole representative of this family is ISMHp1 (Calcutt and Wise, 2000; direct submission to NCBI/GenBank).This element is present in an elevated copy number in the J and 7448 strains (11 and 10 copies), and in a lower number (4) in the 232 strain.A complete ISMHp1 element is 1910 bp long, codifies a 552-aa transposase, and has 21-bp long TIRs.The various complete elements present in the genomes are almost identical (98-99% nucleotide similarity), showing polymorphism of insertion sites in different strains (Figure 1-A) and probably corresponding to the most active IS in this species.
Furthermore, partial ISMHp1 copies are detected in the M. hyopneumoniae genomes.These partial sequences are, probably, footprints of complete ISs that have recently occupied these genomic positions.A remarkable characteristic of ISMHp1 is its extraordinary variability in the length of target sequence duplication generated by transposition events.This can sometimes result in extremely long DRs.For example, DR length varied between 8 and 82 bp, with an average of 50 bp in the J strain.In the 7448 strain, it varied between 8 and 151 bp (64.7 on average).Although the mechanism involved in generating such long and variable DRs remains unknown, they have been identified as well in two other IS4 elements, IS1549 from Mycobacterium smegmatis and IS1634 from Mycoplasma mycoide (Vilei et al., 1999;Chandler and Mahillon, 2002).
In the M. synoviae genome we have found three different ISs belonging to the IS4 family.Previously, Vasconcelos et al. (2005) have identified and annotated as IS4 the ISMHp1-like and IS1634-like elements.Here, we include a third IS4 to this list, which we have called ISMsy1.This element was classified as IS4 due to the characteristics of DD_E motif and to TIRs similarity with other IS4 elements.ISMsy1 is a 1881-bp long putatively active element and presumed to potentially codify a protein with 552 amino acids.Two copies were found in the genome, one complete and one partial, showing TIRs with 12 and 19 bp, respectively.
The second IS4 element present in M. synoviae was called IS1634-like by Vasconcelos et al. (2005) due to the similarity with IS1634 from M. mycoides (Vilei et al., 1999).The general nucleotide similarity between these elements is 71%, but the TIRs are almost identical.The unique IS1634-like copy found is 1801 bp long and has perfect 18-bp TIRs.The ORF has some indels and frameshifts, and has no potential for coding a transposase.However, it is likely that this copy has mobilized recently, because its long 284-bp DR is almost completely conserved.As previously mentioned, it is known that the IS1634 element is able to generate long DRs (Vilei et al., 1999).
The third IS4 element in the M. synoviae genome was denominated ISMHp1-like by Vasconcelos et al. (2005) due to the similarity with the described M. hyopneumoniae ISMHp1 element.Although they show a low general nucleotide similarity (65%), they share TIRs and some remark-  ably well conserved parts of the sequences.The complete element is 1862 bp long and encodes a protein of 562 aa.It possesses 20-bp TIRs, generally differing in two nucleotides in the 3' and 5' TIRs.As the ISMHp-1 element of M. hyopneumoniae, the DRs showed variations in length, varying between 19 and 73 bp.Five copies are present in the M. synoviae genome, of which one is complete, putatively active, three are defective, and one is partial (Table 1).
In the Mycoplasma species studied, the only representative of the IS3 family found was IS1221I in M. hyopneumoniae.This element was previously described for J strains by Ferrell et al., (1989) as a related element to the IS1221 from M. hyorhinis.It is present in the J strain as two defective copies with roughly 1500 bp, showing 20-27 bp TIRs but without any detected DRs.These features indicate that these copies are probably inactive and ancient.This element is absent in the 7448 and 232 strains.
Elements of the IS30 family were found exclusively in M. synoviae.A total of 9 copies was found, all of which had indels and frameshift mutations in the transposase gene.These sequences were called IS1630-like because they are similar to the IS1630 element from Mycoplasma fermentans (Calcutt et al., 1999).The TIRs are 28 bp long and show slight differences between TIRs of the same element.One noteworthy feature of this element is the fact that the DRs of one element are, sometimes, found in other IS.This fact can simply result from homologous inter-or intra-molecular recombination between two IS elements, each with a different DR sequence, or from the formation of adjacent deletions resulting from duplicative intramolecular transposition (Turlan and Chandler, 1995;Ohtsubo and Ohtsubo, 1978).As an example of this fact, the DR of IS1360-like element located in position 302572 is exchanged by the element located in position 498002.The same kind of exchange can be seen in the ISs located in positions 365937 and 504938.This is illustrated by the capacity of these ISs to act as agents of chromosomal rearrangements.
The tMH element is neither an IS nor related to any described prokaryotic transposable element.However, it presents features resembling those of LTR retrotransposons.This element was described by Harasawa et al. (1995) as organized into three ORFs flanked by 272-bp LTRs (Figure 1-B).The putative encoded peptides do not show similarity to any described protein.Nonetheless, when the genetic code is changed from Mycoplasmas to universal, the third ORF shows weak similarity to RNA polymerases.Complete and potentially active and partial copies are present in M. hyopneumoniae strains.Recently, Wu et al. (2004) described the presence of retrotransposon-like elements in the genome of Wolbachia pipientis wMel.However, even though these elements cannot be classified as "classical" retrotransposons, they indicate that this class of TEs may also be present in prokaryotes.In the genomes we have analyzed herein, the tMH elements are arranged in tandem and the copy share the same LTR.In the 7448 strain, three tMH elements are arranged in tandem, while in the J and 232 strains there are two copies, respectively.Nevertheless, in the three strains, the elements occupy the same genomic location (Fig 1-A).This means the particular element was present in the ancestor of these strains and has been maintained by vertical transfer.In J and 232 strains an ISMHp1 element is inserted in the shared internal LTR (Fig 1 -B).Nevertheless, in these different strains the ISMHp1 elements are located in an opposite orientation and occupy a diverse location in the LTRs, thus showing different DRs.So, we can conclude that this represents an independent ISMHp1 insertion in these strains.
An overview of the transposable element location in the analyzed genomes can be seen in Figure 1-A.Searching for flanking sequences of each element we are able to show that tMH elements and two copies of ISMph1 share the same position in the three M. hyopneumoniae strains.A third ISMph1 copy is in the same genomic position, in the J and 7448 strains.An analysis of these insertion site polymorphisms backs the suggestion that these elements are active or were active until recently.Also, some regions where there are accumulations of ISs can be observed in the M. synoviae genome, preponderantly of ISMph1-like and 1630-like elements.Alternatively, these results could also suggest the existence of hot-spot regions for these ISs in this genome.Vasconcelos et al. (2005) have described the horizontal gene transfer (HGT) of fourteen regions between Mycoplasma synoviae and M. gallisepticum, the largest with 5.9 kb and encompassing various CDSs.Some CDSs are hypothetical, while others code for an ABC transporter, a signal peptidase I, and a putative EF-G elongation factor.These fourteen regions were almost identical in both genomes, indicating a recent transfer event.The HGT between these species could have been facilitated by the fact that both are bird parasites and, therefore, keep close contact.Furthermore, these Mycoplasma species belong to different Mycoplasmatales clades that diverged about 350 MYA ago (Vasconcelos et al., 2005).The nucleotide similarity observed in conserved genes such as rRNA 16S is only 79%, while the observed similarity to two different ISs is significantly higher.For example, the similarity shared by M. synoviae ISMHp1-like elements and a related element found in M. gallisepticum is 97%.Also, an element similar to ISMsy1 found in M. synoviae is present in the M. gallisepticum genome, with identical 12-bp TIRs and a general similarity of 96%.HGT for these ISs has been suggested by Vasconcelos et al. (2005), and four of the fourteen putative transfered regions correspond to these ISs.Transposable elements have been pointed out in the literature as potential promoter agents to HGT (Lawrence, 2002).For this reason, we have decided to look for the putative association between ISs and the 14 regions previously mentioned and described as involved in HGT by Vascon-Mycoplasma insertion sequences celos et al. (2005).We have found a 25-kb region in the M. synoviae genome, encompassing five of the regions described as implicated in HGT and that are flanked by ISMhp1-like elements.It is remarkable that the same DRs are shared by these flanking IS elements (Figure 2).This finding could be explained by a homologous recombination between two IS elements, as previously described for the IS1630-like element.However, the fact that the regions described as 7, 8, 9 and 10 by Vasconcelos et al. (2005) are syntenic in the M. gallisepticum genome, corresponding to a conserved sequence between both species, makes the homologous recombination hypothesis strongly improbable.The most parsimonious suggestion is that the 25 kb encompassing the regions 7, 8, 9, 10 and 12 (Vasconcelos et al., 2005) flanked by the ISMhp1-like element in M. gallisepticum genome has been mobilized by a specific transposase.In M. synoviae, this 25-kb region was inserted probably by means of the ISMHph1-like transposase acting over TIRs of both ISs.This transposition could generate the DRs as depicted in Figure 2. It is notable that region 12, present in M. synoviae, is not syntenic in the sequenced M. gallisepticum strain.However, as the 12 region is also involved in the HGT event, the most parsimonious hypothesis is that region was included in the 25-kb region in the M. gallisepticum donor strains implicated in the HGT.
Noteworthy is the fact that among the fourteen regions identified by Vasconcelos et al. (2005) as involved in HGT, we were able to demonstrate the association with an IS in nine.Five are genomic regions mobilized by using the IS in the flanking sequences, and the other four regions are the proper ISs that are involved.This result strongly backs the evidence that the transposable elements are probably the most important HGT promoting agents.
Mobile elements correspond to 2.08% of the M. synoviae genome, and for M. hyopneumoniae the figure varies among the strains (3.13% for 7448, 3.08% for J, and 1.35% for 232).It is known that the portion of mobile DNA per genome significantly increases with genome size.There is a significant, positive correlation between genome size and the percentage of genomic mobile-DNA in bacteria (Bordenstein and Reznikoff, 2005).The proportion of mobile elements in the species we have analyzed is similar to that of free-living bacteria with larger genomes.However, the maintenance of transposable elements-even in a reduced genome such as in Mycoplasma-suggests that these sequences could be important to produce the genetic diversity necessary for evolution, by generating chromosomal rearrangements, altering gene expression, or promoting HGT.From an evolutionary standpoint, the possibility that transposable elements might be one of the necessary components for a minimal genome is not to be ruled out.Considering these findings for the genomes described as devoid of mobile elements, it is necessary to clarify whether such an absence corresponds to the reality for these genomes, or whether it is a peculiarity of the sequenced strain.If the transposable elements are part of a minimal genome, we would expect to find them in other strains of those species believed to be devoid of TEs.

Figure 1 -
Figure 1 -A-An overview location of diverse transposable elements in J, 7448, 232 M. hyopneumoniae and M. synoviae genomes.Different arrowheads symbolize the various ISs.The line connecting IS among genomes corresponds to elements that are homologous in the different genomes.B -Structural features of the tMH element in M. hyopneumoniae.In the upper diagram the tHM is depicted as described byHarasawa (1995) with the Long Terminal Repeats (LTRs) and three ORFs.The complete element has 4193 base pairs.The remaining diagrams depict the arrangement of tMH in: the 7448 strain -in which three tMH elements are arranged in tandem and share adjacent LTRs; in the J and 232 strains -in which two tMH elements are arranged in tandem, and in the internal shared LTR, an ISMHp1 element is inserted.Note the opposite directions of the inserted ISMHp1 element for each strain.

Table 1 -Characterization of IS in the Mycoplasma hyopneumoniae and M. synoviae genomes: location in the
genomes (beginning and end), structure (complete, defective or partial), length, the length of putative proteins, TIRs or LTRs and DRs.