Molecular analysis of an Integrative Conjugative Element, ICEH, present in the chromosome of different strains of Mycoplasma hyopneumoniae

Diversification of bacterial species and pathotypes is largely caused by lateral gene transfer (LGT) of diverse mobile DNA elements such as plasmids, phages, transposons and genomic islands. Thus, acquisition of new phenotypes by LGT is very important for bacterial evolution and relationship with hosts. This paper reports a 23 kb region containing fourteen CDSs with similarity to the previous described Integrative Conjugal Element of Mycoplasma fermentans (ICEF). This element, named ICEH, is present as one copy at distinct integration sites in the chromosome of 7448 and 232 pathogenic strains and is absent in the type strain J (non-pathogenic). Notable differences in the nucleotide composition of the insertion sites were detected, and could be correlated to a lack of specificity of the ICEH integrase. Although present in strains of the same organism, the ICEH elements are more divergent than the typical similarity between other chromosomal locus of Mycoplasma hyopneunomiae, suggesting an accelerated evolution of these constins or an ongoing process of degeneration, while maintaining conservation of the tra genes. An extrachromosomal form of this element had been detected in the 7448 strain, suggesting a possible involvement in its mobilization and transference of CDSs to new hosts.


Introduction
In recent years, lateral gene transfer (LGT), the intraspecies and interspecies exchange of genetic information, has been substantially correlated to changes in genome content and with the evolution of bacteria, disseminating key traits within and among bacterial species (Boucher et al., 2003;Lerat et al., 2005).With the increase in genome sequence information associated with microarray technology, it is now possible to trace the presence or absence of genes in closely related bacterial genomes that could determine a sudden adaptation of bacteria to new environmental conditions.(Calcutt et al., 2002;Porwollik and McClelland, 2003;Ochman et al., 2005).
A number of phenotypes has been associated with this type of dissemination during conjugation of chromo-somal regions: the sucrose metabolism in Salmonella enterica Seftemberg via cTnscr94 (Hochhut et al., 1997); the symbiosis traits associated with the 500 kb symbiosis island in Mesorhizobium loti (Sullivan and Ronson, 1998); the degradation of chlorocatechol mediated by gene products resident on the 105 kb clc element of Pseudomonas putida (Ravatn et al., 1998); the tetracycline resistant element Tn916 of Bacteroides (Salyers et al., 1995); the large STX unit of Vibrio cholerae (Hochhut and Waldor, 1999); and the antibiotic resistance gene cluster from Salmonella (Doublet et al., 2005).The term constin has been recently coined to describe this diverse group of conjugative, selftransmissible, integrating elements (Hochhut et al., 2001).Despite the existence of a diversity of LGT mechanisms, only evolutionarily relevant events of transfer persist within the bacteria genome (Jain et al., 2003).
Mycoplasma hyopneumoniae is recognized as a potent pathogen causing mycoplasmal pneumonia in swine after colonising the respiratory tract and by inducing an inflammatory response (Maes et al., 1996).This chronic worldwide disease causes important economic losses, including those in Brazil (Sobestiansky et al., 2001).With the completion of genome sequence of 10 different species of Mycoplasmas, the search for horizontally acquired genes revealed the presence of constins only in M. fermentans (Calcutt et al., 2002) and M. hyopneumoniae (Vasconcelos et al., 2005).During a survey of specific sequences, a region containing fourteen CDS (≈ 23 kb) with similarity to the previous described Integrative Conjugal Element of M. fermentans (ICEF) (Calcutt et al., 2002) was found in the Brazilian field isolate (7448) strain (Vasconcelos et al., 2005).Recently, subtractive hybridization analysis indicated the presence of coding sequences related to ICEF in Mycoplasma bovis and M. agalactiae genomes (Marenda et al., 2005).
Scrutiny of the complete genome from three strains (7448, J and 232) of M. hyopneumoniae (Vasconcelos et al., 2005;Minion et al., 2004) revealed the presence of a constin element in only two of them.This genetic element is present as one copy at distinct integration sites in the chromosome of 7448 and 232 strains and is absent in the type strain J.An extrachromosomal form of this element can be detected in the 7448 strain by the inverse PCR method (Vasconcelos et al., 2005).

Sequence analysis
Sequence data from Mycoplasma hyopneumoniae strains analyzed in this paper were obtained from GenBank accession no.NC_006360 for strain 232, NC_007295 for strain J and NC_007332 for strain 7448.
Similarity searches were performed using a locally installed NCBI BLAST 2.2.12 (Altschul et al., 1997) against custom built databases of known constins and the nr database (downloaded in 05/09/2005) from NCBI.Initial analysis of sequence families was done using the Tribe-MCL software (Enright et al., 2002) and the Bidirectional Best Hits (BBH) (Tatusov et al., 2000) approach was used to find putative ortologues among the analyzed constins.Sequence annotation was manually curated with the help of similarity searches against the PFAM (Sonnhammer et al., 1997), PRODOM (Bru et al., 2005) and INTERPRO (Mulder et al., 2005) databases.Genometrics indexes were obtained with the EMBOSS suite of programs (Rice et al., 2000).Phylogenetic analyses were performed with MEGA 3 (Kumar et al., 2004), and multiple sequence alignments were generated with CLUSTALX (Thompson et al., 1997).JTT distance was employed for both the neighbor-joining and parsimony tree reconstruction analysis, and a value of 1000 replications was used for bootstrap analysis.The software TOPALi (Milne et al., 2004) was used for the recombination analysis of the traE genes, and PDM and HMM analysis were done with default settings.
Bacterial strains and culture conditions M. hyopneumoniae strain J (ATCC 25934), a nonpathogenic strain with reduced capacity to adhere to porcine cilia (Zielinski and Ross 1990;Zielinski and Ross, 1993;Zhang et al., 1995) was acquired from American Type Culture Collection by CNPSA, EMBRAPA (Concórdia, SC, Brazil).M. hyopneumoniae strain 7448 was isolated from an infected swine in Lindóia do Sul, Santa Catarina, Brazil, as described by Vasconcelos et al. (2005).

PCR analysis and DNA sequencing
PCR amplification of the target sequences was performed in a DNA thermal Cycler (MJ Research).The PCR mixture contained the reaction buffer (50 mm KCl, 10 mm Tris-Cl pH 8.3, 1.5 mm MgCl 2 ), 200 mm of each dNTPs, 20 pmol of each primer, and 1U of Taq polymerase (CenBiot enzymes, Centro de Biotecnologia, UFRGS).Assays were carried out using 50 ng of genomic DNA from both M. hyopneumoniae strains (J and 7448) and distilled water for a total of 25 μL.The reaction mixture was subjected to PCR under the following conditions: heat denaturation at 94 °C for 5 min.and then an additional 30 cycles with heat denaturation at 94 °C for 1 min., annealing at 60.6 °C for 1 min., and DNA extension at 68 °C for 1 min.After the last cycle, samples were maintained at annealing temperature for 5 min.followed by 68 °C for 8 min.PCR products are resolved in a 1.2% gel electrophoresis, stained with ethidium bromide and visualized by UV illumination.PCR amplification was performed with primers ICEH1-F (5'TTTTTCATTGCCTAAATCTTGTTT3') and ICEH2-R (5'AAATCACAAACTTAAAAATGCCAAT3') (Invitrogen, USA).Before the sequencing reaction, the amplification products (300 ng) were purified using GFX kit (Amersham Biosciences, GE Healthcare) or treated with 3.3U of exonuclease III and 0.2U shrimp alkaline phosphatase (SAP) (Amersham Biosciences, GE Healthcare) at 37 °C during 30 min.After this period, the enzymes were inactivated by incubation at 80 °C during 15 min.Amplification products were sequenced using the DYEnamic ET dye terminator cycle sequencing (MegaBACE) kit and run on MegaBACE 1000 capillary sequencers (Amersham Biosciences, GE Healthcare).The generated sequences were assembled and a quality score of 20 was considered for the final assembly.The derived sequence was searched using the BLASTn software using the complete M. hyopneumoniae strain 7448 genome as the database to identify the element in the chromosome.

General characteristics
Although conjugative elements have been described for members of many bacterial genera, indigenous selftransmissible molecules have been reported only in M. fermentans so far in Mycoplasma.During an ongoing anal-Pinto et al.
ysis of genes encoding surface components of M. fermentans, genetic elements designated ICEF were found (Calcutt et al., 2002).Two kinds of elements that differ regarding the presence or absence of a few CDSs were characterized by those authors and called Type I and II.The CDSs present in these elements suggested that they belong to the constin family.The sequence comparisons of three M. hyopneumoniae genomes carried out recently (Vasconcelos et al., 2005) also reveal the presence of putative ICE elements in two of these strains that were called ICEH.Interestingly, these elements were present in the two pathogenic strains (7448 and 232) but were absent from the non-pathogenic one (J strain).
Unlike ICEF elements I and II, which consist of twenty one and nineteen CDSs, respectively, the ICEH7448 elements consist of 17 CDSs and ICEH232 of 22 CDSs.However, despite these differences, the organization of these elements is very similar.
The M. hyopneumoniae elements ICEH7448 and ICEH232 are 22,816 and 21,061 nucleotides in length, respectively.The average length of the ICEH CDSs is 1,218 bp in ICEH7448 and 1,187 bp in ICEH232.Most of the CDSs (14 ICEMH7448 and 12 in ICEMH232) are considered hypothetical since no significant similarity was found with the available databases.Conversely, some CDSs present similarity to tra genes that are associated with the conjugative plasmids of bacteria (Snyder and Champness, 2002) such as traK, traI, traE and a CDS encoding for a single strand binding protein (Ssb) that is essential for the transfer process.The ICEF element presents two tra (traG and traE) genes, with an average similarity of 30% to tra genes found on ICEH232 (traG and traE) and 29% to tra gene of ICEH7448 (only traE).The ICEH232 element has three tra genes, with one traG and two copies of the traE gene.In the M. hyopneumoniae 7448 ICEH element, we found only one traE gene with an average similarity of 44% to the traE genes found in the ICEH232.A characteristic that is shared by both ICEH and ICEF type elements is the lack of a recognizable gene encoding for an integrase function.Calcutt et al. (2002) suggested that the integrase function is possibly mediated by the hypothetical CDSs present in both type I and II ICEF elements.This observation may also be true for the ICEH elements because they contain hypothetical CDSs as well (Table I).The AT content of some CDSs in the ICEH element is higher than the overall percentage for the M. hyopneumoniae genome.For instance, CDSs ICE_MH7448-ORF09 and ICE_MH7448-ORF017 exhibit 76.13% and 76.06% AT content respectively, suggesting a heterologous origin for these ORFs in the ICEH element.
A complete reannotation analysis of the elements ICEH7448, ICEH232 and ICEF was also conducted.From these analyses, we have attributed functions to seven previously not annotated CDSs, including a CDS in ICEH7448 that was not recognized initially and was identified as a pu-tative helicase.Table 1 shows the CDSs annotated in this way and they are marked with asterisks.Note the identification of a transposase-like CDS in the element ICEH232, but not in ICEF and ICEH7448.However, other ICEs have already been described which contain CDSs related to transposases, like the ICE element from Streptococcus thermophilus (Pavlovic et al. 2004).The role of transposases residing inside mobile genomic islands is not clear.However it is possible that such elements could help the transfer process of the ICEs acting as triggers for the activation of the damage-repair machinery found in the host cells.It has already been suggested that the damage-repair machinery is important for the integration of the ICEs in the chromosome of the host cells (Auchtung et al., 2005).
The ICEH7448 element is inserted at the position 517,895 of the genome of M. hyopneumoniae 7448, unlike ICEH232 which is found at position 654,116 of the M. hyopneumoniae 232 genome from the replication origin.This indicates that these elements may insert themselves in different genomic regions.We were unable to determine with accuracy the presence of direct or inverted repeats flanking both elements.However, ICEH7448 is flanked by a duplication of part of the YP_287818.1 CDS, which is located downstream from the element.Nevertheless, considering that the duplicated gene is not exactly at the terminus of the element, we cannot conclude that this duplication is result of the ICEH7448 insertion.
The difference in number of CDSs present in distinct ICE elements suggests that they are able to acquire or lose CDSs over time.For instance, the CDSs AAN85219.1-AAN85220.1 and AAN85229.1,although present in the ICEF-I element are not present in the ICEF-II element, possibly indicating a very recent acquisition of new sequences by ICEF-I.Thus, it is possible that these elements may work as carriers of CDSs that, when mobilized to a new host, could confer new properties.
Genometric analyses have been conducted on the genomes of M. hyopneumoniae 7448 and 232 strains.Interestingly, the ICEH7448 element was found in an island of high AT content, relative to the genome, while the ICEH232 element was located in a high GC island content.The observed deviations on the GC content of the insertion sites indicates that DNA structural features derived from compositional constraints may not be a major determinant for the element insertion site determination.Individually, the CDSs of the ICEH232 element display a slightly higher GC content.The element positions and the graph of GC% cumulative skew are depicted in Figure 1.The variability in base composition is not limited to the element as a whole when compared to the genome.Analysis of the genometric properties of the ICE elements revealed that specific regions contain deviations of sequence properties.The traE gene sequence is peculiar in the sense that, in all elements, this gene presents a higher AT content than the other CDSs of the element.This internal variation in base composition could explain a high expression for the traE genes.High codon usage bias has already been linked to a high expression profile in genes with distinctive base composition (Wang et al., 2005).This is important as the traE genes are fundamental in the processes of element transfer, and an indication that the selection for optimum codon usage is a process that can exist in constins in a marked way.

ICEH elements
It was not possible to determine a single value for similarity among the studied elements, as the sequences are heterogeneous regarding their local similarities scores.The similarity rates range from regions with none (even be-tween ICEH232 and ICEH7448) to regions with 67% of similarity (the traE genes for instance).Figures 2 and 3 demonstrate the similarity between the ICEH232 and ICEH7448 elements.Figure 2 exhibits a nucleic acid word match, while Figure 3 is a TBLASTX comparison based on the 3-frame translated amino acid sequence of all CDSs in the elements.The notable divergence between the ICEH elements, present in different strains of the same organism, could suggest an ongoing process of degeneration or accelerated evolution.These questions can be resolved when more sequences of ICEH constins become available.
While searching for CDSs related to ICEs in the chromosomes of M. hyopneumoniae, strains J, 7448 and 232, a specific region of about 10 kb that has significant similarity to the genes traE and two other hypothetical proteins have been found.We named these regions ICEH-like MHJ, ICEH-like 7448 and ICEH-like 232. Figure 3 A depicts the ICEH-232 region organization.It is interesting to note that this ICE-like region does not only have sequence similarity, but the gene order is also conserved, indicating a common origin with the ICEH elements.Additionally these regions are flanked by inverted repeats of 45 nucleotides in length and show a conservation of 71% between them.This ele-ment is present in the three M. hyopneumoniae strains sequenced to date with a highly conserved gene content, gene order and structure, except for the ICEH-like MH232 which has an additional CDS located upstream from the SMF DNA processing protein without any similarity to known genes.It is possible that these ICEH-like elements could be the product of an ancestral ICE integration event that suffered an irregular excision leaving parts of them inserted in the chromosome.However, phylogeny analyses of the traE genes suggest that these regions are maintained at approximately the same evolutionary rate as the other genes in the chromosome, with lower divergence between the traE genes found in the ICEH-like region or in the ICEH232 and ICEH7448 elements (Figure 4 B).These might be an indication that the ICEH-like elements have acquired new functions during evolution, which differ from those assigned to the ICEH elements that assume an independent behavior in the host.

Comparisons between mycoplasma ICE elements
Two clustering experiments have been also conducted, using the CDSs available from six elements (ICEH-like MHJ, ICEH7448, ICEH232, pSKU146, pBJS-O and ICESt1) sequenced so far, one with the BBH strategy and another employing the MCL algorithm.Both methods found a small number of putative ortologous genes among the analysed sequences, and the traE gene was the most conserved of all.It is interesting to note that although present in the same species, the two ICEH elements are less    conserved in relation to each other than the pSKU146 and pBJS-O elements.These two latter elements, although present in different species, are more conserved and syntenic.This could be explained by the fact that the ICEH elements diverged before the pSKU146 and pBJS-O elements, as evidenced by the phylogenetic analysis of 3 conserved ortologous CDSs of these elements (Figure 4 A).
Although the phylogenetic tree of the elements is congruent with their host species, it disagrees for the traE gene, the most conserved putative ortologous gene.This result might indicate a process of CDS exchange by horizontal gene transfer between these elements or that this gene is under more stringent selective pressures.These analyses also in-dicate that the traE gene of ICE clearly diverged before the same genes found in the more recent ICEs such as pSKU146 and pBJS-O.Curiously, a duplication of the traE genes in the MH232 ICE element has been identified.From the phylogenetic analysis (Figure 4 B), it is possible to assume that the CDS15 originated by duplication of CDS16, which is the putative ortholog of the unique traE gene from ICEH7448.This observation is corroborated by the best bidirectional hit observed among these genes.Apparently, CDS15 is under a process of degeneration as it is the more distinct traE gene among the other homologous genes found in the ICEH elements.An analysis of recombination done on the most conserved traE genes, from the elements ICEH-Like MH232, ICEH-like MH7448, ICEH MH232 and ICEH MH7448, showed that a probable recombination event has occurred in an ancestral form of these genes, originating a mosaic structure comprising two phylogenetically distinct domains (Figure 5).

Evidence for a circular intermediate ICEH
Extrachromosomal forms of the integrative conjugative element have been shown in M. fermentans (Calcutt et al., 2002) and M. agalactiae (Marenda et al., 2006).A model has been proposed where the element is excised, at low frequency, from the chromosome, circularized as a nonreplicative intermediate, and transferred by conjugation to a recipient.Finally, in the recipient cell, the element is integrated into the chromosome (Calcutt et al., 2002).
To investigate the possible presence of extrachromosomal forms of ICEH in M. hyopneumoniae, two outwardly facing PCR primers have been designed, each one annealing at regions located near the ends of the element.The presence of a circular form should yield a fragment of about 600 bp after amplification by PCR.PCR reactions were carried out using 50 ng of genomic DNA from both M. hyopneumoniae strains (J and 7448).Figure 6 shows that a product of 633 bp was observed in M. hyopneumoniae strain 7448 and as expected, no product was observed in strain J.The PCR product was sequenced and perfectly aligned to the extremities of the ICEH, as expected if this element has a circular form (data not shown).This data indicates the presence of an ICEH circular intermediate in M. hyopneumoniae strain 7448, possibly gener-Pinto et al.
The circular conformation of the excised ICEH has important implications for the biology of the element, since in this form, the constin can remain more stable, increasing the probability of horizontal transfers.As constins do not rely on transposases, which remain associated with the insertion sequence DNA molecule during the course of the excision/insertion process, a linear form would be quickly degraded by exonucleases, compromising the relocation to a different host or locus.Additionally, circular DNA forms are structurally more stable than linear ones, helping the element to maintain its structural integrity over the transfer process.

Concluding Remarks
Our analyses indicate the presence of putative integrative conjugal elements (ICE) in pathogenic strains of M. hyopneumoniae.It was also demonstrated that this element exists in an extra-chromosomal form.These results suggest that ICE is a mobile DNA that is probably involved in genomic recombination events and in pathogenicity.Nevertheless, the transference of ICEH between cells remains to be demonstrated experimentally.Genomic sequence analysis also indicated the existence of ICE-like elements in M. hyopneumoniae which are probably derived from an ancestral ICEH integration event that suffered an irregular excision.The ICEs are not confined to M. hyopneumoniae.
Sequences related to these elements were also found in other species such as M. fermentans, M. bovis and M. agalactiae.Interestingly, some CDSs of these elements have similarities to Type IV secretion machinery present in different bacterial species.This system is involved in the secretion of DNA and proteins to the extra-cellular milieu or into eukaryotic cells which have a central role in pathogenicity (reviewed by Christie et al., 2005).This observation reinforces the role of ICEs in pathogenicity.However, there is no evidence that links them to protein secretion.

Figure 1
Figure 1 -GC% Skew graph showing the differential regions between the strains 7448, J and 232 of Mycoplasma hyopneumoniae.The dotted lines indicate the position of the ICEH7448 and ICEH232, respectively.

Figure 2 -
Figure 2 -Word density 3D plot of the ICEH-232 and ICEF-7448 sequences.The peaks correspond to the word density match (number of nucleotide local sequence similarities in a region) between the sequences of the elements, where more elevated peaks denote higher similarity.Location of the peaks in the base plan is determined by the coordinates of the word matches between the two sequences, represented in the X-Z axis.

Figure 3 -
Figure 3 -TBLASTX (translated search of nucleotide sequences) similarity of the elements ICEH-like MH232, ICEH7448 and ICEH232.Dark gray bars denote direct sequence similarity between elements; light gray bars connect reverse sequence similarity between elements.CDS are represented by boxes.Inverted repeats of the ICEH-like elements are represented by arrows.The CDS represented are listed in Table 1 in order of appearance in the figure, except for the ICEH232 element that have the matches flipped for better visualization and the CDS are in inverted order as represented in the table.A: Comparison between the ICEH232-like element and ICEH232 element.B: Comparison between the ICEH232 element and ICEH7448 element.

Figure 4 -
Figure 4 -Phylogenetic analysis of the ICE elements and TraE genes A: Neighbor-Joining tree calculated using the JTT distance of three concatenated ortologous CDS between the six analyzed elements.(Theparsimony tree had the same topology of the Neighbor-joining analysis and was omitted from the figure).B: Neighbor-Joining tree calculated using the JTT distance of 10 TraE genes found in the elements ICEH-Like MH232, ICEH-Like MH7448, ICEH-Like MHJ, ICEH7448, ICEH232, pkstu, pksc and ICESt1.

Figure 5 -
Figure 5 -Recombination tree based on the traE sequences on the left is based on the representation of the traE mosaic structure.
Table 1 in order of appearance in the figure, except for the ICEH232 element that have the matches flipped for better visualization and the CDS are in inverted order as represented in the table.A: Comparison between the ICEH232-like element and ICEH232 element.B: Comparison between the ICEH232 element and ICEH7448 element.