Characterization of two full-sized P elements from Drosophila sturtevanti and Drosophila prosaltans

Previously, only partial P element sequences have been reported in the saltans group of Drosophila but in this paper we report two complete P element sequences from Drosophila sturtevanti and Drosophila prosaltans. The divergence of these sequences from the canonical P element of Drosophila melanogaster is about 31% at the nucleotide level. Phylogenetic analysis revealed that both elements belong to a clade of divergent sequences from the saltans and willistoni groups previously described by other authors.


Introduction
The P elements were first discovered in Drosophila melanogaster because of their ability to induce hybrid dysgenesis (Kidwell et al., 1977).Autonomous P elements are 2.9 kb in length and have four open reading frames which encode two polypeptides, an 87 kDa transposase enzyme necessary for transposition (Rio et al., 1986) and a 66 kDa repressor protein (Robertson and Engels, 1989).Also required for transposition are the element termini, which include flanking 31-bp perfect inverted repeats (O' Hare and Rubin, 1983), 11-bp subterminal repeats and unique terminal sequences comprising approximately 150 bp (see Engels, 1989 for a review).
Sequences belonging to the P family are particularly common in the four principal species groups (melanogaster, obscura, saltans and willistoni) which make up the subgenus Sophophora (Daniels et al., 1990) but have also been described in drosophilid species such as Drosophila mediopunctata which is not part of the Sophophora subgenus (Loreto et al., 2001) and also in Scaptomyza pallida, a drosophilid which does not belong to the genus Drosophila (Anxolabéhère et al., 1985;Simonelig andAnxolabéhère, 1991, 1994).Transposable elements similar to the P elements of the Drosophilidae have also been isolated from members of a few other Diptera families, e.g.Lucilia cuprina from the Calliphoridae (Perkins and Howells, 1992), Musca domestica from the Muscidae (Lee et al., 1999) and seven species Anopheles from the Culicidae (Sarkar et al., 2003).More divergent and rudimentary sequences related to P-transposable elements have also been described using 'in silico' searches such as Hoppel (Reiss et al., 2003) and Proto P (Kapitonov and Jurka, 2003) for the Drosophila melanogaster genome and Phsa (Hagemann and Pinsker, 2001) for the human genome.
Phylogenetic studies based on nucleotide sequences (Clark and Kidwell, 1997;Hagemann et al., 1994Hagemann et al., , 1996;;Silva and Kidwell, 2000) indicated that the more than 200 P element sequences obtained to date fall into 16 distinct clades or subfamilies (Figure 1).Four of these subfamilies have been well characterized.The canonical subfamily appears to be restricted to the sophophoran New World species groups saltans and willistoni (Clark et al., 1995), with the notable exception of Drosophila mediopunctata, which contains P elements due to horizontal transfer (Loreto et al., 2001).Three P element subfamilies (M-, O-and T-type) are found in the Old World obscura species group (Hagemann et al., 1992(Hagemann et al., , 1994(Hagemann et al., , 1996)), with the T-type appearing to be restricted to the obscura lineage (Hagemann et al., 1998) while the M-and O-types also occur in the saltans and willistoni groups.A new subfamily, the K-type (restricted to the montium subgroup species), has recently been described by Nouaud et al. (2003).
The descriptions of the P element subfamilies in the saltans and willistoni species groups have been based so far mainly on partial sequences (Clark et al., 1995;Clark and Kidwell, 1997;Haring et al., 2000;Silva and Kidwell, 2000).The work described in this paper compared two complete sequences obtained from two different saltans subgroups (sturtevanti and saltans) to some of the complete sequences from different P element subfamilies as well as to a data set consisting of 80 partial consensus sequences provided by Clark and Kidwell (1997) in order to establish their phylogenetic relationships.

Fly stocks
We used Drosophila sturtevanti collected in the Mexican town of Matlapa and Drosophila prosaltans collected in the town of Eldorado in the Brazilian state of Rio Grande do Sul both strains having been recently derived from natural populations and subsequently maintained in the laboratory, under standard conditions.

DNA Amplification and Sequencing
Polymerase chain reactions were carried out using a total volume of 50 µL containing 100 ng of genomic DNA, 1 mM MgCl 2 , 400 µM of each deoxynucleotide, 0.5 µM of M-IR primer (5'CATAAGGTGGTCCCGTCG3', corre-sponding to nt. 14-31 and 2877-2894, within the TIRsminal inverted repeats -of the D. melanogaster canonical P element; Haring et al., 1995) and 2 units of Taq polymerase in 1x polymerase buffer.Amplification consisted of 30 cycles of 45 s denaturation at 94 °C, 45 s of primer annealing at 57 °C and 1.5 min of primer extension at 72 °C.The first cycle was preceded by a step of 7 min at 94 °C for denaturation and the last cycle was followed by a final extension at 72 °C for 10 min.The PCR products were cloned into a TOPO TA cloning vector (Invitrogen) and both strands of a randomly-chosen single clone were sequenced for each species.The primers Stu1 369 (5'GTTCCGTATCG AGAC CCGA C3'), Stu2 2402 (5'AATGACGAAGACTC GTCGC G3'), Stu5 1091 (5' GGAAGCAACCAGTTTTCT TT3') and Stu6 1598 (5'CACATCAAACCAATCATTTA3') were designed based on the clone sequences and used to complete the sequencing of the element.
Alignments of sequences were done by the CLUSTAL W program (Thompson et al., 1994) and the phylogenetic relationships between the P sequences constructed using the maximum parsimony method as implemented in PAUP program version 4.0b10 (Swofford, 1997).Branch support was calculated using bootstrap analysis with 500 replicates.The distance matrix was constructed according to the Kimura two-parameter model of nucleotide substitution (Kimura, 1980).

Sequence features
The M-IR primer corresponds to positions 14-31 and 2877-2894 of the canonical P element, and so cannot amplify the first and last 13 bp of a complete P element.The D. sturtevanti P element is 2829 bp and the D. prosaltans 2828 bp, if the TIRs of these elements are complete they  Clark and Kidwell (1997).Species names are given followed by the total number of clones used to obtain a consensus sequence or, for the sequences obtained in our study, an arrow.This is a consensus tree of 1,000 equally parsimonious trees each requiring 2,193 steps.Of 449 characters, 34 were constant, 60 uninformative, and 355 parsimony informative.The consistency index is 0.40 and the retention index 0.815.Bootstrap analysis of 500 replicates was performed on the data.Clades names were given by Clark and Kidwell (1997) and are shown on the right.
should be 2854 bp for D. sturtevanti and 2855 bp for D. prosaltans, which is about 50 bp less than the D. melanogaster canonical P element.The alignment of the two sequences against the D. melanogaster sequence showed that the D. sturtevanti and D. prosaltans P elements are similar in structure and sequence to each other (88%) but strongly divergent from the D. melanogaster canonical P element (31% different).Table 1 shows the main differences between the alignments from which it can be seen that, in general, D. sturtevanti has the same deletions and insertions as the D. prosaltans.
Even though the TIRs were not completely sequenced, PCR amplification with primers specific to the TIR regions indicates that at least the second half of the TIRs are present and well conserved both in D. sturtevanti and D. prosaltans.However, the transposase binding sites, located at positions 48-68 and 2855-2871 in D. melanogaster (Kaufman et al., 1989), the TATA box and the 11-bp subterminal inverted repeats are not well conserved.In all four exons the translational reading frame is interrupted by stop codons and frameshift mutations, suggesting that these sequences do not encode a functional protein.Indeed, leucine-zipper and helix-turn-helix motifs were not detected, supporting the suggestion that these sequences might be non-autonomous in the genome.
A nucleotide differentiation and genetic difference matrix based on Kimura's two-parameter method was calculated for the full-length P element sequences from the literature and the two sequences described here (Table 2) and it was found that the D. sturtevanti and D. prosaltans sequences present an overall divergence of 31% as compared to the canonical sequences described in D. melanogaster, D. willistoni, D. mediopunctata and D. nebulosa.

Phylogenetic analysis
Phylogenetic analyses of P elements in the subgenus Sophophora (Clark et al., 1995(Clark et al., , 1998;;Clark and Kidwell, 1997;Silva and Kidwell, 2000) indicate the existence of multiple P element subfamilies in lineages of single species that apparently must have entered the genome at different times during the past (Lee et al., 1999;Haring et al., 2000).The aim of our phylogenetic analysis was to determine if the D. sturtevanti and D. prosaltans P element sequences belonged to some of the well-characterized P element subfamilies.Figure 1 summarizes the results of our phylogenetic analysis of P element sequences using parsimony, from which it can be seen that in 100% of bootstrap replicates the D. sturtevanti and D. prosaltans sequences clustered in Clark and Kidwell's (1997) F clade, which contains P element sequences of some other saltans group species as well as some willistoni group species.

Figure 1 -
Figure 1 -Cladogram of P element nucleotide sequences obtained in this study and those published byClark and Kidwell (1997).Species names are given followed by the total number of clones used to obtain a consensus sequence or, for the sequences obtained in our study, an arrow.This is a consensus tree of 1,000 equally parsimonious trees each requiring 2,193 steps.Of 449 characters, 34 were constant, 60 uninformative, and 355 parsimony informative.The consistency index is 0.40 and the retention index 0.815.Bootstrap analysis of 500 replicates was performed on the data.Clades names were given byClark and Kidwell (1997) and are shown on the right.

Table 1 -
CompleteP elements in D. sturtevanti and D. prosaltans375 Differences observed between Drosophila sturtevanti and D. prosaltans P element sequences and that of the canonical P element of D. melanogaster.