The mitochondrial genome of the stingless bee Melipona bicolor (Hymenoptera, Apidae, Meliponini): Sequence, gene organization and a unique tRNA translocation event conserved across the tribe Meliponini

At present a complete mtDNA sequence has been reported for only two hymenopterans, the Old World honey bee, Apis mellifera and the sawfly Perga condei. Among the bee group, the tribe Meliponini (stingless bees) has some distinction due to its Pantropical distribution, great number of species and large importance as main pollinators in several ecosystems, including the Brazilian rain forest. However few molecular studies have been conducted on this group of bees and few sequence data from mitochondrial genomes have been described. In this project, we PCR amplified and sequenced 78% of the mitochondrial genome of the stingless bee Melipona bicolor (Apidae, Meliponini). The sequenced region contains all of the 13 mitochondrial protein-coding genes, 18 of 22 tRNA genes, and both rRNA genes (one of them was partially sequenced). We also report the genome organization (gene content and order), gene translation, genetic code, and other molecular features, such as base frequencies, codon usage, gene initiation and termination. We compare these characteristics of M. bicolor to those of the mitochondrial genome of A. mellifera and other insects. A highly biased A+T content is a typical characteristic of the A. mellifera mitochondrial genome and it was even more extreme in that of M. bicolor. Length and compositional differences between M. bicolor and A. mellifera genes were detected and the gene order was compared. Eleven tRNA gene translocations were observed between these two species. This latter finding was surprising, considering the taxonomic proximity of these two bee tribes. The tRNA gene translocation was investigated within Meliponini and showed high conservation across the Pantropical range of the tribe.


Introduction
In recent years a number of mitochondrial genomes have been completely sequenced, contributing to the knowledge of molecular features related to function and evolution of this peculiar genome (Boore, 1999). Its gene content is typically rather conserved in Metazoa (Boore et al., 1995), with notable exceptions described in nematodes (Okimoto et al., 1992), molluscs (Hoffmann et al., 1992) and cnidarians (Beagley et al., 1998). In general, mitochondrial DNA (mtDNA) contains genes for two ribosomal subunits (12S and 16S), 22 tRNA, and 13 proteins (three subunits of cytochrome c oxidase, cytochrome B, subunits 6 and 8 of ATP F0 synthase, and seven subunits of NADH dehydrogenase). There is also a non-coding A+T rich region that contains signaling elements for regulation of replication and transcription (Wolstenholme, 1992). Animal mitochondrial genome size is usually about 16 kb long, with few exceptions (Brown, 1985). Gene rearrangements within the mtDNA, formerly considered rare, have been described for a number of taxa, including bees (Mindell et al., 1998;Dowton and Campbell, 2001;Shao et al., 2003;Silvestre et al., 2002;Silvestre and Arias, 2006).
One hundred sixteen complete arthropod mtDNA sequences, including 61 insect species, have been deposited in GenBank. In the described insect mtDNA genomes, A+T content is very high. Currently, complete mitochondrial genomes have been sequenced for just two hymenopteran species: the sawfly Perga condei (Castro and Dowton, 2005) and the honeybee Apis mellifera (Crozier and Cro-zier, 1993). The honeybee mtDNA presents the highest A+T bias (84.9%) known for insects. According to Crozier and Crozier (1993) the A+T bias is probably maintained due to a greater number of transversions over transitions.
Among bees, the tribe Meliponini, known as stingless bees, has gained some attention. The tribe has a wide geographic distribution, inhabits all tropical areas of the World (Michener, 2000), and includes the main pollinators in several neotropical ecosystems (Kerr et al., 1996). In addition to their ecological importance, some species produce honey, pollen, wax and propolis that are commercially exploited (Nogueira-Neto, 1997). The species Melipona bicolor presents a very unique behavioral characteristic, polygyny, where several laying queens may cohabitate the nest for considerable time (Velthuis et al., 2006). This species is distributed in the southern and southeastern Brazil, in the Atlantic rain forest. Nonetheless as this ecosystem has been severely reduced in size by human activity and only 7% of the original area remains, M. bicolor and other bees are considered endangered.
In previous studies we have characterized the mitochondrial genomes of several meliponine species through RFLP analysis (Francisco et al., 2001;Weinlich et al., 2004;Brito and Arias, 2005). Evidences of size difference, in comparison to those expected for Apis mellifera, were obtained for some mitochondrial regions after PCR amplification. Those data were interpreted as indirect evidence of gene order or content changes. Later these size differences, relative to A. mellifera, were confirmed by sequencing (Silvestre et al., 2002;Silvestre and Arias, 2006), and all involved tRNA gene rearrangements. Eleven tRNA translocations were mapped between A. mellifera and M. bicolor , so far the highest translocation number verified between species belonging to the same taxonomic subfamily.
Mitochondrial gene order has been considered a molecular class very promising for phylogenetic studies, especially among major taxonomic groups (Boore, 1999). tRNA rearrangements are highly reliable as evolutionary markers, neutral and less prone to homoplasy (Boore and Brown, 1998). We previously sequenced a portion of the mitochondrial genome of M. bicolor, and reported that there were 11 tRNA gene rearrangements when this genome was compared with that of Apis mellifera (Silvestre et al., 2002;Silvestre and Arias, 2006). However, in that report we did not extensively characterize the M.bicolor mitochondrial genome, focusing our observations on genome organization. In the present study, we present a thorough characterization of the M. bicolor mitochondrial genome, with respect to codon usage, base frequencies, gene initiation and termination, and tRNA structure. In addition, we examine the evolutionary origin of one of the tRNA gene rearrangements identified in the previous study, the translocation of the tRNA Lys gene. We survey a broad range of Meliponini to more accurately determine the evolutionary origin of this translocation.

Material and Methods mtDNA sequence analysis
Individuals of M. bicolor were collected from a single monogynic colony maintained at the Laboratório de Abelhas, Departamento de Ecologia, IB-USP, São Paulo. Total genomic DNA was extracted using the TNEprotocol (Sheppard and McPheron, 1991). The experimental approach to obtain the whole mtDNA molecule consisted of the amplification of small and overlapped fragments. Thus aliquots of TNE extraction were used as template for PCR reactions, with Taq DNA polymerase (Invitrogen). We initially used the following conditions: denaturation at 94°C/5 min, followed by 35 cycles of 94°C/60 s, 42°C/80 s and 64°C/120 s. An additional final extension step of 64°C for 10 min was performed. The annealing temperature was optimized when necessary. The primers were derived from A. mellifera (Hall and Smith, 1991;Arias et al., 2008), M. bicolor (designed in our laboratory), and other organisms (Simon et al., 1994) (Table 1).
The PCR products were cloned into pGEM T-Easy Vector (Promega). At least two clones of each region were sequenced on both directions using Thermo Sequenase Dye Terminator (Amersham Life Science) or Big Dye Terminator (Applied Biosystems). Samples were analyzed on ABI-PRISM 310 and 3100 automated sequencers (Applied Biosystems).
Electropherograms were checked by eye on Trace-Viewer 2.0.1, and the sequences were assembled manually through GeneRunner 3.00 program (Hastings Software). Automated alignments were obtained using DAMBE (Xia, 2000), employing the CLUSTALW algorithm. Codon usage was analyzed with the software CODONTREE. BLAST searches at National Center for Biotechnology Information (NCBI) were used to verify the similarity between our sequences and those from other insect mitochondrial genomes. Transfer RNA genes were identified and the secondary structures were inferred by the software tRNA-Scan (Lowe and Eddy, 1997).
A single sequence of 14,422 bp was assembled and deposited at GenBank database under the accession number AF466146, which has also a Genome accession number, NC_004529.

KD tRNA cluster analysis
Fourteen Meliponini species (Table 2) were selected for the KD tRNA cluster sequencing. The species analyzed, collected in four continents, may be considered representative of the Pantropical geographic distribution of the tribe. The KD region was amplified using the pair of primers Cox 2/ Atp8 and the PCR conditions described in Castro et al. (2002). The computational analyses were as described above. 452 Silvestre et al. *Primers initially used to amplify longer amplicons, covering most of the genome. These regions were re-amplified in smaller fragments to continue the sequencing process.

Results and Discussion
General features of the genome and gene content The size of the M. bicolor mtDNA had been previously estimated to be 18,500 bp by RFLP analyses (Weinlich et al., 2004). However, this total size could not be fully confirmed by sequencing, we were unable to clone the 4,100 bp fragment containing the control region and its adjacent region. We analyzed a continuous fragment of 14,422 bp, or about 78% of the estimated size. This fragment contains the 13 protein-coding genes, 18 of 22 tRNA genes and the two rRNA genes (complete sequence was obtained for the large subunit -16S, and partial sequence for the small subunit -12S) ( Figure 1). We detected five overlapping regions between genes, three of them including genes that were on the same strand (Table 3). In total, 30 bp were involved.
Seventeen non-coding regions were detected, with sizes ranging from one to 173 bp, totaling 486 bp (Table 4). Considering the same portion of A. mellifera mtDNA, excluding the hypervariable COI-COII intergenic region and the control region, the number of non-coding nucleotides is greater, 618 bp (Crozier and Crozier, 1993). Thus, comparing the sequenced portion of M. bicolor genome with the respective genome portion of A. mellifera, we verified that M. bicolor presents a more compact arrangement.
The COI-COII intergenic region known to occur in A. mellifera was absent in M. bicolor. Moreover, indirect PCR evidence indicated that the COI-COII region is absent in at least 16 other Meliponini species . This intergenic region has been extensively studied in A. 454 Silvestre et al.

Trigona doipaensis Thailand
Trigona flavibasis Thailand   (Garnery et al., 1992(Garnery et al., , 1995Franck et al., 1998). It has also been cited as a possible second origin of mtDNA replication and transcription (Cornuet et al., 1991). Our data clearly suggest that this region and possible function is not a shared feature between A. mellifera and Meliponini, in a broader sense.
The longest intergenic region found in M. bicolor consisted of 173 bp and was located between the tRNA Met and ND2 genes. Concerning the sequence similarities between M. bicolor and A. mellifera, we observed a 46 bp segment within this region that was highly similar (84%) to the non-coding region between COIII and tRNA Gly of A. mellifera.
The second longest non-coding region of M. bicolor mtDNA was located between the ND6 and CytB genes and was 94 bp in length. In A. mellifera this non-coding region is 60 bp long. The sequence similarity between these was 61%. The other intergenic regions found in M. bicolor were also analyzed, but were too small and showed no significant similarities with any region of the mtDNA of A. mellifera or other organisms.

Base composition -A+T bias
The adenine+thymine bias was very high in M. bicolor mtDNA (86.7%), as has been described in A. mellifera (84. 9% Crozier and Crozier, 1993). This latter species, as a member of Hymenoptera, has been cited to be the most AT biased insect mitochondrial genome sequenced (Simon et al., 1994). One hypothesis that attempts to explain this bias is that the DNA polymerase could use those bases in a more efficient way during mtDNA replication (Clary and Wolstenholme, 1985). The lower energetic cost to break the A-T links during mtDNA replication and transcription would generate AT bias on organisms that rely on mitochondrial efficiency to keep a high metabolic rate (Xia, 1996). Studies of additional Apidae genomes may indicate whether (and when) this character was fixed by selection for high metabolism in the evolutionary history of this group.

Protein-coding genes
The mitochondrial protein-coding genes were analyzed and nucleotide composition, codon usage and size were compared with A. mellifera. The initiation codons in M. bicolor protein-coding genes were seven ATT (for isoleucine), four ATA, and two ATG (both for methionine). Although the insect mitochondrial genetic code predicts isoleucine for the first codon, it is generally assumed that a special feature on translation changes all mitochondrial initiation codons to methionine on the final amino acid sequence (Wolstenholme, 1992). As in A. mellifera, there was no anomalous initiation codon (like D. yakuba ATAA; Clary and Wolstenholme, 1985). All M. bicolor stop codons are TAA, the standard for the mitochondrial genetic code (Wolstenholme, 1992). There were no incomplete codons (T or TA), as found in two genes of A. mellifera and four of D. yakuba (Crozier and Crozier, 1993).
The standard insect mitochondrial genetic code was used to analyze M. bicolor mtDNA successfully, since it yielded no stop codons within the gene sequences. The total number of codons (excepting start and stop codons) was 3,643, while the A. mellifera genome has 3,686. The codon usage of all M. bicolor protein-coding genes was compared with A. mellifera (Table 5). It is possible to observe that there is a preferred codon for each amino acid, generally ending with A or T. Interestingly, these codons are not always the complement to their anticodons, particularly when these latter begin with C or G, a feature that is common in insects (Foster et al., 1997). In A. mellifera, there are seven codons that are not used at all, and in M. bicolor there are even more, 12 non-used codons, all ending with C or G.
The AT bias in codon usage can be expressed in terms of the ratio of "G+C" (Pro, Ala, Arg and Gly) to "A+T" rich codons (Phe, Ile, Met, Tyr, Asn and Lys) (Crozier and Crozier, 1993). That ratio is 0.43 for D. yakuba, 0.18 for A. mellifera and 0.14 for M. bicolor, confirming the extreme AT bias of bee mtDNA. Foster et al. (1997) developed a graphical and statistical representation for the nucleotide usage on first and second codon positions, called square plots, to analyze AT bias on mitochondrial protein-coding genes. Those graphics do not use the third position because it is too variable and generally does not reflect the amino acid composition. Figure 2 represents the square plot of M. bicolor. The AT bias is evident, as most codons are in the first quadrant (two positions occupied by A or T). Table 6 summarizes the results obtained for all the protein-coding genes of M. bicolor, and the comparisons with A. mellifera. One of the most intriguing features noticed is the size difference for some genes between the two bees. The cytB gene presents the most extreme example, being 102 bp shorter in M. bicolor, such difference is concentrated at the amino terminal portion of the cytB protein, therefore the reading frame starts 34 codons downstream in reference to A. mellifera. Moreover the former species presents a non-coding region of 94 bp preceding the cytB gene, with 13 possible initiation codons (nine ATT and four ATA). However none of these give rise to a continuous reading frame. Base substitution or even deletion may explain the absence and presence of the initiation codon at different positions than expected. This difference becomes more striking if one considers that cytB is one of the most conserved mitochondrial gene (Simon et al., 1994).

Ribosomal RNA genes
The precise size of ribosomal RNA transcripts are normally difficult to infer from the DNA sequence by itself, so it is assumed that they end on the boundaries of the flanking genes (Boore, 2001). The M. bicolor large subunit of ribosomal RNA gene (lrRNA or 16S) was completely sequenced. It has 1,354 bp, 17 bp smaller than the A. mellifera 16S gene (Crozier and Crozier, 1993), and their nucleotide similarity was 81%. The 16S G+C content in M. bicolor is 13.2%, while A. mellifera is 15.4% and Drosophila yakuba, 17%.
The sequence for the small rRNA subunit (srRNA or 12S) was not obtained completely because it flanks the control region of the mtDNA, which was not amplified. Assuming that this gene has the same length as A. mellifera, we sequenced 55% of it (437 bp). The sequence similarity between A. mellifera and M. bicolor for this sequenced fragment is 79%. As found for 16S, the 12S sequenced stretch presented a higher A+T content (83%) comparing to A. mellifera (81%) and D. yakuba (79%). 456 Silvestre et al.   ATP6  684  681  ATG  ATG  TAA  TAA  79%  ATP8  168  159  ATT  ATT  TAA  TAA  72%  COI  1560  1566  ATT  ATA  TAA  TAA  86%  COII  678  676  ATT  ATT  TAA  T  82%  COIII  780  777  ATG  ATG  TAA  TAA  77%  CytB  1050  1152  ATT  ATG  TAA  TAA  80%  ND1  930  918  ATA  ATT  TAA  TAA  75%  ND2  939  1002  ATA  ATC  TAA  TAA  69%  ND3  354  354  ATA  ATA  TAA  TAA  73%  ND4  1323  1344  ATT  ATA  TAA  TAA  76%  ND4L  279  264  ATA  ATT  TAA  TAA  72%  ND5  1647  1665  ATT  ATT  TAA  TAA  77%  ND6  540  504 ATT ATT TAA TAA 69% The difference in size observed for the 16S gene is quite small, since we have found variations of 102 bp for protein-coding genes (cytB). Size differences are acceptable on rRNA genes more than on protein-coding genes, since there is no need to maintain a frame to read, and only the secondary structure matters to their function (Wolstenholme, 1992). Castro and Dowton (2005) aligned the 12S and 16S genes of several insects and found con-served sequence blocks, indicating that the rRNA secondary structure is also conserved as consequence.

Difficulties to amplify the A+T-rich region in M. bicolor
The A+T-rich region appears generally difficult to amplify in insects, mainly due to tandem repeats, heteroplasmy and great length variation at intra and inter-specific levels (Zhang and Hewitt, 1997). Also the use of heterospecific primers, designed from flanking tRNA gene sequences, may lead to amplification failure if one of those genes is translocated to another region of the genome (Zhang and Hewitt, 1997). Our inability to amplify the control region of M. bicolor may also be explained by its size. Weinlich et al. (2004) estimated it to be about 3,300 bp long, around 2.5 kb longer than in A. mellifera (Crozier and Crozier, 1993). Normally such differences are due to partial duplications inside this region, a common feature of insect mtDNA (Simon et al., 1994), which may also make amplification difficult.

Transfer RNA genes
The tRNA genes were mainly identified by eye, using simple local alignment with their homologues in A. mellifera mtDNA. However, when detection was difficult due to low similarity or translocations, these regions were analyzed with specific software (tRNA-Scan, Lowe and Eddy, 1997), which identifies the genes and folds them in typical cloverleaf structures.
From the 22-23 tRNA genes regularly found in animal mitochondrial genomes, 19 were identified on M. bicolor mtDNA (Figure 3). The four missing genes were tRNAs for Cys, Gln, Ile and Ser (S1). Although the tRNA Ile gene could be identified and positioned on the M. bicolor genome, its sequence was not entirely obtained. Analyzing all the M. bicolor tRNA sequences as a whole, we have 1,193 bp, with 11.1% G+C and 88.9% A+T. The proportion of A+T for the same genes of A. mellifera is slightly less: 87.1%.
The secondary structures of tRNAs of A. mellifera and M. bicolor were very similar. Nonetheless, a few differences were found, concentrated at the D and TΨC arms, considered the most variable ones (Clary and Wolstenholme, 1985). The anticodon loop always had seven nucleotides, and its arm always had five base pairs, except on A. mellifera tRNA Val (4 bp). The acceptor arm was also conserved in size (7 bp), except on M. bicolor tRNA Arg (6 bp). The anticodons were the same for both species.
Eleven tRNA gene clusters ( Figure 1) were identified, comprising the 19 tRNA genes sequenced. Nine tRNA genes of M. bicolor are on different positions or strands when compared with A. mellifera. Two tRNA genes had their position inferred based on their absence in the predicted or sequenced locations. These 11 genes are distributed in only 4 clusters. This number of translocations is higher than those usually found between pairs of Diptera species (OGRE). The molecular and evolutionary implications of this phenomenon on bees are discussed elsewhere (Silvestre et al., 2002;Silvestre and Arias, 2006).

The KD tRNA cluster in Meliponini
In insects, the junction between the cytochrome oxidase II and ATPase 8 genes normally contains two tRNA genes, tRNA Lys (K) and tRNA Asp (D) (e.g. Clary and Wolstenholme, 1985). This junction has been called the KD cluster (Dowton and Austin, 1999), as the plesiomorphic organization is considered to be KD (i.e. COII-K-D-ATP8); this arrangement has been verified in several members of the Hymenoptera, Diptera and Orthoptera. However, in M. bicolor this cluster is represented only by the tRNA Asp (D) gene, while A. mellifera has the plesiomorphic organization (KD). This observation led us to amplify, sequence and characterize the KD cluster in 14 Meliponini species, representing the large distributional range of the tribe. Species from Brazil, India, Thailand, Australia and Africa were analyzed; all contained only the tRNA Asp gene (D). In the M. bicolor mitochondrial genome, the tRNA Lys gene is located in the first tRNA cluster, near the A+T rich region; this region is considered a hot spot of translocation events (Boore and Brown, 1998). Although we have not sequenced cluster one in the additional meliponine species (to confirm the location of the tRNA Lys gene), its absence in the KD cluster is a strong argument that the translocation of this gene occurred very early in the evolutionary history of Meliponini, and seems to be a fixed character of the tribe. Data from other bee families and tribes were obtained (Dowton and Austin, 1999;Silvestre and Arias, unpublished data) and reinforce that this gene rearrangement is a unique feature of Meliponini.
The mitochondrial genome of M. bicolor is the first that has been sequenced for stingless bee species. This work provides critical data for future mtDNA analyses of other meliponine species, facilitating the investigation of biological, ecological and evolutionary questions at intraand inter-specific levels. As several stingless bees are considered endangered, population studies applying molecular tools will be very important in terms of conservation. The sequence obtained here, representing 78% of the total genome, has been available in GenBank since 2003, and has already been used for phylogenetic purposes (Castro and Dowton, 2005) in an attempt to reconstruct the phylogeny of the Holometabola and the position of the Hymenoptera within it. There is an emerging tendency to use complete or nearly complete mtDNA genome sequences for phylogenetic analysis, while this type of data may also provide gene order characters for phylogenetic inferences (Boore and Brown, 1998;Rawlings et al., 2001). It is worthwhile to note the high rate of tRNA gene rearrangement found in the M. bicolor mitochondrial genome in comparison to other insects. Studies on wasps have already indicated that the Hymenoptera generally have an accelerated rate of mtDNA gene rearrangement (Dowton and Austin, 1999;Dowton et al., 2003). Gene order characters may help resolve the evolution and phylogeny of the bees, particularly the unsolved question about the origin of eusociality among corbiculate bees. The high conservation of the translocation of the tRNA Lys gene in Meliponini suggests that it occurred before the great diversification of Meliponini species and their dispersion over the tropical and southern subtropical areas of the World. The inclusion of more molecular data and analysis of other tRNA clusters may provide further clues for the evolution and biogeography of the Meliponini.