Complete sequence and organization of the mitochondrial genome of Cyclemys atripons ( Testudines , Geoemydidae )

The Black Bridged Leaf Turtle, Cyclemys atripons (Testudines; Cryptodira; Geoemydidae), is a poorly known species within the genus Cyclemys. We determined the complete nucleotide sequence of the Cyclemys atripons mitochondrial genome (mtDNA) and found it to be 16,500 base pairs (bp) in length, with the genome organization, gene order and base composition being identical to that of the typical vertebrate. However, unlike for most turtle mtDNA so far reported, an extra base was not found in the NADH3 gene. The C. atripons control region of mtDNA was 981 bp long. Comparisons with three other geoemydids showed that the C. atripons control region contained a highly variable region at the 3’ end composed of AT enriched tandem repeats containing a fifteen-unit 5’-A (AT)3-3’ variable number of tandem repeats (VNTRs).

Vertebrate mitochondrial (mt) DNA forms a double-stranded circular molecule of about 15-20 kb which generally contain 37 genes encoding 13 proteins, 22 tRNAs, 2 rRNAs and a major noncoding region bearing signals for mitochondrial replication and transcription (Wolstenholme, 1992).Due to its maternal inheritance and relative lack of recombination, the mitochondrial genome has been widely employed as a marker in vertebrate phylogenetic analyses, and have been often used in turtle science.Turtles are easily recognizable by the public, with approximately 270 species in the world (Iverson, 1992) and have been widely studied, with many earlier studies of mtDNA having concentrated on polymorphism analysis using restriction fragment length polymorphism (RFLP) and determining partial sequences (Lamb et al., 1989;1994).However, the trend in more and more studies is to move to direct sequencing of the complete mtDNA (Peng et al., 2005(Peng et al., , 2006;;Parham et al., 2006).As of March 2007, complete mitochondrial genomes have been released from Gen-Bank for only 17 turtle species, including 16 cryptodiran turtles and one side-necked turtle, which is far from being sufficient for reliable turtle studies.
The Black Bridged Leaf Turtle, Cyclemys atripons (Testudines; Cryptodira; Geoemydidae) is a poorly understood cryptodiran turtle species within the genus Cyclemys (Guicking et al., 2002).Previously, only the cytochrome b (Cyt b) gene of C. atripons mtDNA has been published (Spinks et al., 2004), clearly, further studies on this species are necessary.In our study, described in this paper, we sequenced and characterized the complete mitochondrial genome of C. atripons, which has laid the foundation for the further comparative analyses between C. atripons and other turtles.
In 2005 a C. atripons specimen was obtained from the suburbia of Longzhou city in the Chinese region of Guangxi, after natural death frozen at -80 °C for preservation.Total DNA was extracted from the liver and muscle tissue using the proteinase K method (Sambrook and Russell, 2001) and kept at -20 °C until needed for polymerase chain reaction (PCR) amplification.
Based on partial sequences reported by Spinks et al. (2004) and the similarity of mtDNA sequences of the painted turtle (Chrysemys picta; GenBank NC_002073) and Reeve's Turtle (Chinemys reevesii; GenBank AY676201) we designed 16 pairs of primers for PCR amplification (Table 1).The PCR was carried out in a total volume of 25 µL containing 100 ng of sample genomic DNA, 2.5 µL of 10×Buffer (TaKaRa, Japan), 2 µL of 2.5 mol L -1 of MgCl 2 , 1.5 µL of each dNTP, 0.25 µL of each primer (25 µmol L -1 ) and 1 unit of Taq DNA polymerase (TaKaRa).The thermal cycles were 95 °C pre-denaturing for 2 min, followed by 35 cycles of 94 °C for 40 s, 51 °C to 58 °C for 45 s and 72 °C for 1 min, plus a final ex-tension at 72 °C for 10 min.The resultant PCR fragments were first resolved on 1% (w/v) agarose gel (Promega, USA).After electrophoreses, gels were stained with ethidium bromide and bands were visualized under ultraviolet.Bands of intended size were excised and recovered with Gel Extract Purification Kit (TaKaRa, Japan).The cleaned PCR products were sequenced in both directions on an ABI3730 automated sequencer (Invitrogen Biotechnology).The sequences obtained from each sequencing reaction averaged 1000 bp in length and each segment overlapped the next contig by roughly 150 bp.The whole mtDNA genome sequence was read at least twice.
Sequence data were analyzed with the EditSeq (DNASTAR) and ClustalX1.8(Thompson et al., 1997) programs.The locations of protein-coding, rRNA and tRNA genes were identified by the tRNA Scan-SE1.21and SQUEIN v. 5.35 programs, which were also used for the comparisons with the corresponding sequences from the other turtles cited above.The analysis of the control region sequence was carried out with the DNAsis and BioEdit programs.The resultant complete mitochondrial genome of C. atripons (16,500 bp) was deposited in GenBank under accession number EF067858.
The structural organization (Table 2) and gene order (Figure 1) of the complete C. atripons mitochondrial genome was identical that of other typical vertebrates, with the genome containing the following: 13 protein-coding genes, all of which except NADH6 being encoded on the H-strand; 22 tRNA genes, 14 on the H-strand and 8 on the L-strand; 2 rRNA genes, 12S and 16S, both on the Hstrand; and one control region.There were few, or small, noncoding intergenic spacer nucleotides, with intervening sequences of 8 bp between cytochrome c oxidase mitochondrial subunit II gene (COII) and tRNA Lys plus 13 bp between NADH4 and tRNA His (Table 2).We found that the base composition of the major coding strand of C. atripons mtDNA was A = 34.42%,G = 13.01%,C = 25.36% and T = 27.20%,demonstrating the low G and high A+T bias seen in most other turtles (Pu et al., 2005;Peng et al., 2005Peng et al., , 2006)).We also found that in C. atripons mtDNA three protein genes (COIII, NADH6 and Cyt b) have an incomplete stop codon such as T, while the cytochrome c oxidase mitochondrial subunit I gene (COI) has GTG instead of ATG as a start codon.As in other vertebrate mitochondrial genomes, we found three instances of reading frame overlap, 10 nucleotides for ATP8 and ATP6, 7 for NADH4L and NADH4, and 5 for NADH5 and NADH6 (Table 2).
However, in C. atripons our analysis did not find the extra base usually found at a specific position in NADH3 of most other turtles (Mindell et al., 1998(Mindell et al., , 1999;;Pu et al., 2005;Parham et al., 2006).Such an insertion in the NADH3 gene has been reported in most turtles with the exception of Pelodiscus sinensis and Kinosternon flavescens (Peng et al., 2005;Pu et al., 2005;Parham et al., 2006).It is generally thought the base could be related to a TAA stop codon frameshift prematurely terminating protein translation if not corrected by RNA editing or other mechanisms (Mindell et al., 1998(Mindell et al., , 1999)).However, since the additional base is apparently absent from C. atripons more studies are needed to ascertain whether or not the extra base in NADH3 is a common characteristic of turtles or is specific to certain species and genera.We found that the C. atripons mitochondrial genome contained 22 tRNA genes, ranging in size from 66 nucleotides to 76 nucleotides (Table 2), interspersed between the rRNA and protein-coding genes, which is typical of the mtDNAs of other vertebrates.Most of the tRNA genes could be folded into the canonical cloverleaf secondary structure, the exception being the tRNA Ser (AGY) gene which lacks the dihydrouridine arm (D arm).The length of the C. atripons 12S rRNA gene was 969 nucleotides and the 16S rRNA gene was 1,601 nucleotides long, these genes, as in other vertebrates, being separated by the tRNA Val gene and positioned between the of tRNA Phe and tRNA Leu (UUR) genes (Table 2).
In the C. atripons mitochondrial DNA, we also found that the light-strand replication origin (31 nucleotides) was located between the tRNA Asn and tRNA Cys genes inside the WANCY tRNA gene cluster (Figure 1).This region has also been discovered in mtDNAs of all other cryptodiran turtles investigated (Pu et al., 2005;Peng et al., 2006), contrasting with its apparent disappearance from Pelomedusa subrufa (Zardoya and Meyer, 1998b).This C. atripons sequence may potentially fold into a stable stem-loop second-  ary structure with a stem comprised of 10 bp and a loop of 10 bp.The secondary structures of the origin of light strand replication (O L ) for 19 cryptodiran turtles (Figure 2) shows that the O L sequence nucleotides are rather conserved and the secondary structures of these sequences are also similar because 9 bp are identical in the stems, possibly a common characteristic of cryptodiran turtles.We found that the C. atripons D-loop control region was 981 bp long, 69.52% A+T rich and flanked by tRNA Pro and tRNA Phe genes (Table 2).A comparison of the complete control region sequences of four geoemydid turtles is given in the online edition of this paper (Figure S1), Similar to three other geoemydid turtles, three conserved sequence blocks (CSBs) 1-3 (Walberg and Clayton, 1981) were identified in the C. atripons control region.The whole lengths of four control region sequences ranged from 981 bp in C. atripons to 1,379 bp in Cuora aurocapitata, mainly resulting from sequences positioned at the 3' end.Interestingly, a large number of AT enriched tandem repeats containing variable number tandem repeats (VNTRs) were revealed at the 3' end (right domain) of the control regions.Furthermore, the composition and number of these tandem units were different for the different species, with the C. atripons VNTR being composed of fifteen 5'-A (AT) 3 -3' units (Figure S1).
The control region is usually considered to be the most variable parts of mtDNAs in terms of nucleotide substitutions, short insertions/deletions and VNTRs dynamics.However, these variations are not distributed randomly across the whole region but occur in particular hypervariable sites and domains at the 5' and 3' ends (Su, 2005).
Previous studies of turtles utilizing control region sequences were primarily focused on the 5' end adjacent to tRNA Pro and several regulatory motifs (Lamb et al., 1994;Walker et al., 1997;Walker and Avise, 1998).However, at present most work focuses on the 3' end close to tRNA Phe , especially tandemly repeated sequences, including VNTRs (Serb et al., 2001).The length difference between mitochondrial genomes among species is caused mainly by the divergent tandem repeats, which are thought to be generated by strand slippage and mispairing during replication (Fumagalli et al., 1996).
Tandemly repeated control region DNA has been reported from an ever-growing number of taxa.Zardoya and Meyer (1998a) characterized six tandem repeats (containing VNTRs) in the 3' domain of the P. subrufa control region and suggested that this sequence might be a potentially informative molecular marker for population studies by its unique localization in the maternally inherited mitochondrial molecule.What is remarkable is that the tandem repeats are present in the four geoemydid turtles discussed in our present paper.The repeat consists of two different repeat cores, "ATTATATC" followed by "AT" in Pyxidea mouhtii (DQ659152) than the one "A (AT) 3 " in C. atripons; in C. reevesii (AY676201) it is over ten 5'-ATATATC-3' units succeeded by AT-rich repeat; whereas in C. aurocapitata (AY874540) an approximately 490 bp AT-rich repeat is in the 3' of the CR , with only a few "G" nucleotides and no "C" nucleotide.
In the mitochondrial genomes of C. atripons and other turtles, AT enriched tandem repeats (containing VNTRs) reflect heteroplasmy, suggesting interspecies 786 Zhang et al. turtle genetic diversity.The occurrence of tandemly repeated mtDNA in the control region could be regarded as a special molecular marker in turtle species researches.However, to confirm this issue and the presence or absence of variation between specimens of the same species (intraspecies polymorphism) more informative characters need to be obtained from more turtle species and specimens.Taken as a whole, our study highlights the need for further work focusing on the 3' domain of the mitochondrial control region.

784
Zhang et al.Table1-Polymerase chain reaction primers used in the determination of the complete mitochondrial genome of Cyclemys atripons.PrimerPrimers (Y = C/T, R = A /g, W = A /T, M = A/C, H = A/C