Organization and variation of mitochondrial DNA control region in pleurodiran turtles

Three complete mitochondrial DNA (mtDNA) control regions (CRs) of Chelodina rugosa (Ogilby, 1890), Chelus fimbriata (Schneider, 1783), and Podocnemis unifilis (Troschel, 1848) were firstly determined using Long-PCR method and the length were 1,016 bp, 1,149 bp, and 985bp, respectively. Together with CRs of Pelomedusa subrufa (Bonnaterre, 1789) and nearly complete CR of Podocnemis expansa (Schweigger, 1812) obtained from GenBank, the structural and evolutionary characteristics of mtDNA CRs in pleurodiran turtle were analyzed in this study. We identified three functional domains (TAS, CD, and CSB domains) as well as their conservation sequences (TAS, CSB-F, and CSB-1) according to their homology to those of other turtles. Within the TAS domain, an interrupted poly-C stretch was found in C. rugosa, C. fimbriata, and P. subrufa, which also exists in the published mt DNA CRs of Chrysemys picta (Schneider, 1783), Trachemys scripta (Thunberg in Schoepff, 1792), and Trionyx triunguis (Forskål, 1775). The analysis of the origin for the poly-C sequences in TAS domain from six turtles suggested that the poly-C sequences are more related to “goose hairpin” in birds rather than CSB2 in CSB domain. In the CSB domain, CSB2 and CSB3, which were determined in CRs of Cryptodira, were absent in Pleurodira CRs, indicating the regulative mechanisms of transcription may be varied in both two suborders and the lack of CSB2 and CSB3 could be proposed as one of diagnostic characters between Pleurodira and Cryptodira at molecular level. As for CR of other cryptodiran turtles, variable number of tandem repeats (VNTRs) in the 3’ end of the CRs was found in the five pleurodiran turtles. Interestingly, the long repeated motifs from each species could form stable stem-loop secondary structures, suggesting that the repeated sequences may play an important role in regulating replication of the mitochondrial genome in Pleurodiran, and the secondary structures of VNTRs may provide some potential information in phylogenetic inference.

The mitochondrial genome is highly conserved and compact, usually including genes that code for 13 proteins, 22 tRNAs and two rRNA, as well as an important noncoding sequence (control region, CR) in vertebrates.Particularly, the CR in vertebrate mitochondrial genomes contains two promoters (LSP and HSP) for the transcription, heavy-strand replication origin (O H ), and the displacement loop (D-loop) (CLAYTON 1982, CHANG & CLAYTON 1986).However, based on the distribution of the variable nucleotide positions and different nucleotide frequencies, the mitochondrial CR is divided into three domains (BROWN et al. 1986, SACCONE et al. 1991): termination associated sequence (TAS) domain, central conserved domain (CD), and conserved sequence block (CSB) domain.The TAS domain has been shown to contain sequences associated with termination of newly synthesized H-strands during replication (DODA et al. 1981, BROWN et al. 1986, SBISÀ et al. 1997).The CD, which contains several areas of highly conserved sequences, is more conserved with respect to the TAS and CSB domains.However, the nature of these conserved sequences varies among vertebrate classes.For example, ANDERSON et al. (1981) indentified several conserved sequence boxes (CSB), B, C, D, E, and F in the CD of the human mitochondrial CR.Of those, only three (CSB B, CSB D, and CSB F) can be found in the avian mitochondrial CD (RUKONEN & KVIST 2002).The CSB domain usually contains the origin of the H-strand transcription (WALBERG & CLAYTON 1981, BROWN et al. 1986, KING & LOW 1987, FORAN et al. 1988).In many taxa, the TAS and CSB domains, which are more variable than the CD, have variable numbers of tandem repeats (VNTRs) (BROUGHTON & DOWLING 1994, SBISÀ et al. 1997, ZARDOYA & MEYER 1998a, DELPORT et al. 2002, FU et al. 2006, ZHANG et al. 2009, XIONG et al. 2010).
Extant turtles have been divided into two monophyletic clades, Pleurodira and Cryptodira.However, virtually all stud-ZOOLOGIA 28 (4): 495-504, August, 2011 ies on turtle mitochondrial genomes have focused on Cryptodira.Studies comparing the mitochondrial CRs of pleurodiran turtles have been limited because there is only one published complete CR from one species -Pelomedusa subrufa (Bonnaterre, 1789).In the present study, three complete CR of three pleurodiran turtles -Chelodina rugosa (Ogilby, 1890), Chelus fimbriata (Schneider, 1783), and Podocnemis unifilis (Troschel, 1848) -representing two families (Chelidae and Podocnemididae) are characterized.Additionally, together with the complete CR of P. subrufa and a nearly complete CR of Podocnemis expansa (Schweigger, 1812) obtained from GenBank, we have compared the CR sequences of pleurodiran turtles with the CRs of other vertebrates.Additionally, the features which are shared or not among the pleurodiran turtles and other vertebrates have been identified and are discussed in detail.

Sample and sequencing
Specimens of C. rugosa, C. fimbriata, and P. unifilis were stored at the Anhui Normal University.Total genomic DNA was extracted from their muscles with the proteinase K method (SAMBROOK & RUSSELL 2001) and kept at -20°C for PCR amplification.
The mitochondrial CR was amplified by long-PCR.The entire CR and 3 tRNAs (tRNA Thr , tRNA Pr °, and tRNA Phe ) as well as partial Cyt b and 12S rRNA gene sequences were amplified in one single step using a pair of long-PCR primers: LCR-F: 5'-CTTCCTATTTGCCTATGCTATC-3' LCR-R: 5'-TATTTTGGGCTCCTGGTGTA-3' Long-PCR conditions were: one minute at 95°C, then 30 cycles of 10 seconds at 98°C, five minutes at 55°C, followed by a final extension for 10 minutes at 72°C.PCR products were isolated using a Gel Extract Purification Kit (TaKaRa Co., Ltd, Dalian, China) after 1% agarose gel electrophoresis.The purified Products were sequenced with an ABI3730 automated sequencer.

Sequence analysis
In order to determine the complete CR sequence of the three turtles, the sequences obtained for C. rugosa, C. fimbriata, and P. unifilis were compared with the complete mtDNA of P. subrufa (ZARDOYA & MEYER 1998b), and the tRNAs were identified using tRNAscan-SE 1.21 (LOWE & EDDY 1997).The sequence containing almost complete CRs of P. expansa was retrieved from GenBank.Subsequently, five pleurodiran turtle sequences were aligned using ClustalX 1.8 software (THOMPSON et al. 1997) and then checked manually in order to define the conserved sequence blocks.
After comparison with published data from other taxa (the sequences used are listed in the Appendix), the conserved box F (CSB F) was delimited in the aligned sequences.The conserved block 1 (CSB1) with the characteristic motif GACATA was also delimited.The boundary of the TAS domains and the Central conserved domain was the starting point of the CSB F.
The CSB domain was always set to start with the conserved sequence box 1 (CSB1) (ZHANG et al. 2009, XIONG et al. 2010).The program 'Tandem Repeats Finder' (BENSON 1999) was used in order to identify the VNTRs in the CRs.Furthermore, putative secondary structures in the CRs were determined using the software RNA structure (MATHEWS et al. 1999).Subsequently, the computer program RNAdraw (HOFACKER et al. 1995) was employed to prepare secondary structures for publication.

The length and base composition of pleurodiran turtle CRs
The CRs of C. rugosa, C. fimbriata, and P. unifilis have 1,016 bp, 1,149 bp, and 985bp, respectively.They are rich in adenine and thymine and lack L strand guanines, which is evident in each domain, particularly in the CSB domain (Tab.I).However, the composition of the CR among the three domains is not uniform.The Central conserved domain is poor in adenine and rich in L strand guanines compared to the flanking TAS and CSB domains.Interestingly, among the five turtles compared, the TAS domain was found to be richer in adenine and thymine in three species: C. rugosa, C. fimbriata, and P. subrufa.This TAS domain composition is consistent with that of cryptodiran turtles (ZHANG et al. 2009, XIONG et al. 2010).However, in the other two pleurodiran turtles (P.unifilis and P. expansa), the TAS domain was rich in cytosine and thymine.

The organization of turtle CRs
Like in most cryptodiran turtles, the CRs in the mitochondrial genomes of pleurodiran turtles are located between the tRNA Pro and tRNA Phe genes.The CSB F and CSB-1 were easy to define in the five turtles (Fig. 1).In the CSB F, 12 of 29 (41.4%)nucleotide positions were fixed among the five turtles.In the CSB1, 9 of 20 (45%) nucleotide positions were fixed.Comparing with the cryptodiran turtles (ZHANG et al. 2009, XIONG et al. 2010), the conserved blocks of pleurodiran turtles varied greatly.This variation may due to the age of the group.The fossil record has shown that most pleurodiran turtles diverged about 100-150 mya (million years ago) (GAFFNEY 1990).During this long evolutionary history, the conserved blocks sequences of the mtDNA CRs might have changed independently, accumulating more variation than the CRs of cryptodiran turtles, in which most species have diverged after 100 mya (NEAR et al. 2005).

TAS domain
The TAS domain was located between the 5'end of the CR and the beginning of the CSB F. The length of the TAS domain varied from 243 bp to 357 bp, and the sequences were heterogeneous and could not be unambiguously aligned.However, the TAS sequences were easily determined and were identical among the pleurodiran and cryptodiran turtles (Fig. 1).
An interrupted poly-C stretch was found in the TAS domain of C. rugosa and C. fimbriata.This poly-C stretch also occurs in four other cryptodiran turtles: Chrysemys picta (Schneider, 1783), Trachemys scripta (Thunberg in Schoepff, 1792), Trionyx triunguis (Forskål, 1775), and P. subrufa (ZARDOYA & MEYER 1998a).It is one remarkable feature of the TAS domain sequence in these turtles.The interrupted poly-C stretch is repeated once in C. picta and twice in T. scripta.The similar poly-C stretch sequence in many birds (QUINN & WILSON 1993, RANDI & LUCCHINI 1998, RITCHIE & LAMBERT 2000) and crocodiles (RAY & DENSMORE 2002) is referred to as "goose hairpin".In birds, it could potentially form a stable hairpin structure, which is usually characterized by several consecutive cytosine residues separated from the complementary guanine by a putative stop sequence (TCCC).The poly-C stretch in crocodiles (Crocodylus) was reported as having a region with high cytosine content that could form a similar stem-loop structure.However, in the present study, the only poly-C stretches that have the capacity to form a similar secondary structure are those of C. fimbriata and P. subrufa (data not shown).ZARDOYA & MEYER (1998a) reported that the interrupted poly-C of the TAS domain in P. subrufa was remarkably similar to the CSB2.In order to discuss the origin of these similar poly-C sequences, we performed clustering analyses using the Neighbor-joining (NJ) method on the poly-C sequences of the TAS domain of the six turtles, the CSB2s from 31 turtles, and the "goose hairpin" sequence of Gallus gallus (Linnaeus, 1758).Interestingly, the sequences grouped into two major clades, A and B (Fig. 2).All CSB2s sequences grouped within Clade A. Though the relationships among these CSB2 may not represent the "species tree" of turtles, the fact that they clustered indicates that the CSB2 of these turtles is conserved.Clade B consists of the poly-C sequences in the TAS domain from the six turtles and the "goose hairpin" sequence from G. gallus.Clades A and B indicate that the origin of the poly-C sequences in the TAS domain were closer to "goose hairpin" in the TAS domain rather than the CSB2.

Central conserved domain
The Central conserved domain (CD) is more conserved than the peripheral domains (TAS and CSB domains).It is 353-420bp long in the five turtles.Our resulting alignment had102 conserved sites, 273 variable sites, and 127 parsimony-informative sites.Although the extreme conservation of this region in vertebrates may suggest that it has played an important role in the evolutionary history of the mitochondrial genome, its function remains completely unknown.On other hand, the high degree of similarity makes this domain suitable for phylogenetic inference (SACCONE et al. 1991).However, we failed to reconstruct the phylogenetic relationships based on it.We believe that the phylogenetic distances in this domain are subject to high statistical fluctuations among family-level taxa of turtles because of the reduced number of sites in this region (353-420bp).Thus, the phylogenetic relationships of the five pleurodiran turtles reconstructed based on this domain were not well resolved.

CSB domain
The CSB domain, ranging from 265bp (P.unifilis) to 588bp (P.subrufa), has several conserved blocks involved in the regulation of replication and transcription.These short conserved sequence blocks were firstly called CSB1, CSB2, and CSB3 by WALBERG & CLAYTON (1981).Researchers have proposed that they provide regulatory signals for the processing of the RNA primers for replication of the H-strand in human KB and mouse L cells.Since then, these three conserved sequence blocks have been identified in a number of vertebrate mtDNAs, such as fish (BUROKER et al. 1990, BROUGHTON & DOWLING 1994, LIU 2002), amphibians (DUNON-BLUTEAU et al. 1985), lizards (BREHM et al. 2003), turtles (SERB et al. 2001, ZHANG et al. 2009, XIONG et al. 2010), and mammals (SBISÀ et al. 1997).However, it is interesting that in turtles only the CSB-1 is found consistently, whereas in marsupials, all three CSBs are consistently present.In placental mammals, CSB-2 and CSB-3 are sometimes missing (NILSSON 2009).Thus, there seems to be a lot of plasticity across vertebrates concerning the CSB1, CSB-2 and CSB-3.Particularly, the CSB1 was found to be a basic element in the CR of the mtDNA of all vertebrates and duplications of the CSB1 have been found in sperm whales, shrews, hedgehogs, opossums (SBISÀ et al. 1997), and hornbills (SBISÀ et al. 1997, DELPORT et al. 2002).A sequence such as GACATA, is identical among several vertebrate mtDNA CSB-1 sequences, from fish to mammals, and birds (BUROKER et al. 1990, SBISÀ et al. 1997, RUOKONEN & KVIST 2002, NILSSON 2009).The conserved sequences and structure of CSB-1 may have important functions.For example, GHIVIZZIANI et al. (1994) have reported that the origin of the heavy strand replication (OH) and transition from RNA transcription to DNA replication should occur in the proximity of the CSB-1.
The other two conserved sequence blocks, CSB2 and CSB3, are believed to play an important role in RNA processing during the transcription in mice (CHANG & CLAYTON 1986).In humans, the CSB2 and CSB3 may have another function: GHIVIZZANI et al. (1994) have proposed that together with C residues, they appear to function as domains which avoid interaction with the mitochondrial transcription factor A (mtTFA), thus reinforcing the organization of the mtTFA binding.Although they appear to have an important function in different taxa, both the CSB2 and CSB3 are not present in the CRs of the mtDNA of all vertebrates.In this study, the CSB2 and CSB3 were missing in all five pleurodiran turtles.The same condition has been reported in birds (RAMIREZ et al. 1993, BAKER & MARSHALL 1997, HÄRLID et al. 1997, RANDI & LUCCHINI 1998) and some mammals (SBISÀ et al. 1997).Interestingly, all mtDNA CRs of cryptodiran turtles published so far have the CSB2 and CSB3.Thus, there are two known types of mitochondrial CR in turtles at present, which correspond to variations in the CSB domain (Figs 3 and 4).In all cryptodiran turtles, the CSB domain contains three conserved blocks (CSB-1, CSB-2, and CSB-3) (SERB et al. 2001, JUNGT et al. 2006, AMER & KUMAZAWA 2009, ZHANG et al. 2009, XIONG et al. 2010) (Fig. 3) but only one (CSB-1) can be Figure 2. NJ tree based on the CSB2s and poly-C sequences.The tree was reconstructed according to SAITOU & NEI (1987) using the absolute number of differences in MEGA (KUMAR et al. 1993).Sequences used in the analysis are listed in the Appendix.A total of 1000 bootstrap replicates were performed and values greater than 50 are indicated at the respective nodes.Clade A consists of species in which CSB2 is located within the CSB domain, whereas clade B contains the species whose poly-C sequences are within the TAS domain.
identified in the CSB region of our five pleurodiran turtles (Fig. 4).This considerable variation shows that the regulative mechanisms of transcription may be different among vertebrates or may involve specific nuclear-mitochondrial co-evolutionary processes (SBISÀ et al. 1997), and in particular, the mechanisms of regulation may be different between the two suborders (Pleurodira and Cryptodira) in turtles.If the lack of the CSB2 and CSB3 in the CSB region turns out to be a pervading character state in Pleurodira, it could be used as a diagnostic character between Pleurodira and Cryptodira at the molecular level.
The VNTRs of the new determined mtDNA CRs from the three pleurodiran turtles, and two additional pleurodiran turtles (P.expansa and P. subrufa), which were only located at the 3'end of the mtDNA, were further analyzed in this study (Tab.II).As in pleurodiran turtles, the repeated motifs of each species were not homogenous at the end of the CR.The motifs of VNTRs in five pleurodiran turtles were similar in length.The lengths of the VNTRs are closer in closely related species.
ZOOLOGIA 28 (4): 495-504, August, 2011 Potential secondary structures of the repeated sequences in the five pleurodiran turtles' CR were constructed in this study, and the long repeated motif from each species formed a stable stem-loop structure (Figs 5-9).In hornbills, these VNTRs' stemloop structures are believed to function as a termination sequence in mitochondrial genome replication (DELPORT et al. 2002).In turtles, whether these VNTRs have some function or not needs further study, but it is noteworthy that the potential secondary structures of the repeated sequences were very similar among P. unifilis, P. expansa, and P. subrufa (Figs 7-9).
Although the mechanism for generating the VNTRs in the mitochondrial genome is not well understood at present, it is believed that the VNTRs can be used as molecular markers and provide sufficient phylogenetic information in molecular phylogenetics, population genetics, species identification, as well as genetic diversity and conservation, because they can be used to differentiate among genera, species, populations, and even individuals (ZARDOYA & MEYER 1998a, ZHANG et al. 2009).Pelomedusids and podocnemidids were placed into Pelomedusidae (sensu lato) based on morphological characteristic (GAFFNEY 1975, GAFFNEY & MEYLAN 1988).Although DE BROIN (1988)   and Pelomedusa) and Podocnemidae (consisting of extant Madagascan Erymnochelys, and the extant South American Podocnemis and Peltocephalus), morphological studies still support the monophyly of Pelomedusidae (sensu lato).Recently, molecular evidence has also suggested that the Pelomedusidae (sensu lato) may be monophyletic (GEOROGES et al. 1998, NOONAN 2000, FUJITA et al. 2004, NOONAN & CHIPPINDALE 2006, THOMSON & SHAFFER 2010).In the present study, these secondary structures of VNTRs for P. subrufa (Pelomedusinae), P. expansa, and P. unifilis (Podocneminae) also supports the monophyly of Pelomedusidae (sensu lato).We suggest that comparative analyses of secondary structures of VNTRs may provide some potential information for phylogenetic inferences. Appendix.Continued.

Figure 1 .
Figure 1.The alignment of TAS and conserved blocks from five pleurodiran turtles.Note: Dots indicate same nucleotides as top sequences, and asterisk shows the identical sites.

Figures
Figures 3-4.Two typical mtDNA CR structures published in turtles: (3) the structure of mtDNA CR in Cryptodira.VNTRs a and VNTRs b are only identified in CRs of some trionychids; (4) The structure of mtDNA CR in Pleurodira.P: tRNA-Pro gene, F: tRNA-Phe gene.

Table I .
The base composition (%) and length of the five pleurodiran turtle CRs.
Note: In the CR of P. expansa, the TAS and Central conserved domain are complete, and CSB domain is almost complete.
suggested that the extant Pelomedusinae and Podocneminae (GAFFNEY & MEYLAN 1988) should be recognized as two families, i.e.Pelomedusidae (consisting of the extant African Pelusios