CO1-Based DNA barcoding for assessing diversity of Pteropus giganteus from the state of Azad Jammu Kashmir, Pakistan

The flying fox (Pteropus giganteus) also familiar with the name of the greater Indian fruit Bat belongs to the order Chiroptera and family Pteropodidae. Current research emphasis on the DNA barcoding of P. giganteus in Azad Jammu Kashmir. Bat sequences were amplified and PCR products were sequenced and examined by bioinformatics software. Congeneric and conspecific, nucleotide composition and K2P nucleotide deviation, haplotype diversity and the number of haplotypes were estimated. The analysis showed that all of the five studied samples of P. giganteus had low G contents (G 19.8%) than C (27.8%), A (25.1%) and T (27.3%) contents. The calculated haplotype diversity was 0.60% and the mean intraspecific K2P distance was 0.001% having a high number of transitional substitutions. The study suggested that P. giganteus (R=0.00) do not deviate from the neutral evolution. It was determined from the conclusion that this mtDNA gene is a better marker for identification of Bat species than nuclear genes due to its distinctive characteristics and may serve as a landmark for the identification of interconnected species at the molecular level and in the determination of population genetics.


Introduction
Bats (Order: Chiroptera) offer important ecosystem services such as pollination, seed dispersal and insect pest suppression (Kunz et al., 2011). Flying foxes are mainly significant players in island ecosystems where they often work as seed dispersers and keystone pollinator both within and between islands (McConkey and Drake, 2007;McConkey and Drake, 2015). Other than that, flying foxes greatly maintains their numbers at high densities which is very essential for the existence of plant communities (McConkey and Drake, 2006). Chiroptera comprises many closely linked species which are sharing very similar ecologies and morphologies (Clare et al., 2007). These sort of similarities in morphology results in overlooked diversity (Mayer et al., 2007). Indian Flying Fox (P. giganteus) is an extensively dispersed species of Bats that present in the different tropical areas of central Asia, mostly between China and Pakistan (Bates and Harrison, 1997). In Pakistan, these Indian flying foxes are widely distributed in the areas of Sindh, Punjab and Northwest Frontier Province (Sheikh and Molur, 2004).
There is an estimation that about 10 to 100 million species may exist on our earth from which at least 1.5 million species of organisms are described (Wilson, 2003). Due to this enormous gap, there is a need to search for new technologies to describe the immense biological diversity. For this, the genetic screening methods which involve a single or a small number of genes -usually called DNA barcoding -were used in diverse taxonomic groups. There are two major aims for this: (i) to use a standardized technique for the identification of species, and (ii) to detect new species efficiently to bring us closer to the true number of species (Blaxter et al., 2004;Moritz and Cicero, 2004;Hebert and Gregory, 2005). Therefore, DNA barcodes were primarily used for genetic identification of taxonomically poorly studied taxa and geographic regions like the tropics (Blaxter et al., 2004;Monaghan et al., 2005;Smith et al., 2006). DNA barcoding is one of the best-developed technique (Hebert et al., 2004) in which a portion of the mitochondrial cytochrome c oxidase subunit 1 (COI) gene is used. DNA barcoding techniques contributed to identifying the cryptic species which increased the knowledge of immense diversity in the different species of Chiroptera (Clare et al., 2007;Mayer et al., 2007). The COI gene size has been preserved completely and used in evolutionary studies. There is a huge number of unidentified species or individuals below the species level in some higher vertebrate classes that could be documented by this best-accomplished tool of identification. DNA barcoding technique is also applicable for biodiversity analysis and certification of species (Ospina-Alvarez and Piferrer, 2008).
Pakistan is recognized as a country of very rich biodiversity but most of the species, including relatively familiar mammalian groups, are poorly identified and there is a lack of prior research and published data (Sheikh and Molur, 2004). This limitation serves as an opportunity for further research to explore unidentified species, therefore, the present study was conducted and the objectives were: (i) to assess the taxonomic identification of Pteropus giganteus by molecular sequencing data, (ii) to explore the diversity of chiropteran (Genus Pteropus) in State of Azad Jammu Kashmir by using mitochondrial COI gene, and (iii) to establish a national reference library in GenBank for Bat (Indian flying fox) species of State of Azad Jammu Kashmir, Pakistan.

Material and Methods
Blood samples of bats were collected from the districts Muzaffarabad (34° 21' N and 73° 28' E) and Bagh (33° 58' N and 73° 46' E) Azad Jammu Kashmir, Pakistan. Bats were randomly captured for the blood collection and anesthetized while blood collection and were released after sampling (McMichael et al., 2015). Identification of the bats was based on morphological principles described in key taxonomic references including Bates and Harrison (1997), Borissenko and Kruskop (2003), and Francis (2008), as well as describing individual species by primary literature review. Blood was collected in the 3 ml EDTA tubes with the help of 3 ml syringes and samples were labeled and were brought back to the Laboratory of Genetics, University of Azad Jammu and Kashmir. Blood samples were stored in the fridge until DNA extraction has been done. A total of 5 specimens were barcoded, from which 3 samples (M1, M2 and M3) were collected from district Muzaffarabad and 2 samples (B1 and B2) were collected from district Bagh, State of Azad Jammu Kashmir.
DNA extraction was done by using a phenol-chloroform extraction procedure followed by Sambrook et al., (1989). The extracted DNA was allowed to run on 1% agarose gel and visualized under UV transilluminator and results were recorded by using a gel documentation system. The quantity was measured by a spectrophotometer at 260/280 nm wavelength. Online software Primer 3 was used for the designing of primers sequences of Cytochrome C Oxidase 1 (CO1) gene of Pteropus giganteus. Primer sequences used were: PG-COI-F (5´-AAC GGC CCT CAG CCT ACT AA-3´), PG-COI-R (5´-GAA AAA GGT GGT ATT TAG GTT-3´) (1000 bps product size).
COI gene was amplified by using PCR (Polymerase Chain Reaction). PCR was carried out in 25μl (standard amount) reaction volumes in each tube (14 μl PCR water, 3 μl template DNA, 2.5 μl Taq buffer, 0.5 μl dNTPs (2.5 μM), primers 1 μl of each (10 μM), 2.5 μl magnesium chloride, and 0.5 U Taq polymerase. For thorough mixing, the reaction mixture kept for 30 seconds on 8,000 rpm for centrifugation. Thermal cycling comprised 95ºC for 3 min, followed by 39 cycles of 95ºC for 30 secs, annealing at 55-63 °C for 30 s, and an extension temperature of 72 °C for 1 min. This was followed by a final extension of 72 °C for 10 min. Before cycle sequencing, PCR products were purified with Exo Sap-IT (Affymetrix purification kit). Nucleotide (nt) sequencing was carried out in an ABI Prism 3100 Genetic Analyzer (PE Applied Biosystems, Foster City, CA, USA) using gene-specific forward and reverse primers for COI genes. One microliter cleaned PCR product was used for each 10 μL reaction. Indian flying fox gene sequences were edited by using the BioEdit program (Hall, 1999) for nucleotides and amino acid variations. Alignment of these nucleotide sequences was done with other already known international sequences by using Clustal W in the MegAlign program of LASERGENE package (DNASTAR Inc, Madison, WI, USA) and the international sequences were retrieved from the NCBI GenBank resource. After alignment and sequencing, these sequences were deposited in GenBank for accession numbers ( Table 1).
The identification of Bat samples was done through Barcode of Life Data Systems (BOLD) and Basic Local Alignment Search Tool (BLAST) of GenBank/NCBI. Sequence divergence between the Bat species of AJK (Pakistan) and other international reference species was determined by multiple sequence alignments. A phylogenetic tree was generated using the neighbor-joining method and Kimura 2-parameter supported by 1,000 bootstrap replicates (Akhtar and Ali, 2016;Akhtar et al., 2017;Ayesha et al., 2019) in MEGA 6.06 software. Replicate trees percentage in which all the related taxa are grouped in the bootstrap test (thousand replicates) were shown in the next branches of the MEGA 6.06 software. Considering the results of Gager et al. (2016) three Cynomops species (GenBank accession numbers JF447634, JN312044 and EF080319) and Molossus rufus (JF444936) were used as outgroups to root the tree inferred from CO1 sequences. Nucleotide composition and rate of transition/transversion ratio (R) were also calculated by MEGA 6.06 software. DNASP 5.0 program (Rozas et al., 2003) was used for the estimation of the number of polymorphic sites, no. of haplotypes, haplotype diversity (Hd), and nucleotide diversity (Pi).

Results
From (~1000 bps) PCR amplicons, the sequence used for comparison was the 627 bps consensus (Figure 1). Of the 627 sites, 626 (99.8%) were constant, 1 (0.15%) site was variable and also parsimony informative. The nucleotides' translation into a total of 209 amino acids sequences produced the 188 (89.9%) constant sites, while 1 (0.47%) site was variable and also parsimony informative. The empirical base composition of the COI gene among the examined samples was T (27.3%), C (27.8%), A (25.1%) and G (19.8%). The frequencies of A+T contents (52.4%) were slightly higher than those of G+C contents (47.6%), resulting in anti-G bias sequencing which is a mitochondrial gene characteristic.
Cytochrome oxidase I gene of bats was sequenced and aligned with other international sequences available in the BOLD and GenBank database using the Clustal W program. Because of the alignment species were grouped into definite clusters. These five Bats specimens were named as P. giganteus because they offered maximum sequence similarity with the BOLD (99.84 to 100%) and BLAST (100%) reference sequence of P. giganteus (KT291772.1) of Assam, India (Table 2).  The total number of haplotypes in the sequences of P. giganteus was 2 (haplotype 1 is shared by B1 & B2 while, haplotype 2 is shared by M1, M2 & M3) and haplotype (gene) diversity was 0.60±0.17. The remarkable rate of transversional substitutions was observed as compared to transitional substitutions. The transition/transversion rate ratios are K1 = 0 (purines) and K2 = 0 (pyrimidines). The overall transition/transversion bias is R = 0, where R = [A*G*k1 + T*C*k2]/[(A+G)*(T+C)]. The estimated proportion of invariable sites was 0.0010%, and the configuration of the gamma parameter was anticipated as 200. The best-fit ML model for the CO1 gene data of P. giganteus was Jukes-Cantor (JC) as DNA evolution with the lowest Bayesian Information Criterion (BIC= 1811.843) scores.

Phylogenetic analysis of present studied sequences
Present study sequences of P. giganteus were formed two main clusters in the phylogenetic tree. One cluster was formed by the three specimens collected from Muzaffarabad district with 100% bootstrap value while the second cluster was formed by the two specimens collected from Bagh district with 100% bootstrap value ( Figure 2). All the samples of P. giganteus had same nucleotide and protein sequence, the only difference is that the sequences of Bagh district (B1 & B2) had C (Proline) at position 290, whereas, all the sequences of Muzaffarabad district (M1, M2 & M3) had A (Histidine) at this position. The sequence divergence between the present studied samples of district Muzaffarabad and district Bagh is 0.2% and these sequences show the 0.001% overall nucleotide divergence (Table 3).

Combined (studied + international) sequences analyses
Combined sequences of different Bat species available on NCBI GenBank resources were analyzed together with present studied samples to create the collective phylogenetic tree. These species made three major clades with more than 50% bootstrap values. All samples of P. giganteus were grouped in clade B with strong bootstrap support (59% posterior probabilities (pP), more than 59% from NJ). While the remaining sequences of Pteropus species form two sub-clusters (A1 & A2). One sub-cluster consists of Pteropus lylei with more than 50% posterior probabilities (pP), which support the 93% bootstrap values. Similarly, all the Pteropus vampyrus samples were clustered together in a vampyrus lineage with 86% and 91% bootstrap values. Clade C consists of outgroup species with 65% and 100% bootstrap values. These outgroup species were also positioned as a base of the tree (Figure 3). The mean intraspecific K2P genetic distance of studied samples was 0.001 ± 0.001. Similarly, the overall mean interspecific genetic distance was 0.016±0.003 while, the mean intergeneric genetic distance was 0.097±0.011 (Table 4).

Discussion
In mitochondrial DNA some specific sequences are helpful in the satisfactory identification of unknown species (Dawnay et al., 2007). In contrast to other mitochondrial genes, cytochrome oxidase I gene produces very low changes in the sequence of Amino acids (Hebert et al., 2003). Mitochondrial CO1 gene consists of 600-800 nucleotides, an important part of animal DNA, and it helps in identification    (Janzen et al., 2005;Savolainen et al., 2005;Dawnay et al., 2007). It is an important tool of precise documentation of several groups and shows the best way to identify unknown species of many animal classes (Elmeer et al., 2012;Ospina-Alvarez and Piferrer 2008) below the level of species and biodiversity study (Mayden et al., 2007). Surprisingly, it can also disclose secrets of unknown diversity and hidden species (Meyer and Paulay, 2005;Kerr et al., 2009). Researchers of wild biodiversity considered DNA barcoding an important means to open the new fields of research (Janzen et al., 2005;Savolainen et al., 2005;Dawnay et al., 2007). The records of DNA barcoding also provide a system of identification where smashed and blemished materials are not identified by their morphological characters (Wong and Hanner, 2008;Victor et al., 2009). It might enable following the exotic insidious species (Ficetola et al., 2008) also give support to identification through analysis of fecal sample or content of gut (Kaartinen et al., 2010). Therefore, barcoding has a significant role in protection from poaching and illegal trade (e.g. International Trade of Endangered Species convention) but it also protects traders from advertising deceit (Wong and Hanner, 2008).
Current investigations include the phylogenetic analysis among the same species of Indian flying fox that are collected from Muzaffarabad and Bagh (AJK) and also includes the comparison of Pteropus giganteus (studied) with the international sequences of Pteropus giganteus and other closely related species present in the genus Pteropus with outgroup species used by Gager et al. (2016). The sequences for the international species are obtained from NCBI (GenBank). Our results confirmed that CO1 gene sequences of Bats are closely related with P. giganteus of BOLD (99.84-100%) and BLAST (100%) species in databanks. It is the fact that if the CO1 gene of the samples, under investigation, shows similarity below the 2% distance (Tamura et al., 2013) or less than 3% (Wong and Hanner, 2008) then the population belongs to the same species. The intra-specific sequence divergence possibility was may be due to ancestral polymorphisms and hybridization (Hajibabaei et al., 2006). The neighbor-joining (NJ) method also revealed similar relationships and authenticated that these Bat samples are P. giganteus. Similarly, Figure (3) demonstrated the phylogenetic tree of studied specimens made a single cluster with NCBI data of P. giganteus whereas, all other species ended in separate clusters. Sympatry of very similar-looking species is common in Bats (Von Helversen et al., 2001;Ibáñez et al., 2006) and it can make identification of species in the field very problematic. We clustered individuals successfully with the CO1 sequences. Besides, we incorporated GenBank sequences from different species and countries with the help of the COI tree.
The present findings offered very low (0.001%) intra-specific K2P genetic distance showing extremely low genetic variations between the species of two (Bagh and Muzaffarabad) district, as Ward et al. (2009) recommended that K2P distance between the entities of the same species should be very low. The current findings disclosed that there was minimum haplotype diversity of the COI gene along with low nucleotide diversity. Less Haplotype (gene) diversity in bats may be due to some environmental disturbances that could be a reason for the decrease of population. Furthermore, after the population becomes small, the genetic factors speed up the process of extinction for that population (Westemeier et al., 1998).

Conclusion
The efficacy of COI barcodes has strongly been validated by this study for identifying the Indian flying fox. The straight forward amplification and sequencing were proved by the COI barcode region, and this would enable the accurate identification of specimens and also aid the consequent generation of barcode database. It was confirmed through BOLD and BLAST that CO1 gene sequences of Bats were P. giganteus. This study concluded that P. giganteus had low G contents (19.8%) and mean intraspecific K2P genetic distance was very low, with a low rate of mutation. For the conservational benefits and to overcome the challenge for correctly describing and mapping biodiversity DNA barcodes will surely facilitate. There is an urgent need for this work because, despite high levels of genetic diversity within many species, conservationists and politicians still focus their effort around named species. There should be some healthy conservation actions within this region because habitat lost is greater already, we are confident that the DNA barcodes use and access of such information to the public with the help of Web portals will surely boost the intensified taxonomic effort needed to catalog and describe this diversity and ultimately help in its protection.