Molecular characterization and genetic relationships of seven piranha species of the genera Serrasalmus and Pygocentrus (Characiformes: Serrasalmidae) from Paraná-Paraguay, São Francisco and Tocantins River basins in Brazil

Distributed: November 30, 2020 (With 3 figures) Abstract Genetic and phylogenetic relationships among seven piranha species of the genera Serrasalmus and Pygocentrus from the Paraná-Paraguay, São Francisco and Tocantins River basins were evaluated in the present study by partial sequences of two mitochondrial genes, Cytochrome b and Cytochrome c Oxidase I. Phylogenetic analysis of Maximum-Likelihood and Bayesian inference were performed. Results indicated, in general, greater genetic similarity between the two species of Pygocentrus ( P. nattereri and P. piraya ), between Serrasalmus rhombeus and S. marginatus and between S. maculatus , S. brandtii and S. eigenmanni . Pygocentrus nattereri , S. rhombeus and S. maculatus showed high intraspecific genetic variability. These species have each one, at least two different mitochondrial lineages that, currently, occur in sympatry ( S. rhombeus ) or in allopatry ( P. nattereri and S. maculatus) . Species delimitation analysis and the high values of genetic distances observed between populations of S. rhombeus and of S. maculatus indicated that each species may corresponds to a complex of cryptic species. The non-monophyletic condition of S. rhombeus and S. maculatus reinforces the hypothesis. The geographic distribution and the genetic differentiation pattern observed for the piranha species analyzed herein are discussed regarding the geological and hydrological

Piranhas are freshwater Neotropical fish endemic from South America, occurring in the rivers of the Amazon, Orinoco, Guiana, Araguaia-Tocantins, Paraná-Paraguay and São Francisco basins (Jégu, 2003). Despite their ecological and economic importance, the taxonomy and systematic classification of piranhas and other serrasalmids are confusing. As a result, species identification and phylogenetic placement of many taxa are problematic (Freeman et al., 2007;Thompson et al., 2014).
Genetic characterization of piranha species is therefore essential to aid in the elucidation of genetic relationships and identification at species level. Hence, the goal of our research was to use partial sequences of the mitochondrial genes Cytochrome b (cytb) and Cytochrome c Oxidase subunit I (coI) to characterize and to elucidate the genetic and phylogenetic relationships of seven piranha species of the genera Serrasalmus (S. maculatus Kner, 1858;S. marginatus Valenciennes, 1837;S. eigenmanni Norman, 1929;S. rhombeus Linnaeus, 1766 andS. brandtii Lütken, 1875) and Pygocentrus (P. nattereri Kner, 1858 and P. piraya Cuvier, 1819) from the Upper Paraná, Upper Paraguay, São Francisco and Tocantins River basins, in Brazil.

Sample collection
Sixty-four piranha specimens of the genera Serrasalmus and Pygocentrus were collected at four hydrographic basins: Upper Paraná, Upper Paraguay, Tocantins and São Francisco Rivers ( Figure 1; Table 1). Specimens were anaesthetized and subsequently sacrificed by overdosing of clove oil. Muscle samples were preserved in 96% ethanol and maintained at -20 °C before DNA extraction. Piaractus brachypomus was used as outgroup in phylogenetic analysis (GenBank accession numbers: AY791429 for cytb and FJ978042 for coI sequences).
The amplification efficiency was confirmed on 1% agarose gel. Samples were then purified with polyethylene glycol. Resulting fragments were once more amplified with primers L14841 or L6448-F1 for cytb or coI regions, respectively. Approximately 50 ng of the final products were directly used in the nucleotide sequencing reactions using the DYEnamic ET Dye Terminator Kit (Amersham Biosciences) in the automatic sequencer MegaBACE 1000 (Amersham Biosciences), following manufacturer instructions.

Sequence and phylogenetic analysis
Partial nucleotide sequences of cytb and coI were edited and aligned separately by Clustal Omega (Sievers et al., 2011) and BioEdit 7.0.1 (Hall, 2011) and, when necessary, adjusted manually. Genetic diversity indices such as polymorphic sites, number of transitions and transversions, p-distances and nucleotide compositions were obtained with MEGA7 (Kumar et al., 2016).
After checking for congruency among tree topologies derived from the single-gene phylogenies, analyses of Maximum-Likelihood (ML) and Bayesian (BA) inferences were based on the concatenated sequences of cytb and coI. The best-fit model of nucleotide evolution was estimated by PartitionFinder 2.1 software (Lanfear et al., 2012). Best-scoring ML trees were estimated with the raxmlGUI software (Silvestro and Michalak, 2012), using rapid bootstrap algorithm, autoMRE function for resamplings, and the partition set defined by the PartitionFinder 2.1. BA trees were calculated using the uncorrelated lognormal relaxed-clock model implemented in BEAST 1.8.2 with an input file generated in BEAUti 1.8.0 (Drummond et al., 2012). The Yule process of speciation, which assumes a constant speciation rate among lineages, was applied as a tree prior. A Monte-Carlo Markov chain (MCMC) of 50,000,000 generations was performed and sampled every 1,000 generations. Results were checked using Tracer 1.6  (Rambaut et al., 2014; ESS >200). The final trees were calculated after 20% of burn-in. Tree was edited with FigTree (Rambaut, 2010). Support for nodes was determined using posterior probabilities (PP; calculated by BEAST).

Species delimitation analyses
Three species delimitation methods were applied to the coI sequences only using either sequence-based estimations (Automatic Barcode Gap Discovery -ABGD) or topology-based analyses based on the ML or BA inferences (Poisson Tree Process -PTP, and General Mixed Yule Coalescent Model -GMYC). The ABGD method was conducted on the online server (Puillandre et al., 2012) using the default parameters and the Kimura model (K80; 2.0) of nucleotide substitution. The PTP model was implemented on the online server (Zhang et al., 2013) using the best-scoring ML tree constructed on raxmlGUI, as mentioned before. The GMYC method was implemented using the ultrametric tree based on the Bayesian inference constructed in BEAST, as mentioned above. Tracer 1.6 software was used to check for chain convergence and the effective sampling size (ESS>200). The identification of significant clusters was implemented in Rstudio software (RStudio Team, 2016) by using the splits package (Ezard et al., 2009).

Results
The total length of the concatenated sequences of partial coI (549 bp) and cytb (592 bp) genes was of 1,141 bp long, without indels. A total of 167 variable sites were identified within the 64 analyzed specimens (excluding outgroup); 157 of them were parsimony-informative. Polymorphic sites and diagnostic nucleotides are presented in Supplementary Material 1. Sequence analysis revealed a ratio of transitions and transversions of R = si/sv = 8.26 and a proportion of bases of T = 26.7%, C = 32.5%, A = 24.1% and G = 16.7%.
Concatenated cytb and coI sequences were efficient in discriminating the piranha species, revealing high bootstrap or posterior probabilities values supporting clades in both dendrograms ( Figure 2) as well as consistent p-distance  (Table 2). Single-gene topologies obtained with cytb or coI sequences were similar with each other (data not shown) and with the topology based on the concatenated sequences. Only two taxa were problematic regarding their positions in the trees: S. eigenmanni and S. brandtii. In the ML dendrogram obtained with coI sequences only, S. eigenmanni was the sister group of S. rhombeus and S. marginatus (see Figure 3), instead of S. maculatus and S. brandtii (as in Figure 2). Serrasalmus brandtii, on the other hand, was closer to S. maculatus  from Paraná-Paraguay basin (data not shown), rather than S. maculatus from Tocantins, in the ML tree based on cytb sequences. All the analyzed species of Serrasalmus were grouped in a monophyletic clade in the ML and BA trees. The same was observed for Pygocentrus in the ML analysis (Figures 2 and 3). The two species of the genus Pygocentrus formed a basal clade in both dendrograms, and the relationships among specimens were mostly corroborated in the ML and BA analyses. Pygocentrus nattereri and P. piraya constituted each one a monophyletic group and they presented a genetic distance from each other of 2.48% ( Table 2). In addition, P. nattereri was separated in two clades, each one corresponding to the populations of Tocantins and Upper Paraguay River basins. This fact was corroborated by the values of genetic distance: although variability within each group was low (0.04% to 0.08%), the genetic distance between the two populations of P. nattereri was of 2.25% (Table 3).
The clade constituted by Pygocentrus was the sister group of all species of the genus Serrasalmus in dendrograms. As observed for P. nattereri, genetic isolation of populations from Tocantins River and Paraná-Paraguay hydrographic basin was also observed for S. maculatus. Specimens of S. maculatus from the Paraná-Paraguay River basin formed a clade distinct from the population of Tocantins River ( Figure 2) and showed high genetic distance values (4.01%; Table 2). However, no significant differentiation was observed between S. maculatus of Paraná and Paraguay Rivers basins. These results were corroborated by the genetic distance values (Table 3). The population of S. maculatus from Tocantins River diverged from the populations of Paraná and Paraguay basins by mean values of 4.20% and 3.69%, respectively. On the other hand, the genetic differentiation between S. maculatus from Paraná and Paraguay Rivers was only 1.04%. In addition, S. brandtii was placed within the clade formed by S. maculatus (Figure 2), indicating the non-monophyletic condition of S. maculatus. Serrasalmus eigenmanni was defined as the sister group of the clade formed by S. maculatus and S. brandtii.
Serrasalmus rhombeus and S. marginatus were genetically close to each other and formed a major clade that was further divided into three sub-clades: one formed exclusively by S. marginatus and two formed by S. rhombeus (Figure 2). The two clades of S. rhombeus were distant from each other by a value of genetic p-distance of 1.78% (Table 3). Values of such magnitude were also identified between S. rhombeus and S. marginatus (p-distance = 1.68%; Table 2). Additionally, S. rhombeus was non-monophyletic in the ML (Figure 2A) and BA (Figure 3) trees.
The three species delimitation analyses were mostly congruent and returned ten (for ABGD and PTP methods) or eleven (GMYC) lineages for Serrasalmus and Pygocentrus species (Figure 3). All methods split both P. nattereri and S. rhombeus into two subgroups. However, S. maculatus was divided into two (ABGD and PTP) or three (GMYC) lineages, depending on the approach.

Delimitation of piranha species
All of the piranha species analyzed herein were discriminated based on the cytb and coI mitochondrial DNA (mtDNA) sequences and based on the species delimitation analyses (ABGD, PTP and GMYC). Genetic p-distance mean values between species varied from 1.68% to 5.97% ( Table 2). The genetic differentiation between P. nattereri and P. piraya was evident. Although the value of p-distance detected between the species were low (2.48%), this is an expected result, since the speciation event leading to P. nattereri and P. piraya occurred recently, during the last 2.6 ± 0.2 Ma (Mateussi et al., 2019). We also detected high genetic variability within P. nattereri, as previously reported by Mateussi et al. (2019) and Luz et al. (2015). Specimens of P. nattereri were separate in the dendrograms according to the sample locality. Values of p-distance observed between populations of P. nattereri (Paraguay and Tocantins River basins; 2.25%; Table 3) were lower than the values identified between species of Serrasalmus, suggesting that the existing polymorphism in P. nattereri is related to intra-specific variability, which is commonly observed in geographically isolated populations. Distinction between populations of P. nattereri of different river basins has already been discussed (Fink, 1993;Fink and Zelditch, 1997), appointing to shape differences consistent with the geographical variation of a widely distributed species (Fink and Zelditch, 1997). However, all species delimitation analysis recovered two mtDNA lineages among P. nattereri specimens (Figure 3). Similarly, Mateussi et al. (2019) detected at least five lineages of P. nattereri in five hydrographic river basins (Amazonas, Guaporé, Itapecuru, Paraná-Paraguay and Tocantins-Araguaia). The authors also concluded that these lineages can be potentially sibling species, a special case of cryptic species, representing the P. nattereri species complex (Mateussi et al., 2019). Finally, the monophyletic condition of P. nattereri confirmed in the studies of Fink (1993) and Ortí et al. (1996) was also observed in our analysis (Figure 2).
The same pattern of genetic differentiation identified in P. nattereri was also observed for the specimens of S. maculatus, i.e., population of S. maculatus from Tocantins River basin was segregated from the populations of Paraná-Paraguay system. In the case of S. maculatus, p-distance values were relatively higher than in P. nattereri. ABGD, PTP and GMYC approaches split S. maculatus in two or three distinct lineages, as previously described by Bignotto et al. (2019). The high values of genetic distance and the differentiation of S. maculatus haplogroups from Tocantins and Paraná-Paraguay River basins, associated with the non-monophyletic condition of S. maculatus observed in the dendrograms, may constitute evidence for more than one species within S. maculatus (Bignotto et al., 2019). Similarly, Machado et al. (2018) did not recover the monophyletic condition of the species.
In addition to the wide geographic distribution, S. maculatus has high karyotypic variability (Cestari and Galetti Junior, 1992;Nakayama et al., 2000;Centofante et al., 2002;Nakayama et al., 2002). The different cytotypes identified in S. maculatus are related to the different hydrographic basins, characterizing differences among populations. Consequently, these evidences support a complex of cryptic species for S. maculatus (Nakayama et al., 2000).
Results obtained in the present study also revealed the close relationship between S. maculatus and S. brandtii (p-distance ranging from 3.63% to 3.74%; Table 2), corroborating previous studies (Cestari and Galetti Junior, 1992;Hubert et al., 2007). Cestari and Galetti Jr. (1992b) reported that S. brandtii of the São Francisco River and S. maculatus from the Upper Paraná River have similar karyotypes. According to Hubert et al. (2007), the event that gave rise to S. brandtii and S. maculatus occurred around 8 Ma.
The low genetic p-distance values observed between S. rhombeus and S. marginatus (1.68%; Table 2) were inferior to those detected among other species of piranhas analyzed in the present study. Values of such magnitude could be explained by two hypotheses. First, specimens namely S. rhombeus and S. marginatus might represent only one species with wide intraspecific variability. Another more reasonable explanation is that S. rhombeus and S. marginatus have undergone a recent speciation process, which impeded the accumulation of mtDNA polymorphisms. Both S. rhombeus and S. marginatus have been previously described, are well-defined species, and are morphologically distinct from each other (Jégu, 2003), which gives support for the second hypothesis.
Analysis clearly divided specimens of S. rhombeus from Tocantins River basin into two clades. Furthermore, p-distance values detected between these two groups of S. rhombeus are in the same magnitude of interspecific genetic divergence observed between S. rhombeus and S. marginatus. Additionally, ML and BA trees revealed the non-monophyletic condition of S. rhombeus. Thompson et al. (2014) and Machado et al. (2018) also did not obtain the monophyly of S. rhombeus. Hence, in the present study, we found evidence for at least two sympatric mitochondrial lineages of S. rhombeus that occur in the Tocantins River basin which diverged recently from each other. We suggest additional studies combining molecular and morphological data to confirm if both groups of S. rhombeus should be considered different species.
Cytogenetic studies also reported the high variability within S. rhombeus (Nakayama et al., 2001(Nakayama et al., , 2012. Different S. rhombeus cytotypes were found in both allopatry and sympatry, suggesting that each cytotype represents a different fish species (Nakayama et al., 2001). Nakayama et al. (2001) also noted that differences in parasite species supported the recognition of a cryptic species of piranha within S. rhombeus. Our results agree with previously described data: according to Géry (1976), S. rhombeus is a species complex (rhombeus group) comprising six to nine species (Nakayama et al., 2001). Consequently, caution should be used in any decision regarding the species complex of the rhombeus group.
Therefore, mtDNA sequences (cytb and coI) were sufficient to characterize and discriminate all of the piranha species analyzed herein from the Upper Paraná, Upper Paraguay, São Francisco and Tocantins River basins, with additional evidence for the possible occurrence of cryptic species in S. maculatus, S. rhombeus and P. nattereri.

Relationships among hydrographic basins
The mitochondrial haplotypes of both P. nattereri and S. maculatus populations were related to the Amazon (Tocantins) or Paraná-Paraguay hydrographic basins, i.e., the genetic diversity is geographically structured and there are no shared haplotypes among rivers. The biogeographical pattern and the genetic differentiation found in the populations of both species can be explained by vicariance and/or dispersion events.
According to Hubert and Renno (2006), the Paraná-Paraguay system split from the Amazon at 10 Ma, in the Late Miocene. Therefore, during the formation of these basins, the definitive separation of the two populations of both S. maculatus and P. nattereri was possible due to vicariance. Similarly, dispersal events between different drainages were hypothesized to promote post-dispersal allopatric speciation once the connections ceased (Hubert et al., 2007). During the last 10 Ma, headwater-capture events occurred between the Amazon and the Paraná systems (Lundberg et al., 1998). Several studies have reported the possible connection between the Amazon and the Paraná-Paraguay systems (e.g., Aquino and Schaefer, 2010). Therefore, vicariance and dispersion events could be related to the genetic differentiation of the populations of both S. maculatus and P. nattereri of the Amazon and the Paraná-Paraguay River basins.
The biogeography associated to the distribution of the haplotypes of both species (S. maculatus and P. nattereri) indicate that the populations are isolated geographically, reproductively and/or by distance for enough time so that mutations could be accumulated in the mtDNA molecule. Hubert et al. (2007) suggested that at least P. nattereri colonized the Paraná River at 2 Ma, which justifies the low genetic divergence identified in the populations of P. nattereri. However, the differentiation between the populations of S. maculatus must have occurred earlier, due to the higher genetic divergence detected for this species.
Fishes from the same river basin tend to be most closely related, whereas populations of fishes from adjacent localities within basins should be most similar (Lovejoy and Araújo, 2000). Such genetic pattern was identified for both S. maculatus and S. marginatus, species that occur in the Paraná-Paraguay hydrographic basin. Within the Upper Paraná River or the Upper Paraguay River, populations of both S. maculatus and S. marginatus revealed minor values of genetic distance. However, comparisons of populations between these localities showed only slightly higher values of genetic distance, characterizing polymorphisms within a species.
Probably, gene flow between populations of the Paraná-Paraguay hydrographic basin of S. marginatus and also of populations of S. maculatus has occurred until recently or may still occur. Occasional migrations from upstream to downstream and/or incorporation of the subpopulation from the Itaipu region to the population of the Upper Paraná River may have contributed to the attenuation of the genetic differentiation between these populations. The Guaíra Falls (Seven Falls) has characterized an efficient geographical barrier against the migration from downstream to upstream (Júlio Júnior et al., 2009). However, migration from upstream to downstream could frequently happen.
Genetic similarity was higher between the piranha species from the São Francisco River (S. brandtii and P. piraya) and the Paraná-Paraguay hydrographic basin (S. maculatus and P. nattereri) than with species occurring in the Tocantins River (S. maculatus and P. nattereri) ( Table 3). This could be explained by the hydrological and geological history. Events of headwater capture were identified after the separations of the Upper Paraná and the São Francisco River basins (Hubert and Renno, 2006), which could enable ichthyofauna exchange between these two hydrographic basins. However, until now, headwater capture events have not been identified between São Francisco and Tocantins River basins.
Another two situations arose in the region and promoted fish dispersion between hydrographic basins. First, in the 1960s, the artificial transposition of the Piumhi River, originally from the Upper Paraná River basin, to the São Francisco River and, consequently, the entire fish fauna of this river was transposed to the São Francisco River (Moreira-Filho and Buckup, 2005). Second, Blanco et al. (2010) suggested that, in the same region prior to transposition of the Piumhi River, there may have been a natural connection causing mixture of the ichthyofauna. During floods, it is likely that connections between the Grande River (Paraná hydrographic basin) and the São Francisco River through the Cururu wetland occurred, representing a natural migratory route for fish (Blanco et al., 2010).
The results of the present study suggest that the biogeographic pattern identified for the piranha species analyzed of Serrasalmus and Pygocentrus agree with the geological and hydrological events previously documented in the literature. The separation of the hydrographic basins, as well as the headwaters events and dispersion routes among the different river systems that occurred in the past, may have strongly influenced the distribution and genetic differentiation of the piranha species.