Genetic differentiation through dispersal and isolation in two freshwater fish species from coastal basins of Northeastern Brazil

The coastal basins in Northeastern Brazil used in this study make up two different ecoregions for freshwater fishes (Amazonas estuary and coastal drainages, and Parnaiba) and two areas of endemism for Characiformes (Maranhão and Parnaíba), and exhibits a diversified yet poorly explored freshwater fish fauna. The population structure and biogeography of two migratory freshwater fish species that are commercially exploited from Maranhão and Parnaíba regions were herein analyzed. Molecular sequence data and statistical analyses were used to estimate haplotypes networks and lineage divergence times and correlated with hydrographic history of drainage and paleodrainages of the region. A total of 171 sequences was produced for both species, Schizodon dissimilis (coI, n = 70) and Prochilodus lacustris (D-loop, n = 101). All analyses identified the presence of three genetically delimited groups of S. dissimilis and six groups of P. lacustris. The lineage time analyses indicate diversification among these species within the past 1 million year. The results indicate the influence of geodispersal in the formation of the ichthyofauna in the studied area through headwater stream capture events and reticulated connections between the mouths of rivers along the coastal plain due to eustatic sea level fluctuations.


INTRODUCTION
Studies using phylogenetic and comparative approaches to understand the biodiversity and biogeography of South American fishes are many and varied (e.g. Sivasundar et al., 2001;Beheregaray et al., 2002;Farias, Hrbek, 2008;Albert et al., 2011a;Pereira et al., 2013). Most of these studies focus on hypotheses to explain the origin, richness, and distribution patterns of fishes in fauna of the Amazon and adjacent river basins of northern South America. Some of these hypotheses, such as refuges, museums, and paleogeographic events (Endler, 1982;Fjeldså, 1994;Roy et al., 1997;Montoya-Burgos, 2003;Hubert, Renno, 2006;Haffer, 2008), were derived from studies of diversification in terrestrial taxa and adapted to fit certain groups of freshwater organisms. River capture and sea-level changes are geomorphological and climatic processes that strongly constrain the biological processes of dispersal and gene flow in obligate freshwater taxa (Albert et al., 2011b;Dias et al., 2014;Thomaz et al., 2015Thomaz et al., , 2017Albert et al., 2018;Thomaz, Knowles, 2018). By altering hydrographic connections among portions of river drainage networks, these processes act to separate and merge adjacent river segments, thereby promoting both speciation and extinction, and increasing the rate of net lineage diversification (Fagan, 2002;Albert, Crampton, 2010;Albert et al., 2016). The combined action of these processes results in a history of landscape evolution events that may be modelled as alternative paleogeographic hypotheses (Albert et al., 2011a;Thomaz et al., 2015).
Paleogeographic hypotheses regarding changing connections among rivers of 3/19 ni.bio.br | scielo.br/ni northern South America have considered the role of semipermeable watershed barriers and the uplift or burial of structural arches that bound major sedimentary basins (Albert et al., 2006;Winemiller et al., 2008;Lovejoy et al., 2010;Albert et al., 2018). Hydrogeologic hypotheses consider the role of headwater and lateral river capture events as geodispersal events, which move the location of barriers among the basins and promote the colonization of new areas, followed by differentiation (Hubert, Renno, 2006). The coastal dispersion or paleodrainage dispersal hypothesis posits the effects of sea-level fluctuations on changing shorelines and patterns of river-mouth connectivity through time (Huber, 1998;Thomaz, Knowles, 2018). These distinct events can influence the process of population differentiation and/or speciation acting both between (river capture) and within (river capture and sea level change) basins. The Maranhão region of Northeastern Brazil exhibits a diverse and poorly explored freshwater fish fauna and is represented in different configurations in studies based in the distribution patterns of the species. This area has been treated as representing two different ecoregions (Abell et al., 2008; the Amazonas estuary and coastal drainages, and Parnaiba), and alternatively two distinct areas of endemism (to Hubert, Renno, 2006;Maranhão and Parnaíba), differentiated by the size of the areas. In recent years, the description of new species (e.g. Piorski et al., 2008;Ottoni, 2011;Ramos et al., 2017;Guimarães et al., 2018a,b;Rocha et al., 2018;Zanata et al., 2018) and the compilation of taxonomic distributions (e.g. Melo et al., 2016;Abreu et al., 2019) for this area have improved the knowledge about the ichthyofauna diversity, and opened new questions regarding the processes involved in the evolution of fish groups of this region. The five principal river drainages of the study area are the Parnaíba, Itapecuru, Mearim, Turiaçu and Gurupi basins. To date, information about fish diversity has been published only for the first four of these basins (e.g. Piorski et al., 1998Piorski et al., , 2003Piorski et al., , 2007Soares, 2005;Barros et al., 2011;Ramos et al., 2014).
Several papers have proposed dispersal along the coastal plain to explain diversity patterns among fishes shared by the Amazonian and these drainages (e.g. Huber, 1998;Montoya-Burgos, 2003;Hubert, Renno, 2006). According to this model, dispersal, speciation and extinction were promoted by sea-level changes through the Neogene (20.4 -2.6 Ma) and Quaternary (2.6 -0.01 Ma). During episodes of marine regressions and transgressions, mouths of coastal rivers became connected and disconnected, respectively, resulting in episodes in which populations of adjacent basins were merged and isolated (Hanebuth et al., 2011;Roxo et al., 2014). The ichthyofauna of the Amazon and these coastal basins is also thought to have been affected by the Neogene uplift of the Serra do Tiracambu mountain range (Piorski, 2010). Beginning in the Late Miocene, the Tiracambu range arose to form the watershed divides between the Tocantins basin draining its western slope, and the coastal Gurupi and Mearim basins that drain its north and northeastern slopes (Soares Júnior et al., 2011). The approximate timing of this geological event generally coincides with estimated lineages divergence times for Hypostomus Lacepède, 1803, species that inhabit these three basins (Montoya-Burgos, 2003).
The population and biogeographic structure of two characiform freshwater fish species, Schizodon dissimilis (Garman, 1890) (Anostomidae) and Prochilodus lacustris Steindachner, 1907 (Prochilodontidae) (Fig. 1) (Fricke et al., 2020), Schizodon dissimilis is distributed only in the Poti River of the Parnaíba basin and, according to Barros et al. (2011), in the Itapecuru basin. Abreu (2013) expanded the known distribution to other localities in the Parnaíba basin, and also to the Mearim and Turiaçu basins. Prochilodus lacustris is distributed to the Parnaíba, Itapecuru, Mearim, Turiaçu, and Tocantins basins (Piorski, 2010;Barros et al., 2011;Fricke et al., 2020). The study seeks to identify the main hydrographic and paleogeographic events (river capture and dispersal by paleodrainages) that influenced the geographic distributions and evolutionary histories of these fish lineages. Through the use of haplotypes network, lineage divergence times, and statistical analyses using molecular data, and by comparing these biological data with available geological information from the region, we identify the most important paleogeographic events that promoted the differentiation of these species.

MATERIAL AND METHODS
Sampling and DNA extraction. Seventy individuals of Schizodon dissimilis and 101 individuals of Prochilodus lacustris were sampled from 14 localities across the Maranhão coastal basins including the Parnaíba, Itapecuru, Mearim, Turiaçu, and Tocantins basins (Tab. 1). Tissue samples were taken from fins of sampled individuals, preserved FIGURE 1 | Species used in this study, Schizodon dissimilis (above) and Prochilodus lacustris (below).
Amplification products were visualized in 1% agarose gel electrophoresis and purified with Illustra GFX PCR DNA and Gel Purification Kit (GE Healthcare). Samples were sequenced using both forward and reverse primers and Big Dye Terminator kit 3.1 Cycle Sequencing kit (Thermo Fisher Scientific) in ABI 3730 DNA Analyzer (Thermo Fisher Scientific). The quality of sequences was inspected using BioEdit 7.0.5.3 (Hall, 1999) and aligned using ClustalW 2.0.3 (Larkin et al., 2007). All sequences are deposited in NCBI with the follow corresponding accession number: MK530951-MK531020 to S. dissimilis and MK531021-MK531121 to P. lacustris (S1).
Genetic diversity and structure. Haplotype and nucleotide diversity indices were obtained using Arlequin 3.5.1.3 (Excoffier, Lischer, 2010). Tajima's D values were also calculated for selective neutrality testing (Tajima, 1989) and Fs (Fu, 1997). A Spatial Analysis of Molecular Variation (SAMOVA) in the program SAMOVA 2.0 (Dupanloup et al., 2002) was used to assess population structure. The variance was partitioned for haplotypes within populations, among groups and among populations within groups. Potential populations were identified using BAPS 6 (Bayesian Group Analysis; Corander et al., 2013) and a haplotype network was constructed to identify the genetic distances among and within populations using the program Haploviewer 1.6 (Salzburger et al., 2011). In all analyses each basin was considered as a group and each locality as a population.

Phylogenetics analyses.
The best molecular evolution model for the data was obtained using jModelTest 2 (Darriba et al., 2012). Bayesian Information Criterion (BIC) identified HKY+I as the best model for both genes coI and D-Loop. A molecular clock was applied to the tree through BEAST 2.4.7 program (Bouckaert et al., 2014), to identify divergence times among populations. The analysis was run with 200 million generations in 4 runs, sampled every 1,000 states with a burn-in of first 10% of the generations excluded. We used for both species the substitution rate of 1.67% of genetic divergence per million years proposed by Sivasundar et al. (2001). The maximum credibility tree was visualized with TreeAnnotator 1.8.1 after an additional burn-in period of 20,000. Median node ages and 95% highest posterior density (HPD) intervals were plotted in the chronogram.

Reconstruction of paleodrainages. Paleodrainage boundaries at the Last Glacial
Maximum (LGM) (e.g. Dias et al., 2014;Thomaz et al., 2015;Thomaz, Knowles, 2018) was estimated using topographical and bathymetric information extracted from a digital elevation model (DEM) GEBCO_2014 at 30 arc-second resolution (ca. 1 km; http:// www.gebco.net/) (Weatherall et al., 2015) using hydrological and spatial tools available in the GRASS environment of QGIS 3.2.3 (QGIS Development Team, 2018). In QGIS program the areas were delimited using the Contour and Mask tools; elevation and flow directions were identified using the Fill and Carve tools; river basin boundaries using the Watershed tool; and sea channel boundaries using the Lake tool. These coastline models were used to identify the minimum and the maximum number of basins with a separate mouth to the sea during the estimated period and to infer whether this connection and separation processes influenced the distribution of the studied species, considering a sealevel of -122 m, previously proposed by Hansen et al. (2013).

Phylogeographic analyses.
A total of 171 sequences for both genes were produced: Schizodon dissimilis (coI, n = 70, 623bp, 12 haplotypes) and P. lacustris (D-loop, n = 101, 702bp, 49 haplotypes). The difference in haplotype numbers between the species is expected more due to the different quantity of sequences, localities and genes used, than the possible different histories of each taxon. Populations from the Parnaíba and Mearim basins showed high indices of haplotype diversity (h) for both species, and Itapecuru and Tocantins basins also presented high levels of haplotype diversity in P. lacustris. All populations showed low levels of nucleotide diversity (π) (Tab. 2). The neutrality indices of Tajima' D and Fu' Fs were not significant for populations of S. dissimilis (p > 0.05), confirming the hypothesis of neutrality, and although significant among populations of P. lacustris (p<0.001) from the Parnaíba and Mearim basins (Tab. 2), not enough to deny the hypothesis of selective neutrality once the stability can only be rejected if both analyses are significant for all populations. Populations of S. dissimilis from the Itapecuru basin showed only one haplotype, which was not enough to make inferences of any indices.
The SAMOVA test separated the studied populations of S. dissimilis in three groups ( Fig. 2A) and P. lacustris in six groups (Fig. 3A), most of the molecular variance is The BAPS analysis identified the same number of groups as did SAMOVA for S. dissimilis (log(ml) = -280.2301; Fig. 2B) and P. lacustris (log(ml) = -881.0328; Fig. 3B), as well as the haplotype networks (Figs. 2C and 3C). For S. dissimilis three groups were identified (the colors in parentheses correspond to Fig. 2A, C): the first group is formed by specimens from Parnaíba and Mearim drainages (green), the second by specimens from Parnaíba, Itapecuru, and Mearim drainages (blue), and the third formed by specimens from Turiaçu and Mearim drainages (red). The large number of mutational steps observed in the network indicate the strong genetic structure and a potential putative new species in the Turiaçu and Mearim drainages. For P. lacustris, six groups were identified with specimens from the drainages (the colors in parentheses correspond to Fig. 3A, C): 1) Parnaíba and Mearim (blue), 2) Parnaíba and Itapecuru (yellow), 3) Turiaçu and Tocantins (aqua), 4) Itapecuru, Turiaçu, and Tocantins (pink), 5) Parnaíba, Itapecuru, and Mearim (green) and 6) Parnaíba, Itapecuru, Mearim, and Tocantins (red). The paleodrainage reconstruction recover four paleodrainages for the studied area, which S. dissimilis is present in three paleodrainages and P. lacustris in four. It is necessary highlight that our reconstruction possesses a different configuration those presented by Thomaz, Knowles (2018), that identified six paleodrainages for part of the studied area.
The results of AMOVA analysis (Tab. 4) for S. dissimilis identified that all results, except to paleobasins, showed that the principal variance are among populations within groups. For P. lacustris all results, except to paleobasins showed that the principal variance is within populations. These results indicate that none of these geological scenario supports the observed genetic structure, which is an indicative that the principal events responsible for the differentiation in these species are the river capture and dispersal probably by paleodrainage to contribute for the occupation of the region. The comparison of Fst values (Tab. 5) was high only between Itapecuru and Turiaçu for S. dissimilis, and Parnaíba and Mearim for P. lacustris.
The time-calibrated phylogeny indicated that populations of S. dissimilis diverged about 600,000 years (Fig. 4), when the group inhabiting the Turiaçu and Mearim basins  ni.bio.br | scielo.br/ni captures between the Turiaçu-Mearim and Itapecuru-Parnaíba basins, presumed to have been caused by reactivation of several faults and lineaments (Costa et al., 1997;Soares Júnior et al., 2011), and 2) coastal dispersal (dispersal by paleodrainages) due Pleistocene sea level fluctuations and reticulated patterns of connectivity among the mouths of rivers on the coastal plain that repeatedly changed the basins configuration (Soares Júnior et al., 2011;Thomaz et al., 2017). Furthermore, different genetic markers identified similar events over the past 1.0 My influencing the geographic distribution of species and genetic differentiation of populations. Shared haplotypes among S. dissimilis populations from Turiaçu and Mearim basins possibly occurred due to partial capture of Turiaçu by Mearim basin through damming and formation of lakes. The convergence of the Mearim, Pindaré and Grajaú rivers, influenced by quaternary neotectonic structures (Costa et al., 1997) and associated with transgressive and regressive sea movements (Soares Júnior et al., 2011), produced a set of interconnected lakes known as Baixada Maranhense. This is a wide plain region extending from the Mearim River to northwest in the direction of the Turiaçu valley. During the Lower Pleistocene (2.6 Ma -0.1 Ma), the western edge of the region seems     ni.bio.br | scielo.br/ni to have been affected by a positive epeirogenesis resulting from tectonic reactivations of faults forming the Ferrer-Urbano Santos Arch, especially in the area of the remains of Gurupi belt (Klein et al., 2005). The Gurupi belt lies to the west of the Turiaçu River. Thus, its reactivation may have contributed to direct parts of the middle Turiaçu draining towards the Pindaré-Mearim system. At that time, Pleistocene eustatic sea-level changes responsible for marine regressions and transgressions (Petri, Fúlfaro, 1983;Soares Júnior et al., 2011) may have controlled the connections between Turiaçu River and Pindaré-Mearim systems. These connections may have been favored during transgressive movements, when river flows were dammed and scattered across the interfluvium. This is plausible because there is some evidence floodplains promote connectivity between hydrological units and allow high levels of gene flow for species adapted to explore them (Balcombe et al., 2007), which is the case of these two rivers, whereas rivers with deep valleys will rarely overflow, leading populations to isolation and subsequent divergence (Burridge et al., 2007). Like many fish species that inhabit these regions, both investigated species have migratory habits and are commercially exploited (Godoy, 1975;Goulding, 1981), which facilitate dispersal processes. The low diversity identified in some populations could be an indicative that these populations are in constant contact by dispersal or could indicate a sample problem, once at least in Turiaçu all sample were collected just in one place. Another factor that could contribute to this lower diversity levels is the mixture among populations caused by the overflow of breeding tanks and release in the river of individuals from other locations.
The haplotype sharing, principally in S. dissimilis, between the Itapecuru and Parnaíba basins, can be explained by headwater capture events, probably occurred in the region between the modern cities of Nova Iorque and São João dos Patos, both in Maranhão state. These rivers have an elbow form which is characteristic of capture events (Rosa, 2004) that could be explained by the several reactivation processes of the Picos-Santa Inês Lineament, from its formation to the present, which was responsible for the change of direction of the middle and lower courses of these rivers (Cunha, 1986).
The results indicate relatively young divergence times for these species, which was not sufficient to allow visual identification of the new species. For example, although S. dissimilis populations exhibit substantial genetic divergence, they do not show clear morphological or morphometric differences (Abreu, 2013). The morphometric analysis developed by Abreu (2013) showed a strong overlap in the populations, that was not enough to separate in different species. Here we found a high number of mutational steps among the groups identified, suggesting the presence of cryptic species only detectable by genetic analyses and we suggest a review in morphological analysis for this specie.
Many studies of fish species distributions in the coastal basins of Brazil use paleodrainage reconstructions during the LGM at ca. 18 -21 Ky to explain the occurrence of species shared for the hydrographic basins (e.g. Thomaz et al., 2015Thomaz et al., , 2017Lima et al., 2017;Tschá et al., 2017). In our reconstruction, the Mearim and Itapecuru basins belong to the same paleodrainage at the LGM, possibly explaining the shared haplotypes of populations of these basins, or perhaps demonstrating that modern watershed barriers are insufficient to block gene flow once the geographic distance does not strongly hinder dispersal in freshwater fishes of this region, as showed by Abreu et al. (2019).
14/19 ni.bio.br | scielo.br/ni The probable dispersal of S. dissimilis and P. lacustris along the coastal plain of this area during the cyclic events of sea-level change, would have allowed shared haplotypes among these several basins that do not have hydrological connections today . Soares Júnior et al. (2011) proposed that in the end of Pleistocene the coast of Maranhão began to assume its current configuration, influenced directly by the sea-level fluctuations. Successive events of sea-level variation may have allowed species dispersal process in the region among the drainages and the subsequent isolation of populations in different basins.
The role of Pleistocene eustatic sea-level falls merging river mouths has long been understood to facilitate geographic range expansions of fish species along flat coastal plains (e.g. Huber, 1998;Albert et al., 2006;Dias et al., 2014;Thomaz et al., 2015). Likewise, the role of eustatic sea-level rises separating river mouths has been argued to isolate species in different river basins (e.g. Lundberg et al., 1998;Albert et al., 2006;López-Fernández, Albert, 2011). In combination, these marine regressions and transgression may be hypothesized to have had multiplicative effects over multiple cycles of merging and separating fish populations through Pleistocene, resulting in repeated rounds of dispersal, isolation, speciation, and extinction. The effects of climatic oscillations driving high species richness have been referred to as a "species pump" or "flickering connectivity system" hypothesis (e.g. Rossiter, 1995;Sedano, Burns, 2010;Papadopoulou, Knowles, 2015;Ferreira et al., 2017;Flantua, Hooghiemstra, 2018).

ACKNOWLEDGMENTS
This work was supported by grants from the US National Science Foundation DEB 0614334, 0741450 and 1354511to J.S.A.; and from FAPEMA 011644/2016 to J.M.S.A., and FAPEMA 01490/16 to N.M.P.