Genetic diversity and population structure of bocachico Prochilodus magdalenae (Pisces, Prochilodontidae) in the Magdalena River basin and its tributaries, Colombia.

Prochilodus magdalenae is an endemic freshwater fish that occurs in the Magdalena, Sinú and Atrato hydrographic basins. It has an important economic role and is a food resource for the artisanal fishing communities. Its socioeconomic importance contrasts with the current status of its fisheries, where stocks are being depleted. Considering its importance and lack of information on its genetic structure, we used seven microsatellite markers to assess the genetic structure of wild populations of P. magdalenae. The genetic diversity was assessed and the population genetic structure was estimated through Fst, analysis of molecular variance and Bayesian analysis. A total of 290 alleles were found in all loci throughout all population. The high polymorphism contrasts with the levels of observed heterozygosity (Ho = 0.276), which are the lowest values recorded for the family. We found three populations of bocachico coexisting throughout the studied system, contradicting the hypothesis that freshwater migratory fish form panmictic populations. These results on the genetic structure of P. magdalenae constitute tools for a better understanding of the behavior and biology of this species, contributing to fish management and conservation programs.


Introduction
The Magdalena River is the principal hydrographic system in Colombia and the major axis of economical development in the country (Jimenez-Segura et al., 2010). Its basin is intensely populated and nearly 80% of the Colombian human population lives on it (Galvis and Mojica, 2007). The basin has an area of 257,438 km 2 , occupying approximately 24% of the Colombian territory (Galvis and Mojica, 2007) and has a large biomass and diversity of freshwater fish, harboring 167 species (Galvis and Mojica, 2007), some of which, like bocachico (Prochilodus magdalenae), migrate in the reproductive season. The "bocachico" is an endemic species occurring in the Magdalena, Sinú and Atrato hydrographic basins (Mojica et al., 2002;Maldonado-Ocampo et al., 2005). This is an important fish resource for the artisanal fishing communities, representing one of the commonly captured species. The bocachico is a detrivorous species that feeds on the fine bottom sediment containing organic particles. Is considered a key element in the ecosystem and may play an eco-logical role due to its sediments processing habit (Flecker, 1996).
Like many other prochilodontids, P. magdalenae has high fecundity and spawns all eggs at once in the open waters of the main river channel (Agostinho et al., 1993). Larvae drift passively towards flooded areas where they feed and remain until maturation (Agostinho et al., 1993). Their life cycle is reliant on the hydrological patterns of the Magdalena River basin, where the fish migrate to the main river in twice-yearly hydrological periods. In the first one when water levels begin to decrease (December-February), spawning migration to the upstream starts and fishes remain there during the dry season. Spawning begins with the onset of the first high-water period (March-June). The second one takes place when water levels start to decrease (July-September) and a second spawning migration occurs. This spawning has a lower magnitude and the process is not very clear, but apparently relies on fishes that failed to reproduce in the previous period (Jimenéz-Segura et al., 2010).
Over the last years, genetic investigations have revealed that freshwater migratory fish can display population structuring within a single hydrographic system (Wasko and Galetti Jr, 2002;Hatanaka et al., 2006;Sanches and Galetti Jr, 2007). During the spawning season, fish schooling may exhibit a behavior that enables the maintenance of the genetic integrity of the populations (Hatanaka et al., 2006;Sanches and Galetti Jr, 2007). This means that the reproductive behavior constitutes a decisive factor in the species population subdivision (Sanches and Galetti Jr, 2007).
One approach that has been successfully applied in genetics studies on fish is the use of microsatellite markers (Piorski et al., 2008), which are short tandem repeats of 1-6 nucleotides widely distributed in the genome (Litt and Luty, 1989). Due to their abundant distribution, high degree of polymorphism, Mendelian inheritance (Selkoe and Toonen, 2006) and simple evolutionary mechanisms (Piorski et al., 2008), microsatellites have been largely used to access genetic diversity (Barroso et al., 2005;Matsumoto and Hilsdorf, 2009;Barroca et al., 2012) and population structure in fish (Barroso et al., 2005;Hatanaka et al., 2006;Carvalho-Costa et al., 2008;Calcagnotto and DeSalle, 2009;Matsumoto and Hilsdorf, 2009;Sanches et al., 2012;Barroca et al., 2012). However, few studies using microsatellite markers were carried out to detect the genetic structure of bocachico in the Magdalena River (Santacruz, 2003).
The study of genetic diversity is essential for fish conservation, which depends on the knowledge of the amount of variation existing in a local reproductive unit (Carvalho, 1993). The importance of this approach lies on its potential for delimiting priority areas for species conservation and sustainable use (Sanches and Galetti Jr, 2007). The purpose of this study was to employ seven loci of microsatellite markers to gather information on the genetic structure of Prochilodus magdalenae in the Magdalena River and its tributaries, providing data for the conservation and management planning of this fish.

Sample collection
Specimens were captured with the help of artisanal fishermen at 25 sites from upstream (Neiva) to downstream (Ciénaga Grande de Santa Marta) of the Magdalena River basin and main tributaries, like Sogamoso River, Cauca River and San Jorge River, from April to December 2010 ( Figure 1). Muscle tissue was removed from 759 specimens sampled immediately after capture and stored in 96% ethanol until use.

DNA extraction and amplification of loci
The genomic DNA was extracted from P. magdalenae muscle using the MasterPure kit (Epicentre Biotechnologies®). Genetic diversity was analyzed using seven microsatellite loci described for Prochilodus lineatus (PL3, PL14, PL23, PL28, PL34, PL64 and PL119) that cross-amplify in P. magdalenae (Rueda et al., 2011). PCRs were carried out in a final volume of 10 mL containing 100 ng of DNA, 1 X reaction buffer (20 mM Tris-HCl pH 8; 50 mM KCl), 2 mM MgCl 2 , 200 mM dNTPs, 0.2 mM of each primer (forward and reverse) and 0.25 U of Taq polymerase (Bioline Meridian Life Science). PCR conditions were as follows: 5 min at 95°C, 30 cycles of 30 s at 94°C, 30 s at the annealing temperature, 30 s at 72°C, and a final extension at 72°C for 10 min. The reactions were performed with the ESCO-SWIF MaxPro gradient thermocycler. The amplification products were analyzed by capillary electrophoresis QIAxcel Advance (QIAGEN), using a high resolution kit (High Resolution Kit QIAGEN) and a molecular weight ladder of known concentration (DNA Size Marker 50-800 bp v2.0 QIAGEN). The size of each amplified product was determined with the ScreenGel QIAxcel v1.0 QIAGEN program, which allows quantifying the weight of each band thus distinguishing heterozygous from homozygous individuals.

Microsatellite statistical analysis
The genetic diversity was estimated for each population analyzing the number of alleles per locus (N A ), observed heterozygosity (Ho), and expected heterozygosity (He) computed using the GenAlex 6.0 software (Peakall and Smouse, 2006). The inbreeding coefficients (Fis) per 38 Berdugo and Barandica loci were calculated in FSTAT (Goudet, 1995). Departure from Hardy-Weinberg expectations was calculated using a test analogous to Fisher's exact test (Guo and Thompson, 1992) that has as null hypothesis the random union of gametes. This test was estimated with a Markov Chain Monte Carlo (MCMC) series of permutations (10,000 batches/1000 iterations), implemented in GENEPOP (Raymond and Rousset, 1995). The independent segregation of genotypes (linkage disequilibrium) was also tested using the exact test with a MCMC (10,000 batches/1000 iterations) (Guo and Thompson, 1992) facilitated also by GENEPOP. The presence of genotyping errors arising from technical artifacts, namely null alleles or large allele dropout due to poor DNA quality was assessed with the software MICRO-CHECKER (van Oosterhout et al., 2004). For population structure analysis the ARLEQUIN 2000 (Schneider et al., 2000) program was employed to calculate pairwise F ST value (Weir and Cockerham, 1984) estimates for each population pair. We examined the distribution of genetic variability into hierarchical levels at different geographical scales in the Magdalena River and tributaries through AMOVA (Analysis of Molecular Variance) performed in ARLEQUIN 2000. In this case, the hierarchical levels were: 1) all populations formed a single group and 2) a group from all localities of the principal channel in the Magdalena River without the tributaries like Sogamoso, Cauca and San Jorge River. Thus, AMOVA provided F ST values analogous to F ST of Wright (1978), and through a non-parametric procedure of random permutations (10,000), the initial hypothesis of no genetic structuring between populations was tested.
The Bayesian clustering methodology of STRUCTURE version 2.3.3 (Hubisz et al., 2009) was applied to identify clusters of genetically similar individuals and determine the level of genetic substructure in the data set independently of sampling areas. To estimate the number of subpopulations (K), three independent runs of K = 1-15 were carried out at 100,000 Markov Chain Monte Carlo repetitions with a 100,000 burn-in period using no prior information and assuming correlated allele frequencies and the admixture model. The migration pattern of P. magdalenae involves the existence of high gene flow between populations; so we used an admixture model where each individual is assumed to have inherited some proportion of its ancestry from each population. To determine the number of populations (K) present in the Magdalena River basin and its tributaries, we used the method proposed by Evanno et al. (2005). This value was obtained using the software STRUCTURE HARVESTER 0.56.3 (Earl, 2009).

Genetic diversity
All seven microsatellites were polymorphic (100%) in the wild populations of P. magdalenae within the Magdalena River basin and its tributaries. A total of 290 alleles were found in all loci throughout the whole population ranging from 33 to 59 by PL64 and PL119. Average number of alleles per locus in the population was 41.4 ± 9.6 and the mean in each sampling site ranged from 9.57 (Ciénaga de Canta Gallo) to 18.43 alleles (Ciénaga de Pijiño) (Table 1).
From the 290 alleles found in the population of bocachico of the Magdalena River basin, 61 were private alleles with a frequency below 11%. Such alleles were detected in 21 of the 25 sampling sites and ranged from one in the Barranco de Loba, Sogamoso River, Neiva and Gambote to eight in Gamarra (Table 1). The observed heterozygosity (Ho) was low in the wild population of bocachico and ranged between 0.19 (Cauca River) to 0.33 (Neiva) ( Table 1). The highest expected heterozygosity (He) was found in Puerto Boyacá (0.9188), and the lowest in Ciénaga de Canta Gallo (0.8289). The average FIS (inbreeding coefficient) was highest in all sampled sites in the Magdalena River basin and ranged from 0.624 (Ciénaga San Silvestre) to 0.777 (Ciénaga de Pajarales).
All sites presented departures from the Hardy-Weinberg expectations (HWE) for all loci (p < 0.01). The U test (Raymond and Rousset, 1995) showed that the deviation from HWE is due to the heterozygote deficit (p < 0.0001). This deficit was also revealed by the positive Fis values. The occurrence of null alleles was identified in all loci and in all localities in the Magdalena River basin through tests performed on MICRO-CHECKER.
No significant association among the genotypes of seven loci presented linkage disequilibrium, suggesting that the analyses could be performed assuming statistical independence of the loci.

Population differentiation and structure
The Fst test revealed presence and absence of differentiation among all pairwise population comparisons, ranging from -0.00038 between the localities of Mompós and Puerto Boyacá to 0.15 in the localities of Ciénaga de Canta Gallo and Sogamoso River (Table 2).
Variation among and within populations was assessed by AMOVA using ARLEQUIN 2000. To test the genetic structure, the sampling sites were artificially divided into the different hierarchical groups described above. The first analysis revealed a significant genetic differentiation between all populations in the Magdalena River basin and its tributaries (F ST = 0.06511; p < 0.00) showing that nearly 6.5% of the total microsatellite DNA diversity was explained by variability among populations and 93.49% was found within population (Table 3). When the tributaries were excluded from the analysis, the AMOVA continued to show a significant genetic differentiation (F ST = 0.0638; p < 0.00) revealing that nearly 6.38% of the variation was explained by the variability between populations and not within them.
However, the AMOVA and F ST test require a priori definition of subpopulations or localities sampled, and this grouping may not correspond to biological groups or populations. So we used the STRUCTURE 2.3.3 program to verify if individuals were clustered into two or more populations when their genotypes indicated that they were admixed. The Bayesian analysis provided the most probable number of subpopulations of P. magdalenae in the Magdalena basin through the method proposed by Evanno et al. (2005) and indicated at least three clusters or populations (K = 3) genetically differentiated within the data set (Figure 2). These populations are distributed along the basin, where at least one fish of each sampled site was assigned to one cluster (red, blue or green).

Genetic diversity in wild populations
Despite the large biodiversity of freshwater fishes in South America, there are few studies using microsatellite markers to assess the genetic characteristics of migratory fish populations . The present study demonstrates that P. magdalenae has a high genetic diversity, He = 0.877, compared to that observed for other species of the genus (Santacruz, 2003;Hatanaka et al., 2006;Galzerani, 2007;Carvalho-Costa et al., 2008;Silva, 2011;Rueda et al., 2011) and to other commercially important species that have the same reproductive pattern and migrate long distances (Sanches, 2007;Batista, 2010;Dantas, 2010) (Table 4). Likewise, the microsatellite genetic diversity was slightly lower than that reported by Aguirre et al. (2013) for mitochondrial DNA (mtDNA control region) in P. magdalenae (HD = 0.997). These results demonstrate the vagility of the P. magdalenae, in which the strong migratory behavior facilitates the maintenance of high levels of genetic variability, as in species of the genus Prochilodus (Lassala and Renesto, 2007;Santos et al., 2007).
On the other hand, the observed heterozygosity of the P. magdalenae populations are not comparable to those registered for others species of fishes (Santacruz, 2003;Hatanaka et al., 2006;Galzerani, 2007;Sanches, 2007;40 Berdugo and Barandica   Carvalho-Costa et al., 2008;Silva, 2011;Rueda et al., 2011), because these values are among the lowest recorded for the whole family (Table 4). The present study demonstrates that all loci exhibited heterozygosity deficiency throughout the populations surveyed and the presence of null alleles is suggested as the cause of the deficiency. This has also been the explanation for the heterozygote deficiency proposed by Barroso et al. (2005) and Matsumoto and Hilsdorf (2009). However, several factors can originate heterozygote deficiency, such as: selection on a specific locus, inbreeding and cryptic population structure (Garcia De Leon et al., 1997). The first factor was not considered since we did not observe loci under selection (Slatkin, 1995). In the freshwater fish Prochilodus argenteus, a heterozygote deficit was attributed to a combination of random sampling effects and null alleles (Hatanaka et al., 2006). Thus, the heterozygote deficit can not be explained by a single factor, since the interaction of several factors may be contributing to it (Sanches, 2007).
The microsatellite used in this study allowed the estimation of a high genetic diversity in the P. magdalenae population, despite the presence of some unique alleles. Due to the great number of individuals sampled in this study (759), it is likely that these unique alleles are rare. If this is so, they can be used as population markers, powerful to show genetic flow (Slatkin, 1985).

Population structure
The microsatellite data suggest the presence of a significant population structuring in the migratory fish P. magdalenae from the Magdalena River basin and its tributaries revealed by the F ST and AMOVA statistics. This suggests that these fish organize themselves during the spawning period in a way that maintains the integrity of 42 Berdugo and Barandica  each subunit of the system (Sanches and Galetti Jr, 2007). Pereira et al. (2009) studied the population structure of Pseudoplatystoma corruscans and argued that, due to the magnitude and geographical scale of their study, the genetic differentiation found could only be explained by homing behavior. However, it is important to note that their samples were collected in tributaries during their spawning migration movements to avoid collecting mixed populations in their feeding areas. This prevents comparisons between our work and that of Pereira et al. (2009) as we included samples from the main canals of the Magdalena River and their feeding areas (like lagoon and marshes), making the structuring pattern of P. magdalenae unclear.
Further studies with samples from headwaters of the tributary rivers could clarify whether P. magdalenae presents homing behavior during spawning migration. Thus, the genetic differentiation found in the population of P. magdalenae is not consistent with the currently accepted idea of panmictic populations of neotropical migratory freshwater fish (Sivasundar et al., 2001;Castro and Vari, 2004;Aguirre et al., 2013), despite the enormous vagility and long migrations of these fish.
Similar trends have been reported in Prochilodus argenteus in the São Francisco River (Brazil) with microsatellite markers, (Hatanaka et al., 2006). A similar situation was also reported for P. costatus and P. argenteus studied in the same hydrographic basin (Barroca et al., 2012). In other species with similar reproductive and migratory behaviors, like Brycon orthotaenia, two defined population were identified in the São Francisco River (Sanches et al., 2012). It has also been found that B. insignis are currently structured possibly due to anthropogenic actions (Matsumoto and Hilsdorf, 2009). A similar situation was reported in P. magdalenae in the Sinú River (Colombia) (Santacruz, 2003), in which a significant genetic structure was identified with heterologous microsatellite markers. However, population studies of P. magdalenae from the Magdalena River basin with the mtDNA control region indicated a single panmictic population (Aguirre et al., 2013). The difference observed with the two markers may be explained by the recent impact of human activities that probably has not allowed enough time to leave traces in the mtDNA control region.
The structuring pattern was also clearly demonstrated in the Bayesian analysis, in which three populations were identified. These populations are distributed along the Magdalena River basin, assuming that at least one individual collected from each sampled location was assigned to one of three clusters, where each location represent a mixture of different populations. This result demonstrates the dispersal capabilities of bocachico due to its reproductive behavior and the geographic proximity of some localities, where many of the sampled sites (like lagoon and marshes) are interconnected during periods of flooding. Thus, our hypothesis is that the population genetic structure of P. magdalenae may be maintained by an event of "reproductive waves" (Jorgensen et al., 2005), represented by genetically differentiated groups that breed in the same place at different time periods with some overlap. This idea is supported by the fact that P. magdalenae has two reproductive peaks during the year in the Magdalena River basin (Valderrama, 1972;Valderrama-Barco and Petrere Jr, 1994;Jimenez-Segura et al., 2010). These authors suggested that a fraction of the population reproduces during the first hydrological cycle and the remaining in the second one. This second migration pulse may involve individuals that could not migrate during the first hydrological cycle.
However, this "reproductive waves" behavior would imply the existence of two populations in the Magdalena River basin. The third population detected by the Bayesian analysis could be a consequence of the repeated restocking programs conducted in this basin. Machado-Schiaffino et al. (2007) provided evidence of significant genetic variation losses in Atlantic salmon stocks (Salmo salar) created for supportive breeding in which the juveniles released in the rivers possessed significantly lower allelic richness than the wild stocks. The cultivated population of Brycon opalinus also presented heterozygosity reduction, indicating a loss of genetic variability in the reproductive supply currently kept in the hatchery (Barroso et al., 2005). A similar trend was reported by Matsumoto and Hilsdorf (2009), in which the broodstock of B. insignis kept at the hatchery has likely maintained the genetic diversity formerly present in some rivers and no longer existing in natural populations. Further studies may clarify whether the broodstock used in stocking programs has influenced the genetic structure and diversity of the P. magdalenae population in the Magdalena basin.

Implications for conservation of P. magdalenae
The IUCN recognizes the need to conserve genetic diversity as one of three global priorities for biodiversity conservation focused on those species vulnerable and at risk of overexploitation like P. magdalenae, which is considered vulnerable (Mojica et al., 2012). Therefore, the main goal of this study was to improve the genetic information of P. magdalenae population structure for fishery management and conservation purposes. Despite the high levels of genetic diversity, the current distribution of P. magdalenae is discontinued or heterogeneous, with at least three populations distributed along the Magdalena River basin. This suggests that management and conservation strategies for P. magdalenae should aim at preserving the diversity of each population.
In this scenario, the recovery of P. magdalenae depends on the strategies implemented for conservation or even on reversing the habitat degradation of the river and riparian environments. On the other hand, other measures could be implemented, such as restocking programs in areas where the population has a lower genetic variability. In this case, the genetic divergence found in the P. magdalenae populations should be taken into account and appropriate breeding management aimed at reducing the risks of genetic drift, inbreeding, and the bottleneck effect should be implemented (Matsumoto and Hilsdorf, 2009). The genetic monitoring of the broodstock and juveniles used for supportive breeding is also essential as it will result in more efficient management and conservation strategies (Barroso et al., 2005;Machado-Schiaffino et al., 2007;Lopera Barrero et al., 2009;Matsumoto and Hilsdorf, 2009).