Genetic diversity and aquaculture conservation for a threatened Neotropical catfish

Due to the ecological importance of Lophiosilurus alexandri, the present work evaluated its genetic representativeness by comparing wild stocks to broodstocks that were kept at three restocking hatcheries along the São Francisco River. A total of 97 samples were genotyped for newly developed microsatellite markers. Low levels of genetic diversity (average alleles number of 4.2 alleles) were detected in all cases, being more severe in captive groups. Significant pairwise FST and DEST values, Structure, and DAPC analyses showed that wild animals were structured in two groups, and a third group was formed by captive animals, evidencing the need to adopt genetic criteria to retain genetic diversity in the hatcheries. For this reason, three full-sib families were constructed to select the best relatedness estimator for L. alexandri and establish a cut-off value aimed to avoid full-sibling matings in the hatcheries. Two estimators, Wang (RW) and Lynch & Li (RLL), were accurate in reflecting the relatedness level for full-sibs in this species. According to them, less than 50% of the potential breeding matings in the three hatcheries are advisable. The innate low diversity of L. alexandri highlights the importance of minimizing inbreeding and retaining genetic diversity towards the species recovery.


INTRODUCTION
Brazil is one of most biodiverse countries for freshwater fish species (Collen et al., 2014), but its ichthyofauna has been impacted due to overfishing, riparian forest degradation, siltation, water transposition, and especially, hydroelectric plants. The São Francisco River (SF) is one of the most impacted basins with declines in fish populations, and at least 10% of the river's more than 200 species are considered threatened (Maneta et al., 2009;Reis et al., 2016;ICMBio, 2018), which has resulted in implementing restocking actions for these species.
One of this species is the Lophiosilurus alexandri Steindachner, 1876, an endemic catfish to the SF, which belongs to the order Siluriformes and family Pseudopimelodidae (Carvalho et al., 2016). It is a sedentary, carnivorous, active nocturnally, and exhibits male parental care for the progenies (Sato et al., 2003;Costa et al., 2015). This species has been subjected to a drastic population reduction in the submiddle and lower stretches of São Francisco River and it has been included into restocking actions for more than two decades (Lopes et al., 2013). Despite the efforts, L. alexandri was included in the Brazilian Red List as Vulnerable, designation for species at high risk of extinction (ICMBio, 2014). Recently, L. alexandri was also included in the Ministry of Environment's National Action Plan (PAN São Francisco -Ordinance n.34 of May 27 th , 2015), which aims to conserve threatened species of this river basin.
One neglected aspect by these restocking actions is the genetic representativeness of the broodstocks compared to the wild conspecifics. Keeping sufficient breeders and incubators are usually prohibitive and most of the hatcheries use a limited number of them. Breeders are mostly not renewed and are replaced by their consecutive progeny generations (F 1 , F 2 , F 3 etc.), therefore using a culture-based restocking model. This kind of strategy favors matings between related individuals and can lead to inbreeding depression, which affects reproductive success, survival, growth, disease resistance, etc and can reduce genetic diversity, which limits the evolutionary potential of the population (Frankham et al., 2010;Keller et al., 2012).
For effective conservation, genetic monitoring of wild populations and captive broodstocks is important to prevent adverse effects on the ichthyofauna (Povh et al., 2008;Schreier et al., 2015). This is particularly important when dealing with species with low genetic diversity, which are more prone to inbreeding and genetic drift (Coleman et al., 2018). Breeding strategies must be adopted to avoid loss of genetic diversity in captive populations of threatened species, such as genetically characterizing the broodstocks (Carvalho et al., 2012), introducing new wild breeders (Whiteley et al., 2015), and slowing down the process of inbreeding by selecting matings based on the degree of relatedness (Wang, 2014).
The use of estimators to determine genetic relatedness between captive-grown breeders could allow selection of those with the lowest degree of genetic similarity, resulting in progenies with greater genetic variability. Genetic relatedness estimators, such as Queller, Goodnight (1989), Ritland (1996), and Lynch, Ritland (1999, based on co-dominant markers, can be used to identify animal relationships. Despite the variety of existing estimators, the best estimator for a certain species can only be found after tests are conducted with the different estimators in groups whose relationships, full-sib or half-sib, are known (Ortega-Villaizan et al., 2011).
For the conservation of L. alexandri, analyses of the genetic diversity remains to be investigated and three hypotheses must be tested: (1) a panmictic population exists along the São Francisco River basin, (2) captive broodstocks of L. alexandri have similar genetic diversity to the wild conspecifics; (3) a relatedness estimator can be used to select the best matings.
In this study, we developed and validated a set of microsatellite markers to evaluate the genetic variability and structure of wild animals and captive broodstocks of L. alexandri. In addition, these markers were also used to identify the best genetic relatedness estimator among L. alexandri full-sib families, towards a conservation aquaculture program.

MATERIAL AND METHODS
Sampling. For the population genetics approach, fin clips from 55 individuals of L. alexandri from three hatcheries along the São Francisco River were collected. Twenty of these were from Paulo Afonso Hatchery of Chesf (São Francisco Hydroelectric Company) in the submiddle stretch of the river, 23 were from the Bebedouro Hatchery of Codevasf (Development Company of São Francisco and Parnaíba Valleys) also in the submiddle stretch of the São Francisco River, and the remaining 12 were from the Itiúba Hatchery of Codevasf in the lower stretch of the river. Forty-two wild animals were sampled, 21 from the submiddle stretch, in the vicinity of Santa Maria da Boa Vista-PE, and 21 from the upper São Francisco River, in the reservoir of Três Marias and downward the reservoir (Fig. 1).
In order to select the best genetic relatedness estimator, three full-sibling families were constructed: FS1 from Bebedouro hatchery, FS2 from Itiúba hatchery, and FS3 from LAQUA (Laboratório de Aquacultura at the Universidade Federal de Minas Gerais). Each mating pair was tagged with PIT tags and was kept in an individual tank with ni.bio.br | scielo.br/ni sand bottom and flowing water. The eggs were collected and transferred to a trough incubator with flowing water, where the fingerlings were maintained until reaching a size of 1 cm. Fifty-three fin clips from the three progenies were collected: 20 samples from FS1, 20 from FS2, and 13 from FS3.
In the genetic diversity analysis, none of the full-sibs was included, only the breeders were used.
One voucher specimen used in this study was deposited in the collection of the Coleção Ictiológica at the Museu de Ciências Naturais da PUC Minas (Voucher number: MCNIP 3217). DNA extraction. DNA extraction was conducted based on the phenol, chloroform, isoamyl alcohol protocol described by Sambrook et al. (1989). The DNA samples were quantified by fluorimeter (Quantus -Promega), and their integrity and quality were evaluated using 0.8% agarose gel electrophoresis and a NanoVue™ Plus Spectrophotometer (GE Healthcare), respectively.

Microsatellites.
Muscle fragments were obtained from a freshly captured L. alexandri. The voucher specimen was fixed in 10 % formalin and later preserved in 70 % ethanol (voucher: LGC6088 at Museu de Ciências Naturais da PUC Minas). Genomic DNA was extracted using a modified salting-out method (Sunnucks, Hales, 1996) and nebulized for 6 min to obtain 200-600bp fragments. Fragments were sequenced using a Nextera kit in a sixth of an Illumina HiSeq 2000 lane.
A set of 178 potentially amplifiable microsatellite loci (PAL) were developed for L. alexandri using the Misa algorithm (Beier et al., 2017) from the scaffold data assembled by SOAP de novo (Li et al., 2008). Primer pairs were designed for each locus using Primer3 (Rozen, Skaletsky, 2000) and tested against the scaffolds using in silico PCR software to verify whether a single product was found. Selected PALs consisted of primers, which produced in silico amplicons with no more than 10 non-ACTG bases. The markers were named with the abbreviation of the species name "Lalex" followed by a number.
From these PALs, 55 loci were initially tested with samples from wild L. alexandri and only those loci that showed polymorphism were subsequently analyzed in captive animals. Four different repeat motifs were tested: 42 dinucleotide motif with eight repeats, one dinucleotide motif locus with nine repeats, ten loci with trinucleotide motifs and six repeats, and two tetranucleotide motif loci with five repeats.
Genetic data analysis. Polymorphism information content (PIC) was calculated, only for the 42 wild samples, using CERVUS v 3.0.7 software (Kalinowski et al., 2007), and the markers were classified as highly informative for PIC values greater than 0.5, as reasonably informative for values between 0.25 and 0.5, and as slightly informative markers for values less than 0.25 (Botstein et al., 1980). In order to predict the probability of two randomly selected individuals from a population holding the same genotype (Waits et al., 2001), the multi-loci probability of identity (PID) was calculated for all 10 loci and the 42 wild samples, using the GIMLET software (Valière, 2002) and under two relatedness scenarios: ''unrelated'', which corrects for sampling a small number of individuals, and ''sibs'', which assumes that the population is composed of siblings.
A Bayesian approach with STRUCTURE v. 2.3.4 was used to investigate the genetic structure of the population using its under admixture model, which assumes that individuals may have different ancestor genotypes with contributions from different populations (Pritchard et al., 2000). The analysis in STRUCTURE v. 2.3.4 was set for a burn-in period of 10,000 interactions, followed by Markov Chain Monte Carlo (MCMC) with 100,000 replicates and four replicates for each hypothetical value of k (k = 1-10).
Structure Harvester v. 0.6.7 software (Earl, vonHoldt, 2012) was used to coordinate the results and assume the most likely number of K (number of clusters -genetic groups), which was estimated through the log-likelihood data (ln (P (X /K )) (Falush et al., 2003) and the best ∆K was based on the rate of change in the log probability of the data (Evanno et al., 2005).
Discriminant Analysis of Principal Components (DAPC) (Jombart et al., 2010) was also used to investigate the population structure and was run in the Adegenet package (Jombart, 2008) in R platform, v. 4.0.2 (R Core Team, 2020). This multivariate analysis is used to identify and describe genetic clusters, while optimizing the variation among them. DAPC does not require populations to be in Hardy-Weinberg Equilibrium or linkage disequilibrium loci (Jombart et al., 2010).
For genetic relatedness analysis, seven microsatellite markers, those that were polymorphic among parental breeders, were used to analyze the parents and progenies from three full-sib families (FS1, FS2 and FS3). The COANCESTRY software (Wang, 2011) was used to generate pairwise relatedness values (dyads) in each full-sib family, according to the five following relatedness estimators: Wang R W (Wang, 2002) (Lynch, 1988;Li et al., 1993), Lynch, Ritland R LR (1999), Ritland R xy (1996), and Queller, Goodnight R QG (1989). In order to choose the best estimator, a mean pairwise relatedness value was calculated for the three full-sib families and compared to the expected reference value for full-siblings (0.5).
The same estimators were applied to the wild samples from the submiddle and upper São Francisco stretches to ensure that estimations were accurate, since an absence of relatedness is expected among wild animals. Further, a minimum value of genetic relatedness of full-sib will be taken as a cut-off value for parentage.
In addition, a simulation of potential breeding pairs within the three hatcheries (Bebedouro, Paulo Afonso and Itiúba) was conducted using the best estimators and their respective cut-off values.

RESULTS
A total of 16,819 tandem repetitive sequences were recovered using the Misa software into contigs assembled by SOAP de novo. After filtering potential amplifiable microsatellite loci, we produced 178 primer pairs for these loci for this catfish (shown in Tab. S1).
Of the 55 microsatellite loci analyzed, 10 showed polymorphisms, 41 were monomorphic, and four did not amplify. The ten polymorphic loci were in HWE, showed PCR amplification consistency, and there was an absence of linkage disequilibrium when genotyped for the wild samples. Polymorphism information content (PIC) values ranged from 0.202 to 0.821 and the number of alleles ranged from two (Lalex5 and Lalex19) to nine (Lalex38) (Tab. 1). The resulting values of multiloci probability of identity (PID) were 3.4 x 10 -7 and 2.4 x 10 -3 for unrelated and sibs scenarios, respectively.
Genetic variability of the 10 microsatellite loci for each of the five sampling sites is described in Tab. 2. A total of 49 alleles were found, 37 of which were present in the upper SF samples, 32 in the submiddle SF, 24 in Bebedouro, 27 in Paulo Afonso, and 23 in the Itiúba samples. The average number of alleles varied between 2.30 and 3.70 per sampling group. Thirteen alleles were private, seven of which belonged to the wild samples from the upper SF, two from wild samples from the submiddle SF, two from Bebedouro, and one each from Paulo Afonso and Itiúba hatcheries. Null alleles were found in Lalex25, Lalex30, and Lalex38 but not systematically in all samples. No evidence was found of either the misidentification of stutters as alleles or dropouts.
The average observed and expected heterozygosities (H o and H e ), respectively ranged from 0.350 to 0.525 and from 0.393 to 0.481, respectively. Hardy-Weinberg (HWE) equilibrium deviations were significant after Bonferroni correction for Lalex1 (Bebedouro and upper SF), Lalex9 (upper SF), Lalex25 (upper SF), Lalex30 (Bebedouro), Lalex38 (submiddle SF), but none of the loci deviated from HWE in the wild samples. Among the 50 estimates (10 loci x 5 sampling sites) of HWE deviation, only six were significant.
Regarding the inbreeding coefficient (F IS ), mean values were all positive, ranging between 0.044 and 0.198, except for the upper SF that was -0.092. Overall F ST among the hatcheries was around 0.10 and the pairwise estimates between them ranged from 0.03 to 0.16 (p < 0.05). The F ST between wild groups (submiddle and upper SF) was  0.25. The pairwise F ST and D EST values between the two hatcheries that restock the submiddle (Paulo Afonso and Bebedouro) and their wild conspecifics (submiddle SF) varied between 0.30 and 0.34 and 0.32 and 0.33, respectively. Overall, D EST among the hatcheries was around 0.07, and the pairwise between them ranged from 0.016 to 0.10 (p < 0.05) (Tab. 3). The AMOVA results for captive samples showed that most of the variance (86.6%) found could be explained by differences within the hatcheries and not among them (10.5%) (p < 0.05). For the wild samples, 78.7% of the variance could be explained by differences within groups and 27.4% for differences between them. The structure analysis was done in two scenarios, firstly for all groups (captive and wild) and secondly only for the wild samples, in both cases the highest likelihood of K values suggested two genetic clusters (K = 2). All suggested clusters were present in the five sampling sites, however, at different proportions. For the first scenario, cluster 1 (red) was predominant among wild animals and was present in more than 96% of the individuals from the submiddle and upper SF. For captive samples, cluster 2 (green) was more abundant and was present in more than 98% of the samples ( Fig. 2A). For the second scenario, there was a distinction between samples from the submiddle and upper SF. Cluster 2 (green) was predominant in the submiddle samples (98.6%) and cluster 1 (red) was primary presented in the upper SF samples (Fig. 2B).
Results of DAPC identified the presence of five clusters (k = 5), with an overlap of cluster clusters 1, 3, and 5 by captive animals. DAPC showed a clear separation of the three groups: upper SF, submiddle SF, and captive samples (Fig. 3  Regarding the genetic relatedness, the R LR , R xy and R QG estimators showed negative mean pairwise relatedness values for all three full-sib families and were discarded. However, R w and R LL gave positive average values between 0.51 and 0.64 for the three full-sib families and negative average values for the wild samples (Tab. 4).
The distribution of pairwise relatedness in intervals for the three full-sib families and wild samples under the R w and R LL estimators is represented in Fig. 4. The extremes of distribution pairwise values into full-sibs were 0.12 and 1 for the R W estimator and 0.03 and 1 for R LL , which the minimum extremes were used as cut-off values for full-sib relatedness. The distribution of pairwise relatedness values for the wild samples showed  (Fig. 5).

DISCUSSION
Microsatellites. This is the first study that developed and used microsatellite markers to examine the genetic diversity of the threatened Neotropical catfish L. alexandri. According to Botstein et al. (1980) classification of PIC values, of the ten loci genotyped of the wild samples from the submiddle and upper São Francisco (mean PIC: 0.482), four were highly informative (Lalex1, Lalex9, Lalex15, and Lalex30), five were reasonably informative (Lalex1, Lalex19, Lalex25, Lalex37, and Lalex58), and one was Eigenvalues of the analysis are displayed in inset.

12/22
ni.bio.br | scielo.br/ni slightly informative (Lalex5). Similarly, Hughes et al. (2015) were able to determine the structuring and a low genetic diversity in the Australian lungfish Neoceratodus forsteri (Krefft, 1870) after the authors used a set of 11 microsatellite markers in which two were highly informative, six were reasonably informative, and three were slightly informative (mean PIC: 0.345). In the present study, although not all loci were highly informative under PIC classification, the multi-loci probability of identity (PID) values were below the threshold value for codominant markers 0.01 (Waits et al., 2001), which is necessary to accurately distinguish individual genotypes, even in the sibs scenario. This indicates that the novel microsatellite set described herein is effective in assessing genetic diversity, population structure, and relatedness in the L. alexandri.  Genetic diversity. A low number of alleles (two to nine), with an average of 4.2 alleles, and expected heterozygosity (H e ) (0.393 to 0.481) were found. These values were well below those reported by Martinez et al. (2018), which found an average number of alleles per locus of 14.81 for freshwater fish species.
It is possible that the low number of polymorphic loci found in L. alexandri might be related to its sedentary behavior. The Australian lungfish Neoceratodus forsteri (Sarcopterygii, Neoceratodontidae), which also has sedentary habits and parental care (Pusey et al., 2004;James et al., 2010), show similar values of genetic diversity in terms of average number of alleles and heterozygosities (Hughes et al., 2015). Other Brazilian freshwater fish species with limited displacement and parental care to whom low genetic diversity has been reported include the Arapaima gigas, with an average of 3.57 alleles per locus, ranging from 1 to 5 and H e from 0.026 to 0.504 (Vitorino et al., 2017), and the Geophagus brasiliensis (Quoy & Gaimard, 1824) which showed an average of 5.33 alleles per locus and H e from 0.534 to 0.706 (Ferreira et al., 2015). Also, studies with the Neotropical catfish species Neoplecostomus yapo Zawadzki, Pavanelli & Langeani, 2008(Philippsen et al., 2009 and Hypostomus ancistroides (Ihering, 1911) (Sofia et al., 2008) using other markers related the low genetic diversity to the sedentary habits of these species.
Despite the low number of alleles detected in the present study, a decrease (20.3%) in the average number of alleles present in the captive animals, when compared to the wild conspecifics from the submiddle, could be observed.  (Valenciennes, 1837) and Brycon insignis Steindachner, 1877, respectively, and observed a reduction of almost 30% in the number of alleles between wild and hatchery broodstocks. A review conducted by Araki, Schmid (2010) in nearly 50 restocking actions worldwide reported that the decreased number of alleles found in captive breeders compared to those wild conspecifics was the most common feature.
Genetic diversity is used as an essential criterion for evolutionary potential. As long as the population retains genetic diversity, there will be evolutionary adaptation to deal with a variable environment (Frankham et al., 2010;Allendorf et al., 2012). Restocking processes based on culture-based methods, lead to mating closely related individuals and to inbreeding (Keller et al., 2012). The three hatcheries of L. alexandri showed high mean inbreeding (F IS ) values between 0.098-0.198, while the wild upper sampling site presented a negative number (-0.092). The submiddle sampling site, in turn, presented a positive value (0.044). Also, private alleles were more abundant in the upper than in the submiddle stretch. The submiddle stretch of the São Francisco River has in one extreme the reservoir of Sobradinho, and at the other end the reservoirs of Itaparica, Moxotó and Paulo Afonso Complex (I, II, III and IV), with no perennial rivers (ANA, 2003), which makes it the most fragmented stretch of the river. In this cascade of reservoirs, Sobradinho (the second largest lake in South America) is the first one in this region to retain the highest rate of nutrients and sediments, thus creating important modifications on primary productivity downstream, with severe implications for fish communities (Santos et al., 2020). Furthermore, the submiddle stretch is affected by long periods of droughts (Stolf et al., 2012), contrasted with the upper region, which has 48 perennial tributaries (ANA, 2003). The wild samples from the upper stretch were mostly collected in the dendritic reservoir of Tres Marias, which despite being a reservoir, is rich in nutrients due to marginal pond grasses that become submerged during the rainy season and decompose, providing food for the fish community (Sato, Sampaio, 2005). Therefore, the two wild sampling sites (upper and submiddle São Francisco) constitute contrasting areas in terms of nutrient and sediment input that might explain differences in genetic diversity level. Hence, it is possible to consider that there are at least two units of conservation in this river, one in the upper and another in the submiddle stretch. Additional samples taken in the middle and lower stretches may reveal other units along a gradient system.
The continuous restocking of L. alexandri in the submiddle stretch with high levels of inbreeding may reduce population persistence and make the species even more vulnerable. The innate low genetic diversity detected in the L. alexandri makes the challenge of restocking actions even more difficult.
In order to avoid the loss of alleles and maximize genetic diversity, it is of utmost importance that restocking actions use an abundant number of breeders (fully genotyped) and that matings be selected under a relatedness cut-off value.
Genetic structure. The genetic difference found between wild animals from the upper and the submiddle SF might be related to the sedentary behavior and parental care of this species. Sedentary behavior and short displacements have been pointed to as limiting factors for gene flow, resulting in population structure in many freshwater species, such as Arapaima gigas (Schinz, 1822) (Torati et al., 2019), Geophagus brasiliensis (Ferreira et al., 2015), Ichthyoelephas longirostris (Steindachner, 1879) (Landínez-Garcia, Márquez, 2016), and Mogurnda adspersa (Castelnau, 1878) (Hughes et al., 2012). Lophiosilurus alexandri inhabits sandy bottoms where it buries itself waiting for its prey (Costa et al., 2015), rarely showing displacement. The sedentary behavior and the distance of nearly 1,500 km between the upper and submiddle sampling sites in the São Francisco River might also have acted to favor a genetic structuring, limiting gene flow.
The genetic structure emphasizes the necessity of having distinct broodstocks for each stretch of the river in conservation aquaculture programs and that the exchange of breeders between hatcheries from different stretches should be avoided.
Considering the levels of genetic differentiation (Tab. 3, Fig. 3), in terms of inbreeding and number of alleles found between hatcheries (Paulo Afonso and Bebedouro) that restock the submiddle SF and the corresponding wild sample, the use of these captive breeders should not be recommended for conservation purposes. The continuing use of these broodstocks for restocking can cause genetic erosion of wild stocks and decrease the adaptive genetic variation and individual fitness, thus limiting the evolutionary potential of the population (Bijlsma, Loeschcke, 2012;Carvalho et al., 2012). Therefore, we suggest replacing the existing captive broodstock with wild animals (fully genotyped), as well as adopting a capture-based restocking method to guarantee the maintenance of genetic diversity and support long-term conservation of the species, or alternatively selecting breeding pairs under a relatedness cut-off value.
Relatedness. The developed microsatellites were also useful in identifying the best 16/22 ni.bio.br | scielo.br/ni genetic relatedness estimators (R W and R LL ) in three full-sib families. Considering the distribution of pairwise relatedness of the three full-sib families, it was possible to determine that 0.12 and 0.03 are the cut-off values for full-sib relatedness, according to R W and R LL , respectively. Considering the innate low diversity of the species, R LL should be used for selecting matings in captive systems. The practice of using relatedness estimators has been successfully applied in other culture-based restocking programs, such as for the coho salmon (Conrad et al., 2013) and sockeye salmon (Kozfkay et al., 2008).
The distribution of pairwise relatedness in wild samples indicated that there is a considerable number of pairwise combinations with some degree of relatedness even in the natural environment. It is possible to speculate that the life strategy (sedentary habit with parental care) of L. alexandri might interfere in the genetic relatedness and matings between related individuals that would happen more frequently than expected by chance.
The availability of L. alexandri in the wild is still incipient even after two decades of restocking actions for the species. Reports related to the monitoring of artisanal fishing in the submiddle and lower São Francisco River from June 2017 to April 2018 revealed an average monthly yield of 52.69 kg of L. alexandri, compared to 783.5 kg for Prochilodus argenteus and 2,861.3 kg for Myleus micans (Lütken, 1875), other species that are equally subjected to restocking actions (Chesf, 2017(Chesf, , 2018. Based on the low genetic diversity recovered from wild and hatchery broodstocks, we recommend that greater efforts should be taken to minimize the genetic loss in captivity and further genetic erosion of wild populations in restocking actions. Towards a conservation aquaculture program for L. alexandri, the implementation of the "culturebased restocking" method requires a strategy based on the use of a minimum number of animals to form the founding broodstock, the physical tagging of breeders, the genetic monitoring of breeders and progenies (prior to being released) and the adoption of a genetic relatedness estimator.