Microsatellite loci development for three catfish species from northwestern South America

The Neotropical catfish species Ageneiosus pardalis, Pimelodus grosskopfii, and Sorubim cuspicaudus are important fishery resources in Colombia that show historical declines in their capture. This study used next-generation sequencing with 454 FLX technology (Roche Applied Science) and bioinformatics analysis to develop between 18 and 24 microsatellite loci for these species. The novel microsatellite loci showed high values of polymorphic information content -PIC (A. pardalis: 0.601–0.903, P. grosskopfii: 0.748–0.946 and S. cuspicaudus: 0.383– 0.876), and the average number of alleles/locus ranged from 7–15 for A. pardalis, 9–30 for P. grosskopfii and 5–14 for S. cuspicaudus. The average observed and expected heterozygosities were respectively, 0.757 ± 0.035 and 0.834 ± 0.015 for A. pardalis; 0.596 ± 0.040 and 0.881 ± 0.009 for P. grosskopfii; and 0.747 ± 0.031 and 0.757 ± 0.025 for S. cuspicaudus. For future studies, these loci can be useful to estimate the genetic diversity and population structure in these three Neotropical catfishes.


INTRODUCTION
Genetic population studies are crucial in the generation of valuable information for different programs of management, conservation, and the genetic-diversity monitoring of several species (Schwartz et al., 2007;Allendorf et al., 2010;Frankham, 2010); particularly, those affected by different anthropogenic activities (Frankham, 2010). Among the different molecular markers utilized in genetic population studies, microsatellite loci are one of the most informative and widely used (Hamilton et al., 1999;Guichoux et al., 2011). However, the first approaches for microsatellite loci development in non-model species were expensive, complex (Hamilton et al., 1999;Castoe et al., 2010;Fernandez-Silva et al., 2013), and produced a low number of useful markers obtained for population studies (Zalapa et al., 2012). Fortunately, nextgeneration sequencing technologies allowed the fast development of different useful molecular markers to generate population and evolutionary information of species at lower costs (Ekblom, Galindo, 2011;Guichoux et al., 2011;Fernandez-Silva et al., 2013;Miller et al., 2013), although for the vast majority of fish species these markers are still limited or absent (Kumar, Kocour, 2017).
Microsatellite loci are absent for catfishes from the west of the Eastern Cordillera of the Andes excepting Pimelodus grosskopfii (Hernandez-Escobar et al., 2011in Agostini et al., 2011, limiting the population genetic studies for these species. Some authors have used microsatellite loci developed for close phylogenetically related species (heterologous loci); however, in some cases their use seems to be related to failures in the amplification, low levels of polymorphism, size homoplasy, null-alleles, and the amplification of non-orthologous loci (Primmer et al., 2005;Barbará et al., 2007;Castoe et al., 2010;Yue et al., 2010). This has stimulated the development of new molecular markers suitable for their population genetic studies.
Consequently, based on next-generation sequencing with the 454 GS-FLX technology (Roche Applied Science) and bioinformatics analysis, this study developed species-specific microsatellite loci for the non-model catfish species Ageneiosus pardalis (Lütken, 1874), Pimelodus grosskopfii (Steindachner, 1879), and Sorubim cuspicaudus (Littmann, Burr, Nass, 2000). These three carnivorous and migratory species are important for fisheries and many aspects of their basic biology and population genetics remain unknown, restraining the development of adequate management programs. This issue is important since population densities of these species have been decreased by anthropic activities in all Colombian watersheds (Galvis and Mojica, 2007;Usma-Oviedo et al., 2009;Mojica et al., 2012), which led to their classification as vulnerable in the red list of freshwater fish of Colombia (Mojica et al., 2012). Moreover, P. grosskopfii was also included as a critically endangered species in the Red List of Threatened Species of the International Union for the Conservation of Nature (IUCN; Villa-Navarro et al., 2016). These tools will allow for future population genetic studies that support different proposals focused on the sustainable management and conservation of these species.

MATERIAL AND METHODS
Samples were collected from 2011 to 2014 in the lower section of the Cauca River and supplied to the Laboratorio de Biología Molecular y Celular (Universidad Nacional de Colombia), through the scientific cooperation agreement CT-2013-002443; framed in the environmental license # 0155 of January 30, 2009 from Ministerio de Ambiente, Vivienda y Desarrollo Territorial. For each species, we took advantage of nuclear reads from pyrosequenced genomic libraries of one individual collected in the lower section of the Cauca River (Restrepo-Escobar et al., 2016a,b). Identification of microsatellite loci, primer design, and electronic PCR to guarantee the correct alignment of primers were performed using the software and procedures used by Landínez-García, Márquez (2016). About 39 and 43 pairs from the list of primers validated by electronic PCR were selected to evaluate their consistent amplification and polymorphism under standard PCR conditions in 12 individuals from each species. Then, we selected pairs of primers that fulfilled the conditions proposed by Landínez-García, Márquez (2016): (1) specific amplification in all individuals within the sizes that were designed (100 to 350 bp), (2) band resolution, (3) specificity, and (4) ability to detect heterozygotes. The forward primers of these pairs were directly fluorescently labeled or universal markers were added to their 5´-tail to produce their fluorescent label through the three primer PCR method (Blacket et al., 2012) and were further evaluated in 50 individuals of each species.
The PCR amplification was carried out under the conditions proposed by Landínez-García, Márquez (2016) for the directly labeled primers, and by Landínez-García, Marquez (2018) for the primers marked using the three primer method. In both cases, the PCR products were separated in an ABI 3130 automatic sequencer (Applied Biosystems, USA) using GeneScan-500 LIZ (Applied Biosystems, USA) as the size marker; the electropherograms obtained were reviewed using GeneMapper 4.0 (Applied Biosystems, USA). Before the statistical analysis, Micro-Checker 2.2.1 (Van Oosterhout et al., 2004) was used to detect possible genotyping errors.
For each species, the genetic diversity, the allelic frequencies, the observed (Ho) and expected (He) average heterozygosity and the average number of alleles per locus (Na) were determined with the GenAlEx 6.503 (Peakall, Smouse, 2012). Additionally, Arlequin 3.5.2.2 (Excoffier, Lischer, 2010) was used to determine the statistical significance of the allelic frequencies in Hardy-Weinberg and Linkage equilibria. In the case of multiple comparisons, the statistical significance was adjusted by sequential Bonferroni correction (Rice, 1989). Furthermore, the polymorphic information content (PIC) was determined for each loci with CERVUS 3.0.7 (Marshall et al., 1998).
For A. pardalis (17 of 21, Tab. 1) and S. cuspicaudus (17 of 24, Tab. 3), all or most of the evaluated loci were shown to be in linkage and Hardy-Weinberg equilibria after the Bonferroni correction. In contrast, P. grosskopfii showed 13 loci with heterozygote deficit, allelic frequencies departures from Hardy-Weinberg equilibrium (Tab. 2), and significant linkage disequilibrium between the pairs of loci Pgrk01-Pgrk02 and Pgrk08-Pgrk20.

DISCUSSION
In this work, 63 microsatellite loci were designed for future studies of the genetic diversity of A. pardalis (21), S. cuspicaudus (24), and P. grosskopfii (18). The microsatellite loci of A. pardalis and S. cuspicaudus represent the first species-specific codominant markers for both Neotropical genera. Along with the above, the new 18 microsatellite loci for P. grosskopfii complement the currently available markers (Agostini et al., 2011). Pyrosequencing also     et al., 1998). Superficial sequencing on A. pardalis, P. grosskopfii, and S. cuspicaudus, showed a higher frequency of 2-mer motifs; a characteristic previously described for several eukaryotic organisms (Meglécz et al., 2012). Additionally, in this study, 4-mer are the second most frequent motif for the three species studied, a similar outcome to others fishes such as Megaleporinus obtusidens (=Leporinus obtusidens in Villanova et al., 2015); Craterocephalus fluviatilis, Galaxias fuscus, Henicorhynchus lobatus, Henicorhynchus siamensis, Alticus arnoldorum, Amphiprion sandaracinos and Amphiprion mccullochi (Meglécz et al., 2012). This result is in contrast however, to others species of fishes such as Ictalurus punctatus (Somridhivej et al., 2008) The high frequency of the AC repeat motif is concordant with that described for all the Chordata phylum species, especially for the species of the Actinopterygii class (Meglécz et al., 2012). Similarly, the low frequency of the CG repeat motif found in this work is consistent with that described for most eukaryotic species (Meglécz et al., 2012). The most common repeat motifs found in this work, AC and ATT, have been described in bony fishes such as Rhamdia sp. (Rodrigues et al., 2015), I. punctatus (Somridhivej et al., 2008) (Saarinen, Austin, 2010). However, the frequency of the ATT motif differs from that found in Piaractus brachypomus (AGC, Jorge et al., 2018) and Takifugu rubripes (AGG;Edwards et al., 1998). The most frequent 4-mer motifs repeat found for the three studied catfish species (AAAT, ATCT, TCTG, AGTG, and AATG) are also among the most frequent for other catfishes, such as I. punctatus (Somridhivej et al., 2008), C. batrachus (Mohindra et al., 2012) and T. fulvidraco (Zhang et al., 2014), as well as other Neotropical fishes such as P. brachypomus (Jorge et al., 2018) andM. obtusidens (Villanova et al., 2015).
All the microsatellite loci developed in this work (except Scus11) have PIC values that allow them to be considered highly informative according to the classification proposed by Botstein et al. (1980), and also are higher than those described for  Pereira et al., 2012). Moreover, the loci designed for A. pardalis and S. cuspicaudus showed evidence of linkage equilibrium and most of their allelic frequencies are in Hardy-Weinberg equilibrium, which make them highly informative for determining the diversity and structure of populations of these species. In contrast, most of the loci of P. grosskopfii showed allelic frequencies deviated from the Hardy-Weinberg equilibrium and two pairs of loci showed signals of linkage disequilibrium. It remains to be explored if these characteristics are technical problems or a consequence of the high levels of exploitation of P. grosskopfii. The linkage disequilibrium has been described in some pairs of loci for the exploited species B. rousseauxii (Batista et al., 2010), S. parahybae (Fonseca et al., 2016), and Pimelodus maculatus (Paiva, Kalapothakis, 2008).

ACKNOWLEDGMENTS
This work was funded by the Universidad Nacional de Colombia, Sede Medellín and Empresas Públicas de Medellín, Grant CT-2013-002443 "Variación genotípica y fenotípica de poblaciones de especies reófilas presentes en el área de influencia del proyecto hidroeléctrico Ituango". We thank the engineering consulting company Integral S.A., for providing the field samples used in this work.