Molecular characterization and genetic structure of the Nero Siciliano pig breed

Nero Siciliano is an autochthonous pig breed that is reared mainly in semi-extensive systems in northeastern Sicily. Despite its economic importance and well-appreciated meat products, this breed is currently endangered. Consequently, an analysis of intra-breed variability is a fundamental step in preserving this genetic resource and its breeding system. In this work, we used 25 microsatellite markers to examine the genetic composition of 147 unrelated Nero Siciliano pigs. The total number of alleles detected (249, 9.96 per locus) and the expected heterozygosity (0.708) indicated that this breed had a high level of genetic variability. Bayesian cluster analysis showed that the most likely number of groups into which the sample could be partitioned was nine. Based on the proportion of each individuals genome derived from ancestry, pigs with at least 70% of their genome belonging to one cluster were assigned to that cluster. The cluster size ranged from 7 to 17 (n = 108). Genetic variability in this sub-population was slightly lower than in the whole sample, genetic differentiation among clusters was moderate (FST 0.125) and the FIS value was 0.011. NeighborNet and correspondence analysis revealed two clusters as the most divergent. Molecular coancestry analysis confirmed the good within-breed variability and highlighted the clusters that retained the highest genetic diversity.


Introduction
In recent decades, many livestock breeds have experience a severe loss of biodiversity that has markedly affected animal production systems, especially in marginal areas. Attempts to reverse this negative trend have led to research in the preservation and exploitation of local animal breeds, with efforts to identify and reintroduce potentially important genetic traits that have been overlooked by globalized production systems. Autochthonous breeds require careful molecular and morphological characterization that takes into account potential influences of the environment in which they were originally developed and adapted.
Nero Siciliano is an ancient black pig breed that originated in Sicily and has always been reared in extensive and semi-extensive systems on this island. Currently, most of the farms that rear autochthonous black pigs (1800 pigs, 70% of the entire population for this breed) are located mainly in the mountainous area of Nebrodi in northeastern Sicily. Black pigs are rustic animals that thrive on roughage and a limited food supply, in addition to being resistant to diseases in harsh conditions. This breed still retains its distinctiveness thanks to the geographical and orographical characteristics of the island and breeding area, but runs a high risk of losing its original traits because of the lack of a suitable plan to safeguard and exploit its production.
Nero Siciliano pigs grow slowly and yield tasty meat and fat (Pugliese et al., 2003) used to produce high quality meat, including salami and cured ham that are very appreciated by local consumers. The creation of a Protected Designation of Origin (PDO) label for Nero Siciliano meat and other related products has helped to safeguard and preserve this breed, and has led to an important increase (21%) in the number of farms and in sow rearing (24%) in the last three years. Despite this renewed interest of breeders and consumers, only 850 breeding sows are currently being reared and the Nero Siciliano breed is included in the list of endangered autochthonous breeds.
Genetic characterization is a fundamental prerequisite for managing genetic resources. A recent morphome-tric analysis of Nero Siciliano pigs involving 13 body measurements highlighted the low-medium size of this breed when compared with other Italian breeds (Guastella et al., 2009). Genetic variability has also been assessed based on the use of different genetic markers (Russo et al., 2004;D'Alessandro et al., 2007;Davoli et al., 2008). In particular, microsatellite markers have been particularly useful for quantifying genetic variation within and among several European pig breeds, including Nero Siciliano (SanCristobal et al., 2006).
The aim of this study was to use microsatellite markers to assess the genetic variability and genetic structure of Nero Siciliano pigs reared in the Nebrodi mountains, in order to provide suitable data for conservation strategies.

Sample collection and microsatellite analysis
A representative sample of 147 Nero Siciliano pigs (22 boars and 125 sows) was selected from 22 farms in 11 communes (Alcara Li Fusi, Brolo, Capizzi, Caronia, Floresta, Longi, Mirto, Sanfratello, San Salvatore di Fitalia, Raccuja and Tortorici) in the Nebrodi area; first-and second-degree relatives were avoided. The Nebrodi area is part of a natural Park (37°50'-38°9' N; 14°26'-14°54' E) located 100-1700 m above sea level, and is where the Nero Siciliano breed originated and is still extensively bred. The sample size ranged from 2 to 15 pigs per farm, depending on the herd size. Only pigs that met the morphological standard for the breed were sampled, and digital photographs of each animal were taken. Farmers were also asked about the breeding strategies that they employed to select animals with an exclusively native Sicilian germplasm: farms in which there had been recent crossbreeding with commercial breeds were excluded from the sampling.
For each pig, 10 mL of peripheral blood was collected in K3-EDTA tubes. DNA was extracted from blood using the commercial Illustra blood genomic Prep Mini Spin kits (GE Healthcare, Little Chalfont, UK). Genetic characterization was done with a set of 25 microsatellite markers (Table S1) chosen from a list maintained by the Pig Biodiversity project and USDA MARC database. The markers were chosen based on preliminary data about their degree of heterozygosity and polymorphism obtained from small samples of Nero Siciliano and other local Italian pig populations. Six PCR multiplex reactions and two single PCR reactions were done to amplify the microsatellites according to standard protocols. The amplicons (2 mL) were mixed with 4 mL of loading buffer containing formamide and 350 TAMRA as an internal size standard (Applied Biosystems, Warrington, UK). Individual genotypes were determined with an ABI PRISM ® 377 sequencer equipped with Genescan Analysis ® v.3.1.2 and Genotyper ® v. 2.5 softwares (Applied Biosystems, Foster City, CA, USA).

Statistical analysis
Individual multilocus genotypes were analyzed by Molkin v. 3.0 software (Gutierrez et al., 2005) to calculate the main parameters of genetic variability. For each locus and group of pigs, the allele frequencies, private alleles (A p ), effective number of alleles (A e ), and observed (H o ) and expected (H e ) heterozygosities were calculated. Molecular coancestry coefficients and kinship distances (D k ) weighted by polymorphic information content (PIC) were also assessed. In addition, the contribution of the different groups of pigs to the overall or total diversity was also inferred, according to Caballero and Toro (2002) and Petit et al. (1998).
Genepop v. 4.0 (Rousset, 2007) was used to perform the score test for Hardy-Weinberg equilibrium (Rousset and Raymond, 1995) per locus using a Markov chain algorithm implemented with 10,000 dememorizations, 200 batches and 5000 iterations per batch. The presence of null alleles was tested with MICRO-CHECKER v. 2.2.3 (Van Oosterhout et al., 2004), using Bonferroni adjustments.
FSTAT v.2.9.3 software (Goudet, 2001) was used to estimate the F IT , F ST and F IS statistics (Weir and Cockerham, 1984) and their significance was inferred by methods based on randomisation. Multiple tests of significance were corrected by the sequential Bonferroni method (Rice, 1989).
The model-based approach proposed by Falush et al. (2003) in the software STRUCTURE v.2.2 was used to assess the genomic clustering of the sample. Individual pigs were probabilistically assigned to two or more subpopulations on the basis of their multilocus genotype, assuming that they were admixed. As suggested by the authors, the admixture model associated with the option of correlated allele frequencies was used to infer the population structure. The run length was set to 100,000 burn-ins followed by 100,000 iterations. This setting produced consistent estimations that were not significantly altered by a longer burn-in or Markov chain Monte Carlo (MCMC). The range of possible clusters (K) tested was from 1 to 15, and 10 runs were done for each K. CLUMPP software (Jakobsson and Rosenberg, 2007) was subsequently used to find the optimal alignment of the 10 replicate cluster analyses of the same K. The similarity coefficient G', which is also a measure of the constancy over runs, was used to define the population structure. The mean membership matrix across replicates was plotted with the program DISTRUCT (Rosenberg, 2004). The method reported by Evanno et al. (2005) was also used to estimate the most likely number of K that explained the sample structure.
Reynolds' pairwise distances (Reynolds et al., 1983), used to assess the genetic relationship between clusters, were calculated with the software Phylip v. 3.67 (Felsenstein, 2005). A multivariate method of correspondence analysis allowed the simultaneous representation of inferred clusters. GENETIX v.4.05 software (Belkhir et al., 2004) was used to spatially plot clusters and individuals based on the allele frequencies of all loci and a correspondence analysis in which the Chi-square distances served to judge the proximity of the clusters.

Results
Nero Siciliano pigs showed high genetic variability (Table 1). Two hundred and forty-nine alleles were detected, with 5 (locus S0026) to 19 (locus S0005) alleles per locus and an average number per locus (9.96) that was fairly high. The effective number of alleles (A e ), which takes into account the expected heterozygosity (H e ), showed that S0005 was the most polymorphic locus and SW951 the least polymorphic. The expected heterozygosity was higher than that observed at each locus and ranged from 0.245 to 0.907 (mean: 0.708). The estimated polymorphic information content (PIC) ranged from 0.236 at SW951 to 0.901 at S0005 (data not shown). Overall, Nero Siciliano pigs showed Hardy-Weinberg disequilibrium, with significant deviations from equilibrium being observed for 10 microsatellites (Table 1). The presence of null alleles, inferred for five loci (S005, SW911, SW1873, SW1556, SW2038), could explain the excess of homozygotes and the deviation from genetic equilibrium at these loci. The heterozygote deficiency within populations (F IS ) was significant for nine loci. The overall F IS coefficient for the loci was 0.109, indicating a significant (p < 0.001) excess of homozygotes in the whole sample.
The level of population structuring was quite high. Model-based clustering of the microsatellite genotypes revealed that the likelihood variance of the observed data initially increased by the predefined number of clusters to reach a peak value at K = 9 and then decreased. As indicated by Pritchard et al. (2000), nine is the smallest value of K that captures the major structure in the dataset. Based on the method proposed by Evanno et al. (2005), two clear peaks at K = 9 and K = 2 (the latter being particularly indicative of a very low likelihood variance) were observed in the DK distribution. The highest values of the similarity coefficient G' (> 95%) were detected at K = 2, K = 7 and K = 9. Based on these three approaches, we assumed K = 9 to be the most likely number of clusters.
Based on the average matrix of membership (data not shown), animals with at least 70% of their genome belonging to one cluster were assigned to that cluster. The cluster size ranged from 7 to 17 (for a subsample of 108 of the original 147 pigs). Table 2 shows the genetic parameters of diversity for the nine clusters and the subsample of 108 pigs. The average genetic differentiation among inferred clusters was moderate (F ST 0.125; p = 0.001). The genetic variability of this subsample was still high with the overall F IS value (0.011) not significantly different from zero, indicating frequent random mating. The observed and expected heterozygosities were slightly lower than in the whole sample of 147 pigs.
Among the 22 farms sampled, nine were highly homogenous, with > 70% of the genome belonging to a cluster. In contrast, animals from three farms that were particularly active in fattening pigs for meat production and pork products showed a wide genome distribution among clusters. Some clusters (K7, K5 and K3) consisted of individuals from single farms, whereas two (K9 and K4) consisted of pigs from different herds reared in the same area.
The rarefacted number of alleles (Ar), which measures the contribution of alleles weighted by the sample size, ranged from 3.01 (K7) to 4.40 (K1). Thirty-three private alleles were detected in the model-inferred group of pigs. Each cluster was characterized by at least one unique allele: high frequencies were observed for alleles SW1695  Correspondence analysis provided an alternative spatial representation of individuals and clusters scattered in the metric space (Figure 1). The first two axes contributed almost equally to the total inertia (16.84% and 15.96%, respectively). With regard to the dispersion in the first and second axes, S0017 (155 bp), SW1928 (85 bp), SW240 (122 bp) and S0017 (155 bp) were the most important alleles, each with a contribution > 5.7%.
The spatial dispersion of clusters and related individuals was determined mainly by four alleles that had very different frequencies among clusters, and by high frequency alleles detected exclusively in a given cluster. In particular, allele SW1928 (85 bp), which was found only in K7 at a high frequency (35.7%), contributed significantly to dispersion in the first and second axes (3.8% and 2.4%, respectively); an additional three alleles contributed more than 5% to dispersion in the first two axes and occurred at very high frequencies, i.e., 50% for SW240 (122 bp) in K2, 56% for S0017 (155 bp) in K2 and 82% for SW1873 (136 bp) in K7.
Molecular coancestry in each cluster ranged from 0.287 in K1 to 0.389 in K7. Self-coancestry and the inbreeding coefficient showed the same trend: the lowest values were detected in K6, the highest in K3. Kinship distances ranged from 0.277 to 0.361. The contribution to the diversity of the whole sample is shown in Table 2. Cluster 2 contributed the most to the total genetic diversity (CGD) and K8 the least, when assessed according to Caballero and Guastella et al. 653   Toro (2002). However, when the rarefacted number of alleles was considered, clusters K1 and K7 provided the highest and lowest contributions (CAr), respectively (Petit et al., 1998).
The number of effective alleles (4.53) was higher than in European breeds (2.74 on average) and Nero Siciliano pigs (4.03) studied by SanCristobal et al. (2006) and higher than or comparable to those reported for several Chinese breeds (1.69-5.62) (Li et al., 2004;Fang et al., 2005), except for Lingao (4.76; Fang et al., 2005) and Nang Yang Black, Sheng Xian Spotted and Hai Nan (5.12, 5.21 and 5.62, respectively) (Li et al., 2004). Conversely, the number of effective alleles and expected heterozygosity in Nero Siciliano pigs was lower than in the Indian Ankamali breed (Ae = 5.34, H e = 0.83) (Behl et al., 2006). The significant deviation from Hardy-Weinberg equilibrium observed at 10 loci may reflect the presence of null alleles at some loci and the Wahlund effect, i.e., a reduction in the observed heterozygosity that occurs in subdivided populations. Genetic substructures in Nero Siciliano pigs apparently resulted from different management strategies across herds and inbreeding within herds. In the Nebrodi area, farmers use their own boars for a year or more, with only a limited exchange of animals among farms and herds. The within-group inbreeding values (F IS ) support this hypothesis ( Table 2).
The unsupervised analysis of population structure detected nine homogenous groups in the sample analyzed and determined the corresponding fraction of the individual's genome derived from ancestry in each cluster. Cluster K1 showed the highest variability in the number of alleles and heterozygosity, whereas K7 showed the least. These results agreed with the values for molecular coancestry and were confirmed by analysis of the contribution to the total genetic diversity estimated from the allelic richness.
The negative F IS values detected in clusters K2 and K5, which represent two closed-cycle farms, probably reflected the proper selection of breed animals among the farms sampled. In these farms, which sell piglets of different ages for production or reproduction, the systematic exchange of genetic resources that occurs between breeding and processing herds probably helps to maintain a high level of diversity. Molecular data confirmed that these two farms had good mating management programs that ensured high genetic variability.
Correspondence analysis provided a more interesting and informative spatial representation of the relationships among the nine clusters with respect to the network built on genetic distances. Clusters K2 and K7 represented the two most distinct group of pigs and accounted for 14.2% and 12.7% of the inertia, respectively. Cluster 7 represented a single herd. The farm in question is interesting because it is the only one in the Nebrodi area in which the recessive mutation in the RYR1 (Ryanodine Receptor 1) locus responsible for the Porcine Stress Syndrome (PSS) has been reported (Matassino et al., 2007); ten of the 14 pigs in this cluster were carriers of this recessive allele. Despite their morphological traits and recent successful breeding history, these pigs retain the traces of a possible accidental introgression of other commercial breeds. In addition, the low values for effective alleles and expected heterozygosity in K7 may reflect mating between closely related or inbred animals. Cluster 2 also represented a single farm that was among the first to rationally manage the short chain cycle of breeding-processing-resale. The spatial distribution of K2 pigs and the high contribution of this cluster to overall diversity support the hypothesis that the present herd was derived from different genetic lineages that were incorporated to control the level of inbreeding.
Mitochondrial DNA haplotype analysis has shown that modern European pig breeds belong to a general European cluster (Larson et al., 2005). The populations are often heavily structured and more than half of the European pig diversity can be assigned to local breeds (Ollivier et al., 2005). The results described here indicate a high degree of genetic variability in Nero Siciliano pigs, as previously reported (SanCristobal et al., 2006;Davoli et al., 2007), despite the limited population size of this breed. This is a rather frequent condition for populations reared in extensive systems where there is no systematic selection and planned mating. Part of the observed heterogeneity may have originated from accidental crosses with wild boars (Ollivier et al., 2005) and limited introgression with other Italian breeds (Neapolitan and Casertana), as well as with Iberian breeds and commercial pigs (Russo et al., 2004).
As shown here, the genetic profile established by using neutral markers and the analysis of intra-population structure provided a general outline of the breeding systems applied to Nero Siciliano pigs in the Nebrodi area. The model-based approach identified homogenous groups in the sample, some of which coincided with specific farms and others with breeding areas. Our results indicate that molecular data can be helpful in the selection of parental stocks and planned matings as part of strategies to preserve and restore the rational breeding of black pigs. Boars belonging to cluster K7 need to be screened for the PSS syndrome gene and possibly excluded from selection programs in order to eliminate the PSS mutated gene and related PSE (pale, soft, exudative) defect from the population gene pool. Individuals belonging to different clusters could be used in planned matings to maintain a good level of genetic variability and rusticity (stress-resistance) and avoid excessive inbreeding. On the other hand, pigs sharing the same clusters and chosen based on their individual multilocus genotypes may be used in planned matings to preserve the most typical traits in this autochthonous population. Nero Siciliano pigs belonging to the most divergent cluster (K2) and pigs not included in the most homogenous subpopulation of 108 animals need to be incorporated into selection schemes in order to counterbalance any increase in inbreeding. Such approaches should help to preserve the Nero Siciliano breed and ensure the production of high quality products for local and national markets.