Patterns of genetic diversity of local pig populations in the State of Pernambuco , Brazil

This study estimated the genetic diversity and structure of 12 genetic groups (GG) of locally adapted and specialized pigs in the state of Pernambuco using 22 microsatellite markers. Nine locally adapted breeds (Baé, Caruncho, Canastra, Canastrão, Mamelado, Moura, Nilo, Piau and UDB (Undefined Breed)) and 3 specialized breeds (Duroc, Landrace and Large White), totaling 190 animals, were analyzed. The Analysis of Molecular Variance (AMOVA) showed that 3.2% of the total variation was due to differences between genetic groups, and 3.6% to differences between local and commercial pigs. One hundred and ninety eight alleles were identified and apart from the Large White breed, all GG presented HardyWeinberg Equilibrium deviations for some loci. The total and effective allele means were lower for Duroc (3.65 and 3.01) and higher for UDB (8.89 and 4.53) and Canastra (8.61 and 4.58). Using Nei’s standard genetic distance and the UPGMA method, it was possible to observe that the Landrace breed was grouped with the local genetic groups Canastra, Moura, Canastrão, Baé and Caruncho. Due to the complex admixture pattern, the genetic variability of the 12 genetic groups can be analyzed by distributing the individuals into two populations as demonstrated by a Bayesian analysis, corroborating the results from AMOVA, which revealed a low level of genetic differentiation between the inferred populations.


Introduction
A great portion of the Brazilian swine herd is composed of breeds with high genetic potential for meat production (Duroc, Landrace, Large White, Hampshire and Pietrain), which were intensively introduced in the country during the 60's, and commercial or specialized breeds (mostly composed by crossbreeding) that were more recently introduced.Naturalized pigs, i.e., animals that were locally adapted and that survived natural selection processes since the country´s Colonial Period, are facing extinction due to their characterization as fat-type pigs with low potential for lean meat production.
In recent years, the use of molecular techniques has helped in the conclusion of several studies focused on estimating the genetic variability distribution among and within endangered breeds of farm animals.Most of these studies were carried out using information from molecular markers such as microsatellites, which are considered to be extremely informative for genetic diversity studies in populations (Aranguren-Méndez et al., 2005).Sollero et al. (2009) identified genetic diversity loss in pig breeds locally adapted in Brazil (Moura, Piau and Monteiro) with the use of a panel of microsatellite markers.Similar studies were also carried out in other countries (Calvo et al., 2000;Martínez et al., 2000;Lemus-Flores et al., 2001;Chaiwatanasin et al., 2002;Fabuel et al., 2004;Li et al., 2004;Canul et al., 2005;Kim et al., 2005;Martínez et al., 2005;Thuy et al., 2006;Vicente et al., 2008).
According to Lemus-Flores et al. (2001), Mexican pig populations may be carriers of genetic variants important for preservation and may be a source of new alleles that could be used for further improvement of commercial pig breeds.Kim et al. (2005) found a strong relation between Korean pigs and the Berkshire and Landrace commercial breeds, and results obtained by Behl et al. (2006) identified a distinct type of Indian pig.Such examples reinforce the value of using microsatellite markers in studies focused on the characterization and conservation of animal genetic resources.
The objective of this study was to quantify the genetic diversity and structure of locally adapted pig populations in the State of Pernambuco, Brazil, with the use of a panel of microsatellite markers.

Material and Methods
One hundred and ninety samples of commercial and locally adapted pigs were used in the study: 176 from 82 properties of different regions in the State of Pernambuco (26 samples from the Metropolitan Region of Recife, 15 from the North and South Zona da Mata regions, 19 from the Central and Meridional Agreste regions and 116 from the city of Salgueiro and from the Sertão do Araripe region); and 14 samples as positive control from other Brazilian states (Santa Catarina, Distrito Federal and Bahia).The collection period was from January to September 2009.
The locally adapted pigs were classified into each of nine genetic groups (Baé, Caruncho, Canastra, Canastrão, Mamelado, Moura, Nilo, Piau and UDB -Undefined Breed) according to their qualitative morphological characteristics as described by Vianna (1986), Barreto (1973) and Castro et al. (2002).The UDB group was composed of 47 pigs with morphological characteristics that were too heterogenic to classify into one of the other groups.Pig samples from specialized breeds (Landrace, Large White and Duroc) from Embrapa CENARGEN´s DNA/Tissue Bank were used as outgroups to evaluate their relation with the locally adapted animals.
Markers were randomly located in 17 autosomal chromosomes of the pig genome.Nineteen of these microsatellites are part of a panel recommended by FAO (Food and Agriculture Organization)/ISAG (International Society of Animal Genetics) (FAO, 2004) for the study of genetic diversity in pigs.
Four loci were excluded from the analyses (S0090, S0386, S0228 and S0097) for presenting problems in amplification or in allele calling, so the final panel was composed of 18 loci.
PCR amplifications were carried out using 4.5 ng of genomic DNA, 0.2 to 0.23 μM of each primer and 5 μL of QIAGEN Master Mix kit, according to the manufacturer's recommendations.The following cycling steps were used: 95°C (15 min) followed by 36 cycles of 94°C (1 min), 60°C (1:30 min) and 72°C (1 min), and one final extension step of 72°C (10 min).PCR products were submitted to capillary electrophoresis in an automated sequencer model ABI Prism 3100 (Applied Biosystems), according to the manufacturer's recommendations.
The genetic structure of the 12 genetic groups studied was carried out using the Analysis of Molecular Variance (AMOVA) with software Arlequin (Excoffier et al., 2006) to determine the genetic differentiation between groups through F ST estimations (Weir & Cockerham, 1984), which were tested with 100,000 iterations of Markov Chains and 10,000 permutations.Four contrasts were tested: (I) To separate the genetic groups by regions, without considering their phenotypic characteristics, which resulted in five populations (Metropolitan Recife, Zona da Mata, Agreste, Sertão and Embrapa's DNA/Tissue Bank CENARGEN, DF); (II) the three main phenotypic groups (locally adapted, commercial and UDB); (III) the UDB population was grouped with the other locally adapted genetic groups, creating two groups: one with the commercial populations and another with the locally adapted ones; (IV) All the 12 genetic groups as independent populations.
The other strategy to define the population structure consisted of estimating the most probable population number (K) in the samples from data generated with the 18 microsatellites in the panel, using a Bayesian approach with the software Structure v. 2.3.3 (Pritchard et al., 2000).A K value from 1 to 15 was estimated, with five simulations for each K, tested with a burn-in of 100,000 repetitions and 500,000 iterations of Markov chain Monte Carlo Simulation (MCMC).Additional analyses were performed to test the consistence of the results (50,000, 80,000 and 100,000 of burnin with 100,000, 150,000 and 500,000 iterations of MCMC, respectively).The tests were applied based on the admixture model with correlated alleles frequencies (Falush et al., 2003).
From the data probability logarithms (P Ln (D)) obtained with the Structure software for both analyses, the best K was estimated with an ad hoc statistic named DeltaK, which is based on the changing rate of the data probability logarithm between successive K values (Evanno et al., 2005).The structure's display graphics were made using the software Distruct (Rosenberg, 2004).
After the identification of the most probable genetic structure for this data set, a series of intra and interpopulation analyses was estimated with the software GenAlex v. 6.3 (Peakall & Smouse, 2006): total number of alleles (A T ); average number of alleles (A M ); effective number of alleles (A E ); number of private alleles; expected heterozygosity (He); and observed heterozygosity (Ho).Arlequin v. 3.1 (Excoffier et al., 2006) was used to estimate if the studied loci were in Hardy-Weinberg Equilibrium in In contrast to our findings, Yang et al. (2003) found higher levels of differentiation between Chinese pig populations (F ST = 7.7%).AMOVA results of five Brazilian genetic groups (Piau, Moura, Monteiro, Landrace and MS60) obtained by Sollero et al. (2009) showed that 14% of all observed diversity came from the difference between the evaluated genetic groups.In another study carried out with Mexican pigs, an F ST value of 11% was found (Lemus-Flores et al., 2001).The highest genetic differentiation values for pig populations ever reported were F ST = 27% (Laval et al., 2000) between European pigs, followed by 26.1% for a differentiation study also carried out with European, Korean and Chinese pigs (Kim et al., 2005).
Structure (Pritchard et al., 2000) has been satisfactorily used in several studies to estimate the number of probable subpopulations/breeds that make up a certain species (Paiva, 2005;Sollero et al., 2009;Vicente et al., 2008;Martínez et al., 2008Santos-Silva et al., 2009).The genetic differentiation observed between genetic groups was low in all structures tested in this study.Therefore, the data probability logarithms (P ln (D)) from the software Structure were used to infer the best K with an ad hoc statistic named DeltaK, inferring that the best K was the one which presented the highest value (92.01) for the ΔK.In the 15 inferred populations (K), a higher probability for K = 2 was identified.It was possible to visualize the genetic variability distribution of individuals according to structures of K = 2 and K = 4.In the other inferred populations (K5-K15), admixture patterns complex and difficult to evaluate were observed.
Figure 1 presents the genetic structure distribution of the 12 pig genetic groups studied in this study from K = 2 to K = 4.
Introgression in different proportions of genetic material from specialized breeds into the locally adapted groups, as well as high levels of admixture, were observed.These results corroborate reports from Sollero et al. (2009) who showed that breeds such as Landrace and Piau share a high number of alleles, as other specialized and locally adapted swine breeds do.
Probabilities from the allocation test of individuals for the 12 pig genetic groups estimated with Structure are presented in Table 1.In cluster 1 the highest grouping probabilities were found for Canastrão (0.885) and Duroc (0.871), which can be a result of the genetic management used, since the samples from these groups came from the Sertão region of Pernambuco (Araripe and Salgueiro) and the Duroc pigs came from Embrapa Swine and Poultry, respectively.each population, using the model proposed by Guo & Thompson (1992).
Polymorphic Information Content (PIC) and two parentage exclusion probabilities were estimated for each locus with Cervus v. 3.0 (Marshall et al., 1998).PE1 is the exclusion probability estimated when only the genotypes from the offspring and the supposed father are available, whereas PE2 is estimated with the inclusion of the genotype of one of the progenitors (the mother).
FSTAT v. 2.9.3 (Goudet, 2002) was used for estimating Wright's F statistics (F ST , F IS and F IT , Weir & Cockerham, 1984) per locus within populations.The significance of the F statistics was tested by bootstrap with 1000 resamplings, with a 99% confidence interval (CI).Bonferroni correction was used with α = 5% (0.05/18).
The genetic relation between populations was evaluated based on Nei (1972) andReynold's (1983) standard genetic distances, using the software TFPGA v.1.3(Miller, 2000).Results were used to build a dendrogram with the Unweighted Pair Group Method with Arithmetic Mean (UPGMA), with a 1000 repetitions bootstrap.

Results and Discussion
In order to identify the main genetic structure among swine genetic groups from Pernambuco State a hierarchical analysis with AMOVA procedure was carried out to test the significance of four contrasts.The highest inter-population variation was observed when two groups composed of specialized and locally adapted animals (analysis III) were contrasted (F ST = 3.6%, P<0.0001).When the analysis was performed with only one group composed of the 12 genetic groups (analysis IV), the F ST value was 3.2% (P<0.0001).Both tests confirm a low genetic structure among swine populations sampled in the State of Pernambuco.
In the other tests, the variation between groups from five different geographic regions, regardless of their phenotypic characteristics, was quite low (F ST = 0.48%) and non-significant (P>0.0001); the F-ST-of a group with three populations (specialized, locally adapted and UDB) was also low (2.09%; P<0.0001).Considering that F ST is inversely proportional to gene flow, the low level of genetic differentiation observed between genetic groups is probably a consequence of a high level of gene flow among different herds from the same geographical region.Santos-Silva et al. (2009), evaluating Portuguese sheep breeds, affirmed that geographically close breeds tend to have a low differentiation level because gene flow between breeds is more likely to occur.
Cluster 1 also presented high allocation probabilities for the Landrace (0.736) and Large White (0.628) specialized breeds, as well as for other local genetic groups like Baé (0.679), Caruncho (0.602) and Canastra (0.559), and it contributed for most of all the studied groups.Local genetic groups Piau (0.573), Nilo (0.487) and Moura (0.422) were grouped in cluster 2. Cluster 3 was related to the local genetic group Nilo (0.433), while groups Mamelado (0.43) and UDB (0.476) shared cluster 4. Notably, all genetic groups, despite having presented higher proportions in some of the clusters, had some individuals allocated in other clusters.
The results of the quantitative intra-population variability analysis of the parameters: total (A T ); mean (A M ) and effective (A E ) allele number; expected heterozygosity (He) and observed heterozygosity (Ho) according to the Hardy-Weinberg Equilibrium (HWE); Polymorphic Information Content (PIC); two paternity exclusion probabilities (EP1 and EP2); F statistics in each locus and number of genetic groups that presented significant deviations for the Hardy-Weinberg Equilibrium test, are presented in Table 2.
At analyses of intra and inter group variability, 198 alleles with the 18 loci analyzed in the 12 swine genetic groups were identified (Table 2).All loci tested were polymorphic in all breeds with the exception of S0355, which was monomorphic in the Duroc and Large White breeds.This observation was probably due to the low number of individuals genotyped in these breeds, n = 1, each.Lemus-Flores et al. (2001) detected 5 and 7 S035 alleles in Duroc and Large White/Landrace pigs, respectively.Conversely, Martinez et al. (2005) reported that S0355 was monomorphic in six of the eleven Iberian pig populations studied.
Common alleles were observed in high frequencies among breeds in most of the tested loci.Thirteen loci presented from 1 to 2 common alleles in 100% of the populations: SW1517, SW830, S0155, S0026, S0002, SW240, SW857, S0178, OPN, S0101, SW72, SW455 and SW911.Sixteen alleles from all evaluated loci were present in 91.7% of the pig populations and four private alleles from four different loci were detected with frequencies lower than 7% (loci S1517, S0355, SW857 in the populations UDB and SW72 in Nilo).
In eight of the 12 populations (Baé, Canastra, Canastrão, Landrace, Moura, Mamelado, Nilo, Piau, UDB), the presence of rare alleles (with frequencies lower than 5%) was observed.Groups UDB, Canastra and Moura presented higher total   allele (160, 155 and 142, respectively) and rare allele numbers (68, 63 and 45, respectively).Among the specialized breeds, the Landrace breed had the highest allele number, with 136 alleles observed in the 18 genotyped loci, from which 33 had frequency lower than 5%.This elevated total allele number detected in local groups in relation to the specialized ones may indicate that the specialized breeds have suffered an allelic diversity loss or suggest recent events of admixture that have the tendency to raise number of alleles.The rare alleles that were identified in large quantities in local groups may have resulted from mutations and/or recent gene flow.Total allele number per loci (A T ) varied from 07 (SW830 and SW72) to 20 (locus S0005).Six of the loci recommended by FAO (SW830, S0026, S0355, S0178, S0101 and SW72) showed low numbers of alleles -7 to 8 per locus.Laval et al. (2000) detected 7 to nine alleles per loci for five of these loci (S0026, S0355, S0178, S0101 and SW72) and Behl et al. ( 2005) also found few alleles within loci S0026, S0355, S0178 and SW72 (6 to 11 alleles).
Polymorphic Information Content (PIC) estimates varied between 0.541 (SW72) and 0.9333 (S0005), with a mean value of 0.738.From the 18 microsatellite loci analyzed, nine presented high polymorphism (PIC>0.7)and only two had PIC lower than 0.6.According to Botstein et al. (1980)'s classification, markers with PIC superior to 0.5 are considered very informative; therefore, all loci used in this study were informative.The highest PIC values (0.939 and 0.903) corresponded to the microsatellites that presented higher allele numbers (S0005 and S0068, respectively).
Based on the PIC values observed and on the high combined exclusion probabilities of 0.99998 (PE1) and 0.99999 (PE2), it can be inferred that all loci can be used to compose a panel for parentage tests.Furthermore, it is worth emphasizing that the loci recommended by FAO (SW830, S0155, S0355, SW2406, SW857 and SW72) presented PE1 and PE2 values below 50%.
The fixation coefficient of populations (F ST ) per locus varied from zero (S0155) to 0.242 (S0355), with a mean value of 0.039, showing that only 3.9% of the total genetic variation was explained by differences between populations.For the F ST-index, 66.7% of the loci presented values outside of the confidence interval (CI = 99%), with eleven loci presenting significant, though low, values to diagnose differences between the genetic groups.The mean F IS and F IT values for all loci were 0.186 and 0.214 with 38.9% and 33% of the values outside of the CI, respectively.The inbreeding coefficient within the populations (F IS ) in each locus represented high inbreeding levels, except for loci S0002 (-0.130),SW857 (-0.236) and S0178 (-0.427), which presented negative F IS , suggesting an excess of heterozygous or exogamy.The positive F IS (0.186) for all loci in the populations probably reflects the subdivision of the general population in subpopulations due to the inbreeding accumulated in small populations and deviations for the Hardy-Weinberg Equilibrium (HWE).
The HWE test showed that all of the polymorphic loci deviated from HWE (P<0.05) in at least one population.Loci SW72 (8), S0068 (8) and OPN (7) presented higher number of populations in imbalance and loci SW911 and S0830, for only one population (UDB and Canastra, respectively).
According to the estimated indexes related to the genetic variability (Table 3), the mean allele number was lower for Duroc (3.65), while UDB and Canastra populations presented mean allele numbers of 8.89 and 8.61, respectively.The total A M was 6.20 alleles, whereas five of the locally adapted genetic groups and the Landrace specialized breed presented A M above the average.
Several factors may have contributed for the low Ho/He ratio in the studied populations, such as inbreeding, population subdivision, presence of null alleles or even the selection in favor of homozygous, which leads to heterozygous losses (Maudet et al., 2002).The PIC was higher in the genetic group Moura (0.717) and lower in the Large White breed (0.560) and in the Duroc breed (0.564), which also had the lowest He value.
The variability within genetic groups estimated by the inbreeding coefficient (F IS ) showed that the highest value was obtained for the Duroc breed (0.250; P<0.05), and the lowest with the Large White breed (0.056), with significant effect (P>0.05).Three locally adapted pig genetic groups (Canastra, UDB and Baé) presented high F-IS (P<0.05).In the local genetic groups, since there are no organized selection programs, it can be inferred that positive F IS observed values are a result from the inbreeding effect, which may contribute to the observed heterozygote deficits resulting from matings between related individuals that occur in the small swine farms found in Pernambuco.
The Landrace breed presented a high significant inbreeding value (F IS = 0.180) (P<0.05), as expected, since many of the individuals in the sample were known to have some degree of kinship due to local production system.The locally adapted Piau breed presented high F IS value (0.184, P<0.05).For the genetic group UDB, composed of 47 pigs, F IS was even higher (0.228, P<0.05).The positive and significant values for F IS in 11 genetic groups also showed that they are in Hardy-Weinberg disequilibrium.Sollero et al. (2009) found negative F IS values (-0.054 and -0.055) when evaluating Landrace and Moura pigs, respectively, however, they obtained the same high value for the Piau breed (F IS = 0.126).Higher F IS values (over 0.10) were observed for Landrace, Large White and Duroc specialized breeds studied in other countries by Lemus-Flores et al. (2001) and Kim et al. (2005), respectively.
The UDB group presented 16 loci that do not fit HWE (P<0.05), while the Large White breed had only one locus that did not adhere to HWE (P<0.05).This shows a great difference in the number of loci in Hardy-Weinberg Disequilibrium among the analyzed populations, especially in the locally adapted groups that presented higher amplitude (Table 3).After the Bonferroni correction, the highest number of loci in imbalance was still observed for the genetic groups UDB and Canastra (14 and 11,respectively), and in five groups, all loci adhered to the HWE (Baé, Caruncho, Duroc, Large White and Mamelado).In the analyzed groups, the high deviated loci number is probably due to the fact that the animals were raised by small farmers in the State of Pernambuco.These farmers often carry out matings between related individuals, especially due to the small size of the herd, as the habit of trading boars and sows between nearby properties and because most of the farmers used to buy the animals in free markets, suggesting gene flow from other populations.
The high number of homozygotes observed must be due to the Wahlund Effect, which is the result of the presence of subpopulations in samples represented in each group, as reported by Lemus-Flores et al. (2001) for Mexican pig populations, which can be one of the reasons why the 11 genetic groups are in Hardy-Weinberg Disequilibrium, as well as the low sample number (Canul et al., 2005).
Overall, the nine local adapted swine genetic groups presented higher mean values for the intra-population genetic diversity parameters such as A E (4.18),A M (7.22), PIC (0.67) and He (0.70) than the ones obtained for the three specialized breeds, and were even superior to the means of the 12 genetic groups.Such remark shows the high diversity of the locally adapted populations in comparison to the specialized ones.This higher diversity may be explained by the fact that the locally adapted genetic groups are not subject to constant improvement programs for specific characteristics such as specialized breeds.
This higher diversity of the locally adapted groups results in a gene pool that makes them capable of surviving the adversities in their living environment.Despite such high diversity, the Ho (0.60) was inferior for the local genetic groups, suggesting that the low numbers of heterozygotes in these groups are due to low numbers of breeders and the lack of breeding management practices, which lead to an increase in the degree of inbreeding, as observed by the highest mean value for the intra-population inbreeding coefficient (F IS = 0.17).
Nei's standard genetic distance estimated that Canastra was the locally adapted breed closest to the specialized Landrace breed (0.094), and that Mamelado was the most genetically distant from Duroc (0.794).The Duroc breed presented the longest distances in relation to the other locally adapted populations, probably due to the geographic isolation, drift and sampling effects.Similar results were also obtained by Reynolds' distance, showing that Landrace is the specialized breed with the highest relation to local breeds.The results obtained corroborate findings reported by Martinez et al. (2000), who used the same distance method and evidenced that there is a genetic differentiation between Duroc and Iberian pig breeds.
Figure 2 presents a distance tree built with the UPGMA method from Nei's (1972) standard genetic distance matrix.Four strong groups were observed: the first one with 51% confidence formed by Landrace and Canastra, Moura, Canastrão, Baé and Caruncho; and the second group, also with 51% confidence formed by Piau, UDB, Nilo and Mamelado.There was a higher proximity between the Landrace specialized breed and the locally adapted group Canastra, showing a high bootstrapping value (77%).Large White and Duroc were more genetically distant from the other genetic groups with bootstrapping values of 73% and 100%, respectively, and composed the other two groups.

Conclusions
It can be observed that there is a weak genetic structure among all studied swine genetic groups in Pernambuco State.Landrace and Canastra, Moura, Canastrão, Caruncho and Baé are genetically close, while Duroc and Large White are the most distinct breeds, possibly because they were not part of the colonization and admixture of the local populations sampled in the region.Due to the complex admixture pattern resulting from crossbreeding between specialized and locally adapted pig breeds, the genetic variability of the 12 genetic groups can be analyzed by distributing the 190 individuals into two populations, as showed by the DeltaK statistic.This distribution corroborates the AMOVA results, which identified a low genetic differentiation level between the inferred populations.Therefore, conservation programs for pigs in this State must focus on local genetic groups in general, and not on pre-defined breeds, due to the excessive crossbreeding between the genetic groups.

Figure 1 -
Figure 1 -Distribution of the genetic structure of the 12 pig genetic groups studied in this study with the software Structure/Distruct for K = 2 to K = 4.

Figure 2 -
Figure 2 -Genetic distance tree grouped by the UPGMA method based on Nei's (1972) distance values, showing the genetic relations between the 12 genetic groups studied.

Table 1 -
Allocation probabilities of individuals in 12 genetic groups based on probabilities estimated with the software structure Total allele number; b Mean allele number; c Efective allele number; d Observed heterozygosity; e Expected heterozygosity; f Polymorphic Information Content; g PE1 = exclusion probability 1, h PE2 = exclusion probability 2. i fixation in dex of the global population; j endogamy due to the differentiation between subpopulations, in relation to the total population; l inbreeding coefficient in relation to its subpopulations; m Number of genetic groups which deviated from the HWE (P<0.05) a

Table 2 -
Estimates of genetic variability indexes per locus based on 190 specialized and locally adapted pigs

Table 3 -
Estimates of genetic variability indexes per group based on 190 specialized and locally adapted pigs evaluated with 18 microsatellites