Fine mapping and single nucleotide polymorphism effects estimation on pig chromosomes 1, 4, 7, 8, 17 and X

Fine mapping of quantitative trait loci (QTL) from previous linkage studies was performed on pig chromosomes 1, 4, 7, 8, 17, and X which were known to harbor QTL. Traits were divided into: growth performance, carcass, internal organs, cut yields, and meat quality. Fifty families were used of a F2 population produced by crossing local Brazilian Piau boars with commercial sows. The linkage map consisted of 237 SNP and 37 microsatellite markers covering 866 centimorgans. QTL were identified by regression interval mapping using GridQTL. Individual marker effects were estimated by Bayesian LASSO regression using R. In total, 32 QTL affecting the evaluated traits were detected along the chromosomes studied. Seven of the QTL were known from previous studies using our F2 population, and 25 novel QTL resulted from the increased marker coverage. Six of the seven QTL that were significant at the 5% genome-wide level had SNPs within their confidence interval whose effects were among the 5% largest effects. The combined use of microsatellites along with SNP markers increased the saturation of the genome map and led to smaller confidence intervals of the QTL. The results showed that the tested models yield similar improvements in QTL mapping accuracy.


Introduction
Quantitative trait loci (QTL) mapping efforts often result in detection of genomic regions that explain part of the quantitative trait variation. However, these regions are usually so large that they do not allow accurate identification of the responsible genes or variants. By using single nucleotide polymorphism (SNP) in the analysis, the genome can be saturated with more markers and the interval of these QTL may become narrowed. Making QTL regions as small as possible is a first step in the process towards the identification of the relevant gene(s) and the respective causative mutation(s).
Previous studies from our research group were conducted on the same population and detected QTL by means of microsatellite markers. A combined total of 40 QTL for growth performance, meat quality, internal organs, cut yield, and carcass composition were found in studies by Paixão et al. (2008Paixão et al. ( , 2012Paixão et al. ( , 2013, Silva et al. (2008), and Sousa et al. (2011): five on chromosome 1 (SSC1), 12 on chromosome 4 (SSC4), nine on chromosome 7 (SSC7), eight on chromosome 8 (SSC8), three on chromosome 17 (SSC17), and three on chromosome X (SSCX). The sparse genetic maps that were used led to the detection of QTL with large confidence intervals. Combining the microsatellite genotypes with new information from SNP markers in these regions will allow fine mapping and reveal the true positions of these QTL.
One important issue when performing fine mapping is how to analyze the data and combine the resulting information, since linkage mapping and genome-wide association (GWA) use different statistical approaches. Linkage mapping is based on simple linear regression using line of origin probabilities of the genotypes . In contrast, certain GWA methods are based on multiple regression models where the allele substitution effects for all SNPs are estimated simultaneously (e.g. Meuwissen et al., 2001). The multiple regression models can have estimation problems that stem from multicollinearity between markers, requiring some special statistical treatment. The Bayesian LASSO regression (BLR) (Park and Casella, 2008) combines desirable features of variable selection and regularization via shrinkage of the regression coefficients (de los Campos et al., 2009).
The objective of this study was to (i) fine map chromosomes that had QTL on our pig population: SSC1, SSC4, SSC7, SSC8, SSC17 and SSCX with increased marker coverage; and (ii) to use complementary information from SNP marker effects estimated by BLR to determine the concordance between the positions of SNPs with the top 5% estimates and the regions covered by a QTL confidence interval.

Material and Methods
Experimental population and phenotypic data All procedures with animals were carried out in accordance with the Ethics Statements of the Department of Animal Science, Federal University of Viçosa (UFV), MG, Brazil.
A three-generation resource population was created and managed as described by Band et al. (2005a). Briefly, two local Piau breed grandsires were crossed with 18 granddams, composed of Large White, Landrace, and Pietrain breeds, producing the F1 generation from which 11 F1 sires and 54 F1 dams were randomly selected (Peixoto et al., 2006). These F1 individuals were crossed to produce 627 F2 animals. Piau is a local unimproved breed with high level of fatness (Serão et al., 2011).

DNA extraction, SNP selection and genotyping
DNA was extracted at the Animal Biotechnology Laboratory of the Department of Animal Science at the Federal University of Viçosa. Genomic DNA was extracted from white blood cells of grand parental, F1, and F2 animals, described in Band et al. (2005b).

Statistical analysis -QTL mapping
Genetic distance between markers was extrapolated from the physical distance (1Mb = 1 cM) (Amaral et al., 2008) to build the combined map of microsatellites and SNP. The combined genotypic data was used in linkage analysis using the regression method described by , implemented in GridQTL (Seaton et al., 2006). 512 Hidalgo et al. The statistical model assumed that the putative QTL is diallelic, with alternative alleles fixed in each of the grand parental breed. The probability that each F2 individual carries each of the three QTL genotypes was calculated according to the genotype of the markers at 1 cM intervals along the chromosome. From these probabilities the additive and dominance coefficients were calculated and used to regress the individual phenotypes for each animal.
The following statistical model was adopted: where y ijk = phenotype; S i = fixed effect of sex i; L j = fixed effect of batch j, j = 1, 2, 3, 4, 5; H k = fixed effect of the halothane genotype k, k = 1 (NN), 2 (Nn); ( The halothane genotype was included as a fixed effect since Band et al. (2005a,b) reported significant effects of the Hal 1843 mutation on performance, carcass, and meat quality traits in our population. Carcass weight at slaughter was included as a covariate for carcass and internal organ traits; age at slaughter was included for meat quality traits. Litter size was included as a covariate for birth weight; litter size at weaning was included for weight at 21, 42, 63, 77, 105 days, and slaughter weight; weight at 77 days was included for feed conversion, feed intake and average daily gain.
The F ratio was calculated at each position, comparing the model with a QTL to the equivalent model without QTL. Estimates for a and d were calculated at the best estimated position with the highest F ratio. The additive fraction of phenotypic variance (h 2 Q ) in the F2 generation explained by a given QTL was computed according to Pérez-Enciso et al. (2000). The conditional probability functions of the QTL given the genotype of the markers (c a and c d ) were estimated according to Haley et al. (1994).

Statistical analysis -SNP effect estimates
In the GWA analysis the phenotypic outcomes, y i (i = 1, 2, ..., 345), were regressed on marker covariates x ik (k = 1, 2, ..., 237) and the same fixed effects and covariates were used as in the linkage analysis following the regression model proposed by Meuwissen et al. (2001): where y i is the phenotypic observation of animal i, m is the general mean, f i is a set of fixed effects relative to each animal i with incidence matrix Z i , b k is the effect of marker k, and e i the residual term, e N i ẽ ( , ) 0 2 s . In this model, x ik take the values 0, 1, and 2 for the SNP genotypes AA, Aa, and aa at each locus k, respectively. Under a matrix notation, the presented GWA model can be rewritten as: where 1', I, and Z are, respectively, a unit vector, an identity matrix, and a fixed effect incidence matrix (Z 345xNf ), being Nf the number of fixed effects; y = [y 1 , y 2 , ..., y 345 ]' 345x1 , X k = [x 1k , x 2k , ..., x 345k ]' 345x1 , and e = [e 1 , e 2 , ..., e 345 ]' 345x1 .
Solutions from model 3 were obtained using BLR (de los Campos et al., 2009), assuming that each locus explains its own amount of the variation. The BLR is a penalized Bayesian regression procedure whose general estimator is given by where l is the regularization parameter. When l = 0 there is no regularization, and when l > 0 there is a shrinkage of the marker effects toward zero, with the possibility of setting some redundant effects (b's) identically equal to zero, resulting in a simultaneous estimation and variable selection procedure. The BLR package (de los Campos et al., 2009;Pérez et al., 2010) of R (R Development Core Team, 2011) was used, which assumes that the joint prior distribution of where s e 2 is the residual variance, with a scaled inverse c 2 prior distribution, and t k 2 is the scale parameter related to each marker. In turn, the BLR also assumes that the joint prior distribution for the scale parameters ( , , ) t t t 1 2 2 2 237 2 K is the product of Exponential distributions, and that the l prior distribution is G (n 1 , n 2 ). The BLR was implemented using 50,000 MCMC iterations with a burn-in equal to 10,000 iterations. Chain length was validated for each MCMC separately using Geweke convergence diagnostics implemented in the BOA package (Bayesian Output Analysis, Smith, 2007).
The large effects SNPs were identified for each trait as those SNPs with absolute values within the top 5%. The genome positions of these markers with large effects were used to analyze overlap with QTL regions.
Fine mapping on the pig genome

QTL analysis
In total, 32 QTL were detected that surpassed the 5% chromosome-wide significance level (CWL) ( Table 2). From these 32 QTL, 12 surpassed the 1% CWL and 7 the 5% genome-wide level (GWL). The 32 QTL included 7 that were found in previous studies on this F2 population, and 25 novel QTL were detected by applying linkage analysis with the increased marker coverage.

Confirmation of previously known QTL
None of the previously found QTL on SSC1 surpassed the 5% CWL in the current analysis. The QTL for 514 Hidalgo et al. *, ** and *** significant at the 5% chromosome-wide level, 1% chromosome-wide level, and at the 5% genome-wide level, respectively; Positive additive effects indicate that Piau alleles increased the trait and negative, that commercial alleles increased it. 1 CLBRA, carcass length by the Brazilian carcass classification method; 2 CLUSA, carcass length by the American carcass classification method; 3 LBF, midline lower backfat thickness above the last lumbar vertebrae; 4 PBF, midline backfat thickness between last and penultimate lumbar vertebrae; 5 LRBF, midline backfat thickness immediately after the last rib; 6 SBF, higher backfat thickness on the shoulder region; 7 TRIMHW, trimmed ham weight; 8 TBSW, total Boston shoulder weight; 9 TRIMBSW, trimmed Boston shoulder weight.
HEART on SSC4 was confirmed at the 1% CWL and explained 3.81% of phenotypic variance. On SSC7 the QTL for AF was significant at the 5% CWL, explaining 4.06% of the phenotypic variance. The QTL for CLBRA and CLUSA (SSC7) were significant at the 5% GWL and explained 6.01% and 6.34% of the phenotypic variance, respectively. On SSC8, the QTL for LBF was significant at the 5% GWL, explaining 7.30% of the phenotypic variance. On SSC17, the QTL for W77 was significant at the 1% CWL and explained 2.68% of the phenotypic variance. On SSCX, the QTL for A was significant at the 5% CWL and explained 3.30% of the phenotypic variance. The peaks of the QTL confirmed in the current analysis deviated on average by 33.8 cM from the QTL positions obtained in the previous analysis.

New QTL detected
On average there were four new QTL detected on each of the targeted chromosomes. On SSC1 a QTL for LEA was found surpassing the 5% GWL, explaining 5.60% of the phenotypic variance. In addition three new QTL were found surpassing the 5% CWL on SSC1 for SA, BW, and DL, explaining 5.03%, 8.72%, and 0.60% of the phenotypic variance, respectively. On SSC4 a QTL for LUNG was found surpassing the 5% GWL, explaining 6.54% of the phenotypic variance. In addition three new QTL were found surpassing the 5% CWL on SSC4 for CL, SF and LBF, explaining 0.07%, 1.08%, and 4.97% respectively. On SSC7 a single new QTL was found for SF that was significant at the 5% CWL, explaining 0.04% of the phenotypic variance. On SSC8 six new QTL were found at different thresholds. Two QTL for AF and SBF were significant at the 5% GWL, explaining 3.97% and 4.55% of the phenotypic variance, respectively. Two additional QTL for LIVER and SIL were significant at the 1% CWL, explaining 3.90% and 3.83% of the phenotypic variance, respec-tively. The final two QTL on SSC8 for PBF and LRBF were significant at the 5% CWL and explained 5.23% and 2.45%, respectively. On SSC17 one QTL for BW was significant at the 1% CWL, explaining 0.42% of the phenotypic variance, and three additional QTL for TN, W63, and pH24 were significant at the 5% CWL, explaining 1.47%, 1.50%, and 1.41% of the phenotypic variance, respectively. On SSCX all six new QTL for SIL, BCW, LW, TRIMHW, TBSW and TRIMBSW were significant at the 5% CWL and respectively explained 3.40%, 5.61%, 4.82%, 3.42%, 4.16%, and 4.04%, of the phenotypic variance.

Confidence interval
Of the 32 QTL described in the current analysis, most (23 QTL) were mapped with a 95% confidence interval of 10 cM or less. Only 4 of the QTL were mapped with a confidence interval larger than 20 cM. Confidence intervals of QTL that were detected previously and confirmed in this study were reduced by 23.9 cM on average ( Figure 1).

Top 5% SNPs with the largest effect
Six of the seven QTL that were significant at the 5% GWL had one top 5% SNP within their QTL confidence intervals. For 11 of the 32 QTL described in the current analysis at least one marker from the top 5% SNPs was located within their QTL confidence interval. Seven of these 11 QTL harbored exactly one of the top 5% SNP: QTL affecting CL, LUNG and HEART on SSC4, AF and CLBRA on SSC7, LBF on SSC8, and BCW on SSCX. The other four QTL each harbored exactly two top 5% SNPs within their confidence interval: QTL affecting LEA on SSC1, CLUSA on SSC7, AF and SBF on SSC8.

Discussion
A QTL mapping study was carried out and QTL confidence intervals were inspected for harboring the positions of any of the top 5% SNPs by means of the Bayesian LASSO method. Using the QTL regression approach, 32 QTL were detected at the 5% CWL using a combined genetic map. Eight of these had not been reported in the consulted literature: SF and LUNG on SSC4; SBF, PBF and LRBF on SSC8; BW on SSC17; SIL and BCW on SSCX (PigQTLdb). Compared to previous studies that used the same F2 population and relied only on microsatellite markers (Paixão et al., , 2012(Paixão et al., , 2013Silva et al., 2008;Sousa et al., 2011), only seven of 40 QTL were confirmed in the present analysis. For these confirmed QTL the confidence intervals were narrowed down on average by 23.9 cM using the dataset with increased marker coverage. Given the increase in power in the present study, from adding SNP markers to the existing microsatellite map, we infer that QTL that were not confirmed in the current study are most likely false positives.

Confirmation of previously known QTL
On SSC4 a QTL associated with HEART  was confirmed. The new analysis positions the QTL in the same interval, between the S0001 and S0217 microsatellite markers, but Silva et al. (2008) reported a much larger confidence interval (68 cM) than obtained in the present study (9 cM), showing an increase in mapping precision.
On SSC7, QTL were confirmed for AF, CLBRA and CLUSA, previously reported in this population by Sousa et al. (2011). The QTL for AF and CLBRA were located between microsatellite markers S0064 and S0102, whereas the QTL for CLUSA was located in the neighboring interval, between microsatellite markers S0102 and SW252. The current results place the SSC7 QTL in the same microsatellite intervals as before, but again with much smaller confidence intervals. A QTL on SSC7 was found by Mikawa et al. (2011), affecting the number of vertebrae, which would increase carcass length. The marker SW252 is also flanking a QTL in their study and the vertnin gene is the suspected cause of variation in the number of vertebrae in commercial populations. This gene could affect other traits as there are other QTL near this region, an example is the QTL for AF that we confirmed. The QTL affecting CLBRA and CLUSA were detected close together on SSC7, suggesting that the same gene possibly affects these two similar traits. The Piau alleles at CLBRA and CLUSA were associated with longer carcasses, which is not expected, as an increased carcass length would be expected from the larger commercial breed. These sources of cryptic variation are, however, known to exist (Abasht et al., 2006).
On SSC8 the QTL for LBF detected by Sousa et al. (2011) was confirmed. The QTL was located between the microsatellite markers SW905 and S0017. The current study narrowed the confidence interval from 12 cM to 8 cM, which is not as dramatic as for some other QTL, but this is mainly because the interval was already quite narrow. Piau alleles were associated with higher values of backfat thickness as measured by LBF, which is expected since Piau is a breed with a high level of fatness.
On SSC17 the QTL for W77 detected by Paixão et al. (2008) was also confirmed. The QTL was located between the microsatellite markers S0359 and SW2427. The original confidence interval of 35 cM was dramatically reduced to 2 cM in the present study. Piau alleles were associated with higher W77, which was not expected from the phenotypic means of the grand parental populations, and like the results for carcass length on SSC7 this again indicates the presence of cryptic variation.
On SSCX a QTL associated with meat color, A, identified by Paixão et al. (2012) was confirmed. The peak of the QTL remained located between markers SW1943 and S0218. Our confidence interval was much smaller (9 cM) than theirs (33 cM). Piau alleles were related to an increase in A, following the expectation, as values for A were higher in Piau than in the commercial breed.

New QTL detected
On SSC1 four new QTL were detected from this resource family. The QTL affecting SA was considered a new QTL because the SA QTL detected by Paixão et al. (2013) on this chromosome was located at a different position, in a different marker interval. We detected a QTL affecting BW near a QTL previously reported by Knott et al. (1998) and Beeckmann et al. (2003), located in the proximal region of the chromosome. A QTL associated with DL, as detected here, was also previously reported, but at a different position (e.g. Ponsuksili et al., 2008). The QTL affecting LEA may coincide with the one detected by Malek et al. (2001) and Grapes and Rothschild (2006). Even though it was considered a cryptic effect, alleles originating from the Piau breed were shown to increase LEA, this corresponding with results in the literature where Berkshire alleles, which is considered a fatter breed, were found to increase growth and leanness (Malek et al., 2001;Grapes and Rothschild 2006).
On SSC4, QTL for SF and LUNG were detected for which we did not find any previous reports in the literature. The QTL affecting CL was also detected by Große-Brinkhaus et al. (2010) in a Duroc x Pietrain cross. The QTL for LBF that we found was also detected by Silva et al. (2008) and Malek et al. (2001). Piau alleles were estimated to increase backfat thickness as expected. On SSC7 we detected a QTL affecting SF, for which a QTL has also been reported by Edwards et al. (2008) on this chromosome, but at different location.
On SSC8, QTL for SBF and PBF on SSC8 have not previously been reported. Other previously mapped QTL, such as the one affecting LIVER, was mapped in the same interval by Beeckmann et al. (2003), and QTL associated with AF (Knott et al., 1998;Sousa et al., 2011), and LRBF (Fan et al., 2011) have also been reported. For all backfat traits, the Piau alleles would cause an increase in backfat thickness. The QTL associated with SIL was previously reported by Knott et al. (1998) in a cross between European wild pigs and Large White, by Gao et al. (2010) in a Duroc x Erhualian cross, and by Sousa et al. (2011) in the current reference population. These other reported QTL were located at some distance from the current QTL. The commercial breed alleles were associated with longer SIL (0.48), supporting the hypothesis that small intestine length increased in response to selection and domestication, as proposed by Andersson et al. (1994) On SSC17, a QTL affecting BW was detected and has not been reported before. QTL associated with TN (Guo et al., 2008), with W63  and with pH24 (Wimmers et al., 2007) were previously reported, but in different chromosomal regions. BW is increased by the Piau alleles, and commercial breed alleles increase TN, which were expected effects in view of the higher fatness of the Piau breed and higher number of piglets per litter in commercial breeds. Piau alleles were related to an increase in pH24, following the expectation of higher pH24 for Piau than for commercial breeds.
On SSCX a QTL for BCW was detected at 45 cM, and a QTL for SIL at 106 cM for which no previous reports were found in the literature. Piau alleles were associated with longer SIL, which was against expectation (Andersson et al., 1994), and different from the other QTL effects detected for SIL in this study. Three of the four remaining new QTL, for LW, TRIMHW, and TRIMBSW, were previously reported by Milan et al. (2002). The QTL for TRIMHW was also reported by Cepica et al. (2003) who additionally reported a QTL for TBSW. The estimated additive effect of the QTL affecting BCW, TRIMBSW and TBSW implied that Piau alleles increase the phenotype for these traits, and that LW and TRIMHW are increased by commercial breed alleles.
The Piau breed has never undergone strong selection for lean growth, as is common in current commercial breeding programs, explaining the higher carcass fatness of the Piau. For QTL related to fatness, such as QTL for backfat on SSC8, for AF on SSC8, and for BCW on SSCX, the Piau breed alleles were expected to result in more fat. For QTL related to growth and meat weight, like LW and TRIMHW on SSCX, the Piau breed alleles were expected to result in less growth. Many of the new QTL detected in this study did not follow this expectation. Instead, many new QTL showed cryptic effects where the alleles of the Piau breed increased growth or decreased fatness. QTL with cryptic effects were BW and LEA on SSC1, BW and W63 on SSC17, and TRIMBSW, TBSW and SIL on SSCX. While cryptic QTL effects are unexpected, they are not uncommon. In other studies on pigs (Yue et al., 2003), as well as studies on different species (Abasht et al., 2006), cryptic QTL effects have been shown.
The length of the intestine, for which a QTL was found on SSCX (SIL), is an important factor affecting the potential to grow, possibly by influencing the nutrient absorption efficiency and digestion (Gao et al., 2010). It was expected that alleles from commercial breeds would be associated with longer intestine length, but the opposite was found. A similar cryptic allelic effect was also reported by Gao et al. (2010) for intestine length on a different chromosome, SSC7, using a White Duroc X Chinese Erhualian intercross resource population. In addition to cryptic QTL, other QTL were found by Gao et al. (2010), where the alleles for higher intestine length came from the commercial White Duroc breed. We speculate that alleles from local breeds cause an increase in SIL due to their adaption to low quality feed, which requires better digestion and higher absorption efficiency. This can be achieved by an increased time of digestion of the feed provided a longer small intestine length.

Marker effects
Eleven out of the 32 QTL confidence intervals covered at least one of the top 5% SNP from the BLR analysis. Six of the seven QTL that surpassed the 5% GWL each contained one top 5% SNP within their confidence intervals. The only genome-wide significant QTL without a top 5% SNP was found on SSC4 for LUNG (CI = 65 cM -74 cM). Nonetheless, the marker ALGA0025795, located at 70 cM, had the largest effect in the region (0.00024) and was immediately below the significance threshold for inclusion in the top 5%. Out of the remaining 25 QTL that were significant at the 5% CWL, only four contained a top 5% SNP within their confidence intervals. The smaller proportion of overlap with a top 5% SNP for the chromosome-wide significant QTL is probably due to the smaller amount of variance explained by these chromosome-wide significant QTL. The overlap between results from linkage mapping and effects of individual markers based on association analysis corroborated to some extent the QTL found by the two models, especially for the genome-wide significant QTL. Given the increase in power in the present study compared to the previous analyses of this resource population, we infer that QTL that were found in previous, but not in the current study, were false positives. On the other hand, there is also a chance that new QTL detected at CWL, most of which did not present a top 5% SNP within their confidence interval, are also false positives.
In summary, the addition of more markers and animal genotypes increased the statistical power for QTL detection compared to previous studies and lead to QTL with much smaller confidence intervals. Seven previously discovered QTL were confirmed, 25 novel QTL were identified, and 33 QTL that were detected in previous studies were lost. Most of the genome-wide significant QTL contained at least one of the top 5% SNP effects estimated by the Bayesian approach, corroborating the QTL found by the re-Fine mapping on the pig genome gression method and showing that both models can be used to refine QTL mapping results. With decreasing SNP genotyping costs, updating existing QTL studies with low density SNP genotypes can be a fruitful approach to improve statistical power to detect QTL and reduce confidence intervals.