Identification of selection signatures in livestock species.

The identification of regions that have undergone selection is one of the principal goals of theoretical and applied evolutionary genetics. Such studies can also provide information about the evolutionary processes involved in shaping genomes, as well as physical and functional information about genes/genomic regions. Domestication followed by breed formation and selection schemes has allowed the formation of very diverse livestock breeds adapted to a wide variety of environments and with special characteristics. The advances in genomics in the last five years have enabled the development of several methods to detect selection signatures and have resulted in the publication of a considerable number of studies involving livestock species. The aims of this review are to describe the principal effects of natural/artificial selection on livestock genomes, to present the main methods used to detect selection signatures and to discuss some recent results in this area. This review should be useful also to research scientists working with wild animals/non-domesticated species and plant biologists working with breeding and evolutionary biology.


Introduction
Selection tends to cause specific changes in the patterns of variation among selected loci and in neutral loci linked to them. These genomic footprints left by selection are known as selection signatures and can be used to identify loci subjected to selection (Kreitman, 2000). The recent availability of genomic information on domestic animal species and the development of improved statistical tools make the identification of these footprints in a given species possible (International Chicken Genome Sequencing Consortium, 2004;The Bovine Genome Sequencing and Analysis Consortium, 2009;The International Sheep Genomics Consortium, 2010;Groenen et al., 2012;Dong et al., 2013).
The identification of selection signatures is currently one of the principal interests of evolutionary geneticists because it can provide information ranging from basic knowledge about the evolutionary processes that are shaping genomes to functional information about genes/genomic regions (Nielsen, 2001(Nielsen, , 2005Schlötterer, 2003). For example, if a region that was not previously identified as con-tributing to any special trait in mapping experiments is targeted by selection in a specific population, then this information could lead to an initial inference about the functional characteristics of that region. This approach could also lead to the identification of genes related to ecological traits (e.g., genes related to tropical adaptation) that are difficult to identify through laboratory experiments and may also be useful in corroborating quantitative trait loci (QTL) mapping experiments in production animals. The final and certainly most ambitious aim of these studies is to identify the causal mutations that confer a selective advantage in a specific population or species (Nielsen, 2001;Schlötterer, 2003;Hayes et al., 2008).
Domestication greatly changed the morphological and behavioral characteristics of modern domestic animals and, along with breed formation and selection schemes for improving the production of specific products or achieving a morphological/behavioral standard, allowed the formation of very diverse modern breeds (Diamond, 2002;Toro and Mäki-Tanila, 2007;Flori et al., 2009). These features, along with extensive knowledge about genomic regions that affect economically important traits and recent advances in the field of genomics, provide an excellent opportunity for identifying loci subjected to selection and for the validation of new methods developed to detect selection signatures (Hayes et al., 2008;Flori et al., 2009).
In this review, we describe the effects of natural/artificial selection on genomes, summarize the main methods of detecting the footprints of selection and, finally, indicate and discuss studies aimed at detecting selection signatures in livestock.

Natural and Artificial Selection
Natural selection is a phenomenon driven by the environment in which individuals with specific genotypes have a differential capacity for contributing to the next generation's gene pool (Falconer and Mackay, 1996;Templeton, 2006;Driscoll et al., 2009). Natural selection could basically act in three ways: positive selection, purifying selection (also known as negative or background selection) and balancing selection. Each form of selection is a response to environmental pressure and acts differentially to alter the allelic and genotypic frequencies (Harris and Meyer, 2006;Oleksyk et al., 2010).
Positive selection occurs when a newly arisen mutation has a selective advantage over other mutations and, therefore, increases in frequency in the population (Kaplan et al., 1989). In purifying selection, the disadvantageous variants that appear in the population tend to be removed, thereby maintaining the functional integrity of DNA sequences (Charlesworth et al., 1993). Balancing selection occurs when polymorphism is favored, leading to increased genetic variability. Several biological processes can be grouped in this type of selection, e.g., overdominant selection (in which the heterozygote has a selective advantage), frequency dependent selection (in which different alleles are favored at different time intervals) and temporally or spatially heterogeneous selection (Charlesworth, 2006).
In contrast to natural selection, artificial selection (also called selective breeding) is a human-mediated process in which the gene pool of the next generation does not depend exclusively (or necessarily) on fitness components, but also on traits chosen by humans. Artificial selection can be classified as unconscious selection or methodical selection -the former occurs when there is no long-term objective, and this has been suggested as the cause of the early domestication process. The second occurs when a standard or objective drives the choice of parents for the next generation. Despite these differences and considering that the time frame in which these changes occur is often considerably different, the genetic consequences of natural and artificial selection are essentially the same (Avise and Ayala, 2009;Driscoll et al., 2009;Gregory, 2009).

Selection Signatures
The occurrence of selection creates departures from the neutral theory expectations in the patterns of molecular variation. Each form of selection causes specific changes in the selected loci and in neutral loci linked to them (Kreitman, 2000). When positive selection operates in a newly arisen allele that has a selective advantage it tends to increase in frequency in the population and carries linked neutral alleles along with it. This phenomenon is known as the hitchhiking effect or selective sweep (Maynard-Smith and Haigh, 1974;Charlesworth, 2007). The selective sweep reduces the heterozygosity of regions surrounding the selected locus (Kaplan et al., 1989;Kim and Stephan, 2002) and introduces a skew in the site frequency spectrum (SFS) because of an excess of rare variants in the selected region (Braverman et al., 1995;Kim and Stephan, 2002).
An increase in the average linkage disequilibrium (LD) leading to long haplotypes is also expected in the region surrounding the selected site (Kim and Stephan, 2002). As LD decays and high frequency neutral alleles become fixed in the population after fixation of the selected mutation, this selection signature vanishes rapidly (Przeworski, 2002;Kim and Nielsen, 2004;McVean, 2007). Thus, a high frequency derived allele surrounded by a long-range LD is indicative of a recent selective sweep (Sabeti et al., 2002;Voight et al., 2006). In addition, the levels of within-population diversity tend to decrease while the between-population levels of diversity tend to increase in the region surrounding the selected locus (Beaumont, 2005;Storz, 2005). Furthermore, the number of nonsynonymous substitutions per nonsynonymous site (d N ) tends to be higher than the number of synonymous substitutions per synonymous site (d S ) (Nei, 2005;Harris and Meyer, 2006).
The model of selective sweep in which a newly arisen allele with a strong selective advantage increases quickly in frequency until reaching fixation is known as "hard sweep". In contrast, when the selected allele is part of existent genetic variation, it causes a "soft sweep" in which the footprint left by selection tends to be less pronounced and the frequency of the selected allele at the beginning of the selected phase is the crucial factor influencing the selective sweep (Przeworski et al., 2005;Pritchard et al., 2010).
Balancing selection favors the maintenance of polymorphism (Harris and Meyer, 2006;Oleksyk et al., 2010). The persistence of the same alleles for a long time is known as long-term balancing selection and, in addition to maintaining polymorphism in the selected locus, it also tends to increase diversity in tightly linked neutral sites; if the region under selection has low recombination rates then it generally also has longer coalescence times than other regions (Charlesworth, 2006). In the presence of long-term balancing selection, the within-population diversity levels tend to increase and the between-population levels of diversity tend to decrease (Navarro and Barton 2002;Charlesworth et al., 2003;Charlesworth, 2006), leading to reduced inbreeding coefficient (FST) values among populations compared to neutral expectations (Beaumont, 2005;Storz, 2005). However, in some cases, the FST levels may be higher than expected by neutrality (Beaumont, 2005;Charlesworth, 2006).
When negative (background) selection occurs, the novel variants are disadvantageous and are consequently removed from the population, along with neutral variations linked to them (Innan and Stephan, 2003). If the recombination rate in the region is restricted or the population is highly inbred then background selection reduces the variability around the eliminated sites (Charlesworth et al., 1993(Charlesworth et al., , 1995Andolfatto, 2001;Stephan, 2010). An excess of low frequency alleles is also observed in small to moderately sized populations (Charlesworth et al., 1993(Charlesworth et al., , 1995 and the number of nonsynonymous substitutions per nonsynonymous site tends to be lower than the number of synonymous substitutions per synonymous site (Nei, 2005;Harris and Meyer, 2006). However, in regions with normal recombination rates, or when inbreeding is restricted, no reduction in variability is observed (Charlesworth et al., 1993(Charlesworth et al., , 1995Stephan, 2010). Furthermore, background selection does not cause a marked bias in the frequency spectrum (Charlesworth et al., 1993(Charlesworth et al., , 1995Kim and Stephan, 2000;Andolfatto, 2001;Stephan, 2010).
Selection signatures can be influenced by several factors. For example, the type of selection, the relative age of the neutral linked alleles, the strength of selection and the recombination rate (Braverman et al., 1995;Kaplan et al., 1989;Kim and Stephan, 2002;Charlesworth, 2007;McVean, 2007). Recognition of the molecular footprints left by different types of selection is a crucial task in identifying genomic regions subjected to selection. In this case, the neutral theory serves as the backbone for the statistical tests developed to detect selection signatures. However, in natural populations, some assumptions of the neutral theory can be violated (e.g., population expansion, subdivision and bottlenecking) and this can lead to signals that mimic the footprints of selection. The interaction of different types of selection and interaction between selection and demographic factors can bias the footprints left in the genome (Barton, 1998;Kim and Stephan, 2000;Kreitman, 2000;Charlesworth et al., 2003;Harris and Meyer, 2006;Toro and Mäki-Tanila, 2007). Because of this, it is worth noting that in studies designed to detect selection signatures in livestock a considerable high rate of false positives is expected as a result of genetic drift and founder effect, both of which were particularly important during the development of livestock breeds (Petersen et al., 2013).

Methods for Detecting Selected Loci
The methods proposed for detecting selected loci can be classified in different ways (Harris and Meyer, 2006;Oleksyk et al., 2010). Based on the main variables that affect the patterns of molecular variation left by selection, Hohenlohe et al. (2010) proposed a decision tree designed to identify the most appropriate method for each case. This decision tree is based primarily on the time scale in which selection can occur, but also considers other factors (e.g., the number of populations in the study, mode of selection, etc.) and can be used by researchers in studies designed to detect selection signatures.

Tests based on synonymous and non-synonymous substitution rates
When the coding sequences of orthologous genes of interest are compared, it is expected that under neutral evolution, d N /d S = 1. When positive selection is in effect, d N /d S > 1, and under negative selection, d N /d S < 1. Differences in d N /d S are also expected among lineages when selection is in effect (Yang, 1998). Several methods have been proposed to estimate d N and d S (Nei, 2005). These methods were initially approximations based on the comparison of two sequences (Nei and Gojobori, 1986). More recently, maximum likelihood estimates from multiple alignments that account for transition/tranversion rate bias, codon usage bias, selective restraints at the protein level (Goldman and Yang, 1994), and variable d N /d S among sites and among lineages have been proposed Yang et al., 2000;Yang, 2002;Yang and Nielsen, 2002;O'Brien et al., 2009). Hypothesis testing can be done using a likelihood ratio test that compares the model (assuming neutrality) with alternative models (Yang, 1998;Yang and Nielsen, 1998;Yang et al., 2000Yang et al., , 2005. Packages such as MEGA (Tamura et al., 2007) and PAML (Yang, 2007) implement the d N /d S selection tests.

Tests based on the frequency spectrum
The q parameter can be estimated from DNA sequences in several ways, and comparison of the different estimates of q is the basis for some tests aimed at identifying selected regions (Tajima, 1989;Fu and Li, 1993;Fu, 1996Fu, , 1997. Tajima (1989) proposed a test based on the difference between $ q p (the average number of nucleotide differences) and $ q S (the number of segregating sites along the DNA sequence) because the presence of selection tends to alter the value of $ q p while that of $ q S tends to remain unaffected (Tajima, 1989;Hartl and Clark, 2010). The proposed statistic (Tajima's D) corresponds to the standardized difference between $ q p and $ q S (Tajima, 1989;Harris and Meyer, 2006). Under neutrality, the value of D tends to be zero. Positive and negative selection tend to reduce heterozygosity and cause an excess of rare variants surrounding the selected locus, leading to D < 0 (Kaplan et al., 1989;Tajima, 1989;Charlesworth et al., 1993Charlesworth et al., , 1995Braverman et al., 1995;Andolfatto, 2001;Kim and Stephan, 2002;Stephan, 2010). In contrast, long-term balancing selection increases the diversity around the selected locus, leading to D > 0 (Tajima, 1989;Navarro and Barton, 2002;Charlesworth, 2006). Several other tests for detecting selection based on the excess of rare alleles have been developed (Fu and Li,332 Selection signatures in livestock 1993; Fu, 1996Fu, , 1997. However, the results of these tests do not always have a straightforward biological interpretation because in some situations it is impossible to differentiate between positive and negative selection (Tajima, 1989;Harris and Meyer, 2006), and also because these tests are sensitive to demography (Tajima, 1989;Charlesworth et al., 1993;Fu and Li, 1993;Fu, 1996Fu, , 1997. While a reduction in heterozygosity and an excess of rare variants are not necessarily a specific pattern left by selection, an excess of derived variants (non-ancestral allele determined by an outgroup) has been identified as a unique feature produced by positive selection (Fay and Wu, 2000). To access this feature, Fay and Wu (2000) (Fay and Wu, 2000). The decrease in variability caused by positive selection tends to be broken by recombination events. Consequently, "valleys" of reduced heterozygosity have been suggested to be footprints of recent hitchhiking events. The depth and extent of the "valleys" is influenced by several factors, such as the strength of selection, recombination rates and effective population size. Because of this, Kim and Stephan (2002) proposed a composite likelihood approach for detecting positive selection in a recombining chromosome. The test is based on the expected number of sites where the derived allele is part of a given frequency interval in the population. More recently, extensions of these tests based on the frequency spectrum around a selective sweep have been proposed. These new methods can deal with genomic data and account for the ascertainment bias Kelley et al., 2006;Williamson et al., 2007).

Tests based on linkage disequilibrium
Exploitation of the LD patterns is the focus of several tests for detecting selection (Sabeti et al., 2002(Sabeti et al., , 2007Kim and Nielsen 2004;Voight et al., 2006;Kimura et al., 2007). However, these signatures tend to be transient since the recombination tends to quickly break down this LD as soon as the selected locus reaches fixation (Przeworski, 2002;Kim and Nielsen, 2004;McVean, 2007). Sabeti et al. (2002) proposed an approach referred to as the long-range haplotype (LRH) test to detect recent selective sweeps by focusing on the relationship between the allele frequency and the LD level surrounding it.
This test starts with identification of the core haplotypes (through genotyping a set of single nucleotide polymorphisms (SNPs) in a region so small that recombination may not occur). Subsequently, other SNPs at increasing distances from the core haplotypes are analyzed to evaluate the decay of LD according to distance (Sabeti et al., 2002). The LD is measured at increasing distances from the core haplotypes through calculation of the extended haplotype homozygosity (EHH), which is the probability that two chromosomes carrying a specific core haplotype are homozygous for the whole region from the core to a distance x (Sabeti et al., 2002). The relative EHH (REHH) is then calculated to compare the decay of EHH of one specific core haplotype to the decay of EHH of all the other core haplotypes combined. To test for selection, REHH and the frequency for each core haplotype is compared to REHH and the frequency of the other core haplotypes. Positive selection is inferred if one core haplotype has a combination of high REHH and high frequency in the population (Sabeti et al., 2002).
An extension of the LRH test was proposed by Voight et al. (2006). This test is referred to as the iHS (integrated haplotype score) and was designed to work on a genomic scale using information from dense SNP chips. The iHS value can be defined simply as a measure of how unusual the haplotypes around an SNP are, compared to the genome (Voight et al., 2006). In this approach, each SNP is treated as a core SNP and the test starts with calculation of the EHH for each core SNP. As SNPs are biallelic loci, each core SNP can be ancestral or derived. For the test, the integral of the observed decay of EHH from a core SNP until EHH reaches 0.05 is computed (the area under the curve in an EHH vs. distance plot). This value is referred to as the integrated EHH (iHH) and is identified as iHH A or iHH D , depending on whether it was computed from the ancestral or the derived allele of the core SNP. This value is then standardized to allow direct comparisons among different SNPs regardless of allele frequencies (Voight et al., 2006). Hussin et al. (2010) proposed a method based on the haplotype allelic classes (HAC). This measure can be defined as the count of allelic differences between the reference allelic class and the individual haplotypes in the sample. The statistic proposed is referred to as Svd, with positive values suggesting positive selection (Hussin et al., 2010).
The LRH and iHS tests rely on the frequencies of alleles at core SNP and therefore have reduced power for detecting selection when the selected allele has reached fixation. To deal with situations in which the selected allele is fixed in one population but remains polymorphic in others, LRH-derived tests based on pairwise comparisons among populations have been proposed (Kimura et al., 2007;Sabeti et al., 2007;Tang et al., 2007). The XP-EHH statistic can be defined as the normalized log-ratio between I A and I B , where I A is the integral of the observed decay of EHH from a core SNP to an SNP X (which has an EHH value as close as possible to 0.04 in both populations) in population A, and I B is the analogous measure in population B (Sabeti et al., 2007). The ln(Rsb) statistic proposed by Tang et al. (2007) is very similar to XP-EHH. The main difference between them is that the former calculates the EHH based on the status of each core SNP allele and the latter calculates the EHH based on the core SNP site (Sabeti et al., 2007;Tang et al., 2007).

Tests based on population differentiation
The estimation of FST from multiple loci and comparison of these values with its neutral expectations is the basis of several tests aimed at identifying selection (Lewontin and Krakauer, 1973;Bowcock et al., 1991;Vitalis et al., 2001Vitalis et al., , 2003Beaumont and Balding, 2004;Foll and Gaggiotti, 2008;Excoffier et al., 2009;Bonhomme et al., 2010). The first effort in this direction was proposed by Lewontin and Krakauer (1973). They suggested that the FST estimated from several loci under neutrality must show small heterogeneity; however, if selection is acting on some of them then the estimates of FST tend to vary widely. The Lewontin and Krakauer test involves comparison between the variance of FST estimated from the data and the expected variance of FST under neutrality through a variance ratio test (Lewontin and Krakauer, 1973).
Lewontin and Krakauer's test was severely criticized soon after publication because of the assumptions they made in estimating the variance of FST under neutrality (Nei and Maruyama, 1975;Robertson, 1975). To avoid the effects of population structure, Bowcock et al. (1991) suggested the use of a null distribution obtained by calculating an FST distribution using simulations that take into account the populations phylogenetic history. More recently, models capable of generating the null distribution of FST that are robust to population history and structure (recent divergence and growth, isolation by distance and heterogeneous levels of gene flow between populations) have been proposed (Beaumont and Nichols, 1996;Beaumont and Balding, 2004;Foll and Gaggiotti, 2008;Excoffier et al., 2009) and implemented in freely distributed softwares such as BayesFST (Beaumont and Balding, 2004), BayeScan (Foll and Gaggiotti, 2008) and Arlequin (Excoffier et al., 2009). The methods proposed by Beaumont and Nichols (1996) and Excoffier et al. (2009) are computationally feasible, but the presence of some complex demographic histories can lead to important biases. On the other hand, Markov chain Monte Carlo (MCMC) based methods (Beaumont and Balding, 2004;Foll and Gaggiotti, 2008) efficiently accommodate some departures from model assumptions but are computationally very intensive.
Another way to avoid the effects of demography is to perform pairwise comparisons between populations (Tsakas and Krimbas, 1976). Based on this idea, Vitalis et al. (2001) proposed a simple model of population divergence from which they obtained the joint distribution of population-specific estimators of branch length which were used to construct the confidence interval. This approach seems to be robust against departures from model assumptions and also tends to remove the bias introduced by unknown population structure. However, the pairwise comparison tends to reduce the power of the test because information from other populations is discarded (Tsakas and Krimbas, 1976;Vitalis et al., 2001). This analysis is implemented in the software DetSel 1.0 (Vitalis et al., 2003).
The foregoing discussion has shown that there are currently several approaches for detecting footprints left by selection. Each of these approaches can capture specific patterns of molecular variation. The use of a combination of alternative approaches for detecting selection signals is an interesting strategy that has been suggested as a means of increasing the reliability of these studies. However, the success of one test and failure of another does not exclude the region of interest from having been subjected to selection since different tests can focus on different signals left by selection or look for different time scales in which the selection can act (Hohenlohe et al., 2010;Oleksyk et al., 2010).

Selection signatures in livestock
Domestication has resulted in considerable changes in the morphology and behavior of livestock species. In the early stages of domestication, unconscious selection for behavioral traits was applied. This early stage was followed by methodical selection in which specific traits were selected based on goals (Diamond, 2002;Gregory, 2009).
The development of specialized breeds, improved to produce specific products or to reach a morphological standard, increased the differences between domesticated animals and their wild relatives and also generated an enormous variety of different populations, with specific traits related to their specialization. Some of these traits are controlled by several interacting genes with minor effects. This creates an exceptional opportunity to gain knowledge of the molecular basis of these traits, particularly since most economically important traits in livestock are quantitative (Andersson and Georges, 2004).
The identification of genes targeted by selection in livestock can help to find and prove causal mutations in regions previously identified by QTL mapping experiments and can reveal genes related to ecological traits (e.g., genes related to tropical adaptation) that are difficult to find experimentally. Furthermore, these studies can help to identify the genes or gene networks that contribute to the same trait but that were selected differentially between breeds; they can also unveil genes responsible for genetic correlations and the domestication process (Schlötterer, 2003;Hayes et al., 2008;Ojeda et al., 2008;Flori et al., 2009;MacEachern et al., 2009).

Signatures associated with domestication and early breed development
In some wild species, the expression both of eumelanin and phaeomelanin pigments is related to a camou- 334 Selection signatures in livestock flaged coat color. During domestication, non-camouflaged coat patterns were selected because of their direct effect on animal husbandry and also because these patterns may have been used as markers associated with improved individuals, or because of cultural preferences (Fang et al., 2009;Wiener and Wilkinson, 2011). The melanocyte stimulating hormone receptor gene (MC1R) influences the production of eumelanin and phaeomelanin pigments (Werth et al., 1996;Kijas et al., 1998;Fang et al., 2009;Li et al., 2010b) and is under selection in domestic cattle Stella et al., 2010) and pig (Fang et al., 2009;Li et al., 2010b;Amaral et al., 2011) breeds. Other genes that influence coat color pattern were also suggested to be under selection in domestic species. Selection signatures around the V-Kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog (KIT) have been reported for cattle (Stella et al., 2010;, pigs (Fontanesi et al., 2010;Amaral et al., 2011) and sheep (Kijas et al., 2012). The melanocyte protein 17 precursor (PMEL17), also known as the Silver gene (SILV), is suggested to be under selection in some cattle breeds .
The presence/absence of horns is another important feature in breed definition in some livestock species. Recently, the relaxin-like receptor 2 (RXFP2) gene was associated with this trait (Johnston et al., 2011), and a SNP surrounding this gene showed a strong selection signal in an analysis involving 74 sheep breeds. In cattle, the region surrounding the polled locus was shown to be under selection, although the gene responsible for this trait was not mapped (Drögemüller et al., 2005;Li et al., 2010a;Stella et al., 2010). Behavioral changes, such as a reduction in fear and anti-predator responses and an increase in sociability, are believed to be important reflections of animal domestication (Diamond, 2002;Amaral et al., 2011;Wiener and Wilkinson, 2011). Indeed, several studies in livestock suggest selection signatures surrounding genes related to nervous system development and function (The Bovine HapMap Consortium, 2009;Gautier et al., 2009;Stella et al., 2010;Amaral et al., 2011).

Cattle
Modern bovine breeds can basically be grouped into two major types, the taurine and indicine groups. Within each group, several breeds have been developed, and there is considerable intra-and inter-group variability in productive (milk yield and quality, meat production), morphological (coat color, presence/absence of horns) and adaptive (disease resistance, heat tolerance) traits (The Bovine HapMap Consortium, 2009). Several genome-wide studies focusing on different approaches and using different sets of breeds have sought for selection signatures in bovines (Prasad et al., 2008;Barendse et al., 2009;Flori et al., 2009;Gautier et al., 2009;Hayes et al., 2009;MacEachern et al., 2009;The Bovine HapMap Consortium, 2009;Li et al., 2010a;Qanbari et al., 2010Qanbari et al., , 2011Stella et al., 2010;Hosokawa et al., 2012).
Various studies in beef cattle using approaches such as differences in allele frequencies, iHS and FST have found selection signals in the centromeric region of BTA14 The Bovine HapMap Consortium, 2009;, a region involved in the control of marbling and fatness traits (Barendse, 1999;Moore et al., 2003;Thaller et al., 2003;Casas et al., 2005;Pannier et al., 2010;Veneroni et al., 2010). An increase in intramuscular fat percentage in Australian Angus in recent years, together with a significant effect of this region on fat traits, may corroborate with the selection signature found in these studies .
The double muscled phenotype has been selected in some beef breeds and mutations in the Growth Differentiation Factor 8 (also known as myostatin or GDF-8) gene are related to this phenotype (Bellinge et al., 2005). A decrease in heterozygosity around this gene has been demonstrated in double muscled breeds (Wiener et al., 2003;Wiener and Gutiérrez-Gil, 2009) and an increase in LD (measured using the iHS approach) has been reported in this region (The Bovine HapMap Consortium, 2009).
Using the FST approach, a selection signature was found in the median region of BTA2 (Barendse et al., 2009;The Bovine HapMap Consortium, 2009;Qanbari et al., 2011). This region was associated with feed efficiency and intramuscular fat in beef breeds (Barendse et al., 2007(Barendse et al., , 2009) and contains the R3H Domain Containing 1 (R3HDM1) and Zinc Finger, RAN Binding Domain Containing 3 (ZRANB3) genes, which have been suggested to be involved in feed efficiency (Barendse et al., 2009;The Bovine HapMap Consortium, 2009).
The increase in allele frequency differences between meat and dairy cattle and the high linkage disequilibrium in dairy breeds (using EHH and iHS methods) suggest that the region surrounding DGAT1 is under selection Qanbari et al., 2010;Hosokawa et al., 2012;Schwarzenbacher et al., 2012). This gene is suggested to be responsible for a QTL with a major effect on milk fat percentage (Grisart et al., 2002;Khatkar et al., 2004;Cole et al., 2009;Hayes et al., 2010;Jiang et al., 2010).
At least two QTLs affecting milk traits are located in the BTA20 chromosome. The first QTL was mapped surrounding the Growth Hormone Receptor Gene (GHR) and has a marked effect on protein percentage and a minor effect on fat percentage and milk yield, while the second overlaps the Prolactin Receptor (PRLR) and affects protein and fat yield (Blott et al., 2003;Khatkar et al., 2004;Schnabel et al., 2005;Viitala et al., 2006;Cole et al., 2009;Ogorevc et al., 2009;Jiang et al., 2010). These regions are under selection Hayes et al., 2009;The Bovine HapMap Consortium, 2009;Qanbari et al., 2010Qanbari et al., , 2011Stella et al., 2010;.
Some studies have shown the presence of QTLs affecting milk fat and protein traits in the region surrounding the Signal Transducer and Activator of Transcription 1 (STAT1) gene. This gene has been implicated in mammary gland development and is associated with milk, fat and protein yield in Holstein cattle (Cobanoglu et al., 2006). Two studies comparing allele frequency differences between beef and dairy cattle suggested a selection signal in the region surrounding this gene Hosokawa et al., 2012).
The region surrounding the Sialic Acid Binding Ig-Like Lectin 5 (SIGLEC-5) and Zinc Finger Protein 577 (ZNF577) genes was shown to be associated with Net Merit and several related traits, such as conformation, longevity and calving ease in Holstein cattle (Cole et al., 2009). Based on findings using the iHS approach, this region was suggested to be under selection in Holstein cattle and, although these traits were not the main objective in breeding improvement programs, a weak selection against unfavorable alleles may be responsible for this signature (Qanbari et al., 2011).
Several other regions have been suggested to be under selection in cattle, but the genes under selection cannot be proposed for most of them. Functional analysis of these regions reveals the presence of genes involved in the gonadotropic and somatotropic axes, muscle development, growth, nervous system development and immune response (Barendse et al., 2009;Flori et al., 2009;The Bovine HapMap Consortium, 2009;Qanbari et al., 2010Qanbari et al., , 2011Stella et al., 2010;.

Pigs
Pig domestication occurred independently multiple times in diverse locations across Eurasia approximately 9000 years ago (Larson et al., 2005). Domestic pig species are found in a wide range of environments and show extensive variation in morphological, behavioral and ecological characteristics (Larson et al., 2005;Chen et al., 2007). The use of this species in very different production systems and environmental conditions around the globe has resulted in an enormous variety of breeds, each one harboring adaptations to special conditions. Currently, most pig production systems are based on five breeds (Large White, Duroc, Landrace, Hampshire and Pietrain) that have been subjected to intense artificial selection focused on productivity traits. Moreover, there is a considerable number of related species and wild individuals that can be used to infer some aspects of selection (Chen et al., 2007).
The increase in muscle mass and decrease in fat content in pigs has been subject to strong selective pressure in commercial pig populations and is related to a substitution in intron 3 of the Insulin-Like Growth Factor 2 (IGF2) gene (Van Laere et al., 2003). Using Tajima's D, Ojeda et al. (2008) identified a selection signature in the IGF2 gene in three breeds (Pietrain, Hampshire and Duroc) that are commonly used as sire lines, and have been selected for growth and meat leanness. The Melanocortin 4 Receptor (MC4R) gene related to growth and fatness traits has also been suggested to be under selection in pigs Onteru et al., 2013).
An intronic substitution in the Estrogen Receptor (ESR) gene has been associated with litter size in pigs (Rothschild et al., 1996;Short et al., 1997). Although some studies have reported divergent results (Muñoz et al., 2007), this marker has been used by the pig breeding industry in Marker Assisted Selection (Dekkers, 2004). Recently, Bonhomne et al. (2010) suggested that this gene is under selection in the Large White breed.
Functional analysis of regions under positive selection in pig breeds has identified genes involved in development of the nervous system and muscle, growth, pigmentation, metabolism, visual/odor perception, immune and inflammatory responses and reproduction (Amaral et al., 2011;Rubin et al., 2012;Esteve-Codina et al., 2013).

Sheep and goats
Sheep and goats were the first domesticated livestock species approximately 9000 years ago. The wide distribution of these species is a reflection of their adaptability to different environments and this has resulted in enormous morphological variation among populations (Diamond, 2002;Gentry et al., 2004;Naderi et al., 2008;Chessa et al., 2009;Kijas et al., 2009). Since their domestication, sheep have been selected for meat, wool and milk production (Chessa et al., 2009;Kijas et al., 2009). Kijas et al. (2012) performed a genome scan based on FST to detect selection signatures in a panel of 2819 individuals from 74 sheep breeds. Thirty-one regions showed selection signals and contained genes related to coat color, bone morphology, growth and reproduction traits. This analysis revealed a strong peak of differentiation surrounding the Growth Differentiation Factor 8 (GDF-8) gene 336 Selection signatures in livestock when Texel individuals were compared with all other breeds (Kijas et al., 2012). In addition, Clop et al. (2006) showed a reduction in the variability of microsatellites surrounding this gene upon comparing hyper-muscled Texels with other sheep breeds. The region surrounding GDF-8 was associated with QTLs for carcass traits in the Texel breed  and a point in the 3' UTR of this gene was suggested to be the causal mutation affecting extreme muscling in Texel individuals (Clop et al., 2006). Moradi et al. (2012) performed a genome scan with approximately 50K SNPs to search for signatures of divergent selection in a comparison between fat and thin-tailed sheep breeds; their study identified at least three regions (OAR5, OAR7 and OARX chromosomes) that have undergone selection. Interestingly, most of the regions identified by Moradi et al. (2012) intersect with QTLs for carcass traits. Improvement in the sheep genome annotation will facilitate the search for and validation of candidate genes related to these traits.

Horses
Horse domestication appears to have occurred 6000 years ago and was central to the development of human history. The major attraction for domestication of this species was probably its ability to run fast for long distances, but its importance as a source of meat may also have been an important factor. The domestic horse shows marked variation in morphological traits, including shape, size, colours and gait (Bowling and Ruvinsky, 2000;Levine, 2005).
Thoroughbred horses have been selected for athletic performance traits and this has led to individuals with extreme phenotypes related to anaerobic and aerobic metabolic capabilities. A genome scan aimed at identifying putative regions under selection in this breed (based on a combination of reduced heterozygosity and increased population differentiation) revealed the presence of genes related to phosphoinositide 3-kinase (PI3K) and insulinsignalling pathways, oxidative stress, energy regulation, adipocyte differentiation and muscle regulation and development. These functions are directly related to the main focus of selection in these breeds, namely, racetrack performance (Gu et al., 2009). Among the genes suggested to be under selection in Thoroughbred horses, the Pyruvate Dehydrogenase Kinase, isozyme 4 (PDK4) gene has been associated with racing performance phenotypes (Hill et al., 2010). Petersen et al. (2013) identified strong signal differentiation around the myostatin (GDF-8) gene in a comparison of the American Paint Horse and Quarter Horse with other breeds. This gene was also associated with muscle fiber type proportions in these breeds.
Another important trait for particular horse breeds is their ability to perform alternate gaits. Recently, it was shown that the gene Doublesex and Mab-3 Related Transcription Factor 3 (DMRT3) is involved with this trait in several breeds (Andersson et al., 2012). In addition, the region encompassing this gene was suggested (based on population differentiation) to be under selection in several breeds that has been selected for alternative gaits (identified as a breed-defining characteristic) (Petersen et al., 2013).

Conclusions
Domestication and artificial selection processes have definitely shaped livestock genomes. The identification of candidate regions as being under selection can help researchers understand the molecular mechanisms involved in adaptation and may also be useful in identifying regions associated with important traits that are under selection.