Acessibilidade / Reportar erro

Validation of a customized subset of SNPs for sheep breed assignment in Brazil

Validação de um subconjunto de SNPs específicos para certificação racial de ovinos no Brasil

Abstract:

The objective of this work was to evaluate the usefulness of a subset of 18 single nucleotide polymorphisms (SNPs) for breed identification of Brazilian Crioula, Morada Nova (MN), and Santa Inês (SI) sheep. Data of 588 animals were analyzed with the Structure software. Assignments higher than 90% confidence were observed in 82% of the studied samples. Most of the low-value assignments were observed in MN and SI breeds. Therefore, although there is a high reliability in this subset of 18 SNPs, it is not enough for an unequivocal assignment of the studied breeds, mainly of hair breeds. A more precise panel still needs to be developed for the widespread use in breed assignment.

Index terms:
Ovis aries; animal genetic resources; certification of origin; genomics; traceability

Resumo:

O objetivo deste trabalho foi avaliar a utilidade de um subconjunto de 18 polimorfismos de nucleotídeo único (SNPs) para a certificação das raças de ovinos Crioula Brasileira, Morada Nova (MN) e Santa Inês (SI). Dados de 588 animais foram analisados com o programa Structure. Em 82% dos casos, observou-se designação racial correta com confiança acima de 90%. A maioria dos casos de designação incorreta de raça foi observada em MN e SI. Portanto, apesar de o subconjunto de 18 SNPs ter confiabilidade elevada, ele não é suficiente para a inequívoca certificação das raças estudadas, principalmente das deslanadas. É necessário o desenvolvimento de um painel mais preciso para uso amplo em certificação racial.

Termos para indexação:
Ovis aries; recursos genéticos animais; certificação de origem; genômica; rastreabilidade

Precise breed identification is a key step in genetic and genomic studies as accurate breed assignment can improve accuracy of the genomic breeding value estimation, especially when mixed-breed populations are used for developing or applying prediction equations (Kachman et al., 2013KACHMAN, S.D.; SPANGLER, M.L.; BENNETT, G.L.; HANFORD, K.J.; KUEHN, L.A.; SNELLING, W.M.; THALLMAN, R.M.; SAATCHI, M.; GARRICK, D.J.; SCHNABEL, R.D.; TAYLOR, J.F.; POLLAK, E.J. Comparison of molecular breeding values based on within- and across-breed training in beef cattle. Genetics Selection Evolution, v.45, p.1-9, 2013. DOI: https://doi.org/10.1186/1297-9686-45-30.
https://doi.org/10.1186/1297-9686-45-30...
; Vandenplas et al., 2016VANDENPLAS, J.; CALUS, M.P.L.; SEVILLANO, C.A.; WINDIG, J.J.; BASTIAANSEN, J.W.M. Assigning breed origin to alleles in crossbred animals. Genetics Selection Evolution, v.48, p.1-22, 2016. DOI: https://doi.org/10.1186/s12711-016-0240-y.
https://doi.org/10.1186/s12711-016-0240-...
). Moreover, many examples of protected denomination of origin (PDO) and protected geographical indications (PGI) for animal-derived products are directly associated with specific breeds (Dimauro et al., 2015DIMAURO, C.; NICOLOSO, L.; CELLESI, M.; MACCIOTTA, N.P.P.; CIANI, E.; MOIOLI, B.; PILLA, F.; CREPALDI, P. Selection of discriminant SNP markers for breed and geographic assignment of Italian sheep. Small Ruminant Research, v.128, p.27-33, 2015. DOI: https://doi.org/10.1016/j.smallrumres.2015.05.001.
https://doi.org/10.1016/j.smallrumres.20...
; Mateus & Russo-Almeida, 2015MATEUS, J.C.; RUSSO-ALMEIDA, P.A. Traceability of 9 Portuguese cattle breeds with PDO products in the market using microsatellites. Food Control, v.47, p.487-492, 2015. DOI: https://doi.org/10.1016/j.foodcont.2014.07.038.
https://doi.org/10.1016/j.foodcont.2014....
), and proper certification is therefore dependent on the correct identification of livestock breed. Issuing of PDO and PGI certifications, associated with robust methods to monitor marketed animal products have contributed to prevent breed extinctions, mainly in Europe (Di Stasio et al., 2017DI STASIO, L.; PIATTI, P.; FONTANELLA, E.; COSTA, S.; BIGI, D.; LASAGNA, E.; PAUCIULLO, A. Lamb meat traceability: The case of Sambucana sheep. Small Ruminant Research, v.149, p.85-90, 2017. DOI: https://doi.org/10.1016/j.smallrumres.2017.01.013.
https://doi.org/10.1016/j.smallrumres.20...
).

Most Brazilian sheep breeds are considered local genetic resources which are currently facing the challenges associated with uncontrolled crossbreeding (McManus et al., 2010MCMANUS, C.; PAIVA, S.R.; ARAÚJO, R.O. de. Genetics and breeding of sheep in Brazil. Revista Brasileira de Zootecnia, v.39, p.236-246, 2010. Suplemento especial. DOI: https://doi.org/10.1590/S1516-35982010001300026.
https://doi.org/10.1590/S1516-3598201000...
). Hair sheep breeds (as Morada Nova and Santa Inês) are found mainly in the Northeastern Brazil, that is characterized by high heat-stress challenges and is associated with lower-productivity indices. Wool sheep (as Crioula) are reared mainly in the Southern part of the country (McManus et al., 2014MCMANUS, C.; HERMUCHE, P.; PAIVA, S.R.; MORAES, J.C.F.; DE MELO, C.B.; MENDES, C. Geographical distribution of sheep breeds in Brazil and their relationship with climatic and environmental factors as risk classification for conservation. Brazilian Journal of Science and Technology, v.1, p.1-15, 2014. DOI: https://doi.org/10.1186/2196-288X-1-3.
https://doi.org/10.1186/2196-288X-1-3...
). Both regions have great potential for development of PDO and PGI products and depend on inexpensive and accurate methods for breed certification.

As individual animals have low-overall values, and sheep farming in Brazil is performed by small and low-income farmers, the use of low-density SNP panels for breed-assignment to lower-genotyping costs is highly appealing. Therefore, a key goal is the identification of a subset of SNPs (up to 96) that can be used for accurate breed assignment.

Vieira et al. (2015)VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
https://doi.org/10.1590/1809-4430-Eng.Ag...
used information generated with the Ovine SNP50 BeadChip (Illumina Inc., San Diego, CA, USA) to identify a subset of SNPs to differentiate between Crioula, Morada Nova, and Santa Inês. These authors applied three different prediction methods (least absolute shrinkage and selection operator - Lasso, Random Forest, and boosting prediction methods) to select a minimum number of SNP markers for sheep breed identification. They were able to define a set of 18 SNPs able to distinguish samples between these three breeds. However, Vieira et al. (2015)VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
https://doi.org/10.1590/1809-4430-Eng.Ag...
had used a reduced sampling of genotypes from only 72 animals (23 Crioula, 22 Morada Nova, and 27 Santa Inês), whose validation with an independent dataset remains necessary.

The objective of this work was to verify the usefulness of this subset of SNPs previously reported for breed identification of Crioula (BC), Morada Nova (MN), and Santa Inês (SI) sheep.

Samples from 19 BC, 308 MN, and 261 SI animals were genotyped with Ovine SNP50 BeadChip (Illumina Inc., San Diego, CA, USA). The full set of genotypes was used to calculate the genomic relationship matrix for each breed, normalized by an individual marker (GCTA method) (Yang et al., 2011YANG, J.; LEE, S.H.; GODDARD, M.E.; VISSCHER, P.M. GCTA: a tool for genome-wide complex trait analysis. American Journal of Human Genetics, v.88, p.76-82, 2011. DOI: https://doi.org/10.1016/j.ajhg.2010.11.011.
https://doi.org/10.1016/j.ajhg.2010.11.0...
). The average relationship between the animals used by Vieira et al. (2015)VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
https://doi.org/10.1590/1809-4430-Eng.Ag...
(reference population) and the animals evaluated in the present study (validation population) was calculated. The results showed a low relationship between animals from the two datasets (Crioula, 0.029±0.132 (mean±standard deviation); Morada Nova, 0.012±0.049; and Santa Inês, 0.008±0.053).

The eighteen SNPs selected by Vieira et al. (2015)VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
https://doi.org/10.1590/1809-4430-Eng.Ag...
were extracted from the dataset, and minor allele frequencies (MAF) were determined for each breed (Table 1). As the minor allele can be different from one breed to another, and can differ between the two datasets, contrasts were performed between breeds and studies. Only one SNP in Santa Inês (s32131) and one in Morada Nova (s69653) differed in minor allele between the reference population (Vieira et al., 2015VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
https://doi.org/10.1590/1809-4430-Eng.Ag...
) and the validation population used in the present study.

Table 1.
Minor allele frequency estimates for each SNP marker used in the analyses, for each breed (Crioula, Morada Nova, and Santa Inês), and the datasets “Reference”, according to Vieira et al. (2015)VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
https://doi.org/10.1590/1809-4430-Eng.Ag...
, and “Validation”, from present study.

The Structure software version 2.3.4 (Pritchard et al., 2000PRITCHARD, J.K.; STEPHENS, M.; DONNELLY, P. Inference of population structure using multilocus genotype data. Genetics, v.155, p.945-959, 2000.) was used to estimate individual allocation probabilities in each of the three breeds. The definition of clusters was based on the admixture model and assumption that allele frequencies were correlated between breeds. Run parameters were as follows: 588 individuals; 18 loci, without a priori information of populations; length of burn-in period of 10,000; and 200,000 repetitions after burn-in for Markov Chain Monte Carlo (MCMC). The number of clusters (K) was set to 2, 3, 4, and 5, with five runs for each cluster. Following the method of Evanno et al. (2005)EVANNO, G.; REGNAUT, S.; GOUDET, J. Detecting the number of clusters of individuals using the software structure: a simulation study. Molecular Ecology, v.14, p.2611-2620, 2005. DOI: https://doi.org/10.1111/j.1365-294X.2005.02553.x.
https://doi.org/10.1111/j.1365-294X.2005...
, the best K was 3, which agrees with breeds in the data, and shows that this extremely small panel is able to identify this structure in the samples. Thereafter, we used the results for K=3 to evaluate the correct classification rate.

The percentage of individuals classified in each cluster was determined by the estimated proportion of the association of each individual genotype to each of the clusters. Tests of individual allocation were performed with and without a priori information about the source population of individuals, yielding similar outcomes. Therefore, results without a priori information were used, as they represent a real situation of breed assignment analyses more properly, since there is no previous knowledge or information about the sample.

Accurate breed assignments (confidence >90%) were observed in 89, 86, and 75% of BC, MN, and SI animals, respectively. Mean cluster allocation values ranged from 90.9 to 93.7% (Table 2). SI has been previously shown to have been formed by crossbreeding of MN, Bergamasca, and Somalis (McManus et al., 2010MCMANUS, C.; PAIVA, S.R.; ARAÚJO, R.O. de. Genetics and breeding of sheep in Brazil. Revista Brasileira de Zootecnia, v.39, p.236-246, 2010. Suplemento especial. DOI: https://doi.org/10.1590/S1516-35982010001300026.
https://doi.org/10.1590/S1516-3598201000...
). MN and SI animals were observed to have some degree of admixture and estimated fixation index (Fst) of 6.59% (Genome-wide..., 2012GENOME-WIDE analysis of the world’s sheep breeds reveals high levels of historic mixture and strong recent selection. PLoS Biology, v.10, e1001258, 2012. DOI: https://doi.org/10.1371/journal.pbio.1001258.
https://doi.org/10.1371/journal.pbio.100...
). Therefore, some allocation errors between MN and SI were expected. Nonetheless, high levels of correct breed allocation (>90%) were observed.

Table 2.
Mean cluster allocation of Crioula (BC), Morada Nova (MN), and Santa Inês (SI) sheep obtained with the Structure analysis of data from 18 SNP markers.

The results obtained here using 18 SNPs were less accurate than those of previous studies, most likely because of the higher-information content of microsatellite markers compared to SNPs, and the great difference in number of SNPs used. SNPS for parentage... (2014)SNPs FOR PARENTAGE testing and traceability in globally diverse breeds of sheep. PLoS ONE, v.9, e94851, 2014. DOI: https://doi.org/10.1371/journal.pone.0094851.
https://doi.org/10.1371/journal.pone.009...
identified a set of 163 SNPs for accurate parentage testing and traceability, in many of the world’s main sheep breeds. Mateus & Russo-Almeida (2015)MATEUS, J.C.; RUSSO-ALMEIDA, P.A. Traceability of 9 Portuguese cattle breeds with PDO products in the market using microsatellites. Food Control, v.47, p.487-492, 2015. DOI: https://doi.org/10.1016/j.foodcont.2014.07.038.
https://doi.org/10.1016/j.foodcont.2014....
identified 12 microsatellite markers able to correctly classify animals into their respective breeds, while Di Stasio et al. (2017)DI STASIO, L.; PIATTI, P.; FONTANELLA, E.; COSTA, S.; BIGI, D.; LASAGNA, E.; PAUCIULLO, A. Lamb meat traceability: The case of Sambucana sheep. Small Ruminant Research, v.149, p.85-90, 2017. DOI: https://doi.org/10.1016/j.smallrumres.2017.01.013.
https://doi.org/10.1016/j.smallrumres.20...
used 15 microsatellite markers for breed certification in Italian sheep breeds. Other studies (Bertolini et al., 2015BERTOLINI, F.; GALIMBERTI, G.; CALÒ, D.G.; SCHIAVO, G.; MATASSINO, D.; FONTANESI, L. Combined use of principal component analysis and random forests identify population-informative single nucleotide polymorphisms: application in cattle breeds. Journal of Animal Breeding and Genetics, v.132, p.346-356, 2015. DOI: https://doi.org/10.1111/jbg.12155.
https://doi.org/10.1111/jbg.12155...
; Dimauro et al., 2015DIMAURO, C.; NICOLOSO, L.; CELLESI, M.; MACCIOTTA, N.P.P.; CIANI, E.; MOIOLI, B.; PILLA, F.; CREPALDI, P. Selection of discriminant SNP markers for breed and geographic assignment of Italian sheep. Small Ruminant Research, v.128, p.27-33, 2015. DOI: https://doi.org/10.1016/j.smallrumres.2015.05.001.
https://doi.org/10.1016/j.smallrumres.20...
) showed that at minimum of 100 SNPs are required for correct and accurate breed assignment of cattle and sheep breeds.

The 18 SNP panel tested showed 90% correct assignment of the studied breeds. Incorrect assignments ranged between 6 to 9% of the animals (Table 2). Ideally, a system for breed certification requires a correct allocation close to 100% with minimal incorrect assignment. The SNP panel tested showed high levels of correct assignment; however, the obtained results are not enough for its widespread use for breed certification.

The construction and validation of a larger panel with additional SNPs could provide higher correct assignment rates (close to 100%) for other major sheep breeds reared in Brazil, which may contribute to breed identification and certification procedures. Thereupon, this tool could be incorporated in routine inspection services and ongoing genetic improvement and conservation activities.

Acknowledgments

To Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) and to Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Capes), for financial support; to Instituto Federal de Educação, Ciência e Tecnologia Goiano (IF Goiano) and to International Sheep Genomics Consortium, for technical and logistics support; and to Embrapa multiuser bioinformatics laboratory (Laboratório Multiusuário de Bioinformática da Embrapa) for permitting the use of its high-performance computational infrastructure.

References

  • BERTOLINI, F.; GALIMBERTI, G.; CALÒ, D.G.; SCHIAVO, G.; MATASSINO, D.; FONTANESI, L. Combined use of principal component analysis and random forests identify population-informative single nucleotide polymorphisms: application in cattle breeds. Journal of Animal Breeding and Genetics, v.132, p.346-356, 2015. DOI: https://doi.org/10.1111/jbg.12155.
    » https://doi.org/10.1111/jbg.12155
  • DI STASIO, L.; PIATTI, P.; FONTANELLA, E.; COSTA, S.; BIGI, D.; LASAGNA, E.; PAUCIULLO, A. Lamb meat traceability: The case of Sambucana sheep. Small Ruminant Research, v.149, p.85-90, 2017. DOI: https://doi.org/10.1016/j.smallrumres.2017.01.013.
    » https://doi.org/10.1016/j.smallrumres.2017.01.013
  • DIMAURO, C.; NICOLOSO, L.; CELLESI, M.; MACCIOTTA, N.P.P.; CIANI, E.; MOIOLI, B.; PILLA, F.; CREPALDI, P. Selection of discriminant SNP markers for breed and geographic assignment of Italian sheep. Small Ruminant Research, v.128, p.27-33, 2015. DOI: https://doi.org/10.1016/j.smallrumres.2015.05.001.
    » https://doi.org/10.1016/j.smallrumres.2015.05.001
  • EVANNO, G.; REGNAUT, S.; GOUDET, J. Detecting the number of clusters of individuals using the software structure: a simulation study. Molecular Ecology, v.14, p.2611-2620, 2005. DOI: https://doi.org/10.1111/j.1365-294X.2005.02553.x.
    » https://doi.org/10.1111/j.1365-294X.2005.02553.x
  • GENOME-WIDE analysis of the world’s sheep breeds reveals high levels of historic mixture and strong recent selection. PLoS Biology, v.10, e1001258, 2012. DOI: https://doi.org/10.1371/journal.pbio.1001258.
    » https://doi.org/10.1371/journal.pbio.1001258
  • KACHMAN, S.D.; SPANGLER, M.L.; BENNETT, G.L.; HANFORD, K.J.; KUEHN, L.A.; SNELLING, W.M.; THALLMAN, R.M.; SAATCHI, M.; GARRICK, D.J.; SCHNABEL, R.D.; TAYLOR, J.F.; POLLAK, E.J. Comparison of molecular breeding values based on within- and across-breed training in beef cattle. Genetics Selection Evolution, v.45, p.1-9, 2013. DOI: https://doi.org/10.1186/1297-9686-45-30.
    » https://doi.org/10.1186/1297-9686-45-30
  • MATEUS, J.C.; RUSSO-ALMEIDA, P.A. Traceability of 9 Portuguese cattle breeds with PDO products in the market using microsatellites. Food Control, v.47, p.487-492, 2015. DOI: https://doi.org/10.1016/j.foodcont.2014.07.038.
    » https://doi.org/10.1016/j.foodcont.2014.07.038
  • MCMANUS, C.; HERMUCHE, P.; PAIVA, S.R.; MORAES, J.C.F.; DE MELO, C.B.; MENDES, C. Geographical distribution of sheep breeds in Brazil and their relationship with climatic and environmental factors as risk classification for conservation. Brazilian Journal of Science and Technology, v.1, p.1-15, 2014. DOI: https://doi.org/10.1186/2196-288X-1-3.
    » https://doi.org/10.1186/2196-288X-1-3
  • MCMANUS, C.; PAIVA, S.R.; ARAÚJO, R.O. de. Genetics and breeding of sheep in Brazil. Revista Brasileira de Zootecnia, v.39, p.236-246, 2010. Suplemento especial. DOI: https://doi.org/10.1590/S1516-35982010001300026.
    » https://doi.org/10.1590/S1516-35982010001300026
  • PRITCHARD, J.K.; STEPHENS, M.; DONNELLY, P. Inference of population structure using multilocus genotype data. Genetics, v.155, p.945-959, 2000.
  • SNPs FOR PARENTAGE testing and traceability in globally diverse breeds of sheep. PLoS ONE, v.9, e94851, 2014. DOI: https://doi.org/10.1371/journal.pone.0094851.
    » https://doi.org/10.1371/journal.pone.0094851
  • VANDENPLAS, J.; CALUS, M.P.L.; SEVILLANO, C.A.; WINDIG, J.J.; BASTIAANSEN, J.W.M. Assigning breed origin to alleles in crossbred animals. Genetics Selection Evolution, v.48, p.1-22, 2016. DOI: https://doi.org/10.1186/s12711-016-0240-y.
    » https://doi.org/10.1186/s12711-016-0240-y
  • VIEIRA, F.D.; OLIVEIRA, S.R. de M.; PAIVA, S.R. Metodologia baseada em técnicas de mineração de dados para suporte à certificação de raças de ovinos. Engenharia Agrícola, v.35, p.1172-1186, 2015. DOI: https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015.
    » https://doi.org/10.1590/1809-4430-Eng.Agric.v35n6p1172-1186/2015
  • YANG, J.; LEE, S.H.; GODDARD, M.E.; VISSCHER, P.M. GCTA: a tool for genome-wide complex trait analysis. American Journal of Human Genetics, v.88, p.76-82, 2011. DOI: https://doi.org/10.1016/j.ajhg.2010.11.011.
    » https://doi.org/10.1016/j.ajhg.2010.11.011

Publication Dates

  • Publication in this collection
    06 May 2019
  • Date of issue
    2019

History

  • Received
    01 Feb 2018
  • Accepted
    27 June 2018
Embrapa Secretaria de Pesquisa e Desenvolvimento; Pesquisa Agropecuária Brasileira Caixa Postal 040315, 70770-901 Brasília DF Brazil, Tel. +55 61 3448-1813, Fax +55 61 3340-5483 - Brasília - DF - Brazil
E-mail: pab@embrapa.br