Association of candidate genes for fatty acid content in soybean by temperature-switch PCR ( TSP ) genotyping

The development of molecular markers is essential for improvement of soybean cultivars with modified fatty acid content. The objective of this study was to identify and validate SNP markers in candidate genes for fatty acid content in soybean. Six candidate genes (ARAF, PDAT, ABI3, FAD2-1b, FAD3B, and FAD3C) were selected. Alignment of gene sequences identified 25 SNPs and 3 INDELs. TSP primers were used to identify SNP alleles. 259 recombinant inbred lines (RILs) (FA22 / CD219) and 185 F2 progenies (A29 / Tucunaré) were tested for association of SNPs. An SNP for FAD3B was associated with variation in content of linoleic acid (R2 = 5.84%) and linolenic acid (R2 = 6.79%). In FAD3C, an SNP was associated with linoleic and linolenic acids (R2 of 9.21% and 18.51%, respectively). The ABI3 gene was associated with palmitic acid, with R2 = 5.41%. The SNP markers identified will be used in assisted selection for improvement of fatty acid content.

The high concentration of polyunsaturated fatty acids (linoleic and linolenic acids) reduces the oxidative stability of soybean oil.Oxidation processes diminish the sensory quality of vegetable oils and the storage time of the raw material and derivative products (Regitano Neto et al. 2016).The low oxidative stability of soybean oil also affects biodiesel production.Fuel produced from oil with a high concentration of polyunsaturated fatty acids has a low cetane index and higher iodine value, which reduces the technological quality of biodiesel (Knothe 2007).Therefore, there is great interest in developing soybean cultivars with low linolenic and linoleic acids (Knothe 2007, Santos et al. 2013).
Although inheritance of fatty acid content is treated in a quantitative manner, genetic modifications are achieved by selection of major genes (Pinto et al. 2013).The enzyme ω-3 desaturase is responsible for synthesis of 18: 3 by using linoleic acid as a substrate to add one unsaturation to the carbon chain at the Δ9 position (Baud and Lepiniec 2010).The family genes GmFAD3A, GmFAD3B, and GmFAD3C encode this enzyme.Different combinations of mutations in these three genes can produce soybean with a low level of linolenic acid (1%) (Bilyeu et al. 2011).Molecular markers associated with desaturase genes are widely used in selection of genotypes with reduced content of linolenic acid (Bilyeu et al. 2005(Bilyeu et al. , 2006(Bilyeu et al. , 2011)).A deletion detected on the FAD3A gene derived from the A5 line allows detection of the mutant allele for low linolenic acid content (Bilyeu et al. 2005).Bilyeu et al. (2006) identified singlenucleotide polymorphism (SNP) in FAD3B and FAD3C genes from the A29 line.They developed codominant markers for allelic detection by RFLP-PCR.However, this process is expensive, more time consuming, and does not allow parallel tests, so that it is only possible to characterize one locus per assay.Hayden et al. (2009) describe temperature-switch polymerase chain reaction (TSP) as a rapid method for genotyping SNP.TSP uses PCR amplification based on hybridization with allele-specific primers for discrimination of alleles by agarose gel electrophoresis.TSP marker tests are carried out quickly and with low technology, allowing the use of SNP information in breeding programs for marker-assisted selection (MAS).Therefore, this study was developed to identify and validate SNP in candidate genes associated with content of soybean fatty acids.In addition, TSP markers were designed for allele discrimination with the aim of applying MAS to reduce polyunsaturated fatty acids in soybean.

MATERIAL AND METHODS
Four candidate genes were selected (ABI3, FAD2-1B, ARAF, and PDAT) based on their relationship with the synthesis and regulation of fatty acid content in soybean (Heppard et al. 1996, Wei et al. 2008, Baud and Lepiniec 2010).The in silico search for selection of sequences of each candidate gene was based on the following criteria: (i) be a single copy or a maximum of two copies; (ii) present e-value equal or close to zero; and iii) provide functional annotation in Phytozome (http://www.phytozome.net/soybean).In addition, we used GmFAD3B and GmFAD3C genes encoding ω-3 desaturases associated with linolenic acid accumulation in soybean (Bilyeu et al. 2006).
The final volume of the PCR reaction was adjusted to 30 µL using 1U Platinum®Taq DNA polymerase, 20 mM Tris-HCl pH 8.4, 50 mM KCl, 1.5 mM MgCl 2 , 0.2 mM of each dNTP, 0.2 µM of each primer, and 30 ng of template DNA.The template DNA used was obtained from the parents A29, Tucunaré, FA22, and CD219RR, as described below.The steps of the PCR cycle were performed in a Eppendorff (AG-22331) thermocycler under the following conditions: initial step of denaturation at 94°C for 2 min, followed by 35 cycles of 94 °C for 30 sec, 55 °C for 30 sec, and 72 °C for 60 sec.The products generated by amplification were separated by agarose gel electrophoresis (1.7%) containing ethidium bromide (0.2 µg mL -1 ), and they were photographed under UV light using the L-PIX system (Loccus Biotechnology, São Paulo, Brazil).
The PCR products exhibiting the expected fragments were purified using the ExoSAP-IT kit (USB Corporation, Cleveland, Ohio, USA) in a ratio of 6 µL of ExoSAP and 18 µL of PCR product.The purified samples were then sequenced by Macrogen Company (Gasan-dong, Geumchun-gu, Seoul, South Korea).The sequences obtained were edited using RD Bueno et al.
the Sequencer 4.1.4program (Gene Code Corporation) and aligned with the ClustalW program (http://www.ebi.ac.uk/Tools/clustalw2/index.html).The same sequences also were edited and aligned with the CodonCode Aligner program (http://www.codoncode.com/aligner/).Two segregating populations for oleic and linolenic acid levels were developed for association of the SNP with soybean fatty acid contents.The first population obtained from the cross between FA22 x CD219 consisted of 259 recombinant inbred lines (RILs) in the F 6 generation.The second population, consisting of 185 F 2:3 lines, was obtained from crossing A29 x Tucunaré.The A29 genotype (1% linolenic acid) was obtained by hybridization and selection from mutant strains for low content of linolenic acid (Ross et al. 2000).The FA22 line was selected for higher levels of oleic acid (~ 50%) (Alt et al. 2005).CD219RR and Tucunaré (cultivars adapted to conditions in central Brazil) have linolenic acid and oleic acid concentrations of 8% and 19%, respectively.All progeny and parental lines were grown in a greenhouse under controlled temperature and humidity conditions.Cultivation practices were the same as recommended for the crop.
Leaf tissues were collected from parents, the F 2 progeny (A29 x Tucunaré), and RILs (FA22 x CD219) 15 days after planting.For extraction of genomic DNA, we used the method proposed by Doyle and Doyle (1990).DNA concentration and quality were determined using the NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA).The DNA obtained underwent electrophoresis on 0.8% agarose gel for quality evaluation.Based on the estimated concentration, samples were diluted to a final concentration of 15 ng μL -1 .
Composition of fatty acids in the seed oil fraction was obtained by gas chromatography (Bubeck et al. 1989).Derived methyl esters were fractionated in a Carbowax column of 30 m length and 0.32 mm internal diameter, with temperatures of the injector-column and detector maintained at 240 °C and 280 °C, respectively.
Based on the SNP polymorphisms identified in the ABI3, ARAF, FAD2-1, FAD3B, and FAD3C genes, TSP primers were designed for genotyping in the study populations (Hayden et al. 2009).TSP primers were designed with the help of the Primer 3 and Primer Select programs.TSP reactions were performed in 15 µL containing Platinum®Taq 1U DNA polymerase, 20 mM Tris-HCl pH 8.4, 50 mM KCl, 1.5 mM MgCl 2 , 0.1 μg μL -1 bovine serum albumin, 0.01% Tween-20, 1.0% formamide, 0.2 mM of each dNTP (dATP, dTTP, dGTP, and dCTP), 0.083 µM site specific primer, 0.83 µM of informative primer, and 30 ng of template DNA.PCR reactions were performed in an Eppendorff (AG-22331 model) thermocycler.For PCR, we used an initial step at 94 °C for 2 min, followed by 10 cycles at 94 °C for 30 sec, 58 °C for 30 sec, and 72 °C for 60 sec; 5 cycles at 94 °C for 10 sec and 45 °C for 30 sec; and 15 cycles at 94°C for 30 sec, 53 °C for 30 sec, and 72 °C for 5 sec.The products obtained were visualized by electrophoresis in 2% agarose gel containing ethidium bromide (0.2 µg mL -1 ) under UV using the L-PIX system.
Segregation of markers was tested using the Bonferroni corrected chi square test (p<0.05).The normality of phenotypic data was tested by the Lilliefors test.Marker-trait association was determined by linear regression.

RESULTS AND DISCUSSION
Primer screening to access the transcribed regions of the candidate genes detected ten different fragments.We obtained 8,461 kbp of good quality sequences with the Phred value greater than or equal to 30.A total of 25 SNPs and three INDEL were identified.The ARAF gene had the highest number of polymorphisms, eight SNP and three INDEL in regions 5' UTR and 3' UTR.The FAD3B and FAD3C genes showed a single SNP type polymorphism detected in the coding regions of each gene.The ABI3, FAD2-1B, and PDAT genes exhibited five SNPs, each dispersed in the coding regions 5' UTR and 3' UTR.
For each of the six candidate genes, one SNP was selected to design TSP primers (Table 1).Each TSP marker comprised a set of three primers.A pair of locus-specific (LS) primers with an annealing temperature (Ta) of 58 °C was used to amplify fragments of ~ 400 bp.An informative allele-specific (AS) primer with Ta of 45 °C, with its 3' end complementary to one of the alleles in the locus, was synthesized with an additional tail on the 5' end with two to three random nucleotides not complementary to the sequence of the DNA template.As a result, a fragment of ~ 100 bp was obtained, which allowed identification of the alternative SNP allele (Figure 1).Thus, heterozygous individuals can be characterized by the presence of two bands.
Genomic DNA of the parents A29, CD219, Tucunaré, and FA22 was used to analyze polymorphisms of developed primers (Figure 1).TSP primers designed to discriminate polymorphism of genes ARAF, PDAT, FAD3B, FAD3C, and ABI3 showed polymorphic bands, without weak bands and/or without nonspecific bands.The pair of primers designed to discriminate alleles of the FAD2-1B gene showed nonspecific amplification and was discarded.
Polymorphic TSP primers of the ARAF and PDAT genes were amplified in 259 RILs of the cross between FA22 x CD219.The two markers showed expected 1: 1 segregation (Table 2).The segregation of markers for the ABI3, FAD3B, and FAD3C genes was tested in 185 F 2 individuals of the cross between A29 x Tucunaré, yielding an expected segregation pattern of 1: 2: 1 (Table 2).
Descriptive statistics of the fatty acid content in the segregating populations and their parents are given in Table 3.The A29 genotype stands out for having a low content of linolenic acid (1.19 ± 0.17%) and high content of linoleic acid (62.03 ± 3.41%).Bilyeu et al. (2005) and Pinto et al. ( 2013) reported the same levels of linolenic    acid for accession A29.The FA22 genotype showed medium content of oleic acid (38.57± 4.04%), which was less than expected, probably due to environmental factors.Tucunaré and CD219 have an average content of 8% linolenic acid and 19% oleic acid.However, Tucunaré and CD219 have different levels of linoleic acid (58% and 48%, respectively).
The phenotypic distribution for linolenic acid content in the F 2:3 population (A29 x Tucunaré) ranged from 0.97% to 12.90%.The average content of 4.21% of the population corresponds to the mean of the parents (4.59%), indicating predominance of additive effects in control of linolenic acid content (Pinto et al. 2013).The average content of linoleic acid in this population was 57.04%, ranging from 37.06% to 65.92%.
In the FA22 x CD219 population, mean oleic acid content was 35.73%.This indicates a deviation toward the mean of the FA22 parent behavior.The range of variation for oleic acid in this population was 12.81% to 46.38%, an indication of transgressive segregation.Other authors identified transgressive segregation in populations derived from FA22 (Alt et al. 2005).The existence of transgressive segregants indicates the possibility of genetic complementation by fixing alleles of different loci in order to increase the oleic acid content.In this study, due to the amplification of nonspecific DNA fragments, it was not possible to obtain TSP markers for polymorphisms tested in the FAD2-1 gene.SNP markers for the ARAF and PDAT genes tested in the FA22 x CD219 population were not significantly associated with fatty acid content (Table 4).In the A29 x Tucunaré population, markers for genes ABI3, FAD3B, and FAD3C showed significant association (Table 4).The SNP marker gene ABI3 was significantly associated with concentration of palmitic acid, explaining 5.42% of phenotypic variation.The FAD3B marker showed an effect of 5.84% on variation of the concentration of linoleic acid and 6.79% in the range of linolenic acid.The FAD3C marker explained 9.21% and 18.51% of the variation in linoleic and linolenic acid concentrations, respectively.
TSP genotyping was efficient, detailing the three genotypic classes of markers tested in the F 2 population (A29 x Tucunaré) and the two genotypic classes in the RIL population (FA22 x CD219).The TSP genotyping method proved to be effective because it is simple, inexpensive, accurate, and easy to implement, enabling its use in MAS (Hayden et al. 2009, Tabone et al. 2009).
The ability to detect heterozygosity by TSP allows its application in marker-assisted backcrossing or early selection for quantitative traits (Singh andSingh 2015, Shi et al. 2015).Recent research has shown that inheritance of fatty acid content is quantitative in nature, mainly controlled by moderate to large-effect genes (Xie et al. 2012, Wang et al. 2012).Additive genetic effect has been shown to be predominant for the SNP marker FAD3B and FAD3C genes (Pinto et al. 2013).The effectiveness of FAD3B and FAD3C markers for selection of individuals with low concentration of linolenic acid has been successfully tested in populations with different genetic backgrounds (Bilyeu et al. 2005(Bilyeu et al. , 2006)).
Although constraints have limited the use of MAS in plant breeding programs in Brazil (Sakiyama et al. 2014), use of molecular markers is effective for some traits, mainly in transfer of major genes (Santana et al. 2014, Yamanaka et al. 2013, Dimitrijević et al. 2017).The use of backcrossing may be the best approach to transfer low linolenic acid and high oleic acid contents to elite cultivars.In this case, for each backcross cycle, it is necessary to evaluate progenies phenotypically in F 2 or F 2:3 generations.Using the MAS tool, BC n F 1 individuals may be tested using molecular markers for the presence of the allele of interest, thus leading to reduction in at least one generation of selfing or the amount of testing for the selection of low linoleic/high oleic acid traits.
In addition to the FAD3B and FAD3C genes, the FAD3A gene has a strong influence on the concentration of linolenic acid (Cardinal et al. 2011).Different combinations of mutations in these three genes are capable of producing soybeans with low levels of linolenic acid (1%) (Bilyeu et al. 2011).With validation of markers for these three genes, it will be possible to use marker-assisted selection to introduce alleles for low linolenic acid content.
Another interesting fact is the significant association of the ABI3 gene with palmitic acid content.The ABI3 (Abscisic Acid Insensitive 3) gene belongs to the family of regulatory genes responsible for activation of fatty acid synthesis in seeds and other processes related to seed ontogeny (Baud and Lepiniec 2010).This association may be explained by the presence of palmitic acid as the first intermediate to be released from the fatty acid synthase complex and transported from the chloroplast to cytosol (Baud and Lepiniec 2010).As the ABI3 gene is a global enabler of lipogenesis, the effect can be more pronounced in palmitic acid content.However, this is only one piece of evidence for this hypothesis.

a
An SNP allele complementary to the allele-specific primer (AS) produces the reference SNP allele, with shorter PCR product.The allele with no complementarity produces an alternative SNP allele, with PCR products of greater length.Heterozygous samples produce both PCR products.b The 5' arbitrary sequence of nucleotides shown in bold and underlined is not complementary to target DNA.c LS: locus-specific primer.

Table 1 .
Characterization of SNP alleles, their localization and primers designed to access polymorphisms by TS-PCR for each candidate gene

Table 4 .
Associative analysis of SNP markers identified in the ARAF, PDAT, ABI3, FAD3B, and FAD3C genes with fatty acid content, evaluated in populations derived from crosses between FA22 x CD219 and A29 x Tucunaré