Effect of a mutation in Raffinose Synthase 2 ( GmRS 2 ) on soybean quality traits

The presence of stachyose and raffinose is considered an antinutritional factor for humans and monogastric animals, leading to limitations on soybean consumption as a protein source. In the present study, the effect of a mutation in the raffinose synthase 2 gene was measured on soybean quality traits. We used an F2 population with 168 soybean individuals developed by crossing four soybean lines and evaluated oil, protein, sucrose, raffinose, stachyose and fatty acid contents and their relationships. The mutation explained 69.61%, 51.81% and 31.96% of stachyose, raffinose and sucrose variation, respectively, and we were able to produce soybean with average stachyose of 0.18%. The low coefficients of determination for protein and oil indicate that the mutation can be used to increase sucrose and reduce raffinose and stachyose content without major changes in oil and protein.


INTRODUCTION
Soybean (Glycine max (L.) Merrill) is one of the most important crops in the world, and is among the main commodities in Brazilian agribusiness.Brazil is the second largest producer, producing 119.281 million tons of grain for the 2017/2018 harvest, about one-third of global production (CONAB 2019).Over the years, the demand for special soybeans has been increasing due to the direct use of the grain, or in the production of derivatives, such as textured protein, tofu, soymilk, natto and edamame (Wang et al. 2014, Matei et al. 2017, Jiang et al. 2018).
Oligosaccharide content is one of the factors responsible for soybean quality.Sucrose is the only oligosaccharide useful for monogastric digestion (Yang et al. 2014), and its content in the grain is a critical factor to produce soybean derivatives (Sato et al. 2014, Wang et al. 2014).In addition, sucrose is probably the main factor influencing the flavor of vegetable soybean (Li et al. 2012).
The presence of the raffinose family of oligosaccharides (RFOs) in soybean, which includes raffinose and stachyose α-galactosides, is considered an antinutritional factor because humans and monogastric animals do not have α-galactosidase, the enzyme responsible for hydrolyzed RFOs (Yang et al. 2014, Matei et al. 2017).RFO consumption by organisms without α-galactosidase can cause organic dysfunctions such as diarrhea, nausea and flatulence (Liener 1994, Reddy et al. 2016, Matei et al. 2017).Stachyose is the second most significant soluble sugar in soybean, usually ranging from 1.4-4.1% (Hymowitz et al. 1972, Zeng et al. 2015), and leads to limitations on soybean consumption as a protein source.The reduction of stachyose and raffinose will increase the metabolized energy and will promote a decrease in the undesirable effects of RFO consumption (Suarez et al. 1999, Parsons et al. 2000, Dierking and Bilyeu 2008).
The literature describes several procedures to reduce or eliminate stachyose content in soybean grains and byproducts.These include imbibition and germination (Kim et al. 1973); fermentation processes (Mital and Steinkraus 1975); oligosaccharide extraction in water (Ku et al. 1976); ultrafiltration of water-soluble soy extract (Omosaiye et al. 1978); oligosaccharide extraction with ethanol from soybean meal (Leske et al. 1991); and use of plant and microbial α-galactosidase (Guimarães et al. 2001), among others.However, the production of low stachyose soybean varieties, associated with high levels of sucrose and protein, is an interesting alternative to improve the nutritional quality of soybeans (Sato et al. 2014), since these varieties will reduce the time and costs for process the grain and its derivatives.
Raffinose and stachyose biosynthesis is mediated by raffinose synthase enzymes, which perform a transfer of galactosyl residues from galactinol to sucrose (Dierking and Bilyeu 2008, Qiu et al. 2015, Bilyeu and Wiebold 2016).Skoneczka et al. (2009) characterized a three base pair deletion in raffinose synthase 2 gene (GmRS2-Glyma.06G179200)on PI200508 accession (high sucrose and low stachyose content).This mutation was analyzed in two F 2 populations derived from PI200508 and explained 88-94% and 76% of stachyose and sucrose content in the grain, respectively.Neus et al. (2005) reported that seed vigor is not affected in lineages with low RFO content derived from PI200508, and there are no significant differences in seed quality characteristics such as emergence in the field, seed yield, maturity, height and fatty acid contents.These characteristics are important for soybean nutritional quality (Yang et al. 2014), making PI200508 a good allele donor for low RFO content.
Correlations between grain quality traits have been investigated by many researchers (Hymowitz et al. 1972, Hartwig et al. 1997, Wilcox and Shibles 2001, Jiang et al. 2018).The production of sucrose, raffinose and stachyose are in the same metabolic pathway (Dierking andBilyeu 2008, Bilyeu andWiebold 2016), and an increase in sucrose content by inhibiting RFO synthesis is expected without a decrease in protein content (Sato et al. 2014).In the present study, we used a population derived by crossing four soybean lines, segregating for several grain quality characteristics, to validate a new molecular marker based on the mutation found in the lineage PI200508, and to study the effect of this mutation on soybean quality traits.

Plant material and population development
The present study developed an F 2 population with 168 individuals by crossing PI603452, PI283327, PI200508 and NA5909.PI200508 has a mutation associated with low stachyose levels (Skoneczka et al. 2009).Accessions PI603452 and PI283327 have mutations in GmFAD2-1A and GmFAD2-1B genes, respectively, and when combined can produce soybean with more than 80% oleic acid (Pham et al. 2011, Pham et al. 2012).The cultivar NA5909 (Nideira Seeds) is a high agronomic performance variety.The population was developed using the following steps: PI603452 was crossed with PI2883327, and the resultant F 1 was crossed with PI200508.This new F 1 was crossed with the NA5909 variety.These F 1 plants were genotyped for the mutation in the GmRS2 gene from PI200508 (see the next section, Genotyping Analysis), and heterozygous plants were used to produce the F 2 population.
The varieties UFVTN105AP (high protein -BIOAGRO/UFV), CS303TNKCA (low linolenic, middle oil -BIOAGRO/UFV) and Tucunaré (good agronomic traits -Mato Grosso Foundation) were used as additional controls.All the crosses and the F 2 population were conducted in a greenhouse at the Universidade Federal de Viçosa, in Viçosa, Minas Gerais.F 2 seeds were planted in June and harvested in October 2015.F 2:3 seeds were used for phenotypic analysis.

Genotyping Analysis
Leaf samples from PI603452, PI283327, PI200508, NA5909, F 1 and F 2 plants were harvested at V2 stage (Fehr and Caviness 1977), frozen in liquid nitrogen and stored in a freezer at -80 °C.Genomic DNA was extracted using the methodology LCC Silva et al.
proposed by Doyle and Doyle (1990).The DNA concentration was determined using a NanoDrop spectrophotometer (NanoDrop Technologies, Wilmington, DE) and the quality was checked by 0.8% agarose gel electrophoresis.
Primers developed by Skoneczka et al. (2009) (Forward 5'-GGACTTGAAGGAACAGTTTAGG-3', and Reverse 5'-CGTTACTGACGATCTTATCCAC-3') were evaluated to check the suitability for High Resolution Melt (HRM) genotyping methodology (Liew et al. 2004, Simko 2016).Subsequently, the same primers were used to sequence the mutation region of GmRS2 gene in the four parents used in this study.After the sequencing, a new set of HRM primers was designed (Forward 5'-GTGGAGCAGGTGTATGTG-3', and Reverse 5'-GTCTGACCCCACCCCAATAC-3').The F 2 population genotyping was performed in RotorGene-Q (QIAGEN), and the reaction was carried out using 30 ng of DNA, 0.7 μM of each primer, and 5 μL of Type-it HRM 2X PCR kit (QIAGEN), in a total volume of 10 μL.DNAs from PI200508, NA5909 and one F 1 plant were used as standards for mutant, wild and heterozygous genotypes, respectively.Each sample was made in duplicate, and the PCR was performed under the following conditions: 94 °C for 10 minutes; 40 cycles started at 94 °C for 20 seconds, 54 °C for 20 seconds, and 72 °C for 30 seconds.Finally, the amplification products were subjected to a temperature gradient of 60 to 90 °C, reading fluorescence each 0.1 °C to determine the melting curves.

Phenotyping analyses
Moisture, protein, and oil contents were determined using near-infrared (NIR) spectroscopy (FT-NIR analyzer, Thermo Scientific, model Antaris II) using 20-30 crushed seeds.The extraction of sucrose, raffinose and stachyose oligosaccharides was performed as described by Teixeira et al. (2012), and the extract was analyzed by high performance liquid chromatography (HPLC) in a Shimadzu Prominence chromatograph using acetonitrile as a mobile phase.Lipids were extracted as described by Gesteira et al. (2003), and analyzed in a Shimadzu GC-2010 Plus chromatograph.

Statistical analysis
To verify the segregation of GmRS2 gene mutation in F 2 population, the Chi-Square test was used.Descriptive statistics (mean, maximum, minimum, amplitude, variance and standard deviation) were calculated for each mutation genotype and for each trait.The Pearson Correlation Coefficient was calculated for each pair of characteristics, and its significance was assessed by the t-test.To estimate the effect of the GmRS2 gene on the evaluated characteristics, linear regression was performed.The coefficient of regression significance was verified using t-test, and the linear model adjustment for each regression was measured by the coefficient of determination.Graphs were constructed showing the distribution of sucrose, raffinose and stachyose content for mutant, wild and heterozygous individuals, showing their means and coefficient of determination.

Population development and genotyping method
The population used in this study was developed by crossing four soybean lines as described in the methodology.We used the primers designed by Skoneczka et al. (2009) for genotyping, using HRM methodology, and to select individuals from F 2 population containing the mutation in the GmRS2 gene (Glyma.06G179200).However, the marker did not show Mendelian segregation in the evaluated population, probably due to some unidentified mutation in the amplified region of the accesses used.To evaluate this hypothesis, we amplified and sequenced the fragments from PI200508, PI283327, PI603452, and NA5909 using the same primers developed by Skoneczka et al. (2009) and generated fragments ranging from 182 to 185 bp.We re-identified the 3 bp deletion in PI200508 accession, however, we found a silent mutation at position 25 in PI603452 accession (C>T) (Figure 1A), used for the development of soybean with more than 80% oleic acid (Pham et al. 2011, Pham et al. 2012).Since the High-Resolution Melt technique is sensitive to variations of a single nucleotide (Wu et al. 2008, Simko 2016), this mutation can alter the denaturation curve and consequently the correct identification of genotypes derived from PI603452.Therefore, we redesigned the molecular diagnosis to amplify a 55 bp fragment that does not include the PI603452 mutation region (Figure 1A and 1B).A total of 168 F 2 plants were genotyped and this new marker behaves as expected for Mendelian segregation.We found 42 mutant, 78 heterozygous and 48 wild-type plants, obtaining a chi-square of 1.29 (p = 0.53).

Descriptive statistics
The population used in this study was phenotyped, the data were statistically evaluated and the variability for all characteristics analyzed was determined (Table 1).The stachyose content ranged from 0.04-0.55%for mutants and 1.11-2.96%for wild type individuals, with averages of 0.18% and 2.08%, respectively.Skoneczka et al. (2009) found higher levels of stachyose, ranging from 0.56-1.47%for mutant and 2.58-5.63%for wild type using two F 2 populations derived from PI200508 and the accessions PI87013 or PI243545.The sucrose and raffinose contents ranged from 1.33-7.07%and 0.02-0.84%,respectively.A broad range of variation was found in protein and oil contents (33.00-47.82%and 12.75-25.51%,respectively, with maximum values close to those found in UFVTN105AP (high protein) and CS303TNKCA (middle oil) control varieties.The fatty acid contents ranged between 6.67-17.32%for palmitic acid, 2.38-5.45%for stearic acid, 17.11-82.17%for oleic acid, 4.01-60.99%for linoleic acid, and 3.70-10.96%for linolenic acid.The results showed that maximum and minimum values for all characteristics exceeded the parental values, indicating the occurrence of transgressive segregation.This phenomenon occurs when the parental individuals are divergent for the trait and they do not have the extreme genotypic combinations (Grant 1964), and was observed in other studies with soybean populations segregated by protein content (Hyten et al. 2004), grain yield and weight (Mansur et al. 1993), sucrose and oligosaccharide content (Kim et al. 2006), and oleic (Pham et al. 2011, Bueno et al. 2018) and linoleic acid contents (Pham et al. 2011).Progenies derived from divergent crosses are expected to show a broad spectrum of genetic variability, providing an increase in the number of transgressive segregating individuals (Tyagi and Khan 2010, Mughal et al. 2015).

Linear regression and relations between seed quality features
Linear regression analysis showed a significant association between GmRS2 mutation and sucrose, raffinose, stachyose, protein and oil contents (Table 2).Graphs were designed using the distribution of sucrose, raffinose or stachyose contents and the genotypes identified (Figure 2).The mutation explained 69.61%, 51.81% and 31.96% of the variation in stachyose, raffinose and sucrose content, respectively.Skoneczka et al. (2009) found different coefficients of determination values, and this can be explained in part by the differences in experimental location and by the population used.We developed a population using four divergent parents, which may lead to an increase in segregation, generating greater genetic variability (Hanson 1959, Alliprandini andVello 2004).
We found an association between the mutation in GmRS2 gene and protein and oil contents, but the coefficients of determination were low (4.1% for protein and 2.6% for oil), indicating that the mutation can be used to increase sucrose and reduce raffinose and stachyose contents, without major changes in oil and protein, as suggested by Sato et al. (2014).Additionally, no association was found between the mutation and the amount of oligosaccharides (sum of  sucrose, raffinose and stachyose contents), indicating that the variation in oligosaccharides content is mainly due to the reduced conversion of sucrose into raffinose (Skoneczka et al. 2009, Yang et al. 2014).Regarding the composition of oil, no association was found between the mutation and palmitic, stearic, oleic, linoleic and linolenic acids, agreeing with Neus et al. (2005).
A negative correlation between protein and sucrose was found (-0.40) (Table 3), a value lower than what has been previously reported.Hartwig et al. (1997) evaluated 20 high protein and 20 high oil soybean cultivars and breeding lines and found a correlation of -0.78 between protein and sucrose.Wilcox and Shibles (2001) found a correlation of -0.66 for these two traits evaluating F 4:5 , F 4:6 and F 4:7 soybean populations derived from crossing C1834 (low protein) and CX1314-37 (high protein) lines.In another study, Jaureguy et al. (2011) evaluated the protein and sugar content of 98 F 4:5 soybean RILs derived from crossing R95-1705 (45.9% of protein and 3.4% of sucrose, on average) and MFL-552 (41.7% of protein and 4.72% of sucrose, on average), and found an average correlation of -0.68% between these traits.Sato et al. (2014) found a correlation of -0.86 between protein and sucrose in a study with four soybean populations derived from different crosses.Sucrose is directly involved in protein biosynthesis (Li et al. 2012), and the negative correlation between sucrose and protein is probably due to the competition for ATP and carbon skeleton requirement for each pathway (Paul and Foyer 2001).Carbohydrate accumulation is an important factor involved in protein production during the grain filling period, and for this reason, the carbon and nitrogen metabolism are not completely independent (Li et al. 2012).As discussed earlier, this smaller negative correlation in our data may be due to the segregation of other genes with effect on both characteristics, given that the effect of the mutation in the GmRS2 gene is only related to the conversion of sucrose into raffinose.LCC Silva et al.
Breeding programs focused on the development of special soybeans seek for adequate seed size, high levels of protein and sucrose and low levels of the oligosaccharides raffinose and stachyose (Chen 2004, Jaureguy et al. 2011).
In the present study, we developed and validated a new genotyping methodology for the mutation in the GmRS2 gene (Skoneczka et al. 2009), and used it to evaluate the effect of this mutation on several grain quality characteristics.The results show that it is possible to increase sucrose and reduce stachyose and raffinose contents without major changes in other grain quality characteristics, such as oil and protein, by interrupting the conversion of sucrose into raffinose.This information is important to assist breeders to develop varieties that best suit the market for specialty soybeans.

Figure 1 .
Figure 1. A. Alignment of a GmRS2 fragment amplified by the primers described by Skoneczka et al. (2009) for all parents of the segregating population and reference genome (Wm_82).The new mutation identified in PI603452 (I), the previously identified mutation in PI200508 (II) and the position of the new set of primers for HRM are shown.B. High resolution melt marker assay.The genotype method was developed based on a 3-bp deletion in the GmRS2 soybean gene found in PI200508 as described in the methodology.The normalized melt curves of mutant, heterozygous and wild type genotypes are shown.

Figure 2 .
Figure 2. Distribution graphs of stachyose, raffinose and sucrose content for 168 soybean individuals in a segregating population separated by GmRS2 genotypes.The coefficient of determination of "stachyose, raffinose or sucrose content x GmRS2 genotype" and the average of the features of each genotype are shown.The genotype letters mean: mutant (M), heterozygous (H) and wild type (W).

Table 2 .
Linear regression of soybean seed quality features x GmRS2 genotype