Expression analysis in response to drought stress in soybean: Shedding light on the regulation of metabolic pathway genes

Metabolomics analysis of wild type Arabidopsis thaliana plants, under control and drought stress conditions revealed several metabolic pathways that are induced under water deficit. The metabolic response to drought stress is also associated with ABA dependent and independent pathways, allowing a better understanding of the molecular mechanisms in this model plant. Through combining an in silico approach and gene expression analysis by quantitative real-time PCR, the present work aims at identifying genes of soybean metabolic pathways potentially associated with water deficit. Digital expression patterns of Arabidopsis genes, which were selected based on the basis of literature reports, were evaluated under drought stress condition by Genevestigator. Genes that showed strong induction under drought stress were selected and used as bait to identify orthologs in the soybean genome. This allowed us to select 354 genes of putative soybean orthologs of 79 Arabidopsis genes belonging to 38 distinct metabolic pathways. The expression pattern of the selected genes was verified in the subtractive libraries available in the GENOSOJA project. Subsequently, 13 genes from different metabolic pathways were selected for validation by qPCR experiments. The expression of six genes was validated in plants undergoing drought stress in both pot-based and hydroponic cultivation systems. The results suggest that the metabolic response to drought stress is conserved in Arabidopsis and soybean plants.


Introduction
Crop plants are often exposed to various biotic (viruses, bacteria and fungi) and abiotic stress factors (such as water deficit and salinity) that may impair their growth, development and ultimately affect productivity (Kang et al., 2002;Mahajan and Tuteja, 2005). Damage caused by these stresses represents a major concern for producers, consumers and governments, especially in relation to crops of great economic importance, such as wheat, corn and soybean, whose losses may range between 78%-87% of maximum yield under ideal conditions (Bray et al., 2000).
Soybean [Glycine max (L.) Merr.], the most important legume grown worldwide, is an essential source of oil, protein, macronutrients and minerals (Clemente and Cahoon, 2009). Despite increased global demand, the current losses in soybean production are estimated to be over one fifth of the crop worldwide. Most of these losses are attributed to abiotic factors, responsible for a decrease of 69% in comparison to the record yield capacity (Bray et al., 2000). In Brazil, the occurrence of prolonged drought during summer has become increasingly common in recent years (Brando et al., 2010). In the state of Paraná, Brazil, soybean yields have fallen due to drought resulting in a cumulative decline of almost 11 million tons in total production (Franchini et al., 2009). In 2008-2009, losses due to drought in the north and west of the state of Paraná, were 80% (Franchini et al., 2009). This situation may become even more dramatic in light of current environmental predictions, which point to global warming and subsequent occurrence of drought in water-stressed regions, which represent one-third of the world's culturable land (Manavalan et al., 2009).
In order to better cope with drought stress, plants possess a large repertoire of morphological, biochemical, physiological and molecular adaptations and responses (Bray, 1993;Seki et al., 2003;Yamaguchi-Shinozaki and Shinozaki, 2006). Recent functional genomics studies using combined strategies of transcriptomics, proteomics, and metabolomics revealed a wide range of important genes involved in the synthesis of metabolites in response to drought, such as osmoprotectants, osmolytes, compatible solutes, or signaling molecules (Shinozaki and Yamaguchi-Shinozaki, 2007;Verbruggen and Hermans, 2008;Urano et al., 2010).
The accumulation of osmolytes in plant cells results in a decrease in osmotic potential, water absorption and cell turgor pressure, which contribute to the maintenance of physiological processes such as stomata opening, photosynthesis and plant growth (Hsiao, 1973;Shinozaki and Yamaguchi-Shinozaki, 2000;Baxter et al., 2007). Solute accumulation under stress is probably the most distinctive feature of an adaptive response to stresses that involve a component of water deficit, such as drought, freezing and salinity (Hsiao, 1973;Thomashow, 1999;Zhu, 2002). A specific physiological response to drought represents combinations of events that are activated and turned off by the perception of stress. An understanding of how these events interact is an important step towards the development of crops with greater tolerance to drought.
Two experimental procedures are usually applied to assess a gene expression profile during drought stress conditions in soybean: the pot-based system (PSys) (Casagrande et al., 2001;Qin et al., 2007;Martins et al., 2008;Tran et al., 2009) and the hydroponic system (HSys) (Martins et al., 2008;Kulcheski et al., 2010). Drought stress in plants cultured in PSys is more similar to field conditions, where the rate of water loss is slower, allowing acclimation to the drought condition (Cowan, 1965). In the HSys, the plants are placed in containers where a nutrient solution composed of water and nutrients circulates, without the presence of soil as a substrate. In this system, the simulation of drought is carried out by removing the plants from the nutrient medium, so water loss is more rapid, causing a shock in the plant, and within minutes it is possible to observe the physical effects caused by the stress. HSys does not allow plant acclimation (Munns et al., 2010).
In this work, we investigated several metabolic pathways potentially associated with water deficit in soybean (G. max). For this purpose, we employed different strategies, combining an in silico approach and gene expression analysis by qPCR. The gene expression analysis was performed with plants cultivated under HSys and PSys, which allowed us to compare the effects and responses to differences in acclimation. The identification of such genes is the first step to better understand the effects of water deficit on the regulation of expression of metabolic pathway genes in soybean. This knowledge should also be helpful in the identification of drought tolerant soybean cultivars and provide better tools to develop water-stress tolerant crops.

Plant material, growth conditions and treatments
The Glycine max L. Merrill cultivars BR 16 and Embrapa 48 have been shown to have contrasting responses to water deficit; BR 16 is very sensitive to drought, and Embrapa 48 shows a high tolerance to this stress (Casagrande et al., 2001;Texeira et al., 2008).
We used two different water deficit treatments, a pot-based system (PSys) in which plant were grown in sand and a hydroponics system (HSys) in which plants were grown in a nutrient solution (Martins et al., 2008;Kulcheski et al., 2010).
Plants grown in the PSys were maintained in a greenhouse at 30°C ± 5°C temperature and 60% ± 20% relative humidity. The cultivars BR16 and Embrapa 48 were germinated in washed sand where they remained for about 10 days. After this period, seedlings were transplanted to pots. Seedlings at the V4 development stage (fourth trifoliate fully expanded) (Fehr et al., 1971) were watered on a daily basis in the control pots, whereas watering was suspended in the pots of plants under drought stress. The water potential (Yw) was measured daily (always between 05:00 and 06:00) after the second day of the interruption of watering. The Yw for each plant was measured by the Scholandertype pressure chamber. Seven days after the interruption of watering the Yw was -1.5 ± 0.2 MPa (moderate stress level) and after ten days -3.0 ± 0.2 MPa (severe stress level). The roots with sand were removed from their pots and then immediately and gently rinsed with water for 1 min, in order to remove all the sand. To remove biological contaminants, the roots were carefully immersed in 2% SDS solution for 1 min, and washed gently with ultrapure water for 1 min. After this process, the root samples for one plant from each treatment, in total, two plants (two biological replicates), were immediately frozen in liquid nitrogen and stored at -80°C for RNA extraction.
For cultivation in the hydroponic system (HSys), seeds were pre-germinated on moist filter paper in dark conditions at 25°C ± 1°C and 65% ± 5% relative humidity. Plantlets were then placed in polystyrene supports in such a way that the roots of the seedlings were completely immersed in the solution. Each tray containing seedlings was maintained in a greenhouse at 25°C ± 2°C and 60% ± 5% relative humidity, under natural daylight (photosynthetic photon flux density (PPFD) = 1.5 x 10 3 mmoles m -2 s -1 , equivalent to 8.93 x 10 4 lux) and a 12 h day length. After 15 days, seedlings at V4 development stage were submitted to different treatments in which they were removed from the hydroponic solution and kept in a tray in the dark without nutrient solution or water for 0 min (T0, or unstressed), 50 min (T50), 100 min (T100) and 150 min (T150). Two biological replicates of root samples from both cultivars were collected at these time points and immediately frozen in liquid nitrogen followed by storage at -80°C for posterior RNA extraction.

Total RNA isolation
Root samples from the PSys were processed for RNA extraction using the Plant RNAeasy kit (Qiagen) following the manufacturer's instructions. The samples of dried roots from hydroponic experiments were processed for RNA extraction with Trizol ® Reagent (Invitrogen). To remove any DNA contamination, samples were treated with RNAsefree DNAseI (BioLabs). RNA concentration and purity were determined before and after DNAse I treatment using a NanoDropTM spectrophotometer ND-1000 (Thermo Scientific), and RNA integrity was verified by electrophoresis in a 1% agarose gel.

Real-time quantitative polymerase chain reaction (RT-qPCR)
Primers were designed using the Primer 3 plus software (Untergasser et al., 2007) using as criteria the generation of amplicons ranging from 80 to 200 bp with a Tm of 60°C ± 1°C (primer sequences are listed in Table S1). Both candidate and housekeeping genes were amplified in a one step protocol. As housekeeping genes, ACT11 (cytoskeleton structural protein) and FBOX (F-Box protein family) (Kulcheski et al., 2010) were used for normalization of target gene expression. Melting curve and gel electrophoresis analysis of the amplification products confirmed that the primers amplified only a single product of expected size (data not shown).
PCRs were carried out in an optical 96-well plate with a Realplex 4 Eppendorf Masterclycler ® Ep gradient sequence detection system (Eppendorf) Power SYBR ® Green RNA-to-Ct TM 1-Step Kit (Applied Biosystems) was used as recommended by the manufacturer. For each sample, 25 ng of RNA was used in the reaction mixture in a final volume of 20 mL. Reaction mixtures were incubated for 30 min at 48°C and 10 min at 95°C, followed by 40 amplification cycles of 15 s at 95°C, and 1 min at 60°C. Primer set efficiencies were estimated for each experimental set by Miner software (Zhao and Fernald, 2005) and these values were used in all subsequent analyses. Miner software was used to determine the starting and ending points of the exponential phase of PCR from raw fluorescence data. It also estimated primer set amplification efficiencies through a nonlinear regression algorithm without the need for a standard curve. In addition, the values of the threshold cycle (quantification cycle value -Cq) were converted by the program QBASE v1.3.5 (Hellemans et al., 2007) into relative amounts normalized (NRQ). All references and samples for each experimental condition were evaluated in technical triplicates.

Bioinformatic tools
Identification of metabolic pathway genes in soybean Arabidopsis genes associated with response to drought in different pathways were selected based on information from the literature (Sanchez et al., 2008;Bundy et al., 2009;Urano et al., 2009;Hey et al., 2010). Gene models for the metabolic pathway genes were obtained using the tools AraCyc metabolic pathway from the TAIR (The Arabidopsis Information Resource) and KEGG pathways websites. The digital expression pattern of these genes under drought conditions in Arabidopsis was evaluated by using the Genevestigator web tool (Hruz et al., 2008). Subsequently, the protein sequences of possible orthologs in soybean were used to conduct Blastp searches in Phytozome. All sequences with an e-value = 0, or, in the absence of sequences with e-value = 0, the first five with e-value lower than 10 -30 were analyzed for their presence in subtractive libraries available in the GENOSOJA LGE (Laboratory of Genomic and Expression: Project GENOSOJA) database (Rodrigues et al., 2012). These subtractive libraries are composed of samples from leaves and roots in three separate bulks with regard to the dehydration period: bulk 1 (T25-50 min); bulk 2 (T75-100 min) and bulk 3 (T125-150 min), for both cultivars (Rodrigues et al., 2012). The presence of a given gene in these libraries is indicative of the induction of its expression during water deficit. The selected genes represented in the libraries were also submitted to a dendrogram analysis, as well as a validation of their expression pattern through qPCR.

Generation of dendrograms
The protein sequences of A. thaliana were used to search for all aligned genes in G. max and Oryza sativa (out group) genomes, as well as in Arabidopsis. The alignment of amino acid sequences was done using the ClustalW2 software (Larkin et al., 2007). The software MEGA v.4 was used to construct dendrograms by means of the Neighbor-Joining algorithm (Tamura et al., 2007), under a Poisson model, complete deletion, and bootstrapping with 1,000 replications (Sitnikova et al., 1995). G. max, O. sativa and A. thaliana genes were selected considering e-values smaller than 10 -15 in the Phytozome and TAIR databases.

Promoter analysis
Sequences of 1,000 bp upstream to the start codon of the genes of the soybean genome were obtained by using 224 Expression analysis in response to drought stress the genome browse tool in the Phytozome database. Cisregulatory elements related to drought stress, salinity stress and ABA were identified in the database of Plant Cis program-acting Regulatory DNA Elements -(PLACE) by a keyword search (Higo et al., 1999). The POBO tool (Kankainen and Holm, 2004) was used for comparison of motif occurrences in promoters of putative orthologous genes by using the whole genome of G. max as background information.

Results
In silico identification and characterization of soybean genes involved in different pathways in response to dehydration The metabolic pathways of Arabidopsis involved the synthesis and degradation of metabolites during drought stress were selected based on information from the literature (Sanchez et al., 2008;Bundy et al., 2009;Urano et al., 2009;Hey et al., 2010). Each step of the metabolic pathways was investigated in the AraCyc metabolic pathway (Zhang et al., 2005) and KEGG pathway tools (Zhang and Wiemann, 2009). The digital expression profile for each gene under water deficit was evaluated through clustering analysis by the Genevestigator web tools (Hruz et al., 2008). This procedure allowed us to select 80 genes from Arabidopsis belonging to 39 different metabolic pathways that are regulated during water deficit (Table S2). For simplicity, this group was named "Arabidopsis Genes of the Metabolic Pathways" (AGMPs). The diagram of the search strategy employed is illustrated in Figure 1.
The 354 putative soybean orthologs of the 80 Arabidopsis genes were identified by Blastp searches on the Phytozome website. The putative soybean ortholog genes had their expression pattern evaluated by subtractive library tools of the GENOSOJA LGE (Laboratory of Genomic and Expression: Project GENOSOJA) (Rodrigues et al., 2012). This step allowed us to check whether the expression of these genes is induced during drought stress. The selection criteria were the presence of the gene in at least two subtractive libraries related to drought stress. This strategy allowed us to identify 13 putative soybean ortholog genes belonging to seven different metabolic pathways (data not shown). We herein focus on the description of three pathways: lysine degradation, putrescine biosynthesis and stachyose biosynthesis.
In order to identify the best candidates in the soybean genome for the AGMPs, we performed dendrogram analyses. These included the genes GmaxLKR/SDH-like1, GmaxLKR/SDH-like2 and GmaxADC2-like1 ( Figure 3) and also GmaxGOLS2-like1, GmaxGOLS2-like2, and GmaxGOLS2-like3 ( Figure 4). These genes are part of the metabolic pathways of lysine degradation II, putrescine biosynthesis I and stachyose biosynthesis, respectively (Figure 2). For those soybean genes where the neighbor-joining analysis was not able to determine the closest Arabidopsis ortholog, the selection of the soybean gene(s) for posterior analysis was based on their expression frequency in the drought induced subtractive library of the GENOSOJA LGE database (Table S2). The putative soybean orthologs of AGMPs were identified through Blastp searches in the soybean genome on the Phytozome website, followed by dendrogram analysis. For each AGMP, we identified a putative ortholog in the G. max and O. sativa genomes. The dendrogram analysis indicated that the Guimarães-Dias et al. 225  Arabidopsis genes AtLKR/SDH (At4g33150) and AtGOLS2 (At1g56600) have two putative orthologs in the soybean genome. For the gene GmaxLKR/SDH the putative orthologs are Glyma13g17580 and Glyma17g0492, while for the gene GmaxGOLS2 the putative orthologs are Glyma20g22700, Glyma03g38080 and Glyma19g40680 ( Figures 3A and 4). The dendrogram analysis of ADC2 pointed to Glyma04g00960 as being the closest gene to AGMP. However, Glyma04g00960 was present only in a single subtractive library whereas Glyma06g00990 was represented in four. Therefore, Glyma06g00990 was also selected to be validated by qPCR ( Figure 3B).

RT-qPCR
Through in silico analysis we selected six genes for validation by qPCR of root samples of the sensitive (BR16) and tolerant (Embrapa 48) cultivars submitted to water deficit in PSys and HSys.
The genes GmaxLKR/SDH-like1 and GmaxLKR/SDH-like2 showed higher expression in PSys compared to HSys ( Figure 5A, B). The expression profile in the sensitive cultivars showed a gradual increase in all conditions tested. Interestingly, the expression of GmaxLKR/SDH-like1 and GmaxLKR/SDH-like2 in the tolerant cultivar was down-regulated in the PSys when exposed to drought. In the HSys condition, these genes showed a higher increase in expression at a later time (T100 min and T150 min) in both cultivars.
The GmaxADC2-like1 gene showed similar expression dynamics for both cultivars in the two systems studied, with a peak of relative expression under moderate stress in PSys (Yw -1.5 MPa) at 100 min (T100) in the HSys culture condition. Furthermore, expression levels were significantly higher in the HSys condition ( Figure 5C).
The GmaxGOLS2-like1 gene presented a quite different expression profile during drought stress in the two tested systems when compared with the other two GmaxGOLS2 soybean orthologs, GmaxGOLS2-like2 and GmaxGOLS2-like3. It is worthy of note that the level of expression of GmaxGOLS2-like1 is eight times higher in the tolerant cultivar at an early time point (T50 min) in HSys compared to the non-stress sample, while the sensitive cultivar showed a level of expression four times higher for the same time point (T50 min) compared to the control sample. In PSys, the tolerant cultivar showed a subtle increase in the GmaxGOLS2-like1 expression level under moderate stress (-1.5 MPa) compared to the control, while the sensitive cultivar exhibited mild repression under the same stress level ( Figure 5D) The GmaxGOLS2-like2 and GmaxGOLS2-like3 showed fairly similar gene expression profiles for both 226 Expression analysis in response to drought stress  cultivars in the two systems studied. These genes reached the highest level of relative expression under the most severe stress (Yw -3.0 MPa) in the PSys condition. Notwithstanding, it is important to note that the expression level of GmaxGOLS2-like2 was about ten times higher than that of GmaxGOLS2-like3. In the HSys conditions, expression levels were very low for both cultivars which indicates that these genes are not regulated during water deficit stress in this system ( Figure 5E,F). In addition to the gene expression studies we investigated the presence of cis-regulatory elements in soybean drought-response genes selected for in silico analysis. By means of the Place tool, 17 candidate motifs related to drought were identified (data not show) and the statistical significance of their enrichment was assessed using the POBO tool, which compares motif abundance in the given promoter set relative to G. max background (BG) frequen-cies. The analysis revealed that two ABA responsive binding elements, named AREBs, (ACGTG and ACGTGKC) and one motif for the early response to dehydration, named ERD (ACGT) are enriched in the promoter of the selected genes when compared to the background genome. The analysis in POBO also indicated that the ACGTG motif was present in 54.5% of the promoters of all genes of interest. The average number of promoters that presented this motif was 2.55 compared to an average of 0.88 for all G. max promoters (BG) (t-test; p > 0.0001). The ACGTGKC motif was present in 54.5% of the promoters of all genes of interest. The average number of promoters that showed this motif in the selected gene set was 1.46 compared to an average of 0.13 for all G. max promoters (t-test; p > 0.0001). The ACGT motif is the most representative one within the set of target genes, being present in 81.8% of the promoters. The average number of promoters harboring this motif was Guimarães-Dias et al. 227 5.96 compared to an average of 3.03 in the promoter regions of the G. max genome (Table S3).

Discussion
Herein we identified several soybean genes that are responsive to drought stress. These belong to different metabolic pathways based on previous information of the model plant Arabidopsis (Taji et al., 2002;Sanchez et al., 2008;Urano et al., 2009Urano et al., , 2010. We identified 354 putative orthologs in the soybean genome within 39 metabolic pathways. We used the subtractive libraries performed on soybean root tissues obtained from the GENOSOJA database to direct us in the selection of the key genes. Through in silico analysis, we selected six soybean genes from three metabolic pathways for qPCR validation. The expression was assayed in roots of plants under water deficit in two ways: (i) PSys, in which the rate of water loss is slower, and allows the plant to adapt to the unfavorable environmental conditions, and (ii) HSys, in which the rate of water loss is very rapid, not giving the plant time to adapt to the stress conditions (Bray, 1993). Employing these alternative systems helped us to understand the control of gene expression involved in drought-induced metabolism.
Drought in plants starts as a complex set of responses, beginning with the perception of stress, which triggers a cascade of molecular events that comprise various levels of physiological, metabolic and developmental responses (Mahajan and Tuteja, 2005). Previous studies indicate that PSys and HSys physiological responses were observed at a stress level of -3.0 MPa and T100 min, respectively (Martins et al., 2008). At this point, soybean plants begin a process of wilting, where the rate of photosynthesis decreases, leading to stomata closure and increased leaf temperature. Our expression analysis allowed to characterize the two systems, revealing a distinct perception of stress in the plants kept under PSys and HSys in cultivars that are tolerant and sensitive to drought, respectively.
In previous studies carried out with different soybean cultivars, the Embrapa 48 cultivar showed a reduced response to the evaluated characteristics, such as lower rates of reduction in germination rate, lower percentage of reduction in primary root length, and lower photosynthetic rate under moderate and severe water deficit, compared to other cultivars, including BR16 (Casagrande et al., 2001;Texeira et al., 2008). Hence, the Embrapa 48 cultivar is considered more tolerant to water deficit because it reacts more rapidly to the adverse situation. In our analysis, GmaxGOLS2-like2 and GmaxGOLS2-like3, for instance, were expressed in both cultivars in the Psys condition, but expression levels were significantly higher in Embrapa 48 ( Figure 5E,F). Differences in the regulation of gene expression between cultivars were also noted when the expression of GmaxLKR/SDH-like1 and GmaxLKR/SDH-like-2 were evaluated in the Psys condition. Both presented high expression levels under this control condition, which may in-dicate that the Embrapa 48 cultivar presents naturally higher levels of protective compounds and can better cope with a water deficit. These conclusions do not apply to the HSys experiment, where practically no differences were observed between the cultivars. These results strongly suggest that a water deficit in the sensitive and tolerant cultivar activates distinct molecular switches depending on the cultivation system.
The adaptive response to stress at cellular and molecular levels involves the accumulation of osmolytes and proteins related to stress tolerance (Kishor et al., 1995;Kiyosue et al., 1996;Zhu, 2002;Mahajan and Tuteja, 2005;Fujita et al., 2006;Hummel et al., 2010;Ashraf et al., 2011). In Arabidopsis, drought stress responses are perceived by the biosynthetic genes BCAT2, LKR/SDH, P5CS1 and ADC2 pertaining to the ABA-dependent pathway, while the raffinose (RFO) and galactinol (GOLS2) genes are not regulated by ABA during dehydration stress (Taji et al., 2002;Sanchez et al., 2008;Hirayama and Shinozaki, 2010). If the GOLS2 ABA independent response is conserved in the three putative soybean homologues, our results suggest that an ABA independent response is activated in both systems tested (PSys and HSys).
Among the genes expected to participate in the ABA-dependent pathway in soybean, GmaxLKR/SDH-like1, GmaxLKR/SDH-like2 and GmaxADC2-like1 showed different expression dynamics during water deprivation. The putative paralogs GmaxLKR/SDH-like1 and GmaxLKR/SDH-like2 displayed a quite similar expression pattern ( Figure 5A,B). Moreover, the gene GmaxADC2-like1 showed higher levels of expression in the HSys condition ( Figure 5C). On the other hand, genes belonging to the ABA-independent pathway presented distinct patterns of gene expression, such as those displayed by GmaxGOLS2-like1, GmaxGOLS2-like2 and GmaxGOLS2-like3 (Figure 5D-F) Lysine is catabolized in plants from saccharopine to glutamic acid and acetyl-CoA. Lysine catabolism is largely regulated by two enzymes, lysine-ketoglutarate reductase (LKR) and saccharopine dehydrogenase (SDH). These are linked to each other by a single bi-functional protein encoded by a single LKR/SDH gene (Arruda et al., 2000;Galili et al., 2001;Anderson et al., 2010) (Figure 2A). The response of LKR/SDH gene expression to ABA as well as to biotic and abiotic stresses (Moulin et al., 2000) implies that the Lys catabolism pathway participates in a metabolic networks that helps plants withstand such stresses. A dendrogram analysis allowed us to identify the putative soybean orthologs of LKR/SDH ( Figure 3A). The analysis also suggests that duplication events occurred in the soybean LKR/SDH genes, generating the two genes found in the soybean genome, GmaxLKR/SDH-like1 and GmaxLKR/SDH-like2 ( Figure 3A). This event has already been described in other crop species, such as sugarcane, coffee, cotton, maize and tobacco, and generated a large 228 Expression analysis in response to drought stress number of paralogous genes for LKR/SDH (Soltis and Sol tis, 1999;Schmutz et al., 2010). This is in accordance with previous studies that indicated two major duplication events in the soybean genome, resulting in a current conformation with almost 75% of the genes represented in multiple copies that were maintained over time (Schmutz et al., 2010). In the gene expression analysis, the GmaxLKR/SDH-like1 and GmaxLKR/SDH-like2 soybean genes presented quite similar expression regulation indicating that the respective promoter regions may not have diverged among the duplicated genes. However, these genes showed a rather distinct gene expression profile between sensitive and tolerant cultivars in the Psys condition (Figure 5A, B).
Arginine decarboxylase (ADC) is a key plant enzyme that converts arginine into putrescine, an important mediator of abiotic stress tolerance ( Figure 2B) (Peremarti et al., 2010). The over-expression of ADC2 in transgenic Arabidopsis showed that higher levels of putrescine increased drought tolerance (Alcazar et al., 2006(Alcazar et al., , 2010. Dendrogram analysis allowed us to identify two paralogs, GmaxADC2-like1 (Glyma06g00990) and GmaxADC2-like2 (Glyma04g0960) ( Figure 3B). An analysis by qPCR was not done for GmaxADC2-like2 because previous information from subtractive library data did not indicate its expression during water deficit. The GmaxADC2-like1 reached peak expression at a water deficit of -1.5 MPa in the PSys and at the T100 time point in the HSys condition in both cultivars. Interestingly, unlike the GmaxLKR/SDH-like1 and GmaxLKR/SDH-like2 genes, the expression levels of GmaxADC2-like1 were lower in the PSys when compared to the HSys condition ( Figure 5C). This indicates that the regulation of GmaxADC2-like1 expression may be early and transient after the onset of a water deficit sensitivity.
The conversion of myo-inositol to galactinol or to other raffinose series oligosaccharides ( Figure 2C) under abiotic stress was studied in Arabidopsis (Seki et al., 2002;Taji et al., 2002;Shinozaki and Yamaguchi-Shinozaki, 2007;Urano et al., 2009Urano et al., ,2010. Among the key genes of this pathway, AtGOLS1 and AtGOLS2 are the best studied. Their expression patterns are tightly regulated by drought stress and the over-expression of AtGOLS2 in Arabidopsis increases dehydration tolerance (Taji et al., 2002). The neighbor joining analysis suggests that there are six genes in the soybean genome related to AtGOLS1, AtGOLS2 and AtGOLS3:  and GmaxGOLS2-like5 (Glyma19g41550) (Figure 4). The genes GmaxGOLS2-like4 and GmaxGOLS2-like5 were not selected for validation by qPCR because they were absent in the subtractive libraries (Table S2). Our analysis in HSys revealed that GmaxGOLS2-like1 shows higher levels of gene expression at earlier stages (T50 min) in the tolerant cultivar (Embrapa 48), while the sensitive cultivar (BR16) shows a slower response to water deficit ( Figure 5D). A similar expression profile was also observed in PSys, but expression levels were significantly lower when compared with HSys ( Figure 5D). In contrast, GmaxGOLS2-like2 and GmaxGOLS2-like3 were induced exclusively in the PSys condition ( Figure 5E, F). Moreover, the expression levels in the tolerant cultivars were dramatically higher under severe stress ( Figure 5E, F). This result indicates that the expression of GmaxGOLS2-like2 and GmaxGOLS2-like3 is not regulated during the sudden water deficit promoted by the HSys treatment, but may be fundamental during the slow adaptation to drought in a PSys condition. The disparity observed in the regulation of gene expression between GmaxGOLS2-like1 and the two paralogs GmaxGOLS2-like2 and GmaxGOLS2-like3 fits with the well-accepted model according to which changes in the transcriptional regulation of duplicated genes play an important role for their fixation in the genome (Carroll, 2000). The distinct regulation of expression of the GmaxGOLS2 genes may be important to soybean plants to promote tight control of GOLS2 expression under a multitude of environmental conditions. The analysis of soybean gene promoters, using the POBO tool, revealed a cluster composed of up-regulated genes in PSys or HSys, where the frequency of the ACGT (ERD1), ACGTG (ABRE) and ACGTGKC (ABRE) cis-elements is higher. The high frequency of these cis-elements suggests that they may function as important regulatory players in genes that participate in different metabolic pathways during drought stress.
The results presented here indicate that several genes of different metabolic pathways have their expression tightly regulated by drought stress in soybean. Moreover, the data show that the dynamics and the expression level can change drastically depending on the drought stress system and also among closely related orthologs. Our work has shed light on the gene expression response of key genes involved in soybean metabolism during drought stress. The information provided here is important to better understand the molecular mechanisms involved in water deficit tolerance in soybean and may contribute to the development of soybean varieties that are more apt to cope with water stress.  This material is available as part of the online article from http://www.scielo.br/gmb.

Supplementary Material
The following online material is available for this article: Table S1 -Sequences and features of primers used in this study. Table S2 -Prevalence of soybean matches in different metabolic pathways responsive to drought in the subtractive libraries. Table S3 -Transcription factor binding site verification performed with the POBO tool.
License information: This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

232
Expression analysis in response to drought stress  Glyma15g41690.1 No No No No No No No No No No No No Glyma08g17450.1 No No No No No No No No No No No No Glyma13g37080.1 No No No No No No No No No No No No Glyma12g33350.1 No No No No No No No No No No No No Glyma12g26170.1 No No No No No No No No No No No No Glyma06g35630.1 No No No No No No No No No No No No Glyma06g35580.1 No No No No No No No No No No No No Glyma13g37080.1 No No No No No No No No No No No No Glyma12g33350.1 No No No No No No No No No No No No Glyma12g26170.1 No No No No No No No No No No No No Glyma06g35630.1 No No No No No No No No No No No No Glyma06g35580.1 No No No No No No No No No No No No Glyma20g28720.5 No No No No No No No No No No No No Glyma20g28720.4 No No No No No No No No No No No No Glyma20g28720.3 No No No No No No No No No No No No Glyma20g28720.1 No No No No No No No No No No No No Glyma19g34120.1 No No No No No No No No No No No No Methionine biosynthesis II At4g23590 At4g23600 At3g22740