An improved method for RNA extraction from common bean seeds and validation of reference genes for qPCR

An RNA extraction method with high integrity and purity as well as the selection of adequate reference genes are prerequisites for gene expression analysis. For common bean seeds, there is no well-defined protocol that can be used in a laboratory routine for gene expression analysis. In this study, an extraction protocol for RNA from common bean seeds, which produced material with good integrity for qPCR (RIN ≥ 6.5), was optimized. In addition, 10 reference genes were evaluated under qPCR standard conditions using different tissue samples of common beans. Gene stabilities were analyzed using the delta-CT method, Bestkeeper, NormFinder and geNorm approaches. The genes β-tubulin and T197 were ranked as the most stable among the sample sets evaluated with different tissue samples, while PvAct and Pv18S were the least stable. To our knowledge, this is the first study evaluating RNA isolation methods and reference gene selection for seeds of Phaseolus vulgaris.


INTRODUCTION
Experimental validation of gene transcription data is usually performed using real-time quantitative polymerase chain reaction (qPCR), which involves amplifying in vitro copies of complementary DNA (cDNA) from an RNA template and monitoring the levels of the molecules produced during each cycle in real time (Heid et al. 1996).The advantages of this method include speed, specificity and sensitivity for amplifying the target fragment through relative and absolute transcribed gene quantification (Hu et al. 2014).The qPCR technique has been considered the gold standard for quantifying gene expression since many factors, such as the quality and integrity of the extracted RNA, sample storage, and the correct choice of internal controls, are considered to avoid obtaining biased and/or misleading results (Dheda et al. 2005).
Only a set of RNA that truly represents cell transcription for a particular sample will provide accurate information on the characteristics and expression levels of the transcripts analyzed.Due to variable cell compositions with varying levels of secondary metabolites, polysaccharides, and phenolic and oxidative compounds, a single standardized procedure of RNA extraction for any type of tissue is not viable (Mornkham et al. 2013).Establishment of such procedures is not trivial, and efforts to develop seed RNA extraction protocols have been conducted for different species (Christou et al. 2014, Ma et al. 2015).For the common bean, RNA extraction methods from leaves and roots have been established, which have been based on either commercial kits (Contour-Ansel et al. 2010) or not (Borges et al. 2012).However, a well-defined RNA extraction protocol for common bean grain/seed tissue is not available and, among legumes, there is just one protocol that was created for soybean seeds (Yin et al. 2014).
Because extraction of high-quality RNA and identification of reference genes are among the most important factors for reliable qPCR experiments, we aimed in this paper to establish a suitable RNA extraction method for common bean seeds.Additionally, we evaluated a set of reference genes previously described in the literature, optimized the qPCR conditions and determined the most adequate method for qPCR analysis in different tissues of common beans and under variable experimental conditions.

Plant samples
Seeds from six cultivars of common bean of the Carioca grain type were used: BRSMG Madrepérola, BRS Pontal, Pérola, BRS Estilo, CNFC10467 and Pinto Beans.Plants were grown in a field from August to October 2014, and the plots consisted of 10 rows that were each five meters in length in Santo Antônio de Goiás -GO, Brazil.After harvesting, the seeds were dried at room temperature (21 °C), processed and stored at -80 °C until use.Aiming to define stable reference genes across different types of common bean tissue in addition to seeds, RNA samples were used from the leaves and roots of Pérola and BAT477 that were cultivated under drought stress and the leaves of BRS Realce and BRS Executivo that were inoculated with the fungus Colletotrichum lindemuthianum.

Seed total RNA extraction
The total RNA from mature seeds of the six cultivars was extracted using three different protocols: 1) based on TRIzol® RNA Isolation Reagents (Invitrogen™, Carlsbad, California) following the manufacturer's instructions; 2) a commercial kit, PureLink ® RNA Mini Kit (Ambion ® , Carlsbad, California), which is based on a column extraction method, following the manufacturer's instructions and 3) a protocol proposed by Silva et al. (2011) for coffee seeds, with adaptations as described below.The ground seeds (150 mg, five grains on average) were transferred to 1.5 mL tubes, then 1000 μL of Concert™ Plant RNA Reagent (Invitrogen™, Carlsbad, California) was added and the samples were homogenized by vortexing for 2 min.Subsequently, the tubes were incubated horizontally at room temperature (21 °C) for 5 min.Next, the samples were centrifuged at 11,400 rpm for 2 min at room temperature, and the supernatant was subsequently removed and transferred to fresh tubes.The supernatant was treated with 100 μL of 1.5 mol L -1 NaCl and homogenized by inverting the tube eighty times.Thereafter, 300 μL of chloroform was added and homogenized by inverting the tube eighty times.The tubes were then centrifuged for 10 min at 11,400 rpm at 4 °C; the upper phase was recovered and transferred to a new tube.Then, 500 μL of the lysing reagent (Concert™ Plant RNA Reagent) was added to the recovered upper phase, and the initial extraction procedures were repeated.The centrifugation step was repeated, and the final supernatant was extracted; then, 500 μL of isopropanol was added, and the tubes were inverted eighty tubes.The tubes were incubated at room temperature for 10 min.The precipitate was recovered by centrifugation at 11,400 rpm for 10 min at 4 °C followed by washing with 1000 μL of 75% cold ethanol and centrifugation at 11,400 rpm for 1 min at room temperature.To the dried precipitate, 70 μL of autoclaved Diethylpyrocarbonate (DEPC) water was added followed by storage at -80 °C.Three technical replicates were used for each sample.The total RNA extraction from leaves and roots was performed using the PureLink ® RNA Mini Kit (Ambion ® , Carlsbad, California).

RNA quality control and synthesis of complementary DNA (cDNA)
The quantity and purity of each RNA sample was estimated using a NanoVue™ Plus Spectrophotometer (General Electric Company, GE).Additionally, an RNA integrity number (RIN) was verified with an Agilent 2100 Bioanalyzer (General Electric Company, GE).RNA samples were treated with DNase I (Invitrogen™, Carlsbad, California) following the manufacturer's guidelines.Subsequently, the RNA was transcribed into cDNA using SuperScript ® II Reverse Transcriptase (Invitrogen™, Carlsbad, California), as directed by the manufacturer.The cDNA was quantified using a Qubit ® ssDNA assay in a Qubit ® 2.0 Fluorometer (Life Technologies TM , Carlsbad, California).Applied Biotechnology -17: 150-158, 2017 WJ Pereira et al.

Optimization of primer concentration for qPCR
The primer sequences corresponding to the target reference genes in different plants were obtained through a literature search (Table 1).Each primer was individually tested for amplification and specificity.The PCR reaction was conducted as described by Müller et al. (2014), and the amplified products were visualized on a 2.0% agarose gel.
For the primer titration, a matrix of reactions was created.A total of 9 reactions were performed varying the concentration of each primer (forward and reverse) independently (50, 300, and 900 nM) to determine the best quantification cycle (C q ) values that avoided primer dimer formation.The reactions were conducted in duplicate using Real Master SYBR ROX Mix (5 Prime, Gaithersburg, Maryland) and 10 ng of cDNA in a final volume of 20 μL, as suggested by the manufacturer.The thermocycling conditions included one cycle at 94 °C for 2 min followed by 40 cycles at 95 °C for 15 sec and 60 °C for 1 min.The qPCR was performed on an ABI7500 Real Time PCR (Life Technologies TM ).
A dissociation curve based on a melting temperature analysis (Tm) was generated and analyzed using the 7500 software v2.0.5.The Sequence Detection Software (SDS) v.1.3(Applied Biosystems) was used to calculate the cycle threshold (C t ).Two methods were used to evaluate the amplification efficiencies: 1) a standard curve based on a serial cDNA dilution (128,64,32,16,8,4 and 2 ng) and 2) linear regression within the window-of-linearity by the LinRegPCR software (Ruijter et al. 2009).

Gene expression stability
The C q values were used to determine the reference gene stability using a variety of methods: Delta-C T (Silver et al. 2006), NormFinder (Andersen et al. 2004), geNorm (Vandesompele et al. 2002) and BestKeeper (Pfaffl et al. 2004).The web-based tool RefFinder (Xie et al. 2012) was used to integrate the results of these programs and to generate the final ranking of the tested reference genes.

RESULTS AND DISCUSSION
Several methods to remove polysaccharide and protein contamination from plant RNA have been developed, as described for seeds containing high levels of starch (Li and Trick 2005) and soybean seeds (Yin et al. 2014).In this report, Table 1.Candidate genes for qPCR analyses with their respective nomenclature and bibliographic citation, GenBank accession number, gene description, primer sequence (forward and reverse) and product size in base pairs (bp)

Origin of primers
Primer sequences (forward/reverse) 5'-3' Size (pb) three different protocols were used based on their reported impact in the literature.The procedures based on the TRIzol ® and column purifications (PureLink®) did not generate RNA from seeds with sufficient purity and integrity for subsequent qPCR analyses (Table 2).The resulting 260/280 ratio (1.8-2.0 desired as an indication of purity) was below 1.5 for the TRIzol ® reagent.For the 260/230 ratio (1.8-2.0 desired as an indication of purity), the values obtained were below 0.3 for both TRIzol ® and PureLink® reagents, and the RIN values were below 2.6 for all samples.
The adapted protocol from Silva et al. (2011), based on an isolation procedure using the organic solvents phenol and chloroform, in addition to various mixing steps (extremely important for tissue homogenization), allowed the insoluble material to be removed from the samples (as polysaccharides and proteins).The 260/280 ratio obtained was adequate (≥ 2.0).The extraction of high-purity RNA is challenge because of the high protein and carbohydrate content in common bean seeds which might be difficult to remove from samples.The secondary metabolites can either interact with or intercalate in RNA, or taken together during the extraction process due to reagent affinity, which reduce the yield and quality of the RNA.The amounts of these compounds may vary widely among common bean genotypes, reaching values of 32% protein and 23% carbohydrate in the seeds (Brigide et al. 2014).The optimized protocol was efficient at removing these organic compounds.For the 260/230 ratio, low values were obtained (ranging from 0.37 to 0.82) revealing remains of solvents or salts in the RNA solution.These impurities did not compromise the reliability of the qPCR results in this  2) and the efficiency of the qPCR reactions was high, as demonstrated by the test of efficiency (Table 3), being used in this study as the criteria for a protocol selection.Most studies have considered the RNA integrity using the RIN algorithm.

Reference genes for qPCR
Concerning the reference genes available for common beans, this work does not intend to confirm information already described in the literature, but rather intends to provide additional guidance on working with common bean reference genes.The purpose is to allow readers access to the most appropriate genes to be used for different plant tissues and the qPCR amplification conditions fundamental to conducting routine laboratory analysis.Important steps of the research process, not available in the published literature, such as primer titrations and a determination of suitability across different tissue samples with strong statistical support, were provided in the present study.Of the genes that were evaluated (Table 1), only the gene Const15 did not generate an amplified product.It is a constitutive gene of soybeans, and an amplified product may be obtained by redesigning the primer sequences using the reference common bean genome.From the remaining 9 genes, four genes presented the formation of a secondary structure, as demonstrated by the dissociation curve, indicating the need to design new sets of primers for these genes.This optimization step for qPCR conditions, implemented in the present study, was fundamental for the development of a robust assay, which ensures reproducibility between replicates.Of the five remaining primers suitable for evaluation as reference genes (Pv18S, β-tubulin, PvAct, Tc127, T197) of leaf, root and seed tissues, four primers were derived from bean genomic sequences and one from cocoa.
The values of PCR efficiency ranged from 63 to 103% based on the standard curve (Table 3).Although the method based on the standard curve remains reliable and broadly used (Svec et al. 2015), it often results in unrealistic values greater than 100% for qPCR efficiency.As described by Peirson et al. (2003), several analytical procedures could result in a cumulative error that might lead to efficiency overestimation.To overcome these limitations, mathematical models have been published describing the kinetics of the qPCR reaction and trying to estimate qPCR efficiency from a single reaction (Robledo et al. 2014).The methodology implemented by the LinRegPCR software used in the present study resulted in values of qPCR efficiency ranging from 55 to 94%, which are reduced values compared to the standard curve.The qPCR efficiency was similar, and even higher, for the seed samples compared to the leaf/root samples, using both the standard curve method and LinRegPCR (Table 3).Based on the standard curve method, all genes, except Pv18S, showed adequate estimates of efficiencies ranging from 95 to 105%, while for LinRegPCR all genes, except Pv18S and PvAct, presented values of efficiency higher than 82% (Table 3).For three genes, β-tubulin, Tc127, and T197, both methods provided satisfactory levels of efficiency (≥ 82.97%) with the highest value for T197 (≥ 93.33%).Among all genes, T197 (encodes a guanine nucleotide-binding protein beta subunit-like protein) was selected from the common bean cDNA libraries; however, in a previous study, it was not recommended for use as a normalizer gene due to the variability of its expression levels (Thibivilliers et al. 2009).As the primer amplifications in this study were carefully adjusted by titration procedure, the potential of each gene and its efficiency as a reference for normalization were evaluated under the adequate qPCR conditions.Failures to amplify a desired target sequence are often assigned to inadequate primer design, and targeting other regions of the gene sequence could be strategic to redesign a new set of primers and obtain a satisfactory amplification.

Expression stability of reference genes
It is well known that gene expression stability is one of the most important criteria for selection of a reference gene.The five selected genes in this study (Pv18S, β-tubulin, PvAct, Tc127 and T197) showed adequate amplification of cDNA samples from the seeds, leaves and roots of common bean plants.The absolute Cq values individually obtained for each one of the five reference genes are graphically represented on a box plot graphic in Figure 1 where the median raw Cqs are represented by lines.There is a premise that reference genes cannot be regulated by the experimental conditions of the sample set (Robledo et al. 2014).The geNorm and BestKeeper algorithms are based on the assumption that none of the analyzed genes are co-regulated (Matz et al. 2013).For this reason, the use of more than one algorithm for the validation of reference genes is suggested to give more reliable results (Zyzynska-Granica and Koziak 2012).
The gene-to-gene correlations were verified by the Pearson correlation coefficient (r) (Pfaffl et al. 2004), and a strong positive correlation for all reference genes (r ≥ 0.80) was observed (Figure 2), even for Pv18S, which presented a negative and strong correlation with the other reference genes evaluated (r ≤ -0.66).For Pv18S, the C q value tended to be reduced in seeds (Figure 1d) (ranging from 16.92 to 30.23) compared to the other tissues, which is indicative of higher initial copy numbers of this target sequence.The Pv18S expression level (C q value) compared with the remaining genes resulted in statistically significant negative correlations (Figure 2).
The ranking of reference gene stabilities evaluated for the whole set of samples and inter-group variation (abiotic stress, biotic stress and seeds) showed that the genes did not present a high stability of expression across all sample groups (Table 4), as demonstrated by the high standard deviation and by the wide range of C q values (Figure 1a).For all samples analyzed together, geNorm ranked T197 and β-tubulin as the most stable genes (M = 1.4) with the combined value below the recommended cutoff of 1.5 (Robledo et al. 2014).NormFinder considered the same two genes most stable, with a combined stability value of 0.468.Based on RefFinder and Bestkeeper, the T197 and β-tubulin were also considered the most stable, although the SD (1.26 and  2.37, respectively) and CV (4.52 and 9.50, respectively) determined by Bestkeeper were above the recommended cutoff of 1 (Table 4).
When considering the whole set of samples, the C q values become more stable and lower (Figures 1b,1c and 1d).For the samples of leaves and roots submitted to the same experimental conditions (abiotic stress), Tc127, T197, β-tubulin and PvAct were ranked as more stable with reduced SD (0.8) by Bestkeeper.The remaining algorithms gave the same result and indicated that the reference genes T197 and β-tubulin were the most stable, with a geNorm value of 0.415.For the samples submitted to biotic stress, the genes Tc127 and β-tubulin were ranked as the best combination by geNorm (M=0.386) and the remaining methods, except for BestKeeper, and the genes ranked with lower SD (below the cutoff of 1) were β-tubulin (0.52), PvAct (0.58), Tc127 (0.59) and T197 (0.64).For the seed samples, geNorm suggested β-tubulin and Tc127 as the most stable (M= 0.605) combination, as well as for the other programs.All genes, except for Pv18S, presented SD values lower than one by BestKeeper.For qPCR data normalization, two genes were frequently observed as the optimal number in several experimental plots (Borges et al. 2012, Müller et al. 2014).
In the present study, out of the five genes, β-tubulin and T197, previously determined for common beans under biotic and abiotic stress, respectively, ranked as the most stable among the sample sets evaluated in different tissue samples (leaves, roots and seeds) and experimental conditions (biotic and abiotic stress).In a previous study, two β-tubulin genes (Tub8 and Tub9) were not selected as reference genes (Borges et al. 2012), suggesting that the β-tubulin isoforms may have different expression patterns, as seen for other enzymes (Sielski et al. 2014).In addition, the genes PvAct and Pv18S were the least stable, in accordance with previous studies (Fernandez et al. 2011).
The RNA extraction, optimization and validation of reference genes for expression analysis in different tissues and experimental conditions were aimed at enabling the optimal performance of qPCR experiments in common bean samples.A more adequate method for RNA seed extraction was implemented that produced RNA with high integrity for gene expression studies (RIN ≥ 6.9).Furthermore, a set of reference genes using SYBR Green chemistry were tested and made available for normalization in qPCR experiments for different tissue samples and experimental conditions.This study also provided the validation of the web-based tool RefFinder that integrates a diverse set of methods to compare and rank the best combination of candidate reference genes in common bean samples.The data from the present study strongly support the indication of reference genes for qPCR analysis.

Figure 1 .
Figure 1.Box plot analysis of Cqs obtained for distinct sample sets of each gene.Each box indicates the middle 50% of the distribution, with a line at the median dividing the box into two parts.A) Cqs distribution values for all samples; B) Cqs distribution for abiotic stress; C) Cqs distribution for biotic stress; D) Cqs distribution for seed samples.

Figure 2 .
Figure 2. Correlation between Cq values for the reference genes.Values in the boxes: Correlation coefficient (r), p< 0.001.

Table 2 .
Summary of the common bean seed RNA profile based on the three extraction methods

Table 3 .
Performance evaluation , Yee et al. 2014)ve selected candidate genes in leaf/root pooled cDNA and seed cDNA ; this finding is consistent with previous reports in the literature(Cincinnati et al. 2008, Yee et al. 2014).The RIN values ranged from 6.93 to 8.30 (Table study

Table 4 .
Stability rankings obtained with the six determination methods for the individual group of common bean samples