Print version ISSN 1415-4757
Genet. Mol. Biol. vol.35 no.1 supl.1 São Paulo 2012
Ana C. Wanderley-NogueiraI; Luis C. BelarminoI; Nina da M. Soares-CavalcantiI; João P. Bezerra-NetoI; Ederson A. KidoI; Valesca PandolfiI; Ricardo V. AbdelnoorI; Eliseu BinneckII; Marcelo F. CarazzoleIII,IV; Ana M. Benko-IsepponI
IDepartamento de Genética, Centro de Ciências Biológicas, Universidade Federal de Pernambuco Recife, PE, Brazil
IIEmbrapa Soja, Londrina, PR, Brazil
IIILaboratório de Genômica e Expressão, Departamento de Genética, Evolução e Bioagentes, Instituto de Biologia, Universidade Estadual de Campinas, Campinas, SP, Brazil
IVCentro Nacional de Processamento de Alto Desempenho em São Paulo, Universidade Estadual de Campinas, Campinas, SP, Brazil
Plants have the ability to recognize and respond to a multitude of pathogens, resulting in a massive reprogramming of the plant to activate defense responses including Resistance (R) and Pathogenesis-Related (PR) genes. Abiotic stresses can also activate PR genes and enhance pathogen resistance, representing valuable genes for breeding purposes. The present work offers an overview of soybean R and PR genes present in the GENOSOJA (Brazilian Soybean Genome Consortium) platform, regarding their structure, abundance, evolution and role in the plantpathogen metabolic pathway, as compared with Medicago and Arabidopsis. Searches revealed 3,065 R candidates (756 in Soybean, 1,142 in Medicago and 1,167 in Arabidopsis), and PR candidates matching to 1,261 sequences (310, 585 and 366 for the three species, respectively). The identified transcripts were also evaluated regarding their expression pattern in 65 libraries, showing prevalence in seeds and developing tissues. Upon consulting the SuperSAGE libraries, 1,072 R and 481 PR tags were identified in association with the different libraries. Multiple alignments were generated for Xa21 and PR-2 genes, allowing inferences about their evolution. The results revealed interesting insights regarding the variability and complexity of defense genes in soybean, as compared with Medicago and Arabidopsis.
Key words: pathogen response, biotic stress, bioinformatics, Glycine max, Medicago truncatula.
In order to prevent the effects of pathogen attack, plants evolved the ability to recognize the threat and struggle against the invader as well as trigger an effective response (Bolton, 2009). One of the most important steps of this complex response lies in the detection of pathogen invaders by the plant, a step where R (Resistance) genes play a crucial role. This sensing involves the recognition of a pathogen gene product called avirulence (avr) factor by a correspondent R gene. The plant will be resistant and the pathogen growth and establishment will be impaired when both avr and R genes are compatible, leading to the socalled Hypersensitive Response (HR) that triggers diverse responses, including local cell death to impair spreading of the pathogen (Bonas and Anckerveken, 1999). Besides this local reaction, the HR activates a signal cascade - including hormones and PR (Pathogen Related) genes, among others - that are able to establish resistance against a spectrum of different pathogen classes, this corroborating observations made at the beginning of the last century that plants, as well as animals (Benko-Iseppon et al., 2010), may be immunized against the attack a of given pathogen after infection by another pathogen (Chester, 1933).
Besides a local reaction, plants may also display the Systemic Acquired Resistance (SAR). The SAR pathway is also common in many non-compatible plant-pathogen interactions (Nurnberg and Brunner, 2002). As soon as the pathogenic agent is detected, the plant induces a complex set of signal molecules able to activate defense proteins that may have a direct antimicrobial effect, as in the case of Pathogenesis-Related (PR) genes (Durrant and Dong, 2004). Alternatively, they may induce the production of secondary metabolites that impair pathogen movement or growth within the plant tissues (Sparla et al., 2004).
Resistance genes are generally classified into five different groups or classes, defined according to their conserved domains (CD) (Bent, 1996; Hammond-Kosack and Jones, 1997; Ellis and Jones, 2000). The first class is represented by the HM1 gene of maize that encodes a reductase able to inactivate toxins produced by the fungus Helminthosporium carbonum (Joahal and Briggs, 1992). It is the only R gene class where conserved domains are absent. A second class is represented by the Pto gene from tomato that confers resistance against the bacterium Pseudomonas syringae pv. tomato. It is characterized by a serine/threonine-kinase (ser/thre-kinase) domain, able to interact with the avrPto gene (Tang et al., 1999). This gene was also identified in other plants, such as Arabidopsis thaliana, Phaseolus vulgaris (Melotto et al., 2004), eucalyptus (Barbosa-da-Silva et al., 2005) and sugarcane (Wanderley-Nogueira et al., 2007).
The third class is represented by genes bearing two domains, viz. LRR (Leucine Rich Repeats) and NBS (Nucleotide Binding Site) (Liu et al., 2004). This is the case of the Rpm1 and Rps2 genes from A. thaliana, the N gene from tobacco, L6 from flax, Prf from tomato and Rpg1 from soybean also found in common bean and faba bean (Mindrinos et al., 1994; Lawrence et al., 1995; Salmeron et al., 1996; Ashfield et al., 2003). The fourth R gene class encodes a membrane-anchored protein composed of an extracellular LRR domain, a transmembrane region and a short intracellular tail in the C terminal. The Cf gene from tomato is an example of this class, conferring resistance against Cladosporium fulvum (Dixon et al., 1996).
The Xa21 gene from rice confers resistance to the bacteria Xanthomonas oryzae pv. oryzae and is a representative of the fifth class (Song et al., 1995; Wang et al., 1995). This gene encodes an extracellular LRR domain (similar to the Cf gene), as well as a ser/thre-kinase domain (similar to the Pto gene), suggesting an evolutionary connection among different classes in the genesis of plant R genes (Song et al., 1997).
PR proteins comprise pathogen-induced proteins that are routinely classified into 17 families based on their biochemical and molecular biological properties, from PR-1 to PR-17 (van-Loon et al., 2006). Similarities among sequences and serological or immunological properties form the basis of their classification (van-Loon et al., 1999). Although most PR proteins are known to have antifungal activities, their active molecular mechanisms are not well understood except for PR-2 (β-glucanases) and PR3 (chitinases) (Kitajima and Sato, 1999). PR1 is the most abundantly accumulated protein after pathogen infection and its genes have been cloned in many plants, such as tobacco (Gaffney et al., 1993), A. thaliana (Metzler et al., 1991), tomato (Tornero et al., 1997) and apple. Although its phytochemical functions are unknown in all these species, this gene class is nonetheless considered to be a typical SAR marker (Bonasera et al., 2006). PR-5 is a thaumatinlike protein with high antifungal activity, being also expressed under cold stress in overwintering monocots where it exhibits antifreeze activities (Hon et al., 1995, Atici and Nalbantolu, 2003, Griffith and Yaish, 2004). Other families like PR-8 (Glycosyl hydrolase), PR-9 (secretory peroxydase), PR-14 (lipid transfer proteins), PR-15 (oxalate oxydase) and PR-17 (basic secretory proteins) (Nanda et al., 2010) have been well studied and are believed to be involved in plant defense responses, although their molecular mechanisms have yet to be determined (Bolton, 2009). Most PR genes are expressed at a basal level under normal growth conditions, but are rapidly induced after pathogen infection. It is worthy of note that several PR genes are also regulated during development, leaf senescence and pollen maturation, as well as by environmental factors, such as osmotic, cold and light stress (Zeier et al., 2004).
Soybean (Glycine max) is a globally important crop, providing oil and at least twice as much protein per acre as any other major grain (Libault et al., 2010). Economically, soybean is the most valuable source of protein and edible oil crop in the world and serves as a model for seed and other developmental processes (Cannon et al., 2009).
The present evaluation offers an overview of the main available sequences regarding plant-pathogen interaction of the R and PR classes in the soybean transcriptome, here compared with data available from Arabidopsis and Medicago, providing insights on the expression of such sequences in different tissues and inferring as to how these genes may have behaved over the course of evolution.
Material and Methods
Search and screening for R and PR genes in soybean, Medicago and Arabidopsis databases
For this purpose 59 proteins that play important roles in plant defense response were selected as seed sequences. The selected protein sequences were related to the 42 R and 17 PR gene classes described above. The R genes were previously compiled by Barbosa-da-Silva et al. (2005) and Wanderley-Nogueira et al. (2007), and PR seed sequences are available in Table S1 (Supplementary Material). All 59 seed sequences regarded full cDNAs that were obtained from the NCBI database and conceptually translated to improve search strategies.
For the identification of these gene analogs in soybean, Medicago and Arabidopsis transcriptomes, tBLASTx alignments were carried out against three platforms: GENOSOJA (The Brazilian Soybean Genome Consortium), TIGR (The Institute for Genomic Research) and TAIR (The Arabidopsis Information Resource), using 1e-05 as the cut-off value.
Obtained clusters were annotated and analyzed for score, e-values, sequence size and presence of conserved domains, as shown in Table 1. For this purpose all clusters were translated using the TRANSLATE tool of Expasy and screened for conserved motifs with the aid of the rps-BLAST CD-search tool (Altschul et al., 1990). The best match for each gene in each studied species was submitted to a BLASTx alignment in NCBI GenBank in an effort to confirm their putative function.
In a second manual analysis redundancies, i.e. clusters that matched more than one gene due to common domains, were eliminated. For this purpose, clusters matching each query sequence were annotated on a local database (called 'non-redundant').
The third step of the analysis aimed at comparing the number of R and PR candidate sequences obtained after the tBLASTn searches against the soybean, Arabidopsis and Medicago databases by direct counting of non-redundant clusters for each one of the 59 genes studied.
Aiming to analyze the relationships among these genes, some R and PR gene candidates were selected from all three studied species for an evolutionary analysis using the maximum parsimony method and bootstrap function with 5,000 replicates. For this purpose CLUSTALx alignments were submitted to the program MEGA (Molecular Evolutionary Genetic Analysis), Version 4 for Windows (Tamura et al., 2007).
Studying syntenic regions among the soybean and Medicago genomes
Best matches for all selected soybean genes were aligned against the M. truncatula pseudogenome aiming to anchor the 59 soybean sequences in virtual chromosomes through the CVit-BLAST procedure implemented in the Medicago sequencing resource website. BLAST algorithm parameters (score, e-value and percentage of identity) were adjusted to infer about the position of soybean sequences along the Medicago virtual chromosomes.
In silico expression assay based on GENOSOJA EST sequences
A preliminary analysis of the prevalence regarding the 59 genes in the soybean libraries was verified by direct correlation of the read frequencies of each cluster in various GENOSOJA cDNA libraries. Information regarding the 65 libraries that constitute the GENOSOJA database is available on The Soybean Genome Project Website. For practical purposes we combined some libraries that comprised different stages of the same tissue/organ (for example, B01 and B02 are here referred to as "B"), resulting in a total of 16 libraries (B: vegetable buds of field grown plants; C: cotyledons; EN: endosperm; EP: epicotyls; F: flowers; H: hypocotyls; LV: leaves; R: roots; SH: germination shoots; ST: stems; SO: somatic embryos; SC: soybean submitted to drought; LI: leaves infected with Asian rust; MJ: soybean submitted to Meloidogyne javanica; SD: seeds and UK: unknown). To generate an overall picture of selected R and PR gene expression patterns in soybean, a hierarchical clustering approach (Eisen et al., 1998) was applied using normalized data and a graphic representation constructed with the aid of the CLUSTER program. Dendrograms including both axes (using the weighted pair-group for each cluster and library) were generated with aid of the TreeView program (Page, 1996). In these graphics, light yellow means no expression and red indicates all degrees of expression.
In silico expression assay based on the GENOSOJA SuperSAGE libraries
R and PR candidates were also used to screen the six SuperSAGE libraries generated by the GENOSOJA consortium. For the drought experiment, four libraries were generated using roots of two contrasting soybean genotypes, viz. Embrapa-48 (tolerant) and BR-16 cultivar (susceptible), both submitted to dehydration in the dark for 25 up to 150 min (all times bulked together), as compared with non-stressed controls. The other stressed library was generated using leaves of the resistant accession PI561356 inoculated with rust fungus and collected 12, 24 and 48 h post inoculation. For the composition of the pathogen-stressed library, equimolar amounts of the three inoculation times were used, as compared with the negative, non-inoculated control of the same genotyp. The libraries were constructed at GenXPro GmbH (Frankfurt, Germany), essentially as described by Matsumura et al. (2008), and were subsequently sequenced via a SOLEXA platform.
Aiming to perform an overview of the GENOSOJA SuperSAGE data associated with R and PR genes, SuperSAGE tags were submitted to a BLASTn (maximum e-value 1e-05) against the database generated from three comparisons of the six available libraries (1-Embrapa-48, drought tolerant stressed vs. negative control; 2-BR-16, drought susceptible stressed vs. negative control; 3-PI561356 fungus resistant stressed vs. negative control). Each SuperSAGE tag was annotated considering the respective library comparison and also the respective aligned ESTs.
Description and distribution of R and PR genes in soybean, Medicago and Arabidopsis
The tBLASTn alignment against the soybean transcriptome using the 59 known R and PR gene probes returned 1,066 non-redundant sequences from the contigs and singlets deposited in the GENOSOJA database. Among them, 700 represented contigs and 366 singlets, which together encompassed 26,653 reads. Regarding the tBLASTn searches in the Medicago transcriptome, a total of 1,727 sequences were positive matches. In Arabidopsis, 1,533 sequences returned matches after the same procedure.
A screening of R and PR genes in these three species resulted in the identification of 4,326 candidates, of which 3,065 were R and 1,261 PR gene candidates. A graphical representation regarding the prevalence of these sequences and how they are distributed among the soybean, Medicago and Arabidopsis transcriptomes is shown in Figure 1.
After analyzing all results it was observed that only one PR (PR-13) and two R genes (L6 and M) were absent from the soybean transcriptome, while all the other 56 genes presented positive results in the tBLASTn searches. The same was denoted in the Medicago tBLASTn results for these three genes. Also in Arabidopsis no matches could be found for the two R genes L6 and M, but four candidate sequences could be identified for the PR-13 class, as shown in Table 1. A comparison of the distribution of non-redundant sequences in the three species revealed that the NBS-LRR family was the most frequent one in all cases, while the LRR-kinase class was the least represented in all studied organisms (Figure 2). Moreover, it was observed that while Arabidopsis presented a higher number of R gene candidates, Medicago matched the high number of PR genes. In both cases, soybean presented the lowest number of matches (Figure 3A).
The three most represented R and PR genes in all species were the same, with Xa21, EFR and Pti6 representing R genes and PR-2, PR-7 and PR-9 representing PR genes. Due to this abundance, both Xa21 and the PR-2 genes were selected for the construction of a dendrogram and expression analysis. Matching of Xa21 and PR-2 candidates in soybean, Medicago and Arabidopsis did not follow a regular distribution pattern, since soybean presented fewer matches for both genes, and most of the Xa21 candidate sequences were found in Medicago, whereas most PR-2 candidates were found in Arabidopsis (Figure 3B).
Among the 310 PR genes of soybean only 40 matched with more than one seed sequence, all the others being exclusive to a given PR gene family. On the other hand, almost all R genes matched sequences that aligned with more than one probe, requiring manual sorting. Exceptions occurred only with respect to RAR, RIN, P, WRKY29, and Xa21, which aligned in most cases with exclusive sequences.
Phylogenetic analysis of Xa21 and PR-2 genes
Dendrograms generated for Xa21 and PR-2 genes using the soybean sequences and orthologs clearly divided dicots and monocots into distinct clades (Figure 4). In the Xa21 analysis, the fern Selaginella moellendorffii was placed in a basal position from which the two branches representing monocots and dicots emerged (Figure 4A). The monocots group included members of the Poaceae family in one branch, with a bootstrap CI of 95%, associated in the same branch with the palm Elaeis guineensis. Regarding the dicot group, it was observed that both Fabaceae members (G. max and M. truncatula) were positioned together, while the other branch included members of the suborder Eurosidae I (Vitis vinifera and Ricinus communis), together with A. thaliana, a member of the Eurosidae II suborder.
Considering the PR-2 dendrogram (Figure 4B), the grasses (Poaceae represented by rice and maize) occupied a basal position, from which a clade containing two monocots, ginger (Zingiber officinale) and banana (Musa paradisiaca), emerged. Moreover, a large clade containing all dicots was split into two subclades that behaved as merophyletic groups. For example, tobacco (Nicotiana tabacum) and coffee (Coffea arabica), members of the Asterid order, remained together, but potato (Solanum tuberosum) of the same order was positioned on another branch. Soybean and Medicago were also positioned in separate subclades.
Expression pattern of R (Xa21) and PR (PR-2) genes in the soybean transcriptome
From the 26,653 reads identified, an in silico expression assay was carried out considering transcripts from both genes Xa21 (2,980 reads) and PR-2 (1,099 reads). This allowed identifying their prevalence and normalizing their distribution among the tissues and conditions represented in the 65 different libraries. Graphic illustrations of these comparisons are available as Figures S1 and S2 (Supplementary Material).
The analysis of their expression pattern in soybean, obtained from normalized data, revealed that all libraries presented almost the same number of reads. The most representative library was from seed tissues (SD), presenting 10% of the identified reads. Expression in tissues from leaves (LV), roots (R) and flowers (F) presented similar expression, representing 9% of all reads in each tissue. The remaining tissues also presented significant expression (ranging from 5% to 8%), except in the case of libraries made from tissues submitted to the nematode Meloidogyne javanica (MJ), where no reads were identified.
Expression considering the SuperSAGE libraries
BLASTn results revealed that 944 soybean EST candidates aligned with 1,553 SuperSAGE tags when considering a cut-off value of < e-5. Among all tags, 1,072 aligned with the R gene candidates from different classes, with emphasis on the NBS-LRR class. Additionally, 481 tags aligned with PR gene candidates, most of them with the PR-9 secretory peroxidase family (Figure 5). Data concerning sequence-tag association are available as supplementary material (Tables S2, S3 and S4). The best results were obtained for comparison 1 (BR-16, drought susceptible stressed vs. negative control), which matched 613 nonredundant tags, while 465 were found for comparison 2 (Embrapa-48, drought tolerant stressed vs. negative control), and for comparison 3 (PI561356 fungus resistant stressed vs. negative control) 475 SuperTags were represented (Figure 5). It is noteworthy that many tags matched in more than one comparison.
Anchoring soybean R and PR genes in Medicago virtual chromosomes
The alignment of 59 soybean genes against the Medicago virtual chromosomes revealed 1,253 sites in all nine chromosomes, also including sub-telomeric regions (Figure 6). 58 genes presented similarities with distinct segments in the same chromosome or appeared twice in distinct chromosomes. Only the PR-1 sequence anchored in an exclusive chromosome (2).
The highest number of anchored genes was found in chromosome 8, matching 32 of the 59 genes in 85 sites. On the other hand, chromosome 6 presented the lowest number of anchored genes (12). Nonetheless, this chromosome presented the highest number of duplications, matching 228 sites, most of them in tandem positions. Such tandem repetitions could be also observed in three sites of chromosome 3. The lowest gene density was observed in the long arm of chromosome 3. Syntenic regions were evident in chromosomes 2 and 4 (Figure 6).
Several sequences clustered along the genome, with some chromosomes rich in resistance genes, especially chromosomes 2, 7, 8 and 9, with at least four distinct genes in very close positions. These blocks of genes always matched R genes, while PR genes generally appeared in the same chromosomes in distinct sites.
The 1,066 soybean sequences resulting from tBLASTx alignments confirmed the excellent coverage that the existing GENOSOJA databank comprises, including the most important representatives from different gene families.
Legumes are plants known to be able to withstand many kinds of stresses, including rapid climate changes, drought tolerance, exposure to diseases and pests, water logging and flooding (Cannon et al., 2009), which could explain the higher number of PR genes encountered in Medicago in comparison to Arabidopsis, since these families of genes can be activated by different kinds of biotic or abiotic stress (Glombitza et al., 2004). The low number of R and PR gene candidates found in soybean is curious when compared to Arabidopsis and Medicago, since these have smaller genomes (157 Mb and 583 Mb respectively) than that of G. max (1,115 Mb). This may be due to the analyzed sample, which was restricted to expressed sequence tags, whereas the databases of both Arabidopsis and Medicago are larger. Previous studies on legumes showed that despite the relatively large difference in genome sizes of soybean and Medicago, gene densities are similar, indicating that a given Medicago region is likely to correspond well with two soybean regions (Mudge et al., 2005). This leads us to believe that additional expression assays in soybean may reveal important genes that are expressed under very specific conditions.
The number of soybean clusters that aligned with more than one R gene seed sequence is not surprising. Similar results were observed in previous studies regarding R genes of eucalyptus (Barbosa-da-Silva et al., 2005) and sugarcane (Wanderley-Nogueira et al., 2007). This occurs due to the common domains shared by R genes, as for example the LRR domain that is present in the LRR, NBS-LRR and LRR-kinase gene families, facilitating alignments with more than one gene. This is rarer when considering PR gene categories that are more distinct in structure and function (Kitajima and Sato, 1999), as also observed herein. A higher number of sequences matching NBS-LRR families, when compared to other classes, was also reported by Barbosa-da-Silva et al. (2005) and Wanderley-Nogueira et al. (2007), confirming the general observation that most R genes are members of this class.
Dendrograms generated from these data revealed a similar picture in both gene classes selected (Xa21 and PR-2). In the case of Xa21, the positioning of Selaginella moellendorffii as an outgroup was expected, since this species figures as a member of an ancient vascular plant lineage that first appeared 400 million years ago, and thus represents a basal node on the plant evolutionary tree (Weng et al., 2008). The analysis of the Xa21 orthologs from different species reflected their relationship according to classic taxonomy. Lilliopsida class (monocots) appeared as a monophyletic group uniting on the same branch Oryza sativa, Zea mays and Sorghum bicolor, which are all annual cereal grains of the Poaceae family, while the palm Elaeis guineensis (Arecaceae) was positioned on another branch. Considering the Magnoliopsida (dicots), the same occurred, since Medicago and soybean, both legumes and members of Fabaceae, appeared in a subclade, separated from the remaining species. R genes are considered fast evolving, due to their co-evolution with specific pathogens (Michelmore and Meyers, 1998). In the case of Xa21 the most polymorphic region is its extracellular LRR domain, which is responsible for pathogen specificity (Ellis et al., 2000), defining the relationships of the dendrogram presented here.
The PR-2 dendrogram topology showed two main clades, as expected, monocots and dicots. The grouping of monocots followed the taxonomic relationship, segregating Musa and Zingiber (Zingiberales) from Oryza and Zea (Poaceae). It was possible to identify that a symplesiomorphic character united all dicots, reflecting their common origin. Moreover, considering the Magnoliopsida group, the evolutionary model of the PR-2 class seemed to follow a synapomorphic pattern, leading to their diversification in different groups comprising families and orders, this probably reflecting divergent processes regarding this PR gene.
The studied organisms presented different centers of origin, habitats and cycles of life, as well as tolerance, resistance and sensitivity to diverse kinds of biotic and abiotic stresses. Nonetheless, from an overall perspective and considering the position of different species in the dendrograms, it is evident that both Xa21 and PR-2 pathways genes were present in a common ancestor of the angiosperms, since they appear relatively conserved in different plant groups.
Many PR genes are constitutively expressed in given plant tissues (Velazhahan and Muthukrishnan, 2003; Liu et al., 2004), suggesting a link between biotic and abiotic stresses and indicating that at least some members of the PR proteins play important roles in plant development, besides their role in defense responses. This fact may explain why the expression of PR-2 gene can be observed at a basal level in almost all tissues, as seen when considering their frequencies in the soybean libraries.
Studies carried out by Li et al. (2008) and Libault et al. (2010) revealed consistent differences in gene expression patterns among diverse tissues, especially between roots and aerial tissues, but also revealed similarities between expression levels in tissues such as flowers and leaves, corroborating our results. The most represented library was for seeds, including different development stages, which is not surprising, since previous evaluations also revealed that the soybean grain contained the vast majorities of expressed genes and regulatory sequences in the plant (Cannon et al., 2009). In the case of the PR-2 protein, it is interesting to note that previous evaluations carried out by Leubner-Metzger (2005) in tobacco suggest that this gene could play a role in seed germination. Furthermore, the expression of both genes was also increased in leaves, roots and flowers, confirming their prevalence in developing tissues.
As mentioned above, abiotic stress is able to trigger diverse plant responses. After an initial massive distribution of energy triggered by stress, a wide array of defense mechanisms is activated by R genes, inducing a signal cascade and increased PR gene transcription (Vergne et al., 2010). This may justify the considerable amount of soybean SuperSAGE tags related to these genes among the three comparisons considered, with considerable representation in both biotic and abiotic (drought) conditions, as well as in the negative controls, with many tags represented in more than one treatment. The high number of tags that matched with BR-16 drought susceptible library vs. control could be explained by the ability of the plant to continue expressing genes related to systemic acquired resistance as a consequence of contact with any kind of previous stress, a crosstalk previously reported for other plants (Durrant and Dong, 2004; Kido et al., 2011). Comparing the distribution between R and PR genes, both were representative with 1,072 tags matching R genes and 481 tags matching PR candidates, indicating that additional analytical efforts regarding the SuperSAGE candidates will reveal not only associations with specific situations, but also allelic differences important in the definition of biotic and abiotic stress responses.
Flowering plants originated approximately 200 million years ago (Wilkstrom et al., 2001) and subsequently diverged into several lineages. Legumes are an old family believed to have originated approximately 54 Mya (Lavin et al., 2005). Soybean and other papilionoid legumes show evidence of an older shared duplication and probably soybean underwent polyploidy 13 Mya (Shoemaker et al., 2006). These duplications are widely evident, both in number of similar duplicated genes and in large areas of synteny between chromosomal regions. Previous evidence indicates extensive similarities in gene densities and distribution among soybean and Medicago, inferring that a given Medicago region is likely to correspond well with two soybean regions (Mudge et al., 2005). This evidence suggests that Medicago could represent "a simplified draft" of the soybean gene distribution, making an evaluation regarding R and PR soybean ortholog distribution in this crop most desirable. Hence, it is not surprising that all identified soybean R and PR transcripts appeared anchored in 1,253 sites in all segments of Medicago virtual chromosomes.
The rich R gene regions found in chromosomes 2, 7, 8 and 9 confirm previous observations that most resistance genes reside in clusters (Kanazin et al., 1996), as reported in maize (Dinesh-Kumar et al., 1995), lettuce (Maisonneuve et al., 1994), oat (Rayapati et al., 1994) and flax (Ellis et al., 1995). The formation of gene clusters is in general associated with a common ancestor, and the diversification of these genes is the result of duplication processes followed by diversification due to pathogen or environmental pressure.
Clustering of R genes corroborates the existing theory that a common genetic mechanism involving duplication has been responsible for the evolution and diversification of this gene superfamily (Hulbert et al., 2001). The four clusters presented similarities with distinct segments in the same chromosome, probably reflecting tandem gene duplication mechanisms. Such duplicated copies tend to diverge by acquiring additional mutations and may specialize or optimize to play slightly different roles (Alberts et al., 2002).
Regarding the duplicated segments considering the entire genome, 58 genes could be identified in at least two distinct chromosomes. Unlike tandem duplications, repetitions in distinct chromosomes resulted from events of duplication followed by translocations and sequence divergence, also allowing functional diversification (Wendell, 2000; Thiel et al., 2009). There is also evidence that transposition outbreaks could be activated by severe environmental biotic or abiotic stress.
Still regarding the duplication event analysis, a large in tandem repetition was evident in both chromosomes 3 and 6, represented by the genes Xa1/I2 and RRS1, respectively. Previous reports suggested that once duplicated, genes in tandem repetitions may expand rapidly through events of unequal crossing over, since the character could confer advantage to the organism (Alberts et al., 2002), in this case a higher diversity of genes associated with resistance and stress response. This evidence supports assumptions that future efforts regarding increased pathogen resistance may rely on biotechnological inferences that consider whole gene clusters naturally associated in neighboring positions, rather than isolated genes (Dafny-Yelin and Tzfira, 2007), as has been traditionally done.
In conclusion, the here identified sequences represent valuable resources for the soybean breeding program, allowing their use in biotechnological approaches, with emphasis on transgenes. They are also valuable for mapping purposes, considering the putative distribution here uncovered when considering available distribution of genes known from the Medicago genome.
Considering gene diversity revealed especially by the SuperSAGE approach, their association with specific responses to biotic or abiotic stress conditions may reveal important gene variants for germplasm screening in the search for new accessions useful for breeding purposes, especially in association with marker assisted selection (MAS), saving decades of laborious research.
The authors would like to thank CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico), FACEPE (Fundação de Amparo à Ciência e Tecnologia do Estado de Pernambuco), and CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior) for their financial support.
Alberts B, Johnson A, Lewis J, Raff M, Roberts K and Walter P 2002) Molecular Biology of the Cell. 4th edition. Garland Publishing Company, New York & London, 1616 pp. [ Links ]
Altschul SF, Gish W, Miller W and Myers E (1990) Basic local alignment search tool J Mol Biol 215:403-410. [ Links ]
Ashfield T, Bocian A, Held D, Henk AD, Marek LF, Danesh D, Penuela S, Meksem K, Lightfoot DA, Young ND, et al (2003) Genetic and physical localization of the soybean Rpg1-b disease resistance gene reveals a complex locus containing several tightly linked families of NBS-LRR genes. Mol Plant Microbe Interact 16:817-826. [ Links ]
Atici O and Nalbantoglu B (2003) Antifreeze proteins in higher plants. Phytochemistry 64:1187-1196. [ Links ]
Barbosa-da-Silva A, Wanderley-Nogueira AC, Silva RRM, Belarmino LC, Soares-Cavalcanti NM and Benko-Iseppon AM (2005) In silico survey of resistance (R) genes in Eucalyptus transcriptome. Genet Mol Biol 28:562-574. [ Links ]
Benko-Iseppon AM, Galdino SL, Calsa Júnior T, Kido EA, Tossi A, Belarmino LC and Crovella S (2010) Overview of plant antimicrobial peptides. Curr Prot Pept Sci 11:181-188. [ Links ]
Bent AF (1996) Plant disease resistance genes: Function meets structure. Plant Cell 8:1751-1771. [ Links ]
Bolton M (2009) Primary metabolism and plant defense - Fuel for the fire. Mol Plant Microbe Interact 22:487-497. [ Links ]
Bonas U and Anckerveken GV (1999) Gene-for-gene interactions: Bacterial avirulence proteins specify plant disease resistance. Curr Opin Plant Biol 2:94-98. [ Links ]
Bonasera JM, Kim JF and Beer SV (2006) PR genes of apple: Identification and expression in response to elicitors and inoculation with Erwinia amylovora. BMC Plant Biol 6:23-34. [ Links ]
Cannon SB, May GD and Jackson SA (2009) Three sequenced legume genomes and many crop species: Rich opportunities for translational genomics. Plant Physiol 151:970-977. [ Links ]
Chester KS (1933) The problem of acquired physiological immunity in plants. Quart Rev Phytopathol 42:185-209. [ Links ]
Dafny-Yelin M and Tzfira T (2007) Delivery of multiple transgenes to plant cells. Plant Physiol 145:1118-1128. [ Links ]
Dinesh-Kumar SP, Whitham S, Choi D, Hehl R, Corr C and Baker B (1995) Transposon tagging of tobacco mosaic virus resistance gene N:I its possible role in the TMV-N-mediated signal transduction pathway. Proc Natl Acad Sci USA 92:4175-4180. [ Links ]
Dixon MS, Jones DA, Keddie JS, Thomas CT, Harrison K and Jones JDG (1996) The tomato Cf2 disease resistance locus comprises two functional genes encoding leucine rich repeats proteins. Cell 84:451-459. [ Links ]
Durrant WE and Dong X (2004) Systemic acquired resistance. Annu Rev Plant Pathol 42:185-209. [ Links ]
Eisen MB, Spellman PT, Brown PO and Botstein B (1998) Cluster analysis and display of genome-wide expression patterns. Genetics 25:14863-14868. [ Links ]
Ellis J and Jones D (2000) Structure and function of proteins controlling strain-specific pathogen resistance in plants. Curr Opin Plant Biol 1:288-293. [ Links ]
Ellis J, Lawrence GJ, Finnegan EJ and Anderson PA (1995) Contrasting complexity of two rust resistance loci in flax. Proc Natl Acad Sci USA 92:4185-4188. [ Links ]
Ellis J, Dodds P and Pryor T (2000) Structure, function and evolution of plant disease resistance genes. Curr Opin Plant Biol 3:278-284. [ Links ]
Gaffney T, Friedrich L, Vernooij B, Negrotto D, Nye G, Ukness S, Ward E, Kessman H and Ryals J (1993) Requirement of sali cylic acid for the induction of systemic acquired resistance. Science 261:754-756. [ Links ]
Glombitza S, Dubuis P-H, Thulke O, Welzl G, Bovet L, Götz M, Affenzeller M, Geist B, Hehn A, Asnaghi C, et al (2004) Crosstalk and differential response to abiotic and biotic stressors reflected at the transcriptional level of effector genes from secondary metabolism. Plant Mol Biol 54:817-835. [ Links ]
Griffith M and Yaish MWF (2004) Antifreeze proteins in overwintering plants: A tale of two activities. Trends Plant Sci 9:399-405. [ Links ]
Hammond-Kosack KE and Jones JDG (1997) Plant disease resistance genes Annu Rev Plant Physiol 48:575-607. [ Links ]
Hon WC, Griffith M, Mlynarz A, Kwok YC and Yang DSC (1995) Antifreeze proteins in winter rye are similar to pathogenesis-related proteins. Plant Physiol 109:879-889. [ Links ]
Hulbert SH, Webb CA, Smith SM and Sun Q (2001) Resistance gene complexes: Evolution and utilization. Annu Rev Phytopathol 39:285-312. [ Links ]
Joahal GS and Briggs SP (1992) Reductase activity encodes by the Hm1 resistance gene in maize. Science 198:985-987. [ Links ]
Kanazin V, Marek LF and Shoemaker RC (1996) Resistance gene analogs are conserved and clustered in soybean. Proc Natl Acad Sci USA 93:11746-11750. [ Links ]
Kido EA, Barbosa PK, Ferreira Neto JCR, Pandolfi V, Houllou-Kido LM, Crovella S and Benko-Iseppon AM (2011) Identification of plant protein kinases in response to abiotic and biotic stresses using SuperSAGE. Curr Prot Pept Sci 12:643-656. [ Links ]
Kitajima S and Sato F (1999) Plant pathogenesis-related proteins: Molecular mechanisms of gene expression and protein function. J Biochem 125:1-8. [ Links ]
Lavin M, Herendeen PS and Wojciechowski MF (2005) Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. Syst Biol 54:575-594. [ Links ]
Lawrence GJ, Finnegan EJ, Ayliffe MA and Ellis JG (1995) The L6 gene for flax rust resistance is related to the Arabidopsis bacterial resistance gene RPS2 and the tobacco viral resistance gene N. Plant Cell 7:1195-1206. [ Links ]
Leubner-Metzger G (2005) f-1,3-glucanase gene expression in low-hydrated seeds as a mechanism for dormancy release during tobacco after-ripening. Plant J 41:133-145. [ Links ]
Li L, He H, Zhang J, Wang X, Bai S, Stolc V, Tongprasit W, Young ND, Yu O and Deng XW (2008) Transcriptional analysis of highly syntenic regions between Medicago truncatula and Glycine max using tiling microarrays. Genome Biol 9:R57. [ Links ]
Libault M, Farmer A, Joshi T, Takahashi K, Langley RJ, Franklin LD, He J, Xu D, May G and Stacey G (2010) An integrated transcriptome atlas of the crop model Glycine max and its use in comparative analyses in plants. Plant J 63:86-99. [ Links ]
Liu B, Zhang S, Zhu X, Yang Q, Wu S, Mei M, Mauleon R, Leach J, Mew T and Leung H (2004) Candidate defense genes as predictors of quantitative blast resistance in rice. Mol Plant Microbe Int 17:1146-1152. [ Links ]
Maisonneuve B, Bellec Y, Anderson P and Michelmore RW (1994) Rapid mapping of two genes for resistance to downy mildew from Lactuca serriola to existing clusters of resistance genes. Theor Appl Genet 89:96-104. [ Links ]
Matsumura H, Kruger DH, Kahl G and Terauchi R (2008) SuperSAGE: A modern platform for genome-wide quantitative transcript profiling. Curr Pharm Biotechnol 9:368-374. [ Links ]
Melotto M, Coelho MF, Pedrosa-Harand A, Kelly JD and Camargo LE (2004) The anthracnose resistance locus Co-4 of common bean is located on chromosome 3 and contains putative disease resistance-related genes. Theor Appl Genet 109:690-699. [ Links ]
Metzler MC, Cutt JR and Klessig DF (1991) Isolation and characterization of a gene encoding a PR-1 like protein from Arabidopsis thaliana. Plant Physiol 96:346-348. [ Links ]
Michelmore RW and Meyers BC (1998) Clusters of resistance genes in plants evolve by divergent selection and a birthand-death process. Genome Res 8:1113-1130. [ Links ]
Mindrinos M, Katagiri F, Yu GL and Ausubel FM (1994) The Arabidopsis thaliana disease resistance gene encodes a protein containing a nucleotide-binding site and leucine rich repeats. Cell 78:1089-1099. [ Links ]
Mudge J, Cannon SB, Kalo P, Oldroyd GE, Roe BA, Town CD and Young ND (2005) Highly syntenic regions in the genomes of soybean, Medicago truncatula and Arabidopsis thaliana. BMC Plant Biol 5:e15. [ Links ]
Nanda AK, Andrio E, Marino D, Pauly N and Dunand C (2010) Reactive Oxygen Species during plant-microorganism early interactions. J Integr Plant Biol 52:195-204. [ Links ]
Nurnberg T and Brunner F (2002) Innate immunity in plants and animals: Emerging parallels between the recognition of general elicitors and pathogen-associated molecular patterns. Curr Opin Plant Biol 5:318-324. [ Links ]
Page RD (1996) Treeview program, ver. 1.6.1. Comp Appl Biosci 12:357-358. [ Links ]
Rayapati PJ, Lee M, Gregory JW and Wise RP (1994) A linkage map of diploid Avena based on RFLP loci and a locus conferring resistance to nine isolates of Puccinia coronata var. 'avenae'. Theor Appl Genet 89:831-837. [ Links ]
Salmeron JM, Oldroyd GED, Romens CMT, Scofield SR, Kim HS, Lavelle DT, Dahlbeck D and Staskawicz BJ (1996) Tomato Prf is a member of the leucine rich repeats class of plant disease resistance genes and lies embedded within the Pto kinase gene cluster. Cell 86:123-133. [ Links ]
Shoemaker RC, Schlueter J and Doyle JJ (2006) Paleopolyploidy and gene duplication in soybean and other legumes. Curr Opin Plant Biol 9:104-109. [ Links ]
Song WY, Pi LY, Wang GL, Gardner J, Holsten T and Ronald PC (1997) Evolution of the rice Xa21 disease resistance genes family. Plant Cell 9:1279-1287. [ Links ]
Song WY, Wang GL, Kim HS, Pi LY, Gardner J, Wang B, Holsten T, Zhai WX, Zhu LH, Fauquet C, et al (1995) A receptor kinase-like protein encoded by the rice disease resistance gene Xa21. Science 270:1804-1806. [ Links ]
Sparla F, Rotino L, Valgimigli MC, Pupillo P and Trost P (2004) Systemic resistance induced by benzothisdizole in pear inoculated with the agent of fire blight. Sci Hortic 101:269-279. [ Links ]
Tamura K, Dudley J, Nei M and Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software ver. 4.0. Mol Biol Evol 24:1596-1599. [ Links ]
Tang X, Xie M, Kim YJ, Zhou J, Klessing DF and Martin GB (1999) Overexpression of Pto activates defense responses and confers broad resistance. Plant Cell 11:15-29. [ Links ]
Thiel T, Graner A, Waugh R, Grosse I, Close TJ and Stein N (2009) Evidence and evolutionary of ancient whole-genome duplication in barley predating the divergence from rice. BMC Evol Biol 9:209-227. [ Links ]
Tornero P, Gadea J, Conejero V and Vera P (1997) Two PR-1 genes from tomato are differentially regulated and reveal a novel mode of expression for a pathogenesis-related gene during the hypersensitive response and development. Plant Microbe Interact 10:624-634. [ Links ]
van-Loon LC, Geraats BPJ and Linthorst HJM (2006) Ethylene as a modulator of disease resistance in plants. Trends Plant Sci 11:184-191. [ Links ]
van-Loon LC, Pierpoint WS, Boller T and Conejero V (1999) Recommendations for naming plant pathogenesis-related proteins. Plant Mol Biol Rep 12:245-264. [ Links ]
Velazhahan R and Muthukrishnan S (2003) Transgenic tobacco plants constitutively overexpressing a rice thaumatin-like protein (PR-5) show enhanced resistance to Alternaria alternata. Plant Biol 47:347-354. [ Links ]
Vergne E, Grand X, Ballini R, Chalvon V, Saindrenan P, Tharreau D, Nottéghem J-L and Morel J-B (2010) Preformed expression of defense is a hallmark of partial resistance to rice blast fungal pathogen Magnaporthe oryzae. BMC Plant Biol 10:e206. [ Links ]
Wanderley-Nogueira AC, Mota N, Lima-Morais D, Silva LCB, Silva AB and Benko-Iseppon AM (2007) Abundance and diversity of resistance (R) genes in the sugarcane transcriptome. Genet Mol Res 6:866-889. [ Links ]
Wang GL, Holsten TE, Song WY, Wang HP and Ronald PC (1995) Construction of a rice bacterial artificial chromosome library and identification of clones linked to the Xa21 disease resistance locus. Plant J 7:525-533. [ Links ]
Wendell J (2000) Genome evolution in polyploids. Plant Mol Biol 42:225-249. [ Links ]
Weng JK, Banks JA and Chapple C (2008) Parallels in lignin biosynthesis: A study in Selaginella moellendorffii reveals convergence across 400 million years of evolution Comm Int Biol 1:20-22. [ Links ]
Wilkstrom N, Savolainen V and Chase MW (2001) Evolution of the angiosperms: Calibrating the family tree. Proc Soc Biol Sci 268:2211-2220. [ Links ]
Zeier J, Pink B, Mueller MJ and Berger S (2004) Light conditions influence specific defense responses in incompatible plantpathogen interactions: Uncoupling systemic resistance from salicylic acid and PR-1 accumulation. Planta 219:673-683. [ Links ]
Expert Protein Analysis System (Expasy), http://expasy.org.uk (August 18, 2010).
Medicago sequencing resource website, http://www.medicago.org (October 19, 2010).
The Arabidopsis Information Resource (TAIR), http://www.arabidopsis.org (September 1, 2010).
The Brazilian Soybean Genome Consortium (GENOSOJA), http://bioinfo03.ibi.unicamp.br/soja (August 8, 2010).
The Institute for Genomic Research (TIGR), http://plantta.jcvi.org (August 8, 2010).
Send correspondence to:
Ana Maria Benko-Iseppon
Departamento de Genética, Centro de Ciências Biológicas, Universidade Federal de Pernambuco,
Av. Prof. Morais Rego 1235,
50.670-420 Recife, PE, Brazil
License information: This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The following online material is available for this article:
Figure S1 -Xa21 expression profile in 16 different libraries from GENOSOJA.
Figure S2 -PR-2 expression profile in 16 different libraries from GENOSOJA.
Table S1 -Accession number of reference PR genes used as seed sequences.
Table S2 -Number of SuperSAGE tags per comparison.
Table S3 -SuperSAGE tags that matched genes.
Table S4 -Number of tag repetitions in comparisons matching R and PR genes.
This material is available as part of the online article from http://www.scielo.br/gmb.