The post-transcriptional gene silencing pathway in Eucalyptus

Post-transcriptional gene silencing (PTGS) is a conserved surveillance mechanism that identifies and cleaves double-stranded RNA molecules and their cellular cognate transcripts. The RNA silencing response is actually used as a powerful technique (named RNA interference) for potent and specific inhibition of gene expression in several organisms. To identify gene products in Eucalyptus sharing similarities with enzymes involved in the PTGS pathway, we queried the expressed sequence tag database of the Brazilian Eucalyptus Genome Sequence Project Consortium (FORESTs) with the amino acid sequences of known PTGS-related proteins. Among twenty-six prospected genes, our search detected fifteen assembled sequences encoding products presenting high level of similarity (E value < 10) to proteins involved in PTGS in plants and other organisms. We conclude that most of the genes known to be involved in the PTGS pathway are represented in the FORESTs database.


Introduction
Post-transcriptional gene silencing (PTGS) is a widely conserved surveillance system that acts at the transcriptome level.It identifies long double stranded RNA molecules (dsRNA) such as those generated during virus replication, transposons mobilization or aberrant RNA synthesis.These molecules act as trigger signals for the PTGS machinery, guiding the cell to an alert state.
In plants and worms, the system is self-sustained and amplified by the activity of a siRNA-primed RNAdependent RNA polymerase (RdRP or RDR; Mourrain et al., 2000;Dalmay et al., 2000).In certain cases, such as in silencing induced by dsRNA-replicating viruses, the requirement of an RdRP activity is bypassed (Dalmay et al., 2000).In Caenorhabditis elegans, local introduction of dsRNA leads to systemic silencing that is probably mediated by a transmembrane protein named SID-1 (Winston et al., 2002).The activity of a protein of unknown function with coiled-coil domains (SGS3) is also required for gene silencing (Mourrain et al., 2000).
PTGS can also affect the genome promoting alterations such as DNA methylation (through MET1; Finnegan et al., 1996) and chromatin remodeling (through DDM1; Jeddeloh et al., 1999) at loci homologous to the target RNA (Morel et al., 2000).
PTGS phenomenon, also called RNA silencing, is a potent means to counteract foreign sequences and its natural role in host defenses against transposable elements and viruses has been demonstrated.Consistent with this defense role, a great number of viral suppressors of RNA silencing have been identified so far (Voinnet et al., 1999).A gene encoding an endogenous RNA silencing suppressor was discovered in tobacco (named rgs-CaM; Anandalakshmi et al., 2000) and was proposed to be involved in endogenous gene regulation.
RNA silencing has been observed in plants, protozoans, insects, fungi (named quelling) and mammals (named RNA interference; RNAi), indicating common aspects and functional analogy among pathways.RNAi is actually used as an experimental approach to suppress the expression of specific transcripts, thus becoming a powerful tool to dissect gene function and a promising therapeutic technology.
In this paper we provide an inventory of the putative genes involved in the PTGS pathway of Eucalyptus by exploring the expressed sequence tag (EST) information generated in the Brazilian Eucalyptus Genome Sequence Project Consortium (FORESTs).Enhanced understanding of this pathway will contribute to the establishment of RNAi-based protocols intending functional genomic studies in this important tree.

Materials and Methods
To identify gene products sharing similarities with enzymes involved in the PTGS pathway (Figure 1), similarity searches against cluster consensus sequences (contigs) Sassaki et al.
497  in the FORESTs database (https://forests.esalq.usp.br) were performed using BLAST search programs (Altschul et al., 1990).The database was first prospected using query sequences from known PTGS-related proteins (Table 1) and the TBLASTN algorithm.Positive hits (evalues < 10 -40 ) were validated against existing homologous sequences in the GenBank nonredundant protein database by using BLASTX.To further validate annotation, a protein domain analysis was performed using the Reverse Position Specific-BLAST (RPS-BLAST) algorithm at www. ncbi.nlm.nih.gov/structure/cdd/wrpsb.cgi, Pfam (Bateman et al., 2000) and SMART (Schultz et al., 2000) databases.
An estimate of the relative abundance of the identified putative genes was generated based on EST counts per corresponding contig.

Results and Discussion
The FORESTs database was mined for Eucalyptus gene products potentially involved in the PTGS pathway (Figure 1).We have searched for genes encoding 26 PTGS-related proteins and have found matches for 15 of them (Table 1).Among the identified EST clusters, the one annotated as MET1 homologue exhibited a low e-value (2e-11) but contained a highly conserved methyltransferase domain, indicating that it probably encodes a DNA methyltransferase.Searches also revealed several clusters with significant sequence and domain similarities to the four Arabidopsis homologues of Dicer (DCL1 to 4) and four clusters encoding RDR6 (also known as SDE1/SGS2).
Using the amino acid sequences of AGO-1, AGO-2 and AGO-4 as queries, nine clusters encoding Argonaute proteins were identified.AGO proteins are a common component of all RISC-related complexes and are defined by the presence of two conserved domains: a ~20 kDa Nterminal PAZ domain and a ~40 kDa C-terminal PIWI domain, which is in fact a (cryptic) RNase H domain responsible for RNA target cleavage (Song et al., 2004).At least ten AGO homologues have been identified in the Arabidopsis genome.Among them, AGO-1 is involved in the siRNA and microRNA (miRNA) pathways (Vaucheret et al., 2004) while AGO-4 plays a role in locus-specific siRNA accumulation and RNA-directed DNA methylation (Zilberman, et al., 2003;Zilberman, et al., 2004).In Drosophila, Homo sapiens and Mus musculus, AGO-2 mediates target mRNA cleavage (Okamura et al., 2004, Meister, et al., 2004;Liu, et al., 2004).Since these proteins are closely related to each other, and biochemical characterization of AGO-2 in plants is still missing, the correct assignment of the annotated Eucalyptus clusters proved to be difficult without further analyses.
We were also able to assign three clusters to AtXRN4, two clusters to SGS3, one cluster to SDE3 and a cluster to rgs-CaM (2e-41), a calmodulin-related protein that acts as a cellular suppressor of PTGS (Anandalakshmi et al., 2000).
In contrast, BLAST searches revealed no gene product with significant similarity to RDE-4 (R2D2 in Drosophila), a dsRNA binding protein that interacts with RNA molecules identical to the trigger dsRNA during RNAi in C. elegans (Tabara et al., 2002).Likewise no HEN1-related clusters could be found in the FORESTs database.HEN1 is a protein of 942 amino acids playing a role  in siRNA and miRNA accumulation in plants (Boutet et al., 2003).We do believe, however, that this protein is present in Eucalyptus but is not represented in the FORESTs database.Moreover, we did not find any EST clusters similar to proteins that have been genetically or biochemically linked to the PTGS pathway in humans (Gemin 3 and 4 of unknown function), Drosophila (RNA-binding proteins VIG and dFXR associated to RISC) and C. elegans (MUT-7 RNase D also known as WEX in Arabidopsis; ERI-1, a siRNA-degrading RNase; Kennedy et al., 2004).An estimate of the relative abundance of the identified genes was obtained by comparing the number of times ESTs were assigned to a particular contig.An overall view of the obtained results shows that the enzymes involved in PTGS are poorly represented in the FORESTs database (Figure 2).In general, transcripts encoding proteins of the AGO family, DCL family and RDR6 figured as the most extensively represented.
In summary, several EST clusters could be assigned to proteins of known functions in PTGS in plants and other organisms, covering almost the entire pathway (Figure 1 and Table 1).The identification of components of the PTGS pathway within the generated set of Eucalyptus ESTs provides good evidence for the conservation of the RNA silencing mechanism among different plant species and organisms.Recent findings suggest that RNA silencing and related pathways are involved in several cellular processes such as defense, RNA surveillance and development.Elucidation of the molecular mechanisms underlying these processes is an important step towards the full understanding of the PTGS phenomenon.From our analyses, we can conclude that Eucalyptus encodes a functional RNA silencing pathway.This collection of PTGS-related ESTs provides an interesting resource for molecular and functional genomic studies in this important tree.

Figure 1 -
Figure 1 -Schematic representation of the molecular steps in PTGS pathway.Nuclear events at the loci of the target gene or transgene are promoted by MET1, DDM1 and QDE3 (a RecQ DNA helicase).PTGS is triggered by long dsRNA molecules that are converted by the ribonuclease III Dicer (or Dicer-like enzymes) into small interfering dsRNAs of 21-23 nucleotides (siRNA).One strand of siRNA is transferred from DCL to RISC, thus directing target RNA cleavage.Unprotected generated ends are targets for exoribonucleases (Rnase D and AtXRN4).A siRNAprimed RNA-dependent RNA polymerase (RdRP) synthesizes new dsRNA molecules from target RNA thus perpetuating the process.Sid-1 is a transmembrane protein required for systemic signaling in C. elegans.

Figure 2 -
Figure 2 -Cluster distribution and total number of ESTs (in each cluster) encoding enzymes involved in the PTGS pathway.

Table 1 -
PTGS-related proteins and homologues identified in Eucalyptus.