Eucalyptus grandis THIOREDOXINS, DIVERSITY AND GENE EXPRESSION

Tree genomes have been sequenced in recent years providing a source of basic information on multigenic family characterization. Comparative genomics based on those complete genome sequences available in public database is an important tool providing useful information to progress on functional gene characterization. In this work, we focus on gene encoding for Thioredoxins (Trxs) in Eucalyptus grandis genome, which are oxidoreductase enzymes, involved in signifi cant biochemical processes, above all the maintenance of cellular homeostasis. Here we investigate the diversity, structure and expression of these genes in eucalyptus. For this purpose, bioinformatics tools were employed, using public platforms data, to identify coding sequences and validate gene expression. Specifi c softwares were employed to characterize gene structure and expression. RTPCR assays were carried out to specifi cally verify the expression of 4 cytoplasmic thioredoxin genes, observed in silico from leaf, phloem, xylem and apical meristem tissues. Twenty-two Trxs with characteristic and canonic active sites were identifi ed, confi rming the presence of all types of the three main groups already defi ned as plastidial (m, f, x, y, z) cytoplasmatic (h) and mitochondrial (o). However, diff erences in the number of genes per group were observed when compared with other tree genomes. The expression of these thioredoxin genes compared to some homologous genes presented divergent expression patterns compared to Arabidopis thaliana suggesting a functional specifi city in eucalyptus, such as in the case of Eucgr.F01604 gene encoding an h1 cytoplasmic Trx, which presents a strong expression in conductor tissues.


INTRODUCTION
Thioredoxins (Trxs) are small ubiquitous enzymes -approximately 14kDa -present in all organisms, and responsible for maintaining the redox state in cells (Geigenberger et al., 2017). They were fi rst discovered in Escherichia coli (Laurent et al., 1964) as hydrogen donors for nucleotide reductase (RNR), and present a protruding and conserved active site, frequently earing WCGPC amino acids. Between the two cysteine, on the canonic active center, a dithiol is formed, and responsible for their enzymatic activity by reducing and opening disulfi de bridges of other proteins (Holmgren et al., 1985).
Systematic sequencing of plant genomes (Michael and Jackson, 2013) has allowed a better characterization of gene function (Rhee and Mutwil, 2014), mostly by the implementation of NGS (Next-Generation Sequencing) platforms, allowing high-throughput DNA sequencing, comprising RNA-seq strategies. These eff orts, allied to comparative genomics supported by increasingly robust Bioinformatics tools, have contributed to a better understanding, and analysis, of functional characterization of multigene families, such as Trxs in vegetables.
In this paper, we analyze sequences of Trxs genes obtained from the Eucalyptus grandis genome sequencing project, published by Myburg et al. (2014). This is an approach to the genetic diversity of Trxs in eucalyptus, which intends to validate data obtained during the FORESTs initiative (Eucalyptus Genome Sequencing Project Consortium) carried out by Brazilian groups (Vicentini et al., 2005), and focusing on a comparative genomic strategy based on available tree genomes. Our interest is to advance the functional characterization of these enzymes in trees, considering diff erential, and specifi c, gene expression in diff erent plant tissues. Gene expression from RNA-seq data is analyzed as well as protein interaction based on a comparative approach with the Populus trichocarpa genome (Tuskan et al., 2006), the fi rst tree genome sequenced. Special attention is paid to semi-quantitative RT-PCR assays, with genes encoding for 4 Trxs h. The practical outcome of this analysis is the identifi cation of genes that may be good candidates for obtaining commercial transgenic plants overexpressing interesting transcripts that could enhance plant productivity related to diff erent aspects, such as growth or stress tolerance to biotic or abiotic agents.

MATERIAL AND METHODS
Trxs sequences were identifi ed within three databases. Firstly, we had access to the eucalyptus genome directly by the AUSX00000000 GenBank link provided in the article (www.ncbi.nlm.nih.gov/ nuccore/AUSX00000000). The other two sources were the plant genome database, Phytozome (https:// www.phytozome.jgi.doe.gov/), and the eucalyptusspecifi c database, Eucgenie.org (https: // eucgenie. org /). In GenBank searches, we obtained sequences via NCBI-National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/), performing BLAST alignments using Arabidopsis thaliana thioredoxin nucleotide sequences. In the other specifi c databases, a textual search, with the word "thioredoxin", was used.
RT-PCR studies were performed in 4 steps, using total RNA extraction from E. grandis tissues (adult leaves, young leaves, xylem, phloem and apical meristem), cDNA preparation using 1g of total RNA, PCR amplifi cations with specifi c primers for h Trxs, and the analysis of PCR products on agarose gel under electrophoresis.
PCR products were analyzed in 1.5% agarose gel on a runVIEW Cleaver electrophoresis apparatus. Samples were loaded within Blue / Orange Loading Dye 6X Promega Buff er, added with 2μl of GelRed Biotium (10.000 dilution) for DNA labeling. Gels were then visualized in Hoefer transilumidor and registered in Cleaver photocumentation equipment.

RESULTS
In this work, weclassifi ed plant trxs according to Chibani et al. (2009), who recognizes 4 groups. Typical CXXC / S Trxs possessing a classic active site, atypical Trxs with single or multi-domain protein domain, TDX, and the thioredoxin reductases, which are the reducing agents in the diff erent thiorredoxin systems described. Data mining investigation has shown that the Eucalyptus grandis genome encodes more than 50 Trxs among typical, atypical, single domain and those from multiple domains.
Here we are especially interested in typical Trxs. Our results in table 1 show that eucalyptus has representatives for all seven types of Trxs described (m, f, h, x, y, z, o) in the genome, and that in terms of gene numbers, they are of 4, 2, 9, 1, 2, 1, 1 respectively, plus 2 genes presenting a CxxS site.
Phylogenetic trees obtained in approaches as described here aimed to allow information about structural and functional characterization of genes by rooting analysis and do not perform classical evolution analysis. In this context, we sought to confi rm the identifi cation of genes and also to group diff erent types of thioredoxins. In this comparative analysis we used sequences homologous to Trxs from 4 diff erent plant species, A. thaliana, V. vinifera, P. thicocarpa, and J. curcas. Results of this analysis, presented in fi gure 1, allowed for the grouping and identifi cation of all thioredoxin sequences with their potential orthologs in the other species.
The expression study was initially based on RNA-Seq data, available from the E. grandis genome. This raw FPKM data is available on public domain platforms (https://eucgenie.org/), and after standardization was used to generate the heatmap shown in fi gure 2. It comprises, then, the 22 Trx coding sequences for m, f, h, o, x, y, and z.
The heatmap obtained allowed for the identifi cation of those Trxs h with gene expression  that apperar to be more abundant in certain tissues. This is particularly the case for Eucgr.I02383, Eucgr. I01913, Eucgr.F01854 and Eucgr.F01604 genes. This result has driven preliminary RT-PCR assays to study the expression of these genes more precisely in the laboratory. Results (Figure 3) of these tests suggest a confi rmation of what is observed in silico analyses. Barbosa and Marinho (2005) when analyzing the fi rst eucalyptus transcriptome, found 7 Trxs h, one of them with CXXS site, 1 Trx x, but they were unable to identify Trxs of groups z and o. In the same work, 4 m and 2 f Trxs were identifi ed. The absence of transcripts for Trxs o and z in that fi rst approach may be explained by the lack of those transcripts for these genes in the libraries employed in that transcriptome that can currently be found in the E. grandis genome. These genes are known to be transcribed at low expression levels because they are involved in more specifi c situations, such as oxidative stress response (Balsera and Buchanan, 2019), and, therefore, not constitutively (Laloi et al., 2001;Chibani et al., 2011). Regarding Trxs m and f, involved in the Calvin-Benson cycle reactions (Buchanan, 2002), then abundant in terms of transcripts, the results are repeated among those we observe here for the number of genes, 4 and 2 respectively. The presence of 9 Trxs h and not 7, as seen by Barbosa and Marinho (2005), is also explained by the transcript levels, but also by the redundancy of these genes. However, it is interesting to note that even in a transcriptome performed with major technical limitations, and only with the analysis  of the 5' transcripts (ESTs), 7 of them were already identifi ed at the time. Most likely, this shows the plasticity of this group of Trxs, preferentially acting on the cytoplasm. With respect to the CXXS site Trxs h, the genome presents two of them while only one was found in the previous data mining approach.

DISCUSSION
Here we can also observe that the basic set of Trxs genes, found in A. thaliana during the genome analysis, in the green lineage as defi ned by Meyer et al. (2009), remains in the E. grandis genome as in the other plant genomes studied. However, the number of genes, and, consequently, of proteins, change what can be verifi ed in comparative genomics analyses. It is likely that, as Meyer et al. (2009) points out, plant gene duplication throughout evolution has occurred, which could be explained here by the location of some genes on the same chromosomes.
If we still take these numerical data for Trxs genes in E. grandis, we can see that gene diversity within typical Trxs subfamily also appears in other plant tree sequenced genomes. P. trichocarpa. For example, Chibani et al. (2009) presents a similar situation to E. grandis, but diff ers when poplar presents 8 Trxs m genes, which is an unusual situation. This analysis, however, will be improved by the sequencing of other tree genomes being performed around the world, in a fast and high quality way, and will allow further hypotheses about the numerical diversity of Trxs genes in these genomes.
The general expression profi le in eucalyptus, considering all genes, is consistent with what is found in the literature (Meyer et al., 2009), and justifi es their classifi cation by cell compartimentalization and function. The expression profi le in conductive tissues, for example, is more discrete than in leaf or meristem tissues in most genes, except in the case of the thioredoxin group h, which has a more varied profi le. Trxs f acts clearly on adult leaves, which is already expected due to their well-characterized performance in photosynthetic tissues. The same can be said for Trxs m, which also acts, on photosynthetic green tissues.
Gene expression of eucalyptus thioredoxins by RNA-seq transcripts had never been reported until now, as this is the fi rst work with this approach. Vining et al. (2015) have analyzed a eucalyptus fl oral transcriptome, but these tissues are not present in this work. However, a large study was performed by Belin et al. (2015) in A. thaliana analyzing RNA-seq data for all plant Trx genes. From this work, some relevant considerations can be made in relation to the results obtained here.
If we consider individually diff erent thioredoxin groups in our study, we can say that the 4 plastid Trxs m are predominantly expressed in adult leaf tissues, such as in A. thaliana. In this plant, however, Trx m3 has low expression, and is constitutive in the tissues analyzed. The Eucgr.L03049 gene that corresponds to the Trxm3 in eucalyptus has a strong expression in adult leaves compared to the other 3 Trxs m, which may indicate a possible functional specifi city in tree genomes. The same expression pattern can be observed for the two eucalyptus Trx f, which are signifi cantly present in photosynthetic adult leaf tissues. The other eucalyptus plastidial Trxs x, y and z are more discreetly expressed than m or f Trxs in the tissues studied. There is a predominance of transcripts for these plastidial Trxs in mature or young leaf tissues, nevertheless, the lack of values for Trx z in adult leaves may be noticed. This thioredoxin is, in fact, particular, presenting more specifi c functions in the target proteins, FLN1, and FLN2 (Arsova et al., 2010;Meng et al., 2010). This is the likely explanations for the absence of transcripts in the library studied. With respect to A. thaliana, Belin et al. (2015) mentioned a strong expression of Trx z in ovarian tissues, which reinforces its more specifi c character.
Eucalyptus Trx o has a discreet and constant expression in the six libraries analyzed -the same as observed in A. thaliana, which indicates its recently described more generalist character (Geigenberger et al., 2017).
Eucalyptus CXXS Trxs have an interesting expression that is more relevant in conductive tissues, although also present in young or adult leaf tissues. They diff er from the other Trxs regarding this specifi city by conductive tissues, also suggesting specifi c functions. In A. thaliana, CXXS2 is strongly detected in pollen grains, according to Belin et al. (2015).
The Trxs h of eucalyptus, represented here by 9 genes, present gene expression in all tissues studied, and can be characterized in two groups. There are some with more discrete and uniform discresion expression in the six libraries: genes Eucgr.J02387, Eucgr.A00783, Eucgr.I01912, Eucgr. B02586, and Eucgr.K01294. There is also a second group, consisting of 4 Trxs, Eucgr.I02383, Eucgr. I01913, Eucgr.F01854, and Eucgr.F01604, with an expression that is clearly more abundant than all the others, and even larger than all Trxs studied here. This possible functional role, characterized by this division of labor in a tree genome, may be of great interest. For example, the genes that are more abundantly expressed are Eucgr.F01854 and Eucgr.F01604. The fi rsthas the lowest expression in conductive tissues, but with a high number of reads in young or adult leaf tissues. The second is the Eucgr.F01604 gene that has a strong expression in all six libraries, suggesting a possible and important role in plant growth, and its vertical expansion, because it is supposedly involved in transport in conductive tissues. This diff erential expression profi le of Trxs, considering tissues or cell partitioning, is a recurrent approach in the study of the functional characterization of these enzymes (Meyer et al., 2012). The central point of the study of the characterization of these genes is the simplicity of the biochemical redox system, represented by the reduction of disulfi de bridges of target proteins via electron transfer by Trxs, on the one hand, and the absence of mutants for Trxs h allowing a specifi c function determination for each one on the other. In this sense, data presented here, based on expression profi le analysis, and the simple presence of transcripts in diff erent tissues and their intensity, could be of great interest. The presence of strong diff erential expression, observed in some genes, justifi es this assumption, and reinforces the idea of not substituting one Trx h for another in specifi c situations. Those assumptions, however, must be confi rmed by obtaining knockout mutants, for instance, which are not available in eucalyptus.
Our semi-quantitative RT-PCR results are relevant because they suggest a possible division of labor among the Trxs h genes in eucalyptus. This aspect is not negligible in the functional study of thioredoxins, of which its functional gene redundancy has already been widely verifi ed with the use of insertion mutants (Meyer et al., 2012).

CONCLUSIONS
The results presented here confi rm the numerical and group complexity for Trx genes already observed in the fi rst Eucalyptus transcriptome. Genes have been identifi ed for all described Trxs groups. Eucalyptus possesses at least 22 typical thioredoxin genes identifi ed by comparative phylogenetic reconstruction. The numerical distribution of genes by groups in eucalyptus is similar to that reported for P. trichocarpa, showing the maintenance of 4 Trx m genes as in A. thaliana, instead of 8 in P. trichocarpa. The most abundant group in genes is represented by Trx h, with 9 genes, and the expression profi le of these genes revealed unique expression patterns not yet reported in eucalyptus. The Eucgr. F01604 gene, encoding a Trx h1, could possibly play a somewhat signifi cant role in conductive tissues, as well as relevant expression in young leaf tissues. This specifi c tissue expression was verifi ed in semiquantitative RT-PCR experiments. The expression of the Eucgr.I02383, Eucgr.I01913, Eucgr.F01854, and Eucgr.F01604 genes by RT-PCR confi rms what is observed in silico. This genetic characterization, via diff erential expression of transcripts, indicates that future studies to obtain commercial transgenic plants with biotechnological potential using those genes are of interest. However, the need remains to carry out more experiments on the functional characterization of eucalyptus thioredoxin h in plantae.