Print version ISSN 0074-0276
Mem. Inst. Oswaldo Cruz vol.103 no.6 Rio de Janeiro Sept. 2008
Laboratório de Epidemiologia Molecular de Doenças Infecciosas, Instituto Oswaldo Cruz-Fiocruz, Av. Brasil 4365, 21045-900 Rio de Janeiro, RJ, Brasil
Expressed Sequence Tag (EST) sequence analysis rapidly gained widespread use and application for the discovery of gene transcripts in a variety of organisms (Adams et al. 1991). As of April 11, 2008, 51,391,051 ESTs entries were recorded in the dbEST NCBI (National Center of Biotechnology Information). Most of these (61%) are generated from laboratory models and economically important organisms (21 species; see Table I for the 10 most frequently sequenced organisms). ESTs have proven invaluable tools for accelerating the discovery of trans-cribed sequences in the genome and generating markers for genome mapping. Several bioinformatic methods for the analysis of ESTs at both small and large scales have been developed in the last 10 years. A thorough revision of these methods as well as a comprehensive analysis regarding the breadth of their applications for EST processing, quality sequence analysis and functional discovery has been published by Nagaraj et al. (2007). Accompanying this road map analysis is a guide through the world of EST bioinformatics tools available at http://biolinfo.org/EST/. This guide contains an extensive list of software and methods.
For the purpose of this opinion article, I am focusing on the Trypanosoma cruzi EST collection. This analysis may, in general terms, be applied to other trypanosomatids. To illustrate this opinion with examples, I have searched the T. cruzi EST dataset as described below.
The full collection of T. cruzi ESTs in the dbEST- GenBank - NCBI was downloaded and screened for the presence of the spliced leader fragment at the 5' end. After redundancy elimination, this EST set was aligned with ClustalX (Thompson et al. 1997) in order to identify genes with evidence of alternative trans-splicing. Each EST set was compared to genomic sequences available either in GenBank (www.ncbi.nlm.nih.gov) or GeneDB (www.GeneDB.org) to confirm the presence of poly-pyrimidine-rich regions and trans-splicing sites (AG dinucleotide). This procedure allowed the discard-ing of any artifacts from the cDNA libraries primed with spliced leader fragments. The presence of a poly-pyrimidine-rich region (at least 10 bases long) before the trans-splicing site in the DNA genomic segment being analyzed is an indication that the transcribed mRNA has been correctly primed. According to this criterion, I selected 88 sets of ESTs for further analyses. These analyses included BLAST similarity searches to label genes, inspection of 5' untranslated region (UTR) for additional poly-pyrimidine-rich regions and validation of trans-splicing sites. The EST figures for other parasites and model organisms were obtained directly from the dbEST summary page at http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html.
ESTs from parasitic organisms - Parasitic organisms account for only 2.5% of the total EST entries in the dbEST and they represent, approximately, 60 human and animal parasitic species (Table II). The most frequently sequenced parasite is Schistosoma mansoni. Among the protozoan parasites (28 species, 4.1% of parasite ESTs and 1% of total EST entries), the apicomplexa organisms have the largest number of ESTs. Specialized databases for Plasmodium spp (www.plasmodb.org) and Toxoplasma gondii (www.toxodb.org) provides resources that complement the information coming from these organisms' respective genome projects. This effort to continuously update the EST collection has also been important in correcting gene predictions and annotations of the Plasmodium genome. Plasmodium was also the first parasite to have an EST sequencing project (Chakrabart et al. 1994).
The trypanosomatid EST collection - Trypanosomatids, which include data from nine species, account for less than 0.1% of the total ESTs entries and 3.0% of total parasitic ESTs (Table III). T. cruzi is the most frequently sequenced trypanosomatid, ranking in the 18th position in the whole dataset. This raises the question of why trypanosomatids are not covered by extensive EST sequencing. Several reasons related to both technical nature and scientific goals are implicated in the answer to this question. First, the number of laboratories performing EST sequencing in trypanosomatid and parasites in general is small compared to the number involved with model or economically important organisms. EST sequencing has been surpassed by more powerful techniques for gene transcription analysis, such as microarrays, SAGE and its variants. EST sequencing is limited due to its dependence on sequencing facilities, requirement for the construction of good cDNA libraries (total and normalized) and typical biasing of clones obtained towards more frequently transcribed genes. In addition, the resources needed to completely characterize libraries mean that every EST sequencing project is executed on terms similar to those of full genome projects. Single laboratories are not prepared to undertake massive EST sequencing as they would for a typical SAGE or microarray experiment. The costs of these latter techniques are within reach of single laboratories. Thus, EST sequencing-generated information regarding the trans-cription activities of genes is less rewarding today than it was 10 years ago. In contrast to the direct sequencing of cDNAs, however, the quantitative and arraying methods do not provide sufficient sequence information to address features like polycystronic transcription and trans-splicing. Though an EST database encompassing all of the organisms represented in the current version of GenBank (Lee et al. 2005) has been constructed, some specific features of trypanosomatid may not be evident in this kind of general analysis. The main assumption for this analysis is based on a kind of gene organization, including gene interruptions and cis-splicing, that is not present in trypanosomatid. The overwhelming majority of trypanosomatid genes do not present introns and mRNA processing occurs via trans-splicing. Thus, refined analysis driven towards trypanosomatid features should be performed in order to extract relevant information regarding gene transcription and organization. To this end, I have taken T. cruzi as an example of what transcriptionally relevant information we can obtain from the present collection of ESTs. I propose that more monoxenous and heteroxenous trypanosomatid ESTs should be produced as tools to understand the evolution of transcription in these protozoa.
What information can be extracted from a trypanosomatid EST collection? The T. cruzi example - T. cruzi, the flagellated protozoan that causes Chagas disease in humans, exhibits gene organization and molecular processes (e.g., intronless genes and polycistronic trans-cription) similar to those of prokaryotes cells (Hausler et al. 1997). The immature polycistronic transcript is processed via a trans-splicing mechanism, which adds a small, spliced, 39 nucleotide RNA leader to the 5' end of the newly generated monocistronic messengers (Campbell et al. 2003). Variant mRNAs can be found in trypanosomes due to different transcription initiation sites that generate distinct 5' UTRs lengths (alternative trans-splicing) (Nepomuceno-Silva et al. 2001). Conversely, alternative polyadenylation is also a mechanism for the generation of transcriptional diversity (different 3' UTR lengths) from the same gene (Kubo et al. 2006). Ten years ago, the first T. cruzi ESTs were publicly available (Brandão et al. 1997). The pattern arising from that work showed a picture that would be confirmed at the end of the genome sequencing project for this protozoan. Most of the transcripts coded for genes either of unknown function or specific to this parasite. The EST sequencing projects that have followed this first initiative, including a larger numbers of clones, did not diverge from this first project. However, cDNAs from other developmental stages (e.g., trypomastigote and amastigote) were included in sequence database (Verdun et al. 1998, Aguero et al. 2004, Cerqueira et al. 2005). These numbers show that a great deal of research effort should be focused on deciphering the codes behind the T. cruzi genome.
Population bias of T. cruzi ESTs - Despite its relatively simple gene organization and mRNA processing, T. cruzi exhibits functional and population diversity that can be observed by the numerous vertebrate hosts it infects as well as the different clinical presentations of Chagas disease (Devera et al. 2003). Two major phylogenetic lineages or at least three populations groups are recognized by molecular markers (Fernandes et al. 2001, Brandão et al. 2006): T. cruzi I, T. cruzi II and T. cruzi III zymodemes. The T. cruzi EST collection was not exclusively obtained from a single strain because ESTs from the CL-Brener, Dm28c, Y, Tulahuen and Tehuantepec strains at different developmental stages have been deposited. However, they do not account for all of the diversity of T. cruzi populations. Because these populations can behave distinctly, with regard to both biological and clinical parameters, these biased cDNAs do not provide a full picture of their functional heterogeneity. The total number of entries in the dbEST-NCBI is almost twice as high as the estimated number of genes in T. cruzi. However, redundancies and multiple fragments for the same gene bias the EST collection towards more frequently expressed genes. To avoid a flawed T. cruzi EST analysis, transcripts from representative strains of the major phylogenetic groups should be present in the database.
Alternative trans-splicing - One of the post-transcription modifications in trypanosomatids involves trans-splicing site changes (Manning-Cela et al. 2002, Helm et al. 2008). This can be observed for several genes in T. cruzi; from approximately 14,000 ESTs, 195 entries were identified as possessing at least two trans-splicing sites and generating mRNAs with distinct 5' UTRs. The majority of these mRNAs correspond to genes coding for ribosomal proteins (rp). In more general terms, two types of mRNAs can be deduced from these T. cruzi EST sets (supplementary data): (i) mRNAs that differ in length (number of bases) and composition of the 5' UTR. For example, EST CF889765 in rp L7 contains a 5' UTR with 12 bases, whereas EST CF888327 contains a 5' UTR with 35 bases.
This difference suggests that there are at least two trans-splicing sites along the 5' UTR. Other examples include Histone H3 (ESTs CF890229, CF890408 and CF890611) and rp L34 (ESTs CF889802, CF890219). The ESTs for these last two genes came from the trypomastigote stage of T. cruzi CL-Brener (Aguero et al. 2004); (ii) mRNAs with similar lengths (number of bases) but different nucleotide composition of the 5' UTR. For example, rp L27 ESTs CF888060 and CF888883 show different composition of their 5' UTRs. They are not differentially trans-spliced. The flanking regions for each copy in the genome are different, resulting in the production of two mRNAs that code for the same protein but differ in their 5' UTRs.
In addition to those coding for rps, other genes also present variations in their 5' UTRs. For example, ESTs CF888803 (from strain CL-Brener) and CB923724 (from strain Tulahuen), which both derive from a gene encoding a hypothetical protein, present similar coding regions but have 5' UTRs that differ by 113 nucleotides in length. Pairwise comparison of these mRNAs with their corresponding genomic sequences reveals that CL-Brener exhibits two loci for this gene: one copy gives rise to mRNAs corresponding to both ESTs (CF888803 and CB923724), whereas the other copy matches only EST CB923724. The latter copy lacks the additional trans-splicing site that yields the longer mRNA (CF888803).
The transcription of some genes yielding variant mRNAs may use acceptors splicing sites different from the AG dinucleotide that follows the poly-pyrimidine-rich region. Through comparison with genomic sequences, we observed the use of five non-canonical acceptor sites by at least one copy of some genes. The GG dinucleotide was used as surrogate acceptor site for two genes encoding a hypothetical protein (ESTs CF889477 and CF890296). The mucin TcSMUGS (ESTs AI035079, AI612603, CF890337 and CF890370) and rp S6 (ESTs CF889016, CF890501) transcripts present TG and CG dinucleotides, respectively, as alternative acceptor sites.
Analysis of both ESTs and the genomic organization of the genes show that variant mRNA production is more frequent in duplicated or multicopy genes. However, single copy genes that have long 5' UTRs also exhibit this phenomenon. For example, ESTs CF888585 and CF888100 originated from the same single copy gene encoding a hypothetical protein that is transcribed into two mRNAs with distinct 5' UTR lengths (82 and 200 bases).
Below, I provide a list of information that can be retrieved for further experimental validation:
Trans-splicing site and additional/multiple trans-splicing site in 5' UTR
Exact length and start of 5' UTR
Composition of both UTRs
Sequence context for additional trans-splicing site
Use of non canonical dinucleotides for the trans-splicing site
Exact length and end of 3' UTR
Poly-A site and additional/multiple poly-A sites
Sequence context for poly-A sites
Sequence context for trans-splicing signal (poly-pyrimidine rich regions) after comparison to genomic sequence
Two of these points - the definition of polyadenylation sites and the transcription starting points - are extremely important for the comprehension of key trypanosomatid mRNA processing events. Trypanosomatids diverge from the rest of the eukaryotes at this point, because they do not have a consensus nucleotide signal that indicates probable polyadenylation sites. In addition, the coupling of trans-splicing and poly-A addition in just one molecular event represents a differential mechanism of mRNA processing. Therefore, knowing what parts of intergenic sequences and flanking transcriptional starting points are present in pre-mRNA is a necessary step in understanding these phenomena. Experimental evidence suggests that UTRs influence gene transcription and expression as well as mRNA half life and degradation. This means that EST sequences still provide a source of information for expanding these experimental findings to the trypanosomatid universe. The T. cruzi genome sequencing effort has made clear that the exact definition of a UTR is dependent on experimental information and it should be checked by full cDNAs or ESTs primed at either the 5' or 3' end (Brandão 2006).
Trypanosomatid and next-generation sequencing technology: who will pay for it? - For almost 30 years, DNA sequencing has relied on the enzymatic method of dideoxinucleotide chain termination developed by Sanger et al. (1977). The introduction of innovations, such as a laser based fluorescence detection, capillary electrophoresis and automated base calling, has not altered the core of the methodology. Sanger sequencing was the tool that shaped genomic analysis into the new face of biology. Due to the economic and entrepreneurial opportunities provided up by this new biology an entire industry has been generated; furthermore, the sequencing knowledge base has moved from academia to private organizations. Some of the giants in the Biotech and pharmaceutical segments, eager to serve as new standards in genomics areas, steadily captured newcomers and advertised innovations in DNA sequencing technology. While the scientific community uses and adapts these technologies to its needs, a battle among the owners and vendors of these technologies is currently underway to define which technology will win the race and become the standard technique. In this new field of sequencing-by-synthesis, three main competitors and techniques are struggling for researchers' attention and resources. These include Roche's GS FLX Genome Analyzer (454 Life Science pyrosequencing, http://www.roche-applied-science.com), Illumina's Solexa 1G sequencer (http://www.illumina.com) and Applied Biosystem's SOLiD system (Supported Oligonucleotide Ligation and Detection) (http://solid.appliedbiosystems.com). These technologies bring technical innovations in the way DNA sequence fragments are generated. For example, 454 sequencing (Roche GS-FLX) overcome the inefficiency and bias of PCR amplification by using lipid droplets to enclose unique template DNA molecules (emulsion PCR) along with the highly parallel processing capacity of pyrosequencing (Ronaghi et al. 1996, Margulies et al. 2005). The Illumina system uses a Sanger-like, four-color sequencing system and "innovates" with solid phase bridge PCR amplification and the reversible chain terminator (Solexa sequencing). SOLiD technology from Applied Biosystems may be viewed as an intermediate between pyrosequencing and the solexa and it innovates with sequencing by ligation based-chemistry. A deep coverage of these technologies, along with its achievements, potential applications and drawbacks, has been reviewed in Schuster (2008) and Holt and Jones (2008). For trypanosomatids, the technological capacity to deal with massive genomes and transcriptomes offers an immense window of opportunity; this is especially true since one of the most significant concerns regarding trypanosomatid is their diversity of environments, hosts and genotypes. Approaches similar to metagenomic sequencing analysis using pyrosequencing (Edwards et al. 2006) could be envisaged for high throughput diversity analysis in trypanosomatid. Additionally, searching for drug-induced alterations is a good starting point in genome resequencing projects involving T. cruzi strains and isolates. The re-sequencing of experimentally modified genomes is an immediate application for these technologies and this has been demonstrated by the full genome methylation analysis of Arabidopsis thaliana with solexa sequencing (Cokus et al. 2008).
A new world of trypanosomatid transcription processing can emerge via these technologies. Since parasitic organisms are responsible for neglected diseases in lower income countries, however, the question remaining is: who will pay for it?
Taking T. cruzi ESTs as an example of a potential source of information useful for understanding similar processes in other trypanosomatid, I present here some points to support the widening of EST sequencing in trypanosomatids: 1: Trypanosomatids can use two mechanisms to generate transcript diversity. They may have either (i) distinct splicing acceptor sites (usually an AG dinucleotide located downstream of a tract of poly-pyrimidines) or (ii) distinct polyadenylation sites. By using both mechanisms, trypanosomatids generate mRNAs corresponding to the same gene with different UTR lengths and/or compositions. Since no removal of introns is necessary in the mRNA processing of T. cruzi, alternative cis-splicing should be uncommon. Until now, few experimental works have demonstrated alterations in protein composition due to alternative cis-splicing (Mair et al. 2000). 2: Additional trans-splicing in T. cruzi occur more frequently in rp genes. Nevertheless, the existence of this phenomenon in other genes leads to the assumption that this is a common feature of transcriptional activity in the T. cruzi epimastigote and other developmental stages. For example, ESTs from the genes coding for Histone H3 and rp L34 were obtained from T. cruzi CL-Brener trypomastigotes RNA (Aguero et al. 2004). These ESTs exhibit alternative trans-splicing at this stage. The final result of this process is that two or more mRNAs with different 5' UTRs code for one specific protein. The alternative trans-splicing may impact mRNA translation because many genes in trypanosomatid are regulated via other mRNA interactions at the post-transcriptional level (Teixeira et al. 1995, Hausler & Clayton 1996, D'Orso & Frasch 2001, D'Orso et al. 2003). Altered 5' or 3' UTRs imply the existence of mRNAs with different properties that may change parameters such as half-life, secondary structures, regulatory motifs and protein binding sites. 3: The comparison of some ESTs to the genomic sequence of T. cruzi (CL-Brener clone) shows that some genes use non-canonical trans-splicing sites other than the classical AG. This use of non-canonical trans-splicing sites is another possibility for generating mRNA variation in T. cruzi and possibly other trypanosomatid.
Despite all of the potential of EST analysis, the trypanosomatid collection in the EST section of GenBank remains deceptively small in comparison to those of either model organisms (e.g., Mus musculus, Homo sapiens and A. thaliana) or other protozoan parasites. Trypanosomatid EST sequencing efforts reflect the same situation as the diseases they cause: as a tool for information, EST sequencing is neglected in research focused on these parasites. Protozoa researchers should endeavor to increase the number of EST entries from trypanosomatids in public databases.
Next-generation sequencing technology provides a broad window of opportunity to tackle the so-called diversity in trypanosomatids from a medical perspective. For example, this includes the task of identifying key targets in transcription and metabolic networks with the goal of drug development and immunotherapy. However, the costs of these technologies are prohibitively high for most laboratories. Due to this obstacle, "old generation sequencing methodologies" such as EST analysis and SAGE still play important roles in trypanosomatid research.
Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, Kerlavage AR, McCombie WR, Venter JC 1991. Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252: 1651-1656. [ Links ]
Aguero F, Ben Abdellah K, Tekiel V, Sanchez DO, Gonzalez A 2004. Generation and analysis of expressed sequence tags from Trypanosoma cruzi trypomastigote and amastigote cDNA libraries. Mol Biochem Parasitol 136: 221-225. [ Links ]
Brandão A 2006. The untranslated regions of genes from Trypanosoma cruzi, perspectives for functional characterization of strains and isolates. Mem Inst Oswaldo Cruz 101: 775-777. [ Links ]
Brandão A, Fernandes O 2006. Trypanosoma cruzi: mutations in the 3' untranslated region (3' UTR) of calmodulin gene are specific for lineages T. cruzi I, T. cruzi II and the Zymodeme III isolates. Exp Parasitol 112: 247-252. [ Links ]
Brandão A, Ürmenyi TP, Rondinelli E, Gonzalez A, Miranda AB, Degrave W 1997. Identification of Transcribed Sequences (ESTs) in the Trypanosoma cruzi Genome Project. Mem Inst Oswaldo Cruz 92: 863-866. [ Links ]
Campbell DA, Thomas S, Sturm NR 2003. Transcription in kinetoplastid protozoa, why be normal? Microbes Infect 5: 1231-1240. [ Links ]
Cerqueira GC, DaRocha WD, Campos PC, Zouain CS, Teixeira SMR 2005. Analysis of expressed sequence tags from Trypanosoma cruzi amastigotes. Mem Inst Oswaldo Cruz 100: 385-389. [ Links ]
Chakrabarti D, Reddy GR, Dame JB, Almira EC, Laipis PJ, Ferl RJ, Yang TP, Rowe TC, Schuster SM 1994. Analysis of expressed sequence tags from Plasmodium falciparum. Mol Biochem Parasitol 66: 97-104. [ Links ]
Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE 2008. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452: 215-219. [ Links ]
Devera R, Fernandes O, Coura JR 2003. Should Trypanosoma cruzi be called "cruzi" complex? A review of the parasite diversity and the potential of selecting population after in vitro culturing and mice infection. Mem Inst Oswaldo Cruz 98: 1-12. [ Links ]
D'Orso I, De Gaudenzi JG, Frasch AC 2003. RNA-binding proteins and mRNA turnover in trypanosomes. Trends Parasitol 19: 151-155. [ Links ]
D'Orso I, Frasch AC 2001. TcUBP-1, a developmentally regulated U-rich RNA-binding protein involved in selective mRNA destabilization in trypanosomes. J Biol Chem 276: 34801-34809. [ Links ]
Edwards RA, Rodrigues-Brito B, Wegle L, Haynes M, Breitbart M, Peterson DM, Saar MO, Alexander S, Alexander EC, Rohwer F 2006. Using pyrosequencing to shed light on deep mine microbial ecology. BMC Genomics 7: 57. [ Links ]
Fernandes O, Santos SS, Cupolillo E, Mendonça B, Derre R, Junqueira AC, Santos LC, Sturm NR, Naiff RD, Barret TV, Campbell DA, Coura JR 2001. A mini-exon multiplex polymerase chain reaction to distinguish the major groups of Trypanosoma cruzi and T. rangeli in the Brazilian Amazon. Trans R Soc Trop Med Hyg 95: 97-99. [ Links ]
Hausler T, Clayton C 1996. Post-transcriptional control of hsp70 mRNA in Trypanosoma brucei. Mol Biochem Parasitol 76: 57-71. [ Links ]
Hausler T, Stierhof YD, Blattner J, Clayton C 1997. Conservation of mitochondrial targeting sequence function in mitochondrial and hydrogenosomal proteins from the early-branching eukaryotes. Crithidia, Trypanosoma and Trichomonas. Eur J Cell Biol 3: 240-251. [ Links ]
Helm JR, Wilson ME, Donelson JE 2008. Different trans RNA splicing events in bloodstream and procyclic Trypanosoma brucei. Mol Biochem Parasitol 159: 134-137. [ Links ]
Holt RA, Jones SJM 2008. The new paradigm of flow cell sequencing. Genome Res 18: 839-846. [ Links ]
Kubo T, Wada T, Yamaguchi Y, Shimizu A, Handa H 2006. Knock-down of 25 kDa subunit of cleavage factor Im in Hela cells alters alternative polyadenylation within 3' UTRs. Nucleic Acids Res 34: 6264-6271. [ Links ]
Lee Y, Tsai J, Sunkara S, Karamycheva S, Pertea G, Sultana R, Antonescu V, Chan A, Cheung F, Quackenbush J 2005. The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes. Nucleic Acids Res 33: D71-74. [ Links ]
Mair G, Shi H, Li H, Djikeng A, Aviles HO, Bishop JR, Falcone FH, Gavrilescu C, Montgomery JL, Santori MI, Stern LS, Wang Z, Ullu E, Tschudi C 2000. A new twist in trypanosome RNA metabolism, cis-splicing of pre-mRNA. RNA 6: 163-169. [ Links ]
Manning-Cela R, González A, Swindle J 2002. Alternative splicing of LYT1 transcripts in Trypanosoma cruzi. Infect Immun 70: 4726-4728. [ Links ]
Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376-380. [ Links ]
Nagaraj SH, Gasser RB, Ranganathan SA 2007. Hitchhiker's guide to expressed sequence tag (EST) analysis. Brief in Bioinform 8: 6-21. [ Links ]
Nepomuceno-Silva JL, Yokoyama K, de Mello LD, Mendonca SM, Paixao JC, Baron R, Faye JC, Buckner FS, Van Voorhis WC, Gelb MH, Lopes UG 2001. TcRho1, a farnesylated Rho family homologue from Trypanosoma cruzi, cloning, trans-splicing, and prenylation studies. J Biol Chem 276: 29711-29718. [ Links ]
Ronaghi M, Karamohamed S, Pettersson B, Uhlén M, Nyrén 1996. Real-time DNA sequencing using detection of pyrophosphate release. Anal Biochem 242: 84-89. [ Links ]
Sanger F, Nicklen S, Coulson AR 1977. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74: 5463-5467. [ Links ]
Schuster CS 2008. Next-generation sequencing transforms today's biology. Nat Methods 5: 6-18. [ Links ]
Teixeira SM, Kirchhoff LV, Donelson JE 1995. Post-transcriptional elements regulating expression of mRNAs from the amastin/tuzin gene cluster of Trypanosoma cruzi. J Biol Chem 270: 22586-22594. [ Links ]
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25: 4876-4882. [ Links ]
Verdun RE, Di Paolo N, Urmenyi TP, Rondinelli E, Frasch AC, Sanchez DO 1998. Gene discovery through expressed sequence Tag sequencing in Trypanosoma cruzi. Infect Immun 66: 5393-5398. [ Links ]
Received 27 April 2008
Accepted 31 July 2008
Financial support: Instituto Oswaldo Cruz-Fiocruz