SciELO - Scientific Electronic Library Online

vol.30 issue3  suppl.Frequency and distribution of microsatellites from ESTs of citrusBioinformatics for the Citrus EST Project (CitEST) author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand



  • English (pdf)
  • Article in xml format
  • How to cite this article
  • SciELO Analytics
  • Curriculum ScienTI
  • Automatic translation


Related links


Genetics and Molecular Biology

Print version ISSN 1415-4757On-line version ISSN 1678-4685

Genet. Mol. Biol. vol.30 no.3 suppl.0 São Paulo  2007 



CitEST libraries



Maria Luísa P. Natividade TargonI; Marco Aurélio TakitaI,II; Alexandre M. do AmaralI,III; Alessandra A. de SouzaI; Eliane Cristina Locali-FabrisI; Sílvia de Oliveira DortaI; Kleber Martins BorgesI; Juliana Mendonça de SouzaI; Carolina Munari RodriguesI; Adriano Reis LuchetaI; Juliana Freitas-AstúaI,IV; Marcos Antonio MachadoI

ICentro APTA Citros Sylvio Moreira, Instituto Agronômico de Campinas, Cordeirópolis, SP, Brazil
IICentro de Recursos Genéticos Vegetais, Instituto Agronômico de Campinas, Campinas, SP, Brazil
IIIEMBRAPA Recursos Genéticos e Biotecnologia, Brasília, DF, Brazil
IVEMBRAPA Mandioca e Fruticultura Tropical, Cruz das Almas, BA, Brazil

Send correspondence to




In order to obtain a better understanding of what is citrus, 33 cDNA libraries were constructed from different citrus species and genera. Total RNA was extracted from fruits, leaves, flowers, bark, seeds and roots, and subjected or not to different biotic and abiotic stresses (pathogens and drought) and at several developmental stages. To identify putative promoter sequences, as well as molecular markers that could be useful for breeding programs, one shotgun library was prepared from sweet orange (Citrus sinensis var. Olimpia). In addition, EST libraries were also constructed for a citrus pathogen, the oomycete Phythophthora parasitica in either virulent or avirulent form. A total of 286,559 cDNA clones from citrus were sequenced from their 5’ end, generating 242,790 valid reads of citrus. A total of 9,504 sequences were produced in the shotgun library and the valid reads were assembled using CAP3. In this procedure, we obtained 1,131 contigs and 4,083 singletons. A total of 19,200 cDNA clones from P. parasitica were sequenced, resulting in 16,400 valid reads. The number of ESTs generated in this project is, to our knowledge, the largest citrus sequence database in the world.

Key words: ESTs, shotgun, citrus, phytopathogen.




Citrus is an important crop worldwide, with an annual production estimated at over 105 million tons in the period of 2000-2004 (FAO, 2005). Brazil is one of the main citrus fruit producing countries, together with the Mediterranean countries, the United States and China. More than two thirds of global citrus fruit production comes from these countries. The processing of citrus fruits represents approximately one third of the fruit production, more than 80 percent in frozen concentrated orange juice production. São Paulo State in Brazil, and Florida State in the USA, are the two main orange juice producers. Brazil exports approximately 99 percent of its production while 90 percent of Florida’s production is consumed domestically and only 10 percent is exported.

Cultivated citrus species are susceptible to various pathogens including bacteria, fungi, nematodes, viruses and viroids; these are responsible for severe losses worldwide (Deng et al., 2001). Citrus trees usually grow as a combination of productive scion variety bud-grafted onto a rootstock variety adapted to soil and environmental conditions (Forment et al., 2005). Citrus breeding is an expensive, long-term process, and citrus breeders have always faced many problems due to the complex genetic background of this crop.

The Citrus genome size (n) is approximately 367,000,000 bp, making it too expensive to be completely sequenced. On the other hand, expressed sequence tags (ESTs) are, today, a fast and inexpensive way of identifying new genes for obtaining data on gene expression and regulation, and for reconstructing metabolic maps based on genome data. ESTs of many important crops have been generated, including sugarcane (Vettore et al., 2001, 2003), Populus (Sterky et al., 2004), Vitis species (da Silva et al., 2005). For citrus, a genomic shotgun library was also constructed in this work, but the focus was the cDNA libraries which were constructed from different tissues of citrus species and genera, either in the presence of a pathogen/abiotic stress, or not.

Pathogens are also a focus of CitEST. One of the major concerns of the citrus industry is the phytosanitary problem caused by the different pathogens that attack the crop. Two of these pathogens, Xylella fastidiosa and Xanthomonas axonopodis pv. citri, had their genomes completely sequenced (Simpson et al., 2000; da Silva et al., 2002). In CitEST, EST libraries were constructed for the oomycete Phythophthora parasitica in either virulent or avirulent form.

In this paper, we report the construction and general data obtained from the CitEST libraries.


Materials and Methods

Biological material

The citrus species and genera as well as all the tissue sources used to construct the cDNA libraries are listed in Table 1. The material was collected at the Centro APTA Citros Sylvio Moreira - IAC (Cordeiropolis, SP, Brazil) from field trees, trees growing under greenhouse and phytotron.



P. parasitica isolates IAC-01/95 G-40 and IAC-01/95 II, avirulent and virulent forms, respectively, were obtained from the collection of the Phytopathological Laboratory of the Centro APTA Citros Sylvio Moreira - IAC.

RNA isolation and cDNA library construction

Total RNA was extracted from 1 g of tissue using the TRIzol Reagent, according to the instructions of the manufacturer (Invitrogen). Due to the high content of carbohydrate, total RNA was extracted from seeds according to modifications of the procedure described by Naito et al. (1988). P. parasitica was obtained by filtration of liquid culture and total RNA was extracted from 1 g of the filtrate.

Poly A+ RNA was isolated from 0.5 mg of total RNA using the mRNA Isolation System (Promega Corporation, Madison, WI). cDNA libraries were constructed with the SuperScript Plasmid System with Gateway Technology for cDNA Synthesis and Cloning kit (Invitrogen). Complementary DNA was synthesized from mRNA using a primer consisting of an oligo(dT) sequence with a NotI restriction site. SalI adapters were ligated to the blunt-ended cDNA fragments followed by NotI digestion. The cDNA fragments were size fractionated in Sephacryl S cDNA Size Fractionation Columns (Invitrogen) and cloned into the NotI-SalI restriction site of the pSPORT 1 vector. The ligated cDNA fragments were transformed into E. coli DH5a cells by the ice-cold RbCl/CaCl2 solution method (Hanahan, 1983). White colonies were inoculated in 200 µL of liquid Circle Grow medium (Molecular Biology Certified Bacterial Growth Media, QBiogene - Bio 101 Systems - USA) containing 8% (v/v) glycerol and 100 µg/mL of ampicillin, in 96-well-microtiter plates, incubated overnight at 37 °C and stored at -80 °C.

The citrus EST collections were catalogued by two characters indicating the genera and species, followed by two numbers indicating the variety, a character and a number indicating the tissue source, and three numbers indicating the conditions. As an example: TS27-C2-300, bark tissue of Citrus sunki under greenhouse.

Shotgun library preparation

For the shotgun library, ten grams of young leaves were collected from Pera Olimpia sweet orange grown in greenhouse and used for total DNA preparation according to Dellaporta et al. (1983). The DNA was purified in a cesium chloride gradient. Twenty micrograms of DNA were partially digested with 0.1 U of Sau3AI (Invitrogen) for 10 min at room temperature. The DNA was separated in a 0.8% agarose gel, the fragments of 1.5 to 3 kb were isolated with the GFX PCR DNA and Gel Band Purification Kit (GE Healthcare) and cloned in pGEM3Z open at the BamHI restriction site.

Plasmid DNA minipreparation and sequencing

Plasmid DNA was extracted by the boiling method (Marra et al., 1999). The sizes of the cloned fragments from the EST libraries were evaluated by digesting the plasmid DNA with PvuII. Visualization of the products was done by electrophoresis in 1.2% agarose gels. The sequencing reactions were prepared according to the instructions of the manufacturer (Applied Biosystems) for the DNA sequencing kit Big Dye Terminator cycle sequencing ready reaction v3 and v3.1. Sequencing was done in ABI 3700 and 3730 DNA Analyzers (Applied Biosystems).

Sequence data analysis

The methods used to submit, process and analyze the ESTs are described elsewhere in this issue (Reis et al., in this issue).


Results and Discussion

A total of 33 cDNA libraries were constructed from different tissues of citrus species and genera at different developmental stages (fruits), or under biotic or abiotic stresses (pathogens and drought), as listed in Table 1. A total of 16, 9, 1, 1, 2 and 4 libraries were prepared from leaves, fruits, flowers (mixture of four developmental stages), seeds (from fruits at different developmental stages), roots and bark, respectively. Two cDNA libraries from mycelia of P. parasitica and one shotgun library from leaves of C. sinensis var. Pera Olimpia were also prepared. The fragment size of the cDNA clones was evaluated for each library and ranged from 1 to 2 kb, depending on the library (average fragment size was 1.5 kb). Vettore et al. (2001) reported an average fragment size of 1,250 bp for clones of sugarcane cDNA libraries.

A total of 13 C. sinensis var. Pera cDNA libraries were constructed, mainly because this is the most important cultivar in Brazil. Leaves from X. fastidiosa, CTV (Citrus tristeza virus), CiLV-C (Citrus leprosis virus cytoplasmic type) and P. parasitica-infected plants were used as RNA sources to evaluate the genes induced/repressed by these pathogens. Two cDNA libraries from Rangpur lime (C. limonia) were constructed, because this Citrus species was the most commonly used as rootstock in Brazil. To compare gene expression in different tissues, libraries were constructed using tissues of the same plant.

cDNA libraries from peel of C. sinensis and C. reticulata fruits with diameters of 1, 2.5, 5, 7, 8, 9 cm and 1, 2.5, 5 cm, respectively, were constructed with the purpose of identifying differentially expressed genes in each stage. Factors associated with productivity and fruit quality are strongly dependent on fruit development. The understanding of the molecular mechanisms by which citrus plants regulate this complex process represents a unique opportunity to improve our understanding of all physiological mechanisms determining fruit setting, fruit size, organic acid accumulation, carbon flow, peel color and morphology. A database containing nonredundant sequences from all cDNA libraries was constructed and used to perform comparisons with all libraries and evaluate individual gene expression levels.

The processing of ESTs is a fundamental step to obtain high quality sequence reads from raw sequencer trace data (Kunne et al., 2005). Table 2 presents the general data obtained from each CitEST library. A total of 286,559 cDNA clones from citrus were sequenced in their 5’ end, generating 242,790 valid reads of citrus. A total of 19,200 cDNA clones from P. parasitica were sequenced, generating 16,400 valid reads. The success index is the percentage of reads with more than 150 bp with Phred quality above 20. Table 2 presents the percentage of efficiency per library (number of valid reads / number of submitted reads x 100). The minimum efficiency was observed in TS27-C2-300 and PT11-C2-301, from bark tissue of C. sunki and P. trifoliata, respectively. The maximum efficiency was observed in CS00-C1-100, the library from leaves collected from trees grown in greenhouse. In this case, the leaves used to construct the library were from new flushes of branches, and therefore more suitable for RNA extraction. The sequences are available at:



Table 3 presents a summary of CitEST data. The average size of reads was 847.48 bp. Forment et al. (2005) reported an average sequence length of 500 nucleotides for citrus EST collection obtained from 25 cDNA libraries. The average EST length from 26 libraries constructed from different sugarcane tissue was 750 bp (Vettore et al., 2001). The number of clusters/specie is also presented in Table 3 and is expressed by (# Contigs + # Singlets). The average efficiency of all libraries was 84.73%. Redundancy is calculated as number of clusters / number of Reads, expressed as a percentage. The high values in redundancy obtained for some species are related to the high number of reads sequenced for these particular species, since the libraries are being exhausted.



The efficiency of the genomic library (shotgun) of sweet orange was 78.2% based on the same quality analysis used for the ESTs. The objective of constructing this library was to identify putative promoter sequences as well as molecular markers that could be useful for breeding programs. We have produced a total of 9,504 sequences for this library and the valid reads were assembled using CAP3. In this procedure, we have obtained 1,131 contigs and 4,083 singletons. These clusters were analyzed for chloroplast or mitochondria contamination using the Blast tool to search for similarity against the chloroplast and mitochondria protein sequences from the Swiss-Prot knowledgebase. In this analysis, it was shown that 107 contigs and 203 singletons were, in fact, contaminant sequences from either chloroplast or mitochondria. This encompasses a total of 1,652 sequences, which corresponds to 17.38% of the total.

In reports from other sequencing projects in Brazil, a total of 237,954 sugarcane ESTs (Vettore et al., 2003), and 123,889 Eucalyptus ESTs (Carrer, 2005) were generated. Forment et al. (2005) obtained 22,635 high quality citrus ESTs from 25 citrus libraries covering different tissues, developmental stages and stress conditions. In the CitEST project, a total of 242,790 citrus ESTs were generated. This number is, to our knowledge, the largest citrus sequence database in the world.



The authors would like to thank Marcelo Reis for assistance in bioinformatics issues. The CitEST project was sponsored by grants from CNPq/Institutos do Milênio/Citrus to M. A. M., process number 620054/01-8.



Carrer H (2005) Seeing the FORESTs for the trees. Genet Mol Biol 283(Suppl.3):i-ii.        [ Links ]

da Silva AC, Ferro JA, Reinach FC, Farah CS, Furlan LR, Quaggio RB, Monteiro-Vitorello CB, Van Sluys MA, Almeida NF, Alves LM, et al. (2002) Comparison of the genomes of two Xanthomonas pathogens with differing host specificities. Nature 417:459-463.        [ Links ]

da Silva FG, Iandolino A, Al-Kayal F, Bohlmann MC, Cushman MA, Lim H, Ergul A, Figueroa R, Kabuloglu EK, Osborne C, et al. (2005) Characterizing the grape transcriptome. Analysis of expressed sequence Tags from multiple Vitis species and development of a compendium of gene expression during berry development. Plant Physiol 139:574-597.        [ Links ]

Dellaporta SL, Wood J and Hicks JB (1983) A plant minipreparation: Version II. Plant Mol Biol Rep. 1:19-20.        [ Links ]

Deng Z, Tao Q, Chang YL, Huang Ling P, Yu C, Chen C, Gmitter Jr DFG and Zhang HB (2001) Construction of a bacterial artificial chromosome (BAC) library for citrus and identification of BAC contigs containing resistance gene candidates. Theor Appl Genet 102:1177-1184.        [ Links ]

FAO (2005) Food and Agriculture Organization of the United Nations.        [ Links ]

Forment J, Gadea J, Huerta L, Abizanda L, Agusti J, Alamar S, Alos E, Andres F, Arribas R Beltran JP, et al. (2005) Development of a citrus genome-wide EST collection and DNA microarray as resources for genomic studies. Plant Mol Biol 57:375-391.        [ Links ]

Hanahan D (1983) Studies on transformation of Escherichia coli with plasmids. J Mol Biol 166:557.        [ Links ]

Kunne C, Lange M, Funke T, Miehe H, Thiel T, Grosse I and Scholz U (2005) CR-EST: A resource for crop ESTs. Nucleic Acids Res 33. Database issue D619-D621.        [ Links ]

Marra MA, Kucaba TA, Hillier LW and Waterston RH (1999) High-throughput plasmid DNA purification for 3 cents per sample. Nucleic Acids Res 27:e37.        [ Links ]

Naito S, Dubé PH and Beachy RN (1988) Differential expression of conglycinin a and b subunit genes in transgenic plants. Plant Mol Biol 11:109-123.        [ Links ]

Simpson AJG, Reinach FC, Arruda PA, Abreu FA, Acencio M, Alvarenga R, Alves LMC, Araya JE, Baia GS, Baptista CS, et al. (2000) The genome sequence of the plant pathogen Xylella fastidiosa. The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and Analysis. Nature 406:151-157.        [ Links ]

Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Charbonnel-Campaa L, Lindvall JJ, Tandre K, Strauss SH, et al (2004) A Populus EST resource for plant functional genomics. Proc Natl Acad Sci USA 101:13951-13956.        [ Links ]

Vettore AL, da Silva FR, Kemper ED and Arruda P (2001) The libraries that made SUCEST. Genet Mol Biol 24:1-7.        [ Links ]

Vettore AL, da Silva FR, Kemper ED, Souza, GM, da Silva AM, Ferro MIT, Henrique-Silva F, Giglioti EA, Lemos MVF, Coutinho LL, et al. (2003) Analysis and functional annotation of an expressed sequence TAG collection for tropical crop sugarcane. Genome Res 13:2725-2735.        [ Links ]



Send correspondence to
Maria Luísa P. Natividade Targon
Centro APTA Citros Sylvio Moreira
Instituto Agronômico de Campinas
Rod. Anhanguera km 158, Caixa Postal 4
13490-970 Cordeirópolis, SP, Brazil

Received: July 24, 2006; Accepted: April 17, 2007.



Associate Editor: Ivan de Godoy Maia

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License