SciELO - Scientific Electronic Library Online

vol.24 issue1-4Sequence polymorphism from EST data in sugarcane: a fine analysis of 6-phosphogluconate dehydrogenase genesSugarcane genes related to mitochondrial function author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand




Related links


Genetics and Molecular Biology

Print version ISSN 1415-4757On-line version ISSN 1678-4685

Genet. Mol. Biol. vol.24 no.1-4 São Paulo Jan./Dec. 2001 

A search for markers of sugarcane evolution


M. Bacci Jr.1*, V.F.O. Miranda1, V.G. Martins1, A.V.O. Figueira2, M.V. Lemos3, J.O. Pereira4 and C.L. Marino5
1Centro de Estudos de Insetos Sociais, IB, UNESP. Av. 24-A 1515, 13506-900 Rio Claro, SP, Brazil.
Laboratório de Melhoramento de Plantas/Centro de Energia Nuclear na Agricultura, USP. Av. Centenário 303, 130400-97 Piracicaba, SP, Brazil.
Departamento de Biologia Aplicada à Agropecuária, UNESP, Via de Acesso Prof. Paulo D. Castellane km 5, 14870-000 Jaboticabal, SP, Brazil.
Departamento de Biotecnologia de Plantas Medicinais, UNAERP. Av. Costábile Romano 2201, 14096-380 Ribeirão Preto, SP, Brazil.
Departamento de Genética, IB, UNESP, 18610-000 Botucatu, SP, Brazil. Send correspondence to M. Bacci Jr. E-mail:




To determine the phylogenetic relationship between sugarcane cultivars and other members of the Saccharinae subtribe, we identified the fast evolving ITS1-5.8S-ITS2 (ITS = internal transcribed spacer; 5.8S = 5.8S ribosomal DNA) region of the sugarcane genome in the Sugarcane Expressed Sequence Tag (SUCEST) genome project database. Parsimony analysis utilizing this region and homologs belonging to the 23 closely related Andropogoneae currently deposited in the GenBank database has shown sugarcane as the sister group of Saccharum sinense. However, because there are few parsimony-informative characters and high homoplasy in the ITS1-5.8S-ITS2 region we were not able to determine with confidence the phylogenetic relationship between sugarcane and some of the remaining members of Saccharine subtribe. To find alternatives for the phylogenetic reconstruction of sugarcane evolutionary history, we selected 17 markers (nuclear, chloroplastic or mitochondrial) from the SUCEST database of which apha-tubulin, ribosomal protein L16 (rpl16) and DNA-directed RNA polymerase beta chain (rpoC2) were found to have a low incidence of polymorphism and comparable, or even faster, rates of evolution than the ITS1-5.8S-ITS2 region. We suggest that these markers should be considered as preferential choices for phylogenetic studies of Saccharinae subtribe.




The Saccharum L. group is a polyploid complex within the Saccharinae subtribe of the Andropogoneae Dumort tribe which itself if located within the Poaceae family (Sobral et al., 1994; Jacobs and Everett, 2000). The Saccharum L. group and Sorghum (Sorghinae subtribe) seem to have diverged from a common ancestor 5 million years ago (Al-Janabi et al., 1994) through a single and rapid radiation (Kellog, 2000; Spangler, 2000), accumulating few consistent mutations (Kellog and Watson, 1993; Mason-Gamer et al., 1998; Spangler et al., 1999). Because of this, phylogenetic reconstructions of Andropogonae often result in clades with poorly supported relationships (Spangler, 1999). Even the fast evolving ITS1 and ITS2 regions have shown few differences between or within the Andropogoneae species (Wang et al., 2000, Ainouche and Bayer, 1997). Therefore, the molecular systematics of the Andropogoneae would greatly benefit from markers with similar, or even faster, rates of evolution compared to that of the ITS regions. In this regard, the Sugarcane Expressed Sequence Tag (SUCEST) genome project is an outstanding source of information because it contains many of the genes expressed by sugarcane cultivars used in agriculture.

However, the identification of such markers may be complicated by the large size (2n = 18-170) of the Saccharum genome, whose expansion has produced new gene combinations and increased polymorphism and chromosome number (Ming et al., 1998). The genomic complexity of the sugarcane cultivars whose data is included in the SUCEST database is probably even greater, since they are derived from hybrids between Saccharum spontaneum and Saccharum officinarum.

The aim of the present investigation was to identify fast evolving markers within the SUCEST database, investigate their polymorphism and select the most appropriate of these markers to study the evolution of closely related specimens within Andrpogoneae. To accomplish this, we identified sugarcane internal transcribed spacer (ITS) ITS1 and ITS2 sequences and used them to infer the phylogenetic position of sugarcane within the Saccharinae. We also compared the rate of evolution of the ITS regions to the rate of evolution of nuclear, chloroplastic and mitochondrial cDNA sequences selected from the SUCEST database. Our results indicate that at least three sugarcane markers have comparable or even faster evolution rates than the ITS regions and can be considered as preferential choices for use in research on the molecular systematics of the Saccharinae subtribe.



Search for sugarcane sequences

Complete coding sequences of markers, including some commonly utilized in plant phylogenetic studies (Table I) were retrieved from the GenBank database and utilized as a query to a basic local alignment search tool (BLASTN) search against the SUCEST database. Results were treated as described below in order to select appropriate molecules for evolutionary studies.



Retrieving of sequences homologous to sugarcane markers

Each of the identified sugarcane markers was utilized in a new BLASTN search against the GenBank database, and homologous sequences were retrieved and grouped into a new database for phylogenetic analysis.

Phylogenetic analysis

Conserved elements within the homologous sequences were aligned through the introduction of gaps using the program MALIGN (multiple alignment), version 2.8 (Wheeler and Gladstein, 1994), running with the score zero and quick parameters. Regions which could not be unambiguously aligned were excluded from the phylogenetic analyses which were carried out through parsimony or maximum likelihood methods using the program PAUP* (phylogenetic analysis using parsimony and other methods), version 4.0b4a of Swofford (2000), and branch-and-bound search (Handy and Denng, 1982). Decay index was calculated according to Bremer (1988), using the Treeroot 2.0 program of Sorenson (1999).



Many current investigations within the Andropogoneae tribe utilize the ITS1 and ITS2 regions because they are known to incorporate changes at comparatively high rates (Hershkovitz and Zimmer, 1997; Baum et al. 1998). The analysis of the sugarcane dataset indicates that cluster SCCCLR1CO5E04.g (107274) contains 590 base pairs corresponding to the complete coding sequence of the ITS1-5.8S-ITS2 region, which was utilized for the parsimony analyses summarized in Table II.

The position of sugarcane cultivars, represented by SCCCLR1CO5E04.g cluster, has been determined within the 23 closely related Andropogoneae taxa that have the ITS1-5.8S-ITS2 region deposited in the GenBank database. The generated alignment was made up of 603 nucleotides and resulted in 33 most-parsimonious trees (MPTs) with length = 373, consistency index (CI) = 0.7131 and retention index (RI) = 0.7902. In order to minimize the interference from homoplastic characters, data were submitted to successive approximation using the PAUP* option for rescaled consistency index (RCI, Farris, 1969; Carpenter, 1988). This procedure increased the goodness-of-fit and resulted in one of the 33 MPTs (length = 227.94614, CI = 0.8175, RI = 0.8877) which shows SCCCLR1CO5E04.g cluster as the sister group of Saccharum sinense in a clade supported by a bootstrap index = 99 (Figure 1). A reasonable bootstrap index (88) was also obtained for the clustering of Saccharum robustum and Saccharum cultivar R46, but little support exists for the clustering of Saccharum barberi, Saccharum officinarum and Saccharum cultivar R48. Identical tree topology and comparable bootstrap indexes (not shown) were obtained for a single MPT (length = 47, CI = 0.9362, RI = 0.8235) derived from the analysis which included only Saccharum specimens and Miscanthus sinensis. The pairwise test applied to Saccharum and Miscanthus ITS1-5.8S-ITS2 sequences showed few differences, from 1 to 5%, and the alignment of these sequences resulted in only 12 parsimony informative characters (Table II). By considering gaps as a fifth state, parsimony informative characters increased to 13, but this did not increase the support for Saccharum barberi, Saccharum officinarum and Saccharum cultivar R48 clade (not shown). In addition to bootstrap values, low decay index and distance values (Figure 1) indicate that tree nodes involving Saccharum species are supported by few mutations and that tree topology may be highly influenced by homoplasy.



In addition to parsimony, maximum-likelihood analysis was applied to the ITS1-5.8S-ITS2 alignment. To render the analysis computationally tractable, only Saccharum and Miscanthus sinensis representatives have been considered. The most parameter-rich model (general time-reversible + proportion of invariant sites + rate heterogeneity modeled as a gamma distribution with six rate categories) was significantly more likely than the next best model, as determined by the likelihood ratio test. The selected model was applied to a starting tree that had been generated by neighbor-joining (Kimura 2-parameter distances) and tree bisection-reconnection (TBR) branch swapping. The single most likely tree (MLT) generated preserved clades which had high support in the single MPT, i.e. those involving SCCCLR1CO5E04.g cluster and Saccharum sinense or Saccharum robustum and cultivar R46. However, the MPT clade containing Saccharum cultivar R48, Saccharum officinarum and Saccharum barberi was not present in the MLT (Figure 1). These results reinforce the low confidence in the poorly supported relationships obtained for this clade in the MPT.

It is not easy to determine clearly the phylogenetic relationship between sugarcane and other members of the Saccharinae subtribe and we feel that it will be necessary to consider a larger number of consistent mutations than those accumulated by the ITS regions. To accomplish this, molecular analysis of Saccharinae should focus on data sets containing not only ITS markers but other markers with relatively high evolutionary rates, and use methods such as successive weightinh to minimize homoplastic noise.

We searched the SUCEST database to find fast evolving markers, the searches being directed at three categories of molecules, i.e. nuclear or mitochondrial ribosomal DNA and spacers, nuclear or mitochondrial protein coding genes related to carbohydrate metabolism and electron transport, and chloroplast genes. These searches resulted in the identification of 17 cDNA clusters (Table III).

Since single clusters were identified for every marker, a low polymorphism incidence was detected in sugarcane. While the finding of a unique gene is consistent with the absence of polymorphism, this result must be interpreted with special caution. Identification of a given marker from a cDNA library occurs as a function of its relative abundance, so that its absence may just be the result of a low transcription rate.

Thus, to refine the results of the polymorphism investigation we carried out a BLASTN search for reads in the SUCEST database utilizing each of the 17 identified sugarcane clusters as queries. This procedure resulted in the identification of small gene fragments, which were found both as a single molecule or as domains inserted in a larger sequence. These fragments were mostly found in nuclear 18S ribosomal RNA (18S rRNA) and elongation factor alpha (EF-1 alpha), and it appears that their distribution in sugarcane cDNA sequences may ultimately prove to be useful in the study of genome expansion in the Saccharum L. group.

In order to evaluate the utility of the selected markers for evolutionary studies, we generated alignments using a given marker and its homologs from the Poaceae (Gramineae) family currently deposited in the GenBank database. The rates of evolution of each marker were estimated by determining the percentage of parsimony-informative characters (PIC) presented by each of the alignments. Table III shows that the ITS1-5.8S-ITS2 alignment contained 136 PIC out of 577 characters, i.e. 24%. In addition, the alpha-tubulin alignment presented 21% of PIC, indicating that alpha-tubulin has an evolution rate comparable to that found in ITS1-5.8S-ITS2 region. On the other hand, rpl16 and rpoC2 have faster evolution rates, since their alignments contained 49% and 59% of PIC respectively. Thus it is reasonable to assume that sequencing of alpha-tubulin, rpl16 and rpoC2 from Saccharinae specimens would add considerable information to the phylogenetic analysis of this subtribe. These findings make such molecules preferential markers for future studies on the systematic position of sugarcane cultivars within the Saccharinae subtribe.



We thank F. Marques for his comments on the parsimony analyses; Fundação de Apoio à Pesquisa do Estado de São Paulo (FAPESP) for the financial support of this work (Proc. 00-07438-1) and the master degree fellowship of V.F.O.M. (Proc. 00/05098-9) and Centro de Estudos de Insetos Sociais-UNESP, Rio Claro for logistical support. Jonathan and Cybel Burgess are also acknowledged for English review.




Com o propósito de determinar a relação filogenética entre a cana-de-açúcar e membros da subtribo Saccharinae, a região gênica nuclear ITS1-5,8S-ITS2 (ITS: espaçador interno transcrito; 5,8S: DNA ribossomal 5.8S), com alta taxa evolutiva, foi identificada no banco de dados do projeto genoma “Sugarcane Expressed Sequence Tag” (SUCEST). Uma análise através do método de parcimônia, utilizando esta região e seqüências homólogas de 23 Andropogoneae retiradas da base de dados GenBank, indicou que a cana-de-açúcar é o grupo-irmão de Saccharum sinense. No entanto, devido à pequena quantidade de caracteres informativos para parcimônia e à homoplasia presentes na região ITS1-5,8S-ITS2, não foi possível determinar com segurança a relação filogenética entre a cana-de-açúcar e alguns dos demais membros da tribo Saccharine. Como alternativa para esta baixa resolução, dezessete regiões gênicas nucleares, cloroplasmáticas ou mitocondriais foram selecionadas a partir do banco de dados SUCEST com o objetivo de encontrar marcadores mais apropriados para a reconstrução da filogenia da cana-de-açúcar. Entre elas, aquelas correspondentes à alfa-tubulina, rpl16, e rpoC2 apresentaram baixa incidência de polimorfismo e taxas de evolução equivalentes ou mesmo maiores do que a observada para a região ITS1-5,8S-ITS2. Estes marcadores são propostos como preferenciais para estudos filogenéticos da subtribo Saccharinae.




Al-Janabi, S.M., Honeycutt, R.J., Peterson, C. and Sobral, B.W.S. (1994). Phylogenetic analysis of organellar DNA sequence in Andropogoneae: Saccharum. Theor. Appl. Gen. 88: 933-934.         [ Links ]

Ainouche, M.L. and Bayer, R.J. (1997). On the origins of tetraploid Bromus species (section Bromus, Poaceae): insights from internal transcribed spacer sequences of nuclear ribosomal DNA. Genome 49: 730-743.         [ Links ]

A.P.G. (1998). An ordinal classification for the families of flowering plants. Ann. Missouri Bot. Gard. 85: 531-553.         [ Links ]

Baldauf, S.L., Palmer, J.D. and Doolittle, W.F. (1996). The root of universal tree and the origin of eukaryotes based on elongation factor phylogeny. Proc. Nat. Ac. Sci. USA 93: 7749-7754.         [ Links ]

Baum, D.A., Small, R.L. and Wendel, J.F. (1998). Biogeography and floral evolution of baobabs (Adansonia, Bombacaceae) as inferred from multiple data sets. Syst. Biol. 47: 181-207.         [ Links ]

Bremer, K. (1988). Limit of aminoacid sequence data in Angyosperm phylogenetic reconstruction. Evolution 42: 795-803.         [ Links ]

Carpenter, J.M. (1988). Choosing among multiple equally parsimonious cladograms. Cladistics 4: 291-296.         [ Links ]

Duff, R.J. and Nickrent, D.L. (1999). Phylogenetic relationships of land plants using mitochondrial small-subunit rDNA sequences. Amer. J. Bot. 86: 372-386.         [ Links ]

Farris, J.S. (1969). A successive approximations approach to character weighting. Syst. Zool. 18: 374-385.         [ Links ]

Goto, S.G. and Kimura, M.T. (2001). Phylogenetic utility of mitochondrial COI and nuclear Gpdh genes in Drosophila molecular phylogenetics and evolution. Mol. Phyl. Evol. 18: 404-422.         [ Links ]

Handy, M.D. and Denng, M.D. (1982). Branch-and-bound algorithmic to determine minimal evolutionary tree. Discrete Math. 96: 51-68.         [ Links ]

Hershkovitz, M.A. and Zimmer, E.A. (1997). On the evolutionary origins of the cacti. Taxon 46: 217-232.         [ Links ]

Hiesel, R., von Haeseler, A. and Brennicke, A. (1994). Plant mitochondrial nucleic acid sequences as a tool for phylogenetic analysis. Proc. Natl. Acad. Sci. USA 91: 629-633.         [ Links ]

Hillis, D.M. and Dixon, M.T. (1991). Ribosomal DNA: molecular evolution and phylogenetic inference. Quart. Rev. Biol. 66: 411-453.         [ Links ]

Kellog, E.A. (2000). Molecular and morphological evolution in the Andropogonae. In: Grasses: systematics and evolution (Jacobs, S.W.L. and Everett, J., eds), CSIRO Publishing, Collingwood, Melbourne, Australia, pp. 149-158.         [ Links ]

Kellog, E.A. and Watson, L. (1993). Phylogenetic studies of a large data set. I. Bambusoideae, Andropogonodae, and Pooidae (Gramineae). Bot. Rev. 59: 273-343.         [ Links ]

Mason-Gramer, R.J., Weil, C.F. and Kellog, E.A. (1998). Granule-bound starch synthase: structure, function, and phylogenetic utility. Mol. Biol. Evol. 15: 1658-1673.         [ Links ]

Ming, R., Liu, S.C., Lin, Y.R., da Silva, J., Wilson, W., Braga, D., van Deynze, A., Wenslaff, T.F., Wu, K.K., Moore, P.H., Burnquist, W., Sorrels, M.E., Irvine, J.E. and Paterson, A.H. (1998). Detailed alignment of Saccharum and Sorghum cromossomes: comparative organization of closely related diploid and polyploid genomes. Genetics 150: 1663-1682.         [ Links ]

Monteiro, A. and Pierce, N.E. (2001). Phylogeny of Bicyclus (Lepidoptera:Nymphalidae) inferred from COI, COII, and EF-1 alpha gene sequences. Mol. Phylogenet. Evol. 18: 264-281.         [ Links ]

Olmstead, R.G. and Palmer, J.D. (1994). Chloroplast DNA systematics: a review of methods and data analysis. Amer. J. Bot. 81: 1205-1224.         [ Links ]

Sobral, B.W.S., Braga, D.P.V., LaHodd, E.S. and Kein, P. (1994). Phylogenetic analysis of chloroplast restriction enzyme site mutation in the Saccharinae Griseb. subtribe of Andropogoneae Dumort. tribe. Theor. Appl. Genet. 87: 843-853.         [ Links ]

Soltis, E.D. and Soltis, P.M. (2000). Contribuitions of plant molecular systematics to studies of molecular evolution. Plant Mol. Biol. 42: 45-75.         [ Links ]

Soltis, P.S., Soltis, D.E. and Chase, M.K. (1999). Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402: 402-404.         [ Links ]

Sorenson, M.D. (1999). Boston University, Boston, MA, USA.         [ Links ]

Spangler, R.E. (2000). Andropogoneae systematics and generic limits of Sorghum. In: Grasses: systematics and evolution (Jacobs, S.W.L. and Everett, J., eds.), CSIRO Publishing, Collingwood, Melbourne, Australia.         [ Links ]

Spangler, R.E., Zaitchik, B., Russo, E. and Kellog, E. (1999). Andropogoneae evolution and genetic limits in Sorghum (Poaceae) using ndhF sequences. Sistematic Bot. 24: 267-281.         [ Links ]

Swofford, D.L. (2000). Phylogenetic analysis using parsimony, version 4.0b4a. Ilinois Natural History Survey, Champaign, USA.         [ Links ]

Wheeler, W.C. and Gladstein, D. (1994). MALIGN: a multiple sequence alignment program. J. Hered. 85: 417.         [ Links ]

Wang, J.B., Wang, C., Shi, S.H. and Zhong, Y. (2000). Evolution of parental ITS regions of nuclear rDNA in allopolyploid Aegilops (Poaceae) species. Hereditas 133: 1-7.        [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License