Mitochondrial and chloroplast localization of FtsH-like proteins in sugarcane based on their phylogenetic profile

A phylogenetic analysis of plant FtsH-like proteins was performed using protein sequences from the GENEBANK database and five groups of plant FtsH-like proteins were identified by neighbor-joining analysis. Prediction of the subcellular location of the proteins suggested that two (FtsH-m1 & FtsH-m2) were mitochondrial and three (FtsH-p1, FtsH-p2, FtsH-p3) were plastid targeting. The phylogenetic profile of plant FtsH-like proteins was used to search sugarcane expressed sequence tag (EST) clusters in the SUCEST database. Initially, 153 clusters presenting homology with FtsH-like proteins were recovered, of which 23 were confirmed by a BLAST search in the GENEBANK database and by comparison of their hidropathy index with that of previously described FtsH-like proteins. Sugarcane presented EST clusters in all phylogenetic groups. In silico expression analysis showed that the groups are differentially expressed in sugarcane tissues, with FtsH-p2 and FtsH-m1 presenting increased levels of expression.


INTRODUCTION
The AAA protein family, or ATPases associated with different cellular activities, is a distinct group of the Walker-type superfamily of A/GTPases (Kunau et al., 1993;Walker et al., 1982).Members of the AAA family are characterized by the presence of one or two copies of the AAA module, a conserved sequence of 220-250 amino acids that encompasses the Walker A and B motifs, and a highly conserved amino acid sequence termed the second region of homology (SRH) which distinguishes the AAA family from other Walker-type A/GTPases (Karata et al., 1999).
AAA proteins are widespread in all living organisms, indicating that functional divergence among AAAs is based on one of the oldest biochemical traits, the conversion of the chemical energy stored in ATP molecules into biological activity.These proteins are involved in a wide range of cellular processes, including cellular housekeeping, control of the cell cycle, protein degradation, regulation of gene expression and organelle biogenesis (for a review, see Patel and Latterich, 1998).The AAA family can be divided into metalloproteases and other subfamilies of proteins involved in vesicle-mediated secretion, homotypic fusion, peroxisome biogenesis, meiosis and the functioning of mitochondria (Karata et al., 1999).
AAA metalloproteases are ubiquitous in Bacteria and Eukarya, but have not yet been identified in Archea (Swaffield and Purugganan, 1997;Langer, 2000).A typical member of this subfamily presents one copy of the AAA module and is distinguished from AAA proteins from other families by the presence of a zinc binding motif (HEXGH) and a putative coiled coil region, both located at the C-terminus (Figure 1).Filamentation temperature sensitive (fts) mutants of Escherichia coli fail to septate when cultured at elevated temperature.The FtsH protein was the first reported AAA metalloprotease.This E. coli protease is an integral membrane ATP-dependent protease involved in the degradation of soluble and integral membrane proteins (Tomoyasu et al., 1995;Akiyama et al., 1996).
FtsH proteases span the cytoplasmic membrane twice, exposing a very short part of the N-terminus and a long part of the C-terminus to the cytoplasms containing the AAA module, a zinc binding motif and a leucine-zipper coiled coil (Tomoyasu et al., 1995;Shotland et al., 2000).
Genetics and Molecular Biology, 24 (1-4), 183-190 (2001)      FtsH has a homooligomeric structure similar to ring-like structures found in other ATP-dependent proteases such as Clp/ClpP, HslU/HslV, and 26S proteasome.The N-terminal region (including the two transmembrane segments and a small periplasmic domain) plays an important role in homooligomerization and in the modulation of the proteolytic activity of this enzyme (Akiyama et al., 1998).
Several eukaryotic FtsH orthologues have been identified and are apparently localized exclusively in mitochondria and chloroplasts.In most cases, the proteins are encoded by nuclear genes and post-translationally imported into their respective organelles by specific targeting sequences (Hugueney et al.,1995;Leonhard et al., 1995;Chen et al., 2000).
In yeast mitochondria, three FtsH-like proteins have so far been identified Yme1p (Yta11p), Yta10p (Afg3p) and Yta12p (Rca1p).Yme1p* forms a homooligomeric complex termed the i-AAA protease that exposes its catalytic sites to the inter-membrane space.Yeast cells lacking the i-AAA protease have an increased rate of mtDNA escape and are respiratory-deficient at elevated temperatures (Thorsness et al., 1993).In fact, it has been shown that the AAA module of Yme1p has chaperone-like properties, in which the interaction with unfolded substrates ensure the specificity of proteolytic activity (Leonhard et al., 2000).In contrast to the i-AAA protease, the m-AAA protease, formed by heterooligimerization of yeast tat-binding analogs (Yta) Yta10p and Yta12p, has its catalytic site oriented towards the mitochondrial matrix (Leonhard et al., 1995).This protease degrades unassembled polypeptide chains in the mitochondrial matrix and is required for the correct assembly of several mitochondrial protein complexes (Paul and Tzagoloff, 1995;Arlt et al., 1998).
Although Ymep1, Yta10p and/or Yta12p orthologues have been described in many eukaryotes including the red alga Cyanidioschyzon merolae, Caenorhabditis elegans, Drosophila melanogaster, mice and humans, none have as yet been characterized in higher plants.In fact, the plant FtsH-like proteins have so far been detected only in plastids, the plastid fusion and/or translocation factor (Pftf) of Capsicum annuum being the first FtsH-like protein identified in plants (Hugueney et al., 1995).A Pftf orthologue (VAR2) identified in variegated mutants of plants of the genus Arabidopsis appears to be expressed only in the green organs of the plant, with genetic evidences indicating that it is involved in the biogenesis of thylakoid membranes (Chen et al., 2000).
A second type of thylakoidal FtsH-like protein has been found in tobacco, where it is related to a hypersensitivity response, and in Arabidopsis where it is involved in the light-induced turnover of the Photosystem II D1 protein (Seo et al., 2000;Lindahl et al., 2000).In pea cells, a thylakoid membrane metalloprotease stimulated by zinc ions has been described as being responsible for the degradation of unassembled subunits of the cytochrome b 6 f com-plex (Ostersetzer and Adam, 1997), this protein might be either an already described FtsH-like protein or a new type of AAA metalloprotease.
Several authors have shown that in absence of direct biochemical or genetic data, information on the subcellular localization and biological function of proteins can be tentatively derived from phylogenetically established profiles (Marcotte et al., 2000;Hannenhalli and Russell, 2000).The subcellular localization and evolutionary relationship with eubacterial AAA metalloproteases indicate that the eukaryotic FtsH-like proteins arose by gene migration events from an ancestral endosymbiont of mitochondria and chloroplasts to the primitive eukaryotic nucleus (Swaffield and Purugganan, 1997).
Eukaryotic orthologues of the FtsH-like proteins of the cyanobacteria Synechocystis have been identified using the phylogenetic approach described by Chen et al. (2000).Three (FtsH-1, 2 and 4) out of four different FtsH-like proteins observed in Synechocystis have been found to be closely related to plant FtsH-like proteins, although whether or not a eukaryotic FtsH-like orthologue to the FtsH-3 protein actually exists remains unclear.
The availability of a complete genome sequence for Arabidopsis and a transcriptome database of sugarcane provide invaluable tools for both the construction of a comprehensive picture of plant FtsH-like proteins and the elucidation of their phylogenetic relationship to the eubacteria.
In this paper, we provide a wider view of the phylogenetic profile of plant FtsH-like proteins as well as their putative subcellular localization by using the data available in the sugarcane EST project (SUCEST) database and the FtsH-like protein sequences available in the GENEBANK database.We also describe for the first time the presence of FtsH-like proteins in monocotyledonous plants and propose a new classification of plant FtsH-like proteins.

MATERIAL AND METHODS
Fourty-one FtsH-like protein sequences from a wide range of species were obtained from the GENBANK database and used to produce plant metalloprotease phylogenetic groups (Table I).These sequences were identified using the basic local alignment search tool (BLAST) (http://www.ncbi.nlm.nih.gov/BLAST/) and the Arabidopsis thaliana X99808 protein as query sequence (Lindahl et al., 1996).The sequences obtained were then checked for the presence of AAA metalloprotease specific motives (the AAA module and the zinc-binding motif), only sequences matching these criteria being used for further analysis.Four protein sequences annotated as FtsH-like proteins in the Arabidopsis database were not included in this analysis due to the absence of the zinc binding motif (access numbers: AAF322452, BAB01269, AAB63647, CAB43894).The alignment procedure and the distance-based phylogenic reconstruction were carried out using the Clustal W program (Thompson et al., 1994), the sequences being aligned using the standard alignment parameters.A Gonnet matrix was used to generate the protein sequence distance matrix that was used in the phylogenetic analysis employing the neighbor-joining method (Saitou and Nei, 1987).Support for nodes was estimated by bootstrapping using 10000 data re-samplings and the phylogen-etic tree was graphically displayed using TreeView 1.5 (Page, 1996).The putative subcellular localization of the 41 FtsH-like proteins was predicted by the TargetP software (Emanuelsson et al., 2000) using the first 130 N-terminal amino acids of each sequence.Sugarcane expressed sequence tag (EST) cluster consensi were initially identified by BLAST searches using the 17 previously aligned plant FtsH-like protein sequences as the query sequences in the SUCEST Cluster consensi database.Using a BLAST cut-off value of E < 1e -5 153 EST clusters were identified.Each EST cluster identified was further used as a query sequence in a new BLAST search of the GENEBANK database, with only those first aligning to the 41 previously recorded FtsH-like proteins being considered as putative FtsH-like sequences.
Three protein sequences from Arabidopsis (access numbers X99808, Y12780 and AAD50055) mapped at the same locus as AAD50055, suggesting that they are allelic forms, and therefore only one of them (AAD50055) was included in our analysis.Hidropathy index profiles were obtained using the prediction of transmembrane regions (TMpred) program (http://www.ch.embnet.org/software/TMPRED_form.html).

RESULTS AND DISCUSSION
According to the endosymbiont hypothesis, symbiotic events involving a proto-eukaryotic host and the ancestors of modern α-proteobacteria and cyanobacteria resulted in mitochondria and plastids respectively (Margulis, 1970).After establishing a symbiotic relationship, the loss of redundant genes and the translocation of endosymbiont genes to the nucleus of the eukaryote resulted in the current distribution of genes between the three genomes (Swaffield and Purugganan, 1997).It is interesting to note that FtsH-like proteins have been found in cyanobacteria and α-proteobacteria.
Our neighbor-joining analysis identified five groups of plant FtsH-like proteins targeted at either plastids (p) or mitochondria (m), the plastid groups being FtsH-p1 and FtsH-p2 (previously described by Chen et al. (2000)) and a new group, FtsH-p3, while the mitochondrial groups were FtsH-m1 and FtsH-m2, all these groupings being supported by strong bootstrap values (Figure 2).Supporting evidence for these groupings exists in the finding that sequence identity (data not shown) and hidropathy index similarity extended the AAA module, mainly at the C-terminal region (Figure 3).This was further used as a signature to differentiate the groups.The grouping together of the ARATH-AAD30220-998-I and MESVI-AAF-43852-890 sequences was not supported by visual inspection of the hydropathy profiles (data not shown) and were thus not considered a phylogenetically related group in our analysis, the relationship of this group to other eukaryotic FtsH-like proteins remains unclear.
In agreement with the endosymbiont hypothesis, most mitochondrial and plastid groups appear to have orthologues in the α-proteobacteria and the cyanobacteria Synechosystis, respectively.FtsH-p1 is very similar to the prokaryotic cyanobacterial FtsH 4, whereas FtsH-p2 is supposed to be orthologous to the cyanobacterial FtsH 1 and Bootstrap support is given as the percentage of 10000 re-samplings in which a given node appeared.The protein sequences are identified through their Swiss-Prot species identification code, accession number and amino acid number (all items are described in Table I).In the Arabidopsis protein sequences, the roman numeral after the amino acid number indicates the chromosome in which the FtsH encoding gene is located.II).
FtsH 2 proteins (Figure 2).These observation support recent research by Chen et al. (2000) based on the classification of cyanobacterial FtsH-like proteins.Interestingly, algae FtsH-p2 group is encoded in the chloroplast genome (Table I).The plastid FtsH-p3 and the mitochondrial FtsH-m1 groups appear to have as prokaryotic orthologues the cyanobacterial FtsH 3 and FtsH-like proteins from the α-protebacteria Rickettsia prowazekii and Bradyrhizobium japonicum, respectively (Figure 2).
In contrast to the new FtsH-p3 plastid-localized proteins, all the other groups have at least two members whose subcellular localization has been experimentally confirmed.In addition, prediction of the subcellular localization of most eukaryotic FtsH-like proteins is in agreement with their phylogenetic profile (Table I).
Phylogenetic analysis of Metazoan mitochondrial FtsH-like proteins suggests the existence of three major groups of FtsH-like proteins related to Yme1p, paraplegin and Yta10/Yta12 (Juhola et al., 2000).Our phylogenetic analysis has revealed that plants and fungi present two major groups of mitochondrial FtsH-like proteins, FtsH-m1 and FtsH-m2, which are homologous to Yme1p and Yta10/Yta12 respectively (Figure 2, Table II).However, in contrast to fungi and metazoans, plants seem to have two FtsH-like proteins related to Yme1p (Figure 2 and Table II).
We used the phylogenetic profile of plant FtsH-like proteins to search for sugarcane EST clusters in the SUCEST database and found 153 clusters presenting similarity with FtsH-like proteins, 23 of these clusters being confirmed as FtsH-like proteins following a BLAST search in the GENEBANK database.Interestingly, sugarcane presents clusters in all groups previously described by phylogenetic analysis (Table II).
Sequencing of the Arabidopsis thaliana genome has revealed very similar FtsH-like protein paralogues in all phylogenetic groups, the exception being the FtsH-m1 set of proteins (designated in this article as FtsH-m1A (AAC31223) and FtsH-m1B (BAB08420)) which present more divergent paralogues.Although it is devoid of a typical targeting sequence, the FtsH-m1A protein seems to be targeted at mitochondria based on its phylogenetic profile.Interestingly, the first aminoacid of this protein sequence aligns with amino acid 87 from the putatively mitochondrial sugarcane EST cluster consensus SCCCLR1078F01.g, suggesting that FtsH-m1A was misassigned or misidentified by the exon-predicting program.
Analysis of the sugarcane EST database showed the existence of FtsH-m1A and FtsHm1B orthologues, indicating that they are also present in monocotyledonous plants (Table II).This suggests that a gene duplication event occurred before monocotyledons and dicotyledons diverged and that the paralogues were maintained in both descendent lineages.
In order to avoid confusion and facilitate scientific communication, a systematic approach to the cataloging of all plant FtsH-like proteins has recently been proposed (Adam et al., 2001).This approach has shown that Arabidopsis contains nine FtsH protease isomers, most of which are located in mitochondria and chloroplasts.In our work, we have extend this classification and produced five distinct groups based on phylogenetic analyses and hidropathy index profile.In addition, subcellular localization of the FtsH-like proteins was restricted to mitochondria (two groups) and chloroplasts (three groups).
In the absence of experimental data, inference about the biological function of new proteins may be obtained by their phylogenetic profile, in which proteins whose functions are well known are compared with their respective orthologues (Marcotte et al., 2000;Hannenhalli and Russell., 2000).Although the function of plant FtsH-like proteins is poorly understood, their orthologues in yeast and bacteria have received a lot of attention in the last few years.In this paper we have presented a new class of FtsH-like proteins, FtsH-p3, of unknown function which we predict to be localized in the plastids of plant cells.The function and location of some members of the FtsH-p1 and FtsH-p2 groups have recently been reported.In Arabidopsis, the proteins from these groups are located in the thylakoid membrane and are apparently involved in the light-induced turnover of the Photosystem II D1 protein (FtsH-p1) and thylakoid membrane biogenesis (FtsH-p2) (Lindahl et al., 2000;Chen et al., 2000), while tobacco FtsH-p1 has been associated with the hypersensitive reaction (Seo et al., 2000).The data shown in Table II and Figure 3 indicates that sugarcane EST clusters present high similarity with these proteins.We believe that these proteins play a phylogenetically conserved role in plastid metabolism because light-induced turnover of the D1 protein and thylakoid membrane biogenesis are crucial in both monocotyledons and dicotyledons.
In yeast cells, mitochondrial FtsH-like proteins are involved in assembly of the respiratory chain complex as well as in degradation of unassembled respiratory chain subunits.The human orthologous to Ymep1 complements a yeast yme1 disruptant (Shah et al., 2000), and it seems that the function of FtsH-like proteins are conserved among eukaryotes.
In order to estimate the FtsH-like gene expression in different cDNA libraries, we carried out an in silico Northern Blotting, based on the assumption that the number of reads in a specific cDNA library is approximately proportional to its level of expression in the tissue.The SUCEST database is composed of non-normalized cDNA libraries from several sugarcane tissues, and we performed in silico Northern analysis with different groups of sugarcane FtsH-like proteins, assuming that the total number of readings per library and the number of FtsH-like ESTs is known (Figure 4).
The FtsH-p2 group was expressed, both qualitative and quantitatively, more than the FtsH-p1 and FtsH-p3 groups, with increased expression in leaf, apex meristem, stem bark and root tissues.Expression of plastid groups of FtsH-like proteins in lateral buds was not detected, suggesting either that these proteins are not expressed or they are expressed at very low levels.On the other hand, all plastid groups of FtsH-like proteins are expressed in seeds and in plantlets infected with Herbaspirillum rubrisubalbicans.In general, except for lateral buds, FtsH-like proteins are differently expressed in most sugarcane tissues (Figure 4A).Interestingly, expression of all plastid groups was observed at the root level and the transition zone.Since the FtsH-p1 and FtsH-p2 orthologues are supposed to be light regulated (Lindahl et al., 1996;Chen et al., 2000), it might be interesting to study their regulation in tissues not exposed to light.As observed with the plastid-located FtsH-like proteins, the mitochondrial groups present a broad spectrum of gene expression and are apparently differentially regulated.However, FtsH-m1 presents a higher expression level in leaf tissues than FtsH-m2 (Figure 4B).
Multifunctionality is an important feature of the AAA metalloproteases, including the FtsH protein from E. coli and the Yme1p, Yta10 and Yta12 proteins from yeast.In addition, FtsH chaperone activity seems to be uncoupled from proteolytic activity.An intriguing question is whether the plastid FtsH-like proteins might behave as their mitochondrial counterparts.According to our observations, expression of FtsH-like proteins in non-photosynthetic tissues might suggest a new cellular function for these proteins.Although speculative, this might shed light on the current understanding of the FtsH-like protein family in plant cells.

Figure 1 -
Figure 1 -Diagram of a typical eukaryotic AAA metalloprotease.The AAA module encompasses the Walker A and B groups and the second region of homology (SRH).

Figure 2 -
Figure 2 -Phylogenetic groups obtained by neighbor-joining analysis.Bootstrap support is given as the percentage of 10000 re-samplings in which a given node appeared.The protein sequences are identified through their Swiss-Prot species identification code, accession number and amino acid number (all items are described in TableI).In the Arabidopsis protein sequences, the roman numeral after the amino acid number indicates the chromosome in which the FtsH encoding gene is located.

Figure 3 -
Figure 3 -Hidropathy index profile for sequences of the five plant phylogenetic groups (A, B, C, D and E) and for the aligned sugarcane EST clusters consensi (F, G, H, I and J).The EST clusters consensi were positioned in accordance to their best aligned protein sequence (see TableII).

Table I -
Mitochondrial and chloroplast localization of FtsH-like proteins in sugarcane 185 List of the protein sequences used on phylogenetic analysis.This protein sequence appears not to have a targeting sequence.Prediction of its subcellular localization was based on the Phylogenetic profile only.The Swiss-Prot identification code of species is shown in parentheses. *

Table II -
Phylogenetic profile from FtsH-like proteins Sugarcane EST clusters.