Print version ISSN 1415-4757
Genet. Mol. Biol. vol.24 no.1-4 São Paulo Jan./Dec. 2001
Identification of metalloprotease gene families in sugarcane
O.H.P. Ramos and H.S. Selistre-de-Araujo*
Departamento de Ciências Fisiológicas, Universidade Federal de São Carlos, 13565-905 São Carlos, SP, Brazil.
Send correspondence to H.S. Selistre-de-Araujo. E-mail: email@example.com.
Metalloproteases play a key role in many physiological processes in mammals such as cell migration, tissue remodeling and processing of growth factors. They have also been identified as important factors in the patho-physiology of a number of human diseases, including cancer and hypertension. Many bacterial pathogens rely on proteases in order to infect the host. Several classes of metalloproteases have been described in humans, bacteria, snake venoms and insects. However, the presence and characterization of plant metalloproteases have rarely been described in the literature. In our research, we searched the sugarcane expressed sequence tag (SUCEST) DNA library in order to identify, by homology with sequences deposited in other databases, metalloprotease gene families expressed under different conditions. Protein sequences from Arabidopsis thaliana and Glycine max were used to search the SUCEST data bank. Conserved regions corresponding to different metalloprotease domains and sequence motifs were identified in the reads to characterize each group of enzymes. At least four classes of sugarcane metalloproteases have been identified, i.e. matrix metalloproteases, zincins, inverzincins, and ATP-dependent metalloproteases. Each enzyme class was analyzed for its expression in different conditions and tissues.
Metalloproteases (E.C. 3. 4. 24.) are usually characterized by having a catalytic zinc ion in the active site (Matrisian, 1990). Many metalloproteases have a conserved consensus sequence (HEXXH) involved in metal coordination, while others, including the mammalian matrix metalloproteases (MMPs) and the snake venom metalloproteases (svMPs), have an extended zinc binding motif (HEXXHXXGXXH) (Stöcker and Bode, 1995). Matrix metalloproteases comprise a family of extracellular matrix-degrading proteases which play essential roles in the tissue remodeling that occurs during physiological processes including morphogenesis, wound healing and angiogenesis. MMPs degrade the components of the extracellular matrix such as collagens, fibronectin and proteoglycans (Matrisian, 1992). The aberrant functioning of MMPs causes severe tissue damage as observed in tumor invasion, metastasis and tissue ulceration (Shapiro, 1998).
Both MMPs and svMPs are multi-modular, secreted proteins with a highly conserved signal peptide and pro-enzyme domain, followed by a protease domain which includes the zinc binding motif. The pro-metalloprotease domain contains a conserved cysteine residue at its C-terminus which is bound to zinc at the active site to inactivate metalloprotease activity by a mechanism called the cysteine-switch (Grams et al., 1993). In addition to the catalytic domain, MMPs have a hemopexin-like domain, which has integrin-binding activity (Brooks et al., 1998) while some MMPs also have a domain for insertion in the membrane (MT-MMPs) (Shapiro, 1998). In the svMPs, a disintegrin domain which interferes with integrins (a class of membrane-bound cell adhesion molecules) replaces the hemopexin-like domain. The disintegrin domain inhibits the interaction of some integrins with their target molecules such as the extracellular matrix components, therefore inhibiting cell adhesion (Bjarnason and Fox, 1994).
Snake venom metalloproteases are related to the ADAMs (for a disintegrin and metalloprotease), a large group of membrane-bound cell surface proteases that interferes with a number of integrins in several physiological processes such as fertilization, cell differentiation, and shedding of receptors (Killar et al., 1999). More than 23 ADAMs have been described in many different tissues from several sources including Drosophila melanogaster, Caenorrhabits elegans, and mammals (Stone et al., 1999). The ADAMs have an unique domain organization, including a pro-domain, zinc-metalloprotease, disintegrin-like, cysteine-rich, transmembrane, and cytoplasmic domains. The disintegrin domain in ADAMs has been shown to mediate cell-cell interactions via integrin molecules, which is a crucial step in several physiological processes such as fertilization. The cytoplasmic domain has potential phosphorylation sites, suggesting a role in signal transduction. On the basis of this domain structure, these proteins are believed to mediate a variety of cellular functions including processing of precursor forms and growth factors, cell adhesion, fusion and signaling (Killar et al., 1999), but despite their importance in biological processes similar proteins have not yet been described in plants.
Patel and Latterich (1998) have described a new family of ATP-dependent metalloproteases, the ATPases associated with diverse cellular activities (AAA), in bacteria, mammals and plants, this class of intracellular proteases having a zinc-binding motif characterized by having a HEXGH sequence at the C-terminus and a conserved nucleotide-binding motif. These metalloproteases may be anchored to the cytoplasmic membrane via two transmembrane regions at the N-terminus in such a way that the long C-terminus is exposed into the cytoplasm. The AAA proteins participate in a variety of cellular functions, including cell-cycle regulation, protein degradation (including activity on transcription factors), organelle biogenesis and chaperonine activity. In plants, homologous proteins are found in the thylakoid membranes and their expression is dependent on light.
During the work presented in this paper we searched the sugarcane expressed sequence tag (SUCEST) DNA library for metalloprotease families by homology with sequences deposited in other databases (e.g. GenBank) and analyzed the sequences we found for the domain organization and putative enzymatic activities of the predicted proteins using open reading frames. Conserved regions corresponding to the different domains of ADAMs, svMPs, MMPs and ATP-dependent metalloproteases, including the zinc binding and disintegrin motifs, were used as tools for the database search. The main purpose was to identify the presence of coding sequences for such enzymes, which could have a role in sugarcane development and disease.
MATERIALS AND METHODS
The SUCEST database (http://sucest.lad.ic. unicamp.br/en) consists of almost 300.000 expressed sequence tags (ESTs) collected from cDNA libraries from different sugarcane tissues under different environmental conditions. Construction of cDNA libraries was carried out at the Center for Molecular Biology and Genetic Engineering (CBMEG, Campinas, SP, Brazil). Plants were grown at Copersucar experimental station (Piracicaba, SP, Brazil) and at the CBMEG greenhouse. Seeds were collected from plants grown at the experimental station of Universidade Federal de Alagoas, (Murici, Al, Brazil).
Searching for sugarcane metalloproteases
For manual annotation, we searched for plant metalloproteases using their biochemical characteristics, while for automatic annotation of sequences in international databases such as The National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/) and The Institute for Genomic Resources (TIGR, http://www.tigr.org/tdb/e2k1/ath1/) we used a keyword search. Protein sequences were preferred instead of nucleotide sequences due to their conservation. However, if the protein sequences could not be found, nucleotide sequences were used. Protein sequences from Arabidopsis thaliana (accession numbers gi:15238585, gi:15221678, gi:15218963, gi:15225398, gi:15230313, gi:15238333 and gi:15232715) and Glycine max (accession number gi:2827777), as well as nucleotide sequences from Sorghum bicolor (accession numbers BE365151, BE592495 and BE598830) that share homology with members of metalloprotease families were selected from data banks at NCBI and TIGR. These sequences were compared using the basic local alignment tool (BLAST, Altschul et al, 1990) against SUCEST reads and cluster consensus using the Phrap software, without any cost for open or extend a gap and scoring by Blocks Substitution Matrix 62 values (BLOSUM 62; Henikoff and Henikoff, 1992).
Sugarcane sequence hits with E values lower than e-50 were considered metalloproteases. However, hits with E values lower than e-15 were also accepted as a putative metalloprotease provided they had the zinc-binding motif.
Metalloproteases from animal sources, snake venom metalloproteases and members of the ADAMs family (accession numbers U18233, U86634, gi:17447343, gi:17435145, gi:13643723, gi:16156304, gi:14746383, gi:14194437, gi:1857673 and gi:1616601) were also compared against SUCEST reads using the BLAST program.
The SUCEST hits that satisfied the above conditions were translated in a reading frame aligned by BLAST and classified according to the presence of conserved motifs (Stöcker and Bode, 1995; Karata et al., 1999) and the sequence peculiarities of each metalloprotease subfamily (Hooper, 1994).
To search for sugarcane homologues, each metalloprotease consensus cluster was translated and compared against NCBI database using the BLAST program to search for homologous proteins from other sources, using 11 as the cost for opening a gap, 1 to extending a gap, and scoring by BLOSUM 62.
The expression profile of different classes of metalloproteases was found by counting the metalloprotease reads found in each library. To compare the results, the number of reads of each subfamily in one library was divided by the total number of reads, resulting in a frequency number.
RESULTS AND DISCUSSION
The data banks for A. thaliana, G. max, and S. bicolor were searched for protein sequences related to all the metalloprotease families. Some animal sequences were extracted from the GenBank. Those included several members from the ADAMs family, svMPs with a disintegrin domain, and MMPs. The selected sequences from all databases were used to search the SUCEST database. About 300.000 ESTs were obtained from all sugarcane libraries. Reads with an E value lower than e-50 were considered to be metalloproteases while reads and clusters with an E value between e-15 and e-50 were analyzed to identify the zinc binding motif and other conserved motifs found in the different classes of metalloproteases. Based on these results, each read was classified into one of the following classes : zincins, matrix metalloprotease-like, inverzincins, and ATP-dependent metalloprotease (Table I and http://sucest. Lad.ic.unicamp.br/private/mining-reports/SA/SA-mining.htm). Interestingly, searches in the SUCEST database performed with sequences from animal sources (e.g. ADAMS or svMPs) gave either no results or hits with high E value. Additional domains such as the disintegrin domain were not detected.
The AAA protein family is a distinct subfamily of ATPases, characterized by a highly conserved ATP-binding motif (the AAA motif). AAA proteins participate in a variety of cellular functions, including cell-cycle regulation, protein degradation, organelle biogenesis and vesicle-mediated protein transport (Patel and Latterich, 1998). Some members of the AAA protein family have a metalloprotease domain in addition to the ATPase domain. The best-characterized member of this class is the FtsH protein from Escherichia coli. FtsH is a membrane-bound ATP-dependent protease with a Zn2+-binding motif (HEAGH) at the C-terminus thought to be the catalytic center for proteolysis (Tomoyasu et al., 1993). Several FtsH substrates have been identified such as the heat-shock transcription factor sigma(32) and the a subunit of the F-0 complex of the H+-ATPase (Tomoyasu et al., 1995). FtsH also acts as a molecular chaperone, and such ATP-dependent proteases with intrinsic chaperone activity have been designated charonins (Schumann, 1999).
ATP-dependent metalloproteases were found to be intensively expressed in various sugarcane tissues collected under different conditions of growth (Figure 1). In the translated sequences it was possible to identify several conserved motifs such as the Walker A and B motifs for ATP-binding activity and the second region of homology (SRH) which has ATPase activity and is characteristic of members of the AAA protein family (Table I). In sugarcane the physiological function of individual members of the AAA protein family remains unclear although stress does not, apparently, induce the expression of ATP-dependent metalloproteases (Figure 1).
An indication of the high level of conservation of the AAA protein family can be seen in Table II which shows that the NCBI search for proteins homologous to the SUCEST reads resulted in very low E-values for proteins from organisms as diverse as Synechocystis species, Saccharomyces cerevisiae, Trypanosoma brucei, Plasmodium falciparum, Caenorhabditis elegans, N. tabacum, Xenopus laevis, Gallus gallus, Drosophila melanogaster, Rattus species and Homo sapiens, the table also showing that the functions assigned to these proteins were divergent.
ZINC-ENDOPEPTIDASES OR METALLOPROTEASES (Zincins)
Metalloproteases are generally characterized by a catalytic zinc ion in their active site. The highly conserved consensus sequence HEXXH is involved in metal ligation. The two histidines of this sequence motif serve as ligands to the zinc, whereas the glutamic acid is believed to transfer hydrogen atoms and to increase the polarization of the zinc-bound water molecule for nucleophilic attack on the scissile peptide bond of the substrate (Stöcker and Bode, 1995).
Enzymes having the consensus sequence HEXXH were classified as zincins (Hooper, 1994). The expression of zincins was observed in growing plantlets and seeds. The search for homologous proteins from other sources at NCBI gave no significant results.
Matrix metalloproteases are usually undetectable in animal cells under normal circumstances but they are prominently expressed during a variety of biological processes such as morphogenesis, angiogenesis, wound healing and tumor invasion. (Shapiro, 1998). Because aberrant function of MMPs results in severe tissue damage, the activities of MMPs are regulated at multiple steps such as gene expression, secretion, activation of pro-enzymes, and inhibition by specific inhibitors (Matrisian, 1990). In mammals, several classes of MMPs have been described such as the collagenases (MMP-1, MMP-8, and MMP-13), stromelysins (MMP-3 and MMP-10), gelatinases (MMP-2 and MMP-9) among others. The zinc-binding consensus sequence for this protein family is HEXXHXXGXDHS (Hooper, 1994). MMPs have an additional hemopexin-like domain whose function is not entirely understood.
MMP-like proteins were identified in sugarcane due to the presence of some conserved motifs such as the cysteine-switch consensus sequence and the zinc-binding motif (Table I). In some clones, other conserved motifs in the MMP family such as the Met-turn and the calcium-binding motif were also identified. The cysteine-switch motif was identified in one of the reads, suggesting that the protein is synthesized as a pro-enzyme.
A BLAST search of the servers at NCBI revealed only one MMP-like cluster found shows slight identity with plant, amphibian, and mammalian proteins (Table II). Despite the striking differences between the animal and plant extracellular matrix, very similar enzymes degrade them. It has been demonstrated that a metalloprotease from Xanthomonas campestris specifically degrades proline/hydroxyproline-rich proteins of the extracellular matrix of plants (Dow et al., 1998). The major occurrence of this class of enzyme was in growing plantlets and seeds (Figure 1).
A few metalloproteases are characterized by an inverted zinc-binding motif HXXEH. Members of this family include the human, rat and Drosophila insulin-degrading enzymes known as insulinases (Hooper, 1994).
Several inverzincins were expressed in sugarcane (Table I) and this was the only class of metalloprotease (excluding the ATP-dependent metalloproteases, which have completely different functions) observed in sugarcane flower-stems or in sugarcane callus-tissue under light and temperature stress (Figure 1). Inverzincins were also expressed in the leaf-root transition zone and the stem-bark of immature plants. The only inverzincin cluster found showed homology with some pathogenic bacteria, insect, and human proteins (Table II).
PLANT METALLOPROTEASES IN GENERAL
Plant metalloproteases are less studied than human, bacterial or snake venom metalloproteases and there are only few reports on the isolation and characterization of these enzymes in plants (Eriksson and Glaser, 1992). A dimeric, 17-kDa metalloprotease has been isolated from the endosperm of sorghum seedlings, the dimer consisting of two 8-kDa subunits linked by disulfide bonds (Macedo et al., 1999).
More information on the primary structure of metalloproteases is being obtained from DNA databases than from protein characterization, using the conserved zinc-binding motif found in other organisms. A cDNA sequence coding for a soybean leaf metalloprotease has been completely characterized by Pak et al. (1997). The protein seems to be synthesized as a preproenzyme, with the consensus sequence for proenzyme activation by the mechanism of cysteine-switch (Grams et al., 1993). Northern and western blotting analysis has shown that the metalloprotease transcript and protein are under a strict developmental program and that both are expressed only in leaf tissue and in a temporal form. The protein is secreted but a portion of the mature form is tightly bound to the cell wall (Pak et al., 1997).
Related proteins such as the ADAMs and additional domains were not found in the SUCEST database, although it may be possible that these and other proteins were not expressed under the conditions used for the construction of the libraries. Another possibility is that in many cases the ESTs are related to incomplete sequences, the full length sequences being necessary for the complete protein characterization.
In conclusion, we have identified at least four different classes of metalloproteases in sugarcane, which were analyzed for their expression in various tissues under different conditions. The exact role of these proteins in plant development and disease remains to be elucidated.
Metaloproteases exercem papéis importantes em muitos processos fisiológicos em mamíferos tais como migração celular, remodelamento tecidual e processamento de fatores de crescimento. Estas enzimas estão envolvidas também na pato-fisiologia de um grande número de doenças humanas como hipertensão e câncer. Muitas bactérias patogênicas dependem de proteases para infectar o hospedeiro. Diversas classes de metaloproteases foram descritas em seres humanos, bactérias, venenos de serpentes e insetos. No entanto, a presença e a caracterização de metaloproteases em plantas estão pouco descritas na literatura. Neste trabalho, foi pesquisada a biblioteca de cDNA de etiquetas de seqüências expressas da cana-de-açúcar (SUCEST) para identificar, por homologia com seqüências depositadas em outros bancos de dados, famílias gênicas de metaloproteases expressas em diferentes condições. Foram utilizadas seqüências protéicas de Arabidopis thaliana e Glycine max e seqüências de nucleotídeos de Sorghum bicolor. Regiões conservadas correspondentes aos diferentes domínios e motivos de seqüência de metaloproteases foram identificadas nos cDNAs de cana-de-açúcar para caracterizar cada grupo de enzimas. Pelo menos quatro classes de metaloproteases foram identificadas na cana-deaçúcar, a saber, metaloproteases de matriz extracelular, zincinas, inverzincinas e metaloproteases dependentes de ATP. Cada uma destas classes foi analisada quanto a sua expressão nas diferentes condições e tecidos utilizados na construção das bibliotecas de cDNA.
This work was supported by the Brazilian agency FAPESP, grant number 00/07426-3.
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215: 403-410. [ Links ]
Bjarnason, J.B. and Fox, J.W. (1994). Hemorrhagic metalloproteinases from snake venoms. Pharmac. Ther. 62: 325-372. [ Links ]
Brooks, PC, Siletti, S., von Schalscha, TL, Friedlander, M. and Cheresh, DA. (1998). Disruption of angiogenesis by PEX, a noncatalytic metalloproteinase fragment with integrin binding activity. Cell. 92: 391-400. [ Links ]
Dow, J.M., Davies, H.A. and Daniels, M.J. (1998). A metalloprotease from Xanthomonas campestris that specifically degrades proline/hydroxyproline-rich glycoproteins of the plant extracellular matrix. Mol. Plant-Microbe Interact. 11: 1085-1093. [ Links ]
Eriksson, A.C. and Glaser, E. (1992). Mitochondrial processing proteinase - A general processing proteinase of spinach leaf mitochondria is a membrane-bound enzyme. Biochim. Biophys. Acta 1140: 208-214. [ Links ]
Grams, F., Huber, R., Kress, L.F., Moroder, L. and Bode, W. (1993). Activation of snake venom metalloproteinases by a cysteine-switch like mechanism. FEBS Lett. 335: 76-80. [ Links ]
Henikoff, S. and Henikoff, J.G. (1992). Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences USA 89: 10915-10919. [ Links ]
Hooper, N.M. (1994). Families of zinc metalloproteases. FEBS Lett. 354: 1-6. [ Links ]
Karata, K., Inagawa, T., Wilkinson, A.J., Tatsuta, T. and Ogura, T. (1999). Dissecting the role of a conserved motif (the second region of homology) in the AAA family of ATPases. J. Biol. Chem. 274: 26225-26232. [ Links ]
Killar, L., White J., Black R. and Peschon J. (1999). Adamalysins - A family of metzincins including TNF-alpha converting enzyme (TACE). Annals New York Acad. Sci. 878: 442-452. [ Links ]
Macedo, I.Q., Marques, P. and Delgadillo, I. (1999). Purification and characterization of a novel plant metalloproteinase. Biotech. Tech. 13: 677-680. [ Links ]
Matrisian, L.M. (1990). Metalloproteinases and their inhibitors in matrix remodeling. Trends Genet. 6: 121-125. [ Links ]
Matrisian, L.M. (1992) The matrix-degrading metalloproteinases. Bioessays 14: 455-463. [ Links ]
Pak, J.H., Liu, C.Y., Huangpu, J. and Graham, J.S. (1997). Construction and characterization of the soybean leaf metalloproteinase cDNA. FEBS Lett. 404: 283-288. [ Links ]
Patel, S. and Latterich, M. (1998). The AAA team: related ATPases with diverse functions. Trends Cell Biol. 8: 65-71. [ Links ]
Schumann, W. (1999). FtsH - a single-chain charonin? FEMS Microbiol. Rev. 23: 1-11. [ Links ]
Shapiro, S.D. (1998). Matrix metalloproteinase degradation of extracellular matrix: biological consequences. Curr. Op. Cell Biol. 10: 602-608. [ Links ]
Stocker W. and Bode, W. (1995). Structural features of a superfamily of zinc-endopeptidases: the metzincins. Curr. Op. Struct. Biol. 5: 383-390. [ Links ]
Stone, A.L., Kroeger, M. and Sang, Q.X.A. (1999). Structure-function analysis of the ADAM family of disintegrin-like and metalloprotease-containing proteins (review). J. Prot. Chem. 18: 447-465. [ Links ]
Tomoyasu, T., Yuki, T., Morimura, S., Mori, H., Yamanaka, K., Niki, H., Hiraga, S. and Ogura, T. (1993). The Escherichia coli FtsH protein is a prokaryotic member of a protein family of putative ATPases involved in membrane functions, cell-cycle control, and gene-expression J. Bacteriol. 175: 1344-1351. [ Links ]
Tomoyasu, T., Gamer, J., Bukau, B., Kanemori, M., Mori, H., Rutman, A.J., Oppenheim, A.B., Yura, T., Yamanaka, K., Niki, H., Hiraga, S. and Ogura, T. (1995). Escherichia coli FtsH is a membrane-bound, ATP-dependent protease which degrades the heat-shock transcription factor sigma (32). Embo J. 14: 2551-2560. [ Links ]