Abstract
Almost identical polyglutamine-containing proteins with unknown structures have been found in human, mouse and rat genomes (GenBank AJ277365, AF525300, AY879229). We infer that an identical new gene (RING) finger domain of real interest is located in each C-terminal segment. A three-dimensional (3-D) model was generated by remote homology modeling and the functional implications are discussed. The model consists of 65 residues from terminal position 707 to 772 of the human protein with a total length of 796 residues. The 3-D model predicts a ubiquitin-protein ligase (E3) as a binding site for ubiquitin-conjugating enzyme (E2). Both enzymes are part of the ubiquitin pathway to label unwanted proteins for subsequent enzymatic degradation. The molecular contact specificities are suggested for both the substrate recognition and the residues at the possible E2-binding surface. The predicted structure, of a ubiquitin-protein ligase (E3, enzyme class number 6.3.2.19, CATH code 3.30.40.10.4) may contribute to explain the process of ubiquitination. The 3-D model supports the idea of a C3HC4-RING finger with a partially new pattern. The putative E2-binding site is formed by a shallow hydrophobic groove on the surface adjacent to the helix and one zinc finger (L722, C739, P740, P741, R744). Solvent-exposed hydrophobic amino acids lie around both zinc fingers (I717, L722, F738, or P765, L766, V767, V733, P734). The 3-D structure was deposited in the protein databank theoretical model repository (2B9G, RCSB Protein Data Bank, NJ).
E3 ubiquitin-protein ligase; RING finger structure; Low homology model
Braz J Med Biol Res, March 2007, Volume 40(3) 293-299
In silico analysis identifies a C3HC4-RING finger domain of a putative E3 ubiquitin-protein ligase located at the C-terminus of a polyglutamine-containing protein
Correspondence and Footnotes T. Scior1, F. Luna1, W. Koch2 and J.F. Sánchez-Ruiz2
1Departamento de Farmacia, Facultad de Ciencias Químicas, Benemérita Universidad Autónoma de Puebla, Puebla, México
2Facultad de Estudios Superiores, Zaragoza, Mexico-City, México
References
Correspondence and Footnotes Correspondence and Footnotes Correspondence and Footnotes
Abstract
Almost identical polyglutamine-containing proteins with unknown structures have been found in human, mouse and rat genomes (GenBank AJ277365, AF525300, AY879229). We infer that an identical new gene (RING) finger domain of real interest is located in each C-terminal segment. A three-dimensional (3-D) model was generated by remote homology modeling and the functional implications are discussed. The model consists of 65 residues from terminal position 707 to 772 of the human protein with a total length of 796 residues. The 3-D model predicts a ubiquitin-protein ligase (E3) as a binding site for ubiquitin-conjugating enzyme (E2). Both enzymes are part of the ubiquitin pathway to label unwanted proteins for subsequent enzymatic degradation. The molecular contact specificities are suggested for both the substrate recognition and the residues at the possible E2-binding surface. The predicted structure, of a ubiquitin-protein ligase (E3, enzyme class number 6.3.2.19, CATH code 3.30.40.10.4) may contribute to explain the process of ubiquitination. The 3-D model supports the idea of a C3HC4-RING finger with a partially new pattern. The putative E2-binding site is formed by a shallow hydrophobic groove on the surface adjacent to the helix and one zinc finger (L722, C739, P740, P741, R744). Solvent-exposed hydrophobic amino acids lie around both zinc fingers (I717, L722, F738, or P765, L766, V767, V733, P734). The 3-D structure was deposited in the protein databank theoretical model repository (2B9G, RCSB Protein Data Bank, NJ).
Key words: E3 ubiquitin-protein ligase, RING finger structure, Low homology model
Introduction
The number of new genes that encode proteins with unknown structure and function is steadily growing. 3-Dimensional (3-D) structure modeling is a computational tool used to narrow this gap. Computing the protein structure of new sequences provides a better understanding of biological functions and indicates directions for new biochemical and cellular experiments. We present a polyglutamine-containing protein of unknown structure present in the human, mouse and rat genome. We call the human protein the "target" used to build our 3-D model. In the present study, we establish a theoretical structure of a ubiquitin-protein ligase (E3) that belongs to the enzymatic pathway of ubiquitin-mediated degradation of unwanted proteins in the living cells. E3 binds to the unwanted protein to be labeled by ubiquitin prior to destruction. E3 is also associated with a ubiquitin-conjugating enzyme (E2)-ubiquitin complex so that the ubiquitination of the E3-bound protein takes place. Thereafter the ubiquitin-labeled protein will be destroyed by proteolysis. The functional part of E3 contains an interesting new gene (RING) finger composed of cysteines and histidines complexed to two central atoms of zinc believed to mediate protein-protein interactions.
Material and Methods
Sequence alignments characterize our target protein unambiguously as a RING finger motif (1,2). The total percent similarity between templates and target was 16% (identity matrix), while the homology was 23% (Dayhoff matrix) (3). The web-based automated homology modeling of proteins by the Molecular Operating Environment (MOE), Swiss-model or CPH servers is difficult because adequate templates are needed with reliable identity scores above a length-dependent threshold, i.e., 25 to 35% for an average domain length of 120 to 80 residues (last visit, September 2005) (4,5). We present a manually built remote homology model of a C3HC4-type RING finger domain of an E3 ubiquitin-protein ligase. The computational study was conducted with two sequence search tools, PSI-BLAST against the Swiss-Prot database, and threading techniques within the Swiss protein viewer (SPV, deep view) (6). Side chain conformations were predicted using SCWRL (7). Later, during the loop and RING construction steps the geometry was locally refined under AMBER-94 force field conditions using MOE software (4). The final 3-D model was chosen after evaluating the stereochemical quality with VERIFY3D, PROCHECK and WHAT_CHECK at the structure analysis and verification server (SAVS) (8-10).
The three query sequences were accessed through GenBank (11) at the National Center for Biotechnology Information, NCBI (accession: AJ277365, AF525300, AY879229) and homologous sequences were found with FASTA and BLAST (12,13). The template candidates were retrieved at Protein Data Bank Brookhaven (PDB) (14). Lower homology searches were performed by using PSI-BLAST (14). In order to select adequate candidates multiple sequence alignments were conducted with Clustal W (15). Alignment parameters were set as follows: i) pairwise alignment with Ktuple, 1; gap penalty, 3; windows, 5, and diagonals saved, 5; ii) multiple alignment with gap penalty, 10, and gap length penalty, 10. The higher PAM (250) and lower BLOSUM (45) weight matrices were also used. Structural fitting data were accessed through Swiss-Model SPV programs. The over 30 thousand protein structures deposited at PDB (14) were screened in order to find suitable 3-D templates (Table 1). Sequence identities were calculated and alignments manually edited to accommodate the single template modeling around the core motif (double zinc finger) while for the remainder a multiple template approach was used (Table 1). The structure file 1IYM represents the crystallographic data of a RING-H2 finger motif (C3H2C3-type). The overall backbone and one zinc finger (C3H-type) were resolved using this single template. The loop and the other zinc finger (C4-type) were adopted using multiple template construction, i.e., 1BOR, 1FBV_A, 1LDJ_B, and 1E4U_A as more distantly related sequences (16-27).
Results and
Discussion
The sequence showing the highest degree of sequence homology with the query had been obtained with the PSI-BLAST option for low homology, and is available at the NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) (28). Multiple sequence alignments as pre-requisites for the homology-based structure construction were performed with the on-line tool Clustal W, which is accessible at the WWW-server at the Bio-web server Pasteur (http://bioweb.pasteur.fr/seqanal/interfaces/clustalw.html) (15).
Since neither MOE's built-in automated structural modeling tool nor the Swiss-model server had predicted a homology model we decided to use a manual procedure. First, the spatial coordinates of the 1IYM structure (22), which showed the highest sequence homology with the query (23% after manual editing, Table 1), was used as a rigid scaffold for the main part of the query. The side chains were interchanged (mutation simulation) to reflect the target sequence (Figure 1). Second, the original conformations were kept at identical positions, conserved positions were mutated preferentially respecting the template orientation, whereas not conserved side chain conformations were locally refined. Side chain conformations were usually constructed by means of molecular mechanics with empirical energy refinements towards local minima. Alternatively, the web-based SCWRL server or its download version could be used to predict conformations (7). The loop database of SPV was also consulted before completing the atom-scale model with MOE tools under AMBER-94 force-field conditions (29). Due to the high degree of local conservation, especially with respect to the amino acids forming the structural RING motif, the backbone fold was obeyed for the entire length of 64 residues. Third, amino acids which are possible salt bridge partners due to charge and position were searched among the remaining residues. If correlated occurrences of these amino acids were detected from the multiple sequence alignment produced with Clustal W (15), they were modeled as partners of a salt bridge. Finally, the remaining amino acids were placed to create optimal steric and electrostatic interactions in the proximity. Both RING finger geometries were adopted from their templates and kept rigid, like the backbone.
A C3HC4 zinc-finger was identified within the C-terminal domain. For substrate recognition on the solvent-exposed surface the following hydrophobic residues are identified: ILE717, LEU722, PHE738 around the Zn atom 1, PRO765, Leu766, VAL767, VAL733, and PRO734 around the Zn atom 2, and PHE740, PRO741 at the end of the alpha-helix. The geometry optimization was performed with the AMBER-94 protocol in MOE (4) running on a Linux-PC using the Kollman all atom force field (29). The structures were solvated in a water-shell and geometry was optimized by 5000-steps conjugate gradient energy minimization. A final molecular dynamics simulation under the MMFF94 (30) force field was performed by heating to 300 K (NVE equilibration phase) and thermal fluctuation became minimal during a 300-pico-second simulation with temperature control (NVT equilibrium). Finally, the constructed model passed the statistical quality tests for proteins at SAVS (Figure 2) (8). The atom-scale model for the putative E3 segment was deposited in the RCSB theoretical 3-D model repository: http://deposit.rcsb.org/depoinfo/depotools.html (Figure 3). No homology modeling has been devised for the substrate binding site due to lack of templates.
The templates belong to two different subfamilies with C3HC4- and C3H2C3-RING domains, respectively. The former type is clearly related to the cysteine/histidine pattern of the target. The target's RING finger domain pattern: (C-x(2)-C-x(11)-C-x(5)-H- x(2)-C-x(2)-C-x(18)-C-x( 5)-C) is quite similar to the well-established pattern: (C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C) (bold letters indicate the conserved pattern, underlined indicate the range, and two cursives indicate two new positions for the target).
The known protein structure 1IYM (22) possesses a RING-H2 (C3H2C3) finger protein and constitutes the highest scoring homologue (23%). Although it was used as a single template construction for the main part, other templates were also considered to copy fold information. Further biochemical analysis of the electrostatic surface shows that this RING finger domain presents a remarkable charge compensation by positively or negatively charged counter residues (GLU720-ARG721-GLU723, LYS737-ASP724, ARG744-GLU745, GLU762-LYS763, -ARG744). The whole picture is graphically illustrated in Figure 3 and shows a ligase classifiable as EC 6.3.2.19 (ubiquitin protein-ligase) or CATH 3.30.40.10.4. We identified five residues at the possible E2-binding surface in good agreement with the E2-binding site reported for template 1IYM (22). They lie in a shallow hydrophobic groove on the surface between the helix and the zinc finger: ARG744/ARG176, PRO- 741/PRO173, PHE740/TRP165, CYS739/CYS161, LEU722/VAL136, for target/template, respectively. The RING finger region is a protein interaction domain, normally 60 residues in length. The recognition may be assisted by solvent-exposed hydrophobic residues: ILE717, LEU722, and PHE738 near Zn atom 1, or PRO765, Leu766, VAL767, VAL733, and PRO734 near coordination center Zn2.
The over 60 residue-long RING structure was deposited in the PDB theoretical model repository (2B9G). It provides a ready-to-use starting point for further molecular characterization of ubiquitination to complete the present lack of such structures in PDB entries (last visit July 2006).
Edited sequence alignment with best scoring template 1IYM for homology modeling. First and last lines: labeling of the zinc finger motifs indicates to which cation the amino acid belongs, "Zn1" and "Zn2", respectively. The RING finger motif is composed of a C3HC4-type domain, i.e., both pairs of four amino acids each coordinate the Zn atom 1 (C731, C759, C765, H737) and Zn atom 2 (C716, C719, C740, C743). The template is classified as a RING-H2 finger protein (C3H2C3). Despite the different cysteine/histidine pattern the target shows a C3HC4-type RING finger which is clearly related. The automated alignment with the Clustal W program was manually corrected improving the homology score (15). The target sequence shows a similarity score of 23% with the template sequence. The third sequence (letters printed in black) indicates the computationally generated 3-D model with over 60 resolved amino acids. The 3-D model starts at position 706 to 772 of the AJ277365 human sequence. The RING finger is composed of a C3HC4-type domain.
Graphic quality assessment at the structure analysis and verification server (http://nihserver.mbi.ucla.edu/SAVS/) of the generated 3-dimensional (3-D) model (2B9G.pdb at http://deposit.rcsb.org/depoinfo/depotools.html) by Ramachandran diagram plots (31). These plots report permitted and forbidden phi-psi angle combinations for the 3-D model (left) of target AJ277365 polyglutamine-containing protein and template 1IYM_A (right) (9-11). The target structure plot shows an acceptable overall geometrical quality and is not significantly worse than its template. The values range from -180 to 180 for phi (x-axis) and psi (y-axis) torsions. SAVS returns a good 3-D to 1-D profile with scores above 0.2 and never below 0. Pairs of stereochemical results for template and target were: chi1-chi2 plots, 1/1; residue properties, max. deviation, 4.0/8.4, bad contacts, 18/13; main chain bond lengths not within 4.8/1.9% limits; main chain bond angles not within 0.8/15.9% limits; planar groups not within 0/1 amino acid limits.
Schematic drawing of the complete sequence. The RING finger domain of the ubiquitin-protein ligase (E3) is located at the C-terminus of AJ277365 to the right. The sequence starts with a low complexity (repetitive residues) segment. The generated 3-D model of the RING finger part is graphically depicted with its backbone folds. The secondary structures were predicted with Jpred (32). The helix was predicted with AGADIR (33). The numbers identify the amino acid positions in AJ277365 polyglutamine-containing protein. The E3 ligase is involved in ubiquitination processes (34-37). Three-dimensional presentation (ribbon and atomic display) of essential amino acids C731, C759, C765, H737 and C716, C719, C740, C743 found in the C3HC4-RING finger. The central helical wheel is colored red, the loops in dark blue. Color code for atoms displayed in a space filling manner: Zn, green; C, white; H, omitted; O, red; N, blue; S, yellow; ribbon presentation of protein, N-terminal above, C-terminal end, middle right (4).
Acknowledgments
The authors wish to thank Sergio R. Ojeda,Division of Neuroscience, Oregon National Primate Research Center, Oregon Health & Science University Beaverton, Beaverton, OR, USA, for sequence contribution and discussion.Address for correspondence: T. Scior, Departamento de Farmacia, Facultad de Ciencias Químicas, Benemérita Universidad Autónoma de Puebla, Ciudad Universitaria, Edificio 139, 14 Sur Con Avenida San Claudio, C.P. 72570, Colonia San Manuel, Puebla, Mexico. E-mail: tscior@siu.buap.mx
Received April 24, 2006. Accepted September 19, 2006.
- 1. Lorick KL, Jensen JP, Fang S, Ong AM, Hatakeyama S, Weissman AM. RING fingers mediate ubiquitin-conjugating enzyme (E2)-dependent ubiquitination. Proc Natl Acad Sci U S A 1999; 96: 11364-11369.
- 2. Borden KL, Freemont PS. The RING finger domain: a recent example of a sequence-structure family. Curr Opin Struct Biol 1996; 6: 395-401.
- 3. Dayhoff MO, Schwartz RM, Orcutt BC. A model of evolutionary change in proteins. In: Dayhoff MO (Editor), Atlas of protein sequence and structure Silver Spring: National Biomedical Research Foundation; 1978. p 345-352.
- 4. The molecular operating environment, MOE [Computer program]. Montreal: Chemical Computing Group Inc.; 2004.
- 5. Lund O, Frimand K, Gorodkin J, Bohr H, Bohr J, Hansen J, et al. Protein distance constraints predicted by neural networks and probability density functions. Protein Eng 1997; 10: 1241-1248.
- 6. Guex N, Peitsch MC. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 1997; 18: 2714-2723.
- 7. Canutescu AA, Shelenkov AA, Dunbrack RL Jr. A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 2003; 12: 2001-2014.
-
8Structure analysis and verification server, online: with Ramachandran plots and what_check and procheck options. http://nihserver. mbi.ucla.edu/SAVS/
- 9. Hooft RW, Vriend G, Sander C, Abola EE. Errors in protein structures. Nature 1996; 381: 272.
- 10. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Cryst 1993; 26: 283-291.
- 11. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res 2005; 33: D34-D38.
- 12. Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol 1990; 183: 63-98.
- 13. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990; 215: 403-410.
- 14. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res 2000; 28: 235-242.
- 15. Thompson JD, Higgins DG, Gibson TJ. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994; 22: 4673-4680.
- 16. Zheng N, Schulman BA, Song L, Miller JJ, Jeffrey PD, Wang P, et al. Structure of the Cul-1-Rbx1-Skp1-F boxSkp2 SCF ubiquitin ligase complex. Nature 2002; 416: 703-709.
- 17. Hanzawa H, de Ruwe MJ, Albert TK, van Der V, Timmers HT, Boelens R. The structure of the C4C4 ring finger of human NOT4 reveals features distinct from those of C3HC4 RING fingers. J Biol Chem 2001; 276: 10185-10190.
- 18. Brzovic PS, Rajagopal P, Hoyt DW, King MC, Klevit RE. Structure of a BRCA1-BARD1 heterodimeric RING-RING complex. Nat Struct Biol 2001; 8: 833-837.
- 19. Bellon SF, Rodgers KK, Schatz DG, Coleman JE, Steitz TA. Crystal structure of the RAG1 dimerization domain reveals multiple zinc-binding motifs including a novel zinc binuclear cluster. Nat Struct Biol 1997; 4: 586-591.
- 20. Zheng N, Wang P, Jeffrey PD, Pavletich NP. Structure of a c-Cbl-UbcH7 complex: RING domain function in ubiquitin-protein ligases. Cell 2000; 102: 533-539.
- 21. Barlow PN, Luisi B, Milner A, Elliott M, Everett R. Structure of the C3HC4 domain by 1H-nuclear magnetic resonance spectroscopy. A new structural class of zinc-finger. J Mol Biol 1994; 237: 201-211.
- 22. Katoh S, Hong C, Tsunoda Y, Murata K, Takai R, Minami E, et al. High precision NMR structure and function of the RING-H2 finger domain of EL5, a rice protein whose expression is increased upon exposure to pathogen-derived oligosaccharides. J Biol Chem 2003; 278: 15341-15348.
- 23. Babu CR, Flynn PF, Wand AJ. Validation of protein structure from preparations of encapsulated proteins dissolved in low viscosity fluids. J Am Chem Soc 2001; 123: 2691-2692.
- 24. Borden KL, Boddy MN, Lally J, O'Reilly NJ, Martin S, Howe K, et al. The solution structure of the RING finger domain from the acute promyelocytic leukaemia proto-oncoprotein PML. EMBO J 1995; 14: 1532-1541.
- 25. Pascual J, Martinez-Yamout M, Dyson HJ, Wright PE. Structure of the PHD zinc finger from human Williams-Beuren syndrome transcription factor. J Mol Biol 2000; 304: 723-729.
- 26. Capili AD, Schultz DC, RauscherIII FJ, Borden KL. Solution structure of the PHD domain from the KAP-1 corepressor: structural determinants for PHD, RING and LIM zinc-binding domains. EMBO J 2001; 20: 165-177.
- 27. Gervais V, Busso D, Wasielewski E, Poterszman A, Egly JM, Thierry JC, et al. Solution structure of the N-terminal domain of the human TFIIH MAT1 subunit: new insights into the RING finger family. J Biol Chem 2001; 276: 7457-7464.
- 28. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25: 3389-3402.
- 29. Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz MK, Ferguson DM, et al. A second generation force-field for the simulation of proteins, nucleic-acids, and organic molecules. J Am Chem Soc 1995; 117: 5179-5197.
- 30. Halgren TA. Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem 1996; 17: 490-519 (520-552, 553-586, 587-615, 616-641).
- 31. Ramachandran GN, Ramakrishnan C, Sasisekharan V. Stereochemistry of polypeptide chain configurations. J Mol Biol 1963; 7: 95-99.
- 32. Cuff JA, Barton GJ. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 1999; 34: 508-519.
- 33. Munoz V, Serrano L. Development of the multiple sequence approximation within the AGADIR model of alpha-helix formation: comparison with Zimm-Bragg and Lifson-Roig formalisms. Biopolymers 1997; 41: 495-509.
- 34. Rampazzo A, Pivotto F, Occhi G, Tiso N, Bortoluzzi S, Rowen L, et al. Characterization of C14orf4, a novel intronless human gene containing a polyglutamine repeat, mapped to the ARVD1 critical region. Biochem Biophys Res Commun 2000; 278: 766-774.
- 35. Hatakeyama S, Nakayama KI. U-box proteins as a new family of ubiquitin ligases. Biochem Biophys Res Commun 2003; 302: 635-645.
- 36. Hagglund R, Roizman B. Characterization of the novel E3 ubiquitin ligase encoded in exon 3 of herpes simplex virus-1-infected cell protein 0. Proc Natl Acad Sci U S A 2002; 99: 7889-7894.
- 37. Pickart CM. Mechanisms underlying ubiquitination. Annu Rev Biochem 2001; 70: 503-533.
Correspondence and Footnotes
Publication Dates
-
Publication in this collection
18 Feb 2007 -
Date of issue
Mar 2007
History
-
Accepted
19 Sept 2006 -
Received
24 Apr 2006