In silico analysis identifies a C 3 HC 4-RING finger domain of a putative E 3 ubiquitin-protein ligase located at the C-terminus of a polyglutamine-containing protein

Almost identical polyglutamine-containing proteins with unknown structures have been found in human, mouse and rat genomes (GenBank AJ277365, AF525300, AY879229). We infer that an identical new gene (RING) finger domain of real interest is located in each Cterminal segment. A three-dimensional (3-D) model was generated by remote homology modeling and the functional implications are discussed. The model consists of 65 residues from terminal position 707 to 772 of the human protein with a total length of 796 residues. The 3D model predicts a ubiquitin-protein ligase (E3) as a binding site for ubiquitin-conjugating enzyme (E2). Both enzymes are part of the ubiquitin pathway to label unwanted proteins for subsequent enzymatic degradation. The molecular contact specificities are suggested for both the substrate recognition and the residues at the possible E2binding surface. The predicted structure, of a ubiquitin-protein ligase (E3, enzyme class number 6.3.2.19, CATH code 3.30.40.10.4) may contribute to explain the process of ubiquitination. The 3-D model supports the idea of a C3HC4-RING finger with a partially new pattern. The putative E2-binding site is formed by a shallow hydrophobic groove on the surface adjacent to the helix and one zinc finger (L722, C739, P740, P741, R744). Solvent-exposed hydrophobic amino acids lie around both zinc fingers (I717, L722, F738, or P765, L766, V767, V733, P734). The 3-D structure was deposited in the protein databank theoretical model repository (2B9G, RCSB Protein Data Bank, NJ). Correspondence


Introduction
The number of new genes that encode proteins with unknown structure and function is steadily growing.3-Dimensional (3-D) structure modeling is a computational tool used to narrow this gap.Computing the protein structure of new sequences provides a better understanding of biological functions and indicates directions for new bio-chemical and cellular experiments.We present a polyglutamine-containing protein of unknown structure present in the human, mouse and rat genome.We call the human protein the "target" used to build our 3-D model.In the present study, we establish a theoretical structure of a ubiquitin-protein ligase (E3) that belongs to the enzymatic pathway of ubiquitin-mediated degradation of unwanted proteins in the living cells.E3 binds to the unwanted protein to be labeled by ubiquitin prior to destruction.E3 is also associated with a ubiquitin-conjugating enzyme (E2)-ubiquitin complex so that the ubiquitination of the E3-bound protein takes place.Thereafter the ubiquitin-labeled protein will be destroyed by proteolysis.The functional part of E3 contains an interesting new gene (RING) finger composed of cysteines and histidines complexed to two central atoms of zinc believed to mediate protein-protein interactions.

Material and Methods
Sequence alignments characterize our target protein unambiguously as a RING finger motif (1,2).The total percent similarity between templates and target was 16% (identity matrix), while the homology was 23% (Dayhoff matrix) (3).The web-based automated homology modeling of proteins by the Molecular Operating Environment (MOE), Swiss-model or CPH servers is difficult because adequate templates are needed with reliable identity scores above a lengthdependent threshold, i.e., 25 to 35% for an average domain length of 120 to 80 residues (last visit, September 2005) (4,5).We present a manually built remote homology model of a C3HC4-type RING finger domain of an E3 ubiquitin-protein ligase.The computational study was conducted with two sequence search tools, PSI-BLAST against the Swiss-Prot database, and threading techniques within the Swiss protein viewer (SPV, deep view) (6).Side chain conformations were predicted using SCWRL (7).Later, during the loop and RING construction steps the geometry was locally refined under AMBER-94 force field conditions using MOE software (4).The final 3-D model was chosen after evaluating the stereochemical quality with VERIFY3D, PROCHECK and WHAT_CHECK at the structure analysis and verification server (SAVS) (8)(9)(10).
The three query sequences were accessed through GenBank (11) at the National Center for Biotechnology Information, NCBI (accession: AJ277365, AF525300, AY879229) and homologous sequences were found with FASTA and BLAST (12,13).The template candidates were retrieved at Protein Data Bank Brookhaven (PDB) (14).Lower homology searches were performed by using PSI-BLAST (14).In order to select adequate candidates multiple sequence alignments were conducted with Clustal W (15).Alignment parameters were set as follows: i) pairwise alignment with Ktuple, 1; gap penalty, 3; windows, 5, and diagonals saved, 5; ii) multiple alignment with gap penalty, 10, and gap length penalty, 10.The higher PAM (250) and lower BLOSUM (45) weight matrices were also used.Structural fitting data were accessed through Swiss-Model SPV programs.The over 30 thousand protein structures deposited at PDB (14) were screened in order to find suitable 3-D templates (Table 1).Sequence identities were calculated and alignments manually edited to accommodate the single template modeling around the core motif (double zinc finger) while for the remainder a multiple template approach was used (Table 1).The structure file 1IYM represents the crystallographic data of a RING-H2 finger motif (C3H2C3type).The overall backbone and one zinc finger (C3H-type) were resolved using this single template.The loop and the other zinc finger (C4-type) were adopted using multiple template construction, i.e., 1BOR, 1FBV_A, 1LDJ_B, and 1E4U_A as more distantly related sequences (16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27).Ubiquitin-ligase activity, EL5.RING finger is the E2-binding site of ELS (23) RING finger domain from the acute promyelocytic leukemia protooncoprotein (25) Structure of the Phd zinc finger from human Williams-Beuren syndrome, transcription factor (26) Equine herpes virus 1 (22) Unusual multi-class zinc-binding motifs: a zinc C3HC4-RING finger together with another C2H2 zinc finger (binuclear cluster) (20) Transcriptional repressor Not4 (18) N-terminal domain of the human Tfiih Mat1 subunit is a member of the RING finger family (28) Ubiquitin-protein ligase E3 contains a RING finger with 3 Zn atoms (17) Ubiquitin ligase complex: E3 ubiquitin-protein ligase-scaffold-E2 ubiquitin-conjugating enzyme: Cul-1-Rbx1-Skp1-F boxSkp2="SCF ubiquitin ligase complex": SCF complexes are the largest family of E3 ubiquitin-protein ligases and mediate the ubiquitination of diverse regulatory and signaling proteins as substrates for degradation.Cul-1 is an elongated protein that consists of a long stalk and a globular domain.The globular domain binds the RING finger protein Rbx1 (=E3) through an inter-molecular beta-sheet, forming a two-subunit catalytic core that recruits the ubiquitin-conjugating enzyme E2 below (in 1LDK, not in 1LDJ).Cul-1 serves as a rigid scaffold that organizes the Skp1-F boxSkp2 and the Rbx1 (E3) subunits, holding them over 100 Å apart.The structure suggests that Cul-1 may contribute to catalysis through the positioning of the substrate and the ubiquitinconjugating enzyme E2 (17) Phosphatidylinositol-3-phosphate-binding Fyve domain of Vps27P protein (29) Breast cancer type 1 susceptibility protein (24) Phd domain from the Kap-1 co-repressor: structural determinants for Phd, RING and Lim zinc-binding domains (27) E2-E3-substrate complex: E3 bound to a cognate E2 and a substrate fragment shows how the RING domain of E3 recruits ubiquitin-conjugating enzyme E2.E3 ubiquitin-protein ligase (chain A) may function as a scaffold that positions the substrate (chain B) and the E2 ubiquitin-conjugating enzyme (chain C) optimally for ubiquitin transfer Templates are listed in decreasing order of importance and referenced by their Protein Data Bank (PDB)-code and identity scores (id) in percent.The query contains two so-called RING motifs.The C4 RING finger was resolved well using chain A of 1IYM, whereas the C3H-RING finger was adopted using multiple template construction with 1BOR, 1FBV_A, 1LDJ_B, and 1E4U_A (14).

Results and Discussion
The sequence showing the highest degree of sequence homology with the query had been obtained with the PSI-BLAST option for low homology, and is available at the NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) (28).Multiple sequence alignments as pre-requisites for the homology-based structure construction were performed with the on-line tool Clustal W, which is accessible at the WWW-server at the Bio-web server Pasteur (http://bioweb.pasteur.fr/seqanal/interfaces/clustalw.html)(15).
Since neither MOE's built-in automated structural modeling tool nor the Swiss-model server had predicted a homology model we decided to use a manual procedure.First, the spatial coordinates of the 1IYM structure (22), which showed the highest sequence homology with the query (23% after manual editing, Table 1), was used as a rigid scaffold for the main part of the query.The side chains were interchanged (mutation simulation) to reflect the target sequence (Figure 1).Second, the original conformations were kept at identical positions, conserved positions were mutated preferentially respecting the template orientation, whereas not conserved side chain conformations were locally refined.Side chain conformations were usually constructed by means of molecular mechanics with empirical energy refinements towards local minima.Alternatively, the webbased SCWRL server or its download version could be used to predict conformations (7).The loop database of SPV was also consulted before completing the atom-scale model with MOE tools under AMBER-94 force-field conditions (29).Due to the high degree of local conservation, especially with respect to the amino acids forming the structural RING motif, the backbone fold was obeyed for the entire length of 64 residues.Third, amino acids which are possible salt bridge partners due to charge and position were searched among the remaining resi-dues.If correlated occurrences of these amino acids were detected from the multiple sequence alignment produced with Clustal W (15), they were modeled as partners of a salt bridge.Finally, the remaining amino acids were placed to create optimal steric and electrostatic interactions in the proximity.Both RING finger geometries were adopted from their templates and kept rigid, like the backbone.
A C3HC4 zinc-finger was identified within the C-terminal domain.For substrate recognition on the solvent-exposed surface the following hydrophobic residues are identified: ILE717, LEU722, PHE738 around the Zn atom 1, PRO765, Leu766, VAL767, VAL733, and PRO734 around the Zn atom 2, and PHE740, PRO741 at the end of the alpha-helix.The geometry optimization was performed with the AMBER-94 protocol in MOE ( 4) running on a Linux-PC using the Kollman all atom force field (29).The structures were solvated in a water-shell and geometry was optimized by 5000-steps conjugate gradient energy minimization.A final molecular dynamics simulation under the MMFF94 (30) force field was performed by heating to 300 K (NVE equilibration phase) and thermal fluctuation became minimal during a 300-pico-second simulation with temperature control (NVT equilibrium).Finally, the constructed model passed the statistical quality tests for proteins at SAVS (Figure 2) (8).The atom-scale model for the putative E3 segment was deposited in the RCSB theoretical 3-D model repository: http:// deposit.rcsb.org/depoinfo/depotools.html (Figure 3).No homology modeling has been devised for the substrate binding site due to lack of templates.
The templates belong to two different subfamilies with C3HC4-and C3H2C3-RING domains, respectively.The former type is clearly related to the cysteine/histidine pattern of the target.The target's RING finger domain pattern:   (31).These plots report permitted and forbidden phi-psi angle combinations for the 3-D model (left) of target AJ277365 polyglutamine-containing protein and template 1IYM_A (right) (9)(10)(11).The target structure plot shows an acceptable overall geometrical quality and is not significantly worse than its template.The values range from -180 to 180 for phi (x-axis) and psi (y-axis) torsions.SAVS returns a good 3-D to 1-D profile with scores above 0.2 and never below 0. Pairs of stereochemical results for template and target were: chi1-chi2 plots, 1/1; residue properties, max.deviation, 4.0/8.4,bad contacts, 18/13; main chain bond lengths not within 4.8/1.9%limits; main chain bond angles not within 0.8/15.9%limits; planar groups not within 0/1 amino acid limits.Figure 3. Schematic drawing of the complete sequence.The RING finger domain of the ubiquitin-protein ligase (E3) is located at the C-terminus of AJ277365 to the right.The sequence starts with a low complexity (repetitive residues) segment.The generated 3-D model of the RING finger part is graphically depicted with its backbone folds.The secondary structures were predicted with Jpred (32).The helix was predicted with AGADIR (33).The numbers identify the amino acid positions in AJ277365 polyglutamine-containing protein.The E3 ligase is involved in ubiquitination processes (34)(35)(36)(37).Three-dimensional presentation (ribbon and atomic display) of essential amino acids C731, C759, C765, H737 and C716, C719, C740, C743 found in the C3HC4-RING finger.The central helical wheel is colored red, the loops in dark blue.Color code for atoms displayed in a space filling manner: Zn, green; C, white; H, omitted; O, red; N, blue; S, yellow; ribbon presentation of protein, Nterminal above, C-terminal end, middle right (4).The known protein structure 1IYM ( 22) possesses a RING-H2 (C3H2C3) finger protein and constitutes the highest scoring homologue (23%).Although it was used as a single template construction for the main part, other templates were also considered to copy fold information.Further biochemical analysis of the electrostatic surface shows that this RING finger domain presents a remarkable charge compensation by positively or negatively charged counter residues (GLU720-ARG721-GLU723, LYS737-ASP724, ARG744-GLU745, GLU762-LYS763, -ARG744).The whole picture is graphically illustrated in Figure 3 and shows a ligase classifiable as EC 6.3.2.19 (ubiquitin protein-ligase) or CATH 3.30.40.10.4.We identified five residues at the possible E2-binding surface in good agreement with the E2-binding site reported for template 1IYM (22).They lie in a shallow hydrophobic groove on the surface between the helix and the zinc finger: ARG744/ARG176, PRO-741/PRO173, PHE740/TRP165, CYS739/ CYS161, LEU722/VAL136, for target/template, respectively.The RING finger region is a protein interaction domain, normally 60 residues in length.The recognition may be assisted by solvent-exposed hydrophobic residues: ILE717, LEU722, and PHE738 near Zn atom 1, or PRO765, Leu766, VAL767, VAL733, and PRO734 near coordination center Zn2.
The over 60 residue-long RING structure was deposited in the PDB theoretical model repository (2B9G).It provides a ready-touse starting point for further molecular characterization of ubiquitination to complete the present lack of such structures in PDB entries (last visit July 2006).

Figure 1 .
Figure 1.Edited sequence alignment with best scoring template 1IYM for homology modeling.First and last lines: labeling of the zinc finger motifs indicates to which cation the amino acid belongs, "Zn1" and "Zn2", respectively.The RING finger motif is composed of a C3HC4-type domain, i.e., both pairs of four amino acids each coordinate the Zn atom 1 (C731, C759, C765, H737) and Zn atom 2 (C716, C719, C740, C743).The template is classified as a RING-H2 finger protein (C3H2C3).Despite the different cysteine/histidine pattern the target shows a C3HC4-type RING finger which is clearly related.The automated alignment with the Clustal W program was manually corrected improving the homology score (15).The target sequence shows a similarity score of 23% with the template sequence.The third sequence (letters printed in black) indicates the computationally generated 3-D model with over 60 resolved amino acids.The 3-D model starts at position 706 to 772 of the AJ277365 human sequence.The RING finger is composed of a C3HC4-type domain.

Table 1 .
3-D template candidates for the RINGER finger model. www.bjournal.com.br