The quality control of glycoprotein folding in the endoplasmic reticulum , a trip from trypanosomes to mammals

The present review deals with the stages of synthesis and processing of asparagine-linked oligosaccharides occurring in the lumen of the endoplasmic reticulum and their relationship to the acquisition by glycoproteins of their proper tertiary structures. Special emphasis is placed on reactions taking place in trypanosomatid protozoa since their study has allowed the detection of the transient glucosylation of glycoproteins catalyzed by UDP-Glc:glycoprotein glucosyltransferase and glucosidase II. The former enzyme has the unique property of covalently tagging improperly folded conformations by catalyzing the formation of protein-linked Glc1Man7GlcNAc2, Glc1Man8GlcNac2 and Glc1Man9GlcNAc2 from the unglucosylated proteins. Glucosyltransferase is a soluble protein of the endoplasmic reticulum that recognizes protein domains exposed in denatured but not in native conformations (probably hydrophobic amino acids) and the innermost N-acetylglucosamine unit that is hidden from macromolecular probes in most native glycoproteins. In vivo, the glucose units are removed by glucosidase II. The influence of oligosaccharides in glycoprotein folding is reviewed as well as the participation of endoplasmic reticulum chaperones (calnexin and calreticulin) that recognize monoglucosylated species in the same process. A model for the quality control of glycoprotein folding in the endoplasmic reticulum, i.e., the mechanism by which cells recognize the tertiary structure of glycoproteins and only allow transit to the Golgi apparatus of properly folded species, is discussed. The main elements of this control are calnexin and calreticulin as retaining components, the UDP-Glc:glycoprotein glucosyltransferase as a sensor of tertiary structures and glucosidase II as the releasing agent. Correspondence


Protein glycosylation and initial oligosaccharide-processing reactions
The initial biochemical steps in N-glycosylation (i.e., glycosylation of asparagine units in proteins) have been known for some years but their truly biological meaning has only been emerging in recent times (1).As was shown in 1972, N-glycosylation is initiated by the transfer of an oligosaccharide (Glc 3 Man 9 GlcNAc 2 in most species, Figure 1) from a dolichol-P-P derivative to asparagine units in nascent polypeptide chains (2).Synthesis of Glc 3 Man 9 GlcNAc 2 -P-P-dolichol involves first the transfer of GlcNAc-1-P from UDP-GlcNAc to dolichol-P, followed by transfer of an additional N-acetylglucosamine unit and five mannose residues.The donor of the last ones is GDP-Man.The subsequent 4 mannose and 3 glucose residues are transferred from dolichol-P-Man and dolichol-P-Glc, respectively.The monophosphate derivatives are formed upon reaction of dolichol-P and GDP-Man or UDP-Glc (3).
Oligosaccharyltransferase is an enzyme bound to the membrane of the endoplasmic reticulum (ER) and composed of several subunits.Ribophorins I and II are two of them, this fact being compatible with transfer of the oligosaccharide to nascent polypeptide chains in the lumen of the above mentioned subcellular location (4).Cell-free assays have shown that transfer of glucosefree oligosaccharides is about 10-20-fold slower than that of compounds having the full complement of three glucose units (5).This feature of oligosaccharyltransferase apparently explains why glycoproteins of Saccharomyces cerevisiae mutants unable to synthesize glucosylated derivatives of dolichol-P-P and that accumulate Man 9 Glc NAc 2 -P-P-dolichol are heavily underglycosylated (6).The presence of the consensus sequence Asn x Ser/Thr (where x may be any amino acid except proline) in the nascent peptide is necessary but not sufficient for Nglycosylation.
Processing of the oligosaccharide in the lumen of the ER starts as soon as it is transferred to protein: the more external glucose unit is removed by glucosidase I, a membrane-bound α(1,2)glucosidase, whereas the remaining two glucoses are excised by glucosidase II, an α(1,3)glucosidase only loosely attached to the inner membrane of the ER (7,8).The more external, α(1,2)-linked glucose unit is removed cotranslationally whereas the half-life of the middle unit is longer and that of the innermost glucose unit is even longer.The explanation for this last fact will be presented below.Another processing reaction also occurring in the ER lumen is the removal of one or two α(1,2)linked peripheral mannose residues.Specific ER α-mannosidases have been identified, one in S. cerevisiae and at least two in mammalian cells (9)(10)(11).Cell-free experiments showed that both glucosylated or unglucosylated oligosaccharides may be substrates for mammalian ER α-mannosidases.At this stage Man 7-9 GlcNAc 2 protein-linked oligosaccharides are transported to the Golgi apparatus where further processing of the saccharide moieties may proceed.

Protein glycosylation and oligosaccharide processing in trypanosomatids
Trypanosomatids are parasitic protozoa of considerable medical and economic importance because they are the causative agents of chronic human and livestock diseases endemic in developing countries.Examples of the former are Chagas disease, present in most of Latin America and produced by Trypanosoma cruzi, sleeping sickness or African trypanosomiasis, caused by Trypanosoma brucei gambiense and Trypanosoma b. rhodesiense, and several forms of leishmaniasis (visceral, mucocutaneous and cutaneous), produced by species belonging to the genus Leishmania.On the other hand, T.b. brucei, Trypanosoma congolense and Trypanosoma vivax cause nagana in Africa, a disease affecting cattle, whereas Trypanosoma evansi is the agent of surra, a disease of camel and horses (12).
Trypanosomatids are the only wild type cells that transfer in vivo unglucosylated oligosaccharides in protein N-glycosylation, i.e., Man 6 GlcNAc 2 , Man 7 GlcNAc 2 or Man 9 Glc NAc 2, depending on the species (Figure 1) (13,14).The parasites are defective in the synthesis of dolichol-P-Glc (15).Moreover, species transferring Man 6 GlcNAc 2 or Man 7 GlcNAc 2 were found to be defective in the dolichol-P-Man-dependent mannosyltransferases responsible for the addition of the seventh, eighth and ninth mannose residues or for the addition of the eighth and ninth mannose residues, respectively (15).It was found that glucosylated protein-linked oligosaccharides were transiently formed upon pulse chase-labeling trypanosomatid cells with [ 14 C]glucose (16)(17)(18)(19)(20). Thus, Glc 1 Man 9 GlcNAc 2 , Glc 1 Man 8 GlcNAc 2 and Glc 1 Man 7 GlcNAc 2 were detected in parasites transferring Man 9 GlcNAc 2 , Glc 1 Man 7 GlcNAc 2 in those transferring Man 7 GlcNAc 2 and Glc 1 Man 6 GlcNAc 2 and Glc 1 Man 5 GlcNAc 2 in parasites where Man 6 GlcNAc 2 is involved in N-glycosylation.In all cases the glucosylated compounds showed a transient existence.A glucosidase II-like enzyme was characterized in trypanosomatids but an activity with the specificity of glucosidase I was not found (21).
As unglucosylated oligosaccharides were transferred to proteins in trypanosomatids, the above mentioned results indicated that glucosylation of protein-linked compounds had occurred.Moreover, as the parasites were defective in the synthesis of dolichol-P-Glc, it was concluded that another sugar donor, probably UDP-Glc, was involved in the glucosylation reaction.In the case of T. cruzi, Glc 1 Man 9 GlcNAc 2 , Glc 1 Man 8 Glc NAc 2 and Glc 1 Man 7 GlcNAc 2 were immediately labeled in the glucose residues upon addition of [ 14 C]glucose to the medium containing intact cells.On the contrary, label in the mannose units of the two last compounds only appeared with a considerable delay after it had appeared in the first one (17).This implied that Glc 1 Man 8 GlcNAc 2 and Glc 1 Man 7 GlcNAc 2 were formed by glucosylation of Man 8 GlcNAc 2 and Man 7 Glc NAc 2 and not by demannosylation of Glc 1 Man 9 GlcNAc 2 .

Transient glucosylation of glycoproteins in mammalian, plant and fungal cells
It was first thought that glucosylation of protein-linked high mannose-type compounds only occurred in trypanosomatids, but the process was later found to occur as well in mammalian, fungal and plant cells pulse-chased with [ 14 C]glucose (initial processing reactions occurring in mammals, plants and fungi and in T. cruzi are shown in Figure 2A and B, respectively) (22)(23)(24).Incubation of rat liver microsomes with UDP-[ 14 C]Glc led to the formation of proteinlinked Glc 3 Man 9 GlcNAc 2 , Glc 1 Man 9 Glc NAc 2 , Glc 1 Man 8 GlcNAc 2 and Glc 1 Man 7 Glc NAc 2 .Addition of amphomycin, an inhibitor of the synthesis of dolichol-P-Glc, to the incubation mixtures precluded formation of the first compound but not of the last three ones, thus indicating that they were formed by direct transfer of glucose units from the sugar nucleotide to unlabeled glycoproteins present in microsomes (23).Formation of the glucosylated compounds was not affected under conditions in which glucosidase I and II and α-mannosidase activities were inhibited, thus confirming that the glucosylated The structure is that of the oligosaccharide transferred to proteins in wild type mammalian, plant and fungal cells.The lettering used for the identification of the individual monosaccharide residues (a, b, c, ...) follows the same order as the addition of the monosaccharides in the assembly of the oligosaccharide.The oligosaccharides with compositions Man 9GlcNAc2, Man 7GlcNAc2 and Man 6Glc NAc 2 transferred in different trypanosomatid species are devoid of residues l, m and n, or of j, k, l, m, and n or of i, j, k, l, m and n, respectively.compounds were not formed by deglucosylation and demannosylation of Glc 3 Man 9 Glc NAc 2 .Moreover, upon fractionation of microsomes in sucrose gradients, maximal formation of the monoglucosylated compounds was detected in the rough ER-containing fractions and not in those containing Golgi apparatus-derived membranes (23).On the other hand, addition of thyroglobulin, a glycoprotein containing high mannose-type oligosaccharides, to the incubation mixtures did not enhance formation of the monoglucosylated derivatives.This result stalled further research on the subject for several years.It is worth mentioning that the formation of Glc 1 Man 8 GlcNAc 2 and Glc 1 Man 7 GlcNAc 2 upon incubation of mammalian cells with [ 14 C]glucose or [ 14 C]galactose had been observed previously by other researchers, but they had assumed without any experimental evidence that those compounds had been formed by successive deglucosylation and demannosylation of the transferred oligosaccharide (Glc 3 Man 9 GlcNAc 2 ) (25,26).It was the fact that direct glucosylation of high mannose type compounds was first detected in trypanosomatids that suggested the occurrence of this pathway in mammalian cells.
The UDP-Glc:glycoprotein glucosyltransferase discriminates between misfolded and native glycoproteins or glycopeptides Mainly by chance it was found that 8 M urea-denatured thyroglobulin was a good glucose acceptor when added to incubation mixtures containing UDP-[ 14 C]Glc and rat liver microsomes.The products formed were protein-linked Glc 1 Man 9 GlcNAc 2 , Glc 1 Man 8 GlcNAc 2 and Glc 1 Man 7 GlcNAc 2 (27).Evidence was obtained indicating that this reaction represented transfer of glucose units from UDP-Glc to the acceptor protein and that no dolichol monophosphate or diphosphate derivatives were involved in it (27).Moreover, this result provided a convenient method for assaying the transferring enzyme (UDP-Glc:glycoprotein glucosyltransferase, GT), which was detected in microsomal membranes of mammalian, plant, fungal and trypanosomatid cells (27).Further work showed that GT was a soluble protein of the ER lumen and that the reaction products had the glucose unit linked to the same mannose with the same α(1,3) bond as in Glc 1 Man 9 GlcNAc 2 -P-P-dolichol (Figure 1) (27,28).As observed for thyroglobulin, other glycoproteins (phytohemagglutinin, soybean agglutinin, ribonuclease B) were not glucosylated unless they had been previously denatured (29).Moreover, glycopeptides obtained upon digestion of thyroglobulin with a nonspecific protease (Pronase) or of phytohemagglutinin or soybean agglutinin with trypsin were very poorly glucosylated by the enzyme (29).
The assay allowed purification to homogeneity of the enzyme from rat liver and Schizosaccharomyces pombe (30,31).GT had an almost absolute calcium requirement for activity (maximal activity is attained at about 10 mM of the cation).The ER lumen is the major intracellular calcium reservoir.The concentration of the cation in such subcellular location is about 3 mM, about three or- ders of magnitude higher than that in the cytosol.The luminal concentration of the cation is not constant and is apparently regulated by the action of an ion-motive ATPase that pumps calcium into the lumen and inositol 1,4,5-triphosphate that releases it to the cytosol.The possibility exists, therefore, that variations in the intraluminal calcium concentration may affect transient glucosylation of glycoproteins.The only donor substrate for the homogeneous rat liver or S. pombe GTs appeared to be UDP-Glc.Other sugar nucleotides such as ADP-Glc, TDP-Glc and UDP-Glc were not effective substrates.The problem posed by the fact that the donor substrate (UDP-Glc) is synthesized in the cytoplasm and acts in the interior of the ER was solved by the description of a transport system specific for UDP-Glc in the ER membrane of mammalian cells (32).

Not all oligosaccharides are transiently glucosylated in vivo by the UDP-Glc:glycoprotein glucosyltransferase
Are the majority or only a minor fraction of glycoproteins transiently glucosylated in vivo?To answer this question in mammalian cells it would be necessary to have available compounds able to only inhibit removal of glucose units added by UDP-Glc:glycoprotein glucosyltransferase and not those present in the transferred oligosaccharide.Availability of such compounds is impossible because, as mentioned above, Glc 1 Man 9 Glc NAc 2 formed by GT is structurally indistinguishable from that produced by deglucosylation of Glc 3 Man 9 GlcNAc 2 .Moreover, known inhibitors of glucosidase II also inhibit glucosidase I, and thence addition of them to intact mammalian cells produces an accumulation of Glc 3 Man 9 GlcNAc 2 and Glc 2 Man 9 GlcNAc 2 .The question can be answered, nevertheless, for trypanosomatids since their glucosylated compounds are exclusively formed by action of UDP-Glc: glycoprotein glucosyltransferase (Figure 2B).It was found that about 50% of all N-linked oligosaccharides were glucosylated in T. cruzi or Crithidia fasciculata cells incubated in the presence of castanospermine and/or 1deoxynojirimycin, known inhibitors of glucosidase II (33,34).Analysis of individual glycoproteins indicated that all glycoproteins are glucosylated in vivo but that the same oligosaccharide is glucosylated in some molecules but not in others (35).A possible explanation for this result will be given below.

Molecular basis for the selective glucosylation of misfolded glycoproteins by the UDP-Glc:glycoprotein glucosyltransferase
As mentioned above, native glycoproteins or glycopeptides were not or only very poorly glucosylated by GT whereas denatured glycoproteins were good glucose acceptors.Further work showed that the effect of denaturation was not to expose the oligosaccharide to the outer surface since oligosaccharides in native conformations, although not glucosylated by GT, were nevertheless accessible to macromolecular probes such as α-mannosidase, concanavalin A and endo-ß-N-acetylglucosaminidase H (29).This result was consistent with the fact that oligosaccharides in glycopeptides were not substrates for GT although they were completely accessible to the enzyme (29).Denatured, endo-ß-N-acetylglucosaminidase Hdeglycosylated glycoproteins appeared to be very good inhibitors of the glucosylation of denatured glycoproteins, whereas native bovine serum albumin had no effect (29).The inhibition was not glycoprotein-specific since for instance denatured deglycosylated phytohemagglutinin inhibited glucosylation of thyroglobulin.Thus, it was concluded that the effect of denaturation was to make accessible to the GT certain protein domains hidden in the interior of the molecules in the native conformations.Interaction of those protein domains with GT was apparently necessary for the occurrence of the transfer reaction (29).Several hypotheses may be advanced concerning the nature of the protein domains interacting with GT.They may be formed by a) specific amino acid sequences common to all glycoproteins, b) certain specific amino acids that are separated in the primary sequence but that become close in the denatured conformations, and c) nonspecific amino acids but common three-dimensional structures shared by all denatured glycoproteins but generated by totally different amino acid sequences.The widespread distribution of transient glucosylation in nature indicates that very different glycoproteins are glucosylated in vivo, a fact that invalidates all the hypotheses advanced above since it should not be reasonably expected for any of the structural features alluded in them to be common to all glucosylated glycoproteins.
The interior of water-soluble proteins in their native states is predominantly composed of hydrophobic amino acids while the hydrophilic side chains are on the exterior where they interact with water.Denatured states have, in general, more hydrophobic side chains on the exterior than native ones.Thus, the effect of denaturation may be to provide a certain hydrophobic environment in the vicinity of the oligosaccharide and this environment may be recognized by GT.Further work showed that under physiological pH and salt concentration conditions GT very efficiently binds hydrophobic but not hydrophilic peptides and that binding of the former species can be inhibited by denatured but not native glycoproteins (36).On the other hand, a rather unexpected result was the fact that not only native but also denatured non-glycosylated proteins failed to inhibit glucosylation of denatured glycoproteins.Additional experiments showed that the GT recognized both protein domains exposed in denatured conformations and the innermost N-acetylglucosamine unit of the oligosaccharide, i.e., the residue that is left linked to the protein moiety by endo-ß-Nacetylglucosaminidase H (36).In many native glycoproteins the innermost N-acetylglucosamine interacts with neighboring amino acid residues and is not accessible to a macromolecular probe such as endo-ß-Nacetylglucosaminidase H.In these cases, cleavage of oligosaccharides by the endoglycosidase requires denaturation of the glycoprotein (37).It may be speculated that proper folding of most glycoproteins would hinder recognition of the innermost N-acetylglucosamine unit by GT and thus prevent glucosylation.Several experiments showed that both recognition elements, the protein domains (hydrophobic amino acids) and the oligosaccharides, had to be covalently linked.This is an important restriction.Since there are numerous unfolded, partially folded and misfolded proteins and glycoproteins in the ER lumen, if the protein domains recognized by the GT and the oligosaccharides were not required to be covalently linked it might be speculated that domains exposed in not properly folded species would induce glucosylation of glycoproteins already in their native conformations, provided that the innermost N-acetylglucosamine units are accessible to GT in the latter species.

GT as a sensor of glycoprotein tertiary structures
GT appears to be an extremely sensitive sensor of the tertiary structure of glycoproteins since it is able to differentially glucosylate two neoglycoproteins (i.e., glycoproteins formed by a chemical coupling of a protein and an oligosaccharide) having the same secondary but different tertiary structures (36).The staphylococcal nuclease is a relatively small (149 amino acids) nonglycosylated protein devoid of cysteine residues.One of these residues was introduced in position 70 by replacing a lysine (K70C) (38).A high mannose type glycopeptide was then linked to that position using Nsuccinimidyl 3-(2-pyridyldithio) propionate, a bifunctional reagent that reacts with amino and sulfhydryl groups.The neoglycoprotein thus formed (K70C-Glyc) had the same specific nuclease activity as the unmodified protein, thus suggesting that both species had the same tertiary structure (38,39).The neoglycoprotein had a very low glucose acceptor capacity but addition of the nuclease inhibitor pdTp [3,5 diphosphothymidine], a stabilizer of the native nuclease conformation and an inducer of proper folding in truncated species (see below), further diminished the glucose acceptor capacity.The neoglycoprotein was nevertheless efficiently glucosylated when previously denatured with 8 M urea.
It has been reported that a truncated nuclease lacking the last 14 amino acids at the C-terminal end is per se (i.e., without any denaturing treatment) in a compact but disordered conformation, but that the enzyme can be induced to properly fold in the presence of Ca 2+ and pdTp or the substrate (DNA), as judged by far UV circular dichroism (CD) and nuclear magnetic resonance spectra (39,40).This large fragment showed the same specific nuclease activity as the full length enzyme, thus indicating that both species had the same tertiary structure.
Two truncated neoglycoproteins were synthesized using this large fragment as a protein backbone, one with the oligosaccharide attached to a cysteine introduced at position 70 (1-135 K70C-Glyc), i.e., in the same position as in the full length species, and the other at position 124 (1-135 H124C-Glyc).In this case the cysteine was introduced in place of a histidine.Both truncated neoglycoproteins were efficiently glucosylated when pdTp was omitted from the reaction mixture (no denaturing treatment was performed).This suggests that the position of the oligosaccharide is not an important factor for glucosylation.Upon addition of pdTp, the glucose acceptor capacity of both truncated neoglycoproteins was reduced by about 60%, but without reaching the basal levels of the full length neoglycoprotein (K70C-Glyc) in the presence of pdTp (it is worth mentioning that Ca 2+ , the other element required for inducing proper folding, is always present in reaction mixtures since it is required for GT activity).
In order to check that the incubation of the truncated neoglycoproteins with Ca 2+ and pdTp had actually led to the transition to a native conformation, the far UV CD spectra of the truncated neoglycoprotein 1-135 K70C-Glyc and the full length wild type enzyme were compared.The truncated neoglycoprotein yielded a spectrum strikingly different from that of the wild type enzyme, but both spectra became superimposable when Ca 2+ and pdTp were added to the former species.This indicated that both the truncated neoglycoprotein in the presence of Ca 2+ and pdTp and the wild type enzyme had the same secondary structure.
Nevertheless, the tertiary structures of K70C-Glyc (a full length neoglycoprotein) and of 1-135 K70C-Glyc (a truncated neoglycoprotein) were different: when submitted to limited proteolysis in the presence of 1 mM pdTp and 10 mM CaCl 2 , K70C-Glyc was barely cleaved even after a 60-min incubation with trypsin whereas the 1-135 K70C-Glyc was rapidly degraded.Moreover, the specific nuclease activities of both truncated neoglycoproteins (1-135 K70C-Glyc and 1-135 H124C-Glyc) were much lower than that of the full length neoglycoprotein (K70C-Glyc).The tertiary structures of the truncated neoglycoproteins in the presence of Ca 2+ and pdTp or DNA were, therefore, looser than that of the truncated nonglycosylated species.The last ones had, as mentioned above, the same specific nuclease activity as the full length enzyme with or without the oligosaccharide.The different tertiary structures between full length and truncated neoglycoproteins may explain why the latter had a residual glucose acceptor capacity in the presence of Ca 2+ and pdTp (36).
The excellence of GT as a sensor of the tertiary structure of glycoproteins was also revealed by the fact that the folding status of glycoproteins was paralleled by their ability to be glucosylated by the enzyme: soybean agglutinin was denatured with 6 M guanidine hydrochloride, diluted and allowed to renature under controlled conditions.Renaturation was monitored by measuring the fluorescent emission of tryptophan at 350 nm.The maximum wavelengths of tryptophan emission in soybean agglutinin are 328 and 350 nm in native and random coil conformations, respectively.The decrease in glucose acceptor capacity closely followed renaturation, thus confirming the conclusion reached above (36).

The primary sequence of UDP-Glc: glycoprotein glucosyltransferase: what it suggests
As mentioned above, GT is an excellent sensor of glycoprotein tertiary structures.The primary sequences of GT from two different species (Drosophila melanogaster and S. pombe, GenBank accession numbers U20554 and U38417, respectively) are known (41,42).In addition, we have recently sequenced the enzyme from rat liver (Trombetta ES and Parodi AJ, unpublished results).The sequence of an open reading frame (gene F48E3.3;GenBank U28735) of the Caenorhabditis elegans genome probably corresponds to the gene encoding GT due to its high identity with the other three sequences.The four sequences correspond to proteins of the same size (about 160 kDa).Moreover, retrieval sequences typical of ER soluble proteins are found at their C-terminal ends.All of them also have several Nglycosylation consensus sequences.This agrees with the fact that the mammalian and fungal enzymes interact with concanavalin A. Comparison of the D. melanogaster and S. pombe GT sequences with that obtained for the rat liver enzyme is depicted in Figure 3A and B. All three enzymes show an extremely high sequence homology at their Cterminal domains.A similar homology has been found between those domains and conceptual translations of expressed sequence tags from rice and Arabidopsis thaliana (GenBank D24933 and T23006, respectively).Since a limited but significant sequence homology may be observed between the C-terminal domains of the fly, fungal and mammalian enzymes with several bacterial glycosyltransferases that use UDP-Glc or UDP-Gal as donor substrates (41) it may be speculated that the highly conserved GT Cterminal portions are responsible for UDP-Glc recognition.The same conserved enzyme portions might also be responsible for the recognition of the other common substrate structure, that of the oligosaccharide acceptor.
The N-terminal portions of the GT sequences show a lower homology than the Cterminal ones.This is more noticeable upon comparison of the rat liver and S. pombe sequences (Figure 3B).These structural features are reminiscent of the hsp70 family of proteins which includes the ER chaperone BiP: their N-termini, carrying ATPase activity, have a high degree of homology whereas the C-terminal portions, containing the hydrophobic peptide binding sites, are much less conserved (43).As mentioned above, under physiological salt and pH conditions, GT binds hydrophobic peptides exposed in misfolded glycoproteins (36).GT shares this property, therefore, with known chaperones and may well perform this function in vivo.The recognition of hydrophobic domains is probably one of the elements that determine the exclusive glucosylation of misfolded structures.It may be speculated that the GT N-terminal domains are responsible for binding the hydrophobic patches and that, as they have to recognize a wide variety of different structures, they may be constituted by quite different amino acid sequences.By further pursuing the analogy with hsp70 proteins, it may also be speculated that part of the energy spent in the transfer of the glucose unit is also used to facilitate the release of the enzyme from the misfolded substrate, as is the case for chaperones upon ATP hydrolysis.
The GT sequences show a certain homology with that of the S. cerevisiae Kre5 protein.This is also a soluble ER protein similar in size to the GTs.The exact function of the Kre5 protein is unknown.Experimental evidence obtained both with cell-free extracts and with intact cells showed that S. cerevisiae is the only eukaryote known to date to be devoid of GT activity (31).The C-termi- nal portions having the highest homology (over 66-70%) among rat liver, D. melanogaster, C. elegans and S. pombe GTs only show a much reduced (about 25%) homology with the Kre5 protein.Another difference is that certain Kre5 mutants appeared to be lethal whereas S. pombe mutant cells completely devoid of GT activity showed no discernible phenotype (42,44).As certain non-lethal Kre5 mutants were affected in cell wall glucan biosynthesis it may be speculated that Kre5 is a glucosyltransferase involved in the formation of that polysaccharide (44).Alternatively, the possibility exists that Kre5 is a GT that has lost its glucosyltransferase activity but has conserved the postulated chaperone activity.

The influence of saccharides on glycoprotein folding
Glycoproteins, as all proteins entering the so called secretory pathway followed by secreted proteins, lysosomal enzymes and by proteins resident on several cellular membranes, acquire their tertiary structure (and in some cases also their quaternary one) in the ER.Glycoproteins that fail to fold properly are retained in that subcellular location and are degraded in the proteasome.This implies the occurrence of a very stringent quality control of folding in the ER.
Except for a few exceptions, saccharide moieties have no influence on the tertiary structure of mature glycoproteins.On the other hand, their influence on folding varies for different species.Three approaches have been followed to study the effect of oligosaccharide on the folding process.The first takes advantage of the fact that tunicamycin, an N-acetylglucosamine analog, inhibits the synthesis of dolichol-P-P-GlcNAc, i.e., the first step in the synthesis of Glc 3 Man 9 GlcNac 2 -P-P-dolichol.Addition of this drug to intact cells drastically inhibits protein glycosylation.The second approach uses cell lines defective in steps leading to the formation of dolichol-P-P derivatives.Finally, a third approach consists of abolishing or creating new glycosylation sites in glycoproteins by site-directed mutagenesis (45).
In most cases glycoproteins show an absolute requirement for oligosaccharides for proper folding.There are, however, numerous exceptions to this rule (45).In some cases folding becomes temperature dependent whereas in others a fraction of the proteins folds properly and is transported to the Golgi apparatus, whereas the rest is degraded (46)(47)(48)(49).Finally in some cases the saccharides are not required for attaining the correct tertiary structure (48,(50)(51)(52).Small differences in the amino acid primary sequence are often responsible for determining whether or not the oligosaccharide is required for folding.For instance, only one G protein of two strains of vesicular stomatitis virus requires the presence of the oligosaccharide for proper folding.The saccharide-dependent species can be converted into an independent one by a single point mutation (53).
Site-directed mutagenesis of glycoproteins having several N-linked oligosaccharides has very often shown that no single oligosaccharide is essential for folding and that several glycosylation consensus sequences must be suppressed to affect folding.On the other hand, the consequences of suppressing glycosylation sites may be reversed by creating new ones at entirely different locations (54)(55)(56)(57).

Endoplasmic reticulum chaperones
Several chaperones involved in protein and glycoprotein folding are present in the ER lumen.The most abundant are BiP/ GRP78 and GRP94.Another two are calnexin and calreticulin.The first one is a type I trans-membrane protein with a molecular mass of 65 kDa (58) and a cytosolic tail with several phosphorylation sites.The C-terminus of mammalian calnexin ends with the sequence Arg-Lys-x-Arg-Arg-x which is probably an ER retention sequence for transmembrane type I proteins (59).Calreticulin (molecular mass 46 kDa) is a soluble homolog of calnexin that ends with the known retention sequence for soluble proteins of the endoplasmic reticulum Lys-Asp-Glu-Leu (60).Both calnexin and calreticulin have calcium-binding motifs.
What is particularly interesting about calnexin and calreticulin is that both exclusively bind glycoproteins (61,62).Moreover, inhibition of glucosidases I and II by the addition of castanospermine or 1-deoxynojirimycin to cells yielded glycoproteins that were not precipitated with anticalnexin or anticalreticulin antibodies under native conditions.Evidence was presented indicating that both calnexin calreticulin behaved as lectin-like proteins that recognized monoglucosylated oligosaccharides.Further experiments showed that binding of calnexin and calreticulin to glycoproteins depended exclusively on the presence of monoglucosylated oligosaccarides and was independent from the tertiary structure of the protein moieties (63,64).Although mammalian cells have both calnexin and calreticulin, only the first one has been detected in S. cerevisiae and S. pombe and only calreticulin was found to be present in trypanosomatid protozoa (65; Labriola C and Parodi AJ, unpublished results).

The quality control of glycoprotein folding
As mentioned above, a very stringent quality control of folding is required to prevent passage of misfolded proteins from the ER to the Golgi cisternae.A model for such quality control applicable to glycoproteins has been recently proposed (45,62,66).According to it, the transferred oligosaccharide (Glc 3 Man 9 GlcNAc 2 ) would be first deglucosylated to Glc 1 Man 9 GlcNAc 2 or Man 9 Glc NAc 2 .The high mannose-type oligosaccha-rides thus formed would then shuttle between monoglucosylated and unglucosylated structures, their formation being catalyzed by the GT and glucosidase II.
Calnexin and calreticulin would bind the monoglucosylated structures in denatured conformations and thus retain glycoproteins in the ER as long as the protein moieties are not properly folded, that is, as long as glycoproteins are reglucosylated by the GT.On attaining the correct native conformations, glycoproteins would become substrates for glucosidase II but not for GT and thus the element recognized by the calnexin/calreticulin anchor would be eliminated.Glycoproteins would then be able to be transported to the Golgi apparatus.It was found that the interaction of calnexin/calreticulin with monoglucosylated oligosaccharides not only retains improperly folded glycoproteins in the ER but also facilitates folding of bound species probably by preventing their aggregation and thus allowing interaction with other chaperones such as BiP (67).It was also found that for individual glycoproteins not all the molecules interact with calnexin/ calreticulin (67).This finding correlates with the known fact that folding is an asynchronous process, i.e., different molecules of the same species might follow different folding pathways.All molecules end up with the same native conformation although at different rates.It may be assumed that glycoprotein molecules that rapidly attain their proper tertiary structures would not interact with calnexin/calreticulin.This would also explain the fact, mentioned above, that the same oligosaccharide in individual T. cruzi glycoproteins appeared to be glucosylated by GT in some molecules but not in others (35).
This model predicts that inhibition of removal of the glucose units added by GT would delay or abolish (depending on how tightly calnexin and calreticulin bind monoglucosylated oligosaccharides) the exit of glycoproteins from the ER.This prediction cannot be tested, however, in most eukaryotic cells because, as mentioned above, glucosidase II is responsible for removal of both α(1,3)-linked glucose units and known inhibitors of glucosidase II also inhibit glucosidase I. Addition of those drugs to cells leads to the accumulation of oligosaccharides having two or three glucoses that are not recognized by calnexin or calreticulin.The prediction can be tested, however, in trypanosomatid protozoa because, as shown in Figure 1A and B, the only glucose present in their glycoproteins is that added by the GT.It was found that addition of 1-deoxynojirimycin (an inhibitor of glucosidase II) to T. cruzi cells delayed the exit of glycoproteins from the ER (35).
What started as the study of the initial steps in the processing of N-linked oligosaccharides in trypanosomatid protozoa ended up as a model for the quality control of glycoprotein folding in the ER.This confirms the ultimate principle, one of Murphys laws: by definition when you are investigating the unknown, you do not know what you will find.

Figure 1 -
Figure1-Structure of oligosaccharides.The structure is that of the oligosaccharide transferred to proteins in wild type mammalian, plant and fungal cells.The lettering used for the identification of the individual monosaccharide residues (a, b, c, ...) follows the same order as the addition of the monosaccharides in the assembly of the oligosaccharide.The oligosaccharides with compositions Man 9GlcNAc2, Man 7GlcNAc2 and Man 6Glc NAc 2 transferred in different trypanosomatid species are devoid of residues l, m and n, or of j, k, l, m, and n or of i, j, k, l, m and n, respectively.

Figure 2 -
Figure 2 -Processing of Nlinked oligosaccharides.The processing reactions occurring in the endoplasmic reticulum of mammalian, plant and fungal (A) or T. cruzi (B) cells is shown.Pr stands for protein.