Protein PEGylation for the design of biobetters: from reaction to purification processes

The covalent attachment of polyethylene glycol (PEG) to therapeutical proteins is an important route to develop biobetters for biomedical, biotech and pharmaceutical industries. PEG conjugation can shield antigenic epitopes of the protein, reduce degradation by proteolytic enzymes, enhance long-term stability and maintain or even improve pharmacokinetic and pharmacodynamics characteristics of the protein drug. Nonetheless, correct information in terms of the PEGylation process from reaction to downstream processing is of paramount importance for the industrial application and processing scale-up. In this review we present and discuss the main steps in protein PEGylation, namely: PEGylation reaction, separation of the products and final characterization of structure and activity of the resulting species. These steps are not trivial tasks, reason why bioprocessing operations based on PEGylated proteins relies on the use of analytical tools according to the specific pharmaceutical conjugate that is being developed. Therefore, the appropriate selection of the technical and analytical methods may ensure success in implementing a feasible industrial process.


PEGYLATION FOR BIOBETTERS DEVELOP-MENT AND PRODUCTION
Therapies based on biological drugs represented a revolutionary innovation in the pharmaceutical industry due to the success to overcome medical challenges, such as haemophilia, diabetes, arthritis and diseases of the immune system.Because of patent protection strategies and the high market price of biological drugs, the holding companies of innovative molecules generated considerable revenue (Calo-Fernández, Martínez-Hurtado, 2012).In 2016, global market of biological drugs reached US$ 209.8 billion (Transparency Market Research, 2016) and Roche® was the company with the highest income, reaching US $ 38.7 billion in 2015 (Spadiut et al., 2014).However, some drawbacks are intrinsically related to biological drugs, more specifically to protein drugs that usually present immunogenicity and short plasma half-life.
The immunogenicity is mainly related to the production of anti-drug antibodies (ADA) that reduce clinical efficacy by biological activity neutralization or induction of hypersensitivity, which includes anaphylactic reactions (Barbosa et al., 2012;Kuriakose, Chirmule, Nair, 2016).The shorter biological half-life of protein drugs leads to more recurrent administrations to achieve the desired clinical effect in patients (Ryu, Kim, Nam, 2012).Another concern related to commercial protein drugs is of economic nature and is called "Patent Cliff", a market phenomenon well described for chemical drugs that is happening with biological ones (Calo-Fernández, Martínez-Hurtado, 2012).It refers to a sharp drop in sales of blockbusters following the end of their patent protection, which can impact negatively or positively the main participants in the biological drug industry, depending on the marketing strategies adopted (Calo-Fernández, Martínez-Hurtado, 2012).
Considering the economic need to remain competitive in the market and the necessity to solve the inherent problems of biological drugs, a new generation of protein-based medicines emerge: the follow-on biologics.This new generation of biological drugs is divided in two main groups: the biosimilars (Barbosa et al., 2012;Calo-Fernández, Martínez-Hurtado, 2012) and biobetters (Ryu, Kim, Nam, 2012;Gorham, 2015).Both biosimilars and biobetters are similar to a reference product; however, biosimilars aim to establish similarity to a known biological, whereas biobetters seek superiority in one or various aspects of their clinical profile (Sassi et al., 2015).In this sense, biosimilars have the same amino acid sequence of the originator biological drug, and must have also the same safety, purity and efficacy profile (Beck, Sanglier-Cianférani, Van Dorsselaer, 2012).On the other hand, a biobetter is a biological molecule that suffered chemical or molecular modifications from an originator to generate functional changes that include increased halflife, reduced toxicity, reduced immunogenicity and/or enhanced pharmacodynamics (Beck, Sanglier-Cianférani, Van Dorsselaer, 2012;Sassi et al., 2015).
Biobetters represent an opportunity for innovation with reduced risk and increased sales for manufacturers, since the mechanism of action of the originator molecule is already known.At the same time, they promote an improved treatment for patients and possibility of cost reduction for health systems.One of the main tools for the development of biobetters refers to PEGylation, a technique in which at least one chain of polyethylene glycol (PEG) is covalently attached to the structure of the protein (Figure 1) (Hoffman, 2016).PEG is a biocompatible polymer, which presents minor immunogenicity, antigenicity and toxicity, is soluble in water and other organic solvents, is readily cleared from the body and has high mobility in solution, making this the polymer of choice for bioconjugation (Jevševar, Kunstelj, Porekar, 2010).Additionally, PEG is one of the few synthetic polymers approved by the US FDA for internal administration.After the first report of protein PEGylation in the 1970s (Hoffman, 2016), many proteins and peptides have been covalently conjugated with PEG and many more are currently under clinical trials.To date, 14 biobetters (PEGylated proteins, peptides, antibody fragments, and oligonucleotides) have been approved by FDA and are currently on the market (Table I).From these, 12 are PEGylated proteins, with a total market value of over US $ 8 billion per year (Ginn et al., 2014).
PEGylation technology was firstly reported in 1977 (Abuchowski et al., 1977) for the modification of albumin and catalase.Depending on the number, molecular weight and location of the attached PEG chains, this covalent modification can enhance the physicochemical properties of the protein, without compromising the secondary structure (González-Valdez, Rito-Palomares, Benavides, 2012).
The PEGylation strategy provides a number of advantages for protein conjugates, such as (i) protection of antigenic sites present on the protein surface, i.e. antigenic epitopes; (ii) prevention of in vivo degradation by endocytosis and proteolytic enzymes; (iii) increase in apparent protein size and hydrodynamic volume, which reduces renal filtration, alters biodistribution and increases in vivo half-life; (iv) increased water solubility and reduction of protein aggregates due to steric repulsion between the PEGylated surfaces; (v) increased thermal and long-term stability (Beck, Sanglier-Cianférani, Van Dorsselaer, 2012;Sassi, Nagarkar, Hamblin, 2015).It may also promote sustained release of originator drug (Monfardini et al., 1995;Veronese, Caliceti, Schiavon, 1997).
Several studies have focused on the use of protein PEGylation to develop novel biobetters.A thorough survey of the scientific literature from 1991 to November of 2017 yielded 1450 articles in which protein PEGylation was used in biobetter development (Figure 2).Therefore, it is undeniable that PEGylation is a hot-topic in the biopharmaceutical field.In this paper, we review the main concepts, strategies and pitfalls of protein PEGylation aiming at biobetter manufacturing.

PEGYLATION REACTION DESIGN
Selecting the appropriate chemistry reaction in the design of PEGylated molecules is the first step in obtaining a successful process (Pasut, Veronese, 2012;Pfister and Morbidelli 2014).PEGylation reactions have been extensively reviewed in literature (Jevševar, Kunstelj, Porekar, 2010;Palm, Esfandiary, Gandhi, 2011;Pasut, Veronese, 2012;Ginn et al., 2014;Pfister, Morbidelli, 2014).The selection of the PEG derivative (the reactive PEG used in the PEGylation reaction) is fundamental, since it depends strongly on the location and number of amino-acids that are able to be PEGylated (Veronese, 2001;Zhou, He, Wang, 2016).Generally, the reactive groups in proteins that covalently bind with activated PEG molecules are nucleophiles (Da Silva Freitas, Mero, Pasut, 2013;Zhou, He, Wang, 2016), with the following moieties ranked in decreasing order of reactivity: thiol, α-amino, ɛ-amino, carboxyl, and hydroxyl (González-Valdez et al., 2012).Moreover, the number and local reactivity of the available PEGylation sites (nucleophilic groups of the amino-acids), the experimental conditions of PEGylation reaction (i.e.pH, temperature, reaction time and molar ratio between PEG derivative and protein), and reactivity of the PEG derivative influence the final composition of PEGylated products (González-Valdez et al., 2012).PEG coupling may result in heterogeneous mixtures of PEGylated conjugates with several degrees of PEGylation and non-PEGylated forms (Pasut, Veronese, 2012).
PEGylation reactions are preferably conducted in a single-step unidirectional batch system to guarantee batch-to-batch control and the formation of all products in equal conditions.In this sense, validation, reproducibility and optimization of the reaction are favoured, while enabling to easily trace the potential formation of undesirable products (i.e.PEGylation adducts) (González-Valdez et al., 2012).Also, maximization of the PEGylation yield and specificity of every reaction is required.The main challenges of PEGylation are (i) the design of site-specific PEGylation reactions to avoid heterogeneity of PEGylated conjugates; (ii) to obtain shortened-time reactions that increase PEGylation productivity; (iii) to decrease the amount of reactive PEG and the overall cost of the process.
Reactive PEGs for random PEGylation usually target amino groups of the protein, most frequently the ɛ-amino of the side-chains of lysine residues.Examples of randomly PEGylated biobetters available in the market are Adagen® and Oncaspar® (Table I), which are both complex mixtures of various PEGylated conjugates (higher polydispersity) at lysine residues and N-terminals.PEG-INTRON® (Sylatron TM ), PEGASYS®, Mircera® are also examples of randomly PEGylated drugs (Table I), but in this case they are mixtures of mono-PEGylated positional isomers exhibiting extended half-lifes in comparison to the originator drugs.PEGylation reactions can be directed towards the formation of site-specific PEGylated conjugates by the optimization of reaction conditions (Veronese, 2001;Zhao et al., 2012;Da Silva Freitas, Mero, Pasut, 2013).An example of a site-specific PEGylated biobetter available in the market is Neulasta® (Table I), an N-terminally mono-PEGylated granulocytecolony stimulating factor (G-CSF).
PEGylation of therapeutic proteins usually involves the use of mono-methoxy PEG (mPEG), approved by FDA and EMA.Since it has only one reactive hydroxyl group, undesirable byproducts, i.e. crossed linked products, are avoided.(Jevševar, Kunstelj, Porekar, 2010;Sassi, Nagarkar, Hamblin, 2015) One of the most common mPEG is the amino reactive N-hydroxylsuccinimide (NHS) functionalized polyethylene glycol (PEG-NHS), used to modify proteins, peptides or any molecule/ structure with available amino groups.The reaction of NHS esters with primary amine groups at pH 7-8.5 results in stable amide bonds.Compared to other PEG NHS ester derivatives, the succinimidyl carbonate (SC) functionalized mPEG-NHS offers superior reactivity and higher stability in aqueous solution (Nanocs, 2017).
In PEGylation, linear PEGs are the conventional and simplest conjugate agents (Figure 3).With this type of PEGs, proteins are conjugated in the distal end of a PEG molecule in a single attachment site (Roberts, Bentley, Harris, 2012).Bifunctional PEGs are linear PEGs with two available sites for protein conjugation, meaning that in maximum only two biomolecules may be conjugated, which limits the loading capacity comparing to the most recent PEG derivatives (Veronese, Caliceti, Schiavon, 1997;Gokarn, McLean, Laue, 2012).Nonetheless, bifunctional PEGs may significantly increase viscosity compared to the originator drug formulation since one reactive PEG molecule can conjugate to two different protein molecules, resulting in protein cross linking and a hydrogel formation.In addition, linear PEGs of large molecular weight may impede the appropriate release of small molecular weight protein drugs, preventing them to reach therapeutic concentrations at the target sites (when a cleavable PEG is employed) (Roberts, Bentley, Harris, 2012).To overcome these drawbacks, several novel types of PEG derivatives have been synthesized, including Y-shaped PEGs, forked PEGs and multi-arm PEGs (Figure 3) (Roberts, Bentley, Harris, 2012;Pfister, Morbidelli, 2014).Y-shaped PEGs have an "umbrella like" structure, linking two linear PEG derivatives to active groups of amino acids.This structure provides better protection than linear PEGs toward antibodies recognition and cleavage by proteolytic enzymes.This PEG variant was tested in several proteins (i.e.ribonuclease, catalase, asparaginase, trypsin, among others), but is not applied as frequently for peptides and small molecules drugs (Monfardini et al., 1995).
Forked PEGs provide multi-proximal reactive groups at one or both ends of a linear PEG chain (Veronese, Caliceti, Schiavon, 1997).The first report on a forked PEG synthesis dates from 1999 (Harris, Kozlowski, 1999); it refers to the attachment to the terminus of the polymer backbone of a trifunctional linker, such as serinol or β-glutamic acid.Forked PEGs are useful for conjugating small molecules rather than proteins since conjugation of forked PEGs functionalized at both ends of the polymer chain generate protein hydrogels rather than soluble PEGylated proteins.Nonetheless, forked PEGs may find application in biobetters' design attached to Fab' antibody fragments to produce a conjugate similar in structure to the full-length antibody (Constantinou, Chen, Deonarain, 2010) Multi-arm PEGs are star-like structures carrying multi-hydroxyl or functional groups, increasing the amount of active sites and molecular weight (Kim et al., 2016).Similarly to fork-shaped PEGs, multiarm PEGs are not much explored in the attachment with proteins due to protein cross-link.They have been widely investigated for conjugation of small molecule drugs such as NKTR-102 (PEG-irinotecan) (Adkins et al., 2015), EZN-2208 (PEG-SN38) (Garrett et al., 2013) and NKTR-214 (PEG-aldesleukin) (Charych et al., 2016), which are some examples of PEGylated drugs that have entered into clinical trials.
PEGylation processes may be classified in "firstgeneration" and "second-generation".First-generation processes involved random PEGylation, what results in multiple isoforms with a lack of control in the physicochemical and pharmaceutical properties of the final product (i.e.presence of mixtures of isomers with batch-to-batch variation and unstable bonds).Despite these limitations, first-generation PEGylated drugs are still in use today.Instead, the second-generation PEGylated biomolecules are obtained with novel PEG derivatives (Figure 3) including higher molecular weight and branched structures.In comparison to linear PEG, branched PEGylation decreases immunogenicity and increases half-life, but usually the decrease in activity of the biomolecules is more pronounced.Novel trends in PEGylation recognise a "third generation" technology, aiming to preserve the drug's bioactivity, using novel non-linear PEG derivatives and alternative PEGylation strategies (Swierczewska, Lee, Lee, 2015) that will be further discussed below.Several sites of proteins can be targeted for PEGylation (Figure 4) and in the next subsections we will explore different PEGylation strategies that can be applied in the development of novel biobetters.

PEGylation of ε-amino groups
Amine groups are present in large quantities on the surface of most therapeutic proteins as lysine residues or N-terminals and PEGylation of lysine ε-amino groups is the most studied type of PEGylation (Figure 5).It may be performed via the N-alkylation or N-acylation by using the activated PEG carbonates or carboxylates (Pasut,  Veronese, 2012; , Zhang et al., 2012).
The main drawback of this approach is low selectivity of PEG position since after the random attachment to some lysine residues, steric hindrance may occur in neighbouring lysine residues.(Pfister, Morbidelli, 2014) This leads to a significant variation in the number of chains introduced and their location, resulting in a mixture of heterogeneous isomers (Ginn et al., 2014) with batch-to-batch variation.The first two proteins to be PEGylated were adenosine deaminase (ADA) and asparaginase, with 5 kDa succinimidyl succinate-activated PEG (mPEG-SS): pegademase (Adagen®, 11-17 PEG molecules) and pegaspargase (Oncaspar®, 69-82 PEG molecules) (Turecek et al., 2016).
It is known that arginine residues are less prone to reaction with the reactive PEG due to delocalization of charge in the guanidinium group.Thus, molecular strategies such as replacement of lysine by arginine at essential sites often result in proteins with retention of activity and more controlled PEGylation at ε-amino groups.The reverse way, i.e. the replacement of some neutral residues by lysine at non-essential sites, add new regions for PEGylation.These strategies controls the number of possible PEGylation sites and may result in more homogeneous PEGylated preparations (Pasut, Veronese, 2012;Zhang et al., 2012).

PEGylation of thiols groups
Thiol groups are very suitable for site-specific PEGylation since specific covalent conjugation is possible even in the presence of other protein nucleophiles (Ginn et al., 2014) and the reduction of protein activity is usually lower.Few PEGylated thiol groups can already improve the pharmacokinetic properties of therapeutic  proteins (Zhang et al., 2012).The PEGylation agents most commonly used are maleimide PEGs such as orthopyridyl disulphide (PEG-OPSS) and tosylate (PEG-TS) to form a thioether bond by a Michael addition (Pasut, Veronese, 2012;Pfister, Morbidelli, 2014).The reaction pH should be carefully buffered at values bellow lysine residues pKa (usually 9.3 to 10.5) to avoid the coupling of protein amine groups to maleimide (Pasut, Veronese, 2012).
The main limitation of this technique is that proteins rarely present cysteine residues in reduced form; they are generally involved in disulfide bonds.When present, reduced cysteine residues are located mainly in inaccessible hydrophobic domains and for that reason present low reactivity (Pasut, Veronese, 2012;Pfister, Morbidelli, 2014).This problem can be overcome using genetic engineering tools to insert one or more free cysteine residues on the protein surface to facilitate sitespecific PEGylation (Constantinou, Chen, Deonarain, 2010).CIMZIA® is a PEGylated anti-TNF recombinant antibody Fab fragment, the only FDA approved protein with thiol group PEGylation (Figure 6).It is approved for treatment of rheumatoid arthritis, Crohn's disease, and axial spondyloarthritis.CIMZIA® is produced in E. coli bacterium and covalently bound with 40-kDa branched PEG at a cysteine residue, which was inserted three amino acids from the C-terminus of the heavy chain antibody fragment by genetic engineering.(Turecek et al., 2016)

PEGylation of disulfide bond
Recently, the disulfide bonds were considered as targets for site-specific PEGylation, mainly as a way to overcome the lack of free cysteines in protein (Pfister, Morbidelli, 2014).Its applicability in therapeutic protein PEGylation is still under investigation, but it is known that disulfide bridges are present in small amounts in therapeutic proteins, making PEGylation at these sites attractive due to the possibility of obtaining homogeneous conjugates.However, the natural distance between the sulfur atoms must be preserved with PEGylation since these bonds are essential for protein conformation (Pasut, Veronese, 2012).The technique was first proposed by Brocchini et al. (Brocchini et al., 2006) using bis-thiol alkylating PEG reagent capable to form a three-carbon bridge after reducing the disulfide bonds of the protein.
The mild reduction of the accessible native disufide bonds enable site-specific PEGylation in the sulfur atoms, while retaining the protein tertiary structure (Figure 7) (Pasut, Veronese, 2012;Kolate et al., 2014).

N-terminal and C-terminal PEGylation
N-terminal PEGylation is considered a site-directed reaction (Ginn et al., 2014), since there is only one group per protein chain and the number of PEGylation sites are dramatically reduced (Pfister, Morbidelli, 2014).The reaction selectivity is based on pKa differences between the ε-amino group of lysine residues (9.3-10.5)and the N-terminal α-amino group of proteins (7.6 to 8).Accordingly, at pH values lower than 9.3 the lysine residues will be predominantly protonated and unavailable to react with active PEG (Pasut, Veronese, 2012).In some  cases, the determination of the optimal pH conditions can be difficult since small changes may result in competition for other amine groups and optimum pH varies between proteins (Ginn et al. 2014).Molecular strategies can be used to improve selectivity, such as depletion or modification of some lysine residues on the protein surface (by chemical or genetic engineering) and functionalization of the N-terminal portion with reactive carbonyl groups that are more selective and efficient (Ginn et al., 2014;Pfister, Morbidelli, 2014).Neulasta® (Pegfilgrastim) and Plegridy® (Peginterferon β-1) are two examples of protein N-terminal PEGylation with 20 kDa PEG.
In the same way, each protein chain has one C-terminal portion that can be used for site-specific PEGylation.This reaction can be performed by introducing a hydrazine group at this location by fusion technique with inteins (Pasut, Veronese, 2012;Kolate et al., 2014).An intein is a protein segment that is self-excised through a process known as protein splicing (Kolate et al., 2014).The conjugation reactions include the intein excision resulting in the protein of interest containing one hydrazine group at the C-terminal portion, which can react specifically with PEG molecules functionalized with aldehyde groups or ketone (Pasut, Veronese, 2012).This approach has been recently applied to IFN-α (Pasut, Veronese, 2012;Ginn et al., 2014;Kolate et al., 2014) and IFN-β (Thom et al., 2011), however, none PEGylated conjugate by this technique reached the stage of clinical research (Ginn et al., 2014).

Enzymatic PEGylation
This site-specific PEGylation is based on a transglutaminase (TGase) catalysed acyl transfer reaction between glutamine residues (Gln) and PEG primary amino groups in reactive PEGs (Figure 8) (Sato, 2002).For chimeric proteins, a short sequence of Gln residues is incorporated at the protein terminal by genetic engineering, without disturbing its flexibility and conformation, and then PEGylated with primary amine derivatives of PEGs in the presence of the enzyme.Compared with other strategies, TGase mediated conjugations were found to be more site-specific, reproducible and versatile (Fontana et al., 2008).Enzymatic PEGylations with TGase were already investigated for several proteins, such as apomyoglobin (apoMb), human growth hormone (hGH), α-lactalbumin (α-LA), human interlukin-2 (hIL-2) and human granulocyte colony-stimulating factor (hG-CSF) (Fontana et al., 2008;Mero et al., 2009).

Released and non-covalent PEGylation
One of the main drawbacks of protein PEGylation is the activity loss of therapeutic proteins.One solution is to target amino acids in non-essential sites, but it is often necessary laborious protein engineering tools that can bring conformational changes in protein structure (Gong, Leroux, Gauthier, 2015).The reversible attachment of PEG polymer to the protein might be interesting to obtain conjugates with longer half-life while keeping pharmacodynamic properties (Pfister, Morbidelli, 2014).Some of the reactions for reversible PEGylation include bicine linker, histidine PEGylation, glucose and different ligands ("linkers") that are cleaved by serum proteases (Nollmann et al., 2013) and other enzymes, especially via β-elimination reaction (Figure 9).An important requirement is that the linker does not leave any residue on the protein after cleavage of the bond.Pharmacodynamic response must also be controlled and optimized (Pasut, Veronese, 2012;Gong, Leroux, Gauthier, 2015).
Non-covalent PEGylation, in turn, is based in hydrophobic interactions (Pasut, Veronese, 2012), coordination complex formation (Mero et al., 2011), protein-polyelectrolyte and protein-block copolymers complexes (Kurinomaru, Shiraki, 2015).Research on this particular type of PEGylation has increased in recent years and promising results have been observed, but a more extensive study is still needed, especially to demonstrate significant improvement in pharmacokinetic properties (Pfister, Morbidelli, 2014).
One of the main disadvantages of this type of PEGylation refers to the release of the protein during purification steps or during storage, which is inherent to non-covalent conjugates.To avoid cleavage during storage, additional techniques such as lyophilization are required.This can be a challenge due to the difficulty of preserving PEGylated protein structure/activity during the freezing step (Pasut, Veronese, 2012;Kolate et al., 2014).

CURRENT ADVANCES IN THE PURIFICATION OF PEGYLATED PROTEINS
PEGylation reactions commonly result in heterogeneous mixtures of unreacted protein, PEGylated conjugates and undesired PEGamers (proteins with varying number of attached PEG molecules) (Yoshimoto, Yamamoto, 2012;Moosmann, Müller, Böttinger, 2014).Furthermore, PEGylated conjugates dissimilar among themselves may be formed, with different number of grafted chains, length and attaching sites.Therefore, efficient downstream strategies are needed to purify these complex mixtures for commercial approval (Yoshimoto, Yamamoto, 2012).Purification of PEGylated proteins implies three main challenges: (i) isolation and recycling of the unreacted protein from the PEGylated proteins, (ii) isolation of each PEGylated protein form from the reaction media (e.g.PEG derivate, undesired PEGamers and other reagents such as hydroxylamine), and (iii) fractionation of PEGylated conjugates based on the degree of PEGylation.This multifaceted challenge is not easy to overcome since PEG-protein conjugates are structurally similar to the originator protein (González-Valdez, Rito-Palomares, Benavides, 2012).A combination of chromatographic (Moosmann, Müller, Böttinger, 2014) and/or non-chromatographic (Mayolo-Deloisa et al., 2011) techniques is usually designed for each PEGylation process, exploiting the physicochemical properties of the molecules present in the PEGylation reaction mixture.In the last years, chromatographic fractionation platforms have been commonly used in downstream processes of PEGylated proteins (Fee, 2003;Moosmann et al., 2010;2014;Müller et al., 2010;Mayolo-Deloisa et al., 2012), as shown in Figure 10.Non-chromatographic techniques have been suggested alternatively in the recent years, since they can exhibit advantages such as high versatility, ease of scale-up and low overall cost and time of processing (Cramer, Holstein, 2011).However, these are not fully characterized for the fractionation and analysis of PEGylated conjugates (Mayolo-Deloisa et al., 2011).Currently, efforts are being made to characterize and optimize those purification platforms for a larger spectrum of PEGylated proteins (Mayolo-Deloisa et al., 2011;Galindo-López, Rito-Palomares, 2013).

Chromatographic fractionation platforms
As mentioned above, most part of the purification processes for PEGylated proteins are based on chromatographic techniques, especially size exclusion chromatography (SEC) (Maiser et al., 2015) and ion exchange chromatography (IEX) (Zhao et al., 2012).SEC is a chromatographic method in which molecules in solution are separated by molecular weight (MW).It is well recognized that SEC can be used to separate PEGylated species from unreacted protein and other components, but the effectiveness will greatly depend upon the molecular size of the species involved (Silva Freitas, Abrahao-Neto, 2010).For instance, the conjugation of a single PEG polymer with a protein with the same MW more than double the protein molecular radius, due to the steric elongation of PEG chains (Fee, Van Alstine, 2011).Therefore, native and mono-PEGylated proteins should be readily separable by SEC (Fee, Van Alstine, 2004).Nonetheless, the resolution among chromatographic peaks lowers as PEGylation extent increases, due to the presence of more PEGylated conjugates (Fahrländer et al., 2015;Maiser et al., 2015).SEC is not able to purify the positional isomers of PEGylated conjugates, due to the minimal radius differences among these protein species (Fee, Van Alstine, 2011).Proteins of different degrees of PEGylation may be separated by IEX since for each PEG molecule attached to an amino group, for example, a PEGylated protein has one less positive charge and this chromatographic technique separates proteins based on net surface charge (Fee, Van Alstine, 2011).By choosing the optimal ion exchanger and separation conditions, high resolution can be obtained.The best protocol must be achieved case-by-case depending on the specific protein and type of PEGylation (Fee, Van Alstine, 2011).In the recent years, IEX has been the most used downstream technique for the separation of PEGylated conjugates (Figure 10) (Abe et al., 2010;Moosmann et al., 2012;Zhao et al., 2012;Morgenstern et al., 2017).The major challenge of IEX refers to the fact that PEG chains sterically interfere in the interaction of the protein charged residues and the ionic exchange support, producing a masking effect of the charges (Fee, Van Alstine, 2011).
Hydrophobic interaction chromatography (HIC) is also used to fractionate PEGylated proteins (Müller et al., 2010;Mayolo-Deloisa et al., 2012;Moosmann, Müller, Böttinger, 2014), although not so extensively as IEX and SEC.HIC takes advantage of the hydrophobicity changes through PEGylation process, separating PEGylated conjugates based on the relative hydrophobicity/ hydrophilicity (Fee, Van Alstine, 2011).However, HIC has low capacity and poor resolution between adjacent peaks.To date, it is not possible to conclude which type of chromatographic technique is the best option to fractionate PEGylated conjugates.Furthermore, there is no generalized chromatographic protocols to purify PEGylated proteins, separation strategies must be developed on a case-by-case basis (Fee, Van Alstine, 2011;Mayolo-Deloisa et al., 2011).

Non-chromatographic fractionation platforms
The non-chromatographic fractionation platforms applied in the purification of PEGylated proteins are: capillary electrophoresis (CE), membrane separation techniques (ultrafiltration, diafiltration and dialysis) and aqueous biphasic systems (ABS) (Mayolo-Deloisa et al., 2011).Membrane separation processes are the simplest among all non-chromatographic techniques currently used and are based on the molecular weight and hydrodynamic radius of the proteins.PEGylated species can be efficiently fractionated and recovered using ultrafiltration and diafiltration (Cheang, Zydney, 2003;Ruanjaikaen, Zydney, 2011).However, ultrafiltration methods may generate high product losses, particularly when using membranes with pores considerably smaller than the hydrodynamic radius of the PEGylated protein (Kwon, Molek, Zydney, 2008).Even though membrane separation techniques are not able to separate conjugates according to their positional isomerism, they present certain advantages, such as costs, over SEC and IEX.Capillary electrophoresis (CE) proved to be a powerful tool for the high-resolution separation of different PEGylated products; it has been used in the analysis and small-scale purification of PEGylated proteins (Li et al., 2001;Na et al., 2008;Lee, Na, 2010).The major advantages of this technique are its automation capability, low sample consumption and short-time process (Caslavska, Thormann, 2004).On the other hand, CE lacks in the industrial process due to technique scaleup incapability.
To overcome the main drawbacks of nonchromatographic downstream processes, the use of microfluidic devices is emerging as a promising strategy for high-throughput monitoring of protein PEGylation purification.The design of these novel devices for rapid separation, concentration, and recovery of PEGylated proteins in a one-step operation is a current trend for nonchromatographic downstream processes (Yoshimoto, Yamamoto, 2012;Mata-Gómez et al., 2016).
Aqueous biphasic systems offer a great alternative of an efficient downstream processing platform for PEGylated proteins due to their versatility, ease of scale up and low costs (Santos et al., 2017).Several works addressed the application of ABS as purification tools to be used in the separation of PEGylated proteins (Delgado et al., 1994;González-Valdez et al., 2011;González-Valdez, Rito-Palomares, Benavides, 2013;Santos et al., 2017).Delgado et al. (1994) used PEG/dextran ABS to characterize the degree of PEGylation (n) in mixtures of PEG-protein conjugates of bovine serum albumin and granulocyte-macrophage colony stimulating factor.These authors determined the relationship between the increase in partition coefficient (K) of the PEGylated conjugate with the degree of PEGylation of the protein species present in the resulting mixtures.González-Valdez et al. (2011) studied PEG/salt-based ABS for the separation of native RNase A and lactoalbumin and their respective PEGylated conjugates.The results indicated the potential of ABS for the fractionation of PEGylated proteins from the respective unreacted proteins.However the sub-fractionation of PEGylated proteins from themselves (depending on the degree of PEGylation) was not achieved.Extended applications of ABS were achieved with the combination of this technique with continuous scale-up platforms, such as the counter current distribution ABS (CCD-ABS).The application of CCD-ABS was already performed for the fractionation of PEGylated forms of lysozyme and RNase (Sookkumnerd, Hsu, 2000;Galindo-López, Rito-Palomares, Benavides, 2013).
Despite the advances reported, other types of ABS, such as PEG/polyacrylate (with or without electrolytes present), PEG/ionic liquids, copolymer/salt are worth investigating for the fractionation of PEGylated conjugates with different degrees of PEGylation.In situ product recovery through ABS in continuous regime and the application of other approaches regarding continuous flow purification, namely the centrifugal partition chromatography (CPC), should also be focused in the near future (Mayolo-Deloisa et al., 2011).

CHARACTERIZATION OF PEGYLATED PRO-TEINS
To obtain a valid PEGylation reaction and to characterize/quantify PEGylated conjugates, several analytical techniques must be applied.These techniques must be adapted to the specific polymer-protein conjugates and be able to access their (i) structural arrangement, (ii) bioactivity potential and (iii) stability.
In terms of structural analysis of the PEGylated conjugates, it is important to determine the molecular mass, the number of bound polymer chains, the specific sites of PEG attachment and the secondary and tertiary structure (González-Valdez, Rito-Palomares, Benavides, 2012).Together with UV-visible spectrophotometry and electrophoresis, size exclusion chromatography (SEC), mass spectrometry (MS), fourier-transform infrared spectroscopy (FTIR), circular dichroism (CD) and dynamic light scattering (DLS) are the major techniques used to characterize the molecular mass and structure of PEGylated proteins and are of great prominence in biobetters manufacturing and quality control.
Generally, the first step for structural characterization is polyacrylamide gel electrophoresis (PAGE), and/or western blotting to confirm the identity of the PEGylated protein (Zhou, He, Wang, 2016).However, PEG may change the electrophoretic mobility, which complicates size characterization, being this technique of low precision (González-Valdez, Rito-Palomares, Benavides, 2012).The mass of PEGylated species is commonly evaluated by MS, which is also capable of identifying the specific sites and number of attached polymer chains in the primary amino acid sequence (Domon, Aebersold, 2006).MALDI-TOF MS has been employed the technique of choice to characterize average molecular weight and degree of PEGylation (Chowdhury, Doleman, Johnston, 1995;Bullock, Chowdhury, Johnston, 1996;Wang et al., 2002;Na, Youn, Lee, 2004;Yun et al., 2005).Irrespectively of the size and type of PEG used (mono and heterofunctional, linear or branched), MALDI provides first-rate information on molecular weight, total amount and distribution of PEG on protein, besides site specific information on PEGylation coupling site (Cindrić et al., 2007).On the other hand, electrospray ionization MS (ESI-MS) have been gaining attention in the past decade for the analysis of PEGylated proteins (Cindrić et al., 2007;Forstenlehner et al., 2014).Compared to MALDI, ESI-MS has some advantages such as automated workflow and reduced sample preparation time (Gioacchini et al., 1997).Yet, the overlapping protein charge pattern and the polydispersity of the PEGylation derivatives make it difficult to understand and analyse the ESI-MS spectrum.Some additional techniques have been alternatively employed, namely tryptic digestion (Wu et al., 2011) and isotopically labelled internal standards of PEGylated proteins (Watson et al., 1994).SEC can also bee applied to infer the size of PEGylated proteins (Fee, Van Alstine, 2004).However, some studies suggest that the determination of PEGylated proteins mass through this technique is not as adequate as using MS (Fee, Van Alstine, 2004;González-Valdez, Rito-Palomares, Benavides, 2012).The apparent PEG interaction with the material of SEC columns could lead to anomalously slow elution times of PEGylated proteins and, thus, incorrect molecular weight determinations (Fee, Van Alstine, 2004).
After the confirmation of the PEGylated protein mass and polydispersity, the effect of PEGylation in the secondary and tertiary structure of the protein should be evaluated and CD spectroscopy is a powerful technique for this end.Through far UV CD (data collected from ~190 to 250 nm) and near UV CD (data collected from 250 to 300 nm) it is possible to infer about the protein secondary and tertiary structure, respectively.Additionally, CD can also be used to study protein stability as a function of temperature or denaturing conditions.FTIR spectroscopy can be alternatively used to evaluate the effect of PEG on the protein secondary structure, however it is less sensitive than CD (Rajan et al., 2006).
The hydrodynamic diameter of PEGylated proteins is also of importance to identify different PEG-protein conjugates and it can be estimated based on DLS measurements (Gokarn, McLean, Laue, 2012).DLS is the most conventional method for determination and characterization of nanoparticles.In PEGylation, usually the hydrodynamic diameter of a conjugate is increased by the addition of PEGs in the protein (at least up to three PEG molecules) (Gokarn, McLean, Laue, 2012).
The determination of a PEGylated protein biological activity is one of the most important steps for characterization (Fontana et al., 2008;Jevševar, Kunstelj, Porekar, 2010;González-Valdez, Rito-Palomares, Benavides, 2012).For enzymes, in particular, kinetic parameters must be determined (e.g.K m , k cat and vmax).Additionally, in vivo bioactivity must be confirmed for PEGylated proteins.A decrease in PEGylated protein activity can be observed depending on the reaction site, since the covalent attachment of PEG impairs restrictions in molecular conformation (Hsieh, Lin, 2015;Morgenstern et al., 2017).Other methods currently used to quantify biological activity involve bioassays, immunoassays, and radioassays.Finally, stability assays are crucial to identify the most stable PEGylated isoform as a promising biobetter.Protein stability assessment should be performed considering the thermostability (Santiago-Rodríguez et al., 2011;Hsieh, Lin, 2015), long term stability (Santiago-Rodríguez et al., 2011), pH stability (Tian et al., 2013) and proteolytic digestion assays (Kurinomaru, Shiraki, 2015).
Overall, a protein PEGylation process to develop a biobetter can be defined as a multi-step approach (Figure 11), in which the complete analysis of the technological and economical benefits of each step should be attained to make it viable to pharmaceutical and biotechnological industries.

CONCLUSIONS
PEGylation is an attractive and prospective tool to improve drug properties, especially protein drugs.Despite all molecular biology tools available today, it is still one of the main strategies for biobetters development.The overall process must be considered to develop a PEGylated protein, from the chemical reaction to the proper PEGylated protein purification, what can be considered the PEGylation downstream step.The reaction step involves proper selection of the PEGylation reagents and chemistry, what is crucial for the final product characteristics and its application.There is no better chemistry, it will depend on the protein drug characteristics, such as if it need to be detached from PEG to present activity and for how long it should circulate in the body.After the reaction, downstream steps will include the fractionation and purification of PEG-protein conjugates, followed by the complete structural and activity characterization of the resulting species.High degree of purity is needed for protein drugs and for this reason it is hard to imagine industrial purification processes without including the chromatographic steps already in use.Nonetheless, the complete analysis of the technological and economic benefits of PEGylation could support biopharmaceutical industries in seeking for more efficient strategies to develop biobetters.

FIGURE 1
FIGURE 1 -(A) Chemical structure of N-terminal reactive PEG and (B) schematic representation of a tetrameric protein PEGylated with 10,000 Da PEG chains at the N-terminal groups.

FIGURE 2 -
FIGURE 2 -PubMed citations of the title words "Protein PEGylation" and "PEGylation" in published articles in the last 26 years (from 1991 to 2017).

FIGURE 3 -
FIGURE 3 -Reactive PEGs used in PEGylation reactions.R represents functional groups of PEG derivatives.Linear -simplest and most often used conjugate agent with only one reactive group; Bifunctional -linear PEG derivative with two reactive groups; Y-shaped PEG -two linear PEG derivatives linked to a single point of attachment to proteins; Fork-shaped PEG -multi-proximal reactive groups at the end of one or both ends of a linear PEG chain; Multi-arm PEG -an eight-arm PEG carrying multi-hydroxyl or other functional groups (R) with pentaerythritol, hexaglycerol, or tripentaerythritol as the central core.

FIGURE 4 -
FIGURE 4 -Potential sites of PEGylation in proteins.Typical reactive amino-acids including lysine, cysteine, glutamine, N-terminal amino group and the C-terminal carboxylic acid are specific sites for conjugation with PEG derivatives.

FIGURE 5 -
FIGURE 5 -PEGylation of primary amines in proteins, forming an amide linkage.

FIGURE 8 -
FIGURE 8 -Site-specific PEGylation of protein with transglutaminase (TGase) strategy, where GLN residue are PEGylated in the presence of TGase.

FIGURE 10 -
FIGURE 10 -Distribution of the publications/papers reported in the last 7 years (> 2010) dealing with chromatographic and nonchromatographic techniques for the separation of PEGylated proteins in Web of Science and Pubmed databases.SEC -size exclusion chromatography, IEX -ion exchange chromatography, HIC -hydrophobic interaction chromatography.

FIGURE 11 -
FIGURE 11 -Flow diagram of the PEGylation bioprocess operation from reaction design to purification and characterization, ending up with the final application of the chemically modified protein drugs as an effective biobetter.

TABLE I -
FDA-approved biotherapeutics PEGylated drugs in United and States and/or Europe