Cross-linking mass spectrometry reveals structural insights of the glutamine synthetase from Leishmania braziliensis

BACKGROUND Leishmaniasis is a neglected tropical disease caused by the parasite Leishmania braziliensis, commonly found in Brazil and associated with cutaneous and visceral forms of this disease. Like other organisms, L. braziliensis has an enzyme called glutamine synthetase (LbGS) that acts on the synthesis of glutamine from glutamate. This enzyme plays an essential role in the metabolism of these parasites and can be a potential therapeutic target for treating this disease. OBJECTIVES Investigate LbGS structure and generate structural models of the protein. METHODS We use the method of crosslinking mass spectrometry (XLMS) and generate structural models in silico using I-TASSER. FINDINGS 42 XLs peptides were identified, of which 37 are explained in a monomeric model with the other five indicating LbGS dimerization and pentamers interaction region. The comparison of 3D models generated in the presence and absence of XLMS restrictions probed the benefits of modeling with XLMS highlighting the inappropriate folding due to the absence of spatial restrictions. MAIN CONCLUSIONS In conclusion, we disclose the conservation of the active site and interface regions, but also unique features of LbGS showing the potential of XLMS to probe structural information and explore new drugs.

The enzyme glutamine synthetase (GS) is essential in nitrogen metabolism, being responsible for the catalysis of glutamine from ATP, glutamate, and ammonia; this process occurs in two stages, starting with the activation of an intermediate gamma-glutamyl phosphate (γ-G-P), followed by a nucleophilic attack of ammonia in this intermediate releasing phosphate and forming glutamine. (5,6,7,8) It is found in all organisms, including Leishmania sp, presenting three types: GS-I, found in most prokaryotes, GS-II, found in eukaryotes, and GS-III, found in some prokaryotes. (9) The GS types I and II are dodecamers formed by two hexameric rings maintained mainly by hydrophobic interactions. The GS type III is formed by two hexameric rings associate across opposite interfaces, each ring has flipped 180º with respect to its position in the other two types. (10,11) Glutamine synthesis sequence of L. braziliensis (LbGS) is formed by two pentameric rings interacting, probably, by hydrophobic interactions due to the conservation (in relation to HsGS) of the sequence rich in prolines and lysines. Hydrogen bonds and salt bridges sustain the interaction of monomers, being interface weaker in LbGS than HsGS. (12)

MATERIALS AND METHODS
In this work, the nucleotide sequence encoding (Gen-Bank CAM36993.1) the putative LbGS was cloned into a pET28a plasmid vector. A 42.35 kDa protein was obtained by overexpressing LbGS in the Escherichia coli (DE3) NiCo strain with 1 mM IPTG at 30ºC for four hours. The recombinant protein was purified from the soluble fraction of cellular lysate using a HisTrap column in the Akta Purifier system (GE Healthcare) using buf-fer A (Sodium phosphate buffer pH 7.4 10 mM, 500 mM NaCl, 40 mM imidazole) to equilibrate the column and a linear gradient of buffer B (pH 7.4 10 mM sodium phosphate buffer, 500 mM NaCl, 1 M imidazole) for elution.
We performed crosslinking experiments (XL) using the purified protein as previously described. (13) The protein was digested with trypsin in the proportion of 1/50 (E/S) for 20 hours and the enzymatic reaction and was stopped by adding trifluoroacetic (0.4% v/v final). Subsequently, the peptides were quantified using the fluorometric test -Qubit 4.0 ® (Invitrogen) according to the manufacturer's recommendations. Each sample was desalted and concentrated using Stage-Tips (STop and Go-Extraction TIPs) according to literature. (14) The peptide mixture was suspended in 0.1% formic acid and analysed as follows. An Ultimate 3000 (Thermo Fisher ® ) coupled online with a Fusion Lumos Orbitrap mass spectrometer (Thermo Fisher ® ) was used for generating the mass spectra data. The peptide mixture was chromatographically separated on a column (15 cm in length with a 75 μm I.D.) packed in-house with ReproSil-Pur C18-AQ 3 μm resin (Dr Maisch GmbH HPLC) with a flow of 250 nL/min from 5% to 50% ACN in 0.1% formic acid in a 140 min gradient. The Fusion Lumos Orbitrap was set to the data-dependent acquisition (DDA) mode to automatically switch between full-scan MS and MS/MS acquisition with 60s dynamic exclusion. Survey scans (200-1500 m/z) were acquired in the Orbitrap system with a resolution of 120,000 at m/z 200. The most intense ions captured in a 2s cycle time were selected, excluding those unassigned and in a 1+ charge state, sequentially isolated and HCD (Higher-energy collisional dissociation) fragmented using a normalised collision energy of 30. The fragment ions were analysed with a resolution of 30,000 at 200 m/z. The general mass spectrometric conditions were as follows: 2.5 kV spray voltage, no sheath or auxiliary gas flow, heated capillary temperature of 250ºC, predictive automatic gain control (AGC) enabled, and an S-lens RF level of 40%. Mass spectrometer scan functions and nLC solvent gradients were controlled by the Xcalibur 4.1 data system (Thermo Fisher ® ). Protein identification was performed using Pattern Lab for proteomics V available at http://www. patternlabforproteomics.org and a database containing 8,084 sequences of L. braziliensis downloaded from Uniprot. Results were filtered as described in the software's bioinformatics protocol (15) and only the protein of interest was identified, thus achieving 0% FDR. XL identification was performed with the Spectrum Identification Machine for Cross-Linked Peptides (SIM-XL) software that is freely available at http://www.patternlabforproteomics.org/sim-xl. (16) The LbGS sequence from L. braziliensis was downloaded on March 29th, 2021, from the NCBI. The search parameters considered: fully tryptic peptide candidates with masses between 600 and 4800 Da, 20 ppm for precursor and fragment mass. The modifications were carbamidomethylation of cysteine and oxidation of methionine as fixed and variable, respectively. The files are available in proteomics.fiocruz.br/LbGS (Supplementary data). The distance of 11.4 Å between cross-linked lysines identified using SIM-XL (scores limit of 1.5 for intralinks and 2.0 for interlink) (16) was used as an input for I-TASSER. (17) Structural analysis and visual inspection were conducted with EBI-PISA, (18) Pymol (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger), Wincoot, (19) and ChimeraX. (20)

RESULTS AND DISCUSSION
The identity of purified protein (Fig. 1A) was confirmed by mass spectrometry (Fig. 1B). The experimental constraints obtained by XLMS are listed in Table I. The tertiary model (Fig. 1C) displayed a C-score value of 0.89 and a TM-score of 0.83 ± 0.08, which indicates good confidence and correct fold (TM-score > 0.5 suggests a correct fold). (17,21) 37 out of 42 XL distances could be placed in the monomeric model, with acceptable distances between 11.4 Å and 35 Å (Table I). (22) All GS are oligomers and eucaryotes GS type II are decamers composed of pentameric rings superimposed with monomers comprised of ~ 350 to 420 residues. GSs have ten active sites per oligomer placed in the interface of two interacting monomers. (5,23) The 5 XL restraints that cannot be justified by a monomeric model should indicate the region of dimerization; these results corroborate that, like other GS, the dimerization occurs from the C-terminal region of one monomer and N-terminal of the other (Table I).
The GS decamers present two main interaction interfaces: intra-ring forming the pentamers rings and tail-totail between superimposed pentameric rings. The human GS intra-ring interface is formed by the interaction of the N-and C-terminal of two subunits. (24) Comparing LbGS and HsGS interfaces (Fig. 1D, Tables II-III), we observe a highly conserved C-terminal region but a divergent Nterminal ( Fig. 2A) (Table  II). In LbGS, the numbers are quite lower: seven hydrogen bonds and nine salt bridges (Table III). Although the interface of LbGS is less stable than HsGS, the following XLs might indicate that LbGS dimerization occurs in solution: K167-K166, S02-K240, D29-K28, C162-S163, K167-S166, D314-S317, D234-K240. Regarding the tailto-tail interface, XL residues S163-K167 and S166-K167 can indicate the presence of this interface in our sample.
Studies have shown that the interaction of the pentameric HsGS model also depends on residues L139 to P160, which form a loop rich in proline and glycine, favoring hydrophobic interactions within pentamers. (24) This loop (I138 to M159, LbGS numbering) is conserved in the LbGS structure and is also rich in prolines and glycines (Fig. 2B). However, the differences in protein sequence result in an β-sheet (R143, R144 and P145, LbGS numbering) not conserved in the homolog HsGS (Fig. 2B).
The active site of GSs comprises three regions: one for glutamate, one for ATP and one for ammonia, with very conserved residues of two subunits of the pentameric ring. (23) Each monomer is divided into two domains, each contributing with the active site of the adjacent  monomer: N-terminal (smaller) is composed of a sheet formed by six antiparallel β-strands which two take parts of the active site; C-terminal (the larger), formed mainly by α-helix and six β-strands formed by most of the residues that make up the active site (9) (Fig. 3A). In the model obtained by the crosslinking method, the active site residues E133, E135, N247, G248, H252, R294, R314, E333 and R335 (Glutamate site, LbGS numbering) (Fig. 3B), G186, S256 and R319 (ATP site, LbGS numbering) (Fig. 3C) are fully conserved together with the ammonia site, which involves three residues from the C-terminal region (E195, E202 and E300, LbGS numbering) and two residues from the N-terminal region (D58 and S60, LbGS numbering) from the adjacent subunit (Fig. 3D). The structural active sites differences found by superposing the GsHS and LbGS structures reside in: (i) glutamate site from 287-303 (LbGS numbering, Fig. 2A), while R299 (PDB_ID 2OJW), which is the terminal part of a loop and R294 (LbGS) which is part of an α -helix; (ii) the ammonia site from 55-63, being D63 (PDB_ID 2OJW) parts of a β-sheet, and D58 (LbGS), part of a loop ( Fig. 2A). Finally, we also predicted a model using only the sequence of LbGS without XL restrictions. The model constructed by I-TASSER displayed good confidence scores (C-score value of -0.12, TM-score of 0.70 ± 0.12). When both LbGS models (modeled with and without XL restrictions) are superimposed, they present an RMSD of 0.671Å and the following regions would be modeled inappropriately in the absence of the restraints obtained from experimental XL: S17-D24, N38-P53, G288-E306 (Fig. 4). Thus, it is relevant to say that spatial restrictions give modeling a sense of the real conformation of the enzyme in vitro, differently from the model generated only based on homology allowing us to evaluate their in-solution conformation, in addition to the comparison with already known 3D counterparts structures avail- able in the PDB. The technique of drug design aims both small therapeutic molecules targeting protein as itself as drug (biotherapeutics). Knowing proteins conformation in vitro and the ligand binding sites are the heart of structure-based drug development. Currently, this strategy depends not only on structural information, but also on dynamics, kinetics, and enzyme-substrate interaction data, that together provide the dynamic information on protein's in vitro conformation and flexibility and are possible due to the computational advances that emerged over years. (25,26) Some studies have used GS enzymes as a therapeutic target to treat diseases such as cancer, malaria, and leishmaniases. (23,24,27,28) The LbGS lacks structural and functional studies being the studies with GS from L. donovani (LdGS) the closest to LbGS. Kumar et al performed biochemical studies that demonstrate the enzyme's dependence on divalent metals for its optimal activity and optimum pHs from 7 to 9, similar to HsGS. (29) Also, the provide a structural comparison of LdGS and HsGS describing relevant non-conserved residues for substrate recognition (E7, L132, S190, S249 and V205, LdGS numbering) and the importance of the electropositive potential in the active sites. (30) These differences allowed them to find specific LdGS inhibitors, that might act in LbGS as we observed that the residues are conserved in LbGS. The  potential of GS from Leishmania sp as therapeutic target was also evidenced by knock-out experiments indicating the dependence of parasite proliferation and infectivity on external supply of glutamine. (31) Herein, we provide LbGS structural investigation identifying the active site, important interfaces, and unique structural features from LbGS. All these information allow investigation for new drugs.

AUTHORS' CONTRIBUTION
JYL -Experiments, data analysis, writing and review; MDMS -XLMS experiments, data analysis and review; MTM -protein production and review; PC and LC -MS/MS data analysis and review; TACBS -project coordination, writing, review and editing.