Design of Inhibitors for Nucleoside Hydrolase from Leishmania donovani using Molecular Dynamics Studies

Neste trabalho é proposto o primeiro modelo por homologia para a nucleosídeo hidrolase de Leishmania donovani construído a partir das estruturas das nucleosídeo hidrolases de Crithidia fasciculata e de Leishmania major. Usando as informações de interação entre o inibidor p-aminofeniliminoribitol e a nucleosídeo hidrolase de Crithidia fasciculata foram planejados dois novos potenciais inibidores, os quais apresentam novas interações com alguns resíduos da bolsa hidrofóbica do sítio ativo do modelo. Simulações por dinâmica molecular dos protótipos ancorados nos sítios ativos do modelo e das enzimas usadas como moldes, mostraram que, diferente do p-aminofeniliminoribitol, eles permaneceram ancorados nos sítios ativos das três enzimas ao longo de toda a dinâmica, interagindo fortemente com os aminoácidos da bolsa hidrofóbica.


Introduction
According to the World Health Organization (WHO), leishmaniasis is a threat for 350 million people in 88 countries, where 72 of those are developing countries.WHO also estimates that 12 million people in the world are infected and 2 million new cases appear annually, despite the official report of only about 600,000 new cases every year. 1 It is clear that leishmaniasis is one of the most important parasitic diseases, which is caused by protozoan of the Leishmania genus.2][3][4] Visceral leishmaniasis, also known as "kala-azar", is considered the most dangerous form of this disease, being fatal in 100% of the untreated cases 1 and is caused by the L. donovani complex. 4he search for prevention against leishmaniasis infection includes vaccines with weakened and genetically modified strains of the parasite and recombinant antigens 5 or its encoding DNA. 6The only preventive treatment for leishmaniasis available today is the canine vaccine Leishmune ® . 7An efficient human vaccine is not available yet.Accordingly, the most effective way to combat leishmaniasis still is the use of chemotherapy.
The chemotherapy against leishmaniasis is still based on the use of the very toxic pentavalent antimonium compounds and pentamidine.Using those compounds, the treatment of patients with leishmaniasis may last up to several weeks, thus maximizing the occurrence of undesirable collateral effects like abdominal pain and renal and hepatic toxicity.Moreover, there is evidence of the emergence of parasites which are resistant to the commonly used antimony drugs. 4,8][14][15][16][17][18] These enzymes are widely distributed in nature and have already been found in bacteria, 19,20 yeast, 21 protozoa 16,22 and insects. 23Their encoding genes are present in plants, amphibians and fish but, surprisingly, no NH activity or their encoding genes have been detected in mammals until the present moment. 18This fact turns NH into ideal targets for selective chemotherapy for parasitic and bacterial infections.
The NH, or nucleoside N-ribohidrolases, present in the genus Leishmania, are part of a super family of metalloproteins structurally related with an original topology of -sheets. 24,25All NH studied to date are homodimers or homotetramers and present great similarity in the amino acid sequences of the active site, thus indicating that they are closely related in structure and catalytic mechanism.7][28] This cation is octacoordinated by a net of interactions involving the O atoms of the side chains of residues Asp10, Asp15, Asp242, the carbonyl oxygen of the main chain of Thr126, and three water molecules.In the enzyme-substrate complex, two of the water molecules are replaced by the 2' and 3' hydroxyl groups of the substrate ribose.The specificity of NH for the ribose fraction is established by a net of conserved interactions involving the 2', 3' and 5' hydroxyl groups of the sugar, the conserved residues Asp14, Asn160, Glu166, Asn168 and Asp242 and the Ca 2+ ion. 18onspecific NH catalyze the hydrolysis of purine and pirimidine nucleosides, as well as the hydrolysis of p-nitrophenyl--D-ribofuranoside.][31][32] The monomeric NH from L. donovani (LdNH) presents 314 residues and its three-dimensional structure has not been experimentally elucidated yet.Fortunately it presents a quite significant homology with the nonspecific Leishmania major nucleoside hydrolase (LmNH), 95% of sequential identity, and Crithidia fasciculata nucleoside hydrolase (CfNH), 80% of sequential identity, suggesting that this enzyme is also nonspecific, as proposed by Cui et al. 16 based on enzymatic assays.The high conservation of the NH sequences among the Trypanosomatidae species suggests that this enzyme is fundamental for the purines and pirimidines salvage in these species.Considering that one of the metabolic factors that differentiate protozoan parasites from mammals is the lack of de novo biosynthesis of purines and pirimidines, these parasites depend exclusively of purines and pirimidines from the host to survive.
In this work we report the first multiple alignment homology model for LdNH, built based on the crystallographic structures of CfNH and LmNH.Also, analysis of pAPIR, a known CfNH inhibitor, 33 docked inside the active site of CfNH, pointed to six residues on the hydrophobic pocket of the active site that are close enough to interact with eventual pAPIR derivatives with substitutions in their aromatic base portion.The analyses of the potential interactions with such residues lead to the proposition of two new potential NH inhibitors.Dynamics simulations of 2 ns of pAPIR and the proposed inhibitors inside the active sites of LdNH, LmNH and CfNH, showed that those compounds, differently from pAPIR, remained tightly bound inside the active sites of the three enzymes during the whole simulation, interacting strongly with those residues of the hydrophobic pocket that are not able to interact with pAPIR.

Methods
In the search for good templates, all the amino acid sequences of NH from protozoa available in the Protein Data Bank (PDB) 34 were investigated.Among the NH structures found, the only ones possessing more than 35% of sequence identity with LdNH were the enzymes from L. major (PDB entry 1EZR; resolution = 2.5 Å, R-value = 0.203) 15 and C. fasciculata (PDB entry 2MAS; resolution = 2.30 Å, R-value = 0.205). 14These sequences, after alignment with the LdNH sequence using the BLAST server, 35,36 presented 95% and 80% of sequence identity, respectively.
The multiple alignment of the primary sequences of LdNH, LmNH and CfNH were manually refined in order to reproduce the alignment of the LdNH, LmNH and CfNH sequences proposed by Cui et al. 16 This refined alignment was then submitted to the SWISSMODEL server, [37][38][39] in the optimize mode, in order to generate the final model of the monomeric apoenzyme.
After building the structure of the LdNH apoenzyme model, the following step was to build the enzyme/Ca 2+ / pAPIR complex by docking pAPIR, and Ca 2+ into the enzyme active site.This task was accomplished by the superposition of the backbone of our model with the crystallographic structure of CfNH complexed with Ca 2+ and pAPIR (PDB entry 2MAS).The coordinates of Ca 2+ and pAPIR were, then, copied into the model.The atomic partial charges of Ca 2+ and pAPIR were previously calculated at the Hartree Fock level with the 6-31G* basis set using the CHELPG approach of the Gaussian98 package, 40 and their topological files, which are compatible with the format of the GROMACS 3.2 package 41,42 databank, were generated at the Dundee PRODRG server. 43The holoenzyme obtained by this procedure, which volume is about 300 nm 3 , was set into a cubic box of about 722.86 nm 3 (nm = nanometer) containing 21,332 water molecules and minimized with the GROMOS96 force field, 44 implemented in the GROMACS 3.2 package. 41,42The minimization steps were carried out, firstly, using the steepest descent algorithm until reaching an energy gradient of 100 kcal mol -1 nm -1 and further with the conjugate gradients algorithm until reaching the energy gradient of 20 kcal mol -1 nm -1 .
In order to check the quality of our model, it was submitted to validation analysis at the Biotech Validation Suite for Protein Structures. 45There were performed the atomic volume analysis by PROVE, 46 the full geometric analysis by PROCHECK 47 and all the WHAT IF 48 checks available on the site.Furthermore, the model was also submitted to the Verify3D 49,50 structure evaluation server 51 in order to check for the chemical environment of each residue and had its Z-scores calculated using the software PROSA 2003. 52he two proposed compounds as potential NH inhibitors were built using the Gaussian View software from the Gaussian 98 package, 40 based on the structure of pAPIR as template and using the potential interactions with the hydrophobic pocket of the active site as a guide.In the same way as for Ca 2+ and pAPIR, the potential inhibitors had their atomic partial charges calculated at the Hartree Fock level with the 6-31G* basis set using the CHELPG approach of the Gaussian98 package, 40 and their topological files generated at the Dundee PRODRG server. 43The insertions of the proposed inhibitors into the LdNH, LmNH and CfNH active sites were performed by superposition to the pAPIR in the same conformation as reported in the literature. 33urthermore, LdNH and the crystallographic structures of LmNH and CfNH, with Ca 2+ and each proposed compound docked into their active sites, were submitted to rounds of energy minimization following the same procedure described to the LdNH/Ca 2+ /pAPIR complex.
The molecular dynamics (MD) steps were carried out according to the following procedure: First, a 50 ps of MD at 300 K in the water molecules inside the box in order to allow for the equilibration of the solvent around the protein residues.In this simulation, all protein atoms had their positions restrained.Then, it was carried out a full MD simulation of 2,000 ps at 300 K with no restrictions, using 1 fs of integration time and a cut-off of 14 Å (angstroms) for long-range interactions.
Because these proteins have only a residual net charge, which was balanced by contra-ions for these simulations, it was possible to use the direct cut-off for long range interactions without smoothening functions.The effects of the electrostatic potential truncation are minor at 14 Å, and the cut-off procedure makes the calculations faster than other methods.Considering that the protein monomer and the water molecules inside the simulation box are composed by about 130,000 atoms, this method can save significant computational time.
As a whole, 1,000 conformations were stored during each simulation.In this step the pair lists were updated every 500 time steps and all the Lys and Arg residues were positively charged while the Glu and Asp residues were negatively charged.
Also, in order to obtain a neutral net charge for all the systems, the charges were neutralized by the addition of Na + ions.The program calculates the electrostatic potential to find best positions for ion insertion by replacing water molecules that are at least at 3.50 Å from the protein surface.There were employed Periodic Boundary Conditions (PBC) and the water model used was the Simple Point Charge water (SPC), the default solvent of the GROMACS 3.2 package. 41,42he hardware resources used in this work were a PC AMD Atlhon 2,000 MHz and a cluster of computers composed by a PC Pentium IV 2.4 and four PC Pentium 2,600 MHz.

Homology modeling
Even considering the sequence similarity of 95% and 80% between the target and the two template proteins, it was important to obtain a homology model of the target protein, because the aim of this work was the design of new potential inhibitors.The most desirable model for this type of work is the crystallographic structure of the protein with a known inhibitor inside the active site.In the present case, the use of very similar crystallographic templates guarantee the obtainment of a very good quality model, as it is necessary for a precise description of the active site.
One of the most important steps in homology modeling is the definition of the correct sequence alignment between the target and the template proteins.Figure 1 shows our final alignment, which was submitted to the SWISSMODEL server [37][38][39] in order to build the initial model of LdNH.This alignment was obtained after manual adjustments of the initial alignment obtained from the BLAST server 35,36 in order to reproduce the alignment of LdNH, LmNH and CfNH sequences proposed by Cui et al. 16 The Ramachandran plot 53 of the model shows more than 98% of the amino acid residues in the favorable regions of the plot for the whole enzyme, and 100% for the active site residues.
The backbones of the modeled monomer of LdNH and the templates used in the multiple alignment were superimposed, using SPDBViewer, 36 in order to calculate the RMSD values and to check the structural compatibility of the model with the templates (see Figure 2).As it was expected due to the great sequence similarity between the target and the template enzymes, the RMSD values obtained were of 0.1 Å and 0.08 Å for CfNH and LmNH respectively, thus confirming that the templates are satisfactory for the homology modeling process.The superposition of the backbones shows that the general fold of LdNH is almost identical to LmNH and differs from CfNH only at the external loops L1 and L2, which are located at the entrance of the enzymes active sites.As can be seen in Figure 2, those loops in CfNH are closer to each other, turning the entrance of this enzyme active site narrowed than in LdNH and LmNH.

Validation of the model and active site determination
Regarding the main chain properties of the modeled enzyme, the careful examination of the checking results performed at the Biotech Validation Suite for Protein Structures, 45 did not show any considerable bad contacts, nor C tetrahedron distortion nor buried unsatisfied H-bond donors and acceptors.Moreover, the average G-factor (-1.04), was well inside the permitted values for homology models. 47he G-factor provides a measure of how "normal", or alternatively how "unusual", a given stereochemical property is.When applied to a given residue, a low G-factor indicates that the property corresponds to a low-probability conformation.For example, residues falling inside the disallowed regions of the Ramachandran plot will have a low (or very negative) G-factor.
Additionally, there were not found any distortions of the side chain torsion angles, the 3D-1D averaged score for each residue calculated by the Verify3D, 49,50 ranged among 0.08 and 0.69 and the Z-scores calculated using the software PROSA 2003, 52 showed that LdNH and its templates are inside the range of a typical native structure. 54Accordingly to these results, the model was considered satisfactory.
The active site determination in our model was accomplished based on its alignment to the templates.There were found degrees of sequence identity of 100% of the active site with LmNH and 98% with CfNH.As it is the case for LmNH, the active site of LdNH presents the residues: Asp10, Asp14, Asp15, Asn39, Ile81, His82, Thr126, Met152, Asn160, Glu166, Phe167, Asn168, Tyr225, Tyr229, Glu232, His241 and Asp242.In the active site of CfNH, however, Glu232 is replaced by Arg233. 13,15,16,18uperimposing the residues of the active sites from our model and the templates, it can be initially inferred, based on the alignment in Figure 1 and the picture of the active site from CfNH reported by Versées and Steyaert, 18 that, in LdNH and LmNH, residues Asp10, Asp14, Asp15, Asn39, Thr126, Asn160, Glu166, Asn168 and Asp242, would be the ones that stabilize the binding of the ribose portion of the nucleosides, while the hydrophobic pocket of the nucleic base would be formed by residues Ile81, His82, Asn160, Phe167, Tyr225, Tyr229 and His241.Also, residues Asp10, Asp15, Thr126 and Asp242 would be the ones binding the Ca 2+ ion.As it can be noticed in the superposition in Figure 3, Arg233 and the residues of the hydrophobic pocket: Asn39, Ile81, His82, Tyr225 and Tyr229 in CfNH, do not superimpose well to their homologues in LmNH and LdNH.This happens because those residues are located at loops L1 and L2, which are the only elements of secondary structure of CfNH that do not fit well to LdNH and LmNH as already shown in Figure 2.

Analysis of the potential interactions of pAPIR inside the CfNH active site and proposition of potential inhibitors
Analysis of Figure 4 shows that the pAPIR aromatic base portion is inserted into a pocket, defined mainly by the residues Asn39, His82, Asn160, Tyr225, Tyr229 and Arg233.However, the side chains of these residues are between 2.25 and 4.77 Å from the pAPIR aromatic base atoms.Those distances seem too far to enable electrostatic interactions or the formation of H-bonds that would stabilize the base portion inside the active site.This inability to anchorage properly the aromatic base portion of pAPIR could be the main reason for the low inhibition of NH by pAPIR when compared to other inhibitors with substitutions on the aromatic base portion, as already described in literature. 21In fact, Miles et al. 55 have synthesized and evaluated the inhibitory activity of 20 pAPIR analogues against specific and non-specific NH.He showed that the most powerful inhibitors of NH are just the ones presenting substitutions at the aromatic base portion.We believe that this probably happens because these groups are able to interact with the hydrophobic pocket residues.The presence of the side chains of Asn39, His82, Asn160, Tyr225, Tyr229 and Glu232 (Arg233 in CfNH) in positions quite close to the aromatic base portion of the inhibitors represents a good opportunity to design new derivatives of pAPIR with substitutions in this part of the molecule.These compounds could be able to interact specifically with those residues, thus binding more tightly to the active site.
That analysis lead to the proposition of the structures of the two pAPIR derivatives as potential inhibitors of non-specific NH, which are shown in Figure 5. Compound 1 was proposed based on the substitution of the group p-amino phenyl of pAPIR by a quinoline ring substituted by an amino group at position 2 and an amino acid group (glycine) at position 5.This creates several possibilities of interactions with the polar groups (amine, amide or hydroxyl) of the side chains of residues Asn160, Tyr225, His82 and Asn39 as well as hydrophobic interactions like -stacking and H-bonds with Tyr225 and Tyr229.In the proposed compound 2 we added to the aromatic amine of pAPIR, a carboxyl and an ethylamine groups with the goal of exploring possible interactions with the side chains of Asn160, Tyr229 and Glu232 (Arg233 in CfNH).Also, a meta hydroxyl group was placed at the aromatic ring in order to explore interactions with Asn39 and His82.The chemical environment around compounds 1 and 2 inside the CfNH hydrophobic pocket is shown in Figure 6.

Molecular dynamics simulations
In order to determine if the proposed inhibitors would interact with the active site of LdNH they were docked, together with the Ca 2+ cation, inside the active sites of LdNH, LmNH and CfNH, affording six enzyme/Ca 2+ /inhibitor complexes.This docking was accomplished using as basis the intermolecular interactions present in the crystallographic structure of pAPIR bound with the active site of CfNH.Each complex was submitted to 50 ps of position-restrained MD (PRMD)   followed by 2,000 ps of MD simulations with no restriction, as described in the methods section.
There were carried out calculations of temporal RMSD on all the atoms in all the systems for 1,000 frames generated at each 2 ps of MD simulation.In this case the result is a unique general value for each enzyme monitored throughout the whole simulation time.Taking into account that the complexes could float inside the water box, each frame was adjusted to the former by the minimum squares method when calculating the standard deviation.In Figure 7, where the variation of the RMSD with time is monitored, it can be observed that the systems NH/Ca 2+ /Compound 1 equilibrate along the first few picoseconds of the MD simulation.This behavior was common to all 6 MD simulations, with deviations never passing 0.1 nm after the first 100 ps of simulation.We also observed that, for all simulations, the temporal RMSD of NH as apoenzymes practically fits the temporal RMSD of the complexes.Also, as expected, the ligands display a greater fluctuation than for the whole complexes but with a smaller RMSD, thus confirming the stability of the MD simulations.
The spatial RMSD (RMSF) on each amino acid residue was also calculated in the time range of 100 to 2,000 ps, at each 1 ps, totalizing 1,900 frames.Figure 8 shows the plots of RMSF for the systems NH/Ca 2+ /Compound 1, with the total RMSF (peptide backbone plus side chains) in gray, the RMSF of the peptide backbone (N, C , C) in black and the RMSF of the side chains in dotted lines.In Figure 9 are shown the qualitative illustrations in the tube representation of the RMSF for the NH/Ca 2+ /Compound 1 systems.Analyzing Figures 8  and 9 there can be determined the most mobile regions along the MD simulations as the regions with major values of RMSF and major thickness of the tubes in the tube representation.These regions correspond to the residues near the two terminus of each monomer and to the loops regions.On the other hand, the residues at the active site regions, as well as those at the -helix and -sheet regions, present lower RMSF values, revealing to be the most stable regions of the system.Along the MD simulations, very few fluctuations exceeded 0.3 nm and even less fluctuations overpassed 0.6 nm.

Dynamics behavior of the designed inhibitors in the active sites of LdNH, LmNH and CfNH
In order to analyze and compare the dynamics behavior of each compound inside the active sites of the enzymes   along the MD simulations, we extracted one frame every 20 ps of simulation, totalizing 100 frames for each system.To make possible the visualization we selected, in each frame, only some of those residues directly involved in the interactions with compounds 1 and 2. The frames in Figures 10, 11 and 12 show that the two proposed compounds stay tightly bound to the active sites of all the enzymes along the whole MD simulations.The same is not observed for pAPIR, which is expelled from the active sites of LdNH and LmNH.The dynamics of pAPIR inside the active site of CfNH shows that this compound is not expelled from the active site only because, differently from LdNH and LmNH, the active site entrance of this enzyme is narrowed by loops L1 and L2 and also, by the side chain of Arg233 that stays at the entrance.This dynamics behavior of pAPIR probably happens because, not having substitutions on its aromatic base portion, it does not   57 interact with the residues in the hydrophobic pocket, counting only with the interactions of the ribose portion to be stabilized inside the active site.Also, this observation corroborates the results of Milles et al. 55 showing that pAPIR is a weaker inhibitor of NH when compared to inhibitors with substitutions on the aromatic base portion and could explain the fact that there are not any other crystallographic structures of a NH in complex with pAPIR than that with CfNH.
The fact that our proposed compounds remain tightly bound to the active sites of all the enzymes along the whole MD simulations also confirms the results of Milles et al. 55 showing that the best NH inhibitors are the ones with substitutions on the aromatic base portion.This result also suggests that the proposed inhibitors have a good potential to become candidates for the strong inhibition of non-specific nucleoside hydrolases.

Conclusions
In this work we have used homology modeling to propose the first 3D structure for the nucleoside hydrolase of Leishmania donovani (LdNH).Analysis of the interactions of the known NH inhibitor pAPIR with the active sites of this model and of nucleoside hydrolase from Leishmania major (LmNH), which were based on the interactions of pAPIR co-crystallized with the nucleoside hydrolase from Crithidia fasciculata (CfNH), pointed to the hydrophobic pocket residues: Asn39, His82, Asn160, Tyr225, Tyr229 and Glu232 (Arg233 in CfNH) as potential sites for enzyme-inhibitor interactions.Using that information we proposed two potential inhibitors for those enzymes, which structures were based on modifications of the aromatic base portion of pAPIR, in order to explore the potential interactions with the hydrophobic pocket.The proposed inhibitors and pAPIR were evaluated using MD simulations of the complexes formed by the docked compounds at the active sites of the enzymes.The results showed that our proposed compounds stay tightly bound to the active sites of all the enzymes along the whole MD simulations while pAPIR is expelled out of the active sites of LmNH and LdNH.It was found that pAPIR remains inside the active site of CfNH only because the narrower entrance and the presence of the side chain of Arg233 in this active site.The interaction of the proposed inhibitors with those residues, which are located in the hydrophobic pocket of the enzymes and are quite close to the aromatic base portion of pAPIR, explain the greater dynamics stability of their complexes with the three enzymes when compared to pAPIR.Interactions with such residues also explain the better inhibition of NH by pAPIR analogues with substitutions on the aromatic base portion already described experimentally in literature. 21,55forts are now being carried out in our laboratory in order to propose other analogues of pAPIR as more efficient potential inhibitors, and to synthesize compounds 1 and 2 for LdNH inhibition studies and biological activity tests.

Figure 1 .
Figure 1.Alignment of LdNH amino acids sequence with the sequences of the Crystallographic structures of LmNH and CfNH available in PDB.

Figure 2 .
Figure 2. Superposition of the backbones of our model and the templates.Black: LmNH, Dark gray: LdNH and Ligth gray: CfNH.

Figure 3 .
Figure 3. Superposition of some residues of the active sites hydrophobic pockets of our model and the templates Black: LmNH, Dark gray: CfNH and Ligth gray: LdNH.

Figure 4 .
Figure 4. pAPIR inside the active site of CfNH.(a) side view; (b) front view.

Figure 5 .
Figure 5. Proposed structures of two new potential inhibitors for non specific NHs.

Figure 10 .
Figure 10.Dynamic Behavior of pAPIR inside the active sites of CfNH (a), LmNH (b) and LdNH (c) along the 2,000 ps of MD simulations.The picture was generated with the Pymol software.57

Figure 11 .
Figure 11.Dynamic Behavior of compound 1 inside the active sites of CfNH (a), LmNH (b) and LdNH (c) along the 2,000 ps of MD simulations.The picture was generated with the Pymol software.57

Figure 12 .
Figure 12.Dynamic behavior of compound 2 inside the active sites of CfNH (a), LmNH (b) and LdNH (c) along the 2,000 ps of MD simulations.The picture was generated with the Pymol software.57