Molecular Dynamics of the Interaction of Plasmodium falciparum and Human Serine Hydroxymethyltransferase with 5-Formyl-6-hydrofolic Acid Analogues : Design of New Potential Antimalarials

A Serina Hidroximetiltransferase de Plasmodium falciparum nunca foi vista como alvo para a quimioterapia antimalarial, possivelmente devido a sua grande similaridade seqüencial com a enzima humana. Esta similaridade sugere que este parasita pode ser incapaz de mutar esta enzima para desenvolver resistência à quimioterapia. Neste trabalho, diferenças observadas no comportamento dinâmico dos sítios ativos da estrutura cristalográfica da enzima humana e um modelo, por homologia, da enzima do parasita, ambos complexados com glicina, como N-glicina-[3-hidróxi-2-metil-5-fosfonooximetil-piridin-4-il-metano], e ácido 5-formil-6hidrofólico, foram utilizadas para o planejamento de inibidores seletivos desta enzima como novos potenciais antimalariais. Estes potenciais inibidores são derivados do ácido 5-formil-6hidrofólico com diferentes cargas e comprimentos de cauda. Simulações por dinâmica molecular e estimativas das energias de interação dos compostos nos sítios ativos de ambas as enzimas mostraram que os compostos com carga líquida negativa e caudas curtas ou caudas longas e anfóteras, seriam potencialmente mais seletivos em relação à enzima do parasita.


Introduction
Medicinal chemistry has undergone a revolutionary change in the last years.Rapid advances in biological sciences and bioinformatics have resulted in a much better understanding of how organisms function at the cellular and molecular levels.As a result, most research projects in the pharmaceutical industry now begin by identifying a suitable target and designing a drug to interact with that target.An understanding of the structure and functional mechanism of the target is, therefore, crucial to this approach.This target-oriented approach is know as "rational design of drugs" and involves many different stages, from choosing a suitable molecular target to marketing the final drug. 1 Along with target validation, the crucial step of this process is to find or design a compound that shows the desired pharmaceutical activity and selectivity on the target: the so-called "lead compound".The level of activity may not be very significant and there may be undesirable side effects, but the lead compound provides a start for the process of drug development. 1alaria is still a world problem of public health, affecting 300 to 500 million people worldwide and causing about 2.5 million deaths annually, mainly among children under five years old. 2 The inefficacy of vaccines 3 and the rapid emergence of P. falciparum strains by resistant to currently available antimalarial drugs [4][5][6][7] have stimulated new efforts regarding the development of new chemotherapy for malaria.0][11][12] Among those enzymes, Plasmodium falciparum Serine Hydroxymethyltransferase (pfSHMT) is the only one that has yet not been seen as a potential target to antimalarials.This probably happens because of the great sequence identity of this enzyme with human Serine Hydroxymethyltransferase (hSHMT), which could implicate in poor selectivity.In fact, the similarity between the active sites of pfSHMT and hSHMT is 86.95% while the entire sequence similarity between those enzymes is 48.50%.However, we have shown, 8 comparing the dynamic behavior of the active sites of crystallographic hSHMT (PDB ID 1BJ4) 13 and the first proposed multiple alignment homology model of pfSHMT, that there are key differences in the anchorage of the glutamate tail of the substrate 5-formyl-6-hydrofolic acid (FFO) in hSHMT and pfSHMT active sites.Those differences could be useful for the design of selective inhibitors of pfSHMT.The main difference between both enzymes is located at residue Glu137 of pfSHMT and its equivalent in hSHMT, Asp146.Based on the results of molecular dynamics (MD) simulations, it was proposed that modifications in the charge and length of the FFO tail could lead to compounds that would be able to interact preferentially with the pfSHMT active site.
In this work, the computer aided design of ligands using MD simulations and studies of the enzyme-inhibitor interactions were employed to propose structures and to study the dynamic behavior in the active site of seven potential lead compounds as selective inhibitors of pfSHMT.
The seven compounds designed in this work are, therefore, FFO derivatives with different charges and tail lengths.All the compounds plus glycine, as N-glycine-[3hydroxy-2-methyl-5-phosphonooxymethyl-pyridin-4-ylmethane] (PLG) were positioned into pfSHMT and hSHMT active sites, submitted to further 500 ps of MD simulations and had their interaction energies estimated with the package Insight Discovery 2000. 14The results show that those inhibitors with a negative net charge at physiologic conditions and possessing either shorter tails or longer amphoteric tails would be more selective towards pfSHMT.

Methods
The previously reported 8 first multiple alignment homology model of pfSHMT, with PLG and FFO positioned into its active sites, was used to build seven holoenzymes.The FFO analogues were planned according to the differences previously observed in the behavior of the FFO tail during the MD simulations in the active site of both enzymes. 8Accordingly, the FFO structure was systematically modified by changing its tail length and charge to yield seven analogues, which structures were built using the Gaussian View software from the Gaussian 98 package, 15 based on the structure of FFO as template.The insertion of the analogues into the modeled enzyme active site was performed by superposition to the FFO in the same conformation as reported in the literature. 16he atomic partial charges of PLG and the analogues were previously calculated at the Hartree Fock level with the 6-31G* basis set using the CHELPG approach of the Gaussian98 package 15 and their topological files, compatible with the format of the GROMACS 3.1.4package 17,18 databank, were generated at the Dundee PRODRG server. 19The holoenzymes obtained by this procedure, which volume is about 700 nm 3 , were set into a 1,350 nm 3 cubic box containing about 43,000 water molecules and minimized with the GROMOS 87 force field, 20 implemented in the GROMACS 3.1.4package. 17,18hese minimizations were carried out using the steepest descent algorithm until reaching an energy gradient of 100 kcal mol -1 nm -1 .
The MD steps were carried out according to the following procedure: First, 50 ps of MD at 300 K of the water molecules inside the box, in order to allow for equilibration of the solvent around the protein residues.In those dynamic simulations, all protein atoms had their positions restrained.Then, it was carried out a full MD simulation of 500 ps at 300 K with no restrictions, using 1 fs of integration time and a cut-off of 14 Å for the longrange interactions.
Because these proteins have only a residual net charge, which was balanced by contra-ions for these simulations, it was possible to use the direct cut-off for long range interactions without smoothening functions.The effects of the electrostatic potential truncation are minor at 14 Å, and the cut-off procedure makes the calculations faster than other methods.Considering that the protein dimer and water in the simulation box can achieve about 130,000 atoms, this method can save significant computational time.
As a whole, 100 frames were stored during this simulation.At this stage of the work the pair lists were updated every picosecond.All the Lys and Arg residues were positively charged and the Glu and Asp residues were negatively charged.
The crystallographic structures of dimeric hSHMT, with PLG and each designed compound inserted into their active sites were also submitted to several rounds of energy minimization and dynamic simulations, following the same procedure as described for pfSHMT.
Also, in order to obtain a neutral net charge for all the systems, they were neutralized by the addition of ions Na + for the hSHMT and Cl -for the pfSHMT systems, respectively.The program calculates the electrostatic potential to find the best positions for ion insertion by replacing water molecules that are at least at 3.50 Å from the protein surface.There were employed Periodic Boundary Conditions (PBC) and the water model used was the Simple Point Charge water (SPC), the default solvent of the GROMACS 3.1.4. 17,18he estimation of the interaction energies of each compound inside the active sites of pfSHMT and hSHMT were performed using the AMBER force field, which is included in the Insight Discovery 2000 package. 14

Proposition of the structures for potential inhibitors of pfSHMT
The differences previously observed in the interactions of the FFO tail with the active sites of the homology model of pfSHMT and the crystallographic structure of hSHMT, 8 led to the design of a series of FFO derivatives as potential selective inhibitors of pfSHMT, which are shown in Figure 1.
First of all we proposed seven structures with different tail lengths, divided into three groups, according to their expected net charges in physiologic conditions: Group 1 included the positively charged compounds (compounds 2, 3, 4, 5 and 6 in Figure 1); group 2 included the neutral compound (compound 1 in Figure 1) and group 3 included the compound with a negatively charged tail (compound 7 in Figure 1).We tried a neutral and a negatively charged compound in physiologic conditions because dimeric forms of hSHMT and pfSHMT are differently charged in physiologic conditions.By summing the charged residues at pH 7.5, hSHMT presents a net charge of -4, while pfSHMT presents a net charge of +8.Since the affinity between an enzyme and an inhibitor could be related to the electrostatic interactions between them, this enzyme net charge difference could be used for the proposition of selective inhibitors.Accordingly, we decided to check the effect of different charges on hSHMT and pfSHMT selectivity designing compounds 1 and 7 with charges different from compounds in group 1.
According to our previous results, 8 the possible selective inhibitors of pfSHMT would need to have a positive tail instead of the glutamate tail of FFO in order to have an attractive interaction with the carboxylate of pfSHMT Glu137 side chain.Initially we decided to design potential inhibitors maintaining the pteridinic ring intact and changing the PABA portion of FFO by substitution of the amino nitrogen of PABA by a CH 2 group for simplification of the synthetic procedure.This was the case for all the structures except compound 7.For the first structure the only additional modification was the substitution of the last carboxylate group of the glutamate tail of FFO by a positively charged guanidine (compound 1), maintaining the length of the tail as close as possible that of FFO.For the next compound we also substituted the amide and the last carboxylate group by an aromatic ring, to explore the hydrophobic behavior of this ring as spacer, maintaining the guanidine terminal group (2).For the next three compounds it was decided to shorten the tail length.Therefore, the whole tail was substituted by amidine (3), guanidine (4) or methylene-guanidine (5) groups.The substitution of the methylene-guanidine group of compound 5 for a guanylhydrazone group, also changing the substitution pattern to the meta position, led to compound 6.The meta position was used in compound 6 in order to maintain the tail size as closer as that of compound 5. Finally, compound 7 was a modification of compound 3, where the CH-CHO portion of the pteridine ring was substituted by a CH 2 , for synthesis simplification, and two -SO 3 -groups were introduced at the aromatic ring in order to give the compound a net negative charge.
All these seven structures were proposed based on the differences previously observed in the dynamics studies of the FFO inside the active sites of crystallographic hSHMT and the pfSHMT homology model. 8There are many other potential structures but these seven ones were chosen as the starting point for the design of potential selective inhibitors for pfSHMT.J. Braz.Chem.Soc.

Molecular dynamics simulations
The seven proposed compounds were positioned, together with PLG, inside the active sites of the dimeric hSHMT and pfSHMT affording 14 enzyme-coenzyme-inhibitor systems.Each system was submitted to 50 ps of position-restrained MD (PRMD) followed by 500 ps of MD simulations with no restriction, as described in the experimental section.We decided to stop the MD simulations at 500 ps because all the systems reached a stable state before the 100 ps of MD simulation, as it can be seen in the energy plots for compound 1 in Figure 2.This procedure allowed for the reduction of the total computational time required to perform all the 14 MD simulations.
There were carried out calculations of temporal RMSD on all the atoms in all the systems for 500 frames generated at each 1 ps of MD simulation.In this case the result is a unique general value for each SHMT monitored throughout the simulation time.Taking into account that the complexes could float inside the water box, each frame was adjusted to the former by the minimum squares method when calculating the standard deviation.In Figure 3, where the variation of RMSD with time is monitored, it can be observed that the systems SHMT/PLG/ Compound 1 equilibrate along the first few picoseconds of the MD simulation.This behavior was common to all 14 MD simulations, with deviations never larger than 0.1 nm after the first 100 ps of simulation.We also observed that, for all simulations, the temporal RMSD of hSHMT and pfSHMT as apoenzymes, practically fits the temporal RMSD of the complexes (data not shown).Also, as expected, the ligands display a fluctuation greater than for the complexes but with a smaller RMSD, thus confirming the stability of the MD simulations.
The spatial RMSD (or RMSF) on each amino acid residue was also calculated in the time range of 100 to 500 ps, at each 1 ps, totalizing 400 frames.Figure 4 shows the plots of RMSF for the systems SHMT/PLG/Compound 1, with the total RMSF (peptidic backbone plus side chains) in gray, the RMSF of the peptidic backbone (N, C α , C) in black and the RMSF of the side chains in dotted lines.In Figure 5      in the tube representation.These regions correspond to the residues near the two terminus of each monomer and to the loops regions.On the other hand, the residues at the active site regions, as well as those at the α-helix and β-sheet regions, present lower RMSF values, revealing to be the most stable regions of the system.Throughout the dynamics simulations, very few fluctuations exceed 0.2 nm and even less fluctuations overpass 0.4 nm.The fluctuations close to 0.4 nm observed in the dynamic plot of the complex with hSHMT are located at the regions of residues 278-283 (monomer A) and 748-753 (monomer B), which correspond to the loop L3.In a previous work 8 it was shown that this loop is smaller in pfSHMT and because of that, these fluctuations are not so apparent in the RMSF plots of the complexes with the parasite enzyme.

Analysis of the dynamic behavior of the designed compounds in the active sites of pfSHMT and hSHMT
In order to analyze and compare the dynamic behavior of each compound inside the active sites of hSHMT and pfSHMT along the MD simulations, we extracted one frame every 20 ps of simulation, totalizing 25 frames for each system.To make possible the visualization we selected, in each frame, only those residues directly involved in the interactions with the compounds tails.These residues are Thr131, Hys132, Gly133, Phe134, Phe135, Asp136, Glu137, Lys138, Lys139, Lys140, Val141 and Ser142 for pfSHMT and Thr140, Hys141, Gly142, Phe143, Met144, Thr145, Asp146, Lys147, Lys148, Lys149, Ile150 and Ser151 for hSHMT.
The interactions of each compound with each active site were analyzed by evaluating, along the dynamics simulations, the distance between the compound tail and the Glu137 side chain for pfSHMT and the Asp146 side chain for hSHMT.The distance that was actually monitored corresponded to the distance between the δ carbons of the Glu137 in pfSHMT or Asp146 in hSHMT and the carbon atom bonded to the amidine or guanidine tails.
Table 1 summarizes the distance variations between the compounds tails and the side chains of Glu137 and Asp146, respectively.To analyze the data from this table it must be stressed that the starting distances tail-residue for the dynamics with each potential inhibitor are different for the two enzymes.This starting distance is necessarily greater for hSHMT, as Asp146 is oriented away from the position of the tail of the different compounds.In fact, in hSHMT it is not possible to set the compounds closer to Asp146. 8This is not the case for pfSHMT, where the Glu137 residue is oriented towards the tail of the potential inhibitors.In fact, the starting position of all compounds in the active sites was obtained by their superimposition with the crystallographic E. coli SHMT (eSHMT) cocrystallized with FFO and PLG. 8,16In Table 1 it can be observed that compound 3 was the one that got closer to the Glu137 side chain, stabilizing their final position at 3.5 Å of the δ carbon of Glu137.All the other compounds ended at distances varying from 5 to 8 Å.We also can observe from Table 1, that for compounds 4 and 5 the distance variation during the MD simulation is small and that the tail of compound 6, instead of approaching, increases its distance from Glu137.This probably happens because the guanidine group is kept from turning its positive charge towards Glu137 because of its The picture was generated with the molmol software. 21ulkiness and because it presents electrostatic repulsive interactions with Lys138, Lys139 and Lys140.Compound 7, despite having a short tail, probably suffers repulsive interactions of its SO 3 -groups with Phe135, as showed in Figure 6, keeping its amidine group far from the Glu137 side chain.Also, the absence of the pteridinic CHO group in compound 7 could eliminate important interactions for the stabilization of this compound inside the active site.As shown in Figure 7, compound 1, despite approaching the Glu137 side chain, in the same way as compounds 4, 5 and 6, presents some difficulty in accommodating its tail between the positively charged side chains of Lys138, Lys139 and Lys140.For this compound, it is possible that a change in the positions of the negatively and positively charged substituents could overcome this problem.Of all the designed  compounds only compound 3, with its short amidine tail, succeeded in approaching the Glu137 side chain enough to make possible the formation of H-bonds, as showed in Figure 8.
For hSHMT, despite some compounds approaching their tails significantly to the Asp146 side chain, during the MD simulation, this approximation was not enough to improve their interaction with the enzyme active site, as they still remained at an absolute distance varying from 10 to 18 Å away from the Asp146 side chain.

Estimation of the compound-enzyme interaction energies
As mentioned before, the compounds were classified into three groups according to their expected ionic states under physiologic conditions.Group 1 (compounds 2, 3, 4, 5 and 6) included the positively charged structures while group 2 (compound 1) and group 3 (compound 7) included the neutral and the negatively charged structures, respectively.This classification came from the observation that dimeric hSHMT and pfSHMT, when submitted to the GROMOS87 force field 20 as apoenzymes, in a water box and under physiologic conditions, presented different net charges.While hSHMT had charge -4, pfSHMT had charge +8.So, it is possible that hSHMT would have greater affinity for molecules with a positive net charge while pfSHMT would have greater affinity for molecules with a negative net charge.A similar hypothesis has already been studied in a previous work where we proposed that a continuous positive potential region between the two active sites of the bifunctional enzyme Dihydrofolate Reductase-Tymidilate Synthase (DHFR-TS) of P. falciparum, could be involved in an optimized mechanism for the transport of dihydrofolate negatively charged from TS active site to DHFR active site. 9In the present work, the interaction energy of the compounds with the enzymes was estimated using the Insight Discovery 2000 package, 13 and are in complete agreement with this hypothesis.Table 2 presents, for each compound, theoretical values corresponding to the electrostatic and van der Waals contributions to the energy of interaction of hSHMT and pfSHMT with the designed compounds.These values were obtained for the last molecular frames generated after each MD simulation of 500 ps, as discussed in the former section.The larger the value of the total energy, the smaller the enzyme affinity for the corresponding compound.In the last column are shown the values of ΔE between the systems for both enzymes, calculated as the difference between the total energy of the compound located in the active site of pfSHMT and the total energy of the compound, inside the active site of hSHMT as showed in equation 1.We have also estimated the interaction energies after two additional MD simulations of 1 ns for the systems pfSHMT/compound 3 and hSHMT/compound 3. The results (see Table 2) show that there are not significant changes in this property from 500 to 1000 ps of MD simulation, thus confirming that (1) It is important to notice that, at this point, we do not intent to predict the binding affinities of the compounds with each protein.The goal of calculating ΔE was only to obtain some qualitative insights of which could be the most promising compounds for selective interaction with pfSHMT.The appropriate way for predicting the binding affinities must include the contribution of the entropic terms in equation 1 and will be the subject of a forthcoming article.
A negative ΔE means that the compound could have more affinity for pfSHMT than for hSHMT.Taking into account these observations to analyze the results from Table 2, one can suppose that the compounds of group 1 (compounds 2, 3, 4, 5 and 6) afford the worst qualitative values in terms of selectivity once the electrostatic contributions dominates the enzyme-compound interaction energies.Probably the enzyme net charge has a major influence in the total energy because the Coulomb interactions are of long-range nature.Group 2 (compound 1) presents closer values for both enzymes.The factors that define the selectivity in this case are the specific interactions in each enzyme active site.The results here are relatively the same.Group 3 (compound 7) was the one that afforded the best value in terms of selectivity for pfSHMT.In this case both, the long-range and the specific interactions are more favorable to pfSHMT than to hSHMT.So, from a qualitative energetic point of view, compound 7 would have more affinity for pfSHMT.

Conclusions
In this work there were carried out MD simulations and interaction energy analysis for seven compounds that were designed as potential selective inhibitors of pfSHMT based on the conformational and dynamics differences observed between the residues Asp146 and Glu137 in the active sites of hSHMT and pfSHMT, respectively.
The MD studies showed that, in order to act as selective ligands for the pfSHMT active site, the tails of the FFO analogues should be short to avoid the repulsive interactions with residues Lys138, Lys139 and Lys140 of the active site of pfSHMT, as it is the case for compounds 3 and 7. On the other hand, the tails may be longer, but in that case they must possess both negative and positive charges at the right positions, in order to explore their interactions with Lys138, Lys139, Lys140 and Glu137.It seems that, comparing the positive groups used to change  the charge of the tail, the amidine is better than guanidine as option to interact with Glu137, mainly because the bulkiness of the latter introduces a repulsive effect with Lys138, Lys139 and Lys140.Also, the analysis of the qualitative interaction energy values for all compounds showed that the prototypes with negative total charge have greater affinity for pfSHMT than for hSHMT.This probably happens because, as demonstrated by the GROMOS 87 force field, 20 hSHMT and pfSHMT have opposite net charges.Accordingly, the Coulomb interactions become determinant in the selectivity and molecules with a net negative charge under physiologic conditions would display greater affinity for pfSHMT.Thus, we can conclude that short tailed compounds and compounds with long tails negatively and positively chrged are able to interact with Lys138, Lys139, Lys140 and Glu137, but with a negative net charge under physiologic conditions, would be more selective towards pfSHMT as they present negative values of ΔE.
Our results clearly suggest that it is possible to design potential selective inhibitors for pfSHMT despite its great similarity to the human enzyme.This may be a very important finding to develop new and more effective chemotherapy for malaria.
Many more potential selective inhibitors for pfSHMT may still be proposed, and the data obtained in this work is now being used to design new compounds for synthesis and biological evaluation.
are shown the qualitative illustrations in the tube representation of the RMSF for the SHMT/PLG/Compound 1 systems.Analyzing Figures 4 and 5 there can be determined the most mobile regions along the MD simulations, as the regions with larger values of RMSF and large thickness of the tubes

Figure 1 .
Figure 1.Structures of FFO and the seven proposed compounds.

Figure 5 .
Figure 5. Qualitative illustrations of RMSF for the MD simulations of: (a) hSHMT/PLG/Compound1 and (b) pfSHMT/PLG/Compound 1. PLG and Compound 1 are omitted.Black tubes correspond to α-helix; Dark gray tubes correspond to β-sheets and Light gray tubes correspond to loops.The picture was generated with the molmol software.21

Figure 6 .
Figure 6.Frames of MD simulations of Compound 7 inside hSHMT (a) and pfSHMT (b) active sites.Picture generated with the PyMOL software.22

Figure 7 .
Figure 7. Frames of MD simulations of Compound 1 inside hSHMT (a) and pfSHMT (b) active sites.Picture generated with the PyMOL software.22

Figure 8 .
Figure 8. Frames of MD simulations of Compound 3 inside hSHMT (a) and pfSHMT (b) active sites.Picture generated with the PyMOL software.22

Table 1 .
Distance variations in Å along the MD simulations, between the compounds (Comp) tails and the Glu137 side chain for pfSHMT and the Asp146 side chain for hSHMT.Variation (Å) = Initial Distance (Å) -Final Distance (

Table 2 .
12lues of the enzyme-compound interaction energies (kcal mol -1 ) for compounds (Comp) 1 to 7 calculated in the package Insight Discovery 2000.12Energies of the van der Waals interactions; b Energies of the electrostatic interactions; c Values inside brackets were estimated after 1 ns of MD simulation. a