The Linear Relationship Between Koopmans ’ and Hydrogen Bond Energies for some Simple Carbonyl Molecules

Recentemente Galabov and Bobadova-Parvanova mostraram que a energia de formação da ligação de hidrogênio obtida por cálculo no nível HF/6-31G(d,p) está altamente correlacionada com o potencial eletrostático molecular na região aceptora em alguns compostos carbonílicos simples. Neste trabalho mostramos que o potencial eletrostático pode ser substituído pela energia de Koopmans. A correlação entre esta energia e a energia de formação da ligação de hidrogênio é tão alta quanto aquela observada por Galabov e Bobadova-Parvanova. O potencial de Siegabhn relacionando às energias de Koopmans e cargas GAPT mostra que a energia de ligação de hidrogênio não está simplesmente correlacionada com a carga da região aceptora pois as cargas dos átomos vizinhos são também importantes no processo de ligação de hidrogênio.


Introduction
The success of QSAR studies depends on whether the molecular descriptors chosen are appropriate to explain biological activities.Descriptors are obtained from a number of sources such as experimental physical-chemical data, geometrical structure parameters and theoretical electronic indexes obtained from quantum mechanical calculations. 1Among the electronic indexes most prominent in QSAR studies are the atomic charges.The charge values used in QSAR studies are dependent on the basis set used, the level of electron correlation treatment and the method used to extract charge values from the molecular wave function.
Considering the models generated by 3D-QSAR, for example using the CoMFA (Comparative Molecular Field Analysis) approach, they can be used either to predict biological activity -taking into account mainly the statistical aspects of the model -or to get information about the physicochemical molecular surroundings, in order to describe single steps of interactions with receptor binding sites.If the charges are correlated with the activity they can be used for predictions even though the absolute charges are not correct.On the other hand, the electrostatic fields generated from different calculational methods could simulate very different physico-chemical conditions of ligand-protein interactions.
Folkers et al. 2 studied the effect of charge calculation methods in CoMFA models performed for 24 substituted N2-phenylguanines, which inhibit the Herpes Simplex Virus 1 Thydimine Kinase (HSV1 TK inhibitors).Similar CoMFA results were observed with different charge methods.However, their study showed that the observed electrostatic fields were greatly affected by the calculational methods.Kroemer et al. 3 analyzed 17 different methods at three different levels of theory to calculate charges and their effects on CoMFA results.Gasteiger-Marsili, semi empirical MNDO, AM1 and PM3) and ab-initio (HF/STO-3G, HF/3-21G* and HF/6-31G*) charges were included.The ESPFIT-derived charges yielded better models than those based on charges calculated from Mülliken population analyses.However the simple Gasteiger-Marsilii charges did not give the worst model.
For angiotensin converting enzyme (ACE) and thermolysin inhibitors, Waller et al. 5 studied the effect of charges calculated by Gasteiger-Hückel and PM3 methods.In the ACE inhibitor series, the two methods gave nearly similar models.For thermolysin inhibitors, a better model (internal predictivity) was observed using Gasteiger-Hückel charges, but the PM3 method gave better external predictivity for 11 test compounds.For non-steroidal aromatase inhibitors related to fadrozole, Recanatini 6 observed similar models using either the AM1 method for geometry optimization and charge calculations or MAXIMIN2 molecular mechanics for geometry optimization and Gasteiger-Marsili charges.Navajas et al. 7 verified that the mutagenic activity of 16 5H-furan-2-one derivatives was correlated with the LUMO field.The MNDO, PM3 and AM1 Hamiltonians were used to optimize and generate the LUMO field.Only the AM1 and PM3 methods gave satisfactory CoMFA models.
More recently the descriptive power of MS-WHIM, classified as a global 3D-QSAR method to model specific biological interactions, was described and the dependence of MS-WHIM on the type of atomic charge used to compute the electrostatic potential was analyzed.They observed that MS-WHIM descriptors were sensitive to the type of partial atomic charges applied and improved models were obtained using more accurate charges. 8oMFA models derived for artemisinin derivatives, using semiempirical AM1 and HF/3-21G optimized geometries, revealed that the HF/3-21G method was found to be usually but not drastically better than AM1.Additional calculations were performed to investigate the electrostatic field difference using the Gasteiger and Marsili charges, the electrostatic potential fit charges at the AM1 level, and the natural population analysis charges at the HF/3-21G level of theory.For the HF/3-21G optimized structures no difference in predictability was observed, whereas for the AM1 optimized structures differences were found. 9n addition to these problems it is not always clear that the atomic charge is the most appropriate descriptor to be used in QSAR studies.This is particularly true for problems involving hydrogen bonding.Galabov and Bobadova-Parvanova 10,11 have recently performed theoretical studies at the HF/6-31G(d,p) and HF/6-31+G(d.p)levels showing that the energy of hydrogen bond formation is not highly correlated with the atomic charge of the H-bonding site in the acceptor molecule but rather to the electrostatic potential at this site.For hydrogen bonding of HF to some open chain carbonyl and nitrile compounds the H-bonding energy was not found to be systematically related to the atomic charges of the oxygen or nitrogen acceptor atoms but was linearly related to the electrostatic potentials at these nuclei.An absence of a precise linear relation was found for several types of atomic charges, Mulliken, 12 CHELPG 13 and MK. 14 This is particularly significant since the charges and potentials were calculated from the same wave functions.
In this paper we investigate the relation between the Hbonding energy and GAPT 15 atomic charges.Even though GAPT charges can be determined experimentally from infrared intensities and also calculated from molecular wave functions 16 their relation to H-bonding has not yet been investigated.Furthermore the GAPT charges have been shown to be related to atomic electrostatic potentials by Siegbahn's simple potential model. 17,18These relationships suggest the use of Koopmans' energies for core electrons as QSAR descriptors in activity problems for which the H-bonding phenomenon is important.

Calculations
H-bonding energies for the hydrogen fluoride molecule with the HCOH, HCOOH, HCOSH, HCOOCH 3 , HCONH 2 , HCONO 2 HCOCN, HCOF, HCOCl, HCOCH 3 and HCOCF 3 molecules were taken from references 10 and 11 and are reproduced in Table 1.These values were calculated at the HF/6-31G(d,p) level and for this reason our GAPT charges and Koopmans' energies were calculated using the same wave functions.All calculations were performed at the theoretical equilibrium geometries of these molecules using the Gaussian 98 program 20 on a DEC ALPHA 1000 work station.
The molecular electrostatic potentials at point x in space, where x refers to the Cartesian coordinates of the acceptor nucleus in the molecule are standard output of the Gaussian program and can be calculated by the equation: 11 The GAPT charges are mean dipole moment derivatives given by where a represents an atom in a molecule.In other words the GAPT charges are one third of the trace of the atomic polar tensor: 20,21 The molecular polar tensor is a juxtaposition of the atomic polar tensors where there are N atoms in the molecule and is calculated from where L -1 , U and B are well known transformation matrixes 22 used in molecular vibrational analysis and P Q contains the dipole moment derivatives with respect to the normal coordinates.These latter elements are proportional to the square roots of the experimental infrared intensities. 23The GAPT charges are automatically calculated with the Gaussian program when the FREQ option is used.

Results
Galabov and Parvanova 10 have shown that the energy of hydrogen bond formation calculated at the HF/6-31G(d,p) level is not highly correlated with the atomic charge on the oxygen atom of the carbonyl group of the isolated acceptor molecule.Regressions of this energy on Mulliken, CHELPG and MK charges for the oxygen atoms of this group of molecules resulted in relatively low coefficients of determination, R 2 = 0.867, 0.867 and 0.837.In Figure 1 the graph of the energy of hydrogen bond formation as a function of the GAPT atomic charges on the oxygen atoms is presented.The energies were taken from reference 10 and the GAPT charges were calculated in our laboratory and are presented in Table 1.The scatter of the points about the regression line, included in the graph, is quite large and reflects the quite low coefficient of determination obtained, 0.602.The slope of the regression line is positive, as expected, since more negative oxygen acceptor charges are expected to result in more stable hydrogen bonds.
Galabov and Bobadova-Parvanova 10 reported that a much higher correlation coefficient is obtained when the energy of hydrogen bond formation is graphed against the molecular electrostatic potential at the carbon atom in the isolated acceptor molecules.The graph can be seen in reference 10 and the corresponding regression has an R 2 value of 0.958, much larger than the values observed for the various types of atomic charges calculated to pertain to the oxygen atom in the isolated acceptor molecules.The molecular electrostatic potential at an atom can be accurately estimated by Koopmans' energies of electrons occupying core orbitals.This is clearly illustrated in Figure 2 where a graph of Koopmans' energies for the 1s electrons of the oxygen atoms for the group of carbonyls investigated here is graphed against the corresponding molecular electrostatic potentials.The numerical values used have been included in Table 1.This graph shows an almost perfect correlation and its regression has an R 2 value of 1.000.
As such one can expect that the energy of hydrogen bond formation is highly correlated with Koopmans´ energy.This graph is presented in Figure 3 and corresponds to a regression R 2 value of 0.951, and is very similar to the one reported by Galabov and Bobadova-Parvanova 10 for the electrostatic potential.The small difference in the coefficient of determination could come from the fact that besides investigating the HCOR molecules treated in this work Galabov and Bobadova-Parvanova also studied their methyl analogues, CH 3 COR.

Discussion
It is not surprising that hydrogen bond formation energies are highly correlated with the molecular electrostatic potentials at the acceptor atoms or with Koopmans' energies of the 1s electron on the acceptor atoms.The energy of hydrogen bond formation consists of moving a positive hydrogen atom with the rest of the donor molecule from infinity and placing the hydrogen atom close to the oxygen nucleus.One can idealize this process by moving a positive test charge from infinity and placing it on the oxygen nucleus.If the test charge does not affect the molecular environment, this is just the molecular electrostatic potential at the acceptor atom.This explains why the molecular electrostatic potential approximates hydrogen bond formation and their values are expected be correlated.Koopmans' ionization of a 1s electron is a process almost exactly opposite to the one used for determining the molecular electrostatic potential.Instead of adding a positive charge a negative charge is removed.As such it is also highly correlated with the energy of hydrogen bond formation.
These relations are only approximate for two reasons.Hydrogen bond formation affects the electron distribution of the acceptor molecule.This, of course, is not the case for molecular electrostatic potentials and Koopmansé nergies of the acceptor molecule.Also the latter quantities, as used in this study, are measured at the acceptor atom.The proton involved in hydrogen bonding is close to the acceptor atom but not on it.Close means the hydrogen bond distance.Since these distances are approximately the same for all hydrogen bonding pairs treated here (between 177 and 222 pm according to the theoretical calculations 10 ) a good correlation between the hydrogen bond formation energy and these two quantities can be expected.The scatter of points about the regression line in Figure 3 could be due, in part, to the different hydrogen bonding distances.However it must be remembered that the molecular electrostatic potentials and Koopmans' energies provide extremely simple models for explaining the much more complicated hydrogen bonding process.
It is also simple to understand why the molecular electrostatic potentials and Koopmans' energies are more highly correlated with the hydrogen bonding energies than the atomic charges.This is more conveniently explained using the GAPT charges.Siegbahn and co-workers 11 have shown that Koopmans' energy can be calculated using a simple potential model represented by the equation:  Here Koop s , O E 1 is Koopmans' energy of a 1s electron of the carbonyl oxygen, k is a constant, O q is the charge on the oxygen atom, a q the charge on a neighboring atom and O R a the distance between the oxygen and its neighboring th a atom.If this potential model is applied to experimental ESCA energies a relaxation energy term must be added on the right hand side of this equation.This relation has been shown to be valid for both experimental 1s electron ionization energies and Koopmans' energies if GAPT charges obtained from experimentally measured infrared intensities or calculated from molecular wave functions are used. 17,18Evidently, the formation energy of the hydrogen bond and GAPT oxygen charges are not highly correlated because the electrostatic potential at the oxygen atom contributed by the neighboring atoms is also important in the hydrogen bonding process.In other words an approaching donor molecule is influenced not only by the charge of the acceptor oxygen atom but also by the charges on the other atoms in the molecule.Analogous arguments could be made for relations between other types of atomic charges and the hydrogen bonding energies.However in those cases the Siegbahn simple potential is not as accurate as for the GAPT charges.

Recommendations
QSAR studies have often been carried out by regressing biological activities on sets of molecular descriptors that often include atomic charges calculated from quantum chemical wave functions.If the main result of this paper, that Koopmans' energies are more highly correlated than atomic charges to hydrogen bond formation energies, can be extended to other groups of molecules besides those investigated here it seems reasonable to use these energies as descriptors for biological processes for which hydrogen bonding is important.Their use instead of electrostatic potentials has several advantages.First, ionization energy processes are more familiar to chemists than the more abstract molecular potential concept.Second, their use instead of atomic charges eliminates two uncertainties in QSAR applications.Which electron correlation level should be used to calculate the molecular wave function ?What is the most appropriate way (Mulliken, CHELPG, GAPT, etc.) of determining the atomic charges from this wave function ?Since Koopmans' energies are defined at the Hartree-Fock electron correlation level, really significant savings in computer time are also desirable consequences.One could also use experimental Koopmans' energies as QSAR descriptors.However these quantities are only available for molecules in the gas and solid phases.Furthermore they reflect not only the initial electronic structure of the molecule but also its relaxation properties on core electron ionization that are probably not relevant to biological problems.

Figure 1 .
Figure 1.Graph of HF/6-31G(d,p) calculated hydrogen bond formation energy against GAPT charges of the carbonyl oxygen in the isolated acceptor molecules.

Figure 3 .
Figure 3. Graph of HF/6-31G(d,p) calculated hydrogen bond formation energy against Koopmans´energies of the 1s electron of the carbonyl oxygen in the isolated acceptor molecules.

Figure 2 .
Figure 2. Graph of HF/6-31G(d,p) calculated molecular electrostatic potentials and 1s electron Koopmans´energies of the carbonyl oxygen in the isolated acceptor molecules.

Table 1 .
HF/6-31G(d,p) ab initio calculated energy of hydrogen bond formation, GAPT charge, Koopmans' energy and molecular electrostatic potential of the carbonyl oxygen in the isolated molecule.
a Values taken from Tables1 and 3of Reference 1. b Values calculated in our laboratory using the Gaussian 98 program.