A Linear Solvation Energy Relationship to Predict Vapor Pressure from Molecular Structure

Pressões de vapor de líquidos orgânicos (em Pa a 298 K) correlacionam (R = 0.986) segundo a relação: log P vap = 7.86 – 3.54 V – 1.17 E – 1.52 (S + λ) – 3.64 (η × A × B) sendo V, E, S, A, e B parâmetros empíricos para o volume molar, o índice de refração em excesso, a dipolaridade/polarizabilidade e a capacidade doadora e receptora de pontes de hidrogênio do soluto, respectivamente. O parâmetro λ ajusta o valor do termo em S para grupos funcionais específicos, enquanto considera diferenças na formação de pontes de hidrogênio entre classes distintas de líquidos puros. Essa relação linear de energia livre (LSER) é quimicamente razoável e permite a previsão da pressão de vapor de líquidos orgânicos a partir de parâmetros de soluto conhecidos ou estimados a partir de estrutura química. Esses resultados ilustram o potencial de uso de parâmetros de soluto para desenvolver LSERs para a previsão de propriedades de substâncias puras.


Introduction
The dispersal of organic liquids in the environment depends on their physical properties, especially their volatility and their solubility in water. 1 It is therefore of central importance in environmental studies to be able to predict these properties when experimental values are not available. 2hile these properties can sometimes be estimated on the basis of known physical constants, such as boiling point and heat of vaporization 3 or by computational methods, 4 it is particularly useful to have available methods to estimate these properties by inspection of molecular structure, without recourse to other experimental values or to computational results.In the present work, we examine the potential advantages and limitations of employing a Linear Solvation Energy Relationship (LSER) approach for the estimation of the vapor pressure of organic liquids at 298 K.
As introduced by Kamlet and Taft 5 and subsequently developed by Abraham, 6 the LSER approach characterizes solvation effects in terms of nonspecific (orientationindependent) and hydrogen bonding interactions.Thus, a solvation property of interest (P) for an organic solute is modeled by a linear free energy relationship of the form 7 P = c + v V + e E + s S + a A + b B (1)   where c, v, e, s, a and b are constants characteristic of the system being studied.The non-specific interactions are represented by V, the characteristic volume 8 of the molecule, which is taken to be a measure of cavitation and generalized dispersion interactions; by E, the excess molar refraction of the compound relative to that of an aliphatic hydrocarbon of the same molar volume, which is thought to indicate the importance of interactions of molecules through their pi-and n-electron pairs; and by S, a measure of the dipolarity/polarizability of the solute. 9The specific interactions are incorporated through the A and B parameters, which respectively represent the sums of the hydrogen bond donor and hydrogen bond acceptor characteristics of the solute.This LSER approach has been used to develop predictive equations for a wide variety of chromatographic and phase transfer processes. 10he Abraham method was developed to model interactions of a set of solutes of diverse structure with a single solvent system or their transfer between two solvents or between a solvent and another phase.Recently, however, we began to explore the viability of the LSER approach for the estimation of physical properties of pure substances, including the work of interfacial adhesion of organic liquids with water 11 and the surface tension of organic liquids. 12long the same lines, Abraham has reported a correlation for the solubility of neat organic compounds in water, 13 and recently Bel´skii reported a correlation of vapor pressure along the lines of equation 1 but with an additional V 2 term. 9ll of the applications discussed in the preceding paragraph represent significant departures from the original Abraham model because for each compound the surrounding solvent molecules are different for each solute molecule in the neat liquid.In each of these cases a reasonable LSER was reported, however, suggesting that applying the Abraham method to pure liquids has some validity.It is therefore important to understand more fully the basis for the apparent validity of the extended Abraham model and, even more important, to determine the limitations of this approach.
Of all the phase changes involving neat liquids, vaporization should provide the most sensitive test of the extent to which the parameters used in a LSER-type correlation can appropriately model the intermolecular interactions present in a neat liquid.Therefore we have undertaken a study of the extent to which vapor pressure values correlate with the empirical solute parameters of Abraham.The results reported here provide a useful method for estimating the vapor pressures of organic compounds from their structures.In addition, the results both lead to a better understanding of the limitations of using LSER methods to model the properties of pure substances and also point to some methods to improve such correlations.

Results and Discussion
In order to avoid the complications of hydrogen bonding in the initial phase of the study, we considered first a group of 315 organic liquids having A values equal to or near 0. The data set included alkanes, cycloalkanes, alkenes, cycloalkenes, dienes, alkynes, benzene, alkylbenzenes, alkylnaphthalenes, alkyl halides, aryl halides, ethers, thioethers, aldehydes, ketones, esters, mercaptans, tertiary amines, pyridines, nitriles, and nitro compounds. 14The resulting correlation of log P vap (in Pa, at 298 K) with the three non-specific solute parameters, V, E, and S, is shown in equation 2. 15,16 log P vap = 7.78 -3.45 V -0.93 E -1.70 S (2)   The fact that all of the coefficients in equation 2 are negative is chemically reasonable because all stabilizing intermolecular interactions decrease the vapor pressure of a liquid. 9Additionally, equation 2 indicates that generalized dispersion is the dominant factor in determining the vapor pressure of compounds that do not hydrogen bond.That is, the effect of dispersion (given by the product 3.45 V) is greater than that of either of the other interactions (0.93 E or 1.70 S) for all of the compounds in the data set.In fact, the contribution of dispersion averages 77% of the total contribution of the V, E, and S interactions for all of the compounds in the data set and is more than 50% for all but three compounds. 17t is noteworthy that the correlation in equation 2 is so good because, as noted above, the solvent is different for each solute in the group.One might expect that the interactions of a solute molecule with identical solvent molecules would be similar for compounds that are close homologs, but the results suggest that such interactions are also similar among many different classes of compounds.In order to test that conclusion more fully, we checked for functional group-specific differences between literature values of log P vap and those predicted with equation 2. Such differences were indeed found for four families of compounds.The predicted values of log P vap were about 0.30 units too large for alkyl nitriles and about 0.41 units too large for alkyl nitro compounds.Alkylbenzenes as a group gave predicted values about 0.24 units too low, and predicted values for alkylnaphthalenes were about 0.27 units too small.These systematic deviations are illustrated in Figure 1.
It has been suggested 18 that deviation of a value for an aliphatic nitro compound predicted with an LSER may reflect some degree of tautomerization to the nitronic acid (as shown for nitromethane in equation 3), which could result in hydrogen-bonding interactions. 19,20) The equilibrium constant for conversion of nitromethane to the nitronic acid in the gas phase is calculated to be 2.2 × 10 -12 , however, and tautomerization of a nitronic acid to the isomeric alkane in the condensed phase is thought to be essentially complete. 19Moreover, in aqueous solution and at cellular pH, aci-tautomerization of secondary nitroalkanes is much greater than that of primary nitroalkanes, 21 but the deviation between predicted and literature values of P vap in the present study is much greater for 1-nitropropane than for 2-nitropropane.In addition, the average deviation between predicted and literature values of log P vap for a series of nitroalkanes is only slightly greater than the average deviation observed for nitrobenzene, m-nitrotoluene, and o-nitrotoluene. 22Similarly, the average deviation between literature and predicted values for a set of alkyl nitriles is not much greater than that for nitrobenzene.Therefore hydrogen bonding arising from tautomerization seems not to explain our log P vap results.
Both cyano and nitro aliphatic compounds have significantly larger local dipole moments than do other compounds in the data set.This observation suggested that the lower than predicted vapor pressures of the nitriles and nitro compounds might reflect an intrinsic limitation of the S parameter as a descriptor for the properties of neat liquids.Thus, if S reflects primarily the non-specific dipolar interactions of the solute with a surrounding dielectric medium, it will fail to account adequately for dipole-dipole interactions strong enough to cause some transient ordering of molecules in the liquid state. 23An LSER correlation such as equation 2, which uses S as the only parameter for dipolar interaction, will therefore overestimate the vapor pressures of nitriles and nitro compounds, as we observed.
The situation is the opposite for nonpolar, aromatic compounds.There is strong correlation (R 2 = 0.97) between the S and E values of those alkylbenzenes and alkylnaphthalenes in the data set.This correlation may lead to an overestimation of the stabilization due to dispersion interactions in the bulk liquid and consequently to an underestimation of the vapor pressure, as observed in Figure 1 for log P vap values predicted with equation 2.
The most direct way to compensate for the deviations noted above is to apply a functional group-specific adjustment (λ) to the S parameter for these four classes of solutes.There is some precedent for this approach in the use of the "polarization correction" parameter δ by Kamlet et al. to predict octanol/water partition coefficients. 24mpirically, the best-fit values of λ were found to be +0.26 for aliphatic nitriles, +0.32 for nitro compounds, -0.20 for benzene and alkylbenzenes, and -0.32 for alkylnaphthalenes (with λ = 0 for all other classes of compounds in the data set).For the former two, these λ values are qualitatively in line with the relative order of the magnitudes of the local dipole moments, and the λ values for the latter two follow the order of the polarizabilities of phenyl vs. naphthyl rings.The resulting correlation, given by equation 4, showed a substantial improvement in the F value and in the standard error of the prediction.Moreover, there was a substantial increase in the partial F values and a notable decrease in the standard errors of the coefficients of E and (S+λ) for the whole data set, again consistent with the proposed origin of the deviations for the four classes identified above.log P vap = 7.86 -3.54 V -1.17 E -1.52 (S +λ) (4)   (n = 315, R 2 = 0.985, F = 6930, standard error = 0.145) For compounds having values of both A and B significantly greater than 0, hydrogen bonding is expected to be the dominant type of specific interaction in the neat liquid.As in previous work, 11,12 we included the parameter A × B to model the overall strength of hydrogen-bond interaction in the correlation (equation 5) for a data set of Inspection of these results again revealed some interesting functional group-specific deviations between predicted and literature values.The predicted values of log P vap for primary alcohols were consistently about 0.37 units too large, while those for primary amines were about 0.37 units too small (Figure 2).Predicted values for secondary alcohols were close to the experimental values, while those for secondary amines were about 0.32 units too small (Figure 3).The simplest explanation for these discrepancies is that the product of the A and B values for a particular solute does not adequately quantify the hydrogen bonding interactions present in the neat liquid because of pronounced steric effects on self-association.Indeed, steric influences on hydrogen bonding have been reported previously for neat alcohols, phenols, and amines. 26The most expedient way of correcting a LSER for steric effects on hydrogen bonding in neat liquids is by the inclusion of a scaling factor, η, as an empirical hydrogen-bonding index.Setting the value of η to 2.0 for primary alcohols resulted in best-fit values of η (to the nearest 0.01 unit) of 1.43 for secondary alcohols; 1.27 for tertiary alcohols, phenols, and anilines; 0.61 for primary amines; and 0 for secondary amines.Such values of η for the alcohols are reminiscent of the 2.0 : 1.66 : 0.94 ratios reported for the relative selfassociation constants for 1-propanol, 2-propanol, and 2methyl-2-propanol, respectively, in cyclohexane solution. 26As with those equilibrium constants, the values of η reflect the decrease in net stabilization of clustered alcohol molecules as substitution around the hydroxyl group increases.Inclusion of the η parameter results in equation 6, which is a substantially better correlation than that in equation 5.In addition, the coefficients for the non-hydrogen bonding interaction terms in equation 6 are identical to those of equation 4, as would be expected if the separation of the hydrogen-bonding and nonhydrogen-bonding interactions described here is correct.The results for all 376 compounds are plotted in Figure 4.
log P vap = 7.86 -3.54 V -1.17 E -1.52 (S + λ) -3.64 (η × A × B) (6)   (n = 376, R 2 = 0.986, F = 6524, standard error = 0.148) Equation 6 implicitly assumes that the attractive intermolecular interactions present in the neat liquid are lost when the liquid vaporizes.This assumption will not be true, and thus equation 6 will not apply, for two categories of compounds.The first includes compounds that are associated in the vapor phase, typically as a result of hydrogen bonding.For example, carboxylic acids have a strong tendency to dimerize in the neat liquid and -for the more volatile aliphatic  carboxylic acids -even in the vapor phase. 27Since solute parameters are available only for monomeric carboxylic acids, values of log P vap predicted for carboxylic acids on the basis of equation 6 would exhibit poor agreement with literature values.The second category of compounds not well described by equation 6 includes compounds such as 2-alkoxyalcohols that exhibit some intramolecular hydrogen-bonding interactions as monomers in the gas phase. 28In addition, highly polar compounds that show enhanced dipole-dipole attraction similar to that proposed for the nitriles and nitro compounds would require determination of an appropriate λ value before equation 6 could be applied to them.
Although equation 6 was derived by using solutes for which the appropriate parameters have been reported by Abraham, it can also be used to estimate vapor pressure from molecular structure.The V parameter can be calculated easily from the number of atoms, bonds, and rings in the molecule, 8 and a neural network method has been used to predict S. 29 Values of E can be calculated from the refractive index or computed from the molar refraction calculated at the sodium D line (578 nm). 30In addition, several groups have reported multiparametric linear regression and neural network methods for the estimation of the E, S, A, and B parameters directly from molecular structure. 31,32n alternative approach for structures with a single functional group is to estimate the parameters for the compound from the parameters reported for analogous structures.We did that for an additional 76 compounds for which literature values of log P vap are available but for which solute parameters had not been tabulated.The additional compounds included a cycloalkane, alkenes, alkynes, conjugated dienes, alkyl halides, alkylbenzenes, alkylnaphthalenes, sulfides, disulfides, ethers, aldehydes, ketones, mercaptans, amines, alcohols, and a phenol. 33As shown in Figure 5, the correlation between predicted and literature values is comparable (R 2 = 0.990, standard error = 0.156) to that for compounds with known solute parameters.Thus the use of equation 6, along with readily accessible molecular solute parameters, offers a convenient way to estimate the vapor pressure of many organic liquids without recourse to experimentally determined values.

Conclusions
The linear free solvation relationship method, which was developed to model the properties of a series of solutes in a given solvent system, can be extended to the prediction of physical properties of pure substances.However, it is necessary to demonstrate that the LSER applies equally well to all classes of compounds in the data set by looking for functional group-specific deviations from the overall correlation.Such deviations can be accommodated by the inclusion of additional parameters, but only if the chemical interpretation of such additional parameters is both clear and reasonable, as with the η and λ parameters developed here.This approach led to the development of equation 6,

Figure 1 .
Figure 1.Correlation of literature values of log P vap and those predicted with equation 2 for nitriles ( ), nitro compounds ( ), alkylbenzenes ( ) and alkylnaphthalenes ( ).The diagonal line represents a perfect correlation of literature and predicted values.

Figure 2 .
Figure 2. Correlation of literature values of log P vap and those predicted with equation 5 for primary alcohols (Δ) and primary amines ( ).The diagonal line represents a perfect correlation of literature and predicted values.

Figure 3 .
Figure 3. Correlation of literature values of log P vap and those predicted with equation 5 for secondary alcohols (Δ) and secondary amines ( ).The diagonal line represents a perfect correlation of literature and predicted values.

Figure 4 .
Figure 4. Correlation of values of predicted with equation 6 with literature log P vap values for all 376 compounds in the data set.The diagonal line represents a perfect correlation of literature and predicted values.

Figure 5 .
Figure 5. Correlation of values of log P vap predicted with equation 6 and solute parameters estimated from molecular structure with literature log P vap values for 76 additional compounds.The diagonal line represents a perfect correlation of literature and predicted values.

Table S1 .
Parameters and log P vap values used to develop the LSER

Table S1 .
(cont.) Explanation of column headings: Values of Pvap are calculated from DIPPR equation 101, using the constants given for a specific compound.The source of the constants is experimental (E), predicted (P), or a combination of the two (EP).The error is the maximum percent error of the resulting Pvap values as determined by DIPPR.Predicted log Pvap is the value of log Pvap calculated using equation 6 and the solute parameters A, B, V, E, S, λ, and η.The sources of the parameters A, B, V, E, and S are given in the text, and the values of λ, and η are taken from the manuscript.

Table S2 .
Solute parameters estimated from molecular structure * * E values for molecules with a particular functional group tend to decrease with increasing alkyl substitution because the difference between the compound´s index of refraction and the index of refraction of an alkane with the same molecular volume tends to decrease as the compound becomes more alkane-like.Thus E values may be interpolated from values of homologous compounds in TableS1above if a sufficient range of values is available.This was done for the compounds in TableS3except for the higher molecular weight trialkylamines, for which E values were calculated as indicated in Lima, G. A. R. Ph.D. Thesis, University of São Paulo, Brazil, 2000.

Table S3 .
Solute parameters and log P vap values used to prepare Figure5

Table S4 .
Statistical data for LSER correlations

Table S5 .
Correlation matrix for solute parametersThe following table shows the cross-correlation of the solute parameters for all 376 compounds in the data set used to generate equation 6: