Local Intersection Volume ( LIV ) Descriptors : 3 D-QSAR Models for PGI 2 Receptor Ligands

Prostaciclina I 2 inibe a agregação plaquetária pela interação com um receptor específico de membrana. Neste trabalho, desenvolvemos modelos de QSAR-3D para uma série de compostos heterocíclicos aromáticos usando como descritor o volume de interseção local. Os modelos obtidos podem ser aplicados no desenvolvimento de novos ligantes de receptor da PGI 2 com potencial atividade anti-agregante plaquetária.


Introduction
2][3] Under physiological conditions, prostacyclin is a labile compound of limited clinical usage, 3 which has a half-life of approximately 3 minutes.5][6][7][8] Based on data from their research, we developed, in this work, 3D-QSAR models for ligands of IP receptor using the local intersection volume (LIV) descriptor, 9 i.e., the intersection volume between the volumes of the compound atoms and the volumes of a set of spheres of defined atom size.It composes a three-dimensional box, in analogy to the Grid method. 102][13] They can be applied to design new PGI 2 receptor ligands that could be potential inhibitors of blood platelet aggregation.
Step 2. Construction of a grid matrix composed by cubic unitary cells, where the vertices of the cells  [4][5][6][7][8] and pharmacophoric sites (S1, S2, S3, and S4) definition according SAR studies. 5,20orrespond to the Cartesian coordinates of the eight carbon atoms.The vertices arrest lengths are 1.50Å (that is, almost equal to the carbon van der Waals radii of 1.54Å).The grid matrix is composed by a total of 2197 (13x13x13) carbon atoms, constructed on an Excel ® program, and imported into the Insight II program 15 (Figure 2, Supplementary Material).
Step 3. The conformational analysis was performed applying the systematic search tool available in the PC Spartan Pro v.1.0.5 16 using the MMFF force field. 17Default options were used, including the maximum number of 100 conformations and a energy cutoff of 10 kcal mol -1 from the minimum energy conformation found.The conformations generated were optimized using the AM1 18 Hamiltonian.We excluded the more similar conformations for each compound according to the root mean square (RMS) deviation using all atoms superposition in the Search_Compare module available in the Insight II program. 15One conformation for each compound was selected from this new set of conformations according to their lowest RMS values from the alignment with a preselected conformation of PGI 2 (see Supplementary Material).
Step 4. On the alignment step, the molecules were inserted into the grid matrix according to RMS deviation with the selected conformation of PGI 2 as a reference compound 19 (see Step 3).The alignment rules are in accordance with structure-activity relationship (SAR) studies 5,20 in which the pharmacophoric groups are labeled as S1 (carbon atom of carboxylic acid group), S2 (oxygen atom of the endocyclic ring), S3 (oxygen atom of the hydroxyl group binding at C11), and S4 (hydrophobic chain), as can be seen in Figure 1.
Step 5.The volume for each hard sphere of the grid matrix was calculated using a radii length of 1.54Å x 0.65, where 0.65 is the scale factor used to avoid a large overlap among the spheres (although still allowing a small one), and, consequently, a minimal loss of volume among the hard spheres.The 42 compound volumes were calculated using the van der Waals radii without scale factors.Subsequently, the intersection volumes were calculated using the molecular volume of each compound and the volume of each hard sphere that composes the grid matrix.These intersection volumes are labeled as local intersection volumes (LIV), and represent the independent variables (volume descriptors) of the database (DB).All volumes were calculated on the Search-Compare module of the Insight II program. 15tep 6.The LIV calculation generated a total of 2197 variables (or LIVs descriptors), in order to exclude data noise on the database, we reduced data so as to generate three databases A, B, and C. DB-A, with 753 LIVs, was constructed from the original DB, excluding the variables in which LIV equals zero in all molecules; DB-B, with 438 LIVs, was constructed from DB-A, excluding the variables in which LIV is different from zero in three or less than three molecules; and BD-C, with 349 LIVs, was constructed from DB-B, excluding the variables in which LIV is different from zero in five or less than five molecules.In DB-A we exclude useless variables and in DB-B and DB-C we harmonize the data, removing variables that were not properly represented in the dataset.That meant we did take into account the structural peculiarities of a few compounds.
Step 7. In the model calculations a combined GA-PLS analysis implemented in the Wolf 6.2 program 11 was used.We created 400, 100, and 100 randomly generated models for DB-A, DB-B, and DB-C, respectively.Initially, each model contained four independent variables.The mutation operator was set to 100% for each 10-crossover operation.The smoothing factor (the variable that controls the number of independent variables in the models) was set to 0.5 and the maximal number of components for the PLS regression analysis was set to three.We performed 35,000, 40,000, and 20,000 crossover operations for DB-A, DB-B, and DB-C, respectively.All other options were left in their default values.
Step 8.The ten best 3D-QSAR models as scored by the "lack-of-fit" (LOF) measure 13 from the GA-PLS analysis were evaluated by "leave-one-out" (LOO) cross-validation procedure 13 using the Wolf 6.2 program. 11The test set was used only for the external validation process.

Results and Discussion
The models 1, 2, and 3, described below, correspond to the best models from DB-A, DB-B, and DB-C, respectively.These models are composed of six independent variables (LIV) with the square of the coefficient of linear correlation (R 2 ) varying from 0.86 to 0.92 and R 2 after cross-validation (Q 2 ), which means predictive capacity, varying from 0.73 to 0.84.We have chosen Model 2 as the best model for two reasons: it is statistically better, presenting the highest Q 2 value (Q 2 = 0.84) and only two outlier compounds (Figure 3, Supplementary Material); and it is more comprehensive in a mechanistic sense, due the pharmacophoric groups that could be correlated to the selected LIVs (see subsequent discussion of Model 2).Model 1 has four outlier compounds and Model 3, even though with two outlier compounds, has the lowest Q 2 value (Q 2 = 0.73).On Table 2, we show the compound numbering and the corresponding experimental, calculated (training set), and predicted (test set) pIC 50 residual values, plus the standard deviation of residues for models 1, 2, and 3.
The graphic representation of maximum LIV for Model 2 is shown on Figure 4. We used compound 2, the most active compound, as a template.Therefore, this figure does not represent the LIV values for compound 2. LIV_1504 represents positive contribution on S1 site, however, due the relative conformational freedom of the carboxylic acid chain, not all compounds occupy this LIV.There is no correlation between any LIV descriptor and the pharmacophoric S2 and S3 sites on Model 2. LIV_1466 has positive contribution for the activity, even tough it does not correlate to any pharmacophoric site.On Figure 4, LIV_1466 looks as if it was close to the S2 site, but it is not.This results from a three-dimensional figure being represented as two-dimensional.
LIV_434 and LIV_603 correspond to positive and negative contributions, respectively, on the S4b site; both are located around the meta position of the phenyl ring.The contribution degree of these LIVs depends on the relative orientation between this phenyl ring and the heterocycle B ring.The more perpendicular this orientation is, the greater the positive contribution will be; on the other hand, the more coplanar it is, the greater the negative contribution will be.But in an intermediate orientation both contributions will be observed.LIV_541 and LIV_725 correspond to positive and negative contributions, respectively, on the S4a site.LIV_541 is located near the meta position of the phenyl ring and LIV_725 is located near the para position of the same phenyl ring, binding to the heterocycle B ring.Again, the contribution degree of these LIVs depends on the relative orientation between this phenyl ring and the heterocycle B ring.The more coplanar this orientation is, the greater the positive contribution (LIV_541) will be.
For compound 2, X-ray crystallographic studies by Meanwell and co-workers 7 describe a coplanar arrangement among the heterocyclic A, and B rings, and the phenyl ring (binding at C5 position on B ring).According to these studies, when the coplanarity between the heterocyclic B and the phenyl ring is reduced (as in compound 8) the activity decreases.These observations corroborate the results described by LIV_541 on Model 2.
As we can see on Table 2, there are two outlier compounds for Model 2: compounds 4 and 39.Compound 4 is an unexpected outlier since the only difference between it and 2 is a methyl group at the C2 position of its heterocyclic A ring, and since they both have similar activities.The residual value for compound 4 is -3.24 (calculated minus experimental pIC 50 ); that is, the predicted activity is lower than the experimental one.The difference is that, for compound 4, LIV_725 (with negative contribution) is 3.75, and it is 0.59 for compound 2.
Observing Figure 5, we can see that what causes the difference between the conformation of both compounds is the carboxylic acid chain orientation.Therefore, as the alignment procedure uses the C1 atom of this group, the overall relative orientation between these compounds is slightly modified.This outlier behavior of compound 4 hinted that, even though this conformation was selected using the same criteria used for the other compounds in this work (see Methods, Step 4), it was not the best possible to describe the observed activity.In order to verify this hypothesis, we used another compound 4 conformation, and obtained a smaller residual value; in fact, with the use of this new conformation, compound 4 stopped behaving as an outlier (data not shown).
Compound 39 has a residual value of 4.31 (Table 2); that is, the predicted activity is greater than the experimental one, due to the large positive contribution of LIV_1466 and LIV_1504 (Figure 6, Supplementary Material).Compound 2, as most of the other analyzed compounds, has the heterocyclic A ring linking the heterocyclic B ring to the phenoxy ring, with these two on a pseudo-cis arrangement.On the other hand, compound 39 has a single bond linking the heterocyclic B ring to the phenoxy ring.Hence, due to this higher conformational freedom, the relative orientation between these rings can be antiperiplanar or synclinal, the synclinal conformation corresponding to the pseudo-cis arrangement.The selected conformation of compound 39 (Figure 6) is synclinal (the torsion angle is equal to 74.45°), which is better for the activity than the antiperiplanar, as may be seen by comparing the activities of compounds 9 (cis isomer) and 36 (trans isomer) (Table 1).In our study, compound 39 behaves as an outlier because of the synclinal conformation, while on the biological medium, the antiperiplanar is probably the predominant conformation.This also might explain why other analogs of compound 39 with predominantly antiperiplanar conformations are not outliers.

Conclusions
In this work, we developed 3D-QSAR models for ligands of the IP receptor using the local intersection volume (LIV) descriptor. 9The LIV is the intersection volume between the volumes of the compound atoms and the volumes of a set of spheres of defined atom size, which compose a three-dimensional box.We obtained three LIV-3D-QSAR models by genetic algorithms (GA) and partial least squares (PLS) methods, [11][12][13] namely Model 1, Model 2, and Model 3, corresponding to the best models from databases A, B, and C, respectively.
Model 2 was chosen as the best model, since it has the highest Q 2 value (Q 2 = 0.84) and only two outlier compounds, and also because, in a mechanistic sense, it is more comprehensive, due the pharmacophoric groups that could be correlated to the selected LIVs.Observing the selected LIVs on Model 2, we can distinguish four with positive, and two with negative contributions to the biological activity.LIV_434, LIV_541, LIV_603, and LIV_725 are correlated to the pharmacophoric S4 site, and LIV_1504 to the S1 site.There is no correlation between any LIV descriptor and the pharmacophoric S2 and S3 sites.LIV_1466 does not correlate to any pharmacophoric site.Compounds 4 and 39 are outliers for Model 2, probably because the selected conformations were not appropriate to describe the observed activity.
In order to verify this hypothesis, we have used a different conformation for compound 4, and obtained a smaller residual value for it.Thus, compound 4 with this new conformation is not anymore an outlier on Model 2.
We may thus conclude that, independently of further analysis, Model 2 can still be applied to design new PGI 2 receptor ligands with potential platelet anti-aggregating activity.

Figure 4 .
Figure 4. Graphical representation of the LIV-3D-QSAR Model 2 using compound 2 as reference.The LIV_434, LIV_541, LIV_1466, and LIV_1504 correspond to positive (gray) contributions and LIV_603 and LIV_725 correspond to negative (dark gray) contributions for the activity.The LIVs descriptors are represented in their maximum size.The nitrogen and oxygen atoms of heterocyclic rings and the phenoxy group are explicitly shown.

Figure 5 .
Figure 5. Graphical representation of the LIV-3D-QSAR Model 2 for compound 4. LIV_434 and LIV_1504 correspond to positive (gray) contributions and LIV_603 and LIV_725 correspond to negative (dark gray) contributions for the activity.The nitrogen and oxygen atoms of heterocyclic rings and the phenoxy group are explicitly shown.

Table 2 .
Experimental and calculated pIC 50 values, residual values, and standard deviation of residues of Models 1, 2, and 3 Compounds from the test set are underlined.b Standard deviation of residues for the entire group (training set and test set).c Standard deviation of residues for the training set. a