D QSAR Studies on a Series of Bifonazole Derivatives with Antifungal Activity

Candida albicans (CA) é considerado como o principal patógeno oportunista em pacientes imunossuprimidos. A maior parte dos fármacos disponíveis para o tratamento de cepas resistentes são altamente tóxicos ou ineficazes. Uma forma de amenizar esse cenário seria através de modificações na estrutura de derivados de azóis que resultassem no aumento da potência e seletividade. Visando esclarecer quais propriedades químicas e estruturais são importantes para atividade antifúngica de derivados de azol, estudos de QSAR 2D clássico e holograma QSAR (HQSAR) foram realizados para um conjunto diverso de 52 derivados de bifonazol com atividade antifúngica. Os descritores topológicos utilizados nos estudos de QSAR 2D clássico originaram modelos com baixa consistência interna (r = 0,38, q = 0,27) e poder preditivo nulo (r pred = −0,6). Por outro lado, a utilização de hologramas moleculares possibilitou a criação de modelos de HQSAR robustos (r = 0,92, q = 0,65) e com bom poder preditivo (r pred = 0,79).


Introduction
During the past two decades, the incidence of invasive and systemic fungal infections has increased dramatically.][3][4][5] Despite Candida albicans (CA) has been identified as the major opportunistic pathogen in many fungal infections, the number of infections with other Candida species has been increasing. 1 Currently available drugs to treat these infections include azoles (such as fluconazole, ketoconazole, and itraconazole), polyenes (such as amphotericin B and nystatin), echinocandins (such as caspofungin and micafungin), and allylamines (such as naftifine and terbinafine) (Figure 1).Among them, azole derivatives are the most common drugs used against fungal infections. 2 The azoles are fungistatic compounds that inhibit the C-14 lanosterol demethylase, a key enzyme in sterol biosynthesis pathway leading to the accumulation of C-14 methyl sterols that alter normal membrane function. 6,7evertheless, some antifungal drugs are either highly toxic (e.g., amphotericin B, AMB) or becoming ineffective against resistant strains that affect mostly hospitalized patients (e.g., flucytosine and azoles). 3In fact, azole resistance is a major concern in long-course treatment of AIDS patients.The causes of resistance are generally associated with mutations in lanosterol 14α-demethylase that reduce azole binding and decreased intracellular drug accumulation due to increased expression of efflux pump genes. 4Moreover, long-term treatments may also cause hepatotoxicity, as azole derivatives can also interact to P450 enzymes from mammalian cytochromes. 8

Caspofugin Naftifine
In order to improve antifungal potency and selectivity, efforts have been made to synthesize new classes of antifungal agents or modify the structures of so far effective azole drugs. 5Indeed, several three-dimensional quantitative structure-activity relationship studies (3D QSAR) have been reported for different datasets of azole derivatives 2,6,[9][10][11][12] however, this strategy afforded models with moderate robustness and local predictive ability only.For instance, Di Santo and co-workers 1 report that their CoMFA model was contradicted by the synthesis and evaluation of novel compounds.This result highlighted that the predictive ability of their model was restrict and should not be used to guide the development of novel azole derivatives.One may argue that alignment rules, based on energy minimized or homology model driven conformations contribute significantly to this outcome.
Pharmacophore models have also been developed by means of docking several antifungal agents to 3D homology models of CYP51. 2 Although important information regarding selectivity towards candidosis or aspergillosis was made available, this study did not focus on molecular modifications that might increase potency or overcome resistance.One of the major limitations faced in the previous studies is the lack of accurate 3D structural information from C. albicans CYP51.An initial step to overcome this dilemma was the resolution of the crystallographic structure of Mycobacterium tuberculosis sterol 14α-demethylase (MTCYP51) in complex with two azole inhibitors (4-phenylimidazole and fluconazole). 9Yet, the bacterial source of this enzyme still poses as a problem for the development of 3D QSAR (ex.: comparative molecular field analysis -CoMFA) and 3D pharmacophore models, as these methods are highly dependent on molecular alignment. 13lternatively, we resorted to 2D QSAR approaches that require no explicit 3D information for the ligands (e.g.putative binding conformations and molecular alignment), employing both classical and fragment-based hologram QSAR (HQSAR) methods. 14,15HQSAR is an important drug design tool that encodes useful fragment-based information of the molecular structures.Nevertheless, more robust QSAR analysis can be carried out when molecular properties (e.g., physicochemical parameters) are also accounted for. 16Such approach highlights the complementary nature of classical and HQSAR methods.Besides, to the best of our knowledge, no HQSAR investigation for this class of antifungal compounds has been reported to date.The results of modeling this data set are reported herein.

Data set
The data set used for the QSAR studies contains 52 derivatives of bifonazole with antifungal activity that were selected from the literature. 17,18The biological property of this dataset is reported as MIC values, which is the antifungal concentration required to substantially inhibit organism (C.albicans) growth.As most azoles are fungistatics, the breakpoints cannot be clearly defined as residual growth persists for all concentrations above the MIC (National Committee for Clinical Laboratory Standards-CLSI M27-A2 document).For this reason the 90% growth inhibition (MIC 90 ) was considered as an accurate measurement of antifungal activity.In order to make comparison with the reference compound and other results from the literature 2,6,9-12 easier, the same relationship employed in previous papers (MIC 90 /MIC 90bifonazole ), was used to derive 2D QSAR models.
The structures and corresponding MIC 90 /MIC 90bifonazole values for the whole set of inhibitors are included in Table 1.The MIC 90 /MIC 90bifonazole values were converted to pMIC 90 / MIC 90bifonazole (−log MIC 90 /MIC 90bifonazole ) values and used as dependent variables in the QSAR analyses.
The chemical structures were drawn in the 2D format and converted to 3D, using Sybyl 7.3 plataform (Tripos Inc., St. Louis, USA).All structures were single point optimized using the AM1 semi-empiric method.A hierarquical cluster analysis, carried out with Pirouette 3.11 software (Infometrix, Washington, USA), using the complete linkage clustering method (Euclidean distances) and data autoscaling, guided the division of the complete dataset into training (compounds 1-43, Table 1) and test (compounds 44-52, Table 1) sets.

Classical QSAR studies
Classical QSAR studies require the calculation and selection of suitable descriptors.The following software was employed for this task: DRAGON 5.4 (Talette SRL, Milan, Italy) and BUILDQSAR. 19Briefly, 2D molecular descriptors, including topological descriptors, connectivity indices, 2D autocorrelation descriptors, Burden eigenvalues indices, among others, were computed using the software DRAGON 5.4 and used as independent variables in the QSAR analyses.A total of 929 molecular descriptors were calculated.Descriptors with zero variance or with poor correlation to biological activity (r 2 < 0.10) were discarded.Then, BUILDQSAR software was employed to systematically search for models of up to 4 variables that give rise to multiple linear regression (MLR) models with r 2 > 0.81.All descriptors present in the MLR models were pooled together, autoscaled and used for the partial least squares (PLS) analysis performed with the PIROUETTE 3.11.

HQSAR analysis
The HQSAR modeling analyses, calculations and visualizations were performed using the SYBYL 7.3 package (Tripos Inc., St. Louis, USA) running on Red Hat Enterprise IV workstations.HQSAR models can be affected by a number of parameters concerning hologram generation: hologram length, fragment size and fragment distinction.The generation of the molecular holograms was carried out using several combinations of the following fragment distinction: atoms (A), bonds (B), connections (C), hydrogen atoms (H), chirality (Ch), and donor/ acceptor (DA).The influence of fragment size, which controls the minimum and maximum length of fragments to be included in the hologram, was further investigated for the model with better q 2 , by using 5 distinct fragment sizes over the 12 default series of hologram lengths values ranging from 53 to 401 bins.The patterns of fragment counts from the training set inhibitors were then related to the experimental biological data using the PLS analysis.

QSAR model validation
All QSAR models were investigated using full cross-validated r 2 (q 2 ) PLS. Leave one-out (LOO) crossvalidation has been applied to determine the number of principal components that yield optimally predictive models.External validation was performed with a test set of 10 compounds, which were not considered for QSAR model development.The predictive ability of the models is expressed by predictive r 2 values (r 2 pred ), calculated as follows (equation 1): (1) SD is the sum of squared deviation between the biological activities of the test set molecule and the mean activity of the training set molecules and PRESS is the sum of squared deviations between the observed and the predicted activities of the test molecules. 20

Fisher's weight
The Fisher weight is a measure of the distance between two categories, it is given by the difference of the mean values of each category, divided by the sum of the categories variances and can be interpreted as a normalized distance between the Classes. 21The Fisher's weight is defined as: (2) Where - x p,1 ,x p,2 , denote the average values of descriptor p in class 1 and class 2 respectively, and S p,1 , S p,2 denote the standard deviation of descriptor p in class 1 and class 2, respectively. 22This value was calculated for every   descriptor of amino and nitro substituted compounds, in order to identify those descriptors that best differentiate between them.

Chemical and biological data
Classical QSAR and HQSAR models were derived for a series of bifonazole derivatives with antifungal activity (Table 1).An initial exploratory analysis was carried out by hierarquical cluster analysis (HCA), available in PIROUETTE 3.11 using the complete linkage clustering method (Euclidean distances) and data autoscaling.The cluster analysis shows 8 distinct clusters at 50% similarity, suggesting a reasonable structural diversity of the data set.The generation of consistent statistical models is dependent on the adequacy of the training and test sets.Therefore, molecules from each cluster were randomly assigned to either training set (compounds 1-43, Table 1) or test set (compounds 44-52, Table 1) so that structurally diverse molecules, possessing activities of wide range were used for model generation, whereas the 10 inhibitors from test set were employed for external validation.
The in vitro MIC 90 /MIC 90bifonazole values employed in this work were measured under the same experimental conditions, 17,18 a fundamental requirement for QSAR studies. 15,23Taken together, these two aspects indicate that this data set is suitable for QSAR modeling.

Classical QSAR analysis
Classical 2D QSAR studies require the calculation of a variety of molecular descriptors (e.g., connectivity indices, 2D autocorrelation descriptors, Burden eigenvalues) that are used as independent variables in QSAR modeling.The DRAGON 5.4 software was used to generate the descriptors for model development.The selection of the descriptors was carried out according to the following criteria.In order to reduce the number of descriptors, BUILDQSAR software was employed to systematically search for MLR models of up to 4 variables with correlation coefficients r 2 > 0.81.All descriptors present in MLR models (9 ) were pooled together, autoscaled and then explored using more robust statistical methods such as PCA and PLS, as implemented in the PIROUETTE 3.11 software (Table 2).
The PCA results shows that the first principal component accounts for 46.7% of total variance, while PC2 and PC3 accounts for 25.09% and 5.52% respectively.Additional components have insignificant contribution and were not considered further.The first PC broadly accounts for potency: the less potent inhibitors have positive PC1 values whereas the most potent ones display negative values.These preliminary results prompted us to use the selected descriptors for QSAR modeling studies using PLS.Unfortunately, all models show poor statistical parameters ( best model r 2 = 0.38, q 2 = 0.27) and no predictive power at all (r 2 pred = −0.6)(Table 3).This scenario suggests that selected descriptors could not capture chemical features that are important for azole derivatives biological activity.As pMIC 90 /MIC 90bifonazole values are influenced by both pharmacokinetics and pharmacodynamics events, and 2D descriptors account for whole molecule properties, it is difficult to identify which phenomena description should be improved.
As an alternative to gain further insight into the fragment-based structure-activity relationships for this series of bifonazole derivatives with antifungal activity, we resorted to another QSAR approach, Hologram QSAR.

HQSAR analysis
HQSAR relates biological activity to structural fragments.Basically, this analysis involve three main steps: 1. generation of structural fragments for each azole derivative of the training set; 2. the encoding of these fragments into a molecular hologram; 3. the statistical generation of PLS QSAR models. 15n our studies, the influence of the three parameters: fragment distinction, fragment size, and hologram length (HL), on the statistical values of our models were investigated.Thus, several combinations of fragment distinction were considered during the QSAR modeling runs using the fragment size default (4-7), as follows: ABC, ABCH, ABCChH, ABCChDAH, ABH, ABDA, ABCDA, ABHDA, ABCHDA, ABHChDA, ABCChDA, and ABCCh.HQSAR analysis was performed over the 12 default series of hologram lengths of 53, 59, 61, 71, 83, 97, 151, 199, 257, 307, 353, and 401 bins.The statistical results from the PLS analyses for the 42 training set inhibitors are summarized in Table 4.
According to HQSAR analysis, donor/acceptor atoms (model 8) or hydrogen atoms (model 2) add no information to default fragment distinction (model 1).On the other hand, chirality seems to play an important role for antifungal activity (compare model 6 and 1).Furthermore, a small improvement was achieved by including donor/acceptor atoms in fragment distinction (model 12).Interestingly, when hydrogen atoms were considered, in addition to chirality and donor/acceptor atoms, a further improvement was observed (model 4).This result is unexpected as previous studies have shown that stereochemical features do not significant influence the antifungal activity of azole derivatives. 18,24,25evertheless, lack of stereo selectivity does not necessarily indicate that inhibitors´ stereochemistry is irrelevant to ligand binding in the active site of the putative fungal target, otherwise it might be related to other phenomena such as cell uptake or different cellular localization.Hence, the importance of chirality in the best HQSAR model might be a consequence of structural features that are important for azole derivatives interaction into lanosterol 14α-demethylase binding site.The influence of different fragment sizes, which control the minimum and maximum length of fragments to be included in the hologram, was further investigated as shown in Table 5, but no improvement was achieved.
As the molecular structure encoded within the molecular hologram is directly related to anti-Candida activity of training set compounds, the HQSAR model should be able to predict the activity of new related molecules.Thus, the predictive ability of the model was assessed using the same test set compounds employed in the classical 2D QSAR studies (compounds 43-52, Table 1).The results of the external validation for the best predictive model (4, r 2 pred = 0.79) are displayed in Table 3 and the graphic results for the experimental versus predicted for both training and test sets are shown in Figure 2. As can be seen, the predicted values are in good agreement with experimental pMIC 90 / MIC 90bifonazole values, deviating by no more than 0.56 log units.Therefore, the model presents good correlative and predictive abilities.
Besides predicting the property value of interest of untested molecules, HQSAR models should also provide hints about the relationships of different molecular fragments to biological activity. 14,15HQSAR models can be graphically represented in the form of contribution maps where the color of each molecular fragment reflects the contribution of an atom or a small number of atoms to the activity of the molecule under study.The colors at the red end of the spectrum (e.g., red, red-orange, and orange) reflect poor (or negative) contributions, while colors at the green end (e.g., yellow, green-blue, and green) reflect favorable (positive) contributions.Atoms with intermediate contributions are colored white.
The HQSAR contribution map, Figure 3, shows that compounds orto-(compare 45 and 1), or para-substituted (compare 10 and 1) at the benzyl ring, next to the pyrrol ring, have good activity against C. albicans.Another interesting feature, highlighted in contribution maps, is the opposite effect of nitro and amine moieties to potency (ex.: 3 and 4).Unfortunately, HQSAR affords no plain explanation for the different contribution of the fragments towards potency.The synergic use classical and HQSAR could circumvent this sort of limitation, 16 but the poor correlation of classical QSAR models prevents this approach.Instead, topological descriptors computed in the early steps of this work were employed to shed some light in this subject.A subset of compounds bearing either amine (4, 8 and 14) or nitro (3, 24 and 49) moieties, at equivalent positions, was investigated in the search for descriptors that could discriminate between them.The assumption being that such descriptor would somehow explain the different inhibitory profile of amine and nitro substituted compounds.In order to accomplish this task, a low dimensional classification rule (Fisher´s weight) was employed.As seen in Table 6, two descriptors are important for class discrimination, the Zagreb index by valence vertex degrees (ZM1V) that accounts for molecular branching in hydrogen-depleted structures 26 and the maximum negative intrinsic state difference in the molecules (MAXDN), which can be related to the nucleophilicity of the molecules. 27The greater value of MAXDN suggests that it is the most appropriate descriptor to separate amine substituted compounds from nitro substituted ones.Therefore, it is tentative to assume that better anti-Candida activity of nitro substituted compounds is correlated to selective binding to electrophilic residues in the active site of lanosterol 14α-demethylase.This hypothesis shed new light on the previous work of Artico and co-workers, 17 which shows that lipophilicity (logP) is responsible for the different potency of nitro and amine substituted compounds.This apparent contradiction can be reasoned if one considers that azole derivatives potency depends on pharmacokinetic (logP) as well as pharmacodynamic (MAXDN) factors.

Conclusions
Though classical QSAR models were unable to describe azole derivatives' anti-Candida activity, fragment based hologram QSAR succeed in this task.The good correlation between experimental and predicted anticandida activity for 10 test set compounds and the new insight into nucleophilicity importance for azole potency further highlights the worth of constructed HQSAR models.Moreover, HQSAR contribution maps provide information about the importance of benzyl substitution pattern towards potency.The combined use of these results should be useful to develop selective and more potent azole derivatives.

Figure 1 .
Figure 1.Representative examples of currently available drugs employed in fungal infections treatment.

Table 1 .
Chemical structures and corresponding MIC 90 /MIC 90bifonazole values for a series of bifonazole derivatives with antifungal activity

Table 1 .
continuation Test Set Compounds

Table 2 .
Descriptors selected for QSAR model development

Table 3 .
Experimental and predicted activities (pMIC 90 /MIC 90bifonazole ) with residual values for the test set compounds * Residual values are calculated from experimental minus predicted values.

Table 4 .
HQSAR analyses for various fragment distinctions on the key statistical parameters using fragment size default(4-7)

Table 5 .
HQSAR analysis for the influence of different fragment sizes on the statistical parameters (HL stands for Hologram length and PC stands for number of principal components) Predicted versus actual pMIC 90 /MIC 90bifonazole values for the training and test set.

Table 6 .
Comparison of nitro and amine substituted compounds by Fisher´s weight