Acessibilidade / Reportar erro

On the information content of 2D and 3D descriptors for QSAR

To gain better understanding on the information content of two-dimensional (2D) vs. three-dimensional (3D) descriptor systems, we analyzed principal component analysis scores derived from 87 2D descriptors and 798 3D (ALMOND) variables on a set of 5998 compounds of medicinal chemistry interest. The information overlap between ALMOND and 2D-based descriptors, as modeled by the fraction of explained variance (r²) and by seven-groups cross-validation (q²) in a two PLS components model was 40%. Individual component analysis indicates that the first and second principal components from the 2D-descriptors are related to the first and third dimensions from the ALMOND PCA model. The first ALMOND component is explained (61%) by size-related descriptors, whereas the third component is marginally explained (25%) by hydrophobicity-related descriptors. Surprisingly, 2D-based hydrogen-bonding descriptors did not contribute significantly in this analysis. These results do not a priori justify the choice of one methodology over the other, when performing QSAR studies.

ALMOND; cheminformatics; chemometrics; QSAR


Sociedade Brasileira de Química Instituto de Química - UNICAMP, Caixa Postal 6154, 13083-970 Campinas SP - Brazil, Tel./FAX.: +55 19 3521-3151 - São Paulo - SP - Brazil
E-mail: office@jbcs.sbq.org.br