A set of nine activity classes used in our publication "Tovar et al. Comparison of 2D Fingerprint Methods for Multiple-Template Similarity Searching on Compound Activity Classes of Increasing Structural Diversity, ChemMedChem 2, 208-217, 2007" are currently available. The activity classes contain between 22 and 27 compounds with increasing structurally diversity and are abbreviated as follows: ANG, angiotensin-II antagonists; ARI, aldose reductase inhibitors; COX, cyclooxygenase-2 inhibitors; ETA, endothelin antagonists; HIV, HIV protease inhibitors; IL1, IL-1 beta converting enzyme inhibitors; LSI, leukotriene synthesis inhibitors; REN, renin inhibitors; THR, thrombin inhibitors (Download). Data sets are only available as MDDR external registry numbers (EXTREG) due to license reasons. Please contact MDL (
www.mdli.com
) for further information on the database.
"2D unique" version of ZINC
Compound IDs of a "2D unique" version of the publicly available ZINC database can be downloaded. The ZINC database (Irwin, J. J.; Shoichet, B. K. ZINC a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 2005, 45, 177-182) consists of several million molecules in predicted 3D conformational states. To generate a "2D unique" version, we calculated the values for a pool of 184 1D and 2D descriptors for each compound of a previous ZINC version (with ~2.01 molecules) and identified compound entries with duplicate descriptor settings. Only one of these entries was retained, which led to the removal of ~0.57 million ZINC compounds. The "2D unique" subset of ZINC contains thus ~1.44 million molecules (Download).
activity classes
A set of 26 activity classes with selectivity for pairs among 13 targets from three protein families, GPCRs, thiol and serine proteases descrived in our publication "Stumpfe et al. Methods for Computer-Aided Chemical Biology, Part 1: Design of a Benchmark System for the Evaluation of Compound Selectivity, Chemical Biology & Drug Design,
in press
". Compound names are taken from the referenced sources. These files contain the selectivity ratios computed from given Ki or IC50 values as indicated in the field names. For MDDR compounds only registry numbers (EXTREG) can be provided, but not structures, due to license reasons. Please contact MDL
(www.mdli.com)
for further information about the MDDR compounds. (Download)
descriptor histogram
A Novel Descriptor Histogram Filtering Method for Database Mining and the Identification of Active Molecules (Download)