Parameter Questions

How are the SMILES, InChI and InChI Key created for CMNPD?

These descriptors are calculated using BIOVIA Pipeline Pilot (version 18.1).

SMILES

The SMILES are calculated using the Daylight's algorithm.

Standard InChI

Standard InChI is calculated using the algorithm from IUPAC. The version of InChI used in CMNPD is 1.05.

InChI Key

InChI Key is based on a strong hash (SHA-256 algorithm) of an InChI string.

How are the physicochemical properties in CMNPD calculated?

All physicochemical properties are calculated using algorithms available in RDKit.

Molecular Weight

The sum of the atomic masses with the isotope average used for each atomic mass.

Molecular Mass

The sum of the atomic masses with the most common isotope is used for each atomic mass.

ALogP

Calculated value for the lipophilicity of a molecule expressed as log (octanol/water partition coefficient). Method used for the calculation is as described in "Prediction of physicochemical parameters by atomic contributions" by Wildman, S. A. et al. J. Chem. Inf. Comput. Sci., 1999, 39, 868-873.

Rotatable Bonds

Number of rotatable bonds in the molecule. Based on matching this SMARTS pattern:

[!$(*#*)&!D1]-&!@[!$(*#*)&!D1]

HBA

Number of hydrogen bond acceptors: number of heteroatoms (Oxygen, Nitrogen, Sulfur, or Phosphorus) with one or more lone pairs, excluding atoms with positive formal charges, amide and pyrrole-type Nitrogen, and aromatic Oxygen and Sulfur atoms in heterocyclic rings. Based on matching these SMARTS patterns:

[$([N;!H0;v3]),$([N;!H0;+1;v4]),$([O,S;H1;+0]),$([n;H1;+0])]

HBD

Number of hydrogen bond donors: number of heteroatoms (Oxygen, Nitrogen, Sulfur, or Phosphorus) with one or more attached Hydrogen atoms. Based on matching these SMARTS patterns:

[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),$([O,S;H0;v2]),$([O,S;-]),$([N;v3;!$(N-*=!@[O,N,P,S])]),$([nH0,o,s;+0])]

Polar Surface Area

The Molecular Polar Surface Area descriptors are calculated using a method based on the published method: "Fast calculation of molecular polar surface area as a sum of fragment based contributions and its application to the prediction of drug transport properties" by Ertl P. et al., J. Med. Chem., 2000, 43, 3714-3717.

Aromatic Rings

The number of aromatic rings in the molecule.

Heavy Atoms

The number of non - hydrogen atoms in the molecule.

QED Weighted

This is the quantitative estimate of drug-likeness as described in "Quantifying the chemical beauty of drugs" by Bickerton G. R. et al., Nat. Chem., 2012, 4, 90-98.

The values range from 0 -1 where 1 is the most drug-like and 0 the least drug-like.

How are the predicted ADMET parameters in CMNPD calculated?

All ADMET parameters are calculated using ADMET models available in BIOVIA Pipeline Pilot (version 18.1).

Blood Brain Barrier Penetration

The model predicts blood-brain barrier penetration (BBB) after oral administration. Method used for the calculation is as described in "Prediction of drug absorption using multivariate statistics" by Egan, W. J. et al., J. Med. Chem., 2000, 43, 3867-3877. The calculable properties of the component are:

Blood Brain Barrier Penetration: Base 10 logarithm of (brain concentration)/(blood concentration)

Blood Brain Barrier Penetration Level: Predicts the blood-brain permeation level based on the following categories

Level

Value

Description

0

Very High

Brain-blood ratio greater than 5:1

1

High

Brain-blood ratio between 1:1 and 5:1

2

Medium

Brain-blood ratio between 1:1 and 5:1

3

Low

Brain-blood ratio less than 0.3:1

4

Undefined

Outside 99 percent confidence ellipse

Human Intestinal Absorption

This model predicts human intestinal absorption (HIA) after oral administration. Method used for the calculation is as described in "Prediction of drug absorption using multivariate statistics" by Egan, W. J. et al., J. Med. Chem., 2000, 43, 3867-3877 and "Prediction of intestinal permeability" by Egan, W. J. et al., Adv. Drug Delivery Rev., 2002, 54, 273-289.

Intestinal absorption is defined as a percentage absorbed rather than as a ratio of concentrations (cf.blood-brain penetration); a well-absorbed compound is one that is absorbed at least 90 percent into the bloodstream in humans. The levels are defined as:

  • 0 = Good

  • 1 = Moderate

  • 2 = Low

  • 3 = Very low

Aqueous Solubility

The Aqueous Solubility descriptor uses linear regression to predict the aqueous solubility of each compound in water at 25 degrees Celsius. Method used for the calculation is as described in "Prediction of aqueous solubility of a diverse set of compounds using quantitative structure−property relationships" by Cheng, A. et al., J. Med. Chem., 2003, 46, 3572-3580.

Aqueous Solubility: The base 10 logarithm of the molar solubility as predicted by the regression.

Aqueous Solubility Level: Assigns the molecule to one of seven solubility classes based on the value of ADMET Solubility. The classes define the solubility relative to a compendium of known drugs:

Level

Value

Description

0

< -8.0

Extremely low solubility, lower than 95 percent of drugs

1

(-8.0, -6.0)

Very low solubility, at border line of 95 percent of drugs

2

(-6.0, -4.0)

Low solubility, at lower end of 95 percent of drugs

3

(-4.0, -2.0)

Good, slight soluble to soluble

4

(-2.0, 0.0)

Optimal solubility

5

> 0.0

Very soluble, perhaps too soluble

CYP2D6 Binding Model

This model predicts whether a particular compound is an inhibitor of the CYP2D6 isozyme of cytochrome P-450 with a value of True or False This model trained on a data set of compounds as described in "Use of robust classification techniques for the prediction of human cytochrome P450 2D6 inhibition" by Susnow R. G. et al., J. Chem. Inf. Comput. Sci., 2003, 43, 1308-1315.

Hepatotoxicity Model

This model predicts dose-dependent human hepatotoxicity with a value of True (toxic) or False (nontoxic). This model trained on a data set of 436 compounds as described in "In silico models for the prediction of dose-dependent human hepatotoxicity" by Cheng A. et al., J. Comput.-Aided Mol. Des., 2003,17, 811-823.

Plasma Protein Binding Model

This model predicts Plasma-Protein binding - whether compound is likely to be highly bound to carrier proteins in the blood. If True, the compound is estimated to be a binder (>=90%). Otherwise, it is estimated to be weaker or non-binder (<90%). This model trained on several data sets of compounds using modified Bayesian learning as described in "Classification of kinase inhibitors using a Bayesian model" by Xia, X. et al., J. Med. Chem., 2004, 47, 4463-4470.

Last updated