Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules?

David Palmer, John Mitchell

Research output: Contribution to journalArticle

37 Citations (Scopus)

Abstract

We report the results of testing Quantitative Structure-Property Relationships (QSPR) that were trained upon the same druglike molecules but two different sets of solubility data: (i) data ex- tracted from several different sources from the published literature, for which the experimental uncertainty is estimated to be 0.6-0.7 log S units (referred to mol/l); (ii) data measured by a sin- gle accurate experimental method (CheqSol), for which experimental uncertainty is typically < 0.05 log S units. Contrary to what might be expected, the models derived from the CheqSol experimental data are not more accurate than those derived from the “noisy” literature data. The results suggest that, at the present time, it is the deficiency of QSPR methods (algorithms and/or descriptor sets), and not, as is commonly quoted, the uncertainty in the experimen- tal measurements, which is the limiting factor in accurately predicting aqueous solubility for pharmaceutical molecules.
LanguageEnglish
Pages2962–2972
Number of pages11
JournalMolecular Pharmaceutics
Volume11
Issue number8
Early online date11 Jun 2014
DOIs
Publication statusPublished - 2014

Fingerprint

Solubility
Uncertainty
Quantitative Structure-Activity Relationship
Pharmaceutical Preparations
Data Accuracy
Datasets

Keywords

  • pharmaceutical
  • rule-of-five
  • solubility
  • bioavailability
  • QSPR
  • QSAR
  • druglike
  • ADME
  • Random Forest
  • dissolution
  • experimental error
  • CheqSol
  • Noyes−Whitney
  • Henderson−Hasselbalch
  • polymorph
  • crystal
  • machine learning
  • general solubility equation
  • ADMET

Cite this

@article{06323db503d34ff39deebf8771796ae7,
title = "Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules?",
abstract = "We report the results of testing Quantitative Structure-Property Relationships (QSPR) that were trained upon the same druglike molecules but two different sets of solubility data: (i) data ex- tracted from several different sources from the published literature, for which the experimental uncertainty is estimated to be 0.6-0.7 log S units (referred to mol/l); (ii) data measured by a sin- gle accurate experimental method (CheqSol), for which experimental uncertainty is typically < 0.05 log S units. Contrary to what might be expected, the models derived from the CheqSol experimental data are not more accurate than those derived from the “noisy” literature data. The results suggest that, at the present time, it is the deficiency of QSPR methods (algorithms and/or descriptor sets), and not, as is commonly quoted, the uncertainty in the experimen- tal measurements, which is the limiting factor in accurately predicting aqueous solubility for pharmaceutical molecules.",
keywords = "pharmaceutical, rule-of-five, solubility, bioavailability, QSPR, QSAR, druglike, ADME, Random Forest, dissolution, experimental error, CheqSol, Noyes−Whitney, Henderson−Hasselbalch, polymorph, crystal, machine learning, general solubility equation, ADMET",
author = "David Palmer and John Mitchell",
year = "2014",
doi = "10.1021/mp500103r",
language = "English",
volume = "11",
pages = "2962–2972",
journal = "Molecular Pharmaceutics",
issn = "1543-8384",
publisher = "American Chemical Society",
number = "8",

}

Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules? / Palmer, David; Mitchell, John .

In: Molecular Pharmaceutics, Vol. 11, No. 8, 2014, p. 2962–2972.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Is experimental data quality the limiting factor in predicting the aqueous solubility of druglike molecules?

AU - Palmer, David

AU - Mitchell, John

PY - 2014

Y1 - 2014

N2 - We report the results of testing Quantitative Structure-Property Relationships (QSPR) that were trained upon the same druglike molecules but two different sets of solubility data: (i) data ex- tracted from several different sources from the published literature, for which the experimental uncertainty is estimated to be 0.6-0.7 log S units (referred to mol/l); (ii) data measured by a sin- gle accurate experimental method (CheqSol), for which experimental uncertainty is typically < 0.05 log S units. Contrary to what might be expected, the models derived from the CheqSol experimental data are not more accurate than those derived from the “noisy” literature data. The results suggest that, at the present time, it is the deficiency of QSPR methods (algorithms and/or descriptor sets), and not, as is commonly quoted, the uncertainty in the experimen- tal measurements, which is the limiting factor in accurately predicting aqueous solubility for pharmaceutical molecules.

AB - We report the results of testing Quantitative Structure-Property Relationships (QSPR) that were trained upon the same druglike molecules but two different sets of solubility data: (i) data ex- tracted from several different sources from the published literature, for which the experimental uncertainty is estimated to be 0.6-0.7 log S units (referred to mol/l); (ii) data measured by a sin- gle accurate experimental method (CheqSol), for which experimental uncertainty is typically < 0.05 log S units. Contrary to what might be expected, the models derived from the CheqSol experimental data are not more accurate than those derived from the “noisy” literature data. The results suggest that, at the present time, it is the deficiency of QSPR methods (algorithms and/or descriptor sets), and not, as is commonly quoted, the uncertainty in the experimen- tal measurements, which is the limiting factor in accurately predicting aqueous solubility for pharmaceutical molecules.

KW - pharmaceutical

KW - rule-of-five

KW - solubility

KW - bioavailability

KW - QSPR

KW - QSAR

KW - druglike

KW - ADME

KW - Random Forest

KW - dissolution

KW - experimental error

KW - CheqSol

KW - Noyes−Whitney

KW - Henderson−Hasselbalch

KW - polymorph

KW - crystal

KW - machine learning

KW - general solubility equation

KW - ADMET

UR - http://pubs.acs.org/journal/mpohbp

U2 - 10.1021/mp500103r

DO - 10.1021/mp500103r

M3 - Article

VL - 11

SP - 2962

EP - 2972

JO - Molecular Pharmaceutics

T2 - Molecular Pharmaceutics

JF - Molecular Pharmaceutics

SN - 1543-8384

IS - 8

ER -