We report methods to predict the intrinsic aqueous solubility of crystalline organic molecules from two different thermodynamic cycles. We find that direct computation of solubility, via ab initio calculation of thermodynamic quantities at an affordable level of theory, cannot deliver the required accuracy. Therefore, we have turned to a mixture of direct computation and informatics, using the calculated thermodynamic properties, along with a few other key descriptors, in regression models. The prediction of log intrinsic solubility (referred to mol/L) by a three-variable linear regression equation gave r(2)=0.77 and RMSE=0.71 for an external test set comprising drug molecules. The model includes a calculated crystal lattice energy which provides a computational method to account for the interactions in the solid state. We suggest that it is not necessary to know the polymorphic form prior to prediction. Furthermore, the method developed here may be applicable to other solid-state systems such as salts or cocrystals.
- aqueous solubility
- thermodynamic cycle