Singh N.,DoD Biotechnology High Performance Software Applications Institute |
Chaudhury S.,DoD Biotechnology High Performance Software Applications Institute |
Liu R.,DoD Biotechnology High Performance Software Applications Institute |
Abdulhameed M.D.M.,DoD Biotechnology High Performance Software Applications Institute |
And 2 more authors.
Journal of Chemical Information and Modeling | Year: 2012
As novel and drug-resistant bacterial strains continue to present an emerging health threat, the development of new antibacterial agents is critical. This includes making improvements to existing antibacterial scaffolds as well as identifying novel ones. The aim of this study is to apply a Bayesian classification QSAR approach to rapidly screen chemical libraries for compounds predicted to have antibacterial activity. Toward this end we assembled a data set of 317 known antibacterial compounds as well as a second data set of diverse, well-validated, non-antibacterial compounds from 215 PubChem Bioassays against various bacterial species. We constructed a Bayesian classification model using structural fingerprints and physicochemical property descriptors and achieved an accuracy of 84% and precision of 86% on an independent test set in identifying antibacterial compounds. To demonstrate the practical applicability of the model in virtual screening, we screened an independent data set of ∼200k compounds. The results show that the model can screen top hits of PubChem Bioassay actives with accuracy up to ∼76%, representing a 1.5-2-fold enrichment. The top screened hits represented a mixture of both known antibacterial scaffolds as well as novel scaffolds. Our study suggests that a well-validated Bayesian classification QSAR approach could compliment other screening approaches in identifying novel and promising hits. The data sets used in constructing and validating this model have been made publicly available. © 2012 American Chemical Society.