journal.pone.0063240.pdf (424.3 kB)
Download file

Enhancing Hit Identification in Mycobacterium tuberculosis Drug Discovery Using Validated Dual-Event Bayesian Models

Download (424.3 kB)
journal contribution
posted on 29.03.2016, 00:00 by Sean Ekins, Robert C. Reynolds, Scott G. Franzblau, Baojie Wan, Joel S. Freundlich, Barry A. Bunin
High-throughput screening (HTS) in whole cells is widely pursued to find compounds active against Mycobacterium tuberculosis (Mtb) for further development towards new tuberculosis (TB) drugs. Hit rates from these screens, usually conducted at 10 to 25 mu M concentrations, typically range from less than 1% to the low single digits. New approaches to increase the efficiency of hit identification are urgently needed to learn from past screening data. The pharmaceutical industry has for many years taken advantage of computational approaches to optimize compound libraries for in vitro testing, a practice not fully embraced by academic laboratories in the search for new TB drugs. Adapting these proven approaches, we have recently built and validated Bayesian machine learning models for predicting compounds with activity against Mtb based on publicly available large-scale HTS data from the Tuberculosis Antimicrobial Acquisition Coordinating Facility. We now demonstrate the largest prospective validation to date in which we computationally screened 82,403 molecules with these Bayesian models, assayed a total of 550 molecules in vitro, and identified 124 actives against Mtb. Individual hit rates for the different datasets varied from 15-28%. We have identified several FDA approved and late stage clinical candidate kinase inhibitors with activity against Mtb which may represent starting points for further optimization. The computational models developed herein and the commercially available molecules derived from them are now available to any group pursuing Mtb drug discovery.


The CDD TB has been developed thanks to funding from the Bill and Melinda Gates Foundation (Grant#49852 ‘‘Collaborative drug discovery for TB through a novel database of SAR data optimized to promote data archiving and sharing’’). The project described was supported by Award Number R43 LM011152-01 ‘‘Biocomputation across distributed private datasets to enhance drug discovery’’ from the National Library of Medicine. RCR and SGF acknowledge the American Reinvestment and Recovery Act Grant 1RC1AI086677-01 (National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID)) – ‘‘Targeting MDR-TB.’’


Publisher Statement

© 2013 Ekins et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. © 2013 by Public Library of Science, PLoS ONE


Public Library of Science





Issue date


Usage metrics