Time filter

Source Type

Allenstown Elementary School, United States

Ma C.,Joint Pitt CMU Computational Biology Program | Ma C.,University of Pittsburgh | Wang L.,University of Pittsburgh | Xie X.-Q.,Joint Pitt CMU Computational Biology Program | Xie X.-Q.,University of Pittsburgh
Journal of Chemical Information and Modeling

Chemical similarity calculation plays an important role in compound library design, virtual screening, and "lead" optimization. In this manuscript, we present a novel GPU-accelerated algorithm for all-vs-all Tanimoto matrix calculation and nearest neighbor search. By taking advantage of multicore GPU architecture and CUDA parallel programming technology, the algorithm is up to 39 times superior to the existing commercial software that runs on CPUs. Because of the utilization of intrinsic GPU instructions, this approach is nearly 10 times faster than existing GPU-accelerated sparse vector algorithm, when Unity fingerprints are used for Tanimoto calculation. The GPU program that implements this new method takes about 20 min to complete the calculation of Tanimoto coefficients between 32 M PubChem compounds and 10K Active Probes compounds, i.e., 324G Tanimoto coefficients, on a 128-CUDA-core GPU. © 2011 American Chemical Society. Source

Ma C.,Joint Pitt CMU Computational Biology Program | Ma C.,University of Pittsburgh | Wang L.,Joint Pitt CMU Computational Biology Program | Wang L.,University of Pittsburgh | And 2 more authors.
Journal of Chemical Information and Modeling

Advanced high-throughput screening (HTS) technologies generate great amounts of bioactivity data, and this data needs to be analyzed and interpreted with attention to understand how these small molecules affect biological systems. As such, there is an increasing demand to develop and adapt cheminformatics algorithms and tools in order to predict molecular and pharmacological properties on the basis of these large data sets. In this manuscript, we report a novel machine-learning-based ligand classification algorithm, named Ligand Classifier of Adaptively Boosting Ensemble Decision Stumps (LiCABEDS), for data-mining and modeling of large chemical data sets to predict pharmacological properties in an efficient and accurate manner. The performance of LiCABEDS was evaluated through predicting GPCR ligand functionality (agonist or antagonist) using four different molecular fingerprints, including Maccs, FP2, Unity, and Molprint 2D fingerprints. Our studies showed that LiCABEDS outperformed two other popular techniques, classification tree and Naive Bayes classifier, on all four types of molecular fingerprints. Parameters in LiCABEDS, including the number of boosting iterations, initialization condition, and a "reject option" boundary, were thoroughly explored and discussed to demonstrate the capability of handling imbalanced data sets, as well as its robustness and flexibility. In addition, the detailed mathematical concepts and theory are also given to address the principle behind statistical prediction models. The LiCABEDS algorithm has been implemented into a user-friendly software package that is accessible online at http://www.cbligand.org/LiCABEDS/. © 2011 American Chemical Society. Source

Discover hidden collaborations