Gordon Life Science Institute

San Diego, CA, United States

Gordon Life Science Institute

San Diego, CA, United States
SEARCH FILTERS
Time filter
Source Type

Chou K.-C.,Gordon Life Science Institute | Chou K.-C.,King Abdulaziz University
Medicinal Chemistry | Year: 2015

Facing the explosive growth of biological sequence data, such as those of protein/peptide and DNA/RNA, generated in the post-genomic age, many bioinformatical and mathematical approaches as well as physicochemical concepts have been introduced to timely derive useful informations from these biological sequences, in order to stimulate the development of medical science and drug design. Meanwhile, because of the rapid penetrations from these disciplines, medicinal chemistry is currently undergoing an unprecedented revolution. In this minireview, we are to summarize the progresses by focusing on the following six aspects. (1) Use the pseudo amino acid composition or PseAAC to predict various attributes of protein/peptide sequences that are useful for drug development. (2) Use pseudo oligonucleotide composition or PseKNC to do the same for DNA/RNA sequences. (3) Introduce the multi-label approach to study those systems where the constituent elements bear multiple characters and functions. (4) Utilize the graphical rules and "wenxiang" diagrams to analyze complicated biomedical systems. (5) Recent development in identifying the interactions of drugs with its various types of target proteins in cellular networking. (6) Distorted key theory and its application in developing peptide drugs. © 2015 Bentham Science Publishers.


Zhou G.-P.,Gordon Life Science Institute | Zhou G.-P.,Guangxi Academy of science | Huang R.-B.,Guangxi Academy of science
Current Topics in Medicinal Chemistry | Year: 2013

Transmissible spongiform encephalopathies (TSEs) are prion protein misfolding diseases that involve the accumulation of an abnormal β-sheet-rich prion protein aggregated form (PrPsc) of the normal α-helix-rich prion protein (PrPc) within the central nervous system (CNS) and other organs. On account of its large size and insolubility properties, characterization of PrPsc is quite difficult. A soluble intermediate, called PrPβ or β° exhibiting many of the same features as PrPsc, can be generated using a combination of low pH and/or mild denaturing conditions. Here, we review the current knowledge on the following five issues relevant to the conversion mechanisms of PrPc to PrPsc: (1) How is the Stability of the Helical Structures in the Native PrPc Related to the Primary Structure of the PrPc (2) Why the Low pH Solution System is a Ideal Trigger of PrPc to PrPsc Conversion (3) How are the Structural and Dynamical Characteristics of the α-helixrich Intermediates Determined using NMR Data (4) How are the Premolten (PrPα4 and PrPαβ) and β-Oligomer (PrPβ) Intermediates Detected and Assayed, and (5) Can the Disordered N-terminal Domain be folded into the Structural Segment? Particularly, Chou's wenxiang diagram (http://en.wikipedia.org/wiki/Wenxiang_diagram) was introduced for providing an intuitive picture. This review may help to further understand the prion protein misfolding mechanism. © 2013 Bentham Science Publishers.


Xu Y.,University of Science and Technology Beijing | Xu Y.,Gordon Life Science Institute | Chou K.-C.,Gordon Life Science Institute | Chou K.-C.,King Abdulaziz University
Current Topics in Medicinal Chemistry | Year: 2016

The posttranslational modification or PTM is a later but subtle step in protein biosynthesis via which to change the properties of a protein by adding a modified group to its one or more amino acid residues. PTMs are responsible for many significant biological processes, and meanwhile for many major diseases as well, such as cancer. Facing the avalanche of biological sequences generated in the post-genomic age, it is important for both basic research and drug development to timely identify the PTM sites in proteins. This Review is devoted to summarize the recent progresses in this area, with a focus on those predictors, which were developed based on the pseudo amino acid composition or PseAAC approach, and for which a publicly accessible web-server has been established. Meanwhile, the future challenge in this area has also been briefly addressed. © 2016 Bentham Science Publishers.


Chen W.,Hebei United University | Chen W.,Gordon Life Science Institute | Feng P.-M.,Hebei United University | Lin H.,University of Electronic Science and Technology of China | Chou K.-C.,Gordon Life Science Institute
Nucleic Acids Research | Year: 2013

Meiotic recombination is an important biological process. As a main driving force of evolution, recombination provides natural new combinations of genetic variations. Rather than randomly occurring across a genome, meiotic recombination takes place in some genomic regions (the so-called 'hotspots') with higher frequencies, and in the other regions (the so-called 'coldspots') with lower frequencies. Therefore, the information of the hotspots and coldspots would provide useful insights for in-depth studying of the mechanism of recombination and the genome evolution process as well. So far, the recombination regions have been mainly determined by experiments, which are both expensive and time-consuming. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the recombination regions. In this study, a predictor, called 'iRSpot-PseDNC', was developed for identifying the recombination hotspots and coldspots. In the new predictor, the samples of DNA sequences are formulated by a novel feature vector, the so-called 'pseudo dinucleotide composition' (PseDNC), into which six local DNA structural properties, i.e. three angular parameters (twist, tilt and roll) and three translational parameters (shift, slide and rise), are incorporated. It was observed by the rigorous jackknife test that the overall success rate achieved by iRSpot-PseDNC was >82% in identifying recombination spots in Saccharomyces cerevisiae, indicating the new predictor is promising or at least may become a complementary tool to the existing methods in this area. Although the benchmark data set used to train and test the current method was from S. cerevisiae, the basic approaches can also be extended to deal with all the other genomes. Particularly, it has not escaped our notice that the PseDNC approach can be also used to study many other DNA-related problems. As a user-friendly web-server, iRSpot-PseDNC is freely accessible at http://lin.uestc.edu. cn/server/iRSpot- PseDNC. © The Author(s) 2013. Published by Oxford University Press.


Zhou G.-P.,Gordon Life Science Institute | Zhou G.-P.,North Carolina State University
Journal of Theoretical Biology | Year: 2011

Wenxiang diagram is a new two-dimensional representation that characterizes the disposition of hydrophobic and hydrophilic residues in α-helices. In this research, the hydrophobic and hydrophilic residues of two leucine zipper coiled-coil (LZCC) structural proteins, cGKIα 1-59 and MBS CT35 are dispositioned on the wenxiang diagrams according to heptad repeat pattern (abcdefg) n, respectively. Their wenxiang diagrams clearly demonstrate that the residues with same repeat letters are laid on same side of the spiral diagrams, where most hydrophobic residues are positioned at a and d, and most hydrophilic residues are localized on b, c, e, f and g polar position regions. The wenxiang diagrams of a dimetric LZCC can be represented by the combination of two monomeric wenxiang diagrams, and the wenxiang diagrams of the two LZCC (tetramer) complex structures can also be assembled by using two pairs of their wenxiang diagrams. Furthermore, by comparing the wenxiang diagrams of cGKIα 1-59 and MBS CT35, the interaction between cGKIα 1-59 and MBS CT35 is suggested to be weaker. By analyzing the wenxiang diagram of the cGKIα 1-59.MBS CT42 complex structure, most affected residues of cGKIα 1-59 by the interaction with MBS CT42 are proposed at positions d, a, e and g of the LZCC structure. These findings are consistent with our previous NMR results. Incorporating NMR spectroscopy, the wenxiang diagrams of LZCC structures may provide novel insights into the interaction mechanisms between dimeric, trimeric, tetrameric coiled-coil structures. © 2011 Elsevier Ltd.


Chou K.-C.,Gordon Life Science Institute
Current Drug Metabolism | Year: 2010

Using graphic rules to deal with kinetic systems is an elegant approach by combining the graph representation (schematic representation) and rigorous mathematical derivation. It bears the following advantages: (1) providing an intuitive picture or illuminative insights; (2) helping grasp the key points from complicated details; (3) greatly simplifying many tedious, laborious, and error-prone calculations; and (4) able to double-check the final results. In this mini review, the non-steady state graphic rule in enzyme-catalyzed kinetics and protein-folding kinetics was extended to cover drugmetabolic systems. As a demonstration, a step-by-step illustration is presented showing how to use the graphic rule to derive the concentrations of the parent drug and its metabolites vs. time for the seliciclib, vildagliptin, and cyclin-dependent kinase inhibitor (AG-024322) metabolic systems, respectively. It can be seen from these paradigms that the graphic rule is particularly useful to analyze complicated drug metabolic systems and ensure the correctness of the derived results. Meanwhile, the intuitive feature of graphic representation may facilitate analyzing and classifying drug metabolic systems; e.g., according to their directed graphs, the metabolism of seliciclib and the metabolism of vildagliptin can be categorized as 0 → 5 mechanism while that of AG-024322 as 0 → 4 → 3 mechanism. © 2010 Bentham Science Publishers Ltd.


Chou K.-C.,Gordon Life Science Institute | Wu Z.-C.,Jing de Zhen Ceramic Institute | Xiao X.,Gordon Life Science Institute | Xiao X.,Jing de Zhen Ceramic Institute
PLoS ONE | Year: 2011

Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or "singleplex" proteins. Actually, multiple-location or "multiplex" proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the "multi-labeled learning" and "accumulation-layer scale", a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has ≥25% pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes. © 2011 Chou et al.


Chou K.-C.,Gordon Life Science Institute | Wu Z.-C.,Jing de Zhen Ceramic Institute | Xiao X.,Gordon Life Science Institute | Xiao X.,Jing de Zhen Ceramic Institute
Molecular BioSystems | Year: 2012

Although numerous efforts have been made for predicting the subcellular locations of proteins based on their sequence information, it still remains as a challenging problem, particularly when query proteins may have the multiplex character, i.e., they simultaneously exist, or move between, two or more different subcellular location sites. Most of the existing methods were established on the assumption: a protein has one, and only one, subcellular location. Actually, recent evidence has indicated an increasing number of human proteins having multiple subcellular locations. This kind of multiplex proteins should not be ignored because they may bear some special biological functions worthy of our attention. Based on the accumulation-label scale, a new predictor, called iLoc-Hum, was developed for identifying the subcellular localization of human proteins with both single and multiple location sites. As a demonstration, the jackknife cross-validation was performed with iLoc-Hum on a benchmark dataset of human proteins that covers the following 14 location sites: centrosome, cytoplasm, cytoskeleton, endoplasmic reticulum, endosome, extracellular, Golgi apparatus, lysosome, microsome, mitochondrion, nucleus, peroxisome, plasma membrane, and synapse, where some proteins belong to two, three or four locations but none has 25% or higher pairwise sequence identity to any other in the same subset. For such a complicated and stringent system, the overall success rate achieved by iLoc-Hum was 76%, which is remarkably higher than that by any of the existing predictors that also have the capacity to deal with this kind of system. Further comparisons were also made via two independent datasets; all indicated that the success rates by iLoc-Hum were even more significantly higher than its counterparts. As a user-friendly web-server, iLoc-Hum is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/ iLoc-Hum or http://www.jci-bioinfo.cn/iLoc-Hum. For the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results by choosing either a straightforward submission or a batch submission, without the need to follow the complicated mathematical equations involved. © 2012 The Royal Society of Chemistry.


Chou K.-C.,Gordon Life Science Institute | Chou K.-C.,King Abdulaziz University
Molecular BioSystems | Year: 2013

Many molecular biosystems and biomedical systems belong to the multi-label systems in which each of their constituent molecules possesses one or more than one function or feature, and hence needs one or more than one label to indicate its attribute(s). With the avalanche of biological sequences generated in the post genomic age, it is highly desirable to develop computational methods to timely and reliably identify their various kinds of attributes. Compared with the single-label systems, the multi-label systems are much more complicated and difficult to deal with. The current mini review focuses on the recent progresses in this area from both conceptual aspects and detailed mathematical formulations. © 2013 The Royal Society of Chemistry.


Chou K.-C.,Gordon Life Science Institute
Journal of Theoretical Biology | Year: 2011

With the accomplishment of human genome sequencing, the number of sequence-known proteins has increased explosively. In contrast, the pace is much slower in determining their biological attributes. As a consequence, the gap between sequence-known proteins and attribute-known proteins has become increasingly large. The unbalanced situation, which has critically limited our ability to timely utilize the newly discovered proteins for basic research and drug development, has called for developing computational methods or high-throughput automated tools for fast and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. Actually, during the last two decades or so, many methods in this regard have been established in hope to bridge such a gap. In the course of developing these methods, the following things were often needed to consider: (1) benchmark dataset construction, (2) protein sample formulation, (3) operating algorithm (or engine), (4) anticipated accuracy, and (5) web-server establishment. In this review, we are to discuss each of the five procedures, with a special focus on the introduction of pseudo amino acid composition (PseAAC), its different modes and applications as well as its recent development, particularly in how to use the general formulation of PseAAC to reflect the core and essential features that are deeply hidden in complicated protein sequences. © 2010 Elsevier Ltd.

Loading Gordon Life Science Institute collaborators
Loading Gordon Life Science Institute collaborators