MOE Key Laboratory of Bioinformatics

MOE Key Laboratory of Bioinformatics

Time filter
Source Type

Liu X.,University of Texas Southwestern Medical Center | Zhang Y.,University of Texas Southwestern Medical Center | Chen Y.,University of Texas at Dallas | Li M.,CAS Shanghai Institutes for Biological Sciences | And 14 more authors.
Cell | Year: 2017

Cis-regulatory elements (CREs) are commonly recognized by correlative chromatin features, yet the molecular composition of the vast majority of CREs in chromatin remains unknown. Here, we describe a CRISPR affinity purification in situ of regulatory elements (CAPTURE) approach to unbiasedly identify locus-specific chromatin-regulating protein complexes and long-range DNA interactions. Using an in vivo biotinylated nuclease-deficient Cas9 protein and sequence-specific guide RNAs, we show high-resolution and selective isolation of chromatin interactions at a single-copy genomic locus. Purification of human telomeres using CAPTURE identifies known and new telomeric factors. In situ capture of individual constituents of the enhancer cluster controlling human β-globin genes establishes evidence for composition-based hierarchical organization. Furthermore, unbiased analysis of chromatin interactions at disease-associated cis-elements and developmentally regulated super-enhancers reveals spatial features that causally control gene transcription. Thus, comprehensive and unbiased analysis of locus-specific regulatory composition provides mechanistic insight into genome structure and function in development and disease. © 2017 Elsevier Inc.

Gong Y.,MOE Key Laboratory of Bioinformatics | Zhang L.,MOE Key Laboratory of Bioinformatics | Zhang L.,Tsinghua University | Li J.,MOE Key Laboratory of Bioinformatics | And 2 more authors.
Bioconjugate Chemistry | Year: 2016

Development of a peptide-based affinity matrix and detection reagent is important for biomedical research and the biopharmaceutical industry. In the present work, we designed and synthesized an immunoglobin G (IgG)-binding peptide ligand, Fc-III-4C. Fc-III-4C is composed of 15 residues, and the 4 cysteine residues form 2 disulfide bonds to generate a double cyclic structure. The binding affinity of the Fc-III-4C peptide toward human IgG was determined to be 2.45 nM (Kd), which is higher than that of IgG with Protein A/G (Pro-A/G). Importantly, the Fc-III-4C peptide displayed high affinity to various IgGs from different species. Fc-III-4C immobilized agarose beads exhibited high stability and reusability when compared with that of the Pro-A/G-immobilized beads. The conjugate of Fc-III-4C with FITC was demonstrated to be suitable for immunofluorescence detection of proteins expressed in cells. These results demonstrate that the Fc-III-4C peptide is a useful affinity ligand for antibody purification and as a protein detection reagent. © 2016 American Chemical Society.

PubMed | Tsinghua University and Griffith University
Type: | Journal: Bioinformatics (Oxford, England) | Year: 2016

The quality of fragment library determines the efficiency of fragment assembly, an approach that is widely used in most de novo protein-structure prediction algorithms. Conventional fragment libraries are constructed mainly based on the identities of amino acids, sometimes facilitated by predicted information including dihedral angles and secondary structures. However, it remains challenging to identify near-native fragment structures with low sequence homology.We introduce a novel fragment-library-construction algorithm, LRFragLib, to improve the detection of near-native low-homology fragments of 7-10 residues, using a multi-stage, flexible selection protocol. Based on logistic regression scoring models, LRFragLib outperforms existing techniques by achieving a significantly higher precision and a comparable coverage on recent CASP protein sets in sampling near-native structures. The method also has a comparable computational efficiency to the fastest existing technique with substantially reduced memory usage.The source code is available for download at CONTACT: SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Chicas A.,Cold Spring Harbor Laboratory | Wang X.,MOE Key Laboratory of Bioinformatics | Zhang C.,Cold Spring Harbor Laboratory | McCurrach M.,Cold Spring Harbor Laboratory | And 7 more authors.
Cancer Cell | Year: 2010

The RB protein family (RB, p107, and p130) has overlapping and compensatory functions in cell-cycle control. However, cancer-associated mutations are almost exclusively found in RB, implying that RB has a nonredundant role in tumor suppression. We demonstrate that RB preferentially associates with E2F target genes involved in DNA replication and is uniquely required to repress these genes during senescence but not other growth states. Consequently, RB loss leads to inappropriate DNA synthesis following a senescence trigger and, together with disruption of a p21-mediated cell-cycle checkpoint, enables extensive proliferation and rampant genomic instability. Our results identify a nonredundant RB effector function that may contribute to tumor suppression and reveal how loss of RB and p53 cooperate to bypass senescence. © 2010 Elsevier Inc. All rights reserved.

Chen Y.,MOE Key Laboratory of Bioinformatics | Wu X.,Massachusetts Institute of Technology | Jiang R.,MOE Key Laboratory of Bioinformatics
BMC Medical Genomics | Year: 2013

Background: The identification of genes involved in human complex diseases remains a great challenge in computational systems biology. Although methods have been developed to use disease phenotypic similarities with a protein-protein interaction network for the prioritization of candidate genes, other valuable omics data sources have been largely overlooked in these methods. Methods. With this understanding, we proposed a method called BRIDGE to prioritize candidate genes by integrating disease phenotypic similarities with such omics data as protein-protein interactions, gene sequence similarities, gene expression patterns, gene ontology annotations, and gene pathway memberships. BRIDGE utilizes a multiple regression model with lasso penalty to automatically weight different data sources and is capable of discovering genes associated with diseases whose genetic bases are completely unknown. Results: We conducted large-scale cross-validation experiments and demonstrated that more than 60% known disease genes can be ranked top one by BRIDGE in simulated linkage intervals, suggesting the superior performance of this method. We further performed two comprehensive case studies by applying BRIDGE to predict novel genes and transcriptional networks involved in obesity and type II diabetes. Conclusion: The proposed method provides an effective and scalable way for integrating multi omics data to infer disease genes. Further applications of BRIDGE will be benefit to providing novel disease genes and underlying mechanisms of human diseases. © 2013 Chen et al.; licensee BioMed Central Ltd.

Jiang R.,MOE Key Laboratory of Bioinformatics | Wu M.,MOE Key Laboratory of Bioinformatics | Li L.,MOE Key Laboratory of Bioinformatics
BMC Genomics | Year: 2015

Background: Pinpointing genes involved in inherited human diseases remains a great challenge in the post-genomics era. Although approaches have been proposed either based on the guilt-by-association principle or making use of disease phenotype similarities, the low coverage of both diseases and genes in existing methods has been preventing the scan of causative genes for a significant proportion of diseases at the whole-genome level.Results: To overcome this limitation, we proposed a rigorous statistical method called pgFusion to prioritize candidate genes by integrating one type of disease phenotype similarity derived from the Unified Medical Language System (UMLS) and seven types of gene functional similarities calculated from gene expression, gene ontology, pathway membership, protein sequence, protein domain, protein-protein interaction and regulation pattern, respectively. Our method covered a total of 7,719 diseases and 20,327 genes, achieving the highest coverage thus far for both diseases and genes. We performed leave-one-out cross-validation experiments to demonstrate the superior performance of our method and applied it to a real exome sequencing dataset of epileptic encephalopathies, showing the capability of this approach in finding causative genes for complex diseases. We further provided the standalone software and online services of pgFusion at pgFusion not only provided an effective way for prioritizing candidate genes, but also demonstrated feasible solutions to two fundamental questions in the analysis of big genomic data: the comparability of heterogeneous data and the integration of multiple types of data. Applications of this method in exome or whole genome sequencing studies would accelerate the finding of causative genes for human diseases. Other research fields in genomics could also benefit from the incorporation of our data fusion methodology. © 2015 Jiang et al.; licensee BioMed Central Ltd.

Loading MOE Key Laboratory of Bioinformatics collaborators
Loading MOE Key Laboratory of Bioinformatics collaborators