Marshfield, WI, United States
Marshfield, WI, United States

Time filter

Source Type

Balaji S.,Florida State University | McClendon C.,Florida State University | Chowdhary R.,Marshfield Clinic Marshfield Center | Liu J.S.,Harvard University | Zhang J.,Florida State University
Bioinformatics | Year: 2012

Motivation: Molecular interaction information, such as protein-protein interactions and protein-small molecule interactions, is indispensable for understanding the mechanism of biological processes and discovering treatments for diseases. Many databases have been built by manual annotation of literature to organize such information into structured form. However, most databases focus on only one type of interactions, which are often not well annotated and integrated with related functional information.Results: In this study, we integrate molecular interaction information from literature by automatic information extraction and from manually annotated databases. We further integrate the relationships between protein/gene and other bio-entity terms including gene ontology terms, pathways, species and diseases to build an integrated molecular interaction database (IMID). Interactions can be selected by their associated probabilities. IMID allows complex and versatile queries for context-specific molecular interactions, which are not available currently in other molecular interaction databases. © The Author 2012. Published by Oxford University Press. All rights reserved.

Bell L.,Florida State University | Chowdhary R.,Marshfield Clinic Marshfield Center | Liu J.S.,Harvard University | Niu X.,Florida State University | Zhang J.,Florida State University
PLoS ONE | Year: 2011

A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein-protein interactions, protein/gene regulations, protein-small molecule interactions, protein-GO relationships, protein-pathway relationships, and pathway-disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses-the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs. © 2011 Bell et al.

Chowdhary R.,Marshfield Clinic Marshfield Center | Tan S.L.,Marshfield Clinic Marshfield Center | Zhang J.,Florida State University | Karnik S.,Marshfield Clinic Marshfield Center | And 2 more authors.
PLoS ONE | Year: 2012

Background: Protein interaction networks (PINs) specific within a particular context contain crucial information regarding many cellular biological processes. For example, PINs may include information on the type and directionality of interaction (e.g. phosphorylation), location of interaction (i.e. tissues, cells), and related diseases. Currently, very few tools are capable of deriving context-specific PINs for conducting exploratory analysis. Results: We developed a literature-based online system, Context-specific Protein Network Miner (CPNM), which derives context-specific PINs in real-time from the PubMed database based on a set of user-input keywords and enhanced PubMed query system. CPNM reports enriched information on protein interactions (with type and directionality), their network topology with summary statistics (e.g. most densely connected proteins in the network; most densely connected protein-pairs; and proteins connected by most inbound/outbound links) that can be explored via a user-friendly interface. Some of the novel features of the CPNM system include PIN generation, ontology-based PubMed query enhancement, real-time, user-queried, up-to-date PubMed document processing, and prediction of PIN directionality. Conclusions: CPNM provides a tool for biologists to explore PINs. It is freely accessible at © 2012 Chowdhary et al.

Loading Marshfield Clinic Marshfield Center collaborators
Loading Marshfield Clinic Marshfield Center collaborators