Time filter

Source Type

Orchard S.,European Bioinformatics Institute | Ammari M.,University of Arizona | Aranda B.,European Bioinformatics Institute | Breuza L.,Swiss Institute of Bioinformatics | And 32 more authors.
Nucleic Acids Research | Year: 2014

IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org). © 2013 The Author(s). Published by Oxford University Press.


Gurulingappa H.,Molecular Connections Pvt Ltd | Toldo L.,Merck KGaA | Rajput A.M.,Merck KGaA | Rajput A.M.,University of Bonn | And 3 more authors.
Pharmacoepidemiology and Drug Safety | Year: 2013

Purpose: The aim of this study was to assess the impact of automatically detected adverse event signals from text and open-source data on the prediction of drug label changes. Methods: Open-source adverse effect data were collected from FAERS, Yellow Cards and SIDER databases. A shallow linguistic relation extraction system (JSRE) was applied for extraction of adverse effects from MEDLINE case reports. Statistical approach was applied on the extracted datasets for signal detection and subsequent prediction of label changes issued for 29 drugs by the UK Regulatory Authority in 2009. Results: 76% of drug label changes were automatically predicted. Out of these, 6% of drug label changes were detected only by text mining. JSRE enabled precise identification of four adverse drug events from MEDLINE that were undetectable otherwise. Conclusions: Changes in drug labels can be predicted automatically using data and text mining techniques. Text mining technology is mature and well-placed to support the pharmacovigilance tasks. © 2013 John Wiley & Sons, Ltd.


Gurulingappa H.,Molecular Connections Pvt Ltd | Mateen-Rajpu A.,Merck KGaA | Toldo L.,Merck KGaA
Journal of Biomedical Semantics | Year: 2012

The sheer amount of information about potential adverse drug events publishedin medical case reports pose major challenges for drug safety experts toperform timely monitoring. Efficient strategies for identification andextraction of information about potential adverse drug events fromfree-text resources are needed to support pharmacovigilance researchand pharmaceutical decision making. Therefore, this work focusses on theadaptation of a machine learning-based system for the identificationand extraction of potential adverse drug event relations from MEDLINE casereports. It relies on a high quality corpus that was manually annotatedusing an ontology-driven methodology. Qualitative evaluation of thesystem showed robust results. An experiment with large scale relationextraction from MEDLINE delivered under-identified potential adversedrug events not reported in drug monographs. Overall, this approach providesa scalable auto-assistance platform for drug safety professionals toautomatically collect potential adverse drug events communicated asfree-text data. © 2012 Gurulingappa et al.; licensee BioMed Central Ltd.


Kerrien S.,European Bioinformatics Institute | Aranda B.,European Bioinformatics Institute | Breuza L.,Swiss Institute of Bioinformatics | Bridge A.,Swiss Institute of Bioinformatics | And 18 more authors.
Nucleic Acids Research | Year: 2012

IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search process and in particular the graphical display of the results. New data download formats are also available, which will facilitate the inclusion of IntAct?s data in the Semantic Web. IntAct is an active contributor to the IMEx consortium (http://www .imexconsortium.org). IntAct source code and data are freely available at http://www.ebi.ac.uk/intact. © The Author(s) 2011. Published by Oxford University Press.


Malhotra A.,Fraunhofer Institute for Algorithms and Scientific Computing | Malhotra A.,University of Bonn | Younesi E.,Fraunhofer Institute for Algorithms and Scientific Computing | Younesi E.,University of Bonn | And 4 more authors.
PLoS Computational Biology | Year: 2013

Speculative statements communicating experimental findings are frequently found in scientific articles, and their purpose is to provide an impetus for further investigations into the given topic. Automated recognition of speculative statements in scientific text has gained interest in recent years as systematic analysis of such statements could transform speculative thoughts into testable hypotheses. We describe here a pattern matching approach for the detection of speculative statements in scientific text that uses a dictionary of speculative patterns to classify sentences as hypothetical. To demonstrate the practical utility of our approach, we applied it to the domain of Alzheimer's disease and showed that our automated approach captures a wide spectrum of scientific speculations on Alzheimer's disease. Subsequent exploration of derived hypothetical knowledge leads to generation of a coherent overview on emerging knowledge niches, and can thus provide added value to ongoing research activities. © 2013 Malhotra et al.


Gurulingappa H.,Molecular Connections Pvt Ltd | Mudi A.,Molecular Connections Pvt Ltd | Toldo L.,Merck KGaA | Hofmann-Apitius M.,Fraunhofer Institute for Algorithms and Scientific Computing | Bhate J.,Molecular Connections Pvt Ltd
RSC Advances | Year: 2013

Chemical information extracted from the literature is of immense value for the pharmaceutical and chemical industries in many areas, including supporting drug discovery, manufacturing processes, or intellectual property protection. However, the exponential growth of the chemical literature has made it increasingly difficult for researchers to find the information they need within a reasonable time-frame. In order to address this issue, a large number of text mining approaches have been developed that can extract chemical information from different types of literature. But the lack of a single universal standard for chemical structure and nomenclature representation has posed significant challenges in mining the chemical information. Hence, a review on the current state of chemical text mining, problems confronted, solutions available, and future prospectus is presented. © The Royal Society of Chemistry 2013.


Urban M.,Rothamsted Research | Pant R.,Molecular Connections Pvt Ltd | Raghunath A.,Molecular Connections Pvt Ltd | Irvine A.G.,Rothamsted Research | And 2 more authors.
Nucleic Acids Research | Year: 2015

Rapidly evolving pathogens cause a diverse array of diseases and epidemics that threaten crop yield, food security as well as human, animal and ecosystem health. To combat infection greater comparative knowledge is required on the pathogenic process in multiple species. The Pathogen-Host Interactions database (PHI-base) catalogues experimentally verified pathogenicity, virulence and effector genes from bacterial, fungal and protist pathogens. Mutant phenotypes are associated with gene information. The included pathogens infect a wide range of hosts including humans, animals, plants, insects, fish and other fungi. The current version, PHI-base 3.6, available at http://www.phi-base.org, stores information on 2875 genes, 4102 interactions, 110 host species, 160 pathogenic species (103 plant, 3 fungal and 54 animal infecting species) and 181 diseases drawn from 1243 references. Phenotypic and gene function information has been obtained by manual curation of the peer-reviewed literature. A controlled vocabulary consisting of nine high-level phenotype terms permits comparisons and data analysis across the taxonomic space. PHI-base phenotypes were mapped via their associated gene information to reference genomes available in Ensembl Genomes. Virulence genes and hotspots can be visualized directly in genome browsers. Future plans for PHI-base include development of tools facilitating community-led curation and inclusion of the corresponding host target(s). © The Author(s) 2014.


Rajput A.M.,University of Bonn | Gurulingappa H.,Molecular Connections Pvt Ltd
Procedia Computer Science | Year: 2013

Ontology enrichment is a process of embedding metadata associated with concepts described in the ontology. Manual information retrieval and enrichment process is labor-intensive and time consuming as each concept is unique and has domain specific meanings. An approach to deal with this problem is to have a unified resource and an automated solution. Different approaches have been used to automate the enrichment process with varying success. Here, we describe our approach of combining automated information retrieval with manual enrichment of retrieved results. Unified Medical Language System implemented on MySQL server was used as a resource for ontology enrichment. To automate the task of information retrieval, KNIME was used which is a workflow management program. The deployed system allows quick retrieval of metadata associated with nearly 1000 ontology terms in a reasonable time frame. Performance evaluation indicated that most of the retrieved results were accurate. © 2013 The Authors.


PubMed | European Bioinformatics Institute, Rothamsted Research, Molecular Connections Pvt Ltd and University of Cambridge
Type: Journal Article | Journal: Nucleic acids research | Year: 2016

The pathogen-host interactions database (PHI-base) is available at www.phi-base.org PHI-base contains expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions reported in peer reviewed research articles. In addition, literature that indicates specific gene alterations that did not affect the disease interaction phenotype are curated to provide complete datasets for comparative purposes. Viruses are not included. Here we describe a revised PHI-base Version 4 data platform with improved search, filtering and extended data display functions. A PHIB-BLAST search function is provided and a link to PHI-Canto, a tool for authors to directly curate their own published data into PHI-base. The new release of PHI-base Version 4.2 (October 2016) has an increased data content containing information from 2219 manually curated references. The data provide information on 4460 genes from 264 pathogens tested on 176 hosts in 8046 interactions. Prokaryotic and eukaryotic pathogens are represented in almost equal numbers. Host species belong 70% to plants and 30% to other species of medical and/or environmental importance. Additional data types included into PHI-base 4 are the direct targets of pathogen effector proteins in experimental and natural host organisms. The curation problems encountered and the future directions of the PHI-base project are briefly discussed.


Loading Molecular Connections Pvt Ltd collaborators
Loading Molecular Connections Pvt Ltd collaborators