Time filter

Source Type

Dessimoz C.,Swiss Institute of Bioinformatics | Gabaldon T.,Center for Genomic Regulation and | Roos D.S.,University of Pennsylvania | Sonnhammer E.L.L.,Stockholm Bioinformatics Center | Herrero J.,European Bioinformatics Institute
Bioinformatics | Year: 2012

The identification of orthologs-genes pairs descended from a common ancestor through speciation, rather than duplication-has emerged as an essential component of many bioinformatics applications, ranging from the annotation of new genomes to experimental target prioritization. Yet, the development and application of orthology inference methods is hampered by the lack of consensus on source proteomes, file formats and benchmarks. The second 'Quest for Orthologs' meeting brought together stakeholders from various communities to address these challenges. We report on achievements and outcomes of this meeting, focusing on topics of particular relevance to the research community at large. The Quest for Orthologs consortium is an open community that welcomes contributions from all researchers interested in orthology research and applications. © The Author(s) 2012. Published by Oxford University Press. Source

Schmitt T.,Stockholm Bioinformatics Center | Messina D.N.,Stockholm Bioinformatics Center | Schreiber F.,Stockholm Bioinformatics Center | Sonnhammer E.L.L.,Stockholm Bioinformatics Center | Sonnhammer E.L.L.,University of Stockholm
Briefings in Bioinformatics | Year: 2011

There is a great need for standards in the orthology field. Users must contend with different ortholog data representations from each provider, and the providers themselves must independently gather and parse the input sequence data. These burdensome and redundant procedures make data comparison and integration difficult.We have designed two XML-based formats, SeqXML and OrthoXML, to solve these problems. SeqXML is a lightweight format for sequence recordsc-the input for orthology prediction. It stores the same sequence and metadata as typical FASTA format records, but overcomes common problems such as unstructured metadata in the header and erroneous sequence content. XML provides validation to prevent data integrity problems that are frequent in FASTA files. The range of applications for SeqXML is broad and not limited to ortholog prediction. We provide read/write functions for BioJava, BioPerl, and Biopython. OrthoXML was designed to represent ortholog assignments from any source in a consistent and structured way, yet cater to specific needs such as scoring schemes or meta-information. A unified format is particularly valuable for ortholog consumers that want to integrate data from numerous resources, e.g. for gene annotation projects. Reference proteomes for 61 organisms are already available in SeqXML, and 10 orthology databases have signed on to OrthoXML. Adoption by the entire field would substantially facilitate exchange and quality control of sequence and orthology information. © The Author 2011. Published by Oxford University Press. Source

Forslund K.,Stockholm Bioinformatics Center | Schreiber F.,Stockholm Bioinformatics Center | Thanintorn N.,Stockholm Bioinformatics Center | Sonnhammer E.L.L.,Stockholm Bioinformatics Center | Sonnhammer E.L.L.,University of Stockholm
Briefings in Bioinformatics | Year: 2011

Orthology is one of the most important tools available to modern biology, as it allows making inferences from easily studied model systems to much less tractable systems of interest, such as ourselves. This becomes important not least in the study of genetic diseases.We here review work on the orthology of disease-associated genes and also present an updated version of the InParanoid-based disease orthology database and web site OrthoDisease, with 14-fold increased species coverage since the previous version.Using this resource, we survey the taxonomic distribution of orthologs of human genes involved in different disease categories. The hypothesis that paralogs can mask the effect of deleterious mutations predicts that known heritable disease genes should have fewer close paralogs. We found large-scale support for this hypothesis as significantly fewer duplications were observed for disease genes in the OrthoDisease ortholog groups. © The Author 2011. Published by Oxford University Press. Source

Alexeyenko A.,KTH Royal Institute of Technology | Alexeyenko A.,Science for Life Laboratory | Schmitt T.,Science for Life Laboratory | Schmitt T.,Stockholm Bioinformatics Center | And 15 more authors.
Nucleic Acids Research | Year: 2012

FunCoup (http://FunCoup.sbc.su.se) is a database that maintains and visualizes global gene/protein networks of functional coupling that have been constructed by Bayesian integration of diverse high-throughput data. FunCoup achieves high coverage by orthology-based integration of data sources from different model organisms and from different platforms. We here present release 2.0 in which the data sources have been updated and the methodology has been refined. It contains a new data type Genetic Interaction, and three new species: chicken, dog and zebra fish. As FunCoup extensively transfers functional coupling information between species, the new input datasets have considerably improved both coverage and quality of the networks. The number of high-confidence network links has increased dramatically. For instance, the human network has more than eight times as many links above confidence 0.5 as the previous release. FunCoup provides facilities for analysing the conservation of subnetworks in multiple species. We here explain how to do comparative interactomics on the FunCoup website. © The Author(s) 2011. Source

Ostlund G.,Stockholm Bioinformatics Center | Ostlund G.,University of Stockholm | Sonnhammer E.L.L.,Stockholm Bioinformatics Center | Sonnhammer E.L.L.,University of Stockholm | Sonnhammer E.L.L.,Swedish eScience Research Center
Gene | Year: 2012

mRNA expression is widely used as a proxy for protein expression. However, their true relation is not known and two genes with the same mRNA levels might have different abundances of respective proteins. A related question is whether the coexpression of mRNA for gene pairs is reflected by the corresponding protein pairs.We examined the mRNA-protein correlation for both expression and coexpression. This analysis yielded insights into the relationship between mRNA and protein abundance, and allowed us to identify subsets of greater mRNA-protein coherence.The correlation between mRNA and protein was low for both expression and coexpression, 0.12 and 0.06 respectively. However, applying the best-performing quality measure, high-quality subsets reached a Spearman correlation of 0.31 for expression, 0.34 for coexpression and 0.49 for coexpression when restricted to functionally coupled genes. Our methodology can thus identify subsets for which the mRNA levels are expected to be the strongest correlated with protein levels. © 2012 Elsevier B.V. Source

Discover hidden collaborations