Time filter

Source Type

Alexeyenko A.,KTH Royal Institute of Technology | Alexeyenko A.,Science for Life Laboratory | Schmitt T.,Science for Life Laboratory | Schmitt T.,Stockholm Bioinformatics Center | And 15 more authors.
Nucleic Acids Research | Year: 2012

FunCoup (http://FunCoup.sbc.su.se) is a database that maintains and visualizes global gene/protein networks of functional coupling that have been constructed by Bayesian integration of diverse high-throughput data. FunCoup achieves high coverage by orthology-based integration of data sources from different model organisms and from different platforms. We here present release 2.0 in which the data sources have been updated and the methodology has been refined. It contains a new data type Genetic Interaction, and three new species: chicken, dog and zebra fish. As FunCoup extensively transfers functional coupling information between species, the new input datasets have considerably improved both coverage and quality of the networks. The number of high-confidence network links has increased dramatically. For instance, the human network has more than eight times as many links above confidence 0.5 as the previous release. FunCoup provides facilities for analysing the conservation of subnetworks in multiple species. We here explain how to do comparative interactomics on the FunCoup website. © The Author(s) 2011.


Tjarnberg A.,Stockholm Bioinformatics Center | Tjarnberg A.,University of Stockholm | Nordling T.E.M.,Stockholm Bioinformatics Center | Nordling T.E.M.,KTH Royal Institute of Technology | And 5 more authors.
Journal of Computational Biology | Year: 2013

Gene regulatory network inference (that is, determination of the regulatory interactions between a set of genes) provides mechanistic insights of central importance to research in systems biology. Most contemporary network inference methods rely on a sparsity/ regularization coefficient, which we call f (zeta), to determine the degree of sparsity of the network estimates, that is, the total number of links between the nodes. However, they offer little or no advice on how to select this sparsity coefficient, in particular, for biological data with few samples. We show that an empty network is more accurate than estimates obtained for a poor choice of f. In order to avoid such poor choices, we propose a method for optimization of f, which maximizes the accuracy of the inferred network for any sparsitydependent inference method and data set. Our procedure is based on leave-one-out crossoptimization and selection of the f value that minimizes the prediction error. We also illustrate the adverse effects of noise, few samples, and uninformative experiments on network inference as well as our method for optimization of f. We demonstrate that our f optimization method for two widely used inference algorithms-Glmnet and NIR-gives accurate and informative estimates of the network structure, given that the data is informative enough. © Mary Ann Liebert, Inc.


Sundstrom K.,Karolinska Institutet | Ploner A.,Karolinska Institutet | Arnheim-Dahlstrom L.,Karolinska Institutet | Eloranta S.,Karolinska Institutet | And 8 more authors.
Journal of the National Cancer Institute | Year: 2015

Background: The clinical significance of co-infections with high-risk (HR) and low-risk (LR) human papillomavirus (HPV) in the etiology of cervical cancer is debated, as prospective evidence on this issue is limited. However, the question is of increasing relevance in relation to HPV-based cancer prevention. Methods: In two population-based nested case-control studies among women participating in cervical screening with baseline normal smears, we collected 4659 smears from women who later developed cancer in situ (CIS; n = 524) or squamous cervical cancer (SCC; n = 378) and individually matched control subjects who remained free of disease during study follow-up. The median follow-up until diagnosis was 6.4 to 7.8 years. All smears were tested for HPV. We used conditional logistic regression models with two-way interaction terms to estimate relative risks (RRs) for CIS and SCC, respectively. All statistical tests were two-sided. Results: Compared with women who were infected with HRHPV only, women who were also infected with LRHPV had a lower risk for SCC (RR = 0.2, 95% confidence interval [CI] = 0.04 to 0.99, P =. 049). This interaction was not shown for CIS (RR = 1.1, 95% CI = 0.4 to 3.6). Women who were positive for both HRHPV and LRHPV had, on average, a 4.8 year longer time to diagnosis of SCC than women who were positive for HRHPV only (P =. 006). Results were highly robust in sensitivity analyses. Conclusion: Co-infection with LRHPV is associated with a lower risk of future invasive disease and longer time to diagnosis than infection with HRHPV alone. We propose that co-infection with LRHPV interferes with the rate of progression to invasive cervical cancer. © 2015 The Author.


Sonnhammer E.L.L.,Stockholm Bioinformatics Center | Sonnhammer E.L.L.,Swedish cience Research Center | Sonnhammer E.L.L.,University of Stockholm | Gabaldon T.,Center for Genomic Regulation | And 10 more authors.
Bioinformatics | Year: 2014

Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking. © The Author 2014. Published by Oxford University Press.


Szulkin R.,Karolinska Institutet | Szulkin R.,Center for Family and Community Medicine | Holmberg E.,Gothenburg University | Stattin P.,Umeå University | And 6 more authors.
Prostate | Year: 2012

BACKGROUND Currently used prognostic markers are limited in their ability to accurately predict disease progression among patients with localized prostate cancer. We examined 23 reported prostate cancer susceptibility variants for association with disease progression. METHODS Disease progression was explored among 4,673 Swedish patients treated for clinically localized prostate cancer between 1997 and 2002. Prostate cancer progression was defined according to primary treatment as a composed event reflecting termination of deferred treatment, biochemical recurrence, local progression, or presence of distant metastasis. Association between single variants, and all variants combined, were performed in Cox regression analysis assuming both log-additive and co-dominant genetic models. RESULTS Three of the 23 genetic variants explored were nominally associated with prostate cancer progression; rs9364554 (P = 0.041) on chromosome 6q25 and rs10896449 (P = 0.029) on chromosome 11q13 among patients treated with curative intent; and rs4054823 (P = 0.008) on chromosome 17p12 among patients on surveillance. However, none of these associations remained statistically significant after correction for multiple testing. The combined effect of all susceptibility variants was not associated with prostate cancer progression neither among patients receiving treatment with curative intent (P = 0.14) nor among patients on surveillance (P = 0.92). CONCLUSIONS We observed no evidence for an association between any of 23 established prostate cancer genetic risk variants and disease progression. Accumulating evidence suggests separate genetic components for initiation and progression of prostate cancer. Future studies systematically searching for genetic risk variants associated with prostate cancer progression and prognosis are warranted. Copyright © 2011 Wiley Periodicals, Inc.


Forslund K.,University of Stockholm | Sonnhammer E.L.L.,University of Stockholm | Sonnhammer E.L.L.,Swedish cience Research Center
Methods in Molecular Biology | Year: 2012

This chapter reviews the current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this directly impacts which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multidomain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). © 2012 Springer Science+Business Media, LLC.


Tjarnberg A.,Stockholm Bioinformatics Center | Tjarnberg A.,University of Stockholm | Nordling T.E.M.,Stockholm Bioinformatics Center | Nordling T.E.M.,Uppsala University | And 5 more authors.
Molecular BioSystems | Year: 2015

Statistical regularisation methods such as LASSO and related L1 regularised regression methods are commonly used to construct models of gene regulatory networks. Although they can theoretically infer the correct network structure, they have been shown in practice to make errors, i.e. leave out existing links and include non-existing links. We show that L1 regularisation methods typically produce a poor network model when the analysed data are ill-conditioned, i.e. the gene expression data matrix has a high condition number, even if it contains enough information for correct network inference. However, the correct structure of network models can be obtained for informative data, data with such a signal to noise ratio that existing links can be proven to exist, when these methods fail, by using least-squares regression and setting small parameters to zero, or by using robust network inference, a recent method taking the intersection of all non-rejectable models. Since available experimental data sets are generally ill-conditioned, we recommend to check the condition number of the data matrix to avoid this pitfall of L1 regularised inference, and to also consider alternative methods. This journal is © The Royal Society of Chemistry 2015.


Guala D.,Stockholm Bioinformatics Center | Guala D.,University of Stockholm | Sjolund E.,Stockholm Bioinformatics Center | Sjolund E.,University of Stockholm | And 3 more authors.
Bioinformatics | Year: 2014

MaxLink, a guilt-by-association network search algorithm, has been made available as a web resource and a stand-alone version. Based on a user-supplied list of query genes, MaxLink identifies and ranks genes that are tightly linked to the query list. This functionality can be used to predict potential disease genes from an initial set of genes with known association to a disease. The original algorithm, used to identify and rank novel genes potentially involved in cancer, has been updated to use a more statistically sound method for selection of candidate genes and made applicable to other areas than cancer. The algorithm has also been made faster by re-implementation in C++, and the Web site uses FunCoup 3.0 as the underlying network.Availability and implementation: MaxLink is freely available at http://maxlink.sbc.su.se both as a web service and a stand-alone application for download. © 2014 The Author.


Schmitt T.,Stockholm Bioinformatics Center | Schmitt T.,University of Stockholm | Ogris C.,Stockholm Bioinformatics Center | Ogris C.,University of Stockholm | And 3 more authors.
Nucleic Acids Research | Year: 2014

We present an update of the FunCoup database (http://FunCoup.sbc.su.se) of functional couplings, or functional associations, between genes and gene products. Identifying these functional couplings is an important step in the understanding of higher level mechanisms performed by complex cellular processes. FunCoup distinguishes between four classes of couplings: participation in the same signaling cascade, participation in the same metabolic process, co-membership in a protein complex and physical interaction. For each of these four classes, several types of experimental and statistical evidence are combined by Bayesian integration to predict genome-wide functional coupling networks. The FunCoup framework has been completely re-implemented to allow for more frequent future updates. It contains many improvements, such as a regularization procedure to automatically downweight redundant evidences and a novel method to incorporate phylogenetic profile similarity. Several datasets have been updated and new data have been added in FunCoup 3.0. Furthermore, we have developed a new Web site, which provides powerful tools to explore the predicted networks and to retrieve detailed information about the data underlying each prediction. © 2013 The Author(s). Published by Oxford University Press.


Studham M.E.,Stockholm Bioinformatics Center | Studham M.E.,University of Stockholm | Tjarnberg A.,Stockholm Bioinformatics Center | Tjarnberg A.,University of Stockholm | And 6 more authors.
Bioinformatics | Year: 2014

Motivation: Gene regulatory network (GRN) inference reveals the influences genes have on one another in cellular regulatory systems. If the experimental data are inadequate for reliable inference of the network, informative priors have been shown to improve the accuracy of inferences. Results: This study explores the potential of undirected, confidence-weighted networks, such as those in functional association databases, as a prior source for GRN inference. Such networks often erroneously indicate symmetric interaction between genes and may contain mostly correlation-based interaction information. Despite these drawbacks, our testing on synthetic datasets indicates that even noisy priors reflect some causal information that can improve GRN inference accuracy. Our analysis on yeast data indicates that using the functional association databases FunCoup and STRING as priors can give a small improvement in GRN inference accuracy with biological data. © 2014 The Author. Published by Oxford University Press. All rights reserved.

Loading Swedish cience Research Center collaborators
Loading Swedish cience Research Center collaborators