Interuniversity Institute of Bioinformatics in Brussels

Brussels, Belgium

Interuniversity Institute of Bioinformatics in Brussels

Brussels, Belgium
Time filter
Source Type

Pucci F.,Roosevelt University | Pucci F.,Interuniversity Institute of Bioinformatics in Brussels | Rooman M.,Roosevelt University | Rooman M.,Interuniversity Institute of Bioinformatics in Brussels
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences | Year: 2016

Despite the intense efforts of the last decades to understand the thermal stability of proteins, the mechanisms responsible for its modulation still remain debated. In this investigation, we tackle this issue by showing how a multiscale perspective can yield new insights. With the help of temperaturedependent statistical potentials, we analysed some amino acid interactions at the molecular level, which are suggested to be relevant for the enhancement of thermal resistance. We then investigated the thermal stability at the protein level by quantifying its modification upon amino acid substitutions. Finally, a large scale analysis of protein stability-at the structurome level-contributed to the clarification of the relation between stability and natural evolution, thereby showing that the mutational profile of proteins differs according to their thermal properties. Some considerations on how the multiscale approach could help in unravelling the protein stability mechanisms are briefly discussed. This article is part of the themed issue 'Multiscale modelling at the physics-chemistry-biology interface'. © 2016 The Author(s) Published by the Royal Society. All rights reserved.

Faust K.,Catholic University of Leuven | Faust K.,Center for the Biology of Disease | Faust K.,Vrije Universiteit Brussel | Lahti L.,Wageningen University | And 8 more authors.
Current Opinion in Microbiology | Year: 2015

The recent increase in the number of microbial time series studies offers new insights into the stability and dynamics of microbial communities, from the world's oceans to human microbiota. Dedicated time series analysis tools allow taking full advantage of these data. Such tools can reveal periodic patterns, help to build predictive models or, on the contrary, quantify irregularities that make community behavior unpredictable. Microbial communities can change abruptly in response to small perturbations, linked to changing conditions or the presence of multiple stable states. With sufficient samples or time points, such alternative states can be detected. In addition, temporal variation of microbial interactions can be captured with time-varying networks. Here, we apply these techniques on multiple longitudinal datasets to illustrate their potential for microbiome research. © 2015 The Authors.

Vranken W.F.,Vrije Universiteit Brussel | Vranken W.F.,Interuniversity Institute of Bioinformatics in Brussels
Progress in Nuclear Magnetic Resonance Spectroscopy | Year: 2014

NMR spectroscopy is a key technique for understanding the behaviour of proteins, especially highly dynamic proteins that adopt multiple conformations in solution. Overall, protein structures determined from NMR spectroscopy data constitute just over 10% of the Protein Data Bank archive. This review covers the validation of these NMR protein structures, but rather than describing currently available methodology, it focuses on concepts that are important for understanding where and how validation is most relevant. First, the inherent characteristics of the protein under study have an influence on quality and quantity of the distinct types of data that can be acquired from NMR experiments. Second, these NMR data are necessarily transformed into a model for use in a structure calculation protocol, and the protein structures that result from this reflect the types of NMR data used as well as the protein characteristics. The validation of NMR protein structures should therefore take account, wherever possible, of the inherent behavioural characteristics of the protein, the types of available NMR data, and the calculation protocol. These concepts are discussed in the context of 'knowledge based' and 'model versus data' validation, with suggestions for questions to ask and different validation categories to consider. The principal aim of this review is to stimulate discussion and to help the reader understand the relationships between the above elements in order to make informed decisions on which validation approaches are the most relevant in particular cases. © 2014 Elsevier B.V. All rights reserved.

Skwark M.J.,University of Stockholm | Skwark M.J.,Aalto University | Raimondi D.,University of Stockholm | Raimondi D.,Interuniversity Institute of Bioinformatics in Brussels | And 2 more authors.
PLoS Computational Biology | Year: 2014

Given sufficient large protein families, and using a global statistical inference approach, it is possible to obtain sufficient accuracy in protein residue contact predictions to predict the structure of many proteins. However, these approaches do not consider the fact that the contacts in a protein are neither randomly, nor independently distributed, but actually follow precise rules governed by the structure of the protein and thus are interdependent. Here, we present PconsC2, a novel method that uses a deep learning approach to identify protein-like contact patterns to improve contact predictions. A substantial enhancement can be seen for all contacts independently on the number of aligned sequences, residue separation or secondary structure type, but is largest for β-sheet containing proteins. In addition to being superior to earlier methods based on statistical inferences, in comparison to state of the art methods using machine learning, PconsC2 is superior for families with more than 100 effective sequence homologs. The improved contact prediction enables improved structure prediction. © 2014 Skwark et al.

Cilia E.,Free University of Colombia | Cilia E.,Interuniversity Institute of Bioinformatics in Brussels | Pancsa R.,Vrije Universiteit Brussel | Tompa P.,Interuniversity Institute of Bioinformatics in Brussels | And 7 more authors.
Nucleic Acids Research | Year: 2014

Protein dynamics are important for understanding protein function. Unfortunately, accurate protein dynamics information is difficult to obtain: here we present the DynaMine webserver, which provides predictions for the fast backbone movements of proteins directly from their amino-acid sequence. DynaMine rapidly produces a profile describing the statistical potential for such movements at residue-level resolution. The predicted values have meaning on an absolute scale and go beyond the traditional binary classification of residues as ordered or disordered, thus allowing for direct dynamics comparisons between protein regions. Through this webserver, we provide molecular biologists with an efficient and easy to use tool for predicting the dynamical characteristics of any protein of interest, even in the absence of experimental observations. The prediction results are visualized and can be directly downloaded. The DynaMine webserver, including instructive examples describing the meaning of the profiles, is available at © 2014 The Author(s).

Lopes M.,Free University of Colombia | Lopes M.,Interuniversity Institute of Bioinformatics in Brussels | Kutlu B.,Institute for Systems Biology | Miani M.,Free University of Colombia | And 9 more authors.
Genomics | Year: 2014

Type 1 Diabetes (T1D) is an autoimmune disease where local release of cytokines such as IL-1β and IFN-γ contributes to β-cell apoptosis. To identify relevant genes regulating this process we performed a meta-analysis of 8 datasets of β-cell gene expression after exposure to IL-1β and IFN-γ. Two of these datasets are novel and contain time-series expressions in human islet cells and rat INS-1E cells. Genes were ranked according to their differential expression within and after 24. h from exposure, and characterized by function and prior knowledge in the literature. A regulatory network was then inferred from the human time expression datasets, using a time-series extension of a network inference method. The two most differentially expressed genes previously unknown in T1D literature (RIPK2 and ELF3) were found to modulate cytokine-induced apoptosis. The inferred regulatory network is thus supported by the experimental validation, providing a proof-of-concept for the proposed statistical inference approach. © 2014 Elsevier Inc.

Pozzolo A.D.,Free University of Colombia | Caelen O.,Fraud Risk Management Analytics | Bontempi G.,Free University of Colombia | Bontempi G.,Interuniversity Institute of Bioinformatics in Brussels
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2015

A well-known rule of thumb in unbalanced classification recommends the rebalancing (typically by resampling) of the classes before proceeding with the learning of the classifier. Though this seems to work for the majority of cases, no detailed analysis exists about the impact of undersampling on the accuracy of the final classifier. This paper aims to fill this gap by proposing an integrated analysis of the two elements which have the largest impact on the effectiveness of an undersampling strategy: the increase of the variance due to the reduction of the number of samples and the warping of the posterior distribution due to the change of priori probabilities. In particular we will propose a theoretical analysis specifying under which conditions undersampling is recommended and expected to be effective. It emerges that the impact of undersampling depends on the number of samples, the variance of the classifier, the degree of imbalance and more specifically on the value of the posterior probability. This makes difficult to predict the average effectiveness of an undersampling strategy since its benefits depend on the distribution of the testing points. Results from several synthetic and real-world unbalanced datasets support and validate our findings. © Springer International Publishing Switzerland 2015.

Cilia E.,Free University of Colombia | Cilia E.,Interuniversity Institute of Bioinformatics in Brussels | Pancsa R.,Vrije Universiteit Brussel | Tompa P.,Interuniversity Institute of Bioinformatics in Brussels | And 6 more authors.
Nature Communications | Year: 2013

Protein function and dynamics are closely related; however, accurate dynamics information is difficult to obtain. Here based on a carefully assembled data set derived from experimental data for proteins in solution, we quantify backbone dynamics properties on the amino-acid level and develop DynaMine - a fast, high-quality predictor of protein backbone dynamics. DynaMine uses only protein sequence information as input and shows great potential in distinguishing regions of different structural organization, such as folded domains, disordered linkers, molten globules and pre-structured binding motifs of different sizes. It also identifies disordered regions within proteins with an accuracy comparable to the most sophisticated existing predictors, without depending on prior disorder knowledge or three-dimensional structural information. DynaMine provides molecular biologists with an important new method that grasps the dynamical characteristics of any protein of interest, as we show here for human p53 and E1A from human adenovirus 5.© 2013 Macmillan Publishers Limited. All rights reserved.

PubMed | Interuniversity Institute of Bioinformatics in Brussels
Type: | Journal: Scientific reports | Year: 2016

Next Generation Sequencing is dramatically increasing the number of known protein sequences, with related experimentally determined protein structures lagging behind. Structural bioinformatics is attempting to close this gap by developing approaches that predict structure-level characteristics for uncharacterized protein sequences, with most of the developed methods relying heavily on evolutionary information collected from homologous sequences. Here we show that there is a substantial observational selection bias in this approach: the predictions are validated on proteins with known structures from the PDB, but exactly for those proteins significantly more homologs are available compared to less studied sequences randomly extracted from Uniprot. Structural bioinformatics methods that were developed this way are thus likely to have over-estimated performances; we demonstrate this for two contact prediction methods, where performances drop up to 60% when taking into account a more realistic amount of evolutionary information. We provide a bias-free dataset for the validation for contact prediction methods called NOUMENON.

PubMed | Interuniversity Institute of Bioinformatics in Brussels and Vrije Universiteit Brussel
Type: Journal Article | Journal: Human mutation | Year: 2016

Cysteines are among the rarest amino acids in nature, and are both functionally and structurally very important for proteins. The ability of cysteines to form disulfide bonds is especially relevant, both for constraining the folded state of the protein and for performing enzymatic duties. But how does the variation record of human proteins reflect their functional importance and structural role, especially with regard to deleterious mutations? We created HUMCYS, a manually curated dataset of single amino acid variants that (1) have a known disease/neutral phenotypic outcome and (2) cause the loss of a cysteine, in order to investigate how mutated cysteines relate to structural aspects such as surface accessibility and cysteine oxidation state. We also have developed a sequence-based in silico cysteine oxidation predictor to overcome the scarcity of experimentally derived oxidation annotations, and applied it to extend our analysis to classes of proteins for which the experimental determination of their structure is technically challenging, such as transmembrane proteins. Our investigation shows that we can gain insights into the reason behind the outcome of cysteine losses in otherwise uncharacterized proteins, and we discuss the possible molecular mechanisms leading to deleterious phenotypes, such as the involvement of the mutated cysteine in a structurally or enzymatically relevant disulfide bond.

Loading Interuniversity Institute of Bioinformatics in Brussels collaborators
Loading Interuniversity Institute of Bioinformatics in Brussels collaborators