Time filter

Source Type

Gagneur J.,TU Munich | Friedel C.,Ludwig Maximilians University of Munich | Heun V.,Ludwig Maximilians University of Munich | Zimmer R.,Ludwig Maximilians University of Munich | And 4 more authors.
Informatik-Spektrum | Year: 2017

Informatics and life sciences (molecular biology and medicine) are undoubtedly the most rapidly growing and most dynamic endeavors of modern society. Computational biology or bioinformatics describes the rising field that integrates those endeavors. Over the last 50 years, the field has shifted focus from the study of individual genes and proteins (1967–1994), to that of entire organisms (19952015), and more recently to studying the diversity of populations. The increasing amount of big data created by the life sciences is challenging already by its volume alone. Even more challenging is the high intrinsic complexity of the data. In addition, the data are changing at a breathtaking speed; most data generated in 2016 probes conditions that had not been anticipated 15 years ago. Precision medicine and personalized health are just two descriptors of how modern biology will become relevant for improving our health. All new drugs have at some point have bioinformatics tools in their development. Similarly, there would not be any digital medicine without the bioinformatics expertise or any advances without mastering machine learning tools turning raw data into valuable insights and decisions. © 2017 Springer-Verlag Berlin Heidelberg

Goldberg T.,Bioinformatics I12 | Hecht M.,Bioinformatics I12 | Hamp T.,Bioinformatics I12 | Karl T.,Bioinformatics I12 | And 35 more authors.
Nucleic Acids Research | Year: 2014

The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18 = 80 ± 3% for eukaryotes and a six-state accuracy Q6 = 89 ± 4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3. © 2014 The Author(s).

Yachdav G.,TU Munich | Yachdav G.,Biosof LLC | Kloppmann E.,TU Munich | Kloppmann E.,Columbia University | And 25 more authors.
Nucleic Acids Research | Year: 2014

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. © 2014 The Author(s).

Reeb J.,TU Munich | Kloppmann E.,TU Munich | Kloppmann E.,New York Structural Biology Center | Bernhofer M.,TU Munich | And 4 more authors.
Proteins: Structure, Function and Bioinformatics | Year: 2015

Experimental structure determination continues to be challenging for membrane proteins. Computational prediction methods are therefore needed and widely used to supplement experimental data. Here, we re-examined the state of the art in transmembrane helix prediction based on a nonredundant dataset with 190 high-resolution structures. Analyzing 12 widely-used and well-known methods using a stringent performance measure, we largely confirmed the expected high level of performance. On the other hand, all methods performed worse for proteins that could not have been used for development. A few results stood out: First, all methods predicted proteins in eukaryotes better than those in bacteria. Second, methods worked less well for proteins with many transmembrane helices. Third, most methods correctly discriminated between soluble and transmembrane proteins. However, several older methods often mistook signal peptides for transmembrane helices. Some newer methods have overcome this shortcoming. In our hands, PolyPhobius and MEMSAT-SVM outperformed other methods. © 2014 Wiley Periodicals, Inc.

Bernhofer M.,TU Munich | Kloppmann E.,TU Munich | Kloppmann E.,New York Structural Biology Center | Reeb J.,TU Munich | And 4 more authors.
Proteins: Structure, Function and Bioinformatics | Year: 2016

Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains. Here, we present TMSEG, a novel method identifying TMPs and accurately predicting their TMHs and their topology. The method combines machine learning with empirical filters. Testing it on a non-redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state-of-the-art in our hands. TMSEG correctly distinguished helical TMPs from other proteins with a sensitivity of 98 ± 2% and a false positive rate as low as 3 ± 1%. Individual TMHs were predicted with a precision of 87 ± 3% and recall of 84 ± 3%. Furthermore, in 63 ± 6% of helical TMPs the placement of all TMHs and their inside/outside topology was correctly predicted. There are two main features that distinguish TMSEG from other methods. First, the errors in finding all helical TMPs in an organism are significantly reduced. For example, in human this leads to 200 and 1600 fewer misclassifications compared to the second and third best method available, and 4400 fewer mistakes than by a simple hydrophobicity-based method. Second, TMSEG provides an add-on improvement for any existing method to benefit from. Proteins 2016; 84:1706–1716. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

Loading Institute for Food and Plant science WZW Weihenstephan collaborators
Loading Institute for Food and Plant science WZW Weihenstephan collaborators