Computational Biology

Santander, Spain

Computational Biology

Santander, Spain
SEARCH FILTERS
Time filter
Source Type

News Article | April 17, 2017
Site: www.eurekalert.org

VIDEO:  This is an attempt to explain what we think are some of the most salient results of our research packed in a 4-minute video. People's ability to make random choices or mimic a random process, such as coming up with hypothetical results for a series of coin flips, peaks around age 25, according to a study published in PLOS Computational Biology. Scientists believe that the ability to behave in a way that appears random arises from some of the most highly developed cognitive processes in humans, and may be connected to abilities such as human creativity. Previous studies have shown that aging diminishes a person's ability to behave randomly. However, it had been unclear how this ability evolves over a person's lifetime, nor had it been possible to assess the ways in which humans may behave randomly beyond simple statistical tests. To better understand how age impacts random behavior, Nicolas Gauvrit and colleagues at the Algorithmic Nature Group, LABORES for the Natural and Digital Sciences, Paris, assessed more than 3,400 people aged 4 to 91 years old. Each participant performed a series of online tasks that assessed their ability to behave randomly. The five tasks included listing the hypothetical results of a series of 12 coin flips so that they would "look random to somebody else," guessing which card would appear when selected from a randomly shuffled deck, and listing the hypothetical results of 10 rolls of a die--"the kind of sequence you'd get if you really rolled a die." The scientists analyzed the participants' choices according to their algorithmic randomness, which is based on the idea that patterns that are more random are harder to summarize mathematically. After controlling for characteristics such as gender, language, and education, they found that age was the only factor that affected the ability to behave randomly. This ability peaked at age 25, on average, and declined from then on. "This experiment is a kind of reverse Turing test for random behavior, a test of strength between algorithms and humans," says study co-author Hector Zenil. "25 is, on average, the golden age when humans best outsmart computers," adds Dr. Gauvrit. The study also demonstrated that a relatively short list of choices, say 10 hypothetical coin flips, can be used to reliably gauge randomness of human behavior. The authors are now using a similar approach to study potential connections between the ability to behave randomly and such things as cognitive decline and neurodegenerative diseases. The authors have produced a video to summarize the key results of their research, which can be found, with a caption and further details, here: https:/ In your coverage please use this URL to provide access to the freely available article in PLOS Computational Biology: http://journals. Funding: HZ received partial funding the Swedish Research Council (VR). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist.


News Article | April 27, 2017
Site: www.eurekalert.org

Researchers have developed a personalized algorithm that predicts the impact of particular foods on an individual's blood sugar levels, according to a new study published in PLOS Computational Biology. The algorithm has been integrated into an app, Glucoracle, which will allow individuals with type 2 diabetes to keep a tighter rein on their glucose levels -- the key to preventing or controlling the major complications of a disease that affects 8 percent of Americans. Medications are often prescribed to help patients with type 2 diabetes manage their blood sugar levels, but exercise and diet also play an important role. "While we know the general effect of different types of food on blood glucose, the detailed effects can vary widely from one person to another and for the same person over time," said lead author David Albers, PhD, associate research scientist in Biomedical Informatics at Columbia University Medical Center (CUMC). "Even with expert guidance, it's difficult for people to understand the true impact of their dietary choices, particularly on a meal-to-meal basis. Our algorithm, integrated into an easy-to-use app, predicts the consequences of eating a specific meal before the food is eaten, allowing individuals to make better nutritional choices during mealtime." The algorithm uses a technique called data assimilation, in which a mathematical model of a person's response to glucose is regularly updated with observational data--blood sugar measurements and nutritional information--to improve the model's predictions, explained co-study leader George Hripcsak, MD, MS, the Vivian Beaumont Allen Professor and chair of Biomedical Informatics at CUMC. Data assimilation is used in a variety of applications, notably weather forecasting. "The data assimilator is continually updated with the user's food intake and blood glucose measurements, personalizing the model for that individual," said co-study leader Lena Mamykina, PhD, assistant professor of biomedical informatics at CUMC, whose team has designed and developed the Glucoracle app. Glucoracle allows the user to upload fingerstick blood measurements and a photo of a particular meal to the app, along with a rough estimate of the nutritional content of the meal. This estimate provides the user with an immediate prediction of post-meal blood sugar levels. The estimate and forecast are then adjusted for accuracy. The app begins generating predictions after it has been used for a week, allowing the data assimilator has learned how the user responds to different foods. The researchers initially tested the data assimilator on five individuals using the app, including three with type 2 diabetes and two without the disease. The app's predictions were compared with actual post-meal blood glucose measurements and with the predictions of certified diabetes educators. For the two non-diabetic individuals, the app's predictions were comparable to the actual glucose measurements. For the three subjects with diabetes, the apps forecasts were slightly less accurate, possibly due to fluctuations in the physiology of patients with diabetes or parameter error, but were still comparable to the predictions of the diabetes educators. "There's certainly room for improvement," said Dr. Albers. "This evaluation was designed to prove that it's possible, using routine self-monitoring data, to generate real-time glucose forecasts that people could use to make better nutritional choices. We have been able to make an aspect of diabetes self-management that has been nearly impossible for people with type 2 diabetes more manageable. Now our task is to make the data assimilation tool powering the app even better." Encouraged by these early results, the research team is preparing for a larger clinical trial. The researchers estimate that the app could be ready for widespread use within two years. This release is based on text provided by the authors. In your coverage please use this URL to provide access to the freely available article in PLOS Computational Biology: http://journals. Funding: GH, ML, and DJA are supported by a grant from the National Library of Medicine LM006910. LM, DJA and ML are supported by a grant from the Robert Wood Johnson Foundation RWJF 73070. LM is supported by a grant from the National Institute of Diabetes and Digestive Kidney diseases R01DK090372. Competing Interests: The authors have declared that no competing interests exist.


News Article | March 17, 2017
Site: www.techtimes.com

The latest advancement in diagnostic technology could help doctors identify the presence of autism biomarkers in patients's bloodstream as early as childhood. This novel diagnostic tool is the first physiological test that can detect the genetic disorder, its accuracy greatly surpassing currently used screening measures, which only focus on behavioral symptoms. The blood test uses an algorithm to track levels of metabolites, and is designed to predict the occurrence of Autism spectrum disorder or ASD in children, allowing the possibility of earlier diagnosis. The Rensselaer Polytechnic Institute in New York, which created the algorithm, studied its efficiency through advanced data analysis and published the results in the journal PLOS Computational Biology. According to Juergen Hahn, one of the study's lead authors, previous research typically focused on only one biomarker: metabolite or gene. Although successful, past results were not statistically significant enough to be replicated in other diagnostic case. Hahn's study, however, is based on multivariate statistical models that enabled his team to classify children with autism based on their neurological status. Researchers analyzed biomedical data from 149 blood samples, belonging to 83 autistic children and 76 neurotypical participants (children not affected by ASD) — all aged between 3 and 10. Instead of focusing on individual metabolites, Hahn's team investigated several metabolite patterns correlated with autism and discovered important differences in metabolite concentrations between the ASD test group and neurotypical cohort. The comparative analysis revealed disparities in two metabolic processes: the methionine cycle (linked to several cellular functions) and the transulfuration pathway (responsible for producing antioxidants to decrease cell oxidation). Past studies showed both pathways go through a process of alteration in people with high risk of autism. "By measuring 24 metabolites from a blood sample, this algorithm can tell whether or not an individual is on the Autism spectrum, and even to some degree where on the spectrum they land," explained Hahn. This new diagnostic method was shown to be "highly accurate and specific," and helped researchers identify 97.6 percent of children who had autism and 96.1 percent of those who were neurotypical. No other diagnosis approach currently available can produce an equally precise classification of autistic patients or predict on which end of the spectrum they are found. Hahn's team believes their algorithm is "a strong indicator that the metabolites under consideration are strongly correlated with an ASD diagnosis." "The method presented in this work is the only one of its kind that can classify an individual as being on the autism spectrum or as being neurotypical. We are not aware of any other method, using any type of biomarker that can do this, much less with the degree of accuracy that we see in our work," study authors said in a statement. © 2017 Tech Times, All rights reserved. Do not reproduce without permission.


News Article | April 17, 2017
Site: www.scientificamerican.com

The brain processes sights, sounds and other sensory information—and even makes decisions—based on a calculation of probabilities. At least, that’s what a number of leading theories of mental processing tell us: The body’s master controller builds an internal model from past experiences, and then predicts how best to behave. Although studies have shown humans and other animals make varied behavioral choices even when performing the same task in an identical environment, these hypotheses often attribute such fluctuations to “noise”—to an error in the system. But not everyone agrees this provides the complete picture. After all, sometimes it really does pay off for randomness to enter the equation. A prey animal has a higher chance of escaping predators if its behavior cannot be anticipated easily, something made possible by introducing greater variability into its decision-making. Or in less stable conditions, when prior experience can no longer provide an accurate gauge for how to act, this kind of complex behavior allows the animal to explore more diverse options, improving its odds of finding the optimal solution. One 2014 study found rats resorted to random behavior when they realized nonrandom behavior was insufficient for outsmarting a computer algorithm. Perhaps, then, this variance cannot simply be chalked up to mere noise. Instead, it plays an essential role in how the brain functions. Now, in a study published April 12 in PLoS Computational Biology, a group of researchers in the Algorithmic Nature Group at LABORES Scientific Research Lab for the Natural and Digital Sciences in Paris hope to illuminate how this complexity unfolds in humans. “When the rats tried to behave randomly [in 2014],” says Hector Zenil, a computer scientist who is one of the study’s authors, “researchers saw that they were computing how to behave randomly. This computation is what we wanted to capture in our study.” Zenil’s team found that, on average, people’s ability to behave randomly peaks at age 25, then slowly declines until age 60, when it starts to decrease much more rapidly. To test this, the researchers had more than 3,400 participants, aged four to 91, complete a series of tasks—“a sort of reversed Turing test,” Zenil says, determining how well a human can outcompete a computer when it comes to producing and recognizing random patterns. The subjects had to create sequences of coin tosses and die rolls they believed would look random to another person, guess which card would be drawn from a randomly shuffled deck, point to circles on a screen and color in a grid to form a seemingly random design. The team then analyzed these responses to quantify their level of randomness by determining the probability that a computer algorithm could generate the same decisions, measuring algorithmic complexity as the length of the shortest possible computer program that could model the participants’ choices. In other words, the more random a person’s behavior, the more difficult it would be to describe his or her responses mathematically, and the longer the algorithm would be. If a sequence were truly random, it would not be possible for such a program to compress the data at all—it would be the same length as the original sequence. After controlling for factors such as language, sex and education, the researchers concluded age was the only characteristic that affected how randomly someone behaved. “At age 25, people can outsmart computers at generating this kind of randomness,” Zenil says. This developmental  trajectory, he adds, reflects what scientists would expect measures of higher cognitive abilities to look like. In fact, a sense of complexity and randomness is based on cognitive functions including attention, inhibition and working memory (which were involved in the study’s five tasks)—although the exact mechanisms behind this relationship remain unknown. “It is around 25, then, that minds are the sharpest.” This makes biological sense, according to Zenil: Natural selection would favor a greater capacity for generating randomness during key reproductive years. The study’s results may even have implications for understanding human creativity. After all, a large part of being creative is the ability to develop new approaches and test different outcomes. “That means accessing a larger repository of diversity,” Zenil says, “which is essentially randomness. So at 25, people have more resources to behave creatively.” Zenil’s findings support previous research, which also showed a decline in random behavior with age. But this is the first study to employ an algorithmic approach to measuring complexity as well as the first to do so over a continuous age range. “Earlier studies considered groups of young and older adults, capturing specific statistical aspects such as repetition rate in very long response sequences,” says Gordana Dodig-Crnkovic, a computer scientist at Mälardalen University in Sweden, who was not involved in the research. “The present article goes a step further.” Using algorithmic measures of randomness, rather than statistical ones, allowed Zenil’s team to examine true random behavior instead of statistical, or pseudorandom, behavior—which, although satisfying statistical tests for randomness, would not necessarily be “incompressible” the way truly random data is. The fact that algorithmic capability differed with age implies the brain is algorithmic in nature—that it does not assume the world is statistically random but takes a more generalized approach without the biases described in more traditional statistical models of the brain. These results may open up a wider perspective on how the brain works: as an algorithmic probability estimator. The theory would update and eliminate some of the biases in statistical models of decision-making that lie at the heart of prevalent theories—prominent among them is the Bayesian brain hypothesis, which holds that the mind assigns a probability to a conjecture and revises it when new information is received from the senses.  “The brain is highly algorithmic,” Zenil says. “It doesn’t behave stochastically, or as a sort of coin-tossing mechanism.” Neglecting an algorithmic approach in favor of only statistical ones gives us an incomplete understanding of the brain, he adds. For instance, a statistical approach does not explain why we can remember sequences of digits such as a phone number—take “246-810-1214,” whose digits are simply even counting numbers: This is not a statistical property, but an algorithmic one. We can recognize the pattern and use it to memorize the number. Algorithmic probability, moreover, allows us to more easily find (and compress) patterns in information that appears random. “This is a paradigm shift,” Zenil says, “because even though most researchers agree that there is this algorithmic component in the way the mind works, we had been unable to measure it because we did not have the right tools, which we have now developed and introduced in our study.” Zenil and his team plan to continue exploring human algorithmic complexity, and hope to shed light on the cognitive mechanisms underlying the relationship between behavioral randomness and age. First, however, they plan to conduct their experiments with people who have been diagnosed with neurodegenerative diseases and mental disorders, including Alzheimer’s and schizophrenia. Zenil predicts, for example, that participants diagnosed with the latter will not generate or perceive randomness as well as their counterparts in the control group, because they often make more associations and observe more patterns than the average person does. The researchers’ colleagues are standing by. Their work on complexity, says Dodig-Crnkovic, “presents a very promising approach.”


Scientists have identified two small molecules that could be pursued as potential treatments for chronic inflammatory diseases. According to a paper published in PLOS Computational Biology, the researchers singled out the molecules using a new drug screening approach they developed. Both molecules, known as T23 and T8, inhibit the function of a protein called tumor necrosis factor (TNF), which is involved in inflammation in diseases such as rheumatoid arthritis, Crohn's disease, psoriasis, multiple sclerosis, and more. Drugs that inhibit TNF's function are considered the most effective way to combat such diseases. However, not all patients respond to them, and their effectiveness can wear off over time. To aid discovery of better TNF inhibitor drugs, Georgia Melagraki and colleagues from Greece and Cyprus developed a new computer-based drug screening platform. The platform incorporates proprietary molecular properties shared between TNF and another protein called RANKL, which is also involved in chronic inflammatory diseases. The researchers developed the platform based on a combination of advanced computational tools. The platform was then used to virtually screen nearly 15,000 small molecules with unknown activity and to predict their interactions with the TNF and RANKL proteins; specifically, how well the small molecules might disrupt the protein-protein interactions (PPIs) leading to the trimerization and activation of these crucial proteins. "This virtual experiment identified nine promising molecules out of thousands of candidates," says study co-corresponding author Antreas Afantitis of NovaMechanics Ltd, Cyprus. To further evaluate their potential, the scientists studied how the nine small molecules interacted with TNF and RANKL in real-world laboratory experiments. Of the nine molecules, T23 and T8 surfaced as particularly strong TNF inhibitors. Both molecules bind to TNF and RANKL, preventing them from interacting properly with other proteins. Both also show low potential for causing toxic side effects in humans. With further research, T23 and T8 could be "further optimized to develop improved treatments for a range of inflammatory, autoimmune, and bone loss diseases," says study co-corresponding author George Kollias of the Biomedical Sciences Research Center 'Alexander Fleming', Greece. Meanwhile, the new virtual drug screening approach could enable discovery of other promising TNF inhibitors, and could be modified to search for potential treatments for additional diseases. In your coverage please use this URL to provide access to the freely available article in PLOS Computational Biology: http://journals. Citation: Melagraki G, Ntougkos E, Rinotas V, Papaneophytou C, Leonis G, Mavromoustakos T, et al. (2017) Cheminformatics-aided discovery of small-molecule Protein-Protein Interaction (PPI) dual inhibitors of Tumor Necrosis Factor (TNF) and Receptor Activator of NF-κB Ligand (RANKL). PLoS Comput Biol 13(4): e1005372. https:/ Funding: This work was funded by Greek "Cooperation" Action project TheRAlead (09SYN-21-784) co-financed by the European Regional Development Fund and NSRF 2007-2013, the Innovative Medicines Initiative (IMI) funded project BTCure (No 115142) and Advanced European Research Council (ERC) grant MCs-inTEST (No 340217) to GKol. AA would like to acknowledge funding from Cyprus Research Promotion Foundation, DESMI 2008, ΕΠΙΧΕΙΡΗΣΕΙΣ/ΕΦΑΡΜ /0308/20 http://www. . The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: Georgia Melagraki, Georgios Leonis and Antreas Afantitis are employed by Novamechanics Ltd, a drug design company. Other authors declare that there are no conflicts of interest.


News Article | April 20, 2017
Site: www.eurekalert.org

Findings could help guide development of treatments that target many mutated proteins at once Scientists have identified thousands of previously ignored genetic mutations that, although rare, likely contribute to cancer growth. The findings, which could help pave the way to new treatments, are published in PLOS Computational Biology. Cancer arises when genetic mutations in a cell cause abnormal growth that leads to a tumor. Some cancer drugs exploit this to attack tumor cells by targeting proteins that are mutated from their usual form because of mutations in the genes that encode them. However, only a fraction of all the mutations that contribute significantly to cancer have been identified. Thomas Peterson, at the University of Maryland, and colleagues developed a new statistical analysis approach that uses genetic data from cancer patients to find cancer-causing mutations. Unlike previous studies that focused on mutations in individual genes, the new approach addresses similar mutations shared by families of related proteins. Specifically, the new method focuses on mutations in sub-components of proteins known as protein domains. Even though different genes encode them, different proteins can share common protein domains. The new strategy draws on existing knowledge of protein domain structure and function to pinpoint locations within protein domains where mutations are more likely to be found in tumors. Using this new approach, the researchers identified thousands of rare tumor mutations that occur in the same domain location as mutations found in other proteins in other tumors-- suggesting that they are likely to be involved in cancer. "Maybe only two patients have a mutation in a particular protein, but when you realize it is in exactly the same position within the domain as mutations in other proteins in cancer patients," says senior author of the study Maricel Kann, "you realize it's important to investigate those two mutations." The researchers have coined the term "oncodomain" to refer to protein domains that are more likely to contain cancer-causing mutations. Further study of oncodomains could help inform drug development: "Because the domains are the same across so many proteins," Kann says, "it is possible that a single treatment could tackle cancers caused by a broad spectrum of mutated proteins." In your coverage please use this URL to provide access to the freely available article in PLOS Computational Biology: http://journals. Funding: This work was funded by NSF (award #1446406, PI: MGK), NIH (award #1K22CA143148, PI: MGK and Award #R01LM009722 CoPI: MGK). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist.


A new computational modeling technique could indicate when atherosclerotic plaques will likely undergo rapid growth, reports a study published this week in PLOS Computational Biology. Atherosclerosis is a form of vascular disease that can result in heart attacks, strokes and gangrene by causing the thickening of artery walls and then narrowing of the arteries themselves at plaque locations. Every year, atherosclerotic plaques rupture and cause millions of deaths worldwide. While patient imaging is advancing and now gives clinicians the ability to detect the geometry and composition of blood vessels, and even the dynamics of blood flow within an artery, the ability to process and interpret such information is still lacking. Rita Bhui and Heather Hayenga at the University of Texas, Dallas, have developed a three-dimensional computational approach to appropriately capture and integrate crucial spatiotemporal events, in order to predict white blood cell movement from the blood into the artery wall, and the subsequent plaque evolution. The researchers coupled two computational modeling techniques, agent-based modeling (ABM) and computational fluid dynamics (CFD), in order to simulate the complex phenomena in inflammation-induced atherosclerotic development. This approach provides explanatory insight into the collective behavior of agents, i.e. cells, obeying simple rules. Unlike previous models that only consider biochemical processes or focus on a specific process of the disease, their model reveals how mechanical forces from the blood flow influence white blood cell migration into the artery wall in addition to the effect of biochemical processes occurring within the artery wall. Using the model the researchers discovered neutrophils, a type of white blood cell, are the primary cell type in the plaque at two timepoints during atherosclerosis: 1) in the beginning and 2) when the plaque starts to restrict the blood flow. Moreover, the model suggests favorable hemodynamics for plaque growth occur in steps rather than linearly. Potential practical applications in knowing how a plaque grows, and the main cell types within the plaque at different stages of growth, could inform clinicians in choosing the most effective treatment plan. Looking to the future, the research group aims to expand on the model to make it more patient-specific. "One day multiscale modeling of plaque evolution will be used for individualized decision making, such as deciding whether or not to treat a patient's lesion. It will also be foundational in optimizing design and interventional approaches, such as theorizing how an artery will respond to a pharmaceutical agent or stent design," Hayenga says. This press release is based on text provided by the authors. In your coverage please use this URL to provide access to the freely available article in PLOS Computational Biology: http://journals. Citation: Bhui R, Hayenga HN (2017) An agent-based model of leukocyte transendothelial migration during atherogenesis. PLoS Comput Biol 13(5): e1005523. https:/ Funding: Funding from the American Heart Association Scientist Development Grant (17SDG33400239) to HNH was used to support this work https:/ . The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist.


A new computational modeling technique could indicate when atherosclerotic plaques will likely undergo rapid growth, reports a study published this week in PLOS Computational Biology. Atherosclerosis is a form of vascular disease that can result in heart attacks, strokes and gangrene by causing the thickening of artery walls and then narrowing of the arteries themselves at plaque locations. Every year, atherosclerotic plaques rupture and cause millions of deaths worldwide. While patient imaging is advancing and now gives clinicians the ability to detect the geometry and composition of blood vessels, and even the dynamics of blood flow within an artery, the ability to process and interpret such information is still lacking. Rita Bhui and Heather Hayenga at the University of Texas, Dallas, have developed a three-dimensional computational approach to appropriately capture and integrate crucial spatiotemporal events, in order to predict white blood cell movement from the blood into the artery wall, and the subsequent plaque evolution. The researchers coupled two computational modeling techniques, agent-based modeling (ABM) and computational fluid dynamics (CFD), in order to simulate the complex phenomena in inflammation-induced atherosclerotic development. This approach provides explanatory insight into the collective behavior of agents, i.e. cells, obeying simple rules. Unlike previous models that only consider biochemical processes or focus on a specific process of the disease, their model reveals how mechanical forces from the blood flow influence white blood cell migration into the artery wall in addition to the effect of biochemical processes occurring within the artery wall. Using the model the researchers discovered neutrophils, a type of white blood cell, are the primary cell type in the plaque at two timepoints during atherosclerosis: 1) in the beginning and 2) when the plaque starts to restrict the blood flow. Moreover, the model suggests favorable hemodynamics for plaque growth occur in steps rather than linearly. Potential practical applications in knowing how a plaque grows, and the main cell types within the plaque at different stages of growth, could inform clinicians in choosing the most effective treatment plan. Looking to the future, the research group aims to expand on the model to make it more patient-specific. "One day multiscale modeling of plaque evolution will be used for individualized decision making, such as deciding whether or not to treat a patient's lesion. It will also be foundational in optimizing design and interventional approaches, such as theorizing how an artery will respond to a pharmaceutical agent or stent design," Hayenga says.


News Article | May 26, 2017
Site: www.cemag.us

In the May issue of PLOS Computational Biology, scientists from UC San Diego and the University of Notre Dame report on a study that could open up the field for nanopore-based protein identification – and eventually proteomic profiling of large numbers of proteins in complex mixtures of different types of molecules. According to UC San Diego computer science and engineering professor Pavel Pevzner, senior author on the paper, the new approach identifies proteins by analyzing the distinct electrical signals produced when the molecules pass through a nanopore (which acts like a sieve). In theory, says Pevzner, nanopores could allow researchers to characterize large numbers of proteins in complex mixtures. While nanopores work extremely well in analyzing single molecules, they are less effective when trying to characterize large numbers of proteins in complex mixtures. As a result, the currently preferred approach to screening complex mixtures involves using other techniques, notably mass spectrometry. (Pevzner and CSE professors Vineet Bafna and Nuno Bandeira are principal investigators of the NIH-funded Center for Computational Mass Spectrometry at UC San Diego.) As recently as 2016, leading nanopore developers were pessimistic about being able to apply nanopores to large-scale protein profiling in the near term. “We aren’t even close to doing that at the moment,” Oxford Nanopore co-founder Hagan Bayley told GenomeWeb, adding that he “wouldn’t say it’s an impossible goal, but it is a bit of a stretch.” UC San Diego’s Pevzner, however, believes that a breakthrough is at hand. “The key is to use machine learning to analyze information generated by proteins when they translocate through a nanopore,” says Pevzner. “By applying machine learning techniques, we were able to identify distinct signals that could lead to large-scale nanopore protein analysis.” In an interview with GenomeWeb, Pevzner says that, early on, the obstacles appeared intractable. "The data was so noisy that we almost thought we should give up," he explains. "I have been working for almost 10 years now on top-down mass spectrometry, and in comparison with protein identification by top-down mass spectrometry, which by now is almost a mature area, it looked like there was no hope that nanopores could produce a comparable signal." Then, when the researchers applied a random forest analysis tool from machine learning to the problem, everything changed. Recalls Mikhail Kolmogorov, a graduate student in Pevzner’s lab: “All of a sudden, the structure of the signal emerged.” As stated in the PLOS paper, the researchers argue that “the current technology is already sufficient for matching nanospectra against small protein databases, e.g., protein identification in bacterial proteomes.”


News Article | May 26, 2017
Site: www.cemag.us

In the May issue of PLOS Computational Biology, scientists from UC San Diego and the University of Notre Dame report on a study that could open up the field for nanopore-based protein identification – and eventually proteomic profiling of large numbers of proteins in complex mixtures of different types of molecules. According to UC San Diego computer science and engineering professor Pavel Pevzner, senior author on the paper, the new approach identifies proteins by analyzing the distinct electrical signals produced when the molecules pass through a nanopore (which acts like a sieve). In theory, says Pevzner, nanopores could allow researchers to characterize large numbers of proteins in complex mixtures. While nanopores work extremely well in analyzing single molecules, they are less effective when trying to characterize large numbers of proteins in complex mixtures. As a result, the currently preferred approach to screening complex mixtures involves using other techniques, notably mass spectrometry. (Pevzner and CSE professors Vineet Bafna and Nuno Bandeira are principal investigators of the NIH-funded Center for Computational Mass Spectrometry at UC San Diego.) As recently as 2016, leading nanopore developers were pessimistic about being able to apply nanopores to large-scale protein profiling in the near term. “We aren’t even close to doing that at the moment,” Oxford Nanopore co-founder Hagan Bayley told GenomeWeb, adding that he “wouldn’t say it’s an impossible goal, but it is a bit of a stretch.” UC San Diego’s Pevzner, however, believes that a breakthrough is at hand. “The key is to use machine learning to analyze information generated by proteins when they translocate through a nanopore,” says Pevzner. “By applying machine learning techniques, we were able to identify distinct signals that could lead to large-scale nanopore protein analysis.” In an interview with GenomeWeb, Pevzner says that, early on, the obstacles appeared intractable. "The data was so noisy that we almost thought we should give up," he explains. "I have been working for almost 10 years now on top-down mass spectrometry, and in comparison with protein identification by top-down mass spectrometry, which by now is almost a mature area, it looked like there was no hope that nanopores could produce a comparable signal." Then, when the researchers applied a random forest analysis tool from machine learning to the problem, everything changed. Recalls Mikhail Kolmogorov, a graduate student in Pevzner’s lab: “All of a sudden, the structure of the signal emerged.” As stated in the PLOS paper, the researchers argue that “the current technology is already sufficient for matching nanospectra against small protein databases, e.g., protein identification in bacterial proteomes.”

Loading Computational Biology collaborators
Loading Computational Biology collaborators