Gretton A.,MPI for Intelligent Systems |
Gretton A.,Gatsby Computational Neuroscience Unit |
Borgwardt K.M.,Max Planck Institutes Tubingen |
Rasch M.J.,Beijing Normal University |
And 3 more authors.
Journal of Machine Learning Research | Year: 2012
We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD).We present two distributionfree tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests. © 2012 Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf and Alexander Smola.
Bunzeck N.,University College London |
Dayan P.,Gatsby Computational Neuroscience Unit |
Duzel E.,University College London |
Duzel E.,Otto Von Guericke University of Magdeburg
Human Brain Mapping | Year: 2010
Declarative memory is remarkably adaptive in the way it maintains sensitivity to relative novelty in both unknown and highly familiar environments. However, the neural mechanisms underlying this contextual adaptation are poorly understood. On the basis of emerging links between novelty processing and reinforcement learning mechanisms, we hypothesized that responses to novelty will be adaptively scaled according to expected contextual probabilities of new and familiar events, in the same way that responses to prediction errors for rewards are scaled according to their expected range. Using functional magnetic resonance imaging in humans, we show that the influence of novelty and reward on memory formation in an incidental memory task is adaptively scaled and furthermore that the BOLD signal in orbital prefrontal and medial temporal cortices exhibits concomitant scaled adaptive coding. These findings demonstrate a new mechanism for adjusting gain and sensitivity in declarative memory in accordance with contextual probabilities and expectancies of future events. © 2010 Wiley-Liss, Inc.
Song L.,Georgia Institute of Technology |
Smola A.,Yahoo! |
Gretton A.,Gatsby Computational Neuroscience Unit |
Gretton A.,Intelligent Group |
And 3 more authors.
Journal of Machine Learning Research | Year: 2012
We introduce a framework for feature selection based on dependence maximization between the selected features and the labels of an estimation problem, using the Hilbert-Schmidt Independence Criterion. The key idea is that good features should be highly dependent on the labels. Our approach leads to a greedy procedure for feature selection. We show that a number of existing feature selectors are special cases of this framework. Experiments on both artificial and real-world data show that our feature selector works well in practice. © 2012 Le Song, Alex Smola, Arthur Gretton, Justin Bedo and Karsten Borgwardt.
News Article | December 7, 2016
When you start a new job, it's normal to spend the first day working out who's who in the pecking order, information that will come in handy for making useful connections in the future. In an fMRI study published December 7 in Neuron, researchers at DeepMind and University College London provide new insights into how we acquire knowledge about social hierarchies, reveal the specific mechanisms at play when that hierarchy is our own (as compared to that of another person), and demonstrate that the brain automatically generates signals of social rank even when they're not needed to perform a task. The work could prove useful in guiding future research, not only in neuroscience, but also in artificial intelligence. In order to determine how we learn about social hierarchies, the authors asked 30 healthy college students to perform a task in the fMRI scanner. In this task, they learned about the power structure of a fictitious company that they imagined working in the future and that of one of their friends. They learned about the relative power of different people in each company, through watching "contests" between pairs of individuals and seeing who won. Once they understood the power structures of both companies, they then saw pictures of individual people from each company and had to decide which company the person worked for. "We found that the way in which participants learn about the power of individuals was best explained by a process of Bayesian inference" says Dharshan Kumaran, a research scientist at DeepMind. "Essentially you have an estimate about the level of power of each person, which you update as you receive new information (i.e., the outcome of a contest between 2 people." In this context, you can actually gain knowledge about how powerful someone is when they're not around: for example, if you see that Jane wins a contest against Paul, and later Paul wins many contests against other people, you should probably up your estimate of Jane's power because the evidence suggests that Paul is much better than you might have previously thought. So what this means is that people are able to rapidly form a coherent understanding of the whole hierarchy through putting together the outcome of different interactions between people, filling in missing pieces. "We found that different processes seem to be used for learning about and representing a social structure that you yourself are part of, compared to a social structure that involves someone else" says Dharshan Kumaran. "The prefrontal cortex, a region that is highly developed in humans, was particularly important when participants were learning about the power of people in their own social group, as compared to that of another person. This points towards the special nature of representing information that relates to the self." Indeed, sophisticated social interactions necessitate distinguishing one's own thoughts, goals, and preferences from those of other people--a cognitive function we know humans in particular excel at. "Part of the reason we do neuroscience research at DeepMind is because our ultimate goal is to develop artificial general intelligence that can be applied to solve some of the world's most intractable problems." says Kumaran. "Understanding how we ourselves learn structured forms of knowledge is a key component of what we'd call 'intelligence,' and it is therefore an important focus for our research." This work is supported by DeepMind in London, the Gatsby Computational Neuroscience Unit, the Institute of Cognitive Neuroscience at the University College London, and the Wellcome Trust. Neuron (@NeuroCellPress), published by Cell Press, is a bimonthly journal that has established itself as one of the most influential and relied upon journals in the field of neuroscience and one of the premier intellectual forums of the neuroscience community. It publishes interdisciplinary articles that integrate biophysical, cellular, developmental, and molecular approaches with a systems approach to sensory, motor, and higher-order cognitive functions. Visit: http://www. . To receive Cell Press media alerts, contact email@example.com.
Pouget A.,University of Geneva |
Pouget A.,University of Rochester |
Pouget A.,Gatsby Computational Neuroscience Unit |
Drugowitsch J.,University of Geneva |
Kepecs A.,Cold Spring Harbor Laboratory
Nature Neuroscience | Year: 2016
When facing uncertainty, adaptive behavioral strategies demand that the brain performs probabilistic computations. In this probabilistic framework, the notion of certainty and confidence would appear to be closely related, so much so that it is tempting to conclude that these two concepts are one and the same. We argue that there are computational reasons to distinguish between these two concepts. Specifically, we propose that confidence should be defined as the probability that a decision or a proposition, overt or covert, is correct given the evidence, a critical quantity in complex sequential decisions. We suggest that the term certainty should be reserved to refer to the encoding of all other probability distributions over sensory and cognitive variables. We also discuss strategies for studying the neural codes for confidence and certainty and argue that clear definitions of neural codes are essential to understanding the relative contributions of various cortical areas to decision making. © 2016 Nature America, Inc. All rights reserved.
Kanitscheider I.,University of Geneva |
Kanitscheider I.,University of Texas at Austin |
Coen-Cagli R.,University of Geneva |
Kohn A.,Yeshiva University |
And 3 more authors.
PLoS Computational Biology | Year: 2015
Neural responses are known to be variable. In order to understand how this neural variability constrains behavioral performance, we need to be able to measure the reliability with which a sensory stimulus is encoded in a given population. However, such measures are challenging for two reasons: First, they must take into account noise correlations which can have a large influence on reliability. Second, they need to be as efficient as possible, since the number of trials available in a set of neural recording is usually limited by experimental constraints. Traditionally, cross-validated decoding has been used as a reliability measure, but it only provides a lower bound on reliability and underestimates reliability substantially in small datasets. We show that, if the number of trials per condition is larger than the number of neurons, there is an alternative, direct estimate of reliability which consistently leads to smaller errors and is much faster to compute. The superior performance of the direct estimator is evident both for simulated data and for neuronal population recordings from macaque primary visual cortex. Furthermore we propose generalizations of the direct estimator which measure changes in stimulus encoding across conditions and the impact of correlations on encoding and decoding, typically denoted by Ishuffle and Idiag respectively. © 2015 Kanitscheider et al.
News Article | March 4, 2016
The first major results of the Blue Brain Project, a detailed simulation of a bit of rat neocortex about the size of a grain of coarse sand, were published last year1. The model represents 31,000 brain cells and 37 million synapses. It runs on a supercomputer and is based on data collected over 20 years. Furthermore, it behaves just like a speck of brain tissue. But therein, say critics, lies the problem. “It's the best biophysical model we have of any brain, but that's not enough,” says Christof Koch, a neuroscientist at the Allen Institute for Brain Science in Seattle, Washington, which has embarked on its own large-scale brain-modelling effort. The trouble with the model is that it holds no surprises: no higher functions or unexpected features have emerged from it. Some neuroscientists, including Koch, say that this is because the model was not built with a particular hypothesis about cognitive processes in mind. Its success will depend on whether specific questions can be asked of it. The irony, says neuroscientist Alexandre Pouget, is that deriving answers will require drastic simplification of the model, “unless we figure out how to adjust the billions of parameters of the simulations, which would seem to be a challenging problem to say the least”. By contrast, Pouget's group at the University of Geneva, Switzerland, is generating and testing hypotheses on how the brain deals with uncertainty in functions such as attention and decision-making. There is a widespread preference for hypothesis-driven approaches in the brain-modelling community. Some models might be very small and detailed, for example, focusing on a single synapse. Others might explore the electrical spiking of whole neurons, the communication patterns between brain areas, or even attempt to recapitulate the whole brain. But ultimately a model needs to answer questions about brain function if we are to advance our understanding of cognition. Blue Brain is not the only sophisticated model to have hit the headlines in recent years. In late 2012, theoretical neuroscientist Chris Eliasmith at the University of Waterloo in Canada unveiled Spaun, a whole-brain model that contains 2.5 million neurons (a fraction of the human brain's estimated 86 billion). Spaun has a digital eye and a robotic arm, and can reason through eight complex tasks such as memorizing and reciting lists, all of which involve multiple areas of the brain2. Nevertheless, Henry Markram, a neurobiologist at the Swiss Federal Institute of Technology in Lausanne who is leading the Blue Brain Project, noted3 at the time: “It is not a brain model.” Although Markram's dismissal of Spaun amused Eliasmith, it did not surprise him. Markram is well known for taking a different approach to modelling, as he did in the Blue Brain Project. His strategy is to build in every possible detail to derive a perfect imitation of the biological processes in the brain with the hope that higher functions will emerge — a 'bottom-up' approach. Researchers such as Eliasmith and Pouget take a 'top-down' strategy, creating simpler models based on our knowledge of behaviour. These skate over certain details, instead focusing on testing hypotheses about brain function. Rather than dismiss the criticism, Eliasmith took Markram's comment on board and added bottom-up detail to Spaun. He selected a handful of frontal cortex neurons, which were relatively simple to begin with, and swapped them for much more complicated neurons — ones that account for multiple ion channels and changes in electrical activity over time. Although these complicated neurons were more biologically realistic, Eliasmith found that they brought no improvement to Spaun's performance on the original eight tasks. “A good model doesn't introduce complexity for complexity's sake,” he says. For many years, computational models of the brain were what theorists call unconstrained: there were not enough experimental data to map onto the models or to fully test them. For instance, scientists could record electrical activity, but from only one neuron at a time, which limited their ability to represent neural networks. Back then, brain models were simple out of necessity. In the past decade, an array of technologies has provided more information. Imaging technology has revealed previously hidden parts of the brain. Researchers can control genes to isolate particular functions. And emerging statistical methods have helped to describe complex phenomena in simpler terms. These techniques are feeding newer generations of models. Nevertheless, most theorists think that a good model includes only the details needed to help answer a specific question. Indeed, one of the most challenging aspects of model building is working out which details are important to include and which are acceptable to ignore. “The simpler the model is, the easier it is to analyse and understand, manipulate and test,” says cognitive and computational neuroscientist Anil Seth of the University of Sussex in Chichester, UK. An oft-cited success in theoretical neuroscience is the Reichardt detector — a simple, top-down model for how the brain senses motion — proposed by German physicist Werner Reichardt in the 1950s. “The big advantage of the Reichardt model for motion detection was that it was an algorithm to begin with,” says neurobiologist Alexander Borst of the Max Planck Institute of Neurobiology in Martinsried, Germany. “It doesn't speak about neurons at all.” When Borst joined the Max Planck Society in the mid-1980s, he ran computational simulations of the Reichardt model, and got surprising results. He found, for instance, that neurons oscillated when first presented with a pattern that was moving at constant velocity — a result that he took to Werner Reichardt, who was also taken aback. “He didn't expect his model to show that,” says Borst. They confirmed the results in real neurons, and continued to refine and expand Reichardt's model to gain insight into how the visual system detects motion. In the realm of bottom-up models, the greatest success has come from a set of equations developed in 1952 to explain how flow of ions in and out of a nerve cell produces an axon potential. These Hodgkin–Huxley equations are “beautiful and inspirational”, says neurobiologist Anthony Zador of Cold Spring Harbor Laboratory in New York, adding that they have allowed many scientists to make predictions about how neuronal excitability works. The equations, or their variants, form some of the basic building blocks of many of today's larger brain models of cognition. Although many theoretical neuroscientists do not see value in pure bottom-up approaches such as that taken by the Blue Brain Project, they do not dismiss bottom-up models entirely. These types of data-driven brain simulations have the benefit of reminding model-builders what they do not know, which can inspire new experiments. And top-down approaches can often benefit from the addition of more detail, says theoretical neuroscientist Peter Dayan of the Gatsby Computational Neuroscience Unit at University College London. “The best kind of modelling is going top-down and bottom-up simultaneously,” he says. Borst, for example, is now approaching the Reichardt detector from the bottom up to explore questions such as how neurotransmitter receptors on motion-sensitive neurons interact. And Eliasmith's more complex Spaun has allowed him to do other types of experiment that he couldn't before — in particular, he can now mimic the effect of sodium-channel blockers on the brain. Also taking a multiscale approach is neuroscientist Xiao-Jing Wang of New York University Shanghai in China, whose group described a large-scale model of the interaction of circuits across different regions of the macaque brain4. The model is built, in part, from his previous, smaller models of local neuronal circuits that show how neurons in a group fire in time. To scale up to the entire brain, Wang had to include the strength of the feedback between areas. Only now has he got the right data — thanks to the burgeoning field of connectomics (the study of connection maps within an organism's nervous system) — to build in this important detail, he says. Wang is using his model to study decision-making, the integration of sensory information and other cognitive processes. In physics, the marriage between experiment and theory led to the development of unifying principles. And although neuroscientists might hope for a similar revelation in their field, the brain (and biology in general) is inherently more noisy than a physical system, says computational neuroscientist Gustavo Deco of the Pompeu Fabra University in Barcelona, Spain, who is an investigator on the Human Brain Project. Deco points out that equations describing the behaviour of neurons and synapses are non-linear, and neurons are connected in a variety of ways, interacting in both a feedforward and a feedback manner. That said, there are examples of theory allowing neuroscientists to extract general principles, such as how the brain balances excitation and inhibition, and how neurons fire in synchrony, Wang says. Complex neuroscience often requires huge computational resources. But it is not a want of supercomputers that limits good, theory-driven models. “It is a lack of knowledge about experimental facts. We need more facts and maybe more ideas,” Borst says. Those who crave vast amounts of computer power misunderstand the real challenge facing scientists who are trying to unravel the mysteries of the brain, Borst contends. “I still don't see the need for simulating one million neurons simultaneously in order to understand what the brain is doing,” he says, referring to the large-scale simulation linked with the Human Brain Project. “I'm sure we can reduce that to a handful of neurons and get some ideas.” Computational neuroscientist Andreas Herz, of the Ludwig-Maximilians University in Munich, Germany, agrees. “We make best progress if we focus on specific elements of neural computation,” he says. For example, a single cortical neuron receives input from thousands of other cells, but it is unclear how it processes this information. “Without this knowledge, attempts to simulate the whole brain in a seemingly biologically realistic manner are doomed to fail,” he adds. At the same time, supercomputers do allow researchers to build details into their models and see how they compare to the originals, as with Spaun. Eliasmith has used Spaun and its variations to see what happens when he kills neurons or tweaks other features to investigate ageing, motor control or stroke damage in the brain. For him, adding complexity to a model has to serve a purpose. “We need to build bigger and bigger models in every direction, more neurons and more detail,” he says. “So that we can break them.”
Grabska-Barwinska A.,Gatsby Computational Neuroscience Unit |
Beck J.,Duke University |
Pouget A.,University of Geneva |
Latham P.E.,Gatsby Computational Neuroscience Unit
Advances in Neural Information Processing Systems | Year: 2013
The olfactory system faces a difficult inference problem: it has to determine what odors are present based on the distributed activation of its receptor neurons. Here we derive neural implementations of two approximate inference algorithms that could be used by the brain. One is a variational algorithm (which builds on the work of Beck. et al., 2012), the other is based on sampling. Importantly, we use a more realistic prior distribution over odors than has been used in the past: we use a "spike and slab" prior, for which most odors have zero concentration. After mapping the two algorithms onto neural dynamics, we find that both can infer correct odors in less than 100 ms. Thus, at the behavioral level, the two algorithms make very similar predictions. However, they make different assumptions about connectivity and neural computations, and make different predictions about neural activity. Thus, they should be distinguishable experimentally. If so, that would provide insight into the mechanisms employed by the olfactory system, and, because the two algorithms use very different coding strategies, that would also provide insight into how networks represent probabilities.
Bahrami B.,University College London |
Bahrami B.,Aarhus University Hospital |
Olsen K.,Aarhus University Hospital |
Latham P.E.,Gatsby Computational Neuroscience Unit |
And 4 more authors.
Science | Year: 2010
In everyday life, many people believe that two heads are better than one. Our ability to solve problems together appears to be fundamental to the current dominance and future survival of the human species. But are two heads really better than one? We addressed this question in the context of a collective low-level perceptual decision-making task. For two observers of nearly equal visual sensitivity, two heads were definitely better than one, provided they were given the opportunity to communicate freely, even in the absence of any feedback about decision outcomes. But for observers with very different visual sensitivities, two heads were actually worse than the better one. These seemingly discrepant patterns of group behavior can be explained by a model in which two heads are Bayes optimal under the assumption that individuals accurately communicate their level of confidence on every trial.
Lloyd K.,Gatsby Computational Neuroscience Unit |
Dayan P.,Gatsby Computational Neuroscience Unit
PLoS Computational Biology | Year: 2015
Substantial evidence suggests that the phasic activity of dopamine neurons represents reinforcement learning’s temporal difference prediction error. However, recent reports of ramp-like increases in dopamine concentration in the striatum when animals are about to act, or are about to reach rewards, appear to pose a challenge to established thinking. This is because the implied activity is persistently predictable by preceding stimuli, and so cannot arise as this sort of prediction error. Here, we explore three possible accounts of such ramping signals: (a) the resolution of uncertainty about the timing of action; (b) the direct influence of dopamine over mechanisms associated with making choices; and (c) a new model of discounted vigour. Collectively, these suggest that dopamine ramps may be explained, with only minor disturbance, by standard theoretical ideas, though urgent questions remain regarding their proximal cause. We suggest experimental approaches to disentangling which of the proposed mechanisms are responsible for dopamine ramps. © 2015 Lloyd, Dayan.