UCL uncovers how exploratory actions in animals learn faster than AI
Researchers at the Sainsbury Wellcome Centre and Gatsby Computational Neuroscience Unit at UCL found the instinctual exploratory runs that animals carry out are not random. These purposeful actions allow mice to learn a map of the world efficiently. The study, published today in Neuron, describes how neuroscientists tested their hypothesis that the specific exploratory actions that animals undertake, such as darting quickly towards objects, are important in helping them learn how to navigate their environment.
Professor Tiago Branco, Group Leader at the Sainsbury Wellcome Centre at UCL and corresponding author on the paper, said: “There are a lot of theories in psychology about how performing certain actions facilitates learning. In this study, we tested whether simply observing obstacles in an environment was enough to learn about them, or if purposeful, sensory-guided actions help animals build a cognitive map of the world.”
In previous work, scientists at UCL observed a correlation between how well animals learn to go around an obstacle and the number of times they had run to the object. In this study, Philip Shamash (Sainsbury Wellcome Centre at UCL), PhD student and first author of the paper, carried out experiments to test the impact of preventing animals from performing exploratory runs. By expressing a light-activated protein called channelrhodopsin in one part of the motor cortex, Philip was able to use optogenetic tools to prevent animals from initiating exploratory runs towards obstacles.
The team found that even though mice had spent a lot of time observing and sniffing obstacles, if they were prevented in running towards them, they did not learn. This shows that the instinctive exploratory actions themselves are helping the animals learn a map of their environment.
To explore the algorithms that the brain might be using to learn, the team worked with Sebastian Lee, a PhD student in Dr Andrew Saxe’s lab in the Sainsbury Wellcome Centre at UCL, to run different models of reinforcement learning that people have developed for artificial agents, and observe which one most closely reproduces the mouse behaviour.
There are two main classes of reinforcement learning models: model-free and model-based. The team found that under some conditions mice act in a model-free way but under other conditions, they seem to have a model of the world. And so the researchers implemented an agent that can arbitrate between model-free and model-based. This is not necessarily how the mouse brain works, but it helped them to understand what is required in a learning algorithm to explain the behaviour.
Professor Branco said: “One of the problems with artificial intelligence is that agents need a lot of experience in order to learn something. They have to explore the environment thousands of times, whereas a real animal can learn an environment in less than ten minutes. We think this is in part because, unlike artificial agents, animals’ exploration is not random and instead focuses on salient objects. This kind of directed exploration makes the learning more efficient and so they need less experience to learn.”
The next steps for the researchers are to explore the link between the execution of exploratory actions and the representation of subgoals. The team are now carrying out recordings in the brain to discover which areas are involved in representing subgoals and how the exploratory actions lead to the formation of the representations.
This research was supported by Wellcome and the Royal Society.