Dr Mark Humphries

PhD Projects

The interaction of performance and learning in action selection

Principal Investigators: Dr Mark Humphries


Project available for individuals with self arranged funding.


Reinforcement learning models of behaviour separate the learning and performance of actions. In these models, appropriate actions are learnt by prediction error feedback from their consequences. Actions are chosen according to their learnt values, modulated by the current balance between the desire to exploit existing knowledge or explore new options. But by controlling which actions are chosen, this exploration-exploitation trade-off must alter the course of learning. This project will explore how this interaction between performance and learning works when the explore-exploit trade-off is a function of the rate of learning.

We have good reason to believe these are coupled in the brain. A longstanding theory holds that phasic dopamine signals a prediction error. New evidence and models suggest that tonic dopamine controls the exploration-exploitation trade-off. As tonic dopamine is, to a first approximation, just the time integral of phasic dopamine, so the two are coupled.

We will use both algorithmic and neural models to study this interaction, and the role of dopamine. One goal will be to determine if the classic habit vs goal-directed distinction of instrumental behaviour is actually a performance effect and not a distinction between learning systems. Another goal will be to seek ideas for forms of directed exploration to advance the cutting edge of machine learning.


Related Publications

  • Humphries, M. D., Khamassi, M. & Gurney, K. (2012) Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Frontiers in Neuroscience, 6, 9.
  • Khamassi, M. & Humphries, M. D. (2012) Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies. Frontiers in Behavioural Neuroscience, 2012, 6, 79.
  • Wunderlich, K., Smittenaar, P. & Dolan, R. J. (2012) Dopamine Enhances Model-Based over Model-Free Choice Behavior. Neuron, 75, 418-424


Fee Band

This project has a Band 1 fee. Details of different fee bands are available for UK/EU or International applicants. See: Fees.

How to Apply

Find out How to apply for this PhD Project.