한빛사논문
John G. Mikhael,1,2,7,8,* HyungGoo R. Kim,3,4,5,7 Naoshige Uchida,5 and Samuel J. Gershman6
1Program in Neuroscience, Harvard Medical School, Boston, MA 02115, USA
2MD-PhD Program, Harvard Medical School, Boston, MA 02115, USA
3Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon 16419, Republic of Korea
4Department of Biomedical Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea
5Department of Molecular and Cellular Biology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
6Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA 02138, USA
7These authors contributed equally
8Lead contact
*Correspondence
Abstract
Reinforcement learning models of the basal ganglia map the phasic dopamine signal to reward prediction errors (RPEs). Conventional models assert that, when a stimulus predicts a reward with fixed delay, dopamine activity during the delay should converge to baseline through learning. However, recent studies have found that dopamine ramps up before reward in certain conditions even after learning, thus challenging the conventional models. In this work, we show that sensory feedback causes an unbiased learner to produce RPE ramps. Our model predicts that when feedback gradually decreases during a trial, dopamine activity should resemble a “bump,” whose ramp-up phase should, furthermore, be greater than that of conditions where the feedback stays high. We trained mice on a virtual navigation task with varying brightness, and both predictions were empirically observed. In sum, our theoretical and experimental results reconcile the seemingly conflicting data on dopamine behaviors under the RPE hypothesis.
논문정보
관련 링크
관련분야 연구자보기
소속기관 논문보기
관련분야 논문보기
해당논문 저자보기