Reinforcement learning approaches to hippocampus-dependent flexible spatial navigation

Brain and Neuroscience Advances ◽

10.1177/2398212820975634 ◽

2021 ◽

Vol 5 ◽

pp. 239821282097563

Author(s):

Charline Tessereau ◽

Reuben O’Dea ◽

Stephen Coombes ◽

Tobias Bast

Keyword(s):

Reinforcement Learning ◽

Spatial Navigation ◽

Place Learning ◽

Learning Performance ◽

Learning Approaches ◽

Spatial Cues ◽

Delayed Matching ◽

Place Memory ◽

Neurobiological Substrates ◽

New Locations

Humans and non-human animals show great flexibility in spatial navigation, including the ability to return to specific locations based on as few as one single experience. To study spatial navigation in the laboratory, watermaze tasks, in which rats have to find a hidden platform in a pool of cloudy water surrounded by spatial cues, have long been used. Analogous tasks have been developed for human participants using virtual environments. Spatial learning in the watermaze is facilitated by the hippocampus. In particular, rapid, one-trial, allocentric place learning, as measured in the delayed-matching-to-place variant of the watermaze task, which requires rodents to learn repeatedly new locations in a familiar environment, is hippocampal dependent. In this article, we review some computational principles, embedded within a reinforcement learning framework, that utilise hippocampal spatial representations for navigation in watermaze tasks. We consider which key elements underlie their efficacy, and discuss their limitations in accounting for hippocampus-dependent navigation, both in terms of behavioural performance (i.e. how well do they reproduce behavioural measures of rapid place learning) and neurobiological realism (i.e. how well do they map to neurobiological substrates involved in rapid place learning). We discuss how an actor–critic architecture, enabling simultaneous assessment of the value of the current location and of the optimal direction to follow, can reproduce one-trial place learning performance as shown on watermaze and virtual delayed-matching-to-place tasks by rats and humans, respectively, if complemented with map-like place representations. The contribution of actor–critic mechanisms to delayed-matching-to-place performance is consistent with neurobiological findings implicating the striatum and hippocampo-striatal interaction in delayed-matching-to-place performance, given that the striatum has been associated with actor–critic mechanisms. Moreover, we illustrate that hierarchical computations embedded within an actor–critic architecture may help to account for aspects of flexible spatial navigation. The hierarchical reinforcement learning approach separates trajectory control via a temporal-difference error from goal selection via a goal prediction error and may account for flexible, trial-specific, navigation to familiar goal locations, as required in some arm-maze place memory tasks, although it does not capture one-trial learning of new goal locations, as observed in open field, including watermaze and virtual, delayed-matching-to-place tasks. Future models of one-shot learning of new goal locations, as observed on delayed-matching-to-place tasks, should incorporate hippocampal plasticity mechanisms that integrate new goal information with allocentric place representation, as such mechanisms are supported by substantial empirical evidence.

Download Full-text

Reinforcement Learning approaches to hippocampus-dependent flexible spatial navigation

10.1101/2020.07.30.229005 ◽

2020 ◽

Author(s):

Charline Tessereau ◽

Reuben O’Dea ◽

Stephen Coombes ◽

Tobias Bast

Keyword(s):

Reinforcement Learning ◽

Spatial Navigation ◽

Place Learning ◽

Learning Performance ◽

Learning Approaches ◽

Spatial Cues ◽

Place Memory ◽

Neurobiological Substrates ◽

New Locations ◽

Human Participants

AbstractHumans and non-human animals show great flexibility in spatial navigation, including the ability to return to specific locations based on as few as one single experience. To study spatial navigation in the laboratory, watermaze tasks, in which rats have to find a hidden platform in a pool of cloudy water surrounded by spatial cues, have long been used. Analogous tasks have been developed for human participants using virtual environments. Spatial learning in the watermaze is facilitated by the hippocampus. In particular, rapid, one-trial, allocentric place learning, as measured in the Delayed-Matching-to-Place (DMP) variant of the watermaze task, which requires rodents to learn repeatedly new locations in a familiar environment, is hippocampal dependent. In this article, we review some computational principles, embedded within a Reinforcement Learning (RL) framework, that utilise hippocampal spatial representations for navigation in watermaze tasks. We consider which key elements underlie their efficacy, and discuss their limitations in accounting for hippocampus-dependent navigation, both in terms of behavioural performance (i.e., how well do they reproduce behavioural measures of rapid place learning) and neurobiological realism (i.e., how well do they map to neurobiological substrates involved in rapid place learning). We discuss how an actor-critic architecture, enabling simultaneous assessment of the value of the current location and of the optimal direction to follow, can reproduce one-trial place learning performance as shown on watermaze and virtual DMP tasks by rats and humans, respectively, if complemented with map-like place representations. The contribution of actor-critic mechanisms to DMP performance is consistent with neurobiological findings implicating the striatum and hippocampo-striatal interaction in DMP performance, given that the striatum has been associated with actor-critic mechanisms. Moreover, we illustrate that hierarchical computations embedded within an actor-critic architecture may help to account for aspects of flexible spatial navigation. The hierarchical RL approach separates trajectory control via a temporal-difference error from goal selection via a goal prediction error and may account for flexible, trial-specific, navigation to familiar goal locations, as required in some arm-maze place memory tasks, although it does not capture one-trial learning of new goal locations, as observed in open field, including watermaze and virtual, DMP tasks. Future models of one-shot learning of new goal locations, as observed on DMP tasks, should incorporate hippocampal plasticity mechanisms that integrate new goal information with allocentric place representation, as such mechanisms are supported by substantial empirical evidence.

Download Full-text

Goal-driven active learning

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09527-5 ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Process ◽

Real World ◽

Imitation Learning ◽

Learning Approaches ◽

Wide Range ◽

Fixed Set ◽

Complex Decision Making ◽

Complex Decision

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Download Full-text

Adaptive and Reinforcement Learning Approaches for Online Network Monitoring and Analysis

IEEE Transactions on Network and Service Management ◽

10.1109/tnsm.2020.3037486 ◽

2020 ◽

pp. 1-1

Author(s):

Sarah Wassermann ◽

Thibaut Cuvelier ◽

Pavol Mulinka ◽

Pedro Casas

Keyword(s):

Reinforcement Learning ◽

Network Monitoring ◽

Learning Approaches

Download Full-text

Reinforcement Learning Approaches in Social Robotics

Sensors ◽

10.3390/s21041292 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1292

Author(s):

Neziha Akalin ◽

Amy Loutfi

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Social Robotics ◽

Research Field ◽

Social Robots ◽

Learning Approaches ◽

Reward Function ◽

Optimal Behavior ◽

Learning Challenges ◽

Starting Point

This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field.

Download Full-text

Reinforcement Learning approaches to Economic Dispatch problem

International Journal of Electrical Power & Energy Systems ◽

10.1016/j.ijepes.2010.12.008 ◽

2011 ◽

Vol 33 (4) ◽

pp. 836-845 ◽

Cited By ~ 28

Author(s):

E.A. Jasmin ◽

T.P. Imthias Ahamed ◽

V.P. Jagathy Raj

Keyword(s):

Reinforcement Learning ◽

Economic Dispatch ◽

Learning Approaches ◽

Economic Dispatch Problem

Download Full-text

Reinforcement Learning Approaches to Biological Manufacturing Systems

CIRP Annals ◽

10.1016/s0007-8506(07)62960-6 ◽

2000 ◽

Vol 49 (1) ◽

pp. 343-346 ◽

Cited By ~ 51

Author(s):

Kanji Ueda ◽

Itsuo Hatono ◽

Nobutada Fujii ◽

Jari Vaario

Keyword(s):

Reinforcement Learning ◽

Manufacturing Systems ◽

Learning Approaches

Download Full-text

Place learning in virtual space III: Investigation of spatial navigation training procedures and their application to fMRI and clinical neuropsychology

Behavior Research Methods Instruments &amp Computers ◽

10.3758/bf03195344 ◽

2001 ◽

Vol 33 (1) ◽

pp. 21-37 ◽

Cited By ~ 30

Author(s):

Kevin G. F. Thomas ◽

Ming Hsu ◽

Holly E. Laurance ◽

Lynn Nadel ◽

W. Jake Jacobs

Keyword(s):

Spatial Navigation ◽

Place Learning ◽

Virtual Space ◽

Clinical Neuropsychology

Download Full-text

A Comparative Study of Model-Free Reinforcement Learning Approaches

Advances in Intelligent Systems and Computing - Advanced Machine Learning Technologies and Applications ◽

10.1007/978-981-15-3383-9_50 ◽

2020 ◽

pp. 547-557

Author(s):

Anant Moudgalya ◽

Ayman Shafi ◽

B. Amulya Arun

Keyword(s):

Reinforcement Learning ◽

Comparative Study ◽

Learning Approaches ◽

Model Free

Download Full-text

Policy-Based Reinforcement Learning Approaches

Deep Reinforcement Learning ◽

10.1007/978-981-13-8285-7_10 ◽

2019 ◽

pp. 127-140

Author(s):

Mohit Sewak

Keyword(s):

Reinforcement Learning ◽

Learning Approaches

Download Full-text

Enhancing Building Performance and Environmental Learning

Architecture and Design ◽

10.4018/978-1-5225-7314-2.ch026 ◽

2019 ◽

pp. 707-727

Author(s):

Shannon M. Chance ◽

J. Timothy Cole

Keyword(s):

Public Schools ◽

Natural Resources ◽

Learning Performance ◽

Building Performance ◽

School Buildings ◽

Learning Approaches ◽

Environmental Learning ◽

Certification Programs ◽

The People ◽

Virginia Beach

School buildings directly affect their natural and socio-cultural environments. They do this through their construction, maintenance, operation, and demolition. Most of the school buildings we have in stock today drain natural resources and inadvertently perpetuate a culture of environmental, social, and long-term economic ignorance and misuse. When approached thoughtfully, however, the design of school buildings can help inform and enrich society. Well-designed buildings can impart environmental knowledge and values. They can foster more effective behaviors among the people who learn in and from them. Effectively designed buildings can also conserve natural resources and—at their best—even help replenish the natural environment. For many school leaders today, participation in green certification programs represents one important step toward improved building and learning performance. This chapter provides a case study of successful learning approaches developed by Virginia Beach City Public Schools (VBCPS).

Download Full-text