Advanced Off-policy Approaches

Department of Electrical Engineering

Control Robotics and Machine Learning Lab

Technion - Israel Institute of Technology

המעבדה לבקרה רובוטיקה ולמידה חישובית

Advanced Approaches in Off-Policy learning

Background

Most of the recent breakthroughs in Deep Reinforcement Learning has been through advanced methods to collect and process data. This can be seen in the initial DQN which through the experience replay enabled the first DRL solution to a vast range of ATARI games. Later on, methods such as Prioritised Experience Replay, Distributed Experience Replay and more unique data collection methods such as Go-Explore - each of these methods has led to state-of-the-art performance in various games.

Project Goal

Implement advanced methods for experience storing and collection.
Assess the performance of each method.
Publish results.

Project steps

Understand the DQN framework.
Understand advanced experience replay methods (prioritized / distributed).
Implement new storage and collection methods.
Suggest your own ideas!

Required knowledge

Strong programming skills.
Knowledge in Deep Learning.
Any knowledge in RL is an advantage.

Environment

TensorFlow.

Comments and links