Action eliminating DQN | RL Projects Lab

Department of Electrical Engineering

Control Robotics and Machine Learning Lab

Technion - Israel Institute of Technology

המעבדה לבקרה רובוטיקה ולמידה חישובית

Action eliminating DQN: exploration via confidence bounds

Background

Deep reinforcement learning (DRL) methods such as the Deep Q-Network (DQN) have achieved state-of-the-art results in a variety of challenging, high-dimensional domains.

One of the major problems in DRL is that the state and action spaces are large, and the task often becomes an exploration problem.

While DRL methods are new, RL is a well established field. There exist various algorithms in the tabular case which ensure proper exploration and convergence to the optimal solution. Such an algorithm is the Model-free AE Algorithm as proposed by Even-Dar et. al. (page 1095).

Recently Bellemare et. al. have proposed an improvement. Their method uses pseudo-counts to give additional intrinsic motivation for the agent to explore better.

In this project you will combine the best of both worlds. By using the pseudo-counts method proposed by Bellemare et. al. you will implement the ‘Model-free AE Algorithm’ as a robust exploratory solution.

Project Goal

Implement the ‘Model-free AE Algorithm’ in a DRL agent such as the DQN.

Project steps

Understand the DQN framework.
Get a basic understanding of the ‘Model-free AE Algorithm’
Implement and test the algorithm on various domains.
Suggest and test your own ideas!

Required knowledge

Strong programming skills.
Any knowledge in DL and RL is an advantage.

Environment

Torch / TensorFlow / PyTorch

Comments and links

See the DQN paper by Google.
See this paper regarding various learning methods in the tabular case (including the Model-free AE Algorithm)
See these two papers [1, 2] regarding count based methods for improved exploration.