Department of Electrical Engineering
Control Robotics and Machine Learning Lab
Technion - Israel Institute of Technology
המעבדה לבקרה רובוטיקה ולמידה חישובית
Action eliminating DQN: exploration via confidence bounds
Background
Deep reinforcement learning (DRL) methods such as the Deep Q-Network (DQN) have achieved state-of-the-art results in a variety of challenging, high-dimensional domains.
One of the major problems in DRL is that the state and action spaces are large, and the task often becomes an exploration problem.
While DRL methods are new, RL is a well established field. There exist various algorithms in the tabular case which ensure proper exploration and convergence to the optimal solution. Such an algorithm is the Model-free AE Algorithm as proposed by Even-Dar et. al. (page 1095).
Recently Bellemare et. al. have proposed an improvement. Their method uses pseudo-counts to give additional intrinsic motivation for the agent to explore better.
In this project you will combine the best of both worlds. By using the pseudo-counts method proposed by Bellemare et. al. you will implement the ‘Model-free AE Algorithm’ as a robust exploratory solution.
Project Goal
Implement the ‘Model-free AE Algorithm’ in a DRL agent such as the DQN.
Project steps
-
Understand the DQN framework.
-
Get a basic understanding of the ‘Model-free AE Algorithm’
-
Implement and test the algorithm on various domains.
-
Suggest and test your own ideas!
Required knowledge
-
Strong programming skills.
-
Any knowledge in DL and RL is an advantage.
Environment
-
Torch / TensorFlow / PyTorch
Comments and links