Supplementary Material

Department of Electrical Engineering

Control Robotics and Machine Learning Lab

Technion - Israel Institute of Technology

המעבדה לבקרה רובוטיקה ולמידה חישובית

We encourage all our students to register for the course 'Learning and Planning in Dynamic Programming' - 046194. The course, taught by Prof. Shie Mannor/Dr. Aviv Tamar, provides excellent theoretical tools to understand and analyze RL algorithms.

If you would like to start reading material related to the projects, then we have provided some relevant material below. Please note that we have provided a large amount of material, you aren't expected to watch all of the lectures or read all of the tutorials. This just gives you an idea of the material that is available.

Reinforcement Learning

Reinforcement Learning an Introduction - Excellent book by Richard S. Sutton and Andrew G. Barto.

Algorithms for Reinforcement Learning Book - A useful guide for Reinforcement Learning

Deep Learning

NYU Deep Learning Course - An excellent introductory course into Deep Learning.

A Deep Learning tutorial by Yan Lecun (One of the leaders in the field).

Machine Learning Course by Nando De Freitas at the University of Oxford. Very well structured course with some excellent lectures on Deep Learning and Reinforcement Learning.

Geoffrey Hinton's Neural Networks for Machine Learning course also on Coursera

Deep Reinforcement Learning

Andrej Karapathy Blog on DRL

WildML tutorials and code on GitHub

KDnuggets overview

David Silver's Course in Reinforcement Learning - A very good introduction to Reinforcement Learning. David Silver is now a research scientist at Google Deepmind.

Machine Learning

Andrew Ng's Machine Learning course.

Learning from Data - Machine Learning course from Caltech.

Code

Software:

Learn Lua in 15 minutes! (The framework used for the Deep Neural Network algorithm.

Official PyTorch tutorials.

Googles Machine Learning Crash Course (TensorFlow).

Nvidia has an online course that is currently running that gives some practical tips regarding Deep Learning and the Deep Learning Software Frameworks

PyTorch implementations for Value-based DRL algorithms

Practical DRL course

Algorithms:

The DQN algorithm code that you will be using for your projects is available here.

Open AI Baselines provides implementations to many algorithms in TensorFlow.

Environments:

Malmo - Microsoft's project Malmo is a Minecraft simulator for machine learning experiments.

OpenAI Gym - A unified python interface for interaction with multiple environments such as ATARI and continuous control tasks (Walker, Hopper, Inverted pendulum etc..).

Research Papers

DQN and basic extensions:

DQN Paper (The Deep Learning Algorithm that will be used for this project). Fill free to contact us if you want to read the Nature paper but dont have access.

Double DQN - Simple addition to the DQN which provides a more stable learning behavior.

Prioritized Experience Replay - Smarter sampling from the replay buffer based on the TD error of each sample (i.e. give higher priority to transitions {s,a,r,s',t} which we have seen less during training).

Dueling DQN - Instead of predicting the Q values, the network now predicts the Value and the Advantage of each action where Q is a combination of the value and the advantage.

High-level extensions:

Count Based Exploration - estimate how many times each state has been visited and provide an additional reward proportional to 1/(number of visitations).

Distributional DQN [1 , 2] - Instead of learning the value function directly, learn a distribution of possible values. Can provide insight in stochastic environments or when using stochastic policies.

Noisy Networks for Exploration - Noise is added to the weights of the network. This can be seen as a way of improved exploration under uncertainty.

Rainbow - A combination of all these different architectures + an ablation study which shows the impact of each of them.

Report Template

The project reports are short reports in the format of international workshop publications. We have provided a Latex template on overleaf containing a 'skeleton' format of how we expect the reports to be structured. Overleaf is a web environment for writing in Latex, you can clone the template or download the source files and edit it on your computer using a Latex editor like TexMaker.

For examples of previous reports written by students in our lab, please visit the completed projects tab.