Practical Reinforcement Learning (Coursera)

Practical Reinforcement Learning (Coursera)
Welcome to the Reinforcement Learning course. Here you will find out about: foundations of RL methods: value/policy iteration, q-learning, policy gradient, etc.--- with math & batteries included; using deep neural networks for RL tasks --- also known as "the hype train"; state of the art RL algorithms--- and how to apply duct tape to them for practical problems; and, of course, teaching your neural network to play games --- because that's what everyone thinks RL is about. We'll also use it for seq2seq and contextual bandits.

Jump in. It's gonna be fun!


Syllabus


Week 1

Intro: why should i care?

In this module we gonna define and "taste" what reinforcement learning is about. We'll also learn one simple algorithm that can solve reinforcement learning problems with embarrassing efficiency.


Week 2

At the heart of RL: Dynamic Programming

This week we'll consider the reinforcement learning formalisms in a more rigorous, mathematical way. You'll learn how to effectively compute the return your agent gets for a particular action - and how to pick best actions based on that return.


Week 3

Model-free methods

This week we'll find out how to apply last week's ideas to the real world problems: ones where you don't have a perfect model of your environment.


Week 4

Approximate Value Based Methods

This week we'll learn to scale things even farther up by training agents based on neural networks.


Week 5

Policy-based methods

We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.


Week 6

Exploration

In this final week you'll learn how to build better exploration strategies with a focus on contextual bandit setup. In honor track, you'll also learn how to apply reinforcement learning to train structured deep learning models.