Lecture notes, tutorial tasks including solutions as well as online videos for the reinforcement learning course hosted by Paderborn University. Source code for the entire course material is open and everyone is cordially invited to use it for self-learning (students) or to set up your own course (lecturers).
- Introduction to Reinforcement Learning
- Markov Decision Processes
- Dynamic Programming
- Monte Carlo Methods
- Temporal-Difference Learning
- Multi-Step Bootstrapping
- Planning and Learning with Tabular Methods
- Function Approximation with Supervised Learning
- On-Policy Prediction with Function Approximation
- Value-Based Control with Function Approximation
- Stochastic Policy Gradient Methods
- Deterministic Policy Gradient Methods
- Further Contemporary RL Algorithms (TRPO, PPO)
- Outlook and Research Insights
- Summary of Part One: Reinforcement Learning in Finite State and Action Spaces
- Summary of Part Two: Reinforcement Learning Using Function Approximation
- Full course slides
All exercises are based on Python 3.9 and site-packages according to the requirements.txt:
>>> pip install setuptools==65.5.0
>>> pip install -r requirements.txt
- Basics of Python for Scientific Computing
- Tutorial video (only 2022 edition available due to technical outage)
- Tutorial template
- Tutorial solution
- Manually Solving Basic Markov Chain, Reward and Decision Problems
- The Beer-Bachelor and Dynamic Programming (the Shortest Beer Problem)
- Tutorial video (only 2022 edition available due to technical outage)
- Tutorial template
- Tutorial solution
- Drive Through the Race Track with Monte Carlo Learning
- Drive even Faster Using Temporal-Difference Learning
- Stabilizing the Inverted Pendulum by Tabular Multi-Step Methods
- Boosting the Inverted Pendulum by Integrating Learning & Planning (Dyna Framework)
- Predicting the Operating Behavior of a Real Electric Drive Systems with Supervised Learning
- Evaluate the Performance of Given Agents in the Mountain Car Problem Using Function Approximation
- Escape from the Mountain Car Valley Using Semi-Gradient Sarsa & Least Square Policy Iteration
- Landing on the Moon with REINFORCE and Actor-Critic Methods
- Shoot for the moon with DDPG & PPO
We highly appreciate any feedback and input to the course material e.g.
- typos or content-related discussions (please raise an issue)
- adding new contents (please provide a pull request)
If you like to contribute to the repo to a larger extent, please do not hesitate to contact us directly.
The lecture notes are inspired by
- Richard S. Sutton, Andrew G. Barto, 'Reinforcement Learning: An Introduction' Second Edition MIT Press, Cambridge, MA, 2018
- David Silver, UCL Course on Reinforcement Learning, 2015
The tutorials are partly using pre-packed environments from
- Gymnasium (maintained branch of OpenAI's Gym)
See "Cite this repository" on top