Multiplicative Value Function for Safe RL

We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. By splitting responsibilities, we facilitate the learning task leading to increased sample efficiency.

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning
Nick Bührer, Zhejun Zhang, Alexander Liniger, Fisher Yu and Luc Van Gool.

IROS 2023
Project Website with Videos
arXiv Paper

@inproceedings{buehrer2023saferl,
  title     = {A Multiplicative Value Function for Safe and Efficient Reinforcement Learning},
  author    = {B{\"u}hrer, Nick and Zhang, Zhejun and Liniger, Alexander and Yu, Fisher and Van Gool, Luc},
  booktitle = {International Conference on Intelligent Robots and Systems (IROS)},
  year = {2023}
}

Installation

Create the conda environment by running

conda env create -f conda_env.yaml

Note that our implementation is build upon stable-baselines3 1.2.0 (as defined in the yaml) and might not work with newer versions.

Running experiments

All the experiments can be launched from main.py. For the experiment configuration, we use hydra. The following environments are supported for now:

Lunar Lander Safe
Car Racing Safe
Point Robot Navigation

For running PPO Mult V1 in Lunar Lander Safe, simply execute:

python main.py +lunar_lander=ppo_mult_v1

For executing the Lagrangian baseline PPO Lagrange in Car Racing Safe, simply execute:

python main.py +car_racing=ppo_lagrange

All the experiment configs can be found under the experiments folder. In the example of lunar lander, the experiment is under experiments/lunar_lander/ppo_mult_v1.yaml.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
builders		builders
docs		docs
envs		envs
experiments		experiments
hydra_config		hydra_config
models		models
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yaml		conda_env.yaml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multiplicative Value Function for Safe RL

Installation

Running experiments

About

Releases

Packages

Contributors 2

Languages

License

nikeke19/Safe-Mult-RL

Folders and files

Latest commit

History

Repository files navigation

Multiplicative Value Function for Safe RL

Installation

Running experiments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages