Skip to content

Latest commit

 

History

History
20 lines (10 loc) · 464 Bytes

README.md

File metadata and controls

20 lines (10 loc) · 464 Bytes

Risk-Sensitive Stochastic Optimal Control as Rao-Blackwellized Markovian Score Climbing

Implements a policy optimization technique via Markovian score climbing

Installation

Create a conda environment

conda create -n NAME python=3.10

Then head to the cloned repository and execute

pip install -e .

Examples

A policy learning example on a simple pendulum environment

python examples/feedback/rb_csmc_pendulum.py