A simple implementation of Clipped Proximal Policy Optimization in pytorch that runs in gym envs. This library also contains some weird additions, shortcuts and experimental stuff like truncated distributions, fixed std on the policy network (suprisingly works quite well) and full episode rollouts so it may not always marry up precisely with openai baselines.
This guy has been training for 50409 16 episode rollouts in the episode shown he scored 284.6
- ideally make yourself a virtualenv so i don't fuck up your torch install or whatever and then do:
git clone https://github.com/leaprovenzano/ppo_pytorch.git
pip install -e ppo_pytorch
COMING SOON ... a notebook or something, soz!