0.2.2

Trinkle23897 released this 26 Apr 07:25

· 756 commits to master since this release

Algorithm Implementation

Generalized Advantage Estimation (GAE);
Update PPO algorithm with arXiv:1811.02553 and arXiv:1912.09729;
Vanilla Imitation Learning (BC & DA, with continuous/discrete action space);
Prioritized DQN;
RNN-style policy network;
Fix SAC with torch==1.5.0

API change

change __call__ to forward in policy;
Add save_fn in trainer;
Add __repr__ in tianshou.data, e.g. print(buffer)

Assets 3