0.2.2
Algorithm Implementation
- Generalized Advantage Estimation (GAE);
- Update PPO algorithm with arXiv:1811.02553 and arXiv:1912.09729;
- Vanilla Imitation Learning (BC & DA, with continuous/discrete action space);
- Prioritized DQN;
- RNN-style policy network;
- Fix SAC with torch==1.5.0
API change
- change
__call__
toforward
in policy; - Add
save_fn
in trainer; - Add
__repr__
in tianshou.data, e.g.print(buffer)