Feature/20240911 enable squashed gaussian on ppo #130

ishihara-y · 2024-09-17T09:20:39Z

Some distribution (like SquashedGaussian) does not have an analytical form for the entropy. Thus PPO can not use such distribution in the training.
However, when entropy coefficient of PPO training is 0, ppo does not need to compute the entropy of the policy distribution.
This PR enables using SquashedGaussian and other distributions in PPO when the entropy coefficient is 0.
New implementation ignores the computation of entropy in policy training with 0 coefficient.

sbsekiguchi · 2024-09-18T02:02:11Z

LGTM.

ishihara-y force-pushed the feature/20240911-enable-squashed-gaussian-on-ppo branch from 92b0df8 to 03908bc Compare September 17, 2024 09:20

ishihara-y requested a review from sbsekiguchi September 17, 2024 09:20

ishihara-y self-assigned this Sep 17, 2024

ishihara-y requested a review from TakayoshiTakayanagi September 17, 2024 09:21

Enable using squashed gaussian in PPO when entropy coef is 0

03908bc

sbsekiguchi merged commit 15c2a84 into master Sep 18, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/20240911 enable squashed gaussian on ppo #130

Feature/20240911 enable squashed gaussian on ppo #130

ishihara-y commented Sep 17, 2024

sbsekiguchi commented Sep 18, 2024

Feature/20240911 enable squashed gaussian on ppo #130

Feature/20240911 enable squashed gaussian on ppo #130

Conversation

ishihara-y commented Sep 17, 2024

sbsekiguchi commented Sep 18, 2024