Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于TRPO车杆环境运行结果 #95

Open
24krab opened this issue Oct 29, 2024 · 0 comments
Open

关于TRPO车杆环境运行结果 #95

24krab opened this issue Oct 29, 2024 · 0 comments

Comments

@24krab
Copy link

24krab commented Oct 29, 2024

我直接copy了trpo这一章节的代码运行结果,但在车杆环境下和教材里展示的结果差距显著:每轮迭代500采样序列的参数下,几乎无法达到200的回报;即使我尝试增大序列数到1000,效果能够有所改善,但还是和展示结果差距明显。在两台电脑上都跑出了类似的结果,请问这可能是什么原因导致的?
车杆2轮平滑
序列1000车杆平滑

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant