Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not detach the hidden state of GRU from the computational graph? 为什么不将GRU的隐藏状态从计算图中detach? #44

Open
MejiroSilence opened this issue Dec 19, 2024 · 0 comments

Comments

@MejiroSilence
Copy link

MejiroSilence commented Dec 19, 2024

Hello author, I have a question regarding your code. Why isn't the hidden state of the GRU detached from the computational graph? This could lead to exploding/vanishing gradients. I've seen other code using RNNs that seems to detach the hidden state from the previous step. It appears that only PyMARL and its various improved extensions don't do this. I checked https://github.com/oxwhirl/pymarl and it seems PyMARL is also written this way, and all the various improved repositories based on PyMARL handle it similarly. I hope to get an answer. Thank you.

作者您好,关于您的代码,我有一个疑问,为什么不将GRU的隐藏状态从计算图中detach?这会导致梯度爆炸/消失。我见过使用rnn的其他代码,似乎都将上一步的隐藏状态detach了,似乎只有pymarl以及其各种改进拓展没这么做。我查看了一下 https://github.com/oxwhirl/pymarl 似乎pymarl也是这么写的,之后所有的pymarl的各种改进的仓库,也都是这么处理的。希望能获得解答,谢谢。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant