-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errors for charptb_discreteflow_af-scf #1
Comments
configure
|
When running the code on pytorch 0.4.1 (as listed in the Dependencies section of README), the above NaN behaviour did not happen and learning seems stable (although I haven't checked whether the results in the paper are replicated yet) However when running the code for pytorch version >= 1.0, I can confirm that the above NaN issue emerges. However there is a fix. I've looked into what was causing the NaN, and observed that the Looking at the release notes for pytorch 1.0 (that shows what has changed from v0.4.1), it turns out that the change made to So for version 0.4.1, |
A side note for compatibility of code with pytorch 1.0 and 1.1: it seems like using in-place operations for tensors that lead to the training loss
instead of the original code: Lines 53 to 58 in 0eb8552
In general, I think it's recommended not to use in-place operations for autograd although for pytorch >= 1.2 the in-place correctness checks do confirm that you are getting correct gradients when there is no error message |
thanks for your confirmation and guide. I'll try it. |
Thanks for your sharing the awesome work.
I'm trying to reproduce your result on PTB dataset and
baseline
andcharptb_discreteflow_af-af
works well as below log but I got error forcharptb_discreteflow_af-scf
. could you check it?NaN
for loss andNameError: name 'cur_impatience' is not defined
thanks
The text was updated successfully, but these errors were encountered: