You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.
loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)
The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.
The text was updated successfully, but these errors were encountered:
yxdr
changed the title
line 108 in model.py can cause serious memory leaks.
line 108 in model.py may cause memory leaks.
Dec 7, 2019
yxdr
changed the title
line 108 in model.py may cause memory leaks.
A problem with loss computation.
Dec 18, 2019
loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)
The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.
so ,how to write the loss?
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)
The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.
The text was updated successfully, but these errors were encountered: