Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

A problem with loss computation. #23

Open
yxdr opened this issue Dec 7, 2019 · 1 comment
Open

A problem with loss computation. #23

yxdr opened this issue Dec 7, 2019 · 1 comment

Comments

@yxdr
Copy link

yxdr commented Dec 7, 2019

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)

The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.

@yxdr yxdr changed the title line 108 in model.py can cause serious memory leaks. line 108 in model.py may cause memory leaks. Dec 7, 2019
@yxdr yxdr changed the title line 108 in model.py may cause memory leaks. A problem with loss computation. Dec 18, 2019
@fengxin619
Copy link

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)

The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.

so ,how to write the loss?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants