A problem with loss computation. #23

yxdr · 2019-12-07T05:18:12Z

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)

The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.

fengxin619 · 2021-06-21T07:25:27Z

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)

The loss computed by the above line is the average at every time step, which can cause it difficult to train the model.
So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.

so ,how to write the loss?

yxdr changed the title ~~line 108 in model.py can cause serious memory leaks.~~ line 108 in model.py may cause memory leaks. Dec 7, 2019

yxdr changed the title ~~line 108 in model.py may cause memory leaks.~~ A problem with loss computation. Dec 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A problem with loss computation. #23

A problem with loss computation. #23

yxdr commented Dec 7, 2019 •

edited

Loading

fengxin619 commented Jun 21, 2021

A problem with loss computation. #23

A problem with loss computation. #23

Comments

yxdr commented Dec 7, 2019 • edited Loading

fengxin619 commented Jun 21, 2021

yxdr commented Dec 7, 2019 •

edited

Loading