Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于文本label直接复制input_ids的处理 #38

Open
Cooperx521 opened this issue Mar 18, 2024 · 1 comment
Open

关于文本label直接复制input_ids的处理 #38

Cooperx521 opened this issue Mar 18, 2024 · 1 comment

Comments

@Cooperx521
Copy link

作者您好,在documents/pretraining/Causal LM for Continual Pre-training.md里面,有这样一句话输入时只需要直接将input_ids复制一份为label即可,麻烦问一下因为在计算loss的时候,label需要左移一位,那么这个操作是在哪一部分被完成的呢,是在trainer里面吗,可是trainer如何知道是causal loss呢

@wjn1996
Copy link
Contributor

wjn1996 commented Apr 12, 2024

这部分操作是在模型的forward中实现。详见这里:https://github.com/HugAILab/HugNLP/blob/main/models/language_modeling/causal_lm.py 的122行

# Shift so that tokens < n predict n
shift_logits = lm_logits[..., :-1, :].contiguous()
shift_labels = labels[..., 1:].contiguous()
# print("shift_labels=", shift_labels)
# Flatten the tokens
loss_fct = CrossEntropyLoss()
loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants