Support end/beginning of sequence other than `BOS` and `EOS` #83

rlouf · 2024-10-25T07:42:22Z

This issue is very general; I do not have a MRE yet.

Large Language Models use the EOS token to signify that the generation is over. Instruct models are different: they have other special tokens to mark the beginning and end of an instruction, as well as the beginning and end of the model's response, see for instance OpenAI's. My impression with the current setup is that generation does not stop when these special tokens are generated, and that generating a turn start token is not disallowed, unlike with the BOS and EOS token.

I noticed this when I went over the SCP examples. There we use JSON-based structured generation to generate the different entries. It turns out that some of the JSON fields contained these special tokens, and then more text, as you can see in this commit where I made the correction by hand.

The text was updated successfully, but these errors were encountered:

rlouf added the bug Something isn't working label Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support end/beginning of sequence other than `BOS` and `EOS` #83

Support end/beginning of sequence other than `BOS` and `EOS` #83

rlouf commented Oct 25, 2024

Support end/beginning of sequence other than BOS and EOS #83

Support end/beginning of sequence other than BOS and EOS #83

Comments

rlouf commented Oct 25, 2024

Support end/beginning of sequence other than `BOS` and `EOS` #83

Support end/beginning of sequence other than `BOS` and `EOS` #83