Introduce guardrails around long prompts #32

jncraton · 2024-02-18T18:54:43Z

It is currently easy to cause an out-of-memory condition when prompting a model with a very long prompt. This is an expected result of the implementation of both certain tokenizers as well as transformer attention. Experienced users may intentionally want to use long prompts, but less experienced users may encounter this issue by accident and encounter confusing OOM conditions (#31) or extremely slow runtime performance.

It may be helpful to explore a mechanism to limit default prompt length in order to help users avoid these friction points.

jncraton added the enhancement New feature or request label Feb 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce guardrails around long prompts #32

Introduce guardrails around long prompts #32

jncraton commented Feb 18, 2024

Introduce guardrails around long prompts #32

Introduce guardrails around long prompts #32

Comments

jncraton commented Feb 18, 2024