Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce guardrails around long prompts #32

Open
jncraton opened this issue Feb 18, 2024 · 0 comments
Open

Introduce guardrails around long prompts #32

jncraton opened this issue Feb 18, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@jncraton
Copy link
Owner

It is currently easy to cause an out-of-memory condition when prompting a model with a very long prompt. This is an expected result of the implementation of both certain tokenizers as well as transformer attention. Experienced users may intentionally want to use long prompts, but less experienced users may encounter this issue by accident and encounter confusing OOM conditions (#31) or extremely slow runtime performance.

It may be helpful to explore a mechanism to limit default prompt length in order to help users avoid these friction points.

@jncraton jncraton added the enhancement New feature or request label Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant