Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable FusedSDPA prefill by default (#447)
This removers the need to pass VLLM_PROMPT_USE_FUSEDSDPA environment variable in order to enable FusedSDPA attention. Fallback attention can still be used if VLLM_PROMPT_USE_FUSEDSDPA=0 is provided.
- Loading branch information