Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[Misc] Pass attention to impl backend
#12218 opened Jan 20, 2025 by wangxiyuan Loading…
[V1] Remove _get_cache_block_size ready ONLY add when PR is ready to merge/full CI is needed
#12214 opened Jan 20, 2025 by heheda12345 Loading…
[core][bugfix] configure env var during import vllm ready ONLY add when PR is ready to merge/full CI is needed
#12209 opened Jan 20, 2025 by youkaichao Loading…
[Model] Introduce CUDA Graph support for DeepSeek v3
#12204 opened Jan 20, 2025 by houseroad Loading…
[V1][Spec Decode] Ngram Spec Decode
#12193 opened Jan 19, 2025 by LiuXiaoxuanPKU Draft
6 tasks
[misc] add cuda runtime version to usage data
#12190 opened Jan 19, 2025 by youkaichao Loading…
[Misc] Add Gemma2 GGUF support
#12186 opened Jan 18, 2025 by Isotr0py Draft
[Kernel] add triton fused moe kernel for gptq/awq
#12185 opened Jan 18, 2025 by jinzhen-lin Loading…
[WIP][Hardware][CPU] testing branch for mlperf ci/build documentation Improvements or additions to documentation needs-rebase
#12141 opened Jan 17, 2025 by bigPYJ1151 Draft
[Misc] Update to Transformers 4.48 ci/build ready ONLY add when PR is ready to merge/full CI is needed
#12120 opened Jan 16, 2025 by tlrmchlsmth Loading…
Use CUDA 12.4 as default for release and nightly wheels ci/build documentation Improvements or additions to documentation
#12098 opened Jan 15, 2025 by mgoin Loading…
ProTip! Updated in the last three days: updated:>2025-01-17.