-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[V1] Remove ONLY add when PR is ready to merge/full CI is needed
_get_cache_block_size
ready
#12214
opened Jan 20, 2025 by
heheda12345
Loading…
[bugfix] catch xgrammar unsupported array constraints
#12210
opened Jan 20, 2025 by
Jason-CKY
Loading…
[core][bugfix] configure env var during import vllm
ready
ONLY add when PR is ready to merge/full CI is needed
#12209
opened Jan 20, 2025 by
youkaichao
Loading…
[Model]: get aria to work with the lastest transfomers impl
needs-rebase
#12207
opened Jan 20, 2025 by
xffxff
Loading…
[Model] Introduce CUDA Graph support for DeepSeek v3
#12204
opened Jan 20, 2025 by
houseroad
Loading…
[Bugfix] fix race condition that leads to wrong order of token returned
#12192
opened Jan 19, 2025 by
joennlae
Loading…
[Kernel] add triton fused moe kernel for gptq/awq
#12185
opened Jan 18, 2025 by
jinzhen-lin
Loading…
[Hardware][Gaudi][Bugfix] Fix HPU tensor parallelism, enable multiprocessing executor
#12167
opened Jan 17, 2025 by
kzawora-intel
Loading…
[Quantization/Parameter] WIP: Another Implementation of the Quantization Parameter Subclass Substitution
#12158
opened Jan 17, 2025 by
cennn
Loading…
[Core] Optimize topp/topk calculation in sampler
#12156
opened Jan 17, 2025 by
afierka-intel
•
Draft
[WIP][Hardware][CPU] testing branch for mlperf
ci/build
documentation
Improvements or additions to documentation
needs-rebase
#12141
opened Jan 17, 2025 by
bigPYJ1151
•
Draft
[Misc] Update to Transformers 4.48
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#12120
opened Jan 16, 2025 by
tlrmchlsmth
Loading…
[BUILD] Add VLLM_BUILD_EXT to control custom op build
ci/build
#12116
opened Jan 16, 2025 by
MengqingCao
Loading…
[Misc]add modules_to_not_convert attribute to gptq series
#12103
opened Jan 16, 2025 by
1096125073
Loading…
Use CUDA 12.4 as default for release and nightly wheels
ci/build
documentation
Improvements or additions to documentation
#12098
opened Jan 15, 2025 by
mgoin
Loading…
Add: Support for Sparse24Bitmask Compressed Models
#12097
opened Jan 15, 2025 by
rahul-tuli
•
Draft
1 task
[V1][Perf] Reduce scheduling overhead in model runner after cuda sync
#12094
opened Jan 15, 2025 by
youngkent
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-01-17.