Set vllm-hpu-extension to c2cd742#588
Closed
szutenberg wants to merge 678 commits intomainfrom dev/mszutenberg/c2cd742
+27,075-8,228
Commits
This pull request is big! We're only showing the most recent 250 commits
Commits on Nov 6, 2024
Commits on Nov 7, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
[Misc] Add Gamma-Distribution Request Generation Support for Serving Benchmark. (vllm-project#10105)
- committed
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 8, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Disable spec-decode + chunked-prefill for draft models with tensor parallelism > 1 (vllm-project#10136)
authored- authored
- authored
- authored
- authored
- authored
Commits on Nov 9, 2024
[Kernel][Triton] Add Triton implementation for scaled_mm_triton to support fp8 and int8 SmoothQuant, symmetric case (vllm-project#9857)
authored- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 10, 2024
- authored
- authored
Commits on Nov 11, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 12, 2024
- authored
- authored
- authored
- authored
- authored
[BugFix] Do not raise a
ValueError
whentool_choice
is set to the supportednone
option andtools
are not defined. (vllm-project#10000)authored- committed
- authored
- authored
- authored
- authored
[V1] Use pickle for serializing EngineCoreRequest & Add multimodal inputs to EngineCoreRequest (vllm-project#10245)
authored- authored
- authored
- authored
- authored
- authored
Commits on Nov 13, 2024
- authored
- authored
- authored
- authored
[Model] Add support for Qwen2-VL video embeddings input & multiple image embeddings input with varied resolutions (vllm-project#10221)
authored[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 (vllm-project#9944)
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 14, 2024
- authored
- authored
- authored
- authored
[BugFix]: properly deserialize
tool_calls
iterator before processing by mistral-common when MistralTokenizer is used (vllm-project#9951)authored- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 15, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- committed
- committed
- authored
- committed
- committed
- committed
[Bugfix] Ensure special tokens are properly filtered out for guided structured output with MistralTokenizer (vllm-project#10363)
authored- committed
- committed
- committed
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 16, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 17, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
Commits on Nov 18, 2024
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- authored
- committed
- committed
- authored
- committed
- committed
- authored
- authored
Commits on Nov 19, 2024
Commits on Nov 20, 2024
Commits on Nov 21, 2024
Commits on Nov 22, 2024
Commits on Nov 25, 2024
Commits on Nov 26, 2024
- authored
- committed
- authored
- authored
- authored
- authored
- authored
- authored
- committed
Commits on Nov 27, 2024
Commits on Nov 28, 2024
Commits on Nov 29, 2024
Commits on Dec 2, 2024
Commits on Dec 3, 2024
Commits on Dec 4, 2024
- authored
- authored
- authored
- committed