⬆️✅ Support 0.6.5+ vllm #7

evaline-ju · 2025-01-02T23:04:38Z

vllm APIs including OpenAIServingChat that the chat detection base class is built on underwent some breaking changes. The decision was made in this PR to just update the lower bound of vllm instead of maintaining conditional support in the tests for 0.6.2-pre0.6.5, since vllm APIs move quickly. At time of writing, there are at least two patch versions 0.6.5-0.6.6 with these supported changes. Some post-0.6.6 but in main branch changes have been noted as inline comments.

Key changes

https://github.com/vllm-project/vllm/pull/9919 building on https://github.com/vllm-project/vllm/pull/9358 added the non-optional chat_template_content_format field to OpenAIServingChat
https://github.com/vllm-project/vllm/pull/10463 that allows extra fields now in the vllm API since the OpenAI API now allows extra fields, impacting request/response fields like ChatCompletionRequest [used to make the request to chat completions]
https://github.com/vllm-project/vllm/pull/11164 added get_diff_sampling_params to model configs

Closes: #6

Signed-off-by: Evaline Ju <[email protected]>

gkumbhat · 2025-01-09T16:27:51Z

tests/test_protocol.py

-    assert type(request) == ErrorResponse
-    assert request.code == HTTPStatus.BAD_REQUEST.value
+    # As of vllm >= 0.6.5, extra fields are allowed
+    assert type(request) == ChatCompletionRequest


hmm. this will change the general API behavior from our side. Does orchestrator expects bad request in such scenario or passthrough?

This will just cause a passthrough of the variable from my testing. My worry is that adding additional validation when vllm and openAI allow passthrough is then we're even more tied to small API changes (like tracking all expected fields)

evaline-ju added 5 commits January 2, 2025 08:28

⬆️ Unpin vllm

ecb818f

Signed-off-by: Evaline Ju <[email protected]>

✅🔧 Update mock model configs

0b2c9be

Signed-off-by: Evaline Ju <[email protected]>

✅ Update test for extra fields

f0a70b5

Signed-off-by: Evaline Ju <[email protected]>

⬆️ Upgrade lower bound of vllm

1964f07

Signed-off-by: Evaline Ju <[email protected]>

🔥 Remove error on extra params tests

66b9acf

Signed-off-by: Evaline Ju <[email protected]>

evaline-ju force-pushed the vllm-latest branch from 843768f to 66b9acf Compare January 2, 2025 23:14

♻️ API server updates

15bf20c

Signed-off-by: Evaline Ju <[email protected]>

evaline-ju marked this pull request as ready for review January 8, 2025 21:21

gkumbhat reviewed Jan 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⬆️✅ Support 0.6.5+ vllm #7

⬆️✅ Support 0.6.5+ vllm #7

evaline-ju commented Jan 2, 2025 •

edited

Loading

gkumbhat Jan 9, 2025

evaline-ju Jan 9, 2025 •

edited

Loading

⬆️✅ Support 0.6.5+ vllm #7

Are you sure you want to change the base?

⬆️✅ Support 0.6.5+ vllm #7

Conversation

evaline-ju commented Jan 2, 2025 • edited Loading

gkumbhat Jan 9, 2025

Choose a reason for hiding this comment

evaline-ju Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

evaline-ju commented Jan 2, 2025 •

edited

Loading

evaline-ju Jan 9, 2025 •

edited

Loading