-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⬆️✅ Support 0.6.5+ vllm #7
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Evaline Ju <[email protected]>
Signed-off-by: Evaline Ju <[email protected]>
Signed-off-by: Evaline Ju <[email protected]>
Signed-off-by: Evaline Ju <[email protected]>
Signed-off-by: Evaline Ju <[email protected]>
843768f
to
66b9acf
Compare
Signed-off-by: Evaline Ju <[email protected]>
assert type(request) == ErrorResponse | ||
assert request.code == HTTPStatus.BAD_REQUEST.value | ||
# As of vllm >= 0.6.5, extra fields are allowed | ||
assert type(request) == ChatCompletionRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. this will change the general API behavior from our side. Does orchestrator expects bad request in such scenario or passthrough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will just cause a passthrough of the variable from my testing. My worry is that adding additional validation when vllm and openAI allow passthrough is then we're even more tied to small API changes (like tracking all expected fields)
vllm APIs including
OpenAIServingChat
that the chat detection base class is built on underwent some breaking changes. The decision was made in this PR to just update the lower bound ofvllm
instead of maintaining conditional support in the tests for0.6.2
-pre0.6.5
, since vllm APIs move quickly. At time of writing, there are at least two patch versions0.6.5
-0.6.6
with these supported changes. Some post-0.6.6
but inmain
branch changes have been noted as inline comments.Key changes
https://github.com/vllm-project/vllm/pull/9919
building onhttps://github.com/vllm-project/vllm/pull/9358
added the non-optionalchat_template_content_format
field toOpenAIServingChat
https://github.com/vllm-project/vllm/pull/10463
that allows extra fields now in the vllm API since the OpenAI API now allows extra fields, impacting request/response fields likeChatCompletionRequest
[used to make the request to chat completions]https://github.com/vllm-project/vllm/pull/11164
addedget_diff_sampling_params
to model configsCloses: #6