-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server : various fixes #10704
server : various fixes #10704
Conversation
ggml-ci
// Some idiosyncrasy in task processing logic makes several trailing calls | ||
// with empty content, we ignore these at the calee site. | ||
if (content.empty()) { | ||
return std::vector<json>({json::object()}); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes #10694
@@ -34,14 +34,6 @@ endforeach() | |||
add_executable(${TARGET} ${TARGET_SRCS}) | |||
install(TARGETS ${TARGET} RUNTIME) | |||
|
|||
# clean up generated files in pre-build step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note here, we should add a check in /scripts/xxd.cmake
to see if the file need to be re-generated or not. I will do that in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. You mentioned that the /slots
endpoint is also broken. I haven't looked at it yet. Maybe we can apply any additional fixes in this PR before merging? Feel free to push directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup I fixed it in 01da1ed
I also fix a problem with cpp wrapper llama_get_chat_template
because it returns null terminator in the final json:
Co-authored-by: Georgi Gerganov <[email protected]>
* server : various fixes ggml-ci * server : show curent seed in slot_params ggml-ci * fix /slots endpoint * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <[email protected]> * server : reflect endpoint response changes in the readme ggml-ci --------- Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]>
Important
The
/slots
and/props
responses have changed. See the updated READMEllama-server
on eachmake
n_ctx
fromslot_params
toserver_slot
server_slot.to_json()