-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async execute is not run concurrently #7888
Comments
+1. We also encountered this problem in NGC triton server 23.12. I suspect that the underlying @Tabrizian @okdimok @oandreeva-nv PTAL, thanks! |
Thanks for the issue, is it possible to share some code? this will significantly speed up the debugging process |
I believe you can use this model in non-decoupled mode: server/qa/python_models/bls_async/model.py Lines 228 to 249 in 0194c3d
|
Hi @oandreeva-nv , When I used It can be seen that the CPU is mostly consumed at https://github.com/python/cpython/blob/3.10/Lib/concurrent/futures/thread.py#L81 and https://github.com/python/cpython/blob/3.10/Lib/concurrent/futures/thread.py#L58. The underlying are PyThread_acquire_lock_timed (libpython3.10.so.1.0) and pthread_cond_timedwait (libc.so.6). I took a quick look at the
I am not sure if it is cpu intensive and causes the blocking, but when I replaced it with aio grpc client, the blocking disappeared. FYI.
|
Description
We have a Python BLS model that calls into another model. This BLS model is just a thin wrapper, and we use
await infer_request.async_exec()
. In this case, the async function should be handle multiple requests concurrently when it's waiting for the async_exec.However, we notice there is backlog on this BLS model rather than the actual backend model, which means requests are not processed concurrently.
Triton Information
24.11
To Reproduce
Expected behavior
If the BLS async model can handle concurrent requests, the backlog should happen on the backend model rather than the BLS model
The text was updated successfully, but these errors were encountered: