Remove Gunicorn and use Uvicorn only for gateway #530

yunfeng-scale · 2024-05-31T17:59:00Z

Pull Request Summary

in kubernetes environment we don't really need multiple workers in the same pod, rather it's simpler to just have kubernetes autoscale the number of pods. based on some internal benchmarks gunicorn has some known load balancing issues, also removing this layer results in less error and better latency

Test Plan and Usage Guide

will run simple load testing for get requests with and without gunicorn

edgan8

This is a big change, can you include a summary of the test plan? I'm worried about any performance impact

edgan8 · 2024-05-31T18:02:25Z

model-engine/model_engine_server/entrypoints/start_fastapi_server.py

        "--workers",
-        f"{num_workers}",
+        "1",  # Let the Kubernetes deployment handle the number of pods


Will this reduce the amount of traffic we can receive per-pod? Why not keep it at 4?

this is to remove load balancing within pod

are we increasing the number of pods by 4 to compensate?

that's the initial plan

tbh i think llm engine is overprovisioned

yixu34 · 2024-06-01T00:03:19Z

This is a big change, can you include a summary of the test plan? I'm worried about any performance impact

In addition to:

will run simple load testing for get requests with and without gunicorn

Let's also monitor post-rollout 👀

Remove Gunicorn and use Uvicorn only for gateway

a970141

yunfeng-scale requested a review from a team May 31, 2024 17:59

yunfeng-scale changed the title ~~Remove Gunicorn and use Uvicorn only for gateway~~ MLI-2258 Remove Gunicorn and use Uvicorn only for gateway May 31, 2024

yunfeng-scale changed the title ~~MLI-2258 Remove Gunicorn and use Uvicorn only for gateway~~ Remove Gunicorn and use Uvicorn only for gateway May 31, 2024

edgan8 reviewed May 31, 2024

View reviewed changes

upgrade uvicorn

34c3183

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Gunicorn and use Uvicorn only for gateway #530

Remove Gunicorn and use Uvicorn only for gateway #530

yunfeng-scale commented May 31, 2024 •

edited

Loading

edgan8 left a comment

edgan8 May 31, 2024

yunfeng-scale May 31, 2024

seanshi-scale May 31, 2024

yunfeng-scale May 31, 2024

yunfeng-scale May 31, 2024

yixu34 commented Jun 1, 2024

Remove Gunicorn and use Uvicorn only for gateway #530

Are you sure you want to change the base?

Remove Gunicorn and use Uvicorn only for gateway #530

Conversation

yunfeng-scale commented May 31, 2024 • edited Loading

Pull Request Summary

Test Plan and Usage Guide

edgan8 left a comment

Choose a reason for hiding this comment

edgan8 May 31, 2024

Choose a reason for hiding this comment

yunfeng-scale May 31, 2024

Choose a reason for hiding this comment

seanshi-scale May 31, 2024

Choose a reason for hiding this comment

yunfeng-scale May 31, 2024

Choose a reason for hiding this comment

yunfeng-scale May 31, 2024

Choose a reason for hiding this comment

yixu34 commented Jun 1, 2024

yunfeng-scale commented May 31, 2024 •

edited

Loading