You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the AWS TEI Docker image (2.0.1-tei1.4.0-gpu-py310-cu122-ubuntu22.04) for text embeddings inference. When I deploy it on a SageMaker g4dn.xlarge instance, the process stops working after just a couple of requests. Strangely, the same setup runs smoothly on a g5 instance without any issues.
It looks like after a few inference requests on g4dn.xlarge, the processes that serve the models just die.
Any idea why is that happening with the specific instance?
Information
Docker
The CLI directly
Tasks
An officially supported command
My own modifications
Reproduction
Deploy a model with TEI on g4dn and send a couple of hundred or thousand requests.
Expected behavior
I would expect for the processes to not die.
The text was updated successfully, but these errors were encountered:
System Info
Hello,
I'm using the AWS TEI Docker image (2.0.1-tei1.4.0-gpu-py310-cu122-ubuntu22.04) for text embeddings inference. When I deploy it on a SageMaker g4dn.xlarge instance, the process stops working after just a couple of requests. Strangely, the same setup runs smoothly on a g5 instance without any issues.
It looks like after a few inference requests on g4dn.xlarge, the processes that serve the models just die.
Any idea why is that happening with the specific instance?
Information
Tasks
Reproduction
Deploy a model with TEI on g4dn and send a couple of hundred or thousand requests.
Expected behavior
I would expect for the processes to not die.
The text was updated successfully, but these errors were encountered: