You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried deploying modular/max-openai-api to fly.io, but it takes a lot of time to do the first compilation of the model, is it possible to cache the model compilation on disk?
What is your motivation for this change?
Add --model-compile-cache=/.root/model parameter
Any other details?
Fly.io is a serverless GPU deployment platform, the machine is stopped and started often, now model compilation is too slow to be able to deploy in this kind of infrastructure
The text was updated successfully, but these errors were encountered:
What is your request?
I tried deploying modular/max-openai-api to fly.io, but it takes a lot of time to do the first compilation of the model, is it possible to cache the model compilation on disk?
What is your motivation for this change?
Add
--model-compile-cache=/.root/model
parameterAny other details?
Fly.io is a serverless GPU deployment platform, the machine is stopped and started often, now model compilation is too slow to be able to deploy in this kind of infrastructure
The text was updated successfully, but these errors were encountered: