Name		Name	Last commit message	Last commit date
parent directory ..
instructor		instructor
llama2		llama2
mosaic_diffusion		mosaic_diffusion
mpt		mpt
README.md		README.md
__init__.py		__init__.py

README.md

MosaicML Inference Model Handlers

The following instructions will guide you on how to get started with MosaicML’s inference service.

In this folder, we provide examples of model handler implementations for various models. They can be used out-of-the box with the yamls provided using the MosaicML inference service.

Getting Started

Before using the inference service, you must request access here.

Once you have access,

Follow instructions here to install mosaicml-cli, our command line interface that will allow you to deploy models and run inference on them.
Once you have mcli set up, the Inference Docs will give you a high level overview of how you can deploy your model and interact with it.
Now, you are ready to look at the README for each of the models in this repo to start running inference on them!

Content

We have provided examples for 3 different model types in this repo:

Each of these model directories have model handlers and yaml files.

Model Handlers

Model handlers define how your model is loaded and how the model should be run when receiving a request. The model handlers are expected to be a class that implements a predict function and optionally a predict_stream function if you'd like your deployment to support streaming outputs. For more details about the structure of the model handlers, please refer to the mcli docs.

If a model that you'd like to deploy isn't supported by one of the existing model handlers, you can flexibly configure the behavior of your models in the server by implementing your own model handler.

YAMLs

Deployment submissions to the MosaicML platform can be configured through a YAML file or using our Python API’s InferenceDeploymentConfig class. We have provided YAMLs in these examples which contain information like name, image, download path for the model, and more. Please see this link to understand what these parameters mean.

Note: The image field in the YAML corresponds to the Docker image for the Docker container that is executing your deployment. It is set to our latest inference release.

What's Next

Check out our LLM foundry, which contains code to train, finetune, evaluate and deploy LLMs.
Check out the Prompt Engineering Guide to better understand LLMs and how to use them.

Additional Resources

Check out the MosaicML Blog to learn more about large scale AI
Follow us on Twitter and LinkedIn
Join our community on Slack

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference-deployments

inference-deployments

README.md

MosaicML Inference Model Handlers

Getting Started

Content

Model Handlers

YAMLs

What's Next

Additional Resources

Files

inference-deployments

Directory actions

More options

Directory actions

More options

Latest commit

History

inference-deployments

Folders and files

parent directory

README.md

MosaicML Inference Model Handlers

Getting Started

Content

Model Handlers

YAMLs

What's Next

Additional Resources