Examples

Fast reference examples for training ML models with recipes. Designed to be easily forked and modified.

ResNet-50 + ImageNet

Figure 1: Comparison of MosaicML recipes against other results, all measured on 8x A100s on MosaicML Cloud.

Train the MosaicML ResNet, the fastest ResNet50 implementation that yields a ✨ 7x ✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes. Our recipes were also demonstrated at MLPerf, a cross industry ML benchmark.

🚀 Get started with the code here.

DeepLabV3 + ADE20k

Train the MosaicML DeepLabV3 that yields a ✨5x✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes.

🚀 Get started with the code here.

Large Language Models (LLMs)

A simple yet feature complete implementation of GPT, that scales to 70B parameters while maintaining high performance on GPU clusters. Flexible code, written with vanilla PyTorch, that uses PyTorch FSDP and some recent efficiency improvements.

🚀 Get started with the code here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Examples

ResNet-50 + ImageNet

DeepLabV3 + ADE20k

Large Language Models (LLMs)

Files

README.md

Latest commit

History

README.md

File metadata and controls

Examples

ResNet-50 + ImageNet

DeepLabV3 + ADE20k

Large Language Models (LLMs)