Skip to content

Latest commit

 

History

History
30 lines (18 loc) · 1.86 KB

README.md

File metadata and controls

30 lines (18 loc) · 1.86 KB

Examples

Fast reference examples for training ML models with recipes. Designed to be easily forked and modified.

ResNet-50 + ImageNet

drawing

Figure 1: Comparison of MosaicML recipes against other results, all measured on 8x A100s on MosaicML Cloud.

Train the MosaicML ResNet, the fastest ResNet50 implementation that yields a ✨ 7x ✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes. Our recipes were also demonstrated at MLPerf, a cross industry ML benchmark.

🚀 Get started with the code here.

DeepLabV3 + ADE20k

drawing

Train the MosaicML DeepLabV3 that yields a ✨5x✨ faster time-to-train compared to a strong baseline. See our blog for more details and recipes.

🚀 Get started with the code here.

Large Language Models (LLMs)

Training curves for various LLM sizes.

A simple yet feature complete implementation of GPT, that scales to 70B parameters while maintaining high performance on GPU clusters. Flexible code, written with vanilla PyTorch, that uses PyTorch FSDP and some recent efficiency improvements.

🚀 Get started with the code here.