This MLOps repository contains python modules for image classification using PyTorch and Ray, a distributed computing framework. Train, Tune, and Serve Image Classifiers with ease.
The goal of this repository is to explore the model training, tuning, and serving using the Ray Framework. You might be wondering, why Ray? Over the past four years, my curiosity has been fueled by the desire to understand how these massive models are trained. Think about it—training a vision transformer on your laptop with millions of parameters or even conducting ablation studies seems like an insurmountable task.
That's where Ray steps in. This remarkable framework offers distributed computing capabilities that enable us to train colossal models swiftly. It eliminates the need for expertise in infrastructure management, taking care of the heavy lifting. Moreover, transitioning from local development to a cloud environment is a breeze with Ray; no drastic code changes required. For an in-depth understanding of the framework, I urge you to refer to their documentation. In this implementation, an end-to-end machine learning pipeline is implemented for an image classification task using the ResNet50 model with pretrained weights. Data preparation, Model Training and Tuning are done with the help of Ray. In addition, mlflow is used for experiment tracking and Gradio for model serving.
Follow the steps outlined in the notebook on understanding how to run the required modules. Example of how to run the modules is given below.
Model Training: python train_engine.py
Model Tuning: python tune_engine.py
Model Evaluation: python evaluate_engine.py --experiment-name "tuning-resnet-1693749273"
Model Serving with Gradio: python serve_engine.py --experiment_name "tuning-resnet-1693749273"