This project demonstrates wine quality prediction using machine learning techniques and manages the experiment tracking and model deployment using MLflow. The dataset contains various features related to the chemical composition of wines and their corresponding quality ratings.
- Introduction
- Features
- Technology Used
- Installation
- MLflow Tracking
- AWS CICD Deployment with Github Actions
- Preview of Wine Quality Prediction in Action
- License
In this project, we utilize machine learning algorithms to predict the quality of wines based on their chemical attributes. MLflow is used to manage the end-to-end machine learning lifecycle, including experiment tracking, model development, and deployment.
The Wine Quality Prediction model utilizes a set of chemical attributes as features to predict the quality of wines. The features used in this project include:
- Fixed Acidity: The amount of fixed acids in the wine.
- Volatile Acidity: The amount of volatile acids in the wine, which contribute to its odor.
- Citric Acid: The amount of citric acid present in the wine.
- Residual Sugar: The amount of residual sugar left after fermentation.
- Chlorides: The level of salt present in the wine.
- Free Sulfur Dioxide: The amount of sulfur dioxide that is not bound to other molecules.
- Total Sulfur Dioxide: The total amount of sulfur dioxide in the wine.
- Density: The density of the wine.
- pH: The pH level, which indicates the acidity or alkalinity of the wine.
- Sulphates: The amount of sulfur compounds present in the wine.
- Alcohol: The alcohol content of the wine.
These features are used to train the machine learning model and make predictions about the quality of the wine. The dataset containing these features has been preprocessed and split into training and testing sets for model training and evaluation.
This project leverages various technologies to achieve its goals:
-
Programming Language: Python
-
MLflow: MLflow is employed for experiment tracking, model versioning, and deployment. It allows us to seamlessly manage the end-to-end machine learning lifecycle.
-
Dagshub: Dagshub is used for experiment tracking and collaboration, providing an integrated platform to log experiments and share results.
-
FastAPI: A modern, fast web framework for building APIs with Python.
-
Docker: A platform to develop, ship, and run applications in containers.
-
Amazon Web Services (AWS): A cloud computing platform that provides various services and tools to facilitate deployment and management of applications.
- AWS Identity and Access Management (IAM): Used to manage access to AWS resources securely.
- Elastic Container Registry (ECR): A managed container registry to store, manage, and deploy Docker container images.
- Amazon Elastic Compute Cloud (Amazon EC2): Provides scalable compute capacity in the cloud, often used to host applications and services.
These technologies collectively enable us to efficiently develop, evaluate, and deploy machine learning models for wine quality prediction.
- Clone the repository:
git clone https://github.com/SahilChowkekar/Wine-Quality-Prediction-with-MLflow.git
- Navigate to the project directory:
cd Wine-Quality-Prediction-with-MLflow
- Create a conda environment after opening the repository:
conda create -n mlops python=3.8 -y
- Activate the conda environment:
conda activate mlops
- Install the required dependencies:
pip install -r requirements.txt
- Finally, run the following command to start the application:
python app.py
- Open your preferred web browser and visit:
http://localhost:8080
To integrate MLflow tracking with dagshub, follow these steps:
- Log in to dagshub and connect your account to GitHub.
- Connect to your repository and add access.
- Click on the selected repository for the project name and connect to it.
- Go to "Remote" and click on "Experiment".
- Copy the MLflow training command provided.
MLFLOW_TRACKING_URI=https://dagshub.com/SahilChowkekar/End-to-end-Machine-Learning-Project-with-MLflow.mlflow
MLFLOW_TRACKING_USERNAME=SahilChowkekar
MLFLOW_TRACKING_PASSWORD=1dc4f330b0a4f858969f6cee768b6252ac0bdbf8
python script.py
Make changes in the "export as env variables" command according to your information:
Run this to export as env variables:
export MLFLOW_TRACKING_URI=https://dagshub.com/SahilChowkekar/End-to-end-Machine-Learning-Project-with-MLflow.mlflow
export MLFLOW_TRACKING_USERNAME=SahilChowkekar
export MLFLOW_TRACKING_PASSWORD=1dc4f330b0a4f858969f6cee768b6252ac0bdbf8
This repository provides instructions and steps to deploy an application using AWS services and GitHub Actions for continuous integration and continuous deployment (CI/CD).
Login to your AWS console using your credentials.
Create an IAM user with specific access for deployment purposes, including EC2 and ECR access. Attach the following policies:
- AmazonEC2ContainerRegistryFullAccess
- AmazonEC2FullAccess
Create an Elastic Container Registry (ECR) repository to store your Docker image. Save the repository URI, e.g., 062582090401.dkr.ecr.us-east-2.amazonaws.com/text-s
.
Create an EC2 instance with the Ubuntu operating system.
Connect to your EC2 instance and install Docker:
sudo apt-get update -y
sudo apt-get upgrade
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
Go to your repository settings in GitHub, navigate to Actions, and set up a new self-hosted runner. Choose the appropriate OS and follow the provided commands
In your GitHub repository, go to Settings > Secrets and add the following secrets:
AWS_ACCESS_KEY_ID
: Your AWS access key ID.AWS_SECRET_ACCESS_KEY
: Your AWS secret access key.AWS_REGION
: The AWS region you are using (e.g.,us-east-2
).AWS_ECR_LOGIN_URI
: The ECR login URI (e.g.,062582090401.dkr.ecr.us-east-2.amazonaws.com/text-s
).ECR_REPOSITORY_NAME
: The name of your ECR repository (e.g.,textsummary-app
).
Now, your GitHub Actions workflows can securely access the required AWS resources using these secrets.
Feel free to follow these steps to set up a seamless CI/CD pipeline for deploying your application using AWS services and GitHub Actions.
Please make sure to review and adjust the content as needed, especially the specific AWS resource names and regions based on your project.
Here's a preview of the Wine Quality Prediction in action:
This project is licensed under the MIT License. See the LICENSE file for more details.