Grocery Sales Prediction

https://grocergeniusaibasedsupermarketsalesprediction.streamlit.app/

Introduction

Welcome to the Grocery Sales Prediction project!

In today's competitive retail landscape, accurate sales forecasting is crucial for inventory management, resource allocation, and strategic planning. This project leverages machine learning to predict the sales of grocery items across various outlets, enabling businesses to make data-driven decisions and optimize their operations.

Why is this important?

Inventory Management: Prevent overstocking or stockouts.
Pricing Strategies: Adjust prices based on demand predictions.
Marketing Campaigns: Target promotions effectively.

Technology Stack

We have utilized a modern and robust technology stack to ensure scalability and performance:

Programming Language: Python 3.x
Data Manipulation: Pandas, NumPy
Data Visualization: Matplotlib, Seaborn
Machine Learning: Scikit-learn
Model Persistence: Joblib
Web Framework: Streamlit
Version Control: Git
Development Environment: Jupyter Notebooks

Project Workflow

Our project follows a structured workflow to ensure clarity and efficiency:

Data Collection 📊
- Gather raw sales data from multiple grocery outlets.
Data Preprocessing 🛠️
- Cleanse data, handle missing values, and prepare for modeling.
Exploratory Data Analysis (EDA) 🔍
- Visualize data patterns and uncover insights.
Feature Engineering 🧪
- Create new features and transform existing ones.
Model Training 🤖
- Train machine learning models and fine-tune hyperparameters.
Model Evaluation 🏆
- Assess model performance using appropriate metrics.
Model Deployment 🚀
- Deploy the model using Streamlit for user interaction.
Inferencing 🔮
- Generate predictions based on user inputs.

Data Preprocessing

"Data is the new oil." — Clive Humby

To extract value from data, we performed meticulous preprocessing:

Handling Missing Values:
- Item Weight: Imputed using median values grouped by Item Type.
- Outlet Size: Filled using mode values grouped by Outlet Type.
Outlier Detection and Treatment:
- Applied the Interquartile Range (IQR) method to cap outliers.
Data Standardization:
- Unified labels in Item Fat Content to ensure consistency.
Feature Creation:
- Item Visibility Bins: Categorized into 'Low', 'Medium', 'High'.
- Years Since Establishment: Calculated operational years of outlets.
Encoding Categorical Variables:
- One-Hot Encoding: For nominal variables like Item Type.
- Ordinal Encoding: For variables with an inherent order.
- Mean Target Encoding: For Outlet Identifier based on mean sales.
Feature Transformation:
- Log transformation applied to Item Visibility to reduce skewness.

Visual Overview of Preprocessing Steps:

flowchart TD
    A[Raw Data] --> B[Handle Missing Values]
    B --> C[Outlier Treatment]
    C --> D[Data Standardization]
    D --> E[Feature Creation]
    E --> F[Encoding Categorical Variables]
    F --> G[Feature Transformation]
    G --> H[Preprocessed Data]

Modeling

Our predictive modeling process is designed for accuracy and robustness:

Algorithm Selection: Random Forest Regressor
- Reasons:
  - Handles both linear and non-linear data.
  - Reduces overfitting through ensemble learning.
  - Captures complex feature interactions.
Model Training:
- Data split into training and validation sets.
- Hyperparameters tuned using grid search.
Evaluation Metrics:
- Mean Squared Error (MSE): Measures average squared difference.
- R-squared (R²): Indicates the proportion of variance explained.

Feature Importance Plot:

An image showcasing the importance of each feature in the model can be placed here.

Inferencing

The deployed model is accessible through an interactive web application:

User Interface: Built with Streamlit for a seamless experience.
Real-Time Predictions: Users receive immediate feedback upon input.
Robust Error Handling: Ensures smooth user interaction and guides users in case of invalid inputs.

Usage

Prerequisites

Ensure you have the following installed:

Python 3.x
Python Libraries:
- pandas
- numpy
- scikit-learn
- joblib
- streamlit

Setup Instructions

Follow these steps to get the project up and running:

Clone the Repository

git clone https://github.com/yourusername/grocery_sales_prediction.git
cd grocery_sales_prediction

Create a Virtual Environment

python3 -m venv env
source env/bin/activate  # For Windows: env\Scripts\activate

Install Dependencies
```
pip install -r requirements.txt
```

Directory Structure

Your project should have the following structure:

grocery_sales_prediction/
├── data_alchemy/
│   ├── raw/
│   │   └── train.csv
├── model_factory/
│   ├── models/
│   ├── encoders/
│   └── features/
├── codebase/
│   ├── utils.py
│   ├── training_script.py
│   └── app.py
└── README.md

Place Your Data
- Copy your train.csv file into data_alchemy/raw/.
Train the Model
```
cd codebase
python training_script.py
```
- This script will preprocess the data and train the model.
Run the Streamlit App
```
streamlit run app.py
```
Access the Application
- Open your web browser and navigate to http://localhost:8501.

Using the Application

Step-by-Step Guide:

Navigate to the Application
- Open your browser and go to http://localhost:8501.
Input Features
- Item Identifier: Select from the dropdown.
- Item Weight: Enter a value within the specified range.
- Item Fat Content: Choose between 'Low Fat' and 'Regular'.
- Item Visibility: Adjust using the slider.
- Item Type: Select the appropriate category.
- Item MRP: Enter the maximum retail price.
- Outlet Identifier: Select the outlet code.
- Outlet Establishment Year: Choose the year from the dropdown.
- Outlet Size: Select the size category.
- Outlet Location Type: Choose the location type.
- Outlet Type: Select the type of outlet.
Predict Sales
- Click the "Predict" button to generate the sales prediction.
View Results
- The predicted sales figure will be displayed on the screen.

Sample Screenshot of the Application:

Include a screenshot of the Streamlit app interface here.

Contributors

We extend our heartfelt gratitude to everyone who contributed to this project:

Mentor: Amal Salilan (amalsalilan)
Aman (theamansyed)
Vrushika K Panchal (vrushika-k-panchal)
Chetan (Chetanp717)
Rimi (rs2103)
Shilpa Manaji (Shilpa-Manaji)
Tharun (Kottetharun-09)
Sumithra (Sumithra-git)
Yanvi Arora (YanviAroraCS)
Sayantan (SayanRony)
Muskan Asthana (muskan42)
Purnima Pattnaik (Purnima07-sudo)
Rameswar Bisoyi (RB137)
Raunit (raunit45)
Hima Mankanta (manu-vasamsetti)
Nuka Abhinay (NUKA-ABHINAY)
Anjan Kumar (Anjankumarkamalapur)

Your contributions have been invaluable. Thank you for your dedication and hard work! 🙌

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Open-Source Community: For providing tools and resources that made this project possible.
You: For taking the time to explore our project.

Feel free to reach out for any queries or collaboration opportunities.

Contact: [email protected]

Made with ❤️ by the Grocery Sales Prediction Team.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.idea		.idea
grocery_sales_prediction		grocery_sales_prediction
LICENSE		LICENSE
README.md		README.md
Spring Board Internship Project Report.pdf		Spring Board Internship Project Report.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grocery Sales Prediction

Table of Contents

Introduction

Technology Stack

Project Workflow

Data Preprocessing

Modeling

Inferencing

Usage

Prerequisites

Setup Instructions

Using the Application

Contributors

License

Acknowledgments

About

Releases

Packages

Contributors 7

Languages

License

amalsalilan/GrocerGenius_AI_Based_Supermarket_Sales_Prediction_Infosys_Internship_Oct2024

Folders and files

Latest commit

History

Repository files navigation

Grocery Sales Prediction

Table of Contents

Introduction

Technology Stack

Project Workflow

Data Preprocessing

Modeling

Inferencing

Usage

Prerequisites

Setup Instructions

Using the Application

Contributors

License

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages