An end-to-end Machine Learning workflow focused on stock price forecasting. This project encompasses everything from the ML development process to the ML production lifecycle.
- Forecasting stock price using Machine Learning
- Real-time dashboard showing forecasted values
- Observability dashboard for ML models and their performance
- Open APIs for data retrieval
- BinanceAPI - for data retrieval from Binance
- MageAI - for data processing pipeline
- DVC - for data versioning control
- PyCaret - for Automated ML model training
- Weights & Biases - for model experiment tracking
- Aporia's MLNotify - for model training monitoring service (Aporia official website)
- Docker - for system containerization and deployment
- BentoML - for model serving (BentoML official website)
- Yatai - for model serving at Scale on Kubernetes (BentoML official website)
- Caddy - for reverse proxy of services
- Grafana and Prometheus - for system, model, hardware observability
- Coming soon
System Architecture (coming soon)
Functional Architecture (coming soon)
# clone this repository
git clone https://github.com/thakorneyp11/stock-price-prediction.git
# change directory to project
cd stock-price-prediction
# create virtual environment (`pip3 install virtualenv` if not installed)
virtualenv env
# activate virtual environment
source env/bin/activate
# install dependencies
pip3 install -r requirements.txt
Data Sources:
- Binance Data Dumper: included data from 2017 to Now
- Kaggle - prasoonkottarathil/btcinusd: only included data from 2017-2021
- Binance API: only support few candles per request, not suitable for historical data retrieval
Download Historical Data:
- Download historical data from Binance Data Dumper:
python3 data_download.py
- Raw CSV dataset:
dataset/BTCUSDT_15m_Aug2017-Oct2023.csv
Retrieve Real-time Data:
- sample script can be found in
data_retrieval.py
(later will scheduled executed using MageAI) - note: need to update
.env
file with Binance API key and secret
- Exploratory Data Analysis (EDA):
eda.ipynb
- Feature Engineering:
feature_engineering.ipynb
(reference) - Processed CSV dataset: 1)
dataset/feature_extracted_data.csv
and 2)dataset/feature_selected_data.csv
- Coming soon
- Coming soon
- Coming soon
- Coming soon
- Coming soon
- Coming soon