This repo saves code for the Handling Imbalanced Data series for our YouTube channel. Please check out the videos for detailed explanations. (https://youtube.com/playlist?list=PL2L4c5jChmctqiXvOaJA91o0OJhYq1rR9).
Welcome to our Handling Imbalanced Data in machine learning classification series. You'll work on a highly imbalanced example dataset in Python.
In this Part 1 video, we'll learn:
- what is imbalanced data
- what are the proper evaluation metrics for it
- set up our example of a highly imbalanced dataset ready for modeling.
GitHub Repo with code: https://github.com/liannewriting/YouTube-videos-public/tree/main/imbalanced-data-machine-learning-abalone19
Source of the dataset: https://sci2s.ugr.es/keel/dataset.php?cod=115 Please download from GitHub, since we've made minor changes to the original dataset.
Please check out the Part 2 video to learn 6 popular techniques to deal with the imbalanced data problem in Python.
✔️Collecting a bigger sample
✔️Oversampling (e.g., random, SMOTE)
✔️Undersampling (e.g., random, K-Means, Tomek links)
✔️Combining over and undersampling
✔️Weighing classes differently
✔️Changing algorithms
Technologies that will be used: ☑️ JupyterLab (Notebook) ☑️ pandas ☑️ sklearn ☑️ imbalanced-learn (imblearn)
Links mentioned in the video
►8 popular Evaluation Metrics for Machine Learning Models: https://www.justintodata.com/machine-learning-model-evaluation-metrics/
►FREE Python crash course - basics: https://www.justintodata.com/learn-python-free-online-course-data-science/
►Python for Data Analysis with projects: https://www.udemy.com/course/python-for-data-analysis-step-by-step/?referralCode=C8B8B507FB1197183455
►Logistic Regression for Machine Learning: complete Tutorial: https://www.justintodata.com/logistic-regression-for-machine-learning-tutorial/
►Logistic Regression Example in Python: Step-by-Step Guide: https://www.justintodata.com/logistic-regression-example-in-python/
There's also an article version of the same content. If you prefer reading, please check it out. How to handle Imbalanced Data in machine learning classification: https://www.justintodata.com/imbalanced-data-machine-learning-classification/
Get access to more data science materials, check out our website Just into Data: https://justintodata.com/