Machine Learning and Data Mining Projects (2022-2023).
This repo demonstrates multiple Machine Learning and Data Mining Techniques including:
- Data Handling and Data Manipulation
- Applying Linear Regression under multiple sets.
- Applying Classification and feature engineering techniques to reach an accuracy limit.
- Study of Regularization effects.
- Study of different Quality metrics and the effect of GridSearchCV to find the optimal value for k in KNeighborsClassifier.
- Applying DecisionTrees and ROC AUC score on Breast cancer Wisconsin (diagnostic) dataset.
- study Ensembles by comparing mean-square error between kNN regressor, random forest regressor and stacking regressor on California Housing dataset.
- Traning Neural Network Transformer-based model for images classification task.
- Dog breed Identification using Transfer Learning and CNN Auto Encoders on dataset.
Skills developed: pandas | scikit-learn | matplotlib | numpy | Regression | Classifications | Data Processing | Feature engineering | Regularization | Quality metrics | hyperparameter optimization | Decision Trees | Ensmbles Learning | Neural Networks | pytorch | python.
This repo is part of the MLDM course, HSE, Moscow, Russia.