Emotion analysis and classification of short comments using machine learning techniques

😃 🙄 😔 😭

This repository contains codes that allow studying supervised machine learning models for short text comment classification tasks.

Python Environment

The project requires Python 3.8 or higher. You can check the Python version using the command in terminal:

python --version

If you don't have Python installed on your machine, see: https://www.python.org

The repository currently relies on the main.ipynb, oversampling_test.ipynb, emotion_analysis.py scripts:

main.ipynb: This has the main application of the project, it has tasks of pre-processing the texts and classification of them considering the Naive Bayes(NB), Support Vector Machine(SVM) and K-Nearest Neighbors (KNN) models.
oversampling_test.ipynb: This has an isolated case, the application of the main code, performing tests of the Oversampling function that seeks to balance the database, considering that the dataset.xlsx file has an amount of unbalanced emotions (classes).
emotion_analysis.py: This one has all the functions necessary for the operation of the other codes, storing functions of pre-processing of the data and also of plots necessary for the evaluation of each model.
dataset.xlsx: This has the data set collected by the authors of the present project, having 173 comments in Portuguese classified as Joy, Sadness and Surprise.

You can install the necessary packages to run the code using the command in the terminal:

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
README.md		README.md
dataset.xlsx		dataset.xlsx
emotion_analysis.py		emotion_analysis.py
ensemble_methods.ipynb		ensemble_methods.ipynb
main.ipynb		main.ipynb
oversampling_and_ensemble.ipynb		oversampling_and_ensemble.ipynb
oversampling_test.ipynb		oversampling_test.ipynb
requirements.txt		requirements.txt