#96 Added Fake news detection with TF-IDF and ML #186
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ Fake News Detection
Description:
This Pull Request implements a basic framework for fake news detection using TF-IDF and a Passive Aggressive Classifier. It includes the following functionalities:➡️
1.Data Loading and Preprocessing:
Reads news articles and labels (fake/real) from a CSV file (news.csv).
2.NLP (TF-IDF):
Applies TF-IDF (Term Frequency-Inverse Document Frequency) to convert textual content into numerical features for machine learning.
3.Machine Learning Model:
Trains a Passive Aggressive Classifier to learn patterns that distinguish fake from real news based on TF-IDF features.
4.Prediction:
Provides a function fake_news_det(news) to predict labels (fake or real) for new, unseen news content.
5.Code Readability:
Enhances code readability by adding comments that explain the NLP and ML stages involved.
Improvements and Benefits:
✨This implementation offers a starting point for exploring fake news detection with machine learning.
✨The inclusion of TF-IDF as a feature extraction technique allows the model to focus on words that are important for a particular news article but not overly common in the entire dataset.
✨Clear comments improve understanding and maintainability of the code.