Skip to content

UBC-MDS/DSCI524_Text_Analyzer_19

Repository files navigation

textanalyzer

TextAnalyzer includes powerful tools to perform natural language processing on English texts.

TextAnalyzer is a Python package designed for performing comprehensive Natural Language Processing (NLP) tasks on English texts. This package provides tools for sentiment analysis, keyword extraction, topic modeling, and the detection and visualization of language patterns, making it ideal for text mining and content analysis projects.

Installation

$ pip install textanalyzer

Usage

  • analyze_sentiment(message, model="default"): This function analyzes the sentiment of a given message and prints alert message if it's highly negative.
  • topic_modeling(): This function performs topic extraction from a list of texts and returns the words that represent the extracted topics by using Nonnegative Matrix Factorization.
  • extract_keywords(messages, method="tfidf", num_keywords=5): This function extracts the top keywords from a list of messages using specified methods like TF-IDF or RAKE.
  • detect_language_patterns(messages, method="language", n=2, top_n=5): This function detects language patterns such as detected languages, common n-grams, or character usage patterns from a list of messages.
  • visualize_language_patterns(patterns, method="language"): This function visualizes the detected language patterns using bar charts for language frequency, n-grams, or character patterns.

Ecosystem Fit

TextAnalyzer integrates into the Python NLP ecosystem by offering a simple yet powerful toolkit for analyzing text data. While other Python libraries like NLTK and spaCy provide extensive NLP functionalities, TextAnalyzer focuses on making sentiment analysis, keyword extraction, and language pattern visualization more accessible and user-friendly.

For keyword extraction, packages like YAKE and RAKE-NLTK provide similar functionality. However, TextAnalyzer combines these tasks into a unified and streamlined workflow.

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.(TO DO - add link)

Dependencies

License

textanalyzer was created by Quanhua Huang, Adrian Leung, Anna Nandar, Colombe Tolokin. It is licensed under the terms of the MIT license.

Credits

textanalyzer was created with cookiecutter and the py-pkgs-cookiecutter template.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages