Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove stopwords from the generated word clouds #1655

Open
maxdavidson91 opened this issue Oct 10, 2024 · 2 comments · May be fixed by #1676
Open

Remove stopwords from the generated word clouds #1655

maxdavidson91 opened this issue Oct 10, 2024 · 2 comments · May be fixed by #1676
Labels
feature request 💬 Requests for new features

Comments

@maxdavidson91
Copy link

maxdavidson91 commented Oct 10, 2024

Missing functionality

Word clouds contain the most common words, and for free text fields, these words are often: 'and', 'to', 'the', 'from' etc. Which provide no meaningful insight into the data.

Proposed feature

Include an option to remove stopwords when generating word clouds. Perhaps by incorporating the nltk package to identify the list of stopwords.

Example:

from nltk.corpus import stopwords

stop = stopwords.words('english')

Alternatives considered

Removing stopwords from a pandas dataframe prior to generating the report wouldn't suffice in this case, as it would affect the 'samples' in the report

Additional context

No response

@maxdavidson91 maxdavidson91 changed the title Feature Request Remove stopwords from the generated word clouds Oct 10, 2024
@fabclmnt fabclmnt added feature request 💬 Requests for new features and removed needs-triage labels Oct 15, 2024
@fabclmnt
Copy link
Contributor

Hi @maxdavidson91 ,

thank you for the suggestion. Let me know If you would be interested in contributing to Ydata-profiling, by developing this feature request!

@maxdavidson91
Copy link
Author

I'd be happy to contribute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request 💬 Requests for new features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants