Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it accept Arabic (or any non-ASCII) in general? #62

Open
ameer-kanaan opened this issue Nov 2, 2020 · 0 comments
Open

Does it accept Arabic (or any non-ASCII) in general? #62

ameer-kanaan opened this issue Nov 2, 2020 · 0 comments

Comments

@ameer-kanaan
Copy link

ameer-kanaan commented Nov 2, 2020

I'm facing difficulty in vectorizing an Arabic text, I don't seem to be able of getting anything useful.

The word2vec function is only extracting funny characters (like emojis and so on) from a text file of about 200k Arabic words.. it seems also to convert these characters to codepoint values.

I would like to have nice an normal looking word2vec for my Arabic text.

Any comments or workarounds?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant