Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal Error on 1 MB file #42

Open
rychardguedes opened this issue Dec 5, 2017 · 2 comments
Open

Fatal Error on 1 MB file #42

rychardguedes opened this issue Dec 5, 2017 · 2 comments

Comments

@rychardguedes
Copy link

Congratulations, the package is great and thanks for developing it.

I've teste with some standard dataset and it works great (including the 50 MB cookbooks). However, when using it with a personal 1 MB dataset written in Brazilian Portuguese, R crashes every single time. I've already removed punctuation and excess white space, tried with 1/2/4/8 threads, 100/200/500 vectors and with/without removing stopwords, but got no better result. Do you have any idea what it can be the reason of this crash?

ayuda

@rychardguedes
Copy link
Author

More information and tests:

I tried with accent, like "é ó ú â ã", and it worked fine for small files (5 KB). I also tried with small pieces of my personal dataset, it worked with samples from 1% to 6% of it. I also generated a lorem ipsum files: from 5 KB until 700 KB, it worked. However, when I tried with around 1 MB, it crashed again.

@mdilmanian
Copy link

I'm having the same issues -- RStudio aborts even when trying to train very reasonably-sized files. The same problem with the rword2vec package as well. Path/file length is unlikely to be an issue (path length of around 80-90 characters).

Appreciate any suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants