We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Currently when using the AsyncWritter, it is possible to have an OOM error due to the queue being huge.
For instance this snippet will fill up the queue faster than it can be send via https to hdfs
import string import random import hdfs client = hdfs(<valid arguments>) with client.write("filename", encoding="utf-8") as file_handle: writer = csv.writer(file_handle) # creates 25 pseudo lines of csv junk for element in [["".join(random.choice(string.ascii_letters) for _ in range(100)) for _ in range(25)] for _ in range(25)]: writer.writerows(element)
Leading to a unmanageable large memory usage.
Is it possible to have a limit on the queue size when creating a file_handle? If you like I would like to create a PR with a possible solution?
The text was updated successfully, but these errors were encountered:
Hi @Kaldie. A PR for this would be welcome.
Sorry, something went wrong.
Created a PR, however can't seem to link it in here 😢
Hi @mtth, could you have a look at the corrisponding PR?
No branches or pull requests
Currently when using the AsyncWritter, it is possible to have an OOM error due to the queue being huge.
For instance this snippet will fill up the queue faster than it can be send via https to hdfs
Leading to a unmanageable large memory usage.
Is it possible to have a limit on the queue size when creating a file_handle?
If you like I would like to create a PR with a possible solution?
The text was updated successfully, but these errors were encountered: