-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase grpc max message size #212
base: master
Are you sure you want to change the base?
Conversation
fhoering
commented
Apr 2, 2020
- Currently the limit in Python is 4 MB which fails for retrieving large application logs
- Currently the limit in Python is 4 MB which fails for retrieving large application logs
@jcrist
|
Apologies, things have been a bit chaotic. Thinking about this a bit more, I think the method returning logs should be a streaming output method, so the msg size matters less. We might stream one response per container, which would hopefully stay below this limit. Thoughts? |
Why not but I think one response by container could still easily go above 4 MB. It turns out even 50 MB seems not enough in some cases. We have jobs that produce logs for 24 hours and each spawned container produces the same amount of logs. In that case we would even need to stream the response for each container in chunks. But that also means we have to merge it back on client side to have a cleaner API right ? Not sure a list of strings would be a nice API to use. |
Hmmm, yeah. In that case we may want to stream in fixed bytesize chunks (line-by-line seems too much to me). To the end user I think this should keep the same python api, so we'd have to reassemble in the client before returning. I'm fine merging this as a stop-gap for now, but I think we should redesign the logs api to avoid this issue entirely. |
OK. I agree it would be nice not having to specify this setting. Having a quick look we could change the proto like this :
Then send n LogResponse messages and aggregate this back to the client. Seems easy on Python side to just iterate over the messages and merge them. The only complication is to propagate this all the way down to the yarn api on Java side. |
👍 That rpc design makes sense to me. |
OK. I also had a look at LogClient and it could also just produce a stream with Java8 stream API. I will work on it and propose something in another PR. In the meantime, if you have time, please have a look at: |
Hi folks, I've been hitting this issue consistently, with my yarn jobs generating around 20MBs of logs. It looks like there was agreement on merging the stop gap measure proposed in this PR. If that is still the case, can we please merge? |