-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large windowSize can trip ES circuit breaker, fail slices, crash ES nodes. #948
Comments
I've discussed this with Kimbro and Brien today at some length. Our proposed solution would be, for workers, to set the URL parameter So for example, for a slice with a Does that seem reasonable @jsnoble ? |
The other thing we noticed is that the workers appear to be querying with |
Using the value from Setting We should test this to confirm of course. |
fetch now uses an expanded slice count as the query size refs: #948
@briend can you updated this issue and let me know whether the changes in the Teraslice elasticsearch asset |
Yes, elasticsearch asset |
Thanks Brien. Unfortunately we have a problem in the internal Spaces QPL query use case. Given use of the fetcher in QPL it often expands the slice past the Our first thought on addressing this will be to make it configurable whether that error is a hard failure or not. That way, Teraslice can still fail in this scenario, but QPL can configure itself to not fail. Not that we're happy about the situation. |
So in talking through the situation with spaces queries, @kstaken asked: "Does fetch do the expansion on the first request?" The answer is "yes" and I think it's important that it do so, otherwise we increase the number of retried requests, which would be wasteful. We want the retry path to be a rare exception. That being said, I also think my choice of expansion coefficient, |
Rather than changing the error handling as I tried here (which we can close without merging): I think it's just more straight forward if the Space Reader API just always uses the original fetch which sets the query |
This was resolved in #965 |
windowSize
is defined here in the elasticsearch-reader-api, but is not externally configurable via job parameters, it seems (addingwindowSize
to job does not override the value). :elasticsearch-assets/packages/elasticsearch-asset-apis/src/elasticsearch-reader-api/ElasticsearchReaderAPI.ts
Lines 74 to 78 in 381fdab
windowSize
seems to be automatically set to match the per-indexmax_result_window
setting from elasticsearch, and is ultimately used as thesize
parameter for the query:elasticsearch-assets/packages/elasticsearch-asset-apis/src/elasticsearch-reader-api/ElasticsearchReaderAPI.ts
Lines 637 to 660 in 381fdab
Here we can see that varying the
size
(windowSize) dramatically affects performance, despite the record count remaining the same (less than 50 records returned for this query):Presumably we could manually set
max_result_window
on the indices and the Teraslice job would no longer trigger circuit breaker or crash es nodes, but ideally this would just be set on the job or more intelligently adapt to slice sizes somehow; if a slice only had 1000 records, perhaps a defaultwindowSize
of 1000*2 would be appropriate.Maybe this issue was alluding to this problem: #13
(Thanks to @briend for this description)
The text was updated successfully, but these errors were encountered: