elasticsearch date reader slicer can't slice data with gaps in the time field #166

ciorg · 2019-10-17T23:37:24Z

this could be related to issue #11

It seems like the issue appears when there is a significant time period with no data then data again.

details:

job:
- assets: elasticsearch: 1.6.1
- 10 workers
- 1 slicer
- operations: elasticsearch_reader -> noop

when reading data from an index 2 slices failed with the error Elasticsearch Error: [query_phase_execution_exception] Result window is too large,

The data itself has second resolution with milliseconds in the timestamp, but the milliseconds are always 000. The slicer is millisecond resolution.

2 slices failed during the initial job:

slice1: {"start":"2019-05-22T00:00:34+00:00","end":"2019-05-27T00:03:11+00:00","count":2293241}
slice2: {"start":"2019-05-14T00:02:50+00:00","end":"2019-05-17T00:05:39+00:00","count":3873168}

When I broke up the slices into smaller groups the jobs ran with out issues.

I moved the start date in the job based on the slice error start until the job didn't error out anymore.

Slice 1 succeeded with start: 2019-05-22T00:00:34 and end: 2019-05-26T22:34:15, then start: 2019-05-26T22:34:15, end: 2019-05-27T00:03:11

Slice 2 succeeded with start: 2019-05-14T00:02:50 , end: 2019-05-16T23:12:16 then start:2019-05-16T23:12:16 and end:2019-05-17T00:05:39

Slice 1 index searches showed that an index search between 2019-05-22T00:00:34 and end: 2019-05-26T22:34:15 returns 0 results.
Index search between 2019-05-26T22:34:15. and end <2019-05-27T00:03:11.000 returns 2293241 docs the same record count for slice 1.

Slice 2 index searches showed that an index search with the date field:>2019-05-14T00:02:50.000+AND+date_field:<2019-05-16T23:12:16.000 returns a count of 0 docs.

Searching date_field:>2019-05-16T23:12:16+AND+date_field:<2019-05-17T00:05:39.000 returns 3873168. The same record count for slice 2.

Removing the time periods of 0 results resulted in the test jobs finishing with no issues.

The text was updated successfully, but these errors were encountered:

ciorg added the bug Something isn't working label Oct 17, 2019

ciorg changed the title ~~elasticsearch date reader slicer can't slice data~~ elasticsearch date reader slicer can't slice data with gaps in the time field Oct 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

elasticsearch date reader slicer can't slice data with gaps in the time field #166

elasticsearch date reader slicer can't slice data with gaps in the time field #166

ciorg commented Oct 17, 2019

elasticsearch date reader slicer can't slice data with gaps in the time field #166

elasticsearch date reader slicer can't slice data with gaps in the time field #166

Comments

ciorg commented Oct 17, 2019