-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnionIterator skips not readable subiterators #106
base: main
Are you sure you want to change the base?
UnionIterator skips not readable subiterators #106
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maartyman. I know these are hard to find and solve.
Nonetheless I'll need to ask for a test that breaks without but works with.
That's to avoid accidental breakage in the future.
Code-wise, can we just make the read
thing a simple variable?
I'm sorry, in my hastiness I forgot the skip check in the _read() function. |
Also, why is does the UnionIterator extend the BufferedIterator? |
Short answer: so readers have to wait less (it fills up in the background). Longer answer: buffering should've been mix-in functionality rather than inherited functionality. |
Okay, I'm asking because I see a performance increase if I use a non buffered UnionIterator. But this is just in my tests with querying local stores with incremunica, when using online sources this might be different. |
We can give it a go if the performance effect extends to other cases. An easy fix could be to set the buffer size to zero. Perhaps there's already sufficient intermediary buffering happening in your case. |
There is a closed PR #81 which does it and has significant improvements - it was meant to be bundled into a bugger update that I never got around to. |
@@ -1817,8 +1821,11 @@ export class UnionIterator<T> extends BufferedIterator<T> { | |||
// Pick the next source | |||
this._currentSource = (this._currentSource + 1) % this._sources.length; | |||
const source = this._sources[this._currentSource]; | |||
if (!source.source.readable && !source.requiresRead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason reason why only having !source.source.readable
is insufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to jumpstart the sub-iterators, I added this because I had issues with it in the past. If I remove it, all test still pass, so I guess it's not necessary? I guess an iterator should be created with readable set to true...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess an iterator should be created with readable set to true...
Yes, for iterators without autoStart, readable should be true.
The UnionIterator is used in the bind join in comunica (and incremunica). During some performance tests, I saw that the UnionIterator called
read()
on all its sub-iterators, even if only one was readable. This PR fixes this by adding a check before callingread()
on the sub-iterators. The sub-iterators need to be read to start them, that is why I added theread
attribute to the _sources. To be fair, I think there is a better way to check if the sources have started and if not callread()
on them.