Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade asynciterator to v3.x #71

Merged
merged 1 commit into from
Sep 25, 2020

Conversation

RubenVerborgh
Copy link
Member

No description provided.

@RubenVerborgh RubenVerborgh added the enhancement New feature or request label Sep 9, 2020
@RubenVerborgh RubenVerborgh marked this pull request as draft September 9, 2020 15:30
@RubenVerborgh
Copy link
Member Author

@rubensworks The upgrade of our Comunica dependencies introduces unpredictable timeouts in test/system/profile-test.js. Could you have a look?

@rubensworks
Copy link
Contributor

This appears to be a very painfull one to debug...

Some findings in the meantime:

  • Clearing cache before each test solves the problem.
  • Can not reproduce when calling Comunica directly (with a regular on-data listener).
  • I suspect the problem is related to attaching/detaching data listeners as done here: https://github.com/LDflex/LDflex-Comunica/blob/master/src/ComunicaEngine.js#L83-L117
  • Since the problem only occurs when everything is cached, some iterator in the pipeline may have ended before listeners are attached. (should not happen since we use autoStart: false, but this may potentially not work well together with attaching/detaching data listeners).

@RubenVerborgh
Copy link
Member Author

Is there an alternative streamToAsyncIterable implementation available?

@rubensworks
Copy link
Contributor

Is there an alternative streamToAsyncIterable implementation available?

Buffering would be one obvious alternative (which will fix the problem). But I don't think we want that, as we would lose lazy streaming.

To clarify, I don't think streamToAsyncIterable is incorrect, but it only causes the problem.

This is confirmed by checking how many times the yieldValue callback is invoked for the "friend names" test. Sometimes it's 296 (all friends), but most of the time it's a random value between 0-296. This seems to indicate that the data event is being emitted without a listener being attached.

@RubenVerborgh
Copy link
Member Author

Does Comunica listen for listener unsubscribes and react immediately?

@rubensworks
Copy link
Contributor

Does Comunica listen for listener unsubscribes and react immediately?

No, it does not. But AFAICS comunica should not listen to unsubscribes, as asynciterator should handle those, right?

@RubenVerborgh
Copy link
Member Author

asynciterator should handle those

And it seems to be covered indeed:
https://github.com/RubenVerborgh/AsyncIterator/blob/v3.0.1/asynciterator.ts#L517-L537

@RubenVerborgh
Copy link
Member Author

RubenVerborgh commented Sep 14, 2020

There's actually a possible race condition:

  while (this.listenerCount('data') !== 0 && (item = this.read()) !== null)
    this.emit('data', item);

What is the listener is removed when read is called?
Not sure what to do with the read item then though…

@RubenVerborgh
Copy link
Member Author

FYI just tried changing it into

    // Reads the next item
    function readNext(resolve, reject) {
      if (pendingError)
        return reject(pendingError);
      if (readable.done)
        return resolve({ value: undefined, done: true });

      // Try to see if an item is already available
      const item = readable.read();
      if (item !== null)
        return resolve({ value: item, done: false });

      // Attach stream listeners
      readable.once('data', yieldValue);
      readable.once('end', finish);
      readable.once('error', finish);

but does not help.

@RubenVerborgh
Copy link
Member Author

…and a Heisenbug, too. Inserting logging makes it more likely for the error not to occur.

@rubensworks
Copy link
Contributor

I'll try to see if I can reproduce it without Comunica and LDflex.

@RubenVerborgh
Copy link
Member Author

…and a Heisenbug, too. Inserting logging makes it more likely for the error not to occur.

And even worse, when trying the scenario outside of Jest (to get rid of the whole ceremony Jest inserts around console.log), it consistently succeeds.

@rubensworks
Copy link
Contributor

And even worse, when trying the scenario outside of Jest (to get rid of the whole ceremony Jest inserts around console.log), it consistently succeeds.

That's not good :-(

At least we know how to reproduce it in the browser, so that's something: #45 (comment)

@RubenVerborgh
Copy link
Member Author

RubenVerborgh commented Sep 14, 2020 via email

@rubensworks
Copy link
Contributor

Sounds good.

One solution might be to create an async variant on .read(), but will always return the next data element (unless end is reached), as flow-mode isn't really needed in this case.

RubenVerborgh added a commit to LDflex/LDflex-Comunica that referenced this pull request Sep 14, 2020
@RubenVerborgh
Copy link
Member Author

RubenVerborgh commented Sep 14, 2020

Implemented streamToAsyncIterable without data listeners (LDflex/LDflex-Comunica@cb0e118), and the problem persists. This might be helpful for debugging.

So you can npm link @ldflex/comunica to the latest master if you want.

RubenVerborgh added a commit to LDflex/LDflex-Comunica that referenced this pull request Sep 14, 2020
@rubensworks
Copy link
Contributor

Interesting finding: with this change, the test "main profile fields" also starts failing sometimes on my end. This seems to indicate that the problem also occurs for queries with just a single triple pattern, while I used to think it occured only on queries with 2+ triple patterns.

@rubensworks
Copy link
Contributor

Another finding:
Contrary to what I assumed before, no intermediary results are lost.
Instead, at some point in the stream data and end events seem to halt.

For example, if we have a stream that should go from 1->10, we could have something like this: 1,2,3,4,5,6,7, FREEZE

@RubenVerborgh
Copy link
Member Author

Yeah, so it's definitely a race condition somewhere. Hence, we should also be careful with some of the assumptions above, in particular:

Can not reproduce when calling Comunica directly (with a regular on-data listener).

Is there any possibility of doing a git bisect?

@rubensworks
Copy link
Contributor

Is there any possibility of doing a git bisect?

Should be possible, but unlikely to be useful I think. I'm quite sure the problem is caused by the asynciterator update, which is part of a large refactoring commit.
Currently doing some fine-grained debugging to find out the exact location in the asynciterators where it fails.

@rubensworks
Copy link
Contributor

Ok, good news and bad news.

Good news: I've identified the problem: https://github.com/comunica/comunica/blob/master/packages/actor-query-operation-bgp-single/lib/ActorQueryOperationBgpSingle.ts#L33-L42
Removing that block makes the LDflex tests always pass again.

Bad news: Removing this block is (currently) not an option, since it breaks other tests.

The way forward:
This problem is related to these issues:

Comunica's current way of handling metadata internally causes these problems, and needs to be refactored.
Unfortunately this will require quite a bit of time, as I don't see a possible quick fix here.

@RubenVerborgh
Copy link
Member Author

Excellent job on finding the problem. That must've been hard!

In addition to identifying it, did we also learn something about how to spot such problems more easily in the future?

@rubensworks
Copy link
Contributor

Excellent job on finding the problem. That must've been hard!

Yes, this definitely goes in my top 10 of most frustrating bugs.

In addition to identifying it, did we also learn something about how to spot such problems more easily in the future?

Things would have been simplified a lot if we'd have the ability to print the state of our asynciterators.
This would allow us to identify where in the iterator pipeline things are blocked.

For example:

const it = new AsyncIterator(...);
it.debug = 'My-Iterator-1';
setTimeout(() => it.dumpState(), 1000);

Output:

My-Iterator-1
  STATE: READABLE
  BUFFER: [1, 2, 3]
  DESTINATION: ...

Maybe some kind of global flag to enable this on all iterators would also be useful.

@RubenVerborgh
Copy link
Member Author

Yes, I do something similar when debugging them myself. I think it would be good to ship a debug build of asynciterator with such logging functionality built in.

rubensworks added a commit to comunica/comunica that referenced this pull request Sep 21, 2020
This is done by removing asyncreiterable sources,
and using property-based metadata instead of event-based metadata.

This allows us to undo all hacks to force metadata loading.

Related to #724
Related to LDflex/Query-Solid#71
Closes #674
Related to #553
Closes #566
Closes #555
rubensworks added a commit to comunica/comunica that referenced this pull request Sep 25, 2020
This is done by removing asyncreiterable sources,
and using property-based metadata instead of event-based metadata.

This allows us to undo all hacks to force metadata loading.

Related to #724
Related to LDflex/Query-Solid#71
Closes #674
Related to #553
Closes #566
Closes #555
@rubensworks
Copy link
Contributor

All of the tests seem to pass on my end in the newly released Comunica version (1.17.0).

@RubenVerborgh RubenVerborgh marked this pull request as ready for review September 25, 2020 21:07
@RubenVerborgh RubenVerborgh force-pushed the feature/upgrade-asynciterator-3.x branch from 5d132dd to 3c65a34 Compare September 25, 2020 21:09
@RubenVerborgh RubenVerborgh merged commit 3c65a34 into master Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants