Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Improved performance for async iterables #42618

Closed
jeswr opened this issue Apr 6, 2022 · 23 comments
Closed

Performance: Improved performance for async iterables #42618

jeswr opened this issue Apr 6, 2022 · 23 comments
Labels
feature request Issues that request new features to be added to Node.js. stale

Comments

@jeswr
Copy link

jeswr commented Apr 6, 2022

What is the problem this feature will solve?

I have recently been working on optimizations for this implementation of Asynchronous iterators. I did a quick benchmark between that implementation; and the Readable API for Node v17.4.0 and found that some of the latest of our optimizations (yet to be released) we have are orders of magnitude faster than the Readable API. As an example, we can apply 10 maps to 200_000 elements in 20ms, whilst it takes the Readable API ~2s (tested on DELL XPS 15, 32 GB RAM).

import { ArrayIterator } from './dist/asynciterator.js'
import { Readable } from 'stream';

async function test(fn) {
  let j = 0;
  const arr = new Array(200_000).fill(true).map(() => j++);
  let iterator = fn(arr)

  for (let i = 0; i < 10; i++) {
    iterator = iterator.map(x => x + 1)
  }

  const now = performance.now();
  return await new Promise((res) => iterator.on('data', () => {}).on('end', () => {
    res(performance.now() - now);
  }))
}

console.log('ArrayIterator', await test(arr => new ArrayIterator(arr)));
console.log('Readable.from', await test(arr => Readable.from(arr)))
Implementation Time (ms)
ArrayIterator 20.243778005242348
Readable.from 2403.212348997593

The recent optimizations we have done include:

This (somewhat long) issue is where we have had most of the discussion around the recent improvements we have been working on.

What is the feature you are proposing to solve the problem?

Use some of the optimizations already implemented in https://github.com/RubenVerborgh/AsyncIterator, RubenVerborgh/AsyncIterator#59

What alternatives have you considered?

No response

@jeswr jeswr added the feature request Issues that request new features to be added to Node.js. label Apr 6, 2022
@mscdex
Copy link
Contributor

mscdex commented Apr 6, 2022

Are the suggested changes passing all of node's tests?

@jeswr
Copy link
Author

jeswr commented Apr 6, 2022

It wouldn't because this library is missing a few elements that the Readable API has (for instance it doesn't expose a few read-only properties such as readableFlowing etc.) - though there is a PR to add some more of those properties in (RubenVerborgh/AsyncIterator#45), which also improved the perf. of the AsyncIterator library by ~5-10% in cases like the one given above.

@mscdex
Copy link
Contributor

mscdex commented Apr 6, 2022

If it passes all tests, then feel free to submit a PR.

@benjamingr
Copy link
Member

@nodejs/streams

@benjamingr
Copy link
Member

Hey, thanks for asking. I don't think we spent a lot of time optimizing this area in the past and were mostly waiting for people to ask.

Is the ask specifically to make async iterators for Readable.from(largeArray) fast?

@ronag
Copy link
Member

ronag commented Apr 6, 2022

I don't think we can just replace the current implementation. Would break too much stuff.

@benjamingr
Copy link
Member

I don't think we can just replace the current implementation. Would break too much stuff.

The current implementation doesn't refer to the whole of Readable (The AsyncIterator package doesn't contain 10% of the functionality of streams or stuff like watermarks).

I think the ask is to improve our performance for the async iterator API for Readable?

@ronag
Copy link
Member

ronag commented Apr 6, 2022

I think the ask is to improve our performance for the async iterator API for Readable?

I'm not sure what the ask here is. 😕

@ronag
Copy link
Member

ronag commented Apr 6, 2022

Use some of the optimizations already implemented in https://github.com/RubenVerborgh/AsyncIterator, RubenVerborgh/AsyncIterator#59

I think this needs to be a bit more specific... I'm not sure what optimisations we should apply.

@benjamingr
Copy link
Member

For from, probably just adding a special case for arrays to avoid all the iteration objects should be fine?

diff --git a/lib/internal/streams/from.js b/lib/internal/streams/from.js
index d3d43f7dfb..ad2afa578c 100644
--- a/lib/internal/streams/from.js
+++ b/lib/internal/streams/from.js
@@ -25,10 +25,28 @@ function from(Readable, iterable, opts) {
     });
   }
 
+  if (Array.isArray(iterable)) {
+    let i = 0;
+    return new Readable({
+      objectMode: true,
+      ...opts,
+      read() {
+        if (i < iterable.length) {
+          this.push(iterable[i++]);
+        } else {
+          iterable.push(null);
+        }
+      }
+    });
+  }
+
+
   let isAsync;
   if (iterable && iterable[SymbolAsyncIterator]) {
     isAsync = true;
     iterator = iterable[SymbolAsyncIterator]();
   } else if (iterable && iterable[SymbolIterator]) {
     isAsync = false;
     iterator = iterable[SymbolIterator]();

I'm not sure it's worth it though? Statically creating readables with .from is probably not a hot path? (is it for your use case @jeswr ?)

@ronag
Copy link
Member

ronag commented Apr 6, 2022

We could probably make that even faster by just injecting the array as the readable's buffer. But again this seems like a quite unusual use case.

@mcollina
Copy link
Member

mcollina commented Apr 6, 2022

I don't really understand what's the use case for this. I'm totally ok to have the perf improvement in, but what are we optimizing for in this case? I would not recommend anyone to process an Array using an AsyncIterator, it'll definitely be slower than just processing the Array.

@jeswr
Copy link
Author

jeswr commented Apr 6, 2022

Apologies for the lack of clarity in my original issue. The main point was to highlight that there is room to make pull-based/Asynchronous streams a lot faster which has the potential to speed up downstream applications; I see where the choice to benchmark using Readable.from may have caused confusion. @jacoscaz reported to me that he has applications (not using Readable.from or its equivalent) that are now 600-900x as a result porting from using Readable to using the current release of the Asynciterator package.

I also note that I'm not intimately familiar with he nodejs codebase - and all comments are based on the observed performance comparison between Readable, and the Asynciterator package.

The ask is more specifically to optimize (in order of importance) the speed of:
(1) Internal buffers by using linked lists
(2) Transformations of pull-based/asynchronous iterators. Specifically the case of applying several synchronous transformations such as synchronous map/filter/skip etc. can be made a lot faster; we have done this optimization in RubenVerborgh/AsyncIterator#59 with more discussions around its performance gains in RubenVerborgh/AsyncIterator#58. The overarching idea is to apply all the transforms in a single read and skip any intermediary buffers.
(3) The speed of Readable.from by truncating the array on load

Once/if these have been applied then there are potentially some further optimisations that can be looked into based on the thread I linked to in my original message.

@mcollina
Copy link
Member

mcollina commented Apr 6, 2022

@jeswr I'm not sure I follow. Readable.from() is primarily an utility to convert to a node stream.Readable, which allow plenty of things - including piping. There are plenty of way to make it faster, however it would deeply break backward compatibility for all things that depends on Stream.

@jeswr
Copy link
Author

jeswr commented Apr 7, 2022

Note I'll try and revisit this once I've had a chance to dig into this codebase further - I don't have as much time as I would like to dig into this right now.

Off the top of my head there is a very naive way to get improved performance without breaking backwards compat which is to have a constructor parameter pullOnly (defaulted to false), which when enabled allows such optimizations as this to be applied; it won't break any old code since it is opt-in, and will allow a lot of packages to get a perf. improvement by simply enabling this parameter.

However, I think there should also be ways of achieving this without downstream libs having to add this extra parameter. The way this kind of thing achieved in the AsyncIterator lib is by using different classes with different internals to handle different types of operations. I.e. all of these fast 'synchronous' transformations are handled by a MappingIterator whilst other transforms are handled by a TransformIterator. I'd imagine this could be extended to all of the other kinds of operations that are made available here.

In addition, let me alay some confusion by giving an example without Readable.from(). Here we create a Readable with a read method in object mode. In this example; Readable takes 1170ms (benchmarked on node v17.4.0), whilst the BufferedIterator from the AsyncIterator lib takes 50ms (benchmarked on RubenVerborgh/AsyncIterator#59)

let i = 0;
let SIZE = 100_000;

let iterator = new Readable({
  read() {
    if (i < SIZE) {
      this.push(i++);
      this.push(i++);
      this.push(i++);
    } else {
      this.push(null)
    }
  },
  objectMode: true
});


for (let i = 0; i < 10; i++) {
  iterator = iterator.map(x => x + 1)
}


const now = performance.now();
iterator.on('data', () => {}).on('end', () => {
  console.log(performance.now() - now);
})
import { BufferedIterator } from './dist/asynciterator.js'


let i = 0;
let SIZE = 100_000;

class MyIterator extends BufferedIterator {
  _read(count, done) {
    if (i < SIZE) {
      this._push(i++);
      this._push(i++);
      this._push(i++);
      done();
    } else {
      this.close();
      done();
    }
  }
}


let iterator = new MyIterator();

for (let i = 0; i < 10; i++) {
  iterator = iterator.map(x => x + 1)
}

const now = performance.now();
iterator.on('data', () => {}).on('end', () => {
  console.log(performance.now() - now);
})

@ronag
Copy link
Member

ronag commented Apr 7, 2022

I think the problem here is that Readable.map is always asynchronous. We haven't optimized for synchronous mappers.

If you remove the map part of the above example then I think the performance difference will be significantly smaller.

@benjamingr
Copy link
Member

I think the problem here is that Readable.map is always asynchronous. We haven't optimized for synchronous mappers.

I'm not sure why I'd use an async iterator with a synchronous map but we can definitely make that much much faster and (for example) fall back on a regular for loop for a asynchronous map on the stream's buffer and even do some of the tricks @jeswr is trying to apply (the transducing several .map calls into one) though I am really not sure that's something people should use streams for in the first place?

@jeswr
Copy link
Author

jeswr commented Apr 7, 2022

I'm not sure why I'd use an async iterator with a synchronous map

One use case is this configurable query engine which takes data from pull-based sources and then applies various transforms to the data in the query evaluation process. Here is a link to a subset of the places it applies such transforms RubenVerborgh/AsyncIterator#44 (comment)

@jacoscaz
Copy link

jacoscaz commented Jun 27, 2022

Elaborating on @jeswr's comments with my own thoughts, I think the underlying issue here is that developers are often tempted to use streams (as in the API implemented by the stream module and the readable-stream package) for both I/O and for CPU-bound internal processing. A simple example of this would be a streaming processor with multiple internal processing stages that consumes process.stdin and pipes to process.stdout.

In one use case of which I am unfortunately unable to share the code, we achieved a 150x perf. increase simply by moving from a pre-existing stream.Readable -> stream.Transform -> stream.Transform -> stream.Transform -> stream.Writable chain to stream.Redable -> AsyncIterator -> AsyncIterator -> AsyncIterator -> stream.Writable with virtually no changes to each transform step while maintaining full support for backpressure management.

We haven't optimized for synchronous mappers.

And I am unsure as to whether the stream API should do so, either; at least not as a primary goal. After all, it was born to deal with a different problem. However, especially for newcomers to either Node.js and/or stream-based processing, the convenience of simplified construction, of objectMode and of methods like .map() and .filter() (which also happen to use .shift()) can be misleading as to the performance we can expect out of a CPU-bound processing chain.

If I had to formulate an ask, I guess it might be two-fold:

  1. Some of the optimizations we've been working on may be applicable to the stream API too. If there's interest in doing so, perhaps we could have a go at evaluating and porting some of them over.
  2. It would be helpful for the official documentation to elaborate on the applicability of the stream API to different use cases (I/O, CPU-bound processing, ...). Also, given the amount of ways in which async iteration can be done (stream API, for await ... of, third-party modules like asynciterator, ...), a table comparing one against the other would also be quite helpful in guiding readers to the solution that best fits. Can PRs be made against the official documentation?

@jacoscaz
Copy link

In one use case of which I am unfortunately unable to share the code, we achieved a 150x perf. increase simply by moving from a pre-existing stream.Readable -> stream.Transform -> stream.Transform -> stream.Transform -> stream.Writable chain to stream.Redable -> AsyncIterator -> AsyncIterator -> AsyncIterator -> stream.Writable with virtually no changes to each transform step while maintaining full support for backpressure management.

I should note that we did not consider Readable#map() at the time due to its experimental status and having to support both Node 14.x and 16.x. On first glance it should make for a smaller perf. difference, although still significant, but don't quote me on this.

@mcollina
Copy link
Member

Can PRs be made against the official documentation?

Definitely! Any contribution is welcomed

@github-actions
Copy link
Contributor

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot added the stale label Dec 25, 2022
@github-actions
Copy link
Contributor

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 25, 2023
@targos targos moved this from Pending Triage to Stale in Node.js feature requests Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js. stale
Projects
None yet
Development

No branches or pull requests

6 participants