Performance: Improved performance for async iterables #42618

jeswr · 2022-04-06T04:06:23Z

What is the problem this feature will solve?

I have recently been working on optimizations for this implementation of Asynchronous iterators. I did a quick benchmark between that implementation; and the Readable API for Node v17.4.0 and found that some of the latest of our optimizations (yet to be released) we have are orders of magnitude faster than the Readable API. As an example, we can apply 10 maps to 200_000 elements in 20ms, whilst it takes the Readable API ~2s (tested on DELL XPS 15, 32 GB RAM).

import { ArrayIterator } from './dist/asynciterator.js'
import { Readable } from 'stream';

async function test(fn) {
  let j = 0;
  const arr = new Array(200_000).fill(true).map(() => j++);
  let iterator = fn(arr)

  for (let i = 0; i < 10; i++) {
    iterator = iterator.map(x => x + 1)
  }

  const now = performance.now();
  return await new Promise((res) => iterator.on('data', () => {}).on('end', () => {
    res(performance.now() - now);
  }))
}

console.log('ArrayIterator', await test(arr => new ArrayIterator(arr)));
console.log('Readable.from', await test(arr => Readable.from(arr)))

Implementation	Time (ms)
ArrayIterator	20.243778005242348
Readable.from	2403.212348997593

The recent optimizations we have done include:

Truncating arrays, and using linkedlists for internal buffers to get around performance issues with Array.shift in the V8 engine (Performance: make Array.shift an O(1) operation. #42449)
'Pre-compiling' chained operations

This (somewhat long) issue is where we have had most of the discussion around the recent improvements we have been working on.

What is the feature you are proposing to solve the problem?

Use some of the optimizations already implemented in https://github.com/RubenVerborgh/AsyncIterator, RubenVerborgh/AsyncIterator#59

What alternatives have you considered?

No response

The text was updated successfully, but these errors were encountered:

mscdex · 2022-04-06T04:24:21Z

Are the suggested changes passing all of node's tests?

jeswr · 2022-04-06T04:34:23Z

It wouldn't because this library is missing a few elements that the Readable API has (for instance it doesn't expose a few read-only properties such as readableFlowing etc.) - though there is a PR to add some more of those properties in (RubenVerborgh/AsyncIterator#45), which also improved the perf. of the AsyncIterator library by ~5-10% in cases like the one given above.

mscdex · 2022-04-06T11:13:43Z

If it passes all tests, then feel free to submit a PR.

benjamingr · 2022-04-06T13:36:06Z

@nodejs/streams

benjamingr · 2022-04-06T13:38:05Z

Hey, thanks for asking. I don't think we spent a lot of time optimizing this area in the past and were mostly waiting for people to ask.

Is the ask specifically to make async iterators for Readable.from(largeArray) fast?

ronag · 2022-04-06T13:39:14Z

I don't think we can just replace the current implementation. Would break too much stuff.

benjamingr · 2022-04-06T13:40:36Z

I don't think we can just replace the current implementation. Would break too much stuff.

The current implementation doesn't refer to the whole of Readable (The AsyncIterator package doesn't contain 10% of the functionality of streams or stuff like watermarks).

I think the ask is to improve our performance for the async iterator API for Readable?

ronag · 2022-04-06T13:41:13Z

I think the ask is to improve our performance for the async iterator API for Readable?

I'm not sure what the ask here is. 😕

ronag · 2022-04-06T13:43:41Z

Use some of the optimizations already implemented in https://github.com/RubenVerborgh/AsyncIterator, RubenVerborgh/AsyncIterator#59

I think this needs to be a bit more specific... I'm not sure what optimisations we should apply.

benjamingr · 2022-04-06T13:48:43Z

For from, probably just adding a special case for arrays to avoid all the iteration objects should be fine?

diff --git a/lib/internal/streams/from.js b/lib/internal/streams/from.js
index d3d43f7dfb..ad2afa578c 100644
--- a/lib/internal/streams/from.js
+++ b/lib/internal/streams/from.js
@@ -25,10 +25,28 @@ function from(Readable, iterable, opts) {
     });
   }
 
+  if (Array.isArray(iterable)) {
+    let i = 0;
+    return new Readable({
+      objectMode: true,
+      ...opts,
+      read() {
+        if (i < iterable.length) {
+          this.push(iterable[i++]);
+        } else {
+          iterable.push(null);
+        }
+      }
+    });
+  }
+
+
   let isAsync;
   if (iterable && iterable[SymbolAsyncIterator]) {
     isAsync = true;
     iterator = iterable[SymbolAsyncIterator]();
   } else if (iterable && iterable[SymbolIterator]) {
     isAsync = false;
     iterator = iterable[SymbolIterator]();

I'm not sure it's worth it though? Statically creating readables with .from is probably not a hot path? (is it for your use case @jeswr ?)

ronag · 2022-04-06T13:56:40Z

We could probably make that even faster by just injecting the array as the readable's buffer. But again this seems like a quite unusual use case.

mcollina · 2022-04-06T14:36:21Z

I don't really understand what's the use case for this. I'm totally ok to have the perf improvement in, but what are we optimizing for in this case? I would not recommend anyone to process an Array using an AsyncIterator, it'll definitely be slower than just processing the Array.

jeswr · 2022-04-06T21:54:17Z

Apologies for the lack of clarity in my original issue. The main point was to highlight that there is room to make pull-based/Asynchronous streams a lot faster which has the potential to speed up downstream applications; I see where the choice to benchmark using Readable.from may have caused confusion. @jacoscaz reported to me that he has applications (not using Readable.from or its equivalent) that are now 600-900x as a result porting from using Readable to using the current release of the Asynciterator package.

I also note that I'm not intimately familiar with he nodejs codebase - and all comments are based on the observed performance comparison between Readable, and the Asynciterator package.

The ask is more specifically to optimize (in order of importance) the speed of:
(1) Internal buffers by using linked lists
(2) Transformations of pull-based/asynchronous iterators. Specifically the case of applying several synchronous transformations such as synchronous map/filter/skip etc. can be made a lot faster; we have done this optimization in RubenVerborgh/AsyncIterator#59 with more discussions around its performance gains in RubenVerborgh/AsyncIterator#58. The overarching idea is to apply all the transforms in a single read and skip any intermediary buffers.
(3) The speed of Readable.from by truncating the array on load

Once/if these have been applied then there are potentially some further optimisations that can be looked into based on the thread I linked to in my original message.

mcollina · 2022-04-06T22:44:35Z

@jeswr I'm not sure I follow. Readable.from() is primarily an utility to convert to a node stream.Readable, which allow plenty of things - including piping. There are plenty of way to make it faster, however it would deeply break backward compatibility for all things that depends on Stream.

jeswr · 2022-04-07T00:09:45Z

Note I'll try and revisit this once I've had a chance to dig into this codebase further - I don't have as much time as I would like to dig into this right now.

Off the top of my head there is a very naive way to get improved performance without breaking backwards compat which is to have a constructor parameter pullOnly (defaulted to false), which when enabled allows such optimizations as this to be applied; it won't break any old code since it is opt-in, and will allow a lot of packages to get a perf. improvement by simply enabling this parameter.

However, I think there should also be ways of achieving this without downstream libs having to add this extra parameter. The way this kind of thing achieved in the AsyncIterator lib is by using different classes with different internals to handle different types of operations. I.e. all of these fast 'synchronous' transformations are handled by a MappingIterator whilst other transforms are handled by a TransformIterator. I'd imagine this could be extended to all of the other kinds of operations that are made available here.

In addition, let me alay some confusion by giving an example without Readable.from(). Here we create a Readable with a read method in object mode. In this example; Readable takes 1170ms (benchmarked on node v17.4.0), whilst the BufferedIterator from the AsyncIterator lib takes 50ms (benchmarked on RubenVerborgh/AsyncIterator#59)

let i = 0;
let SIZE = 100_000;

let iterator = new Readable({
  read() {
    if (i < SIZE) {
      this.push(i++);
      this.push(i++);
      this.push(i++);
    } else {
      this.push(null)
    }
  },
  objectMode: true
});


for (let i = 0; i < 10; i++) {
  iterator = iterator.map(x => x + 1)
}


const now = performance.now();
iterator.on('data', () => {}).on('end', () => {
  console.log(performance.now() - now);
})

import { BufferedIterator } from './dist/asynciterator.js'


let i = 0;
let SIZE = 100_000;

class MyIterator extends BufferedIterator {
  _read(count, done) {
    if (i < SIZE) {
      this._push(i++);
      this._push(i++);
      this._push(i++);
      done();
    } else {
      this.close();
      done();
    }
  }
}


let iterator = new MyIterator();

for (let i = 0; i < 10; i++) {
  iterator = iterator.map(x => x + 1)
}

const now = performance.now();
iterator.on('data', () => {}).on('end', () => {
  console.log(performance.now() - now);
})

ronag · 2022-04-07T07:36:28Z

I think the problem here is that Readable.map is always asynchronous. We haven't optimized for synchronous mappers.

If you remove the map part of the above example then I think the performance difference will be significantly smaller.

benjamingr · 2022-04-07T08:11:27Z

I think the problem here is that Readable.map is always asynchronous. We haven't optimized for synchronous mappers.

I'm not sure why I'd use an async iterator with a synchronous map but we can definitely make that much much faster and (for example) fall back on a regular for loop for a asynchronous map on the stream's buffer and even do some of the tricks @jeswr is trying to apply (the transducing several .map calls into one) though I am really not sure that's something people should use streams for in the first place?

jeswr · 2022-04-07T08:58:58Z

I'm not sure why I'd use an async iterator with a synchronous map

One use case is this configurable query engine which takes data from pull-based sources and then applies various transforms to the data in the query evaluation process. Here is a link to a subset of the places it applies such transforms RubenVerborgh/AsyncIterator#44 (comment)

jacoscaz · 2022-06-27T09:40:41Z

Elaborating on @jeswr's comments with my own thoughts, I think the underlying issue here is that developers are often tempted to use streams (as in the API implemented by the stream module and the readable-stream package) for both I/O and for CPU-bound internal processing. A simple example of this would be a streaming processor with multiple internal processing stages that consumes process.stdin and pipes to process.stdout.

In one use case of which I am unfortunately unable to share the code, we achieved a 150x perf. increase simply by moving from a pre-existing stream.Readable -> stream.Transform -> stream.Transform -> stream.Transform -> stream.Writable chain to stream.Redable -> AsyncIterator -> AsyncIterator -> AsyncIterator -> stream.Writable with virtually no changes to each transform step while maintaining full support for backpressure management.

We haven't optimized for synchronous mappers.

And I am unsure as to whether the stream API should do so, either; at least not as a primary goal. After all, it was born to deal with a different problem. However, especially for newcomers to either Node.js and/or stream-based processing, the convenience of simplified construction, of objectMode and of methods like .map() and .filter() (which also happen to use .shift()) can be misleading as to the performance we can expect out of a CPU-bound processing chain.

If I had to formulate an ask, I guess it might be two-fold:

Some of the optimizations we've been working on may be applicable to the stream API too. If there's interest in doing so, perhaps we could have a go at evaluating and porting some of them over.
It would be helpful for the official documentation to elaborate on the applicability of the stream API to different use cases (I/O, CPU-bound processing, ...). Also, given the amount of ways in which async iteration can be done (stream API, for await ... of, third-party modules like asynciterator, ...), a table comparing one against the other would also be quite helpful in guiding readers to the solution that best fits. Can PRs be made against the official documentation?

jacoscaz · 2022-06-27T09:51:19Z

In one use case of which I am unfortunately unable to share the code, we achieved a 150x perf. increase simply by moving from a pre-existing stream.Readable -> stream.Transform -> stream.Transform -> stream.Transform -> stream.Writable chain to stream.Redable -> AsyncIterator -> AsyncIterator -> AsyncIterator -> stream.Writable with virtually no changes to each transform step while maintaining full support for backpressure management.

I should note that we did not consider Readable#map() at the time due to its experimental status and having to support both Node 14.x and 16.x. On first glance it should make for a smaller perf. difference, although still significant, but don't quote me on this.

mcollina · 2022-06-27T09:58:03Z

Can PRs be made against the official documentation?

Definitely! Any contribution is welcomed

github-actions · 2022-12-25T01:28:37Z

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

github-actions · 2023-01-25T01:28:37Z

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

jeswr added the feature request Issues that request new features to be added to Node.js. label Apr 6, 2022

targos added this to Node.js feature requests May 4, 2022

targos moved this to Pending Triage in Node.js feature requests May 4, 2022

github-actions bot added the stale label Dec 25, 2022

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 25, 2023

targos moved this from Pending Triage to Stale in Node.js feature requests Apr 4, 2023

avivkeller removed this from Node.js feature requests Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Improved performance for async iterables #42618

Performance: Improved performance for async iterables #42618

jeswr commented Apr 6, 2022 •

edited

Loading

mscdex commented Apr 6, 2022

jeswr commented Apr 6, 2022

mscdex commented Apr 6, 2022

benjamingr commented Apr 6, 2022

benjamingr commented Apr 6, 2022

ronag commented Apr 6, 2022

benjamingr commented Apr 6, 2022

ronag commented Apr 6, 2022

ronag commented Apr 6, 2022

benjamingr commented Apr 6, 2022

ronag commented Apr 6, 2022

mcollina commented Apr 6, 2022

jeswr commented Apr 6, 2022 •

edited

Loading

mcollina commented Apr 6, 2022

jeswr commented Apr 7, 2022 •

edited

Loading

ronag commented Apr 7, 2022 •

edited

Loading

benjamingr commented Apr 7, 2022

jeswr commented Apr 7, 2022

jacoscaz commented Jun 27, 2022 •

edited

Loading

jacoscaz commented Jun 27, 2022

mcollina commented Jun 27, 2022

github-actions bot commented Dec 25, 2022

github-actions bot commented Jan 25, 2023

Performance: Improved performance for async iterables #42618

Performance: Improved performance for async iterables #42618

Comments

jeswr commented Apr 6, 2022 • edited Loading

What is the problem this feature will solve?

What is the feature you are proposing to solve the problem?

What alternatives have you considered?

mscdex commented Apr 6, 2022

jeswr commented Apr 6, 2022

mscdex commented Apr 6, 2022

benjamingr commented Apr 6, 2022

benjamingr commented Apr 6, 2022

ronag commented Apr 6, 2022

benjamingr commented Apr 6, 2022

ronag commented Apr 6, 2022

ronag commented Apr 6, 2022

benjamingr commented Apr 6, 2022

ronag commented Apr 6, 2022

mcollina commented Apr 6, 2022

jeswr commented Apr 6, 2022 • edited Loading

mcollina commented Apr 6, 2022

jeswr commented Apr 7, 2022 • edited Loading

ronag commented Apr 7, 2022 • edited Loading

benjamingr commented Apr 7, 2022

jeswr commented Apr 7, 2022

jacoscaz commented Jun 27, 2022 • edited Loading

jacoscaz commented Jun 27, 2022

mcollina commented Jun 27, 2022

github-actions bot commented Dec 25, 2022

github-actions bot commented Jan 25, 2023

jeswr commented Apr 6, 2022 •

edited

Loading

jeswr commented Apr 6, 2022 •

edited

Loading

jeswr commented Apr 7, 2022 •

edited

Loading

ronag commented Apr 7, 2022 •

edited

Loading

jacoscaz commented Jun 27, 2022 •

edited

Loading