Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merkle performance improvements #233

Closed
wants to merge 27 commits into from
Closed

Merkle performance improvements #233

wants to merge 27 commits into from

Conversation

eljobe
Copy link
Member

@eljobe eljobe commented Apr 29, 2024

This change does two things:

  1. It introduces a Merkle::extend method on the merkle tree implementation which extends the vectors which hold the leaves and their parents and then calls set on all of the hashes being appended to layer[0].
  2. It changes the Memory::resize method to use Merkle::extend instead of throwing away the merkle tree and creating a new one.

This results in the always_merkleize strategy to be much faster than it would be without these changes. Before this change, the benchbin binary ran with metrics like these:

Running benchmark with always merkleize feature on
avg hash time 49.506µs, avg step time 49.583µs, step size 1, num_iters 100, total time 1.587369292s
avg hash time 131.402µs, avg step time 67.44µs, step size 1024, num_iters 100, total time 1.56975325s
avg hash time 529.716µs, avg step time 65.128135ms, step size 32768, num_iters 100, total time 8.108754125s
avg hash time 486.609µs, avg step time 372.639626ms, step size 1048576, num_iters 100, total time 38.849238375s

After this change, the data is more like this:

Running benchmark with always merkleize feature on
avg hash time     55.777µs, avg step time     57.294µs, step size        1, num_iters 100, total time  11.315333ms
avg hash time    126.267µs, avg step time     71.536µs, step size     1024, num_iters 100, total time  19.788125ms
avg hash time    497.955µs, avg step time   3.839355ms, step size    32768, num_iters 100, total time  433.74075ms
avg hash time    461.622µs, avg step time  50.498257ms, step size  1048576, num_iters 100, total time  5.09599725s
avg hash time    826.465µs, avg step time 676.037947ms, step size 16777216, num_iters 100, total time 67.686471417s

NOTE: With the optimization, even step-size 16,777,216 (2^24) is able to run 100 iterations in just a little bit over a minute.

references https://linear.app/offchain-labs/issue/NIT-2411/arbitrator-optimizations

rauljordan and others added 17 commits April 21, 2024 04:19
Cherry-Picked from OffchainLabs/nitro@flatmerkleapril16
It can be easy to look at small average step and hash times and miss
that the total time is what we're really trying to reduce.
I definitely had some incorrect assumptions about this data structure
which made it more difficult to learn. So, I'm documenting how it
works and adding some tests.

The simple_merkle test is currently failing because the `set` method
doesn't allow setting an index larger than the largest currently
set leaf's index.

There is some debate as to whether or not this is the correct
behavior. To run the test, use:

```
$> cargo test -- --include-ignored
```
At this point, the new root hash is eagerly calculated after each call to `extend`.
If this happened frequently, it should really improve the perfomance
of the machine.  However, it looks like it doesn't happen at all with
the benchmark inputs.
Previously, it could hit an index out of bounds if the new leafs
caused any parent layer to grow beyond its current size.
Hopefully, this will allow us to compare this branch's implementation
of a merkle tree to the one on merkle-perf-a.
The previous implementation was growing the same layers and
dirty_indices arrays because the clone isn't deep (I guess.)
There are a few different things going on in this commit.

1. I've added some counters for when methods get called on the Merkle
tree.

2. I've added integration with gperftools for profiling specific areas
of the code.
This allows me to profile CPU and Heap independently, and to enable
and disable the call counters independently.
This part of the code is obviously slow.
Let's see if we can improve it.
This is why there were all those unexpected "new_advanced" calls on
the memory merkle. The resizes were actually setting self.merkle
back to None.
There was a bug where expanding the lowest layer and calling set on
all of the new elements was not sufficient to grow the upper layers.

This commit also fixes a warning about the package-level profile
override being ineffective.
@cla-bot cla-bot bot added the s label Apr 29, 2024
@eljobe
Copy link
Member Author

eljobe commented May 2, 2024

I'm closing this PR. The implementation is only fast because it has a bug. The root hash will quite often be incorrect after the merkle tree is extended in Memory::resize.
I'll try again after lunch.

@eljobe eljobe closed this May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants