feat(l1): fix snap sync + add healing #1505

fmoletta · 2024-12-13T15:45:27Z

Motivation
Fix snap sync logic:
Instead of rebuilding all block's state via snap, we select a pivot block (sync head - 64) fetch its state via snap and then execute all blocks after it
Add Healing phase

Missing from this PR:

Reorg handling
Fetching receipts

Description

Closes #1455

…to snap-fix

crates/networking/docs/Sync.md

…to eth-get-receipts-msg

mpaulucci · 2024-12-18T19:36:50Z

crates/networking/p2p/peer_channels.rs

+    /// Returns the nodes or None if:
+    /// - There are no available peers (the node just started up or was rejected by all other nodes)
+    /// - The response timed out
+    /// - The response was empty or not valid


Haven't read the implementation for this, but it's not clear what happens if some nodes are found and others have issues (partial success). Does it return None or the nodes that are valid? Seems like the comment should make this clear

If a node is invalid then the whole response would be invalid, there are no partial successes.

…to snap-fix

mpaulucci · 2024-12-19T21:31:17Z

crates/networking/p2p/sync.rs

+                let mut pivot_idx = if all_block_headers.len() > MIN_FULL_BLOCKS {
+                    all_block_headers.len() - MIN_FULL_BLOCKS
+                } else {
+                    all_block_headers.len() - 1


it's a bit weird to do snap sync if there is less than 64 total blocks. Can't we default to full sync here or would this increase complexity? If it's not trivial I would not do it.

Id say this is a very rare edge case, we only perform snap sync if:
A) We are just starting up the node, in which case it is very rare that the chain we sync to would have less than 64 blocks in a non-test case
B) A re-org threw us below the latest pivot, in which case it would also be very rare that the new canonical chain has only 64 blocks or less

mpaulucci · 2024-12-19T21:35:05Z

crates/networking/p2p/sync.rs

+                // If the pivot became stale, set a further pivot and try again
+                if stale_pivot && pivot_idx != all_block_headers.len() - 1 {
+                    warn!("Stale pivot, switching to newer head");
+                    pivot_idx = all_block_headers.len() - 1;


shouldn't you update all_block_headers before choosing a new pivot? I imagine the state_pivot could happen after some time and there are a lot more headers.

why are you choosing pivot_idx = all_block_headers.len() - 1;? It will only move forward by 63 blocks.

We are not fetching any more headers as we don't know about any later block aside from the sync root, the only way to sync further would be to wait for a next fork choice update.

mpaulucci · 2024-12-19T21:36:39Z

crates/networking/p2p/sync.rs

-                    store.set_canonical_block(header.number, hash)?;
-                    store.add_block_header(hash, header)?;
+                let mut pivot_idx = if all_block_headers.len() > MIN_FULL_BLOCKS {
+                    all_block_headers.len() - MIN_FULL_BLOCKS


Does it make sense to have all block headers in memory? For mainnet it would be ~21M. If it's a first implementation it's fine, we can optimize later

But maybe I would just do something like pivot_idx = head_block.number - MIN_FULL_BLOCKS

It made sense for small tests as we could store the full block once we had the body too, but yes, we wouldn't be able to do it with the full ethereum state

mpaulucci · 2024-12-19T21:42:19Z

crates/networking/p2p/sync.rs

-                    result?;
+                if stale_pivot {
+                    warn!("Stale pivot, aborting sync");
+                    return Ok(());


return Error here?

I looked at the error handling on geth and they don't return an error from the main sync process.
The sync process happenes on its own, there is no one we need to reply to with the status of the sync, only the owner/user of the node itself (via tracing) so it doesn't make much sense to return an error unless we want to shut down the node entirely

mpaulucci · 2024-12-19T21:43:32Z

crates/networking/p2p/sync.rs

+                        .zip(all_block_bodies.into_iter()),
+                ) {
+                    if header.number <= pivot_number {
+                        store.set_canonical_block(header.number, hash)?;


I think the only one calling set_canonical_block should be the apply forkchoice function.

The problem is that we already left the fork choice function by the time the sync ends.
Syncs are only triggered by forkChoiceUpdates so I think it makes sense to consider the sync root as the new head of the canonical chain.
Also the ethereum/sync hive test expects the latest block to be set after triggering the sync with a single forkChoiceUpdate

mpaulucci · 2024-12-19T21:44:08Z

crates/networking/p2p/sync.rs

+                        store.add_block(Block::new(header, body))?;
+                    } else {
+                        store.set_canonical_block(header.number, hash)?;
+                        store.update_latest_block_number(header.number)?;


idem, this is a job of the apply forkchoice update function, here we just store the block

…to snap-fix

fmoletta added 30 commits November 6, 2024 15:20

Fix

275659a

[DEBUG] Add debug prints

6e1b4b7

[DEBUG] Add debug prints

167c591

Fix

aad0bcf

[DEBUG] Add debug prints

71357dd

refactor: add next_choice method

cd847b9

Simplify leaf node encoding

37c93d7

Simplify encoding of Leaf

92313f0

Simplify encoding of Extension

3eb5ee9

Simplify encoding of Branch

eb4fd0c

Remove the NodeEncoder

d50e0ff

Clippy

856224d

Update TrieIterator

f713657

Add proptest

30ba82f

Remove old nibble representation

21537d4

Rename DumbNibbles -> Nibbles

28bd344

Update some doc

c56ad04

Simplify BranchNode::remove

bebce62

Simplify

08c6668

Update doc

e1f032f

Fix unit test

9dc2752

Fix test + code

bbe367d

Update test values

63f1645

Fix potential panick

237f291

Fix

d5dfa30

Fix unit tests

55b0c37

Remove outdated comment

01090b4

[DEBUG] Remove debug prints

e1be0c6

Remove funny name test

bed25f7

doc nibbles module

f2c2eef

fmoletta and others added 5 commits December 17, 2024 17:04

Improve doc

d52f95d

Add hive workflow

ff1fd94

Merge branch 'main' of github.com:lambdaclass/lambda_ethereum_rust in…

b54e408

…to snap-fix

Merge branch 'main' into eth-get-receipts-msg

efc967b

Merge branch 'main' of github.com:lambdaclass/lambda_ethereum_rust in…

1b4edb7

…to snap-fix

fmoletta marked this pull request as ready for review December 17, 2024 20:47

fmoletta requested a review from a team as a code owner December 17, 2024 20:47

fmoletta added 2 commits December 18, 2024 13:07

Rename inner_encode_receipt -> encode_inner

9e41b62

Add decode_inner for Receipt + tests

7c12d7f

mpaulucci reviewed Dec 18, 2024

View reviewed changes

crates/networking/docs/Sync.md Show resolved Hide resolved

Simplify logic of RLPDecode impl

12b2c77

mpaulucci reviewed Dec 18, 2024

View reviewed changes

crates/networking/docs/Sync.md Show resolved Hide resolved

fmoletta added 2 commits December 18, 2024 15:56

Fix typos in diagram

a6a52c7

Merge branch 'main' of github.com:lambdaclass/lambda_ethereum_rust in…

b26e825

…to eth-get-receipts-msg

mpaulucci reviewed Dec 18, 2024

View reviewed changes

fmoletta added 2 commits December 18, 2024 17:32

Merge branch 'eth-get-receipts-msg' into snap-fix-plus-receipts

04ac9ec

Add receipt fetching to snap-sync

d7e9eee

fmoletta mentioned this pull request Dec 19, 2024

feat(l1) add receipt fetching to snap [waiting for bases to be merged] #1534

Merged

fmoletta added 3 commits December 19, 2024 10:49

Clippy

32be829

Merge branch 'main' of github.com:lambdaclass/lambda_ethereum_rust in…

6f8c9e9

…to snap-fix

Update comments

72b2156

mpaulucci reviewed Dec 19, 2024

View reviewed changes

fmoletta and others added 3 commits December 19, 2024 18:47

Merge branch 'snap-fix-plus-receipts' into snap-fix

b77bdff

Merge branch 'main' of github.com:lambdaclass/lambda_ethereum_rust in…

f18ca83

…to snap-fix

Merge branch 'main' into snap-fix

bb1a7dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(l1): fix snap sync + add healing #1505

feat(l1): fix snap sync + add healing #1505

fmoletta commented Dec 13, 2024 •

edited

Loading

mpaulucci Dec 18, 2024

fmoletta Dec 18, 2024

mpaulucci Dec 19, 2024

fmoletta Dec 20, 2024

mpaulucci Dec 19, 2024 •

edited

Loading

mpaulucci Dec 19, 2024

fmoletta Dec 19, 2024

mpaulucci Dec 19, 2024

mpaulucci Dec 19, 2024

fmoletta Dec 19, 2024

mpaulucci Dec 19, 2024

fmoletta Dec 19, 2024

mpaulucci Dec 19, 2024

fmoletta Dec 19, 2024

mpaulucci Dec 19, 2024

feat(l1): fix snap sync + add healing #1505

Are you sure you want to change the base?

feat(l1): fix snap sync + add healing #1505

Conversation

fmoletta commented Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpaulucci Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmoletta commented Dec 13, 2024 •

edited

Loading

mpaulucci Dec 19, 2024 •

edited

Loading