Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root not found in tree error hangs the node (Sepolia long run) #1309

Closed
rodrigo-o opened this issue Sep 26, 2024 · 3 comments · Fixed by #1318
Closed

Root not found in tree error hangs the node (Sepolia long run) #1309

rodrigo-o opened this issue Sep 26, 2024 · 3 comments · Fixed by #1318
Labels
bug Something isn't working P1 This is a MUST for this milestone. It's a main feature, or blocking other main features.

Comments

@rodrigo-o
Copy link
Collaborator

This is something that could happen along the node executions but is easily reproducible on the initial setup. After the initial sync and some time running (a couple of minutes) an initial pruning kicks in and for some reason it generates a slot in the future error, here is an example log:

INFO 17:43:12.001 [StateDb] Pruning started. slot=5963552
INFO 17:43:12.001 [BlobDb] Pruning started. slot=5832480
INFO 17:43:12.001 [BlockDb] Pruning started. slot=5963552
INFO 17:43:12.001 [BlobDb] Pruning finished. 0 blobs removed. 
INFO 17:43:12.003 [Libp2p] Slot transition slot=5963616
INFO 17:43:12.009 [BlockDb] Pruning finished. 52 blocks removed. 
INFO 17:43:12.019 [StateDb] Pruning finished. 52 states removed. 
INFO 17:43:12.684 [Gossip] Block received, block.slot: 5963616. 
INFO 17:43:24.002 [Libp2p] Slot transition slot=5963617
INFO 17:43:24.216 [Gossip] Block received, block.slot: 5963617. 
INFO 17:43:29.977 [Fork choice] Adding new block slot=5963613 root=0x3de..fc05
INFO 17:43:30.204 [Fork choice] Block processed. Recomputing head. 
ERROR 17:43:30.204 GenServer LambdaEthereumConsensus.Libp2pPort terminating
** (RuntimeError) Parent 14C35B27D611342466C990ABE85147AFCDA13DBE2D5C05C4B3155309CF94FFFD not found in tree
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/simple_tree.ex:84: LambdaEthereumConsensus.ForkChoice.Simple.Tree.get_children!/2
    (lambda_ethereum_consensus 0.1.0) lib/types/store.ex:147: Types.Store.get_children/2
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:90: LambdaEthereumConsensus.ForkChoice.Head.filter_block_tree/4
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:85: LambdaEthereumConsensus.ForkChoice.Head.get_filtered_block_tree/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:14: LambdaEthereumConsensus.ForkChoice.Head.get_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:269: LambdaEthereumConsensus.ForkChoice.recompute_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:62: anonymous fn/1 in LambdaEthereumConsensus.ForkChoice.on_block/2
    (telemetry 1.3.0) /home/admin/lambda_ethereum_consensus/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
Last message: {#Port<0.16>, {:data, <<50, 145, 166, 1, 10, 36, 55, 100, 49, 52, 49, 57, 53, 49, 45, 97, 100, 56, 50, 45, 52, 50, 101, 102, 45, 97, 102, 102, 101, 45, 48, 57, 102, 54, 50, 55, 101, 97, 57, 56, 102, 101, 16, 1, 26, 229, ...>>}} 
INFO 17:43:30.227 [Optimistic Sync] Waiting 10.0 seconds to discover some peers before requesting blocks. 
INFO 17:43:30.227 [Fork choice] Adding new block slot=5963613 root=0x3de..fc05
ERROR 17:43:30.280 [Fork choice] Failed to add block: block is from the future slot=5963613 root=0x3de..fc05
ERROR 17:43:30.280 [PendingBlocks] Saving block as invalid block is from the future slot=5963613 root=0x3de..fc05
INFO 17:43:36.001 [Libp2p] Slot transition slot=5963618
INFO 17:43:40.228 [Optimistic sync] Performing optimistic sync between slots 5963489 and 5963618, for a total of 130 slots. 

This happens to kick in sync again and and may trigger #1308 issue.

@rodrigo-o rodrigo-o added bug Something isn't working P1 This is a MUST for this milestone. It's a main feature, or blocking other main features. labels Sep 26, 2024
@rodrigo-o rodrigo-o changed the title Pruning causes the node to try to sync again, sometimes leading to hangs (Sepolia long run) Root not found in tree error hangs the node (Sepolia long run) Oct 2, 2024
@rodrigo-o
Copy link
Collaborator Author

rodrigo-o commented Oct 2, 2024

This was not related to pruning as I originally thought. Here is a particular example where pruning wasn't involved. Something to be mindful about is that the error was pointing to a block that we added at the start of the following log, and that was correctly processed and not reorged in Sepolia (https://sepolia.etherscan.io/block/6790864#consensusinfo)

INFO 21:08:40.621 [Fork choice] Adding new block slot=6000639 root=0xb3a..dbe9
INFO 21:08:40.775 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:40.785 [Fork choice] Added new block slot=6000639 root=0xb3a..dbe9
INFO 21:08:40.786 [Fork choice] Adding new block slot=6000640 root=0xcae..a8f3
INFO 21:08:41.061 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:41.067 [Fork choice] Added new block slot=6000640 root=0xcae..a8f3
INFO 21:08:41.069 [Fork choice] Adding new block slot=6000641 root=0xe4e..0178
INFO 21:08:41.070 [Block processing] Computing committees for epoch 187520 
INFO 21:08:41.264 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:41.270 [Fork choice] Added new block slot=6000641 root=0xe4e..0178
INFO 21:08:41.270 [Fork choice] Adding new block slot=6000642 root=0x179..fdb5
INFO 21:08:41.430 [Fork choice] Block processed. Recomputing head. 
INFO 21:08:41.434 [Fork choice] Added new block slot=6000642 root=0x179..fdb5

....

INFO 21:14:12.008 [Libp2p] Slot transition slot=6000671
INFO 21:14:12.585 [Gossip] Block received, block.slot: 6000671. 
INFO 21:14:12.596 [Fork choice] Adding new block slot=6000671 root=0x8c6..d57e
INFO 21:14:12.791 [Fork choice] Block processed. Recomputing head. 
INFO 21:14:12.797 [Fork choice] Added new block slot=6000671 root=0x8c6..d57e
INFO 21:14:24.004 [Libp2p] Slot transition slot=6000672
INFO 21:14:27.545 [Gossip] Block received, block.slot: 6000672. 
INFO 21:14:27.718 [Fork choice] Adding new block slot=6000672 root=0x611..070f
INFO 21:14:28.073 [Fork choice] Block processed. Recomputing head. 
ERROR 21:14:28.076 GenServer LambdaEthereumConsensus.Libp2pPort terminating
** (RuntimeError) Parent CAE1F13DBD7FAF2177FDEE036E00B1D59319D380A9DC859478940CB1B0FCA8F3 not found in tree
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/simple_tree.ex:84: LambdaEthereumConsensus.ForkChoice.Simple.Tree.get_children!/2
    (lambda_ethereum_consensus 0.1.0) lib/types/store.ex:148: Types.Store.get_children/2
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:90: LambdaEthereumConsensus.ForkChoice.Head.filter_block_tree/4
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:85: LambdaEthereumConsensus.ForkChoice.Head.get_filtered_block_tree/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:14: LambdaEthereumConsensus.ForkChoice.Head.get_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:272: LambdaEthereumConsensus.ForkChoice.recompute_head/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/fork_choice.ex:62: anonymous fn/1 in LambdaEthereumConsensus.ForkChoice.on_block/2
    (telemetry 1.3.0) /Users/roliveri/prog/elixir/consensus/lambda_ethereum_consensus/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
Last message: {#Port<0.19>, {:data, <<10, 152, 76, 10, 40, 47, 101, 116, 104, 50, 47, 100, 51, 49, 102, 54, 49, 57, 49, 47, 98, 108, 111, 98, 95, 115, 105, 100, 101, 99, 97, 114, 95, 49, 47, 115, 115, 122, 95, 115, 110, 97, 112, 112, 121, 26, ...>>}} 
INFO 21:14:28.106 [Optimistic Sync] Waiting 10.0 seconds to discover some peers before requesting blocks. 
INFO 21:14:28.107 [Fork choice] Adding new block slot=6000672 root=0x611..070f

@rodrigo-o
Copy link
Collaborator Author

In #1316 I was able to check that the store is in al old state and that cause the setting of the block as invalid:

INFO 16:00:48.604 [Fork choice] Adding new block slot=6013504 root=0xaa1..c455
INFO 16:00:48.802 [Fork choice] Block processed. Recomputing head. 
ERROR 16:00:48.805 GenServer LambdaEthereumConsensus.Libp2pPort terminating
** (RuntimeError) Parent 8C7A76877C9CD8052DFF5995378515B9140DF11467672AEFFA7D0E3A1B9AE05B not found in tree
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/simple_tree.ex:84: LambdaEthereumConsensus.ForkChoice.Simple.Tree.get_children!/2
    (lambda_ethereum_consensus 0.1.0) lib/types/store.ex:144: Types.Store.get_children/2
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:90: LambdaEthereumConsensus.ForkChoice.Head.filter_block_tree/4
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:85: LambdaEthereumConsensus.ForkChoice.Head.get_filtered_block_tree/1
    (lambda_ethereum_consensus 0.1.0) lib/lambda_ethereum_consensus/fork_choice/head.ex:14: LambdaEthereumConsensus.ForkChoice.Head.get_head/1
    (lambda_ethereum_consensus 0.1.0) iex:296: LambdaEthereumConsensus.ForkChoice.recompute_head/1
    (lambda_ethereum_consensus 0.1.0) iex:62: anonymous fn/1 in LambdaEthereumConsensus.ForkChoice.on_block/2
    (telemetry 1.3.0) /Users/roliveri/prog/elixir/consensus/lambda_ethereum_consensus/deps/telemetry/src/telemetry.erl:324: :telemetry.span/3
Last message: {#Port<0.19>, {:data, <<10, 174, 186, 1, 10, 38, 47, 101, 116, 104, 50, 47, 100, 51, 49, 102, 54, 49, 57, 49, 47, 98, 101, 97, 99, 111, 110, 95, 98, 108, 111, 99, 107, 47, 115, 115, 122, 95, 115, 110, 97, 112, 112, 121, 26, 20, ...>>}} 
INFO 16:00:48.828 [Optimistic Sync] Waiting 10.0 seconds to discover some peers before requesting blocks. 
INFO 16:00:48.828 [Fork choice] Adding new block slot=6013504 root=0xaa1..c455
INFO 16:00:48.881 Store slot: 6013457 is in the past, slot: 6013504 is in the future. Current chain slot: 6013504 
ERROR 16:00:48.881 [Fork choice] Failed to add block: block is from the future slot=6013504 root=0xaa1..c455
ERROR 16:00:48.882 [PendingBlocks] Saving block as invalid block is from the future slot=6013504 root=0xaa1..c455

Either way the root cause is the same, the tree manage in the Recomputing head

@rodrigo-o
Copy link
Collaborator Author

Resolved by #1318

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 This is a MUST for this milestone. It's a main feature, or blocking other main features.
Projects
Development

Successfully merging a pull request may close this issue.

1 participant