Skip to content

Commit

Permalink
Hierarchical state diffs (#5978)
Browse files Browse the repository at this point in the history
* Start extracting freezer changes for tree-states

* Remove unused config args

* Add comments

* Remove unwraps

* Subjective more clear implementation

* Clean up hdiff

* Update xdelta3

* Tree states archive metrics (#6040)

* Add store cache size metrics

* Add compress timer metrics

* Add diff apply compute timer metrics

* Add diff buffer cache hit metrics

* Add hdiff buffer load times

* Add blocks replayed metric

* Move metrics to store

* Future proof some metrics

---------

Co-authored-by: Michael Sproul <[email protected]>

* Port and clean up forwards iterator changes

* Add and polish hierarchy-config flag

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Cleaner errors

* Fix beacon_chain test compilation

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Patch a few more freezer block roots

* Fix genesis block root bug

* Fix test failing due to pending updates

* Beacon chain tests passing

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix doc lint

* Implement DB schema upgrade for hierarchical state diffs (#6193)

* DB upgrade

* Add flag

* Delete RestorePointHash

* Update docs

* Update docs

* Implement hierarchical state diffs config migration (#6245)

* Implement hierarchical state diffs config migration

* Review PR

* Remove TODO

* Set CURRENT_SCHEMA_VERSION correctly

* Fix genesis state loading

* Re-delete some PartialBeaconState stuff

---------

Co-authored-by: Michael Sproul <[email protected]>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix test compilation

* Update schema downgrade test

* Fix tests

* Fix null anchor migration

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix tree states upgrade migration (#6328)

* Towards crash safety

* Fix compilation

* Move cold summaries and state roots to new columns

* Rename StateRoots chunked field

* Update prune states

* Clean hdiff CLI flag and metrics

* Fix "staged reconstruction"

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Fix alloy issues

* Fix staged reconstruction logic

* Prevent weird slot drift

* Remove "allow" flag

* Update CLI help

* Remove FIXME about downgrade

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Remove some unnecessary error variants

* Fix new test

* Tree states archive - review comments and metrics (#6386)

* Review PR comments and metrics

* Comments

* Add anchor metrics

* drop prev comment

* Update metadata.rs

* Apply suggestions from code review

---------

Co-authored-by: Michael Sproul <[email protected]>

* Update beacon_node/store/src/hot_cold_store.rs

Co-authored-by: Lion - dapplion <[email protected]>

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Clarify comment and remove anchor_slot garbage

* Simplify database anchor (#6397)

* Simplify database anchor

* Update beacon_node/store/src/reconstruct.rs

* Add migration for anchor

* Fix and simplify light_client store tests

* Fix incompatible config test

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* More metrics

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* New historic state cache (#6475)

* New historic state cache

* Add more metrics

* State cache hit rate metrics

* Fix store metrics

* More logs and metrics

* Fix logger

* Ensure cached states have built caches :O

* Replay blocks in preference to diffing

* Two separate caches

* Distribute cache build time to next slot

* Re-plumb historic-state-cache flag

* Clean up metrics

* Update book

* Update beacon_node/store/src/hdiff.rs

Co-authored-by: Lion - dapplion <[email protected]>

* Update beacon_node/store/src/historic_state_cache.rs

Co-authored-by: Lion - dapplion <[email protected]>

---------

Co-authored-by: Lion - dapplion <[email protected]>

* Update database docs

* Update diagram

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Update lockbud to work with bindgen/etc

* Correct pkg name for Debian

* Remove vestigial epochs_per_state_diff

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Markdown lint

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Address Jimmy's review comments

* Simplify ReplayFrom case

* Fix and document genesis_state_root

* Typo

Co-authored-by: Jimmy Chen <[email protected]>

* Merge branch 'unstable' into tree-states-archive

* Compute diff of validators list manually (#6556)

* Split hdiff computation

* Dedicated logic for historical roots and summaries

* Benchmark against real states

* Mutated source?

* Version the hdiff

* Add lighthouse DB config for hierarchy exponents

* Tidy up hierarchy exponents flag

* Apply suggestions from code review

Co-authored-by: Michael Sproul <[email protected]>

* Address PR review

* Remove hardcoded paths in benchmarks

* Delete unused function in benches

* lint

---------

Co-authored-by: Michael Sproul <[email protected]>

* Test hdiff binary format stability (#6585)

* Merge remote-tracking branch 'origin/unstable' into tree-states-archive

* Add deprecation warning for SPRP

* Update xdelta to get rid of duplicate deps

* Document test
  • Loading branch information
michaelsproul authored Nov 18, 2024
1 parent 654fc6a commit 9fdd53d
Show file tree
Hide file tree
Showing 57 changed files with 3,350 additions and 1,681 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/test-suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v3
- name: Install dependencies
run: apt update && apt install -y cmake
- name: Generate code coverage
run: apt update && apt install -y cmake libclang-dev
- name: Check for deadlocks
run: |
cargo lockbud -k deadlock -b -l tokio_util
Expand Down
78 changes: 76 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,8 @@ validator_http_metrics = { path = "validator_client/http_metrics" }
validator_metrics = { path = "validator_client/validator_metrics" }
validator_store= { path = "validator_client/validator_store" }
warp_utils = { path = "common/warp_utils" }
xdelta3 = { git = "http://github.com/sigp/xdelta3-rs", rev = "50d63cdf1878e5cf3538e9aae5eed34a22c64e4a" }
zstd = "0.13"

[profile.maxperf]
inherits = "release"
Expand Down
24 changes: 10 additions & 14 deletions beacon_node/beacon_chain/src/beacon_chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -767,7 +767,6 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
start_slot,
local_head.beacon_state.clone(),
local_head.beacon_block_root,
&self.spec,
)?;

Ok(iter.map(|result| result.map_err(Into::into)))
Expand All @@ -790,12 +789,11 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
}

self.with_head(move |head| {
let iter = self.store.forwards_block_roots_iterator_until(
start_slot,
end_slot,
|| Ok((head.beacon_state.clone(), head.beacon_block_root)),
&self.spec,
)?;
let iter =
self.store
.forwards_block_roots_iterator_until(start_slot, end_slot, || {
Ok((head.beacon_state.clone(), head.beacon_block_root))
})?;
Ok(iter
.map(|result| result.map_err(Into::into))
.take_while(move |result| {
Expand Down Expand Up @@ -865,7 +863,6 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
start_slot,
local_head.beacon_state_root(),
local_head.beacon_state.clone(),
&self.spec,
)?;

Ok(iter.map(|result| result.map_err(Into::into)))
Expand All @@ -882,12 +879,11 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
end_slot: Slot,
) -> Result<impl Iterator<Item = Result<(Hash256, Slot), Error>> + '_, Error> {
self.with_head(move |head| {
let iter = self.store.forwards_state_roots_iterator_until(
start_slot,
end_slot,
|| Ok((head.beacon_state.clone(), head.beacon_state_root())),
&self.spec,
)?;
let iter =
self.store
.forwards_state_roots_iterator_until(start_slot, end_slot, || {
Ok((head.beacon_state.clone(), head.beacon_state_root()))
})?;
Ok(iter
.map(|result| result.map_err(Into::into))
.take_while(move |result| {
Expand Down
19 changes: 0 additions & 19 deletions beacon_node/beacon_chain/src/block_verification.rs
Original file line number Diff line number Diff line change
Expand Up @@ -839,9 +839,6 @@ impl<T: BeaconChainTypes> GossipVerifiedBlock<T> {

let block_root = get_block_header_root(block_header);

// Disallow blocks that conflict with the anchor (weak subjectivity checkpoint), if any.
check_block_against_anchor_slot(block.message(), chain)?;

// Do not gossip a block from a finalized slot.
check_block_against_finalized_slot(block.message(), block_root, chain)?;

Expand Down Expand Up @@ -1074,9 +1071,6 @@ impl<T: BeaconChainTypes> SignatureVerifiedBlock<T> {
.fork_name(&chain.spec)
.map_err(BlockError::InconsistentFork)?;

// Check the anchor slot before loading the parent, to avoid spurious lookups.
check_block_against_anchor_slot(block.message(), chain)?;

let (mut parent, block) = load_parent(block, chain)?;

let state = cheap_state_advance_to_obtain_committees::<_, BlockError>(
Expand Down Expand Up @@ -1688,19 +1682,6 @@ impl<T: BeaconChainTypes> ExecutionPendingBlock<T> {
}
}

/// Returns `Ok(())` if the block's slot is greater than the anchor block's slot (if any).
fn check_block_against_anchor_slot<T: BeaconChainTypes>(
block: BeaconBlockRef<'_, T::EthSpec>,
chain: &BeaconChain<T>,
) -> Result<(), BlockError> {
if let Some(anchor_slot) = chain.store.get_anchor_slot() {
if block.slot() <= anchor_slot {
return Err(BlockError::WeakSubjectivityConflict);
}
}
Ok(())
}

/// Returns `Ok(())` if the block is later than the finalized slot on `chain`.
///
/// Returns an error if the block is earlier or equal to the finalized slot, or there was an error
Expand Down
4 changes: 4 additions & 0 deletions beacon_node/beacon_chain/src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,10 @@ where
store
.put_block(&beacon_block_root, beacon_block.clone())
.map_err(|e| format!("Failed to store genesis block: {:?}", e))?;
store
.store_frozen_block_root_at_skip_slots(Slot::new(0), Slot::new(1), beacon_block_root)
.and_then(|ops| store.cold_db.do_atomically(ops))
.map_err(|e| format!("Failed to store genesis block root: {e:?}"))?;

// Store the genesis block under the `ZERO_HASH` key.
store
Expand Down
30 changes: 14 additions & 16 deletions beacon_node/beacon_chain/src/historical_blocks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ use std::iter;
use std::time::Duration;
use store::metadata::DataColumnInfo;
use store::{
chunked_vector::BlockRoots, AnchorInfo, BlobInfo, ChunkWriter, Error as StoreError,
KeyValueStore,
get_key_for_col, AnchorInfo, BlobInfo, DBColumn, Error as StoreError, KeyValueStore,
KeyValueStoreOp,
};
use strum::IntoStaticStr;
use types::{FixedBytesExtended, Hash256, Slot};
Expand All @@ -35,8 +35,6 @@ pub enum HistoricalBlockError {
InvalidSignature,
/// Transitory error, caller should retry with the same blocks.
ValidatorPubkeyCacheTimeout,
/// No historical sync needed.
NoAnchorInfo,
/// Logic error: should never occur.
IndexOutOfBounds,
/// Internal store error
Expand Down Expand Up @@ -72,10 +70,7 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
&self,
mut blocks: Vec<AvailableBlock<T::EthSpec>>,
) -> Result<usize, HistoricalBlockError> {
let anchor_info = self
.store
.get_anchor_info()
.ok_or(HistoricalBlockError::NoAnchorInfo)?;
let anchor_info = self.store.get_anchor_info();
let blob_info = self.store.get_blob_info();
let data_column_info = self.store.get_data_column_info();

Expand Down Expand Up @@ -119,8 +114,6 @@ impl<T: BeaconChainTypes> BeaconChain<T> {

let mut expected_block_root = anchor_info.oldest_block_parent;
let mut prev_block_slot = anchor_info.oldest_block_slot;
let mut chunk_writer =
ChunkWriter::<BlockRoots, _, _>::new(&self.store.cold_db, prev_block_slot.as_usize())?;
let mut new_oldest_blob_slot = blob_info.oldest_blob_slot;
let mut new_oldest_data_column_slot = data_column_info.oldest_data_column_slot;

Expand Down Expand Up @@ -158,8 +151,11 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
}

// Store block roots, including at all skip slots in the freezer DB.
for slot in (block.slot().as_usize()..prev_block_slot.as_usize()).rev() {
chunk_writer.set(slot, block_root, &mut cold_batch)?;
for slot in (block.slot().as_u64()..prev_block_slot.as_u64()).rev() {
cold_batch.push(KeyValueStoreOp::PutKeyValue(
get_key_for_col(DBColumn::BeaconBlockRoots.into(), &slot.to_be_bytes()),
block_root.as_slice().to_vec(),
));
}

prev_block_slot = block.slot();
Expand All @@ -171,15 +167,17 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
// completion.
if expected_block_root == self.genesis_block_root {
let genesis_slot = self.spec.genesis_slot;
for slot in genesis_slot.as_usize()..prev_block_slot.as_usize() {
chunk_writer.set(slot, self.genesis_block_root, &mut cold_batch)?;
for slot in genesis_slot.as_u64()..prev_block_slot.as_u64() {
cold_batch.push(KeyValueStoreOp::PutKeyValue(
get_key_for_col(DBColumn::BeaconBlockRoots.into(), &slot.to_be_bytes()),
self.genesis_block_root.as_slice().to_vec(),
));
}
prev_block_slot = genesis_slot;
expected_block_root = Hash256::zero();
break;
}
}
chunk_writer.write(&mut cold_batch)?;
// these were pushed in reverse order so we reverse again
signed_blocks.reverse();

Expand Down Expand Up @@ -271,7 +269,7 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
let backfill_complete = new_anchor.block_backfill_complete(self.genesis_backfill_slot);
anchor_and_blob_batch.push(
self.store
.compare_and_set_anchor_info(Some(anchor_info), Some(new_anchor))?,
.compare_and_set_anchor_info(anchor_info, new_anchor)?,
);
self.store.hot_db.do_atomically(anchor_and_blob_batch)?;

Expand Down
3 changes: 3 additions & 0 deletions beacon_node/beacon_chain/src/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2004,6 +2004,7 @@ pub fn scrape_for_metrics<T: BeaconChainTypes>(beacon_chain: &BeaconChain<T>) {
let attestation_stats = beacon_chain.op_pool.attestation_stats();
let chain_metrics = beacon_chain.metrics();

// Kept duplicated for backwards compatibility
set_gauge_by_usize(
&BLOCK_PROCESSING_SNAPSHOT_CACHE_SIZE,
beacon_chain.store.state_cache_len(),
Expand Down Expand Up @@ -2067,6 +2068,8 @@ pub fn scrape_for_metrics<T: BeaconChainTypes>(beacon_chain: &BeaconChain<T>) {
.canonical_head
.fork_choice_read_lock()
.scrape_for_metrics();

beacon_chain.store.register_metrics();
}

/// Scrape the given `state` assuming it's the head state, updating the `DEFAULT_REGISTRY`.
Expand Down
Loading

0 comments on commit 9fdd53d

Please sign in to comment.