From 6d792b4280fd190a8335a2220c58e2c6137c450a Mon Sep 17 00:00:00 2001 From: Jimmy Chen Date: Tue, 7 May 2024 10:36:32 +1000 Subject: [PATCH 01/19] Revise contributors guide (#5720) * Revise contributors guide. --- CONTRIBUTING.md | 42 ++++++++++++++++++++++++++---------------- 1 file changed, 26 insertions(+), 16 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a408fcdd52..3c53558a10 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,12 +1,14 @@ # Contributors Guide + [![GitPOAP badge](https://public-api.gitpoap.io/v1/repo/sigp/lighthouse/badge)](https://www.gitpoap.io/gh/sigp/lighthouse) -Lighthouse is an open-source Ethereum 2.0 client. We're community driven and +Lighthouse is an open-source Ethereum consensus client. We're community driven and welcome all contribution. We aim to provide a constructive, respectful and fun environment for collaboration. -We are active contributors to the [Ethereum 2.0 specification](https://github.com/ethereum/eth2.0-specs) and attend all [Eth -2.0 implementers calls](https://github.com/ethereum/eth2.0-pm). +We are active contributors to +the [Ethereum Proof-of-Stake Consensus specification](https://github.com/ethereum/consensus-specs) and attend +all [Ethereum implementers calls](https://github.com/ethereum/pm/). This guide is geared towards beginners. If you're an open-source veteran feel free to just skim this document and get straight into crushing issues. @@ -41,7 +43,7 @@ We recommend the following work-flow for contributors: 1. **Find an issue** to work on, either because it's interesting or suitable to your skill-set. Use comments to communicate your intentions and ask -questions. + questions. 2. **Work in a feature branch** of your personal fork (github.com/YOUR_NAME/lighthouse) of the main repository (github.com/sigp/lighthouse). @@ -49,13 +51,13 @@ questions. `unstable` as the base branch to merge your changes into the main repository. 4. Wait for the repository maintainers to **review your changes** to ensure the issue is addressed satisfactorily. Optionally, mention your PR on -[discord](https://discord.gg/cyAszAh). + [discord](https://discord.gg/cyAszAh). 5. If the issue is addressed the repository maintainers will **merge your pull-request** and you'll be an official contributor! Generally, you find an issue you'd like to work on and announce your intentions to start work in a comment on the issue. Then, do your work on a separate -branch (a "feature branch") in your own fork of the main repository. Once +branch (a "feature branch") in your own fork of the main repository. Once you're happy and you think the issue has been addressed, create a pull request into the main repository. @@ -66,18 +68,20 @@ steps: 1. [Create a fork](https://help.github.com/articles/fork-a-repo/#fork-an-example-repository) -and [clone -it](https://help.github.com/articles/fork-a-repo/#step-2-create-a-local-clone-of-your-fork) -to your local machine. + and [clone + it](https://help.github.com/articles/fork-a-repo/#step-2-create-a-local-clone-of-your-fork) + to your local machine. 2. [Add an _"upstream"_ branch](https://help.github.com/articles/fork-a-repo/#step-3-configure-git-to-sync-your-fork-with-the-original-spoon-knife-repository) -that tracks github.com/sigp/lighthouse using `$ git remote add upstream -https://github.com/sigp/lighthouse.git` (pro-tip: [use SSH](https://help.github.com/articles/connecting-to-github-with-ssh/) instead of HTTPS). + that tracks github.com/sigp/lighthouse using `$ git remote add upstream + https://github.com/sigp/lighthouse.git` ( + pro-tip: [use SSH](https://help.github.com/articles/connecting-to-github-with-ssh/) instead of HTTPS). 3. Create a new feature branch with `$ git checkout -b your_feature_name`. The name of your branch isn't critical but it should be short and instructive. -E.g., if you're fixing a bug with serialization, you could name your branch -`fix_serialization_bug`. -4. Make sure you sign your commits. See [relevant doc](https://help.github.com/en/github/authenticating-to-github/about-commit-signature-verification). + E.g., if you're fixing a bug with serialization, you could name your branch + `fix_serialization_bug`. +4. Make sure you sign your commits. + See [relevant doc](https://help.github.com/en/github/authenticating-to-github/about-commit-signature-verification). 5. Commit your changes and push them to your fork with `$ git push origin your_feature_name`. 6. Go to your fork on github.com and use the web interface to create a pull @@ -92,22 +96,28 @@ by Rob Allen that provides much more detail on each of these steps, if you're having trouble. As always, jump on [discord](https://discord.gg/cyAszAh) if you get stuck. +Additionally, +the ["Contributing to Lighthouse" section](https://lighthouse-book.sigmaprime.io/contributing.html#contributing-to-lighthouse) +of the Lighthouse Book provides more details on the setup. ## FAQs ### I don't think I have anything to add There's lots to be done and there's all sorts of tasks. You can do anything -from correcting typos through to writing core consensus code. If you reach out, +from enhancing documentation through to writing core consensus code. If you reach out, we'll include you. +Please note, to maintain project quality, we may not accept PRs for small typos or changes +with minimal impact. + ### I'm not sure my Rust is good enough We're open to developers of all levels. If you create a PR and your code doesn't meet our standards, we'll help you fix it and we'll share the reasoning with you. Contributing to open-source is a great way to learn. -### I'm not sure I know enough about Ethereum 2.0 +### I'm not sure I know enough about Ethereum No problems, there's plenty of tasks that don't require extensive Ethereum knowledge. You can learn about Ethereum as you go. From 93e0649abca54b7f2af4de0f486aec26adb15eb7 Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Mon, 13 May 2024 14:41:29 +0300 Subject: [PATCH 02/19] Notify lookup sync of gossip processing results (#5722) * Notify lookup sync of gossip processing results * Add tests * Add GossipBlockProcessResult event * Re-add dropped comments * Update beacon_node/network/src/network_beacon_processor/sync_methods.rs * update test_lookup_disconnection_peer_left --- .../src/block_verification_types.rs | 20 ++ .../src/data_availability_checker.rs | 7 +- .../overflow_lru_cache.rs | 13 +- .../gossip_methods.rs | 20 +- .../src/network_beacon_processor/mod.rs | 6 +- .../network_beacon_processor/sync_methods.rs | 18 +- .../network/src/sync/block_lookups/common.rs | 21 +- .../network/src/sync/block_lookups/mod.rs | 30 ++- .../sync/block_lookups/single_block_lookup.rs | 35 ++- .../network/src/sync/block_lookups/tests.rs | 241 +++++++++++++++++- beacon_node/network/src/sync/manager.rs | 19 +- .../network/src/sync/network_context.rs | 61 +++-- 12 files changed, 398 insertions(+), 93 deletions(-) diff --git a/beacon_node/beacon_chain/src/block_verification_types.rs b/beacon_node/beacon_chain/src/block_verification_types.rs index d0360bf18e..70f1e99ef7 100644 --- a/beacon_node/beacon_chain/src/block_verification_types.rs +++ b/beacon_node/beacon_chain/src/block_verification_types.rs @@ -314,6 +314,26 @@ pub struct BlockImportData { pub consensus_context: ConsensusContext, } +impl BlockImportData { + pub fn __new_for_test( + block_root: Hash256, + state: BeaconState, + parent_block: SignedBeaconBlock>, + ) -> Self { + Self { + block_root, + state, + parent_block, + parent_eth1_finalization_data: Eth1FinalizationData { + eth1_data: <_>::default(), + eth1_deposit_index: 0, + }, + confirmed_state_roots: vec![], + consensus_context: ConsensusContext::new(Slot::new(0)), + } + } +} + pub type GossipVerifiedBlockContents = (GossipVerifiedBlock, Option>); diff --git a/beacon_node/beacon_chain/src/data_availability_checker.rs b/beacon_node/beacon_chain/src/data_availability_checker.rs index 27ed0ae6d5..a981d31e55 100644 --- a/beacon_node/beacon_chain/src/data_availability_checker.rs +++ b/beacon_node/beacon_chain/src/data_availability_checker.rs @@ -84,10 +84,11 @@ impl DataAvailabilityChecker { }) } - /// Checks if the block root is currenlty in the availability cache awaiting processing because + /// Checks if the block root is currenlty in the availability cache awaiting import because /// of missing components. - pub fn has_block(&self, block_root: &Hash256) -> bool { - self.availability_cache.has_block(block_root) + pub fn has_execution_valid_block(&self, block_root: &Hash256) -> bool { + self.availability_cache + .has_execution_valid_block(block_root) } /// Return the required blobs `block_root` expects if the block is currenlty in the cache. diff --git a/beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs b/beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs index f29cec9244..2e3c4aac55 100644 --- a/beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs +++ b/beacon_node/beacon_chain/src/data_availability_checker/overflow_lru_cache.rs @@ -432,11 +432,6 @@ impl Critical { Ok(()) } - /// Returns true if the block root is known, without altering the LRU ordering - pub fn has_block(&self, block_root: &Hash256) -> bool { - self.in_memory.peek(block_root).is_some() || self.store_keys.contains(block_root) - } - /// This only checks for the blobs in memory pub fn peek_blob( &self, @@ -549,8 +544,12 @@ impl OverflowLRUCache { } /// Returns true if the block root is known, without altering the LRU ordering - pub fn has_block(&self, block_root: &Hash256) -> bool { - self.critical.read().has_block(block_root) + pub fn has_execution_valid_block(&self, block_root: &Hash256) -> bool { + if let Some(pending_components) = self.critical.read().peek_pending_components(block_root) { + pending_components.executed_block.is_some() + } else { + false + } } /// Fetch a blob from the cache without affecting the LRU ordering diff --git a/beacon_node/network/src/network_beacon_processor/gossip_methods.rs b/beacon_node/network/src/network_beacon_processor/gossip_methods.rs index af7f3a53e5..cf9f3e54e1 100644 --- a/beacon_node/network/src/network_beacon_processor/gossip_methods.rs +++ b/beacon_node/network/src/network_beacon_processor/gossip_methods.rs @@ -1187,19 +1187,18 @@ impl NetworkBeaconProcessor { "block_root" => %block_root, ); } - Err(BlockError::ParentUnknown(block)) => { - // Inform the sync manager to find parents for this block - // This should not occur. It should be checked by `should_forward_block` + Err(BlockError::ParentUnknown(_)) => { + // This should not occur. It should be checked by `should_forward_block`. + // Do not send sync message UnknownParentBlock to prevent conflicts with the + // BlockComponentProcessed message below. If this error ever happens, lookup sync + // can recover by receiving another block / blob / attestation referencing the + // chain that includes this block. error!( self.log, "Block with unknown parent attempted to be processed"; + "block_root" => %block_root, "peer_id" => %peer_id ); - self.send_sync_message(SyncMessage::UnknownParentBlock( - peer_id, - block.clone(), - block_root, - )); } Err(ref e @ BlockError::ExecutionPayloadError(ref epe)) if !epe.penalize_peer() => { debug!( @@ -1263,6 +1262,11 @@ impl NetworkBeaconProcessor { &self.log, ); } + + self.send_sync_message(SyncMessage::GossipBlockProcessResult { + block_root, + imported: matches!(result, Ok(AvailabilityProcessingStatus::Imported(_))), + }); } pub fn process_gossip_voluntary_exit( diff --git a/beacon_node/network/src/network_beacon_processor/mod.rs b/beacon_node/network/src/network_beacon_processor/mod.rs index f10646c741..cabe39f929 100644 --- a/beacon_node/network/src/network_beacon_processor/mod.rs +++ b/beacon_node/network/src/network_beacon_processor/mod.rs @@ -1,7 +1,5 @@ -use crate::{ - service::NetworkMessage, - sync::{manager::BlockProcessType, SyncMessage}, -}; +use crate::sync::manager::BlockProcessType; +use crate::{service::NetworkMessage, sync::manager::SyncMessage}; use beacon_chain::block_verification_types::RpcBlock; use beacon_chain::{builder::Witness, eth1_chain::CachingEth1Backend, BeaconChain}; use beacon_chain::{BeaconChainTypes, NotifyExecutionLayer}; diff --git a/beacon_node/network/src/network_beacon_processor/sync_methods.rs b/beacon_node/network/src/network_beacon_processor/sync_methods.rs index daa9a2cf19..f66879715d 100644 --- a/beacon_node/network/src/network_beacon_processor/sync_methods.rs +++ b/beacon_node/network/src/network_beacon_processor/sync_methods.rs @@ -170,17 +170,15 @@ impl NetworkBeaconProcessor { if reprocess_tx.try_send(reprocess_msg).is_err() { error!(self.log, "Failed to inform block import"; "source" => "rpc", "block_root" => %hash) }; - if matches!(process_type, BlockProcessType::SingleBlock { .. }) { - self.chain.block_times_cache.write().set_time_observed( - hash, - slot, - seen_timestamp, - None, - None, - ); + self.chain.block_times_cache.write().set_time_observed( + hash, + slot, + seen_timestamp, + None, + None, + ); - self.chain.recompute_head_at_current_slot().await; - } + self.chain.recompute_head_at_current_slot().await; } // Sync handles these results self.send_sync_message(SyncMessage::BlockComponentProcessed { diff --git a/beacon_node/network/src/sync/block_lookups/common.rs b/beacon_node/network/src/sync/block_lookups/common.rs index fa63e37c1b..400d382d6d 100644 --- a/beacon_node/network/src/sync/block_lookups/common.rs +++ b/beacon_node/network/src/sync/block_lookups/common.rs @@ -2,8 +2,8 @@ use crate::sync::block_lookups::single_block_lookup::{ LookupRequestError, SingleBlockLookup, SingleLookupRequestState, }; use crate::sync::block_lookups::{BlobRequestState, BlockRequestState, PeerId}; -use crate::sync::manager::{BlockProcessType, Id, SLOT_IMPORT_TOLERANCE}; -use crate::sync::network_context::SyncNetworkContext; +use crate::sync::manager::{Id, SLOT_IMPORT_TOLERANCE}; +use crate::sync::network_context::{LookupRequestResult, SyncNetworkContext}; use beacon_chain::block_verification_types::RpcBlock; use beacon_chain::BeaconChainTypes; use std::sync::Arc; @@ -45,7 +45,7 @@ pub trait RequestState { peer_id: PeerId, downloaded_block_expected_blobs: Option, cx: &mut SyncNetworkContext, - ) -> Result; + ) -> Result; /* Response handling methods */ @@ -80,7 +80,7 @@ impl RequestState for BlockRequestState { peer_id: PeerId, _: Option, cx: &mut SyncNetworkContext, - ) -> Result { + ) -> Result { cx.block_lookup_request(id, peer_id, self.requested_block_root) .map_err(LookupRequestError::SendFailed) } @@ -97,10 +97,10 @@ impl RequestState for BlockRequestState { peer_id: _, } = download_result; cx.send_block_for_processing( + id, block_root, RpcBlock::new_without_blobs(Some(block_root), value), seen_timestamp, - BlockProcessType::SingleBlock { id }, ) .map_err(LookupRequestError::SendFailed) } @@ -128,7 +128,7 @@ impl RequestState for BlobRequestState { peer_id: PeerId, downloaded_block_expected_blobs: Option, cx: &mut SyncNetworkContext, - ) -> Result { + ) -> Result { cx.blob_lookup_request( id, peer_id, @@ -149,13 +149,8 @@ impl RequestState for BlobRequestState { seen_timestamp, peer_id: _, } = download_result; - cx.send_blobs_for_processing( - block_root, - value, - seen_timestamp, - BlockProcessType::SingleBlob { id }, - ) - .map_err(LookupRequestError::SendFailed) + cx.send_blobs_for_processing(id, block_root, value, seen_timestamp) + .map_err(LookupRequestError::SendFailed) } fn response_type() -> ResponseType { diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 3da2577114..dd823a307b 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -408,7 +408,10 @@ impl BlockLookups { self.on_processing_result_inner::>(id, result, cx) } }; - self.on_lookup_result(process_type.id(), lookup_result, "processing_result", cx); + let id = match process_type { + BlockProcessType::SingleBlock { id } | BlockProcessType::SingleBlob { id } => id, + }; + self.on_lookup_result(id, lookup_result, "processing_result", cx); } pub fn on_processing_result_inner>( @@ -521,6 +524,7 @@ impl BlockLookups { } other => { debug!(self.log, "Invalid lookup component"; "block_root" => ?block_root, "component" => ?R::response_type(), "error" => ?other); + let peer_id = request_state.on_processing_failure()?; cx.report_peer( peer_id, @@ -561,6 +565,30 @@ impl BlockLookups { } } + pub fn on_external_processing_result( + &mut self, + block_root: Hash256, + imported: bool, + cx: &mut SyncNetworkContext, + ) { + let Some((id, lookup)) = self + .single_block_lookups + .iter_mut() + .find(|(_, lookup)| lookup.is_for_block(block_root)) + else { + // Ok to ignore gossip process events + return; + }; + + let lookup_result = if imported { + Ok(LookupResult::Completed) + } else { + lookup.continue_requests(cx) + }; + let id = *id; + self.on_lookup_result(id, lookup_result, "external_processing_result", cx); + } + /// Makes progress on the immediate children of `block_root` pub fn continue_child_lookups(&mut self, block_root: Hash256, cx: &mut SyncNetworkContext) { let mut lookup_results = vec![]; // < need to buffer lookup results to not re-borrow &mut self diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index a5729f3906..6ee519b0dd 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -2,7 +2,7 @@ use super::common::ResponseType; use super::{BlockComponent, PeerId, SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS}; use crate::sync::block_lookups::common::RequestState; use crate::sync::block_lookups::Id; -use crate::sync::network_context::SyncNetworkContext; +use crate::sync::network_context::{LookupRequestResult, SyncNetworkContext}; use beacon_chain::BeaconChainTypes; use itertools::Itertools; use rand::seq::IteratorRandom; @@ -179,11 +179,13 @@ impl SingleBlockLookup { .use_rand_available_peer() .ok_or(LookupRequestError::NoPeers)?; - // make_request returns true only if a request needs to be made - if request.make_request(id, peer_id, downloaded_block_expected_blobs, cx)? { - request.get_state_mut().on_download_start()?; - } else { - request.get_state_mut().on_completed_request()?; + match request.make_request(id, peer_id, downloaded_block_expected_blobs, cx)? { + LookupRequestResult::RequestSent => request.get_state_mut().on_download_start()?, + LookupRequestResult::NoRequestNeeded => { + request.get_state_mut().on_completed_request()? + } + // Sync will receive a future event to make progress on the request, do nothing now + LookupRequestResult::Pending => return Ok(()), } // Otherwise, attempt to progress awaiting processing @@ -262,12 +264,16 @@ pub struct DownloadResult { pub peer_id: PeerId, } -#[derive(Debug, PartialEq, Eq)] +#[derive(Debug, PartialEq, Eq, IntoStaticStr)] pub enum State { AwaitingDownload, Downloading, AwaitingProcess(DownloadResult), + /// Request is processing, sent by lookup sync Processing(DownloadResult), + /// Request is processed: + /// - `Processed(Some)` if lookup sync downloaded and sent to process this request + /// - `Processed(None)` if another source (i.e. gossip) sent this component for processing Processed(Option), } @@ -428,12 +434,11 @@ impl SingleLookupRequestState { } } - pub fn on_processing_success(&mut self) -> Result { + pub fn on_processing_success(&mut self) -> Result<(), LookupRequestError> { match &self.state { State::Processing(result) => { - let peer_id = result.peer_id; - self.state = State::Processed(Some(peer_id)); - Ok(peer_id) + self.state = State::Processed(Some(result.peer_id)); + Ok(()) } other => Err(LookupRequestError::BadState(format!( "Bad state on_processing_success expected Processing got {other}" @@ -514,12 +519,6 @@ impl SingleLookupRequestState { impl std::fmt::Display for State { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { - match self { - State::AwaitingDownload => write!(f, "AwaitingDownload"), - State::Downloading { .. } => write!(f, "Downloading"), - State::AwaitingProcess { .. } => write!(f, "AwaitingProcessing"), - State::Processing { .. } => write!(f, "Processing"), - State::Processed { .. } => write!(f, "Processed"), - } + write!(f, "{}", Into::<&'static str>::into(self)) } } diff --git a/beacon_node/network/src/sync/block_lookups/tests.rs b/beacon_node/network/src/sync/block_lookups/tests.rs index 75e0fc524f..761e54144d 100644 --- a/beacon_node/network/src/sync/block_lookups/tests.rs +++ b/beacon_node/network/src/sync/block_lookups/tests.rs @@ -11,12 +11,17 @@ use std::sync::Arc; use super::*; use crate::sync::block_lookups::common::{ResponseType, PARENT_DEPTH_TOLERANCE}; -use beacon_chain::block_verification_types::RpcBlock; +use beacon_chain::blob_verification::GossipVerifiedBlob; +use beacon_chain::block_verification_types::{BlockImportData, RpcBlock}; use beacon_chain::builder::Witness; +use beacon_chain::data_availability_checker::Availability; use beacon_chain::eth1_chain::CachingEth1Backend; use beacon_chain::test_utils::{ build_log, generate_rand_block_and_blobs, BeaconChainHarness, EphemeralHarnessType, NumBlobs, }; +use beacon_chain::{ + AvailabilityPendingExecutedBlock, PayloadVerificationOutcome, PayloadVerificationStatus, +}; use beacon_processor::WorkEvent; use lighthouse_network::rpc::{RPCError, RPCResponseErrorCode}; use lighthouse_network::types::SyncState; @@ -25,10 +30,12 @@ use slog::info; use slot_clock::{ManualSlotClock, SlotClock, TestingSlotClock}; use store::MemoryStore; use tokio::sync::mpsc; +use types::test_utils::TestRandom; use types::{ test_utils::{SeedableRng, XorShiftRng}, BlobSidecar, ForkName, MinimalEthSpec as E, SignedBeaconBlock, Slot, }; +use types::{BeaconState, BeaconStateBase}; type T = Witness, E, MemoryStore, MemoryStore>; @@ -68,6 +75,8 @@ struct TestRig { sync_manager: SyncManager, /// To manipulate sync state and peer connection status network_globals: Arc>, + /// Beacon chain harness + harness: BeaconChainHarness>, /// `rng` for generating test blocks and blobs. rng: XorShiftRng, fork_name: ForkName, @@ -129,6 +138,7 @@ impl TestRig { sync_recv, log.clone(), ), + harness, fork_name, log, } @@ -423,6 +433,63 @@ impl TestRig { }); } + fn complete_single_lookup_blob_download( + &mut self, + id: SingleLookupReqId, + peer_id: PeerId, + blobs: Vec>, + ) { + for blob in blobs { + self.single_lookup_blob_response(id, peer_id, Some(blob.into())); + } + self.single_lookup_blob_response(id, peer_id, None); + } + + fn complete_single_lookup_blob_lookup_valid( + &mut self, + id: SingleLookupReqId, + peer_id: PeerId, + blobs: Vec>, + import: bool, + ) { + let block_root = blobs.first().unwrap().block_root(); + let block_slot = blobs.first().unwrap().slot(); + self.complete_single_lookup_blob_download(id, peer_id, blobs); + self.expect_block_process(ResponseType::Blob); + self.single_blob_component_processed( + id.lookup_id, + if import { + BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(block_root)) + } else { + BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( + block_slot, block_root, + )) + }, + ); + } + + fn complete_single_lookup_block_valid(&mut self, block: SignedBeaconBlock, import: bool) { + let block_root = block.canonical_root(); + let block_slot = block.slot(); + let id = self.expect_block_lookup_request(block_root); + self.expect_empty_network(); + let peer_id = self.new_connected_peer(); + self.single_lookup_block_response(id, peer_id, Some(block.into())); + self.single_lookup_block_response(id, peer_id, None); + self.expect_block_process(ResponseType::Block); + let id = self.find_single_lookup_for(block_root); + self.single_block_component_processed( + id, + if import { + BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(block_root)) + } else { + BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( + block_slot, block_root, + )) + }, + ) + } + fn parent_lookup_failed(&mut self, id: SingleLookupReqId, peer_id: PeerId, error: RPCError) { self.send_sync_message(SyncMessage::RpcError { peer_id, @@ -714,6 +781,89 @@ impl TestRig { )); blocks } + + fn insert_block_to_da_checker(&mut self, block: Arc>) { + let state = BeaconState::Base(BeaconStateBase::random_for_test(&mut self.rng)); + let parent_block = self.rand_block(); + let import_data = BlockImportData::::__new_for_test( + block.canonical_root(), + state, + parent_block.into(), + ); + let payload_verification_outcome = PayloadVerificationOutcome { + payload_verification_status: PayloadVerificationStatus::Verified, + is_valid_merge_transition_block: false, + }; + let executed_block = + AvailabilityPendingExecutedBlock::new(block, import_data, payload_verification_outcome); + match self + .harness + .chain + .data_availability_checker + .put_pending_executed_block(executed_block) + .unwrap() + { + Availability::Available(_) => panic!("block removed from da_checker, available"), + Availability::MissingComponents(block_root) => { + self.log(&format!("inserted block to da_checker {block_root:?}")) + } + }; + } + + fn insert_blob_to_da_checker(&mut self, blob: BlobSidecar) { + match self + .harness + .chain + .data_availability_checker + .put_gossip_blob(GossipVerifiedBlob::__assumed_valid(blob.into())) + .unwrap() + { + Availability::Available(_) => panic!("blob removed from da_checker, available"), + Availability::MissingComponents(block_root) => { + self.log(&format!("inserted blob to da_checker {block_root:?}")) + } + }; + } + + fn insert_block_to_processing_cache(&mut self, block: Arc>) { + self.harness + .chain + .reqresp_pre_import_cache + .write() + .insert(block.canonical_root(), block); + } + + fn simulate_block_gossip_processing_becomes_invalid(&mut self, block_root: Hash256) { + self.harness + .chain + .reqresp_pre_import_cache + .write() + .remove(&block_root); + + self.send_sync_message(SyncMessage::GossipBlockProcessResult { + block_root, + imported: false, + }); + } + + fn simulate_block_gossip_processing_becomes_valid_missing_components( + &mut self, + block: Arc>, + ) { + let block_root = block.canonical_root(); + self.harness + .chain + .reqresp_pre_import_cache + .write() + .remove(&block_root); + + self.insert_block_to_da_checker(block); + + self.send_sync_message(SyncMessage::GossipBlockProcessResult { + block_root, + imported: false, + }); + } } #[test] @@ -1111,17 +1261,17 @@ fn test_parent_lookup_disconnection_no_peers_left() { } #[test] -fn test_parent_lookup_disconnection_peer_left() { +fn test_lookup_disconnection_peer_left() { let mut rig = TestRig::test_setup(); let peer_ids = (0..2).map(|_| rig.new_connected_peer()).collect::>(); - let trigger_block = rig.rand_block(); + let block_root = Hash256::random(); // lookup should have two peers associated with the same block for peer_id in peer_ids.iter() { - rig.trigger_unknown_parent_block(*peer_id, trigger_block.clone().into()); + rig.trigger_unknown_block_from_attestation(block_root, *peer_id); } // Disconnect the first peer only, which is the one handling the request rig.peer_disconnected(*peer_ids.first().unwrap()); - rig.assert_parent_lookups_count(1); + rig.assert_single_lookups_count(1); } #[test] @@ -1254,6 +1404,87 @@ fn test_same_chain_race_condition() { rig.expect_no_active_lookups(); } +#[test] +fn block_in_da_checker_skips_download() { + let Some(mut r) = TestRig::test_setup_after_deneb() else { + return; + }; + let (block, blobs) = r.rand_block_and_blobs(NumBlobs::Number(1)); + let block_root = block.canonical_root(); + let peer_id = r.new_connected_peer(); + r.insert_block_to_da_checker(block.into()); + r.trigger_unknown_block_from_attestation(block_root, peer_id); + // Should not trigger block request + let id = r.expect_blob_lookup_request(block_root); + r.expect_empty_network(); + // Resolve blob and expect lookup completed + r.complete_single_lookup_blob_lookup_valid(id, peer_id, blobs, true); + r.expect_no_active_lookups(); +} + +#[test] +fn block_in_processing_cache_becomes_invalid() { + let Some(mut r) = TestRig::test_setup_after_deneb() else { + return; + }; + let (block, blobs) = r.rand_block_and_blobs(NumBlobs::Number(1)); + let block_root = block.canonical_root(); + let peer_id = r.new_connected_peer(); + r.insert_block_to_processing_cache(block.clone().into()); + r.trigger_unknown_block_from_attestation(block_root, peer_id); + // Should not trigger block request + let id = r.expect_blob_lookup_request(block_root); + r.expect_empty_network(); + // Simulate invalid block, removing it from processing cache + r.simulate_block_gossip_processing_becomes_invalid(block_root); + // Should download and process the block + r.complete_single_lookup_block_valid(block, false); + // Resolve blob and expect lookup completed + r.complete_single_lookup_blob_lookup_valid(id, peer_id, blobs, true); + r.expect_no_active_lookups(); +} + +#[test] +fn block_in_processing_cache_becomes_valid_imported() { + let Some(mut r) = TestRig::test_setup_after_deneb() else { + return; + }; + let (block, blobs) = r.rand_block_and_blobs(NumBlobs::Number(1)); + let block_root = block.canonical_root(); + let peer_id = r.new_connected_peer(); + r.insert_block_to_processing_cache(block.clone().into()); + r.trigger_unknown_block_from_attestation(block_root, peer_id); + // Should not trigger block request + let id = r.expect_blob_lookup_request(block_root); + r.expect_empty_network(); + // Resolve the block from processing step + r.simulate_block_gossip_processing_becomes_valid_missing_components(block.into()); + // Resolve blob and expect lookup completed + r.complete_single_lookup_blob_lookup_valid(id, peer_id, blobs, true); + r.expect_no_active_lookups(); +} + +// IGNORE: wait for change that delays blob fetching to knowing the block +#[ignore] +#[test] +fn blobs_in_da_checker_skip_download() { + let Some(mut r) = TestRig::test_setup_after_deneb() else { + return; + }; + let (block, blobs) = r.rand_block_and_blobs(NumBlobs::Number(1)); + let block_root = block.canonical_root(); + let peer_id = r.new_connected_peer(); + for blob in blobs { + r.insert_blob_to_da_checker(blob); + } + r.trigger_unknown_block_from_attestation(block_root, peer_id); + // Should download and process the block + r.complete_single_lookup_block_valid(block, true); + // Should not trigger blob request + r.expect_empty_network(); + r.expect_no_active_lookups(); +} + mod deneb_only { use super::*; use beacon_chain::{ diff --git a/beacon_node/network/src/sync/manager.rs b/beacon_node/network/src/sync/manager.rs index 56bce7acad..6afaa76da9 100644 --- a/beacon_node/network/src/sync/manager.rs +++ b/beacon_node/network/src/sync/manager.rs @@ -144,6 +144,9 @@ pub enum SyncMessage { process_type: BlockProcessType, result: BlockProcessingResult, }, + + /// A block from gossip has completed processing, + GossipBlockProcessResult { block_root: Hash256, imported: bool }, } /// The type of processing specified for a received block. @@ -153,14 +156,6 @@ pub enum BlockProcessType { SingleBlob { id: Id }, } -impl BlockProcessType { - pub fn id(&self) -> Id { - match self { - BlockProcessType::SingleBlock { id } | BlockProcessType::SingleBlob { id } => *id, - } - } -} - #[derive(Debug)] pub enum BlockProcessingResult { Ok(AvailabilityProcessingStatus), @@ -637,6 +632,14 @@ impl SyncManager { } => self .block_lookups .on_processing_result(process_type, result, &mut self.network), + SyncMessage::GossipBlockProcessResult { + block_root, + imported, + } => self.block_lookups.on_external_processing_result( + block_root, + imported, + &mut self.network, + ), SyncMessage::BatchProcessed { sync_type, result } => match sync_type { ChainSegmentProcessId::RangeBatchId(chain_id, epoch) => { self.range_sync.handle_block_process_result( diff --git a/beacon_node/network/src/sync/network_context.rs b/beacon_node/network/src/sync/network_context.rs index 88495a5b35..cc4d18fd68 100644 --- a/beacon_node/network/src/sync/network_context.rs +++ b/beacon_node/network/src/sync/network_context.rs @@ -4,13 +4,13 @@ use self::requests::{ActiveBlobsByRootRequest, ActiveBlocksByRootRequest}; pub use self::requests::{BlobsByRootSingleBlockRequest, BlocksByRootSingleRequest}; use super::block_sidecar_coupling::BlocksAndBlobsRequestInfo; -use super::manager::{BlockProcessType, Id, RequestId as SyncRequestId}; +use super::manager::{Id, RequestId as SyncRequestId}; use super::range_sync::{BatchId, ByRangeRequestType, ChainId}; use crate::network_beacon_processor::NetworkBeaconProcessor; use crate::service::{NetworkMessage, RequestId}; use crate::status::ToStatusMessage; use crate::sync::block_lookups::SingleLookupId; -use crate::sync::manager::SingleLookupReqId; +use crate::sync::manager::{BlockProcessType, SingleLookupReqId}; use beacon_chain::block_verification_types::RpcBlock; use beacon_chain::validator_monitor::timestamp_now; use beacon_chain::{BeaconChain, BeaconChainTypes, EngineState}; @@ -81,6 +81,19 @@ impl From for LookupFailure { } } +pub enum LookupRequestResult { + /// A request is sent. Sync MUST receive an event from the network in the future for either: + /// completed response or failed request + RequestSent, + /// No request is sent, and no further action is necessary to consider this request completed + NoRequestNeeded, + /// No request is sent, but the request is not completed. Sync MUST receive some future event + /// that makes progress on the request. For example: request is processing from a different + /// source (i.e. block received from gossip) and sync MUST receive an event with that processing + /// result. + Pending, +} + /// Wraps a Network channel to employ various RPC related network functionality for the Sync manager. This includes management of a global RPC request Id. pub struct SyncNetworkContext { /// The network channel to relay messages to the Network service. @@ -305,14 +318,27 @@ impl SyncNetworkContext { lookup_id: SingleLookupId, peer_id: PeerId, block_root: Hash256, - ) -> Result { + ) -> Result { + // da_checker includes block that are execution verified, but are missing components + if self + .chain + .data_availability_checker + .has_execution_valid_block(&block_root) + { + return Ok(LookupRequestResult::NoRequestNeeded); + } + + // reqresp_pre_import_cache includes blocks that may not be yet execution verified if self .chain .reqresp_pre_import_cache .read() .contains_key(&block_root) { - return Ok(false); + // A block is on the `reqresp_pre_import_cache` but NOT in the + // `data_availability_checker` only if it is actively processing. We can expect a future + // event with the result of processing + return Ok(LookupRequestResult::Pending); } let id = SingleLookupReqId { @@ -340,7 +366,7 @@ impl SyncNetworkContext { self.blocks_by_root_requests .insert(id, ActiveBlocksByRootRequest::new(request)); - Ok(true) + Ok(LookupRequestResult::RequestSent) } /// Request necessary blobs for `block_root`. Requests only the necessary blobs by checking: @@ -355,7 +381,7 @@ impl SyncNetworkContext { peer_id: PeerId, block_root: Hash256, downloaded_block_expected_blobs: Option, - ) -> Result { + ) -> Result { let expected_blobs = downloaded_block_expected_blobs .or_else(|| { self.chain @@ -387,7 +413,7 @@ impl SyncNetworkContext { if indices.is_empty() { // No blobs required, do not issue any request - return Ok(false); + return Ok(LookupRequestResult::NoRequestNeeded); } let id = SingleLookupReqId { @@ -419,7 +445,7 @@ impl SyncNetworkContext { self.blobs_by_root_requests .insert(id, ActiveBlobsByRootRequest::new(request)); - Ok(true) + Ok(LookupRequestResult::RequestSent) } pub fn is_execution_engine_online(&self) -> bool { @@ -595,19 +621,19 @@ impl SyncNetworkContext { pub fn send_block_for_processing( &self, + id: Id, block_root: Hash256, block: RpcBlock, duration: Duration, - process_type: BlockProcessType, ) -> Result<(), &'static str> { match self.beacon_processor_if_enabled() { Some(beacon_processor) => { - debug!(self.log, "Sending block for processing"; "block" => ?block_root, "process" => ?process_type); + debug!(self.log, "Sending block for processing"; "block" => ?block_root, "id" => id); if let Err(e) = beacon_processor.send_rpc_beacon_block( block_root, block, duration, - process_type, + BlockProcessType::SingleBlock { id }, ) { error!( self.log, @@ -628,17 +654,20 @@ impl SyncNetworkContext { pub fn send_blobs_for_processing( &self, + id: Id, block_root: Hash256, blobs: FixedBlobSidecarList, duration: Duration, - process_type: BlockProcessType, ) -> Result<(), &'static str> { match self.beacon_processor_if_enabled() { Some(beacon_processor) => { - debug!(self.log, "Sending blobs for processing"; "block" => ?block_root, "process_type" => ?process_type); - if let Err(e) = - beacon_processor.send_rpc_blobs(block_root, blobs, duration, process_type) - { + debug!(self.log, "Sending blobs for processing"; "block" => ?block_root, "id" => id); + if let Err(e) = beacon_processor.send_rpc_blobs( + block_root, + blobs, + duration, + BlockProcessType::SingleBlob { id }, + ) { error!( self.log, "Failed to send sync blobs to processor"; From f37ffe4b8d88ad688fb6aff5ca2b03c13a7ee523 Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Mon, 13 May 2024 18:13:32 +0300 Subject: [PATCH 03/19] Do not request current child lookup peers (#5724) * Do not request current child lookup peers * Update tests --- .../network/src/sync/block_lookups/mod.rs | 5 +- .../sync/block_lookups/single_block_lookup.rs | 13 +- .../network/src/sync/block_lookups/tests.rs | 467 +++++++----------- 3 files changed, 197 insertions(+), 288 deletions(-) diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index dd823a307b..d2a0066c84 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -134,7 +134,10 @@ impl BlockLookups { block_root, Some(block_component), Some(parent_root), - &[peer_id], + // On a `UnknownParentBlock` or `UnknownParentBlob` event the peer is not required + // to have the rest of the block components (refer to decoupled blob gossip). Create + // the lookup with zero peers to house the block components. + &[], cx, ); } diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index 6ee519b0dd..ec3256ce58 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -174,10 +174,15 @@ impl SingleBlockLookup { return Err(LookupRequestError::TooManyAttempts { cannot_process }); } - let peer_id = request - .get_state_mut() - .use_rand_available_peer() - .ok_or(LookupRequestError::NoPeers)?; + let Some(peer_id) = request.get_state_mut().use_rand_available_peer() else { + if awaiting_parent { + // Allow lookups awaiting for a parent to have zero peers. If when the parent + // resolve they still have zero peers the lookup will fail gracefully. + return Ok(()); + } else { + return Err(LookupRequestError::NoPeers); + } + }; match request.make_request(id, peer_id, downloaded_block_expected_blobs, cx)? { LookupRequestResult::RequestSent => request.get_state_mut().on_download_start()?, diff --git a/beacon_node/network/src/sync/block_lookups/tests.rs b/beacon_node/network/src/sync/block_lookups/tests.rs index 761e54144d..619db0469b 100644 --- a/beacon_node/network/src/sync/block_lookups/tests.rs +++ b/beacon_node/network/src/sync/block_lookups/tests.rs @@ -67,6 +67,7 @@ type T = Witness, E, MemoryStore, Memo struct TestRig { /// Receiver for `BeaconProcessor` events (e.g. block processing results). beacon_processor_rx: mpsc::Receiver>, + beacon_processor_rx_queue: Vec>, /// Receiver for `NetworkMessage` (e.g. outgoing RPC requests from sync) network_rx: mpsc::UnboundedReceiver>, /// Stores all `NetworkMessage`s received from `network_recv`. (e.g. outgoing RPC requests) @@ -127,6 +128,7 @@ impl TestRig { let rng = XorShiftRng::from_seed([42; 16]); TestRig { beacon_processor_rx, + beacon_processor_rx_queue: vec![], network_rx, network_rx_queue: vec![], rng, @@ -285,15 +287,6 @@ impl TestRig { self.expect_no_active_single_lookups(); } - fn expect_lookups(&self, expected_block_roots: &[Hash256]) { - let block_roots = self - .active_single_lookups() - .iter() - .map(|(_, b, _)| *b) - .collect::>(); - assert_eq!(&block_roots, expected_block_roots); - } - fn new_connected_peer(&mut self) -> PeerId { let peer_id = PeerId::random(); self.network_globals @@ -544,6 +537,12 @@ impl TestRig { } } + fn drain_processor_rx(&mut self) { + while let Ok(event) = self.beacon_processor_rx.try_recv() { + self.beacon_processor_rx_queue.push(event); + } + } + fn pop_received_network_event) -> Option>( &mut self, predicate_transform: F, @@ -564,8 +563,34 @@ impl TestRig { } } - #[track_caller] - fn expect_block_lookup_request(&mut self, for_block: Hash256) -> SingleLookupReqId { + fn pop_received_processor_event) -> Option>( + &mut self, + predicate_transform: F, + ) -> Result { + self.drain_processor_rx(); + + if let Some(index) = self + .beacon_processor_rx_queue + .iter() + .position(|x| predicate_transform(x).is_some()) + { + // Transform the item, knowing that it won't be None because we checked it in the position predicate. + let transformed = predicate_transform(&self.beacon_processor_rx_queue[index]).unwrap(); + self.beacon_processor_rx_queue.remove(index); + Ok(transformed) + } else { + Err(format!( + "current processor messages {:?}", + self.beacon_processor_rx_queue + ) + .to_string()) + } + } + + fn find_block_lookup_request( + &mut self, + for_block: Hash256, + ) -> Result { self.pop_received_network_event(|ev| match ev { NetworkMessage::SendRequest { peer_id: _, @@ -574,11 +599,18 @@ impl TestRig { } if request.block_roots().to_vec().contains(&for_block) => Some(*id), _ => None, }) - .unwrap_or_else(|e| panic!("Expected block request for {for_block:?}: {e}")) } #[track_caller] - fn expect_blob_lookup_request(&mut self, for_block: Hash256) -> SingleLookupReqId { + fn expect_block_lookup_request(&mut self, for_block: Hash256) -> SingleLookupReqId { + self.find_block_lookup_request(for_block) + .unwrap_or_else(|e| panic!("Expected block request for {for_block:?}: {e}")) + } + + fn find_blob_lookup_request( + &mut self, + for_block: Hash256, + ) -> Result { self.pop_received_network_event(|ev| match ev { NetworkMessage::SendRequest { peer_id: _, @@ -594,7 +626,12 @@ impl TestRig { } _ => None, }) - .unwrap_or_else(|e| panic!("Expected blob request for {for_block:?}: {e}")) + } + + #[track_caller] + fn expect_blob_lookup_request(&mut self, for_block: Hash256) -> SingleLookupReqId { + self.find_blob_lookup_request(for_block) + .unwrap_or_else(|e| panic!("Expected blob request for {for_block:?}: {e}")) } #[track_caller] @@ -610,6 +647,15 @@ impl TestRig { .unwrap_or_else(|e| panic!("Expected block parent request for {for_block:?}: {e}")) } + fn expect_no_requests_for(&mut self, block_root: Hash256) { + if let Ok(request) = self.find_block_lookup_request(block_root) { + panic!("Expected no block request for {block_root:?} found {request:?}"); + } + if let Ok(request) = self.find_blob_lookup_request(block_root) { + panic!("Expected no blob request for {block_root:?} found {request:?}"); + } + } + #[track_caller] fn expect_blob_parent_request(&mut self, for_block: Hash256) -> SingleLookupReqId { self.pop_received_network_event(|ev| match ev { @@ -653,18 +699,16 @@ impl TestRig { #[track_caller] fn expect_block_process(&mut self, response_type: ResponseType) { match response_type { - ResponseType::Block => match self.beacon_processor_rx.try_recv() { - Ok(work) => { - assert_eq!(work.work_type(), beacon_processor::RPC_BLOCK); - } - other => panic!("Expected block process, found {:?}", other), - }, - ResponseType::Blob => match self.beacon_processor_rx.try_recv() { - Ok(work) => { - assert_eq!(work.work_type(), beacon_processor::RPC_BLOBS); - } - other => panic!("Expected blob process, found {:?}", other), - }, + ResponseType::Block => self + .pop_received_processor_event(|ev| { + (ev.work_type() == beacon_processor::RPC_BLOCK).then_some(()) + }) + .unwrap_or_else(|e| panic!("Expected block work event: {e}")), + ResponseType::Blob => self + .pop_received_processor_event(|ev| { + (ev.work_type() == beacon_processor::RPC_BLOBS).then_some(()) + }) + .unwrap_or_else(|e| panic!("Expected blobs work event: {e}")), } } @@ -907,22 +951,29 @@ fn test_single_block_lookup_happy_path() { rig.expect_no_active_lookups(); } +// Tests that if a peer does not respond with a block, we downscore and retry the block only #[test] fn test_single_block_lookup_empty_response() { - let mut rig = TestRig::test_setup(); + let mut r = TestRig::test_setup(); - let block_hash = Hash256::random(); - let peer_id = rig.new_connected_peer(); + let block = r.rand_block(); + let block_root = block.canonical_root(); + let peer_id = r.new_connected_peer(); // Trigger the request - rig.trigger_unknown_block_from_attestation(block_hash, peer_id); - let id = rig.expect_lookup_request_block_and_blobs(block_hash); + r.trigger_unknown_block_from_attestation(block_root, peer_id); + let id = r.expect_lookup_request_block_and_blobs(block_root); // The peer does not have the block. It should be penalized. - rig.single_lookup_block_response(id, peer_id, None); - rig.expect_penalty(peer_id, "NoResponseReturned"); - - rig.expect_block_lookup_request(block_hash); // it should be retried + r.single_lookup_block_response(id, peer_id, None); + r.expect_penalty(peer_id, "NoResponseReturned"); + // it should be retried + let id = r.expect_block_lookup_request(block_root); + // Send the right block this time. + r.single_lookup_block_response(id, peer_id, Some(block.into())); + r.expect_block_process(ResponseType::Block); + r.single_block_component_processed_imported(block_root); + r.expect_no_active_lookups(); } #[test] @@ -1014,6 +1065,8 @@ fn test_parent_lookup_happy_path() { rig.expect_block_process(ResponseType::Block); rig.expect_empty_network(); + // Add peer to child lookup to prevent it being dropped + rig.trigger_unknown_block_from_attestation(block_root, peer_id); // Processing succeeds, now the rest of the chain should be sent for processing. rig.parent_block_processed( block_root, @@ -1049,6 +1102,8 @@ fn test_parent_lookup_wrong_response() { rig.parent_lookup_block_response(id2, peer_id, Some(parent.into())); rig.expect_block_process(ResponseType::Block); + // Add peer to child lookup to prevent it being dropped + rig.trigger_unknown_block_from_attestation(block_root, peer_id); // Processing succeeds, now the rest of the chain should be sent for processing. rig.parent_block_processed_imported(block_root); rig.expect_parent_chain_process(); @@ -1056,33 +1111,6 @@ fn test_parent_lookup_wrong_response() { rig.expect_no_active_lookups(); } -#[test] -fn test_parent_lookup_empty_response() { - let mut rig = TestRig::test_setup(); - - let (parent, block, parent_root, block_root) = rig.rand_block_and_parent(); - let peer_id = rig.new_connected_peer(); - - // Trigger the request - rig.trigger_unknown_parent_block(peer_id, block.into()); - let id1 = rig.expect_parent_request_block_and_blobs(parent_root); - - // Peer sends an empty response, peer should be penalized and the block re-requested. - rig.parent_lookup_block_response(id1, peer_id, None); - rig.expect_penalty(peer_id, "NoResponseReturned"); - let id2 = rig.expect_block_parent_request(parent_root); - - // Send the right block this time. - rig.parent_lookup_block_response(id2, peer_id, Some(parent.into())); - rig.expect_block_process(ResponseType::Block); - - // Processing succeeds, now the rest of the chain should be sent for processing. - rig.parent_block_processed_imported(block_root); - - rig.single_block_component_processed_imported(block_root); - rig.expect_no_active_lookups(); -} - #[test] fn test_parent_lookup_rpc_failure() { let mut rig = TestRig::test_setup(); @@ -1102,6 +1130,8 @@ fn test_parent_lookup_rpc_failure() { rig.parent_lookup_block_response(id2, peer_id, Some(parent.into())); rig.expect_block_process(ResponseType::Block); + // Add peer to child lookup to prevent it being dropped + rig.trigger_unknown_block_from_attestation(block_root, peer_id); // Processing succeeds, now the rest of the chain should be sent for processing. rig.parent_block_processed_imported(block_root); rig.expect_parent_chain_process(); @@ -1287,19 +1317,6 @@ fn test_skip_creating_failed_parent_lookup() { rig.expect_no_active_lookups(); } -#[test] -fn test_skip_creating_failed_current_lookup() { - let mut rig = TestRig::test_setup(); - let (_, block, parent_root, block_root) = rig.rand_block_and_parent(); - let peer_id = rig.new_connected_peer(); - rig.insert_failed_chain(block_root); - rig.trigger_unknown_parent_block(peer_id, block.into()); - // Expect single penalty for peer - rig.expect_single_penalty(peer_id, "failed_chain"); - // Only the current lookup should be rejected - rig.expect_lookups(&[parent_root]); -} - #[test] fn test_single_block_lookup_ignored_response() { let mut rig = TestRig::test_setup(); @@ -1396,7 +1413,10 @@ fn test_same_chain_race_condition() { rig.trigger_unknown_parent_block(peer_id, trigger_block.clone()); rig.expect_empty_network(); - // Processing succeeds, now the rest of the chain should be sent for processing. + // Add a peer to the tip child lookup which has zero peers + rig.trigger_unknown_block_from_attestation(trigger_block.canonical_root(), peer_id); + + rig.log("Processing succeeds, now the rest of the chain should be sent for processing."); for block in blocks.iter().skip(1).chain(&[trigger_block]) { rig.expect_parent_chain_process(); rig.single_block_component_processed_imported(block.canonical_root()); @@ -1497,6 +1517,7 @@ mod deneb_only { rig: TestRig, block: Arc>, blobs: Vec>>, + parent_block_roots: Vec, parent_block: VecDeque>>, parent_blobs: VecDeque>>>, unknown_parent_block: Option>>, @@ -1512,16 +1533,16 @@ mod deneb_only { enum RequestTrigger { AttestationUnknownBlock, - GossipUnknownParentBlock { num_parents: usize }, - GossipUnknownParentBlob { num_parents: usize }, + GossipUnknownParentBlock(usize), + GossipUnknownParentBlob(usize), } impl RequestTrigger { fn num_parents(&self) -> usize { match self { RequestTrigger::AttestationUnknownBlock => 0, - RequestTrigger::GossipUnknownParentBlock { num_parents } => *num_parents, - RequestTrigger::GossipUnknownParentBlob { num_parents } => *num_parents, + RequestTrigger::GossipUnknownParentBlock(num_parents) => *num_parents, + RequestTrigger::GossipUnknownParentBlob(num_parents) => *num_parents, } } } @@ -1539,6 +1560,7 @@ mod deneb_only { let num_parents = request_trigger.num_parents(); let mut parent_block_chain = VecDeque::with_capacity(num_parents); let mut parent_blobs_chain = VecDeque::with_capacity(num_parents); + let mut parent_block_roots = vec![]; for _ in 0..num_parents { // Set the current block as the parent. let parent_root = block.canonical_root(); @@ -1546,6 +1568,7 @@ mod deneb_only { let parent_blobs = blobs.clone(); parent_block_chain.push_front(parent_block); parent_blobs_chain.push_front(parent_blobs); + parent_block_roots.push(parent_root); // Create the next block. let (child_block, child_blobs) = @@ -1580,13 +1603,12 @@ mod deneb_only { )); let parent_root = block.parent_root(); - let blob_req_id = rig.expect_blob_lookup_request(block_root); let parent_block_req_id = rig.expect_block_parent_request(parent_root); let parent_blob_req_id = rig.expect_blob_parent_request(parent_root); rig.expect_empty_network(); // expect no more requests ( None, - Some(blob_req_id), + None, Some(parent_block_req_id), Some(parent_blob_req_id), ) @@ -1596,14 +1618,12 @@ mod deneb_only { let parent_root = single_blob.block_parent_root(); rig.send_sync_message(SyncMessage::UnknownParentBlob(peer_id, single_blob)); - let block_req_id = rig.expect_block_lookup_request(block_root); - let blobs_req_id = rig.expect_blob_lookup_request(block_root); let parent_block_req_id = rig.expect_block_parent_request(parent_root); let parent_blob_req_id = rig.expect_blob_parent_request(parent_root); rig.expect_empty_network(); // expect no more requests ( - Some(block_req_id), - Some(blobs_req_id), + None, + None, Some(parent_block_req_id), Some(parent_blob_req_id), ) @@ -1616,6 +1636,7 @@ mod deneb_only { blobs, parent_block: parent_block_chain, parent_blobs: parent_blobs_chain, + parent_block_roots, unknown_parent_block: None, unknown_parent_blobs: None, peer_id, @@ -1633,6 +1654,13 @@ mod deneb_only { self } + fn trigger_unknown_block_from_attestation(mut self) -> Self { + let block_root = self.block.canonical_root(); + self.rig + .trigger_unknown_block_from_attestation(block_root, self.peer_id); + self + } + fn parent_block_response(mut self) -> Self { self.rig.expect_empty_network(); let block = self.parent_block.pop_front().unwrap().clone(); @@ -1743,15 +1771,6 @@ mod deneb_only { self } - fn empty_parent_block_response(mut self) -> Self { - self.rig.parent_lookup_block_response( - self.parent_block_req_id.expect("block request id"), - self.peer_id, - None, - ); - self - } - fn empty_parent_blobs_response(mut self) -> Self { self.rig.parent_lookup_blob_response( self.parent_blob_req_id.expect("blob request id"), @@ -1800,23 +1819,28 @@ mod deneb_only { } fn parent_block_imported(mut self) -> Self { - self.rig.log("parent_block_imported"); + let parent_root = *self.parent_block_roots.first().unwrap(); + self.rig + .log(&format!("parent_block_imported {parent_root:?}")); self.rig.parent_block_processed( self.block_root, - BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(self.block_root)), + BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(parent_root)), ); - self.rig.expect_empty_network(); + self.rig.expect_no_requests_for(parent_root); self.rig.assert_parent_lookups_count(0); self } fn parent_blob_imported(mut self) -> Self { - self.rig.log("parent_blob_imported"); + let parent_root = *self.parent_block_roots.first().unwrap(); + self.rig + .log(&format!("parent_blob_imported {parent_root:?}")); self.rig.parent_blob_processed( self.block_root, - BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(self.block_root)), + BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(parent_root)), ); - self.rig.expect_empty_network(); + + self.rig.expect_no_requests_for(parent_root); self.rig.assert_parent_lookups_count(0); self } @@ -1914,6 +1938,34 @@ mod deneb_only { self } + fn complete_current_block_and_blobs_lookup(self) -> Self { + self.expect_block_request() + .expect_blobs_request() + .block_response() + .blobs_response() + // TODO: Should send blobs for processing + .expect_block_process() + .block_imported() + } + + fn empty_parent_blobs_then_parent_block(self) -> Self { + self.log( + " Return empty blobs for parent, block errors with missing components, downscore", + ) + .empty_parent_blobs_response() + .expect_no_penalty_and_no_requests() + .parent_block_response() + .parent_block_missing_components() + .expect_penalty("sent_incomplete_blobs") + .log("Re-request parent blobs, succeed and import parent") + .expect_parent_blobs_request() + .parent_blob_response() + .expect_block_process() + // Insert new peer into child request before completing parent + .trigger_unknown_block_from_attestation() + .parent_blob_imported() + } + fn expect_penalty(mut self, expect_penalty_msg: &'static str) -> Self { self.rig.expect_penalty(self.peer_id, expect_penalty_msg); self @@ -1971,10 +2023,6 @@ mod deneb_only { self.blobs.push(first_blob); self } - fn expect_parent_chain_process(mut self) -> Self { - self.rig.expect_parent_chain_process(); - self - } fn expect_block_process(mut self) -> Self { self.rig.expect_block_process(ResponseType::Block); self @@ -1995,7 +2043,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .block_response_triggering_process() .blobs_response() @@ -2009,7 +2056,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .blobs_response() // hold blobs for processing .block_response_triggering_process() @@ -2023,7 +2069,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .empty_block_response() .expect_penalty("NoResponseReturned") @@ -2075,7 +2120,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .block_response_triggering_process() .invalid_block_processed() @@ -2092,7 +2136,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .block_response_triggering_process() .missing_components_from_block_request() @@ -2108,7 +2151,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .block_response_triggering_process() .missing_components_from_block_request() @@ -2125,7 +2167,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .block_response_triggering_process() .invalidate_blobs_too_many() @@ -2140,7 +2181,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .invalidate_blobs_too_few() .blobs_response() // blobs are not sent until the block is processed @@ -2153,7 +2193,6 @@ mod deneb_only { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester .invalidate_blobs_too_many() .blobs_response() @@ -2163,16 +2202,13 @@ mod deneb_only { .block_response_triggering_process(); } + // Test peer returning block that has unknown parent, and a new lookup is created #[test] fn parent_block_unknown_parent() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlock(1)) else { return; }; - tester - .blobs_response() .expect_empty_beacon_processor() .parent_block_response() .parent_blob_response() @@ -2183,17 +2219,13 @@ mod deneb_only { .expect_empty_beacon_processor(); } + // Test peer returning invalid (processing) block, expect retry #[test] fn parent_block_invalid_parent() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlock(1)) else { return; }; - tester - .blobs_response() - .expect_empty_beacon_processor() .parent_block_response() .parent_blob_response() .expect_block_process() @@ -2203,99 +2235,44 @@ mod deneb_only { .expect_empty_beacon_processor(); } + // Tests that if a peer does not respond with a block, we downscore and retry the block only #[test] - fn parent_block_and_blob_lookup_parent_returned_first() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { - return; - }; - - tester - .parent_block_response() - .parent_blob_response() - .expect_block_process() - .parent_block_imported() - .blobs_response() - .expect_parent_chain_process(); - } - - #[test] - fn parent_block_and_blob_lookup_child_returned_first() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { - return; - }; - - tester - .blobs_response() - .expect_no_penalty_and_no_requests() - .parent_block_response() - .parent_blob_response() - .expect_block_process() - .parent_block_imported() - .expect_parent_chain_process(); - } - - #[test] - fn empty_parent_block_then_parent_blob() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { + fn empty_block_is_retried() { + let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester - .empty_parent_block_response() + .empty_block_response() .expect_penalty("NoResponseReturned") - .expect_parent_block_request() + .expect_block_request() .expect_no_blobs_request() - .parent_blob_response() - .expect_empty_beacon_processor() - .parent_block_response() - .expect_block_process() - .parent_block_imported() + .block_response() .blobs_response() - .expect_parent_chain_process(); + .block_imported() + .expect_no_active_lookups(); } #[test] fn empty_parent_blobs_then_parent_block() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlock(1)) else { return; }; - tester - .blobs_response() - .log(" Return empty blobs for parent, block errors with missing components, downscore") - .empty_parent_blobs_response() - .expect_no_penalty_and_no_requests() - .parent_block_response() - .parent_block_missing_components() - .expect_penalty("sent_incomplete_blobs") - .log("Re-request parent blobs, succeed and import parent") - .expect_parent_blobs_request() - .parent_blob_response() - .expect_block_process() - .parent_blob_imported() + .empty_parent_blobs_then_parent_block() .log("resolve original block trigger blobs request and import") + // Should not have block request, it is cached + .expect_blobs_request() + // TODO: Should send blobs for processing .block_imported() .expect_no_active_lookups(); } #[test] fn parent_blob_unknown_parent() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(1)) else { return; }; - tester - .block_response() .expect_empty_beacon_processor() .parent_block_response() .parent_blob_response() @@ -2308,14 +2285,10 @@ mod deneb_only { #[test] fn parent_blob_invalid_parent() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(1)) else { return; }; - tester - .block_response() .expect_empty_beacon_processor() .parent_block_response() .parent_blob_response() @@ -2329,106 +2302,37 @@ mod deneb_only { #[test] fn parent_block_and_blob_lookup_parent_returned_first_blob_trigger() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { - return; - }; - - tester - .parent_block_response() - .parent_blob_response() - .expect_block_process() - .parent_block_imported() - .block_response() - .blobs_response() - .expect_parent_chain_process() - .block_imported() - .expect_no_active_lookups(); - } - - #[test] - fn parent_block_and_blob_lookup_child_returned_first_blob_trigger() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(1)) else { return; }; - tester - .block_response() - .expect_no_penalty_and_no_requests() .parent_block_response() .parent_blob_response() .expect_block_process() + .trigger_unknown_block_from_attestation() .parent_block_imported() - .blobs_response() - .expect_parent_chain_process() - .block_imported() - .expect_no_active_lookups(); - } - - #[test] - fn empty_parent_block_then_parent_blob_blob_trigger() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { - return; - }; - - tester - .empty_parent_block_response() - .expect_penalty("NoResponseReturned") - .expect_parent_block_request() - .expect_no_blobs_request() - .parent_blob_response() - .expect_empty_beacon_processor() - .parent_block_response() - .expect_block_process() - .parent_block_imported() - .blobs_response() - .block_response() - .block_imported() + .complete_current_block_and_blobs_lookup() .expect_no_active_lookups(); } #[test] fn empty_parent_blobs_then_parent_block_blob_trigger() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(1)) else { return; }; - tester - .block_response() - .log(" Return empty blobs for parent, block errors with missing components, downscore") - .empty_parent_blobs_response() - .expect_no_penalty_and_no_requests() - .parent_block_response() - .parent_block_missing_components() - .expect_penalty("sent_incomplete_blobs") - .log("Re-request parent blobs, succeed and import parent") - .expect_parent_blobs_request() - .parent_blob_response() - .expect_block_process() - .parent_blob_imported() + .empty_parent_blobs_then_parent_block() .log("resolve original block trigger blobs request and import") - .blobs_response() - .block_imported() + .complete_current_block_and_blobs_lookup() .expect_no_active_lookups(); } #[test] fn parent_blob_unknown_parent_chain() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 2 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(2)) else { return; }; - tester - .block_response() .expect_empty_beacon_processor() .parent_block_response() .parent_blob_response() @@ -2446,12 +2350,9 @@ mod deneb_only { #[test] fn unknown_parent_block_dup() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlock { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlock(1)) else { return; }; - tester .search_parent_dup() .expect_no_blobs_request() @@ -2460,18 +2361,18 @@ mod deneb_only { #[test] fn unknown_parent_blob_dup() { - let Some(tester) = - DenebTester::new(RequestTrigger::GossipUnknownParentBlob { num_parents: 1 }) - else { + let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(1)) else { return; }; - tester .search_parent_dup() .expect_no_blobs_request() .expect_no_block_request(); } + // This test no longer applies, we don't issue requests for child lookups + // Keep for after updating rules on fetching blocks only first + #[ignore] #[test] fn no_peer_penalty_when_rpc_response_already_known_from_gossip() { let Some(mut r) = TestRig::test_setup_after_deneb() else { From ce66ab374e14aa234bdb83b0747647f6affad6cb Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Tue, 14 May 2024 13:12:48 +0300 Subject: [PATCH 04/19] Enforce sync lookup receives a single result (#5777) * Enforce sync lookup receives a single result --- .../network/src/sync/block_lookups/mod.rs | 33 ++++++++-------- .../sync/block_lookups/single_block_lookup.rs | 38 +++++++++++++++---- beacon_node/network/src/sync/manager.rs | 4 +- .../network/src/sync/network_context.rs | 21 +++++----- 4 files changed, 60 insertions(+), 36 deletions(-) diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index d2a0066c84..60126818b6 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -6,7 +6,7 @@ use super::network_context::{RpcProcessingResult, SyncNetworkContext}; use crate::metrics; use crate::sync::block_lookups::common::{ResponseType, PARENT_DEPTH_TOLERANCE}; use crate::sync::block_lookups::parent_chain::find_oldest_fork_ancestor; -use crate::sync::manager::Id; +use crate::sync::manager::{Id, SingleLookupReqId}; use crate::sync::network_context::LookupFailure; use beacon_chain::block_verification_types::AsBlock; use beacon_chain::data_availability_checker::AvailabilityCheckErrorCategory; @@ -308,19 +308,19 @@ impl BlockLookups { /// Process a block or blob response received from a single lookup request. pub fn on_download_response>( &mut self, - id: SingleLookupId, + id: SingleLookupReqId, peer_id: PeerId, response: RpcProcessingResult, cx: &mut SyncNetworkContext, ) { let result = self.on_download_response_inner::(id, peer_id, response, cx); - self.on_lookup_result(id, result, "download_response", cx); + self.on_lookup_result(id.lookup_id, result, "download_response", cx); } /// Process a block or blob response received from a single lookup request. pub fn on_download_response_inner>( &mut self, - id: SingleLookupId, + id: SingleLookupReqId, peer_id: PeerId, response: RpcProcessingResult, cx: &mut SyncNetworkContext, @@ -333,10 +333,10 @@ impl BlockLookups { } let response_type = R::response_type(); - let Some(lookup) = self.single_block_lookups.get_mut(&id) else { + let Some(lookup) = self.single_block_lookups.get_mut(&id.lookup_id) else { // We don't have the ability to cancel in-flight RPC requests. So this can happen // if we started this RPC request, and later saw the block/blobs via gossip. - debug!(self.log, "Block returned for single block lookup not present"; "id" => id); + debug!(self.log, "Block returned for single block lookup not present"; "id" => ?id); return Err(LookupRequestError::UnknownLookup); }; @@ -348,7 +348,7 @@ impl BlockLookups { debug!(self.log, "Received lookup download success"; "block_root" => ?block_root, - "id" => id, + "id" => ?id, "peer_id" => %peer_id, "response_type" => ?response_type, ); @@ -356,25 +356,28 @@ impl BlockLookups { // Register the download peer here. Once we have received some data over the wire we // attribute it to this peer for scoring latter regardless of how the request was // done. - request_state.on_download_success(DownloadResult { - value: response, - block_root, - seen_timestamp, - peer_id, - })?; + request_state.on_download_success( + id.req_id, + DownloadResult { + value: response, + block_root, + seen_timestamp, + peer_id, + }, + )?; // continue_request will send for processing as the request state is AwaitingProcessing } Err(e) => { debug!(self.log, "Received lookup download failure"; "block_root" => ?block_root, - "id" => id, + "id" => ?id, "peer_id" => %peer_id, "response_type" => ?response_type, "error" => %e, ); - request_state.on_download_failure()?; + request_state.on_download_failure(id.req_id)?; // continue_request will retry a download as the request state is AwaitingDownload } } diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index ec3256ce58..6804798dc9 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -2,7 +2,7 @@ use super::common::ResponseType; use super::{BlockComponent, PeerId, SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS}; use crate::sync::block_lookups::common::RequestState; use crate::sync::block_lookups::Id; -use crate::sync::network_context::{LookupRequestResult, SyncNetworkContext}; +use crate::sync::network_context::{LookupRequestResult, ReqId, SyncNetworkContext}; use beacon_chain::BeaconChainTypes; use itertools::Itertools; use rand::seq::IteratorRandom; @@ -41,6 +41,13 @@ pub enum LookupRequestError { Failed, /// Attempted to retrieve a not known lookup id UnknownLookup, + /// Received a download result for a different request id than the in-flight request. + /// There should only exist a single request at a time. Having multiple requests is a bug and + /// can result in undefined state, so it's treated as a hard error and the lookup is dropped. + UnexpectedRequestId { + expected_req_id: ReqId, + req_id: ReqId, + }, } pub struct SingleBlockLookup { @@ -185,7 +192,9 @@ impl SingleBlockLookup { }; match request.make_request(id, peer_id, downloaded_block_expected_blobs, cx)? { - LookupRequestResult::RequestSent => request.get_state_mut().on_download_start()?, + LookupRequestResult::RequestSent(req_id) => { + request.get_state_mut().on_download_start(req_id)? + } LookupRequestResult::NoRequestNeeded => { request.get_state_mut().on_completed_request()? } @@ -272,7 +281,7 @@ pub struct DownloadResult { #[derive(Debug, PartialEq, Eq, IntoStaticStr)] pub enum State { AwaitingDownload, - Downloading, + Downloading(ReqId), AwaitingProcess(DownloadResult), /// Request is processing, sent by lookup sync Processing(DownloadResult), @@ -355,10 +364,10 @@ impl SingleLookupRequestState { } /// Switch to `Downloading` if the request is in `AwaitingDownload` state, otherwise returns None. - pub fn on_download_start(&mut self) -> Result<(), LookupRequestError> { + pub fn on_download_start(&mut self, req_id: ReqId) -> Result<(), LookupRequestError> { match &self.state { State::AwaitingDownload => { - self.state = State::Downloading; + self.state = State::Downloading(req_id); Ok(()) } other => Err(LookupRequestError::BadState(format!( @@ -369,9 +378,15 @@ impl SingleLookupRequestState { /// Registers a failure in downloading a block. This might be a peer disconnection or a wrong /// block. - pub fn on_download_failure(&mut self) -> Result<(), LookupRequestError> { + pub fn on_download_failure(&mut self, req_id: ReqId) -> Result<(), LookupRequestError> { match &self.state { - State::Downloading => { + State::Downloading(expected_req_id) => { + if req_id != *expected_req_id { + return Err(LookupRequestError::UnexpectedRequestId { + expected_req_id: *expected_req_id, + req_id, + }); + } self.failed_downloading = self.failed_downloading.saturating_add(1); self.state = State::AwaitingDownload; Ok(()) @@ -384,10 +399,17 @@ impl SingleLookupRequestState { pub fn on_download_success( &mut self, + req_id: ReqId, result: DownloadResult, ) -> Result<(), LookupRequestError> { match &self.state { - State::Downloading => { + State::Downloading(expected_req_id) => { + if req_id != *expected_req_id { + return Err(LookupRequestError::UnexpectedRequestId { + expected_req_id: *expected_req_id, + req_id, + }); + } self.state = State::AwaitingProcess(result); Ok(()) } diff --git a/beacon_node/network/src/sync/manager.rs b/beacon_node/network/src/sync/manager.rs index 6afaa76da9..4a4d70090e 100644 --- a/beacon_node/network/src/sync/manager.rs +++ b/beacon_node/network/src/sync/manager.rs @@ -819,7 +819,7 @@ impl SyncManager { if let Some(resp) = self.network.on_single_block_response(id, block) { self.block_lookups .on_download_response::>( - id.lookup_id, + id, peer_id, resp, &mut self.network, @@ -861,7 +861,7 @@ impl SyncManager { if let Some(resp) = self.network.on_single_blob_response(id, blob) { self.block_lookups .on_download_response::>( - id.lookup_id, + id, peer_id, resp, &mut self.network, diff --git a/beacon_node/network/src/sync/network_context.rs b/beacon_node/network/src/sync/network_context.rs index cc4d18fd68..44fb69d9b2 100644 --- a/beacon_node/network/src/sync/network_context.rs +++ b/beacon_node/network/src/sync/network_context.rs @@ -81,10 +81,13 @@ impl From for LookupFailure { } } +/// Sequential ID that uniquely identifies ReqResp outgoing requests +pub type ReqId = u32; + pub enum LookupRequestResult { /// A request is sent. Sync MUST receive an event from the network in the future for either: /// completed response or failed request - RequestSent, + RequestSent(ReqId), /// No request is sent, and no further action is necessary to consider this request completed NoRequestNeeded, /// No request is sent, but the request is not completed. Sync MUST receive some future event @@ -341,10 +344,8 @@ impl SyncNetworkContext { return Ok(LookupRequestResult::Pending); } - let id = SingleLookupReqId { - lookup_id, - req_id: self.next_id(), - }; + let req_id = self.next_id(); + let id = SingleLookupReqId { lookup_id, req_id }; debug!( self.log, @@ -366,7 +367,7 @@ impl SyncNetworkContext { self.blocks_by_root_requests .insert(id, ActiveBlocksByRootRequest::new(request)); - Ok(LookupRequestResult::RequestSent) + Ok(LookupRequestResult::RequestSent(req_id)) } /// Request necessary blobs for `block_root`. Requests only the necessary blobs by checking: @@ -416,10 +417,8 @@ impl SyncNetworkContext { return Ok(LookupRequestResult::NoRequestNeeded); } - let id = SingleLookupReqId { - lookup_id, - req_id: self.next_id(), - }; + let req_id = self.next_id(); + let id = SingleLookupReqId { lookup_id, req_id }; debug!( self.log, @@ -445,7 +444,7 @@ impl SyncNetworkContext { self.blobs_by_root_requests .insert(id, ActiveBlobsByRootRequest::new(request)); - Ok(LookupRequestResult::RequestSent) + Ok(LookupRequestResult::RequestSent(req_id)) } pub fn is_execution_engine_online(&self) -> bool { From 683d9df63ba1933be1458863941ae9c0dc200e18 Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Tue, 14 May 2024 17:50:38 +0300 Subject: [PATCH 05/19] Don't request block components until having block (#5774) * Don't request block components until having block * Update tests * Resolve todo comment * Merge branch 'unstable' into request-blocks-first --- .../network/src/sync/block_lookups/mod.rs | 31 +- .../sync/block_lookups/single_block_lookup.rs | 33 +- .../network/src/sync/block_lookups/tests.rs | 341 ++++++------------ beacon_node/network/src/sync/manager.rs | 4 +- .../network/src/sync/network_context.rs | 96 ++--- .../src/sync/network_context/requests.rs | 16 +- 6 files changed, 207 insertions(+), 314 deletions(-) diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 60126818b6..1deb50237d 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -7,7 +7,6 @@ use crate::metrics; use crate::sync::block_lookups::common::{ResponseType, PARENT_DEPTH_TOLERANCE}; use crate::sync::block_lookups::parent_chain::find_oldest_fork_ancestor; use crate::sync::manager::{Id, SingleLookupReqId}; -use crate::sync::network_context::LookupFailure; use beacon_chain::block_verification_types::AsBlock; use beacon_chain::data_availability_checker::AvailabilityCheckErrorCategory; use beacon_chain::{AvailabilityProcessingStatus, BeaconChainTypes, BlockError}; @@ -325,12 +324,7 @@ impl BlockLookups { response: RpcProcessingResult, cx: &mut SyncNetworkContext, ) -> Result { - // Downscore peer even if lookup is not known - // Only downscore lookup verify errors. RPC errors are downscored in the network handler. - if let Err(LookupFailure::LookupVerifyError(e)) = &response { - // Note: the error is displayed in full debug form on the match below - cx.report_peer(peer_id, PeerAction::LowToleranceError, e.into()); - } + // Note: no need to downscore peers here, already downscored on network context let response_type = R::response_type(); let Some(lookup) = self.single_block_lookups.get_mut(&id.lookup_id) else { @@ -459,23 +453,16 @@ impl BlockLookups { // if both components have been processed. request_state.on_processing_success()?; - // If this was the result of a block request, we can't determined if the block peer did anything - // wrong. If we already had both a block and blobs response processed, we should penalize the - // blobs peer because they did not provide all blobs on the initial request. if lookup.both_components_processed() { - if let Some(blob_peer) = lookup - .blob_request_state - .state - .on_post_process_validation_failure()? - { - cx.report_peer( - blob_peer, - PeerAction::MidToleranceError, - "sent_incomplete_blobs", - ); - } + // We don't request for other block components until being sure that the block has + // data. If we request blobs / columns to a peer we are sure those must exist. + // Therefore if all components are processed and we still receive `MissingComponents` + // it indicates an internal bug. + return Err(LookupRequestError::MissingComponentsAfterAllProcessed); + } else { + // Continue request, potentially request blobs + Action::Retry } - Action::Retry } BlockProcessingResult::Ignored => { // Beacon processor signalled to ignore the block processing result. diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index 6804798dc9..b6c2825fab 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -39,6 +39,9 @@ pub enum LookupRequestError { BadState(String), /// Lookup failed for some other reason and should be dropped Failed, + /// Received MissingComponents when all components have been processed. This should never + /// happen, and indicates some internal bug + MissingComponentsAfterAllProcessed, /// Attempted to retrieve a not known lookup id UnknownLookup, /// Received a download result for a different request id than the in-flight request. @@ -158,7 +161,7 @@ impl SingleBlockLookup { } /// Potentially makes progress on this request if it's in a progress-able state - pub fn continue_request>( + fn continue_request>( &mut self, cx: &mut SyncNetworkContext, ) -> Result<(), LookupRequestError> { @@ -285,10 +288,8 @@ pub enum State { AwaitingProcess(DownloadResult), /// Request is processing, sent by lookup sync Processing(DownloadResult), - /// Request is processed: - /// - `Processed(Some)` if lookup sync downloaded and sent to process this request - /// - `Processed(None)` if another source (i.e. gossip) sent this component for processing - Processed(Option), + /// Request is processed + Processed, } /// Object representing the state of a single block or blob lookup request. @@ -463,8 +464,8 @@ impl SingleLookupRequestState { pub fn on_processing_success(&mut self) -> Result<(), LookupRequestError> { match &self.state { - State::Processing(result) => { - self.state = State::Processed(Some(result.peer_id)); + State::Processing(_) => { + self.state = State::Processed; Ok(()) } other => Err(LookupRequestError::BadState(format!( @@ -473,27 +474,11 @@ impl SingleLookupRequestState { } } - pub fn on_post_process_validation_failure( - &mut self, - ) -> Result, LookupRequestError> { - match &self.state { - State::Processed(peer_id) => { - let peer_id = *peer_id; - self.failed_processing = self.failed_processing.saturating_add(1); - self.state = State::AwaitingDownload; - Ok(peer_id) - } - other => Err(LookupRequestError::BadState(format!( - "Bad state on_post_process_validation_failure expected Processed got {other}" - ))), - } - } - /// Mark a request as complete without any download or processing pub fn on_completed_request(&mut self) -> Result<(), LookupRequestError> { match &self.state { State::AwaitingDownload => { - self.state = State::Processed(None); + self.state = State::Processed; Ok(()) } other => Err(LookupRequestError::BadState(format!( diff --git a/beacon_node/network/src/sync/block_lookups/tests.rs b/beacon_node/network/src/sync/block_lookups/tests.rs index 619db0469b..2a59c24d58 100644 --- a/beacon_node/network/src/sync/block_lookups/tests.rs +++ b/beacon_node/network/src/sync/block_lookups/tests.rs @@ -287,6 +287,11 @@ impl TestRig { self.expect_no_active_single_lookups(); } + fn expect_no_active_lookups_empty_network(&mut self) { + self.expect_no_active_lookups(); + self.expect_empty_network(); + } + fn new_connected_peer(&mut self) -> PeerId { let peer_id = PeerId::random(); self.network_globals @@ -461,14 +466,16 @@ impl TestRig { ); } - fn complete_single_lookup_block_valid(&mut self, block: SignedBeaconBlock, import: bool) { + fn complete_lookup_block_download(&mut self, block: SignedBeaconBlock) { let block_root = block.canonical_root(); - let block_slot = block.slot(); let id = self.expect_block_lookup_request(block_root); self.expect_empty_network(); let peer_id = self.new_connected_peer(); self.single_lookup_block_response(id, peer_id, Some(block.into())); self.single_lookup_block_response(id, peer_id, None); + } + + fn complete_lookup_block_import_valid(&mut self, block_root: Hash256, import: bool) { self.expect_block_process(ResponseType::Block); let id = self.find_single_lookup_for(block_root); self.single_block_component_processed( @@ -477,12 +484,19 @@ impl TestRig { BlockProcessingResult::Ok(AvailabilityProcessingStatus::Imported(block_root)) } else { BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( - block_slot, block_root, + Slot::new(0), + block_root, )) }, ) } + fn complete_single_lookup_block_valid(&mut self, block: SignedBeaconBlock, import: bool) { + let block_root = block.canonical_root(); + self.complete_lookup_block_download(block); + self.complete_lookup_block_import_valid(block_root, import) + } + fn parent_lookup_failed(&mut self, id: SingleLookupReqId, peer_id: PeerId, error: RPCError) { self.send_sync_message(SyncMessage::RpcError { peer_id, @@ -676,26 +690,6 @@ impl TestRig { .unwrap_or_else(|e| panic!("Expected blob parent request for {for_block:?}: {e}")) } - fn expect_lookup_request_block_and_blobs(&mut self, block_root: Hash256) -> SingleLookupReqId { - let id = self.expect_block_lookup_request(block_root); - // If we're in deneb, a blob request should have been triggered as well, - // we don't require a response because we're generateing 0-blob blocks in this test. - if self.after_deneb() { - let _ = self.expect_blob_lookup_request(block_root); - } - id - } - - fn expect_parent_request_block_and_blobs(&mut self, block_root: Hash256) -> SingleLookupReqId { - let id = self.expect_block_parent_request(block_root); - // If we're in deneb, a blob request should have been triggered as well, - // we don't require a response because we're generateing 0-blob blocks in this test. - if self.after_deneb() { - let _ = self.expect_blob_parent_request(block_root); - } - id - } - #[track_caller] fn expect_block_process(&mut self, response_type: ResponseType) { match response_type { @@ -932,7 +926,7 @@ fn test_single_block_lookup_happy_path() { let block_root = block.canonical_root(); // Trigger the request rig.trigger_unknown_block_from_attestation(block_root, peer_id); - let id = rig.expect_lookup_request_block_and_blobs(block_root); + let id = rig.expect_block_lookup_request(block_root); // The peer provides the correct block, should not be penalized. Now the block should be sent // for processing. @@ -962,7 +956,7 @@ fn test_single_block_lookup_empty_response() { // Trigger the request r.trigger_unknown_block_from_attestation(block_root, peer_id); - let id = r.expect_lookup_request_block_and_blobs(block_root); + let id = r.expect_block_lookup_request(block_root); // The peer does not have the block. It should be penalized. r.single_lookup_block_response(id, peer_id, None); @@ -985,7 +979,7 @@ fn test_single_block_lookup_wrong_response() { // Trigger the request rig.trigger_unknown_block_from_attestation(block_hash, peer_id); - let id = rig.expect_lookup_request_block_and_blobs(block_hash); + let id = rig.expect_block_lookup_request(block_hash); // Peer sends something else. It should be penalized. let bad_block = rig.rand_block(); @@ -1007,7 +1001,7 @@ fn test_single_block_lookup_failure() { // Trigger the request rig.trigger_unknown_block_from_attestation(block_hash, peer_id); - let id = rig.expect_lookup_request_block_and_blobs(block_hash); + let id = rig.expect_block_lookup_request(block_hash); // The request fails. RPC failures are handled elsewhere so we should not penalize the peer. rig.single_lookup_failed(id, peer_id, RPCError::UnsupportedProtocol); @@ -1026,7 +1020,7 @@ fn test_single_block_lookup_becomes_parent_request() { // Trigger the request rig.trigger_unknown_block_from_attestation(block.canonical_root(), peer_id); - let id = rig.expect_lookup_request_block_and_blobs(block_root); + let id = rig.expect_block_parent_request(block_root); // The peer provides the correct block, should not be penalized. Now the block should be sent // for processing. @@ -1044,7 +1038,7 @@ fn test_single_block_lookup_becomes_parent_request() { BlockError::ParentUnknown(RpcBlock::new_without_blobs(None, block)).into(), ); assert_eq!(rig.active_single_lookups_count(), 2); // 2 = current + parent - rig.expect_parent_request_block_and_blobs(parent_root); + rig.expect_block_parent_request(parent_root); rig.expect_empty_network(); assert_eq!(rig.active_parent_lookups_count(), 1); } @@ -1058,10 +1052,12 @@ fn test_parent_lookup_happy_path() { // Trigger the request rig.trigger_unknown_parent_block(peer_id, block.into()); - let id = rig.expect_parent_request_block_and_blobs(parent_root); + let id = rig.expect_block_parent_request(parent_root); // Peer sends the right block, it should be sent for processing. Peer should not be penalized. rig.parent_lookup_block_response(id, peer_id, Some(parent.into())); + // No request of blobs because the block has not data + rig.expect_empty_network(); rig.expect_block_process(ResponseType::Block); rig.expect_empty_network(); @@ -1074,7 +1070,7 @@ fn test_parent_lookup_happy_path() { ); rig.expect_parent_chain_process(); rig.parent_chain_processed_success(block_root, &[]); - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1086,7 +1082,7 @@ fn test_parent_lookup_wrong_response() { // Trigger the request rig.trigger_unknown_parent_block(peer_id, block.into()); - let id1 = rig.expect_parent_request_block_and_blobs(parent_root); + let id1 = rig.expect_block_parent_request(parent_root); // Peer sends the wrong block, peer should be penalized and the block re-requested. let bad_block = rig.rand_block(); @@ -1108,7 +1104,7 @@ fn test_parent_lookup_wrong_response() { rig.parent_block_processed_imported(block_root); rig.expect_parent_chain_process(); rig.parent_chain_processed_success(block_root, &[]); - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1120,14 +1116,14 @@ fn test_parent_lookup_rpc_failure() { // Trigger the request rig.trigger_unknown_parent_block(peer_id, block.into()); - let id1 = rig.expect_parent_request_block_and_blobs(parent_root); + let id = rig.expect_block_parent_request(parent_root); // The request fails. It should be tried again. - rig.parent_lookup_failed_unavailable(id1, peer_id); - let id2 = rig.expect_block_parent_request(parent_root); + rig.parent_lookup_failed_unavailable(id, peer_id); + let id = rig.expect_block_parent_request(parent_root); // Send the right block this time. - rig.parent_lookup_block_response(id2, peer_id, Some(parent.into())); + rig.parent_lookup_block_response(id, peer_id, Some(parent.into())); rig.expect_block_process(ResponseType::Block); // Add peer to child lookup to prevent it being dropped @@ -1136,7 +1132,7 @@ fn test_parent_lookup_rpc_failure() { rig.parent_block_processed_imported(block_root); rig.expect_parent_chain_process(); rig.parent_chain_processed_success(block_root, &[]); - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1152,9 +1148,6 @@ fn test_parent_lookup_too_many_attempts() { for i in 1..=PARENT_FAIL_TOLERANCE { let id = rig.expect_block_parent_request(parent_root); // Blobs are only requested in the first iteration as this test only retries blocks - if rig.after_deneb() && i == 1 { - let _ = rig.expect_blob_parent_request(parent_root); - } if i % 2 == 0 { // make sure every error is accounted for @@ -1178,7 +1171,7 @@ fn test_parent_lookup_too_many_attempts() { } } - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1193,10 +1186,6 @@ fn test_parent_lookup_too_many_download_attempts_no_blacklist() { for i in 1..=PARENT_FAIL_TOLERANCE { assert!(!rig.failed_chains_contains(&block_root)); let id = rig.expect_block_parent_request(parent_root); - // Blobs are only requested in the first iteration as this test only retries blocks - if rig.after_deneb() && i == 1 { - let _ = rig.expect_blob_parent_request(parent_root); - } if i % 2 != 0 { // The request fails. It should be tried again. rig.parent_lookup_failed_unavailable(id, peer_id); @@ -1210,7 +1199,7 @@ fn test_parent_lookup_too_many_download_attempts_no_blacklist() { assert!(!rig.failed_chains_contains(&block_root)); assert!(!rig.failed_chains_contains(&parent.canonical_root())); - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1224,12 +1213,8 @@ fn test_parent_lookup_too_many_processing_attempts_must_blacklist() { rig.trigger_unknown_parent_block(peer_id, block.into()); rig.log("Fail downloading the block"); - for i in 0..(PARENT_FAIL_TOLERANCE - PROCESSING_FAILURES) { + for _ in 0..(PARENT_FAIL_TOLERANCE - PROCESSING_FAILURES) { let id = rig.expect_block_parent_request(parent_root); - // Blobs are only requested in the first iteration as this test only retries blocks - if rig.after_deneb() && i == 0 { - let _ = rig.expect_blob_parent_request(parent_root); - } // The request fails. It should be tried again. rig.parent_lookup_failed_unavailable(id, peer_id); } @@ -1247,7 +1232,7 @@ fn test_parent_lookup_too_many_processing_attempts_must_blacklist() { } rig.assert_not_failed_chain(block_root); - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1261,7 +1246,7 @@ fn test_parent_lookup_too_deep() { rig.trigger_unknown_parent_block(peer_id, trigger_block); for block in blocks.into_iter().rev() { - let id = rig.expect_parent_request_block_and_blobs(block.canonical_root()); + let id = rig.expect_block_parent_request(block.canonical_root()); // the block rig.parent_lookup_block_response(id, peer_id, Some(block.clone())); // the stream termination @@ -1326,7 +1311,7 @@ fn test_single_block_lookup_ignored_response() { // Trigger the request rig.trigger_unknown_block_from_attestation(block.canonical_root(), peer_id); - let id = rig.expect_lookup_request_block_and_blobs(block.canonical_root()); + let id = rig.expect_block_lookup_request(block.canonical_root()); // The peer provides the correct block, should not be penalized. Now the block should be sent // for processing. @@ -1342,8 +1327,7 @@ fn test_single_block_lookup_ignored_response() { rig.single_lookup_block_response(id, peer_id, None); // Send an Ignored response, the request should be dropped rig.single_block_component_processed(id.lookup_id, BlockProcessingResult::Ignored); - rig.expect_empty_network(); - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1355,7 +1339,7 @@ fn test_parent_lookup_ignored_response() { // Trigger the request rig.trigger_unknown_parent_block(peer_id, block.clone().into()); - let id = rig.expect_parent_request_block_and_blobs(parent_root); + let id = rig.expect_block_parent_request(parent_root); // Note: single block lookup for current `block` does not trigger any request because it does // not have blobs, and the block is already cached @@ -1385,7 +1369,7 @@ fn test_same_chain_race_condition() { rig.trigger_unknown_parent_block(peer_id, trigger_block.clone()); for (i, block) in blocks.clone().into_iter().rev().enumerate() { - let id = rig.expect_parent_request_block_and_blobs(block.canonical_root()); + let id = rig.expect_block_parent_request(block.canonical_root()); // the block rig.parent_lookup_block_response(id, peer_id, Some(block.clone())); // the stream termination @@ -1421,7 +1405,7 @@ fn test_same_chain_race_condition() { rig.expect_parent_chain_process(); rig.single_block_component_processed_imported(block.canonical_root()); } - rig.expect_no_active_lookups(); + rig.expect_no_active_lookups_empty_network(); } #[test] @@ -1453,12 +1437,13 @@ fn block_in_processing_cache_becomes_invalid() { r.insert_block_to_processing_cache(block.clone().into()); r.trigger_unknown_block_from_attestation(block_root, peer_id); // Should not trigger block request - let id = r.expect_blob_lookup_request(block_root); r.expect_empty_network(); // Simulate invalid block, removing it from processing cache r.simulate_block_gossip_processing_becomes_invalid(block_root); - // Should download and process the block - r.complete_single_lookup_block_valid(block, false); + // Should download block, then issue blobs request + r.complete_lookup_block_download(block); + let id = r.expect_blob_lookup_request(block_root); + r.complete_lookup_block_import_valid(block_root, false); // Resolve blob and expect lookup completed r.complete_single_lookup_blob_lookup_valid(id, peer_id, blobs, true); r.expect_no_active_lookups(); @@ -1475,10 +1460,10 @@ fn block_in_processing_cache_becomes_valid_imported() { r.insert_block_to_processing_cache(block.clone().into()); r.trigger_unknown_block_from_attestation(block_root, peer_id); // Should not trigger block request - let id = r.expect_blob_lookup_request(block_root); r.expect_empty_network(); // Resolve the block from processing step r.simulate_block_gossip_processing_becomes_valid_missing_components(block.into()); + let id = r.expect_blob_lookup_request(block_root); // Resolve blob and expect lookup completed r.complete_single_lookup_blob_lookup_valid(id, peer_id, blobs, true); r.expect_no_active_lookups(); @@ -1592,8 +1577,7 @@ mod deneb_only { peer_id, block_root, )); let block_req_id = rig.expect_block_lookup_request(block_root); - let blob_req_id = rig.expect_blob_lookup_request(block_root); - (Some(block_req_id), Some(blob_req_id), None, None) + (Some(block_req_id), None, None, None) } RequestTrigger::GossipUnknownParentBlock { .. } => { rig.send_sync_message(SyncMessage::UnknownParentBlock( @@ -1604,14 +1588,8 @@ mod deneb_only { let parent_root = block.parent_root(); let parent_block_req_id = rig.expect_block_parent_request(parent_root); - let parent_blob_req_id = rig.expect_blob_parent_request(parent_root); rig.expect_empty_network(); // expect no more requests - ( - None, - None, - Some(parent_block_req_id), - Some(parent_blob_req_id), - ) + (None, None, Some(parent_block_req_id), None) } RequestTrigger::GossipUnknownParentBlob { .. } => { let single_blob = blobs.first().cloned().unwrap(); @@ -1619,14 +1597,8 @@ mod deneb_only { rig.send_sync_message(SyncMessage::UnknownParentBlob(peer_id, single_blob)); let parent_block_req_id = rig.expect_block_parent_request(parent_root); - let parent_blob_req_id = rig.expect_blob_parent_request(parent_root); rig.expect_empty_network(); // expect no more requests - ( - None, - None, - Some(parent_block_req_id), - Some(parent_blob_req_id), - ) + (None, None, Some(parent_block_req_id), None) } }; @@ -1675,6 +1647,23 @@ mod deneb_only { self } + fn parent_block_response_expect_blobs(mut self) -> Self { + self.rig.expect_empty_network(); + let block = self.parent_block.pop_front().unwrap().clone(); + let _ = self.unknown_parent_block.insert(block.clone()); + self.rig.parent_lookup_block_response( + self.parent_block_req_id.expect("parent request id"), + self.peer_id, + Some(block), + ); + + // Expect blobs request after sending block + let s = self.expect_parent_blobs_request(); + + s.rig.assert_parent_lookups_count(1); + s + } + fn parent_blob_response(mut self) -> Self { let blobs = self.parent_blobs.pop_front().unwrap(); let _ = self.unknown_parent_blobs.insert(blobs.clone()); @@ -1687,7 +1676,7 @@ mod deneb_only { assert_eq!(self.rig.active_parent_lookups_count(), 1); } self.rig.parent_lookup_blob_response( - self.parent_blob_req_id.expect("blob request id"), + self.parent_blob_req_id.expect("parent blob request id"), self.peer_id, None, ); @@ -1696,7 +1685,7 @@ mod deneb_only { } fn block_response_triggering_process(self) -> Self { - let mut me = self.block_response(); + let mut me = self.block_response_and_expect_blob_request(); me.rig.expect_block_process(ResponseType::Block); // The request should still be active. @@ -1704,7 +1693,7 @@ mod deneb_only { me } - fn block_response(mut self) -> Self { + fn block_response_and_expect_blob_request(mut self) -> Self { // The peer provides the correct block, should not be penalized. Now the block should be sent // for processing. self.rig.single_lookup_block_response( @@ -1712,12 +1701,14 @@ mod deneb_only { self.peer_id, Some(self.block.clone()), ); - self.rig.expect_empty_network(); + // After responding with block the node will issue a blob request + let mut s = self.expect_blobs_request(); + + s.rig.expect_empty_network(); // The request should still be active. - self.rig - .assert_lookup_is_active(self.block.canonical_root()); - self + s.rig.assert_lookup_is_active(s.block.canonical_root()); + s } fn blobs_response(mut self) -> Self { @@ -1831,6 +1822,21 @@ mod deneb_only { self } + fn parent_block_missing_components(mut self) -> Self { + let parent_root = *self.parent_block_roots.first().unwrap(); + self.rig + .log(&format!("parent_block_missing_components {parent_root:?}")); + self.rig.parent_block_processed( + self.block_root, + BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( + Slot::new(0), + parent_root, + )), + ); + self.rig.expect_no_requests_for(parent_root); + self + } + fn parent_blob_imported(mut self) -> Self { let parent_root = *self.parent_block_roots.first().unwrap(); self.rig @@ -1864,26 +1870,6 @@ mod deneb_only { self } - fn parent_block_missing_components(mut self) -> Self { - let block = self.unknown_parent_block.clone().unwrap(); - self.rig.parent_block_processed( - self.block_root, - BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( - block.slot(), - block.canonical_root(), - )), - ); - self.rig.parent_blob_processed( - self.block_root, - BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( - block.slot(), - block.canonical_root(), - )), - ); - assert_eq!(self.rig.active_parent_lookups_count(), 1); - self - } - fn invalid_parent_processed(mut self) -> Self { self.rig.parent_block_processed( self.block_root, @@ -1922,45 +1908,35 @@ mod deneb_only { self.block_root, )), ); - self.rig.assert_single_lookups_count(1); - self - } + // Add block to da_checker so blobs request can continue + self.rig.insert_block_to_da_checker(self.block.clone()); - fn missing_components_from_blob_request(mut self) -> Self { - self.rig.single_blob_component_processed( - self.blob_req_id.expect("blob request id").lookup_id, - BlockProcessingResult::Ok(AvailabilityProcessingStatus::MissingComponents( - self.slot, - self.block_root, - )), - ); self.rig.assert_single_lookups_count(1); self } fn complete_current_block_and_blobs_lookup(self) -> Self { self.expect_block_request() - .expect_blobs_request() - .block_response() + .block_response_and_expect_blob_request() .blobs_response() // TODO: Should send blobs for processing .expect_block_process() .block_imported() } - fn empty_parent_blobs_then_parent_block(self) -> Self { + fn parent_block_then_empty_parent_blobs(self) -> Self { self.log( " Return empty blobs for parent, block errors with missing components, downscore", ) - .empty_parent_blobs_response() - .expect_no_penalty_and_no_requests() .parent_block_response() - .parent_block_missing_components() - .expect_penalty("sent_incomplete_blobs") + .expect_parent_blobs_request() + .empty_parent_blobs_response() + .expect_penalty("NotEnoughResponsesReturned") .log("Re-request parent blobs, succeed and import parent") .expect_parent_blobs_request() .parent_blob_response() .expect_block_process() + .parent_block_missing_components() // Insert new peer into child request before completing parent .trigger_unknown_block_from_attestation() .parent_blob_imported() @@ -2044,77 +2020,27 @@ mod deneb_only { return; }; tester - .block_response_triggering_process() + .block_response_and_expect_blob_request() .blobs_response() .block_missing_components() // blobs not yet imported .blobs_response_was_valid() .blob_imported(); // now blobs resolve as imported } - #[test] - fn single_block_and_blob_lookup_blobs_returned_first_attestation() { - let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { - return; - }; - tester - .blobs_response() // hold blobs for processing - .block_response_triggering_process() - .block_missing_components() // blobs not yet imported - .blobs_response_was_valid() - .blob_imported(); // now blobs resolve as imported - } - - #[test] - fn single_block_and_blob_lookup_empty_response_attestation() { - let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { - return; - }; - tester - .empty_block_response() - .expect_penalty("NoResponseReturned") - .expect_block_request() - .expect_no_blobs_request() - .empty_blobs_response() - .expect_empty_beacon_processor() - .expect_no_penalty() - .expect_no_block_request() - .expect_no_blobs_request() - .block_response_triggering_process() - .missing_components_from_block_request(); - } - #[test] fn single_block_response_then_empty_blob_response_attestation() { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { return; }; - tester - .block_response_triggering_process() + .block_response_and_expect_blob_request() .missing_components_from_block_request() .empty_blobs_response() - .missing_components_from_blob_request() - .expect_penalty("sent_incomplete_blobs") + .expect_penalty("NotEnoughResponsesReturned") .expect_blobs_request() .expect_no_block_request(); } - #[test] - fn single_blob_response_then_empty_block_response_attestation() { - let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { - return; - }; - - tester - .blobs_response() - .expect_no_penalty_and_no_requests() - // blobs not sent for processing until the block is processed - .empty_block_response() - .expect_penalty("NoResponseReturned") - .expect_block_request() - .expect_no_blobs_request(); - } - #[test] fn single_invalid_block_response_then_blob_response_attestation() { let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { @@ -2156,8 +2082,7 @@ mod deneb_only { .missing_components_from_block_request() .invalidate_blobs_too_few() .blobs_response() - .missing_components_from_blob_request() - .expect_penalty("sent_incomplete_blobs") + .expect_penalty("NotEnoughResponsesReturned") .expect_blobs_request() .expect_no_block_request(); } @@ -2171,37 +2096,12 @@ mod deneb_only { .block_response_triggering_process() .invalidate_blobs_too_many() .blobs_response() - .expect_penalty("DuplicateData") - .expect_blobs_request() + .expect_penalty("TooManyResponses") + // Network context returns "download success" because the request has enough blobs + it + // downscores the peer for returning too many. .expect_no_block_request(); } - #[test] - fn too_few_blobs_response_then_block_response_attestation() { - let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { - return; - }; - tester - .invalidate_blobs_too_few() - .blobs_response() // blobs are not sent until the block is processed - .expect_no_penalty_and_no_requests() - .block_response_triggering_process(); - } - - #[test] - fn too_many_blobs_response_then_block_response_attestation() { - let Some(tester) = DenebTester::new(RequestTrigger::AttestationUnknownBlock) else { - return; - }; - tester - .invalidate_blobs_too_many() - .blobs_response() - .expect_penalty("DuplicateData") - .expect_blobs_request() - .expect_no_block_request() - .block_response_triggering_process(); - } - // Test peer returning block that has unknown parent, and a new lookup is created #[test] fn parent_block_unknown_parent() { @@ -2210,12 +2110,11 @@ mod deneb_only { }; tester .expect_empty_beacon_processor() - .parent_block_response() + .parent_block_response_expect_blobs() .parent_blob_response() .expect_block_process() .parent_block_unknown_parent() .expect_parent_block_request() - .expect_parent_blobs_request() .expect_empty_beacon_processor(); } @@ -2226,7 +2125,7 @@ mod deneb_only { return; }; tester - .parent_block_response() + .parent_block_response_expect_blobs() .parent_blob_response() .expect_block_process() .invalid_parent_processed() @@ -2246,19 +2145,19 @@ mod deneb_only { .expect_penalty("NoResponseReturned") .expect_block_request() .expect_no_blobs_request() - .block_response() + .block_response_and_expect_blob_request() .blobs_response() .block_imported() .expect_no_active_lookups(); } #[test] - fn empty_parent_blobs_then_parent_block() { + fn parent_block_then_empty_parent_blobs() { let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlock(1)) else { return; }; tester - .empty_parent_blobs_then_parent_block() + .parent_block_then_empty_parent_blobs() .log("resolve original block trigger blobs request and import") // Should not have block request, it is cached .expect_blobs_request() @@ -2274,12 +2173,11 @@ mod deneb_only { }; tester .expect_empty_beacon_processor() - .parent_block_response() + .parent_block_response_expect_blobs() .parent_blob_response() .expect_block_process() .parent_block_unknown_parent() .expect_parent_block_request() - .expect_parent_blobs_request() .expect_empty_beacon_processor(); } @@ -2290,7 +2188,7 @@ mod deneb_only { }; tester .expect_empty_beacon_processor() - .parent_block_response() + .parent_block_response_expect_blobs() .parent_blob_response() .expect_block_process() .invalid_parent_processed() @@ -2307,6 +2205,7 @@ mod deneb_only { }; tester .parent_block_response() + .expect_parent_blobs_request() .parent_blob_response() .expect_block_process() .trigger_unknown_block_from_attestation() @@ -2316,12 +2215,12 @@ mod deneb_only { } #[test] - fn empty_parent_blobs_then_parent_block_blob_trigger() { + fn parent_block_then_empty_parent_blobs_blob_trigger() { let Some(tester) = DenebTester::new(RequestTrigger::GossipUnknownParentBlob(1)) else { return; }; tester - .empty_parent_blobs_then_parent_block() + .parent_block_then_empty_parent_blobs() .log("resolve original block trigger blobs request and import") .complete_current_block_and_blobs_lookup() .expect_no_active_lookups(); @@ -2334,15 +2233,15 @@ mod deneb_only { }; tester .expect_empty_beacon_processor() - .parent_block_response() + .parent_block_response_expect_blobs() .parent_blob_response() .expect_no_penalty() .expect_block_process() .parent_block_unknown_parent() .expect_parent_block_request() - .expect_parent_blobs_request() .expect_empty_beacon_processor() .parent_block_response() + .expect_parent_blobs_request() .parent_blob_response() .expect_no_penalty() .expect_block_process(); diff --git a/beacon_node/network/src/sync/manager.rs b/beacon_node/network/src/sync/manager.rs index 4a4d70090e..66d23dd191 100644 --- a/beacon_node/network/src/sync/manager.rs +++ b/beacon_node/network/src/sync/manager.rs @@ -816,7 +816,7 @@ impl SyncManager { peer_id: PeerId, block: RpcEvent>>, ) { - if let Some(resp) = self.network.on_single_block_response(id, block) { + if let Some(resp) = self.network.on_single_block_response(id, peer_id, block) { self.block_lookups .on_download_response::>( id, @@ -858,7 +858,7 @@ impl SyncManager { peer_id: PeerId, blob: RpcEvent>>, ) { - if let Some(resp) = self.network.on_single_blob_response(id, blob) { + if let Some(resp) = self.network.on_single_blob_response(id, peer_id, blob) { self.block_lookups .on_download_response::>( id, diff --git a/beacon_node/network/src/sync/network_context.rs b/beacon_node/network/src/sync/network_context.rs index 44fb69d9b2..8693bc0c6c 100644 --- a/beacon_node/network/src/sync/network_context.rs +++ b/beacon_node/network/src/sync/network_context.rs @@ -12,7 +12,6 @@ use crate::status::ToStatusMessage; use crate::sync::block_lookups::SingleLookupId; use crate::sync::manager::{BlockProcessType, SingleLookupReqId}; use beacon_chain::block_verification_types::RpcBlock; -use beacon_chain::validator_monitor::timestamp_now; use beacon_chain::{BeaconChain, BeaconChainTypes, EngineState}; use fnv::FnvHashMap; use lighthouse_network::rpc::methods::BlobsByRangeRequest; @@ -383,24 +382,18 @@ impl SyncNetworkContext { block_root: Hash256, downloaded_block_expected_blobs: Option, ) -> Result { - let expected_blobs = downloaded_block_expected_blobs - .or_else(|| { - self.chain - .data_availability_checker - .num_expected_blobs(&block_root) - }) - .unwrap_or_else(|| { - // If we don't about the block being requested, attempt to fetch all blobs - if self - .chain - .data_availability_checker - .da_check_required_for_current_epoch() - { - T::EthSpec::max_blobs_per_block() - } else { - 0 - } - }); + let Some(expected_blobs) = downloaded_block_expected_blobs.or_else(|| { + self.chain + .data_availability_checker + .num_expected_blobs(&block_root) + }) else { + // Wait to download the block before downloading blobs. Then we can be sure that the + // block has data, so there's no need to do "blind" requests for all possible blobs and + // latter handle the case where if the peer sent no blobs, penalize. + // - if `downloaded_block_expected_blobs` is Some = block is downloading or processing. + // - if `num_expected_blobs` returns Some = block is processed. + return Ok(LookupRequestResult::Pending); + }; let imported_blob_indexes = self .chain @@ -554,13 +547,14 @@ impl SyncNetworkContext { pub fn on_single_block_response( &mut self, request_id: SingleLookupReqId, + peer_id: PeerId, block: RpcEvent>>, ) -> Option>>> { let Entry::Occupied(mut request) = self.blocks_by_root_requests.entry(request_id) else { return None; }; - Some(match block { + let resp = match block { RpcEvent::Response(block, seen_timestamp) => { match request.get_mut().add_response(block) { Ok(block) => Ok((block, seen_timestamp)), @@ -579,43 +573,61 @@ impl SyncNetworkContext { request.remove(); Err(e.into()) } - }) + }; + + if let Err(LookupFailure::LookupVerifyError(e)) = &resp { + self.report_peer(peer_id, PeerAction::LowToleranceError, e.into()); + } + Some(resp) } pub fn on_single_blob_response( &mut self, request_id: SingleLookupReqId, + peer_id: PeerId, blob: RpcEvent>>, ) -> Option>> { let Entry::Occupied(mut request) = self.blobs_by_root_requests.entry(request_id) else { return None; }; - Some(match blob { - RpcEvent::Response(blob, _) => match request.get_mut().add_response(blob) { - Ok(Some(blobs)) => to_fixed_blob_sidecar_list(blobs) - .map(|blobs| (blobs, timestamp_now())) - .map_err(Into::into), - Ok(None) => return None, - Err(e) => { - request.remove(); - Err(e.into()) + let resp = match blob { + RpcEvent::Response(blob, seen_timestamp) => { + let request = request.get_mut(); + match request.add_response(blob) { + Ok(Some(blobs)) => to_fixed_blob_sidecar_list(blobs) + .map(|blobs| (blobs, seen_timestamp)) + .map_err(|e| (e.into(), request.resolve())), + Ok(None) => return None, + Err(e) => Err((e.into(), request.resolve())), } + } + RpcEvent::StreamTermination => match request.remove().terminate() { + Ok(_) => return None, + // (err, false = not resolved) because terminate returns Ok() if resolved + Err(e) => Err((e.into(), false)), }, - RpcEvent::StreamTermination => { - // Stream terminator - match request.remove().terminate() { - Some(blobs) => to_fixed_blob_sidecar_list(blobs) - .map(|blobs| (blobs, timestamp_now())) - .map_err(Into::into), - None => return None, + RpcEvent::RPCError(e) => Err((e.into(), request.remove().resolve())), + }; + + match resp { + Ok(resp) => Some(Ok(resp)), + // Track if this request has already returned some value downstream. Ensure that + // downstream code only receives a single Result per request. If the serving peer does + // multiple penalizable actions per request, downscore and return None. This allows to + // catch if a peer is returning more blobs than requested or if the excess blobs are + // invalid. + Err((e, resolved)) => { + if let LookupFailure::LookupVerifyError(e) = &e { + self.report_peer(peer_id, PeerAction::LowToleranceError, e.into()); + } + if resolved { + None + } else { + Some(Err(e)) } } - RpcEvent::RPCError(e) => { - request.remove(); - Err(e.into()) - } - }) + } } pub fn send_block_for_processing( diff --git a/beacon_node/network/src/sync/network_context/requests.rs b/beacon_node/network/src/sync/network_context/requests.rs index 0522b7fa38..cd73b4beba 100644 --- a/beacon_node/network/src/sync/network_context/requests.rs +++ b/beacon_node/network/src/sync/network_context/requests.rs @@ -9,6 +9,7 @@ use types::{ #[derive(Debug, PartialEq, Eq, IntoStaticStr)] pub enum LookupVerifyError { NoResponseReturned, + NotEnoughResponsesReturned { expected: usize, actual: usize }, TooManyResponses, UnrequestedBlockRoot(Hash256), UnrequestedBlobIndex(u64), @@ -139,11 +140,20 @@ impl ActiveBlobsByRootRequest { } } - pub fn terminate(self) -> Option>>> { + pub fn terminate(self) -> Result<(), LookupVerifyError> { if self.resolved { - None + Ok(()) } else { - Some(self.blobs) + Err(LookupVerifyError::NotEnoughResponsesReturned { + expected: self.request.indices.len(), + actual: self.blobs.len(), + }) } } + + /// Mark request as resolved (= has returned something downstream) while marking this status as + /// true for future calls. + pub fn resolve(&mut self) -> bool { + std::mem::replace(&mut self.resolved, true) + } } From 6f45ad45348049d6b340882769f71cde12eb4ede Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Tue, 14 May 2024 20:34:26 +0300 Subject: [PATCH 06/19] Log stuck lookups (#5778) * Log stuck lookups every interval * Implement debug manually * Add comment * Do not print peers twice * Add SYNC_LOOKUPS_STUCK metric * Skip logging request root * use derivative * Merge branch 'unstable' of https://github.com/sigp/lighthouse into log-stuck-lookups * add req id to debug * Merge remote-tracking branch 'sigp/unstable' into log-stuck-lookups * Fix conflict with unstable --- beacon_node/network/src/metrics.rs | 4 ++ .../network/src/sync/block_lookups/mod.rs | 18 +++++++ .../sync/block_lookups/single_block_lookup.rs | 47 +++++++++++++++++-- beacon_node/network/src/sync/manager.rs | 7 +++ 4 files changed, 73 insertions(+), 3 deletions(-) diff --git a/beacon_node/network/src/metrics.rs b/beacon_node/network/src/metrics.rs index d096b33f6c..309512076a 100644 --- a/beacon_node/network/src/metrics.rs +++ b/beacon_node/network/src/metrics.rs @@ -257,6 +257,10 @@ lazy_static! { "sync_lookups_completed_total", "Total count of sync lookups completed", ); + pub static ref SYNC_LOOKUPS_STUCK: Result = try_create_int_gauge( + "sync_lookups_stuck", + "Current count of sync lookups that may be stuck", + ); /* * Block Delay Metrics diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 1deb50237d..48dda03fac 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -30,6 +30,7 @@ mod tests; const FAILED_CHAINS_CACHE_EXPIRY_SECONDS: u64 = 60; pub const SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS: u8 = 4; +const LOOKUP_MAX_DURATION_SECS: u64 = 60; pub enum BlockComponent { Block(DownloadResult>>), @@ -665,4 +666,21 @@ impl BlockLookups { self.single_block_lookups.len() as i64, ); } + + pub fn log_stuck_lookups(&self) { + let mut stuck_count = 0; + for lookup in self.single_block_lookups.values() { + if lookup.elapsed_since_created() > Duration::from_secs(LOOKUP_MAX_DURATION_SECS) { + debug!(self.log, "Lookup maybe stuck"; + // Fields id and block_root are also part of the summary. However, logging them + // here allows log parsers o index them and have better search + "id" => lookup.id, + "block_root" => ?lookup.block_root(), + "summary" => ?lookup, + ); + stuck_count += 1; + } + } + metrics::set_gauge(&metrics::SYNC_LOOKUPS_STUCK, stuck_count); + } } diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index b6c2825fab..b35a3e91fb 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -4,12 +4,13 @@ use crate::sync::block_lookups::common::RequestState; use crate::sync::block_lookups::Id; use crate::sync::network_context::{LookupRequestResult, ReqId, SyncNetworkContext}; use beacon_chain::BeaconChainTypes; +use derivative::Derivative; use itertools::Itertools; use rand::seq::IteratorRandom; use std::collections::HashSet; use std::fmt::Debug; use std::sync::Arc; -use std::time::Duration; +use std::time::{Duration, Instant}; use store::Hash256; use strum::IntoStaticStr; use types::blob_sidecar::FixedBlobSidecarList; @@ -53,12 +54,15 @@ pub enum LookupRequestError { }, } +#[derive(Derivative)] +#[derivative(Debug(bound = "T: BeaconChainTypes"))] pub struct SingleBlockLookup { pub id: Id, pub block_request_state: BlockRequestState, pub blob_request_state: BlobRequestState, block_root: Hash256, awaiting_parent: Option, + created: Instant, } impl SingleBlockLookup { @@ -74,6 +78,7 @@ impl SingleBlockLookup { blob_request_state: BlobRequestState::new(requested_block_root, peers), block_root: requested_block_root, awaiting_parent, + created: Instant::now(), } } @@ -98,6 +103,11 @@ impl SingleBlockLookup { self.awaiting_parent = None; } + /// Returns the time elapsed since this lookup was created + pub fn elapsed_since_created(&self) -> Duration { + self.created.elapsed() + } + /// Maybe insert a verified response into this lookup. Returns true if imported pub fn add_child_components(&mut self, block_component: BlockComponent) -> bool { match block_component { @@ -244,7 +254,10 @@ impl SingleBlockLookup { } /// The state of the blob request component of a `SingleBlockLookup`. +#[derive(Derivative)] +#[derivative(Debug)] pub struct BlobRequestState { + #[derivative(Debug = "ignore")] pub block_root: Hash256, pub state: SingleLookupRequestState>, } @@ -259,7 +272,10 @@ impl BlobRequestState { } /// The state of the block request component of a `SingleBlockLookup`. +#[derive(Derivative)] +#[derivative(Debug)] pub struct BlockRequestState { + #[derivative(Debug = "ignore")] pub requested_block_root: Hash256, pub state: SingleLookupRequestState>>, } @@ -281,7 +297,7 @@ pub struct DownloadResult { pub peer_id: PeerId, } -#[derive(Debug, PartialEq, Eq, IntoStaticStr)] +#[derive(PartialEq, Eq, IntoStaticStr)] pub enum State { AwaitingDownload, Downloading(ReqId), @@ -293,13 +309,16 @@ pub enum State { } /// Object representing the state of a single block or blob lookup request. -#[derive(PartialEq, Eq, Debug)] +#[derive(PartialEq, Eq, Derivative)] +#[derivative(Debug)] pub struct SingleLookupRequestState { /// State of this request. state: State, /// Peers that should have this block or blob. + #[derivative(Debug(format_with = "fmt_peer_set"))] available_peers: HashSet, /// Peers from which we have requested this block. + #[derivative(Debug = "ignore")] used_peers: HashSet, /// How many times have we attempted to process this block or blob. failed_processing: u8, @@ -529,8 +548,30 @@ impl SingleLookupRequestState { } } +// Display is used in the BadState assertions above impl std::fmt::Display for State { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { write!(f, "{}", Into::<&'static str>::into(self)) } } + +// Debug is used in the log_stuck_lookups print to include some more info. Implements custom Debug +// to not dump an entire block or blob to terminal which don't add valuable data. +impl std::fmt::Debug for State { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + match self { + Self::AwaitingDownload { .. } => write!(f, "AwaitingDownload"), + Self::Downloading(req_id) => write!(f, "Downloading({:?})", req_id), + Self::AwaitingProcess(d) => write!(f, "AwaitingProcess({:?})", d.peer_id), + Self::Processing(d) => write!(f, "Processing({:?})", d.peer_id), + Self::Processed { .. } => write!(f, "Processed"), + } + } +} + +fn fmt_peer_set( + peer_set: &HashSet, + f: &mut std::fmt::Formatter, +) -> Result<(), std::fmt::Error> { + write!(f, "{}", peer_set.len()) +} diff --git a/beacon_node/network/src/sync/manager.rs b/beacon_node/network/src/sync/manager.rs index 66d23dd191..71d3113414 100644 --- a/beacon_node/network/src/sync/manager.rs +++ b/beacon_node/network/src/sync/manager.rs @@ -547,6 +547,10 @@ impl SyncManager { futures::stream::iter(ee_responsiveness_watch.await).flatten() }; + // LOOKUP_MAX_DURATION_SECS is 60 seconds. Logging every 30 seconds allows enough timely + // visbility while being sparse and not increasing the debug log volume in a noticeable way + let mut interval = tokio::time::interval(Duration::from_secs(30)); + // process any inbound messages loop { tokio::select! { @@ -556,6 +560,9 @@ impl SyncManager { Some(engine_state) = check_ee_stream.next(), if check_ee => { self.handle_new_execution_engine_state(engine_state); } + _ = interval.tick() => { + self.block_lookups.log_stuck_lookups(); + } } } } From 6636167503956d73bb54b1c27dbda8d5291b054a Mon Sep 17 00:00:00 2001 From: Eitan Seri-Levi Date: Wed, 15 May 2024 12:17:06 +0300 Subject: [PATCH 07/19] Log block import source (#5738) * the default target peers is 100 * add some comments * Merge branch 'unstable' of https://github.com/sigp/lighthouse into track-block-import-source * add block import source * revert * update logging text * fix tests * lint * use % instaed of to_string --- beacon_node/beacon_chain/src/beacon_chain.rs | 14 +++++++++++--- beacon_node/beacon_chain/src/test_utils.rs | 2 ++ .../beacon_chain/tests/block_verification.rs | 7 +++++++ .../beacon_chain/tests/payload_invalidation.rs | 6 +++++- beacon_node/beacon_chain/tests/store_tests.rs | 3 +++ beacon_node/beacon_chain/tests/tests.rs | 4 +++- beacon_node/http_api/src/publish_blocks.rs | 5 +++-- .../network_beacon_processor/gossip_methods.rs | 13 ++++++++++--- .../network_beacon_processor/sync_methods.rs | 8 +++++++- consensus/types/src/beacon_block.rs | 18 ++++++++++++++++++ consensus/types/src/lib.rs | 2 +- testing/ef_tests/src/cases/fork_choice.rs | 7 ++++--- 12 files changed, 74 insertions(+), 15 deletions(-) diff --git a/beacon_node/beacon_chain/src/beacon_chain.rs b/beacon_node/beacon_chain/src/beacon_chain.rs index 8eeb75fd7d..9584b2e29f 100644 --- a/beacon_node/beacon_chain/src/beacon_chain.rs +++ b/beacon_node/beacon_chain/src/beacon_chain.rs @@ -2774,6 +2774,7 @@ impl BeaconChain { signature_verified_block.block_root(), signature_verified_block, notify_execution_layer, + BlockImportSource::RangeSync, || Ok(()), ) .await @@ -2956,6 +2957,7 @@ impl BeaconChain { self: &Arc, block_root: Hash256, unverified_block: B, + block_source: BlockImportSource, notify_execution_layer: NotifyExecutionLayer, ) -> Result> { self.reqresp_pre_import_cache @@ -2963,9 +2965,13 @@ impl BeaconChain { .insert(block_root, unverified_block.block_cloned()); let r = self - .process_block(block_root, unverified_block, notify_execution_layer, || { - Ok(()) - }) + .process_block( + block_root, + unverified_block, + notify_execution_layer, + block_source, + || Ok(()), + ) .await; self.remove_notified(&block_root, r) } @@ -2988,6 +2994,7 @@ impl BeaconChain { block_root: Hash256, unverified_block: B, notify_execution_layer: NotifyExecutionLayer, + block_source: BlockImportSource, publish_fn: impl FnOnce() -> Result<(), BlockError> + Send + 'static, ) -> Result> { // Start the Prometheus timer. @@ -3048,6 +3055,7 @@ impl BeaconChain { "Beacon block imported"; "block_root" => ?block_root, "block_slot" => block_slot, + "source" => %block_source, ); // Increment the Prometheus counter for block processing successes. diff --git a/beacon_node/beacon_chain/src/test_utils.rs b/beacon_node/beacon_chain/src/test_utils.rs index 8fbd5d575f..dde6b75054 100644 --- a/beacon_node/beacon_chain/src/test_utils.rs +++ b/beacon_node/beacon_chain/src/test_utils.rs @@ -1881,6 +1881,7 @@ where block_root, RpcBlock::new(Some(block_root), block, sidecars).unwrap(), NotifyExecutionLayer::Yes, + BlockImportSource::RangeSync, || Ok(()), ) .await? @@ -1907,6 +1908,7 @@ where block_root, RpcBlock::new(Some(block_root), block, sidecars).unwrap(), NotifyExecutionLayer::Yes, + BlockImportSource::RangeSync, || Ok(()), ) .await? diff --git a/beacon_node/beacon_chain/tests/block_verification.rs b/beacon_node/beacon_chain/tests/block_verification.rs index 98a112daff..9c196b12e1 100644 --- a/beacon_node/beacon_chain/tests/block_verification.rs +++ b/beacon_node/beacon_chain/tests/block_verification.rs @@ -473,6 +473,7 @@ async fn assert_invalid_signature( ) .unwrap(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await; @@ -541,6 +542,7 @@ async fn invalid_signature_gossip_block() { signed_block.canonical_root(), Arc::new(signed_block), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await, @@ -875,6 +877,7 @@ async fn block_gossip_verification() { gossip_verified.block_root, gossip_verified, NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -1165,6 +1168,7 @@ async fn verify_block_for_gossip_slashing_detection() { verified_block.block_root, verified_block, NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -1196,6 +1200,7 @@ async fn verify_block_for_gossip_doppelganger_detection() { verified_block.block_root, verified_block, NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -1342,6 +1347,7 @@ async fn add_base_block_to_altair_chain() { base_block.canonical_root(), Arc::new(base_block.clone()), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -1477,6 +1483,7 @@ async fn add_altair_block_to_base_chain() { altair_block.canonical_root(), Arc::new(altair_block.clone()), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await diff --git a/beacon_node/beacon_chain/tests/payload_invalidation.rs b/beacon_node/beacon_chain/tests/payload_invalidation.rs index 0ef348319a..0c36d21f2e 100644 --- a/beacon_node/beacon_chain/tests/payload_invalidation.rs +++ b/beacon_node/beacon_chain/tests/payload_invalidation.rs @@ -702,6 +702,7 @@ async fn invalidates_all_descendants() { fork_block.canonical_root(), fork_block, NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -802,6 +803,7 @@ async fn switches_heads() { fork_block.canonical_root(), fork_block, NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -1061,7 +1063,7 @@ async fn invalid_parent() { // Ensure the block built atop an invalid payload is invalid for import. assert!(matches!( - rig.harness.chain.process_block(block.canonical_root(), block.clone(), NotifyExecutionLayer::Yes, + rig.harness.chain.process_block(block.canonical_root(), block.clone(), NotifyExecutionLayer::Yes, BlockImportSource::Lookup, || Ok(()), ).await, Err(BlockError::ParentExecutionPayloadInvalid { parent_root: invalid_root }) @@ -1352,6 +1354,7 @@ async fn build_optimistic_chain( block.canonical_root(), block, NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -1926,6 +1929,7 @@ async fn recover_from_invalid_head_by_importing_blocks() { fork_block.canonical_root(), fork_block.clone(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await diff --git a/beacon_node/beacon_chain/tests/store_tests.rs b/beacon_node/beacon_chain/tests/store_tests.rs index ba8a6bf701..5da92573f7 100644 --- a/beacon_node/beacon_chain/tests/store_tests.rs +++ b/beacon_node/beacon_chain/tests/store_tests.rs @@ -2458,6 +2458,7 @@ async fn weak_subjectivity_sync_test(slots: Vec, checkpoint_slot: Slot) { full_block.canonical_root(), RpcBlock::new(Some(block_root), Arc::new(full_block), Some(blobs)).unwrap(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -2676,6 +2677,7 @@ async fn process_blocks_and_attestations_for_unaligned_checkpoint() { invalid_fork_block.canonical_root(), invalid_fork_block.clone(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await @@ -2689,6 +2691,7 @@ async fn process_blocks_and_attestations_for_unaligned_checkpoint() { valid_fork_block.canonical_root(), valid_fork_block.clone(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await diff --git a/beacon_node/beacon_chain/tests/tests.rs b/beacon_node/beacon_chain/tests/tests.rs index e27180a002..2f496eecd7 100644 --- a/beacon_node/beacon_chain/tests/tests.rs +++ b/beacon_node/beacon_chain/tests/tests.rs @@ -12,7 +12,8 @@ use lazy_static::lazy_static; use operation_pool::PersistedOperationPool; use state_processing::{per_slot_processing, per_slot_processing::Error as SlotProcessingError}; use types::{ - BeaconState, BeaconStateError, EthSpec, Hash256, Keypair, MinimalEthSpec, RelativeEpoch, Slot, + BeaconState, BeaconStateError, BlockImportSource, EthSpec, Hash256, Keypair, MinimalEthSpec, + RelativeEpoch, Slot, }; // Should ideally be divisible by 3. @@ -686,6 +687,7 @@ async fn run_skip_slot_test(skip_slots: u64) { harness_a.chain.head_snapshot().beacon_block_root, harness_a.get_head_block(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ) .await diff --git a/beacon_node/http_api/src/publish_blocks.rs b/beacon_node/http_api/src/publish_blocks.rs index e23768ebb6..10d000ef6f 100644 --- a/beacon_node/http_api/src/publish_blocks.rs +++ b/beacon_node/http_api/src/publish_blocks.rs @@ -19,8 +19,8 @@ use std::time::Duration; use tokio::sync::mpsc::UnboundedSender; use tree_hash::TreeHash; use types::{ - AbstractExecPayload, BeaconBlockRef, BlobSidecarList, EthSpec, ExecPayload, ExecutionBlockHash, - ForkName, FullPayload, FullPayloadBellatrix, Hash256, SignedBeaconBlock, + AbstractExecPayload, BeaconBlockRef, BlobSidecarList, BlockImportSource, EthSpec, ExecPayload, + ExecutionBlockHash, ForkName, FullPayload, FullPayloadBellatrix, Hash256, SignedBeaconBlock, SignedBlindedBeaconBlock, VariableList, }; use warp::http::StatusCode; @@ -230,6 +230,7 @@ pub async fn publish_block NetworkBeaconProcessor { let block = verified_block.block.block_cloned(); let block_root = verified_block.block_root; + // TODO(block source) + let result = self .chain - .process_block_with_early_caching(block_root, verified_block, NotifyExecutionLayer::Yes) + .process_block_with_early_caching( + block_root, + verified_block, + BlockImportSource::Gossip, + NotifyExecutionLayer::Yes, + ) .await; match &result { diff --git a/beacon_node/network/src/network_beacon_processor/sync_methods.rs b/beacon_node/network/src/network_beacon_processor/sync_methods.rs index f66879715d..acd02ab6ad 100644 --- a/beacon_node/network/src/network_beacon_processor/sync_methods.rs +++ b/beacon_node/network/src/network_beacon_processor/sync_methods.rs @@ -24,6 +24,7 @@ use store::KzgCommitment; use tokio::sync::mpsc; use types::beacon_block_body::format_kzg_commitments; use types::blob_sidecar::FixedBlobSidecarList; +use types::BlockImportSource; use types::{Epoch, Hash256}; /// Id associated to a batch processing request, either a sync batch or a parent lookup. @@ -153,7 +154,12 @@ impl NetworkBeaconProcessor { let result = self .chain - .process_block_with_early_caching(block_root, block, NotifyExecutionLayer::Yes) + .process_block_with_early_caching( + block_root, + block, + BlockImportSource::Lookup, + NotifyExecutionLayer::Yes, + ) .await; metrics::inc_counter(&metrics::BEACON_PROCESSOR_RPC_BLOCK_IMPORTED_TOTAL); diff --git a/consensus/types/src/beacon_block.rs b/consensus/types/src/beacon_block.rs index 81491d6505..ed3d182772 100644 --- a/consensus/types/src/beacon_block.rs +++ b/consensus/types/src/beacon_block.rs @@ -4,6 +4,7 @@ use derivative::Derivative; use serde::{Deserialize, Serialize}; use ssz::{Decode, DecodeError}; use ssz_derive::{Decode, Encode}; +use std::fmt; use std::marker::PhantomData; use superstruct::superstruct; use test_random_derive::TestRandom; @@ -836,6 +837,23 @@ impl> ForkVersionDeserialize )) } } +pub enum BlockImportSource { + Gossip, + Lookup, + RangeSync, + HttpApi, +} + +impl fmt::Display for BlockImportSource { + fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { + match self { + BlockImportSource::Gossip => write!(f, "gossip"), + BlockImportSource::Lookup => write!(f, "lookup"), + BlockImportSource::RangeSync => write!(f, "range_sync"), + BlockImportSource::HttpApi => write!(f, "http_api"), + } + } +} #[cfg(test)] mod tests { diff --git a/consensus/types/src/lib.rs b/consensus/types/src/lib.rs index 5c521d98af..c170b6b70d 100644 --- a/consensus/types/src/lib.rs +++ b/consensus/types/src/lib.rs @@ -122,7 +122,7 @@ pub use crate::attester_slashing::AttesterSlashing; pub use crate::beacon_block::{ BeaconBlock, BeaconBlockAltair, BeaconBlockBase, BeaconBlockBellatrix, BeaconBlockCapella, BeaconBlockDeneb, BeaconBlockElectra, BeaconBlockRef, BeaconBlockRefMut, BlindedBeaconBlock, - EmptyBlock, + BlockImportSource, EmptyBlock, }; pub use crate::beacon_block_body::{ BeaconBlockBody, BeaconBlockBodyAltair, BeaconBlockBodyBase, BeaconBlockBodyBellatrix, diff --git a/testing/ef_tests/src/cases/fork_choice.rs b/testing/ef_tests/src/cases/fork_choice.rs index f0749c3c7e..bd8cc79156 100644 --- a/testing/ef_tests/src/cases/fork_choice.rs +++ b/testing/ef_tests/src/cases/fork_choice.rs @@ -24,9 +24,9 @@ use std::future::Future; use std::sync::Arc; use std::time::Duration; use types::{ - Attestation, AttesterSlashing, BeaconBlock, BeaconState, BlobSidecar, BlobsList, Checkpoint, - ExecutionBlockHash, Hash256, IndexedAttestation, KzgProof, ProposerPreparationData, - SignedBeaconBlock, Slot, Uint256, + Attestation, AttesterSlashing, BeaconBlock, BeaconState, BlobSidecar, BlobsList, + BlockImportSource, Checkpoint, ExecutionBlockHash, Hash256, IndexedAttestation, KzgProof, + ProposerPreparationData, SignedBeaconBlock, Slot, Uint256, }; #[derive(Default, Debug, PartialEq, Clone, Deserialize, Decode)] @@ -498,6 +498,7 @@ impl Tester { block_root, block.clone(), NotifyExecutionLayer::Yes, + BlockImportSource::Lookup, || Ok(()), ))? .map(|avail: AvailabilityProcessingStatus| avail.try_into()); From 1d6160549dbedbfc11c9989e5620f45e15e2a6e3 Mon Sep 17 00:00:00 2001 From: realbigsean Date: Wed, 15 May 2024 06:37:59 -0400 Subject: [PATCH 08/19] use electra feature in notifier completeness check (#5786) * use electra feature in notifier completeness check * update capella and denab readiness chacks to use fork name rather than random fields --- beacon_node/client/src/notifier.rs | 34 ++++++++++++------------------ 1 file changed, 13 insertions(+), 21 deletions(-) diff --git a/beacon_node/client/src/notifier.rs b/beacon_node/client/src/notifier.rs index a6fd07789d..0a2f24748d 100644 --- a/beacon_node/client/src/notifier.rs +++ b/beacon_node/client/src/notifier.rs @@ -434,11 +434,9 @@ async fn capella_readiness_logging( .canonical_head .cached_head() .snapshot - .beacon_block - .message() - .body() - .execution_payload() - .map_or(false, |payload| payload.withdrawals_root().is_ok()); + .beacon_state + .fork_name_unchecked() + >= ForkName::Capella; let has_execution_layer = beacon_chain.execution_layer.is_some(); @@ -496,11 +494,9 @@ async fn deneb_readiness_logging( .canonical_head .cached_head() .snapshot - .beacon_block - .message() - .body() - .execution_payload() - .map_or(false, |payload| payload.blob_gas_used().is_ok()); + .beacon_state + .fork_name_unchecked() + >= ForkName::Deneb; let has_execution_layer = beacon_chain.execution_layer.is_some(); @@ -549,17 +545,13 @@ async fn electra_readiness_logging( beacon_chain: &BeaconChain, log: &Logger, ) { - // TODO(electra): Once Electra has features, this code can be swapped back. - let electra_completed = false; - //let electra_completed = beacon_chain - // .canonical_head - // .cached_head() - // .snapshot - // .beacon_block - // .message() - // .body() - // .execution_payload() - // .map_or(false, |payload| payload.electra_placeholder().is_ok()); + let electra_completed = beacon_chain + .canonical_head + .cached_head() + .snapshot + .beacon_state + .fork_name_unchecked() + >= ForkName::Electra; let has_execution_layer = beacon_chain.execution_layer.is_some(); From 0f49951363e611d32c5b71a994778713349c306a Mon Sep 17 00:00:00 2001 From: antondlr Date: Thu, 16 May 2024 11:33:32 +0300 Subject: [PATCH 09/19] Skip CI's `test-suite` when the `skip-ci` label is present (#5790) * skip `test-suite` if `skip-ci` label present --- .github/workflows/test-suite.yml | 61 ++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/.github/workflows/test-suite.yml b/.github/workflows/test-suite.yml index 74aab44ac0..84ef42dc3f 100644 --- a/.github/workflows/test-suite.yml +++ b/.github/workflows/test-suite.yml @@ -29,6 +29,31 @@ env: # Enable portable to prevent issues with caching `blst` for the wrong CPU type TEST_FEATURES: portable jobs: + check-labels: + runs-on: ubuntu-latest + name: Check for 'skip-ci' label + outputs: + skip_ci: ${{ steps.set-output.outputs.SKIP_CI }} + steps: + - name: check for skip-ci label + id: set-output + env: + LABELS: ${{ toJson(github.event.pull_request.labels) }} + run: | + SKIP_CI="false" + if [ -z "${LABELS}" ]; then + LABELS="none"; + else + LABELS=$(echo ${LABELS} | jq -r '.[].name') + fi + for label in ${LABELS}; do + if [ "$label" = "skip-ci" ]; then + SKIP_CI="true" + break + fi + done + echo "::set-output name=skip_ci::$SKIP_CI" + target-branch-check: name: target-branch-check runs-on: ubuntu-latest @@ -38,6 +63,8 @@ jobs: run: test ${{ github.base_ref }} != "stable" release-tests-ubuntu: name: release-tests-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' # Use self-hosted runners only on the sigp repo. runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "linux", "CI", "large"]') || 'ubuntu-latest' }} steps: @@ -63,6 +90,8 @@ jobs: run: sccache --show-stats release-tests-windows: name: release-tests-windows + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "windows", "CI"]') || 'windows-2019' }} steps: - uses: actions/checkout@v4 @@ -97,6 +126,8 @@ jobs: run: sccache --show-stats beacon-chain-tests: name: beacon-chain-tests + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' # Use self-hosted runners only on the sigp repo. runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "linux", "CI", "large"]') || 'ubuntu-latest' }} env: @@ -117,6 +148,8 @@ jobs: run: sccache --show-stats op-pool-tests: name: op-pool-tests + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} @@ -132,6 +165,8 @@ jobs: run: make test-op-pool network-tests: name: network-tests + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} @@ -147,6 +182,8 @@ jobs: run: make test-network slasher-tests: name: slasher-tests + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} @@ -162,6 +199,8 @@ jobs: run: make test-slasher debug-tests-ubuntu: name: debug-tests-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' # Use self-hosted runners only on the sigp repo. runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "linux", "CI", "large"]') || 'ubuntu-latest' }} env: @@ -186,6 +225,8 @@ jobs: run: sccache --show-stats state-transition-vectors-ubuntu: name: state-transition-vectors-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -198,6 +239,8 @@ jobs: run: make run-state-transition-tests ef-tests-ubuntu: name: ef-tests-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' # Use self-hosted runners only on the sigp repo. runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "linux", "CI", "small"]') || 'ubuntu-latest' }} env: @@ -218,6 +261,8 @@ jobs: run: sccache --show-stats dockerfile-ubuntu: name: dockerfile-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -227,6 +272,8 @@ jobs: run: docker run -t lighthouse:local lighthouse --version basic-simulator-ubuntu: name: basic-simulator-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -239,6 +286,8 @@ jobs: run: cargo run --release --bin simulator basic-sim fallback-simulator-ubuntu: name: fallback-simulator-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -251,6 +300,8 @@ jobs: run: cargo run --release --bin simulator fallback-sim doppelganger-protection-test: name: doppelganger-protection-test + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "linux", "CI", "small"]') || 'ubuntu-latest' }} env: # Enable portable to prevent issues with caching `blst` for the wrong CPU type @@ -285,6 +336,8 @@ jobs: ./doppelganger_protection.sh success genesis.json execution-engine-integration-ubuntu: name: execution-engine-integration-ubuntu + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ${{ github.repository == 'sigp/lighthouse' && fromJson('["self-hosted", "linux", "CI", "small"]') || 'ubuntu-latest' }} steps: - uses: actions/checkout@v4 @@ -344,6 +397,8 @@ jobs: run: cargo check --workspace cargo-udeps: name: cargo-udeps + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -366,6 +421,8 @@ jobs: RUSTFLAGS: "" compile-with-beta-compiler: name: compile-with-beta-compiler + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -377,6 +434,8 @@ jobs: run: make cli-check: name: cli-check + needs: [check-labels] + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 @@ -391,8 +450,10 @@ jobs: # a PR is safe to merge. New jobs should be added here. test-suite-success: name: test-suite-success + if: needs.check-labels.outputs.skip_ci != 'true' runs-on: ubuntu-latest needs: [ + 'check-labels', 'target-branch-check', 'release-tests-ubuntu', 'release-tests-windows', From 319b4a246733e106a4608c22fa8e0dcba9043da5 Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Fri, 17 May 2024 13:58:27 +0300 Subject: [PATCH 10/19] Skip creating child lookup if parent is never created (#5803) * Skip creating child lookup if parent is never created --- .../network/src/sync/block_lookups/mod.rs | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 48dda03fac..73ed3a92c1 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -298,9 +298,12 @@ impl BlockLookups { }; let result = lookup.continue_requests(cx); - self.on_lookup_result(id, result, "new_current_lookup", cx); - self.update_metrics(); - true + if self.on_lookup_result(id, result, "new_current_lookup", cx) { + self.update_metrics(); + true + } else { + false + } } /* Lookup responses */ @@ -622,15 +625,16 @@ impl BlockLookups { } /// Common handler a lookup request error, drop it and update metrics + /// Returns true if the lookup is created or already exists fn on_lookup_result( &mut self, id: SingleLookupId, result: Result, source: &str, cx: &mut SyncNetworkContext, - ) { + ) -> bool { match result { - Ok(LookupResult::Pending) => {} // no action + Ok(LookupResult::Pending) => true, // no action Ok(LookupResult::Completed) => { if let Some(lookup) = self.single_block_lookups.remove(&id) { debug!(self.log, "Dropping completed lookup"; "block" => ?lookup.block_root(), "id" => id); @@ -641,12 +645,14 @@ impl BlockLookups { } else { debug!(self.log, "Attempting to drop non-existent lookup"; "id" => id); } + false } Err(error) => { debug!(self.log, "Dropping lookup on request error"; "id" => id, "source" => source, "error" => ?error); metrics::inc_counter_vec(&metrics::SYNC_LOOKUP_DROPPED, &[error.into()]); self.drop_lookup_and_children(id); self.update_metrics(); + false } } } From 8006418d802cefefe635443e1d9f87dbfae3fd2e Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Fri, 17 May 2024 14:34:21 +0300 Subject: [PATCH 11/19] Type sync network context send errors (#5808) * Type sync network context send errors * Consisntent naming --- .../network/src/sync/backfill_sync/mod.rs | 2 +- .../network/src/sync/block_lookups/common.rs | 8 +- .../network/src/sync/block_lookups/mod.rs | 6 +- .../sync/block_lookups/single_block_lookup.rs | 10 +- .../network/src/sync/network_context.rs | 194 ++++++++++-------- .../network/src/sync/range_sync/chain.rs | 2 +- 6 files changed, 119 insertions(+), 103 deletions(-) diff --git a/beacon_node/network/src/sync/backfill_sync/mod.rs b/beacon_node/network/src/sync/backfill_sync/mod.rs index 4be92d59a4..ce7d04ac0a 100644 --- a/beacon_node/network/src/sync/backfill_sync/mod.rs +++ b/beacon_node/network/src/sync/backfill_sync/mod.rs @@ -952,7 +952,7 @@ impl BackFillSync { Err(e) => { // NOTE: under normal conditions this shouldn't happen but we handle it anyway warn!(self.log, "Could not send batch request"; - "batch_id" => batch_id, "error" => e, &batch); + "batch_id" => batch_id, "error" => ?e, &batch); // register the failed download and check if the batch can be retried if let Err(e) = batch.start_downloading_from_peer(peer, 1) { return self.fail_sync(BackFillError::BatchInvalidState(batch_id, e.0)); diff --git a/beacon_node/network/src/sync/block_lookups/common.rs b/beacon_node/network/src/sync/block_lookups/common.rs index 400d382d6d..2791623f3f 100644 --- a/beacon_node/network/src/sync/block_lookups/common.rs +++ b/beacon_node/network/src/sync/block_lookups/common.rs @@ -82,7 +82,7 @@ impl RequestState for BlockRequestState { cx: &mut SyncNetworkContext, ) -> Result { cx.block_lookup_request(id, peer_id, self.requested_block_root) - .map_err(LookupRequestError::SendFailed) + .map_err(LookupRequestError::SendFailedNetwork) } fn send_for_processing( @@ -102,7 +102,7 @@ impl RequestState for BlockRequestState { RpcBlock::new_without_blobs(Some(block_root), value), seen_timestamp, ) - .map_err(LookupRequestError::SendFailed) + .map_err(LookupRequestError::SendFailedProcessor) } fn response_type() -> ResponseType { @@ -135,7 +135,7 @@ impl RequestState for BlobRequestState { self.block_root, downloaded_block_expected_blobs, ) - .map_err(LookupRequestError::SendFailed) + .map_err(LookupRequestError::SendFailedNetwork) } fn send_for_processing( @@ -150,7 +150,7 @@ impl RequestState for BlobRequestState { peer_id: _, } = download_result; cx.send_blobs_for_processing(id, block_root, value, seen_timestamp) - .map_err(LookupRequestError::SendFailed) + .map_err(LookupRequestError::SendFailedProcessor) } fn response_type() -> ResponseType { diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 73ed3a92c1..79e95e4c8c 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -2,7 +2,7 @@ use self::parent_chain::{compute_parent_chains, NodeChain}; pub use self::single_block_lookup::DownloadResult; use self::single_block_lookup::{LookupRequestError, LookupResult, SingleBlockLookup}; use super::manager::{BlockProcessType, BlockProcessingResult}; -use super::network_context::{RpcProcessingResult, SyncNetworkContext}; +use super::network_context::{RpcResponseResult, SyncNetworkContext}; use crate::metrics; use crate::sync::block_lookups::common::{ResponseType, PARENT_DEPTH_TOLERANCE}; use crate::sync::block_lookups::parent_chain::find_oldest_fork_ancestor; @@ -313,7 +313,7 @@ impl BlockLookups { &mut self, id: SingleLookupReqId, peer_id: PeerId, - response: RpcProcessingResult, + response: RpcResponseResult, cx: &mut SyncNetworkContext, ) { let result = self.on_download_response_inner::(id, peer_id, response, cx); @@ -325,7 +325,7 @@ impl BlockLookups { &mut self, id: SingleLookupReqId, peer_id: PeerId, - response: RpcProcessingResult, + response: RpcResponseResult, cx: &mut SyncNetworkContext, ) -> Result { // Note: no need to downscore peers here, already downscored on network context diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index b35a3e91fb..28ac0378b3 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -2,7 +2,9 @@ use super::common::ResponseType; use super::{BlockComponent, PeerId, SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS}; use crate::sync::block_lookups::common::RequestState; use crate::sync::block_lookups::Id; -use crate::sync::network_context::{LookupRequestResult, ReqId, SyncNetworkContext}; +use crate::sync::network_context::{ + LookupRequestResult, ReqId, RpcRequestSendError, SendErrorProcessor, SyncNetworkContext, +}; use beacon_chain::BeaconChainTypes; use derivative::Derivative; use itertools::Itertools; @@ -34,8 +36,10 @@ pub enum LookupRequestError { }, /// No peers left to serve this lookup NoPeers, - /// Error sending event to network or beacon processor - SendFailed(&'static str), + /// Error sending event to network + SendFailedNetwork(RpcRequestSendError), + /// Error sending event to processor + SendFailedProcessor(SendErrorProcessor), /// Inconsistent lookup request state BadState(String), /// Lookup failed for some other reason and should be dropped diff --git a/beacon_node/network/src/sync/network_context.rs b/beacon_node/network/src/sync/network_context.rs index 8693bc0c6c..fa1f50cee0 100644 --- a/beacon_node/network/src/sync/network_context.rs +++ b/beacon_node/network/src/sync/network_context.rs @@ -52,31 +52,43 @@ pub enum RpcEvent { RPCError(RPCError), } -pub type RpcProcessingResult = Result<(T, Duration), LookupFailure>; +pub type RpcResponseResult = Result<(T, Duration), RpcResponseError>; -pub enum LookupFailure { +pub enum RpcResponseError { RpcError(RPCError), - LookupVerifyError(LookupVerifyError), + VerifyError(LookupVerifyError), } -impl std::fmt::Display for LookupFailure { +#[derive(Debug, PartialEq, Eq)] +pub enum RpcRequestSendError { + /// Network channel send failed + NetworkSendError, +} + +#[derive(Debug, PartialEq, Eq)] +pub enum SendErrorProcessor { + SendError, + ProcessorNotAvailable, +} + +impl std::fmt::Display for RpcResponseError { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { match self { - LookupFailure::RpcError(e) => write!(f, "RPC Error: {:?}", e), - LookupFailure::LookupVerifyError(e) => write!(f, "Lookup Verify Error: {:?}", e), + RpcResponseError::RpcError(e) => write!(f, "RPC Error: {:?}", e), + RpcResponseError::VerifyError(e) => write!(f, "Lookup Verify Error: {:?}", e), } } } -impl From for LookupFailure { +impl From for RpcResponseError { fn from(e: RPCError) -> Self { - LookupFailure::RpcError(e) + RpcResponseError::RpcError(e) } } -impl From for LookupFailure { +impl From for RpcResponseError { fn from(e: LookupVerifyError) -> Self { - LookupFailure::LookupVerifyError(e) + RpcResponseError::VerifyError(e) } } @@ -209,7 +221,7 @@ impl SyncNetworkContext { peer_id: PeerId, batch_type: ByRangeRequestType, request: BlocksByRangeRequest, - ) -> Result { + ) -> Result { let id = self.next_id(); trace!( self.log, @@ -218,11 +230,13 @@ impl SyncNetworkContext { "count" => request.count(), "peer" => %peer_id, ); - self.send_network_msg(NetworkMessage::SendRequest { - peer_id, - request: Request::BlocksByRange(request.clone()), - request_id: RequestId::Sync(SyncRequestId::RangeBlockAndBlobs { id }), - })?; + self.network_send + .send(NetworkMessage::SendRequest { + peer_id, + request: Request::BlocksByRange(request.clone()), + request_id: RequestId::Sync(SyncRequestId::RangeBlockAndBlobs { id }), + }) + .map_err(|_| RpcRequestSendError::NetworkSendError)?; if matches!(batch_type, ByRangeRequestType::BlocksAndBlobs) { debug!( @@ -234,14 +248,16 @@ impl SyncNetworkContext { ); // Create the blob request based on the blocks request. - self.send_network_msg(NetworkMessage::SendRequest { - peer_id, - request: Request::BlobsByRange(BlobsByRangeRequest { - start_slot: *request.start_slot(), - count: *request.count(), - }), - request_id: RequestId::Sync(SyncRequestId::RangeBlockAndBlobs { id }), - })?; + self.network_send + .send(NetworkMessage::SendRequest { + peer_id, + request: Request::BlobsByRange(BlobsByRangeRequest { + start_slot: *request.start_slot(), + count: *request.count(), + }), + request_id: RequestId::Sync(SyncRequestId::RangeBlockAndBlobs { id }), + }) + .map_err(|_| RpcRequestSendError::NetworkSendError)?; } Ok(id) @@ -254,7 +270,7 @@ impl SyncNetworkContext { batch_type: ByRangeRequestType, request: BlocksByRangeRequest, sender_id: RangeRequestId, - ) -> Result { + ) -> Result { let id = self.blocks_by_range_request(peer_id, batch_type, request)?; self.range_blocks_and_blobs_requests .insert(id, (sender_id, BlocksAndBlobsRequestInfo::new(batch_type))); @@ -320,7 +336,7 @@ impl SyncNetworkContext { lookup_id: SingleLookupId, peer_id: PeerId, block_root: Hash256, - ) -> Result { + ) -> Result { // da_checker includes block that are execution verified, but are missing components if self .chain @@ -357,11 +373,13 @@ impl SyncNetworkContext { let request = BlocksByRootSingleRequest(block_root); - self.send_network_msg(NetworkMessage::SendRequest { - peer_id, - request: Request::BlocksByRoot(request.into_request(&self.chain.spec)), - request_id: RequestId::Sync(SyncRequestId::SingleBlock { id }), - })?; + self.network_send + .send(NetworkMessage::SendRequest { + peer_id, + request: Request::BlocksByRoot(request.into_request(&self.chain.spec)), + request_id: RequestId::Sync(SyncRequestId::SingleBlock { id }), + }) + .map_err(|_| RpcRequestSendError::NetworkSendError)?; self.blocks_by_root_requests .insert(id, ActiveBlocksByRootRequest::new(request)); @@ -381,7 +399,7 @@ impl SyncNetworkContext { peer_id: PeerId, block_root: Hash256, downloaded_block_expected_blobs: Option, - ) -> Result { + ) -> Result { let Some(expected_blobs) = downloaded_block_expected_blobs.or_else(|| { self.chain .data_availability_checker @@ -428,11 +446,13 @@ impl SyncNetworkContext { indices, }; - self.send_network_msg(NetworkMessage::SendRequest { - peer_id, - request: Request::BlobsByRoot(request.clone().into_request(&self.chain.spec)), - request_id: RequestId::Sync(SyncRequestId::SingleBlob { id }), - })?; + self.network_send + .send(NetworkMessage::SendRequest { + peer_id, + request: Request::BlobsByRoot(request.clone().into_request(&self.chain.spec)), + request_id: RequestId::Sync(SyncRequestId::SingleBlob { id }), + }) + .map_err(|_| RpcRequestSendError::NetworkSendError)?; self.blobs_by_root_requests .insert(id, ActiveBlobsByRootRequest::new(request)); @@ -549,7 +569,7 @@ impl SyncNetworkContext { request_id: SingleLookupReqId, peer_id: PeerId, block: RpcEvent>>, - ) -> Option>>> { + ) -> Option>>> { let Entry::Occupied(mut request) = self.blocks_by_root_requests.entry(request_id) else { return None; }; @@ -575,7 +595,7 @@ impl SyncNetworkContext { } }; - if let Err(LookupFailure::LookupVerifyError(e)) = &resp { + if let Err(RpcResponseError::VerifyError(e)) = &resp { self.report_peer(peer_id, PeerAction::LowToleranceError, e.into()); } Some(resp) @@ -586,7 +606,7 @@ impl SyncNetworkContext { request_id: SingleLookupReqId, peer_id: PeerId, blob: RpcEvent>>, - ) -> Option>> { + ) -> Option>> { let Entry::Occupied(mut request) = self.blobs_by_root_requests.entry(request_id) else { return None; }; @@ -618,7 +638,7 @@ impl SyncNetworkContext { // catch if a peer is returning more blobs than requested or if the excess blobs are // invalid. Err((e, resolved)) => { - if let LookupFailure::LookupVerifyError(e) = &e { + if let RpcResponseError::VerifyError(e) = &e { self.report_peer(peer_id, PeerAction::LowToleranceError, e.into()); } if resolved { @@ -636,31 +656,27 @@ impl SyncNetworkContext { block_root: Hash256, block: RpcBlock, duration: Duration, - ) -> Result<(), &'static str> { - match self.beacon_processor_if_enabled() { - Some(beacon_processor) => { - debug!(self.log, "Sending block for processing"; "block" => ?block_root, "id" => id); - if let Err(e) = beacon_processor.send_rpc_beacon_block( - block_root, - block, - duration, - BlockProcessType::SingleBlock { id }, - ) { - error!( - self.log, - "Failed to send sync block to processor"; - "error" => ?e - ); - Err("beacon processor send failure") - } else { - Ok(()) - } - } - None => { - trace!(self.log, "Dropping block ready for processing. Beacon processor not available"; "block" => %block_root); - Err("beacon processor unavailable") - } - } + ) -> Result<(), SendErrorProcessor> { + let beacon_processor = self + .beacon_processor_if_enabled() + .ok_or(SendErrorProcessor::ProcessorNotAvailable)?; + + debug!(self.log, "Sending block for processing"; "block" => ?block_root, "id" => id); + beacon_processor + .send_rpc_beacon_block( + block_root, + block, + duration, + BlockProcessType::SingleBlock { id }, + ) + .map_err(|e| { + error!( + self.log, + "Failed to send sync block to processor"; + "error" => ?e + ); + SendErrorProcessor::SendError + }) } pub fn send_blobs_for_processing( @@ -669,31 +685,27 @@ impl SyncNetworkContext { block_root: Hash256, blobs: FixedBlobSidecarList, duration: Duration, - ) -> Result<(), &'static str> { - match self.beacon_processor_if_enabled() { - Some(beacon_processor) => { - debug!(self.log, "Sending blobs for processing"; "block" => ?block_root, "id" => id); - if let Err(e) = beacon_processor.send_rpc_blobs( - block_root, - blobs, - duration, - BlockProcessType::SingleBlob { id }, - ) { - error!( - self.log, - "Failed to send sync blobs to processor"; - "error" => ?e - ); - Err("beacon processor send failure") - } else { - Ok(()) - } - } - None => { - trace!(self.log, "Dropping blobs ready for processing. Beacon processor not available"; "block_root" => %block_root); - Err("beacon processor unavailable") - } - } + ) -> Result<(), SendErrorProcessor> { + let beacon_processor = self + .beacon_processor_if_enabled() + .ok_or(SendErrorProcessor::ProcessorNotAvailable)?; + + debug!(self.log, "Sending blobs for processing"; "block" => ?block_root, "id" => id); + beacon_processor + .send_rpc_blobs( + block_root, + blobs, + duration, + BlockProcessType::SingleBlob { id }, + ) + .map_err(|e| { + error!( + self.log, + "Failed to send sync blobs to processor"; + "error" => ?e + ); + SendErrorProcessor::SendError + }) } } diff --git a/beacon_node/network/src/sync/range_sync/chain.rs b/beacon_node/network/src/sync/range_sync/chain.rs index 9a6c99ebf6..63cafa9aca 100644 --- a/beacon_node/network/src/sync/range_sync/chain.rs +++ b/beacon_node/network/src/sync/range_sync/chain.rs @@ -923,7 +923,7 @@ impl SyncingChain { Err(e) => { // NOTE: under normal conditions this shouldn't happen but we handle it anyway warn!(self.log, "Could not send batch request"; - "batch_id" => batch_id, "error" => e, &batch); + "batch_id" => batch_id, "error" => ?e, &batch); // register the failed download and check if the batch can be retried batch.start_downloading_from_peer(peer, 1)?; // fake request_id is not relevant self.peers From b5de925d8f3868502cc1813ab66d565b7865aedb Mon Sep 17 00:00:00 2001 From: chonghe <44791194+chong-he@users.noreply.github.com> Date: Mon, 20 May 2024 10:17:43 +0800 Subject: [PATCH 12/19] Use JSON header by default for `/eth/v1/beacon/deposit_snapshot` (#5813) * Fix with or * Flip case --- beacon_node/http_api/src/lib.rs | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/beacon_node/http_api/src/lib.rs b/beacon_node/http_api/src/lib.rs index 024e268e2a..1d7095ae13 100644 --- a/beacon_node/http_api/src/lib.rs +++ b/beacon_node/http_api/src/lib.rs @@ -2121,14 +2121,7 @@ pub fn serve( task_spawner: TaskSpawner, eth1_service: eth1::Service| { task_spawner.blocking_response_task(Priority::P1, move || match accept_header { - Some(api_types::Accept::Json) | None => { - let snapshot = eth1_service.get_deposit_snapshot(); - Ok( - warp::reply::json(&api_types::GenericResponse::from(snapshot)) - .into_response(), - ) - } - _ => eth1_service + Some(api_types::Accept::Ssz) => eth1_service .get_deposit_snapshot() .map(|snapshot| { Response::builder() @@ -2154,6 +2147,13 @@ pub fn serve( )) }) }), + _ => { + let snapshot = eth1_service.get_deposit_snapshot(); + Ok( + warp::reply::json(&api_types::GenericResponse::from(snapshot)) + .into_response(), + ) + } }) }, ); From 2a87016d94fdee29f94400151ecadccf28e9d082 Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Mon, 20 May 2024 20:27:57 +0200 Subject: [PATCH 13/19] Fix lookup disconnect peer (#5815) * Test lookup peer disconnect modes * Fix lookup peer disconnected return early --- .../network/src/sync/block_lookups/mod.rs | 15 ++- .../sync/block_lookups/single_block_lookup.rs | 105 +++++------------- .../network/src/sync/block_lookups/tests.rs | 22 +++- 3 files changed, 59 insertions(+), 83 deletions(-) diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 79e95e4c8c..0e89eb956c 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -191,7 +191,7 @@ impl BlockLookups { .iter() .find(|(_, l)| l.block_root() == block_to_drop) { - for &peer_id in lookup.all_used_peers() { + for &peer_id in lookup.all_peers() { cx.report_peer( peer_id, PeerAction::LowToleranceError, @@ -387,8 +387,15 @@ impl BlockLookups { pub fn peer_disconnected(&mut self, peer_id: &PeerId) { self.single_block_lookups.retain(|_, lookup| { - if lookup.remove_peer(peer_id) { - debug!(self.log, "Dropping single lookup after peer disconnection"; "block_root" => ?lookup.block_root()); + lookup.remove_peer(peer_id); + + // Note: this condition should be removed in the future. It's not strictly necessary to drop a + // lookup if there are no peers left. Lookup should only be dropped if it can not make progress + if lookup.has_no_peers() { + debug!(self.log, + "Dropping single lookup after peer disconnection"; + "block_root" => ?lookup.block_root() + ); false } else { true @@ -545,7 +552,7 @@ impl BlockLookups { lookup.continue_requests(cx) } Action::ParentUnknown { parent_root } => { - let peers = lookup.all_available_peers().cloned().collect::>(); + let peers = lookup.all_peers().copied().collect::>(); lookup.set_awaiting_parent(parent_root); debug!(self.log, "Marking lookup as awaiting parent"; "id" => lookup.id, "block_root" => ?block_root, "parent_root" => ?parent_root); self.search_parent_of_child(parent_root, block_root, &peers, cx); diff --git a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs index 28ac0378b3..f587a98254 100644 --- a/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs +++ b/beacon_node/network/src/sync/block_lookups/single_block_lookup.rs @@ -7,7 +7,6 @@ use crate::sync::network_context::{ }; use beacon_chain::BeaconChainTypes; use derivative::Derivative; -use itertools::Itertools; use rand::seq::IteratorRandom; use std::collections::HashSet; use std::fmt::Debug; @@ -64,6 +63,9 @@ pub struct SingleBlockLookup { pub id: Id, pub block_request_state: BlockRequestState, pub blob_request_state: BlobRequestState, + /// Peers that claim to have imported this set of block components + #[derivative(Debug(format_with = "fmt_peer_set_as_len"))] + peers: HashSet, block_root: Hash256, awaiting_parent: Option, created: Instant, @@ -78,8 +80,9 @@ impl SingleBlockLookup { ) -> Self { Self { id, - block_request_state: BlockRequestState::new(requested_block_root, peers), - blob_request_state: BlobRequestState::new(requested_block_root, peers), + block_request_state: BlockRequestState::new(requested_block_root), + blob_request_state: BlobRequestState::new(requested_block_root), + peers: HashSet::from_iter(peers.iter().copied()), block_root: requested_block_root, awaiting_parent, created: Instant::now(), @@ -134,22 +137,9 @@ impl SingleBlockLookup { self.block_root() == block_root } - /// Get all unique used peers across block and blob requests. - pub fn all_used_peers(&self) -> impl Iterator + '_ { - self.block_request_state - .state - .get_used_peers() - .chain(self.blob_request_state.state.get_used_peers()) - .unique() - } - - /// Get all unique available peers across block and blob requests. - pub fn all_available_peers(&self) -> impl Iterator + '_ { - self.block_request_state - .state - .get_available_peers() - .chain(self.blob_request_state.state.get_available_peers()) - .unique() + /// Get all unique peers that claim to have imported this set of block components + pub fn all_peers(&self) -> impl Iterator + '_ { + self.peers.iter() } /// Makes progress on all requests of this lookup. Any error is not recoverable and must result @@ -198,7 +188,7 @@ impl SingleBlockLookup { return Err(LookupRequestError::TooManyAttempts { cannot_process }); } - let Some(peer_id) = request.get_state_mut().use_rand_available_peer() else { + let Some(peer_id) = self.use_rand_available_peer() else { if awaiting_parent { // Allow lookups awaiting for a parent to have zero peers. If when the parent // resolve they still have zero peers the lookup will fail gracefully. @@ -208,6 +198,7 @@ impl SingleBlockLookup { } }; + let request = R::request_state_mut(self); match request.make_request(id, peer_id, downloaded_block_expected_blobs, cx)? { LookupRequestResult::RequestSent(req_id) => { request.get_state_mut().on_download_start(req_id)? @@ -238,9 +229,7 @@ impl SingleBlockLookup { /// Add peer to all request states. The peer must be able to serve this request. /// Returns true if the peer was newly inserted into some request state. pub fn add_peer(&mut self, peer_id: PeerId) -> bool { - let inserted_block = self.block_request_state.state.add_peer(&peer_id); - let inserted_blob = self.blob_request_state.state.add_peer(&peer_id); - inserted_block || inserted_blob + self.peers.insert(peer_id) } /// Returns true if the block has already been downloaded. @@ -252,8 +241,17 @@ impl SingleBlockLookup { /// Remove peer from available peers. Return true if there are no more available peers and all /// requests are not expecting any future event (AwaitingDownload). pub fn remove_peer(&mut self, peer_id: &PeerId) -> bool { - self.block_request_state.state.remove_peer(peer_id) - && self.blob_request_state.state.remove_peer(peer_id) + self.peers.remove(peer_id) + } + + /// Returns true if this lookup has zero peers + pub fn has_no_peers(&self) -> bool { + self.peers.is_empty() + } + + /// Selects a random peer from available peers if any + fn use_rand_available_peer(&mut self) -> Option { + self.peers.iter().choose(&mut rand::thread_rng()).copied() } } @@ -267,10 +265,10 @@ pub struct BlobRequestState { } impl BlobRequestState { - pub fn new(block_root: Hash256, peer_source: &[PeerId]) -> Self { + pub fn new(block_root: Hash256) -> Self { Self { block_root, - state: SingleLookupRequestState::new(peer_source), + state: SingleLookupRequestState::new(), } } } @@ -285,10 +283,10 @@ pub struct BlockRequestState { } impl BlockRequestState { - pub fn new(block_root: Hash256, peers: &[PeerId]) -> Self { + pub fn new(block_root: Hash256) -> Self { Self { requested_block_root: block_root, - state: SingleLookupRequestState::new(peers), + state: SingleLookupRequestState::new(), } } } @@ -318,12 +316,6 @@ pub enum State { pub struct SingleLookupRequestState { /// State of this request. state: State, - /// Peers that should have this block or blob. - #[derivative(Debug(format_with = "fmt_peer_set"))] - available_peers: HashSet, - /// Peers from which we have requested this block. - #[derivative(Debug = "ignore")] - used_peers: HashSet, /// How many times have we attempted to process this block or blob. failed_processing: u8, /// How many times have we attempted to download this block or blob. @@ -331,16 +323,9 @@ pub struct SingleLookupRequestState { } impl SingleLookupRequestState { - pub fn new(peers: &[PeerId]) -> Self { - let mut available_peers = HashSet::default(); - for peer in peers.iter().copied() { - available_peers.insert(peer); - } - + pub fn new() -> Self { Self { state: State::AwaitingDownload, - available_peers, - used_peers: HashSet::default(), failed_processing: 0, failed_downloading: 0, } @@ -518,38 +503,6 @@ impl SingleLookupRequestState { pub fn more_failed_processing_attempts(&self) -> bool { self.failed_processing >= self.failed_downloading } - - /// Add peer to this request states. The peer must be able to serve this request. - /// Returns true if the peer is newly inserted. - pub fn add_peer(&mut self, peer_id: &PeerId) -> bool { - self.available_peers.insert(*peer_id) - } - - /// Remove peer from available peers. Return true if there are no more available peers and the - /// request is not expecting any future event (AwaitingDownload). - pub fn remove_peer(&mut self, disconnected_peer_id: &PeerId) -> bool { - self.available_peers.remove(disconnected_peer_id); - self.available_peers.is_empty() && self.is_awaiting_download() - } - - pub fn get_used_peers(&self) -> impl Iterator { - self.used_peers.iter() - } - - pub fn get_available_peers(&self) -> impl Iterator { - self.available_peers.iter() - } - - /// Selects a random peer from available peers if any, inserts it in used peers and returns it. - pub fn use_rand_available_peer(&mut self) -> Option { - let peer_id = self - .available_peers - .iter() - .choose(&mut rand::thread_rng()) - .copied()?; - self.used_peers.insert(peer_id); - Some(peer_id) - } } // Display is used in the BadState assertions above @@ -573,7 +526,7 @@ impl std::fmt::Debug for State { } } -fn fmt_peer_set( +fn fmt_peer_set_as_len( peer_set: &HashSet, f: &mut std::fmt::Formatter, ) -> Result<(), std::fmt::Error> { diff --git a/beacon_node/network/src/sync/block_lookups/tests.rs b/beacon_node/network/src/sync/block_lookups/tests.rs index 2a59c24d58..b99598fe90 100644 --- a/beacon_node/network/src/sync/block_lookups/tests.rs +++ b/beacon_node/network/src/sync/block_lookups/tests.rs @@ -526,8 +526,10 @@ impl TestRig { fn peer_disconnected(&mut self, disconnected_peer_id: PeerId) { self.send_sync_message(SyncMessage::Disconnect(disconnected_peer_id)); + } - // Return RPCErrors for all active requests of peer + /// Return RPCErrors for all active requests of peer + fn rpc_error_all_active_requests(&mut self, disconnected_peer_id: PeerId) { self.drain_network_rx(); while let Ok(request_id) = self.pop_received_network_event(|ev| match ev { NetworkMessage::SendRequest { @@ -1265,13 +1267,25 @@ fn test_parent_lookup_too_deep() { } #[test] -fn test_parent_lookup_disconnection_no_peers_left() { +fn test_lookup_peer_disconnected_no_peers_left_while_request() { let mut rig = TestRig::test_setup(); let peer_id = rig.new_connected_peer(); let trigger_block = rig.rand_block(); rig.trigger_unknown_parent_block(peer_id, trigger_block.into()); + rig.peer_disconnected(peer_id); + rig.rpc_error_all_active_requests(peer_id); + rig.expect_no_active_lookups(); +} +#[test] +fn test_lookup_peer_disconnected_no_peers_left_not_while_request() { + let mut rig = TestRig::test_setup(); + let peer_id = rig.new_connected_peer(); + let trigger_block = rig.rand_block(); + rig.trigger_unknown_parent_block(peer_id, trigger_block.into()); rig.peer_disconnected(peer_id); + // Note: this test case may be removed in the future. It's not strictly necessary to drop a + // lookup if there are no peers left. Lookup should only be dropped if it can not make progress rig.expect_no_active_lookups(); } @@ -1279,13 +1293,15 @@ fn test_parent_lookup_disconnection_no_peers_left() { fn test_lookup_disconnection_peer_left() { let mut rig = TestRig::test_setup(); let peer_ids = (0..2).map(|_| rig.new_connected_peer()).collect::>(); + let disconnecting_peer = *peer_ids.first().unwrap(); let block_root = Hash256::random(); // lookup should have two peers associated with the same block for peer_id in peer_ids.iter() { rig.trigger_unknown_block_from_attestation(block_root, *peer_id); } // Disconnect the first peer only, which is the one handling the request - rig.peer_disconnected(*peer_ids.first().unwrap()); + rig.peer_disconnected(disconnecting_peer); + rig.rpc_error_all_active_requests(disconnecting_peer); rig.assert_single_lookups_count(1); } From 52e31121df5ba74442d37ed9ec1812ee0bdad44a Mon Sep 17 00:00:00 2001 From: Jimmy Chen Date: Wed, 22 May 2024 10:52:40 +1000 Subject: [PATCH 14/19] Reduce frequency of polling unknown validators to avoid overwhelming the Beacon Node (#5628) * Reduce frequency of polling unknown validators. * Move slot calculation into for loop. * Simplify logic. Co-authored-by: Michael Sproul * Fix formatting --- validator_client/src/duties_service.rs | 33 ++++++++++++++++++++++++++ validator_client/src/lib.rs | 1 + 2 files changed, 34 insertions(+) diff --git a/validator_client/src/duties_service.rs b/validator_client/src/duties_service.rs index 6f25a1c05d..880f0eaa48 100644 --- a/validator_client/src/duties_service.rs +++ b/validator_client/src/duties_service.rs @@ -215,6 +215,8 @@ pub struct DutiesService { pub sync_duties: SyncDutiesMap, /// Provides the canonical list of locally-managed validators. pub validator_store: Arc>, + /// Maps unknown validator pubkeys to the next slot time when a poll should be conducted again. + pub unknown_validator_next_poll_slots: RwLock>, /// Tracks the current slot. pub slot_clock: T, /// Provides HTTP access to remote beacon nodes. @@ -489,6 +491,24 @@ async fn poll_validator_indices( .is_some(); if !is_known { + let current_slot_opt = duties_service.slot_clock.now(); + + if let Some(current_slot) = current_slot_opt { + let is_first_slot_of_epoch = current_slot % E::slots_per_epoch() == 0; + + // Query an unknown validator later if it was queried within the last epoch, or if + // the current slot is the first slot of an epoch. + let poll_later = duties_service + .unknown_validator_next_poll_slots + .read() + .get(&pubkey) + .map(|&poll_slot| poll_slot > current_slot || is_first_slot_of_epoch) + .unwrap_or(false); + if poll_later { + continue; + } + } + // Query the remote BN to resolve a pubkey to a validator index. let download_result = duties_service .beacon_nodes @@ -533,10 +553,23 @@ async fn poll_validator_indices( .initialized_validators() .write() .set_index(&pubkey, response.data.index); + + duties_service + .unknown_validator_next_poll_slots + .write() + .remove(&pubkey); } // This is not necessarily an error, it just means the validator is not yet known to // the beacon chain. Ok(None) => { + if let Some(current_slot) = current_slot_opt { + let next_poll_slot = current_slot.saturating_add(E::slots_per_epoch()); + duties_service + .unknown_validator_next_poll_slots + .write() + .insert(pubkey, next_poll_slot); + } + debug!( log, "Validator without index"; diff --git a/validator_client/src/lib.rs b/validator_client/src/lib.rs index 377b064048..381269129e 100644 --- a/validator_client/src/lib.rs +++ b/validator_client/src/lib.rs @@ -476,6 +476,7 @@ impl ProductionValidatorClient { slot_clock: slot_clock.clone(), beacon_nodes: beacon_nodes.clone(), validator_store: validator_store.clone(), + unknown_validator_next_poll_slots: <_>::default(), spec: context.eth2_config.spec.clone(), context: duties_context, enable_high_validator_count_metrics: config.enable_high_validator_count_metrics, From 8762d82adfd9316adf787df7dfc6941f8c76b49e Mon Sep 17 00:00:00 2001 From: Michael Sproul Date: Thu, 23 May 2024 10:17:53 +1000 Subject: [PATCH 15/19] Fix hot state disk leak (#5768) * Fix hot state leak * Don't delete the genesis state when split is 0x0! --- .../beacon_chain/src/block_verification.rs | 22 ++++---- beacon_node/beacon_chain/src/migrate.rs | 5 ++ beacon_node/store/src/hot_cold_store.rs | 51 +++++++++++++++++++ 3 files changed, 68 insertions(+), 10 deletions(-) diff --git a/beacon_node/beacon_chain/src/block_verification.rs b/beacon_node/beacon_chain/src/block_verification.rs index 866dde5a76..f4f6526a56 100644 --- a/beacon_node/beacon_chain/src/block_verification.rs +++ b/beacon_node/beacon_chain/src/block_verification.rs @@ -1382,18 +1382,20 @@ impl ExecutionPendingBlock { let catchup_timer = metrics::start_timer(&metrics::BLOCK_PROCESSING_CATCHUP_STATE); // Stage a batch of operations to be completed atomically if this block is imported - // successfully. We include the state root of the pre-state, which may be an advanced state - // that was stored in the DB with a `temporary` flag. + // successfully. If there is a skipped slot, we include the state root of the pre-state, + // which may be an advanced state that was stored in the DB with a `temporary` flag. let mut state = parent.pre_state; - let mut confirmed_state_roots = if state.slot() > parent.beacon_block.slot() { - // Advanced pre-state. Delete its temporary flag. - let pre_state_root = state.update_tree_hash_cache()?; - vec![pre_state_root] - } else { - // Pre state is parent state. It is already stored in the DB without temporary status. - vec![] - }; + let mut confirmed_state_roots = + if block.slot() > state.slot() && state.slot() > parent.beacon_block.slot() { + // Advanced pre-state. Delete its temporary flag. + let pre_state_root = state.update_tree_hash_cache()?; + vec![pre_state_root] + } else { + // Pre state is either unadvanced, or should not be stored long-term because there + // is no skipped slot between `parent` and `block`. + vec![] + }; // The block must have a higher slot than its parent. if block.slot() <= parent.beacon_block.slot() { diff --git a/beacon_node/beacon_chain/src/migrate.rs b/beacon_node/beacon_chain/src/migrate.rs index ad597bf92a..08b2a51720 100644 --- a/beacon_node/beacon_chain/src/migrate.rs +++ b/beacon_node/beacon_chain/src/migrate.rs @@ -703,6 +703,11 @@ impl, Cold: ItemStore> BackgroundMigrator, Cold: ItemStore> HotColdDB Ok(()) } + + /// Prune states from the hot database which are prior to the split. + /// + /// This routine is important for cleaning up advanced states which are stored in the database + /// with a temporary flag. + pub fn prune_old_hot_states(&self) -> Result<(), Error> { + let split = self.get_split_info(); + debug!( + self.log, + "Database state pruning started"; + "split_slot" => split.slot, + ); + let mut state_delete_batch = vec![]; + for res in self + .hot_db + .iter_column::(DBColumn::BeaconStateSummary) + { + let (state_root, summary_bytes) = res?; + let summary = HotStateSummary::from_ssz_bytes(&summary_bytes)?; + + if summary.slot <= split.slot { + let old = summary.slot < split.slot; + let non_canonical = summary.slot == split.slot + && state_root != split.state_root + && !split.state_root.is_zero(); + if old || non_canonical { + let reason = if old { + "old dangling state" + } else { + "non-canonical" + }; + debug!( + self.log, + "Deleting state"; + "state_root" => ?state_root, + "slot" => summary.slot, + "reason" => reason, + ); + state_delete_batch.push(StoreOp::DeleteState(state_root, Some(summary.slot))); + } + } + } + let num_deleted_states = state_delete_batch.len(); + self.do_atomically_with_block_and_blobs_cache(state_delete_batch)?; + debug!( + self.log, + "Database state pruning complete"; + "num_deleted_states" => num_deleted_states, + ); + Ok(()) + } } /// Advance the split point of the store, moving new finalized states to the freezer. From 17d9086df3bc6edef0660a49b3e480adb35e191f Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Thu, 23 May 2024 14:46:05 +0200 Subject: [PATCH 16/19] Drop stuck lookups (#5824) * Drop stuck lookups --- beacon_node/network/src/metrics.rs | 6 +- .../network/src/sync/block_lookups/common.rs | 7 +- .../network/src/sync/block_lookups/mod.rs | 91 ++++++++++++++++--- .../network/src/sync/block_lookups/tests.rs | 2 +- beacon_node/network/src/sync/manager.rs | 2 +- 5 files changed, 82 insertions(+), 26 deletions(-) diff --git a/beacon_node/network/src/metrics.rs b/beacon_node/network/src/metrics.rs index 309512076a..bf4cbd09ab 100644 --- a/beacon_node/network/src/metrics.rs +++ b/beacon_node/network/src/metrics.rs @@ -257,9 +257,9 @@ lazy_static! { "sync_lookups_completed_total", "Total count of sync lookups completed", ); - pub static ref SYNC_LOOKUPS_STUCK: Result = try_create_int_gauge( - "sync_lookups_stuck", - "Current count of sync lookups that may be stuck", + pub static ref SYNC_LOOKUPS_STUCK: Result = try_create_int_counter( + "sync_lookups_stuck_total", + "Total count of sync lookups that are stuck and dropped", ); /* diff --git a/beacon_node/network/src/sync/block_lookups/common.rs b/beacon_node/network/src/sync/block_lookups/common.rs index 2791623f3f..aef76fb0da 100644 --- a/beacon_node/network/src/sync/block_lookups/common.rs +++ b/beacon_node/network/src/sync/block_lookups/common.rs @@ -2,7 +2,7 @@ use crate::sync::block_lookups::single_block_lookup::{ LookupRequestError, SingleBlockLookup, SingleLookupRequestState, }; use crate::sync::block_lookups::{BlobRequestState, BlockRequestState, PeerId}; -use crate::sync::manager::{Id, SLOT_IMPORT_TOLERANCE}; +use crate::sync::manager::Id; use crate::sync::network_context::{LookupRequestResult, SyncNetworkContext}; use beacon_chain::block_verification_types::RpcBlock; use beacon_chain::BeaconChainTypes; @@ -19,11 +19,6 @@ pub enum ResponseType { Blob, } -/// The maximum depth we will search for a parent block. In principle we should have sync'd any -/// canonical chain to its head once the peer connects. A chain should not appear where it's depth -/// is further back than the most recent head slot. -pub(crate) const PARENT_DEPTH_TOLERANCE: usize = SLOT_IMPORT_TOLERANCE * 2; - /// This trait unifies common single block lookup functionality across blocks and blobs. This /// includes making requests, verifying responses, and handling processing results. A /// `SingleBlockLookup` includes both a `BlockRequestState` and a `BlobRequestState`, this trait is diff --git a/beacon_node/network/src/sync/block_lookups/mod.rs b/beacon_node/network/src/sync/block_lookups/mod.rs index 0e89eb956c..94645197c9 100644 --- a/beacon_node/network/src/sync/block_lookups/mod.rs +++ b/beacon_node/network/src/sync/block_lookups/mod.rs @@ -1,10 +1,10 @@ use self::parent_chain::{compute_parent_chains, NodeChain}; pub use self::single_block_lookup::DownloadResult; use self::single_block_lookup::{LookupRequestError, LookupResult, SingleBlockLookup}; -use super::manager::{BlockProcessType, BlockProcessingResult}; +use super::manager::{BlockProcessType, BlockProcessingResult, SLOT_IMPORT_TOLERANCE}; use super::network_context::{RpcResponseResult, SyncNetworkContext}; use crate::metrics; -use crate::sync::block_lookups::common::{ResponseType, PARENT_DEPTH_TOLERANCE}; +use crate::sync::block_lookups::common::ResponseType; use crate::sync::block_lookups::parent_chain::find_oldest_fork_ancestor; use crate::sync::manager::{Id, SingleLookupReqId}; use beacon_chain::block_verification_types::AsBlock; @@ -28,9 +28,18 @@ mod single_block_lookup; #[cfg(test)] mod tests; +/// The maximum depth we will search for a parent block. In principle we should have sync'd any +/// canonical chain to its head once the peer connects. A chain should not appear where it's depth +/// is further back than the most recent head slot. +pub(crate) const PARENT_DEPTH_TOLERANCE: usize = SLOT_IMPORT_TOLERANCE * 2; + const FAILED_CHAINS_CACHE_EXPIRY_SECONDS: u64 = 60; pub const SINGLE_BLOCK_LOOKUP_MAX_ATTEMPTS: u8 = 4; -const LOOKUP_MAX_DURATION_SECS: u64 = 60; + +/// Maximum time we allow a lookup to exist before assuming it is stuck and will never make +/// progress. Assume the worse case processing time per block component set * times max depth. +/// 15 * 2 * 32 = 16 minutes. +const LOOKUP_MAX_DURATION_SECS: usize = 15 * PARENT_DEPTH_TOLERANCE; pub enum BlockComponent { Block(DownloadResult>>), @@ -680,20 +689,72 @@ impl BlockLookups { ); } - pub fn log_stuck_lookups(&self) { - let mut stuck_count = 0; - for lookup in self.single_block_lookups.values() { - if lookup.elapsed_since_created() > Duration::from_secs(LOOKUP_MAX_DURATION_SECS) { - debug!(self.log, "Lookup maybe stuck"; - // Fields id and block_root are also part of the summary. However, logging them - // here allows log parsers o index them and have better search - "id" => lookup.id, - "block_root" => ?lookup.block_root(), - "summary" => ?lookup, + /// Safety mechanism to unstuck lookup sync. Lookup sync if purely event driven and depends on + /// external components to feed it events to make progress. If there is a bug in network, in + /// beacon processor, or here internally: lookups can get stuck forever. A stuck lookup can + /// stall a node indefinitely as other lookup will be awaiting on a parent lookup to make + /// progress. + /// + /// If a lookup lasts more than LOOKUP_MAX_DURATION_SECS this function will find its oldest + /// ancestor and then drop it and all its children. This action will allow the node to unstuck + /// itself. Bugs that cause lookups to get stuck may be triggered consistently. So this strategy + /// is useful for two reasons: + /// + /// - One single clear warn level log per stuck incident + /// - If the original bug is sporadic, it reduces the time a node is stuck from forever to 15 min + pub fn drop_stuck_lookups(&mut self) { + // While loop to find and drop all disjoint trees of potentially stuck lookups. + while let Some(stuck_lookup) = self.single_block_lookups.values().find(|lookup| { + lookup.elapsed_since_created() > Duration::from_secs(LOOKUP_MAX_DURATION_SECS as u64) + }) { + let ancestor_stuck_lookup = match self.find_oldest_ancestor_lookup(stuck_lookup) { + Ok(lookup) => lookup, + Err(e) => { + warn!(self.log, "Error finding oldest ancestor lookup"; "error" => ?e); + // Default to dropping the lookup that exceeds the max duration so at least + // eventually sync should be unstuck + stuck_lookup + } + }; + + if stuck_lookup.id == ancestor_stuck_lookup.id { + warn!(self.log, "Notify the devs, a sync lookup is stuck"; + "block_root" => ?stuck_lookup.block_root(), + "lookup" => ?stuck_lookup, ); - stuck_count += 1; + } else { + warn!(self.log, "Notify the devs, a sync lookup is stuck"; + "block_root" => ?stuck_lookup.block_root(), + "lookup" => ?stuck_lookup, + "ancestor_block_root" => ?ancestor_stuck_lookup.block_root(), + "ancestor_lookup" => ?ancestor_stuck_lookup, + ); + } + + metrics::inc_counter(&metrics::SYNC_LOOKUPS_STUCK); + self.drop_lookup_and_children(ancestor_stuck_lookup.id); + } + } + + /// Recursively find the oldest ancestor lookup of another lookup + fn find_oldest_ancestor_lookup<'a>( + &'a self, + stuck_lookup: &'a SingleBlockLookup, + ) -> Result<&'a SingleBlockLookup, String> { + if let Some(awaiting_parent) = stuck_lookup.awaiting_parent() { + if let Some(lookup) = self + .single_block_lookups + .values() + .find(|l| l.block_root() == awaiting_parent) + { + self.find_oldest_ancestor_lookup(lookup) + } else { + Err(format!( + "Lookup references unknown parent {awaiting_parent:?}" + )) } + } else { + Ok(stuck_lookup) } - metrics::set_gauge(&metrics::SYNC_LOOKUPS_STUCK, stuck_count); } } diff --git a/beacon_node/network/src/sync/block_lookups/tests.rs b/beacon_node/network/src/sync/block_lookups/tests.rs index b99598fe90..5a85e57f63 100644 --- a/beacon_node/network/src/sync/block_lookups/tests.rs +++ b/beacon_node/network/src/sync/block_lookups/tests.rs @@ -10,7 +10,7 @@ use std::sync::Arc; use super::*; -use crate::sync::block_lookups::common::{ResponseType, PARENT_DEPTH_TOLERANCE}; +use crate::sync::block_lookups::common::ResponseType; use beacon_chain::blob_verification::GossipVerifiedBlob; use beacon_chain::block_verification_types::{BlockImportData, RpcBlock}; use beacon_chain::builder::Witness; diff --git a/beacon_node/network/src/sync/manager.rs b/beacon_node/network/src/sync/manager.rs index 71d3113414..1162f63de0 100644 --- a/beacon_node/network/src/sync/manager.rs +++ b/beacon_node/network/src/sync/manager.rs @@ -561,7 +561,7 @@ impl SyncManager { self.handle_new_execution_engine_state(engine_state); } _ = interval.tick() => { - self.block_lookups.log_stuck_lookups(); + self.block_lookups.drop_stuck_lookups(); } } } From 61b29fa361177f6dbfd971173fa2702b51e7dd9b Mon Sep 17 00:00:00 2001 From: Eitan Seri-Levi Date: Thu, 23 May 2024 14:46:08 +0200 Subject: [PATCH 17/19] Update default target peers documentation (#5727) * the default target peers is 100 --- book/src/faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/book/src/faq.md b/book/src/faq.md index 9cc695c442..104190ab9b 100644 --- a/book/src/faq.md +++ b/book/src/faq.md @@ -401,7 +401,7 @@ If the ports are open, you should have incoming peers. To check that you have in If you have incoming peers, it should return a lot of data containing information of peers. If the response is empty, it means that you have no incoming peers and there the ports are not open. You may want to double check if the port forward was correctly set up. -2. Check that you do not lower the number of peers using the flag `--target-peers`. The default is 80. A lower value set will lower the maximum number of peers your node can connect to, which may potentially interrupt the validator performance. We recommend users to leave the `--target peers` untouched to keep a diverse set of peers. +2. Check that you do not lower the number of peers using the flag `--target-peers`. The default is 100. A lower value set will lower the maximum number of peers your node can connect to, which may potentially interrupt the validator performance. We recommend users to leave the `--target peers` untouched to keep a diverse set of peers. 3. Ensure that you have a quality router for the internet connection. For example, if you connect the router to many devices including the node, it may be possible that the router cannot handle all routing tasks, hence struggling to keep up the number of peers. Therefore, using a quality router for the node is important to keep a healthy number of peers. From 7073242ccce851ca6569a74e90c811ade01ff6a5 Mon Sep 17 00:00:00 2001 From: Lion - dapplion <35266934+dapplion@users.noreply.github.com> Date: Thu, 23 May 2024 16:34:49 +0200 Subject: [PATCH 18/19] Suppress RPC Error disconnect log (#5802) * Suppress RPC Error disconnect log --- beacon_node/lighthouse_network/src/service/mod.rs | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/beacon_node/lighthouse_network/src/service/mod.rs b/beacon_node/lighthouse_network/src/service/mod.rs index 8ed8658a4b..86086feda3 100644 --- a/beacon_node/lighthouse_network/src/service/mod.rs +++ b/beacon_node/lighthouse_network/src/service/mod.rs @@ -1383,14 +1383,21 @@ impl Network { // Silencing this event breaks the API contract with RPC where every request ends with // - A stream termination event, or // - An RPCError event - if !matches!(event.event, HandlerEvent::Err(HandlerErr::Outbound { .. })) { + return if let HandlerEvent::Err(HandlerErr::Outbound { + id: RequestId::Application(id), + error, + .. + }) = event.event + { + Some(NetworkEvent::RPCFailed { peer_id, id, error }) + } else { debug!( self.log, "Ignoring rpc message of disconnecting peer"; event ); - return None; - } + None + }; } let handler_id = event.conn_id; From 3070cb7c3982f8321bbeb33413009f8f928bdf8a Mon Sep 17 00:00:00 2001 From: chonghe <44791194+chong-he@users.noreply.github.com> Date: Fri, 24 May 2024 10:45:19 +0800 Subject: [PATCH 19/19] Markdown linter (#5494) * linter * Add markdown linter * add env * only check markdown * Add token * Update .github/workflows/test-suite.yml * Markdown linter * Exit code * Update script * rename * mdlint * Add an empty line after end of file * Testing disable * add text * update mdlint.sh * ori validator inclusion * Add config yml file * Remove MD041 and fix advanced-datadir file * FIx validator inclusion file conflict * Merge branch 'unstable' into markdown-linter * change files * Merge branch 'markdown-linter' of https://github.com/chong-he/lighthouse into markdown-linter * mdlint * Remove MD025 * Remove MD036 * Remove MD045 * Removr MD001 * Set MD028 to false * Remove MD024 * Remove MD055 * Remove MD029 * Remove MD040 * Set MD040 to false * Set MD033 to false * Set MD013 to false * Rearrange yml file * Update mdlint.sh and test * Test remove fix * Test with fix * Test with space * Fix summary indentation * Test mdlint.sh * Update mdlint.sh * Test * Update * Test fix * Test again * Fix * merge into check-code * Update scripts/mdlint.sh Co-authored-by: Mac L * Update scripts/mdlint.sh Co-authored-by: Mac L * Remove set -e * Add comment * Merge pull request #7 from chong-he/unstable Merge unstable to markdown branch * mdlint * Merge branch 'unstable' into markdown-linter * mdlint --- .github/workflows/test-suite.yml | 2 + Makefile | 4 + book/.markdownlint.yml | 28 +++++ book/src/SUMMARY.md | 108 +++++++++--------- book/src/advanced-blobs.md | 17 ++- book/src/advanced-datadir.md | 7 +- book/src/advanced-proposer-only.md | 6 +- book/src/advanced-release-candidates.md | 5 +- book/src/advanced.md | 2 +- book/src/advanced_database.md | 6 +- book/src/advanced_metrics.md | 3 +- book/src/advanced_networking.md | 59 +++++----- book/src/api-bn.md | 35 ++++-- book/src/api-lighthouse.md | 130 +++++++++++---------- book/src/api-vc-auth-header.md | 9 +- book/src/api-vc-endpoints.md | 65 ++++++----- book/src/api-vc-sig-header.md | 4 +- book/src/api-vc.md | 4 +- book/src/builders.md | 53 +++++---- book/src/checkpoint-sync.md | 32 +++--- book/src/cli.md | 10 +- book/src/contributing.md | 26 ++--- book/src/cross-compiling.md | 1 - book/src/database-migrations.md | 24 ++-- book/src/developers.md | 4 - book/src/docker.md | 7 +- book/src/faq.md | 144 +++++++++++------------- book/src/graffiti.md | 39 ++++--- book/src/help_bn.md | 1 + book/src/help_general.md | 1 + book/src/help_vc.md | 1 + book/src/help_vm.md | 1 + book/src/help_vm_create.md | 1 + book/src/help_vm_import.md | 1 + book/src/help_vm_move.md | 1 + book/src/homebrew.md | 6 +- book/src/installation-binaries.md | 8 +- book/src/installation-source.md | 29 ++--- book/src/installation.md | 15 +-- book/src/intro.md | 1 - book/src/key-management.md | 54 +++++---- book/src/key-recovery.md | 13 +-- book/src/lighthouse-ui.md | 2 +- book/src/mainnet-validator.md | 34 +++--- book/src/merge-migration.md | 8 +- book/src/partial-withdrawal.md | 13 ++- book/src/pi.md | 13 +-- book/src/redundancy.md | 13 +-- book/src/run_a_node.md | 30 ++--- book/src/setup.md | 10 +- book/src/slasher.md | 5 +- book/src/slashing-protection.md | 8 +- book/src/suggested-fee-recipient.md | 17 +-- book/src/ui-authentication.md | 12 +- book/src/ui-configuration.md | 4 +- book/src/ui-faqs.md | 15 ++- book/src/ui-installation.md | 28 +++-- book/src/ui-usage.md | 38 +++---- book/src/validator-doppelganger.md | 2 +- book/src/validator-inclusion.md | 16 +-- book/src/validator-management.md | 11 +- book/src/validator-manager-create.md | 10 +- book/src/validator-manager-move.md | 9 +- book/src/validator-manager.md | 3 +- book/src/validator-monitoring.md | 21 ++-- book/src/voluntary-exit.md | 45 ++++---- scripts/cli.sh | 2 +- scripts/mdlint.sh | 23 ++++ 68 files changed, 721 insertions(+), 638 deletions(-) create mode 100644 book/.markdownlint.yml create mode 100755 scripts/mdlint.sh diff --git a/.github/workflows/test-suite.yml b/.github/workflows/test-suite.yml index 84ef42dc3f..928833cbef 100644 --- a/.github/workflows/test-suite.yml +++ b/.github/workflows/test-suite.yml @@ -383,6 +383,8 @@ jobs: run: make audit-CI - name: Run cargo vendor to make sure dependencies can be vendored for packaging, reproducibility and archival purpose run: CARGO_HOME=$(readlink -f $HOME) make vendor + - name: Markdown-linter + run: make mdlint check-msrv: name: check-msrv runs-on: ubuntu-latest diff --git a/Makefile b/Makefile index 12d33cc3a8..3e6934e6b5 100644 --- a/Makefile +++ b/Makefile @@ -214,6 +214,10 @@ cli: cli-local: make && ./scripts/cli.sh +# Check for markdown files +mdlint: + ./scripts/mdlint.sh + # Runs the entire test suite, downloading test vectors if required. test-full: cargo-fmt test-release test-debug test-ef test-exec-engine diff --git a/book/.markdownlint.yml b/book/.markdownlint.yml new file mode 100644 index 0000000000..5d6bda29f1 --- /dev/null +++ b/book/.markdownlint.yml @@ -0,0 +1,28 @@ +# MD010: https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md010---hard-tabs +MD010: + # Set code blocks to false so that code blocks will be ignored, default is true + code_blocks: false + +#MD013 line length: https://github.com/DavidAnson/markdownlint/blob/main/doc/md013.md +# Set to false as this will also interfere with help_x.md files, and it is not necessary to comply with the line length of 80 +MD013: false + +# MD028: set to false to allow blank line between blockquote: https://github.com/DavidAnson/markdownlint/blob/main/doc/md028.md +# This is because the blockquotes are shown separatedly (a deisred outcome) when having a blank line in between +MD028: false + +# MD024: set siblings_only to true so that same headings with different parent headings are allowed +# https://github.com/DavidAnson/markdownlint/blob/main/doc/md024.md +MD024: + siblings_only: true + +# MD033 in-line html: https://github.com/DavidAnson/markdownlint/blob/main/doc/md033.md +# In-line html is fine in the markdown files, so this is set to false +MD033: false + +# MD036 set to false to preserve the emphasis on deprecation notice on key-management.md (a heading is not necessary) +MD036: false + +# MD040 code blocks should have a language specified: https://github.com/DavidAnson/markdownlint/blob/main/doc/md040.md +# Set to false as the help_x.md files are code blocks without a language specified, which is fine and does not need to change +MD040: false \ No newline at end of file diff --git a/book/src/SUMMARY.md b/book/src/SUMMARY.md index 1a35d9d139..7fb0b2f4e7 100644 --- a/book/src/SUMMARY.md +++ b/book/src/SUMMARY.md @@ -2,66 +2,66 @@ * [Introduction](./intro.md) * [Installation](./installation.md) - * [Pre-Built Binaries](./installation-binaries.md) - * [Docker](./docker.md) - * [Build from Source](./installation-source.md) - * [Raspberry Pi 4](./pi.md) - * [Cross-Compiling](./cross-compiling.md) - * [Homebrew](./homebrew.md) - * [Update Priorities](./installation-priorities.md) + * [Pre-Built Binaries](./installation-binaries.md) + * [Docker](./docker.md) + * [Build from Source](./installation-source.md) + * [Raspberry Pi 4](./pi.md) + * [Cross-Compiling](./cross-compiling.md) + * [Homebrew](./homebrew.md) + * [Update Priorities](./installation-priorities.md) * [Run a Node](./run_a_node.md) * [Become a Validator](./mainnet-validator.md) * [Validator Management](./validator-management.md) - * [The `validator-manager` Command](./validator-manager.md) - * [Creating validators](./validator-manager-create.md) - * [Moving validators](./validator-manager-move.md) - * [Slashing Protection](./slashing-protection.md) - * [Voluntary Exits](./voluntary-exit.md) - * [Partial Withdrawals](./partial-withdrawal.md) - * [Validator Monitoring](./validator-monitoring.md) - * [Doppelganger Protection](./validator-doppelganger.md) - * [Suggested Fee Recipient](./suggested-fee-recipient.md) - * [Validator Graffiti](./graffiti.md) + * [The `validator-manager` Command](./validator-manager.md) + * [Creating validators](./validator-manager-create.md) + * [Moving validators](./validator-manager-move.md) + * [Slashing Protection](./slashing-protection.md) + * [Voluntary Exits](./voluntary-exit.md) + * [Partial Withdrawals](./partial-withdrawal.md) + * [Validator Monitoring](./validator-monitoring.md) + * [Doppelganger Protection](./validator-doppelganger.md) + * [Suggested Fee Recipient](./suggested-fee-recipient.md) + * [Validator Graffiti](./graffiti.md) * [APIs](./api.md) - * [Beacon Node API](./api-bn.md) - * [Lighthouse API](./api-lighthouse.md) - * [Validator Inclusion APIs](./validator-inclusion.md) - * [Validator Client API](./api-vc.md) - * [Endpoints](./api-vc-endpoints.md) - * [Authorization Header](./api-vc-auth-header.md) - * [Signature Header](./api-vc-sig-header.md) - * [Prometheus Metrics](./advanced_metrics.md) + * [Beacon Node API](./api-bn.md) + * [Lighthouse API](./api-lighthouse.md) + * [Validator Inclusion APIs](./validator-inclusion.md) + * [Validator Client API](./api-vc.md) + * [Endpoints](./api-vc-endpoints.md) + * [Authorization Header](./api-vc-auth-header.md) + * [Signature Header](./api-vc-sig-header.md) + * [Prometheus Metrics](./advanced_metrics.md) * [Lighthouse UI (Siren)](./lighthouse-ui.md) - * [Installation](./ui-installation.md) - * [Authentication](./ui-authentication.md) - * [Configuration](./ui-configuration.md) - * [Usage](./ui-usage.md) - * [FAQs](./ui-faqs.md) + * [Installation](./ui-installation.md) + * [Authentication](./ui-authentication.md) + * [Configuration](./ui-configuration.md) + * [Usage](./ui-usage.md) + * [FAQs](./ui-faqs.md) * [Advanced Usage](./advanced.md) - * [Checkpoint Sync](./checkpoint-sync.md) - * [Custom Data Directories](./advanced-datadir.md) - * [Proposer Only Beacon Nodes](./advanced-proposer-only.md) - * [Remote Signing with Web3Signer](./validator-web3signer.md) - * [Database Configuration](./advanced_database.md) - * [Database Migrations](./database-migrations.md) - * [Key Management (Deprecated)](./key-management.md) - * [Key Recovery](./key-recovery.md) - * [Advanced Networking](./advanced_networking.md) - * [Running a Slasher](./slasher.md) - * [Redundancy](./redundancy.md) - * [Release Candidates](./advanced-release-candidates.md) - * [MEV](./builders.md) - * [Merge Migration](./merge-migration.md) - * [Late Block Re-orgs](./late-block-re-orgs.md) - * [Blobs](./advanced-blobs.md) + * [Checkpoint Sync](./checkpoint-sync.md) + * [Custom Data Directories](./advanced-datadir.md) + * [Proposer Only Beacon Nodes](./advanced-proposer-only.md) + * [Remote Signing with Web3Signer](./validator-web3signer.md) + * [Database Configuration](./advanced_database.md) + * [Database Migrations](./database-migrations.md) + * [Key Management (Deprecated)](./key-management.md) + * [Key Recovery](./key-recovery.md) + * [Advanced Networking](./advanced_networking.md) + * [Running a Slasher](./slasher.md) + * [Redundancy](./redundancy.md) + * [Release Candidates](./advanced-release-candidates.md) + * [MEV](./builders.md) + * [Merge Migration](./merge-migration.md) + * [Late Block Re-orgs](./late-block-re-orgs.md) + * [Blobs](./advanced-blobs.md) * [Built-In Documentation](./help_general.md) - * [Beacon Node](./help_bn.md) - * [Validator Client](./help_vc.md) - * [Validator Manager](./help_vm.md) - * [Create](./help_vm_create.md) - * [Import](./help_vm_import.md) - * [Move](./help_vm_move.md) + * [Beacon Node](./help_bn.md) + * [Validator Client](./help_vc.md) + * [Validator Manager](./help_vm.md) + * [Create](./help_vm_create.md) + * [Import](./help_vm_import.md) + * [Move](./help_vm_move.md) * [Contributing](./contributing.md) - * [Development Environment](./setup.md) + * [Development Environment](./setup.md) * [FAQs](./faq.md) -* [Protocol Developers](./developers.md) \ No newline at end of file +* [Protocol Developers](./developers.md) diff --git a/book/src/advanced-blobs.md b/book/src/advanced-blobs.md index eee404a9be..785bd5797d 100644 --- a/book/src/advanced-blobs.md +++ b/book/src/advanced-blobs.md @@ -1,8 +1,8 @@ # Blobs -In the Deneb network upgrade, one of the changes is the implementation of EIP-4844, also known as [Proto-danksharding](https://blog.ethereum.org/2024/02/27/dencun-mainnet-announcement). Alongside with this, a new term named `blob` (binary large object) is introduced. Blobs are "side-cars" carrying transaction data in a block. They are mainly used by Ethereum layer 2 operators. As far as stakers are concerned, the main difference with the introduction of blobs is the increased storage requirement. +In the Deneb network upgrade, one of the changes is the implementation of EIP-4844, also known as [Proto-danksharding](https://blog.ethereum.org/2024/02/27/dencun-mainnet-announcement). Alongside with this, a new term named `blob` (binary large object) is introduced. Blobs are "side-cars" carrying transaction data in a block. They are mainly used by Ethereum layer 2 operators. As far as stakers are concerned, the main difference with the introduction of blobs is the increased storage requirement. -### FAQ +## FAQ 1. What is the storage requirement for blobs? @@ -10,33 +10,32 @@ In the Deneb network upgrade, one of the changes is the implementation of EIP-48 One blob is 128 KB in size. Each block can carry a maximum of 6 blobs. Blobs will be kept for 4096 epochs and pruned afterwards. This means that the maximum increase in storage requirement will be: - ``` + ```text 2**17 bytes / blob * 6 blobs / block * 32 blocks / epoch * 4096 epochs = 96 GB ``` However, the blob base fee targets 3 blobs per block and it works similarly to how EIP-1559 operates in the Ethereum gas fee. Therefore, practically it is very likely to average to 3 blobs per blocks, which translates to a storage requirement of 48 GB. - 1. Do I have to add any flags for blobs? - No, you can use the default values for blob-related flags, which means you do not need add or remove any flags. + No, you can use the default values for blob-related flags, which means you do not need add or remove any flags. 1. What if I want to keep all blobs? Use the flag `--prune-blobs false` in the beacon node. The storage requirement will be: - ``` + ```text 2**17 bytes * 3 blobs / block * 7200 blocks / day * 30 days = 79GB / month or 948GB / year ``` - + To keep blobs for a custom period, you may use the flag `--blob-prune-margin-epochs ` which keeps blobs for 4096+EPOCHS specified in the flag. 1. How to see the info of the blobs database? - We can call the API: + We can call the API: ```bash curl "http://localhost:5052/lighthouse/database/info" | jq ``` - Refer to [Lighthouse API](./api-lighthouse.md#lighthousedatabaseinfo) for an example response. \ No newline at end of file + Refer to [Lighthouse API](./api-lighthouse.md#lighthousedatabaseinfo) for an example response. diff --git a/book/src/advanced-datadir.md b/book/src/advanced-datadir.md index 074857346e..7ad993a107 100644 --- a/book/src/advanced-datadir.md +++ b/book/src/advanced-datadir.md @@ -1,4 +1,4 @@ -## Custom Data Directories +# Custom Data Directories Users can override the default Lighthouse data directories (e.g., `~/.lighthouse/mainnet`) using the `--datadir` flag. The custom data directory mirrors the structure of any network specific default directory (e.g. `~/.lighthouse/mainnet`). @@ -11,10 +11,11 @@ lighthouse --network mainnet --datadir /var/lib/my-custom-dir account validator lighthouse --network mainnet --datadir /var/lib/my-custom-dir bn --staking lighthouse --network mainnet --datadir /var/lib/my-custom-dir vc ``` + The first step creates a `validators` directory under `/var/lib/my-custom-dir` which contains the imported keys and [`validator_definitions.yml`](./validator-management.md). After that, we simply run the beacon chain and validator client with the custom dir path. -### Relative Paths +## Relative Paths [#2682]: https://github.com/sigp/lighthouse/pull/2682 [#2846]: https://github.com/sigp/lighthouse/pull/2846 @@ -40,7 +41,7 @@ be applied. On start-up, if a split directory scenario is detected (i.e. `~/here Lighthouse will continue to operate with split directories. In such a scenario, the following harmless log will show: -``` +```text WARN Legacy datadir location location: "/home/user/datadir/beacon", msg: this occurs when using relative paths for a datadir location ``` diff --git a/book/src/advanced-proposer-only.md b/book/src/advanced-proposer-only.md index c3347e044b..1ea3610988 100644 --- a/book/src/advanced-proposer-only.md +++ b/book/src/advanced-proposer-only.md @@ -2,7 +2,7 @@ Lighthouse allows for more exotic setups that can minimize attack vectors by adding redundant beacon nodes and dividing the roles of attesting and block -production between them. +production between them. The purpose of this is to minimize attack vectors where malicious users obtain the network identities (IP addresses) of beacon @@ -24,7 +24,7 @@ harder to identify as a potential node to attack and will also consume less resources. Specifically, this flag reduces the default peer count (to a safe minimal -number as maintaining peers on attestation subnets do not need to be considered), +number as maintaining peers on attestation subnets do not need to be considered), prevents the node from subscribing to any attestation-subnets or sync-committees which is a primary way for attackers to de-anonymize validators. @@ -34,7 +34,6 @@ validators. > normal beacon node, the validator may fail to handle its duties correctly and > result in a loss of income. - ## The Validator Client The validator client can be given a list of HTTP API endpoints representing @@ -53,7 +52,6 @@ these nodes for added security). > producing a more profitable block. Any block builders should therefore be > attached to the `--beacon-nodes` and not necessarily the `--proposer-nodes`. - ## Setup Overview The intended set-up to take advantage of this mechanism is to run one (or more) diff --git a/book/src/advanced-release-candidates.md b/book/src/advanced-release-candidates.md index a539aa489c..9f00da9ae9 100644 --- a/book/src/advanced-release-candidates.md +++ b/book/src/advanced-release-candidates.md @@ -20,7 +20,7 @@ you're looking for stable Lighthouse**. From time to time, Lighthouse may use the terms "release candidate" and "pre release" interchangeably. A pre release is identical to a release candidate. -### Examples +## Examples [`v1.4.0-rc.0`] has `rc` in the version string and is therefore a release candidate. This release is *not* stable and is *not* intended for critical tasks on mainnet (e.g., staking). @@ -36,9 +36,8 @@ Users may wish to try a release candidate for the following reasons: - To help detect bugs and regressions before they reach production. - To provide feedback on annoyances before they make it into a release and become harder to change or revert. -There can also be a scenario that a bug has been found and requires an urgent fix. An example of incidence is [v4.0.2-rc.0](https://github.com/sigp/lighthouse/releases/tag/v4.0.2-rc.0) which contains a hot-fix to address high CPU usage experienced after the [Capella](https://ethereum.org/en/history/#capella) upgrade on 12th April 2023. In this scenario, we will announce the release candidate on [Github](https://github.com/sigp/lighthouse/releases) and also on [Discord](https://discord.gg/cyAszAh) to recommend users to update to the release candidate version. +There can also be a scenario that a bug has been found and requires an urgent fix. An example of incidence is [v4.0.2-rc.0](https://github.com/sigp/lighthouse/releases/tag/v4.0.2-rc.0) which contains a hot-fix to address high CPU usage experienced after the [Capella](https://ethereum.org/en/history/#capella) upgrade on 12th April 2023. In this scenario, we will announce the release candidate on [Github](https://github.com/sigp/lighthouse/releases) and also on [Discord](https://discord.gg/cyAszAh) to recommend users to update to the release candidate version. ## When *not* to use a release candidate Other than the above scenarios, it is generally not recommended to use release candidates for any critical tasks on mainnet (e.g., staking). To test new release candidate features, try one of the testnets (e.g., Holesky). - diff --git a/book/src/advanced.md b/book/src/advanced.md index 21e732afa1..1a882835a4 100644 --- a/book/src/advanced.md +++ b/book/src/advanced.md @@ -15,7 +15,7 @@ tips about how things work under the hood. * [Key Management](./key-management.md): explore how to generate wallet with Lighthouse. * [Key Recovery](./key-recovery.md): explore how to recover wallet and validator with Lighthouse. * [Advanced Networking](./advanced_networking.md): open your ports to have a diverse and healthy set of peers. -* [Running a Slasher](./slasher.md): contribute to the health of the network by running a slasher. +* [Running a Slasher](./slasher.md): contribute to the health of the network by running a slasher. * [Redundancy](./redundancy.md): want to have more than one beacon node as backup? This is for you. * [Release Candidates](./advanced-release-candidates.md): latest release of Lighthouse to get feedback from users. * [Maximal Extractable Value](./builders.md): use external builders for a potential higher rewards during block proposals diff --git a/book/src/advanced_database.md b/book/src/advanced_database.md index f65fb10415..345fff6981 100644 --- a/book/src/advanced_database.md +++ b/book/src/advanced_database.md @@ -29,7 +29,7 @@ some example values. | Enthusiast (prev. default) | 2048 | hundreds of GB | 10.2 s | | Validator only (default) | 8192 | tens of GB | 41 s | -*Last update: Dec 2023. +*Last update: Dec 2023. As we can see, it's a high-stakes trade-off! The relationships to disk usage and historical state load time are both linear – doubling SPRP halves disk usage and doubles load time. The minimum SPRP @@ -40,9 +40,9 @@ The default value is 8192 for databases synced from scratch using Lighthouse v2. The values shown in the table are approximate, calculated using a simple heuristic: each `BeaconState` consumes around 145MB of disk space, and each block replayed takes around 5ms. The -**Yearly Disk Usage** column shows the approximate size of the freezer DB _alone_ (hot DB not included), calculated proportionally using the total freezer database disk usage. +**Yearly Disk Usage** column shows the approximate size of the freezer DB _alone_ (hot DB not included), calculated proportionally using the total freezer database disk usage. The **Load Historical State** time is the worst-case load time for a state in the last slot -before a restore point. +before a restore point. To run a full archival node with fast access to beacon states and a SPRP of 32, the disk usage will be more than 10 TB per year, which is impractical for many users. As such, users may consider running the [tree-states](https://github.com/sigp/lighthouse/releases/tag/v5.0.111-exp) release, which only uses less than 200 GB for a full archival node. The caveat is that it is currently experimental and in alpha release (as of Dec 2023), thus not recommended for running mainnet validators. Nevertheless, it is suitable to be used for analysis purposes, and if you encounter any issues in tree-states, we do appreciate any feedback. We plan to have a stable release of tree-states in 1H 2024. diff --git a/book/src/advanced_metrics.md b/book/src/advanced_metrics.md index 3141f336a1..323ba8f58a 100644 --- a/book/src/advanced_metrics.md +++ b/book/src/advanced_metrics.md @@ -30,7 +30,6 @@ curl localhost:5054/metrics ## Validator Client Metrics - By default, these metrics are disabled but can be enabled with the `--metrics` flag. Use the `--metrics-address`, `--metrics-port` and `--metrics-allow-origin` flags to customize the metrics server. @@ -78,7 +77,7 @@ You can adjust the frequency at which Lighthouse sends metrics to the remote ser `--monitoring-endpoint-period` flag. It takes an integer value in seconds, defaulting to 60 seconds. -``` +```bash lighthouse bn --monitoring-endpoint-period 60 --monitoring-endpoint "https://url" ``` diff --git a/book/src/advanced_networking.md b/book/src/advanced_networking.md index 5fabf57d56..732b4f51e6 100644 --- a/book/src/advanced_networking.md +++ b/book/src/advanced_networking.md @@ -5,8 +5,7 @@ be adjusted to handle a variety of network situations. This section outlines some of these configuration parameters and their consequences at the networking level and their general intended use. - -### Target Peers +## Target Peers The beacon node has a `--target-peers` CLI parameter. This allows you to instruct the beacon node how many peers it should try to find and maintain. @@ -38,7 +37,7 @@ large peer count will not speed up sync. For these reasons, we recommend users do not modify the `--target-peers` count drastically and use the (recommended) default. -### NAT Traversal (Port Forwarding) +## NAT Traversal (Port Forwarding) Lighthouse, by default, uses port 9000 for both TCP and UDP. Since v4.5.0, Lighthouse will also attempt to make QUIC connections via UDP port 9001 by default. Lighthouse will still function if it is behind a NAT without any port mappings. Although @@ -62,36 +61,39 @@ TCP and UDP ports (9000 TCP/UDP, and 9001 UDP by default). > explicitly specify them using the `--enr-tcp-port` and `--enr-udp-port` as > explained in the following section. -### How to Open Ports +## How to Open Ports The steps to do port forwarding depends on the router, but the general steps are given below: + 1. Determine the default gateway IP: -- On Linux: open a terminal and run `ip route | grep default`, the result should look something similar to `default via 192.168.50.1 dev wlp2s0 proto dhcp metric 600`. The `192.168.50.1` is your router management default gateway IP. -- On MacOS: open a terminal and run `netstat -nr|grep default` and it should return the default gateway IP. -- On Windows: open a command prompt and run `ipconfig` and look for the `Default Gateway` which will show you the gateway IP. - The default gateway IP usually looks like 192.168.X.X. Once you obtain the IP, enter it to a web browser and it will lead you to the router management page. + - On Linux: open a terminal and run `ip route | grep default`, the result should look something similar to `default via 192.168.50.1 dev wlp2s0 proto dhcp metric 600`. The `192.168.50.1` is your router management default gateway IP. + - On MacOS: open a terminal and run `netstat -nr|grep default` and it should return the default gateway IP. + - On Windows: open a command prompt and run `ipconfig` and look for the `Default Gateway` which will show you the gateway IP. + + The default gateway IP usually looks like 192.168.X.X. Once you obtain the IP, enter it to a web browser and it will lead you to the router management page. + +1. Login to the router management page. The login credentials are usually available in the manual or the router, or it can be found on a sticker underneath the router. You can also try the login credentials for some common router brands listed [here](https://www.noip.com/support/knowledgebase/general-port-forwarding-guide/). -2. Login to the router management page. The login credentials are usually available in the manual or the router, or it can be found on a sticker underneath the router. You can also try the login credentials for some common router brands listed [here](https://www.noip.com/support/knowledgebase/general-port-forwarding-guide/). +1. Navigate to the port forward settings in your router. The exact step depends on the router, but typically it will fall under the "Advanced" section, under the name "port forwarding" or "virtual server". -3. Navigate to the port forward settings in your router. The exact step depends on the router, but typically it will fall under the "Advanced" section, under the name "port forwarding" or "virtual server". +1. Configure a port forwarding rule as below: -4. Configure a port forwarding rule as below: -- Protocol: select `TCP/UDP` or `BOTH` -- External port: `9000` -- Internal port: `9000` -- IP address: Usually there is a dropdown list for you to select the device. Choose the device that is running Lighthouse. + - Protocol: select `TCP/UDP` or `BOTH` + - External port: `9000` + - Internal port: `9000` + - IP address: Usually there is a dropdown list for you to select the device. Choose the device that is running Lighthouse. -Since V4.5.0 port 9001/UDP is also used for QUIC support. + Since V4.5.0 port 9001/UDP is also used for QUIC support. -- Protocol: select `UDP` -- External port: `9001` -- Internal port: `9001` -- IP address: Choose the device that is running Lighthouse. + - Protocol: select `UDP` + - External port: `9001` + - Internal port: `9001` + - IP address: Choose the device that is running Lighthouse. -5. To check that you have successfully opened the ports, go to [yougetsignal](https://www.yougetsignal.com/tools/open-ports/) and enter `9000` in the `port number`. If it shows "open", then you have successfully set up port forwarding. If it shows "closed", double check your settings, and also check that you have allowed firewall rules on port 9000. Note: this will only confirm if port 9000/TCP is open. You will need to ensure you have correctly setup port forwarding for the UDP ports (`9000` and `9001` by default). +1. To check that you have successfully opened the ports, go to [yougetsignal](https://www.yougetsignal.com/tools/open-ports/) and enter `9000` in the `port number`. If it shows "open", then you have successfully set up port forwarding. If it shows "closed", double check your settings, and also check that you have allowed firewall rules on port 9000. Note: this will only confirm if port 9000/TCP is open. You will need to ensure you have correctly setup port forwarding for the UDP ports (`9000` and `9001` by default). -### ENR Configuration +## ENR Configuration Lighthouse has a number of CLI parameters for constructing and modifying the local Ethereum Node Record (ENR). Examples are `--enr-address`, @@ -113,8 +115,7 @@ harder for peers to find you or potentially making it harder for other peers to find each other. We recommend not touching these settings unless for a more advanced use case. - -### IPv6 support +## IPv6 support As noted in the previous sections, two fundamental parts to ensure good connectivity are: The parameters that configure the sockets over which @@ -122,7 +123,7 @@ Lighthouse listens for connections, and the parameters used to tell other peers how to connect to your node. This distinction is relevant and applies to most nodes that do not run directly on a public network. -#### Configuring Lighthouse to listen over IPv4/IPv6/Dual stack +### Configuring Lighthouse to listen over IPv4/IPv6/Dual stack To listen over only IPv6 use the same parameters as done when listening over IPv4 only: @@ -136,6 +137,7 @@ TCP and UDP. This can be configured with `--quic-port`. To listen over both IPv4 and IPv6: + - Set two listening addresses using the `--listen-address` flag twice ensuring the two addresses are one IPv4, and the other IPv6. When doing so, the `--port` and `--discovery-port` flags will apply exclusively to IPv4. Note @@ -149,7 +151,7 @@ To listen over both IPv4 and IPv6: UDP over IPv6. This will default to the value given to `--port6` + 1. This flag has no effect when listening over IPv6 only. -##### Configuration Examples +#### Configuration Examples > When using `--listen-address :: --listen-address 0.0.0.0 --port 9909`, listening will be set up as follows: > @@ -175,7 +177,8 @@ To listen over both IPv4 and IPv6: > It listens on the default value of `--port6` (`9090`) for TCP, and port `9999` for UDP. > QUIC will use port `9091` for UDP, which is the default `--port6` value (`9090`) + 1. -#### Configuring Lighthouse to advertise IPv6 reachable addresses +### Configuring Lighthouse to advertise IPv6 reachable addresses + Lighthouse supports IPv6 to connect to other nodes both over IPv6 exclusively, and dual stack using one socket for IPv4 and another socket for IPv6. In both scenarios, the previous sections still apply. In summary: @@ -205,7 +208,7 @@ In the general case, a user will not require to set these explicitly. Update these options only if you can guarantee your node is reachable with these values. -#### Known caveats +### Known caveats IPv6 link local addresses are likely to have poor connectivity if used in topologies with more than one interface. Use global addresses for the general diff --git a/book/src/api-bn.md b/book/src/api-bn.md index 3e57edd8db..e7c900e84d 100644 --- a/book/src/api-bn.md +++ b/book/src/api-bn.md @@ -10,15 +10,15 @@ A Lighthouse beacon node can be configured to expose an HTTP server by supplying The following CLI flags control the HTTP server: - `--http`: enable the HTTP server (required even if the following flags are - provided). + provided). - `--http-port`: specify the listen port of the server. - `--http-address`: specify the listen address of the server. It is _not_ recommended to listen on `0.0.0.0`, please see [Security](#security) below. - `--http-allow-origin`: specify the value of the `Access-Control-Allow-Origin` - header. The default is to not supply a header. + header. The default is to not supply a header. - `--http-enable-tls`: serve the HTTP server over TLS. Must be used with `--http-tls-cert` - and `http-tls-key`. This feature is currently experimental, please see - [Serving the HTTP API over TLS](#serving-the-http-api-over-tls) below. + and `http-tls-key`. This feature is currently experimental, please see + [Serving the HTTP API over TLS](#serving-the-http-api-over-tls) below. - `--http-tls-cert`: specify the path to the certificate file for Lighthouse to use. - `--http-tls-key`: specify the path to the private key file for Lighthouse to use. @@ -38,18 +38,18 @@ the listening address from `localhost` should only be done with extreme care. To safely provide access to the API from a different machine you should use one of the following standard techniques: -* Use an [SSH tunnel][ssh_tunnel], i.e. access `localhost` remotely. This is recommended, and +- Use an [SSH tunnel][ssh_tunnel], i.e. access `localhost` remotely. This is recommended, and doesn't require setting `--http-address`. -* Use a firewall to limit access to certain remote IPs, e.g. allow access only from one other +- Use a firewall to limit access to certain remote IPs, e.g. allow access only from one other machine on the local network. -* Shield Lighthouse behind an HTTP server with rate-limiting such as NGINX. This is only +- Shield Lighthouse behind an HTTP server with rate-limiting such as NGINX. This is only recommended for advanced users, e.g. beacon node hosting providers. Additional risks to be aware of include: -* The `node/identity` and `node/peers` endpoints expose information about your node's peer-to-peer +- The `node/identity` and `node/peers` endpoints expose information about your node's peer-to-peer identity. -* The `--http-allow-origin` flag changes the server's CORS policy, allowing cross-site requests +- The `--http-allow-origin` flag changes the server's CORS policy, allowing cross-site requests from browsers. You should only supply it if you understand the risks, e.g. malicious websites accessing your beacon node if you use the same machine for staking and web browsing. @@ -57,7 +57,6 @@ Additional risks to be aware of include: Start a beacon node and an execution node according to [Run a node](./run_a_node.md). Note that since [The Merge](https://ethereum.org/en/roadmap/merge/), an execution client is required to be running along with a beacon node. Hence, the query on Beacon Node APIs requires users to run both. While there are some Beacon Node APIs that you can query with only the beacon node, such as the [node version](https://ethereum.github.io/beacon-APIs/#/Node/getNodeVersion), in general an execution client is required to get the updated information about the beacon chain, such as [state root](https://ethereum.github.io/beacon-APIs/#/Beacon/getStateRoot), [headers](https://ethereum.github.io/beacon-APIs/#/Beacon/getBlockHeaders) and many others, which are dynamically progressing with time. - ## HTTP Request/Response Examples This section contains some simple examples of using the HTTP API via `curl`. @@ -124,9 +123,11 @@ curl -X GET "http://localhost:5052/eth/v1/beacon/states/head/validators/1" -H " } } ``` + You can replace `1` in the above command with the validator index that you would like to query. Other API query can be done similarly by changing the link according to the Beacon API. ### Events API + The [events API](https://ethereum.github.io/beacon-APIs/#/Events/eventstream) provides information such as the payload attributes that are of interest to block builders and relays. To query the payload attributes, it is necessary to run Lighthouse beacon node with the flag `--always-prepare-payload`. It is also recommended to add the flag `--prepare-payload-lookahead 8000` which configures the payload attributes to be sent at 4s into each slot (or 8s from the start of the next slot). An example of the command is: ```bash @@ -141,8 +142,8 @@ An example of response is: data:{"version":"capella","data":{"proposal_slot":"11047","proposer_index":"336057","parent_block_root":"0x26f8999d270dd4677c2a1c815361707157a531f6c599f78fa942c98b545e1799","parent_block_number":"9259","parent_block_hash":"0x7fb788cd7afa814e578afa00a3edd250cdd4c8e35c22badd327d981b5bda33d2","payload_attributes":{"timestamp":"1696034964","prev_randao":"0xeee34d7a3f6b99ade6c6a881046c9c0e96baab2ed9469102d46eb8d6e4fde14c","suggested_fee_recipient":"0x0000000000000000000000000000000000000001","withdrawals":[{"index":"40705","validator_index":"360712","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1202941"},{"index":"40706","validator_index":"360713","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1201138"},{"index":"40707","validator_index":"360714","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1215255"},{"index":"40708","validator_index":"360715","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1161977"},{"index":"40709","validator_index":"360716","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1257278"},{"index":"40710","validator_index":"360717","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1247740"},{"index":"40711","validator_index":"360718","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1204337"},{"index":"40712","validator_index":"360719","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1183575"},{"index":"40713","validator_index":"360720","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1157785"},{"index":"40714","validator_index":"360721","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1143371"},{"index":"40715","validator_index":"360722","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1234787"},{"index":"40716","validator_index":"360723","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1286673"},{"index":"40717","validator_index":"360724","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1419241"},{"index":"40718","validator_index":"360725","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1231015"},{"index":"40719","validator_index":"360726","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1304321"},{"index":"40720","validator_index":"360727","address":"0x73b2e0e54510239e22cc936f0b4a6de1acf0abde","amount":"1236543"}]}}} ``` - ## Serving the HTTP API over TLS +> > **Warning**: This feature is currently experimental. The HTTP server can be served over TLS by using the `--http-enable-tls`, @@ -160,10 +161,13 @@ Below is a simple example serving the HTTP API over TLS using a self-signed certificate on Linux: ### Enabling TLS on a beacon node + Generate a self-signed certificate using `openssl`: + ```bash openssl req -x509 -nodes -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost" ``` + Note that currently Lighthouse only accepts keys that are not password protected. This means we need to run with the `-nodes` flag (short for 'no DES'). @@ -180,21 +184,27 @@ lighthouse bn \ --http-tls-cert cert.pem \ --http-tls-key key.pem ``` + Note that the user running Lighthouse must have permission to read the certificate and key. The API is now being served at `https://localhost:5052`. To test connectivity, you can run the following: + ```bash curl -X GET "https://localhost:5052/eth/v1/node/version" -H "accept: application/json" --cacert cert.pem | jq ``` + ### Connecting a validator client + In order to connect a validator client to a beacon node over TLS, the validator client needs to be aware of the certificate. There are two ways to do this: + #### Option 1: Add the certificate to the operating system trust store + The process for this will vary depending on your operating system. Below are the instructions for Ubuntu and Arch Linux: @@ -211,13 +221,16 @@ sudo trust extract-compat ``` Now the validator client can be connected to the beacon node by running: + ```bash lighthouse vc --beacon-nodes https://localhost:5052 ``` #### Option 2: Specify the certificate via CLI + You can also specify any custom certificates via the validator client CLI like so: + ```bash lighthouse vc --beacon-nodes https://localhost:5052 --beacon-nodes-tls-certs cert.pem ``` diff --git a/book/src/api-lighthouse.md b/book/src/api-lighthouse.md index ce71450987..6ef00578f7 100644 --- a/book/src/api-lighthouse.md +++ b/book/src/api-lighthouse.md @@ -16,12 +16,12 @@ Although we don't recommend that users rely on these endpoints, we document them briefly so they can be utilized by developers and researchers. +## `/lighthouse/health` - -### `/lighthouse/health` *Note: This endpoint is presently only available on Linux.* Returns information regarding the health of the host machine. + ```bash curl -X GET "http://localhost:5052/lighthouse/health" -H "accept: application/json" | jq ``` @@ -64,7 +64,8 @@ curl -X GET "http://localhost:5052/lighthouse/health" -H "accept: application/j ``` -### `/lighthouse/ui/health` +## `/lighthouse/ui/health` + Returns information regarding the health of the host machine. ```bash @@ -101,8 +102,10 @@ curl -X GET "http://localhost:5052/lighthouse/ui/health" -H "accept: applicatio } ``` -### `/lighthouse/ui/validator_count` +## `/lighthouse/ui/validator_count` + Returns an overview of validators. + ```bash curl -X GET "http://localhost:5052/lighthouse/ui/validator_count" -H "accept: application/json" | jq ``` @@ -123,9 +126,10 @@ curl -X GET "http://localhost:5052/lighthouse/ui/validator_count" -H "accept: ap } ``` +## `/lighthouse/ui/validator_metrics` -### `/lighthouse/ui/validator_metrics` Re-exposes certain metrics from the validator monitor to the HTTP API. This API requires that the beacon node to have the flag `--validator-monitor-auto`. This API will only return metrics for the validators currently being monitored and present in the POST data, or the validators running in the validator client. + ```bash curl -X POST "http://localhost:5052/lighthouse/ui/validator_metrics" -d '{"indices": [12345]}' -H "Content-Type: application/json" | jq ``` @@ -150,7 +154,9 @@ curl -X POST "http://localhost:5052/lighthouse/ui/validator_metrics" -d '{"indic } } ``` + Running this API without the flag `--validator-monitor-auto` in the beacon node will return null: + ```json { "data": { @@ -159,8 +165,10 @@ Running this API without the flag `--validator-monitor-auto` in the beacon node } ``` -### `/lighthouse/syncing` +## `/lighthouse/syncing` + Returns the sync status of the beacon node. + ```bash curl -X GET "http://localhost:5052/lighthouse/syncing" -H "accept: application/json" | jq ``` @@ -168,6 +176,7 @@ curl -X GET "http://localhost:5052/lighthouse/syncing" -H "accept: application/ There are two possible outcomes, depending on whether the beacon node is syncing or synced. 1. Syncing: + ```json { "data": { @@ -178,20 +187,21 @@ There are two possible outcomes, depending on whether the beacon node is syncing } } ``` + 1. Synced: + ```json { "data": "Synced" } ``` -### `/lighthouse/peers` +## `/lighthouse/peers` ```bash curl -X GET "http://localhost:5052/lighthouse/peers" -H "accept: application/json" | jq ``` - ```json [ { @@ -255,14 +265,14 @@ curl -X GET "http://localhost:5052/lighthouse/peers" -H "accept: application/js ] ``` -### `/lighthouse/peers/connected` +## `/lighthouse/peers/connected` + Returns information about connected peers. + ```bash curl -X GET "http://localhost:5052/lighthouse/peers/connected" -H "accept: application/json" | jq ``` - - ```json [ { @@ -327,7 +337,7 @@ curl -X GET "http://localhost:5052/lighthouse/peers/connected" -H "accept: appl ] ``` -### `/lighthouse/proto_array` +## `/lighthouse/proto_array` ```bash curl -X GET "http://localhost:5052/lighthouse/proto_array" -H "accept: application/json" | jq @@ -335,45 +345,45 @@ curl -X GET "http://localhost:5052/lighthouse/proto_array" -H "accept: applicat *Example omitted for brevity.* -### `/lighthouse/validator_inclusion/{epoch}/{validator_id}` +## `/lighthouse/validator_inclusion/{epoch}/{validator_id}` See [Validator Inclusion APIs](./validator-inclusion.md). -### `/lighthouse/validator_inclusion/{epoch}/global` +## `/lighthouse/validator_inclusion/{epoch}/global` See [Validator Inclusion APIs](./validator-inclusion.md). -### `/lighthouse/eth1/syncing` +## `/lighthouse/eth1/syncing` Returns information regarding execution layer, as it is required for use in consensus layer -#### Fields +### Fields - `head_block_number`, `head_block_timestamp`: the block number and timestamp from the very head of the execution chain. Useful for understanding the immediate health of the execution node that the beacon node is connected to. - `latest_cached_block_number` & `latest_cached_block_timestamp`: the block number and timestamp of the latest block we have in our block cache. - - For correct execution client voting this timestamp should be later than the + - For correct execution client voting this timestamp should be later than the `voting_target_timestamp`. - `voting_target_timestamp`: The latest timestamp allowed for an execution layer block in this voting period. - `eth1_node_sync_status_percentage` (float): An estimate of how far the head of the execution node is from the head of the execution chain. - - `100.0` indicates a fully synced execution node. - - `0.0` indicates an execution node that has not verified any blocks past the - genesis block. + - `100.0` indicates a fully synced execution node. + - `0.0` indicates an execution node that has not verified any blocks past the + genesis block. - `lighthouse_is_cached_and_ready`: Is set to `true` if the caches in the - beacon node are ready for block production. - - This value might be set to - `false` whilst `eth1_node_sync_status_percentage == 100.0` if the beacon - node is still building its internal cache. - - This value might be set to `true` whilst - `eth1_node_sync_status_percentage < 100.0` since the cache only cares - about blocks a certain distance behind the head. + beacon node are ready for block production. + - This value might be set to + `false` whilst `eth1_node_sync_status_percentage == 100.0` if the beacon + node is still building its internal cache. + - This value might be set to `true` whilst + `eth1_node_sync_status_percentage < 100.0` since the cache only cares + about blocks a certain distance behind the head. -#### Example +### Example ```bash curl -X GET "http://localhost:5052/lighthouse/eth1/syncing" -H "accept: application/json" | jq @@ -393,11 +403,11 @@ curl -X GET "http://localhost:5052/lighthouse/eth1/syncing" -H "accept: applica } ``` -### `/lighthouse/eth1/block_cache` +## `/lighthouse/eth1/block_cache` Returns a list of all the execution layer blocks in the execution client voting cache. -#### Example +### Example ```bash curl -X GET "http://localhost:5052/lighthouse/eth1/block_cache" -H "accept: application/json" | jq @@ -424,11 +434,11 @@ curl -X GET "http://localhost:5052/lighthouse/eth1/block_cache" -H "accept: app } ``` -### `/lighthouse/eth1/deposit_cache` +## `/lighthouse/eth1/deposit_cache` Returns a list of all cached logs from the deposit contract. -#### Example +### Example ```bash curl -X GET "http://localhost:5052/lighthouse/eth1/deposit_cache" -H "accept: application/json" | jq @@ -463,7 +473,7 @@ curl -X GET "http://localhost:5052/lighthouse/eth1/deposit_cache" -H "accept: a } ``` -### `/lighthouse/liveness` +## `/lighthouse/liveness` POST request that checks if any of the given validators have attested in the given epoch. Returns a list of objects, each including the validator index, epoch, and `is_live` status of a requested validator. @@ -488,9 +498,7 @@ curl -X POST "http://localhost:5052/lighthouse/liveness" -d '{"indices":["0","1" } ``` - - -### `/lighthouse/database/info` +## `/lighthouse/database/info` Information about the database's split point and anchor info. @@ -498,7 +506,6 @@ Information about the database's split point and anchor info. curl "http://localhost:5052/lighthouse/database/info" | jq ``` - ```json { "schema_version": 18, @@ -541,9 +548,10 @@ reconstruction has yet to be completed. For more information on the specific meanings of these fields see the docs on [Checkpoint Sync](./checkpoint-sync.md#reconstructing-states). +## `/lighthouse/merge_readiness` -### `/lighthouse/merge_readiness` Returns the current difficulty and terminal total difficulty of the network. Before [The Merge](https://ethereum.org/en/roadmap/merge/) on 15th September 2022, you will see that the current difficulty is less than the terminal total difficulty, An example is shown below: + ```bash curl -X GET "http://localhost:5052/lighthouse/merge_readiness" | jq ``` @@ -574,16 +582,15 @@ As all testnets and Mainnet have been merged, both values will be the same after } ``` - -### `/lighthouse/analysis/attestation_performance/{index}` +## `/lighthouse/analysis/attestation_performance/{index}` Fetch information about the attestation performance of a validator index or all validators for a range of consecutive epochs. Two query parameters are required: -* `start_epoch` (inclusive): the first epoch to compute attestation performance for. -* `end_epoch` (inclusive): the final epoch to compute attestation performance for. +- `start_epoch` (inclusive): the first epoch to compute attestation performance for. +- `end_epoch` (inclusive): the final epoch to compute attestation performance for. Example: @@ -649,18 +656,18 @@ curl -X GET "http://localhost:5052/lighthouse/analysis/attestation_performance/g Caveats: -* For maximum efficiency the start_epoch should satisfy `(start_epoch * slots_per_epoch) % slots_per_restore_point == 1`. - This is because the state _prior_ to the `start_epoch` needs to be loaded from the database, +- For maximum efficiency the start_epoch should satisfy `(start_epoch * slots_per_epoch) % slots_per_restore_point == 1`. + This is because the state *prior* to the `start_epoch` needs to be loaded from the database, and loading a state on a boundary is most efficient. -### `/lighthouse/analysis/block_rewards` +## `/lighthouse/analysis/block_rewards` Fetch information about the block rewards paid to proposers for a range of consecutive blocks. Two query parameters are required: -* `start_slot` (inclusive): the slot of the first block to compute rewards for. -* `end_slot` (inclusive): the slot of the last block to compute rewards for. +- `start_slot` (inclusive): the slot of the first block to compute rewards for. +- `end_slot` (inclusive): the slot of the last block to compute rewards for. Example: @@ -668,7 +675,6 @@ Example: curl -X GET "http://localhost:5052/lighthouse/analysis/block_rewards?start_slot=1&end_slot=1" | jq ``` - The first few lines of the response would look like: ```json @@ -698,25 +704,25 @@ The first few lines of the response would look like: Caveats: -* Presently only attestation and sync committee rewards are computed. -* The output format is verbose and subject to change. Please see [`BlockReward`][block_reward_src] +- Presently only attestation and sync committee rewards are computed. +- The output format is verbose and subject to change. Please see [`BlockReward`][block_reward_src] in the source. -* For maximum efficiency the `start_slot` should satisfy `start_slot % slots_per_restore_point == 1`. - This is because the state _prior_ to the `start_slot` needs to be loaded from the database, and +- For maximum efficiency the `start_slot` should satisfy `start_slot % slots_per_restore_point == 1`. + This is because the state *prior* to the `start_slot` needs to be loaded from the database, and loading a state on a boundary is most efficient. [block_reward_src]: https://github.com/sigp/lighthouse/tree/unstable/common/eth2/src/lighthouse/block_rewards.rs -### `/lighthouse/analysis/block_packing` +## `/lighthouse/analysis/block_packing` Fetch information about the block packing efficiency of blocks for a range of consecutive epochs. Two query parameters are required: -* `start_epoch` (inclusive): the epoch of the first block to compute packing efficiency for. -* `end_epoch` (inclusive): the epoch of the last block to compute packing efficiency for. +- `start_epoch` (inclusive): the epoch of the first block to compute packing efficiency for. +- `end_epoch` (inclusive): the epoch of the last block to compute packing efficiency for. ```bash curl -X GET "http://localhost:5052/lighthouse/analysis/block_packing_efficiency?start_epoch=1&end_epoch=1" | jq @@ -745,13 +751,12 @@ An excerpt of the response looks like: Caveats: -* `start_epoch` must not be `0`. -* For maximum efficiency the `start_epoch` should satisfy `(start_epoch * slots_per_epoch) % slots_per_restore_point == 1`. - This is because the state _prior_ to the `start_epoch` needs to be loaded from the database, and +- `start_epoch` must not be `0`. +- For maximum efficiency the `start_epoch` should satisfy `(start_epoch * slots_per_epoch) % slots_per_restore_point == 1`. + This is because the state *prior* to the `start_epoch` needs to be loaded from the database, and loading a state on a boundary is most efficient. - -### `/lighthouse/logs` +## `/lighthouse/logs` This is a Server Side Event subscription endpoint. This allows a user to read the Lighthouse logs directly from the HTTP API endpoint. This currently @@ -764,6 +769,7 @@ curl -N "http://localhost:5052/lighthouse/logs" ``` Should provide an output that emits log events as they occur: + ```json { "data": { @@ -779,7 +785,8 @@ Should provide an output that emits log events as they occur: } ``` -### `/lighthouse/nat` +## `/lighthouse/nat` + Checks if the ports are open. ```bash @@ -787,6 +794,7 @@ curl -X GET "http://localhost:5052/lighthouse/nat" | jq ``` An open port will return: + ```json { "data": true diff --git a/book/src/api-vc-auth-header.md b/book/src/api-vc-auth-header.md index 33f6f6ff7a..ca0cc098d9 100644 --- a/book/src/api-vc-auth-header.md +++ b/book/src/api-vc-auth-header.md @@ -11,7 +11,7 @@ HTTP header: Where `` is a string that can be obtained from the validator client host. Here is an example `Authorization` header: -``` +```text Authorization: Bearer api-token-0x03eace4c98e8f77477bb99efb74f9af10d800bd3318f92c33b719a4644254d4123 ``` @@ -22,16 +22,15 @@ this is `~/.lighthouse/{network}/validators/api-token.txt`. Here's an example using the `cat` command to print the token to the terminal, but any text editor will suffice: -``` -$ cat api-token.txt +```bash +cat api-token.txt api-token-0x03eace4c98e8f77477bb99efb74f9af10d800bd3318f92c33b719a4644254d4123 ``` - When starting the validator client it will output a log message containing the path to the file containing the api token. -``` +```text Sep 28 19:17:52.615 INFO HTTP API started api_token_file: "$HOME/prater/validators/api-token.txt", listen_address: 127.0.0.1:5062 ``` diff --git a/book/src/api-vc-endpoints.md b/book/src/api-vc-endpoints.md index cf52454c2d..22f0064745 100644 --- a/book/src/api-vc-endpoints.md +++ b/book/src/api-vc-endpoints.md @@ -2,27 +2,27 @@ ## Endpoints -HTTP Path | Description | +| HTTP Path | Description | | --- | -- | -[`GET /lighthouse/version`](#get-lighthouseversion) | Get the Lighthouse software version. -[`GET /lighthouse/health`](#get-lighthousehealth) | Get information about the host machine. -[`GET /lighthouse/ui/health`](#get-lighthouseuihealth) | Get information about the host machine. Focused for UI applications. -[`GET /lighthouse/spec`](#get-lighthousespec) | Get the Ethereum proof-of-stake consensus specification used by the validator. -[`GET /lighthouse/auth`](#get-lighthouseauth) | Get the location of the authorization token. -[`GET /lighthouse/validators`](#get-lighthousevalidators) | List all validators. -[`GET /lighthouse/validators/:voting_pubkey`](#get-lighthousevalidatorsvoting_pubkey) | Get a specific validator. -[`PATCH /lighthouse/validators/:voting_pubkey`](#patch-lighthousevalidatorsvoting_pubkey) | Update a specific validator. -[`POST /lighthouse/validators`](#post-lighthousevalidators) | Create a new validator and mnemonic. -[`POST /lighthouse/validators/keystore`](#post-lighthousevalidatorskeystore) | Import a keystore. -[`POST /lighthouse/validators/mnemonic`](#post-lighthousevalidatorsmnemonic) | Create a new validator from an existing mnemonic. -[`POST /lighthouse/validators/web3signer`](#post-lighthousevalidatorsweb3signer) | Add web3signer validators. -[`GET /lighthouse/logs`](#get-lighthouselogs) | Get logs - -The query to Lighthouse API endpoints requires authorization, see [Authorization Header](./api-vc-auth-header.md). +| [`GET /lighthouse/version`](#get-lighthouseversion) | Get the Lighthouse software version. | +| [`GET /lighthouse/health`](#get-lighthousehealth) | Get information about the host machine. | +| [`GET /lighthouse/ui/health`](#get-lighthouseuihealth) | Get information about the host machine. Focused for UI applications. | +| [`GET /lighthouse/spec`](#get-lighthousespec) | Get the Ethereum proof-of-stake consensus specification used by the validator. | +| [`GET /lighthouse/auth`](#get-lighthouseauth) | Get the location of the authorization token. | +| [`GET /lighthouse/validators`](#get-lighthousevalidators) | List all validators. | +| [`GET /lighthouse/validators`](#get-lighthousevalidators) | List all validators. | +| [`GET /lighthouse/validators/:voting_pubkey`](#get-lighthousevalidatorsvoting_pubkey) | Get a specific validator. | +| [`PATCH /lighthouse/validators/:voting_pubkey`](#patch-lighthousevalidatorsvoting_pubkey) | Update a specific validator. | +| [`POST /lighthouse/validators`](#post-lighthousevalidators) | Create a new validator and mnemonic. | +| [`POST /lighthouse/validators/keystore`](#post-lighthousevalidatorskeystore) | Import a keystore. | +| [`POST /lighthouse/validators/mnemonic`](#post-lighthousevalidatorsmnemonic) | Create a new validator from an existing mnemonic. | +| [`POST /lighthouse/validators/web3signer`](#post-lighthousevalidatorsweb3signer) | Add web3signer validators. | +| [`GET /lighthouse/logs`](#get-lighthouselogs) | Get logs | + +The query to Lighthouse API endpoints requires authorization, see [Authorization Header](./api-vc-auth-header.md). In addition to the above endpoints Lighthouse also supports all of the [standard keymanager APIs](https://ethereum.github.io/keymanager-APIs/). - ## `GET /lighthouse/version` Returns the software version and `git` commit hash for the Lighthouse binary. @@ -37,6 +37,7 @@ Returns the software version and `git` commit hash for the Lighthouse binary. | Typical Responses | 200 | Command: + ```bash DATADIR=/var/lib/lighthouse curl -X GET "http://localhost:5062/lighthouse/version" -H "Authorization: Bearer $(cat ${DATADIR}/validators/api-token.txt)" | jq @@ -44,7 +45,6 @@ curl -X GET "http://localhost:5062/lighthouse/version" -H "Authorization: Bearer Example Response Body: - ```json { "data": { @@ -52,9 +52,11 @@ Example Response Body: } } ``` + > Note: The command provided in this documentation links to the API token file. In this documentation, it is assumed that the API token file is located in `/var/lib/lighthouse/validators/API-token.txt`. If your database is saved in another directory, modify the `DATADIR` accordingly. If you are having permission issue with accessing the API token file, you can modify the header to become `-H "Authorization: Bearer $(sudo cat ${DATADIR}/validators/api-token.txt)"`. > As an alternative, you can also provide the API token directly, for example, `-H "Authorization: Bearer api-token-0x02dc2a13115cc8c83baf170f597f22b1eb2930542941ab902df3daadebcb8f8176`. In this case, you obtain the token from the file `API token.txt` and the command becomes: + ```bash curl -X GET "http://localhost:5062/lighthouse/version" -H "Authorization: Bearer api-token-0x02dc2a13115cc8c83baf170f597f22b1eb2930542941ab902df3daadebcb8f8176" | jq ``` @@ -75,6 +77,7 @@ Returns information regarding the health of the host machine. *Note: this endpoint is presently only available on Linux.* Command: + ```bash DATADIR=/var/lib/lighthouse curl -X GET "http://localhost:5062/lighthouse/health" -H "Authorization: Bearer $(cat ${DATADIR}/validators/api-token.txt)" | jq @@ -133,6 +136,7 @@ Returns information regarding the health of the host machine. | Typical Responses | 200 | Command: + ```bash DATADIR=/var/lib/lighthouse curl -X GET "http://localhost:5062/lighthouse/ui/health" -H "Authorization: Bearer $(cat ${DATADIR}/validators/api-token.txt)" | jq @@ -178,10 +182,12 @@ Returns the graffiti that will be used for the next block proposal of each valid | Typical Responses | 200 | Command: + ```bash DATADIR=/var/lib/lighthouse curl -X GET "http://localhost:5062/lighthouse/ui/graffiti" -H "Authorization: Bearer $(cat ${DATADIR}/validators/api-token.txt)" | jq ``` + Example Response Body ```json @@ -323,7 +329,7 @@ Example Response Body ## `GET /lighthouse/auth` Fetch the filesystem path of the [authorization token](./api-vc-auth-header.md). -Unlike the other endpoints this may be called _without_ providing an authorization token. +Unlike the other endpoints this may be called *without* providing an authorization token. This API is intended to be called from the same machine as the validator client, so that the token file may be read by a local user with access rights. @@ -440,7 +446,6 @@ and `graffiti`. The following example updates a validator from `enabled: true` | Required Headers | [`Authorization`](./api-vc-auth-header.md) | | Typical Responses | 200, 400 | - Example Request Body ```json @@ -458,6 +463,7 @@ curl -X PATCH "http://localhost:5062/lighthouse/validators/0xb0148e6348264131bf4 -H "Content-Type: application/json" \ -d "{\"enabled\":false}" | jq ``` + ### Example Response Body ```json @@ -466,12 +472,11 @@ null A `null` response indicates that the request is successful. At the same time, `lighthouse vc` will log: -``` +```text INFO Disabled validator voting_pubkey: 0xb0148e6348264131bf47bcd1829590e870c836dc893050fd0dadc7a28949f9d0a72f2805d027521b45441101f0cc1cde INFO Modified key_cache saved successfully ``` - ## `POST /lighthouse/validators/` Create any number of new validators, all of which will share a common mnemonic @@ -510,7 +515,8 @@ Validators are generated from the mnemonic according to ] ``` -Command: +Command: + ```bash DATADIR=/var/lib/lighthouse curl -X POST http://localhost:5062/lighthouse/validators \ @@ -560,7 +566,7 @@ curl -X POST http://localhost:5062/lighthouse/validators \ `lighthouse vc` will log: -``` +```text INFO Enabled validator voting_pubkey: 0x8ffbc881fb60841a4546b4b385ec5e9b5090fd1c4395e568d98b74b94b41a912c6101113da39d43c101369eeb9b48e50, signing_method: local_keystore INFO Modified key_cache saved successfully INFO Disabled validator voting_pubkey: 0xa9fadd620dc68e9fe0d6e1a69f6c54a0271ad65ab5a509e645e45c6e60ff8f4fc538f301781193a08b55821444801502 @@ -625,6 +631,7 @@ Import a keystore into the validator client. We can use [JSON to String Converter](https://jsontostring.com/) so that the above data can be properly presented as a command. The command is as below: Command: + ```bash DATADIR=/var/lib/lighthouse curl -X POST http://localhost:5062/lighthouse/validators/keystore \ @@ -636,6 +643,7 @@ curl -X POST http://localhost:5062/lighthouse/validators/keystore \ As this is an example for demonstration, the above command will return `InvalidPassword`. However, with a keystore file and correct password, running the above command will import the keystore to the validator client. An example of a success message is shown below: ### Example Response Body + ```json { "data": { @@ -717,7 +725,7 @@ curl -X POST http://localhost:5062/lighthouse/validators/mnemonic \ `lighthouse vc` will log: -``` +```text INFO Enabled validator voting_pubkey: 0xa062f95fee747144d5e511940624bc6546509eeaeae9383257a9c43e7ddc58c17c2bab4ae62053122184c381b90db380, signing_method: local_keystore INFO Modified key_cache saved successfully ``` @@ -759,8 +767,8 @@ Create any number of new validators, all of which will refer to a Some of the fields above may be omitted or nullified to obtain default values (e.g., `graffiti`, `request_timeout_ms`). - Command: + ```bash DATADIR=/var/lib/lighthouse curl -X POST http://localhost:5062/lighthouse/validators/web3signer \ @@ -769,21 +777,18 @@ curl -X POST http://localhost:5062/lighthouse/validators/web3signer \ -d "[{\"enable\":true,\"description\":\"validator_one\",\"graffiti\":\"Mr F was here\",\"suggested_fee_recipient\":\"0xa2e334e71511686bcfe38bb3ee1ad8f6babcc03d\",\"voting_public_key\":\"0xa062f95fee747144d5e511940624bc6546509eeaeae9383257a9c43e7ddc58c17c2bab4ae62053122184c381b90db380\",\"builder_proposals\":true,\"url\":\"http://path-to-web3signer.com\",\"root_certificate_path\":\"/path/to/certificate.pem\",\"client_identity_path\":\"/path/to/identity.p12\",\"client_identity_password\":\"pass\",\"request_timeout_ms\":12000}]" ``` - ### Example Response Body - ```json null ``` A `null` response indicates that the request is successful. At the same time, `lighthouse vc` will log: -``` +```text INFO Enabled validator voting_pubkey: 0xa062f95fee747144d5e511940624bc6546509eeaeae9383257a9c43e7ddc58c17c2bab4ae62053122184c381b90db380, signing_method: remote_signer ``` - ## `GET /lighthouse/logs` Provides a subscription to receive logs as Server Side Events. Currently the diff --git a/book/src/api-vc-sig-header.md b/book/src/api-vc-sig-header.md index a1b9b104f9..468f714cfa 100644 --- a/book/src/api-vc-sig-header.md +++ b/book/src/api-vc-sig-header.md @@ -9,7 +9,7 @@ The validator client HTTP server adds the following header to all responses: Example `Signature` header: -``` +```text Signature: 0x304402205b114366444112580bf455d919401e9c869f5af067cd496016ab70d428b5a99d0220067aede1eb5819eecfd5dd7a2b57c5ac2b98f25a7be214b05684b04523aef873 ``` @@ -83,7 +83,7 @@ The previous Javascript example was written using the output from the following curl -v localhost:5062/lighthouse/version -H "Authorization: Basic api-token-0x03eace4c98e8f77477bb99efb74f9af10d800bd3318f92c33b719a4644254d4123" ``` -``` +```text * Trying ::1:5062... * connect to ::1 port 5062 failed: Connection refused * Trying 127.0.0.1:5062... diff --git a/book/src/api-vc.md b/book/src/api-vc.md index a3400016ec..630a032006 100644 --- a/book/src/api-vc.md +++ b/book/src/api-vc.md @@ -19,11 +19,11 @@ A Lighthouse validator client can be configured to expose a HTTP server by suppl The following CLI flags control the HTTP server: - `--http`: enable the HTTP server (required even if the following flags are - provided). + provided). - `--http-address`: specify the listen address of the server. It is almost always unsafe to use a non-default HTTP listen address. Use this with caution. See the **Security** section below for more information. - `--http-port`: specify the listen port of the server. - `--http-allow-origin`: specify the value of the `Access-Control-Allow-Origin` - header. The default is to not supply a header. + header. The default is to not supply a header. ## Security diff --git a/book/src/builders.md b/book/src/builders.md index 930d330d99..5b8e9ddb8b 100644 --- a/book/src/builders.md +++ b/book/src/builders.md @@ -18,30 +18,34 @@ a missed proposal and the opportunity cost of lost block rewards. The beacon node and validator client each require a new flag for lighthouse to be fully compatible with builder API servers. -``` +```bash lighthouse bn --builder https://mainnet-builder.test ``` + The `--builder` flag will cause the beacon node to simultaneously query the provided URL and the local execution engine during block production for a block payload with stubbed-out transactions. If either fails, the successful result will be used; If both succeed, the more profitable result will be used. The beacon node will *only* query for this type of block (a "blinded" block) when a validator specifically requests it. Otherwise, it will continue to serve full blocks as normal. In order to configure the validator client to query for blinded blocks, you should use the following flag: -``` +```bash lighthouse vc --builder-proposals ``` + With the `--builder-proposals` flag, the validator client will ask for blinded blocks for all validators it manages. -``` +```bash lighthouse vc --prefer-builder-proposals ``` + With the `--prefer-builder-proposals` flag, the validator client will always prefer blinded blocks, regardless of the payload value, for all validators it manages. -``` +```bash lighthouse vc --builder-boost-factor ``` + With the `--builder-boost-factor` flag, a percentage multiplier is applied to the builder's payload value when choosing between a -builder payload header and payload from the paired execution node. For example, `--builder-boost-factor 50` will only use the builder payload if it is 2x more profitable than the local payload. +builder payload header and payload from the paired execution node. For example, `--builder-boost-factor 50` will only use the builder payload if it is 2x more profitable than the local payload. In order to configure whether a validator queries for blinded blocks check out [this section.](#validator-client-configuration) @@ -88,7 +92,6 @@ You can also update the configured gas limit with these requests. #### `PATCH /lighthouse/validators/:voting_pubkey` - #### HTTP Specification | Property | Specification | @@ -100,12 +103,14 @@ You can also update the configured gas limit with these requests. #### Example Path -``` +```text localhost:5062/lighthouse/validators/0xb0148e6348264131bf47bcd1829590e870c836dc893050fd0dadc7a28949f9d0a72f2805d027521b45441101f0cc1cde ``` #### Example Request Body + Each field is optional. + ```json { "builder_proposals": true, @@ -113,7 +118,7 @@ Each field is optional. } ``` -Command: +Command: ```bash DATADIR=/var/lib/lighthouse @@ -125,6 +130,7 @@ curl -X PATCH "http://localhost:5062/lighthouse/validators/0xb0148e6348264131bf4 "gas_limit": 30000001 }' | jq ``` + If you are having permission issue with accessing the API token file, you can modify the header to become `-H "Authorization: Bearer $(sudo cat ${DATADIR}/validators/api-token.txt)"` #### Example Response Body @@ -135,7 +141,7 @@ null A `null` response indicates that the request is successful. At the same time, `lighthouse vc` will show a log which looks like: -``` +```text INFO Published validator registrations to the builder network, count: 3, service: preparation ``` @@ -147,7 +153,7 @@ Refer to [suggested fee recipient](suggested-fee-recipient.md) documentation. You can also directly configure these fields in the `validator_definitions.yml` file. -``` +```text --- - enabled: true voting_public_key: "0x87a580d31d7bc69069b55f5a01995a610dd391a26dc9e36e81057a17211983a79266800ab8531f21f1083d7d84085007" @@ -178,16 +184,16 @@ checks to try and avoid scenarios like this. By default, Lighthouse is strict with these conditions, but we encourage users to learn about and adjust them. -- `--builder-fallback-skips` - If we've seen this number of skip slots on the canonical chain in a row prior to proposing, we will NOT query +* `--builder-fallback-skips` - If we've seen this number of skip slots on the canonical chain in a row prior to proposing, we will NOT query any connected builders, and will use the local execution engine for payload construction. -- `--builder-fallback-skips-per-epoch` - If we've seen this number of skip slots on the canonical chain in the past `SLOTS_PER_EPOCH`, we will NOT +* `--builder-fallback-skips-per-epoch` - If we've seen this number of skip slots on the canonical chain in the past `SLOTS_PER_EPOCH`, we will NOT query any connected builders, and will use the local execution engine for payload construction. -- `--builder-fallback-epochs-since-finalization` - If we're proposing and the chain has not finalized within +* `--builder-fallback-epochs-since-finalization` - If we're proposing and the chain has not finalized within this number of epochs, we will NOT query any connected builders, and will use the local execution engine for payload construction. Setting this value to anything less than 2 will cause the node to NEVER query connected builders. Setting it to 2 will cause this condition to be hit if there are skips slots at the start of an epoch, right before this node is set to propose. -- `--builder-fallback-disable-checks` - This flag disables all checks related to chain health. This means the builder +* `--builder-fallback-disable-checks` - This flag disables all checks related to chain health. This means the builder API will always be used for payload construction, regardless of recent chain conditions. ## Checking your builder config @@ -196,20 +202,20 @@ You can check that your builder is configured correctly by looking for these log On start-up, the beacon node will log if a builder is configured: -``` +```text INFO Using external block builder ``` At regular intervals the validator client will log that it successfully registered its validators with the builder network: -``` +```text INFO Published validator registrations to the builder network ``` When you successfully propose a block using a builder, you will see this log on the beacon node: -``` +```text INFO Successfully published a block to the builder network ``` @@ -218,34 +224,35 @@ for `INFO` and `WARN` messages indicating why the builder was not used. Examples of messages indicating fallback to a locally produced block are: -``` +```text INFO Builder did not return a payload ``` -``` +```text WARN Builder error when requesting payload ``` -``` +```text WARN Builder returned invalid payload ``` -``` +```text INFO Builder payload ignored ``` -``` +```text INFO Chain is unhealthy, using local payload ``` In case of fallback you should see a log indicating that the locally produced payload was used in place of one from the builder: -``` +```text INFO Reconstructing a full block using a local payload ``` ## Information for block builders and relays + Block builders and relays can query beacon node events from the [Events API](https://ethereum.github.io/beacon-APIs/#/Events/eventstream). An example of querying the payload attributes in the Events API is outlined in [Beacon node API - Events API](./api-bn.md#events-api) [mev-rs]: https://github.com/ralexstokes/mev-rs diff --git a/book/src/checkpoint-sync.md b/book/src/checkpoint-sync.md index 37677c00ad..63d96874c3 100644 --- a/book/src/checkpoint-sync.md +++ b/book/src/checkpoint-sync.md @@ -15,20 +15,20 @@ To begin checkpoint sync you will need HTTP API access to another synced beacon checkpoint sync by providing the other beacon node's URL to `--checkpoint-sync-url`, alongside any other flags: -``` +```bash lighthouse bn --checkpoint-sync-url "http://remote-bn:5052" ... ``` Lighthouse will print a message to indicate that checkpoint sync is being used: -``` +```text INFO Starting checkpoint sync remote_url: http://remote-bn:8000/, service: beacon ``` After a short time (usually less than a minute), it will log the details of the checkpoint loaded from the remote beacon node: -``` +```text INFO Loaded checkpoint block and state state_root: 0xe8252c68784a8d5cc7e5429b0e95747032dd1dcee0d1dc9bdaf6380bf90bc8a6, block_root: 0x5508a20147299b1a7fe9dbea1a8b3bf979f74c52e7242039bd77cbff62c0695a, slot: 2034720, service: beacon ``` @@ -43,7 +43,8 @@ as soon as forwards sync completes. ### Use a community checkpoint sync endpoint The Ethereum community provides various [public endpoints](https://eth-clients.github.io/checkpoint-sync-endpoints/) for you to choose from for your initial checkpoint state. Select one for your network and use it as the url for the `--checkpoint-sync-url` flag. e.g. -``` + +```bash lighthouse bn --checkpoint-sync-url https://example.com/ ... ``` @@ -52,7 +53,7 @@ lighthouse bn --checkpoint-sync-url https://example.com/ ... If the beacon node fails to start due to a timeout from the checkpoint sync server, you can try running it again with a longer timeout by adding the flag `--checkpoint-sync-url-timeout`. -``` +```bash lighthouse bn --checkpoint-sync-url-timeout 300 --checkpoint-sync-url https://example.com/ ... ``` @@ -66,7 +67,7 @@ from the checkpoint back to genesis. The beacon node will log messages similar to the following each minute while it completes backfill sync: -``` +```text INFO Downloading historical blocks est_time: 5 hrs 0 mins, speed: 111.96 slots/sec, distance: 2020451 slots (40 weeks 0 days), service: slot_notifier ``` @@ -80,21 +81,16 @@ Once backfill is complete, a `INFO Historical block download complete` log will 1. What if I have an existing database? How can I use checkpoint sync? -The existing beacon database needs to be deleted before Lighthouse will attempt checkpoint sync. -You can do this by providing the `--purge-db` flag, or by manually deleting `/beacon`. + The existing beacon database needs to be deleted before Lighthouse will attempt checkpoint sync. + You can do this by providing the `--purge-db` flag, or by manually deleting `/beacon`. -2. Why is checkpoint sync faster? +1. Why is checkpoint sync faster? -Checkpoint sync prioritises syncing to the head of the chain quickly so that the node can perform -its duties. Additionally, it only has to perform lightweight verification of historic blocks: -it checks the hash chain integrity & proposer signature rather than computing the full state -transition. + Checkpoint sync prioritises syncing to the head of the chain quickly so that the node can perform its duties. Additionally, it only has to perform lightweight verification of historic blocks: it checks the hash chain integrity & proposer signature rather than computing the full state transition. -3. Is checkpoint sync less secure? +1. Is checkpoint sync less secure? -No, in fact it is more secure! Checkpoint sync guards against long-range attacks that -genesis sync does not. This is due to a property of Proof of Stake consensus known as [Weak -Subjectivity][weak-subj]. + No, in fact it is more secure! Checkpoint sync guards against long-range attacks that genesis sync does not. This is due to a property of Proof of Stake consensus known as [Weak Subjectivity][weak-subj]. ## Reconstructing States @@ -122,7 +118,7 @@ states: Reconstruction runs from the state lower limit to the upper limit, narrowing the window of unavailable states as it goes. It will log messages like the following to show its progress: -``` +```text INFO State reconstruction in progress remaining: 747519, slot: 466944, service: freezer_db ``` diff --git a/book/src/cli.md b/book/src/cli.md index 6540d3fc3a..f9e7df0748 100644 --- a/book/src/cli.md +++ b/book/src/cli.md @@ -4,10 +4,10 @@ The `lighthouse` binary provides all necessary Ethereum consensus client functio has two primary sub-commands: - `$ lighthouse beacon_node`: the largest and most fundamental component which connects to - the p2p network, processes messages and tracks the head of the beacon - chain. + the p2p network, processes messages and tracks the head of the beacon + chain. - `$ lighthouse validator_client`: a lightweight but important component which loads a validators private - key and signs messages using a `beacon_node` as a source-of-truth. + key and signs messages using a `beacon_node` as a source-of-truth. There are also some ancillary binaries like `lcli` and `account_manager`, but these are primarily for testing. @@ -34,11 +34,11 @@ Each binary supports the `--help` flag, this is the best source of documentation. ```bash -$ lighthouse beacon_node --help +lighthouse beacon_node --help ``` ```bash -$ lighthouse validator_client --help +lighthouse validator_client --help ``` ## Creating a new database/testnet diff --git a/book/src/contributing.md b/book/src/contributing.md index 5b0ab48e86..312acccbc0 100644 --- a/book/src/contributing.md +++ b/book/src/contributing.md @@ -8,7 +8,6 @@ [stable]: https://github.com/sigp/lighthouse/tree/stable [unstable]: https://github.com/sigp/lighthouse/tree/unstable - Lighthouse welcomes contributions. If you are interested in contributing to the Ethereum ecosystem, and you want to learn Rust, Lighthouse is a great project to work on. @@ -56,8 +55,8 @@ Please use [clippy](https://github.com/rust-lang/rust-clippy) and inconsistent code formatting: ```bash -$ cargo clippy --all -$ cargo fmt --all --check +cargo clippy --all +cargo fmt --all --check ``` ### Panics @@ -88,8 +87,9 @@ pub fn my_function(&mut self, _something &[u8]) -> Result { **General Comments** -* Prefer line (``//``) comments to block comments (``/* ... */``) -* Comments can appear on the line prior to the item or after a trailing space. +- Prefer line (``//``) comments to block comments (``/* ... */``) +- Comments can appear on the line prior to the item or after a trailing space. + ```rust // Comment for this struct struct Lighthouse {} @@ -98,8 +98,8 @@ fn make_blockchain() {} // A comment on the same line after a space **Doc Comments** -* The ``///`` is used to generate comments for Docs. -* The comments should come before attributes. +- The ``///`` is used to generate comments for Docs. +- The comments should come before attributes. ```rust /// Stores the core configuration for this Lighthouse instance. @@ -123,9 +123,9 @@ introduction and tutorial for the language). Rust has a steep learning curve, but there are many resources to help. We suggest: -* [Rust Book](https://doc.rust-lang.org/stable/book/) -* [Rust by example](https://doc.rust-lang.org/stable/rust-by-example/) -* [Learning Rust With Entirely Too Many Linked Lists](http://cglab.ca/~abeinges/blah/too-many-lists/book/) -* [Rustlings](https://github.com/rustlings/rustlings) -* [Rust Exercism](https://exercism.io/tracks/rust) -* [Learn X in Y minutes - Rust](https://learnxinyminutes.com/docs/rust/) +- [Rust Book](https://doc.rust-lang.org/stable/book/) +- [Rust by example](https://doc.rust-lang.org/stable/rust-by-example/) +- [Learning Rust With Entirely Too Many Linked Lists](http://cglab.ca/~abeinges/blah/too-many-lists/book/) +- [Rustlings](https://github.com/rustlings/rustlings) +- [Rust Exercism](https://exercism.io/tracks/rust) +- [Learn X in Y minutes - Rust](https://learnxinyminutes.com/docs/rust/) diff --git a/book/src/cross-compiling.md b/book/src/cross-compiling.md index 7cf7f4feb1..dfddcbc294 100644 --- a/book/src/cross-compiling.md +++ b/book/src/cross-compiling.md @@ -4,7 +4,6 @@ Lighthouse supports cross-compiling, allowing users to run a binary on one platform (e.g., `aarch64`) that was compiled on another platform (e.g., `x86_64`). - ## Instructions Cross-compiling requires [`Docker`](https://docs.docker.com/engine/install/), diff --git a/book/src/database-migrations.md b/book/src/database-migrations.md index 1e8e134436..a81acd7794 100644 --- a/book/src/database-migrations.md +++ b/book/src/database-migrations.md @@ -53,13 +53,13 @@ To apply a downgrade you need to use the `lighthouse db migrate` command with th 5. After stopping the beacon node, run the migrate command with the `--to` parameter set to the schema version you would like to downgrade to. -``` +```bash sudo -u "$LH_USER" lighthouse db migrate --to "$VERSION" --datadir "$LH_DATADIR" --network "$NET" ``` For example if you want to downgrade to Lighthouse v4.0.1 from v4.2.0 and you followed Somer Esat's guide, you would run: -``` +```bash sudo -u lighthousebeacon lighthouse db migrate --to 16 --datadir /var/lib/lighthouse --network mainnet ``` @@ -113,7 +113,7 @@ The `schema_version` key indicates that this database is using schema version 16 Alternatively, you can check the schema version with the `lighthouse db` command. -``` +```bash sudo -u lighthousebeacon lighthouse db version --datadir /var/lib/lighthouse --network mainnet ``` @@ -132,25 +132,27 @@ Several conditions need to be met in order to run `lighthouse db`: The general form for a `lighthouse db` command is: -``` +```bash sudo -u "$LH_USER" lighthouse db version --datadir "$LH_DATADIR" --network "$NET" ``` If you followed Somer Esat's guide for mainnet: -``` +```bash sudo systemctl stop lighthousebeacon ``` -``` + +```bash sudo -u lighthousebeacon lighthouse db version --datadir /var/lib/lighthouse --network mainnet ``` If you followed the CoinCashew guide for mainnet: -``` +```bash sudo systemctl stop beacon-chain ``` -``` + +```bash lighthouse db version --network mainnet ``` @@ -178,7 +180,7 @@ Here are the steps to prune historic states: If pruning is available, Lighthouse will log: - ``` + ```text INFO Ready to prune states WARN Pruning states is irreversible WARN Re-run this command with --confirm to commit to state deletion @@ -193,10 +195,10 @@ Here are the steps to prune historic states: The `--confirm` flag ensures that you are aware the action is irreversible, and historic states will be permanently removed. Lighthouse will log: - ``` + ```text INFO Historic states pruned successfully ``` - + 4. After successfully pruning the historic states, you can restart the Lighthouse beacon node: ```bash diff --git a/book/src/developers.md b/book/src/developers.md index ab12bed5b9..244c935ac2 100644 --- a/book/src/developers.md +++ b/book/src/developers.md @@ -5,7 +5,6 @@ _Documentation for protocol developers._ This section lists Lighthouse-specific decisions that are not strictly spec'd and may be useful for other protocol developers wishing to interact with lighthouse. - ## Custom ENR Fields Lighthouse currently uses the following ENR fields: @@ -18,7 +17,6 @@ Lighthouse currently uses the following ENR fields: | `attnets` | An SSZ bitfield which indicates which of the 64 subnets the node is subscribed to for an extended period of time | | `syncnets` | An SSZ bitfield which indicates which of the sync committee subnets the node is subscribed to | - ### Lighthouse Custom Fields Lighthouse is currently using the following custom ENR fields. @@ -27,7 +25,6 @@ Lighthouse is currently using the following custom ENR fields. | `quic` | The UDP port on which the QUIC transport is listening on IPv4 | | `quic6` | The UDP port on which the QUIC transport is listening on IPv6 | - ## Custom RPC Messages The specification leaves room for implementation-specific errors. Lighthouse uses the following @@ -43,7 +40,6 @@ custom RPC error messages. | 251 | Banned | The peer has been banned and disconnected | | 252 | Banned IP | The IP the node is connected to us with has been banned | - ### Error Codes | Code | Message | Description | diff --git a/book/src/docker.md b/book/src/docker.md index 2c410877e5..16e685491e 100644 --- a/book/src/docker.md +++ b/book/src/docker.md @@ -30,7 +30,7 @@ If you can see the latest [Lighthouse release](https://github.com/sigp/lighthous ### Example Version Output -``` +```text Lighthouse vx.x.xx-xxxxxxxxx BLS Library: xxxx-xxxxxxx ``` @@ -49,13 +49,13 @@ compatibility (see [Portability](./installation-binaries.md#portability)). To install a specific tag (in this case `latest-modern`), add the tag name to your `docker` commands: -``` +```bash docker pull sigp/lighthouse:latest-modern ``` Image tags follow this format: -``` +```text ${version}${arch}${stability}${modernity}${features} ``` @@ -85,7 +85,6 @@ The `features` is: * `-dev` for a development build with `minimal` preset enabled (`spec-minimal` feature). * empty for a standard build with no custom feature enabled. - Examples: * `latest-unstable-modern`: most recent `unstable` build for all modern CPUs (x86_64 or ARM) diff --git a/book/src/faq.md b/book/src/faq.md index 104190ab9b..c7fdb6b32f 100644 --- a/book/src/faq.md +++ b/book/src/faq.md @@ -1,6 +1,7 @@ # Frequently Asked Questions ## [Beacon Node](#beacon-node-1) + - [I see a warning about "Syncing deposit contract block cache" or an error about "updating deposit contract cache", what should I do?](#bn-deposit-contract) - [I see beacon logs showing `WARN: Execution engine called failed`, what should I do?](#bn-ee) - [I see beacon logs showing `Error during execution engine upcheck`, what should I do?](#bn-upcheck) @@ -16,6 +17,7 @@ - [My beacon node logs `WARN Failed to finalize deposit cache`, what should I do?](#bn-deposit-cache) ## [Validator](#validator-1) + - [Why does it take so long for a validator to be activated?](#vc-activation) - [Can I use redundancy in my staking setup?](#vc-redundancy) - [I am missing attestations. Why?](#vc-missed-attestations) @@ -27,6 +29,7 @@ - [How can I delete my validator once it is imported?](#vc-delete) ## [Network, Monitoring and Maintenance](#network-monitoring-and-maintenance-1) + - [I have a low peer count and it is not increasing](#net-peer) - [How do I update lighthouse?](#net-update) - [Do I need to set up any port mappings (port forwarding)?](#net-port-forwarding) @@ -38,13 +41,14 @@ - [How to know how many of my peers are connected through QUIC?](#net-quic) ## [Miscellaneous](#miscellaneous-1) + - [What should I do if I lose my slashing protection database?](#misc-slashing) - [I can't compile lighthouse](#misc-compile) - [How do I check the version of Lighthouse that is running?](#misc-version) - [Does Lighthouse have pruning function like the execution client to save disk space?](#misc-prune) - [Can I use a HDD for the freezer database and only have the hot db on SSD?](#misc-freezer) - [Can Lighthouse log in local timestamp instead of UTC?](#misc-timestamp) -- [My hard disk is full and my validator is down. What should I do? ](#misc-full) +- [My hard disk is full and my validator is down. What should I do?](#misc-full) ## Beacon Node @@ -52,13 +56,13 @@ The error can be a warning: -``` +```text Nov 30 21:04:28.268 WARN Syncing deposit contract block cache est_blocks_remaining: initializing deposits, service: slot_notifier ``` or an error: -``` +```text ERRO Error updating deposit contract cache error: Failed to get remote head and new block ranges: EndpointError(FarBehind), retry_millis: 60000, service: deposit_contract_rpc ``` @@ -80,11 +84,13 @@ The `WARN Execution engine called failed` log is shown when the beacon node cann `error: HttpClient(url: http://127.0.0.1:8551/, kind: timeout, detail: operation timed out), service: exec` which says `TimedOut` at the end of the message. This means that the execution engine has not responded in time to the beacon node. One option is to add the flags `--execution-timeout-multiplier 3` and `--disable-lock-timeouts` to the beacon node. However, if the error persists, it is worth digging further to find out the cause. There are a few reasons why this can occur: + 1. The execution engine is not synced. Check the log of the execution engine to make sure that it is synced. If it is syncing, wait until it is synced and the error will disappear. You will see the beacon node logs `INFO Execution engine online` when it is synced. 1. The computer is overloaded. Check the CPU and RAM usage to see if it has overloaded. You can use `htop` to check for CPU and RAM usage. 1. Your SSD is slow. Check if your SSD is in "The Bad" list [here](https://gist.github.com/yorickdowne/f3a3e79a573bf35767cd002cc977b038). If your SSD is in "The Bad" list, it means it cannot keep in sync to the network and you may want to consider upgrading to a better SSD. If the reason for the error message is caused by no. 1 above, you may want to look further. If the execution engine is out of sync suddenly, it is usually caused by ungraceful shutdown. The common causes for ungraceful shutdown are: + - Power outage. If power outages are an issue at your place, consider getting a UPS to avoid ungraceful shutdown of services. - The service file is not stopped properly. To overcome this, make sure that the process is stopped properly, e.g., during client updates. - Out of memory (oom) error. This can happen when the system memory usage has reached its maximum and causes the execution engine to be killed. To confirm that the error is due to oom, run `sudo dmesg -T | grep killed` to look for killed processes. If you are using geth as the execution client, a short term solution is to reduce the resources used. For example, you can reduce the cache by adding the flag `--cache 2048`. If the oom occurs rather frequently, a long term solution is to increase the memory capacity of the computer. @@ -95,7 +101,7 @@ An example of the full error is: `ERRO Error during execution engine upcheck error: HttpClient(url: http://127.0.0.1:8551/, kind: request, detail: error trying to connect: tcp connect error: Connection refused (os error 111)), service: exec` -Connection refused means the beacon node cannot reach the execution client. This could be due to the execution client is offline or the configuration is wrong. If the execution client is offline, run the execution engine and the error will disappear. +Connection refused means the beacon node cannot reach the execution client. This could be due to the execution client is offline or the configuration is wrong. If the execution client is offline, run the execution engine and the error will disappear. If it is a configuration issue, ensure that the execution engine can be reached. The standard endpoint to connect to the execution client is `--execution-endpoint http://localhost:8551`. If the execution client is on a different host, the endpoint to connect to it will change, e.g., `--execution-endpoint http://IP_address:8551` where `IP_address` is the IP of the execution client node (you may also need additional flags to be set). If it is using another port, the endpoint link needs to be changed accordingly. Once the execution client/beacon node is configured correctly, the error will disappear. @@ -109,13 +115,12 @@ INFO Downloading historical blocks est_time: --, distance: 4524545 slo If the same log appears every minute and you do not see progress in downloading historical blocks, you can try one of the followings: - - Check the number of peers you are connected to. If you have low peers (less than 50), try to do port forwarding on the ports 9000 TCP/UDP and 9001 UDP to increase peer count. - - Restart the beacon node. - +- Check the number of peers you are connected to. If you have low peers (less than 50), try to do port forwarding on the ports 9000 TCP/UDP and 9001 UDP to increase peer count. +- Restart the beacon node. ### I proposed a block but the beacon node shows `could not publish message` with error `duplicate` as below, should I be worried? -``` +```text INFO Block from HTTP API already known` WARN Could not publish message error: Duplicate, service: libp2p ``` @@ -128,7 +133,7 @@ In short, it is nothing to worry about. The log looks like: -``` +```text WARN Head is optimistic execution_block_hash: 0x47e7555f1d4215d1ad409b1ac188b008fcb286ed8f38d3a5e8078a0af6cbd6e1, info: chain not fully verified, block and attestation production disabled until execution engine syncs, service: slot_notifier ``` @@ -138,7 +143,7 @@ It means the beacon node will follow the chain, but it will not be able to attes An example of the log is shown below: -``` +```text CRIT Beacon block processing error error: ValidatorPubkeyCacheLockTimeout, service: beacon WARN BlockProcessingFailure outcome: ValidatorPubkeyCacheLockTimeout, msg: unexpected condition in processing block. ``` @@ -149,7 +154,7 @@ A `Timeout` error suggests that the computer may be overloaded at the moment, fo An example of the full log is shown below: -``` +```text WARN BlockProcessingFailure outcome: MissingBeaconBlock(0xbdba211f8d72029554e405d8e4906690dca807d1d7b1bc8c9b88d7970f1648bc), msg: unexpected condition in processing block. ``` @@ -165,41 +170,41 @@ This warning usually comes with an http error code. Some examples are given belo 1. The log shows: -``` -WARN Error processing HTTP API request method: GET, path: /eth/v1/validator/attestation_data, status: 500 Internal Server Error, elapsed: 305.65µs -``` + ```text + WARN Error processing HTTP API request method: GET, path: /eth/v1/validator/attestation_data, status: 500 Internal Server Error, elapsed: 305.65µs + ``` -The error is `500 Internal Server Error`. This suggests that the execution client is not synced. Once the execution client is synced, the error will disappear. + The error is `500 Internal Server Error`. This suggests that the execution client is not synced. Once the execution client is synced, the error will disappear. -2. The log shows: +1. The log shows: -``` -WARN Error processing HTTP API request method: POST, path: /eth/v1/validator/duties/attester/199565, status: 503 Service Unavailable, elapsed: 96.787µs -``` + ```text + WARN Error processing HTTP API request method: POST, path: /eth/v1/validator/duties/attester/199565, status: 503 Service Unavailable, elapsed: 96.787µs + ``` -The error is `503 Service Unavailable`. This means that the beacon node is still syncing. When this happens, the validator client will log: + The error is `503 Service Unavailable`. This means that the beacon node is still syncing. When this happens, the validator client will log: -``` -ERRO Failed to download attester duties err: FailedToDownloadAttesters("Some endpoints failed, num_failed: 2 http://localhost:5052/ => Unavailable(NotSynced), http://localhost:5052/ => RequestFailed(ServerMessage(ErrorMessage { code: 503, message: \"SERVICE_UNAVAILABLE: beacon node is syncing -``` + ```text + ERRO Failed to download attester duties err: FailedToDownloadAttesters("Some endpoints failed, num_failed: 2 http://localhost:5052/ => Unavailable(NotSynced), http://localhost:5052/ => RequestFailed(ServerMessage(ErrorMessage { code: 503, message: \"SERVICE_UNAVAILABLE: beacon node is syncing + ``` -This means that the validator client is sending requests to the beacon node. However, as the beacon node is still syncing, it is therefore unable to fulfil the request. The error will disappear once the beacon node is synced. + This means that the validator client is sending requests to the beacon node. However, as the beacon node is still syncing, it is therefore unable to fulfil the request. The error will disappear once the beacon node is synced. ### My beacon node logs `WARN Error signalling fork choice waiter`, what should I do? An example of the full log is shown below: -``` +```text WARN Error signalling fork choice waiter slot: 6763073, error: ForkChoiceSignalOutOfOrder { current: Slot(6763074), latest: Slot(6763073) }, service: state_advance ``` This suggests that the computer resources are being overwhelmed. It could be due to high CPU usage or high disk I/O usage. This can happen, e.g., when the beacon node is downloading historical blocks, or when the execution client is syncing. The error will disappear when the resources used return to normal or when the node is synced. - ### My beacon node logs `ERRO Aggregate attestation queue full`, what should I do? An example of the full log is shown below: -``` + +```text ERRO Aggregate attestation queue full, queue_len: 4096, msg: the system has insufficient resources for load, module: network::beacon_processor:1542 ``` @@ -207,7 +212,7 @@ This suggests that the computer resources are being overwhelmed. It could be due ### My beacon node logs `WARN Failed to finalize deposit cache`, what should I do? -This is a known [bug](https://github.com/sigp/lighthouse/issues/3707) that will fix by itself. +This is a known [bug](https://github.com/sigp/lighthouse/issues/3707) that will fix by itself. ## Validator @@ -312,7 +317,9 @@ However, there are some components which can be configured with redundancy. See [Redundancy](./redundancy.md) guide for more information. ### I am missing attestations. Why? + The first thing is to ensure both consensus and execution clients are synced with the network. If they are synced, there may still be some issues with the node setup itself that is causing the missed attestations. Check the setup to ensure that: + - the clock is synced - the computer has sufficient resources and is not overloaded - the internet is working well @@ -322,13 +329,12 @@ You can see more information on the [Ethstaker KB](https://ethstaker.gitbook.io/ Another cause for missing attestations is delays during block processing. When this happens, the debug logs will show (debug logs can be found under `$datadir/beacon/logs`): -``` +```text DEBG Delayed head block set_as_head_delay: Some(93.579425ms), imported_delay: Some(1.460405278s), observed_delay: Some(2.540811921s), block_delay: 4.094796624s, slot: 6837344, proposer_index: 211108, block_root: 0x2c52231c0a5a117401f5231585de8aa5dd963bc7cbc00c544e681342eedd1700, service: beacon ``` The fields to look for are `imported_delay > 1s` and `observed_delay < 3s`. The `imported_delay` is how long the node took to process the block. The `imported_delay` of larger than 1 second suggests that there is slowness in processing the block. It could be due to high CPU usage, high I/O disk usage or the clients are doing some background maintenance processes. The `observed_delay` is determined mostly by the proposer and partly by your networking setup (e.g., how long it took for the node to receive the block). The `observed_delay` of less than 3 seconds means that the block is not arriving late from the block proposer. Combining the above, this implies that the validator should have been able to attest to the block, but failed due to slowness in the node processing the block. - ### Sometimes I miss the attestation head vote, resulting in penalty. Is this normal? In general, it is unavoidable to have some penalties occasionally. This is particularly the case when you are assigned to attest on the first slot of an epoch and if the proposer of that slot releases the block late, then you will get penalised for missing the target and head votes. Your attestation performance does not only depend on your own setup, but also on everyone elses performance. @@ -337,18 +343,17 @@ You could also check for the sync aggregate participation percentage on block ex Another possible reason for missing the head vote is due to a chain "reorg". A reorg can happen if the proposer publishes block `n` late, and the proposer of block `n+1` builds upon block `n-1` instead of `n`. This is called a "reorg". Due to the reorg, block `n` was never included in the chain. If you are assigned to attest at slot `n`, it is possible you may still attest to block `n` despite most of the network recognizing the block as being late. In this case you will miss the head reward. - ### Can I submit a voluntary exit message without running a beacon node? Yes. Beaconcha.in provides the tool to broadcast the message. You can create the voluntary exit message file with [ethdo](https://github.com/wealdtech/ethdo/releases/tag/v1.30.0) and submit the message via the [beaconcha.in](https://beaconcha.in/tools/broadcast) website. A guide on how to use `ethdo` to perform voluntary exit can be found [here](https://github.com/eth-educators/ethstaker-guides/blob/main/voluntary-exit.md). It is also noted that you can submit your BLS-to-execution-change message to update your withdrawal credentials from type `0x00` to `0x01` using the same link. -If you would like to still use Lighthouse to submit the message, you will need to run a beacon node and an execution client. For the beacon node, you can use checkpoint sync to quickly sync the chain under a minute. On the other hand, the execution client can be syncing and *needs not be synced*. This implies that it is possible to broadcast a voluntary exit message within a short time by quickly spinning up a node. +If you would like to still use Lighthouse to submit the message, you will need to run a beacon node and an execution client. For the beacon node, you can use checkpoint sync to quickly sync the chain under a minute. On the other hand, the execution client can be syncing and _needs not be synced_. This implies that it is possible to broadcast a voluntary exit message within a short time by quickly spinning up a node. ### Does increasing the number of validators increase the CPU and other computer resources used? -A computer with hardware specifications stated in the [Recommended System Requirements](./installation.md#recommended-system-requirements) can run hundreds validators with only marginal increase in CPU usage. +A computer with hardware specifications stated in the [Recommended System Requirements](./installation.md#recommended-system-requirements) can run hundreds validators with only marginal increase in CPU usage. ### I want to add new validators. Do I have to reimport the existing keys? @@ -360,8 +365,7 @@ Generally yes. If you do not want to stop `lighthouse vc`, you can use the [key manager API](./api-vc-endpoints.md) to import keys. - -### How can I delete my validator once it is imported? +### How can I delete my validator once it is imported? Lighthouse supports the [KeyManager API](https://ethereum.github.io/keymanager-APIs/#/Local%20Key%20Manager/deleteKeys) to delete validators and remove them from the `validator_definitions.yml` file. To do so, start the validator client with the flag `--http` and call the API. @@ -371,7 +375,7 @@ If you are looking to delete the validators in one node and import it to another ### I have a low peer count and it is not increasing -If you cannot find *ANY* peers at all, it is likely that you have incorrect +If you cannot find _ANY_ peers at all, it is likely that you have incorrect network configuration settings. Ensure that the network you wish to connect to is correct (the beacon node outputs the network it is connecting to in the initial boot-up log lines). On top of this, ensure that you are not using the @@ -385,26 +389,25 @@ expect, there are a few things to check on: 1. Ensure that port forward was correctly set up as described [here](./advanced_networking.md#nat-traversal-port-forwarding). -To check that the ports are forwarded, run the command: - - ```bash - curl http://localhost:5052/lighthouse/nat - ``` + To check that the ports are forwarded, run the command: -It should return `{"data":true}`. If it returns `{"data":false}`, you may want to double check if the port forward was correctly set up. + ```bash + curl http://localhost:5052/lighthouse/nat + ``` -If the ports are open, you should have incoming peers. To check that you have incoming peers, run the command: + It should return `{"data":true}`. If it returns `{"data":false}`, you may want to double check if the port forward was correctly set up. - ```bash - curl localhost:5052/lighthouse/peers | jq '.[] | select(.peer_info.connection_direction=="Incoming")' - ``` + If the ports are open, you should have incoming peers. To check that you have incoming peers, run the command: -If you have incoming peers, it should return a lot of data containing information of peers. If the response is empty, it means that you have no incoming peers and there the ports are not open. You may want to double check if the port forward was correctly set up. + ```bash + curl localhost:5052/lighthouse/peers | jq '.[] | select(.peer_info.connection_direction=="Incoming")' + ``` -2. Check that you do not lower the number of peers using the flag `--target-peers`. The default is 100. A lower value set will lower the maximum number of peers your node can connect to, which may potentially interrupt the validator performance. We recommend users to leave the `--target peers` untouched to keep a diverse set of peers. + If you have incoming peers, it should return a lot of data containing information of peers. If the response is empty, it means that you have no incoming peers and there the ports are not open. You may want to double check if the port forward was correctly set up. -3. Ensure that you have a quality router for the internet connection. For example, if you connect the router to many devices including the node, it may be possible that the router cannot handle all routing tasks, hence struggling to keep up the number of peers. Therefore, using a quality router for the node is important to keep a healthy number of peers. +1. Check that you do not lower the number of peers using the flag `--target-peers`. The default is 100. A lower value set will lower the maximum number of peers your node can connect to, which may potentially interrupt the validator performance. We recommend users to leave the `--target peers` untouched to keep a diverse set of peers. +1. Ensure that you have a quality router for the internet connection. For example, if you connect the router to many devices including the node, it may be possible that the router cannot handle all routing tasks, hence struggling to keep up the number of peers. Therefore, using a quality router for the node is important to keep a healthy number of peers. ### How do I update lighthouse? @@ -415,7 +418,7 @@ If you are updating by rebuilding from source, see [here.](./installation-source If you are running the docker image provided by Sigma Prime on Dockerhub, you can update to specific versions, for example: ```bash -$ docker pull sigp/lighthouse:v1.0.0 +docker pull sigp/lighthouse:v1.0.0 ``` If you are building a docker image, the process will be similar to the one described [here.](./docker.md#building-the-docker-image) @@ -461,7 +464,7 @@ Monitoring](./validator-monitoring.md) for more information. Lighthouse has also The setting on the beacon node is the same for both cases below. In the beacon node, specify `lighthouse bn --http-address local_IP` so that the beacon node is listening on the local network rather than `localhost`. You can find the `local_IP` by running the command `hostname -I | awk '{print $1}'` on the server running the beacon node. -1. If the beacon node and validator clients are on different servers *in the same network*, the setting in the validator client is as follows: +1. If the beacon node and validator clients are on different servers _in the same network_, the setting in the validator client is as follows: Use the flag `--beacon-nodes` to point to the beacon node. For example, `lighthouse vc --beacon-nodes http://local_IP:5052` where `local_IP` is the local IP address of the beacon node and `5052` is the default `http-port` of the beacon node. @@ -475,34 +478,33 @@ The setting on the beacon node is the same for both cases below. In the beacon n You can refer to [Redundancy](./redundancy.md) for more information. -2. If the beacon node and validator clients are on different servers *and different networks*, it is necessary to perform port forwarding of the SSH port (e.g., the default port 22) on the router, and also allow firewall on the SSH port. The connection can be established via port forwarding on the router. - - +2. If the beacon node and validator clients are on different servers _and different networks_, it is necessary to perform port forwarding of the SSH port (e.g., the default port 22) on the router, and also allow firewall on the SSH port. The connection can be established via port forwarding on the router. - In the validator client, use the flag `--beacon-nodes` to point to the beacon node. However, since the beacon node and the validator client are on different networks, the IP address to use is the public IP address of the beacon node, i.e., `lighthouse vc --beacon-nodes http://public_IP:5052`. You can get the public IP address of the beacon node by running the command ` dig +short myip.opendns.com @resolver1.opendns.com` on the server running the beacon node. + In the validator client, use the flag `--beacon-nodes` to point to the beacon node. However, since the beacon node and the validator client are on different networks, the IP address to use is the public IP address of the beacon node, i.e., `lighthouse vc --beacon-nodes http://public_IP:5052`. You can get the public IP address of the beacon node by running the command `dig +short myip.opendns.com @resolver1.opendns.com` on the server running the beacon node. Additionally, port forwarding of port 5052 on the router connected to the beacon node is required for the vc to connect to the bn. To do port forwarding, refer to [how to open ports](./advanced_networking.md#how-to-open-ports). - If you have firewall setup, e.g., `ufw`, you will need to allow connections to port 5052 (assuming that the default port is used). Since the beacon node HTTP/HTTPS API is public-facing (i.e., the 5052 port is now exposed to the internet due to port forwarding), we strongly recommend users to apply IP-address filtering to the BN/VC connection from malicious actors. This can be done using the command: - ``` + ```bash sudo ufw allow from vc_IP_address proto tcp to any port 5052 ``` - where `vc_IP_address` is the public IP address of the validator client. The command will only allow connections to the beacon node from the validator client IP address to prevent malicious attacks on the beacon node over the internet. + where `vc_IP_address` is the public IP address of the validator client. The command will only allow connections to the beacon node from the validator client IP address to prevent malicious attacks on the beacon node over the internet. It is also worth noting that the `--beacon-nodes` flag can also be used for redundancy of beacon nodes. For example, let's say you have a beacon node and a validator client running on the same host, and a second beacon node on another server as a backup. In this case, you can use `lighthouse vc --beacon-nodes http://localhost:5052, http://IP-address:5052` on the validator client. ### Should I do anything to the beacon node or validator client settings if I have a relocation of the node / change of IP address? + No. Lighthouse will auto-detect the change and update your Ethereum Node Record (ENR). You just need to make sure you are not manually setting the ENR with `--enr-address` (which, for common use cases, this flag is not used). ### How to change the TCP/UDP port 9000 that Lighthouse listens on? + Use the flag `--port ` in the beacon node. This flag can be useful when you are running two beacon nodes at the same time. You can leave one beacon node as the default port 9000, and configure the second beacon node to listen on, e.g., `--port 9100`. Since V4.5.0, Lighthouse supports QUIC and by default will use the value of `--port` + 1 to listen via UDP (default `9001`). This can be configured by using the flag `--quic-port`. Refer to [Advanced Networking](./advanced_networking.md#nat-traversal-port-forwarding) for more information. -### Lighthouse `v4.3.0` introduces a change where a node will subscribe to only 2 subnets in total. I am worried that this will impact my validators return. +### Lighthouse `v4.3.0` introduces a change where a node will subscribe to only 2 subnets in total. I am worried that this will impact my validators return Previously, having more validators means subscribing to more subnets. Since the change, a node will now only subscribe to 2 subnets in total. This will bring about significant reductions in bandwidth for nodes with multiple validators. @@ -520,11 +522,12 @@ With `--metrics` enabled in the beacon node, you can find the number of peers co A response example is: -``` +```text # HELP libp2p_quic_peers Count of libp2p peers currently connected via QUIC # TYPE libp2p_quic_peers gauge libp2p_quic_peers 4 ``` + which shows that there are 4 peers connected via QUIC. ## Miscellaneous @@ -552,19 +555,22 @@ Specs: mainnet (true), minimal (false), gnosis (true) If you download the binary file, navigate to the location of the directory, for example, the binary file is in `/usr/local/bin`, run `/usr/local/bin/lighthouse --version`, the example of output is the same as above. Alternatively, if you have Lighthouse running, on the same computer, you can run: + ```bash curl "http://127.0.0.1:5052/eth/v1/node/version" ``` Example of output: + ```bash {"data":{"version":"Lighthouse/v4.1.0-693886b/x86_64-linux"}} ``` + which says that the version is v4.1.0. ### Does Lighthouse have pruning function like the execution client to save disk space? -Yes, Lighthouse supports [state pruning](./database-migrations.md#how-to-prune-historic-states) which can help to save disk space. +Yes, Lighthouse supports [state pruning](./database-migrations.md#how-to-prune-historic-states) which can help to save disk space. ### Can I use a HDD for the freezer database and only have the hot db on SSD? @@ -574,20 +580,6 @@ Yes, you can do so by using the flag `--freezer-dir /path/to/freezer_db` in the The reason why Lighthouse logs in UTC is due to the dependency on an upstream library that is [yet to be resolved](https://github.com/sigp/lighthouse/issues/3130). Alternatively, using the flag `disable-log-timestamp` in combination with systemd will suppress the UTC timestamps and print the logs in local timestamps. -### My hard disk is full and my validator is down. What should I do? +### My hard disk is full and my validator is down. What should I do? A quick way to get the validator back online is by removing the Lighthouse beacon node database and resync Lighthouse using checkpoint sync. A guide to do this can be found in the [Lighthouse Discord server](https://discord.com/channels/605577013327167508/605577013331361793/1019755522985050142). With some free space left, you will then be able to prune the execution client database to free up more space. - - - - - - - - - - - - - - diff --git a/book/src/graffiti.md b/book/src/graffiti.md index 302f8f9679..ba9c7d05d7 100644 --- a/book/src/graffiti.md +++ b/book/src/graffiti.md @@ -2,14 +2,16 @@ Lighthouse provides four options for setting validator graffiti. -### 1. Using the "--graffiti-file" flag on the validator client +## 1. Using the "--graffiti-file" flag on the validator client + Users can specify a file with the `--graffiti-file` flag. This option is useful for dynamically changing graffitis for various use cases (e.g. drawing on the beaconcha.in graffiti wall). This file is loaded once on startup and reloaded everytime a validator is chosen to propose a block. Usage: `lighthouse vc --graffiti-file graffiti_file.txt` The file should contain key value pairs corresponding to validator public keys and their associated graffiti. The file can also contain a `default` key for the default case. -``` + +```text default: default_graffiti public_key1: graffiti1 public_key2: graffiti2 @@ -18,7 +20,7 @@ public_key2: graffiti2 Below is an example of a graffiti file: -``` +```text default: Lighthouse 0x87a580d31d7bc69069b55f5a01995a610dd391a26dc9e36e81057a17211983a79266800ab8531f21f1083d7d84085007: mr f was here 0xa5566f9ec3c6e1fdf362634ebec9ef7aceb0e460e5079714808388e5d48f4ae1e12897fed1bea951c17fa389d511e477: mr v was here @@ -26,13 +28,15 @@ default: Lighthouse Lighthouse will first search for the graffiti corresponding to the public key of the proposing validator, if there are no matches for the public key, then it uses the graffiti corresponding to the default key if present. -### 2. Setting the graffiti in the `validator_definitions.yml` +## 2. Setting the graffiti in the `validator_definitions.yml` + Users can set validator specific graffitis in `validator_definitions.yml` with the `graffiti` key. This option is recommended for static setups where the graffitis won't change on every new block proposal. -You can also update the graffitis in the `validator_definitions.yml` file using the [Lighthouse API](api-vc-endpoints.html#patch-lighthousevalidatorsvoting_pubkey). See example in [Set Graffiti via HTTP](#set-graffiti-via-http). +You can also update the graffitis in the `validator_definitions.yml` file using the [Lighthouse API](api-vc-endpoints.html#patch-lighthousevalidatorsvoting_pubkey). See example in [Set Graffiti via HTTP](#set-graffiti-via-http). Below is an example of the validator_definitions.yml with validator specific graffitis: -``` + +```text --- - enabled: true voting_public_key: "0x87a580d31d7bc69069b55f5a01995a610dd391a26dc9e36e81057a17211983a79266800ab8531f21f1083d7d84085007" @@ -48,32 +52,35 @@ Below is an example of the validator_definitions.yml with validator specific gra graffiti: "somethingprofound" ``` -### 3. Using the "--graffiti" flag on the validator client +## 3. Using the "--graffiti" flag on the validator client + Users can specify a common graffiti for all their validators using the `--graffiti` flag on the validator client. Usage: `lighthouse vc --graffiti example` -### 4. Using the "--graffiti" flag on the beacon node +## 4. Using the "--graffiti" flag on the beacon node + Users can also specify a common graffiti using the `--graffiti` flag on the beacon node as a common graffiti for all validators. Usage: `lighthouse bn --graffiti fortytwo` > Note: The order of preference for loading the graffiti is as follows: +> > 1. Read from `--graffiti-file` if provided. -> 2. If `--graffiti-file` is not provided or errors, read graffiti from `validator_definitions.yml`. -> 3. If graffiti is not specified in `validator_definitions.yml`, load the graffiti passed in the `--graffiti` flag on the validator client. -> 4. If the `--graffiti` flag on the validator client is not passed, load the graffiti passed in the `--graffiti` flag on the beacon node. -> 4. If the `--graffiti` flag is not passed, load the default Lighthouse graffiti. +> 1. If `--graffiti-file` is not provided or errors, read graffiti from `validator_definitions.yml`. +> 1. If graffiti is not specified in `validator_definitions.yml`, load the graffiti passed in the `--graffiti` flag on the validator client. +> 1. If the `--graffiti` flag on the validator client is not passed, load the graffiti passed in the `--graffiti` flag on the beacon node. +> 1. If the `--graffiti` flag is not passed, load the default Lighthouse graffiti. -### Set Graffiti via HTTP +## Set Graffiti via HTTP Use the [Lighthouse API](api-vc-endpoints.md) to set graffiti on a per-validator basis. This method updates the graffiti -both in memory and in the `validator_definitions.yml` file. The new graffiti will be used in the next block proposal +both in memory and in the `validator_definitions.yml` file. The new graffiti will be used in the next block proposal without requiring a validator client restart. Refer to [Lighthouse API](api-vc-endpoints.html#patch-lighthousevalidatorsvoting_pubkey) for API specification. -#### Example Command +### Example Command ```bash DATADIR=/var/lib/lighthouse @@ -85,4 +92,4 @@ curl -X PATCH "http://localhost:5062/lighthouse/validators/0xb0148e6348264131bf4 }' | jq ``` -A `null` response indicates that the request is successful. \ No newline at end of file +A `null` response indicates that the request is successful. diff --git a/book/src/help_bn.md b/book/src/help_bn.md index efdc7114b7..e77db3df54 100644 --- a/book/src/help_bn.md +++ b/book/src/help_bn.md @@ -486,4 +486,5 @@ OPTIONS: block root should be 0x-prefixed. Note that this flag is for verification only, to perform a checkpoint sync from a recent state use --checkpoint-sync-url. ``` + diff --git a/book/src/help_general.md b/book/src/help_general.md index 551f93e2bf..e7e323f330 100644 --- a/book/src/help_general.md +++ b/book/src/help_general.md @@ -105,4 +105,5 @@ SUBCOMMANDS: validator_manager Utilities for managing a Lighthouse validator client via the HTTP API. [aliases: vm, validator-manager, validator_manager] ``` + diff --git a/book/src/help_vc.md b/book/src/help_vc.md index 1b7e7f2b0a..4fd35b1ea2 100644 --- a/book/src/help_vc.md +++ b/book/src/help_vc.md @@ -223,4 +223,5 @@ OPTIONS: --web3-signer-max-idle-connections Maximum number of idle connections to maintain per web3signer host. Default is unlimited. ``` + diff --git a/book/src/help_vm.md b/book/src/help_vm.md index db01164a92..85dcdd3c0b 100644 --- a/book/src/help_vm.md +++ b/book/src/help_vm.md @@ -95,4 +95,5 @@ SUBCOMMANDS: which can be generated using the "create-validators" command. This command only supports validators signing via a keystore on the local file system (i.e., not Web3Signer validators). ``` + diff --git a/book/src/help_vm_create.md b/book/src/help_vm_create.md index 2fa54265ab..1b43d0f988 100644 --- a/book/src/help_vm_create.md +++ b/book/src/help_vm_create.md @@ -135,4 +135,5 @@ OPTIONS: Path to directory containing eth2_testnet specs. Defaults to a hard-coded Lighthouse testnet. Only effective if there is no existing database. ``` + diff --git a/book/src/help_vm_import.md b/book/src/help_vm_import.md index e6ff351dac..e8eb4946aa 100644 --- a/book/src/help_vm_import.md +++ b/book/src/help_vm_import.md @@ -99,4 +99,5 @@ OPTIONS: A HTTP(S) address of a validator client using the keymanager-API. If this value is not supplied then a 'dry run' will be conducted where no changes are made to the validator client. [default: http://localhost:5062] ``` + diff --git a/book/src/help_vm_move.md b/book/src/help_vm_move.md index fe1d4c5ae9..95c6c8e00e 100644 --- a/book/src/help_vm_move.md +++ b/book/src/help_vm_move.md @@ -116,4 +116,5 @@ OPTIONS: --validators The validators to be moved. Either a list of 0x-prefixed validator pubkeys or the keyword "all". ``` + diff --git a/book/src/homebrew.md b/book/src/homebrew.md index 486de371f8..da92dcb26c 100644 --- a/book/src/homebrew.md +++ b/book/src/homebrew.md @@ -5,7 +5,7 @@ Lighthouse is available on Linux and macOS via the [Homebrew package manager](ht Please note that this installation method is maintained by the Homebrew community. It is not officially supported by the Lighthouse team. -### Installation +## Installation Install the latest version of the [`lighthouse`][formula] formula with: @@ -13,7 +13,7 @@ Install the latest version of the [`lighthouse`][formula] formula with: brew install lighthouse ``` -### Usage +## Usage If Homebrew is installed to your `PATH` (default), simply run: @@ -27,7 +27,7 @@ Alternatively, you can find the `lighthouse` binary at: "$(brew --prefix)/bin/lighthouse" --help ``` -### Maintenance +## Maintenance The [formula][] is kept up-to-date by the Homebrew community and a bot that lists for new releases. diff --git a/book/src/installation-binaries.md b/book/src/installation-binaries.md index 30bf03e14e..580b5c19d4 100644 --- a/book/src/installation-binaries.md +++ b/book/src/installation-binaries.md @@ -30,16 +30,16 @@ a `x86_64` binary. 1. Go to the [Releases](https://github.com/sigp/lighthouse/releases) page and select the latest release. 1. Download the `lighthouse-${VERSION}-x86_64-unknown-linux-gnu.tar.gz` binary. For example, to obtain the binary file for v4.0.1 (the latest version at the time of writing), a user can run the following commands in a linux terminal: + ```bash cd ~ curl -LO https://github.com/sigp/lighthouse/releases/download/v4.0.1/lighthouse-v4.0.1-x86_64-unknown-linux-gnu.tar.gz tar -xvf lighthouse-v4.0.1-x86_64-unknown-linux-gnu.tar.gz ``` + 1. Test the binary with `./lighthouse --version` (it should print the version). 1. (Optional) Move the `lighthouse` binary to a location in your `PATH`, so the `lighthouse` command can be called from anywhere. For example, to copy `lighthouse` from the current directory to `usr/bin`, run `sudo cp lighthouse /usr/bin`. - - > Windows users will need to execute the commands in Step 2 from PowerShell. ## Portability @@ -49,10 +49,10 @@ sacrifice the ability to make use of modern CPU instructions. If you have a modern CPU then you should try running a non-portable build to get a 20-30% speed up. -* For **x86_64**, any CPU supporting the [ADX](https://en.wikipedia.org/wiki/Intel_ADX) instruction set +- For **x86_64**, any CPU supporting the [ADX](https://en.wikipedia.org/wiki/Intel_ADX) instruction set extension is compatible with the optimized build. This includes Intel Broadwell (2014) and newer, and AMD Ryzen (2017) and newer. -* For **ARMv8**, most CPUs are compatible with the optimized build, including the Cortex-A72 used by +- For **ARMv8**, most CPUs are compatible with the optimized build, including the Cortex-A72 used by the Raspberry Pi 4. ## Troubleshooting diff --git a/book/src/installation-source.md b/book/src/installation-source.md index c2f5861576..be03a189de 100644 --- a/book/src/installation-source.md +++ b/book/src/installation-source.md @@ -23,7 +23,7 @@ The rustup installer provides an easy way to update the Rust compiler, and works With Rust installed, follow the instructions below to install dependencies relevant to your operating system. -#### Ubuntu +### Ubuntu Install the following packages: @@ -42,7 +42,7 @@ sudo apt update && sudo apt install -y git gcc g++ make cmake pkg-config llvm-de After this, you are ready to [build Lighthouse](#build-lighthouse). -#### Fedora/RHEL/CentOS +### Fedora/RHEL/CentOS Install the following packages: @@ -52,7 +52,7 @@ yum -y install git make perl clang cmake After this, you are ready to [build Lighthouse](#build-lighthouse). -#### macOS +### macOS 1. Install the [Homebrew][] package manager. 1. Install CMake using Homebrew: @@ -61,21 +61,22 @@ After this, you are ready to [build Lighthouse](#build-lighthouse). brew install cmake ``` - [Homebrew]: https://brew.sh/ After this, you are ready to [build Lighthouse](#build-lighthouse). -#### Windows +### Windows 1. Install [Git](https://git-scm.com/download/win). 1. Install the [Chocolatey](https://chocolatey.org/install) package manager for Windows. > Tips: > - Use PowerShell to install. In Windows, search for PowerShell and run as administrator. > - You must ensure `Get-ExecutionPolicy` is not Restricted. To test this, run `Get-ExecutionPolicy` in PowerShell. If it returns `restricted`, then run `Set-ExecutionPolicy AllSigned`, and then run + ```bash Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1')) ``` + > - To verify that Chocolatey is ready, run `choco` and it should return the version. 1. Install Make, CMake and LLVM using Chocolatey: @@ -158,14 +159,14 @@ FEATURES=gnosis,slasher-lmdb make Commonly used features include: -* `gnosis`: support for the Gnosis Beacon Chain. -* `portable`: support for legacy hardware. -* `modern`: support for exclusively modern hardware. -* `slasher-lmdb`: support for the LMDB slasher backend. Enabled by default. -* `slasher-mdbx`: support for the MDBX slasher backend. -* `jemalloc`: use [`jemalloc`][jemalloc] to allocate memory. Enabled by default on Linux and macOS. +- `gnosis`: support for the Gnosis Beacon Chain. +- `portable`: support for legacy hardware. +- `modern`: support for exclusively modern hardware. +- `slasher-lmdb`: support for the LMDB slasher backend. Enabled by default. +- `slasher-mdbx`: support for the MDBX slasher backend. +- `jemalloc`: use [`jemalloc`][jemalloc] to allocate memory. Enabled by default on Linux and macOS. Not supported on Windows. -* `spec-minimal`: support for the minimal preset (useful for testing). +- `spec-minimal`: support for the minimal preset (useful for testing). Default features (e.g. `slasher-lmdb`) may be opted out of using the `--no-default-features` argument for `cargo`, which can be plumbed in via the `CARGO_INSTALL_EXTRA_FLAGS` environment variable. @@ -184,9 +185,9 @@ You can customise the compiler settings used to compile Lighthouse via Lighthouse includes several profiles which can be selected via the `PROFILE` environment variable. -* `release`: default for source builds, enables most optimisations while not taking too long to +- `release`: default for source builds, enables most optimisations while not taking too long to compile. -* `maxperf`: default for binary releases, enables aggressive optimisations including full LTO. +- `maxperf`: default for binary releases, enables aggressive optimisations including full LTO. Although compiling with this profile improves some benchmarks by around 20% compared to `release`, it imposes a _significant_ cost at compile time and is only recommended if you have a fast CPU. diff --git a/book/src/installation.md b/book/src/installation.md index e8caf5c457..a0df394bd2 100644 --- a/book/src/installation.md +++ b/book/src/installation.md @@ -19,20 +19,17 @@ There are also community-maintained installation methods: - Arch Linux AUR packages: [source](https://aur.archlinux.org/packages/lighthouse-ethereum), [binary](https://aur.archlinux.org/packages/lighthouse-ethereum-bin). - - ## Recommended System Requirements -Before [The Merge](https://ethereum.org/en/roadmap/merge/), Lighthouse was able to run on its own with low to mid-range consumer hardware, but would perform best when provided with ample system resources. +Before [The Merge](https://ethereum.org/en/roadmap/merge/), Lighthouse was able to run on its own with low to mid-range consumer hardware, but would perform best when provided with ample system resources. After [The Merge](https://ethereum.org/en/roadmap/merge/) on 15th September 2022, it is necessary to run Lighthouse together with an execution client ([Nethermind](https://nethermind.io/), [Besu](https://www.hyperledger.org/use/besu), [Erigon](https://github.com/ledgerwatch/erigon), [Geth](https://geth.ethereum.org/)). The following system requirements listed are therefore for running a Lighthouse beacon node combined with an execution client , and a validator client with a modest number of validator keys (less than 100): +- CPU: Quad-core AMD Ryzen, Intel Broadwell, ARMv8 or newer +- Memory: 32 GB RAM* +- Storage: 2 TB solid state drive +- Network: 100 Mb/s download, 20 Mb/s upload broadband connection -* CPU: Quad-core AMD Ryzen, Intel Broadwell, ARMv8 or newer -* Memory: 32 GB RAM* -* Storage: 2 TB solid state drive -* Network: 100 Mb/s download, 20 Mb/s upload broadband connection - -> *Note: 16 GB RAM is becoming rather limited due to the increased resources required. 16 GB RAM would likely result in out of memory errors in the case of a spike in computing demand (e.g., caused by a bug) or during periods of non-finality of the beacon chain. Users with 16 GB RAM also have a limited choice when it comes to selecting an execution client, which does not help with the [client diversity](https://clientdiversity.org/). We therefore recommend users to have at least 32 GB RAM for long term health of the node, while also giving users the flexibility to change client should the thought arise. +> *Note: 16 GB RAM is becoming rather limited due to the increased resources required. 16 GB RAM would likely result in out of memory errors in the case of a spike in computing demand (e.g., caused by a bug) or during periods of non-finality of the beacon chain. Users with 16 GB RAM also have a limited choice when it comes to selecting an execution client, which does not help with the [client diversity](https://clientdiversity.org/). We therefore recommend users to have at least 32 GB RAM for long term health of the node, while also giving users the flexibility to change client should the thought arise. Last update: April 2023 diff --git a/book/src/intro.md b/book/src/intro.md index ef16913d68..9892a8a49d 100644 --- a/book/src/intro.md +++ b/book/src/intro.md @@ -24,7 +24,6 @@ You may read this book from start to finish, or jump to some of these topics: - Utilize the whole stack by starting a [local testnet](./setup.md#local-testnets). - Query the [RESTful HTTP API](./api.md) using `curl`. - Prospective contributors can read the [Contributing](./contributing.md) section to understand how we develop and test Lighthouse. diff --git a/book/src/key-management.md b/book/src/key-management.md index b2bb7737fd..007ccf6977 100644 --- a/book/src/key-management.md +++ b/book/src/key-management.md @@ -40,29 +40,32 @@ to secure BTC, ETH and many other coins. We defined some terms in the context of validator key management: - **Mnemonic**: a string of 24 words that is designed to be easy to write down - and remember. E.g., _"radar fly lottery mirror fat icon bachelor sadness - type exhaust mule six beef arrest you spirit clog mango snap fox citizen - already bird erase"_. - - Defined in BIP-39 + and remember. E.g., _"radar fly lottery mirror fat icon bachelor sadness + type exhaust mule six beef arrest you spirit clog mango snap fox citizen + already bird erase"_. + - Defined in BIP-39 - **Wallet**: a wallet is a JSON file which stores an - encrypted version of a mnemonic. - - Defined in EIP-2386 + encrypted version of a mnemonic. + - Defined in EIP-2386 - **Keystore**: typically created by wallet, it contains a single encrypted BLS - keypair. - - Defined in EIP-2335. + keypair. + - Defined in EIP-2335. - **Voting Keypair**: a BLS public and private keypair which is used for - signing blocks, attestations and other messages on regular intervals in the beacon chain. + signing blocks, attestations and other messages on regular intervals in the beacon chain. - **Withdrawal Keypair**: a BLS public and private keypair which will be - required _after_ Phase 0 to manage ETH once a validator has exited. + required _after_ Phase 0 to manage ETH once a validator has exited. ## Create a validator + There are 2 steps involved to create a validator key using Lighthouse: + 1. [Create a wallet](#step-1-create-a-wallet-and-record-the-mnemonic) 1. [Create a validator](#step-2-create-a-validator) The following example demonstrates how to create a single validator key. ### Step 1: Create a wallet and record the mnemonic + A wallet allows for generating practically unlimited validators from an easy-to-remember 24-word string (a mnemonic). As long as that mnemonic is backed up, all validator keys can be trivially re-generated. @@ -78,12 +81,14 @@ to `./wallet.pass`: ```bash lighthouse --network goerli account wallet create --name wally --password-file wally.pass ``` + Using the above command, a wallet will be created in `~/.lighthouse/goerli/wallets` with the name `wally`. It is encrypted using the password defined in the -`wally.pass` file. +`wally.pass` file. During the wallet creation process, a 24-word mnemonic will be displayed. Record the mnemonic because it allows you to recreate the files in the case of data loss. > Notes: +> > - When navigating to the directory `~/.lighthouse/goerli/wallets`, one will not see the wallet name `wally`, but a hexadecimal folder containing the wallet file. However, when interacting with `lighthouse` in the CLI, the name `wally` will be used. > - The password is not `wally.pass`, it is the _content_ of the > `wally.pass` file. @@ -91,22 +96,23 @@ During the wallet creation process, a 24-word mnemonic will be displayed. Record > of that file. ### Step 2: Create a validator + Validators are fundamentally represented by a BLS keypair. In Lighthouse, we use a wallet to generate these keypairs. Once a wallet exists, the `lighthouse account validator create` command can be used to generate the BLS keypair and all necessary information to submit a validator deposit. With the `wally` wallet created in [Step 1](#step-1-create-a-wallet-and-record-the-mnemonic), we can create a validator with the command: ```bash lighthouse --network goerli account validator create --wallet-name wally --wallet-password wally.pass --count 1 ``` + This command will: - Derive a single new BLS keypair from wallet `wally` in `~/.lighthouse/goerli/wallets`, updating it so that it generates a new key next time. - Create a new directory `~/.lighthouse/goerli/validators` containing: - - An encrypted keystore file `voting-keystore.json` containing the validator's voting keypair. - - An `eth1_deposit_data.rlp` assuming the default deposit amount (`32 ETH`) which can be submitted to the deposit - contract for the Goerli testnet. Other networks can be set via the - `--network` parameter. + - An encrypted keystore file `voting-keystore.json` containing the validator's voting keypair. + - An `eth1_deposit_data.rlp` assuming the default deposit amount (`32 ETH`) which can be submitted to the deposit + contract for the Goerli testnet. Other networks can be set via the + `--network` parameter. - Create a new directory `~/.lighthouse/goerli/secrets` which stores a password to the validator's voting keypair. - If you want to create another validator in the future, repeat [Step 2](#step-2-create-a-validator). The wallet keeps track of how many validators it has generated and ensures that a new validator is generated each time. The important thing is to keep the 24-word mnemonic safe so that it can be used to generate new validator keys if needed. ## Detail @@ -116,16 +122,16 @@ If you want to create another validator in the future, repeat [Step 2](#step-2-c There are three important directories in Lighthouse validator key management: - `wallets/`: contains encrypted wallets which are used for hierarchical - key derivation. - - Defaults to `~/.lighthouse/{network}/wallets` + key derivation. + - Defaults to `~/.lighthouse/{network}/wallets` - `validators/`: contains a directory for each validator containing - encrypted keystores and other validator-specific data. - - Defaults to `~/.lighthouse/{network}/validators` + encrypted keystores and other validator-specific data. + - Defaults to `~/.lighthouse/{network}/validators` - `secrets/`: since the validator signing keys are "hot", the validator process - needs access to the passwords to decrypt the keystores in the validators - directory. These passwords are stored here. - - Defaults to `~/.lighthouse/{network}/secrets` - + needs access to the passwords to decrypt the keystores in the validators + directory. These passwords are stored here. + - Defaults to `~/.lighthouse/{network}/secrets` + where `{network}` is the name of the network passed in the `--network` parameter. When the validator client boots, it searches the `validators/` for directories diff --git a/book/src/key-recovery.md b/book/src/key-recovery.md index a996e95cbc..a0593ddd94 100644 --- a/book/src/key-recovery.md +++ b/book/src/key-recovery.md @@ -1,6 +1,5 @@ # Key Recovery - Generally, validator keystore files are generated alongside a *mnemonic*. If the keystore and/or the keystore password are lost, this mnemonic can regenerate a new, equivalent keystore with a new password. @@ -8,9 +7,9 @@ regenerate a new, equivalent keystore with a new password. There are two ways to recover keys using the `lighthouse` CLI: - `lighthouse account validator recover`: recover one or more EIP-2335 keystores from a mnemonic. - These keys can be used directly in a validator client. + These keys can be used directly in a validator client. - `lighthouse account wallet recover`: recover an EIP-2386 wallet from a - mnemonic. + mnemonic. ## ⚠️ Warning @@ -18,10 +17,10 @@ There are two ways to recover keys using the `lighthouse` CLI: resort.** Key recovery entails significant risks: - Exposing your mnemonic to a computer at any time puts it at risk of being - compromised. Your mnemonic is **not encrypted** and is a target for theft. + compromised. Your mnemonic is **not encrypted** and is a target for theft. - It's completely possible to regenerate a validator keypairs that is already active - on some other validator client. Running the same keypairs on two different - validator clients is very likely to result in slashing. + on some other validator client. Running the same keypairs on two different + validator clients is very likely to result in slashing. ## Recover EIP-2335 validator keystores @@ -32,7 +31,6 @@ index on the same mnemonic always results in the same validator keypair being generated (see [EIP-2334](https://eips.ethereum.org/EIPS/eip-2334) for more detail). - Using the `lighthouse account validator recover` command you can generate the keystores that correspond to one or more indices in the mnemonic: @@ -41,7 +39,6 @@ keystores that correspond to one or more indices in the mnemonic: - `lighthouse account validator recover --first-index 1`: recover only index `1`. - `lighthouse account validator recover --first-index 1 --count 2`: recover indices `1, 2`. - For each of the indices recovered in the above commands, a directory will be created in the `--validator-dir` location (default `~/.lighthouse/{network}/validators`) which contains all the information necessary to run a validator using the diff --git a/book/src/lighthouse-ui.md b/book/src/lighthouse-ui.md index 81098715f3..106a5e8947 100644 --- a/book/src/lighthouse-ui.md +++ b/book/src/lighthouse-ui.md @@ -23,7 +23,7 @@ information: - [Installation Guide](./ui-installation.md) - Information to install and run the Lighthouse UI. - [Configuration Guide](./ui-configuration.md) - Explanation of how to setup - and configure Siren. + and configure Siren. - [Authentication Guide](./ui-authentication.md) - Explanation of how Siren authentication works and protects validator actions. - [Usage](./ui-usage.md) - Details various Siren components. - [FAQs](./ui-faqs.md) - Frequently Asked Questions. diff --git a/book/src/mainnet-validator.md b/book/src/mainnet-validator.md index 942ca09b8e..c53be97ccf 100644 --- a/book/src/mainnet-validator.md +++ b/book/src/mainnet-validator.md @@ -1,7 +1,6 @@ # Become an Ethereum Consensus Mainnet Validator [launchpad]: https://launchpad.ethereum.org/ -[lh-book]: https://lighthouse-book.sigmaprime.io/ [advanced-datadir]: ./advanced-datadir.md [license]: https://github.com/sigp/lighthouse/blob/stable/LICENSE [slashing]: ./slashing-protection.md @@ -18,7 +17,6 @@ Being educated is critical to a validator's success. Before submitting your main - Reading through this documentation, especially the [Slashing Protection][slashing] section. - Performing a web search and doing your own research. - > > **Please note**: the Lighthouse team does not take any responsibility for losses or damages > occurred through the use of Lighthouse. We have an experienced internal security team and have @@ -27,7 +25,6 @@ Being educated is critical to a validator's success. Before submitting your main > due to the actions of other actors on the consensus layer or software bugs. See the > [software license][license] for more detail on liability. - ## Become a validator There are five primary steps to become a validator: @@ -39,23 +36,24 @@ There are five primary steps to become a validator: 1. [Submit deposit](#step-5-submit-deposit-32eth-per-validator) > **Important note**: The guide below contains both mainnet and testnet instructions. We highly recommend *all* users to **run a testnet validator** prior to staking mainnet ETH. By far, the best technical learning experience is to run a testnet validator. You can get hands-on experience with all the tools and it's a great way to test your staking -hardware. 32 ETH is a significant outlay and joining a testnet is a great way to "try before you buy". +hardware. 32 ETH is a significant outlay and joining a testnet is a great way to "try before you buy". > **Never use real ETH to join a testnet!** Testnet such as the Holesky testnet uses Holesky ETH which is worthless. This allows experimentation without real-world costs. ### Step 1. Create validator keys The Ethereum Foundation provides the [staking-deposit-cli](https://github.com/ethereum/staking-deposit-cli/releases) for creating validator keys. Download and run the `staking-deposit-cli` with the command: + ```bash ./deposit new-mnemonic ``` + and follow the instructions to generate the keys. When prompted for a network, select `mainnet` if you want to run a mainnet validator, or select `holesky` if you want to run a Holesky testnet validator. A new mnemonic will be generated in the process. > **Important note:** A mnemonic (or seed phrase) is a 24-word string randomly generated in the process. It is highly recommended to write down the mnemonic and keep it safe offline. It is important to ensure that the mnemonic is never stored in any digital form (computers, mobile phones, etc) connected to the internet. Please also make one or more backups of the mnemonic to ensure your ETH is not lost in the case of data loss. It is very important to keep your mnemonic private as it represents the ultimate control of your ETH. Upon completing this step, the files `deposit_data-*.json` and `keystore-m_*.json` will be created. The keys that are generated from staking-deposit-cli can be easily loaded into a Lighthouse validator client (`lighthouse vc`) in [Step 3](#step-3-import-validator-keys-to-lighthouse). In fact, both of these programs are designed to work with each other. - > Lighthouse also supports creating validator keys, see [Key management](./key-management.md) for more info. ### Step 2. Start an execution client and Lighthouse beacon node @@ -64,15 +62,17 @@ Start an execution client and Lighthouse beacon node according to the [Run a Nod ### Step 3. Import validator keys to Lighthouse -In [Step 1](#step-1-create-validator-keys), the staking-deposit-cli will generate the validator keys into a `validator_keys` directory. Let's assume that +In [Step 1](#step-1-create-validator-keys), the staking-deposit-cli will generate the validator keys into a `validator_keys` directory. Let's assume that this directory is `$HOME/staking-deposit-cli/validator_keys`. Using the default `validators` directory in Lighthouse (`~/.lighthouse/mainnet/validators`), run the following command to import validator keys: Mainnet: + ```bash lighthouse --network mainnet account validator import --directory $HOME/staking-deposit-cli/validator_keys ``` Holesky testnet: + ```bash lighthouse --network holesky account validator import --directory $HOME/staking-deposit-cli/validator_keys ``` @@ -85,7 +85,6 @@ lighthouse --network holesky account validator import --directory $HOME/staking- > Docker users should use the command from the [Docker](#docker-users) documentation. - The user will be prompted for a password for each keystore discovered: ``` @@ -122,11 +121,10 @@ WARNING: DO NOT USE THE ORIGINAL KEYSTORES TO VALIDATE WITH ANOTHER CLIENT, OR Y Once you see the above message, you have successfully imported the validator keys. You can now proceed to the next step to start the validator client. - ### Step 4. Start Lighthouse validator client After the keys are imported, the user can start performing their validator duties -by starting the Lighthouse validator client `lighthouse vc`: +by starting the Lighthouse validator client `lighthouse vc`: Mainnet: @@ -135,11 +133,12 @@ lighthouse vc --network mainnet --suggested-fee-recipient YourFeeRecipientAddres ``` Holesky testnet: + ```bash lighthouse vc --network holesky --suggested-fee-recipient YourFeeRecipientAddress ``` -The `validator client` manages validators using data obtained from the beacon node via a HTTP API. You are highly recommended to enter a fee-recipient by changing `YourFeeRecipientAddress` to an Ethereum address under your control. +The `validator client` manages validators using data obtained from the beacon node via a HTTP API. You are highly recommended to enter a fee-recipient by changing `YourFeeRecipientAddress` to an Ethereum address under your control. When `lighthouse vc` starts, check that the validator public key appears as a `voting_pubkey` as shown below: @@ -156,9 +155,9 @@ by the protocol. After you have successfully run and synced the execution client, beacon node and validator client, you can now proceed to submit the deposit. Go to the mainnet [Staking launchpad](https://launchpad.ethereum.org/en/) (or [Holesky staking launchpad](https://holesky.launchpad.ethereum.org/en/) for testnet validator) and carefully go through the steps to becoming a validator. Once you are ready, you can submit the deposit by sending 32ETH per validator to the deposit contract. Upload the `deposit_data-*.json` file generated in [Step 1](#step-1-create-validator-keys) to the Staking launchpad. -> **Important note:** Double check that the deposit contract for mainnet is `0x00000000219ab540356cBB839Cbe05303d7705Fa` before you confirm the transaction. +> **Important note:** Double check that the deposit contract for mainnet is `0x00000000219ab540356cBB839Cbe05303d7705Fa` before you confirm the transaction. -Once the deposit transaction is confirmed, it will take a minimum of ~16 hours to a few days/weeks for the beacon chain to process and activate your validator, depending on the queue. Refer to our [FAQ - Why does it take so long for a validator to be activated](./faq.md#why-does-it-take-so-long-for-a-validator-to-be-activated) for more info. +Once the deposit transaction is confirmed, it will take a minimum of ~16 hours to a few days/weeks for the beacon chain to process and activate your validator, depending on the queue. Refer to our [FAQ - Why does it take so long for a validator to be activated](./faq.md#why-does-it-take-so-long-for-a-validator-to-be-activated) for more info. Once your validator is activated, the validator client will start to publish attestations each epoch: @@ -172,10 +171,11 @@ If you propose a block, the log will look like: Dec 03 08:49:36.225 INFO Successfully published block slot: 98, attestations: 2, deposits: 0, service: block ``` -Congratulations! Your validator is now performing its duties and you will receive rewards for securing the Ethereum network. +Congratulations! Your validator is now performing its duties and you will receive rewards for securing the Ethereum network. ### What is next? -After the validator is running and performing its duties, it is important to keep the validator online to continue accumulating rewards. However, there could be problems with the computer, the internet or other factors that cause the validator to be offline. For this, it is best to subscribe to notifications, e.g., via [beaconcha.in](https://beaconcha.in/) which will send notifications about missed attestations and/or proposals. You will be notified about the validator's offline status and will be able to react promptly. + +After the validator is running and performing its duties, it is important to keep the validator online to continue accumulating rewards. However, there could be problems with the computer, the internet or other factors that cause the validator to be offline. For this, it is best to subscribe to notifications, e.g., via [beaconcha.in](https://beaconcha.in/) which will send notifications about missed attestations and/or proposals. You will be notified about the validator's offline status and will be able to react promptly. The next important thing is to stay up to date with updates to Lighthouse and the execution client. Updates are released from time to time, typically once or twice a month. For Lighthouse updates, you can subscribe to notifications on [Github](https://github.com/sigp/lighthouse) by clicking on `Watch`. If you only want to receive notification on new releases, select `Custom`, then `Releases`. You could also join [Lighthouse Discord](https://discord.gg/cyAszAh) where we will make an announcement when there is a new release. @@ -202,9 +202,10 @@ Here we use two `-v` volumes to attach: - `~/.lighthouse` on the host to `/root/.lighthouse` in the Docker container. - The `validator_keys` directory in the present working directory of the host - to the `/root/validator_keys` directory of the Docker container. + to the `/root/validator_keys` directory of the Docker container. ### Start Lighthouse beacon node and validator client + Those using Docker images can start the processes with: ```bash @@ -222,8 +223,5 @@ $ docker run \ lighthouse --network mainnet vc ``` - If you get stuck you can always reach out on our [Discord][discord] or [create an issue](https://github.com/sigp/lighthouse/issues/new). - - diff --git a/book/src/merge-migration.md b/book/src/merge-migration.md index a5769162b0..8653cc45d9 100644 --- a/book/src/merge-migration.md +++ b/book/src/merge-migration.md @@ -16,7 +16,7 @@ the merge: be made to your `lighthouse vc` configuration, and are covered on the [Suggested fee recipient](./suggested-fee-recipient.md) page. -Additionally, you _must_ update Lighthouse to v3.0.0 (or later), and must update your execution +Additionally, you *must* update Lighthouse to v3.0.0 (or later), and must update your execution engine to a merge-ready version. ## When? @@ -27,7 +27,7 @@ All networks (**Mainnet**, **Goerli (Prater)**, **Ropsten**, **Sepolia**, **Kiln | Network | Bellatrix | The Merge | Remark | |---------|-------------------------------|-------------------------------| -----------| -| Ropsten | 2nd June 2022 | 8th June 2022 | Deprecated | +| Ropsten | 2nd June 2022 | 8th June 2022 | Deprecated | | Sepolia | 20th June 2022 | 6th July 2022 | | | Goerli | 4th August 2022 | 10th August 2022 | Previously named `Prater`| | Mainnet | 6th September 2022| 15th September 2022| | @@ -55,7 +55,7 @@ has the authority to control the execution engine. > needing to pass a jwt secret file. The execution engine connection must be **exclusive**, i.e. you must have one execution node -per beacon node. The reason for this is that the beacon node _controls_ the execution node. Please +per beacon node. The reason for this is that the beacon node *controls* the execution node. Please see the [FAQ](#faq) for further information about why many:1 and 1:many configurations are not supported. @@ -173,7 +173,7 @@ client to be able to connect to the beacon node. ### Can I use `http://localhost:8545` for the execution endpoint? Most execution nodes use port `8545` for the Ethereum JSON-RPC API. Unless custom configuration is -used, an execution node _will not_ provide the necessary engine API on port `8545`. You should +used, an execution node *will not* provide the necessary engine API on port `8545`. You should not attempt to use `http://localhost:8545` as your engine URL and should instead use `http://localhost:8551`. diff --git a/book/src/partial-withdrawal.md b/book/src/partial-withdrawal.md index e5a0a97c6c..26003e1f2f 100644 --- a/book/src/partial-withdrawal.md +++ b/book/src/partial-withdrawal.md @@ -2,12 +2,13 @@ After the [Capella](https://ethereum.org/en/history/#capella) upgrade on 12th April 2023: - - if a validator has a withdrawal credential type `0x00`, the rewards will continue to accumulate and will be locked in the beacon chain. - - if a validator has a withdrawal credential type `0x01`, any rewards above 32ETH will be periodically withdrawn to the withdrawal address. This is also known as the "validator sweep", i.e., once the "validator sweep" reaches your validator's index, your rewards will be withdrawn to the withdrawal address. At the time of writing, with 560,000+ validators on the Ethereum mainnet, you shall expect to receive the rewards approximately every 5 days. +- if a validator has a withdrawal credential type `0x00`, the rewards will continue to accumulate and will be locked in the beacon chain. +- if a validator has a withdrawal credential type `0x01`, any rewards above 32ETH will be periodically withdrawn to the withdrawal address. This is also known as the "validator sweep", i.e., once the "validator sweep" reaches your validator's index, your rewards will be withdrawn to the withdrawal address. At the time of writing, with 560,000+ validators on the Ethereum mainnet, you shall expect to receive the rewards approximately every 5 days. + +## FAQ -### FAQ 1. How to know if I have the withdrawal credentials type `0x00` or `0x01`? - + Refer [here](./voluntary-exit.md#1-how-to-know-if-i-have-the-withdrawal-credentials-type-0x01). 2. My validator has withdrawal credentials type `0x00`, is there a deadline to update my withdrawal credentials? @@ -16,8 +17,8 @@ After the [Capella](https://ethereum.org/en/history/#capella) upgrade on 12 3. Do I have to do anything to get my rewards after I update the withdrawal credentials to type `0x01`? - No. The "validator sweep" occurs automatically and you can expect to receive the rewards every *n* days, [more information here](./voluntary-exit.md#4-when-will-i-get-my-staked-fund-after-voluntary-exit-if-my-validator-is-of-type-0x01). + No. The "validator sweep" occurs automatically and you can expect to receive the rewards every *n* days, [more information here](./voluntary-exit.md#4-when-will-i-get-my-staked-fund-after-voluntary-exit-if-my-validator-is-of-type-0x01). Figure below summarizes partial withdrawals. - ![partial](./imgs/partial-withdrawal.png) \ No newline at end of file + ![partial](./imgs/partial-withdrawal.png) diff --git a/book/src/pi.md b/book/src/pi.md index 2fea91ad17..b91ecab548 100644 --- a/book/src/pi.md +++ b/book/src/pi.md @@ -4,22 +4,21 @@ Tested on: - - Raspberry Pi 4 Model B (4GB) - - `Ubuntu 20.04 LTS (GNU/Linux 5.4.0-1011-raspi aarch64)` - +- Raspberry Pi 4 Model B (4GB) +- `Ubuntu 20.04 LTS (GNU/Linux 5.4.0-1011-raspi aarch64)` *Note: [Lighthouse supports cross-compiling](./cross-compiling.md) to target a Raspberry Pi (`aarch64`). Compiling on a faster machine (i.e., `x86_64` desktop) may be convenient.* -### 1. Install Ubuntu +## 1. Install Ubuntu Follow the [Ubuntu Raspberry Pi installation instructions](https://ubuntu.com/download/raspberry-pi). **A 64-bit version is required** A graphical environment is not required in order to use Lighthouse. Only the terminal and an Internet connection are necessary. -### 2. Install Packages +## 2. Install Packages Install the Ubuntu dependencies: @@ -32,7 +31,7 @@ sudo apt update && sudo apt install -y git gcc g++ make cmake pkg-config llvm-de > - If there are difficulties, try updating the package manager with `sudo apt > update`. -### 3. Install Rust +## 3. Install Rust Install Rust as per [rustup](https://rustup.rs/): @@ -47,7 +46,7 @@ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh > be found, run `source $HOME/.cargo/env`. After that, running `cargo version` should return the version, for example `cargo 1.68.2`. > - It's generally advisable to append `source $HOME/.cargo/env` to `~/.bashrc`. -### 4. Install Lighthouse +## 4. Install Lighthouse ```bash git clone https://github.com/sigp/lighthouse.git diff --git a/book/src/redundancy.md b/book/src/redundancy.md index bd1976f950..ee685a17cf 100644 --- a/book/src/redundancy.md +++ b/book/src/redundancy.md @@ -1,7 +1,5 @@ # Redundancy -[subscribe-api]: https://ethereum.github.io/beacon-APIs/#/Validator/prepareBeaconCommitteeSubnet - There are three places in Lighthouse where redundancy is notable: 1. ✅ GOOD: Using a redundant beacon node in `lighthouse vc --beacon-nodes` @@ -38,9 +36,9 @@ duties as long as *at least one* of the beacon nodes is available. There are a few interesting properties about the list of `--beacon-nodes`: - *Ordering matters*: the validator client prefers a beacon node that is - earlier in the list. + earlier in the list. - *Synced is preferred*: the validator client prefers a synced beacon node over - one that is still syncing. + one that is still syncing. - *Failure is sticky*: if a beacon node fails, it will be flagged as offline and won't be retried again for the rest of the slot (12 seconds). This helps prevent the impact of time-outs and other lengthy errors. @@ -49,7 +47,6 @@ There are a few interesting properties about the list of `--beacon-nodes`: > provided (if it is desired). It will only be used as default if no `--beacon-nodes` flag is > provided at all. - ### Configuring a redundant Beacon Node In our previous example, we listed `http://192.168.1.1:5052` as a redundant @@ -58,8 +55,10 @@ following flags: - `--http`: starts the HTTP API server. - `--http-address local_IP`: where `local_IP` is the private IP address of the computer running the beacon node. This is only required if your backup beacon node is on a different host. + > Note: You could also use `--http-address 0.0.0.0`, but this allows *any* external IP address to access the HTTP server. As such, a firewall should be configured to deny unauthorized access to port `5052`. - - `--execution-endpoint`: see [Merge Migration](./merge-migration.md). + +- `--execution-endpoint`: see [Merge Migration](./merge-migration.md). - `--execution-jwt`: see [Merge Migration](./merge-migration.md). For example one could use the following command to provide a backup beacon node: @@ -107,7 +106,7 @@ The default is `--broadcast subscriptions`. To also broadcast blocks for example ## Redundant execution nodes Lighthouse previously supported redundant execution nodes for fetching data from the deposit -contract. On merged networks _this is no longer supported_. Each Lighthouse beacon node must be +contract. On merged networks *this is no longer supported*. Each Lighthouse beacon node must be configured in a 1:1 relationship with an execution node. For more information on the rationale behind this decision please see the [Merge Migration](./merge-migration.md) documentation. diff --git a/book/src/run_a_node.md b/book/src/run_a_node.md index ab42c0c10a..6c1f23d8e8 100644 --- a/book/src/run_a_node.md +++ b/book/src/run_a_node.md @@ -8,9 +8,8 @@ You should be finished with one [Installation](./installation.md) method of your 1. Set up a [beacon node](#step-3-set-up-a-beacon-node-using-lighthouse); 1. [Check logs for sync status](#step-4-check-logs-for-sync-status); - - ## Step 1: Create a JWT secret file + A JWT secret file is used to secure the communication between the execution client and the consensus client. In this step, we will create a JWT secret file which will be used in later steps. ```bash @@ -21,18 +20,15 @@ openssl rand -hex 32 | tr -d "\n" | sudo tee /secrets/jwt.hex ## Step 2: Set up an execution node The Lighthouse beacon node *must* connect to an execution engine in order to validate the transactions present in blocks. The execution engine connection must be *exclusive*, i.e. you must have one execution node -per beacon node. The reason for this is that the beacon node _controls_ the execution node. Select an execution client from the list below and run it: - +per beacon node. The reason for this is that the beacon node *controls* the execution node. Select an execution client from the list below and run it: - [Nethermind](https://docs.nethermind.io/nethermind/first-steps-with-nethermind/running-nethermind-post-merge) - [Besu](https://besu.hyperledger.org/en/stable/public-networks/get-started/connect/mainnet/) - [Erigon](https://github.com/ledgerwatch/erigon#beacon-chain-consensus-layer) - [Geth](https://geth.ethereum.org/docs/getting-started/consensus-clients) - > Note: Each execution engine has its own flags for configuring the engine API and JWT secret to connect to a beacon node. Please consult the relevant page of your execution engine as above for the required flags. - Once the execution client is up, just let it continue running. The execution client will start syncing when it connects to a beacon node. Depending on the execution client and computer hardware specifications, syncing can take from a few hours to a few days. You can safely proceed to Step 3 to set up a beacon node while the execution client is still syncing. ## Step 3: Set up a beacon node using Lighthouse @@ -50,9 +46,10 @@ lighthouse bn \ --http ``` -> Note: If you download the binary file, you need to navigate to the directory of the binary file to run the above command. +> Note: If you download the binary file, you need to navigate to the directory of the binary file to run the above command. + +Notable flags: -Notable flags: - `--network` flag, which selects a network: - `lighthouse` (no flag): Mainnet. - `lighthouse --network mainnet`: Mainnet. @@ -71,14 +68,11 @@ provide a `--network` flag instead of relying on the default. - `--checkpoint-sync-url`: Lighthouse supports fast sync from a recent finalized checkpoint. Checkpoint sync is *optional*; however, we **highly recommend** it since it is substantially faster than syncing from genesis while still providing the same functionality. The checkpoint sync is done using [public endpoints](https://eth-clients.github.io/checkpoint-sync-endpoints/) provided by the Ethereum community. For example, in the above command, we use the URL for Sigma Prime's checkpoint sync server for mainnet `https://mainnet.checkpoint.sigp.io`. - `--http`: to expose an HTTP server of the beacon chain. The default listening address is `http://localhost:5052`. The HTTP API is required for the beacon node to accept connections from the *validator client*, which manages keys. - - If you intend to run the beacon node without running the validator client (e.g., for non-staking purposes such as supporting the network), you can modify the above command so that the beacon node is configured for non-staking purposes: - ### Non-staking -``` +``` lighthouse bn \ --network mainnet \ --execution-endpoint http://localhost:8551 \ @@ -89,16 +83,14 @@ lighthouse bn \ Since we are not staking, we can use the `--disable-deposit-contract-sync` flag to disable syncing of deposit logs from the execution node. - - Once Lighthouse runs, we can monitor the logs to see if it is syncing correctly. - - ## Step 4: Check logs for sync status -Several logs help you identify if Lighthouse is running correctly. + +Several logs help you identify if Lighthouse is running correctly. ### Logs - Checkpoint sync + If you run Lighthouse with the flag `--checkpoint-sync-url`, Lighthouse will print a message to indicate that checkpoint sync is being used: ``` @@ -147,11 +139,11 @@ as `verified` indicating that they have been processed successfully by the execu INFO Synced, slot: 3690668, block: 0x1244…cb92, epoch: 115333, finalized_epoch: 115331, finalized_root: 0x0764…2a3d, exec_hash: 0x929c…1ff6 (verified), peers: 78 ``` -Once you see the above message - congratulations! This means that your node is synced and you have contributed to the decentralization and security of the Ethereum network. +Once you see the above message - congratulations! This means that your node is synced and you have contributed to the decentralization and security of the Ethereum network. ## Further readings -Several other resources are the next logical step to explore after running your beacon node: +Several other resources are the next logical step to explore after running your beacon node: - If you intend to run a validator, proceed to [become a validator](./mainnet-validator.md); - Explore how to [manage your keys](./key-management.md); diff --git a/book/src/setup.md b/book/src/setup.md index c678b4387a..d3da68f97c 100644 --- a/book/src/setup.md +++ b/book/src/setup.md @@ -9,6 +9,7 @@ particularly useful for development but still a good way to ensure you have the base dependencies. The additional requirements for developers are: + - [`anvil`](https://github.com/foundry-rs/foundry/tree/master/crates/anvil). This is used to simulate the execution chain during tests. You'll get failures during tests if you don't have `anvil` available on your `PATH`. @@ -17,10 +18,11 @@ The additional requirements for developers are: - [`java 17 runtime`](https://openjdk.java.net/projects/jdk/). 17 is the minimum, used by web3signer_tests. - [`libpq-dev`](https://www.postgresql.org/docs/devel/libpq.html). Also known as - `libpq-devel` on some systems. + `libpq-devel` on some systems. - [`docker`](https://www.docker.com/). Some tests need docker installed and **running**. ## Using `make` + Commands to run the test suite are available via the `Makefile` in the project root for the benefit of CI/CD. We list some of these commands below so you can run them locally and avoid CI failures: @@ -31,7 +33,7 @@ you can run them locally and avoid CI failures: - `$ make test-ef`: (medium) runs the Ethereum Foundation test vectors. - `$ make test-full`: (slow) runs the full test suite (including all previous commands). This is approximately everything - that is required to pass CI. + that is required to pass CI. _The lighthouse test suite is quite extensive, running the whole suite may take 30+ minutes._ @@ -80,6 +82,7 @@ test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; fini Alternatively, since `lighthouse` is a cargo workspace you can use `-p eth2_ssz` where `eth2_ssz` is the package name as defined `/consensus/ssz/Cargo.toml` + ```bash $ head -2 consensus/ssz/Cargo.toml [package] @@ -120,13 +123,14 @@ test src/lib.rs - (line 10) ... ok test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.15s$ cargo test -p eth2_ssz ``` -#### test_logger +### test_logger The test_logger, located in `/common/logging/` can be used to create a `Logger` that by default returns a NullLogger. But if `--features 'logging/test_logger'` is passed while testing the logs are displayed. This can be very helpful while debugging tests. Example: + ``` $ cargo test -p beacon_chain validator_pubkey_cache::test::basic_operation --features 'logging/test_logger' Finished test [unoptimized + debuginfo] target(s) in 0.20s diff --git a/book/src/slasher.md b/book/src/slasher.md index 79a2d1f8eb..5098fe6eda 100644 --- a/book/src/slasher.md +++ b/book/src/slasher.md @@ -8,6 +8,7 @@ extra income for your validators. However it is currently only recommended for e of the immaturity of the slasher UX and the extra resources required. ## Minimum System Requirements + * Quad-core CPU * 16 GB RAM * 256 GB solid state storage (in addition to the space requirement for the beacon node DB) @@ -47,8 +48,8 @@ directory. It is possible to use one of several database backends with the slasher: -- LMDB (default) -- MDBX +* LMDB (default) +* MDBX The advantage of MDBX is that it performs compaction, resulting in less disk usage over time. The disadvantage is that upstream MDBX is unstable, so Lighthouse is pinned to a specific version. diff --git a/book/src/slashing-protection.md b/book/src/slashing-protection.md index 38348d2094..88e2bb955c 100644 --- a/book/src/slashing-protection.md +++ b/book/src/slashing-protection.md @@ -65,11 +65,11 @@ interchange file is a record of blocks and attestations signed by a set of valid basically a portable slashing protection database! To import a slashing protection database to Lighthouse, you first need to export your existing client's database. Instructions to export the slashing protection database for other clients are listed below: -- [Lodestar](https://chainsafe.github.io/lodestar/reference/cli/#validator-slashing-protection-export) -- [Nimbus](https://nimbus.guide/migration.html#2-export-slashing-protection-history) -- [Prysm](https://docs.prylabs.network/docs/wallet/slashing-protection#exporting-your-validators-slashing-protection-history) -- [Teku](https://docs.teku.consensys.net/HowTo/Prevent-Slashing#export-a-slashing-protection-file) +* [Lodestar](https://chainsafe.github.io/lodestar/reference/cli/#validator-slashing-protection-export) +* [Nimbus](https://nimbus.guide/migration.html#2-export-slashing-protection-history) +* [Prysm](https://docs.prylabs.network/docs/wallet/slashing-protection#exporting-your-validators-slashing-protection-history) +* [Teku](https://docs.teku.consensys.net/HowTo/Prevent-Slashing#export-a-slashing-protection-file) Once you have the slashing protection database from your existing client, you can now import the database to Lighthouse. With your validator client stopped, you can import a `.json` interchange file from another client using this command: diff --git a/book/src/suggested-fee-recipient.md b/book/src/suggested-fee-recipient.md index 44accbd143..4a9be7b963 100644 --- a/book/src/suggested-fee-recipient.md +++ b/book/src/suggested-fee-recipient.md @@ -9,14 +9,14 @@ During post-merge block production, the Beacon Node (BN) will provide a `suggest the execution node. This is a 20-byte Ethereum address which the execution node might choose to set as the recipient of other fees or rewards. There is no guarantee that an execution node will use the `suggested_fee_recipient` to collect fees, -it may use any address it chooses. It is assumed that an honest execution node *will* use the -`suggested_fee_recipient`, but users should note this trust assumption. +it may use any address it chooses. It is assumed that an honest execution node _will_ use the +`suggested_fee_recipient`, but users should note this trust assumption. The `suggested_fee_recipient` can be provided to the VC, which will transmit it to the BN. The BN also has a choice regarding the fee recipient it passes to the execution node, creating another noteworthy trust assumption. -To be sure *you* control your fee recipient value, run your own BN and execution node (don't use +To be sure _you_ control your fee recipient value, run your own BN and execution node (don't use third-party services). ## How to configure a suggested fee recipient @@ -68,7 +68,6 @@ Provide a 0x-prefixed address, e.g. lighthouse vc --suggested-fee-recipient 0x25c4a76E7d118705e7Ea2e9b7d8C59930d8aCD3b ... ``` - ### 3. Using the "--suggested-fee-recipient" flag on the beacon node The `--suggested-fee-recipient` can be provided to the BN to act as a default value when the @@ -96,7 +95,8 @@ client. | Required Headers | [`Authorization`](./api-vc-auth-header.md) | | Typical Responses | 202, 404 | -#### Example Request Body +### Example Request Body + ```json { "ethaddress": "0x1D4E51167DBDC4789a014357f4029ff76381b16c" @@ -120,6 +120,7 @@ curl -X POST \ Note that an authorization header is required to interact with the API. This is specified with the header `-H "Authorization: Bearer $(cat ${DATADIR}/validators/api-token.txt)"` which read the API token to supply the authentication. Refer to [Authorization Header](./api-vc-auth-header.md) for more information. If you are having permission issue with accessing the API token file, you can modify the header to become `-H "Authorization: Bearer $(sudo cat ${DATADIR}/validators/api-token.txt)"`. #### Successful Response (202) + ```json null ``` @@ -137,7 +138,7 @@ The same path with a `GET` request can be used to query the fee recipient for a | Required Headers | [`Authorization`](./api-vc-auth-header.md) | | Typical Responses | 200, 404 | -Command: +Command: ```bash DATADIR=$HOME/.lighthouse/mainnet @@ -150,6 +151,7 @@ curl -X GET \ ``` #### Successful Response (200) + ```json { "data": { @@ -171,7 +173,7 @@ This is useful if you want the fee recipient to fall back to the validator clien | Required Headers | [`Authorization`](./api-vc-auth-header.md) | | Typical Responses | 204, 404 | -Command: +Command: ```bash DATADIR=$HOME/.lighthouse/mainnet @@ -184,6 +186,7 @@ curl -X DELETE \ ``` #### Successful Response (204) + ```json null ``` diff --git a/book/src/ui-authentication.md b/book/src/ui-authentication.md index 0572824d5c..8d457c8f68 100644 --- a/book/src/ui-authentication.md +++ b/book/src/ui-authentication.md @@ -2,9 +2,9 @@ To enhance the security of your account, we offer the option to set a session password. This allows the user to avoid re-entering the api-token when performing critical mutating operations on the validator. Instead a user can simply enter their session password. In the absence of a session password, Siren will revert to the api-token specified in your configuration settings as the default security measure. -> This does not protect your validators from unauthorized device access. +> This does not protect your validators from unauthorized device access. -![](imgs/ui-session-auth.png) +![authentication](imgs/ui-session-auth.png) Session passwords must contain at least: @@ -14,20 +14,18 @@ Session passwords must contain at least: - 1 number - 1 special character - ## Protected Actions Prior to executing any sensitive validator action, Siren will request authentication of the session password or api-token. -![](imgs/ui-exit.png) - +![exit](imgs/ui-exit.png) In the event of three consecutive failed attempts, Siren will initiate a security measure by locking all actions and prompting for configuration settings to be renewed to regain access to these features. -![](imgs/ui-fail-auth.png) +![fail-authentication](imgs/ui-fail-auth.png) ## Auto Connect In the event that auto-connect is enabled, refreshing the Siren application will result in a prompt to authenticate the session password or api-token. If three consecutive authentication attempts fail, Siren will activate a security measure by locking the session and prompting for configuration settings to be reset to regain access. -![](imgs/ui-autoconnect-auth.png) \ No newline at end of file +![autoconnect](imgs/ui-autoconnect-auth.png) diff --git a/book/src/ui-configuration.md b/book/src/ui-configuration.md index 31951c3c92..f5e4bed34a 100644 --- a/book/src/ui-configuration.md +++ b/book/src/ui-configuration.md @@ -6,7 +6,6 @@ following configuration screen. ![ui-configuration](./imgs/ui-configuration.png) - ## Connecting to the Clients Both the Beacon node and the Validator client need to have their HTTP APIs enabled. These ports should be accessible from the computer running Siren. This allows you to enter the address and ports of the associated Lighthouse @@ -18,7 +17,7 @@ To enable the HTTP API for the beacon node, utilize the `--gui` CLI flag. This a If you require accessibility from another machine within the network, configure the `--http-address` to match the local LAN IP of the system running the Beacon Node and Validator Client. -> To access from another machine on the same network (192.168.0.200) set the Beacon Node and Validator Client `--http-address` as `192.168.0.200`. When this is set, the validator client requires the flag `--beacon-nodes http://192.168.0.200:5052` to connect to the beacon node. +> To access from another machine on the same network (192.168.0.200) set the Beacon Node and Validator Client `--http-address` as `192.168.0.200`. When this is set, the validator client requires the flag `--beacon-nodes http://192.168.0.200:5052` to connect to the beacon node. In a similar manner, the validator client requires activation of the `--http` flag, along with the optional consideration of configuring the `--http-address` flag. If `--http-address` flag is set on the Validator Client, then the `--unencrypted-http-transport` flag is required as well. These settings will ensure compatibility with Siren's connectivity requirements. @@ -27,7 +26,6 @@ If you run Siren in the browser (by entering `localhost` in the browser), you wi A green tick will appear once Siren is able to connect to both clients. You can specify different ports for each client by clicking on the advanced tab. - ## API Token The API Token is a secret key that allows you to connect to the validator diff --git a/book/src/ui-faqs.md b/book/src/ui-faqs.md index 77821788f6..4e4de225af 100644 --- a/book/src/ui-faqs.md +++ b/book/src/ui-faqs.md @@ -1,16 +1,20 @@ # Frequently Asked Questions ## 1. Are there any requirements to run Siren? + Yes, the most current Siren version requires Lighthouse v4.3.0 or higher to function properly. These releases can be found on the [releases](https://github.com/sigp/lighthouse/releases) page of the Lighthouse repository. ## 2. Where can I find my API token? + The required Api token may be found in the default data directory of the validator client. For more information please refer to the lighthouse ui configuration [`api token section`](./api-vc-auth-header.md). ## 3. How do I fix the Node Network Errors? + If you receive a red notification with a BEACON or VALIDATOR NODE NETWORK ERROR you can refer to the lighthouse ui configuration and [`connecting to clients section`](./ui-configuration.md#connecting-to-the-clients). ## 4. How do I connect Siren to Lighthouse from a different computer on the same network? -The most effective approach to enable access for a local network computer to Lighthouse's HTTP API ports is by configuring the `--http-address` to match the local LAN IP of the system running the beacon node and validator client. For instance, if the said node operates at `192.168.0.200`, this IP can be specified using the `--http-address` parameter as `--http-address 192.168.0.200`. When this is set, the validator client requires the flag `--beacon-nodes http://192.168.0.200:5052` to connect to the beacon node. + +The most effective approach to enable access for a local network computer to Lighthouse's HTTP API ports is by configuring the `--http-address` to match the local LAN IP of the system running the beacon node and validator client. For instance, if the said node operates at `192.168.0.200`, this IP can be specified using the `--http-address` parameter as `--http-address 192.168.0.200`. When this is set, the validator client requires the flag `--beacon-nodes http://192.168.0.200:5052` to connect to the beacon node. Subsequently, by designating the host as `192.168.0.200`, you can seamlessly connect Siren to this specific beacon node and validator client pair from any computer situated within the same network. ## 5. How can I use Siren to monitor my validators remotely when I am not at home? @@ -22,6 +26,7 @@ Most contemporary home routers provide options for VPN access in various ways. A In the absence of a VPN, an alternative approach involves utilizing an SSH tunnel. To achieve this, you need remote SSH access to the computer hosting the Beacon Node and Validator Client pair (which necessitates a port forward in your router). In this context, while it is not obligatory to set a `--http-address` flag on the Beacon Node and Validator Client, you can configure an SSH tunnel to the local ports on the node and establish a connection through the tunnel. For instructions on setting up an SSH tunnel, refer to [`Connecting Siren via SSH tunnel`](./ui-faqs.md#6-how-do-i-connect-siren-to-lighthouse-via-a-ssh-tunnel) for detailed guidance. ## 6. How do I connect Siren to Lighthouse via a ssh tunnel? + If you would like to access Siren beyond the local network (i.e across the internet), we recommend using an SSH tunnel. This requires a tunnel for 3 ports: `80` (assuming the port is unchanged as per the [installation guide](./ui-installation.md#docker-recommended)), `5052` (for beacon node) and `5062` (for validator client). You can use the command below to perform SSH tunneling: ```bash @@ -30,13 +35,10 @@ ssh -N -L 80:127.0.0.1:80 -L 5052:127.0.0.1:5052 -L 5062:127.0.0.1:5062 username ``` - Where `username` is the username of the server and `local_ip` is the local IP address of the server. Note that with the `-N` option in an SSH session, you will not be able to execute commands in the CLI to avoid confusion with ordinary shell sessions. The connection will appear to be "hung" upon a successful connection, but that is normal. Once you have successfully connected to the server via SSH tunneling, you should be able to access Siren by entering `localhost` in a web browser. - You can also access Siren using the app downloaded in the [Siren release page](https://github.com/sigp/siren/releases). To access Siren beyond the local computer, you can use SSH tunneling for ports `5052` and `5062` using the command: - ```bash ssh -N -L 5052:127.0.0.1:5052 -L 5062:127.0.0.1:5062 username@local_ip @@ -44,7 +46,9 @@ ssh -N -L 5052:127.0.0.1:5052 -L 5062:127.0.0.1:5062 username@local_ip ``` ## 7. Does Siren support reverse proxy or DNS named addresses? + Yes, if you need to access your beacon or validator from an address such as `https://merp-server:9909/eth2-vc` you should follow the following steps for configuration: + 1. Toggle `https` as your protocol 2. Add your address as `merp-server/eth2-vc` 3. Add your Beacon and Validator ports as `9909` @@ -53,9 +57,10 @@ If you have configured it correctly you should see a green checkmark indicating If you have separate address setups for your Validator Client and Beacon Node respectively you should access the `Advance Settings` on the configuration and repeat the steps above for each address. - ## 8. How do I change my Beacon or Validator address after logging in? + Once you have successfully arrived to the main dashboard, use the sidebar to access the settings view. In the top right-hand corner there is a `Configuration` action button that will redirect you back to the configuration screen where you can make appropriate changes. ## 9. Why doesn't my validator balance graph show any data? + If your graph is not showing data, it usually means your validator node is still caching data. The application must wait at least 3 epochs before it can render any graphical visualizations. This could take up to 20min. diff --git a/book/src/ui-installation.md b/book/src/ui-installation.md index b8ae788c69..4f7df4e8ff 100644 --- a/book/src/ui-installation.md +++ b/book/src/ui-installation.md @@ -3,6 +3,7 @@ Siren runs on Linux, MacOS and Windows. ## Version Requirement + The Siren app requires Lighthouse v3.5.1 or higher to function properly. These versions can be found on the [releases](https://github.com/sigp/lighthouse/releases) page of the Lighthouse repository. ## Pre-Built Electron Packages @@ -26,26 +27,26 @@ The electron app can be built from source by first cloning the repository and entering the directory: ``` -$ git clone https://github.com/sigp/siren.git -$ cd siren +git clone https://github.com/sigp/siren.git +cd siren ``` Once cloned, the electron app can be built and ran via the Makefile by: ``` -$ make +make ``` alternatively it can be built via: ``` -$ yarn +yarn ``` Once completed successfully the electron app can be run via: ``` -$ yarn dev +yarn dev ``` ### Running In The Browser @@ -59,19 +60,22 @@ production-grade web-server to host the application. `docker` is required to be installed with the service running. The docker image can be built and run via the Makefile by running: + ``` -$ make docker +make docker ``` Alternatively, to run with Docker, the image needs to be built. From the repository directory run: + ``` -$ docker build -t siren . +docker build -t siren . ``` Then to run the image: + ``` -$ docker run --rm -ti --name siren -p 80:80 siren +docker run --rm -ti --name siren -p 80:80 siren ``` This will open port 80 and allow your browser to connect. You can choose @@ -83,20 +87,24 @@ To view Siren, simply go to `http://localhost` in your web browser. #### Development Server A development server can also be built which will expose a local port 3000 via: + ``` -$ yarn start +yarn start ``` Once executed, you can direct your web browser to the following URL to interact with the app: + ``` http://localhost:3000 ``` A production version of the app can be built via + ``` -$ yarn build +yarn build ``` + and then further hosted via a production web server. ### Known Issues diff --git a/book/src/ui-usage.md b/book/src/ui-usage.md index 867a49a91f..eddee311fd 100644 --- a/book/src/ui-usage.md +++ b/book/src/ui-usage.md @@ -1,10 +1,10 @@ # Usage -# Dashboard +## Dashboard Siren's dashboard view provides a summary of all performance and key validator metrics. Sync statuses, uptimes, accumulated rewards, hardware and network metrics are all consolidated on the dashboard for evaluation. -![](imgs/ui-dashboard.png) +![dashboard](imgs/ui-dashboard.png) ## Account Earnings @@ -12,66 +12,62 @@ The account earnings component accumulates reward data from all registered valid Below in the earning section, you can also view your total earnings or click the adjacent buttons to view your estimated earnings given a specific time frame based on current device and network conditions. -![](imgs/ui-account-earnings.png) +![earning](imgs/ui-account-earnings.png) ## Validator Table The validator table component is a list of all registered validators, which includes data such as name, index, total balance, earned rewards and current status. Each validator row also contains a link to a detailed data modal and additional data provided by [Beaconcha.in](https://beaconcha.in). -![](imgs/ui-validator-table.png) +![validator-table](imgs/ui-validator-table.png) ## Validator Balance Chart The validator balance component is a graphical representation of each validator balance over the latest 10 epochs. Take note that only active validators are rendered in the chart visualization. -![](imgs/ui-validator-balance1.png) +![validator-balance](imgs/ui-validator-balance1.png) By clicking on the chart component you can filter selected validators in the render. This call allow for greater resolution in the rendered visualization. - - - - +balance-modal +validator-balance2 ## Hardware Usage and Device Diagnostics The hardware usage component gathers information about the device the Beacon Node is currently running. It displays the Disk usage, CPU metrics and memory usage of the Beacon Node device. The device diagnostics component provides the sync status of the execution client and beacon node. - - - +hardware +device ## Log Statistics The log statistics present an hourly combined rate of critical, warning, and error logs from the validator client and beacon node. This analysis enables informed decision-making, troubleshooting, and proactive maintenance for optimal system performance. - +log -# Validator Management +## Validator Management Siren's validator management view provides a detailed overview of all validators with options to deposit to and/or add new validators. Each validator table row displays the validator name, index, balance, rewards, status and all available actions per validator. -![](imgs/ui-validator-management.png) +![validator-management](imgs/ui-validator-management.png) ## Validator Modal Clicking the validator icon activates a detailed validator modal component. This component also allows users to trigger validator actions and as well to view and update validator graffiti. Each modal contains the validator total income with hourly, daily and weekly earnings estimates. - +ui-validator-modal -# Settings +## Settings Siren's settings view provides access to the application theme, version, name, device name and important external links. From the settings page users can also access the configuration screen to adjust any beacon or validator node parameters. -![](imgs/ui-settings.png) - +![settings](imgs/ui-settings.png) -# Validator and Beacon Logs +## Validator and Beacon Logs The logs page provides users with the functionality to access and review recorded logs for both validators and beacons. Users can conveniently observe log severity, messages, timestamps, and any additional data associated with each log entry. The interface allows for seamless switching between validator and beacon log outputs, and incorporates useful features such as built-in text search and the ability to pause log feeds. Additionally, users can obtain log statistics, which are also available on the main dashboard, thereby facilitating a comprehensive overview of the system's log data. Please note that Siren is limited to storing and displaying only the previous 1000 log messages. This also means the text search is limited to the logs that are currently stored within Siren's limit. -![](imgs/ui-logs.png) \ No newline at end of file +![logs](imgs/ui-logs.png) diff --git a/book/src/validator-doppelganger.md b/book/src/validator-doppelganger.md index b62086d4bf..a3d60d31b3 100644 --- a/book/src/validator-doppelganger.md +++ b/book/src/validator-doppelganger.md @@ -16,7 +16,7 @@ achieves this by staying silent for 2-3 epochs after a validator is started so i other instances of that validator before starting to sign potentially slashable messages. > Note: Doppelganger Protection is not yet interoperable, so if it is configured on a Lighthouse -> validator client, the client must be connected to a Lighthouse beacon node. +> validator client, the client must be connected to a Lighthouse beacon node. ## Initial Considerations diff --git a/book/src/validator-inclusion.md b/book/src/validator-inclusion.md index f31d729449..092c813a1e 100644 --- a/book/src/validator-inclusion.md +++ b/book/src/validator-inclusion.md @@ -12,10 +12,10 @@ In order to apply these APIs, you need to have historical states information in ## Endpoints -HTTP Path | Description | +| HTTP Path | Description | | --- | -- | -[`/lighthouse/validator_inclusion/{epoch}/global`](#global) | A global vote count for a given epoch. -[`/lighthouse/validator_inclusion/{epoch}/{validator_id}`](#individual) | A per-validator breakdown of votes in a given epoch. +| [`/lighthouse/validator_inclusion/{epoch}/global`](#global) | A global vote count for a given epoch. | +| [`/lighthouse/validator_inclusion/{epoch}/{validator_id}`](#individual) | A per-validator breakdown of votes in a given epoch. | ## Global @@ -53,16 +53,17 @@ vote (that is why it is _effective_ `Gwei`). The following fields are returned: - `current_epoch_active_gwei`: the total staked gwei that was active (i.e., - able to vote) during the current epoch. + able to vote) during the current epoch. - `current_epoch_target_attesting_gwei`: the total staked gwei that attested to - the majority-elected Casper FFG target epoch during the current epoch. + the majority-elected Casper FFG target epoch during the current epoch. +- `previous_epoch_active_gwei`: as per `current_epoch_active_gwei`, but during the previous epoch. - `previous_epoch_target_attesting_gwei`: see `current_epoch_target_attesting_gwei`. - `previous_epoch_head_attesting_gwei`: the total staked gwei that attested to a - head beacon block that is in the canonical chain. + head beacon block that is in the canonical chain. From this data you can calculate: -#### Justification/Finalization Rate +### Justification/Finalization Rate `previous_epoch_target_attesting_gwei / current_epoch_active_gwei` @@ -95,7 +96,6 @@ The [Global Votes](#global) endpoint is the summation of all of these individual values, please see it for definitions of terms like "current_epoch", "previous_epoch" and "target_attester". - ### HTTP Example ```bash diff --git a/book/src/validator-management.md b/book/src/validator-management.md index bc6aba3c4f..b9610b6967 100644 --- a/book/src/validator-management.md +++ b/book/src/validator-management.md @@ -41,6 +41,7 @@ Here's an example file with two validators: voting_keystore_path: /home/paul/.lighthouse/validators/0xa5566f9ec3c6e1fdf362634ebec9ef7aceb0e460e5079714808388e5d48f4ae1e12897fed1bea951c17fa389d511e477/voting-keystore.json voting_keystore_password: myStrongpa55word123&$ ``` + In this example we can see two validators: - A validator identified by the `0x87a5...` public key which is enabled. @@ -51,7 +52,7 @@ In this example we can see two validators: Each permitted field of the file is listed below for reference: - `enabled`: A `true`/`false` indicating if the validator client should consider this - validator "enabled". + validator "enabled". - `voting_public_key`: A validator public key. - `type`: How the validator signs messages (this can be `local_keystore` or `web3signer` (see [Web3Signer](./validator-web3signer.md))). - `voting_keystore_path`: The path to a EIP-2335 keystore. @@ -59,9 +60,9 @@ Each permitted field of the file is listed below for reference: - `voting_keystore_password`: The password to the EIP-2335 keystore. > **Note**: Either `voting_keystore_password_path` or `voting_keystore_password` *must* be -> supplied. If both are supplied, `voting_keystore_password_path` is ignored. +> supplied. If both are supplied, `voting_keystore_password_path` is ignored. ->If you do not wish to have `voting_keystore_password` being stored in the `validator_definitions.yml` file, you can add the field `voting_keystore_password_path` and point it to a file containing the password. The file can be, e.g., on a mounted portable drive that contains the password so that no password is stored on the validating node. +>If you do not wish to have `voting_keystore_password` being stored in the `validator_definitions.yml` file, you can add the field `voting_keystore_password_path` and point it to a file containing the password. The file can be, e.g., on a mounted portable drive that contains the password so that no password is stored on the validating node. ## Populating the `validator_definitions.yml` file @@ -77,7 +78,6 @@ recap: ### Automatic validator discovery - When the `--disable-auto-discover` flag is **not** provided, the validator client will search the `validator-dir` for validators and add any *new* validators to the `validator_definitions.yml` with `enabled: true`. @@ -148,7 +148,6 @@ ensure their `secrets-dir` is organised as below: └── 0x87a580d31d7bc69069b55f5a01995a610dd391a26dc9e36e81057a17211983a79266800ab8531f21f1083d7d84085007 ``` - ### Manual configuration The automatic validator discovery process works out-of-the-box with validators @@ -181,7 +180,7 @@ the active validator, the validator client will: password. 1. Use the keystore password to decrypt the keystore and obtain a BLS keypair. 1. Verify that the decrypted BLS keypair matches the `voting_public_key`. -1. Create a `voting-keystore.json.lock` file adjacent to the +1. Create a `voting-keystore.json.lock` file adjacent to the `voting_keystore_path`, indicating that the voting keystore is in-use and should not be opened by another process. 1. Proceed to act for that validator, creating blocks and attestations if/when required. diff --git a/book/src/validator-manager-create.md b/book/src/validator-manager-create.md index 98202d3b52..d97f953fc1 100644 --- a/book/src/validator-manager-create.md +++ b/book/src/validator-manager-create.md @@ -48,6 +48,7 @@ lighthouse \ --suggested-fee-recipient
\ --output-path ./ ``` + > If the flag `--first-index` is not provided, it will default to using index 0. > The `--suggested-fee-recipient` flag may be omitted to use whatever default > value the VC uses. It does not necessarily need to be identical to @@ -63,6 +64,7 @@ lighthouse \ --validators-file validators.json \ --vc-token ``` + > This is assuming that `validators.json` is in the present working directory. If it is not, insert the directory of the file. > Be sure to remove `./validators.json` after the import is successful since it > contains unencrypted validator keystores. @@ -141,7 +143,6 @@ must be known. The location of the file varies, but it is located in the `~/.lighthouse/mainnet/validators/api-token.txt`. We will use `` to substitute this value. If you are unsure of the `api-token.txt` path, you can run `curl http://localhost:5062/lighthouse/auth` which will show the path. - Once the VC is running, use the `import` command to import the validators to the VC: ```bash @@ -166,16 +167,18 @@ The user should now *securely* delete the `validators.json` file (e.g., `shred - The `validators.json` contains the unencrypted validator keys and must not be shared with anyone. At the same time, `lighthouse vc` will log: + ```bash INFO Importing keystores via standard HTTP API, count: 1 WARN No slashing protection data provided with keystores INFO Enabled validator voting_pubkey: 0xab6e29f1b98fedfca878edce2b471f1b5ee58ee4c3bd216201f98254ef6f6eac40a53d74c8b7da54f51d3e85cacae92f, signing_method: local_keystore INFO Modified key_cache saved successfully ``` -The WARN message means that the `validators.json` file does not contain the slashing protection data. This is normal if you are starting a new validator. The flag `--enable-doppelganger-protection` will also protect users from potential slashing risk. + +The WARN message means that the `validators.json` file does not contain the slashing protection data. This is normal if you are starting a new validator. The flag `--enable-doppelganger-protection` will also protect users from potential slashing risk. The validators will now go through 2-3 epochs of [doppelganger protection](./validator-doppelganger.md) and will automatically start performing -their duties when they are deposited and activated. +their duties when they are deposited and activated. If the host VC contains the same public key as the `validators.json` file, an error will be shown and the `import` process will stop: @@ -194,6 +197,7 @@ lighthouse \ --vc-token \ --ignore-duplicates ``` + and the output will be as follows: ```bash diff --git a/book/src/validator-manager-move.md b/book/src/validator-manager-move.md index 5009e6407e..10de1fe87c 100644 --- a/book/src/validator-manager-move.md +++ b/book/src/validator-manager-move.md @@ -100,7 +100,7 @@ lighthouse \ > it is recommended for an additional layer of safety. It will result in 2-3 > epochs of downtime for the validator after it is moved, which is generally an > inconsequential cost in lost rewards or penalties. -> +> > Optionally, users can add the `--http-store-passwords-in-secrets-dir` flag if they'd like to have > the import validator keystore passwords stored in separate files rather than in the > `validator-definitions.yml` file. If you don't know what this means, you can safely omit the flag. @@ -158,7 +158,9 @@ Moved keystore 1 of 2 Moved keystore 2 of 2 Done. ``` + At the same time, `lighthouse vc` will log: + ```bash INFO Importing keystores via standard HTTP API, count: 1 INFO Enabled validator voting_pubkey: 0xab6e29f1b98fedfca878edce2b471f1b5ee58ee4c3bd216201f98254ef6f6eac40a53d74c8b7da54f51d3e85cacae92f, signing_method: local_keystore @@ -183,12 +185,13 @@ lighthouse \ ``` > Note: If you have the `validator-monitor-auto` turned on, the source beacon node may still be reporting the attestation status of the validators that have been moved: + ``` INFO Previous epoch attestation(s) success validators: ["validator_index"], epoch: 100000, service: val_mon, service: beacon ``` -> This is fine as the validator monitor does not know that the validators have been moved (it *does not* mean that the validators have attested twice for the same slot). A restart of the beacon node will resolve this. +> This is fine as the validator monitor does not know that the validators have been moved (it *does not* mean that the validators have attested twice for the same slot). A restart of the beacon node will resolve this. Any errors encountered during the operation should include information on how to proceed. Assistance is also available on our -[Discord](https://discord.gg/cyAszAh). \ No newline at end of file +[Discord](https://discord.gg/cyAszAh). diff --git a/book/src/validator-manager.md b/book/src/validator-manager.md index e3cb74bd66..a71fab1e3a 100644 --- a/book/src/validator-manager.md +++ b/book/src/validator-manager.md @@ -1,7 +1,6 @@ # Validator Manager [Ethereum Staking Launchpad]: https://launchpad.ethereum.org/en/ -[Import Validators]: #import-validators ## Introduction @@ -32,4 +31,4 @@ The `validator-manager` boasts the following features: ## Guides - [Creating and importing validators using the `create` and `import` commands.](./validator-manager-create.md) -- [Moving validators between two VCs using the `move` command.](./validator-manager-move.md) \ No newline at end of file +- [Moving validators between two VCs using the `move` command.](./validator-manager-move.md) diff --git a/book/src/validator-monitoring.md b/book/src/validator-monitoring.md index 532bd50065..6439ea83a3 100644 --- a/book/src/validator-monitoring.md +++ b/book/src/validator-monitoring.md @@ -20,7 +20,6 @@ Lighthouse performs validator monitoring in the Beacon Node (BN) instead of the - Users can use a local BN to observe some validators running in a remote location. - Users can monitor validators that are not their own. - ## How to Enable Monitoring The validator monitor is always enabled in Lighthouse, but it might not have any enrolled @@ -57,7 +56,8 @@ Monitor the mainnet validators at indices `0` and `1`: ``` lighthouse bn --validator-monitor-pubkeys 0x933ad9491b62059dd065b560d256d8957a8c402cc6e8d8ee7290ae11e8f7329267a8811c397529dac52ae1342ba58c95,0xa1d1ad0714035353258038e964ae9675dc0252ee22cea896825c01458e1807bfad2f9969338798548d9858a571f7425c ``` -> Note: The validator monitoring will stop collecting per-validator Prometheus metrics and issuing per-validator logs when the number of validators reaches 64. To continue collecting metrics and logging, use the flag `--validator-monitor-individual-tracking-threshold N` where `N` is a number greater than the number of validators to monitor. + +> Note: The validator monitoring will stop collecting per-validator Prometheus metrics and issuing per-validator logs when the number of validators reaches 64. To continue collecting metrics and logging, use the flag `--validator-monitor-individual-tracking-threshold N` where `N` is a number greater than the number of validators to monitor. ## Observing Monitoring @@ -102,7 +102,7 @@ dashboard contains most of the metrics exposed via the validator monitor. Lighthouse v4.6.0 introduces a new feature to track the performance of a beacon node. This feature internally simulates an attestation for each slot, and outputs a hit or miss for the head, target and source votes. The attestation simulator is turned on automatically (even when there are no validators) and prints logs in the debug level. -> Note: The simulated attestations are never published to the network, so the simulator does not reflect the attestation performance of a validator. +> Note: The simulated attestations are never published to the network, so the simulator does not reflect the attestation performance of a validator. The attestation simulation prints the following logs when simulating an attestation: @@ -118,11 +118,11 @@ DEBG Simulated attestation evaluated, head_hit: true, target_hit: true, source_h ``` An example of a log when the head is missed: + ``` DEBG Simulated attestation evaluated, head_hit: false, target_hit: true, source_hit: true, attestation_slot: Slot(1132623), attestation_head: 0x1c0e53c6ace8d0ff57f4a963e4460fe1c030b37bf1c76f19e40928dc2e214c59, attestation_target: 0xaab25a6d01748cf4528e952666558317b35874074632550c37d935ca2ec63c23, attestation_source: 0x13ccbf8978896c43027013972427ee7ce02b2bb9b898dbb264b870df9288c1e7, service: val_mon, service: beacon, module: beacon_chain::validator_monitor:2051 ``` - With `--metrics` enabled on the beacon node, the following metrics will be recorded: ``` @@ -134,11 +134,12 @@ validator_monitor_attestation_simulator_source_attester_hit_total validator_monitor_attestation_simulator_source_attester_miss_total ``` -A grafana dashboard to view the metrics for attestation simulator is available [here](https://github.com/sigp/lighthouse-metrics/blob/master/dashboards/AttestationSimulator.json). +A grafana dashboard to view the metrics for attestation simulator is available [here](https://github.com/sigp/lighthouse-metrics/blob/master/dashboards/AttestationSimulator.json). + +The attestation simulator provides an insight into the attestation performance of a beacon node. It can be used as an indication of how expediently the beacon node has completed importing blocks within the 4s time frame for an attestation to be made. -The attestation simulator provides an insight into the attestation performance of a beacon node. It can be used as an indication of how expediently the beacon node has completed importing blocks within the 4s time frame for an attestation to be made. +The attestation simulator _does not_ consider: -The attestation simulator *does not* consider: - the latency between the beacon node and the validator client - the potential delays when publishing the attestation to the network @@ -146,10 +147,6 @@ which are critical factors to consider when evaluating the attestation performan Assuming the above factors are ignored (no delays between beacon node and validator client, and in publishing the attestation to the network): -1. If the attestation simulator says that all votes are hit, it means that if the beacon node were to publish the attestation for this slot, the validator should receive the rewards for the head, target and source votes. +1. If the attestation simulator says that all votes are hit, it means that if the beacon node were to publish the attestation for this slot, the validator should receive the rewards for the head, target and source votes. 1. If the attestation simulator says that the one or more votes are missed, it means that there is a delay in importing the block. The delay could be due to slowness in processing the block (e.g., due to a slow CPU) or that the block is arriving late (e.g., the proposer publishes the block late). If the beacon node were to publish the attestation for this slot, the validator will miss one or more votes (e.g., the head vote). - - - - diff --git a/book/src/voluntary-exit.md b/book/src/voluntary-exit.md index 4ec4837fea..33672e54b7 100644 --- a/book/src/voluntary-exit.md +++ b/book/src/voluntary-exit.md @@ -22,14 +22,11 @@ In order to initiate an exit, users can use the `lighthouse account validator ex - The `--password-file` flag is used to specify the path to the file containing the password for the voting keystore. If this flag is not provided, the user will be prompted to enter the password. - After validating the password, the user will be prompted to enter a special exit phrase as a final confirmation after which the voluntary exit will be published to the beacon chain. The exit phrase is the following: > Exit my validator - - Below is an example for initiating a voluntary exit on the Holesky testnet. ``` @@ -71,16 +68,15 @@ After the [Capella](https://ethereum.org/en/history/#capella) upgrade on 12 There are two types of withdrawal credentials, `0x00` and `0x01`. To check which type your validator has, go to [Staking launchpad](https://launchpad.ethereum.org/en/withdrawals), enter your validator index and click `verify on mainnet`: - - `withdrawals enabled` means your validator is of type `0x01`, and you will automatically receive the full withdrawal to the withdrawal address that you set. -- `withdrawals not enabled` means your validator is of type `0x00`, and will need to update your withdrawal credentials from `0x00` type to `0x01` type (also known as BLS-to-execution-change, or BTEC) to receive the staked funds. The common way to do this is using `Staking deposit CLI` or `ethdo`, with the instructions available [here](https://launchpad.ethereum.org/en/withdrawals#update-your-keys). - +- `withdrawals enabled` means your validator is of type `0x01`, and you will automatically receive the full withdrawal to the withdrawal address that you set. +- `withdrawals not enabled` means your validator is of type `0x00`, and will need to update your withdrawal credentials from `0x00` type to `0x01` type (also known as BLS-to-execution-change, or BTEC) to receive the staked funds. The common way to do this is using `Staking deposit CLI` or `ethdo`, with the instructions available [here](https://launchpad.ethereum.org/en/withdrawals#update-your-keys). ### 2. What if my validator is of type `0x00` and I do not update my withdrawal credentials after I initiated a voluntary exit? Your staked fund will continue to be locked on the beacon chain. You can update your withdrawal credentials **anytime**, and there is no deadline for that. The catch is that as long as you do not update your withdrawal credentials, your staked funds in the beacon chain will continue to be locked in the beacon chain. Only after you update the withdrawal credentials, will the staked funds be withdrawn to the withdrawal address. -### 3. How many times can I update my withdrawal credentials? - +### 3. How many times can I update my withdrawal credentials? + If your withdrawal credentials is of type `0x00`, you can only update it once to type `0x01`. It is therefore very important to ensure that the withdrawal address you set is an address under your control, preferably an address controlled by a hardware wallet. If your withdrawal credentials is of type `0x01`, it means you have set your withdrawal address previously, and you will not be able to change the withdrawal address. @@ -89,38 +85,35 @@ There are two types of withdrawal credentials, `0x00` and `0x01`. To check which Your BTEC request will be included very quickly as soon as a new block is proposed. This should be the case most (if not all) of the time, given that the peak BTEC request time has now past (right after the [Capella](https://ethereum.org/en/history/#capella) upgrade on 12th April 2023 and lasted for ~ 2 days) . -### 4. When will I get my staked fund after voluntary exit if my validator is of type `0x01`? - +### 4. When will I get my staked fund after voluntary exit if my validator is of type `0x01`? + There are 3 waiting periods until you get the staked funds in your withdrawal address: - - An exit queue: a varying time that takes at a minimum 5 epochs (32 minutes) if there is no queue; or if there are many validators exiting at the same time, it has to go through the exit queue. The exit queue can be from hours to weeks, depending on the number of validators in the exit queue. During this time your validator has to stay online to perform its duties to avoid penalties. - - - A fixed waiting period of 256 epochs (27.3 hours) for the validator's status to become withdrawable. +- An exit queue: a varying time that takes at a minimum 5 epochs (32 minutes) if there is no queue; or if there are many validators exiting at the same time, it has to go through the exit queue. The exit queue can be from hours to weeks, depending on the number of validators in the exit queue. During this time your validator has to stay online to perform its duties to avoid penalties. + +- A fixed waiting period of 256 epochs (27.3 hours) for the validator's status to become withdrawable. - - A varying time of "validator sweep" that can take up to *n* days with *n* listed in the table below. The "validator sweep" is the process of skimming through all eligible validators by index number for withdrawals (those with type `0x01` and balance above 32ETH). Once the "validator sweep" reaches your validator's index, your staked fund will be fully withdrawn to the withdrawal address set. +- A varying time of "validator sweep" that can take up to _n_ days with _n_ listed in the table below. The "validator sweep" is the process of skimming through all eligible validators by index number for withdrawals (those with type `0x01` and balance above 32ETH). Once the "validator sweep" reaches your validator's index, your staked fund will be fully withdrawn to the withdrawal address set.
-| Number of eligible validators | Ideal scenario *n* | Practical scenario *n* | +| Number of eligible validators | Ideal scenario _n_ | Practical scenario _n_ | |:----------------:|:---------------------:|:----:| -| 300000 | 2.60 | 2.63 | -| 400000 | 3.47 | 3.51 | -| 500000 | 4.34 | 4.38 | -| 600000 | 5.21 | 5.26 | -| 700000 | 6.08 | 6.14 | -| 800000 | 6.94 | 7.01 | -| 900000 | 7.81 | 7.89 | -| 1000000 | 8.68 | 8.77 | +| 300000 | 2.60 | 2.63 | +| 400000 | 3.47 | 3.51 | +| 500000 | 4.34 | 4.38 | +| 600000 | 5.21 | 5.26 | +| 700000 | 6.08 | 6.14 | +| 800000 | 6.94 | 7.01 | +| 900000 | 7.81 | 7.89 | +| 1000000 | 8.68 | 8.77 |
> Note: Ideal scenario assumes no block proposals are missed. This means a total of withdrawals of 7200 blocks/day * 16 withdrawals/block = 115200 withdrawals/day. Practical scenario assumes 1% of blocks are missed per day. As an example, if there are 700000 eligible validators, one would expect a waiting time of slightly more than 6 days. - - The total time taken is the summation of the above 3 waiting periods. After these waiting periods, you will receive the staked funds in your withdrawal address. The voluntary exit and full withdrawal process is summarized in the Figure below. ![full](./imgs/full-withdrawal.png) - diff --git a/scripts/cli.sh b/scripts/cli.sh index 2767ed73c8..148b23966c 100755 --- a/scripts/cli.sh +++ b/scripts/cli.sh @@ -19,7 +19,7 @@ write_to_file() { printf "# %s\n\n\`\`\`\n%s\n\`\`\`" "$program" "$cmd" > "$file" # Adjust the width of the help text and append to the end of file - sed -i -e '$a\'$'\n''' "$file" + sed -i -e '$a\'$'\n''\n''' "$file" } CMD=./target/release/lighthouse diff --git a/scripts/mdlint.sh b/scripts/mdlint.sh new file mode 100755 index 0000000000..5274f108d2 --- /dev/null +++ b/scripts/mdlint.sh @@ -0,0 +1,23 @@ +#! /usr/bin/env bash + +# IMPORTANT +# This script should NOT be run directly. +# Run `make mdlint` from the root of the repository instead. + +# use markdownlint-cli to check for markdown files +docker run --rm -v ./book:/workdir ghcr.io/igorshubovych/markdownlint-cli:latest '**/*.md' --ignore node_modules + +# exit code +exit_code=$(echo $?) + +if [[ $exit_code == 0 ]]; then + echo "All markdown files are properly formatted." + exit 0 +elif [[ $exit_code == 1 ]]; then + echo "Exiting with errors. Run 'make mdlint' locally and commit the changes. Note that not all errors can be fixed automatically, if there are still errors after running 'make mdlint', look for the errors and fix manually." + docker run --rm -v ./book:/workdir ghcr.io/igorshubovych/markdownlint-cli:latest '**/*.md' --ignore node_modules --fix + exit 1 +else + echo "Exiting with exit code >1. Check for the error logs and fix them accordingly." + exit 1 +fi \ No newline at end of file