Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improves block commit error typing (WIP) #5682

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions zebra-consensus/src/block.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ use zebra_chain::{
work::equihash,
};
use zebra_state as zs;
use zs::CommitBlockError;

use crate::{error::*, transaction as tx, BoxError};

Expand Down Expand Up @@ -71,7 +72,7 @@ pub enum VerifyBlockError {
Time(zebra_chain::block::BlockTimeError),

#[error("unable to commit block after semantic verification")]
Commit(#[source] BoxError),
Commit(#[source] CommitBlockError),

#[error("invalid transaction")]
Transaction(#[from] TransactionError),
Expand Down Expand Up @@ -271,10 +272,10 @@ where
match state_service
.ready()
.await
.map_err(VerifyBlockError::Commit)?
.expect("poll_ready method of state service and Ready::<StateService>::poll will never return Err")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your information - no changes needed:

This change locks us into always having the state service ready.

That might be ok, because our design requires the service is either ready: Poll::Ready(Ok(_)), or permanently failed: Poll::Ready(Err(_)). The state service can never be temporarily paused: Poll::Pending.

The state needs to commit blocks in order. So we want to pause higher blocks, but still let lower blocks commit. But Tower services can only pause all requests, so we can't use poll_ready() to do that.

So as long as we're sure we'll never error here, even on shutdown, it would be ok to do this.

Copy link
Contributor Author

@arya2 arya2 Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! I was very uncertain about this part, that's good to know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also unsure about this!

.call(zs::Request::CommitBlock(prepared_block))
.await
.map_err(VerifyBlockError::Commit)?
.map_err(|e| VerifyBlockError::Commit(e.downcast::<CommitBlockError>().map(|e| *e).unwrap()))?
{
zs::Response::Committed(committed_hash) => {
assert_eq!(committed_hash, hash, "state must commit correct hash");
Expand Down
38 changes: 36 additions & 2 deletions zebra-state/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,44 @@ impl From<BoxError> for CloneError {
/// A boxed [`std::error::Error`].
pub type BoxError = Box<dyn std::error::Error + Send + Sync + 'static>;

/// An error specifying where a duplicate block was found.
#[derive(Debug, Error, PartialEq, Eq)]
#[allow(missing_docs)]
pub enum DuplicateError {
#[error("block is on the best chain")]
BestChain(Option<String>),

#[error("block is on a side chain")]
SideChain,

#[error("block is queued to be validated and committed")]
Pending,
}

/// An error specifying the kind of task exit.
#[derive(Debug, Clone, Error, PartialEq, Eq)]
#[allow(missing_docs)]
pub enum TaskExitError {
#[error("dropping the state: dropped unused queued non-finalized block")]
StateService,

#[error("block commit task exited. Is Zebra shutting down?")]
BlockWriteTask,
}

/// An error describing the reason a block could not be committed to the state.
#[derive(Debug, Error, PartialEq, Eq)]
#[error("block is not contextually valid")]
pub struct CommitBlockError(#[from] ValidateContextError);
#[allow(missing_docs)]
pub enum CommitBlockError {
#[error("block is not contextually valid")]
Invalid(#[from] ValidateContextError),

#[error("block is already committed or pending a commit")]
Duplicate(#[from] DuplicateError),

#[error("block commit task exited. Is Zebra shutting down?")]
TaskExited(#[from] TaskExitError),
}

/// An error describing why a block failed contextual validation.
#[derive(Debug, Error, PartialEq, Eq)]
Expand Down
27 changes: 8 additions & 19 deletions zebra-state/src/service.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ use crate::{
MAX_FIND_BLOCK_HASHES_RESULTS, MAX_FIND_BLOCK_HEADERS_RESULTS_FOR_ZEBRA,
MAX_LEGACY_CHAIN_BLOCKS,
},
error::{DuplicateError, TaskExitError},
service::{
block_iter::any_ancestor_blocks,
chain_tip::{ChainTipBlock, ChainTipChange, ChainTipSender, LatestChainTip},
Expand Down Expand Up @@ -244,9 +245,7 @@ impl Drop for StateService {
self.clear_finalized_block_queue(
"dropping the state: dropped unused queued finalized block",
);
self.clear_non_finalized_block_queue(
"dropping the state: dropped unused queued non-finalized block",
);
self.clear_non_finalized_block_queue(TaskExitError::StateService);

// Then drop self.read_service, which checks the block write task for panics,
// and tries to shut down the database.
Expand Down Expand Up @@ -573,7 +572,7 @@ impl StateService {
}

/// Drops all queued non-finalized blocks, and sends an error on their result channels.
fn clear_non_finalized_block_queue(&mut self, error: impl Into<BoxError> + Clone) {
fn clear_non_finalized_block_queue(&mut self, error: TaskExitError) {
for (_hash, queued) in self.queued_non_finalized_blocks.drain() {
Self::send_non_finalized_block_error(queued, error.clone());
}
Expand Down Expand Up @@ -609,18 +608,13 @@ impl StateService {
.contains(&prepared.hash)
{
let (rsp_tx, rsp_rx) = oneshot::channel();
let _ = rsp_tx.send(Err(
"block has already been sent to be committed to the state".into(),
));
let _ = rsp_tx.send(Err(DuplicateError::Pending.into()));
return rsp_rx;
}

if self.read_service.db.contains_height(prepared.height) {
let (rsp_tx, rsp_rx) = oneshot::channel();
let _ = rsp_tx.send(Err(
"block height is in the finalized state: block is already committed to the state"
.into(),
));
let _ = rsp_tx.send(Err(DuplicateError::BestChain(None).into()));
return rsp_rx;
}

Expand All @@ -633,7 +627,7 @@ impl StateService {
tracing::debug!("replacing older queued request with new request");
let (mut rsp_tx, rsp_rx) = oneshot::channel();
std::mem::swap(old_rsp_tx, &mut rsp_tx);
let _ = rsp_tx.send(Err("replaced by newer request".into()));
let _ = rsp_tx.send(Err(DuplicateError::Pending.into()));
rsp_rx
} else {
let (rsp_tx, rsp_rx) = oneshot::channel();
Expand Down Expand Up @@ -730,14 +724,9 @@ impl StateService {

if let Err(SendError(queued)) = send_result {
// If Zebra is shutting down, drop blocks and return an error.
Self::send_non_finalized_block_error(
queued,
"block commit task exited. Is Zebra shutting down?",
);
Self::send_non_finalized_block_error(queued, TaskExitError::BlockWriteTask);

self.clear_non_finalized_block_queue(
"block commit task exited. Is Zebra shutting down?",
);
self.clear_non_finalized_block_queue(TaskExitError::BlockWriteTask);

return;
};
Expand Down
9 changes: 5 additions & 4 deletions zebra-state/src/service/queued_blocks.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ use tracing::instrument;

use zebra_chain::{block, transparent};

use crate::{BoxError, FinalizedBlock, PreparedBlock};
use crate::{error::DuplicateError, BoxError, FinalizedBlock, PreparedBlock};

#[cfg(test)]
mod tests;
Expand Down Expand Up @@ -139,9 +139,10 @@ impl QueuedBlocks {
let parent_hash = &expired_block.block.header.previous_block_hash;

// we don't care if the receiver was dropped
let _ = expired_sender.send(Err(
"pruned block at or below the finalized tip height".into()
));
let _ = expired_sender.send(Err(DuplicateError::BestChain(Some(
"pruned block at or below the finalized tip height".to_string(),
))
.into()));

// TODO: only remove UTXOs if there are no queued blocks with that UTXO
// (known_utxos is best-effort, so this is ok for now)
Expand Down