Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sync ignore known hashes #853

Merged
merged 8 commits into from
Aug 10, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 76 additions & 42 deletions zebrad/src/commands/start/sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ where
/// - state: the zebra-state that stores the chain
/// - verifier: the zebra-consensus verifier that checks the chain
pub fn new(chain: Network, peers: ZN, state: ZS, verifier: ZV) -> Self {
let retry_peers = Retry::new(RetryLimit::new(3), peers.clone());
let retry_peers = Retry::new(RetryLimit::new(10), peers.clone());
Self {
tip_network: peers,
block_network: retry_peers,
Expand All @@ -83,7 +83,8 @@ where

// ObtainTips Step 6
//
// If there are any prospective tips, call ExtendTips. Continue this step until there are no more prospective tips.
// If there are any prospective tips, call ExtendTips.
// Continue this step until there are no more prospective tips.
while !self.prospective_tips.is_empty() {
tracing::debug!("extending prospective tips");

Expand All @@ -110,11 +111,6 @@ where
// making progress (probably using a timeout), then
// continue the loop with a new invocation of
// obtain_tips(), which will restart block downloads.
// this requires correctly constructing a block locator
// (TODO below) and ensuring that the verifier handles
// multiple requests for verification of the same block
// hash by handling both requests or by discarding the
// earlier request in favor of the later one.
Err(e) => tracing::error!(?e, "potentially transient error"),
};
}
Expand Down Expand Up @@ -190,35 +186,29 @@ where
// list, prune any block hashes already included in the
// state, stopping at the first unknown hash to get resp1',
// ..., respF'. (These lists may be empty).
let mut first_unknown = 0;
let mut first_unknown = None;
for (i, &hash) in hashes.iter().enumerate() {
let depth = self
.state
.ready_and()
.await
.map_err(|e| eyre!(e))?
.call(zebra_state::Request::GetDepth { hash })
.await
.map_err(|e| eyre!(e))?;
if let zs::Response::Depth(None) = depth {
first_unknown = i;
if !self.state_contains(hash).await? {
first_unknown = Some(i);
break;
}
}

// Hashes will be empty if we know about all the blocks in the response.
if first_unknown.is_none() {
tracing::debug!("ObtainTips: all hashes are known");
continue;
}
let first_unknown = first_unknown.expect("already checked for None");
tracing::debug!(
first_unknown,
"found index of first unknown hash in response"
);
if first_unknown == hashes.len() {
// We should only stop getting hashes once we've finished the initial sync
tracing::debug!("no new hashes, even though we gave our tip?");
continue;
}

let unknown_hashes = &hashes[first_unknown..];
let new_tip = *unknown_hashes
.last()
.expect("already checked first_unknown < hashes.len()");
.expect("already checked that unknown hashes isn't empty");

// ObtainTips Step 4:
// Combine the last elements of each list into a set; this is the
Expand Down Expand Up @@ -297,23 +287,48 @@ where
// It indicates that the remote peer does not have any blocks
// following the prospective tip.
match (hashes.first(), hashes.len()) {
(_, 0) => {
tracing::debug!("skipping empty response");
(None, _) => {
tracing::debug!("ExtendTips: skipping empty response");
continue;
}
(_, 1) => {
tracing::debug!("skipping length-1 response, in case it's an unsolicited inv message");
tracing::debug!("ExtendTips: skipping length-1 response, in case it's an unsolicited inv message");
continue;
}
(Some(hash), _) if (hash == &self.genesis_hash) => {
tracing::debug!("skipping response, peer could not extend the tip");
tracing::debug!(
"ExtendTips: skipping response, peer could not extend the tip"
);
continue;
}
_ => {}
(Some(&hash), _) => {
// Check for hashes we've already seen.
// This happens a lot near the end of the chain.
// This check reduces the number of duplicate
// blocks, but it is not required for
// correctness.
if self.state_contains(hash).await? {
tracing::debug!(
?hash,
"ExtendTips: skipping response, peer returned a duplicate hash: already in state"
);
continue;
}
}
}

let new_tip = hashes.pop().expect("expected: hashes must have len > 0");

// Check for tips we've already seen
// TODO: remove this check once the sync service is more reliable
if self.state_contains(new_tip).await? {
tracing::debug!(
?new_tip,
"ExtendTips: Unexpected duplicate tip from peer: already in state"
);
continue;
}

// ExtendTips Step 4
//
// Combine the last elements of the remaining responses into
Expand Down Expand Up @@ -372,19 +387,7 @@ where
// - the genesis hash is used as a placeholder for "no matches".
//
// So we just queue the genesis block here.

let state_has_genesis = self
.state
.ready_and()
.await
.map_err(|e| eyre!(e))?
.call(zebra_state::Request::GetBlock {
hash: self.genesis_hash,
})
.await
.is_ok();

if !state_has_genesis {
if !self.state_contains(self.genesis_hash).await? {
self.request_blocks(vec![self.genesis_hash]).await?;
}

Expand All @@ -395,6 +398,14 @@ where
async fn request_blocks(&mut self, hashes: Vec<BlockHeaderHash>) -> Result<(), Report> {
tracing::debug!(hashes.len = hashes.len(), "requesting blocks");
for hash in hashes.into_iter() {
// TODO: remove this check once the sync service is more reliable
if self.state_contains(hash).await? {
tracing::debug!(
?hash,
"request_blocks: Unexpected duplicate hash: already in state"
);
continue;
}
// We construct the block download requests sequentially, waiting
// for the peer set to be ready to process each request. This
// ensures that we start block downloads in the order we want them
Expand Down Expand Up @@ -433,6 +444,29 @@ where
Ok(())
}

/// Returns `Ok(true)` if the hash is present in the state, and `Ok(false)`
/// if the hash is not present in the state.
///
/// Returns `Err(_)` if an error occurs.
///
/// TODO: handle multiple tips in the state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you expect the multiple tips to affect this code? The only thing I can think of is race conditions related to depths that are invalidated by updates to the state in the background, but I think we're always using the depth to test for the presence of a block, not to specifically locate it, so it doesn't seem like it would matter here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like the state service to add a request that checks if a block is present in any chain.

Then we don't have to worry about the exact depth and race conditions.

(We might occasionally re-download some side-chains near the reorg limit, but they should be rejected or dropped pretty quickly. So that's not a correctness issue.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #862 for this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we don't have to worry about the exact depth and race conditions.

Then we don't have to worry about the exact depth and race conditions.

That's the thing I'm not seeing though, what race conditions are there and in what situations does the exact depth matter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's a follow-on issue already, it seems like we could merge this in now and sort out this detail later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied in #862 - I think we are actually agreeing about the details, but I want to change the interface.

#[instrument(skip(self))]
async fn state_contains(&mut self, hash: BlockHeaderHash) -> Result<bool, Report> {
match self
.state
.ready_and()
.await
.map_err(|e| eyre!(e))?
.call(zebra_state::Request::GetDepth { hash })
.await
.map_err(|e| eyre!(e))?
{
zs::Response::Depth(Some(_)) => Ok(true),
zs::Response::Depth(None) => Ok(false),
_ => unreachable!("wrong response to depth request"),
}
}

/// Update metrics gauges, and create a trace containing metrics.
fn update_metrics(&self) {
metrics::gauge!(
Expand Down