Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: state updates #902

Merged
merged 37 commits into from
Sep 21, 2020
Merged

RFC: state updates #902

merged 37 commits into from
Sep 21, 2020

Conversation

hdevalence
Copy link
Contributor

@hdevalence hdevalence commented Aug 14, 2020

WIP, to include content from #871

Rendered

@hdevalence hdevalence added the C-design Category: Software design work label Aug 14, 2020
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I had a few questions and tweaks.

I'd like to chat about the overall verification design, and how this RFC fits in to that bigger picture. Let's do a video call, when everyone is available?

hdevalence and others added 4 commits August 31, 2020 20:30
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Co-authored-by: Jane Lusby <jlusby42@gmail.com>
Co-authored-by: teor <teor@riseup.net>
@hdevalence
Copy link
Contributor Author

Rebased onto main and renumbered to target RFC 5.

@dconnolly dconnolly marked this pull request as ready for review September 4, 2020 06:20
The rationale for this change is described in the document: it means
that we write blocks only to one end of the Sled tree, and hopefully
helps us with spatial access patterns.

This should help alleviate a major cause of memory use in Zebra's
current WIP Sled structure, which is that:

- blocks are stored in random, sparse order (by hash) in the B-tree;
- the `Request::GetDepth` method opens the entire block store and
  queries a random part of its block data to determine whether a hash is
  present;
- if present, it deserializes the complete block data of both the given
  block and the current tip block, to compute the difference in block
  heights.

This access pattern forces a large amount of B-tree data to remain
resident, and could probably be avoided if we didn't do that.
@hdevalence
Copy link
Contributor Author

The latest commit, 6b8a5d7, changes the RFC to store blocks by height and notes that the Request::CommitFinalizedBlock can arrive out of order. For the reasons mentioned in the commit message, I think that implementing this RFC will help solve the current excessive memory usage, which I believe is an artefact of the current stub code.

teor2345
teor2345 previously approved these changes Sep 8, 2020
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the recent changes clear up a lot of the outstanding issues.

Here are my remaining open questions:

Do we need to distinguish between zcashd's probabilistically final state (due to cumulative rollbacks), and Zebra's finalised state?
#902 (comment)

Do we need to store the total work of the tip in the sled state?
#902 (comment)

But I don't want to block the PR on them.

We can open up a design ticket for a separate proof of work RFC, and skip clarifying zcashd state implementation details.

But let's also check with @yaahc and @dconnolly?

Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I've covered a bunch of edge cases and ambiguities. I'm happy for the state RFC to merge tomorrow - if we're all happy with it.

@yaahc yaahc requested review from teor2345 and dconnolly September 21, 2020 18:52
@yaahc yaahc dismissed teor2345’s stale review September 21, 2020 19:03

comments have mostly been addressed, but this RFC is close enough to done that we (dierdre, henry, and jane) decided (while teor was asleep) to ship it as is and do further proofreading / updates in later PRs.

@yaahc yaahc merged commit 16cc095 into main Sep 21, 2020
@yaahc yaahc deleted the rfc-state-updates branch September 21, 2020 19:05
Comment on lines +653 to +654
sprout_anchor: sprout::tree::Root,
sapling_anchor: sapling::tree::Root,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dconnolly does every post-Sapling block have a sprout and a sapling anchor?

Copy link
Contributor

@dconnolly dconnolly Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Yes'. What it is depends on the contents of the transactions in the block, joinsplits for sprout and spends/outputs for sapling.

If there are no joinsplits, then the input sprout treestate from the parent block has not been changed, and I think (but staring at the spec to confirm) that means that the output Sprout treestate == the input Sprout treestate. Same for Sapling, if there are no changes to the state, I assume the final treestate (anchor) is the output state from the last transaction (a no-op).

// @str4d @daira does this sound accurate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that sounds like we'll always have both Roots in CommitBlock, because it only accepts post-Sapling blocks.

But I just realised that CommitFinalizedBlock accepts all blocks, including Genesis blocks.

Do genesis blocks have a Sprout root?
Do pre-Sapling blocks have a Sapling root?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the genesis block, the input treestate 'for each of Sprout and Sapling', is the empty treestate, which we can compute with my changes in #1045 (at least for Sapling for now, I'm still working on Sprout).

So for pre-Sapling, I think they literally compute and pass along the empty Sapling treestate, from block to block, until Sapling activation.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and by definition, the genesis block can't have a Sprout transaction, because there are no coins yet.

Ok, that's great, I think we're good, as far as the state goes.

I'll leave this conversation un-resolved, so @str4d or @daira can double-check things.

Copy link
Contributor

@str4d str4d Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically there is an unspendable block subsidy in the genesis block (as with Bitcoin); it just happens that due to the slow start, it has a value of zero. In a chain where ZIP 213 is active from genesis, there could therefore be a Sapling output in the genesis block; if that chain also disabled slow start, the output would have value. It's not immediately clear to me at 4am whether that output should be spendable; under the current consensus rules in the spec, I think it would be added to the commitment tree and therefore be spendable (whereas genesis transparent UTXOs are not added to the UTXO set), but IDK what zcashd would do here (again, special genesis block).

In practice the genesis block never contains shielded outputs in zcashd due to the way that the genesis block is side loaded, making it hard to generate on the fly and thus simpler to hard-coded for regtest. In our RPC tests we currently activate the NUs up to Sapling at height 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, transparent coinbase UTXOs can't be spent until 100 blocks after they are created :-)

Comment on lines +668 to +669
sprout_anchor: sprout::tree::Root,
sapling_anchor: sapling::tree::Root,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make changes above, we'll also need to make them here.

@teor2345 teor2345 mentioned this pull request Sep 29, 2020
2 tasks
@teor2345 teor2345 mentioned this pull request Nov 4, 2020
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-design Category: Software design work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants