Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store only the first tree state in each series of identical tree states #4784

Closed
Tracked by #3757 ...
upbqdn opened this issue Jul 17, 2022 · 15 comments · Fixed by #7239, #7266 or #7312
Closed
Tracked by #3757 ...

Store only the first tree state in each series of identical tree states #4784

upbqdn opened this issue Jul 17, 2022 · 15 comments · Fixed by #7239, #7266 or #7312
Assignees
Labels
A-rust Area: Updates to Rust code A-state Area: State / database changes C-tech-debt Category: Code maintainability issues I-heavy Problems with excessive memory, disk, or CPU usage I-slow Problems with performance or responsiveness

Comments

@upbqdn
Copy link
Member

upbqdn commented Jul 17, 2022

Storing only the first tree state in each identical series of tree states saves memory, and makes Chain cloning cheaper.

Possible Implementation

We can access the deduped trees using:

Non-finalized state:

Chain.{sapling,orchard}_note_commitment_tree_by_height.range(..=target_height).next_back()

Finalized state (reading is implemented, writing new blocks and updating cached blocks is not):

db.zs_prev_key_value_back_from({sapling,orchard}_note_commitment_tree, target_height)

Performance

This needs a full sync performance test:

The performance of reversed iteration is usually much worse than forward iteration
https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ

@upbqdn upbqdn changed the title Store roots in Arcs in the non-finalized state Store only the first tree state in each identical series of tree states Jul 18, 2022
@upbqdn upbqdn self-assigned this Jul 20, 2022
@ftm1000 ftm1000 added S-needs-triage Status: A bug report needs triage P-Medium ⚡ labels Jul 20, 2022
@teor2345
Copy link
Contributor

This is important to do eventually because large blocks take up more space on disk. But we don't need to do it any time soon.

@teor2345 teor2345 added P-Optional ✨ I-heavy Problems with excessive memory, disk, or CPU usage A-state Area: State / database changes and removed P-Medium ⚡ labels Aug 23, 2022
@teor2345
Copy link
Contributor

Performance and disk space usage are not a priority for Zebra right now.

@teor2345 teor2345 closed this as not planned Won't fix, can't repro, duplicate, stale Aug 26, 2022
@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Sep 28, 2022
@teor2345
Copy link
Contributor

This is required for the note commitment tree scanning in #6642 to have reasonable performance.

@teor2345
Copy link
Contributor

teor2345 commented Jun 5, 2023

Note from the engineering sync: This seems risky to do between the last release candidate and the first stable release. There is no specific deadline for this change.

@teor2345
Copy link
Contributor

teor2345 commented Jun 6, 2023

Finalized state:

db.iterator_cf({sapling,orchard}_note_commitment_tree, IteratorMode::From(target_height, Direction::Reverse)).next()

The database read part of this ticket is implemented in PR #6813.

Everything else still needs to be implemented:

  • non-finalized state data structure deduplication and lookups
  • finalized state deduplication for newly verified blocks
  • finalized state deduplication for existing blocks on disk

PR #6813 introduces state minor versions and the idea of a state format update task. But I didn't create the actual task in that PR. I'm happy to help out with that change, currently it's just a vague idea of a design, and a bunch of TODOs.

@teor2345
Copy link
Contributor

@upbqdn lets catch up some time and work out how to make these database changes without breaking compatibility?

@mpguerra
Copy link
Contributor

Please add your planning poker estimate with Zenhub @arya2

@upbqdn upbqdn removed the S-needs-triage Status: A bug report needs triage label Jul 17, 2023
@teor2345
Copy link
Contributor

Here's a draft checklist for state format changes, let me know if it works for you!

  • update the database format in the Zebra docs
  • increment the state minor version
  • write the new format in the block write task
  • update older formats in the format upgrade task
  • test that the new format works when creating a new state, and updating an older state

See the upgrade design docs for more details.

The links won't work until PR #7261 merges.

@upbqdn upbqdn changed the title Store only the first tree state in each identical series of tree states Store only the first tree state in each series of identical tree states Jul 19, 2023
@teor2345
Copy link
Contributor

Some database compatibility issues have come up in the disk format change part of this ticket, so I'd like to merge them after the 1.1.0 release. This impacts some parts of PR #7266. (But not the compatibility fixes!)

@mpguerra I wasn't sure how to set up those dependencies in ZenHub, did you want to split this ticket into permanent database format changes and in-memory changes? (The memory changes are merged and fine to go in the release.)

@teor2345
Copy link
Contributor

Or we could just remember not to merge PR #7266 before the release.

@mpguerra
Copy link
Contributor

Some database compatibility issues have come up in the disk format change part of this ticket, so I'd like to merge them after the 1.1.0 release. This impacts some parts of PR #7266. (But not the compatibility fixes!)

@mpguerra I wasn't sure how to set up those dependencies in ZenHub, did you want to split this ticket into permanent database format changes and in-memory changes? (The memory changes are merged and fine to go in the release.)

I blocked #7266 on the v1.1.0 release, however that won't physically stop it from merging so I would just add the do-not-merge label to it for now and we need to remember to remove it after the release

@mpguerra
Copy link
Contributor

mpguerra commented Aug 9, 2023

Did this close by mistake? Or are all the PRs for this issue done already?

@upbqdn upbqdn reopened this Aug 9, 2023
@upbqdn
Copy link
Member Author

upbqdn commented Aug 9, 2023

This shouldn't have been closed. The only reason I can think of might be this line in the PR description:

Address part of #4784.

Which seems odd.

@mpguerra
Copy link
Contributor

mpguerra commented Aug 9, 2023

I think it's because I linked a PR with this issue in the Github interface and when that PR close this automatically closed also

@upbqdn
Copy link
Member Author

upbqdn commented Aug 9, 2023

I haven't used that feature yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Area: Updates to Rust code A-state Area: State / database changes C-tech-debt Category: Code maintainability issues I-heavy Problems with excessive memory, disk, or CPU usage I-slow Problems with performance or responsiveness
Projects
None yet
4 participants