Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/multi version flat db #5865

Conversation

garyschulte
Copy link
Contributor

@garyschulte garyschulte commented Sep 11, 2023

PR description

Draft version of multi-version flat db

  • creates bonsai context concept
  • creates FlatDbArchiveStrategy
  • kikori worldstate provider integrated into BonsaiWorldStateProvider (not suitable for eth_getProof)

There are a handful of hacks that are marked with TODOs that are there for testing expedience.

skipping CI while in draft form

Fixed Issue(s)

fixes #5846

@github-actions
Copy link

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from 51c12ef to 1700245 Compare September 11, 2023 19:16
public void putFlatCode(
final SegmentedKeyValueStorageTransaction transaction,
final Hash accountHash,
final Bytes32 codeHash,

Check notice

Code scanning / CodeQL

Useless parameter

The parameter 'codeHash' is never used.
@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from 2963fa3 to ca74aab Compare September 12, 2023 05:14

var contextSafeCopy = worldStateStorage.getContextSafeCopy();
contextSafeCopy.getFlatDbStrategy().updateBlockContext(blockHeader);
return Optional.of(new BonsaiWorldState(this, contextSafeCopy));
Copy link
Contributor

@matkt matkt Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why we are doing that . seems not clear to clone only the flatdb . It seems to be better to follow the logic of bonsai to clone all the storage if we need a clone and pass this clone to BonsaiWorldstate . like that no need to have a specific clone for flat db .

just passing a clone of the storage to the worldstate and the internal flat db will check directly the snapshot automatically

I'm also not fan to clone everytime. we had problem before because of that . I think we can use the same code as before . using the clone in the cache and if not present create a new one (without rollback)

return trieLogManager.getHeadWorldState(blockchain::getBlockHeader)
              .map(MutableWorldState::freeze);

When we will have the checkpointed trie we will have to add again the rollback. maybe an optional rolling depending if we need the state or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just saw that the current implemention is a hack for testing 👍

Copy link
Contributor Author

@garyschulte garyschulte Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't really any cloning exactly, contextSafeClone exists only to directly reuse the flatdb storage, but with a different bonsai context for a different block number. For example we don't want RPC queries to modify the bonsai context, because that will mess up the block suffixing when we persist.

We need to guard the bonsai context for the primary worldstate, but otherwise as long as we are not persisting, there is no need to snapshot or clone the database itself.

So clone is probably not the best name for this, better naming suggestions welcome :). My thinking here is that as long as this is a 'non-persisting' mutable worldstate, we can just hand out the primary worldstate storage with whatever bonsai context the caller is requesting.

Copy link
Contributor Author

@garyschulte garyschulte Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should have a different BonsaiWorldState subclass that better reflects that it is non-persisting and limited to flatdb only for reads. Right now we use non-persisting worldstates to propose blocks, so we might need some other "marker" to indicate we can hand out a worldstate that is both non-persisting and non-mutable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Layered storage is already doing this. It is not a clone and prevents modification. I would like to limit the number of classes to keep the same logic as much as possible. We also have the notion of Frozen BonsaiWorldState . The idea imo is to have a single worldstate and give a storage to this one. And at the storage level it will be different . Maybe I'm wrong but I think we have to think about this because we did the refactoring to reduce the number of worldstate class etc and I think that we must avoid redoing what we removed

So I would say that we should have just a Layered Storage that wraps it and give it to the worldstate. And we freeze the worldstate as we are doing with bonsai before rpc call. Then find out if we should have a special worldstate for flat DB. I think we have to wait to see the implementation of the checkpointed trie. We could access in one worldstate to the flat db directly and the checkpointed trie in lazy mode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I think that it is in the creation of the worldstate storage that we will give the context of the block number.

BonsaiWorldstate doesn't need to know this.

@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from ca74aab to 73edff2 Compare September 14, 2023 17:14
/**
* record type used to wrap responses from getNearestTo, includes the matched key and the value.
*
* @param key the matched (nearest) key

Check notice

Code scanning / CodeQL

Spurious Javadoc @param tags

@param tag "key" does not match any actual type parameter of type "NearestKeyValue".
* record type used to wrap responses from getNearestTo, includes the matched key and the value.
*
* @param key the matched (nearest) key
* @param value the corresponding value

Check notice

Code scanning / CodeQL

Spurious Javadoc @param tags

@param tag "value" does not match any actual type parameter of type "NearestKeyValue".
@garyschulte garyschulte mentioned this pull request Sep 19, 2023
@garyschulte garyschulte changed the title Feature/multi version flat db [skip ci] Feature/multi version flat db Sep 20, 2023
@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from d0e7d08 to 9d9fe8c Compare October 20, 2023 20:20
@garyschulte garyschulte reopened this Oct 20, 2023
@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from 5e359f9 to 511ec7e Compare October 20, 2023 21:34
@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from 511ec7e to 9d9fe8c Compare October 20, 2023 21:37
@garyschulte garyschulte reopened this Oct 20, 2023
@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from d11389d to 21a23ba Compare October 20, 2023 21:45
Signed-off-by: garyschulte <garyschulte@gmail.com>
@garyschulte garyschulte force-pushed the feature/multi-version-flat-db branch from 21a23ba to eea30f8 Compare October 20, 2023 21:49
@garyschulte
Copy link
Contributor Author

garyschulte commented May 29, 2024

Context from conversation with @mattnelson, this implementation is a single column family implementation. Read and write performance degrades as multiple versions 'pile up' in the flat worldstate column families. A partitioned version of this implementation that uses separate column families for 'hot' (read/write) and a 'history' (read-only) should prevent degradation of execution performance, and should provide adequate historical RPC performance for use cases like block explorers.

@matkt, @jframe - this would be like a manual checkpoint implementation, where the check point is dynamic, and we use trie logs to push old state into the history partition.

@matkt
Copy link
Contributor

matkt commented May 30, 2024

Context from conversation with @mattnelson, this implementation is a single column family implementation. Read and write performance degrades as multiple versions 'pile up' in the flat worldstate column families. A partitioned version of this implementation that uses separate column families for 'hot' (read/write) and a 'history' (read-only) should prevent degradation of execution performance, and should provide adequate historical RPC performance for use cases like block explorers.

@matkt, @jframe - this would be like a manual checkpoint implementation, where the check point is dynamic, and we use trie logs to push old state into the history partition.

seems to be a good idea

@jframe
Copy link
Contributor

jframe commented Dec 9, 2024

@garyschulte and @matthew1001 Do we still need this draft PR open?

@jframe jframe closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Bonsai archive storage format
3 participants