-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TrieLogPruner preload with 30 second timeout #7365
Conversation
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a question for you, but I think it looks good and adding the timeout is a good thing!
|
||
private void preloadQueue( | ||
final AtomicBoolean timeoutOccurred, final ScheduledFuture<?> timeoutFuture) { | ||
|
||
try (final Stream<byte[]> trieLogKeys = rootWorldStateStorage.streamTrieLogKeys(loadingLimit)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to start streaming at a random key? There is a method streamFromKey() in class KeyValueStorage that allows you to pass in the starting key.
...src/main/java/org/hyperledger/besu/ethereum/trie/diffbased/common/trielog/TrieLogPruner.java
Show resolved
Hide resolved
…meout Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Also reduce pruning window from 30_000 to 5_000 --------- Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: gconnect <agatevureglory@gmail.com>
PR description
The TrieLogPruner preload operation executes during startup. It streams a batch of trie logs from rocksdb, looks up their block headers to get a block number and adds them to the prune queue. Finally it triggers a prune operation on the prune queue, which may prune some of these trie logs if they are eligible - note pruning from the queue is fast.
The preload is executed synchronously on the main thread. Under certain conditions, this can take a long time, particularly when the trie log column family contains a large number of keys.
This PR wraps the whole operation in a 30 second timeout to avoid startup hangs.
It is still synchronous and blocks the main startup thread because doing this async (as explored in #7337) affects the performance (especially disk I/O) of the normal operation of the node: block import and potentially block proposal.
In addition, the default pruning window size is reduced from 30_000 to 5_000 which should be sufficient to cover any pruning gaps due to maintenance downtime, while reducing the impact of the performance issue.
The expectation is that if the user has a backlog of trie logs, they will follow the guide to use the prune subcommand. As such, a log has been added which will be the last log line printed before the 30 second timeout window expires.
Here's an example from before this PR of a 2.5minute load...
And now with this PR...
Fixed Issue(s)
#7322
Thanks for sending a pull request! Have you done the following?
doc-change-required
label to this PR if updates are required.Locally, you can run these tests to catch failures early:
./gradlew build
./gradlew acceptanceTest
./gradlew integrationTest
./gradlew ethereum:referenceTests:referenceTests