Add trie log pruning triggered after trie log persist #6026

siladu · 2023-10-12T07:42:15Z

First piece of #5390

Feature enabled with ~~--Xbonsai-trie-log-pruning-enabled~~ renamed to --Xbonsai-limit-trie-logs-enabled

TrieLogPruner loads ~~all~~ a limited number of trie logs on startup to preload the pruner ~~cache~~ queue, based on loadingLimit with a default value of 30,000 blocks (configured with ~~--Xbonsai-trie-log-prune-limit~~ renamed to --Xbonsai-trie-logs-pruning-window-size). We immediately prune anything that is eligible for pruning following the load.

Each time a trie log is persisted it is added to the pruner ~~cache~~ queue and then the pruner is run against the ~~cache~~ queue, which will prune trie logs associated with block numbers below the ~~--Xbonsai-trie-log-retention-threshold (defaults to 0 === disabled)~~ now using --bonsai-historical-block-limit.
The retention-threshold is approximately the size of the trie log pruning ~~cache~~ queue (it is measured in blocks but the multimap ~~cache~~ queue may contain forks, so they can differ).

For PoS networks, we read the finalizedBlock hash from the blockchain, load the header and use that to ensure we don't prune non-finalized blocks which might be subject to reorg. Since the minimum retention is currently 512, there would have to be a long non-finality event and reorg for this check to be needed. It is a nice safety net to have though and opens to possibility to prune closer to head in the future.

Trie log backlog management

There are two reasons the database might contain old trie logs that don't exist in the ~~cache~~ queue:

Pruning was enabled on an existing node that already has trie logs in the database (backlog could be quite large in this case).
Besu was restarted, clearing the pruner ~~cache~~ queue but leaving the database with the trie logs associated with the blocks above the retention-threshold.

The backlog is gradually pruned: a single set of trie logs are loaded each time besu is started up, which ensures pruneable trie logs aren't forgotten when besu gets restarted, whilst avoiding any complicated logic for progressively managing the backlog.

A loadingLimit > retention-threshold is desirable since if besu is restarted, we want to at least preload the amount of trie logs that were forgotten by the ~~cache~~ queue.

Once backlog is cleared, each prune run should just be a single trie log.

Depending on the size of the backlog, it is possible that it will never be cleared. This is particularly true for nodes that have been running before this feature was enabled. To mitigate this, we can offer one of two options:

A one-off subcommand to trim the trie logs down to the retention-threshold (future PR, see the WIP [WIP] Trie log pruning #6000)
Resync besu

Tasks:

github-actions · 2023-10-12T07:42:31Z

I thought about documentation and added the doc-change-required label to this PR if updates are required.
I thought about the changelog and included a changelog update if required.
If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

siladu · 2023-10-12T07:45:30Z

.../main/java/org/hyperledger/besu/ethereum/bonsai/storage/BonsaiWorldStateKeyValueStorage.java

@@ -203,6 +204,10 @@ public Optional<byte[]> getTrieLog(final Hash blockHash) {
    return trieLogStorage.get(blockHash.toArrayUnsafe());
  }

+  public Stream<byte[]> streamTrieLogs() {
+    return trieLogStorage.streamKeys();


Used to preload the pruner. Need to test the feasibility of this on real nodes with lots of trie logs, might need to limit it.

yes seems to be dangerous to do that without limit

agreed. If we do not close the RocksIterator, that is going to lead to problems. I would suggest passing in a Consumer or something like that so we can compartmentalize the iterator open/close. Or do like streamFlat* methods and supply a Predicate or row limit, collect to a map and return.

Now I'm simply using streamKeys().limit(..), hopefully that is safe!

8658aec (#6026)

More info about new approach in the updated description

garyschulte · 2023-10-12T21:53:58Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/bonsai/trielog/TrieLogPruner.java

+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class TrieLogPruner {


I would rather see the pruning implemented in a subclass of CachedWorldStorageManager. This feels like a management function that could benefit from knowing the current state of cache. we could ensure that DEFAULT_MAX_BLOCKS_TO_PRUNE doesn't conflict with AbstractTrieLogManager.RETAINED_LAYERS, etc.

Having a separate pruner class breaks encapsulation imo.

I think the main problem with this is that at the CachedWorldStorageManager level, the cachedWorldStatesByHash cache only maintains canonical blocks, not forks...in other words CacheWorldStorageManager.addCachedLayer (and therefore scrubCachedLayers) is only called for canonical blocks (when the head is moved during an FCU AFAICT).

So we'd miss out on pruning forks this way.

Am still investigating potential benefits of the pruning being aware of this cache

As discussed in person, these concepts (and therefore caches) are separated out now. There may still be a task to align these in code rather than relying on configuration.

garyschulte · 2023-10-12T23:01:58Z

.../core/src/main/java/org/hyperledger/besu/ethereum/bonsai/trielog/AbstractTrieLogManager.java

+          trieLogPruner.rememberTrieLogKeyForPruning(
+              forBlockHeader.getNumber(), forBlockHeader.getBlockHash().toArrayUnsafe());


since block number isn't unique for trielogs, we will get collisions during reorgs that could result in stale trielogs hanging around. depending on how the initial/startup cleanup is implemented, this might not be a major concern.

yep, the data structure hold forks for pruning too (it's a multimap)

siladu · 2023-10-16T08:35:05Z

Test results for this first iteration (with no startup loadingLimit) on some ~3 week old nodes with ~150K trie logs using
numBlocksToRetain = 512;
pruningWindowSize = 1000;

Most of the work is in streaming all the trie log entries into the knownTrieLogKeysByDescendingBlockNumber cache at startup, which currently blocks from besu starting up e.g.

{"@timestamp":"2023-10-16T01:04:07,335","level":"INFO","thread":"main","class":"TrieLogPruner","message":"Loading trie logs from database...","throwable":""}
{"@timestamp":"2023-10-16T01:04:47,761","level":"INFO","thread":"main","class":"TrieLogPruner","message":"Loaded 134958 trie logs from database","throwable":""}

Network	Number of trie log	Loading Time (s)
Sepolia	165213	32
Goerli	141391	30
Mainnet	134956	40
Mainnet	515	24
Mainnet	516	6

The pruning windows themselves were performant since RocksDB marks for deletion, the range of time taken:
pruningWindowSize = 1000: 10 - 27 ms
pruningWindowSize = 1: 2 - 4ms (normal operation once backlog is cleared)

TrieLogPruner loads all trie logs on startup. Each time a trie log is persisted, the pruner is run and uses a pruning window (currently hardcoded to 1000) to chip away at an initially large backlog of trie logs Once backlog is cleared, each prune run should just be a single trie log. Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Defaults to 0 = unlimited threshold === pruning disabled Wire in the TrieLogPruner based on this. Initialize the pruner on startup to preload the cache using the loadingLimit (currently hardcoded as 1000) Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu · 2023-10-31T01:41:07Z

Test mainnet to determine reasonable pruningLimit, which will impact besu startup time.

Testing various number of trie log limits for preloading pruning cache.
Tested on mainnet nodes ~5 weeks old with ~240,000 trie logs in the database (14 GB).

Instance	Loading Limit	Loading Time avg (s)
dev-elc-besu-nimbus-mainnet-simon-memory-23.7.2-1	500	0.646
dev-elc-besu-nimbus-mainnet-simon-memory-23.7.2-1	5000	1.4
dev-elc-besu-nimbus-mainnet-simon-memory-23.7.2-1	50000	8
dev-elc-besu-nimbus-mainnet-simon-memory-23.7.2-1	25000	3.6
dev-elc-besu-nimbus-mainnet-simon-memory-23.7.2-2	30000	4.4
dev-elc-besu-nimbus-mainnet-simon-memory-23.7.3-RC-1	30000	4.6

~~UPDATE: Tested on prd-elc-besu-teku-mainnet-bonsai-snap with 1,400,000 trie logs (60 GB) and got a similar load time for 30000 trie logs on mainnet: 4.8 seconds.~~ This prd-elc-besu-teku-mainnet-bonsai-snap test is invalid due to some previous block production simulations that were run on it, resulting in lots of orphaned trie logs.

Tested on the 3 month old prd-elc-besu-lodestar-mainnet-bonsai-snap with 631,705 trie logs (37 GB) and got an average of 7s. Here was the breakdown of canonical/forks/orphaned trie logs...

trieLog count: 631705
 - canonical count: 629934
 - fork count: 1771
 - orphaned count: 0

(Throwaway code for this test #6108)

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Rename method Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu · 2023-11-01T22:11:29Z

Test engine API perf impact

IMPORTANT: this was not a definitive test because there are too many differences between the nodes. I just used what was already available and had pruning enabled for a while. A more definitive test is currently underway.

This is using the first iteration of this code that has been running for a while...

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu · 2023-11-02T09:37:41Z

Ready for code review, but leaving in draft until testing is complete.

(...instead of --Xbonsai-trie-log-retention-threshold=0) Use 512 as default and also minimum value for --Xbonsai-trie-log-retention-threshold Validate that --Xbonsai-trie-log-prune-limit is positive Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

…imit Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Renames and logging Refactor test Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Separate list is for logging reasons Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu · 2023-11-07T08:28:23Z

Second and third iteration tested on the following deployments:

Instance	Description	Commit Ref
dev-elc-besu-teku-mainnet-simon-6026-control-*	Last merged in change from main	`236779d`
dev-elc-besu-teku-mainnet-simon-6026-prune-*	Without finalized block check: no extra db calls due to pruner cache (except on startup)	`bd53bdf`
dev-elc-besu-teku-mainnet-simon-6026-prune-finalized-*	Same as `-prune` but with extra finalized block check: two extra db calls	`6d22bf7`

Last 12 hours:

New Payload:

FCU:

Java Memory Used:

Java Memory Committed:

The mean response times and memory usage appear to be randomly spread across the different deployment versions so I don't think there's a significant difference between -control, -prune and -prune-finalized.

matkt

LGTM as it's experimental feature, we will have to do other PRs to do flag renaming and call db optimization

jframe · 2023-11-09T04:08:51Z

besu/src/main/java/org/hyperledger/besu/cli/ConfigurationOverviewBuilder.java

+      lines.add("Trie log pruning enabled:");
+      lines.add("  - retention threshold: " + trieLogRetentionThreshold + " blocks");
+      if (trieLogPruningLimit != null) {
+        lines.add("  - prune limit: " + trieLogPruningLimit + " blocks");
+      }


nit: seems a little verbose to be using three lines in the config logging. can to be shortened to one line?

Done this here: 6b27522
Looks like this:

# Trie log pruning enabled: retention: 512; prune limit: 30000 #

besu/src/main/java/org/hyperledger/besu/cli/options/stable/DataStorageOptions.java

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/bonsai/trielog/TrieLogPruner.java

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

final Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu · 2023-11-15T07:29:43Z

I setup a reorg test based on this hive test, with some modifications to send finalized blocks trailing the head by 10 blocks. Note it uses a mock CL.

In the abbreviated logs below, we:

build a chain and start finalizing it
simultaneously build a sidechain starting from 6
reorg to a sidechain at Step 16
reorg back to the original chain at Step 26.

The test was setup with an artificially low --Xbonsai-trie-log-retention-threshold=1 to ensure that we were relying on the finalizedBlock check to prevent premature pruning.

Step | FCU | Block | FinalizedBlock
## 0. FCU 0
## 1. FCU 1
## 2. FCU 2
## 3. FCU 3
## 4. FCU 4
## 5. FCU 5 (common ancestor)
## 6. FCU 6
## 7. FCU 7
## 8. FCU 8
## 9. FCU 9
## 10. FCU 10
## 11. FCU 11 FINALIZED 1
## 12. FCU 12 FINALIZED 2
## 13. FCU 13 FINALIZED 3
## 14. FCU 14 FINALIZED 4
## 15. FCU 15 FINALIZED 5
## 16. FCU 6' FINALIZED 5
## 17. FCU 7' FINALIZED 5
## 18. FCU 8' FINALIZED 5
## 19. FCU 9' FINALIZED 5
## 20. FCU 10' FINALIZED 5
## 21. FCU 11' FINALIZED 5
## 22. FCU 12' FINALIZED 5
## 23. FCU 13' FINALIZED 5
## 24. FCU 14' FINALIZED 5
## 25. [INVALID] FCU 15' [INVALID] FINALIZED 5
## Resend the latest correct fcU
## 26. FCU 15 FINALIZED 5
## 27. FCU 16 FINALIZED 6
## 28. FCU 17 FINALIZED 7

(I intend to turn this into a besu AT in a future PR)

matkt

LGTM seems to be fine with finalized block modification

matkt · 2023-11-15T08:50:32Z

ethereum/core/src/test/java/org/hyperledger/besu/ethereum/bonsai/trielog/TrieLogPrunerTest.java

  @BeforeEach
  public void setup() {
+    Configurator.setLevel(LogManager.getLogger(TrieLogPruner.class).getName(), Level.TRACE);


why this one is needed for the test ?

It's not needed - purely convenience for comprehending the tests when you run them. I thought it was useful enough to leave in but happy to remove.

jframe · 2023-11-15T18:03:59Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/bonsai/trielog/TrieLogPruner.java

+        .addArgument(loadingLimit)
+        .log();
+    try {
+      final Stream<byte[]> trieLogKeys = rootWorldStateStorage.streamTrieLogKeys(loadingLimit);


Think this should be done in a try-with-resources so that we close the stream

done 13f0040

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu · 2023-11-16T06:02:14Z

Test pruner cache/queue size, with default value of 512 retention, size of the cache/queue is ~140KB

140KB / 512 = ~275 bytes per trielog

In a non-finality event, this queue is unbounded, however since the size is small I don't think this will be a problem. More likely other parts of besu would break first.

For context, the recent non-finality event lasted a max of 9 epochs. Max of 32 blocks per epoch = 9 * 32 = 288 blocks which is less than the default bound of 140KB.

To reach Megabyte magnitude, the non-finality event would have to be 3,636 blocks long (1 MB / 275 bytes) = 114 epochs = 12 hours of non-finality
To reach Gigabyte magnitude, the non-finality event would have to be 3,636,363 blocks long (1 GB / 275 bytes) = 113,636 epochs = 1.4 years of non-finality

jframe

LGTM

ahamlat · 2023-11-16T16:15:00Z

Checking the performance and the implementation, I think this should be done asynchronously to not have an overhead on the block processing time. Also, I think handling this case on either OnblockAdded event or onTrieLogAdded event, and process the event asynchronously.
Nonetheless, I didn't observe any performance degradation. Given that this is hidden behind an experimental flag, I approve the PR from a performance overhead perspective. The async optimization can be done in another PR.

siladu · 2023-11-17T00:31:49Z

Test on existing validator canary once code is finalized (which should have orphaned trielogs)

TL;DR it works 🎉

Tested on prd-elc-besu-lighthouse-sepolia-bonsai-snap

Trie log count before enabling pruning:

trieLog count: 742311
 - canonical count: 681039
 - fork count: 217
 - orphaned count: 61055

As expected, there are orphaned trielogs due to block creation. The number is high because this sepolia node is 20 / 1972 validators = 1% of the network so does an unusual amount of block creation compared to mainnet nodes.

Log snippets

# Besu version 23.10.3-dev-81115bd9                                                                #
...
# Trie log pruning enabled: retention: 512; prune limit: 30000                                     #

Example of the initial load picking up a batch of trielogs. Note, we don't control the order of the trielog loading due to the way they are stored - it appears to be quite random. The upshot is that not all of them are eligible for pruning immediately because they may fall inside the retention window...

2023-11-16T09:19:55,439 INFO TrieLogPruner "Loading first 30000 trie logs from database..."
2023-11-16T09:20:04,694 INFO TrieLogPruner "Loaded 27528 trie logs from database"
2023-11-16T09:20:04,928 DEBUG TrieLogPruner "pruned 27470 trie logs from 27469 blocks"

After that (and every restart), there's a period of no pruning while the queue builds up its 512 block retention window. This period lasts ~2 hours. This is only the case for pre-existing nodes with a larger backlog of trielogs. If total number of trielogs is <= 30,000 then we will prune most on load and have a full retention window to begin. Note pruning triggers on both block processing thread and block building thread...

{"@timestamp":"2023-11-16T09:20:21,302","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"pruned 0 trie logs for blocks {}","throwable":""}
...
{"@timestamp":"2023-11-16T09:32:56,171","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"pruned 0 trie logs for blocks {}","throwable":""}
...
{"@timestamp":"2023-11-16T11:12:12,820","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"pruned 0 trie logs for blocks {}","throwable":""}

During this time we prune the odd trielog, which I believe occurs when an queue entry that was preloaded during startup becomes eligible.

Once the queue size reaches 512 retention window, we start pruning ~1 block every block (one in, one out).
At this point you will notice the trielogs are pruned in order since it's a sliding window by block number...

{"@timestamp":"2023-11-16T11:12:24,376","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"pruned 1 trie logs for blocks {4704297=[0x1e335684e0c7d41a1a3bc13b321e2f27b73672c6a37515e1c735bc8ac213c393]}","throwable":""}
{"@timestamp":"2023-11-16T11:12:36,385","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"pruned 1 trie logs for blocks {4704298=[0x2875008c2ee12ab366d7969e6682a9928260c359e93333e0513ca87015df3a56]}","throwable":""}
{"@timestamp":"2023-11-16T11:13:00,985","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"pruned 1 trie logs for blocks {4704299=[0x913742463b968f41cc4e101844493c34e87cd444def91ca386ed6c09b218cb16]}","throwable":""}

When we propose a block, we actually save multiple trie logs, but only the latest one makes it into the chain (hence the orphaned trielogs). During pruning (512 blocks later), these will show up as a list of trie logs stored against the same block number (similar to forks).

{"@timestamp":"2023-11-16T09:32:56,018","level":"INFO","thread":"vert.x-worker-thread-0","class":"MergeCoordinator","message":"Start building proposals for block 4704354 identified by 0x00688f3cd98f37a0","throwable":""}
...
// FCU thread creating the initial empty block...
{"@timestamp":"2023-11-16T09:32:56,017","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0xd2b16e03e6d312a9989da949ac06a7cf6319e575f63ef1fbedba2b0ce9ce275b","throwable":""}
// Block creation loop...
{"@timestamp":"2023-11-16T09:32:56,171","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x6724630e7d47e454de20b95a913a4997a3ce8bf138c04e413045bab896435aa9","throwable":""}
{"@timestamp":"2023-11-16T09:32:56,677","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x656bae99dd673404531b9fe95b09bf0a281df9c803d0c11b5b054838046ecd3f","throwable":""}
{"@timestamp":"2023-11-16T09:32:57,144","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x4379fe26a8e77cfaa4668065fceca9cefe2fbb64ee0213eedc3993196177743d","throwable":""}
{"@timestamp":"2023-11-16T09:32:57,728","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0xc87cdd96177387b20327573e682a23147215c99597599a9280848e9c54152622","throwable":""}
{"@timestamp":"2023-11-16T09:32:58,262","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x3e3a13b87a3ef3b342581fab6d3794a563aae79f5c11feb73b2415355da341d8","throwable":""}
{"@timestamp":"2023-11-16T09:32:58,744","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x39b3f1ca18ca4a6c5a15dd3b8076e03724393412953c4d378cbb4caf61b213c8","throwable":""}
{"@timestamp":"2023-11-16T09:32:59,251","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x9cf0aea39d630ffcf79458a89348ccb584991051ba80e4672df83fca40fa8bcc","throwable":""}
{"@timestamp":"2023-11-16T09:32:59,756","level":"TRACE","thread":"EthScheduler-BlockCreation-0","class":"TrieLogPruner","message":"adding trie log to queue for later pruning blockNumber 4704354; blockHash 0x0ec8491df4952c450503933d43e8c8163881a335f00816ed37b8bd0f42bb3085","throwable":""}
...
{"@timestamp":"2023-11-16T11:24:24,577","level":"TRACE","thread":"vert.x-worker-thread-0","class":"TrieLogPruner","message":"pruned 9 trie logs for blocks {4704354=[0x0ec8491df4952c450503933d43e8c8163881a335f00816ed37b8bd0f42bb3085, 0x39b3f1ca18ca4a6c5a15dd3b8076e03724393412953c4d378cbb4caf61b213c8, 0x3e3a13b87a3ef3b342581fab6d3794a563aae79f5c11feb73b2415355da341d8, 0x4379fe26a8e77cfaa4668065fceca9cefe2fbb64ee0213eedc3993196177743d, 0x656bae99dd673404531b9fe95b09bf0a281df9c803d0c11b5b054838046ecd3f, 0x6724630e7d47e454de20b95a913a4997a3ce8bf138c04e413045bab896435aa9, 0x9cf0aea39d630ffcf79458a89348ccb584991051ba80e4672df83fca40fa8bcc, 0xc87cdd96177387b20327573e682a23147215c99597599a9280848e9c54152622, 0xd2b16e03e6d312a9989da949ac06a7cf6319e575f63ef1fbedba2b0ce9ce275b]}","throwable":""}

- Toggled with --Xbonsai-trie-log-pruning-enabled - Introduces TrieLogPruner which loads a limited number of trie logs on startup to preload the pruner queue, based on loadingLimit with a default value of 30,000 blocks (configured with --Xbonsai-trie-log-pruning-limit). - Each time a trie log is persisted it is added to the pruner queue and then the pruner is run against the queue, which will prune trie logs associated with block numbers below the --Xbonsai-trie-log-retention-threshold (default 512). - Once the retention threshold is reached, each prune run should just be a single trie log. - Prune any orphaned trielogs that were created during block creation. - Don't prune non-finalized blocks for PoS chains. --------- Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Signed-off-by: Justin Florentine <justin+github@florentine.us>

siladu commented Oct 12, 2023

View reviewed changes

garyschulte reviewed Oct 12, 2023

View reviewed changes

siladu added the TeamGroot GH issues worked on by Groot Team label Oct 12, 2023

usmansaleem removed the TeamGroot GH issues worked on by Groot Team label Oct 12, 2023

siladu mentioned this pull request Oct 24, 2023

Continuous Trie Log Pruning #6075

Closed

siladu force-pushed the trie-log-pruning branch from 757ab62 to 161b05b Compare October 26, 2023 00:39

siladu added 7 commits October 26, 2023 12:57

Rename TrieLogPruner methods

5a4450e

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Wire in NoOpTrieLogPruner to TrieLogManager

20d8d9d

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Preload cache limited by pruningLimit

8658aec

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Log when adding to prune cache

0279bec

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Prune any orphaned blocks included in prune window during load

c0704ca

Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Merge branch 'main' into trie-log-pruning

2fa1cb8

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu added 2 commits October 31, 2023 16:32

Set DEFAULT_PRUNING_LIMIT to 30000 following testing

1ca25e9

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Add cache preload unit test

6859678

Rename method Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu added 5 commits November 2, 2023 10:40

Make pruningLimit configurable with --Xbonsai-trie-log-prune-limit

e760907

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Merge branch 'main' into trie-log-pruning

aff0222

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

refactor conditional

1fa94db

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Fixup config overview

d2be18d

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Fix unit test

bd53bdf

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu requested review from garyschulte and matkt November 2, 2023 09:30

siladu added the TeamGroot GH issues worked on by Groot Team label Nov 2, 2023

siladu added 4 commits November 6, 2023 15:49

rename --Xbonsai-trie-log-prune-limit to --Xbonsai-trie-log-pruning-l…

3a779ea

…imit Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Readability improvements - use Hash instead of byte[]

2e0747e

Renames and logging Refactor test Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Remove TODO

b0c72ca

Separate list is for logging reasons Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

siladu marked this pull request as ready for review November 7, 2023 08:45

matkt approved these changes Nov 7, 2023

View reviewed changes

jframe reviewed Nov 9, 2023

View reviewed changes

siladu added 6 commits November 15, 2023 14:26

Avoid pruning when PoS chains don't have a finalized block

c1435c7

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

review comments: Shorten config overview to one line

6b27522

final Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Remove configuration from TrieLogPruner

f2d5cb6

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

error handling during prune

4681a85

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Error handling during preload

9ba8bed

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Rename cache to queue

0e9f9e9

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

matkt approved these changes Nov 15, 2023

View reviewed changes

jframe reviewed Nov 15, 2023

View reviewed changes

siladu added 3 commits November 16, 2023 06:41

try-with-resources

13f0040

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Merge branch 'main' into trie-log-pruning

04f7813

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Merge branch 'main' into trie-log-pruning

81115bd

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

jframe approved these changes Nov 16, 2023

View reviewed changes

siladu merged commit 8c35ce1 into hyperledger:main Nov 17, 2023

siladu mentioned this pull request Nov 17, 2023

Removal of old trie logs specified by historical block limit #5390

Closed

21 tasks

siladu deleted the trie-log-pruning branch January 15, 2024 06:50

siladu mentioned this pull request Jul 15, 2024

TrieLogPruner.preloadQueue Performance Issue #7322

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add trie log pruning triggered after trie log persist #6026

Add trie log pruning triggered after trie log persist #6026

siladu commented Oct 12, 2023 •

edited

Loading

github-actions bot commented Oct 12, 2023

siladu Oct 12, 2023 •

edited

Loading

matkt Oct 12, 2023

garyschulte Oct 12, 2023 •

edited

Loading

siladu Oct 26, 2023 •

edited

Loading

garyschulte Oct 12, 2023

siladu Oct 19, 2023 •

edited

Loading

siladu Oct 26, 2023

garyschulte Oct 12, 2023 •

edited

Loading

siladu Oct 12, 2023

siladu commented Oct 16, 2023 •

edited

Loading

siladu commented Oct 31, 2023 •

edited

Loading

siladu commented Nov 1, 2023 •

edited

Loading

siladu commented Nov 2, 2023

siladu commented Nov 7, 2023 •

edited

Loading

matkt left a comment

jframe Nov 9, 2023

siladu Nov 15, 2023

siladu commented Nov 15, 2023 •

edited

Loading

matkt left a comment

matkt Nov 15, 2023

siladu Nov 15, 2023 •

edited

Loading

jframe Nov 15, 2023

siladu Nov 15, 2023

siladu commented Nov 16, 2023 •

edited

Loading

jframe left a comment

ahamlat commented Nov 16, 2023

siladu commented Nov 17, 2023 •

edited

Loading

		trieLogPruner.rememberTrieLogKeyForPruning(
		forBlockHeader.getNumber(), forBlockHeader.getBlockHash().toArrayUnsafe());

Add trie log pruning triggered after trie log persist #6026

Add trie log pruning triggered after trie log persist #6026

Conversation

siladu commented Oct 12, 2023 • edited Loading

Trie log backlog management

Tasks:

github-actions bot commented Oct 12, 2023

siladu Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garyschulte Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

siladu Oct 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siladu Oct 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garyschulte Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siladu commented Oct 16, 2023 • edited Loading

siladu commented Oct 31, 2023 • edited Loading

siladu commented Nov 1, 2023 • edited Loading

siladu commented Nov 2, 2023

siladu commented Nov 7, 2023 • edited Loading

matkt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siladu commented Nov 15, 2023 • edited Loading

matkt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siladu Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siladu commented Nov 16, 2023 • edited Loading

jframe left a comment

Choose a reason for hiding this comment

ahamlat commented Nov 16, 2023

siladu commented Nov 17, 2023 • edited Loading

Test on existing validator canary once code is finalized (which should have orphaned trielogs)

Log snippets

siladu commented Oct 12, 2023 •

edited

Loading

siladu Oct 12, 2023 •

edited

Loading

garyschulte Oct 12, 2023 •

edited

Loading

siladu Oct 26, 2023 •

edited

Loading

siladu Oct 19, 2023 •

edited

Loading

garyschulte Oct 12, 2023 •

edited

Loading

siladu commented Oct 16, 2023 •

edited

Loading

siladu commented Oct 31, 2023 •

edited

Loading

siladu commented Nov 1, 2023 •

edited

Loading

siladu commented Nov 7, 2023 •

edited

Loading

siladu commented Nov 15, 2023 •

edited

Loading

siladu Nov 15, 2023 •

edited

Loading

siladu commented Nov 16, 2023 •

edited

Loading

siladu commented Nov 17, 2023 •

edited

Loading