Add special tag to cache specific runtime storage items. #10251

shawntabrizi · 2021-11-12T18:02:17Z

The Substrate runtime takes advantage of a storage cache where the first read of a storage item places an item in the cache, and subsequent reads of that item happen in memory.

There are some storage items which we know will be accessed every block, and thus we ignore reads to these items already in our benchmarking: https://github.com/paritytech/substrate/blob/master/bin/node/runtime/src/lib.rs#L1683

let whitelist: Vec<TrackedStorageKey> = vec![
    // Block Number
    hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef702a5c1b19ab7a04f536c519aca4983ac").to_vec().into(),
    // Total Issuance
    hex_literal::hex!("c2261276cc9d1f8598ea4b6a74b15c2f57c875e4cff74148e4628f264b974c80").to_vec().into(),
   ...
];

What we should introduce is some tag like #[cached] or #[fast] that can be placed over runtime storage items which will ensure that these items are placed into memory at the begging of a block, and thus we can ignore all storage reads to these items when calculating weights.

Imagine:

/// The total units issued in the system.
#[pallet::storage]
#[pallet::storage::cached]  // New attribute macro
#[pallet::getter(fn total_issuance)]
pub type TotalIssuance<T: Config<I>, I: 'static = ()> = StorageValue<_, T::Balance, ValueQuery>;

The pallet then collects a list of all such storage keys, and exposes them in some vec.

In benchmarking, we can use this list of cached storage items to populate the whitelist (rather than doing so manually).

Finally, as a part of the block production / client process, these storage items would be read if not already in memory, and them kept in memory (even between blocks?).

Questions:

Can this somehow work with storage maps?
Can the memory cache persist across blocks? (or would we need to do one read at the beginning of a block?)

The text was updated successfully, but these errors were encountered:

xlc · 2021-11-12T19:15:53Z

Can we avoid encode/decode those cached storages within a block?

bkchr · 2021-11-12T19:36:19Z

Can this somehow work with storage maps?

Not sure where the advantage would be of reading the entire map before building the block, while we maybe only read some elements of it. I mean, it makes sense to ensure that keys of the map are "pinned" in the cache, so that they can not be removed. However, if there is a value that we need to read before building the block vs while building the block (both times the value would be cached after the first read), it would make no difference in execution time.

Can the memory cache persist across blocks? (or would we need to do one read at the beginning of a block?)

With cache pinning we could ensure that stuff stays in memory across blocks. However, this gets more complicated when it comes to forks, because we also need to unpin data again (e.g. there was a runtime upgrade with new keys that are relevant).

We also need to make sure that we don't break parachains, because with parachains we need to collect the proof and that requires that we"bubble" up the read to the point where the proof is collected.

The examples you have given, like e.g. block number are irrelevant for this request anyway. Because block number is written at the beginning of the block and then we hit this in the overlay, aka our "fastest cache".

shawntabrizi · 2022-08-29T16:15:46Z

Here is a more specific breakdown of the parts of this PR:

There should be a new trait defined in frame/suport/src/traits/storage.rs:

pub trait WhitelistedStorageKeys {
    fn whitelisted_storage_keys() -> Vec<TrackedStorageKey>;
}

In this case, TrackedStorageKey is already defined in primitives/storage/src/lib.rs.

This new trait should be implemented for every pallet, where by default it returns an empty Vec.

This trait should be implemented for tuples so that it can support the AllPallets tuple: https://crates.io/crates/impl-trait-for-tuples/0.2.2

See also: https://github.com/paritytech/substrate/pull/11813/files

Each pallet should be able to tag storage items like so:

	/// The current block number being processed. Set by `execute_block`.
	#[pallet::storage]
	#[pallet::storage::cached]
	#[pallet::getter(fn block_number)]
	pub(super) type Number<T: Config> = StorageValue<_, T::BlockNumber, ValueQuery>;

Then in the macro generation, this will populate the storage keys which should be whitelisted.

You should be able to query the storage key for each storage item using: https://paritytech.github.io/substrate/master/frame_support/traits/struct.StorageInfo.html

Once you have collected all the storage keys from the macro, each pallet should automatically generate something like:

impl WhitelistedStorageKeys for Pallet {
    fn whitelisted_storage_keys() -> Vec<TrackedStorageKey> {
        vec![key1, key2, key3,...]
    }
}

NOTE: We may need to look at pallet instances here, since different instances may have different keys, and we wouldn't want to statically generate just one set of keys.

Once this is done, then we should be able to simply replace the call for whitelisting keys in the benchmarking setup:

// let whitelist: Vec<TrackedStorageKey> = vec![
// 	// Block Number
// 	hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef702a5c1b19ab7a04f536c519aca4983ac").to_vec().into(),
// 	// Total Issuance
// 	hex_literal::hex!("c2261276cc9d1f8598ea4b6a74b15c2f57c875e4cff74148e4628f264b974c80").to_vec().into(),
// 	// Execution Phase
// 	hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef7ff553b5a9862a516939d82b3d3d8661a").to_vec().into(),
// 	// Event Count
// 	hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef70a98fdbe9ce6c55837576c60c7af3850").to_vec().into(),
// 	// System Events
// 	hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef780d41e5e16056765bc8461851072c9d7").to_vec().into(),
// 	// System BlockWeight
// 	hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef734abf5cb34d6244378cddbf18e849d96").to_vec().into(),
// 	// Treasury Account
// 	hex_literal::hex!("26aa394eea5630e07c48ae0c9558cef7b99d880ec681799c0cf30e8886371da95ecffd7b6c0f78751baa9d281e0bfa3a6d6f646c70792f74727372790000000000000000000000000000000000000000").to_vec().into(),
// ];

let whitelist = AllPallets::whitelisted_storage_keys();

Note the commented lines should be replaced by this new AllPallets::whitelisted_storage_keys() trait call for tuples.

bkchr · 2022-08-29T20:31:30Z

#[pallet::storage::cached]

I would propose something like #[benchmarking(cached)] or #[benchmarking(whitelisted)].

sam0x17 · 2022-08-31T15:22:02Z

@shawntabrizi just a note that if I try to derive impl_for_tuples(5) on the trait, it won't work in automatic mode. Do we want to try to do semi-automatic mode? Could you elaborate a bit more about the scenarios where we need to use this with tuples?

error: Not supported by full-automatic tuple implementation. Use semi-automatic tuple implementation for more control of the implementation.
   --> frame/support/src/traits/storage.rs:103:35
    |
103 |     fn whitelisted_storage_keys() -> Vec<TrackedStorageKey>;
    |                                   ^^

shawntabrizi · 2022-08-31T16:12:33Z

Yes, you should do a semi-automatic implementation which concatenates the vectors together.

We need the tuple implementation so that we can reference all the whitelisted keys of all the pallets, or else there is no other smooth way to get that across all the pallets.

The impl should look something like:

impl WhitelistedStorageKeys for (A, B)
where:
    A: WhitelistedStorageKeys,
    B: WhitelistedStorageKeys,
{
    fn whitelisted_storage_keys() -> Vec<TrackedStorageKey> {
        A.whitelisted_storage_keys().concat(B.whitelisted_storage_keys())
    }
}

This is how you would do it for 2 item tuples, and the impl_for_tuples should allow you to scale that to N somehow.

There should be examples in Substrate or in the impl_for_tuples documentation.

sam0x17 · 2022-09-06T11:08:36Z

@shawntabrizi just an update I was struggling for a bit trying to figure out how best to capture the #[benchmarking(cached)] macro calls in aggregate and am now doing this from within the parsing code in frame/support/procedural/src/pallet/parse/mod.rs. Doing a bit of yak shaving so I can scan for such calls in a hygenic way. Let me know if this is not where I should be detecting these. Some rudimentary/trivial code that detects #[benchmarking(cached)] calls can be found here: 63b2a1d Right now I am re-writing this to use syn for the parsing as what I have right now is just hacky and string based. Hopefully this is what you intended!

kianenigma · 2022-09-06T12:08:59Z

Is this only meant for benchmarking, or will also do something in production?

If former, then it should be called #[benchmarking(whitelisted)] as @bkchr said. If latter, I honestly don't quite agree with it. This is equivalent to: in any pallet, if you want some storage items to be cached, read them all in on_initialize. With the exception that if for whatever reason that storage item was not needed during that block, we'd be wasting resources.

Also, you have to be careful about where and when you read these items. When we do block import, these cache semantics make sense, but when building blocks, every transaction is applied on top of a fresh runtime and anything that was cached is now gone. I am not sure what are the benchmarking assumptions about this. I guess we are benchmarking for block import, not authorship.

sam0x17 · 2022-09-06T16:32:43Z

I agree @kianenigma, perhaps instead of having it manually read those storage items just to ensure they are cached, we could instead statically determine that there is at least one read of the item, and panic in the attribute macro otherwise. That way the data access is still up to the programmer, and you won't be able to whitelist something that isn't actually read. Easier said than done of course but this would defintely be the best developer experience if I understand the problem correctly

bkchr · 2022-09-06T17:51:56Z

@shawntabrizi just an update I was struggling for a bit trying to figure out how best to capture the #[benchmarking(cached)] macro calls in aggregate and am now doing this from within the parsing code in frame/support/procedural/src/pallet/parse/mod.rs. Doing a bit of yak shaving so I can scan for such calls in a hygenic way. Let me know if this is not where I should be detecting these. Some rudimentary/trivial code that detects #[benchmarking(cached)] calls can be found here: 63b2a1d Right now I am re-writing this to use syn for the parsing as what I have right now is just hacky and string based. Hopefully this is what you intended!

I think it goes into the right direction.

ggwpez · 2022-09-06T18:16:47Z

I would propose something like #[benchmarking(cached)] or #[benchmarking(whitelisted)].

Wait a second, I thought this change would also affect production; not just benchmarking.
To explain my understanding:

#[pallet::storage::cached]: Access the value in on_initialize and whitelist it in benchmarks.
#[benchmarking(whitelisted)]: Only whitelist in benchmarking without production change.

@shawntabrizi which one did you mean now?

sam0x17 · 2022-09-06T18:21:18Z

Happy to do both separately btw if that makes sense

sam0x17 · 2022-09-07T04:18:10Z

Note that it now uses proper parsing to find the #[benchmarking(cached)] calls as of 2bbb92e

shawntabrizi added J0-enhancement An additional feature request. Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. labels Nov 12, 2021

shawntabrizi mentioned this issue Aug 24, 2023

META: FRAME Macro Ideas paritytech/polkadot-sdk#264

Open

19 tasks

shawntabrizi assigned sam0x17 Aug 18, 2022

kianenigma moved this to Backlog in Runtime / FRAME Aug 21, 2022

kianenigma added this to Runtime / FRAME Aug 21, 2022

sam0x17 mentioned this issue Sep 7, 2022

Add special tag to exclude runtime storage items from benchmarking #12205

Merged

sam0x17 moved this from Backlog to In progress in Runtime / FRAME Sep 7, 2022

paritytech-processbot bot closed this as completed in #12205 Sep 16, 2022

Repository owner moved this from In progress to Done in Runtime / FRAME Sep 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add special tag to cache specific runtime storage items. #10251

Add special tag to cache specific runtime storage items. #10251

shawntabrizi commented Nov 12, 2021 •

edited

Loading

xlc commented Nov 12, 2021

bkchr commented Nov 12, 2021

shawntabrizi commented Aug 29, 2022

bkchr commented Aug 29, 2022

sam0x17 commented Aug 31, 2022 •

edited

Loading

shawntabrizi commented Aug 31, 2022 •

edited

Loading

sam0x17 commented Sep 6, 2022 •

edited

Loading

kianenigma commented Sep 6, 2022

sam0x17 commented Sep 6, 2022 •

edited

Loading

bkchr commented Sep 6, 2022

ggwpez commented Sep 6, 2022 •

edited

Loading

sam0x17 commented Sep 6, 2022 •

edited

Loading

sam0x17 commented Sep 7, 2022 •

edited

Loading

Add special tag to cache specific runtime storage items. #10251

Add special tag to cache specific runtime storage items. #10251

Comments

shawntabrizi commented Nov 12, 2021 • edited Loading

xlc commented Nov 12, 2021

bkchr commented Nov 12, 2021

shawntabrizi commented Aug 29, 2022

bkchr commented Aug 29, 2022

sam0x17 commented Aug 31, 2022 • edited Loading

shawntabrizi commented Aug 31, 2022 • edited Loading

sam0x17 commented Sep 6, 2022 • edited Loading

kianenigma commented Sep 6, 2022

sam0x17 commented Sep 6, 2022 • edited Loading

bkchr commented Sep 6, 2022

ggwpez commented Sep 6, 2022 • edited Loading

sam0x17 commented Sep 6, 2022 • edited Loading

sam0x17 commented Sep 7, 2022 • edited Loading

shawntabrizi commented Nov 12, 2021 •

edited

Loading

sam0x17 commented Aug 31, 2022 •

edited

Loading

shawntabrizi commented Aug 31, 2022 •

edited

Loading

sam0x17 commented Sep 6, 2022 •

edited

Loading

sam0x17 commented Sep 6, 2022 •

edited

Loading

ggwpez commented Sep 6, 2022 •

edited

Loading

sam0x17 commented Sep 6, 2022 •

edited

Loading

sam0x17 commented Sep 7, 2022 •

edited

Loading