Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Fix false positives in changes tries #2865

Closed
wants to merge 1 commit into from

Conversation

svyatonik
Copy link
Contributor

@svyatonik svyatonik commented Jun 14, 2019

Long story short: there are sometimes false positives in changes tries and this PR fixes this. I.e. now when key at the beginning of the block has original_value AND block changes it to the same original_value => this change will appear in changes trie of this block. And after applying this fix - it won't.

But this PR is a bit more than just fixing a bug - I'd like to check that we actually need to fix this. It is a bug, for sure, when we talk about it in context of original specification. But we could, probably, benefit from it and convert this to feature. I'm currently talking about #2491 where I've lied (sorry for that) to @pepyakin , when he was asking whether changes tries will include notifications about changes like that.

Also I believe this closes #2826 - I've tested this RPC locally && found that it actually issues 1 entry per change, not per block (at least when changes tries are used). The only case is with these 'false changes' - then entry issued on every block that tries to change that value (though it changes it from original_value to original_value) && it seems redundant.

So here are pros of fixing the issue:

  • it is a right thing to-do, because light (and RPC) implementations will have 100% guarantee that the value has actually been changed;
  • changes tries are smaller => small performance boost, especially for digest tries.

And here are cons:

  • fix requires additional Backend::storage() call for every key that has been touched by Ext::set_storage();
  • we need workarounds (like suggested in Light client friendly events #2491) when we want light clients to be notified about false changes.

closes #2826 (I believe, or @pepyakin could you please clarify the case?)

@svyatonik svyatonik added A0-please_review Pull request needs code review. M4-core labels Jun 14, 2019
@svyatonik svyatonik requested review from gavofyork and pepyakin June 14, 2019 12:39
@gavofyork
Copy link
Member

I'm generally happy to proceed with fixing it on the basis that it's probably better to optimise for light clients than for full clients. That said do we have any idea about how much of a performance cost the extra storage() call is?

@svyatonik
Copy link
Contributor Author

do we have any idea about how much of a performance cost the extra storage() call is

We don't have exact benches - I'm currently only measure block import time. Before this PR it takes in average ~0.0466s to import block with 5000 different key changes + with changes trie. After this PR it takes ~0.0689 (~30% more) to import same blocks. So it is roughly (0.0689 - 0.0466)/5000 ~0.000_004_460s.

@Demi-Marie
Copy link
Contributor

do we have any idea about how much of a performance cost the extra storage() call is

We don't have exact benches - I'm currently only measure block import time. Before this PR it takes in average ~0.0466s to import block with 5000 different key changes + with changes trie. After this PR it takes ~0.0689 (~30% more) to import same blocks. So it is roughly (0.0689 - 0.0466)/5000 ~0.000_004_460s.

This could be a problem for blockchain throughput. Will it?

@svyatonik
Copy link
Contributor Author

Yes, you're right - every additional op here could be a problem for blockchain throughput. Luckily, in case of framework (which substrate is) it is mostly a configuration problem, imo. You have many options: (1) disable changes tries at all (2) limit number of SSTORE-like ops (3) play with changes tries parameters.

What I'm currently try to achieve is to select default CT configuration that is somewhere in the middle of our chain(s) needs (which is currently rather small - we have 18-22 update ops per blocks in all Emberic Elm blocks) and needs of early ethereum-like chains. I'm currently syncing ethereum chain, analyzing max+min+median of SSTORE+log+create+suicide+balance ops (ops that are affecting state), thus trying to guess best defaults.

What I currently have is: (1) small CT optimizations that are in #2840 (fucked up with numbers => still marked inprogress) (2) assumption that there shouldn't be level-2 CT digests at all (by default) - it seems that we will be unable to cover large blocks ranges with CT anyway given current performance (2) idea that we need changeable CT-configuration, because chains are evolving.

@gavofyork
Copy link
Member

50% extra import time is probably not something we can countenance. How many keys tend to have these "false changes"? Any idea which keys?

@svyatonik
Copy link
Contributor Author

When I'm running current master with 2 authorities, there are 3 typical block patterns:

  • with 31 keys changed, of which 14-15 are 'false changes';
  • with 19 keys changed, of which 12-13 are 'false changes';
  • with 26 keys changed, of which 12 are 'false changes'.

These ^^^ false changes also include significant number of 'temporary values' changes (that are set at the block initialization && unset on finalization). Temporary values have been already eliminated from changes tries (also by calling Backend::storage()) before this PR.

Typical 'temporary value's are block number, parent hash from system module. Or, for example, GasSpent from contracts module. The rest 'false changes' could be literally everywhere where set_storage() is called.

Related fact is that there's a state cache => so if value is read from the runtime in the same block where it is 'falsely updated', the check, introduced in this PR, will be just RwLock::read() + HashMap::get(). I'm not sure whether 'read-then-update' pattern is used often/less often than the simple 'update' pattern, though.

Re 50% increase of block import time: for blocks that are heavily using storage, introducing changes tries (disregarding this PR) could increase import-time by >100%, especially when creating digest tries (see #2840 - numbers are still inaccurate, but they'll give you an overview).

@gavofyork
Copy link
Member

i think the read-then-update is quite common - indeed i can't think of any instance of "just-update" across the runtime, so perhaps many of the fake changes can be alleviated much more efficiently as part of set_storage itself.

in any case, i don't think we can merge this PR with the cost as high as it is.

@pepyakin pepyakin added A3-in_progress Pull request is in progress. No review needed at this stage. and removed A0-please_review Pull request needs code review. labels Jun 17, 2019
@pepyakin pepyakin removed their request for review June 17, 2019 10:13
@pepyakin
Copy link
Contributor

Marking this as in-progress and unassigning myself (just for cleaning my queue) from the review for now then. Feel free to re-request review though!

@svyatonik
Copy link
Contributor Author

indeed i can't think of any instance of "just-update" across the runtime,

All the temporaries (block number, parent block hash, extrinsic index, extrinsics count, ...) + all ::kill() calls + all clear_prefix() calls + all updates of accumulate-like values (like Median from finality tracker) are working using that pattern. There's a lot of such calls actually.

so perhaps many of the fake changes can be alleviated much more efficiently as part of set_storage itself.

I'm not sure - how's that? You can't distinguish between fake and non-fake calls just by looking at passed values. You need to access stored value anyway. And set_storage() is called many times, while changes trie is built once per block => we need to invent some additional cache (or we end up having multiple get_storage() calls per for the same key) instead of just calling it once naturally.

===

Let's probably focus on 3 things (though the 3nd one is probably out of scope):

  1. is this an issue that should be fixed - think we have agreed on that. It should be fixed;
  2. is there a better (than proposed) way to fix the issue? I haven't found better way (yet?) - the only way to find if that's a fake change, is to look for previous value in storage. And PR makes minimal number of calls for that - 1 per key (ok, a bit more if key is both in prospective && committed - could fix this). We could probably do try some completely different approach to make it <1 call per key - like explicitly marking some storage keys as temporary and ignore these keys in CT;
  3. what is acceptable overhead for the fix. This actually depends (imo) on the overall overhead of CT - if we're not ready to have significant negative impact on current block import time at all, then probably trie isn't the structure we need?

@svyatonik
Copy link
Contributor Author

P.S.: alternative to CT (#425) should be faster if we would limit max number of keys that could be cleaned at every block (so that we won't end up with block that tries to clean up 500_000 keys).

@svyatonik svyatonik added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Jun 17, 2019
@gavofyork
Copy link
Member

  1. when did we agree on that? i'm not particularly convinced now, tbh.
  2. this is only efficient if the key was not read before. something like ::mutate for example would be able to prevent the unnecessary set_storage at source.
  3. there are two overheads: one relevant primarily for light-clients and the other for full-nodes. with this "fix", full-nodes take a lot longer but light-clients are better optimised. without, light-clients get told of more changes than are strictly needed and full-nodes don't need do anything more than now.

i don't think this should be fixed in this way. if we want to alleviate "fake changes" at all, then it should probably be by restricting usage of set_storage within the runtime in order to minimise the instances of redundant sets.

@gavofyork
Copy link
Member

labelling gotissues as i've now convinced myself that this isn't the way we should be "fixing" this.

@gavofyork gavofyork added A4-gotissues and removed A0-please_review Pull request needs code review. labels Jun 18, 2019
@svyatonik svyatonik closed this Jun 19, 2019
@svyatonik svyatonik deleted the fix_false_positives_in_ct branch June 21, 2019 06:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

state_queryStorage is too verbose
4 participants