-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
I'm generally happy to proceed with fixing it on the basis that it's probably better to optimise for light clients than for full clients. That said do we have any idea about how much of a performance cost the extra |
We don't have exact benches - I'm currently only measure block import time. Before this PR it takes in average ~0.0466s to import block with 5000 different key changes + with changes trie. After this PR it takes ~0.0689 (~30% more) to import same blocks. So it is roughly (0.0689 - 0.0466)/5000 ~0.000_004_460s. |
This could be a problem for blockchain throughput. Will it? |
Yes, you're right - every additional op here could be a problem for blockchain throughput. Luckily, in case of framework (which substrate is) it is mostly a configuration problem, imo. You have many options: (1) disable changes tries at all (2) limit number of SSTORE-like ops (3) play with changes tries parameters. What I'm currently try to achieve is to select default CT configuration that is somewhere in the middle of our chain(s) needs (which is currently rather small - we have 18-22 update ops per blocks in all Emberic Elm blocks) and needs of early ethereum-like chains. I'm currently syncing ethereum chain, analyzing max+min+median of SSTORE+log+create+suicide+balance ops (ops that are affecting state), thus trying to guess best defaults. What I currently have is: (1) small CT optimizations that are in #2840 (fucked up with numbers => still marked inprogress) (2) assumption that there shouldn't be level-2 CT digests at all (by default) - it seems that we will be unable to cover large blocks ranges with CT anyway given current performance (2) idea that we need changeable CT-configuration, because chains are evolving. |
50% extra import time is probably not something we can countenance. How many keys tend to have these "false changes"? Any idea which keys? |
When I'm running current master with 2 authorities, there are 3 typical block patterns:
These ^^^ false changes also include significant number of 'temporary values' changes (that are set at the block initialization && unset on finalization). Temporary values have been already eliminated from changes tries (also by calling Typical 'temporary value's are block number, parent hash from system module. Or, for example, GasSpent from contracts module. The rest 'false changes' could be literally everywhere where Related fact is that there's a state cache => so if value is read from the runtime in the same block where it is 'falsely updated', the check, introduced in this PR, will be just Re 50% increase of block import time: for blocks that are heavily using storage, introducing changes tries (disregarding this PR) could increase import-time by >100%, especially when creating digest tries (see #2840 - numbers are still inaccurate, but they'll give you an overview). |
i think the read-then-update is quite common - indeed i can't think of any instance of "just-update" across the runtime, so perhaps many of the fake changes can be alleviated much more efficiently as part of in any case, i don't think we can merge this PR with the cost as high as it is. |
Marking this as in-progress and unassigning myself (just for cleaning my queue) from the review for now then. Feel free to re-request review though! |
All the temporaries (block number, parent block hash, extrinsic index, extrinsics count, ...) + all
I'm not sure - how's that? You can't distinguish between fake and non-fake calls just by looking at passed values. You need to access stored value anyway. And === Let's probably focus on 3 things (though the 3nd one is probably out of scope):
|
P.S.: alternative to CT (#425) should be faster if we would limit max number of keys that could be cleaned at every block (so that we won't end up with block that tries to clean up 500_000 keys). |
i don't think this should be fixed in this way. if we want to alleviate "fake changes" at all, then it should probably be by restricting usage of |
labelling |
Long story short: there are sometimes false positives in changes tries and this PR fixes this. I.e. now when
key
at the beginning of the block hasoriginal_value
AND block changes it to the sameoriginal_value
=> this change will appear in changes trie of this block. And after applying this fix - it won't.But this PR is a bit more than just fixing a bug - I'd like to check that we actually need to fix this. It is a bug, for sure, when we talk about it in context of original specification. But we could, probably, benefit from it and convert this to feature. I'm currently talking about #2491 where I've lied (sorry for that) to @pepyakin , when he was asking whether changes tries will include notifications about changes like that.
Also I believe this closes #2826 - I've tested this RPC locally && found that it actually issues 1 entry per change, not per block (at least when changes tries are used). The only case is with these 'false changes' - then entry issued on every block that tries to change that value (though it changes it from
original_value
tooriginal_value
) && it seems redundant.So here are pros of fixing the issue:
And here are cons:
Backend::storage()
call for every key that has been touched byExt::set_storage()
;closes #2826 (I believe, or @pepyakin could you please clarify the case?)