-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
polkadot: pin one block per session #1220
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @ordian !
"All good. Unpinning the inclusion blocks", | ||
); | ||
for (_number, hash) in inclusions { | ||
ctx.send_message(ChainApiMessage::UnpinBlock(hash)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a mall issue. While we pin the block the scraper will eventually fade out blocks that fell behind finality too much.
This might be hard to exploit, as slashing should not take too long (always true?!), but we also have an lru that might just get full. I am wondering whether we can not do better? It looks like we would not strictly need the precise relay parent to be around. We should be able to make a block of the same session suffice.
With this, couldn't we make this more robust and more efficient at the same time, by making sure to always have one relay block available for every session in dispute window size and use this for any proofs/session fetches/..?
Technically with probabilistic finality fixed, this should not be too hard as that relay block could then always be on the canonical chain.
Interested in your thoughts. I am definitely not opposing the current solution, it should definitely get a security audit though. I feel there could be exploitable edge cases.
I definitely love the explicit block pinning!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great suggestion, thank you! Currently tinkering with an implementation for it to see if there are any problems with it.
Also extract the leaf creation for tests into a common function.
a3f6021
to
5cba653
Compare
* master: (25 commits) Markdown linter (#1309) Update `fmt` file and some authors (#1379) Bump the known_good_semver group with 1 update (#1375) Bump proc-macro-warning from 0.4.1 to 0.4.2 (#1376) feat: add futures api to `TransactionPool` (#1348) Ensure cumulus/bridges is ignored by formatter and run it (#1369) substrate: chain-spec paths corrected in zombienet tests (#1362) contracts: Update to wasmi 0.31 (#1350) [improve docs]: Template pallet (#1280) [xcm-emulator] Unignore cumulus integration tests (#1247) Fix wrong ref counting (#1358) Use cached session index to obtain executor params (#1190) fix typos (#1339) Use bandersnatch-vrfs with locked dependencies ref (#1342) Bump bs58 from 0.4.0 to 0.5.0 (#1293) Contracts: `seal0::balance` should return the free balance (#1254) Logs: add extra debug log for negative rep changes (#1205) Added short-benchmarks for cumulus (#1183) [xcm-emulator] Improve hygiene and clean up (#1301) Bump the known_good_semver group with 1 update (#1347) ...
This is ready for the review, but |
} | ||
|
||
#[test] | ||
fn forward_subsystem_works() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove this test to break (dev-) cycle dependency. Also, in general, I am not convinced we need tests for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! This is code how I like it!
status: LeafStatus::Fresh, | ||
span: Arc::new(jaeger::Span::Disabled), | ||
}, | ||
fresh_leaf(*new_head, number), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the side: We found this fresh/stale mechanism to be redundant. We should remove it at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed it to new_leaf to account for this issue being worked on. Also with this helper function, we won't have to modify tons of files to implement this issue.
pub number: BlockNumber, | ||
/// A handle to unpin the block on drop. | ||
pub unpin_handle: UnpinHandle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 😍
/// | ||
/// This is useful for runtime API calls to blocks that are | ||
/// racing against finality, e.g. for slashing purposes. | ||
pub type UnpinHandle = sc_client_api::UnpinHandle<Block>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! No races against finality anymore! This should do away with a couple of issues, you have been looking into @tdimitrov !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All we need to do is keep the leaf alive, while we work on it.
@@ -74,6 +75,10 @@ pub struct RuntimeInfo { | |||
/// Look up cached sessions by `SessionIndex`. | |||
session_info_cache: LruMap<SessionIndex, ExtendedSessionInfo>, | |||
|
|||
/// Unpin handle of *some* block in the session. | |||
/// Only blocks pinned explicitly by `pin_block` are stored here. | |||
pinned_blocks: LruMap<SessionIndex, UnpinHandle>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @ordian . This is a very elegant solution.
* master: (28 commits) Adds base benchmark for do_tick in broker pallet (#1235) zombienet: use another collator image for the slashing test (#1386) Prevent a fail prdoc check to block (#1433) Fix nothing scheduled on session boundary (#1403) GHW for building and publishing docker images (#1391) pallet asset-conversion additional quote tests (#1371) Remove deprecated `pallet_balances`'s `set_balance_deprecated` and `transfer` dispatchables (#1226) Fix PRdoc check (#1419) Fix the wasm runtime substitute caching bug (#1416) Bump enumn from 0.1.11 to 0.1.12 (#1412) RFC 14: Improve locking mechanism for parachains (#1290) Add PRdoc check (#1408) fmt fixes (#1413) Enforce a decoding limit in MultiAssets (#1395) Remove dynamic dispatch using `Ext` (#1399) Remove redundant calls to `borrow()` (#1393) Get rid of polling in `WarpSync` (#1265) Bump actions/checkout from 3 to 4 (#1398) Bump thiserror from 1.0.47 to 1.0.48 (#1396) Move Relay-Specific Shared Code to One Place (#1193) ...
* master: Forgotten `polkadot-core-primitives/std` (#1440)
* polkadot: propagate UnpinHandle to ActiveLeafUpdate Also extract the leaf creation for tests into a common function. * dispute-coordinator: try pinned blocks for slashin * apparently 1.72 is smarter than 1.70 * address nits * rename fresh_leaf to new_leaf
* polkadot: propagate UnpinHandle to ActiveLeafUpdate Also extract the leaf creation for tests into a common function. * dispute-coordinator: try pinned blocks for slashin * apparently 1.72 is smarter than 1.70 * address nits * rename fresh_leaf to new_leaf
This pull request has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/stalled-parachains-on-kusama-post-mortem/3998/1 |
* refactor finality relay helper definitions * add missing doc * removed commented code * fmt * disable rustfmt for macro * move best_finalized method const to relay chain def
Fixes #623.
Problem
When an invalid parachain block (candidate) gets backed and included, an honest validator doing approval-checking work will raise a dispute. Assuming the honest supermajority, the dispute will conclude against the candidate. That means we will revert to the block right before the inclusion of the invalid candidate and transplant the results of the dispute to the next block built on top (without the inclusion). So far so good.
After this happens, we want to slash the offending backers. Normally, this would happen right in the same block when the dispute concluded. However, if the inclusion happened in a past session, this might happen in the following blocks. For past session slashing to work, we need the state of a block in this past session in order to query the key ownership proof on the node side.
The problem is that inclusion blocks of the candidate might be pruned as a fork of a finalized chain as you can see from the logs:
Solution
To address this problem, we pin the state of 1 block per session for
DISPUTE_WINDOW
(which is 6 at the time of writing) sessions. See 5cba653. Given that we know the blocks that are not pruned, we can make guarantee that runtime APIs to that block will not fail.Alternatives
Instead of storing (pinning) the whole state of the block, why don't we collect just key ownership proofs of backers?
Well, the problem with this approach, is that it assumes that the node side can predict who is going to be slashed in the runtime before it happens. The logic of slashing is encapsulated in the runtime and we don't want to leak that to the node side. Besides, there might be other storage items we might want to query in the future.
Pin inclusion blocks as soon as we see f+1 votes against a candidate. The problem with this approach is described below: polkadot: pin one block per session #1220 (comment)