Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] Moving Staking off the Relay Chain #491

Open
Ank4n opened this issue Aug 23, 2023 · 12 comments
Open

[Meta] Moving Staking off the Relay Chain #491

Ank4n opened this issue Aug 23, 2023 · 12 comments
Assignees
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category.

Comments

@Ank4n
Copy link
Contributor

Ank4n commented Aug 23, 2023

Updated on 7th August 2024 based on the Plaza proposal..


Design

Balances will exist in the same chain as Staking. Staking chain will have its own pallet-balances. DOT tokens can be transferred (teleported) to staking chain in order to stake. The chain cannot burn or mint the tokens that changes the total issuance of DOTs (teleport is fine).
Staking will be on same chain as Balances which simplifies the reward minting logic.

There are three major discussion points that we have identified and some of them may deserve its own RFC.

  • Relay Chain <> Staking Chain interface
  • Reward minting logic
  • Migration Strategy

Relay Chain <> Staking Chain interface

  • Relay chain knows about session but has no concept of an era. Lot of parachain functions such as onboarding of parachain are dependent on sessions so we cannot move concept of session away from relay chain without breaking lot of existing stuff.
  • Block Authoring: Staking needs to know about which validators authored how many blocks in an era to assign them era points. Relay chain buffers block authors by session keys and sends them as a batch at the end of every session to staking chain.
  • Offence reporting: Whenever there is an offence, this is reported from relay chain to staking chain. This is not buffered and sent as soon as possible as we don't want to wait to slash/chill a validator.
  • Session keys: Validators generate session key on their relay chain node and sets it on staking chain pairing it with their stash account.
  • Every end of an era, staking chain sends an xcm (maybe paged) instruction to relay chain to change the active validator set with the provided set of keys.

Migration Strategy

Snapshot based: Freezing and migrating all staked balance to the Staking Chain

Rough steps:

  • Freeze all stakes, active and unlocking.
  • Build a state root of the data to be migrated to Staking Chain and set it to Staking chain via root origin track.
  • Copy over (via bot using permissionless extrinsic) all the staking ledger and staked balances to Staking Chain verifying the data is correct.
  • No new election happens during this phase. We extend the era if we need to.
  • Slashes are buffered in the relay chain (if any).
  • Once we are sure staking system is working as expected, migrated staking state in relay chain can be killed. We still should keep historical era data for validators to be able to claim the older rewards.

Advantages

  • Only one staking system and better overall economic security.
  • One shot migration.

Other thoughts

Staking Chain performs election for all collators of system parachains

Since Staking chain already elects the winners for the validator set of the relay chain, it could also become the central place where collators stake, gets elected and rewarded for collating on other system chains such as Asset Hub, Collectives as well as collating for Staking Chain itself.

Other links

@Ank4n Ank4n added the J1-meta label Aug 23, 2023
@Ank4n Ank4n self-assigned this Aug 23, 2023
@burdges
Copy link

burdges commented Aug 23, 2023

Do we know how much relay chain usage is staking related now?

@joepetrowski
Copy link
Contributor

We talked about not having tokens on Staking chain at all (or may be a small balance just for paying fees) but eventually decided against the complexity introduced because of it. The topic is a bit beyond the scope of this issue, though we could revisit it if someone has strong opinions about it.

This is more XCM v4 related, but it'd still be nice to have Stake and Unstake instructions in v4. A value proposition of Asset Hub is that it gives custodians the ability to support parachain native assets without running infrastructure for those parachains. But a lot of their users also want to stake, and if they have to teleport the assets back to the parachain to make certain calls, then it puts them back in the same problem of needing to run nodes there.

So, we don't have to use these instructions for DOT staking, but it'd be nice to have them in XCM v4 so that other chains can impl those instructions, and on Asset Hub we can have a stake_remote(asset, amount) call.

@juangirini juangirini transferred this issue from paritytech/polkadot Aug 24, 2023
@Ank4n
Copy link
Contributor Author

Ank4n commented Aug 24, 2023

Do we know how much relay chain usage is staking related now?

Do you mean by block weight? I don't have any numbers but I can try to dig it up if you think its relevant? The primary reason as I understand to move staking (as well as governance and other functions) out of relay chain is not only for freeing up bandwidth on relay chain but to move towards a vision of relay chain only being used to back parachain blocks.

@vstam1
Copy link
Contributor

vstam1 commented Aug 25, 2023

Validators need to set session keys on the session pallet and tie it to their ValidatorID (stash account). The validators on staking chain would probably need to set session keys on the relay chain with a remote origin from Staking chain. Would this work?

I could see a solution where we can use the AliasOrigin instruction together with the Transact instruction. We use AliasOrigin to alias the remote origin to a local origin.
Lets say Alice on staking hub wants to set session keys on the relay chain session pallet:

XCM(vec![
... // possibly paying for fees etc
AliasOrigin(MultiLocation{
	parents:0, 
	interior: X1(AccountId32{Network: ..., id: Alice})
}),
Transact{origin_kind: Native, ..., call: *set_keys call*}
])

The origin is changed from:
Parachain(StakingHubId)\AccountId32{id:Alice}
to:
AccountId32{id:Alice}
So it is as if the local Alice account dispatches the set_keys call.

The question is if we should allow aliasing staking hub accounts in the relay chain (and maybe vice versa) as it would mean that StakingHubAlice == RelayAlice. But I think it could solve a lot of problems around the parallel staking in staking hub and asset hub.

@the-right-joyce the-right-joyce added I6-meta A specific issue for grouping tasks or bugs of a specific category. and removed J1-meta labels Aug 25, 2023
@joepetrowski
Copy link
Contributor

DOT tokens can be transferred to staking chain in order to stake. The chain cannot burn or mint the tokens. Relay chain (in future this could be Asset hub) could mint 8% of total supply and teleport it to the Staking Chain at the start of each cycle where each cycle is 28 eras. This 8% then is only used for Staking rewards.

We talked about not having tokens on Staking chain at all (or may be a small balance just for paying fees) but eventually decided against the complexity introduced because of it. The topic is a bit beyond the scope of this issue, though we could revisit it if someone has strong opinions about it.

Adding more detail to my last comment, I missed the retreat but would like to revisit this decision, or at least bring up a few tradeoffs.

First, I understand that just moving DOT to a staking chain makes things much simpler for staking, since it will minimize the number of changes needed. But there are two major downsides I see:

  1. Until now, staked DOT has been non-transferrable but still usable for other locking (now freezing) activities, namely governance. Moving DOT to a staking chain means either (a) those funds can only be used for staking, or (b) governance (and other freezing activities) will need to know about a user's DOT in multiple locations (Asset Hub and staking chain).
  2. As alluded to in my previous comment, it adds extra integration work for applications/services to provide a very common service. Ideally Asset Hub can be somewhat a single entry point for "universal" user stories (transfer, stake, vote). All asset mutations (transfers, rewards, slashes, holds, etc.) would be accessible via an Asset Hub node. Of course, many chains (Polkadot included) will have custom staking/governance logic, but we can provide apps/tools with embedded light clients for transacting on those chains with custom application logic, but the asset-related side effects come to Asset Hub. With this pattern, custodians/applications could support transferring, staking, and voting for any asset on any chain reachable from Polkadot that implements the relevant XCM instructions.

The tradeoff with (2) that we should minimize is that AH should not become a "do everything" chain, because that defeats the purpose of separate cores. Specific things like validator ops in staking or submitting referenda in governance should still all be handled on their respective chains. But I still think there's a minimal amount (e.g. stake, vote) that can be done from AH with specific implementations of that logic handled on other chains.

@burdges
Copy link

burdges commented Aug 25, 2023

Do you mean by block weight? I don't have any numbers but I can try to dig it up if you think its relevant? The primary reason as I understand to move staking (as well as governance and other functions) out of relay chain is not only for freeing up bandwidth on relay chain but to move towards a vision of relay chain only being used to back parachain blocks.

It's relevant to how fast we do the move probably.

An easier target might be NPoS because NPoS blocks are way too heavy for paracains right now, which also block this, but that's basically the only problem with doing NPoS on a parachain. If we could run the blocks, like by dividing them, then we could just fork the relay chain state into a parachain of itself, where NPoS runs.

@Ank4n
Copy link
Contributor Author

Ank4n commented Sep 4, 2023

The question is if we should allow aliasing staking hub accounts in the relay chain (and maybe vice versa) as it would mean that StakingHubAlice == RelayAlice.

This is the tricky part during the transition phase where we will slowly scale the validator set coming from staking hub. Since staking on staking hub and staking on relay chain would be completely independent of each other, Alice validating on relay chain and Alice validating on Staking hub should be considered different validators.

@Ank4n
Copy link
Contributor Author

Ank4n commented Sep 5, 2023

  1. Until now, staked DOT has been non-transferrable but still usable for other locking (now freezing) activities, namely governance. Moving DOT to a staking chain means either (a) those funds can only be used for staking, or (b) governance (and other freezing activities) will need to know about a user's DOT in multiple locations (Asset Hub and staking chain).

We definitely want staked dots to be available for governance and this I believe should be very similar to dots on Asset Hub. That is, 1) hold the dots in the asset hub/staking chain, 2) send NoteUnlockable instruction to governance chain, 3) vote action on governance chain once it receives proof of hold. Governance chain would just know there are some dots locked in asset hub/staking chain and it can be used for voting. Apps/wallets would need to integrate with both Asset Hub/Staking chain to see all the Assets/DOTs a user has.

  1. As alluded to in my previous comment, it adds extra integration work for applications/services to provide a very common service. Ideally Asset Hub can be somewhat a single entry point for "universal" user stories (transfer, stake, vote). All asset mutations (transfers, rewards, slashes, holds, etc.) would be accessible via an Asset Hub node. Of course, many chains (Polkadot included) will have custom staking/governance logic, but we can provide apps/tools with embedded light clients for transacting on those chains with custom application logic, but the asset-related side effects come to Asset Hub. With this pattern, custodians/applications could support transferring, staking, and voting for any asset on any chain reachable from Polkadot that implements the relevant XCM instructions.

We did start with the assumption to go with the idea that you are proposing (AssetHub be the central point of action for any asset mutation op). As you said though, this makes AssetHub a bottleneck since almost every action would start there. There are other complexities such as large number of xcm instructions needed for reward payouts (and slashes), fee payments on the non-asset chain. Most of these instructions will need followup interaction with the non asset chain, such as stake would require a follow up action of nominate (voting instruction with what-to-vote) which means apps would need to integrate with both chains anyways. There were also some reservations around how a staking operation would work. Current LockAsset instruction iirc does not propagate the intent of locking to the non-asset chain, and so we need another xcm instruction describing the intent (such as stake the locked funds) but ensure that NoteUnlockable is processed first at the non-asset chain before the intent xcm. This from what I understood was not so trivial, and as it works today, apps would still need to integrate with both chains, listen for NoteUnlockable before issuing the next instruction.

So even if we go with the Asset Hub being the entry point, I think most apps would need to integrate with multiple chains one way or another. Since staking funds are semi-liquid, it seems a reasonable compromise to keep assets there as well (which simplifies lot of things). Once we have that, I think we can still strive to move towards the goal that you are suggesting, i.e. to be able to stake funds in the asset hub (or move funds to staking and then stake).

@joepetrowski
Copy link
Contributor

Most of these instructions will need followup interaction with the non asset chain, such as stake would require a follow up action of nominate

Actually I meant the opposite of this when I wrote, "many chains (Polkadot included) will have custom staking/governance logic, but we can provide apps/tools with embedded light clients for transacting on those chains with custom application logic". As such, only the first interaction would be from AH (viz. represent my assets in some destination chain for some purpose). However, further interaction would be on the destination chain with your Agent.

So even if we go with the Asset Hub being the entry point, I think most apps would need to integrate with multiple chains one way or another.

Yeah, so the "one way or another" part is key here, because running an archive node for every chain is burdensome. Ideally an archive node solely on AH would allow people to track balance changes for users (these could be batched, e.g. accumulate rewards for a month on the staking chain and then send a message to update your balance on AH). But all transacting could be done via light client on other chains.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/remote-stake-any-asset-on-asset-hub/4210/1

@kianenigma
Copy link
Contributor

kianenigma commented Oct 13, 2023

Some thoughts after discussing this in person over the last 2 weeks:

Realistic Plan

I would divide this in general into two tasks:

  1. Code refactor: Refactor EPM to be multi-block. There is also a branch for this that needs reviving. Refactor staking to not contain any notion of controller. Refactor staking to contain a simple approval-stake tracking system, and track the total amount of stake that is bonded, is active, and is unbonding. There is an endless list of other refactors, but they are not needed.

  2. Interfaces: Put a staking parachain with the current incomplete pallets asap on a test network and start working on the interactions between the RC and the parachain.

These are fairly independent and can be pursued in parallel for the most part.

Ideal Plan

This is probably out of reach, but if time had permitted, I would refactor the staking and pools to be just one staking primitive. Everything should be pools. Fixed nomination is robot-pools, Mirror nomination is robot pools. There are some downsides to this, mayhaps a more limited ability to "switch your pool", but all in all would yield a much much simpler staking system.

enum Nomination {
	/// Managed pool. AccountId can be a smart contract.
	Mirror(AccountId),
	/// Robot pool.
	Fixed(BoundedVec<AccountId, 16>),
}

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/staking-update-paged-staking-reward-to-avoid-validator-oversubscription-issue-is-live-on-westend/5215/1

gpestana added a commit that referenced this issue Feb 15, 2024
…2504)

> highly WIP, opening draft PR for early feedback.

This PR implements a PoV friendly, multi-block EPM to be used by the
staking parachain. It is split into multiple sub-pallets for better
logic and storage encapsulation and better readability. The pallet split
consists of a main pallet and several other sub-pallets that implement
the logic for different sub-systems, namely:

- **main pallet**: 
    - implements the `trait ElectionProvider`
- `fn elect(remaining_pages)` basically fetches the queued page from the
`pallet::verifier`, which keeps the valid solutions in its storage.
- manages current election `Phase` in `on_initialize`. The current phase
signals other sub-pallets what to do.
    - stores and manages the (paged) target and voter snapshots.
- *note*: the staking pallet needs to return/interpret a paged fetching
of the snapshot data for both voters and targes.
- **signed pallet**: 
- implements the `trait SolutionDataProvider`, which provides a paged
solution for the current best score (the `pallet::verifier` is the
consumer, when trying to fetch the best queued solution to verify).
    - keeps track of the best solution commitments from submitters.
- exposes `Call::register` for submitters to submit a commitment score
for their solution.
- exposes callable `Call::submit_page` for submitters to submit a page
of their solution.
- upon the verifier pallet finalizing the paged solution verification,
it handles the submission deposit/rewards based on the reported
`VerificationResult` (from `pallet::signed`).
- **verifier pallet**: 
- implements the `trait Verifier`: verifies one solution page on-call.
The inputs for the verification are provided in-place.
- implements the `trait AsyncVerifier`: fetches pages from the
implementor of `SolutionDataProvider` (implemented by `pallet::signed`)
and verifies the paged solution.
- `on_initialize`, it checks if the verification is ongoing and proceeds
with it
- it has it's own `VerificationStatus` which signals the current state
of the verification
- for each successfully verified page, add it to the `QueuedSolution`
storage.
- at the end of verifying a solution, it reports the results of the
verification back to the implementor of `trait SolutionDataProvider`
(`pallet::signed`)
- **unsigned pallet**:
- `on_initialize` checks if on `UnsignedPhase` and no queued solution;
compute a solution with offchain Miner.
  - implements the off-chain unsigned (paged) miner. 
  - implements the inherent call that processes unsigned submissions.

---

### Todo/discussion

- [x] E2E multi-page election with staking and EPM-MB pallet
integration.
- [ ] refactor the current `on_initialize` across all pallets to make
explicit calls depending on the current phase, rather than relying on
the pallet's `on_initialize` and current phase to decide what to do at a
given block (TBD).
- [ ] remove the `Emergency` phase and instead just keep trying the
election in case of failure.
- [ ] refactor current `SignedValidation` phase to have enough
blockspace to verify all queued signed submissions, for security
purposes (ie. at least `max_num_queued_submissions * T::Pages` blocks
allocated to signed verification, return early if a submission is valid
and accepted).
- [x] implement the paged ingestion of the election results in the
staking pallet. How to convert from multiple `BoundedSupports` to
`Exposures` in the staking pallet in a nice way (add integration tests).
- idea: if each page contains up to `N` targets and a validator may
appear only in one page, we can process the pages in serie in the
staking side, keeping track of the state of `Exposures` across the
pages.
- [ ] allow the validator to replace the current submission if their
submission has better score than the accepted queued submission.
- [ ] mutations to both the target and voter lists need to "freeze"
while the snapshot is being generated (now multi-block). what's the best
approach?
 
Closes: #2199
Related to: #491
Inspiration from
https://github.com/paritytech/substrate/tree/kiz-multi-block-election
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category.
Projects
Status: ⌛️ Sometime-soon
Status: Draft
Development

No branches or pull requests

7 participants