State cleanup / efficiency proposal: replace Code CIDs in actors with stable ids #1090
rvagg
started this conversation in
Enhancements - Technical
Replies: 1 comment 3 replies
-
This proposal SGTM. However, please be careful to communicate breaking changes to users - i know there are toolings that are currently dependent on Code CIDs to get a hold of what kind of actor it is, and execute logic based on that. i.e only get traces if the code cids match msig's. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction
The current Filecoin actor object uses a
Code
field containing a CID linking to a raw IPLD block containing the WASM for the particular builtin actor type of that actor.This is used in two ways:
code
field to load the WASM to execute and then runs that against the actor's state, in particular its ownbalance
and itsstate
(Rust) /Head
(Go) state root. i.e. Code + Data come together in the Actor object. Here: https://github.com/filecoin-project/ref-fvm/blob/551e24a9b7b4c8b1e42731b497d12202732beed1/fvm/src/call_manager/default.rs#L749-L756Code
CID to determine the network version by searching the historic actor CIDs for a match; then we are able to determine the schema of the actor's state. e.g. here: https://github.com/filecoin-project/lotus/blob/ae5b84503cea5b996d4a9d3ed46c4bdabd4ddea8/chain/actors/builtin/miner/miner.go#L33-L39Problem
One of the negative results of storing the current actor's code CID in each actor object is that when we replace the code of an actor, we need to update all actor objects in the state tree to point to their new code CIDs. For most network version upgrade migrations, this takes the majority of the time and it's increasing over time as we add more actors.
The latest migration benchmark shows a growth of ~75k actors (to 3,266,879) and an increase in migration time of ~2.5s. With ~average hardware we're nearing the 30s mark for migrations, where we almost exclusively are updating the code CID of actors. There's also a lot of state churn in this operation as we have to rebuild the entire actors tree.
There's also size, the actor code CIDs take 39 bytes in each actor object, pointing to the same 16 IPLD blocks.
Unfortunately, actors have worked like this from the begining, even before WASM builtin actors, they used to use identity CIDs with the actor type and versions in a string. So a lot of code assumes that an actor object has the data+code pairing in it.
Proposal
Let's replace the
Code
field in the actor object with a stable identifier for the actor type. The identifier could be an integer for efficiency and we'd establish a stable mapping of integer to actor type that would remain consistent over time (we could avoid the 0-99 range to escape the confusion with the singleton actor IDs [e.g.f01
, etc.] —which confused me when I started looking at this tbh). Then we just need a place to store that mapping such that we can easily fetch it when needed so we end up with the code+data pairing wherever we need it.Option: Use the system actor's state
The shape of the system actor's state is currently a map of actor string names to code CIDs, we update this at each network version upgrade.
Two options exist here if we wanted to repurpose this:
Unfortunately, this has a downside of being a slightly recursive, or at least having a recursive smell: looking for a specific actor
f00
to load its state to find the code of any other actor.Option: place the mapping at the top of the state root
The current state root object:
Has a version, a link to the actors HAMT (where the 3M actors live) and a link to an "info" object; which is simply the empty array CBOR block
[]
; i.e. it's never been used for anything interesting.We could park a mapping of actor integer identifiers to code CIDs off the state root block, as a separate block, and we could even do it without changing the schema of the state root by repurposing the
info
CID to be anactor_mapping
CID. Each network version upgrade we'd get a new one of these blocks and the CID to it would be stored in the state roots from that point until the next upgrade.Usage
With either option, the mapping is accessible from the state root, so whenever we want to make use of an actor, we need to either be accessing the actor from the state root, or have the state root at hand.
FVM
The FVM has no concept of historical state, it's called on a particular state root. When we execute an actor's method, we start from an actor ID and we load the actor from the state root that the FVM "engine" is instantiated with: https://github.com/filecoin-project/ref-fvm/blob/551e24a9b7b4c8b1e42731b497d12202732beed1/fvm/src/call_manager/default.rs#L665-L668
Currently we simply use that actor object as the code+data pair. But we could just as easily also load the code CID from the mapping in the state root and create a new code+data pair when we load the actor and then execute off this.
Lotus (etc.)
Typically we use actors in Lotus APIs by ID or address. We're either finding them in the latest state root (no
TipSetKey
specified) or in a specific state root (a specificTipSetKey
specified). In both cases we're starting from a state root and finding the actor and then using that actor to determine the state schema and then making use of the state data for whatever the API requires.e.g.
StateMinerSectors
works like this:LoadActorTsk
loads the actor from a givenTipSetKey
and an actor address. TheLoadActor*
family of methods all start from a state root and drill down to find the actor and just return that.miner.Load(actor)
to do a miner-specific load of the actor's state data to make use of it; this in turn uses the actor'sCode
to work out which network version it was and then uses that to determine the schema of the state data that it needs to load.LoadSectors
method on the object returned byminer.Load
which abstracts the loading of the sectors from the state data across all network versions such that they act the same way regardless of the underlying schema.This is fairly typical of actor use in Lotus, starting from the state root and using the code of the code+data pair to determine the schema of the data. We could simply fetch this code from the mapping in the state root instead of from the actor object itself once we have the actor object and the id to map.
Migrations
Migrations would skip individual actor migrations except where something about the actor's state needs to change. So we don't mutate actors where there hasn't been a FIP that touches it. We would need to update the mapping in the state root, but that's new, single IPLD object and a new CID for that object placed in the state root.
Beta Was this translation helpful? Give feedback.
All reactions