fip | title | author | discussions-to | status | type | category | created | spec-sections |
---|---|---|---|---|---|---|---|---|
0063 |
Switching to new Drand mainnet network |
@yiannisbot, @CluEleSsUK, @AnomalRoil, @nikkolasg, @willscott |
Final |
Core, Networking |
Core |
2023-02-24 |
The randomness source for Filecoin, the League of Entropy’s drand network, is moving to a new network setup. This FIP gives a high-level overview of drand’s new features and discusses the required mechanisms that need to be in place in order for Filecoin to switch to the new League of Entropy network and why it should do so.
The current drand network operates in “chained mode”, i.e. each beacon produced is tied to the previous one. Currently, drand beacons are produced every 30 seconds. The protocol upgrades we have recently developed and deployed as the new League of Entropy drand quicknet
network change these two network characteristics. Randomness beacons are not chained to previous beacons and are produced at a higher frequency of once every 3s (that remains a divisor of Filecoin block time).
These new features open up a number of new horizons for drand itself, but also for the applications that use drand. Applications operating on lower or higher frequencies can now use drand by picking the subset of rounds according to their needs for randomness input. For instance, Filecoin can pick 1 in every 10 beacons (since the new network is running at a 3s frequency) to satisfy its need for randomness every 30s for its Leader election. This is supported by the fact that randomness is unchained and therefore not all beacons need to be kept by clients in order to verify new beacons. In turn, this means a significantly lower memory requirement, as: i) client nodes do not need to keep the entirety of the beacon chain since the genesis beacon and, ii) the verification process is stateless and much simpler.
For further details on these new features and their implementation we refer the reader to this blogpost and the complementary, comprehensive FAQ doc.
- The existing drand network is expected to be deprecated in Q2 or Q3 2024.
- The Filecoin network should switch to the new drand mainnet networks before the existing network is deprecated.
- The new drand network is going to be producing smaller signatures, which means that Filecoin headers will be more lightweight.
- This new drand network also allows performing timelock encryption, which will be a valuable feature for user-deployed actors.
Currently drand verification is performed in two stages:
- hash the previous beacon’s signature and the current beacon round number using SHA256
- compare the digest to the randomness in the payload; if not equal, reject
- verify the signature over the digest is valid for the group public key (using BLS12-381 with signatures on G2); if not, reject
In this change, we propose a modified verification flow:
- Instantiate the BLS12-381 library with signatures on group G1 and public keys on group G2
- hash only the current beacon round number using SHA256
- compare the digest to the randomness in the payload; if not equal, reject
- verify the signature over the digest is valid for the group public key (using BLS12-381 with signatures on G1); if not, reject
For signatures on the unchained scheme, the only change required is to omit the previous beacon’s signature when creating a SHA256 digest for comparison against the randomness. Note that the new network also changes the way it does BLS signatures, and uses G1 for signatures and G2 for public keys. This means shorter signatures, but it also means the signature verification algorithm needs to be updated in order to be instantiated over G1 instead of G2.
For Go implementations, this can be achieved by using the latest drand client library, and is transparent and handled by specifying the right “scheme” upon instantiation. For Rust implementations there are multiple community libraries supporting this (see: drand-verify, drand-rs and drand-client-rs).
Additional minor changes will be needed in implementations:
- the new public key, chain hash and scheme will need updated to correctly verify beacon signatures
The details of quicknet
chain info are:
{
"public_key": "83cf0f2896adee7eb8b5f01fcad3912212c437e0073e911fb90022d3e760183c8c4b450b6a0a6c3ac6a5776a2d1064510d1fec758c921cc22b0e17e63aaf4bcb5ed66304de9cf809bd274ca73bab4af5a6e9c76a4bc09e76eae8991ef5ece45a",
"period": 3,
"genesis_time": 1692803367,
"hash": "52db9ba70e0cc0f6eaf7803dd07447a1f5477735fd3f661792ba94600c84e971",
"groupHash": "f477d5c89f21a17c863a7f937c6a6d15859414d2be09cd448d4279af331c5d3e",
"schemeID": "bls-unchained-g1-rfc9380",
"metadata": {
"beaconID": "quicknet"
}
}
The way beacons are retrieved must be adapted to fetch every 10th beacon instead of every beacon to accommodate for Filecoin epochs. In previous iterations of filecoin, operators could request drand beacons with monotonically increasing round numbers, as the drand and filecoin epochs were aligned. Upon switching to the new scheme, consumers must ensure they request the right drand epoch over either HTTP or gossipsub.
For a Filecoin epoch at time T
seconds, we source randomness from a round at time T - 30
seconds. Thus, the beacon round
corresponding to a filecoin epoch at time T
is calculated as (T - 30 - 1692803367)/3 + 1
, where 1692803367 is the genesis timestamp of Quicknet, and 3 is the period duration of Quicknet.
As today, blocks should include beacon entries for the epoch corresponding to their own height, as well as those corresponding to any parent epochs that were null. This guarantees that a unique beacon entry can be found for any Filecoin epoch to serve as a source of external randomness to the protocol (currently used for WindowPoSt and ProveCommit verification).
Note that this does NOT mean that every beacon entry is included in the blockchain as it is today. Instead, only every 10th entry will be included.
The switch to Quicknet will happen in a network upgrade. All blocks produced in the new network version should source their beacons from Quicknet.
If drand halts, Filecoin stops producing blocks. Once drand comes back online, it produces randomness faster until it catches up to the expected present round.
Filecoin, too, has a similar catch-up mechanism that consumes the expected beacons (per the mapping defined above) as they become available. Quicknet has a catch-up time of 2s per beacon, and so the total catch-up time per Filecoin epoch is increased from 15 seconds to 20 seconds.
This is only noticeable in case of a drand outage, of which there have been none since the launch of Filecoin. Nevertheless, any monitoring, alerting, reporting, or automation relying on the existing catch-up time should be updated with the new period of 20s.
-
Why the G1/G2 swap: It usually makes sense for BLS signatures to be instantiated with the public key over G1 and the signatures over G2, meaning shorter public keys but longer signatures. This is so because signature aggregation means we can have one aggregated signature to eventually verify against many public keys, it minimizes the overall size of the data required for verification. However, this does not apply to drand’s beacons since a new one with a single signature is generated at a given frequency. Therefore, it makes more sense for drand’s beacons to have shorter signatures and a longer group public key, since we cannot benefit from the aggregation capabilities of BLS signatures and the public key never changes and can be stored once unlike the signatures which need to be stored for each beacon. This allows to reduce the size of anything storing drand beacons, as well as to increase performances and reduce gas cost of any on-chain operations pertaining to drand beacons.
-
Why the unchained mode: While it may seem like having each beacon be chained to the previous one increases the security of the scheme, it is not the case as discussed below in the Security Consideration section. Instead, having unchained beacons means the verification can be stateless and we don’t need to keep previous beacons and access these to verify new beacons. Since the “previous signature” was required to verify a beacon, it means handling two signatures instead of just one per beacon, which is extraneous data since it doesn’t increase the security of the scheme. Therefore, moving away from the chained mode allows for faster and simpler beacon verification. Furthermore, unchained mode allows one to predict the message that will be signed in a future round (but nothing else), thus enabling “timelock encryption”, which is an interesting feature in and of itself.
Switching to a new unchained network will incur the following incompatibilities with older versions:
- users of the existing drand client libraries will not be able to correctly verify beacons emitted from unchained networks (as they do not contain a
PreviousSignature
field and use a different group for signatures) without updating their clients - the chain info (chain hash, group public key, etc) embedded in the drand config of the relevant Filecoin implementation must be updated
- URLs for accessing drand beacons from relays on the new network will require inclusion of the chain hash (e.g.
[https://api.drand.sh/public/latest](http://api.drand.sh/public/latest)
must becomehttps://api.drand.sh/52db9ba70e0cc0f6eaf7803dd07447a1f5477735fd3f661792ba94600c84e971/public/latest
) - Filecoin implementations must consume every 10th round from drand rather than each consecutive round (to maintain a 30sec block time)
- Beacons created with the scheme containing G1 and G2 swapped will also be unable to be verified by existing implementations.
- Verification of beacons from existing schemes, including the G1 swap, are fully supported in the updated drand client.
What happens when default
dies?
At some point, the League of Entropy's default
network will be sunset. To that end, parties will likely stop serving the historical beacons created by the default
network. This does not pose a concern for Filecoin, as the drand randomness is stored in the block headers of Filecoin, and the group public key is available in the git history of Lotus. Together these are sufficient to maintain network security and verifiability, and rebuild the state for any new storage providers joining the network.
Therefore, switching to the new drand network must be done through a Filecoin network upgrade.
In order to enable effective testing, we suggest creating a new, temporary testnets with a small number of clients from each widely supported Filecoin implementation.
unchained-scheme-testnet
would be started from fresh with only unchained randomness, and the following scenarios would be tested:
- foundation of a new Filecoin network
- creation of new blocks with unchained randomness
- onboarding new nodes from snapshots that contain unchained randomness
Upon successful testing, the network would be taken down and a new network mixed-scheme-testnet
would be created. This network would be started using the existing drand chained network, and after creation of some blocks, would be migrated to the drand unchained network.
Scenarios that must be run on this mixed-scheme-testnet
are:
- onboarding new nodes from existing snapshots (with chained randomness in the headers)
- onboarding new nodes from newly created snapshots (containing both chained and unchained randomness in the headers)
- transition of existing nodes from chained to unchained randomness at a set epoch
- creation of new blocks with the unchained scheme
From a security point of view, the guarantees of the League of Entropy (LoE) network remain the same:
- a random beacon cannot be predicted unless a threshold number of LoE nodes collude together (current threshold is 12 out of 23 nodes, though it may grow in the coming quarter as we onboard new nodes).
- a random beacon cannot be biased by anybody unless the attacker is able to change the public key of the group, which should be hard-coded and checked against more than a threshold number of members of the League of Entropy.
The fact that the drand beacons are now unchained might give the false impression that they are less secure than the previous chained ones, however this is not the case:
- an attacker controlling a threshold number of shares is able to predict any future round in the unchained setting, whereas they would need to compute all intermediary rounds in order to predict a given future round with the chained setting. However controlling a threshold amount of shares at any point in time allows computation of the shared secret of the group and nothing can prevent such an attack from computing future rounds offline using that shared secret much faster than the existing network would have, leading to the same result: a complete predictability of all future rounds.
- all future beacons are entirely determined by two things: the initial Distributed Key Generation and their round number. This was already the case for the chained network and won’t change with the unchained network.
From a liveness point of view, the way the drand nodes operate hasn’t changed and we can also take advantage of these changes to add a new partner relay to the list of available relays (https://api.drand.secureweb3.com:6875/public/latest). Therefore liveness and availability shouldn’t be negatively impacted in any way. A small increase in catchup period will incur some time in network recovery should the drand network stop functioning, though this has not happened to date. Regardless, a test of the catchup mechanism is provisionally scheduled for Q1 2024 on testnet.
Risks:
- Changing to the new drand network will require changing the way beacons are fetched and verified.
- It will require a careful transition from the old method to the new one to avoid any issues.
- The catch-up period will be 20 seconds instead of 15 seconds, in the case the drand network is ever down for more than 10 rounds.
Mitigations:
- We have inspected the implementation of Lotus with members of the Lotus team and the changes have a low surface area.
- Since the old League of Entropy network and the new network will both be available at the same time months prior to these changes, we will have enough time to extensively test on testnet nodes and ensure the transition goes smoothly.
- The drand codebase is continuously being improved and refactored to increase its liveness and reduce any risks of downtime. It being a threshold network means that it requires more than threshold nodes down for the network to actually halt. The drand team plans to further mitigate this by implementing fuzzing of the drand codebase and by diversifying its drand daemon implementations in more languages than just Go, to further decrease the risk of ever having a threshold of nodes that are down at the same time.
Unchained randomness allows for lighter drand nodes since they effectively become “stateless” and can delegate storing historical beacons to Filecoin, IPFS or some other storage system. The longer we keep a “chained beacon” running, the bigger the burden on the nodes running it becomes over time (requires more storage, costs more, etc.) and the more likely we are to see partners leaving the League of Entropy and causing an undesirable centralization of drand nodes.
The new network is also producing smaller beacon signatures (half the size of the current ones), which means lighter Filecoin headers.
Future Product Consideration - why is the new drand network an improved network?
- The Timelock Encryption feature is only available when using unchained randomness, since we need to be able to predict which message will be signed at what time. Switching would allow us to integrate Timelock Encryption natively on Filecoin. Expect another FIP focused on Timelock and FVM soon.
- Verification of the beacon becomes “stateless” and simpler: we don’t need to keep track of the previous beacon anymore and can verify a given round’s signature using just the group public key and the expected round number. This means a lower code complexity.
- Changing block frequency becomes easier if relying on a higher frequency unchained beacon, because one could say “we only use every 5th beacon”, etc. and easily change frequency to any multiple of the base drand frequency without any change on the drand side.
- Switching to a new chain is much easier in unchained randomness mode: it’s enough to just change the public key (bootstrapping the Filecoin chain becomes easier through use of the unchained beacon)
The following Lotus implementation details have been highlighted:
- the way beacons are retrieved will need to be adapted in order to fetch every 10th beacon to accommodate for Filecoin epochs: chain/beacon/drand/drand.go; the drand team could provide an API at the library level for fetching rounds at a given different frequency, but this is not yet the case.
The following configurations will need to be adapted:
- /build/drand.go#L45
- /build/params_mainnet.go
- drand configuration will need to be amended to include the new
public key
andchain hash
as well as to add the notion ofscheme
that determines how verification and signatures are done
For rust implementations, the Nois network maintain a crate for performing verification of beacons using the new scheme, drand-verify
Copyright and related rights waived via CC0.