This repository has been archived by the owner on Nov 26, 2020. It is now read-only.
fully deterministic, reproducible test scenarios #146
Labels
workstream/e2e-tests
Workstream: End-to-end Tests
Context
With #143 (proper synchronized mining), we have the ability to synchronise mining across a fleet of miners, so that they're advancing in lockstep via a global clock.
However, there are other chain-dependent async processes like window PoSt (fault declaration, recovery declaration, posting proofs), sealing, etc. that we need to wait for at every chain height before we proceed. We need to tap into those processes before we allow the clock to proceed.
Additionally, some/all of those processes generate messages asynchronously, and we might need to coordinate across miners to only advance the clock globally when all those messages have been received in the corresponding mempools.
Dependent downstream processes
Windowed PoSt runner (Lotus:
storage
package)Currently subscribes to head changes, and uses those to drive these three processes (at least):
The way it works is: upon a new HEAD that starts a new proving window, we wait for
StartConfidence
epochs before we actually do anything (to avoid computational wastage in case of reorgs). We then:Each of those steps generates and broadcasts messages. For each step that generated a message, we wait for
build.MessageConfidence
epochs on top of it before continuing with the next step.^^ This would pose a catch-22 on synchronised mining: the logic whose completion we're waiting on before we advance the chain is in turn waiting for the chain to advance. IMO this logic is wrong to begin with: we should be having a message sentinel that we delegate watching and rebroadcasting messages to. We should not BLOCK window PoSt waiting for messages to appear. The current logic also has other weaknesses.
Proposed mechanics
MpoolSub
), before advancing to the next epoch.Deal/sector sealing
Deals have epoch deadlines. If the chain advances too fast (as is the case with #143) sealing will not have enough time to run, ever, and therefore deals will always fail. Ideas:
The text was updated successfully, but these errors were encountered: