Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(banwidth_scheduler) - bandwidth scheduler (#12533)
This PR adds implementation of the core bandwidth scheduler algorithm. It takes the bandwidth requests generated by shards at the previous height and decides how many bytes of receipts can be sent between each pair of shards. The bandwidth grants generated by the bandwidth scheduler are then used in `ReceiptSink` as outgoing size limits. The exact algorithm is described in more detail in the module-level comment. This almost completes the basic bandwidth scheduler feature - shards generate bandwidth requests and bandwidth scheduler decides how much they are allowed to send. There is one more piece missing - the algorithm for distributing remaining bandwidth after all requests have been processed. I'll add it in another PR, this one is big enough already. It's not needed for correctness, it just improves typical throughput. ### Missing chunks One thing I'm still not 100% sure about is the missing chunks. I'm starting to think that the rule "don't send anything to a shard that had a missing chunk" might be enough to make sure that a shard never has more than `max_shard_bandwidth` incoming receipts. Maybe there was no fatal flaw after all? I need to think more about it, or just write a test for it. ### Performance Complexity of the scheduler algorithm is `O(num_shards^2 * log(num_shards))` It's pretty much impossible to avoid quadratic complexity because we have to consider every pair of shards. The log comes from sorting the requests by allowance. I measured the worst-case performance on a typical n2d-standard-8 GCP VM, and I think it should work fine up to ~128 shards: ``` Running scheduler with 6 shards: 0.10 ms Running scheduler with 10 shards: 0.16 ms Running scheduler with 32 shards: 1.76 ms Running scheduler with 64 shards: 5.74 ms Running scheduler with 128 shards: 23.63 ms Running scheduler with 256 shards: 93.15 ms Running scheduler with 512 shards: 371.76 ms ``` Once we reach 100+ shards we might have to revisit the design. Maybe choose only a random subset of shards/links that are allowed to send receipts at each height? I think we can worry about it when we get there. --- The code is basically ready, but untested. I'll add the tests next week. The PR is meant to be reviewed commit-by-commit. (note that last commit fixes the initial implementation of the scheduler!)
- Loading branch information