Feat(near-client): Chunk distribution via message bus #10480

birchmd · 2024-01-22T19:58:15Z

This PR implements the changes to nearcore proposed in #10083

To summarize briefly here, the goal of this project is to reduce the latency experienced by RPC nodes through directly distributing chunks over a message bus. Validators eagerly push the chunks they produce on to this bus in addition to the peer-to-peer messages they send. When any node (validator or RPC) realizes it needs a chunk (e.g. because it is present in a new block) then it can check the message bus to see if it present there before trying a request over the peer-to-peer network.

Participation in chunk distribution via the message bus is entirely optional and disabled by default. This PR has no impact on nodes that are not participating in the new chunk distribution. To be clear: this PR is not a replacement for the existing chunk distribution logic via the peer-to-peer network; it is a secondary channel which should provide faster chunk distribution (on the happy path).

If invalid chunks are published to the message bus then nodes which receive them will request via the peer-to-peer network instead.

The details of how the chunks are posted to the message bus and what service exists for querying from the message bus are left intentionally abstract. These will be handled by separate a application (or applications). The nearcore code only sends HTTP requests to endpoints specified in config that is used by node operators to opt-in to this feature.

codecov · 2024-01-22T20:18:52Z

Codecov Report

Attention: 84 lines in your changes are missing coverage. Please review.

Comparison is base (aa8d89a) 72.20% compared to head (a8f85f3) 72.24%.

Files	Patch %	Lines
chain/client/src/chunk_distribution_network.rs	86.71%	43 Missing and 8 partials ⚠️
chain/client/src/client.rs	51.51%	14 Missing and 2 partials ⚠️
chain/chunks/src/lib.rs	0.00%	15 Missing ⚠️
core/chain-configs/src/client_config.rs	33.33%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #10480      +/-   ##
==========================================
+ Coverage   72.20%   72.24%   +0.03%     
==========================================
  Files         729      730       +1     
  Lines      148076   148521     +445     
  Branches   148076   148521     +445     
==========================================
+ Hits       106923   107302     +379     
- Misses      36342    36404      +62     
- Partials     4811     4815       +4

Flag	Coverage Δ
backward-compatibility	`0.16% <0.00%> (-0.01%)`	⬇️
db-migration	`0.16% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.32% <0.45%> (-0.01%)`	⬇️
integration-tests	`36.93% <10.50%> (-0.09%)`	⬇️
linux	`71.25% <81.61%> (+0.04%)`	⬆️
linux-nightly	`71.67% <79.64%> (+0.01%)`	⬆️
macos	`55.31% <79.86%> (+0.11%)`	⬆️
pytests	`1.54% <0.45%> (-0.01%)`	⬇️
sanity-checks	`1.34% <0.45%> (-0.01%)`	⬇️
unittests	`68.16% <79.86%> (+0.03%)`	⬆️
upgradability	`0.21% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wacban · 2024-01-26T11:39:05Z

HI @birchmd, thanks for the proposal! I wasn't aware of this issue and it's great to see involvement from the community. I'll bring it up to the team for discussion and come back to you next week.

First thoughts:

I'm a tiny bit concerned that this is a centralized solution. I understand that this is opt in only but still we wouldn't want to build a dependence on it.
We will need to consider what are the alternatives and if there is any good distributed alternative. Perhaps we can just tweak the existing setup to broadcast a bit more.
We will need to double check the security. I don't think there should be any issues given we don't normally trust peers anyway but just to make sure I'd like to have a proper look.
The code looks nice, clean and well separated from the rest of the codebase which is nice.

birchmd · 2024-01-26T14:30:42Z

I'm a tiny bit concerned that this is a centralized solution.

I completely understand the concern here. I think of this as an optimization of the happy path only, not any kind of protocol change. The decentralized peer-to-peer network is still the foundation on which everything works. This just gives an opportunity for faster message distribution when things are going well (which should be 99% of the time). The worst case scenario if something does go wrong with this is the latency of chunk distribution returns to what it was before this solution.

tweak the existing setup to broadcast a bit more

I think it will be hard to compete with this using the peer-to-peer network. This is a solution optimized for this kind of message passing (multi-producer, multi-consumer with interest-based routing). I also wouldn't want to make the security of the core networking protocol worse by trying to improve its performance. I think it makes sense to have a separate network for performance.

We will need to double check the security

+1 please review carefully.

The code looks nice

❤️

wacban · 2024-01-29T18:20:24Z

We discussed it in core team and here are a few thoughts, comments and questions:

Alternatives - can you compare each to your proposal?

Would it be possible for the RPC nodes to subscribe directly to a friendly validator instead?

Comments for this solution:

Can you do some experiments and share some relevant numbers e.g. what is the current latency and what is it using this solution?
This solution should be made generic so that anyone can fully set it up. It should not rely on any closed source and we should share full instructions on how to configure it etc.
How do we make it the least intrusive into nearcore? Instead of coding it directly into nearcore can we implement a plug-in mechanism and keep the bus based implementation separate from nearcore?

I think it will be hard to compete with this using the peer-to-peer network.

I think it boils down how much of a difference is there between the perfomance of the alternative solution. If we can implement a solution that is 10% slower but decentralized and safe to enable by default I would consider it the better alternative. If it's a 2x difference that's a different story.

Also JFYI we're making some changes to chunk processing and validation logic. In stateless validation chunks will have accompanying state witnesses and endorsements. Not sure if there are any changes necessary here but sharing just in case.

bowenwang1996 · 2024-01-29T22:34:23Z

This solution should be made generic so that anyone can fully set it up. It should not rely on any closed source and we should share full instructions on how to configure it etc.

Given that the message bus part is abstracted out, I don't think this PR necessarily introduces any point of centralization on its own.

How do we make it the least intrusive into nearcore? Instead of coding it directly into nearcore can we implement a plug-in mechanism and keep the bus based implementation separate from nearcore?

This is a good question. @birchmd what do you think?

birchmd · 2024-01-30T15:51:34Z

Would it be possible for the RPC nodes to subscribe directly to a friendly validator instead?
...
If we can implement a solution that is 10% slower but decentralized and safe to enable by default I would consider it the better alternative.

I don't think any kind of solution using the peer-to-peer network will be both performant and scalable. Essentially the trade-off is between latency (via pull messaging) vs bandwidth (actively pushing messages).

Currently chunks are distributed via pull (a node realizes they need some chunk and asks for it) and even with a direct connection to a peer that is guaranteed to have the chunk you will always have the round-trip latency of sending the pull message before getting the chunk in response. If we switch to a broadcast model (i.e. push) then we eliminate the round-trip latency, but traffic becomes an issue for the peer-to-peer network as the number of shards increases. This is why the base protocol is designed with pull messaging in the first place. You could have a push model that is more precise than broadcast by having some kind of subscription protocol, but I suspect validators will not want to open themselves up to having to spend even more bandwidth on messaging and therefore would reject subscription requests.

Even if we could switch chunk distribution to a push model, there is still the issue that subscribing to one validator is not enough. With one validator you would only automatically get the chunks that single validator produces (at least the changes to nearcore in this PR only forward chunks produced by the node as opposed to all chunks it receives). To consistently get all chunks from a shard you would need to subscribe to most of the validators assigned to that shard. That set of validators could change between epochs, requiring yet more complexity in the hypothetical subscription protocol we are discussing.

Can you do some experiments and share some relevant numbers e.g. what is the current latency and what is it using this solution?

We have done some experiments and the solution should reduce latency by around 2 seconds. This number comes from measuring the time between an indexer receiving a block header for the first time and having received all chunks for that block. Additionally, on testnet we ran our own validator node with the patch to publish chunks it creates to a message bus and compared the time it took for an indexer to receive the chunk via the peer-to-peer network against the time for another indexer to receive that chunk via the message bus. This experiment suffered a little from lack of statistics because we could only measure time differences for chunks produced by our own validator.

The best experiment we could do would be to set up a sandbox network with programmable latencies and compare the peer-to-peer with the message bus in this setting. But this would be a larger undertaking and a significant delay for the project.

This solution should be made generic so that anyone can fully set it up. It should not rely on any closed source and we should share full instructions on how to configure it etc.

All code for this project will be public and include documentation.

How do we make it the least intrusive into nearcore? Instead of coding it directly into nearcore can we implement a plug-in mechanism and keep the bus based implementation separate from nearcore?
This is a good question. @birchmd what do you think?

I think this PR is already minimally invasive to nearcore and is essentially a plug-in mechanism as-is. This PR only exposes sending and receiving chunk data at configurable HTTP endpoints. Anyone could design any other service that uses or produces chunk data and connect it to nearcore using this mechanism. For example, suppose a large validator wanted to use this to sync data between their production and fail-over nodes. This would be very easy; a simple HTTP server with two endpoints (post for accepting chunks from the production node and get for downloading those chunks to the fail-over node) would slot in perfectly with the changes in this PR.

This was the whole idea of only defining abstract HTTP interfaces in nearcore. It works for this project, while cleanly separating concerns and also opening up possibilities for future innovation.

…vides an invalid chunk

… is present but not enabled

…ture is enabled

pugachAG

LGTM with a bunch of optional improvement suggestions.

I confirm that this PR:

doesn't introduce any protocol changes
doesn't introduce any additional dependencies, interactions with centralised messaged bus are preformed via http
doesn't change existing behaviour by default, the new feature needs to be explicitly enabled via config

chain/client/src/chunk_distribution_network.rs

chain/client/src/client.rs

chain/chunks/src/lib.rs

chain/client/src/client.rs

birchmd · 2024-02-06T20:00:22Z

Thanks for the review @pugachAG ! I like all the suggestions you made and applied them in 2a4d181

bowenwang1996 · 2024-02-13T23:49:40Z

@birchmd what prevents this PR from getting merged?

birchmd · 2024-02-14T00:36:13Z

@bowenwang1996 nothing is blocking as far as I know. I'd love to see this merged!

vikpande · 2024-02-17T10:44:43Z

@wacban , can you please review and approve/comment. this is waiting for your approval to be merged. thanks

pugachAG · 2024-02-19T15:33:19Z

@vikpande Wac is OOO this week, but as far as I understand he is OK with this PR, so feel free to proceed with merging this.

vikpande · 2024-02-19T18:43:33Z

@birchmd , pls take it from here. good luck !

This PR implements the changes to nearcore proposed in near#10083 To summarize briefly here, the goal of this project is to reduce the latency experienced by RPC nodes through directly distributing chunks over a message bus. Validators eagerly push the chunks they produce on to this bus in addition to the peer-to-peer messages they send. When any node (validator or RPC) realizes it needs a chunk (e.g. because it is present in a new block) then it can check the message bus to see if it present there before trying a request over the peer-to-peer network. Participation in chunk distribution via the message bus is entirely optional and disabled by default. This PR has no impact on nodes that are not participating in the new chunk distribution. To be clear: this PR is not a replacement for the existing chunk distribution logic via the peer-to-peer network; it is a secondary channel which should provide faster chunk distribution (on the happy path). If invalid chunks are published to the message bus then nodes which receive them will request via the peer-to-peer network instead. The details of how the chunks are posted to the message bus and what service exists for querying from the message bus are left intentionally abstract. These will be handled by separate a application (or applications). The nearcore code only sends HTTP requests to endpoints specified in config that is used by node operators to opt-in to this feature.

birchmd added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-chain Area: Chain, client & related T-node Team: issues relevant to the node experience team Node Node team labels Jan 22, 2024

birchmd force-pushed the chunk-distribution-network branch from 028b7db to 179f04d Compare January 23, 2024 23:22

birchmd marked this pull request as ready for review January 24, 2024 00:21

birchmd requested a review from a team as a code owner January 24, 2024 00:21

birchmd requested a review from wacban January 24, 2024 00:21

birchmd mentioned this pull request Jan 25, 2024

Known problems in the 1.38 release #10498

Open

birchmd force-pushed the chunk-distribution-network branch from 179f04d to 37ae9c8 Compare January 26, 2024 14:40

birchmd force-pushed the chunk-distribution-network branch from 2169aad to 2e8b952 Compare January 30, 2024 17:08

birchmd added 12 commits February 1, 2024 20:26

Chunk Distribution Network Config

daa7be9

Publishing chunks to the Chunk Distribution Network

1ca046f

Request chunks via Chunk Distribution Network

881395c

Refactor publishing chunk to chunk distribution network

b4162cb

Refactor interaction with Chunk Distribution Network to use a trait

18ceea1

Unit test for request_missing_chunks

fc65d3b

Fall back on p2p chunk requests if the chunk distribution network pro…

c3c9dda

…vides an invalid chunk

Clippy fix

71f5645

Extend unit test to cover error case

a093807

Use p2p_request_missing_chunks when chunk_distribution_network config…

cd4f86b

… is present but not enabled

Integration tests for chunk distribution network

41afc9f

Skip run_with_chunk_distribution_network if StatelessValidationV0 fea…

4bf0852

…ture is enabled

birchmd force-pushed the chunk-distribution-network branch from 2e8b952 to 4bf0852 Compare February 1, 2024 19:39

pugachAG approved these changes Feb 6, 2024

View reviewed changes

birchmd added 2 commits February 6, 2024 20:25

Merge branch 'master' into chunk-distribution-network

b8fcbd3

Review nits

2a4d181

Merge branch 'master' into chunk-distribution-network

a86e116

Merge branch 'master' into chunk-distribution-network

78f2bc9

Merge branch 'master' into chunk-distribution-network

a8f85f3

birchmd added this pull request to the merge queue Feb 20, 2024

Merged via the queue into near:master with commit 911bd26 Feb 20, 2024
28 of 29 checks passed

birchmd deleted the chunk-distribution-network branch February 20, 2024 01:45

rasfies mentioned this pull request Apr 12, 2024

[TASK] Include RPC Speedup patches in NEARCORE release NEAR-DevHub/Infra-Committee#47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat(near-client): Chunk distribution via message bus #10480

Feat(near-client): Chunk distribution via message bus #10480

birchmd commented Jan 22, 2024

codecov bot commented Jan 22, 2024 •

edited

Loading

wacban commented Jan 26, 2024

birchmd commented Jan 26, 2024

wacban commented Jan 29, 2024

bowenwang1996 commented Jan 29, 2024

birchmd commented Jan 30, 2024

pugachAG left a comment

birchmd commented Feb 6, 2024

bowenwang1996 commented Feb 13, 2024

birchmd commented Feb 14, 2024

vikpande commented Feb 17, 2024

pugachAG commented Feb 19, 2024

vikpande commented Feb 19, 2024

Feat(near-client): Chunk distribution via message bus #10480

Feat(near-client): Chunk distribution via message bus #10480

Conversation

birchmd commented Jan 22, 2024

codecov bot commented Jan 22, 2024 • edited Loading

Codecov Report

wacban commented Jan 26, 2024

birchmd commented Jan 26, 2024

wacban commented Jan 29, 2024

bowenwang1996 commented Jan 29, 2024

birchmd commented Jan 30, 2024

pugachAG left a comment

Choose a reason for hiding this comment

birchmd commented Feb 6, 2024

bowenwang1996 commented Feb 13, 2024

birchmd commented Feb 14, 2024

vikpande commented Feb 17, 2024

pugachAG commented Feb 19, 2024

vikpande commented Feb 19, 2024

codecov bot commented Jan 22, 2024 •

edited

Loading