Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔷 [ProjectTracking] Enable mainnet validators to track a single shard #10462

Closed
4 tasks done
Tracked by #9571 ...
VanBarbascu opened this issue Jan 18, 2024 · 5 comments · Fixed by #10632
Closed
4 tasks done
Tracked by #9571 ...

🔷 [ProjectTracking] Enable mainnet validators to track a single shard #10462

VanBarbascu opened this issue Jan 18, 2024 · 5 comments · Fixed by #10632
Assignees
Labels

Comments

@VanBarbascu
Copy link
Contributor

VanBarbascu commented Jan 18, 2024

Goals

Background

In our current setup, in mainnet, every node tracks all shards. State Sync work enabled nodes to easily download state data, unblocking them from tracking a single shard. However, initial testing showed bugs and limitations outside of StateSync code, and this issue tracks the effort to address them and fully enable mainnet nodes in tracking a single shard.

Why should NEAR One work on this

  • Single shard tracking contributes to network scalability and reduces the hardware requirements for chunk only producers.
  • Stake Wars IV will run in a network (Statelessnet) where nodes track a subset of shards.

What needs to be accomplished

We need to fix missing incoming receipts for shards not tracked. State sync implementation has a dependency on nodes tracking all shards. When we generate the state sync header for a shard (call it T), a node looks at all the other shards for incoming receipts for shard T and adds them to the header. This is not possible if a node only tracks a subset of shards. More info in the doc. Initially, we will fix GCS-based State Sync by uploading the header from a host that tracks all shards. The rest of the hosts can download the header from the external location. For decentralised State Sync we will add a new work item to propose a solution. This will enable stateless validation work to go in parallel with this effort.

Main use case

The main usecase for enabling single shard tracking is Stateless validation.

Links to external documentations and discussions

doc

Estimated effort

Engineers assigned: @VanBarbascu , @telezhnaya.

Initial effort estimate is about 0.5 PM (person months): transferring headers via GCS
Currently remaining effort is presented in the latest comment of this issue.

Assumptions

In the single shard tracking scenario, in the next months, RPC nodes will still be able to track all shards of mainnet.

Pre-requisites

N/A

Out of scope

Single shard tracking with decentralized state sync. This involves more work and will be addressed in a different issue.

Task list:

Tasks

Preview Give feedback
  1. A-stateless-validation Near Core Node
    VanBarbascu
  2. 2 of 2
    Node
    VanBarbascu
@VanBarbascu VanBarbascu self-assigned this Jan 18, 2024
@VanBarbascu VanBarbascu added the Node Node team label Jan 18, 2024
@walnut-the-cat walnut-the-cat added the A-stateless-validation Area: stateless validation label Jan 18, 2024
@gmilescu gmilescu changed the title [State Sync] Refactor state sync header to work when nodes track a subset of shards [ProjectTracking] Enable mainnet validators to track a single shard Jan 29, 2024
@VanBarbascu
Copy link
Contributor Author

30 Jan 2024

Upload and download feature for state sync header is merged and tested in mainnet and testnet.
Parts are uploaded in GCP i.e. testnet. Mainnet can be found by changing paths.
In this Grafana dash, in the selected interval I had 2 nodes (mainnet and testnet) syncing from external storage using the new download flow.

@gmilescu gmilescu added the Epic label Feb 6, 2024
@gmilescu gmilescu changed the title [ProjectTracking] Enable mainnet validators to track a single shard 🔷 [ProjectTracking] Enable mainnet validators to track a single shard Feb 6, 2024
@VanBarbascu
Copy link
Contributor Author

VanBarbascu commented Feb 6, 2024

In recent weeks, we've successfully resolved the issue of single shard tracking, complemented by state sync functionality from external storage. Comprehensive testing on both mainnet and testnet RPC nodes has demonstrated their ability to seamlessly transition between shards every epoch.

Our focus now shifts to enhancing file monitoring capabilities to include the verification of the state sync header in the external bucket. This step is crucial in ensuring the integrity and effectiveness of the state sync feature.

Pending confirmation from the Core team regarding the optimal performance of the state sync feature for the stateless validation use case, we anticipate closing the issue next week.

Furthermore, the refinement of single shard tracking with decentralized state sync will be addressed as a separate project, thus maintaining clarity in our planning process.

@VanBarbascu
Copy link
Contributor Author

VanBarbascu commented Feb 16, 2024

This week we set up a base integration test with validator nodes that can be configured to track a subset of shards.

We are waiting on the results of in memory trie + single shard tracking experiment to close this ticket.

@VanBarbascu
Copy link
Contributor Author

VanBarbascu commented Feb 24, 2024

We have confirmation that nodes tracking a subset of shards are able to download the state from external storage.
Waiting for the monitoring PR to get approved and then we can close this project.

@VanBarbascu
Copy link
Contributor Author

The functionality has been finalised. We have the capability to track parts availability using the metrics provided by the dump_check tool. The task of single shard tracking with decentralised state sync will be handled in a separate project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging a pull request may close this issue.

3 participants