Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

Closed
RS146BIJAY opened this issue Aug 5, 2024 · 0 comments
Labels
bug Something isn't working Indexing:Replication Issues and PRs related to core replication framework eg segrep untriaged

Comments

@RS146BIJAY
Copy link
Contributor

Describe the bug

Description

When selecting a node on which shard will be allocated, OpenSearch calculates weight of that shard on every node. Weight of a shard represents comparison of the number of shards on this node to the number of shards that should be on each node on average (both taking the cluster as a whole into account as well as shards per index). Calculating the average shard per node during weight calculation is a resource-intensive operation. To do this, we sum up the shards count on all nodes by iterating through metadata information of all nodes and dividing this sum by total number of nodes. Since this computation is performed for each node during shard allocation, it becomes computationally expensive. As there is only single thread on master node which handles all the operations including the deciders and make allocation decisions, allocation deciders execution may continue to block these threads which may prevent execution of certain high priority tasks like applying/sending cluster state update, index create, etc.

Screenshot 2024-07-08 at 5 48 10 PM

As can be validated from the graph, about 50% of the time spent for relocating 6k shards (empty shards) from 100 source nodes and assigning them on 100 destination nodes is attributed to average shard calculation during weight determination.

Related component

Indexing:Replication

To Reproduce

Create 500k shards on a setup with 1000 data nodes and 3 master nodes.

Expected behavior

Do an offline calculation of total shards across all nodes and caches it so that LocalShardBalancer does not needs to traverse all the nodes metadata for weight calculation.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@RS146BIJAY RS146BIJAY added bug Something isn't working untriaged labels Aug 5, 2024
@github-actions github-actions bot added the Indexing:Replication Issues and PRs related to core replication framework eg segrep label Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Indexing:Replication Issues and PRs related to core replication framework eg segrep untriaged
Projects
None yet
Development

No branches or pull requests

1 participant