[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

RS146BIJAY · 2024-08-05T13:31:34Z

Describe the bug

Description

When selecting a node on which shard will be allocated, OpenSearch calculates weight of that shard on every node. Weight of a shard represents comparison of the number of shards on this node to the number of shards that should be on each node on average (both taking the cluster as a whole into account as well as shards per index). Calculating the average shard per node during weight calculation is a resource-intensive operation. To do this, we sum up the shards count on all nodes by iterating through metadata information of all nodes and dividing this sum by total number of nodes. Since this computation is performed for each node during shard allocation, it becomes computationally expensive. As there is only single thread on master node which handles all the operations including the deciders and make allocation decisions, allocation deciders execution may continue to block these threads which may prevent execution of certain high priority tasks like applying/sending cluster state update, index create, etc.

As can be validated from the graph, about 50% of the time spent for relocating 6k shards (empty shards) from 100 source nodes and assigning them on 100 destination nodes is attributed to average shard calculation during weight determination.

Related component

Indexing:Replication

To Reproduce

Create 500k shards on a setup with 1000 data nodes and 3 master nodes.

Expected behavior

Do an offline calculation of total shards across all nodes and caches it so that LocalShardBalancer does not needs to traverse all the nodes metadata for weight calculation.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

OS: [e.g. iOS]
Version [e.g. 22]

Additional context
Add any other context about the problem here.

RS146BIJAY added bug Something isn't working untriaged labels Aug 5, 2024

github-actions bot added the Indexing:Replication Issues and PRs related to core replication framework eg segrep label Aug 5, 2024

RS146BIJAY mentioned this issue Aug 5, 2024

Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #14675

Merged

3 tasks

RS146BIJAY closed this as completed Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

RS146BIJAY commented Aug 5, 2024

[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

Comments

RS146BIJAY commented Aug 5, 2024

Describe the bug

Description

Related component

To Reproduce

Expected behavior

Additional Details