Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraints to de-prioritize nodes from becoming shard allocation targets #487

Closed
ashwinpankaj opened this issue Apr 2, 2021 · 6 comments
Labels
enhancement Enhancement or improvement to existing feature or request

Comments

@ashwinpankaj
Copy link
Contributor

Is your feature request related to a problem? Please describe.

In clusters where each node has a reasonable number of shards, when a node is replaced, (or a small number of node(s) added); all shards of newly created indexes get allocated on the new empty node(s). It puts a lot of stress on the single (or few) new node, deteriorating overall cluster performance.

This happens because the index-balance factor in weight function is unable to offset shard-balance when other nodes are filled with shards. It causes the new node to always have minimum weight and get picked as the target for allocation (until the new node has approx. mean number of shards/node in the cluster).

This was originally submitted as an issue to Elasticsearch.

Describe the solution you'd like

We propose an allocation constraint mechanism, that de-prioritizes nodes from getting picked for allocation if they breach certain constraints. Whenever an allocation constraint is breached for a shard (or index) on a node, we add a high positive constant to the node's weight. This increased weight makes the node less preferable for allocating the shard. Unlike deciders, however, this is not a hard filter. If no other nodes are eligible to accept shards (say due to deciders like disk watermarks), the shard can still be allocated on nodes with breached constraints.

Describe alternatives you've considered

A work around for this is to set the total-shards-per-node index/cluster setting , which limits shards of an index on a single node.

This however, is a hard limit enforced by Allocation Deciders. If breached on all nodes, the shards go unassigned causing yellow/red clusters. Configuring this setting requires careful calculation around number of nodes, and must be revised when the cluster is scaled down.

Additional context

elastic/elasticsearch#43350

@ashwinpankaj ashwinpankaj added the enhancement Enhancement or improvement to existing feature or request label Apr 2, 2021
@itiyamas
Copy link
Contributor

itiyamas commented Apr 8, 2021

I see that this is disabled by default and only enabled via a setting. Any downsides you see for not enabling this by default?

@ashwinpankaj
Copy link
Contributor Author

ashwinpankaj commented Apr 8, 2021

I see that this is disabled by default and only enabled via a setting. Any downsides you see for not enabling this by default?

So far no issues were reported due to this setting but we don't have the adoption numbers to make an informed decision yet. I decided to leave it disabled by default after conferring with the team. Let me get back to you after double checking.

@itiyamas
Copy link
Contributor

If it is disabled by default due to

  • a stability issue, let us enable it by adding more tests.
  • a performance problem you foresee, let us add more JMH benchmarks to work around it or solve the performance problem.
  • the fact that some use cases may not work well, let us identify those and find a solution for all use-cases.

@shwetathareja
Copy link
Member

+1 lets enable this setting by default as there is value for customers to keep it enabled, it would prevent hot node issues and un-necessary shard movements as shards are evenly distributed for newly created index across all nodes. The setting would provide extra control to customers in case they see some degradation. At later point, we should see to deprecate this setting altogether.

@minalsha
Copy link
Contributor

minalsha commented Sep 7, 2021

@ashwinpankaj Closing this issue since PR: #777 is merged. If it inst the case, feel free to reopen.

@minalsha minalsha closed this as completed Sep 7, 2021
@asafm
Copy link

asafm commented Feb 10, 2022

Do you plan to document this in the OpenSearch documentation @ashwinpankaj ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request
Projects
None yet
Development

No branches or pull requests

5 participants