Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add changes for graceful node decommission #4586

Merged
Merged
Changes from 1 commit
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
bdb6c6f
Add changes for graceful node decommission
pranikum Sep 25, 2022
afec689
Add WRRWeights class
pranikum Sep 25, 2022
3314ec3
Add logger message
pranikum Sep 25, 2022
bf622c4
Add tests
pranikum Sep 25, 2022
d933377
Remove unused code
pranikum Sep 26, 2022
bf4c217
Remove test annonation
pranikum Sep 26, 2022
c57530a
Fix spotless
pranikum Sep 26, 2022
7c43ad1
Add package info
pranikum Sep 26, 2022
ba8f500
Add weigh away status
pranikum Sep 26, 2022
8920480
PR comments
pranikum Sep 26, 2022
eeed711
Update changelog
pranikum Sep 26, 2022
ed1ccbc
Fix tests
pranikum Sep 26, 2022
7922291
Remove time check. Just we will schedule
pranikum Sep 27, 2022
bed942a
Set decommission status to In Progress on completion of timeout
pranikum Sep 27, 2022
67622a2
PR comments. Take latest changes for WRR API
pranikum Sep 27, 2022
b75b108
Add drain timeout to decommission request
pranikum Sep 27, 2022
f79b51d
Resolve merge conflicts
pranikum Sep 28, 2022
46a1b12
Merge conflict
pranikum Sep 28, 2022
afda24c
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Sep 28, 2022
509015e
Merge with latests
pranikum Sep 28, 2022
3f832f9
Fix tests and split set weights and weights population
pranikum Sep 28, 2022
b70370c
PR comments
pranikum Sep 28, 2022
5724ba8
Merging with latest
pranikum Sep 29, 2022
6d1ba1a
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Sep 29, 2022
e401908
Merge with latest.
pranikum Sep 29, 2022
289077f
Add changelog
pranikum Sep 29, 2022
0890c39
Merge with latest
pranikum Oct 3, 2022
ec0f0e1
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 3, 2022
2f10664
Merge on latest
pranikum Oct 3, 2022
daa0065
Spotless java check
pranikum Oct 3, 2022
e9e25f1
Fix compilation issue
pranikum Oct 3, 2022
4590da3
PR comments
pranikum Oct 3, 2022
abbec12
Fix logger check
pranikum Oct 3, 2022
bd6f485
PR comments
pranikum Oct 3, 2022
9c1b2aa
Fix PR comments
pranikum Oct 3, 2022
9aa4f37
Handle PR comments
pranikum Oct 6, 2022
fd9fabe
Merge with latest changes
pranikum Oct 6, 2022
79f06fc
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 6, 2022
0faa9b9
Merge graceful changes
pranikum Oct 6, 2022
adbc120
Make variable final
pranikum Oct 6, 2022
81d320b
Fix logger usage
pranikum Oct 6, 2022
de54abd
Merge changelog with latest
pranikum Oct 10, 2022
d2b0444
Update changelog Avoid joining during draining state
pranikum Oct 10, 2022
4ddaae4
Merge with latest
pranikum Oct 10, 2022
e5984f7
Add changelog
pranikum Oct 10, 2022
c600c6d
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 10, 2022
b97c83b
Handle node draining for node decommission
pranikum Oct 10, 2022
926c969
Fix changelog
pranikum Oct 10, 2022
abb9825
Changelog merge
pranikum Oct 12, 2022
26ba03e
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 12, 2022
8564208
Resolve conflict with latest
pranikum Oct 12, 2022
3e1c3b1
Merge to latest
pranikum Oct 14, 2022
01b208c
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 14, 2022
d8f0166
Merge with latest
pranikum Oct 14, 2022
f367f4f
Merge with Latest
pranikum Oct 14, 2022
e5ae1f3
Add changelog
pranikum Oct 14, 2022
d9d4904
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 14, 2022
f9ced18
Update Changelog
pranikum Oct 14, 2022
9881ef0
PR comments
pranikum Oct 15, 2022
1ed1b57
Fix validation message
pranikum Oct 15, 2022
dde8f48
Fix Tests
pranikum Oct 15, 2022
8453ca2
Add decommission State transition validation
pranikum Oct 16, 2022
3c417f5
Fix transition state
pranikum Oct 17, 2022
602ced2
Merge with latest
pranikum Oct 19, 2022
0574db5
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 19, 2022
3115772
checks valid weights and don't set it.
pranikum Oct 19, 2022
95fc853
Empty-Commit
pranikum Oct 19, 2022
165ce7d
Fix the integ tests
pranikum Oct 19, 2022
8e0f681
Merge with latest
pranikum Oct 20, 2022
460e96c
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 20, 2022
f3a7240
Resolve conflict
pranikum Oct 20, 2022
c147d15
PR comments
pranikum Oct 20, 2022
13c0c61
Spotless Fix and changes
pranikum Oct 20, 2022
56c48eb
Fix log message for logging
pranikum Oct 20, 2022
ce3cc6f
Fix spotless Apply
pranikum Oct 20, 2022
09aec9f
Fix logger usage
pranikum Oct 20, 2022
252a976
PR comments
pranikum Oct 20, 2022
60d4e87
Fix Tests
pranikum Oct 20, 2022
abdda72
Empty-Commit
pranikum Oct 20, 2022
1b8234c
Fix Tests
pranikum Oct 20, 2022
a4a2ff6
Empty-Commit
pranikum Oct 20, 2022
e410e45
Fix Integ tests
pranikum Oct 20, 2022
7d2f2df
Change dely for Draining check
pranikum Oct 20, 2022
e25e411
Lower delay to 100 mills
pranikum Oct 20, 2022
09b7cdc
Fix typo
pranikum Oct 20, 2022
878dad4
Take latest for main
pranikum Oct 20, 2022
9d28440
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 20, 2022
d2303c5
Merge with latest
pranikum Oct 20, 2022
b78b452
Fix spotless Java
pranikum Oct 20, 2022
d1bb936
Update delay to 300
pranikum Oct 20, 2022
a8a6032
Add no_Delay param. No need to use schedule when nodelay is true
pranikum Oct 21, 2022
b724464
Remove exposed param delay_timeout
pranikum Oct 21, 2022
10a6e5f
Update delay to 500 since still we see failure for cluster state
pranikum Oct 21, 2022
d2d171d
Empty-Commit
pranikum Oct 21, 2022
68e4e94
In case of no delay.. Avoid draining state
pranikum Oct 24, 2022
d6fe3c4
Merging with lates
pranikum Oct 24, 2022
ca0418f
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 24, 2022
04e4676
Resolve conflict with main
pranikum Oct 24, 2022
a6fbdb2
Fix test
pranikum Oct 24, 2022
811fe8e
Take latest file
pranikum Oct 25, 2022
97c93c1
Merge branch 'main' into graceful-node-decommission-wrr-2
pranikum Oct 25, 2022
122b16c
Resolve conflict
pranikum Oct 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add tests
Signed-off-by: pranikum <109206473+pranikum@users.noreply.github.com>
  • Loading branch information
pranikum committed Sep 25, 2022

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit bf622c42829bcf545e5109f80f2a1bf94ff9f635
Original file line number Diff line number Diff line change
@@ -290,11 +290,11 @@ public void handleNodesDecommissionRequest(
checkHttpStatsForDecommissionedNodes(nodesToBeDecommissioned, reason, timeout, timeoutForNodeDecommission, nodesRemovedListener);
}

private void setWeightForDecommissionedZone(List<String> zones) {
void setWeightForDecommissionedZone(List<String> zones) {
ClusterState clusterState = clusterService.getClusterApplierService().state();

DecommissionAttributeMetadata decommissionAttributeMetadata = clusterState.metadata().custom(DecommissionAttributeMetadata.TYPE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use utility function created in metadata for getting decommission attribute - clusterState.metadata().decommissionedAttribute()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Used clusterState.metadata().decommissionedAttributeMetadata().

assert decommissionAttributeMetadata.status().equals(DecommissionStatus.INIT)
assert decommissionAttributeMetadata.status().equals(DecommissionStatus.IN_PROGRESS)
: "unexpected status encountered while decommissioning nodes";
DecommissionAttribute decommissionAttribute = decommissionAttributeMetadata.decommissionAttribute();

@@ -341,7 +341,7 @@ public ClusterPutWRRWeightsResponse read(StreamInput in) throws IOException {
);
}

public void checkHttpStatsForDecommissionedNodes(
void checkHttpStatsForDecommissionedNodes(
Set<DiscoveryNode> decommissionedNodes,
String reason,
TimeValue timeout,
Original file line number Diff line number Diff line change
@@ -10,11 +10,15 @@

import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.mockito.ArgumentCaptor;
import org.mockito.Mockito;
import org.opensearch.OpenSearchTimeoutException;
import org.opensearch.Version;
import org.opensearch.action.ActionListener;
import org.opensearch.action.admin.cluster.configuration.TransportAddVotingConfigExclusionsAction;
import org.opensearch.action.admin.cluster.configuration.TransportClearVotingConfigExclusionsAction;
import org.opensearch.action.admin.cluster.shards.routing.wrr.put.ClusterPutWRRWeightsRequest;
import org.opensearch.action.support.ActionFilters;
import org.opensearch.cluster.ClusterName;
import org.opensearch.cluster.ClusterState;
@@ -28,6 +32,7 @@
import org.opensearch.cluster.node.DiscoveryNodes;
import org.opensearch.cluster.routing.allocation.AllocationService;
import org.opensearch.cluster.service.ClusterService;
import org.opensearch.common.collect.List;
import org.opensearch.common.settings.ClusterSettings;
import org.opensearch.common.settings.Settings;
import org.opensearch.common.unit.TimeValue;
@@ -36,6 +41,7 @@
import org.opensearch.test.transport.MockTransport;
import org.opensearch.threadpool.TestThreadPool;
import org.opensearch.threadpool.ThreadPool;
import org.opensearch.transport.TransportResponseHandler;
import org.opensearch.transport.TransportService;

import java.util.Arrays;
@@ -268,6 +274,84 @@ public void onFailure(Exception e) {
assertEquals(decommissionAttributeMetadata.status(), DecommissionStatus.SUCCESSFUL);
}

public void testSetWeightsForDecommission() {
TransportService mockTransportService = Mockito.mock(TransportService.class);
Mockito.when(mockTransportService.getLocalNode()).thenReturn(Mockito.mock(DiscoveryNode.class));
decommissionController = new DecommissionController(clusterService, mockTransportService, allocationService, threadPool);

DecommissionAttributeMetadata oldMetadata = new DecommissionAttributeMetadata(
new DecommissionAttribute("zone", "zone-1"),
DecommissionStatus.IN_PROGRESS
);
ClusterState state = clusterService.state();
Metadata metadata = state.metadata();
Metadata.Builder mdBuilder = Metadata.builder(metadata);
mdBuilder.decommissionAttributeMetadata(oldMetadata);
state = ClusterState.builder(state).metadata(mdBuilder).build();
setState(clusterService, state);

decommissionController.setWeightForDecommissionedZone(List.of("zone-1", "zone-2", "zone-3"));
ArgumentCaptor<ClusterPutWRRWeightsRequest> clusterPutWRRWeightsRequestArgumentCaptor = ArgumentCaptor.forClass(
ClusterPutWRRWeightsRequest.class
);
Mockito.verify(mockTransportService)
.sendRequest(
Mockito.any(DiscoveryNode.class),
Mockito.anyString(),
clusterPutWRRWeightsRequestArgumentCaptor.capture(),
Mockito.any(TransportResponseHandler.class)
);

ClusterPutWRRWeightsRequest request = clusterPutWRRWeightsRequestArgumentCaptor.getValue();
assertEquals("0", request.wrrWeight().weights().get("zone-1"));
assertEquals("1", request.wrrWeight().weights().get("zone-2"));
assertEquals("1", request.wrrWeight().weights().get("zone-3"));
}

@Test(expected = AssertionError.class)
public void testSetWeightsForDecommissionForDecommissionInit() {
TransportService mockTransportService = Mockito.mock(TransportService.class);
Mockito.when(mockTransportService.getLocalNode()).thenReturn(Mockito.mock(DiscoveryNode.class));
decommissionController = new DecommissionController(clusterService, mockTransportService, allocationService, threadPool);

DecommissionAttributeMetadata oldMetadata = new DecommissionAttributeMetadata(
new DecommissionAttribute("zone", "zone-1"),
DecommissionStatus.INIT
);
ClusterState state = clusterService.state();
Metadata metadata = state.metadata();
Metadata.Builder mdBuilder = Metadata.builder(metadata);
mdBuilder.decommissionAttributeMetadata(oldMetadata);
state = ClusterState.builder(state).metadata(mdBuilder).build();
setState(clusterService, state);

decommissionController.setWeightForDecommissionedZone(List.of("zone-1", "zone-2", "zone-3"));
}

public void testCheckHttpStatsForDecommissionedNodes() {
TransportService mockTransportService = Mockito.mock(TransportService.class);
ThreadPool mockThreadPool = Mockito.mock(ThreadPool.class);
Mockito.when(mockTransportService.getThreadPool()).thenReturn(mockThreadPool);
Mockito.when(mockTransportService.getLocalNode()).thenReturn(Mockito.mock(DiscoveryNode.class));
decommissionController = new DecommissionController(clusterService, mockTransportService, allocationService, threadPool);

DiscoveryNode node1 = Mockito.mock(DiscoveryNode.class);
DiscoveryNode node2 = Mockito.mock(DiscoveryNode.class);

ActionListener listener = Mockito.mock(ActionListener.class);

String reason = "Node is Decommissioned";
decommissionController.checkHttpStatsForDecommissionedNodes(
Set.of(node1, node2),
reason,
TimeValue.timeValueSeconds(30),
TimeValue.timeValueSeconds(60),
listener
);

Mockito.verify(mockThreadPool).schedule(Mockito.any(Runnable.class), Mockito.any(TimeValue.class), Mockito.anyString());
}

private static class AdjustConfigurationForExclusions implements ClusterStateObserver.Listener {

final CountDownLatch doneLatch;