Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding feature to exclude indexes starting with dot from shard validator #4695

Merged
merged 11 commits into from
Oct 19, 2022
Merged
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Addition of Doc values on the GeoShape Field
- Addition of GeoShape ValueSource level code interfaces for accessing the DocValues.
- Addition of Missing Value feature in the GeoShape Aggregations.
- Added feature to ignore indexes starting with dot during shard limit validation.([#4695](https://github.com/opensearch-project/OpenSearch/pull/4695))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the parent comment that is the reason it looks distorted from above comments as those are sub comments.


### Changed

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,28 @@
import org.opensearch.cluster.ClusterState;
import org.opensearch.cluster.metadata.Metadata;
import org.opensearch.common.Priority;
import org.opensearch.common.network.NetworkModule;
import org.opensearch.common.settings.Settings;
import org.opensearch.common.unit.ByteSizeUnit;
import org.opensearch.core.internal.io.IOUtils;
import org.opensearch.indices.ShardLimitValidator;
import org.opensearch.snapshots.SnapshotInfo;
import org.opensearch.snapshots.SnapshotState;
import org.opensearch.snapshots.mockstore.MockRepository;
import org.opensearch.test.InternalSettingsPlugin;
import org.opensearch.test.InternalTestCluster;
import org.opensearch.test.MockHttpTransport;
import org.opensearch.test.NodeConfigurationSource;
import org.opensearch.test.OpenSearchIntegTestCase;
import org.opensearch.test.transport.MockTransportService;
import org.opensearch.transport.nio.MockNioTransportPlugin;

import java.io.IOException;
import java.nio.file.Path;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.function.Function;

import static org.opensearch.cluster.metadata.IndexMetadata.SETTING_NUMBER_OF_REPLICAS;
import static org.opensearch.cluster.metadata.IndexMetadata.SETTING_NUMBER_OF_SHARDS;
Expand All @@ -63,12 +76,18 @@
@OpenSearchIntegTestCase.ClusterScope(scope = OpenSearchIntegTestCase.Scope.TEST)
public class ClusterShardLimitIT extends OpenSearchIntegTestCase {
private static final String shardsPerNodeKey = ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE.getKey();
private static final String ignoreHiddenIndexKey = ShardLimitValidator.SETTING_CLUSTER_IGNORE_HIDDEN_INDEXES.getKey();

public void testSettingClusterMaxShards() {
int shardsPerNode = between(1, 500_000);
setShardsPerNode(shardsPerNode);
}

public void testSettingIgnoreHiddenIndexes() {
boolean ignoreHiddenIndexes = randomBoolean();
setIgnoreHiddenIndex(ignoreHiddenIndexes);
}

public void testMinimumPerNode() {
int negativeShardsPerNode = between(-50_000, 0);
try {
Expand Down Expand Up @@ -100,7 +119,6 @@ public void testIndexCreationOverLimit() {
ShardCounts counts = ShardCounts.forDataNodeCount(dataNodes);

setShardsPerNode(counts.getShardsPerNode());

// Create an index that will bring us up to the limit
createIndex(
"test",
Expand All @@ -127,6 +145,108 @@ public void testIndexCreationOverLimit() {
assertFalse(clusterState.getMetadata().hasIndex("should-fail"));
}

/**
* The test checks if the index starting with the dot can be created if the node has
* number of shards equivalent to the cluster.max_shards_per_node and the cluster.ignore_hidden_indexes
* setting is set to true. If the cluster.ignore_hidden_indexes is set to true index creation of
* indexes starting with dot would succeed.
*/
public void testIndexCreationOverLimitForHiddenIndexesSucceeds() {
int dataNodes = client().admin().cluster().prepareState().get().getState().getNodes().getDataNodes().size();

// Setting the cluster.max_shards_per_node setting according to the data node count.
setShardsPerNode(dataNodes);
setIgnoreHiddenIndex(true);

/*
Create an index that will bring us up to the limit. It would create index with primary equal to the
dataNodes * dataNodes so that cluster.max_shards_per_node setting is reached.
*/
createIndex(
"test",
Settings.builder()
.put(indexSettings())
.put(SETTING_NUMBER_OF_SHARDS, dataNodes * dataNodes)
.put(SETTING_NUMBER_OF_REPLICAS, 0)
.build()
);

Copy link
Member

@ashking94 ashking94 Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add assertions around the max shard per node setting value and the total number of shards created so far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK will add

// Getting total active shards in the cluster.
int currentActiveShards = client().admin().cluster().prepareHealth().get().getActiveShards();

// Getting cluster.max_shards_per_node setting
ClusterState clusterState = client().admin().cluster().prepareState().get().getState();
String maxShardsPerNode = clusterState.getMetadata()
.settings()
.get(ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE.getKey());

// Checking if the total shards created are equivalent to dataNodes * cluster.max_shards_per_node
assertEquals(dataNodes * Integer.parseInt(maxShardsPerNode), currentActiveShards);

createIndex(
".test-index",
Settings.builder().put(indexSettings()).put(SETTING_NUMBER_OF_SHARDS, 1).put(SETTING_NUMBER_OF_REPLICAS, 0).build()
);

clusterState = client().admin().cluster().prepareState().get().getState();
assertTrue(clusterState.getMetadata().hasIndex(".test-index"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add an assertion here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK will add

}

/**
* The test checks if the index starting with the dot should not be created if the node has
* number of shards equivalent to the cluster.max_shards_per_node and the cluster.ignore_hidden_indexes
* setting is set to false. If the cluster.ignore_hidden_indexes is set to false index creation of
* indexes starting with dot would fail as well.
*/
public void testIndexCreationOverLimitForHiddenIndexesFail() {
int dataNodes = client().admin().cluster().prepareState().get().getState().getNodes().getDataNodes().size();
int maxAllowedShards = dataNodes * dataNodes;

// Setting the cluster.max_shards_per_node setting according to the data node count.
setShardsPerNode(dataNodes);

/*
Create an index that will bring us up to the limit. It would create index with primary equal to the
dataNodes * dataNodes so that cluster.max_shards_per_node setting is reached.
*/
createIndex(
"test",
Settings.builder()
.put(indexSettings())
.put(SETTING_NUMBER_OF_SHARDS, maxAllowedShards)
.put(SETTING_NUMBER_OF_REPLICAS, 0)
.build()
);

// Getting total active shards in the cluster.
int currentActiveShards = client().admin().cluster().prepareHealth().get().getActiveShards();

// Getting cluster.max_shards_per_node setting
ClusterState clusterState = client().admin().cluster().prepareState().get().getState();
String maxShardsPerNode = clusterState.getMetadata()
.settings()
.get(ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE.getKey());

// Checking if the total shards created are equivalent to dataNodes * cluster.max_shards_per_node
assertEquals(dataNodes * Integer.parseInt(maxShardsPerNode), currentActiveShards);

int extraShardCount = 1;
try {
createIndex(
".test-index",
Settings.builder()
.put(indexSettings())
.put(SETTING_NUMBER_OF_SHARDS, extraShardCount)
.put(SETTING_NUMBER_OF_REPLICAS, 0)
.build()
);
} catch (IllegalArgumentException e) {
verifyException(maxAllowedShards, currentActiveShards, extraShardCount, e);
}
clusterState = client().admin().cluster().prepareState().get().getState();
assertFalse(clusterState.getMetadata().hasIndex(".test-index"));
}

public void testIndexCreationOverLimitFromTemplate() {
int dataNodes = client().admin().cluster().prepareState().get().getState().getNodes().getDataNodes().size();

Expand Down Expand Up @@ -414,6 +534,100 @@ public void testOpenIndexOverLimit() {
assertFalse(clusterState.getMetadata().hasIndex("snapshot-index"));
}

public void testIgnoreHiddenSettingOnMultipleNodes() throws IOException, InterruptedException {
int maxAllowedShardsPerNode = 10, indexPrimaryShards = 11, indexReplicaShards = 1;

InternalTestCluster cluster = new InternalTestCluster(
randomLong(),
createTempDir(),
true,
true,
0,
0,
"cluster",
new NodeConfigurationSource() {
@Override
public Settings nodeSettings(int nodeOrdinal) {
return Settings.builder()
.put(ClusterShardLimitIT.this.nodeSettings(nodeOrdinal))
.put(NetworkModule.TRANSPORT_TYPE_KEY, getTestTransportType())
.build();
}

@Override
public Path nodeConfigPath(int nodeOrdinal) {
return null;
}
},
0,
"cluster-",
Arrays.asList(
TestSeedPlugin.class,
MockHttpTransport.TestPlugin.class,
MockTransportService.TestPlugin.class,
MockNioTransportPlugin.class,
InternalSettingsPlugin.class,
MockRepository.Plugin.class
),
Function.identity()
);
cluster.beforeTest(random());

// Starting 3 ClusterManagerOnlyNode nodes
cluster.startClusterManagerOnlyNode(Settings.builder().put("cluster.ignore_hidden_indexes", true).build());
cluster.startClusterManagerOnlyNode(Settings.builder().put("cluster.ignore_hidden_indexes", false).build());
cluster.startClusterManagerOnlyNode(Settings.builder().put("cluster.ignore_hidden_indexes", false).build());

// Starting 2 data nodes
cluster.startDataOnlyNode(Settings.builder().put("cluster.ignore_hidden_indexes", false).build());
cluster.startDataOnlyNode(Settings.builder().put("cluster.ignore_hidden_indexes", false).build());

// Setting max shards per node to be 10
cluster.client()
.admin()
.cluster()
.prepareUpdateSettings()
.setPersistentSettings(Settings.builder().put(shardsPerNodeKey, maxAllowedShardsPerNode))
.get();

// Creating an index starting with dot having shards greater thn the desired node limit
cluster.client()
.admin()
.indices()
.prepareCreate(".test-index")
.setSettings(
Settings.builder().put(SETTING_NUMBER_OF_SHARDS, indexPrimaryShards).put(SETTING_NUMBER_OF_REPLICAS, indexReplicaShards)
)
.get();

// As active ClusterManagerNode setting takes precedence killing the active one.
// This would be the first one where cluster.ignore_hidden_indexes is true because the above calls are blocking.
cluster.stopCurrentClusterManagerNode();

// Waiting for all shards to get assigned
cluster.client().admin().cluster().prepareHealth().setWaitForGreenStatus().get();

// Creating an index starting with dot having shards greater thn the desired node limit
try {
cluster.client()
.admin()
.indices()
.prepareCreate(".test-index1")
.setSettings(
Settings.builder().put(SETTING_NUMBER_OF_SHARDS, indexPrimaryShards).put(SETTING_NUMBER_OF_REPLICAS, indexReplicaShards)
)
.get();
} catch (IllegalArgumentException e) {
ClusterHealthResponse clusterHealth = cluster.client().admin().cluster().prepareHealth().get();
int currentActiveShards = clusterHealth.getActiveShards();
int dataNodeCount = clusterHealth.getNumberOfDataNodes();
int extraShardCount = indexPrimaryShards * (1 + indexReplicaShards);
verifyException(maxAllowedShardsPerNode * dataNodeCount, currentActiveShards, extraShardCount, e);
}

IOUtils.close(cluster);
}

private int ensureMultipleDataNodes(int dataNodes) {
if (dataNodes == 1) {
internalCluster().startNode(dataNode());
Expand Down Expand Up @@ -457,6 +671,29 @@ private void setShardsPerNode(int shardsPerNode) {
}
}

private void setIgnoreHiddenIndex(boolean ignoreHiddenIndex) {
try {
ClusterUpdateSettingsResponse response;
if (frequently()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we revisit if the usage of frequently is required here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we will check if for both persistent and transient the setting is getting applied.

response = client().admin()
.cluster()
.prepareUpdateSettings()
.setPersistentSettings(Settings.builder().put(ignoreHiddenIndexKey, ignoreHiddenIndex).build())
.get();
assertEquals(ignoreHiddenIndex, response.getPersistentSettings().getAsBoolean(ignoreHiddenIndexKey, true));
} else {
response = client().admin()
.cluster()
.prepareUpdateSettings()
.setTransientSettings(Settings.builder().put(ignoreHiddenIndexKey, ignoreHiddenIndex).build())
.get();
assertEquals(ignoreHiddenIndex, response.getTransientSettings().getAsBoolean(ignoreHiddenIndexKey, true));
}
} catch (IllegalArgumentException ex) {
fail(ex.getMessage());
}
}

private void verifyException(int dataNodes, ShardCounts counts, IllegalArgumentException e) {
int totalShards = counts.getFailingIndexShards() * (1 + counts.getFailingIndexReplicas());
int currentShards = counts.getFirstIndexShards() * (1 + counts.getFirstIndexReplicas());
Expand All @@ -471,4 +708,15 @@ private void verifyException(int dataNodes, ShardCounts counts, IllegalArgumentE
assertEquals(expectedError, e.getMessage());
}

private void verifyException(int maxShards, int currentShards, int extraShards, IllegalArgumentException e) {
String expectedError = "Validation Failed: 1: this action would add ["
+ extraShards
+ "] total shards, but this cluster currently has ["
+ currentShards
+ "]/["
+ maxShards
+ "] maximum shards open;";
assertEquals(expectedError, e.getMessage());
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,7 @@ public void apply(Settings value, Settings current, Settings previous) {
Metadata.SETTING_READ_ONLY_SETTING,
Metadata.SETTING_READ_ONLY_ALLOW_DELETE_SETTING,
ShardLimitValidator.SETTING_CLUSTER_MAX_SHARDS_PER_NODE,
ShardLimitValidator.SETTING_CLUSTER_IGNORE_HIDDEN_INDEXES,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indices beginning with . are not necessarily hidden and vice-versa. index.hidden is a separate index level setting. Many opensearch plugin use that like cross-cluster and many don't like security-plugin. Should we say dot_indices ?

Copy link
Contributor Author

@nishchay21 nishchay21 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously named this to dot_index however was asked to rename as mentioned in this close PR #4674

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bukhtawar : what is your take on using hidden or dot ?

Copy link
Collaborator

@Bukhtawar Bukhtawar Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dot indices have been deprecated so going forward we should use hidden and system indices only. For 2.x we might still have to support dot indices. Any reason why security plugin is still using dot index?

I am fine with dot index for the purpose of this PR and backporting to 2.x but please open a PR to deprecate dot indices going forward

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bukhtawar @gbbafna so should we change to dot or go with hidden ??

RecoverySettings.INDICES_RECOVERY_MAX_BYTES_PER_SEC_SETTING,
RecoverySettings.INDICES_RECOVERY_RETRY_DELAY_STATE_SYNC_SETTING,
RecoverySettings.INDICES_RECOVERY_RETRY_DELAY_NETWORK_SETTING,
Expand Down
Loading