You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
With Segment Replication it is possible that a shard is closed while segment copy is still ongoing, while rare this can result in a case where a ReplicationListener is left open on shard close. This can happen because in the beforeIndexShardClosed method in SegmentReplicationTargetService we cancel ongoing replications, however this is more of a request for cancellation as the method does not block and wait for all resources to be cleaned up.
Within these components there are two listeners. One that is passed to the ReplicationTarget (ReplicationListener), that is only resolved through ReplicationCollection when it is done/failed. The other that is internal to SegmentReplicationTarget directly on execution, that is used to alert the target service a replication event is completed and the collection done/fail methods should be invoked.
Cancellation will remove the shard from SegmentReplicationTargetService's ReplicationCollection immediately in beforeIndexShardClosed, so when the internal listener to startReplication is resolved with failure/success the collection does not notify the original ReplicationListener as it is already removed from the collection. This surfaces with the following stacktrace:
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.index.mapper.ICUCollationKeywordFieldMapperIT.testCustomRules" -Dtests.seed=6885F19536DCDBB9 -Dtests.security.manager=false -Dtests.locale=no -Dtests.timezone=America/Adak -Druntime.java=19
java.lang.AssertionError: All incoming requests on node [node_s3] should have finished. Expected 0 but got 437; pending tasks [[{
"node" : "node_s3",
"id" : 415,
"type" : "transport",
"action" : "internal:index/shard/recovery/start_recovery",
"description" : "",
"start_time" : "2023-06-27T17:03:01.988Z",
"start_time_in_millis" : 1687885381988,
"running_time" : "1m",
"running_time_in_nanos" : 65986775792,
"cancellable" : false,
"cancelled" : false,
"headers" : { },
"resource_stats" : {
"average" : {
"cpu_time_in_nanos" : 0,
"memory_in_bytes" : 0
},
"total" : {
"cpu_time_in_nanos" : 0,
"memory_in_bytes" : 0
},
"min" : {
"cpu_time_in_nanos" : 0,
"memory_in_bytes" : 0
},
"max" : {
"cpu_time_in_nanos" : 0,
"memory_in_bytes" : 0
},
"thread_info" : {
"thread_executions" : 0,
"active_threads" : 0
}
}
}]]
at __randomizedtesting.SeedInfo.seed([6885F19536DCDBB9:E8D91A8A3570CB6A]:0)
at org.opensearch.test.InternalTestCluster.lambda$assertRequestsFinished$43(InternalTestCluster.java:2746)
at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1084)
at org.opensearch.test.InternalTestCluster.assertRequestsFinished(InternalTestCluster.java:2737)
at org.opensearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2712)
at org.opensearch.test.OpenSearchIntegTestCase.afterInternal(OpenSearchIntegTestCase.java:603)
at org.opensearch.test.OpenSearchIntegTestCase.cleanUpCluster(OpenSearchIntegTestCase.java:2245)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:578)
To Reproduce
Steps to reproduce the behavior:
Enable SR by default & run any test running suite scope - I've used ICUCollationKeywordFieldMapperIT & get this after a couple hundred iterations on shard close.
I also wrote a unit test in SegmentReplicationIndexShardTests:
publicvoidtestBeforeIndexShardClosedWhileCopyingFiles() throwsException {
try (ReplicationGroupshards = createGroup(1, settings, newNRTReplicationEngineFactory())) {
shards.startAll();
IndexShardprimary = shards.getPrimary();
finalIndexShardreplica = shards.getReplicas().get(0);
primary.refresh("Test");
finalSegmentReplicationSourceFactorysourceFactory = mock(SegmentReplicationSourceFactory.class);
finalSegmentReplicationTargetServicetargetService = newTargetService(sourceFactory);
SegmentReplicationSourcesource = newTestReplicationSource() {
ActionListener<GetSegmentFilesResponse> listener;
@OverridepublicvoidgetCheckpointMetadata(
longreplicationId,
ReplicationCheckpointcheckpoint,
ActionListener<CheckpointInfoResponse> listener
) {
resolveCheckpointInfoResponseListener(listener, primary);
}
@OverridepublicvoidgetSegmentFiles(
longreplicationId,
ReplicationCheckpointcheckpoint,
List<StoreFileMetadata> filesToFetch,
IndexShardindexShard,
ActionListener<GetSegmentFilesResponse> listener
) {
// set the listener, we will only fail it once we cancel the source.this.listener = listener;
// shard is closing while we are copying files.targetService.beforeIndexShardClosed(replica.shardId, replica, Settings.EMPTY);
}
@Overridepublicvoidcancel() {
// simulate listener resolving, but only after we have issued a cancel from beforeIndexShardClosed .finalRuntimeExceptionexception = newCancellableThreads.ExecutionCancelledException("retryable action was cancelled");
listener.onFailure(exception);
}
};
when(sourceFactory.get(any())).thenReturn(source);
startReplicationAndAssertCancellation(replica, targetService);
shards.removeReplica(replica);
closeShards(replica);
}
}
Expected behavior
All resources to be cleaned up.
The text was updated successfully, but these errors were encountered:
Describe the bug
With Segment Replication it is possible that a shard is closed while segment copy is still ongoing, while rare this can result in a case where a ReplicationListener is left open on shard close. This can happen because in the beforeIndexShardClosed method in SegmentReplicationTargetService we cancel ongoing replications, however this is more of a request for cancellation as the method does not block and wait for all resources to be cleaned up.
Within these components there are two listeners. One that is passed to the ReplicationTarget (ReplicationListener), that is only resolved through ReplicationCollection when it is done/failed. The other that is internal to SegmentReplicationTarget directly on execution, that is used to alert the target service a replication event is completed and the collection done/fail methods should be invoked.
Cancellation will remove the shard from
SegmentReplicationTargetService
's ReplicationCollection immediately in beforeIndexShardClosed, so when the internal listener to startReplication is resolved with failure/success the collection does not notify the original ReplicationListener as it is already removed from the collection. This surfaces with the following stacktrace:To Reproduce
Steps to reproduce the behavior:
Expected behavior
All resources to be cleaned up.
The text was updated successfully, but these errors were encountered: