Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Searchable Snapshots] Add integration test to validate cache files are closed after use #6215

Conversation

tlfeng
Copy link
Collaborator

@tlfeng tlfeng commented Feb 7, 2023

Description

  • Add an integration test for Searchable Snapshots to validate all cache files are closed / no resource leak after an index which backed by remote snapshot is deleted.

The cache directory is supposed to exist in the parent directory for the shard data of the index (https://github.com/opensearch-project/OpenSearch/blob/2.5.0/server/src/internalClusterTest/java/org/opensearch/snapshots/SearchableSnapshotIT.java#L367-L370)

final Path file = PathUtils.get(path)
    .resolve("indices")
    .resolve(index.getUUID())
    .resolve(Integer.toString(shardRouting.getId()))
    .resolve("LocalCache");

Based on that, I iterator all the files in the parent directory to validate all the files are closed.
I used Files.move() to check whether the file is closed / locked / in-use or not, if the file can be moved, then it proves the file has been closed properly. (reference: https://stackoverflow.com/questions/1500174/check-if-a-file-is-locked-in-java/13706972#13706972)

The steps for the whole test are based on the idea from the original issue #5244

creating remote backed index, do some searches, and then delete the index. And monitor the file descriptor held by process affecting the caching directory.

Run the test: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.snapshots.SearchableSnapshotIT.testCacheFilesAreClosedAfterUse"

Issues Resolved

#5244

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…re closed after use

Signed-off-by: Tianli Feng <ftianli@amazon.com>
@tlfeng tlfeng added :test Adding or fixing a test v3.0.0 Issues and PRs related to version 3.0.0 backport 2.x Backport to 2.x branch v2.6.0 'Issues and PRs related to version v2.6.0' labels Feb 7, 2023
Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 7, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.geo.search.aggregations.bucket.GeoHashGridIT.testGeoShapes

Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

Gradle Check (Jenkins) Run Completed with:

…archable-snapshot

Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

Gradle Check (Jenkins) Run Completed with:

CHANGELOG.md Outdated
@@ -9,6 +9,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Add getter for path field in NestedQueryBuilder ([#4636](https://github.com/opensearch-project/OpenSearch/pull/4636))
- Allow mmap to use new JDK-19 preview APIs in Apache Lucene 9.4+ ([#5151](https://github.com/opensearch-project/OpenSearch/pull/5151))
- Add support for ppc64le architecture ([#5459](https://github.com/opensearch-project/OpenSearch/pull/5459))
- [Searchable Snapshots] Add integration test to validate cache files are closed after use ([#6215](https://github.com/opensearch-project/OpenSearch/pull/6215)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test-only changes can be omitted from the changelog.

Also, though this is moot since we don't need an entry for this change, but any change that will be backported to 2.x should only be in the [Unreleased 2.x] section because this will never be "unreleased" relative to 3.0 because it will go into a 2.x release.

Copy link
Collaborator Author

@tlfeng tlfeng Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrross Thanks for your explanation and pointing me to the document of the changelog! The guide is indeed very clear.
I will remove the change in this file and add skip-changelog label.

Tianli Feng added 2 commits February 8, 2023 09:46
Signed-off-by: Tianli Feng <ftianli@amazon.com>
…archable-snapshot

Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testIndexCreateBlockWithAReadOnlyBlock

@codecov-commenter
Copy link

Codecov Report

Merging #6215 (7e53d6e) into main (1f4cdd2) will increase coverage by 0.03%.
The diff coverage is n/a.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #6215      +/-   ##
============================================
+ Coverage     70.72%   70.75%   +0.03%     
- Complexity    58791    58887      +96     
============================================
  Files          4782     4782              
  Lines        281484   281484              
  Branches      40642    40642              
============================================
+ Hits         199072   199158      +86     
+ Misses        65984    65979       -5     
+ Partials      16428    16347      -81     
Impacted Files Coverage Δ
...java/org/opensearch/client/indices/DataStream.java 0.00% <0.00%> (-76.09%) ⬇️
...n/indices/forcemerge/ForceMergeRequestBuilder.java 0.00% <0.00%> (-75.00%) ⬇️
.../opensearch/client/indices/CloseIndexResponse.java 17.50% <0.00%> (-73.75%) ⬇️
.../indices/forcemerge/TransportForceMergeAction.java 25.00% <0.00%> (-58.34%) ⬇️
...a/org/opensearch/client/cluster/ProxyModeInfo.java 0.00% <0.00%> (-55.00%) ⬇️
...cluster/coordination/PublishClusterStateStats.java 33.33% <0.00%> (-37.51%) ⬇️
...ndex/query/functionscore/DecayFunctionBuilder.java 33.45% <0.00%> (-32.37%) ⬇️
.../java/org/opensearch/index/reindex/RemoteInfo.java 50.45% <0.00%> (-29.36%) ⬇️
...opensearch/snapshots/SnapshotMissingException.java 28.57% <0.00%> (-28.58%) ⬇️
...rc/main/java/org/opensearch/ingest/IngestInfo.java 51.72% <0.00%> (-27.59%) ⬇️
... and 484 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@tlfeng tlfeng linked an issue Feb 9, 2023 that may be closed by this pull request
Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

deleteIndicesAndEnsureGreen(client, restoredIndexName);

logger.info("--> validate all the cache files are closed");
// Iterate all the primary shards to get all the possible path that contains local cache file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlfeng With the changes within #6350, you can get easy access to the cache node path and check for files particularly within the cache instead of step walking through the indices directory.
And this code will pass always since there will be no shards in this directory when #6350 is merged in.
I think you would want to base your test off the changed files for the cache logic.

Copy link
Collaborator Author

@tlfeng tlfeng Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kotwanikunal Thank you! Since you made much contribution on the cache, the constructive idea is what I want to hear. I will change the test to match the latest code, by checking the files in the "cache node path".

@tlfeng tlfeng removed the v2.6.0 'Issues and PRs related to version v2.6.0' label Feb 20, 2023
Tianli Feng added 3 commits February 28, 2023 09:44
…archable-snapshot

Signed-off-by: Tianli Feng <ftianli@amazon.com>

# Conflicts:
#	server/src/internalClusterTest/java/org/opensearch/snapshots/SearchableSnapshotIT.java
…archable-snapshot

Signed-off-by: Tianli Feng <ftianli@amazon.com>
Signed-off-by: Tianli Feng <ftianli@amazon.com>
@tlfeng tlfeng force-pushed the resource-leak-it-searchable-snapshot branch from 4c11f9f to a5b750f Compare February 28, 2023 18:29
Signed-off-by: Tianli Feng <ftianli@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.cluster.service.MasterServiceTests.classMethod
      1 org.opensearch.cluster.service.MasterServiceTests.testThrottlingForMultipleTaskTypes

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Member

@kotwanikunal kotwanikunal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we still need this post #6473 since those tests check the in memory cache for cleanup vs this test verifies against the actual file system.

@kotwanikunal kotwanikunal merged commit fa8937b into opensearch-project:main Mar 1, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 1, 2023
…re closed after use (#6215)

* [Searchable Snapshots] Add integration test to validate cache files are closed after use

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Add changelog

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Change the outdated File API to Path

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Reduce a for loop of the sharditerator

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Revert changes in changelog file

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Get file cache path directly

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Correct a log message

Signed-off-by: Tianli Feng <ftianli@amazon.com>

---------

Signed-off-by: Tianli Feng <ftianli@amazon.com>
(cherry picked from commit fa8937b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross pushed a commit that referenced this pull request Mar 3, 2023
…re closed after use (#6215) (#6524)

* [Searchable Snapshots] Add integration test to validate cache files are closed after use



* Add changelog



* Change the outdated File API to Path



* Reduce a for loop of the sharditerator



* Revert changes in changelog file



* Get file cache path directly



* Correct a log message



---------


(cherry picked from commit fa8937b)

Signed-off-by: Tianli Feng <ftianli@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@tlfeng tlfeng deleted the resource-leak-it-searchable-snapshot branch March 3, 2023 19:27
mingshl pushed a commit to mingshl/OpenSearch-Mingshl that referenced this pull request Mar 24, 2023
…re closed after use (opensearch-project#6215)

* [Searchable Snapshots] Add integration test to validate cache files are closed after use

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Add changelog

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Change the outdated File API to Path

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Reduce a for loop of the sharditerator

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Revert changes in changelog file

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Get file cache path directly

Signed-off-by: Tianli Feng <ftianli@amazon.com>

* Correct a log message

Signed-off-by: Tianli Feng <ftianli@amazon.com>

---------

Signed-off-by: Tianli Feng <ftianli@amazon.com>
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog :test Adding or fixing a test v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Searchable Snapshot] Add Integ test to make there is no resource leak
4 participants