Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky integ tests - CreateRemoteIndexIT,CreateRemoteIndexTranslogDisabledIT,CreateRemoteIndexClusterDefaultDocRep #7141

Merged
merged 1 commit into from
Apr 14, 2023

Conversation

linuxpi
Copy link
Collaborator

@linuxpi linuxpi commented Apr 13, 2023

Description

Fixes flaky integ tests

Issues Resolved

#7128

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@linuxpi linuxpi force-pushed the create-remote-index-it-fix branch from 2eab79e to e7c52ad Compare April 13, 2023 05:23
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

…DisabledIT,CreateRemoteIndexClusterDefaultDocRep

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
@linuxpi linuxpi force-pushed the create-remote-index-it-fix branch from e7c52ad to 4d56547 Compare April 13, 2023 11:06
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.cluster.allocation.AwarenessAllocationIT.testThreeZoneOneReplicaWithForceZoneValueAndLoadAwareness

Comment on lines -59 to +58
.put(IndexMetadata.SETTING_NUMBER_OF_SHARDS, 3)
.put(IndexMetadata.SETTING_NUMBER_OF_REPLICAS, numReplicas)
.put(IndexMetadata.SETTING_NUMBER_OF_SHARDS, 1)
.put(IndexMetadata.SETTING_NUMBER_OF_REPLICAS, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why were the tests flaky ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Also how have you determined that they are no longer flaky?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failures were due to the following error:

java.lang.AssertionError: [test-idx-1][2], node[vX6f0DTmRfC1il8YDrSIkQ], [P], s[STARTED], a[id=53rPm8H3SrepttaigUli0A] has unreleased snapshotted index commits

java.lang.RuntimeException: file handle leaks: [InputStream(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.remotestore.CreateRemoteIndexIT_58A3D614EB68203F-001/tempDir-008/repos/fgUzGkzgHm/qWmYzVJSTKeMhahfEsSuhw/0/segments/data/segment_infos_snapshot_filename__3__PXCpdocBnatMeQEWYB27)]

https://build.ci.opensearch.org/job/gradle-check/13937/testReport/junit/org.opensearch.remotestore/CreateRemoteIndexIT/testRemoteStoreDisabledByUser/

I was able to repro with following params:

./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.CreateRemoteIndexIT" -Dtests.seed=58A3D614EB68203F -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Dtests.iters=20

./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.CreateRemoteIndexTranslogDisabledIT" -Dtests.seed=58A3D614EB68203F -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Dtests.iters=20

./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.CreateRemoteIndexClusterDefaultDocRep" -Dtests.seed=58A3D614EB68203F -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Dtests.iters=20

Mostly it was because we were not cleaning up snapshot dir.

After my changes, i've ran above 3 commands multiple times and havent seen any failures.

changes around replica count and shard count are done just to simplify the tests further, our main goal with these tests is to verify cluster level settings behavior with remote store

@gbbafna gbbafna merged commit 632eb44 into opensearch-project:main Apr 14, 2023
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Apr 14, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-7141-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 632eb44a541b28ef16ed261904b45d74b84f3b9f
# Push it to GitHub
git push --set-upstream origin backport/backport-7141-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-7141-to-2.x.

linuxpi added a commit to linuxpi/OpenSearch that referenced this pull request Apr 17, 2023
…#7141)

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
(cherry picked from commit 632eb44)
gbbafna pushed a commit that referenced this pull request Apr 17, 2023
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
(cherry picked from commit 632eb44)
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants