Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Make translog transfer timeout configurable using dynamic setting #12704

Merged

Conversation

sachinpkale
Copy link
Member

@sachinpkale sachinpkale commented Mar 16, 2024

Description

  • In this PR, we make translog transfer timeout configurable using a dynamic cluster setting.
  • In certain cases, when configured remote store is slow, translog uploads can take more than the current hardcoded value of 30s. This could result in uploading the same file twice.
  • To prevent the re-upload of translog files, we need to change the translog transfer timeout value to request timeout to the remote store.
  • As we can configure different types of remote stores, instead of setting up a default value, we made it configurable.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

github-actions bot commented Mar 16, 2024

Compatibility status:

Checks if related components are compatible with change 420c595

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git]

Copy link
Contributor

❌ Gradle check result for cf104af:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 581d604:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for cf104af: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for cf104af: SUCCESS

Copy link

codecov bot commented Mar 17, 2024

Codecov Report

Attention: Patch coverage is 87.50000% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 71.35%. Comparing base (b15cb0c) to head (420c595).
Report is 117 commits behind head on main.

Files Patch % Lines
...in/java/org/opensearch/index/shard/IndexShard.java 0.00% 1 Missing ⚠️
...in/java/org/opensearch/indices/IndicesService.java 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12704      +/-   ##
============================================
- Coverage     71.42%   71.35%   -0.07%     
- Complexity    59978    60341     +363     
============================================
  Files          4985     5024      +39     
  Lines        282275   284304    +2029     
  Branches      40946    41173     +227     
============================================
+ Hits         201603   202865    +1262     
- Misses        63999    64609     +610     
- Partials      16673    16830     +157     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sachinpkale sachinpkale force-pushed the dynamic-translog-transfer-timeout branch 2 times, most recently from 3337fa7 to 4c2d966 Compare April 2, 2024 11:26
Copy link
Contributor

github-actions bot commented Apr 2, 2024

❕ Gradle check result for 3337fa7: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testList

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

github-actions bot commented Apr 2, 2024

✅ Gradle check result for 4c2d966: SUCCESS

Copy link
Member

@ashking94 ashking94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ashking94
Copy link
Member

Detect Breaking Changes / detect-breaking-change (pull_request) from approval checklist is failing.

Sachin Kale added 6 commits April 2, 2024 18:29
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
@sachinpkale sachinpkale force-pushed the dynamic-translog-transfer-timeout branch from 4c2d966 to 420c595 Compare April 2, 2024 12:59
Copy link
Contributor

github-actions bot commented Apr 2, 2024

✅ Gradle check result for 420c595: SUCCESS

@gbbafna gbbafna merged commit b7396e1 into opensearch-project:main Apr 3, 2024
21 of 31 checks passed
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Apr 3, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12704-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 b7396e1cbaf5d8d10ad2917afbaab54f1d4e2816
# Push it to GitHub
git push --set-upstream origin backport/backport-12704-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12704-to-2.x.

ashking94 pushed a commit to ashking94/OpenSearch that referenced this pull request Apr 23, 2024
…mic setting (opensearch-project#12704)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…mic setting (opensearch-project#12704)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
@ashking94 ashking94 added backport 2.x Backport to 2.x branch and removed backport 2.x Backport to 2.x branch backport-failed labels Apr 27, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12704-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 b7396e1cbaf5d8d10ad2917afbaab54f1d4e2816
# Push it to GitHub
git push --set-upstream origin backport/backport-12704-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12704-to-2.x.

ashking94 pushed a commit to ashking94/OpenSearch that referenced this pull request Apr 27, 2024
…mic setting (opensearch-project#12704)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>
sachinpkale added a commit that referenced this pull request Apr 28, 2024
…mic setting (#12704) (#13425)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Co-authored-by: Sachin Kale <sachinpkale@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants