-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a counter to node stat api to track shard going from idle to non-idle #12768
Conversation
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
Compatibility status:Checks if related components are compatible with change 5d47d6b Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/performance-analyzer.git] |
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
❌ Gradle check result for 919e4e2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for ef646c6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for c5ee63f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 8ec3c12: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 2d9bb3d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@ruai0511 - Thank you for this PR. I am wondering if we should also include whether refresh triggered or not as part of search request into coordinator slow logs. That will prevent us from guessing whether refresh happened or not causing the query to run slower. |
If you choose to add it to a |
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
❌ Gradle check result for 49ebf20: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
❌ Gradle check result for d2a686e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 5172c09: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for 5d47d6b: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank, @ruai0511! Looks good!
I'll add the backport 2.x
label, but we'll need to update the PR to change the version check for the new field.
Can you please prepare a PR to update the version check from 3.0 to 2.14?
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12768-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 f2cc3d8ea314b2d49a8b6ea5f57de5b6aff4faf9
# Push it to GitHub
git push --set-upstream origin backport/backport-12768-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
… from idle to non-idle (opensearch-project#12768) Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
… from idle to non-idle (opensearch-project#12768) --------- Signed-off-by: Ruirui Zhang <mariazrr@amazon.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
… from idle to non-idle (opensearch-project#12768) --------- Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>
Description
Shards automatically refresh every second, but when a shard doesn't receive search requests for over 30 seconds, it goes into an idle state to improve performance by suspending the implicit index refresh (More information on search idle feature here). However, this introduces a problem: After a shard does idle, the next search request must force a refresh to reflect the latest data. This extra step increases latency.
We want to monitor how often idle shards are reactivated. This PR introduces a counter called
search_idle_waken_up_total
and exports it in the node stat api.Related Issues
Resolves #12678
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.