-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coordination time to profiler #705
Comments
I don't think it's a bug as the documentation clearly calls out the gap. We should plan on bridging this holistically
|
Looking into this. |
Here is an activity diagram of what it seems like is profiled, and those gaps around fan out, and node-to-node communication and waiting time are part of what I'd consider coordinator time that are absent. |
These are the limitations with profiling that we are trying to bridge with this issue - Profile limitations Namely, the top two issues is what we're trying to target right now:
Network overhead time measurement for node-to-node communication
Time spent on coordinator fan out and coordinator fan inThe coordinating node (i.e., the node that receives the client request) executes it in 2 phases.
There are two questions to be answered here -
I'm putting possible choices/approaches for these metrics up for community transparency and to get opinions on alternate metrics that could be included! |
@Poojita-Raj Why are we closing this issue? |
…checks (opensearch-project#705) Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Describe the bug
Took time on profiled request is 1.3 seconds, inspecting the shards search/aggregations times they are all under 460ms-339ms. This implies coordination had a substantial impact on the response time.
To Reproduce
Steps to reproduce the behavior:
"profile":"true"
until took time is above an acceptable thresholdExpected behavior
There would be a breakdown for coordination
Plugins
N/A
Screenshots
N/A
Host/Environment (please complete the following information):
Additional context
In some ways this is a feature request - but this is also an incomplete scenario to diagnose performance issues with this existing profiling tools.
The text was updated successfully, but these errors were encountered: