-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vmId to the Rntbd Health Check Error Message #43079
Conversation
API change check API changes are not detected in this pull request. |
...om/azure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthChecker.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one blocking comment
...om/azure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthChecker.java
Outdated
Show resolved
Hide resolved
...om/azure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthChecker.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM besides some small comments
...ure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthCheckerTests.java
Outdated
Show resolved
Hide resolved
...ure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthCheckerTests.java
Outdated
Show resolved
Hide resolved
...ure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthCheckerTests.java
Outdated
Show resolved
Hide resolved
...om/azure/cosmos/implementation/directconnectivity/rntbd/RntbdClientChannelHealthChecker.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Thanks!
One suggestion is to modify the logs when requests see cancellation or transit timeouts. It is not fully clear whether we'll definitely have diagnostics in these scenarios (or at least not clear to me :P) so it would not hurt to have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks!
/azp run java - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
/check-enforcer override |
Description
Issue 42811
Currently vmId is part of our client-side diagnostics. However, diagnostics can only be obtained if an operation succeeds or fails - if it hangs, we will not see anything. There was a WMT Sev2 where we saw channel health check failure logs (the log itself has the source and destination IP) but not the vmId, so we couldn't diagnose the VM well enough. With this change, the vmId can be extracted from health check logs and correlated so we can get VM level visibility.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines