-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check PD enpoints status when it's unhealthy. #545
Conversation
Hi contributor, thanks for your PR. This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically. |
/run-e2e-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution.
- Use
gofmt
to format the code, or themake check
will be filled. - Can you add the unit tests or create an issue to track it, like: add unit test to tidb_controller GetInfo Interface #406 #429 does.
@@ -40,6 +40,7 @@ type pdMemberManager struct { | |||
setLister v1beta1.StatefulSetLister | |||
svcLister corelisters.ServiceLister | |||
podLister corelisters.PodLister | |||
epsLister corelisters.EndpointsLister |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use gofmt to format the source code.
@gregwebs PTAL |
Hi, the code has been formatted in the latest commit, and passed
New issue: #547 |
Yes, |
/run-e2e-tests |
Hi, the latest commit adds some unit test about endpoints. However, I'm not sure it's good enough, so please help check it again. Thanks! |
@@ -258,18 +261,27 @@ func (pmm *pdMemberManager) syncTidbClusterStatus(tc *v1alpha1.TidbCluster, set | |||
|
|||
pdClient := pmm.pdControl.GetPDClient(tc) | |||
|
|||
cluster, err := pdClient.GetCluster() | |||
healthInfo, err := pdClient.GetHealth() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you chage the order pdClient.GetHealth()
and pdClient.GetCluster()
?
If there is no special need, it is better to stay in the old order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It maybe more intuitive to check pd's health first, in my opinion. QAQ
Alright, I'll reset the order.
@xiaojingchen @aylei ptal |
@@ -268,6 +271,15 @@ func (pmm *pdMemberManager) syncTidbClusterStatus(tc *v1alpha1.TidbCluster, set | |||
healthInfo, err := pdClient.GetHealth() | |||
if err != nil { | |||
tc.Status.PD.Synced = false | |||
// get endpoints info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetCluster()
will be failed if there are no endpoints.
so we should move these codes to https://github.com/pingcap/tidb-operator/pull/545/files#diff-006f391cd7cdae269e89bd77e21c1a6fR266
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I think that it may be better to check health state at first, and check endpoints' status when it's unhealthy, just like the previous code. Here is what I think:
- It makes no difference to the whole component whether we check health state first or not.
- Unit tests based on
GetHealth
rather thanGetCluser
are easier to understand, such as: https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/pd_member_manager_test.go#L209. And current unit tests on endpoints are based on whetherGetHealth
fail or not. - More intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, i am fine with both orders. But these codes must be moved to the first method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patience!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
/run-e2e-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-e2e-tests |
1 similar comment
/run-e2e-tests |
/run-e2e-tests |
What problem does this PR solve?
This PR check PD endpoints status when PD is unhealthy, according to #293.
What is changed and how it works?
endpointsLister
topdMemberManager
pdClient.GetHealth()
andpdClient.GetCluster()
Check List
Tests
Code changes
Related changes
Does this PR introduce a user-facing change?: