-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolver: UpdateState returns ErrBadResolverState when submitting empty list of addresses #5048
Comments
Which part is broken? What were you expecting to happen instead? If I understand what you're saying, the resolver should poll for new addresses when this happens, though it should apply some backoff.
What happens / what was expected? I suspect the error being returned regarding the empty address list is coming from the LB policy being fed zero addresses, but not being a policy that would be able to cope with that. I.e. pick first and round robin both will do this*, as zero addresses would lead to a dead ClientConn. If they don't error and let the resolver know that something is wrong, then the ClientConn will be stuck with no connections forever (eventually...if there are any existing connections, they would remain until they die). * - Refs:
|
How my resolver handles the
I expected that when I call
That is what my resolver is currently doing. I basically copied the behavior from: grpc-go/internal/resolver/dns/dns_resolver.go Line 234 in 40916aa
I guess the DNS resolver will never call UpdateState with an empty set of addresses though.
The way that you described makes also sense to me. The resolvers can then be a little bit more dumb and don't have to care about if they submitted empty or non-empty list of addresses. What behavior is expected by the resolver when the list of addresses for a target changed to 0?
|
Hmmm.. if there are existing addresses, then we'll keep the SubConn(s) for them, and when they fail to connect, a ResolveNow will be triggered. If there were no pre-existing SubConns (if addresses had never been produced), this won't ever happen. So while it's possible there are other sources of initiating a retry of the name resolver, it shouldn't be assumed.
UpdateState sends the resolver state to the LB policy. Some LB policies (e.g. grpclb) are okay with not getting addresses from the name resolver, so they would not return an error in this case. Pick first and round robin both need addresses to work, however, so they return errors.
I don't believe that's the case. We want it to call UpdateState so RPCs can begin failing quickly if there are no addresses for the target at startup.
As mentioned above, the resolver doesn't necessarily know what LB policy is in place. Some might be fine with an empty address list. The resolver shouldn't be making assumptions about the consumers of its data. The consumers will determine if the result is acceptable and report an error if not.
Yes. As mentioned above, having no addresses either might not be a problem in the first place, or it could be a signal to the ClientConn that it should start failing RPCs with a relevant error message.
Any error from UpdateState should result in the resolver polling and reporting the state again (for polling name resolvers; watch-based resolvers would ignore the error since there's nothing else to do). A backoff timer should be used to avoid overloading the server being polled. If it polls and finds no change, it would be fine to not call UpdateState with the data -- presumably the call would just fail again. However, it's also harmless to call it again. |
Thanks a lot for the clarification.
That also means that there is no scenario where the error returned by Should I create a PR to document the expected resolver behavior regarding Errors returned by
I'm wondering if it wouldn't make sense to distinguish between invalid results returned by a resolver and valid resolver results that do not fit for the LB/consumer. For example an IPv4 DNS resolver could resolve the target localhost to 127.0.0.1, report it via |
That should be safe to say. If the LB policy itself is changed then the new one might be okay with the previously reported state, but the LB policy in use is controlled by the name resolver (via the same call), so it would know when that is happening.
Sure, that would be helpful.
It's always possible a re-resolution will return the results the LB policy needs. Unless the LB policy and name resolver are intended to be closely paired, e.g. the xds name resolver. But in that case, the resolver produces the LB policy & its configuration, so we know everything will always be compatible there.
If the resolver is 100% sure it will never produce a different result, then it's fine to not re-resolve, since there is no point. This is similar to a watch-based name resolver that literally can't poll again. Otherwise, it should attempt to re-resolve when an error is returned from Also, generally, as long as any addresses are returned by the name resolver, it should be extremely unusual for an LB policy to fail the data. |
Thanks again for the detailed responses |
Hello,
I'm implementing a custom resolver that queries a Consul server to resolve a target.
It can happen that a service is unregistered in Consul and therefore the set of known addresses for a target changes to none.
When this happens I'm calling
UpdateState
with an empty address slice.UpdateState
then returns in response aErrBadResolverState
error.Because
UpdateState()
returned an error my resolver starts to poll consul and callUpdateState
periodically again (also with empty addresses). This behavior is broken.Despite that
UpdateState()
returns an error, grpc-go seems to handle it as expected.It triggers reresolving periodically again via
ResolveNow
.Is this a bug in grpc-go and it should not return an error when a resolver submits an empty set of address?
Otherwise what behavior is expected by the resolver when all addresses for a host vanish?
I'm using grpc-go version 1.42.
The text was updated successfully, but these errors were encountered: