Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection topology recover can miss restore consumers #1047

Closed
pierresetteskog opened this issue May 18, 2021 · 4 comments
Closed

connection topology recover can miss restore consumers #1047

pierresetteskog opened this issue May 18, 2021 · 4 comments
Assignees

Comments

@pierresetteskog
Copy link

sometimes our network is unstable and our services just stoppes listening on messages.
Its seems like the client gets an initial connection but very quickly losening it again.
It's seems like in 6.x and 7.x has try catch by purpose to just continue if any errors while register consumers etc.

Why not just add throw and let it retry after a 5sec if it cant recover full topology?

private void HandleTopologyRecoveryException(TopologyRecoveryException e)
{
ESLog.Error("Topology recovery exception", e);
// throw e; //if this is added it works as expected
}

Should i create PR?

Log error:
payload: {"Type":"RabbitMQ.Client.Exceptions.TopologyRecoveryException","Message":"Caught an exception while recovering consumer amq.ctag-MsJoUkH2fqy6Ttm4ks528Q on queue Summoning.WheelChange.BookingExpiration: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=404, text='NOT_FOUND - queue 'Summoning.WheelChange.BookingExpiration' in vhost '/' process is stopped by supervisor', classId=50, methodId=10","StackTrace":"","InnerException":"RabbitMQ.Client.Exceptions.AlreadyClosedException: Already closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=404, text='NOT_FOUND - queue 'Summoning.WheelChange.BookingExpiration' in vhost '/' process is stopped by supervisor', classId=50, methodId=10\n at RabbitMQ.Client.Impl.SessionBase.Transmit(OutgoingCommand& cmd)\n at RabbitMQ.Client.Impl.ModelBase.ModelSend(MethodBase method, ContentHeaderBase

@michaelklishin
Copy link
Member

Because topology recovery can fail for all kinds of reasons. We cannot assume that every error is related to connection state. I'd only consider a PR that retries on exceptions that we know are related to connectivity.

Specifically

NOT_FOUND - queue 'Summoning.WheelChange.BookingExpiration' in vhost '/' process is stopped by supervisor'

suggests that the node hosting the leader of that queue has been shutting down.

@pierresetteskog
Copy link
Author

pierresetteskog commented May 18, 2021

thanks I will give it a try in our environment first, every month the system team makes something with our environment
private void HandleTopologyRecoveryException(TopologyRecoveryException e)
{
ESLog.Error("Topology recovery exception", e);
if (e.InnerException is AlreadyClosedException)
{
throw e;
}
}

@pierresetteskog
Copy link
Author

will there be a nuget release on my pr to 6.x branch or when will next master release be ? :)

@lukebakken lukebakken self-assigned this Nov 18, 2023
@lukebakken
Copy link
Contributor

Closing because this appears to have been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants