Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

network error not recovered by servicebus #237

Closed
serbrech opened this issue Jul 20, 2021 · 2 comments
Closed

network error not recovered by servicebus #237

serbrech opened this issue Jul 20, 2021 · 2 comments

Comments

@serbrech
Copy link
Member

serbrech commented Jul 20, 2021

Running in prod, we have identified a few errors that a sender will not recover from.
Once these errors are returned during by a sender operation, it means that the underlying TCP connection is dead and the only way to recover is to recreate the sender.

  • syscall.ETIMEDOUT (covered by neterr.Timeout())
  • io.EOF
  • amqp error with Condition: amqp:internal-error, Description: The service was unable to process the request; please retry the operation.

I think these should be recovered either by servicebus sdk, or maybe even at the lowest level of amqp (for io.EOF for example)

Happy to port the implementation if it's agreed

see go-shuttle for the recovery code we needed to add :
https://github.com/Azure/go-shuttle/blob/main/publisher/errorhandling/recovery.go

@richardpark-msft
Copy link
Member

This looks sensible to me. We actually have code similar to this in Event Hubs now as well that handles the recovery in addition to the error classification:

I would definitely be interested in this PR. You'll note that we also recently added in the ability to limit the # of retries as well.

What do you currently default to for # of retries in your code?

@richardpark-msft
Copy link
Member

We've moved development of this package to the azure-sdk-for-go repo link.

Error handling and making errors programatically useful is being tracked here: Azure/azure-sdk-for-go#15610

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants