-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] [FlyteAdmin] Notifications SQS subscriber stops processing messages when "connection reset by peer" #376
Comments
@rstanevich thank you for opening up the bug, I will add it to the next milestone. Do you think thats ok, or is it affecting everyday and we should fix it ASAP? |
thanks for opening @rstanevich - is this a recent issue? |
Thank you for the reply. BTW, we have Flyte's very own |
@rstanevich do you have an alert if FlyteAdmin crashes? What would be preferrable, that you notice a FlyteAdmin crash or it continues to limp along and you notice older messages? |
@kumare3 |
@rstanevich I think we found a very good way of solving this problem. @katrogan will merge the PR soon. Thank you for raising the issue. |
it is merged and will be part of the next release |
* Better Error Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * lint Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * fix unit test Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Add defensive nil checks Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
…#376) * wip Signed-off-by: Katrina Rogan <katroganGH@gmail.com> * add a test too Signed-off-by: Katrina Rogan <katroganGH@gmail.com> * Matchable attribute impl Signed-off-by: Katrina Rogan <katroganGH@gmail.com>
* Add missing in_container.mk Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Fix serialization for greatexpectations Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * fix mnist classifier examples Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Update requirements for kfpytorch Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Update kfpytorch requirements Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Update sql requirements Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Enable pytorch sagemaker image build Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * tidy names Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Try to silence TERM errors Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Install libssl1.0.0 Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Updates to pytorch images Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Try updates Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * cleanup Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
* Better Error Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * lint Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * fix unit test Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com> * Add defensive nil checks Signed-off-by: Haytham Abuelfutuh <haytham@afutuh.com>
…#376) * wip Signed-off-by: Katrina Rogan <katroganGH@gmail.com> * add a test too Signed-off-by: Katrina Rogan <katroganGH@gmail.com> * Matchable attribute impl Signed-off-by: Katrina Rogan <katroganGH@gmail.com>
Describe the bug
Notifications SQS subscriber stopped process messages
Expected behavior
Gracefully reconnecting if the application is running
Flyte component
To Reproduce
Steps to reproduce the behavior:
Environment
Flyte component
Additional context
Logs:
I guess solution will be similar to this one: flyteorg/flyteadmin#92
The text was updated successfully, but these errors were encountered: