Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take some actions to avoid unhealthy containers #1241

Merged
merged 3 commits into from
Jan 7, 2022

Conversation

chadwhitacre
Copy link
Member

Closes #1178

@aminvakil
Copy link
Collaborator

Are we sure changing all healthcheck timeouts for all users to 60s is a good idea?

My question is are there any situations in self-hosted where if something does not respond in 5 seconds, it's broken?

install.sh Outdated
source parse-cli.sh
source dc-detect-version.sh
source turn-things-off.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error-handling should come as early as possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. It depends on $dc so moving just after that in c48efe0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... thought through this a little further and created 2b9a179. Eh?

@chadwhitacre
Copy link
Member Author

My question is are there any situations in self-hosted where if something does not respond in 5 seconds, it's broken?

If it's broken after 5s it will still be broken after 60s, right?

@aminvakil
Copy link
Collaborator

My question is are there any situations in self-hosted where if something does not respond in 5 seconds, it's broken?

If it's broken after 5s it will still be broken after 60s, right?

Well if something is wrong and it responds in 30s or so, it's broken in my POV.

@aminvakil
Copy link
Collaborator

My question is are there any situations in self-hosted where if something does not respond in 5 seconds, it's broken?

If it's broken after 5s it will still be broken after 60s, right?

Well if something is wrong and it responds in 30s or so, it's broken in my POV.

Like I said in #1178 (comment) best if it could be set in .env IMO, for this PR changing default to 60s is better than current state though.

@chadwhitacre
Copy link
Member Author

chadwhitacre commented Jan 6, 2022

best if it could be set in .env IMO, for this PR changing default to 60s is better than current state though.

Okay, let's run with this for a while and if we find that it's causing problems in the other direction we can look at making it configurable, would rather not introduce additional complexity if we don't have to.

@chadwhitacre
Copy link
Member Author

@aminvakil I mean, if "someone else" wants to make a PR for configurability ... ;)

@chadwhitacre chadwhitacre merged commit 7eb16f3 into master Jan 7, 2022
@chadwhitacre chadwhitacre deleted the cwlw/healthy-containers-plz branch January 7, 2022 14:00
@chadwhitacre
Copy link
Member Author

Pretty sure we're good here, merging.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'ERROR: for snuba-api Container "ee666f7f2cdd" is unhealthy.' during install.sh
3 participants