-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stale lockfile removal mechanism for confd critical sections #370
base: master
Are you sure you want to change the base?
Conversation
Pull request was converted to draft
6015e11
to
4c48cfc
Compare
On underpowered machines, services may get terminated on healthcheck failures and leave behind stale lockfiles. This mechanism should mitigate this, by removing lock files older than 5m (300s), configurable. change-type: patch
4c48cfc
to
552e70a
Compare
Co-authored-by: Josh Bowling <45343541+joshbwlng@users.noreply.github.com>
remove_update_lock | ||
return 0 | ||
fi | ||
return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning for returning 1 here? I thought even in bash script returning anything other than 0 indicates an error.
function set_update_lock { | ||
if [[ -d "$(dirname "${CONF}")" ]]; then | ||
echo "create lockfile ${CONF}.lock with ${LOCK_TIMEOUT}s age timeout" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo "create lockfile ${CONF}.lock with ${LOCK_TIMEOUT}s age timeout" | |
echo "creating lockfile ${CONF}.lock with ${LOCK_TIMEOUT}s age timeout" |
Nit just to match the other log message
Related to: https://github.com/balena-io/balena-cloud/pull/3466
On underpowered machines, services may get terminated on healthcheck failures (and OOMs) and leave behind stale lockfiles. This mechanism should mitigate this, by removing lock files older than 5m (300s), configurable.
Only affects BoB(s) as this locking mechanism isn't used in production/Kubernetes.
change-type: patch