Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stale lockfile removal mechanism for confd critical sections #370

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ab77
Copy link
Contributor

@ab77 ab77 commented Aug 20, 2024

Related to: https://github.com/balena-io/balena-cloud/pull/3466

On underpowered machines, services may get terminated on healthcheck failures (and OOMs) and leave behind stale lockfiles. This mechanism should mitigate this, by removing lock files older than 5m (300s), configurable.

Only affects BoB(s) as this locking mechanism isn't used in production/Kubernetes.

change-type: patch

@ab77 ab77 requested a review from a team August 20, 2024 15:50
@flowzone-app flowzone-app bot enabled auto-merge August 20, 2024 15:56
@ab77 ab77 force-pushed the ab77/operational branch from 733758b to 5154b20 Compare August 20, 2024 16:54
@ab77 ab77 marked this pull request as draft August 22, 2024 15:27
auto-merge was automatically disabled August 22, 2024 15:27

Pull request was converted to draft

@ab77 ab77 marked this pull request as ready for review August 22, 2024 15:27
@ab77 ab77 force-pushed the ab77/operational branch from 5154b20 to 1bc8fc9 Compare August 22, 2024 15:27
@flowzone-app flowzone-app bot enabled auto-merge August 22, 2024 15:33
@ab77 ab77 force-pushed the ab77/operational branch from 1bc8fc9 to 6015e11 Compare August 29, 2024 15:31
@ab77 ab77 force-pushed the ab77/operational branch from 6015e11 to 4c48cfc Compare October 25, 2024 20:44
On underpowered machines, services may get terminated on healthcheck
failures and leave behind stale lockfiles. This mechanism should mitigate
this, by removing lock files older than 5m (300s), configurable.

change-type: patch
@ab77 ab77 force-pushed the ab77/operational branch from 4c48cfc to 552e70a Compare January 14, 2025 19:44
src/configure-balena.sh Outdated Show resolved Hide resolved
src/configure-balena.sh Outdated Show resolved Hide resolved
ab77 and others added 2 commits January 15, 2025 07:17
Co-authored-by: Josh Bowling <45343541+joshbwlng@users.noreply.github.com>
@ab77 ab77 requested a review from joshbwlng January 16, 2025 19:14
remove_update_lock
return 0
fi
return 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for returning 1 here? I thought even in bash script returning anything other than 0 indicates an error.

function set_update_lock {
if [[ -d "$(dirname "${CONF}")" ]]; then
echo "create lockfile ${CONF}.lock with ${LOCK_TIMEOUT}s age timeout"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "create lockfile ${CONF}.lock with ${LOCK_TIMEOUT}s age timeout"
echo "creating lockfile ${CONF}.lock with ${LOCK_TIMEOUT}s age timeout"

Nit just to match the other log message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants