feat(runners): add retry logic to default install and start script for dnf operations #3787
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
This is a continuation of work done in #3748
There seems to be a race condition some where with the user-data script and the EC2 starting up and locking RPM. This issue is seen elsewhere, not necessarily with this repo's user-data, see the following:
Amazon Linux 2023 - issue with installing packages with cloud-init
dnf/yum both fails while being executed on instance bootstrap on Amazon Linux 2023
Also, #3741
Changes Made
Added a loop to retry if the rpm lock file is found which sleeps for 5 seconds then retries again with a total of 5 iterations. This logic is now added to the
user-data.sh
for theupgrade-minimal
operation and installation of docker, cloudwatch-agent, and curl.Testing Done
In progress. I'd like to open this up to review while testing.