Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix snap startup error #7354

Merged
merged 1 commit into from
Aug 26, 2024
Merged

Conversation

damonbarry
Copy link
Member

Cherry-pick d950b62.

To test, I built new snaps in the CI build pipeline, ran the end-to-end tests pipeline against them, and confirmed the snap jobs pass. Since the snap jobs install with --devmode, I also published to a temporary branch in the marketplace and tested manually.

A recent fix to the azure-iot-edge snap (Azure#7330) changed the daemon type of the docker-proxy service to `notify` to resolve install/startup timing issues. With that fix, docker-proxy now waits until it can establish communication with Docker before calling `systemd-notify --ready` to resume and allow aziot-edged to start.

A problem was discovered once the updated snap was published to the marketplace. If a user installs the snap without --devmode, docker-proxy startup fails with: "Got notification message from PID nnnn, but reception only permitted for main PID mmmm". This happens because systemd expects the `systemd-notify --ready` command to run in the service's main process, but bash runs it in its own process. Systemd can be configured to allow other options, but snapd doesn't expose them.

This change uses the shell builtin `exec` command to run `systemd-notify` in the same process as the parent script. It also uses the  --exec option on `systemd-notify` to set up socat after `systemd-notify` returns. The --exec option is new, so the snap has been upgraded to core24.

Also, the configure hook was producing config files with empty hostnames, which caused failures on startup. It's unclear why the hostnamectl command isn't working in this instance (possibly something to do with the core24 upgrade?), but it turns out to be unnecessary because `iotedge config apply` will correctly determine/populate the hostname a little later. This change removes the logic to populate the hostname field from the configure hook.

To test, I built new snaps in the CI build pipeline, ran the end-to-end tests pipeline against them, and confirmed the snap jobs pass. Since the snap jobs install with --devmode, I also published to a temporary branch in the marketplace and tested manually.

## Azure IoT Edge PR checklist:
@damonbarry damonbarry changed the title Fix snap startup error (#7351) Fix snap startup error Aug 22, 2024
@damonbarry damonbarry requested a review from nyanzebra August 26, 2024 21:48
@kodiakhq kodiakhq bot merged commit 436c1cc into Azure:release/1.4 Aug 26, 2024
18 checks passed
@damonbarry damonbarry deleted the snap-startup-1.4 branch August 26, 2024 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants