Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unfreeze dracut, adapt for NM via systemd in initrd #1068

Merged
merged 4 commits into from
Jul 9, 2021

Conversation

dustymabe
Copy link
Member

@dustymabe dustymabe commented Jun 24, 2021

Upstream dracut updated NM to run as a systemd service
(with full dbus support) in the initrd in [1]. Adapt our
systemd units to handle this case.

This should still work fine for RHCOS because we still have
Before=dracut-initqueue.service, which can be dropped when
everyone is on dracut 0.54+.

Fixes: coreos/fedora-coreos-tracker#842

@dustymabe dustymabe force-pushed the dusty-unfreeze-dracut branch from 6216680 to 8fa89fb Compare June 24, 2021 19:30
@dustymabe dustymabe marked this pull request as draft June 24, 2021 19:31
@dustymabe
Copy link
Member Author

/hold

This needs to wait until we get teaming fixed: dracutdevs/dracut#1547

@dustymabe
Copy link
Member Author

Running local kola tests I see the multipath test is failing

--- FAIL: multipath (36.69s)                                                                                         
        multipath.go:60: Failed to reboot the machine: machine "1e703edb-aeea-4d62-a525-ebf2b67e1b2d" failed basic checks: some systemd units failed:
● coreos-propagate-multipath-conf.service not-found failed failed coreos-propagate-multipath-conf.service
[    2.046823] xfs filesystem being mounted at /sysroot supports timestamps until 2038 (0x7fffffff)
[    2.048122] systemd[1]: Mounted /sysroot.
[ESC[0;32m  OK  ESC[0m] Mounted ESC[0;1;39m/sysrootESC[0m.
[    2.049290] systemd[1]: Starting OSTree Prepare OS/...
         Starting ESC[0;1;39mOSTree Prepare OS/ESC[0m...
[    2.052119] ostree-prepare-root[618]: Resolved OSTree target to: /sysroot/ostree/deploy/fedora-coreos/deploy/e22451d83f721fe678b07eb1a4e7f94a6cf601ae0aedcf631d069c29e7ef27e9.1
[    2.053539] ostree-prepare-root[618]: sysroot configured read-only: 1, currently writable: 1
[ESC[0;32m  OK  ESC[0m] Finished ESC[0;1;39mOSTree Prepare OS/ESC[0m.
[    2.055179] systemd[1]: Finished OSTree Prepare OS/.
[ESC[0;32m  OK  ESC[0m] Reached target ESC[0;1;39mInitrd Root File SystemESC[0m.
[    2.056124] systemd[1]: Reached target Initrd Root File System.
         Starting ESC[0;1;39mCoreOS Propagate Multipath ConfigurationESC[0m...
[    2.057406] systemd[1]: Starting CoreOS Propagate Multipath Configuration...
         Starting ESC[0;1;39mReload Configuration from the Real RootESC[0m...
[    2.063022] systemd[1]: Starting Reload Configuration from the Real Root...
[    2.064740] systemd[1]: Reloading.
[    2.071436] coreos-propagate-multipath-conf[619]: info: propagating automatic multipath configuration
[    2.072833] coreos-propagate-multipath-conf[621]: '/etc/multipath.conf' -> '/sysroot/etc/multipath.conf'
[    2.263003] coreos-propagate-multipath-conf[624]: Relabeled /sysroot//etc/multipath.conf from (null) to system_u:object_r:etc_t:s0
[    2.310204] systemd[1]: initrd-parse-etc.service: Deactivated successfully.
[    2.310985] systemd[1]: Finished Reload Configuration from the Real Root.
[ESC[0;32m  OK  ESC[0m] Finished ESC[0;1;39mReload Configuration from the Real RootESC[0m.
[    2.312520] systemd[1]: Reached target Initrd File Systems.
[ESC[0;32m  OK  ESC[0m] Reached target ESC[0;1;39mInitrd File SystemsESC[0m.
[    2.314182] systemd[1]: coreos-multipath-trigger.service: Deactivated successfully.
[    2.314984] systemd[1]: Stopped CoreOS Trigger Multipath.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mCoreOS Trigger MultipathESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mCoreOS Wait For Multipathed BootESC[0m.
[    2.316327] systemd[1]: Stopped target CoreOS Wait For Multipathed Boot.
[    2.317309] systemd[1]: Condition check resulted in dracut mount hook being skipped.
         Starting ESC[0;1;39mdracut pre-pivot and cleanup hookESC[0m...
[    2.319170] systemd[1]: Starting dracut pre-pivot and cleanup hook...
         Stopping ESC[0;1;39mDevice-Mapper Multipath Device ControllerESC[0m...
[    2.325814] multipathd[488]: --------shut down-------
[    2.326285] systemd[1]: Stopping Device-Mapper Multipath Device Controller...
[    2.330878] multipathd[707]: ok
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mDevice-Mapper Multipath Device ControllerESC[0m.
[    2.341671] systemd[1]: multipathd.service: Deactivated successfully.
[    2.342224] systemd[1]: Stopped Device-Mapper Multipath Device Controller.
[ESC[0;32m  OK  ESC[0m] Finished ESC[0;1;39mdracut pre-pivot and cleanup hookESC[0m.
         Starting ESC[0;1;39mCleaning Up and Shutting Down DaemonsESC[0m...
[    2.372330] systemd[1]: Finished dracut pre-pivot and cleanup hook.
[    2.372887] systemd[1]: Starting Cleaning Up and Shutting Down Daemons...
[    2.378617] systemd[1]: Stopped target Ignition Subsequent Boot Disk Setup.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mIgnition Subsequent Boot Disk SetupESC[0m.
[    2.380222] systemd[1]: Stopped target Initrd Root Device.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mInitrd Root DeviceESC[0m.
[    2.381810] systemd[1]: Stopped target Remote Encrypted Volumes.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mRemote Encrypted VolumesESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mTimersESC[0m.
[    2.383148] systemd[1]: Stopped target Timers.
[    2.383756] systemd[1]: dbus.socket: Deactivated successfully.
[ESC[0;32m  OK  ESC[0m] Closed ESC[0;1;39mD-Bus System Message Bus SocketESC[0m.
[    2.384698] systemd[1]: Closed D-Bus System Message Bus Socket.
[    2.386270] systemd[1]: Condition check resulted in Unmount live /var if persistent /var is configured being skipped.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mCoreOS: Touch /run/agetty.reloadESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mdracut pre-pivot and cleanup hookESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mRemote File SystemsESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mRemote File Systems (Pre)ESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mdracut pre-mount hookESC[0m.
[    2.389982] systemd[1]: coreos-touch-run-agetty.service: Deactivated successfully.
[    2.390633] systemd[1]: Stopped CoreOS: Touch /run/agetty.reload.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mdracut initqueue hookESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mCoreOS Propagate Multipath ConfigurationESC[0m.
[    2.393528] systemd[1]: dracut-pre-pivot.service: Deactivated successfully.
[ESC[0;32m  OK  ESC[0m] Finished ESC[0;1;39mCleaning Up and Shutting Down DaemonsESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mBasic SystemESC[0m.
[    2.395538] systemd[1]: Stopped dracut pre-pivot and cleanup hook.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mPathsESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mSlicesESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mSocketsESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mSystem InitializationESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mLocal Encrypted VolumesESC[0m.
[    2.397739] systemd[1]: Stopped target Remote File Systems.
[    2.398217] systemd[1]: Stopped target Remote File Systems (Pre).
[    2.398731] systemd[1]: dracut-pre-mount.service: Deactivated successfully.
[    2.399296] systemd[1]: Stopped dracut pre-mount hook.
[ESC[0;32m  OK  ESC[0m] Stopped ESC[0;1;39mDispatch Password …ts to Console Directory WatchESC[0m.
[ESC[0;32m  OK  ESC[0m] Stopped target ESC[0;1;39mLocal Encrypted Volumes (Pre)ESC[0m.
[    2.400589] systemd[1]: dracut-initqueue.service: Deactivated successfully.
[    2.401182] systemd[1]: Stopped dracut initqueue hook.
[    2.401618] systemd[1]: coreos-propagate-multipath-conf.service: Main process exited, code=killed, status=15/TERM

cgwalters
cgwalters previously approved these changes Jun 29, 2021
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look at the details, but this looks sane to me

@jlebon
Copy link
Member

jlebon commented Jun 30, 2021

Running local kola tests I see the multipath test is failing

Is it failing consistently or just flaky?

jlebon
jlebon previously approved these changes Jul 7, 2021
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, but overall LGTM! Nice work patching dracut for this.

@@ -30,8 +30,8 @@ Description=Copy CoreOS Firstboot Networking Config
ConditionPathExists=/usr/lib/initrd-release
DefaultDependencies=false
Before=ignition-diskful.target
Before=nm-run.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should we keep nm-run.service and just add nm-initrd.service? That way if nm-run.service is backported to RHEL, but not nm-initrd.service, this will still work for RHCOS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I highly doubt that will happen as nm-run.service was only in there for one release of dracut (053) and was replaced by the full systemd+dbus implementation because the NM team actively wants it for the next version of RHEL.

IMHO I'd rather it fail so we could consider and fix our wrong assumption.

@jlebon
Copy link
Member

jlebon commented Jul 7, 2021

This should also fix coreos/fedora-coreos-tracker#849.

@dustymabe
Copy link
Member Author

This should also fix coreos/fedora-coreos-tracker#849.

yep

@dustymabe dustymabe dismissed stale reviews from jlebon and cgwalters via 170176c July 9, 2021 16:35
@dustymabe dustymabe force-pushed the dusty-unfreeze-dracut branch from 8fa89fb to 170176c Compare July 9, 2021 16:35
@dustymabe dustymabe marked this pull request as ready for review July 9, 2021 16:45
dustymabe added 2 commits July 9, 2021 16:02
Upstream dracut updated NM to run as a systemd service
(with full dbus support) in the initrd in [1]. Adapt our
systemd units to handle this case.

This should still work fine for RHCOS because we still have
`Before=dracut-initqueue.service`, which can be dropped when
everyone is on dracut 0.54+.

Fixes: coreos/fedora-coreos-tracker#842
Contains upstream fixes needed to get NM running via systemd+dbus
in the initramfs without issues.

- dracutdevs/dracut#1547
- dracutdevs/dracut#1548
- dracutdevs/dracut#1552

Needed to get dracut unfrozen:
coreos/fedora-coreos-tracker#842 (comment)
@dustymabe dustymabe force-pushed the dusty-unfreeze-dracut branch from 5c21919 to cf8ec8f Compare July 9, 2021 20:02
@dustymabe
Copy link
Member Author

rebased on top of latest testing-devel

@dustymabe dustymabe enabled auto-merge (rebase) July 9, 2021 20:03
dustymabe and others added 2 commits July 9, 2021 16:06
…-kargs

We've seen races with ignition-kargs.service, which accesses /boot rw.
Let's introduce some ordering here. Need to use `Before` because otherwise
we get a systemd ordering cycle.

Fixes: coreos/fedora-coreos-tracker#883
…pagate-multipath-conf.service

Otherwise, we'll end up racing with `initrd-cleanup.service` which wants
to kill everything. It has `After=initrd.target` and we do have
`Before=initrd.target`, but that's not being respected, we think because
`initrd-parse-etc.service` does an explicit `systemctl start` on it.

Anyway, we need to dig more into this, but for now this will unblock us.
@dustymabe dustymabe force-pushed the dusty-unfreeze-dracut branch from cf8ec8f to 99ecca7 Compare July 9, 2021 20:07
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

# We've seen races with ignition-kargs.service, which accesses /boot rw.
# Let's introduce some ordering here. Need to use `Before` because otherwise
# we get a systemd ordering cycle. https://github.com/coreos/fedora-coreos-tracker/issues/883
Before=ignition-kargs.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK yeah that makes sense now looking at the cycle in coreos/fedora-coreos-tracker#883 (comment).

We need networking to fetch the config to apply the kargs. And this service obviously has to run before networking comes up since the whole point is network configuration.

@dustymabe dustymabe merged commit 8b80486 into coreos:testing-devel Jul 9, 2021
@dustymabe dustymabe deleted the dusty-unfreeze-dracut branch July 9, 2021 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

adjust for the change to running NetworkManager via systemd in the initrd
3 participants