Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fsck to dracut modules #1352

Merged
merged 10 commits into from
May 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/base-dracut-modules/definition.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: "base-dracut-modules"
category: "system"
version: 0.1.1-29
version: "0.1.2"
description: "Base modules for creating an initrd with dracut for cOS derivatives"
4 changes: 1 addition & 3 deletions packages/cos/collection.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,18 @@ packages:
- &cos
name: "cos"
category: "system"
version: 0.8.11-2
version: "0.8.12"
description: "cOS base image, used to build cOS live ISOs"
brand_name: "cOS"
labels:
autobump.revdeps: "true"
- !!merge <<: *cos
name: "cos-container"
description: "cOS container image, used to build cOS derivatives from scratch"
version: 0.8.11-3
- !!merge <<: *cos
category: "recovery"
brand_name: "cOS recovery"
description: "cOS recovery image, used to boot cOS for troubleshooting"
version: 0.8.11-2
- !!merge <<: *cos
name: "cos-img"
category: "recovery"
2 changes: 1 addition & 1 deletion packages/grub2/collection.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ packages:
version: ">0.0.2"
- name: "grub2-config"
category: "system"
version: 0.0.16-1
version: "0.0.17"
provides:
- name: "grub-config"
version: ">0.0.12"
Expand Down
2 changes: 1 addition & 1 deletion packages/grub2/config/bootargs.cfg.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ set kernel=/boot/vmlinuz
if [ -n "$recoverylabel" ]; then
set kernelcmd="console=tty1 console=ttyS0 root=live:LABEL=$recoverylabel rd.live.dir=/ rd.live.squashimg=$img panic=5 rd.neednet=1 rd.cos.oemlabel=@OEM_LABEL@"
else
set kernelcmd="console=tty1 console=ttyS0 root=LABEL=$label cos-img/filename=$img panic=5 security=selinux selinux=1 rd.neednet=1 rd.cos.oemlabel=@OEM_LABEL@"
set kernelcmd="console=tty1 console=ttyS0 root=LABEL=$label cos-img/filename=$img panic=5 security=selinux selinux=1 rd.neednet=1 rd.cos.oemlabel=@OEM_LABEL@ fsck.mode=force fsck.repair=yes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any fsck.repair definition, hence I guess this is something that systemd-fsck parses, right? If this something custom from our side I'd suggest defining it with the rd.cos... prefix, I think it is not the case though, just to double check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, those are read by systemd-fsck directly, so we don't need to do any special handling ourselves 🥳

fi

set initramfs=/boot/initrd
25 changes: 24 additions & 1 deletion packages/immutable-rootfs/30cos-immutable-rootfs/cos-loop-img.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,38 @@ function doLoopMount {
label=$(basename "${dev}")
[ -e "/tmp/cosloop-${label}" ] && continue
> "/tmp/cosloop-${label}"

mount -t auto -o "${cos_root_perm}" "/dev/disk/by-label/${label}" "${cos_state}" || continue
if [ -f "${cos_state}/${cos_img}" ]; then
losetup -f "${cos_state}/${cos_img}"

# FSCHECK if cos_root_perm == "ro" on both
if [ "$cos_root_perm" == "ro" ]; then
systemd-fsck "/dev/disk/by-label/${label}"
fi

dev=$(losetup --show -f "${cos_state}/${cos_img}")

# FSCHECK if cos_root_perm == "ro"
if [ "$cos_root_perm" == "ro" ]; then
systemd-fsck "$dev"
fi

exit 0
else
umount "${cos_state}"
fi
done
}

function dofsCheck {
# Iterate over current partitions
Copy link
Contributor Author

@mudler mudler May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidcassany here while testing I corrupted COS_PERSISTENT in a way that no labels were properly detected anymore, but by running efsck on the disk id manually I could restore them just fine. When no labels are identified, it just skips all our loops, as we loop over labels

Since this could happen, I'm back at scanning all the partitions found. I'm not sure if this is maybe too invasive on some front and maybe better to make it opt-in - but on the other hand I don't see why someone would disable a fscheck integrity utility running on boot. I guess, in that case fsck.mode=skip could be passed, so I didn't felt to add any option to instrument it even more.

# As fs corruption could lead to partitions with no label, we scan here for all partitions found and we run systemd-fsck
for dev in /dev/disk/by-partuuid/*; do
partuuid=$(basename "${dev}")
systemd-fsck "/dev/disk/by-partuuid/${partuuid}"
done
}

type getarg > /dev/null 2>&1 || . /lib/dracut-lib.sh

PATH=/usr/sbin:/usr/bin:/sbin:/bin
Expand All @@ -37,6 +59,7 @@ ismounted "${cos_state}" && exit 0

mkdir -p "${cos_state}"

dofsCheck
doLoopMount

rm -r "${cos_state}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,8 @@ for mount in "${mountpoints[@]}"; do
fstab+=$(mountOverlay "${mount%%:*}")
fi
else
# FSCK
systemd-fsck "${mount}"
fstab+=$(mountPersistent "${mount}")
fi
done
Expand Down
11 changes: 7 additions & 4 deletions packages/immutable-rootfs/30cos-immutable-rootfs/module-setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@

# called by dracut
check() {
require_binaries "$systemdutildir"/systemd || return 1
return 255
}

# called by dracut
# called by dracut
depends() {
echo rootfs-block dm
echo systemd rootfs-block dm fs-lib
return 0
}

Expand All @@ -29,7 +30,7 @@ install() {
# Include utilities required for cos-setup services,
# probably a devoted cos-setup dracut module makes sense
inst_multiple -o \
partprobe sync udevadm lsblk sgdisk parted mkfs.ext2 mkfs.ext3 mkfs.ext4 mkfs.vfat mkfs.fat mkfs.xfs blkid e2fsck resize2fs mount xfs_growfs umount
"$systemdutildir"/systemd-fsck partprobe sync udevadm lsblk sgdisk parted mkfs.ext2 mkfs.ext3 mkfs.ext4 mkfs.vfat mkfs.fat mkfs.xfs blkid e2fsck resize2fs mount xfs_growfs umount
inst_hook cmdline 30 "${moddir}/parse-cos-cmdline.sh"
inst_script "${moddir}/cos-generator.sh" \
"${systemdutildir}/system-generators/dracut-cos-generator"
Expand All @@ -40,5 +41,7 @@ install() {
mkdir -p "${initdir}/${systemdsystemunitdir}/initrd-fs.target.requires"
ln_r "../cos-immutable-rootfs.service" \
"${systemdsystemunitdir}/initrd-fs.target.requires/cos-immutable-rootfs.service"
ln_r "$systemdutildir"/systemd-fsck \
"/sbin/systemd-fsck"
dracut_need_initqueue
}
}
8 changes: 7 additions & 1 deletion packages/immutable-rootfs/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,13 @@ steps:
{{ if .Values.distribution }}
{{if eq .Values.distribution "opensuse" }}
# Mount /tmp as tmpfs by default as set by systemd itself
- cp /usr/share/systemd/tmp.mount /etc/systemd/system
- |
/bin/bash -c " \
if [ -e /usr/share/systemd/tmp.mount ]; then \
cp /usr/share/systemd/tmp.mount /etc/systemd/system; \
Copy link
Contributor

@Itxaka Itxaka May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this something to do with the underlying distro change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be, note that TW alreay has this mountpoint by default, probably SLE for rancher too. The old logic was valid when opensuse meant leap, if we include TW, this logic is safer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup exactly, this happens to be in two different places - tw and leap have it differently

else \
cp /usr/lib/systemd/system/tmp.mount /etc/systemd/system; \
fi "
{{end}}
{{end}}
- cp -r 30cos-immutable-rootfs /usr/lib/dracut/modules.d
Expand Down
2 changes: 1 addition & 1 deletion packages/immutable-rootfs/definition.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: "immutable-rootfs"
category: "system"
version: 0.4-5
version: "0.5"
requires:
- name: "cos-setup"
category: "system"
Expand Down
2 changes: 1 addition & 1 deletion packages/initrd/definition.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: "dracut-initrd"
category: "system"
version: 0.2-8
version: "0.3"
description: "Dracut-based generated initrd"
37 changes: 37 additions & 0 deletions tests/fallback/fallback_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,43 @@ var _ = Describe("cOS booting fallback tests", func() {
})
})

Context("COS_PERSISTENT partition is corrupted", func() {
It("boots in active when the persistent partition is damaged, and can be repaired with fsck", func() {

// Just to make sure we can match against the same output of blkid later on
// and that the starting condition is the one we expect
Eventually(func() string {
out, _ := s.Command("sudo blkid")
return out
}, 1*time.Minute, 10*time.Second).Should(ContainSubstring(`LABEL="COS_PERSISTENT"`))

persistent, err := s.Command("blkid -L COS_PERSISTENT")
Expect(err).ToNot(HaveOccurred())

// This breaks the partition so it can be fixed with fsck
_, err = s.Command("dd if=/dev/zero count=1 bs=4096 seek=0 of=" + persistent)
Expect(err).ToNot(HaveOccurred())

Eventually(func() string {
out, _ := s.Command("sudo blkid")
return out
}, 5*time.Minute, 10*time.Second).ShouldNot(ContainSubstring(`LABEL="COS_PERSISTENT"`))

s.Reboot()
s.EventuallyConnects(700)

Expect(s.BootFrom()).To(Equal(sut.Active))

// We should see traces of fsck in the journal.
// Note, this is a bit ugly because the only messages
// we have from systemd-fsck is just failed attempts to run.
// But this is enough for us to assess if it actually kicked in.
out, err := s.Command("sudo journalctl")
Expect(err).ToNot(HaveOccurred())
Expect(out).To(ContainSubstring("systemd-fsck"))
})
})

Context("GRUB cannot mount image", func() {
When("COS_ACTIVE image was corrupted", func() {
It("fallbacks by booting into passive", func() {
Expand Down