Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debian/Root on ZFS: modules missing from initrd after kernel update #10355

Closed
mmitch opened this issue May 22, 2020 · 3 comments
Closed

Debian/Root on ZFS: modules missing from initrd after kernel update #10355

mmitch opened this issue May 22, 2020 · 3 comments

Comments

@mmitch
Copy link

mmitch commented May 22, 2020

System information

Type Version/Name
Distribution Name Debian
Distribution Version Buster / 10.4
Linux Kernel 4.19.118-2
Architecture amd64
ZFS Version 0.8.3-1~bpo10+1
SPL Version 0.8.3-1~bpo10+1

Describe the problem you're observing

I have set up a machine using the Debian Buster Root on ZFS instructions (ping to @rlaager).

It has already happened twice that when there was an update of the Debian kernel package, the system did not boot the with new kernel because the ZFS modules could not be loaded from the initrd.

Describe how to reproduce the problem

  • install a system as described in Debian Buster Root on ZFS
  • wait for a Debian kernel update

(i did not test this on a second system and I don't know when the next kernel update in Debian Buster is due)

Include any warning/errors/backtraces from the system logs

no screenshot of the boot messages, sorry

Preliminary analysis

I use unattended-upgrades to automatically install security updates and even reboot automatically when the kernel has been updated. It already happened twice that the system did not boot after such an update because the ZFS modules were not present in the initrd.

In both cases, I could boot the previous kernel via grub, manually rebuild the initrd for the new kernel and then the next boot succeeded. No data was lost, I only had some unexpected downtime on the server.

I think that this has happened in both cases:

  1. a kernel update is installed
  2. the initrd is rebuilt
  3. the zfs-dkms modules are rebuilt against the new kernel

Because the ZFS kernel modules are built after the initrd has already been built, the new ZFS modules for the new kernel are missing from the initrd.

The ZFS DKMS sources explicitely don't request a rebuild of the initrd after the ZFS modules have been rebuilt:

$ grep INITRD /usr/src/zfs-0.8.3/dkms.conf 
REMAKE_INITRD="no"

Proposed fix

Having the ZFS modules in the initrd is propably not important for everybody so REMAKE_INITRD="no" is a useful default. But when using ZFS on Root, this setting leads to the problem I have described.

According to the DKMS documentation, REMAKE_INITRD can be overwritten within a file /etc/dkms/zfs.conf containing this:

# override for /usr/src/zfs-*/dkms.conf:
# always rebuild initrd when zfs module has been changed
# (either by a ZFS update or a new kernel version)
REMAKE_INITRD='yes'

I have not yet tested this, but according to the documentation this should to the trick if my analysis was correct.

Could a note regarding this workaround/fix be included in the Debian Buster Root on ZFS documentation?

@rlaager
Copy link
Member

rlaager commented May 22, 2020

This analysis sounds reasonable. Any chance you could test this? I'm happy to merge this change, but it'd be nice to confirm it works as intended.

@rlaager
Copy link
Member

rlaager commented May 24, 2020

It seems this has been discussed before, so I'll treat that as (mostly) confirmation: #849

I have this committed. It'll get pushed with the other updates I'm making.

@rlaager rlaager closed this as completed May 24, 2020
@mmitch
Copy link
Author

mmitch commented May 24, 2020

Thanks for spotting the duplicate – I should have inspected the closed issues more thoroughly :-/

I was trying to provide a proof by reinstalling the current kernel, but this did not work as expected:

1. initial status

ZFS modules are present in both the system and the current initrd.
The workaround is present in /etc/dkms/zfs.conf

$ uname -a
Linux derpy 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64 GNU/Linux

$ dkms status
zfs, 0.8.3, 4.19.0-8-amd64, x86_64: installed
zfs, 0.8.3, 4.19.0-9-amd64, x86_64: installed

$ lsinitramfs /boot/initrd.img-4.19.0-9-amd64 | grep -E '(zfs|spl).ko'
usr/lib/modules/4.19.0-9-amd64/updates/dkms/spl.ko
usr/lib/modules/4.19.0-9-amd64/updates/dkms/zfs.ko

$ cat /etc/dkms/zfs.conf
# override for /usr/src/zfs-*/dkms.conf:
# always rebuild initrd when zfs module has been changed
# (either by a ZFS update or a new kernel version)
REMAKE_INITRD='yes'

2. disable workaround

$ mv /etc/dkms/zfs.conf /etc/dkms/zfs.conf.DISABLED

3. remove ZFS modules and initrd from system (look out!)

$ dkms remove -k 4.19.0-9-amd64 zfs/0.8.3
[…]
DKMS: uninstall completed.

$ dkms status
zfs, 0.8.3, 4.19.0-8-amd64, x86_64: installed

$ update-initramfs -d -k 4.19.0-9-amd64
update-initramfs: Deleting /boot/initrd.img-4.19.0-9-amd64

$ ls /boot/initrd.img-4.19.0-9-amd64
ls: cannot access '/boot/initrd.img-4.19.0-9-amd64': No such file or directory

4. reinstall the current kernel package to simulate a kernel update

$ apt-get reinstall linux-image-4.19.0-9-amd64
Reading package lists... Done
Building dependency tree       
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 0 B/48.2 MB of archives.
After this operation, 0 B of additional disk space will be used.
(Reading database ... 231446 files and directories currently installed.)
Preparing to unpack .../linux-image-4.19.0-9-amd64_4.19.118-2_amd64.deb ...
Unpacking linux-image-4.19.0-9-amd64 (4.19.118-2) over (4.19.118-2) ...
Setting up linux-image-4.19.0-9-amd64 (4.19.118-2) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-4.19.0-9-amd64
cryptsetup: ERROR: Couldn't resolve device rpool/ROOT/debian
cryptsetup: WARNING: Couldn't determine root device
cryptsetup: WARNING: The initramfs image may not contain cryptsetup binaries 
    nor crypto modules. If that's on purpose, you may want to uninstall the 
    'cryptsetup-initramfs' package in order to disable the cryptsetup initramfs 
    integration and avoid this warning.
/etc/kernel/postinst.d/zz-update-grub:
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.19.0-9-amd64
Found initrd image: /boot/initrd.img-4.19.0-9-amd64
Found linux image: /boot/vmlinuz-4.19.0-8-amd64
Found initrd image: /boot/initrd.img-4.19.0-8-amd64
Adding boot menu entry for EFI firmware configuration
done

5. now modules are present on the system but not in the initrd

$ dkms status
zfs, 0.8.3, 4.19.0-8-amd64, x86_64: installed
zfs, 0.8.3, 4.19.0-9-amd64, x86_64: installed

Whoops, here they are, they are NOT missing:

$ lsinitramfs /boot/initrd.img-4.19.0-9-amd64 | grep -E '(zfs|spl).ko'
usr/lib/modules/4.19.0-9-amd64/updates/dkms/spl.ko
usr/lib/modules/4.19.0-9-amd64/updates/dkms/zfs.ko

Simply reinstalling the kernel image seems not to be enough to trigger the bug :-(

My next planned steps were:

  • re-enble the workaround
  • remove ZFS modules and initrd (again)
  • reinstall the current kernel package (again)
  • check that ZFS modules are now present both on system and in initrd

rlaager added a commit to openzfs/openzfs-docs that referenced this issue May 27, 2020
The initrd needs to be rebuilt after the ZFS modules are built.
Otherwise, the system can (will?) fail to boot.

Closes: openzfs/zfs#10355
Reported-by: Christian Garbs <mitch@cgarbs.de>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants