Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read-only /sysroot remount racing with other systemd units #488

Closed
shivarammysore opened this issue May 20, 2020 · 11 comments
Closed

Read-only /sysroot remount racing with other systemd units #488

shivarammysore opened this issue May 20, 2020 · 11 comments

Comments

@shivarammysore
Copy link

Install FCOS on Lenovo T410 laptop. I did not see the below errors in late February release when I installed the same via Bootable ISO image

Fedora CoreOS 31.20200505.3.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/c/server/coreos/

[systemd]
Failed Units: 4
  systemd-backlight@backlight:acpi_video0.service
  systemd-backlight@backlight:intel_backlight.service
  systemd-rfkill.service
  systemd-rfkill.socket 
[core@localhost ~]$ sudo rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
● ostree://fedora:fedora/x86_64/coreos/stable
                   Version: 31.20200505.3.0 (2020-05-19T13:39:46Z)
                    Commit: 01f074cc6cd88d8d2b43f821da692f2367c101eb4377802cb35092bde0ef02f7
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4
@lucab
Copy link
Contributor

lucab commented May 20, 2020

Thanks for the report. Those are possibly systemd issues and likely related to your specific hardware.

It would be good to get the journal entries for those failed units.

@lucab
Copy link
Contributor

lucab commented May 20, 2020

For reference, the stable image in this ticket has systemd 243.
It would be interesting trying with current next release too (32.20200517.1.0), which has latest systemd release 245.

@shivarammysore
Copy link
Author

[core@localhost ~]$ journalctl -f -u systemd-backlight@backlight:acpi_video0.service
-- Logs begin at Wed 2020-05-20 04:01:11 UTC. --
May 20 04:01:26 localhost systemd[925]: systemd-backlight@backlight:acpi_video0.service: Failed at step STATE_DIRECTORY spawning /usr/lib/systemd/systemd-backlight: Read-only file system
May 20 04:01:26 localhost systemd[1]: systemd-backlight@backlight:acpi_video0.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 04:01:26 localhost systemd[1]: systemd-backlight@backlight:acpi_video0.service: Failed with result 'exit-code'.
May 20 04:01:26 localhost systemd[1]: Failed to start Load/Save Screen Backlight Brightness of backlight:acpi_video0.
-- Reboot --
May 20 16:26:03 localhost systemd[1]: Starting Load/Save Screen Backlight Brightness of backlight:acpi_video0...
May 20 16:26:03 localhost systemd[725]: systemd-backlight@backlight:acpi_video0.service: Failed to set up special execution directory in /var/lib: Read-only file system
May 20 16:26:03 localhost systemd[725]: systemd-backlight@backlight:acpi_video0.service: Failed at step STATE_DIRECTORY spawning /usr/lib/systemd/systemd-backlight: Read-only file system
May 20 16:26:03 localhost systemd[1]: systemd-backlight@backlight:acpi_video0.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-backlight@backlight:acpi_video0.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save Screen Backlight Brightness of backlight:acpi_video0.

[core@localhost ~]$ journalctl -f -u systemd-backlight@backlight:intel_backlight.service
-- Logs begin at Wed 2020-05-20 04:01:11 UTC. --
May 20 04:01:26 localhost systemd[927]: systemd-backlight@backlight:intel_backlight.service: Failed at step STATE_DIRECTORY spawning /usr/lib/systemd/systemd-backlight: Read-only file system
May 20 04:01:26 localhost systemd[1]: systemd-backlight@backlight:intel_backlight.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 04:01:26 localhost systemd[1]: systemd-backlight@backlight:intel_backlight.service: Failed with result 'exit-code'.
May 20 04:01:26 localhost systemd[1]: Failed to start Load/Save Screen Backlight Brightness of backlight:intel_backlight.
-- Reboot --
May 20 16:26:03 localhost systemd[1]: Starting Load/Save Screen Backlight Brightness of backlight:intel_backlight...
May 20 16:26:03 localhost systemd[726]: systemd-backlight@backlight:intel_backlight.service: Failed to set up special execution directory in /var/lib: Read-only file system
May 20 16:26:03 localhost systemd[726]: systemd-backlight@backlight:intel_backlight.service: Failed at step STATE_DIRECTORY spawning /usr/lib/systemd/systemd-backlight: Read-only file system
May 20 16:26:03 localhost systemd[1]: systemd-backlight@backlight:intel_backlight.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-backlight@backlight:intel_backlight.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save Screen Backlight Brightness of backlight:intel_backlight.
^C
[core@localhost ~]$ journalctl -f -u systemd-rfkill.service
-- Logs begin at Wed 2020-05-20 04:01:11 UTC. --
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: Starting Load/Save RF Kill Switch Status...
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Start request repeated too quickly.
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
^C
[core@localhost ~]$ journalctl -f -u systemd-rfkill.socket
-- Logs begin at Wed 2020-05-20 04:01:11 UTC. --
May 20 04:01:26 localhost systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
May 20 04:01:26 localhost systemd[1]: systemd-rfkill.socket: Failed with result 'service-start-limit-hit'.
-- Reboot --
May 20 16:26:03 localhost systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.socket: Failed with result 'service-start-limit-hit'.
^C
[core@localhost ~]$ 

Will automatically update to the new release once it becomes available.

@jlebon
Copy link
Member

jlebon commented May 20, 2020

May 20 16:26:03 localhost systemd[726]: systemd-backlight@backlight:intel_backlight.service: Failed at step STATE_DIRECTORY spawning /usr/lib/systemd/systemd-backlight: Read-only file system

Ahhh yeah, this may be from the sysroot ro work (ostreedev/ostree#1767). Probably another instance of ostreedev/ostree#2104?

@jlebon
Copy link
Member

jlebon commented May 20, 2020

Though I guess in this case, it's because it's running before the /var mount? Can you show the output of journalctl -b 0 -u systemd-backlight@backlight:intel_backlight.service -u var.mount?

@cgwalters
Copy link
Member

OK yes there's clearly a race between

  do_remount ("/sysroot", !sysroot_readonly);
...
  do_remount ("/var", true);

basically we need to move this to the initramfs, there's no other sane fix.

@cgwalters
Copy link
Member

coreos/fedora-coreos-config#413 will back this out for now.

@shivarammysore
Copy link
Author

[core@localhost faucet]$ journalctl -b 0 -u systemd-rfkill.service -u var.mount
-- Logs begin at Wed 2020-05-20 04:01:11 UTC, end at Wed 2020-05-20 17:52:16 UTC. --
May 20 16:26:03 localhost systemd[1]: var.mount: Directory /var to mount over is not empty, mounting anyway.
May 20 16:26:03 localhost systemd[1]: Mounting /var...
May 20 16:26:03 localhost systemd[1]: Mounted /var.
May 20 16:26:03 localhost systemd[1]: Starting Load/Save RF Kill Switch Status...
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: Starting Load/Save RF Kill Switch Status...
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: Starting Load/Save RF Kill Switch Status...
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: Starting Load/Save RF Kill Switch Status...
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: Starting Load/Save RF Kill Switch Status...
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Main process exited, code=exited, status=238/STATE_DIRECTORY
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Start request repeated too quickly.
May 20 16:26:03 localhost systemd[1]: systemd-rfkill.service: Failed with result 'exit-code'.
May 20 16:26:03 localhost systemd[1]: Failed to start Load/Save RF Kill Switch Status.

jlebon added a commit to jlebon/ostree that referenced this issue May 20, 2020
We're hitting issues with the read-only remounts racing with various
services coming up. Let's neuter it for now until we rework how it
works.

See: coreos/fedora-coreos-tracker#488
jlebon added a commit to jlebon/ostree that referenced this issue May 20, 2020
We're hitting issues with the read-only remounts racing with various
services coming up. Let's neuter it for now until we rework how it
works.

See: coreos/fedora-coreos-tracker#488
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue May 21, 2020
Which includes a patch to backport sysroot.readonly neutering:

coreos/fedora-coreos-tracker#488
ostreedev/ostree#2108
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue May 22, 2020
Which includes a patch to backport sysroot.readonly neutering:

coreos/fedora-coreos-tracker#488
ostreedev/ostree#2108
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue May 22, 2020
Which includes a patch to backport sysroot.readonly neutering:

coreos/fedora-coreos-tracker#488
ostreedev/ostree#2108
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue May 22, 2020
Which includes a patch to backport sysroot.readonly neutering:

coreos/fedora-coreos-tracker#488
ostreedev/ostree#2108
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue May 22, 2020
Which includes a patch to backport sysroot.readonly neutering:

coreos/fedora-coreos-tracker#488
ostreedev/ostree#2108
dustymabe pushed a commit to coreos/fedora-coreos-config that referenced this issue May 22, 2020
Which includes a patch to backport sysroot.readonly neutering:

coreos/fedora-coreos-tracker#488
ostreedev/ostree#2108
@lucab lucab added kind/bug status/pending-testing-release Fixed upstream. Waiting on a testing release. and removed needs/more-information component/systemd labels May 23, 2020
@lucab
Copy link
Contributor

lucab commented May 23, 2020

I think #483 is another instance of this issue.

@jlebon jlebon changed the title Failed Service units on fresh install Read-only /sysroot remount racing with other systemd units May 25, 2020
@dustymabe
Copy link
Member

The change to correct this problem landed in coreos/fedora-coreos-config#416.

The fix for this went into testing stream release 32.20200601.2.1. Please try out the new release and report issues.

@dustymabe dustymabe added status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. and removed status/pending-testing-release Fixed upstream. Waiting on a testing release. labels Jun 4, 2020
@dustymabe
Copy link
Member

The fix for this went into stable stream release 32.20200601.3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants