Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Mount raid0 device to /var #8692

Closed
wants to merge 1 commit into from

Conversation

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongkailiu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2020
@hongkailiu
Copy link
Member Author

/hold

sh-4.4# systemctl status var.mount
● var.mount - /var
   Loaded: loaded (/run/systemd/generator/var.mount; generated)
   Active: active (mounted) since Thu 2020-04-30 00:02:35 UTC; 19h ago
    Where: /var
     What: /dev/mapper/coreos-luks-root-nocrypt
     Docs: man:ostree(1)
    Tasks: 0
   Memory: 0B
      CPU: 2ms
   CGroup: /system.slice/var.mount

Apr 30 00:02:35 localhost systemd[1]: Mounting /var...
Apr 30 00:02:35 localhost systemd[1]: Mounted /var.

@cgwalters
Copy link
Member

OK so there's an important thing here, which is that the MCO will barf trying to apply this change in place (because we can't do it sanely). So after applying this update, we'll need to "rotate" the machineset: https://github.com/openshift/machine-api-operator/blob/master/FAQ.md#after-i-edit-a-machineset-how-can-i-replace-the-existing-machines

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 30, 2020
@cgwalters
Copy link
Member

systemctl status var.mount

Right that comes from ostreedev/ostree#859 but...well, let me double check this works!

@cgwalters
Copy link
Member

Argh, I think we don't support /var as a mountpoint for RHCOS yet. This is one of the things fixed with the Ignition rework that happened in FCOS that will hopefully filter down at some point... The core problem is on RHCOS we use https://github.com/coreos/ignition-dracut/blob/spec2x/dracut/30ignition/ignition-ask-var-mount.service
which is ultimately a workaround for SELinux labeling issues from the kernel which we're waiting on a RHEL 8.3 rebase I think.

I'm trying to look at an ugly workaround.

@hongkailiu hongkailiu changed the title Mount raid0 device to /var [WIP]Mount raid0 device to /var Apr 30, 2020
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 30, 2020
@jlebon
Copy link
Member

jlebon commented Apr 30, 2020

The core problem is on RHCOS we use coreos/ignition-dracut:dracut/30ignition/ignition-ask-var-mount.service@spec2x
which is ultimately a workaround for SELinux labeling issues from the kernel which we're waiting on a RHEL 8.3 rebase I think.

Not exactly. The core issue is that spec2 doesn't have the semantics needed for separate filesystems which can be mounted at specific points under /sysroot (compare the documented meaning of storage.filesystems[].path between spec2 and spec3). This is what the [u]mount stages unlock in the spec3 rework (see coreos/ignition-dracut#47). The SELinux labeling from initrd bits came after.

So any "sensitive" mountpoints like /var and /var/home where we need to be able to write some things from the initrd will likely cause issues on spec2. Though things like /var/lib/containers or /var/log or whatever should work fine.

@hongkailiu
Copy link
Member Author

Journal from the node.

sh-4.4# journalctl -u  var-lib-containers.mount
-- Logs begin at Mon 2020-04-27 12:58:51 UTC, end at Thu 2020-04-30 19:07:46 UTC. --
Apr 30 00:02:05 ip-10-0-146-117 systemd[1]: Unmounting /var/lib/containers...
Apr 30 00:02:08 ip-10-0-146-117 systemd[1]: Unmounted /var/lib/containers.
Apr 30 00:02:08 ip-10-0-146-117 systemd[1]: var-lib-containers.mount: Consumed 3.231s CPU time

@cgwalters
Copy link
Member

The core issue is that spec2 doesn't have the semantics needed for separate filesystems which can be mounted at specific points under /sysroot (compare the documented meaning of storage.filesystems[].path between spec2 and spec3). This is what the [u]mount stages unlock in the spec3 rework (see coreos/ignition-dracut#47). The SELinux labeling from initrd bits came after.

OK, thanks.

Though things like /var/lib/containers or /var/log or whatever should work fine.

Oh they do, the tricky part is we want both of them. And I didn't do any investigation that it was /var/log running us out of IOPS (though what else would?) so it was easier to say "let's just do /var". But...I bet I can craft something that uses symlinks or bind mounts to redirect both of them to /var/mnt/raid or so.

@smarterclayton
Copy link
Contributor

/hold

@@ -42,7 +42,7 @@ spec:
contents: |-
[Mount]
What=/dev/md/containerraid
Where=/var/lib/containers
Where=/var
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were we running container raid in thefirst place?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance types have two attached NVMe devices. CI workload is heavily container images - what else do you suggest?

@cgwalters
Copy link
Member

/close
this is obsoleted by #8715

@openshift-ci-robot
Copy link
Contributor

@cgwalters: Closed this PR.

In response to this:

/close
this is obsoleted by #8715

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hongkailiu hongkailiu deleted the raid0_path branch June 12, 2024 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants