-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update: Always use podman pull+cp
#2751
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: cgwalters The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
A while ago we switched to using `oc image extract` in order to reduce the I/O writes done to the host, but it turned out that doesn't yet work in disconnected environments that need ImageContentSourcePolicy. Now, in https://bugzilla.redhat.com/show_bug.cgi?id=2000195 we discovered that the podman fallback broke due to `user.*` extended attributes in the content (which will be removed soon hopefully). But still, a good part of the value proposition of OpenShift is that we work *consistently* across platforms. Having two ways to apply OS updates is not worth the maintenance overhead. Eventually this flow will be more native to rpm-ostree, xref coreos/fedora-coreos-tracker#812 and https://github.com/ostreedev/ostree-rs-ext/#module-container-encapsulate-ostree-commits-in-ocidocker-images
5c251a3
to
fb5a223
Compare
Looking at #1941 it seems we had selinux concerns as well. If anything we should wait until 4.10 branches. cc @sinnykumari for some more context. |
Yeah, a definite possibility. (I didn't test this before submitting BTW) But the thing is, if there are SELinux or really any other issues, then those would also break in the case of disconnected - and we need that to work. So I think it's better to have one way to do it and ensure our single OS update path is tested on each e.g. e2e-aws run. (edit: and accept the cost of this extra I/O for control plane and other hosts for now) |
While e2e-agnostic-upgrade failed, AFAICS it failed on a known flake - the OS update part worked. |
If I understand correctly PR coreos/coreos-assembler#2406 is the actual fix for BZ#2000195 and this PR is a cleanup to keep one flow for OSImage extract? I am not sure or convinced that we should go back to using podman for all the cases. Since 4.6, Debugging a failure shouldn't be difficult with current two flow because podman is always a fallback. |
Yeah, I hope anyways. It hasn't been verified yet.
I think the problem isn't so much debugging it (though it was a bit tricky in this case) so much as the fact that a customer OS update hit it in the field. What if it turned out the podman fallback had been silently broken? Then it'd be much harder to work around; we'd need to ship them a custom MCD, etc. Hmmm...I think we don't even have an informing job for disconnected in Prow? That seems like a large oversight. So my argument to merge this is basically disconnected is important and we should only have one OS update path. But, if you prefer to close this that's also OK by me. Ultimately we will be replacing this bit with the ostree-native container bits which should be a lot better than either, but that's still a ways away and a fair bit of time for any new bugs in the podman path to creep in to be hit only in the field. |
right, we don't have good ci coverage for disconnected.
+1
right, make sense considering ostree-native container work is going to take a while. As mentioned earlier using |
@cgwalters: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
A while ago we switched to using
oc image extract
in orderto reduce the I/O writes done to the host, but it turned out
that doesn't yet work in disconnected environments that need
ImageContentSourcePolicy.
Now, in https://bugzilla.redhat.com/show_bug.cgi?id=2000195 we discovered
that the podman fallback broke due to
user.*
extended attributesin the content (which will be removed soon hopefully).
But still, a good part of the value proposition of OpenShift is that we
work consistently across platforms. Having two ways to apply
OS updates is not worth the maintenance overhead.
Eventually this flow will be more native to rpm-ostree, xref
coreos/fedora-coreos-tracker#812
and
https://github.com/ostreedev/ostree-rs-ext/#module-container-encapsulate-ostree-commits-in-ocidocker-images