Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rebase to OSTree commits in container images #2940

Merged
merged 2 commits into from
Jul 16, 2021

Conversation

kelvinfan001
Copy link
Member

@kelvinfan001 kelvinfan001 commented Jun 28, 2021

UPDATED:
We can now do e.g.

$ booted_commit=$(rpm-ostree status --json | jq -r '.deployments[0].checksum')
$ sudo rpm-ostree ex-container export --repo=/ostree/repo ${booted_commit} containers-storage:localhost/fcos
$ sudo rpm-ostree rebase containers-storage:localhost/fcos:latest --experimental
$ rpm-ostree status
State: idle
Deployments:
  containers-storage:localhost/fcos:latest
                   Version: v2021.6-27-g279ff28768+dirty (2021-06-30T21:49:22Z)

* c57a5d1d1e22c79f4a85187f959d9a16c189f8352b840c9f3d40c55a47d3dc24
                   Version: v2021.6-27-g279ff28768+dirty (2021-06-30T21:49:22Z)

  fedora:fedora/x86_64/coreos/testing-devel
                   Version: 34.20210623.dev.0 (2021-06-23T18:16:51Z)
                    Commit: a59d1ad617e0fd0690e3f00d1f7e0fc7c2c18ad887e0bd57ceee3a9a03119672
              GPGSignature: (unsigned)

$ sudo reboot

...

$ rpm-ostree status
State: idle
Deployments:
* containers-storage:localhost/fcos:latest
                   Version: v2021.6-27-g279ff28768+dirty (2021-06-30T21:49:22Z)

  c57a5d1d1e22c79f4a85187f959d9a16c189f8352b840c9f3d40c55a47d3dc24
                   Version: v2021.6-27-g279ff28768+dirty (2021-06-30T21:49:22Z)

Note that a tag must be provided if the refspec is a container image reference when doing rpm-ostree rebase.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this! At first glance looking pretty good!

@kelvinfan001
Copy link
Member Author

kelvinfan001 commented Jun 29, 2021

So I just realized that "refspec type" (e.g. ostree refspec, container image reference, rojig) and the the key in the origin file (e.g. origin/refspec, origin/baserefspec) are orthogonal (1). So it seems like we shouldn't be using the origin file's key as a way to determine the "refspec type" since it is currently being used to determine whether the deployment has layered/overridden packages.
Would it be a good idea to add another key to the [origin] group in the origin file to indicate the type of the refspec under refspec/baserefspec? I think this would actually fit better with the Deployments D-Bus API which currently returns a dictionary where an origin key contains the refspec, with no way of telling the refspec's type (without using guesswork).

(1): I think I was misled a bit by the now-removed rojig code that did also use the origin file keys to sort of indicate "refspec type", i.e. the possible keys could be origin/rojig, origin/refspec, origin/baserefspec.

@cgwalters
Copy link
Member

Right, I think we could use e.g.:

[origin]
container-image-reference = "quay.io/coreos/fedora-coreos:stable"

Then it's consistent with our proposed value in rpm-ostree status --json etc.

@kelvinfan001
Copy link
Member Author

kelvinfan001 commented Jun 29, 2021

hmm, I think I was proposing to not do that because we already have the refspec/baserefspec keys which are communicating whether the deployment has layered packages.
IOW, it seems sort of confusing to sometimes see a refspec key which doesn't indicate the refspec type (it could be TYPE_OSTREE or TYPE_CHECKSUM), and sometimes see container-image-reference which does indicate refspec type, but doesn't indicate whether anything is layered.

So I was proposing to add a separate key that will exist alongside refspec/baserefspec to indicate its type in the origin file.

Same for rpm-ostree status --json, I feel like it might make more sense to add a separate key.
Right now rpm-ostree status --json will display e.g. "origin" : "fedora:fedora/x86_64/coreos/testing-devel", so the origin key is being used to display the refspec. Would we be more backwards-compatible if we added an additional key that will exist alongside the existing origin key (just in case any users are relying on an origin key existing)?

@cgwalters
Copy link
Member

cgwalters commented Jun 29, 2021

IOW, it seems sort of confusing to sometimes see a refspec key which doesn't indicate the refspec type (it could be TYPE_OSTREE or TYPE_CHECKSUM), and sometimes see container-image-reference which does indicate refspec type, but doesn't indicate whether anything is layered.

The rationale for refspec versus baserefspec is mentioned here #2882 (comment)

Whereas in contrast with container-image-reference, typing ostree admin upgrade won't work - regardless of whether or not packages are layered. Except now here's where we get 🤯 - in theory it should. But doing so would require integrating ostree and ostree-rs-ext which is a large bridge to cross. Since rpm-ostree already has both Rust and libostree, I'd like to do it here first if that makes sense.

So I was proposing to add a separate key that will exist alongside refspec/baserefspec to indicate its type in the origin file.

I think we concluded that the type is already expressed by refspec - i.e we're defining that key to be pure ostree always.

Right now rpm-ostree status --json will display e.g. "origin" : "fedora:fedora/x86_64/coreos/testing-devel", so the origin key is being used to display the refspec.

Right. I think we wouldn't emit an origin key, we'd emit container-image-reference.

@kelvinfan001
Copy link
Member Author

ahh, ok this makes a lot of sense, thanks! #2882 (comment) was super helpful.

kelvinfan001 added a commit to kelvinfan001/rpm-ostree that referenced this pull request Jun 30, 2021
This mainly undoes
coreos@2c270a6,
and follows up on coreos#2842.
Also serves as some cleanup to make way for the introduction of a
new `container-image-reference` refspec type.

We no longer have any use for `ostree://` or `rojig://`-style
prefixes in the refspecs. There had been efforts to "canonicalize"
refspecs to always include such prefixes into the code internally,
but that hasn't gotten too far. Crucially, since such prefixes were
never introduced to libostree, the refspecs in the origin file
were never meant to contain such prefixes anyway.

Following some discussion in
coreos#2940 (comment),
in the future we would disambiguate "refspec type" via the name of
the key in the origin file instead.
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch 3 times, most recently from c8b1487 to 8691fe4 Compare June 30, 2021 19:37
@kelvinfan001
Copy link
Member Author

kelvinfan001 commented Jun 30, 2021

I spent a good amount of time trying to refactor the existing code to not need to call rpmostree_refspec_classify() everywhere, and instead cache the refspec type somewhere and read from that cached value as the source of truth instead; but that led to quite significant changes and was a bit hard to debug. I think this PR shouldn't block on that for now, as the cleanup can perhaps be done in a followup PR.
For now though, we are relying on inferring the refspec type from a string whenever we need to; but as discussed in #2909 (comment), this might not be a reliable approach long term.
Ideally, the refspec type should be read/inferred from the origin file (which contains a bit more information via the name of the key that the refspec is stored in), or if coming from a CLI command, we require a switch that expresses the refspectype of the provided argument, as suggested at #2909 (comment).

@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from 8691fe4 to 3745416 Compare June 30, 2021 22:32
kelvinfan001 added a commit to kelvinfan001/rpm-ostree that referenced this pull request Jul 5, 2021
This mainly undoes
coreos@2c270a6,
and follows up on coreos#2842.
Also serves as some cleanup to make way for the introduction of a
new `container-image-reference` refspec type.

We no longer have any use for `ostree://` or `rojig://`-style
prefixes in the refspecs. There had been efforts to "canonicalize"
refspecs to always include such prefixes into the code internally,
but that hasn't gotten too far. Crucially, since such prefixes were
never introduced to libostree, the refspecs in the origin file
were never meant to contain such prefixes anyway.

Following some discussion in
coreos#2940 (comment),
in the future we would disambiguate "refspec type" via the name of
the key in the origin file instead.
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch 2 times, most recently from 134a5bc to 5800d1d Compare July 6, 2021 01:25
@kelvinfan001 kelvinfan001 changed the title [WIP] Allow rebase to OSTree commits in container images Allow rebase to OSTree commits in container images Jul 6, 2021
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from 5800d1d to 31c01e1 Compare July 6, 2021 02:24
@kelvinfan001 kelvinfan001 linked an issue Jul 6, 2021 that may be closed by this pull request
@kelvinfan001
Copy link
Member Author

Sounds good to make it experimental first. Will add a couple more sanity checks without duplicating everything for now.

@cgwalters
Copy link
Member

OK right, reproduced locally:

$ kola run ext.rpm-ostree.destructive.container-image --ssh-on-test-failure
...
Jul 13 20:26:37 qemu0 kola-runext-container-image[1375]: time="2021-07-13T20:26:37Z" level=fatal msg="Error committing the finished image: write /var/tmp/ostree-rs-ext6s8eI3/skopeo.pipe: broken pipe"
$

Hmm. This is clearly where we're taking a hit from forking skopeo as a child process. I fixed an EPIPE (error code for "broken pipe") bug like this in ostreedev/ostree-rs-ext@b924092 but this must be a new variant.

@cgwalters
Copy link
Member

OK I think I found the bug, testing...

@cgwalters
Copy link
Member

This is working for me ostreedev/ostree-rs-ext#50

@cgwalters
Copy link
Member

@kelvinfan001 Can you test this out too? See https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html#the-patch-section

diff --git a/Cargo.toml b/Cargo.toml
index 777d4a31..fd5f4a75 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -104,3 +104,6 @@ fedora-integration = []
 sanitizers = []
 
 default = []
+
+[patch.crates-io]
+ostree-ext = { path = '../../ostreedev/ostree-rs-ext/lib' }

Then in that repo check out my PR.

@cgwalters
Copy link
Member

Next issue is we need

diff --git a/tests/kolainst/destructive/container-image b/tests/kolainst/destructive/container-image
index e3e8faa1..c113bcd8 100755
--- a/tests/kolainst/destructive/container-image
+++ b/tests/kolainst/destructive/container-image
@@ -23,6 +23,8 @@ set -euo pipefail
 
 set -x
 
+cd $(mktemp -d)
+
 checksum=$(rpm-ostree status --json | jq -r '.deployments[0].checksum')
 rpm-ostree ex-container export --repo=/ostree/repo ${checksum} containers-storage:localhost/fcos
 rpm-ostree rebase containers-storage:localhost/fcos:latest

After that it looks like
Jul 14 13:38:38 qemu0 kola-runext-container-image[1328]: .deployments[0]["container-image-reference"] == 02d215c8c7dc664c83b55dcba98c94291721f908804872109f559fa5c07fec77 failed to match status.json
Because we have "container-image-reference" : "containers-storage:localhost/fcos:latest",.

@kelvinfan001
Copy link
Member Author

Ok, fixing issues with the test cases now, but no longer facing the skopeo issue. Thanks for digging into this bug!

@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from 3f53747 to 7d901cd Compare July 15, 2021 15:02
@coreos coreos deleted a comment from openshift-ci bot Jul 15, 2021
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from 7d901cd to 792199b Compare July 15, 2021 15:13
@coreos coreos deleted a comment from openshift-ci bot Jul 15, 2021
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from 792199b to 6196c13 Compare July 15, 2021 16:29
@coreos coreos deleted a comment from openshift-ci bot Jul 15, 2021
@kelvinfan001
Copy link
Member Author

hmm... interesting how the new kolainst/destructive/container-image is passing in CI but not locally.
When I run the test locally rpm-ostree upgrade gets stuck in the transaction (though it does work fine when testing things manually in a VM).

Anyways, I think this should be ready for another look.

@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch 2 times, most recently from d21cdfc to 5aba7f9 Compare July 15, 2021 18:41
@coreos coreos deleted a comment from openshift-ci bot Jul 15, 2021
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch 2 times, most recently from 77c6e66 to c388844 Compare July 15, 2021 20:57
@cgwalters
Copy link
Member

Not super important but feel free to squash my trivial commit into the earlier one, it's a bit of noise to have in the merged history I think.

@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from c388844 to 0efaad8 Compare July 15, 2021 21:03
Comment on lines 469 to 470
g_autofree char *image_digest = strdup (import->image_digest.c_str());
rpmostree_origin_set_container_image_reference_digest (self->original_origin, image_digest);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
g_autofree char *image_digest = strdup (import->image_digest.c_str());
rpmostree_origin_set_container_image_reference_digest (self->original_origin, image_digest);
rpmostree_origin_set_container_image_reference_digest (self->original_origin, import->image_digest.c_str());

We don't need to make a new copy here AFAICS.

@@ -115,23 +118,51 @@ rpmostree_origin_parse_keyfile (GKeyFile *origin,

ret->cached_unconfigured_state = g_key_file_get_string (ret->kf, "origin", "unconfigured-state", NULL);

g_autofree char *refspec = g_key_file_get_string (ret->kf, "origin", "refspec", NULL);
if (!refspec)
/* Note that the refspec type can be inferred from the key in the orgin file, where
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Note that the refspec type can be inferred from the key in the orgin file, where
/* Note that the refspec type can be inferred from the key in the origin file, where

Comment on lines 158 to 160
g_autofree char *digest = g_key_file_get_string (ret->kf, "origin", "container-image-reference-digest", NULL);
if (digest)
ret->cached_digest = util::move_nullify (digest);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
g_autofree char *digest = g_key_file_get_string (ret->kf, "origin", "container-image-reference-digest", NULL);
if (digest)
ret->cached_digest = util::move_nullify (digest);
ret->cached_digest = g_key_file_get_string (ret->kf, "origin", "container-image-reference-digest", NULL);

Comment on lines 47 to 48
# Disable repos, no Internet access should be required
rm -rf /etc/yum.repos.d/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also libtest_prepare_offline, see some of the other tests.

rpm-ostree install ${KOLA_EXT_DATA}/rpm-repos/0/packages/x86_64/foo-1.2-3.x86_64.rpm
echo "ok install foo locally"

/tmp/autopkgtest-reboot 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as is, but I think you can save a reboot by committing the pending deployment instead of rebooting and having it be the booted. (Something to look at after merge in other words)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah true. I was also thinking we wanted to do a rpm -q foo after the reboot to make sure the package is really there. Anyways, probably smart to save a reboot since I did notice that this test was taking a bit longer than the others :)

rpm-ostree ex-container export --repo=/ostree/repo ${new_commit} containers-storage:localhost/fcos
rpm-ostree uninstall foo
rpm-ostree upgrade > out.txt
assert_not_file_has_content out.txt "No upgrade available."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm wait isn't the point that we did get an upgrade here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh it's assert_NOT_file_has_content here. Sorry, that was confusing. I wanted to explicitly test that the upgrade command led to an available upgrade. But the couple lines following this is probably sufficient.

This adds support for providing a container image as the argument for
`rpm-ostree rebase` by teaching rpm-ostree to use ostree-rs-ext's new
import OSTree commits from container images feature.
`deploy` while based on a container image is not yet implemented.
Currently, a tag for the target container image must be provided; i.e.
omitting the tag will not be a shortcut for `latest`.
Previously, `rpm-ostree upgrade` was re-pulling a container image
each time, and throwing it away if it realizes that the commit is
the same as the currently booted deployment much later on.
Be smarter about this by checking the image reference digest, and
don't bother pulling the container image if the digest is the same.
@kelvinfan001 kelvinfan001 force-pushed the pr/rebase-container-image branch from 0efaad8 to ff130aa Compare July 15, 2021 22:44
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 👍 🥳

@cgwalters cgwalters merged commit 7ac7ef4 into coreos:main Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add rpm-ostree rebase registry:quay.io/coreos/fedora-coreos:stable
2 participants