-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formalize support for zstd compression: v1.1.0 ? #803
Comments
@jonjohnsonjr @vbatts @mikebrow @dmcgowan @SteveLasker ptal (not sure if this is the right location for this discussion, or if it should be discussed in the OCI call; I just noticed this, so thought I'd write it down 😬 😅) |
I had similar issues interpreting "ignore". The In case of a call, I will do my best to join. |
I must admit I'm not the most proficient reader of specifications, but good to hear I'm not the only person that was a bit confused by it 😅 (which may warrant expanding that passage a bit to clarify the intent). I guess "ignoring" will lead to an "error" in any case, because skipping "unknown media types" should likely lead to a failure to calculate the digest 🤔. Still, having some more words to explain would be useful. |
Thanks, @thaJeztah! I also felt some relief 😄 @tych0, could you elaborate a bit on your use case? I don't want to break you a second time 👼 |
I'm not sure (3) solves the underlying problem here. That defines a way for understanding the media type, but it doesn't necessarily mean that clients can handle all possible permutations of a media type. The main issue is that if clients start pushing up images with |
Sure, I'm putting squashfs files in OCI images instead of gzipped tarballs, so I can direct mount them instead of having to extract them first. The "MUST ignore" part of the standard lets me do this, because tools like skopeo happily copy around OCI images with underlying blob types they can't decode. If we suddenly change the standard to not allow unknown blob types in images and allow tools to reject them, use cases like this will no longer be possible. Indeed, the standard does not need to change for docker to generate valid OCI images with zstd compression. The hard work goes into the tooling on the other end, but presumably docker has already done that. It might be worth adding a few additional known blob types to the spec here: https://github.com/opencontainers/image-spec/blob/master/media-types.md#oci-image-media-types but otherwise I don't generally understand the goals of this thread. |
I think in case of Skopeo, Skopeo itself is not consuming the image, and is used as a tool to pull those images; I think that's more the "distribution spec" than the "image spec" ? I think a runtime that does not support a specific type of layer should be able to reject that layer, and not accept "any" media-type. What use would there be for a runtime to pull an image with (say) For such cases, I think it'd make more sense to reject the image (/layer). |
No; the distribution spec is for repos serving content over http. skopeo translates to/from OCI images according to the OCI images spec.
If someone asks you to run something you can't run, I agree an error is warranted. But in the case of skopeo, it is a tool that is perfectly capable of handling layers with mime types it doesn't understand, and I think similar tools should not error out either. |
Yeah, poor choice of words; was trying to put in words that Skopeo itself is not the end-consumer of the image (hope I'm making sense).
The confusion in the words picked in the specs is about "mime types it doesn't understand". What makes a tool compliant with the image-spec? Should it be able to parse the manifest, or also be able to process the layers? Is While I understand the advantage of having some flexibility, if the spec does not dictate anything there, how can I know if an image would work with some tool implementing image-spec "X" ? Currently it For Skopeo's case, even though the mediaType is "unknown to the implementation", Skopeo is able to "handle" / "process" the layer (within the scope it's designed for), so perhaps "unknown" should be changed to something else; e.g.implementations should / must produce an error if they're not able to "handle" / "process" a layer-type. |
That seems like a reasonable clarification to me! |
Regarding the ambiguity of the MUST clause. The intention of that sentence is to say that implementations should act as though the layer (or manifest) doesn't exist if it doesn't know how to do whatever the user has requested, and should use an alternative layer (or manifest) if possible. This is meant to avoid implementations just breaking and always giving you an error if some extension was added to an image which doesn't concern that implementation -- it must use an alternative if possible rather than giving a hard error. Otherwise any new media-types will cause endless problems. In the example of pulling image data, arguably the tool supports pulling image data regardless of the media-type so there isn't any issue of it being "unknown [what to do with the blob] to the implementation" -- but if the image pulling is being done in order for an eventual unpacking step then you could argue that it should try to pull an alternative if it doesn't support the image type. I agree this wording could be a bit clearer though, this change was done during the period of some of the more contentious changes to the image-spec in 2016. Given that the above was the original intention of the language, I don't think it would be a breaking change to better clarify its meaning.
This is being worked on by @SteveLasker. The idea was to first register just one media-type so we get an idea of how the process works, and then to effectively go and register the rest. |
Another issue with the current way of representing compression is that the ordering of multiple media type modifiers (such as compression or encryption) isn't really well-specified since MIME technically doesn't support such things. There was some discussion last year about writing a library for dealing with MIME types so that programs can easily handle such types, but I haven't seen much since then. |
Ack: please assume the other mediaTypes will be registered. I'm providing clarity in the Artifacts Spec to help with both these issues. Once the Artifacts spec is merged, with clarity on the registration process, I'll register the other types. For the compression, what I think we're saying is this: There are other tools, like skopeo, (I think) or ORAS which work on any artifact type pushed to a registry. In these cases, they need to know some conventions to be generic. But, in the case of ORAS, it intentionally doesn't know about a specific artifact type and simply provides auth, push, pull of layers associated with a manifest. It's the outer wrapper, like Helm or Singularity that provide specific details on layer processing. We have an open agenda for the 4/22 call to discuss. |
I see I forgot to reply to some of the comments
So, I was wondering about that: I can see this "work" for a multi-manifest(ish) image, in which case there could be multiple variations of an image (currently used for multi-arch), and I can use "one" of those, but I'm having trouble understanding how this works for a single image. What if an image has layers with mixed compression?
I think it's technically possible to have mixed compressions. For example, in a situation where an existing image is pulled (using, e.g. However, the "reverse" could also make a valid use-case, to create a "fat/hybrid" image, offering alternative compressions for systems that support it ("gzip" layers for older clients, "zstd" for newer clients that support it). Looks like this needs further refinement to describe how this should be handled.
Thanks! I recall seeing a discussion (on the mailing list?) about registering, but noticed "some" were registered, but others were not, so thought I'd check 👍 |
Yes, absolutely agree with Sebastiaan, picking some layers you understand and rejecting the rest is meaningless, and the semantics are not defined. There is no way to construct an image with zstd compression that is compatible with both older and newer clients. This only works for very limited workflows where you synchronously update all your clients and then update the images you generate, it does not work at all for people wanting to distribute public images, for example, where basically you cannot use zstd because there is no way to make an image anyone can use. A manifest list mechanism would be workable, but the current design just doesn't seem fit for purpose, and I think we should revert it. |
I think the way to move forward is to add support for zstd to the different clients but still keep the gzip compression as the default. Generating these images should not be the default yet, but the more we postpone zstd support in clients, the more it will take to switch to it. I don't see anything wrong if an older client, in 1-2 years will fail to pull newer images. |
The problem is that currently the correct behavior is effectively "undefined".
See my earlier comment about layers using mixed compression (which IMO should be a valid use case). Without any definition how these images should be handled, it would not be possible to keep them interoperable.
… On 9 Dec 2020, at 14:37, Giuseppe Scrivano ***@***.***> wrote:
I think the way to move forward is to add support for zstd to the different clients but still keep the gzip compression as the default.
Generating these images should not be the default yet, but the more we postpone zstd support in clients, the more it will take to switch to it.
I don't see anything wrong if an older client, in 1-2 years will fail to pull newer images.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
What about just adding the clarification you already proposed above, i.e.
Doesn't that define it well enough? |
Unfortunately, it doesn't, because for runtimes that support both Take the following example; {
"layers": [
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
"size": 12345,
"digest": "sha256:deadbeef"
},
{
"mediaType": "application/vnd.oci.image.layer.v1.tar+zstd",
"size": 34567,
"digest": "sha256:badcafe"
}
]
} The above would be ambiguous, as it could either mean;
In the above, While it's possible to define something along the lines of "MUST" pick one compression, and only use layers with the same compression, this would paint us in a corner, and disallow use-case All of this would've been easier if digests were calculated over the non-compressed artifacts (and compression being part of the transport), but that ship has somewhat sailed. Perhaps it would be possible with a new media-type ( |
I don't think case 1 you've provided is legal. Per https://github.com/opencontainers/image-spec/blob/master/manifest.md#image-manifest-property-descriptions we have, "The final filesystem layout MUST match the result of applying the layers to an empty directory." So I think the specification already states that it must be case 2. |
yes, I think it should be case 2, an image made of two different layers. It would be very confusing to support case 1 this way. |
"Applying the layers" is very ambiguous combined with the other requirements (more below:)
Which means that there's no way to have images that are compatible with both any of the existing runtimes and runtimes that support As
Which means that any of the current runtimes |
I don't think that's what it means at all. It means it won't work this specific way, but I can imagine other ways in which it would.
That's why I think your proposed clarification is useful: runtimes who can't "process" the layer should error out when asked to. In particular, that's exactly what will happen in current implementations: they will try to gunzip the zstd blob, realize they can't, and fail. |
Can you elaborate on what other ways? |
Sure, but I don't think it's relevant for whether or not zstd support should be in the spec. With your proposed clarification, I think the spec would be very clear about the expected behavior when runtimes encounter blobs they don't understand (and for tools like e.g. skopeo, who can shuttle these blobs around without understanding them, which is my main concern). We are already using non-standard mime types in layers at my organization, and because the tooling support for this is not very good, right now we just disambiguate by using a "foo-squashfs" tag for images that are squashfs-based, and a "foo" tag for the tar-based ones. However, since tag names are really just annotations, you could imagine having an additional annotation, maybe "org.opencontainers.ref.layer_type" to go along "org.opencontainers.ref.name" that people use as tags, that would just be the layer type. Then, in a tool like skopeo, you would do something like To make this backwards compatible, I suspect always listing the tar-based manifest as the first one in the image would mostly work, assuming tools don't check for multiple images with the same tag and fail. But maybe it wouldn't, I haven't really played around with it. In any case, just using tags to disambiguate works totally fine, even though it's ugly and better tooling support would be appreciated. |
Adding new compression formats to a specific type is goodness to bring that artifact forward with new capabilities. Providing consistent behavior across an ecosystem of successful deployment of multiple versions seems the problem. This is very akin to the multi-arch approach. I'm not saying we should actually use multi-arch manifests, but the concept is what we seem to need here. For reference, we debated this with Teleport. We didn't want to change the user model, or require image owners to publish a new format. When someone pushes content to a teleport enabled registry, we automatically convert it. When the client makes a request, it sends header information that says it supports teleport. The registry can then hand back teleport references to blobs. So, there are two models to consider here:
This is also similar to what registries do with docker and OCI manifests. They get converted on the fly. I recognize converting a small json file is far quicker than multi-gb blobs. Ultimately, it seems like we need to incorporate the full end to end experience and be careful to not destabilize the e2e container ecosystem while we provide new enhancements and optimizations. |
(IIUC) tools like skopeo should not be really affected for your specific use-case as they for that use-case are not handling the actual image, and are mainly used as a tool to do a full download of whatever artifacts/blobs are referenced (also see my earlier comments #803 (comment) and #803 (comment))
I feel like this is now replicating what manifest-lists were for (a list of alternatives to pick from); manifest lists currently allow differentiating on architecture, and don't have a dimension for "compression type". Adding that would be an option, but (for distribution/registry) may mean an extra roundtrip (image/tag -> os/architecture variant -> layer-compression variant), or add a new dimension besides "platform". Which looks to be what @SteveLasker is describing as well;
Regarding;
Docker manifests are OCI manifests; I think the only conversion currently still present is for old (Schema 2 v1) manifest (related discussion on that in opencontainers/distribution-spec#212), and is being discussed to deprecate / disable (docker/roadmap#173) I'd be hesitant to start extracting and re-compressing artifacts. This would break the contract of content addressability, or more specific: what guarantee do I have that the re-compressed artifact has the same content as the artifact that was pushed?. If we want to separate compression from artifacts, then #803 (comment) is probably a better alternative;
|
FYI #880 may be interesting to folks. |
Please add zstd. Pulling images is so slow because of decompression. |
/cc @zvonkok |
For those that aren't supporting zstd today, what is preventing it? Of the tools that don't support it yet and what is blocking the PR to add support? For those that require this feature, do we have metrics showing that decompression is a slower step than the network speed to download the blob? Without that, then decompressing during the download (instead of waiting for the download to complete) would provide a performance improvement without zstd. |
For Podman/CRI-O, one zstd feature we are interested in using is "skippable frames" so we can embed some metadata in the compressed stream. In addition to the faster decompression, zstd also requires less CPU time. On my machine, using pigz for comparison as it is much faster than GNU gzip, I get:
|
What is preventing Podman/CRI-O from supporting these features today? Is there a PR blocked because of OCI?
Is this from content that was pulled from a registry, or content stored locally in a compressed state? |
nothing really :-) We are planning to use zstd by default on Fedora 41: https://fedoraproject.org/wiki/Changes/zstd:chunked
stored locally in a compressed state |
Do we need to differentiate between zstd and zstd+chunked? |
No, it is still a valid zstd file. Clients that do not use the additional metadata will simply ignore it. |
What is the media type of |
it is |
How does a client then know if the layer is chunked? Does it always need to fetch the header and somehow determine if it is chunked using magic bytes? |
it could do that, or use the annotations we added for that layer, e.g.: {
"MIMEType": "application/vnd.oci.image.layer.v1.tar+zstd",
"Digest": "sha256:9efd019b05bc504fcc4d0e244f3c996eb2c739f3274ab5cc746e0f421044c041",
"Size": 113639090,
"Annotations": {
"io.github.containers.zstd-chunked.manifest-checksum": "sha256:f67017010afe34d9a5df4c1f65c6ff7ac7a452b57e7febea91d80ed9da51841e",
"io.github.containers.zstd-chunked.manifest-position": "111910713:1066869:6231165:1"
}
} so it can immediately fetch the TOC and validate it against the checksum that is recorded in the manifest as well |
I just spent a bit of time rereading this thread (it's a long one) and a lot has happened since the original issue.
Given everything that has happened, what is left to complete before resolving this issue?
Anything that I'm missing? If not, I can work on a PR for the |
(adding the meaningful bit of my thoughts from today's call)
IMO, the intermediate state of "many older tools don't support the layers at all" (zstd) is worse than "many older tools will have increased storage, but everything will be functional" (uncompressed / transport compression) 👀 One of these is a really viable intermediate state to allow actual transition (that happens to also solve other interesting problems like layer digest vs DiffID) while the other is a complete non-starter (at least, from my own perspective as a large publisher of a number of popular images). |
(oh, and I'm +1 on adding zstd to the "SHOULD support" list) |
2 cents: the real world (containerd, buildkit, docker, buildah, etc) already supports zstd. The fact that image spec lags behind just introduces a discrepancy. From technical POV, zstd is just unlimately superior to gzip in all aspects. In order to continue to make sense, image spec should declare zstd as a supported format. |
The image spec was updated to support it, back in #788, which was part of every RC and the GA of v1.1.0. This issue is more about saying that implementations "SHOULD" support it, which is mostly semantics because, as you've said, all the major tools already do. |
While reviewing moby/moby#40820, I noticed that support for zstd was merged in master (proposal: #787, implementation in #788 and #790), and some runtimes started implementing this;
However, the current (v1.0.1) image-spec does not yet list zstd as a supported compression, which means that not all runtimes may support these images, and the ones that do are relying on a non-finalized specification, which limits interoperability (something that I think this specification was created for in the first place).
I think the current status is not desirable; not only does it limit interoperability (as mentioned), it will also cause complications Golang projects using this specification as a dependency; go modules will default to the latest tagged release, and some distributions (thinking of Debian) are quite strict about the use of unreleased versions. Golang project that want to support zstd would either have to "force" go mod to use a non-released version of the specification, or work around the issue by using a custom implementation (similar to the approach that containerd took: containerd/containerd#3649).
In addition to the above, concerns were raised about the growing list of media-types (#791), and suggestions were made to make this list more flexible.
The Image Manifest Property Descriptions, currently describes:
Followed by:
This part is a bit ambiguous (perhaps that's just my interpretation of it though);
+zstd
layer mediatype is not in theMUST
list, is there any reason for including it in the list of OCI Media Types? After all, any media types not included in the list "could" be supported by an implementation, and must otherwise be ignored.What's the way forward with this?
v1.1.0
, only defining+zstd
as a possible compression format for layers, but no requirement for implementations of thev1.1.0
specification to support them+zstd
compression format to the list of required media types, and tagv1.1.0
; projects implementingv1.1.0
of the specification MUST support zstd layers, or otherwise implementv1.0.x
v1.1.0
v1.1.0
release (1.
or2.
), and leave3.
for a future (v1.2.0
) release of the specification.On a side-note, I noticed that the vnd.oci.image.manifest.v1+json was registered, but other mediatypes, including media-types for image layers are not; should they be?
The text was updated successfully, but these errors were encountered: