Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specs-go: clarify mediatypes #411

Merged
merged 1 commit into from
Jan 20, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions image-layout.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ The blobs directory MAY be missing referenced blobs, in which case the missing b

No semantic restriction is given for object names in the `refs` subdirectory.
Each object in the `refs` subdirectory MUST be of type `application/vnd.oci.descriptor.v1+json`.
In general the `mediatype` of this [descriptor][descriptors] object will be either `application/vnd.oci.image.manifest.list.v1+json` or `application/vnd.oci.image.manifest.v1+json` although future versions of the spec MAY use a different mediatype.
In general the `mediaType` of this [descriptor][descriptors] object will be either `application/vnd.oci.image.manifest.list.v1+json` or `application/vnd.oci.image.manifest.v1+json` although future versions of the spec MAY use a different mediatype.

**Implementor's Note:**
A common use case of refs is representing "tags" for a container image.
Expand All @@ -85,7 +85,6 @@ $ cat ./refs/v1.0 | jq
$ cat ./blobs/sha256/e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f | jq
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.list.v1+json",
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
Expand All @@ -103,7 +102,6 @@ $ cat ./blobs/sha256/e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7f
$ cat ./blobs/sha256/afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51 | jq
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": [
"mediaType": "application/vnd.oci.image.config.v1+json",
"size": 7023,
Expand Down
7 changes: 4 additions & 3 deletions manifest-list.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ The manifest list is a higher-level manifest which points to specific [image man
While the use of a manifest list is OPTIONAL for image providers, image consumers SHOULD be prepared to process them.

This section defines the `application/vnd.oci.image.manifest.list.v1+json` [media type](media-types.md).
For the media type(s) that this document is compatible with, see the [matrix][matrix].

## *Manifest List* Property Descriptions

Expand All @@ -14,9 +15,8 @@ This section defines the `application/vnd.oci.image.manifest.list.v1+json` [medi

- **`mediaType`** *string*

This REQUIRED property contains the media type of the manifest list.
For this version of the specification, this MUST be set to `application/vnd.oci.image.manifest.list.v1+json`.
For the media type(s) that this is compatible with, see the [matrix](media-types.md#compatibility-matrix).
This property is *reserved* for use, to [maintain compatibility][matrix].
When used, this field contains the media type of this document, which differs from the [descriptor](descriptor.md#properties) use of `mediaType`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems awkward. I'd expect "reserved" to mean "configs MUST NOT set this property, and we haven't assigned semantics to it". If it is optional and has defined semantics, what do you intend to change by reserving it too?

I think we'll have better Docker compatibility if we drop this property from the spec entirely (except for descriptor.mediaType).

Copy link
Contributor

@jonboulle jonboulle Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda agree, this confused me on my read-through. Can we just say something like:

This OPTIONAL property is RESERVED by the specification for backwards compatibility with older versions of Docker. This field MAY be removed in a future version of the specification.

and leave it at that.


- **`manifests`** *array of objects*

Expand Down Expand Up @@ -119,3 +119,4 @@ Instead they MUST ignore unknown properties.
```

[runtime-platform2]: https://github.com/opencontainers/runtime-spec/blob/v1.0.0-rc2/config.md#platform
[matrix]: media-types.md#compatibility-matrix
6 changes: 3 additions & 3 deletions manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ In OCI, this is codified in a [Manifest List](manifest-list.md).
The third goal is to be translatable to the [OCI Runtime Specification](https://github.com/opencontainers/runtime-spec).

This section defines the `application/vnd.oci.image.manifest.v1+json` [media type](media-types.md).
For the media type(s) that this is compatible with see the [matrix](media-types.md#compatibility-matrix).

# Image Manifest

Expand All @@ -21,9 +22,8 @@ Unlike the [Manifest List](manifest-list.md), which contains information about a

- **`mediaType`** *string*

This REQUIRED property contains the media type of the image manifest.
For this version of the specification, this MUST be set to `application/vnd.oci.image.manifest.v1+json`.
For the media type(s) that this is compatible with see the [matrix](media-types.md#compatibility-matrix).
This property is *reserved* for use, to [maintain compatibility][matrix].
When used, this field contains the media type of this document, which differs from the [descriptor](descriptor.md#properties) use of `mediaType`.

- **`config`** *[descriptor](descriptor.md)*

Expand Down
8 changes: 0 additions & 8 deletions schema/image-manifest-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,6 @@
"minimum": 2,
"maximum": 2
},
"mediaType": {
"id": "https://opencontainers.org/schema/image/manifest/mediaType",
"type": "string",
"enum": [
"application/vnd.oci.image.manifest.v1+json"
]
},
"config": {
"$ref": "content-descriptor.json"
},
Expand All @@ -35,7 +28,6 @@
},
"required": [
"schemaVersion",
"mediaType",
"config",
"layers"
]
Expand Down
8 changes: 0 additions & 8 deletions schema/manifest-list-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,6 @@
"minimum": 2,
"maximum": 2
},
"mediaType": {
"id": "https://opencontainers.org/schema/image/manifest-list/mediaType",
"type": "string",
"enum": [
"application/vnd.oci.image.manifest.list.v1+json"
]
},
"manifests": {
"type": "array",
"items": {
Expand All @@ -31,7 +24,6 @@
},
"required": [
"schemaVersion",
"mediaType",
"manifests"
]
}
1 change: 1 addition & 0 deletions specs-go/v1/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ type History struct {
}

// Image is the JSON structure which describes some basic information about the image.
// This provides the `application/vnd.oci.image.config.v1+json` mediatype when marshalled to JSON.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is annoying that this one is inconsistent. Can't we safely add a media type field to it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was thinking the same. Especially since the field is omitEmpty for JSON. Perhaps make it optional with a SHOULD, so it will have compat with older docker? Thoughts @stevvooe ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to that end, also the oci-layout file, and it doesn't even have a mediatype assigned for it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Oct 21, 2016 at 07:12:12AM -0700, Vincent Batts wrote:

(to that end, also the oci-layout file, and it doesn't even have a mediatype assigned for it)

+1 to assigning a media type to oci-layout. I don't expect to find it in CAS, but image-layout directories might be accessed over HTTP, and it would be strange to return application/json for it when all of our other schemas have more specific types.

On Fri, Oct 21, 2016 at 06:59:20AM -0700, Jonathan Boulle wrote:

It is annoying that this one is inconsistent. Can't we safely add a media type field to it?

This is going the wrong way. If folks want to look inside inside the blob and try to guess its type (e.g. by unmarshalling into Versioned), they can do that. But I don't think we should require anyone to look inside the blob to figure out what it is. Descriptor references tell you the media type ahead of time, and we should be using those to identify blob types, and that approach places no restrictions on the blob itself. So while we may want to keep the Versioned fields for backwards compat with Docker, I don't think we want to extend that approach to additional structures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonboulle In general, we should not actually be embedding mediaTypes in the target types. The mediaType is a lens to the data. Really, we should remove them from the types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just follow along with literally every other format that exists and make it self-describing?

The only time I can see where we'd want this to be the recommended method for typing a blob is in signed assertions. For everything else, I'd rather have:

The descriptor that sent me here said this was a application/vnd.oci.image.manifest.v1+json, so I'll attach it to my manifest handler…

Instead of:

Lets see, does it have ustar\000 at offset 257? No? Good, because I'm not sure how I would have figured out if that was using the .wh.* whiteout handling or the new [static] whiteout handling (#24). Do the first four bytes match 00 00 00 xx? No? Ok, not UTF-32BE JSON. What about 00 xx 00 xx? … Maybe the first byte is {? Ok, that sounds like UTF-8 JSON. Let me unmarshal it into MediaTyped. That worked! And it has a value from mediaType! It says it is application/vnd.oci.image.manifest.v1+json, so I'll seek back to the beginning and attach it to my manifest handler…

And again, I'm just arguing that we shouldn't be using peek-inside typing for image unpacking, etc. I'm ok with us deciding that we want to use it to save keystrokes on pre-CAS-push validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking You're arguing that there's dichotomy between "free for all, no need to have mediaType" and "hueristic so that we can recognise every blob type". I don't think there is one.

The benefit of being able to know what a JSON blob is meant to represent is entirely separate from "I can tell what every blob in the image is without references". If you're not happy with detecting tar files (like file and libmagic do) that's fine. But please let's not make all of our JSON objects meaningless blobs that require jumping through references in order to even understand what we're looking at (or keeping the type information out-of-band).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But please let's not make all of our JSON objects meaningless blobs that require jumping through references in order to even understand what we're looking at…

You're saying “assuming (for some out-of-band reason) that the blob is a JSON object which contains a self-describing mediaType field, we can use that mediaType field to unambiguously identify the content”. That initial assumption is what I'm worried about. If you see cases where you are comfortable making that assumption (for whatever out-of-band reasons), then great, use peek-inside type detection based on the mediaType value. But I'd strongly recommend consumers use the referencing descriptor's mediaType to avoid having to rely on that assumption.

Copy link
Member

@cyphar cyphar Nov 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'd strongly recommend consumers use the referencing descriptor's mediaType to avoid having to rely on that assumption.

Your consistent implication that all consumers will have access to the entire image is getting annoying. If I have a tool like oci-do-something which I pipe a JSON object to, I don't expect that it will be reading the repository. In fact, I might have a service that modifies the JSON objects (and therefore actually cannot access the original repo). So you can't "use the referencing descriptor" because there isn't one (that you can see).

Now, you might argue that we should send the out-of-band media type with it. But why should that be a requirement? What are you gaining by removing mediaType?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, you might argue that we should send the out-of-band media type with it. But why should that be a requirement?

That's exactly what I'll argue ;). And unless you implement completely generic peek-inside detection (which I don't think anyone's arguing for), you're going to have to transmit some amount of media-type guidance along with your blob content. I'm suggesting that guidance be the media type.

You seem to be suggesting that that guidance be “this blob is a JSON object which contains a self-describing mediaType field”. Maybe you transmit that information because the tool-caller knows the tool can only handle such media types and therefore only feeds matching blobs into the tool. That's how oci-image-validate works, and I'm comfortable with that from a keystroke-saving perspective. However, I don't think we should pretend that this approach is completely free of out-of-band type guidance.

What are you gaining by removing mediaType?

I'm not suggesting we remove mediaType, because some users (e.g. you with oci-do-something, or a number of people with oci-image-validate's autodection) can't be bothered to pass media types around. And I'm fine with that (typing out a long media type is not something I'd like to do repeatedly).

I'm just suggesting image-handling tools follow the spec's SHOULD and use descriptors to reference blob content, with peek-inside type detection being reserved for signed-assertions. And having acquired the media type from the referencing descriptor (or because we authored the blob ourselves), I see no need for image-handling tools to use peek-inside type detection.

Perhaps our difference here is that I see (almost) all tooling as being descriptor-based, while you see the tooling as being isolated-blob based. Since I'm fine leaving existing mediaType entries in place, maybe we can just wait a year to see how that plays out and revisit this discussion then?

type Image struct {
// Created defines an ISO-8601 formatted combined date and time at which the image was created.
Created string `json:"created,omitempty"`
Expand Down
5 changes: 3 additions & 2 deletions specs-go/v1/descriptor.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@
package v1

// Descriptor describes the disposition of targeted content.
// This structure provides `application/vnd.oci.descriptor.v1+json` mediatype when marshalled to JSON
type Descriptor struct {
// MediaType contains the MIME type of the referenced object.
MediaType string `json:"mediaType"`
// MediaType is the media type of the object this schema refers to.
MediaType string `json:"mediaType,omitempty"`

// Digest is the digest of the targeted content.
Digest string `json:"digest"`
Expand Down
2 changes: 1 addition & 1 deletion specs-go/v1/manifest.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ package v1

import "github.com/opencontainers/image-spec/specs-go"

// Manifest defines a schema2 manifest
// Manifest provides `application/vnd.oci.image.manifest.list.v1+json` mediatype structure when marshalled to JSON.
type Manifest struct {
specs.Versioned

Expand Down
3 changes: 2 additions & 1 deletion specs-go/v1/manifest_list.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ type ManifestDescriptor struct {
Platform Platform `json:"platform"`
}

// ManifestList references manifests for various platforms.
// ManifestList references manifests for various platforms.
// This structure provides `application/vnd.oci.image.manifest.list.v1+json` mediatype when marshalled to JSON.
type ManifestList struct {
specs.Versioned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Versioned contains MediaTyped (which is what you currently have), isn't it redundant to list MediaTyped here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, technically. Does not change anything and is very apparent that it is media typed


Expand Down
3 changes: 0 additions & 3 deletions specs-go/versioned.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,4 @@ package specs
type Versioned struct {
// SchemaVersion is the image manifest schema that this image follows
SchemaVersion int `json:"schemaVersion"`

// MediaType is the media type of this schema.
MediaType string `json:"mediaType"`
}