Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The codecs parameter should have a formal definition of the use of the combination operators. #71

Closed
cconcolato opened this issue Mar 21, 2019 · 15 comments · Fixed by #78
Labels

Comments

@cconcolato
Copy link
Contributor

As proposed in #63, Section 2 should define formally the use of the + and | operators.

@cconcolato
Copy link
Contributor Author

In particular, the proposal in bullet point 2 of #63 (comment) is a good text.

@skynavga skynavga changed the title The codecs parameter should have a formal definition of the use of the combination operators The codecs parameter should have a formal definition of the use of the combination operators. Mar 27, 2019
@skynavga skynavga added the agenda Items for discussion in the next meeting label Mar 27, 2019
@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed The codecs parameter should have a formal definition of the use of the combination operators. tt-profile-registry#71.

The full IRC log of that discussion <nigel> Topic: The codecs parameter should have a formal definition of the use of the combination operators. tt-profile-registry#71
<nigel> github: https://github.com//issues/71
<nigel> Cyril: I discussed this with Mike and think he has the same view as me. We can discuss this on a call when
<nigel> .. all of I, Mike and Glenn are on the call.
<nigel> Nigel: Okay, let's come back to this another day

@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed TTML Profile Registry.

The full IRC log of that discussion <cyril> Topic: TTML Profile Registry
<cyril> nigel: about issue 71
<nigel> github: https://github.com//issues/71
<cyril> cyril: the request is to add a substantive change in the IANA section that defines the combination operators
<cyril> ... mike is not on the call for a while
<cyril> ... but he does think that it ought to be there
<cyril> ... my recollection is that it should be defined and compliant to RFC
<cyril> glenn: we have to qualify what you mean by "define"
<nigel> s/cyril/nigel
<cyril> ... it is not defined in the body of the media type registration
<cyril> nigel: at the moment it is defined what it means
<nigel> scribe: nigel
<nigel> Cyril: The syntax is defined but absent from the IANA registry.
<nigel> .. It seems odd to have an informal definition when the rest is formal.
<nigel> Glenn: The only issue previously was if we should trigger the IANA review process.
<nigel> .. If everyone is happy with that then we can go ahead.
<nigel> Pierre: I'm not objecting to going through the IANA process but noting that we have trouble
<nigel> .. getting this out so I'm concerned about our level of resource.
<nigel> Glenn: We have updated our document and IANA references our document so we have formally
<nigel> .. updated it.
<cyril> scribe: Cyril
<cyril> cyril: I will prepare a PR
<cyril> nigel: I don't share the concern about resources given that we don't have a hard deadline

@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed TTML Profile Registry The codecs parameter should have a formal definition of the use of the combination operators. #71.

The full IRC log of that discussion <nigel> Topic: TTML Profile Registry The codecs parameter should have a formal definition of the use of the combination operators. #71
<nigel> github: https://github.com//issues/71
<nigel> Nigel: It's been a while since we opened this and we haven't managed to get to it.
<nigel> Cyril: I haven't had a chance to work on it.
<nigel> Nigel: Comment from 11 April, we need Cyril, Mike and Glenn on the call. Let's move on for today, since we don't have all those people.

@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed The codecs parameter should have a formal definition of the use of the combination operators. tt-profile-registry#71.

The full IRC log of that discussion <nigel> Topic: The codecs parameter should have a formal definition of the use of the combination operators. tt-profile-registry#71
<nigel> github: https://github.com//issues/71
<nigel> Nigel: [reminds group of the issue]
<nigel> Pierre: Some of the primary users of this are in the community where Cyril and Mike
<nigel> .. participate so I'm not sure we can make much progress without them.
<nigel> Nigel: I know what you mean.

@mikedo
Copy link

mikedo commented Jul 9, 2020

Copying #63 substance to this new issue so we don't have to cover old ground...

mikedo commented on Feb 22, 2019

A reminder that any changes to the media type information in section 2 requires IANA and IETF expert review through the normal process between W3C and IANA. This should be done before publication.

The codecs parameter was created to be used in ISO BMFF and DASH MPD. Thus, it needs to conform to RFC 6381 syntax. That is, the entire codecs parameter string must conform to RFC 6381 (not just the profile code). The RFC 6381 citation should be moved to the codecs parameter text.

If we want to constrain the charset of the profile code, then we can point to 4CC in ISO/IEC 14496-12. Note that a recent update to that constrained 4CC to a subset of ASCII (0x20-0x7E).

skynavga commented on Feb 28, 2019

@mikedo a few questions/comments

Since the W3C is the designated "change controller", then is it correct to say that an IETF expert review is strictly a formality, and not required by process?

As specified, the registration does not refer to RFC 6381 and does not define a formal syntax (or charset) but implies that the syntax is effectively

codecs-parameter-value :
short-identifier [ ( "|" | "+" ) short-identifier ]*

short-identifier :
4*<any TOKEN character except "|" and "+">
I can see that it may desirable to make this syntax more concrete, and perhaps even more restricted, e.g., allowing only ASCII alpha and digit characters in a short-identifier.

Do you think we need to do this in the next edition of the published NOTE or in a subsequent edition (giving us more time to resolve these changes separately from pushing out the door our current changes)?

mikedo commented on Feb 28, 2019

I'm confident in the review process for changes. Philippe manages it I believe (he did last time).

RFC 6381 is cited here: https://w3c.github.io/tt-profile-registry/#Registration_Entry_Requirements_and_Update_Process It needs to be in the codecs parameter paragraph instead. It does not help to have it for the individual codes.

We need to ensure the syntax is valid now hopefully before anyone uses the more complex operator syntax. I have a 6381 regex if someone will prepare a few complex examples, If they pass, then we can defer on perfecting the syntax. The 6381 citation needs to be fixed though so that we're covered until there is more concrete syntax,

skynavga commented on Mar 1, 2019

Here are some possible values for the codecs parameter

im2t
im2t|im2i
im2t+im2i
tt2p|tt1p|im1t|im1i
tt2p+tt2t|tt1p+tt1t|im1t|im1i

mikedo commented on Mar 5, 2019

I have a draft work in process XML regex for DASH MPD @codecs (RFC 6381). The above examples all validate. That’s good and gives some level of confidence that we can defer a more concrete syntax for the profile code itself.

@mikedo
Copy link

mikedo commented Jul 9, 2020

I look forward to a proposal for the IANA codecs parameter syntax (I am not the proponent of this). Once that is baked, then the registry update is trivial.

@nigelmegitt
Copy link
Contributor

Thanks for this @mikedo , very helpful.

@mikedo
Copy link

mikedo commented Jul 9, 2020

And from PR #69 :

cconcolato commented on Mar 14, 2019

I'm not sure the note as is really useful. It's true that the codecs parameter of TTML follows a specific pattern, either a 4CC or a set of +/|-combined 4CCs, but the purpose of this section should not be about how the codecs in TTML is a specific subset of the general RFC6381 one. The note in this section could rather say: "the restriction on the + and | signs is required to enable the use of these signs in the codecs parameter, as defined in the MIME type registration". We could add a note in the MIME type saying: "Note that in this MIME type registration the first element (as defined in RFC6381) of the codecs parameter may contain a combination of 4CC and the + and | signs".

@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed The codecs parameter should have a formal definition of the use of the combination operators. w3c/tt-profile-registry#71, and agreed to the following:

  • SUMMARY: @nigelmegitt to draft a pull request matching the above discussion
The full IRC log of that discussion <nigel> Topic: The codecs parameter should have a formal definition of the use of the combination operators. #71
<nigel> github: https://github.com//issues/71
<nigel> Nigel: [summarises issue]
<nigel> .. I think the semantics are clear but we should check in if we agree!
<nigel> .. The next point is to check where any new text has to go, in the registration text or
<nigel> .. elsewhere in the document.
<nigel> Mike: codecs is defined by DASH rfc6831 so having a formally defined parameter for TTML
<nigel> .. is going to confuse people. We need a note that this isn't the codecs you think it is, or
<nigel> .. something like that.
<nigel> Nigel: I half remember discussing this before - we're defining a parameter for the MIME
<nigel> .. type, but the DASH codecs is not part of a MIME type.
<nigel> Mike: Just a note for now, I will dig around more to see if we need any text about this.
<nigel> Nigel: Checking in on the semantics,
<nigel> .. this is about signalling processor requirements
<nigel> .. and the + operator means "both things on each side of the operator" are required.
<nigel> .. and the | operator means "either thing is acceptable"
<nigel> .. Then, when both + and | are used, the + has higher precedence,
<nigel> .. so A|B+C|D means any of "A, something that supports B and C, or D"
<nigel> .. and that's it.
<nigel> .. Do we have agreement on that being the intention?
<nigel> group: [no dissent from this]
<nigel> Nigel: OK I think we're agreed on that.
<nigel> Mike: I have a clarification on rfc6381
<mike_> RFC6381 defines MIME type parameters "codecs" and "profiles" for ISO BMFF-wrapped content
<nigel> Mike: I think its okay because we're in the application/ttml+xml space but we are going
<nigel> .. to need a note that this is for the sidecar native filetype not ISOBMFF.
<nigel> Nigel: Good point, please could you raise the issue on the repo?
<nigel> Mike: Sure
<nigel> Nigel: I think that's orthogonal to defining codecs
<nigel> Mike: Agreed
<nigel> Nigel: The next question is where we put the text.
<nigel> .. I'm not sure.
<nigel> Mike: It's a parameter so it needs to go in the registration text. We're modifying the
<nigel> .. string and the semantics of the MIME type parameter.
<nigel> Nigel: We're not actually modifying it but we are explaining it better.
<nigel> Mike: If you put the text in the registration part then it needs to go to IANA, and if you don't
<nigel> .. then that's weird.
<nigel> Mike: Is this for TTML1 or TTML2 or both?
<nigel> Nigel: It's both, we moved the registration text into the profile registry.
<nigel> Mike: I recall now.
<nigel> Nigel: That's really useful guidance. The last question I have is if we think we must
<nigel> .. define these operators completely with reference to TTML2 profile semantics or if
<nigel> .. we can so it only partially, e.g. in relation to the any() or all() for | and + but with the
<nigel> .. combination only here in the profile registry.
<nigel> Mike: It seems okay to point to the spec for the definition.
<nigel> Nigel: I don't want to change TTML2 at this stage.
<nigel> Mike: It should really be the same.
<nigel> Nigel: I think there's enough wriggle room there.
<nigel> Mike: It'd be good if they were the same.
<nigel> Nigel: This is enough clarity for me to try to make progress. Any other questions or thoughts?
<nigel> q?
<nigel> group: [nothing more]
<nigel> SUMMARY: @nigelmegitt to draft a pull request matching the above discussion

@nigelmegitt nigelmegitt removed the agenda Items for discussion in the next meeting label Aug 21, 2020
@nigelmegitt
Copy link
Contributor

In beginning to prepare a pull request for this I am puzzled that the issue has been raised. The following text is already present in a normative section:

Additionally, applications using the entries in the registry are encouraged to adopt the following combination syntax:

Employ two combination operators, '+' (AND) and '|' (OR), which may be used to specify, respectively, that multiple processor profiles apply (simultaneously) or that any processor profile of a list of profiles may apply individually. If both operators are used in a codecs value, then the '+' operator has precedence.

The example: "A+B|C+D|E" states that a TTML processor that implements any one of A+B or C+D or E processor profiles satisfies, at first order, the requirements to fetch and begin decode/processing of a TTML document, where X+Y means that both X and Y processor profiles must be supported, and X|Y means that either X or Y processor profile must be supported.

This seems to be exactly what this issue asks for, unless the request is simply to write out the syntax for this in a more formal seeming style, without changing the content. For example:

mediatype: "application/ttml+xml" [charset]? [profile]? [codecs]?

charset: [XML encoding]

profile: [...]

codecs: codec-group [ | codec_group ]*

codec-group: 4cc [+ 4cc]*

4cc: 4 character code defined by this specification

@mikedo
Copy link

mikedo commented Aug 25, 2020

No harm in a formal ABNF, but that's not the issue that I tried to raise.

This profiles document states:

An identifier must conform with the element non-terminal of [RFC6381] and, furthermore, may not contain any of the characters in the regular expression character class [+|].

the codecs parameter ... is a syntactic subset of the definition of the codecs parameter defined by [RFC6381].

The syntax defined in this profile document, does not directly have anything to do with RFC 6381, except that the charset is constrained to that of "element", which among other things forbids ".".. The codecs parameter defined here is for media type, "application/ttml+xml". So, the use of RFC 6381 is a little misleading.

The codecs parameter of media types used in ISO BMFF (specifically media type application/mp4) was created by RFC 6381 and extended by other media specifications such as 14496-30. The latter, following RFC 6381, states that the codecs parameter for a TTML document carried in ISOI BMFF has the syntax "stpp.ttml" appended with an optional profile syntax citing this W3C document (including the combinatorics operators). For example, the codecs parameter for an application/mp4 file that contains a TTML document that conforms to both IMSC1.0.1 text profile and EBU-TT-D-1 is "stpp.ttml.im1t|etd1".

It would be helpful to have some of the language above in this profiles document to clarify that this codecs parameter is not the same as the RFC 6381 codecs parameter.

@nigelmegitt
Copy link
Contributor

It would be helpful to have some of the language above in this profiles document to clarify that this codecs parameter is not the same as the RFC 6381 codecs parameter.

@mikedo is that not what you raised as #76? I think this issue is about the formal definition of the use of the | and + operators.

@nigelmegitt
Copy link
Contributor

In particular, the proposal in bullet point 2 of #63 (comment) is a good text.

My understanding is that this issue is directly a request for a BNF definition of the syntax of the codecs parameter value. @cconcolato @mikedo it'd be good to have your confirmation that this would indeed resolve the issue, before a pull request gets raised.

@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed The codecs parameter should have a formal definition of the use of the combination operators. w3c/tt-profile-registry#71, and agreed to the following:

  • SUMMARY: @nigelmegitt to take a pass at creating a PR to do the above
The full IRC log of that discussion <nigel> Topic: The codecs parameter should have a formal definition of the use of the combination operators. #71
<nigel> Nigel: I was puzzled about this because I couldn't see what needs to be done.
<nigel> Mike: I agree with your assessment.
<nigel> .. [describes history of the issues]
<nigel> .. The post I made the other day was more appropriate to #76 so I'll repost it there.
<nigel> .. Hopefully we can bring closure to both of these.
<nigel> Nigel: In terms of #71, would a BNF satisfy this?
<nigel> Mike: I didn't open this issue but I agree it would.
<nigel> Nigel: And Cyril you seemed to suggest that is what would be needed too.
<nigel> Cyril: The BNF is certainly needed because it clarifies what characters can be used.
<nigel> .. But also the wording is confusing to me. What does it mean to say that applications are
<nigel> .. "encouraged to adopt the following syntax"?
<nigel> Mike: These are intertwined a little bit.
<nigel> .. TTML1 cites RFC6381's element item, which is in the middle of the ABNF for @codecs,
<nigel> .. and it constraints the W3C TTML profile character set to that item.
<nigel> .. The good news is it is any TOKEN except ., and now + and | of course.
<nigel> .. It turns out as always that Glenn's writing is precise but sometimes obtuse. Everything
<nigel> .. is okay in there. The character set is crisply defined by RFC6381 even though this has
<nigel> .. nothing actually to do with RFC6381. Does that make sense?
<nigel> Nigel: That's my understanding as well.
<nigel> github: https://github.com//issues/71
<nigel> Cyril: [trying to digest]
<nigel> Mike: We've intertwined application/ttml+xml and application/mp4 codecs parameters.
<nigel> .. The first one is just this profile code thing. The second one follows 14496-30 with the
<nigel> .. stpp.ttml.[thing we define here] though we don't mention it anywhere.
<nigel> Nigel: I think the "encouraged to adopt" language is there because in a sense we can't
<nigel> .. force anyone to adopt this.
<nigel> .. My memory bells are ringing about a discussion we had about this.
<nigel> Mike: Decoders today are not | and + aware, certainly not in the ISOBMFF universe, and
<nigel> .. I don't know that anyone cares about the application/ttml+xml media type, but maybe
<nigel> .. if a sidecar is delivered over the web it would matter.
<nigel> q+
<nigel> Cyril: We should define the syntax, but the encouragement to adopt is secondary.
<nigel> Nigel: That makes sense.
<nigel> Mike: Agreed.
<nigel> ack at
<nigel> Nigel: I think this syntax may be used at least in part in the DVB TTML specification, I would
<nigel> .. need to check to confirm.
<nigel> s/I don't know that anyone cares about/I don't know who uses
<nigel> Nigel: To conclude then, I think this has turned the request into:
<nigel> .. 1. Define the syntax of codecs
<nigel> .. 2. Separate out the "encouragement to adopt" language.
<nigel> Mike: I would offer that it would be helpful to add a note of the form:
<nigel> .. "For codecs parameter for MP4 please see ISO14496-30".
<nigel> Nigel: Is that the resolution to #76?
<nigel> Mike: Not quite. To separate them, ISOBMFF folk will get confused by the identically named
<nigel> .. parameter. I'm suggesting we should add a note about that to tell people where to
<nigel> .. go to understand about stpp.ttml. So:
<nigel> .. 3. Add an informative note.
<nigel> Nigel: Okay, any more on this one?
<nigel> Mike: Then I think we're done on this.
<nigel> Cyril: I agree
<nigel> SUMMARY: @nigelmegitt to take a pass at creating a PR to do the above

nigelmegitt added a commit that referenced this issue Sep 2, 2020
Also separate out the encouragement to support from the syntax definition and explanation as per discussion at #71.
nigelmegitt added a commit that referenced this issue Sep 4, 2020
Also separate out the encouragement to support from the syntax definition and explanation as per discussion at #71.
nigelmegitt added a commit that referenced this issue Sep 16, 2020
Close #71 by:

* Add syntax of codecs parameter
* separate out the encouragement to support from the syntax definition and explanation as per discussion at #71.
* Add note about MP4 codecs parameter
* Prohibit `"."` from `element`
* Add concrete example note
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants