-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write down how extra data is saved for usage with AVC & HEVC #373
Comments
I'm all for including this in the standard exactly the way MakeMKV handles it. Not only for Dolby Vision, but also for 3D MVC stuff. It's a simple & practical solution. The CodecPrivate extension you're talking about looks like this:
That way of storing the extensions was the consensus of a mailing list thread on the Matroska-devel list from years ago. mkvmerge doesn't support MVC yet (yeah I know), though I've looked into it over the last couple of weeks. Support will probably come within the next couple of weeks (though not yet in the upcoming release). |
Trying to force a standard by implementing before discussing is not really the way it should be done. That said, AFAIK Dolby Vision data in (2) is "just" private NAL so Matroska would not really see them in the steam itself, not sure that there is something to put in the standard here. I don't have Blu-ray with Dolby Vision for testing and it seems that MakeMKV does not support standalone TS or MP4 files (I have only that with Dolby Vision), could you provide a sample file? Just 1 frame would be enough. |
I doubt there was any intention to go around the standard - i simply think he was unaware of the IETF efforts to continue the specifications. So it's more a desire to help his community to be able to maintain data for future use. Will see if i can get someone to throw a sample as soon as possible |
@steffenmanden I don't succeed to convert your ISO with "MakeMKV", 1.15.1, it complains that the m2ts file has length of 60 seconds which is less than minimum title length of 120 seconds, therefore I can not get the corresponding MKV. Would you mind to share the corresponding MKV? |
MakeMKV options -> second tab "Video" -> write "1" seconds -> then MakeMKV doesn't complain. |
What he said :-) default settings ignores the super short sequences, so you have to overwrite it |
Please write an extension document, using the mechanisms in matroska and ebml. |
I confirm that it is as described by @mbunkus (for Dolby Vision: I suggest to change the issue title to a more generic title. |
Hello everyone. Firstly, I was truly unaware of this standardization effort - thanks for keeping me in loop. Also, there was nothing close to "I did it my way, now make it a standard" attitude on my part - the forum post linked at top specifically says that officially MakeMKV does not support DV in MKV yet - the announcement was omitted from release notes on purpose. I rather wanted to create an interim version for "those who know", just to get the wheels moving, and then start a discussion in matroska-devel. Good thing that we have this place now. Here is my input: Some history - originally, the need to store AVC stream extension in MKV arose years ago when MVC (3D) streams appeared. At that time there was an ad-hoc process to standardize extensions for CodecPrivate that was later discussed on matroska-devel and became the de-facto standard. Moritz described it above - a generic way to keep multiple MP4 stream description atoms in MKV CodecPrivate data. At that time it was defined only for AVC and only for a single additional atom (mvcC). With DV is looks a natural way to extend this method for HEVC as well, and make it somewhat formal. Given this is a major change, I took the liberty to use this opportunity to change the storage scheme a bit - the full description is below, mostly a copy-paste from the Moritz post. CodecPrivate starts with the normal avcC/hvcC codec configuration boxes as found in the corresponding standards wrt. storing AVC/HEVC in MP4 files. As noted earlier, while the size of the first configuration box is not stored explixitly, it is rather trivial to compute - both avcC and hvcC are a fixed-size data structures followed by a single variable-sized part, where the size of the variable part is stored at fixed offset. The size of entire CodecPrivate data is stored as part of EBML encoding. If the first (avcC/hvcC) structure ends before the end of CodecPrivate, then the rest of the CodecPrivate buffer contains zero or more MP4 atoms, in exactly the same way they are stored in MP4: A four-byte, Big Endian length field INCLUDING the size field itself and the following fourcc tag and data (this is deviation I took - please read the rationale below). A four-byte ID field (e.g. mvcC for 3D MVC video data in 3D Blu-rays with AVC; the IDs are the same as in the xyz-in-MP4 the standards) The content of that extension One more formal rule - The parser must correctly handle invalid data at the end of CodecPrivate. Specifically:
Otherwise the parser should assume that the data is a valid mp4 atom. Now, the rationale behind my proposal to change the meaning of the size field - to include the header in size as opposed to "old" definition where length field specifies the size of the following fourcc tag and data, excluding the field itself. The current storage method is sort of inconsistent - half of the header size is included in length, and length of the other half is stored implicitly. More importantly, it looks very similar similar but differs from a corresponding mp4 implementation. Why it is safe to do now - the only usage of this extension so far is mvcC box (MVC video from 3D blu-ray) - this is just an avcC box for stereoscopic extension. I've modified that interim MakeMKV release to write length field in a new way and include few zeroes at the end of mvcC atom. This way either interpretation of the length field will be correct - the old code would expect 4 more bytes at the end of mvcC atom, would read these inserted zeroes and ignore them, as mvcC ends with a variable-sized structure whose length is explicitly specified. The new code would just read the atom properly. I think that keeping the structures layout identical to corresponding mp4 standard is a big deal, and this change is justified. Thanks! |
Thanks a lot for your input, Mike. I'm undecided on the length field semantics change. I do see your point about compatibility with MP4 atoms. However, there are two arguments against it:
So to be honest I'd rather keep the currently used semantics of the size field not including its own length of four bytes. However, my most important wish is that the same method is used for both H.264 and H.265 content; maybe even as the default methods for future cases where CodecPrivate needs to be extended for other codecs. |
I unfortunately stop to implement the parsing on my side during my tests and didn't catch this deviation.
But now we have an even more complex spec, which needs to describe a readout of the ID field, check which "format" (old or new), based on the ID field and by guessing with 0 padding (and if the mvcC box has 0x00000000 at the end, how to differentiate?), and apply the right "rule" (old or new). So in practice it may be worse in both the spec and in the implementation, because "old" behavior is already in the wild for a long time so we can not just connect a MP4 parser. IMO advantages of the "MP4 style" method are not enough for trying to handle both "old" and "new" methods, and keeping the "old" method for a generic manner for both H.264 and H.264 is more important.
Issue is that the release is in the wild so files will be with it as users are starting to use it, before someone from Matroska is aware and can say a word. It is good that the issue was raised quickly. (Moritz answer appeared while I was writing, looks like we have a similar comment :) ). |
We should not do any of that, this is ugly. If we go the new way, the parser should assume that length fileld includes total size, period. The only thing that gets broken - old MVC files with new parsers. I'm under assumption that no 3D player or utility actually uses these data, so nobody would notice the missing mvcC atom, should it be stripped due to mismatched len.
Ok, let me clarify - the change of "length" field semantics happened only in this latest version of MakeMKV (and with zero-padding compatibility hack, so old parsers still would consume files created by 1.5.1 without error). I have no issues at all switching it back, if we decide that this is a proper way to go. |
Just wanted to follow up here, as i have a deep desire to get a good solution going for this! A title change was also desired, what should i change it to ? |
@steffenmanden you're right to ask for a follow up. I understand that from people involved in Matroska standardization there is a preference for keeping the "old" method (element size not including the size bytes) as it was a consensus in mailing list thread on the Matroska-devel list from years ago, we could create more confusion by changing from AVC MVC style to a new style for Dolby Vision and/or other fields. This way, a parser could be neutral (not checking the block identifier and/or format for knowing which "size format" to use). So it would means:
|
Confirming. |
@rcombs had some valid objections in PR #377 and raised the question if we shouldn't just use a couple of new Matroska elements for extensions.
Yes indeed. Parsing the end of However, if we're designing a general method of extending CodecPrivate, we cannot rely on this always being easy to do. For other codecs it might be much harder to determine where the regular CodecPrivate content ends and where extensions start.
Existing tools wouldn't be able to mux Matroska-with-Dolby-Vision to MP4 no matter how we store the additional data in Matroska. They all have to be adjusted. So this isn't really a good argument in my view. Be that as it may. It would definitely be more fitting for Matroska to expose each extension as Matroska elements instead of bit-packing it into the existing CodecPrivate. Wouldn't have to be complicated either:
with MP4's relevant extensions atoms would easily be mappable to such a structure (and back). [1] Yes, the |
I guess that this would also leave it more open for Mike's "new" way of storing stuff, instead of the "old" way? |
A few additional thoughts:
Yeah, it's not too bad, just a little awkward. Nowhere near as bad as, say, AAC, where the structures are much more complex and rarely have easy length codes.
Right, just, I'd rather have Matroska-with-Dolby-Vision get remuxed to something fairly benign (an MP4 with one or more video streams that might contain Dolby Vision data, but that are otherwise completely standard), rather than something that isn't standard MP4 and might break other things… or worse, become relied upon if some player decides to handle this kind of private data at a higher level instead of within the demuxer, which could end up accidentally creating a requirement for MP4 demuxers as well. Another couple concerns about the current design:
|
Mike's "new" and "old ways" are about different ways of storing additional data in Existing MP4 atoms such as @MikeChenMM any comments on my proposal? |
Yeah OK, I see your point.
That's true and even worse than what I thought about. So… I'm all for storing extensions in Matroska structures instead of appending them to |
The current extension is the result of extending the method used for MVC. So issues like a fail due to I wrote the patch proposal only as a documentation about what is/was used, but I am not specifically for this method. Myself I add to change a bit my demuxer because I was considering I see an advantage to this past method: a MKV remuxer not aware of the extension will smoothling remux the MVC/Dolby stuff. But maybe this is not the most important.
This is relevant, we can more or less easily handle AVC and HEVC, but this may be not future proof in case of complex CodecPrivate. So maybe that instead of keeping this direction we could change our mind here before it becomes too much spread and be more generic as in @mbunkus proposal. I am fine with proposal of @mbunkus new Matroska elements + documenting the potential existence of |
It's been pointed out to me that there's one more flaw with the current method: it assumes that the |
|
I also think not adding data to the current I don't think we need new elements though. We already have the Ideally the Dolby Vision NAL would be stripped from the rest of the NALs and put in a But maybe splitting NALs isn't a good idea or makes it more complex to reconstruct it for playback so it could (also?) be inside the regular That means we can signal the Dolby Vision info in the Track and optionally strip the Dolby Vision NALs is the |
We're talking about data that must be available in the track headers, Steve. Why would we divert radically from how HEVC is stored in MP4 all of a sudden? I don't see a real advantage here. |
OK, new proposal based on the existing elements I hadn't been aware of. It's incomplete, especially wrt. to the ISO references. AVC/h.264New way
Existing, deprecated wayA deprecated way of storing the Demuxers MAY support this way of storing data. Multiplexers SHOULD NOT create such files anymore. Multiplexers SHOULD convert such structures into the "new way" when re-writing such files. HEVC/h.265New way
Fotonotes[1] I don't really understand the difference of |
I understand |
Ah, so I would have to say…
That being said, Man, that's somewhat cryptic to describe (and to understand). Or we could simply change the specs, make |
We'll have to change the specs anyway in order to make it clear that |
Technically the interpretation is still tied to the codec (originally the complement data in Musepack to get the lossless version). Each codec should tell how to interpret one of these ID (you can't just say MP3 has a complement data and hope decoders would know how to use it). So IMO that's not an issue. |
Why can't we fix |
EDIT: Oh, I've read the topic more thoroughly. I think we should not create yet another ID for the content type and should reuse MP4 IDs and layout. I'm all in favor for an additional "codecprivateextension" element - be it a new EBML id or a reuse of BlockAdditionMapping in that sort of hacky way - doesn't matter. But we should certainly keep the MP4 atom fourcc ID as is. The type=1 id=anything with fourcc and size in content is a good method of achieving this goal. |
I don't like to use the whole atom structure as the size field itself is not only completely redundant, it would also pose the potential for more buffer overrun errors. And what happens when the size fields don't agree (meaning if the size from the embedded MP4 atom doesn't match the size of the Matroska element)? We'd have to spec something for that, parsers would have to handle that. I would be OK with |
Yes, you are right about the size. This would be the best way IMHO. |
So is this the way to go ? If so, i hope some of you would have time to do a write-up, as i simply don't have the insight into the container specs to do it properly :) |
This prepares the element specs for the storage of elements such as the mvcC extension for AVC video or the dvcC and hvcE extensions for HEVC video. It does so by * changing the verbiage to make BlockAdditionMapping not only referring to BlockAdditions in tracks, * making BlockAddIDValue non-mandatory which would indicate that the BlockAdditionMapping is actually an extension to CodecPrivate and * includes a first registered value for BlockAddIDType which signals that BlockAddIDExtraData contains a type & content like similar header atoms in ISO Base Media File Format files. See #373 for the discussion how to store things. Will affect PR #377 which will have to be adjusted to make use of these elements.
PR created for the element specs. No, I'm not really happy with the wording yet; reviews & helpful suggestions are more than welcome. If its accepted, Jérôme can adjust his PR #377 to make use of the elements. |
@mbunkus I have written a PR based on your proposal, but actually I don't agree with mixing the 4-byte identifier with the content, and IMO it leads to more confusion especially if we try to avoid potential misunderstandings with WebM.
@robUx4 being tied to the codec is not always the case, we should consider to have additions not tied to the codec. As we are clarifying the usage of My counter-proposal is:
Proposal is on #390. |
Thanks! I like both of your PRs and see advantages in either method. Pros for #389:
Pros for #390:
I'm fine with either, I guess, though I slightly prefer #390.
I don't like such an element. Why should a muxer dictate how a remuxer should work? How should a muxer know in advance if a user wants to keep the lossless extensions upon remuxing or not? I don't see a use case for/the advantage of such an option. Let's not make Matroska more complex than it already is. I'll add more comments about specific wordings in the respective PRs. |
Me too, so if others (@robUx4?) are fine we could focus on this one only.
This is an hint about what to do during a transcoding without having the (incomplete) list of formats used, some additional data is useless if there is a transcoding (for example Musepack lossless addition is useless if the format of the main data is not Musepack) and some additional data is still useful because it is not related to the format of the main data (for example captions, any other ancillary data from MXF, RAWcooked content), and a transcoder could decide to discard/keep additional data based on this hint.
It is an hint only, very optional, and IMO more useful than |
Ah, I see, thanks for the explanation. Maybe a separate PR solely for that topic would indeed be better. |
seems people are getting to agree that #390 is the way to go! hope we can look into finalizing it - a lot of people waiting for it if you agree on this path :-) |
There is a lot of expectation around the revision spec from several important projects like; Plex, Emby, Kodi, etc, they absolutely trust on the matroska standard as the most important part to start working on give support to the new files. |
I'm not sure what you're trying to say there. It comes across as being impatient and making sure to let us know that IMPORTANT PEOPLE are waiting for us, the slowpokes, and that we should finally get going and what are we waiting for. Please abstain from such posts, they're really not helpful; instead they're making me a bit resentful (just talking about myself here). We're pretty much all volunteers here and work on best effort basis. A couple days more or less until specs are finalized won't make much difference if any to all of the projects you've listed. We're happy to create specs for this particular thing. We like to have current content be storable in Matroska. We do work in that direction. That work includes peer review, and that simply takes time. It pretty much also means that the end result will be better and more consistent than had we rushed the job. Patience, please. |
…I'm in charge of media-core engineering at Plex. You won't see any pressure to rush from us. |
Just take your time and do it properly, then we will all be happy :) I only push a reminder from time to time to continue the chat ;) - however i know it takes time, so we will get there when the time is right! |
@mbunkus You are getting me absolutely wrong, and I don't know where I'm saying you are slowpokes. I will not let you know which important people is waiting simply because it was just an expression, and both we know is not like that, the expectation is not on the devs but in a small part of the people following the DV thing on those projects. But if you are feeling a bit resentful I can understand it and, in fact, assume my unintentional responsability of making you feel that way. I know you are volunteers and I also perfectly understand the spirit behind that volunteering (I myself collaborated in some important open source project in the past). Having said that I hope you accept my apologies, I guess I need to calm down a bit with the whole thing of DV, you are specially right on the fact no one of those comments actually helps with the issue, even this one is just noise, I will take care that it doesn't happen again. PD: The lockdown didn't help with my "anxiety" of software updates either |
@doctorsirius Thanks for saying all that. I'm fine. I tend to be thin-skinned in general, and I've had people feel entitled too often in the past, both of which makes me interpret comments more negatively than they were intended. So… sorry from my side, too. |
I merged the corresponding PR, so @MikeChenMM please keep us posted of the implementation in your software, I would like to test my own software on the files created by it in order to be sure we are in sync. |
Hi,
The creator of MakeMKV have made a custom solution for saving Dolby Vision data within a mkv file.
https://makemkv.com/forum/viewtopic.php?f=12&t=21937
Is it possible to look into how we can make this a part of the standard, so we going forward can support Dolby Vision in matroska?
The creator says:
"Dolby Vision is not a separate codec, but rather a (HEVC) stream extension - DV data (EL+RPU) are encoded in separate NAL units, according to Dolby spec.
The presence of DV data can be checked by several different methods - most obvious are:
(1) DV descriptor in codecprivate data (there is an "standard" of storing additional elements in codec private for AVC and HEVC streams, that was used for ages since 3D MVC video) or
(2)presence of DV RPU unit in first frame."
Hope this can be looked into and approved so the solution can become a part of the standard instead of a custom solution
The text was updated successfully, but these errors were encountered: