-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CEA608 Missing Whitespaces #3906
Comments
I don't have full grasp of the 608 spec, so please correct me if I am wrong. As I understand it, our implementation only supports Caption Mode (Text mode is unsupported).
I agree this is the case in Text mode. But I am not convinced that
won't break other captions that assume Caption mode. It would be great to receive some insight from someone experienced in 608. |
After having met this bug again and again, when live channels seems to miss a whitespace around single words in italic, I have checked the 608 spec multiple times. Unfortunately, I have only found implicit signs that the Mid-Row command also requires the whitespace to be added at the current location. Sentences quoted above and:
All these suggest that All Mid-Row codes have a "spacing attribute", but it is never explicitly described. But the legal document titled "Closed caption decoder requirements for analog television receivers" I linked above clearly says multiple times:
This is repeated for all captioning modes available in 608.
So, if we can accept the validity of the linked legal document - as a clarification of the older CEA608 standard created on July 1, 1993, we probably need to add that whitespace. The CEA608 Wiki also links a form of the legal document: |
Should be fixed at the same time as #4321. |
Note: this is also fixed in this pull request: |
Apple's 16x9 test stream doesn't seem to assume that midrow codes insert spaces. The stream includes captions that are fully italic. If we add space when we see a midrow code we end up starting each of these captions with a space, which I very much doubt is intended. There are no backspaces to undo the spaces, as far as I can see. Thoughts? |
Looks like dash.js inserts a space (Dash-Industry-Forum/dash.js#1906). Android's ClosedCaptionRenderer does too. Does this imply the Apple test stream is incorrect in not following the midrow codes with backspaces, or is there something else I'm missing? |
That is usually achieved by a Preamble code, not a mid-row code. You need to look for a "Sentence with a single word in italic." But please, check the links I added above. Legal requirements for CEA608 say that:
When did Apple follow any standards for the last time? in 1982? ;) |
Yes, I've looked at those already, thanks. I'm pretty sure inserting the space is the right thing to do. This does imply the Apple test stream is incorrect, as far as I can tell. They seem to be using mid-row code rather than preamble code. If you have a chance it would be good if you could take a quick look at that stream to see if you agree, and/or whether we have some other bug. The unwanted space is inserted with all of the fixes in #4308, so if there is some other bug then it's not fixed as part of that change. Note that it's trivial to see the unwanted space simply by replacing it for another character :). |
My plan for this is to merge a one line change (equivalent to 5a6bb78#diff-83f8c19437ea277fb176fdd4055095baR765), so that we can close this issue ahead of fully merging all of that commit (which will take more work). |
Issue: #3906 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=200526335
Issue: #3906 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=200526335
I see lots of content having captions seemingly missing space characters around italic words (when Mid-Row command is used):
"He saidHELLOimmediately. (where the word HELLO is italic in the caption)
The CEA608 Standard seems to be not too clear about handling of Mid-Row Commands. Some parts mention adding a white space when processing a code, but it is quite ambiguous:
Part 7.4 Real-Time Scrolling Display:
This seems to describe Text Mode only.
Part B.12 Backspacing:
This suggests that the Mid-Row code occupies the space for a character on the screen.
If we can rely on https://www.law.cornell.edu/cfr/text/47/79.101
So all in all, this function probably needs to add a whitespace to the text to handle Mid-Row Codes appropriately:
ExoPlayer/library/core/src/main/java/com/google/android/exoplayer2/text/cea/Cea608Decoder.java
Line 374 in 2b20780
The text was updated successfully, but these errors were encountered: