-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CEA-708 Decoder (#2648) #2807
CEA-708 Decoder (#2648) #2807
Conversation
removed useless logs Added color support fixed up background attribute logic trim new lines on row ends and styling fix logic on clearing rollup captions when moving window hotfixes on decoder and tests fixed logic and added support for all charsets minor syntax fix renaming constants improve commenting, remove redundant comments move important hex into constants broke hex values in unit tests into constants comment fixing constant suffixes cleaned impl for streams/channel
clarification in comments
made colors array more intuitive
lib/cea/cea_decoder.js
Outdated
const stableComparator = (ccPacket1, ccPacket2) => { | ||
const diff = ccPacket1.pts - ccPacket2.pts; | ||
const isEqual = diff === 0; | ||
return isEqual ? ccPacket1.order - ccPacket2.order : diff; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could condense down to:
const stableComparator = (p1, p2) => (p1.pts - p2.pts) || (p1.order - p2.order);
For me, the compactness makes it easy for me to see at a glance what it's doing (falling back from one value to another). But I'm fine with you leaving it as-is if you feel the long form is more readable.
shaka-player.uncompiled.js
Outdated
goog.require('shaka.cea.CeaDecoder'); | ||
goog.require('shaka.cea.Cea608DataChannel'); | ||
goog.require('shaka.cea.Cea608Memory'); | ||
goog.require('shaka.cea.Cea708Window'); | ||
goog.require('shaka.cea.Mp4CeaParser'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classes in this list should only be the ones which are not goog.require
d anywhere else. If they are required by another class in this list, they don't need to be listed here.
So the only things that appear in this list are plugins which register themselves and top-level classes used by applications directly.
You may be missing some goog.require() statements in lib/cea/, but I think everything in this namespace is depended on directly or indirectly by ClosedCaptionParser.
// will set a text stream if it isn't already set. Consequently, reversing | ||
// the order of these calls makes two languages display simultaneously | ||
// if captions are turned off -> on in a different language. | ||
this.player.selectTextLanguage(track.language, track.roles[0]); | ||
this.player.selectTextTrack(track); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back through the history of this file to see how/when this came to be, and I am certain now that it was a mistake. Thanks for fixing it!
lib/cea/cea708_window.js
Outdated
const topLevelCue = new shaka.text.Cue( | ||
this.startTime_, endTime, /* payload= */ ''); | ||
|
||
if (this.justification_ === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the enum here, please.
lib/cea/cea708_service.js
Outdated
* @property {!number} type | ||
* Type of the byte. Either 2 or 3, DTVCC Packet Data or a DTVCC Packet Start. | ||
* @property {!number} byte The byte containing data relevant to the packet. | ||
* @property {(!number)} order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: remove these extra parens
lib/cea/dtvcc_packet_builder.js
Outdated
// Builds packets based on Figure 5 CCP State Table in 5.2 of CEA-708-E. | ||
// Initially, there is no packet. When a DTVCC_PACKET_START payload is received, | ||
// a packet begins construction. The packet is considered parsed once all bytes | ||
// indicated in the header are read, and ignored if a new packet is started | ||
// before the current packet being processed is parsed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble with your terminology here. Let's discuss in our scheduled meeting today and see if we can work it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much!
lib/cea/cea708_service.js
Outdated
controlCode = (controlCode << 16) | extendedControlCodeBlock.value; | ||
} | ||
|
||
// Control codes are in 1 of 4 groups: CL, CR, GL, GR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time connecting this comment with the handlers below, which seem to be C0-C4 and G0-G4. None of them ends with L or R.
lib/cea/cea708_service.js
Outdated
* @private | ||
*/ | ||
handleC2_(dtvccPacket, controlCode) { | ||
// There are currently no commands on the C2 table as of CEA-708-E, but if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this code is read years from now, it would be helpful to have a rough date instead of "currently", so the reader knows if it's worth double-checking for changes in specs. I tend to say things like "as of September 2020..."
lib/cea/cea708_service.js
Outdated
* @private | ||
*/ | ||
handleC3_(dtvccPacket, controlCode) { | ||
// There are currently no commands on the C3 table as of CEA-708-E, but if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
} | ||
|
||
/** | ||
* Yields each non-null window specified in the 8-bit bitmap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
lib/cea/cea708_service.js
Outdated
*/ | ||
setPenAttributes_(dtvccPacket) { | ||
// Following 2 bytes take the following form: | ||
// b1 = |TXT_TAG|OFS|PSZ| , b2 = |I|U|EDTYP|FNTAG| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I assume that the shortened names are from the spec, they are not self-explanatory names. Can you explain them in comments?
Also, you don't explain the size of those fields in bits, so it's impossible to compare this comment with the bitwise math below.
A diagram would help (I'm just making up these sizes):
// | 7654| 32| 10| |7|6| 543| 210|
// |TXT_TAG|OFS|PSZ| |I|U|EDTYP|FNTAG|
Or maybe just list them with sizes and descriptions:
// TXT_TAG (4 bits): A tag for text, I guess?
// OFS (2 bits): family-owned contract furniture manufacturer (ofs.com)
// PSZ (2 bits): Pez dispenser
// I (1 bit): Italic
// U (1 bit): Underline
// EDTYP (3 bits): Equine Dystopia
// FNTAG (3 bits): Fondue Tag
Maybe also link to https://en.wikipedia.org/wiki/CEA-708#SetPenAttributes_(0x90_+_2_bytes)
:-)
test/cea/cea708_window_unit.js
Outdated
} | ||
const topLevelCue = new shaka.text.Cue(startTime, endTime, ''); | ||
topLevelCue.nestedCues = [ | ||
CeaUtils.createDefaultCue(startTime, endTime, text1+text2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little uncertain about this. If the pen starts at column 0, writes four characters, then moves to column 6 and writes more, shouldn't there be 2 spaces between them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I didn't even consider that a null cell should be handled as a space, but that makes perfect sense. I've fixed it. 👍
test/cea/cea708_window_unit.js
Outdated
|
||
it('cuts off text that exceeds the column size on a given row', () => { | ||
const text = '0123456789012345678901234567890123'; // this text is 34 chars. | ||
const trimmedText = '01234567890123456789012345678901'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, should this maybe be written as text.substr(0, 32)
?
order: 0, | ||
}); | ||
|
||
const dtvccStartByte = 0b00000001; // 1 * 2 -1 = 1 data bytes should follow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous tests used hex notation for this. Why switch to binary?
test/cea/dtvcc_packet_unit.js
Outdated
beforeEach(() => { | ||
|
||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could drop this. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, I should have been more attentive while copy/pasting 😳
lib/cea/cea708_service.js
Outdated
@@ -276,7 +277,7 @@ shaka.cea.Cea708Service = class { | |||
* @private | |||
*/ | |||
handleC3_(dtvccPacket, controlCode) { | |||
// There are currently no commands on the C3 table as of CEA-708-E, but if | |||
// As of the CEA-708-E spec there are no commands on the C2 table, but if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this change to C2 by mistake? The method is still called handleC3_.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woohoo!
Test Failure:
|
This looks like a problem with our build bot. I'm looking into it. |
All tests passed! |
Test Failure:
|
All tests passed! |
Thanks, @muhammadharis, for your contributions! |
This pertains to shaka-project#2648 (although this is a new feature, not a replacement) and shaka-project#1404. A CEA-708 decoder that follows the CEA-708-E standard, decodes closed caption data from User Data Registered by Rec. ITU-T T.35 SEI messages, and returns them as cues in Shaka's internal cue format. Furthermore, this pull request fixes and cements some of the logic surrounding CEA-608 and CEA-708 tag parsing on the Dash Manifest Parser. Format: Similar to the CEA-608 decoder, cues are emitted in Shaka's internal format (lib/text/cue.js). This decoder makes use of nested cues. The top level cue is always a blank cue with no text, and each nested cue inside it contains text, as well as a specific style, or linebreak cues to facilitate line breaks. This also allows for inline style (color, italics, underline) changes. Details: - ASCII (G0), Latin-1 (G1), and CEA-708 specific charsets (G2 and G3) all supported. - Underlines, colors, and Italics supported, set as a property on each nested cue. - Positioning of text is supported. (Exception: In CEA-708 the default positioning is left, in this decoder it is centered.) - Positioning of windows not supported, but relevant fields that could be used to support this are extracted and left as a TODO.
This pertains to shaka-project#2648 (although this is a new feature, not a replacement) and shaka-project#1404. A CEA-708 decoder that follows the CEA-708-E standard, decodes closed caption data from User Data Registered by Rec. ITU-T T.35 SEI messages, and returns them as cues in Shaka's internal cue format. Furthermore, this pull request fixes and cements some of the logic surrounding CEA-608 and CEA-708 tag parsing on the Dash Manifest Parser. Format: Similar to the CEA-608 decoder, cues are emitted in Shaka's internal format (lib/text/cue.js). This decoder makes use of nested cues. The top level cue is always a blank cue with no text, and each nested cue inside it contains text, as well as a specific style, or linebreak cues to facilitate line breaks. This also allows for inline style (color, italics, underline) changes. Details: - ASCII (G0), Latin-1 (G1), and CEA-708 specific charsets (G2 and G3) all supported. - Underlines, colors, and Italics supported, set as a property on each nested cue. - Positioning of text is supported. (Exception: In CEA-708 the default positioning is left, in this decoder it is centered.) - Positioning of windows not supported, but relevant fields that could be used to support this are extracted and left as a TODO.
This pertains to shaka-project#2648 (although this is a new feature, not a replacement) and shaka-project#1404. A CEA-708 decoder that follows the CEA-708-E standard, decodes closed caption data from User Data Registered by Rec. ITU-T T.35 SEI messages, and returns them as cues in Shaka's internal cue format. Furthermore, this pull request fixes and cements some of the logic surrounding CEA-608 and CEA-708 tag parsing on the Dash Manifest Parser. Format: Similar to the CEA-608 decoder, cues are emitted in Shaka's internal format (lib/text/cue.js). This decoder makes use of nested cues. The top level cue is always a blank cue with no text, and each nested cue inside it contains text, as well as a specific style, or linebreak cues to facilitate line breaks. This also allows for inline style (color, italics, underline) changes. Details: - ASCII (G0), Latin-1 (G1), and CEA-708 specific charsets (G2 and G3) all supported. - Underlines, colors, and Italics supported, set as a property on each nested cue. - Positioning of text is supported. (Exception: In CEA-708 the default positioning is left, in this decoder it is centered.) - Positioning of windows not supported, but relevant fields that could be used to support this are extracted and left as a TODO.
This pertains to #2648 (although this is a new feature, not a replacement) and #1404. A CEA-708 decoder that follows the CEA-708-E standard, decodes closed caption data from User Data Registered by Rec. ITU-T T.35 SEI messages, and returns them as cues in Shaka's internal cue format. Furthermore, this pull request fixes and cements some of the logic surrounding CEA-608 and CEA-708 tag parsing on the Dash Manifest Parser.
Format:
Similar to the CEA-608 decoder, cues are emitted in Shaka's internal format (lib/text/cue.js). This decoder makes use of nested cues. The top level cue is always a blank cue with no text, and each nested cue inside it contains text, as well as a specific style, or linebreak cues to facilitate line breaks. This also allows for inline style (color, italics, underline) changes.
Details: