Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root temporal extent issues #11

Open
fcartegnie opened this issue Jan 12, 2017 · 7 comments
Open

Root temporal extent issues #11

fcartegnie opened this issue Jan 12, 2017 · 7 comments

Comments

@fcartegnie
Copy link

fcartegnie commented Jan 12, 2017

Seems to me by looking at the overlapping examples (ex 2) there's some issue with timebases.

As time base is set to media, RootTemporalExtent is defined by the Instance of the document/body, which is here the sample presentation time.
Timings can't then be identical for a same sub over different document instances sent at different times.

That's what I also find in unified-streaming's DASH ttml samples:
http://demo.unified-streaming.com/video/elephantsdream/elephantsdream.ism/QualityLevels(1000)/Fragments(textstream_deu=4200000000)

@nigelmegitt
Copy link
Collaborator

I disagree - in my view the unified-streaming's DASH ttml samples are incorrect. The spec is that the media timebase is equivalent to the track time, rather than having a media time of zero be equivalent to the sample presentation time.

@fcartegnie
Copy link
Author

6.2.11 ttp:timeBase
"may be the content of a Document Instance itself in a case where the timed text content is intended to establish an independent time line."

If a full document with multiple timings exists in a sample, that can match the "independent time line".

Root Temporal Extent
The temporal extent (interval) defined by the temporal beginning and ending of a Document Instance in relationship with some external application or presentation context.

Resolving paragraph/span timings can then be relative to document instance body timings, document instance as sample pts, or media time itself.

L Streaming TTML Content (Non-Normative)
is the only part in favor of just "slicing" the document, but is not normative...

No surprise that everyone does it his own way. Decoders issues ahead.

@nigelmegitt
Copy link
Collaborator

Those are TTML1 references - you also need to check ISO 14496-30 for the definitive interpretation of TTML timings in the context of ISO BMFF/MP4.

The goal of this (TTML in MP4 DASH) document is exactly to clarify this kind of issue to minimise decoder issues and misunderstandings.

@rbouqueau
Copy link
Owner

Hi François, happy new year!

I disagree - in my view the unified-streaming's DASH ttml samples are incorrect.

I agree. I noticed that also and that's probably because USP internal format seems to be based on smooth streaming (where the sample TTML timings are relative to the beginning of the segment).

That being said, dash.js sometimes requires to have the absolute time relative to the DASH MPD availability start time (instead of using the media timeline). I haven't found the pattern yet so I didn't report.

@fcartegnie The exact reason I wanted to write this document was to have a single interpretation So your input and discussions are warmly welcome :)

Also note that MPEG-4 part 30 has precedence over TTML for the timeline. And MPEG-4 part 30 states (section 5.3):

The top-level internal timing values in the timed text samples based on TTML express times on the track presentation timeline – that is, the track media time as optionally modified by the edit list. For example, the begin and end attributes of the element, if used are relative to the start of the track, not relative to the start of the sample.

@fcartegnie
Copy link
Author

Defining time reference by the use of the container instead of tagging the document itself
seem pretty akward :/

The major issue I see with absolute timebase reference for TTML samples,
in a context where content can be reused/remuxed and/or passed to container agnostic decoder,
is that you cannot edit data without editing samples.
The USP smooth->dash case being the best example.

@nigelmegitt
Copy link
Collaborator

Actually both the document and the container impact how the time references are understood, and I don't think that's avoidable.

There are interesting issues however the timebase reference is arranged, however at this stage there is a specification and it is clear, so the most helpful thing for interoperability would be if everyone used it.

Bear in mind that unlike the case of audio or video where there is a linear series of contiguous samples of known size, that can be concatenated or split, text based subtitle and caption formats include time expressions within them. Any kind of transformation, resampling, reuse etc that affects the timing will require the time expressions in the document to be modified, regardless of the basis of the time expressions (e.g. consider resampling to join multiple TTML samples together or split them apart). I'm fairly confident that all object based encoding schemes have this "feature".

@nigelmegitt
Copy link
Collaborator

Slight correction to the above: resampling to join or split TTML samples together currently does not require time expressions to be modified since they are constant relative to the track; other transformations may do, and if the time expressions were relative to the sample then those modifications certainly would require time expression reprocessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants