TTML and DFXP — broadcast-style timed text on the web
TTML (Timed Text Markup Language) and the older DFXP label describe XML-based caption documents common in broadcast, OTT, and some LMS stacks. On the open web, plain SRT or WebVTT still wins for simple uploads — but you will still see TTML when a client asks for “IMSC” or when a mastering house hands off a sidecar that is richer than a three-line SRT.
What makes TTML different from SRT or VTT
- XML tree. Cues, regions, styles, and metadata live in elements — not sequential blocks of plain text. A missing namespace or wrong profile can make a file invalid for a strict player even if it looks fine in a text editor.
- Timing vocabulary. TTML can express clock time, frame-based time, or tick-based time. If the frame rate in the document does not match the video you marry it to, cues slip — see timecode and frame rate.
- Styling and layout. TTML can describe fonts, regions, and alignment. Many web players and social uploaders ignore most of that and only ingest plain text cues. Assume styling will be flattened or dropped unless your delivery spec says otherwise.
DFXP vs TTML in conversation
People still say “DFXP” when they mean an XML sidecar from an older broadcast toolchain. In practice, validate against what your receiver documents — YouTube, TikTok, HTML5, or an LMS — rather than assuming one TTML dialect maps cleanly to another without conversion.
Encoding and tooling
Like other text sidecars, TTML should be saved as UTF-8. If you strip namespaces or pretty-print aggressively, keep a checksum-friendly copy for regression tests — small XML edits can reorder whitespace and confuse diff-based review.
CaptionPass and TTML
CaptionPass can ingest TTML and emit platform-tuned outputs (for example SRT for short-form or WebVTT for HTML5) using presets documented in the HTTP API. When you need TTML out for an LMS pipeline, use the lms preset so the emitter targets TTML-friendly structure.
Run a TTML file through CaptionPass on the home page to see diagnostics and normalized output side by side.
More guides
- SRT vs VTT — when each format silently failsComma vs dot timestamps, WEBVTT headers, and where YouTube, TikTok, and HTML5 bite.
- Caption file encoding — UTF-8, BOM, and garbled textWhy uploads show mojibake or blank cues: UTF-8 vs legacy encodings and quick fixes.
- Burned-in vs soft subtitles — what to deliver whenOpen captions burned into the picture vs separate SRT/VTT tracks — tradeoffs for editors and clients.
- Reading speed for captions — CPS, line length, and platformsCharacters per second, lines per cue, and where YouTube, TikTok, and HTML5 push back.
- Why your captions are not showing — a triage guideHTML5, YouTube, and TikTok checks when subtitles vanish after upload.
- Fix overlapping subtitlesWhat overlap means and why some players drop overlapping cues.
- CaptionPass JSON IR and the developer-json presetLossless-ish cue interchange for tooling: when to use JSON IR, version tag, and how it pairs with the HTTP API.
- Timecode, frame rate, and caption syncWhy captions drift or jump: drop-frame vs non-drop, fractional frame rates, and export settings that survive upload.
- WCAG-minded captions — reading speed, sound tags, and burned-in contrastHow WCAG 1.2.x thinking maps to real files: CPS, line length, SDH-style cues, and contrast for open captions.