Localization for Streaming Platforms: Dubbing Standards & Technical Specifications

admin

2026/05/14 15:01:48

Nobody tells you about the -27 LKFS problem until it’s too late.

You’ve just finished dubbing an entire season. The performances are solid. The director signed off. You export the stems, run a quick sanity check, and ship the deliverables to Netflix’s QC house. Two weeks later, the rejection notice lands in your inbox.

Integrated loudness: -31.4 LKFS. Tolerance: ±1 LU from -27. That’s a full re-mix.

The thing about streaming media dubbing standards is that they’re not suggestions, and they’re not approximate. Netflix, Amazon Prime Video, Disney+, and Apple TV+ all publish detailed audio specifications, and their QC processes are automated to a degree that would have seemed absurd a decade ago. A file either passes or it doesn’t. There’s no “close enough” column in their QC reports.

The Numbers That Actually Matter

Let’s start with the fundamentals, because this is where most rejections happen. I’m going to use the term LUFS throughout because that’s what most audio engineers are comfortable with. Functionally, LUFS and LKFS describe the same measurement—they’re just different names from different standards bodies (EBU vs. ATSC). Some platform spec sheets use one, some use the other. Don’t let that confuse you.

Integrated loudness is the big one. It’s the average perceived loudness of your entire program, and every major platform has a target number:

• Netflix wants -27 LUFS, with a ±1 LU tolerance window. That’s tight. You’re working between -26 and -28, and you need to be there consistently across every episode.

• Amazon, Disney+, and Apple TV+ all target -24 LUFS, but with slightly different tolerance windows—±2 LU for Amazon and Disney+, ±1 LU for Apple TV+.

Here’s the part that catches people off guard: Netflix’s target is three LU quieter than the others. Three LU is perceptible. It’s not a subtle difference. If you mix a dub to Netflix specs and then try to deliver the same files to Amazon, they’re going to sound noticeably quieter than everything else on the platform. Conversely, if you mix to Amazon’s -24 LUFS spec and send it to Netflix, you’re three LU over their ceiling.

You cannot use the same mix for both. Budget separate mixes for each platform, or at minimum, budget for the gain adjustment and re-measurement cycle if you’re trying to get close enough with a single mix.

True peak (measured in dBTP) accounts for inter-sample peaks—the signal level between your digital samples that doesn’t show up on a standard sample peak meter. Most platforms cap it at -2 dBTP. Apple TV+ is stricter: -1 dBTP. This matters more than people think, especially with dialogue. Sibilance—those sharp ‘s’ and ‘t’ sounds—can generate inter-sample peaks that sail right past -2 dBTP even when your sample peak meter reads -3 or lower. A true peak limiter set about 0.5 to 1 dB below your target ceiling is cheap insurance. Use it.

Dialogue-to-background ratio. Most platforms now require at least 20 dB of separation between dialogue and everything else. This isn’t about artistic preference—it’s about intelligibility on playback devices that most viewers actually use. A viewer watching on an iPad with built-in speakers in a moderately noisy room will lose dialogue the moment the background music creeps up.

Lip-sync. The standard across all four platforms is ±40 milliseconds—roughly three frames at 24fps. That sounds generous until you realize that human perception can detect lip-sync drift at around 45ms for audio-early and 125ms for audio-late. Netflix in particular reviews lip-sync frame-by-frame.

What Each Platform Actually Requires

Netflix

Netflix publishes their specs openly, and their QC is probably the most rigorous in the industry. Beyond the -27 LKFS target:

Independent dialogue normalization. You don’t just normalize the full mix to -27 and call it done. The dialogue stem has to hit -27 LKFS on its own, and your music and effects need to sit under that with at least 20 dB of headroom below the dialogue. A lot of studios used to mixing for broadcast or theatrical release find this step unfamiliar.

Room tone consistency. Every scene change needs clean, consistent room tone. If your room tone jumps 2 dB between dialogue edits within the same scene, Netflix QC will flag it. When you’re stitching together ADR recordings from different sessions, this is genuinely hard to maintain.

File delivery. WAV (BWF) or IMF. Subtitles in TTML. Video reference in ProRes 422 HQ or DNxHR HQ. File naming conventions are also specific and worth getting right.

Amazon Prime Video

Amazon follows ATSC A/85 and targets -24 LKFS with ±2 LU tolerance. That wider tolerance window can be misleading—Amazon applies dialogue gating in their QC, measuring loudness only during dialogue segments and ignoring silence. If your overall mix hits -24 LKFS but your actual dialogue passages average -22 because quiet sections pull the number down, you’ll pass the integrated measurement and fail the dialogue-gated measurement.

Stereo downmix. Amazon requires that 5.1 mixes produce acceptable stereo downmixes. Surround panning decisions that cause phasing or level shifts when folded to stereo will be flagged.

Disney+

Disney+ uses -24 LKFS with ±2 LU tolerance, largely consistent with Amazon. Where Disney diverges is creative oversight. For franchise content—Marvel, Star Wars, Pixar, Disney Animation—voice casting approval and dubbing director supervision are required for all recording sessions. For franchise content, add creative review time to your schedule.

Apple TV+

Apple targets -24 LUFS with ±1 LU tolerance and the strictest true peak limit in the industry at -1 dBTP. They also recommend keeping LRA at or below 15 LU—tighter than the 20 LU other platforms use.

What does that mean practically? Less dynamic range. Your quiet scenes can’t be as quiet and your loud scenes can’t be as loud. You’ll need more compression and limiting, and that has artistic implications—particularly for thrillers, horror, and war dramas.

Platform Comparison

	Netflix	Amazon	Disney+	Apple TV+
Loudness	-27 LUFS	-24 LUFS	-24 LUFS	-24 LUFS
Tolerance	±1 LU	±2 LU	±2 LU	±1 LU
True Peak	-2 dBTP	-2 dBTP	-2 dBTP	-1 dBTP
LRA		≤ 20 LU	N/A	≤ 15 LU
Lip-sync	±40 ms	±40 ms	±40 ms	±40 ms
Dialogue-BG	> 20 dB	> 20 dB	> 20 dB	> 20 dB

The Rejection Patterns Nobody Warns You About

Integrated loudness out of tolerance. Far and away the most common reason. Usually it’s not that the mixer didn’t know the spec—it’s that the final limiting or processing stage shifted the loudness enough to push it outside the window. Always re-measure after your final processing chain. Not before. After.

True peak overshoots from sibilance. Dialogue contains a lot of high-frequency energy in sibilant consonants, and those generate inter-sample peaks that standard meters won’t show. True peak meters on individual dialogue stems help catch these.

Lip-sync drift from frame rate conversion. If your source is 23.976 fps and delivery is 25 fps, the conversion introduces timing shift that compounds across duration. A 90-minute film drifts by several seconds between start and end. Automated tools can detect drift, but correction still requires manual intervention.

Room tone inconsistency between ADR sessions. Session A might have a -62 dB noise floor while Session B has -58 dB. That 4 dB difference is audible in quiet passages. Clean room tone in every session and level matching in post is tedious but necessary.

Character voice inconsistency across episodes. If your voice actor sounds different in Episode 1 versus Episode 8, platforms with character-level QC tracking will flag it. Reference recordings from the first successful session and playback comparison during subsequent sessions are the main prevention approach.

A Practical Pre-Delivery Checklist

• Integrated loudness re-measured after final processing chain (not before)

• True peak verified with inter-sample peak meter on both full mix and dialogue stem

• Dialogue-to-background separation verified at 20 dB minimum

• Lip-sync spot-checked at dialogue onset across multiple scenes

• Room tone level checked for consistency within and between scenes

• Character voice reference compared across episodes

• No clicks, pops, or edit artifacts (headphone check, not speaker check)

• Correct sample rate (48 kHz) and bit depth (24-bit) in file metadata

• Channel configuration matches deliverable spec (5.1 or stereo)

• Stereo downmix checked for phasing and level issues

• File naming and folder structure match platform delivery requirements

• Subtitle timing verified against dubbed audio, not original language audio

There’s a temptation to treat technical specs as something you can fix in post—mix first, measure, adjust, re-deliver. That approach works when you have time and budget to spare. In practice, most dubbing projects don’t have either. The studios that deliver clean on first pass are the ones that build the platform specs into their workflow from day one.

Artlangs Translation delivers multilingual dubbing for streaming platforms across 230+ languages, mixed to platform-specific loudness and quality specifications for Netflix, Amazon Prime Video, Disney+, and Apple TV+. Services include voice casting, dubbing direction, lip-sync recording, dialogue mixing with independent dialogue normalization, room tone management, and technical QC verification before delivery. Files are delivered in platform-compliant formats (WAV/BWF, IMF) with correct metadata and naming conventions. Combined with video localization, subtitle adaptation in TTML/SRT/WebVTT, game localization, short drama script translation, multilingual audiobook dubbing, and multilingual data annotation and transcription, Artlangs handles the full spectrum of media localization—from script to screen to spec.

PREV: Navigating HKEX & SEC Listings: Why Certified Financial Translation is Non-Negotiable

NEXT: Protecting Global IP: The Critical Role of Accuracy in Chemical Patent Translation

News