The metadata nobody checks until delivery day

You finish the online. The grade is approved. Sound sends the final mix. You render the master, run it through your naming convention checklist, ship it to the distributor. Two days later, a QC report lands in your inbox. The file has been rejected. Not for a bad frame, not for a glitch in the audio, not for anything you can see or hear. The sample rate metadata in the BWF chunk says 48000 and the container header says 47952.

The file sounds fine. It plays fine. But the metadata disagrees with itself, and the automated QC system caught it.

Three layers of truth

Every media file carries metadata in multiple places. These layers were designed by different people, at different times, for different purposes. They don’t always agree.

Container metadata is what the file wrapper declares about its contents. An MOV header, an MXF descriptor, an MP4 box. This is where the container announces the frame rate, sample rate, timecode, colour space, and codec of each track. Playback applications read this layer first.

Codec metadata lives inside the encoded stream itself. ProRes carries its own colour primaries tag. H.264 has VUI parameters for transfer characteristics. DNxHR headers declare their own resolution and bit depth. These values describe the actual encoded data, independent of whatever the container says about it.

Sidecar metadata sits outside the file entirely. An ALE with clip properties. An XML from a camera report. A CSV from the sound department. A .cube LUT that implies a colour space. Sidecar metadata is often the most human-readable and the least trustworthy, because it’s manually created and easily separated from the media it describes.

Metadata layers

Container

MOV/MXF container atoms and track descriptions

Codec

Encoded frame headers (ProRes, DNxHR, H.264 VUI)

Sidecar

External XML, ALE, CSV, or camera report

Click a field to see which layer tools trust

The problem is that different tools in your pipeline trust different layers. Your NLE reads the container. Your colour management system reads the codec tags. Your asset manager reads the sidecar. When those layers disagree, each tool is correct according to its own source of truth, and the file is wrong according to everyone else.

Sample rate: the quiet disagreement

The most common metadata mismatch in audio deliverables is sample rate. A BWF file has at least two places where sample rate is declared: the fmt chunk (standard WAV header) and the bext chunk (broadcast extension). These should always match. They usually do. When they don’t, the results depend on which field your tool reads.

Then there’s the container. If that BWF gets wrapped into an MXF for broadcast delivery, the MXF descriptor carries its own sample rate field. Three declarations of the same value, three opportunities for disagreement.

The 48000 vs 47952 problem is a specific variant. Some older hardware and certain Avid workflows used a pull-down sample rate of 47952 Hz (48000 divided by 1.001) to maintain sync with 23.976fps picture. The audio plays at the same speed either way, but if one layer says 48000 and another says 47952, the QC system flags it as a discrepancy. You end up re-wrapping a file that was sonically perfect, just to make two numbers match.

Colour tags that lie

Video colour metadata is worse. A ProRes file in a MOV container can declare its colour primaries in at least two places: the colr atom in the MOV track description and the ProRes frame header. If those disagree, what you see depends on which decoder you’re using and which layer it trusts.

This is how a grade that looked correct in Resolve can look wrong in a broadcaster’s QC suite. Resolve reads the actual encoded data and applies its own colour management. The QC system reads the container tags and expects them to describe reality. If someone rendered Rec. 709 video but the container tag says P3 (because a preset was wrong, or because a transcode tool didn’t update the tag), the pixels are fine but the label is wrong. The QC system rejects it.

The reverse is also common: a file tagged as Rec. 709 that actually contains P3 or Rec. 2020 data. Any colour-managed player that trusts the tag will display it incorrectly. The image looks washed out or oversaturated, and nobody notices until it hits a display that’s actually doing the colour math.

ACES compounds this because the container tag might say Rec. 709 (the output), but the editorial pipeline expects the tag to describe the input colour space. There is no universal convention. Some facilities tag the output, some tag the input, some tag nothing and rely on naming conventions. All three approaches work until the file leaves the building.

Timecode: pick a source, any source

A QuickTime MOV can carry timecode in a dedicated timecode track, in the tmcd sample description, and sometimes in user data atoms. An MXF carries timecode in its header metadata partition, in each content package, and optionally in system items. A camera original might also carry timecode in sidecar XML from the camera manufacturer.

When you conform from an EDL or AAF, the conform tool reads timecode from the source files. Which timecode? The first one it finds, usually. If the timecode track says 01:00:00:00 but the MXF header partition says 00:59:59:00 (because of a pre-roll convention), the conform matches against one of them and ignores the other. You get a one-second offset on every clip, which looks like a global slip and is easy to fix once you spot it. If you spot it.

The worse case is when timecode disagrees at a sub-frame level, across formats. A camera shoots 23.976. The offline edit happens at 23.976. The sound department works at 24fps. Timecode values are numerically identical (01:23:45:12) but represent slightly different real-world times because the frame rates differ. The deliverable passes a timecode check (the numbers match) but the audio is actually 0.1% slow relative to picture. Over a two-hour film, that’s about seven seconds of accumulated drift.

Frame rate: the number vs the duration

A file can declare a frame rate of 24fps while every frame actually occupies 1/23.976 of a second. This happens when a 23.976 file gets incorrectly flagged as 24, or when a transcode tool writes the rounded value instead of the precise one. Most playback tools compensate, because they calculate timing from the actual frame count and file duration rather than trusting the metadata. But QC tools read the declared rate and compare it against the measured duration. If those don’t match within tolerance, the file fails.

Drop-frame and non-drop-frame at 29.97 is another metadata trap. The frame rate is always 29.97 (or more precisely, 30000/1001). Drop-frame is a counting convention, not a rate change. But some encoders write the frame rate as 30 when non-drop-frame is selected, because non-drop counts 30 frame numbers per second. A QC system that sees “30fps” will measure the actual frame duration, find it’s 1/29.97, and flag the discrepancy.

What actually gets checked at mastering

QC systems at major distributors and broadcasters run automated checks against a specification. The spec varies, but these fields are almost always on the list:

Video: Frame rate (declared vs measured). Codec and profile. Resolution. Bit depth. Colour primaries, transfer function, and matrix coefficients. Scan type (progressive vs interlaced). Aspect ratio (both stored and display). Clean aperture. Active picture area.

Audio: Sample rate (declared vs measured, across all layers). Bit depth. Channel count and channel assignment. Loudness (integrated, true peak, dialogue gating). Sample rate consistency between BWF metadata, container header, and actual sample data.

Timecode: Start timecode. Continuity (no breaks). Frame rate consistency with video. Drop-frame flag matching the actual count.

Container: File wrapper compliance (AS-11, IMF, specific MOV profiles). Metadata field population (originator, creation date, unique identifiers). No unexpected tracks or data streams.

The pattern across all of these: the QC system is checking whether the file describes itself correctly. Not whether the content is good. Not whether the edit is right. Whether the metadata across all layers is internally consistent and matches the delivery spec.

The practical checklist

Before you ship a deliverable, check these things. Not by playing the file, by inspecting it.

Read the container metadata with MediaInfo or FFprobe. Check frame rate, sample rate, colour tags, and timecode. Read the codec-level metadata separately (ProRes has tools for this; FFprobe with -show_frames will reveal per-frame flags). Compare the two. If you have BWF audio, inspect the bext chunk and compare its sample rate and timecode against the container.

Check your colour tags against what the content actually is. If you graded in Rec. 709 and the container says BT.2020, that’s a rejection waiting to happen. If you’re delivering HDR and the transfer function tag says “unspecified,” fix it before you ship.

Verify timecode start and continuity. Open the file in something that reads timecode as data, not just displays it. Check that the timecode track matches the header metadata. If the file is MXF, check both the header partition and the footer.

Confirm that the delivery spec matches what you’ve rendered. Not the project settings, the actual output file. Specs change between versions. Someone updates the frame rate requirement and doesn’t tell online. You render to the old spec and the file bounces.

None of this takes long. All of it is faster than re-rendering and re-delivering after a QC rejection, especially when the rejection email arrives at 6pm on a Friday and the network delivery window closes Monday morning.

Metadata is a contract

The frustrating thing about metadata failures is that they’re never about the content. The grade is beautiful. The mix is flawless. The conform is frame-accurate. And the file gets rejected because a tag says the wrong thing.

But the tag isn’t decoration. Metadata is a contract between the file and every system that will touch it downstream. Broadcast playout servers use colour tags to configure their output. Archive systems use sample rates to validate preservation copies. Automated pipelines use frame rate metadata to route files through processing chains. When the metadata lies, those systems make wrong decisions with real consequences.

The metadata nobody checks is the metadata that matters most: not the creative choices, but the technical declarations that tell machines how to handle the file after it leaves your hands.