Dead Letter Notes
Notes on data pipelines, message queues, and the failure modes in between.

Schema drift: catching breaking changes before your consumers do

Schema drift is the slow divergence between the shape of the data a producer emits and the shape a consumer expects. Nobody decides to break the contract; it happens one innocent field at a time. Someone renames user_id to customer_id, makes a required field nullable "just for now", or changes a string to an int. Each change is small. Together they are why your dashboard is showing zeros.

The three flavours, worst last

Validate at the boundary, fail loudly

The single most useful thing is to assert the schema at ingestion and reject what doesn't match, rather than letting bad data flow downstream where it's ten times harder to trace. A schema registry with a compatibility policy (backward-compatible only) turns most breaking changes into a failed producer deploy instead of a silent data incident. That's exactly where you want the pain — on the person making the change, at the time they make it.

Contract tests beat hope

Where a registry isn't available, a small contract test in CI goes a long way: keep a golden sample of the upstream payload, validate it against the schema your pipeline assumes, and fail the build when they diverge. It's crude, but it moves discovery from "three weeks later in a Slack thread titled why is revenue down" to "the pull request is red."

For semantic drift, watch the data, not the schema

Since types can't catch a units change, add cheap distribution checks: a sudden 100× shift in the mean of amount should trip an alert even though the schema is unchanged. Anomaly checks on a handful of key metrics catch the drift that validation never will.

data-qualityschematesting