Skip to main content

4 posts tagged with "risk"

View all tags

The AI Indemnification Gap: When the Model Was Wrong and Nobody's Contract Covers You

· 11 min read
Tian Pan
Software Engineer

A customer's general counsel sends you a one-line email: "When the model invents a fact in our compliance workflow next week, whose insurance is paying?" You forward it to your VP of Engineering, who forwards it to Legal, who forwards it back to you. By the time the chain closes, three people have separately assumed that someone else read the model provider's terms carefully. None of them did. The contracts don't actually connect — and you are the layer in the middle that finds out first.

This is the AI indemnification gap. It exists because every enterprise AI product sits in a three-link liability chain — end customer, your product, model provider — where each link silently assumes the layer underneath is carrying the weight. The model provider's terms cap damages at roughly the last twelve months of fees and explicitly exclude output accuracy. Your MSA inherits those exclusions through a flow-down clause your customer's lawyer didn't read carefully. Your customer's contract with their downstream user — the actual end-of-chain victim when an output goes wrong — names your product as the responsible party with no clear upstream recourse.

The first claim discovers the gap. Until then, everyone in the chain is operating on a hopeful shrug.

The Dual Newspaper Test for AI Features: Catching the Failure Modes Your Post-Mortems Miss

· 9 min read
Tian Pan
Software Engineer

Your AI feature passed load testing. It hit the latency SLA. The rollback procedure works. Cost estimates came in under budget. Your post-mortem template has a green checkmark next to every line.

Two months after launch, the product appears in an investigative piece about discriminatory outcomes. You spend six weeks in legal review.

This is the gap the dual newspaper test is designed to close. Most engineering teams build thorough pre-ship processes for technical failures — reliability regressions, API instability, infrastructure cost blowouts. They read post-mortems about outages and optimize accordingly. But a second class of AI failures gets shipped right through those processes because it doesn't look like a bug: the feature works exactly as designed, and the harm happens anyway.

The AI Feature You Should Not Have Shipped: A Task-Shape Checklist

· 10 min read
Tian Pan
Software Engineer

The demo always works. That is the most expensive sentence in AI product development. The product manager sees the model handle the happy path, the engineer ships the obvious version of the feature, and six weeks later the support queue is full of complaints that the metric did not predict. Nothing in the model regressed. Nothing in the prompt got worse. The feature was simply not the shape the model could do well, and the team did not have a way to say so before the work began.

A meaningful fraction of shipped AI features fail this way — not because the model is bad, but because the task is wrong. The output the product needs is deterministic and the engine is stochastic. The user's tolerance for the tail is one bad answer per thousand and the model's failure distribution is heavier than that. The latency budget the unit economics require is half of what the model can deliver at any tier you can afford. The ground truth required to evaluate quality does not exist and cannot be cheaply created. None of these are model problems. They are task-shape problems, and they should have been screened before the first prompt was written.

The Copyright Exposure in AI-Generated Content: A Risk Framework for Engineering Teams

· 10 min read
Tian Pan
Software Engineer

GPT-4 reproduced exact passages from books in 43% of test prompts when asked to continue a given excerpt. In one 2025 study, researchers extracted nearly an entire book near-verbatim from a production LLM — no jailbreaking required, just a persistent prefix-feeding loop. If your product generates content using a language model, the copyright exposure is not a future risk. It is happening in your users' sessions today, and you probably have no instrumentation to catch it.

This is not primarily a legal article. It's an engineering article about a legal problem that engineering decisions either create or contain. Lawyers will tell you what constitutes infringement. This framework tells you where your system leaks, how to measure it, and what actually reduces risk versus what only looks like it does.