Skip to main content

5 posts tagged with "ai-compliance"

View all tags

The Compliance Audit That Asked Which Model Produced Which Output

· 10 min read
Tian Pan
Software Engineer

The auditor's question sounds simple. She has your appeals log open, points at a row from eight months ago, and asks which model decided that case. Your engineer pulls up the schema: there is a model column, and every decision in the audit window says v1. Then someone from the platform team mentions, almost in passing, that the alias behind v1 rotated four times during the audit period — a base model upgrade, a fine-tune refresh, a vendor-side capacity move, and one rollback that lasted six hours during an incident. The honest answer is that you cannot say which checkpoint produced that decision. The auditor writes something down. That phrase is not a regulator-acceptable answer, and you have just learned that the system you shipped has been failing an audit requirement it was never designed to meet.

The gap here is not a missing log line. The gap is between two different ideas of what "model" means. To the engineers shipping the system, v1 is an endpoint — a stable contract callers can point at while the thing behind it gets upgraded for free. To the auditor, "the model that produced this decision" is a specific artifact: a weight checkpoint, a hash, a thing you could in principle re-run on the same input and get a defensibly similar output. Endpoint aliases were invented to hide checkpoint rotation from callers. Audit-grade provenance demands the opposite — that every decision be attributable to exactly the checkpoint that produced it. The two ideas were on a collision course from the start; the audit just happened to be where they met.

The Legal Review Timeline Your AI Feature Roadmap Never Costed

· 10 min read
Tian Pan
Software Engineer

You sketched a six-quarter AI roadmap. The model swap, the new data source, the multilingual launch, and the prompt that now offers advice each got a single row on the Gantt chart, sized by engineering effort. Then the first launch slipped four weeks, and the post-mortem said the same thing three times in three different sections: "waiting on legal." The roadmap had assumed engineering capacity was the binding constraint. The actual binding constraint was a queue of legal reviews, each running its own three-to-six-week SLA, none of them aware of each other, and all of them landing on the same two product counsels.

The mistake was not in any of the individual reviews. Each one was warranted. The mistake was treating four parallel features as four parallel timelines while their legal dependencies serialized through the same upstream resource. By the second slip the org learns the shape of the problem. By the fourth it learns to plan against it. The teams that ship AI features on a predictable cadence have stopped treating legal throughput as an external surprise and started treating it as a planning input on the same footing as headcount and infra capacity.

Statistical Watermarking for LLM Output: How Token Logit Bias Creates Detectable Signatures

· 9 min read
Tian Pan
Software Engineer

Google has been watermarking Gemini output for every user since October 2024 — 20 million users, no perceptible quality degradation, algorithmically detectable. OpenAI has a working prototype that requires only a few hundred tokens to produce a reliable signal. Anthropic says it's on the roadmap. The EU AI Act's Article 50 mandates machine-readable marking of AI-generated content for covered providers. And yet: a $0.88-per-million-token attack achieves ~100% evasion success against seven recent watermarking schemes simultaneously.

This is the actual state of LLM text watermarking. The gap between what's deployed, what the papers claim, and what adversaries can do is wider than most teams realize — and the engineering decisions you make about watermarking depend heavily on which side of that gap you're standing on.

Open-Weight Licenses Are a Compliance Minefield Your Team Hasn't Mapped

· 9 min read
Tian Pan
Software Engineer

The word "open" is doing an extraordinary amount of work in "open-weight." When an engineer downloads a safetensors file from a model hub, they tend to file the act under the same mental category as npm install lodash — pull a dependency, ship a feature, move on. But the license that ships next to those weights is rarely Apache 2.0 or MIT. It is more often a custom community license with acceptable-use carve-outs, attribution requirements, derivative-naming rules, and user-count thresholds that switch the contract terms once your product gets popular. And almost none of it is enforced by the loader. The model runs whether you complied or not.

This is how compliance debt accumulates silently. The team that treats license review as a one-time download check is signing the company up for an audit finding that will ship years after the developer who clicked "I agree" has left. The fix is not a stricter procurement gate at the door — it is a discipline of treating model weights as a supply chain, with provenance, periodic re-review, and a manifest that traces every deployed inference path back to its upstream license.

The EU AI Act for Engineers: What the Four Risk Tiers Actually Require From Your Architecture

· 11 min read
Tian Pan
Software Engineer

Retrofitting EU AI Act compliance into an existing system costs 3-5x more than building it in from the start. That single fact should reframe how every engineering team thinks about the August 2026 deadline. The EU AI Act isn't a legal problem that lawyers will solve and engineers can ignore — it's an architecture problem that requires logging pipelines, human override mechanisms, bias testing infrastructure, and explainability layers baked into your system design. If your AI system touches European users and you haven't started building this, you're already behind.

Most coverage of the AI Act focuses on the legal framework: what's prohibited, what's permitted, how fines work. That's useful for your legal team. This article is about what you, as an engineer, actually need to build — the specific systems, pipelines, and architecture changes that compliance demands.