Skip to main content

3 posts tagged with "ai-compliance"

View all tags

Statistical Watermarking for LLM Output: How Token Logit Bias Creates Detectable Signatures

· 9 min read
Tian Pan
Software Engineer

Google has been watermarking Gemini output for every user since October 2024 — 20 million users, no perceptible quality degradation, algorithmically detectable. OpenAI has a working prototype that requires only a few hundred tokens to produce a reliable signal. Anthropic says it's on the roadmap. The EU AI Act's Article 50 mandates machine-readable marking of AI-generated content for covered providers. And yet: a $0.88-per-million-token attack achieves ~100% evasion success against seven recent watermarking schemes simultaneously.

This is the actual state of LLM text watermarking. The gap between what's deployed, what the papers claim, and what adversaries can do is wider than most teams realize — and the engineering decisions you make about watermarking depend heavily on which side of that gap you're standing on.

Open-Weight Licenses Are a Compliance Minefield Your Team Hasn't Mapped

· 9 min read
Tian Pan
Software Engineer

The word "open" is doing an extraordinary amount of work in "open-weight." When an engineer downloads a safetensors file from a model hub, they tend to file the act under the same mental category as npm install lodash — pull a dependency, ship a feature, move on. But the license that ships next to those weights is rarely Apache 2.0 or MIT. It is more often a custom community license with acceptable-use carve-outs, attribution requirements, derivative-naming rules, and user-count thresholds that switch the contract terms once your product gets popular. And almost none of it is enforced by the loader. The model runs whether you complied or not.

This is how compliance debt accumulates silently. The team that treats license review as a one-time download check is signing the company up for an audit finding that will ship years after the developer who clicked "I agree" has left. The fix is not a stricter procurement gate at the door — it is a discipline of treating model weights as a supply chain, with provenance, periodic re-review, and a manifest that traces every deployed inference path back to its upstream license.

The EU AI Act for Engineers: What the Four Risk Tiers Actually Require From Your Architecture

· 11 min read
Tian Pan
Software Engineer

Retrofitting EU AI Act compliance into an existing system costs 3-5x more than building it in from the start. That single fact should reframe how every engineering team thinks about the August 2026 deadline. The EU AI Act isn't a legal problem that lawyers will solve and engineers can ignore — it's an architecture problem that requires logging pipelines, human override mechanisms, bias testing infrastructure, and explainability layers baked into your system design. If your AI system touches European users and you haven't started building this, you're already behind.

Most coverage of the AI Act focuses on the legal framework: what's prohibited, what's permitted, how fines work. That's useful for your legal team. This article is about what you, as an engineer, actually need to build — the specific systems, pipelines, and architecture changes that compliance demands.