Skip to main content

3 posts tagged with "content-moderation"

View all tags

Diffusion Models in Production: The Engineering Stack Nobody Discusses After the Demo

· 10 min read
Tian Pan
Software Engineer

Your image generation feature just went viral. 100,000 requests are coming in daily. The API provider's rate limit technically accommodates it. Latency crawls to 12 seconds at p95. Your NSFW classifier is flagging legitimate medical illustrations. A compliance audit surfaces that California's AI Transparency Act required watermarking since September 2024. Support has 50 open tickets from users whose content was silently blocked. By the time you realize you need a real production stack, you've already burned two weeks in crisis mode.

This is the moment "just call the API" fails—not because the API is bad, but because the demo's success exposes every assumption you made about inference latency, content policy, moderation fairness, and regulatory compliance. The engineering work nobody shows you in tutorials lives here.

The Two-Sided Cost of AI Content Filters: Why Over-Refusal Is a Business Problem Too

· 9 min read
Tian Pan
Software Engineer

Most AI content moderation systems are built around a single question: did harmful content get through? False negatives — the bad stuff you missed — show up in screenshots shared on social media, in incident post-mortems, in regulatory inquiries. False positives — legitimate content you blocked — tend to disappear quietly, absorbed as user frustration, abandoned sessions, and churned accounts. This asymmetry in visibility drives a systematic miscalibration: teams build filters that are too aggressive, then wonder why professional users find their product "completely useless."

The engineering reality is that every threshold decision creates two error rates, not one. Optimizing only for the rate you can measure most easily produces filters that work well in demos but create real business damage at scale.

The Precision-Recall Tradeoff Hiding Inside Your AI Safety Filter

· 10 min read
Tian Pan
Software Engineer

When teams deploy an AI safety filter, the conversation almost always centers on what it catches. Did it block the jailbreak? Does it flag hate speech? Can it detect prompt injection? These are the right questions for recall. They are almost never paired with the equally important question: what does it block that it shouldn't?

The answer is usually: a lot. And because most teams ship with the vendor's default threshold and never instrument false positives in production, they don't find out until users start complaining—or until they stop complaining, because they stopped using the product.