Skip to main content

61 posts tagged with "ai"

View all tags

The Org Chart Problem: Why AI Features Die Between Teams

· 10 min read
Tian Pan
Software Engineer

The model works. The pipeline runs. The demo looks great. And then the feature dies somewhere between the data team's Slack channel and the product engineer's JIRA board.

This is the pattern behind most AI project failures—not a technical failure, but an organizational one. A 2025 survey found that 42% of companies abandoned most of their AI initiatives that year, up from 17% the year prior. The average sunk cost per abandoned initiative was $7.2 million. When the post-mortems get written, the causes listed are "poor data readiness," "unclear ownership," and "lack of governance"—which are three different ways of saying the same thing: nobody was actually responsible for shipping the feature.

The Persona Lock Problem: How Long-Lived AI Sessions Trap Users in Their Own Patterns

· 8 min read
Tian Pan
Software Engineer

There's a failure mode in long-lived AI systems that nobody talks about in product reviews but shows up constantly in user behavior data: people start routing around their own AI assistants. They rephrase prompts in uncharacteristic ways, abandon features the system has learned to surface for them, or quietly switch to a different tool for a task they've done hundreds of times before. The system worked — it learned — and that's exactly why it stopped working.

This is the persona lock problem. When an AI adapts to your past behavior, it's building a model of the you that existed at training time. That model gets more confident with every interaction. And eventually it becomes a prison.

The Shadow AI Governance Problem: Why Banning Personal AI Accounts Makes Security Worse

· 9 min read
Tian Pan
Software Engineer

Workers at 90% of companies are using personal AI accounts — ChatGPT, Claude, Gemini — to do their jobs, and 73.8% of those accounts are non-corporate. Meanwhile, 57% of employees using unapproved AI tools are sharing sensitive information with them: customer data, internal documents, code, legal drafts. Most executives believe their policies protect against this. The data says only 14.4% actually have full security approval for the AI their teams deploy.

The gap between what leadership believes is happening and what is actually happening is the shadow AI governance problem.

The instinct at most companies is to respond with a ban. Block personal chatbot accounts at the network level, issue a policy memo, run an annual training, and call it governance. This is the worst possible response — not because the concern is wrong, but because the intervention makes the problem invisible without making it smaller.

The Air-Gapped LLM Blueprint: What Egress-Free Deployments Actually Need

· 11 min read
Tian Pan
Software Engineer

The cloud AI playbook assumes one primitive that nobody writes down: outbound HTTPS. Vendor APIs, hosted judges, telemetry pipelines, model registries, vector stores, dashboard SaaS, secret managers — every one of them quietly resolves to a domain on the public internet. Pull that one cable and the stack does not degrade gracefully. It collapses.

That is the moment most teams discover their architecture has an egress dependency they never accounted for. A "small" prompt update needs to call out to a hosted classifier. The eval suite hits an LLM judge over the wire. The observability agent phones home. The model registry pulls weights from a CDN. None of it is malicious, and none of it is unusual. It is just what the cloud-native stack looks like when you stop noticing the cable.

Found Capabilities: When Users Ship Features Your Team Never Roadmapped

· 10 min read
Tian Pan
Software Engineer

A customer emails support to ask why your CRM agent stopped drafting their NDAs. You did not know your CRM agent drafted NDAs. A power user complains that your support bot's Tagalog translations have gotten worse since last week. You did not know your support bot did Tagalog. A forum thread spreads a prompt that turns your code-review assistant into a passable security scanner, and within a quarter you are getting CVE reports filed against findings the assistant produced. Each of these is a feature with adoption, business impact, and zero institutional ownership — no eval, no SLA, no surface in the UX, no roadmap entry, and a quiet bus factor of one: the customer who figured it out.

This is what happens once your product is wrapped around a model whose capability surface is wider than the surface you scoped. Users explore the wider surface, find behaviors that solve their problems, build workflows on top of those behaviors, and then experience your next model upgrade as a regression even though nothing on your roadmap moved. The contract between you and your users is no longer the one you wrote down. It includes everything the model happened to do for them that you happened not to break.

Treating this as an engineering surprise — "we will harden the prompt, we will add a guardrail, we will catch it next time" — is a category error. Found capabilities are a product-management problem. The discipline is not preventing them; it is detecting them, deciding what to do with them, and remembering that you decided.

The AI Feature Nobody Uses: How Teams Ship Capabilities That Never Get Adopted

· 9 min read
Tian Pan
Software Engineer

A VP of Product at a mid-market project management company spent three quarters of her engineering team's roadmap building an AI assistant. Six months after launch, weekly active usage sat at 4%. When asked why they built it: "Our competitor announced one. Our board asked when we'd have ours." That's a panic decision dressed up as a product strategy — and it's endemic right now.

The 4% isn't an outlier. A customer success platform shipped AI-generated call summaries to 6% adoption after four months. A logistics SaaS added AI route optimization suggestions and got 11% click-through with a 2% action rate. An HR platform launched an AI policy Q&A bot that spiked for two weeks and flatlined at 3%. The pattern is consistent enough to name: ship an AI feature, watch it get ignored, quietly sunset it eighteen months later.

The default explanation is that the AI wasn't good enough. Sometimes that's true. More often, the model was fine — users just never found the feature at all.

The AI Feature Sunset Playbook: How to Retire Underperforming AI Without Burning Trust

· 10 min read
Tian Pan
Software Engineer

Engineering teams have built more AI features in the past three years than in the prior decade. They have retired almost none of them. Deloitte found that 42% of companies abandoned at least one AI initiative in 2025 — up from 17% the year before — with average sunk costs of $7.2 million per abandoned project. Yet the features that stay in production often cause more damage than the ones that get cut: they erode user trust slowly, accumulate technical debt that compounds monthly, and consume engineering capacity that could go toward things that work.

The asymmetry is structural. AI feature launches generate announcements, stakeholder excitement, and team recognition. Retirements are treated as admissions of failure. So bad features accumulate. The fix is not willpower — it is a decision framework that makes retirement a normal, predictable engineering outcome rather than an organizational crisis.

AI Incident Response Playbooks: Why Your On-Call Runbook Doesn't Work for LLMs

· 10 min read
Tian Pan
Software Engineer

Your monitoring dashboard shows elevated latency, a small error rate spike, and then nothing. Users are already complaining in Slack. A quarter of your AI feature's responses are hallucinating in ways that look completely valid to your alerting system. By the time you find the cause — a six-word change to a prompt deployed two hours ago — you've had a slow-burn incident that your runbook never anticipated.

This is the defining challenge of operating AI systems in production. The failure modes are real, damaging, and invisible to conventional tooling. An LLM that silently hallucinates looks exactly like an LLM that's working correctly from the outside.

Bias Monitoring Infrastructure for Production AI: Beyond the Pre-Launch Audit

· 10 min read
Tian Pan
Software Engineer

Your model passed its fairness review. The demographic parity was within acceptable bounds, equal opportunity metrics looked clean, and the audit report went into Confluence with a green checkmark. Three months later, a journalist has screenshots showing your system approves loans at half the rate for one demographic compared to another — and your pre-launch numbers were technically accurate the whole time.

This is the bias monitoring gap. Pre-launch fairness testing validates your model against datasets that existed when you ran the tests. Production AI systems don't operate in that static world. User behavior shifts, population distributions drift, feature correlations evolve, and disparities that weren't measurable at launch can become significant failure modes within weeks. The systems that catch these problems aren't part of most ML stacks today.

Organizational Antibodies: Why AI Projects Die After the Pilot

· 11 min read
Tian Pan
Software Engineer

The demo went great. The pilot ran for six weeks, showed clear results, and the stakeholders in the room were impressed. Then nothing happened. Three months later the project was quietly shelved, the engineer who built it moved on to something else, and the company's AI strategy became a slide deck that said "exploring opportunities."

This is the pattern that kills AI initiatives. Not technical failure. Not insufficient model capability. Not even budget. The technology actually works — research consistently shows that around 80% of AI projects that reach production meet or exceed their stated expectations. The problem is the 70-90% that never get there.

AI Coding Agents on Legacy Codebases: Why They Fail Where You Need Them Most

· 9 min read
Tian Pan
Software Engineer

The teams that most urgently need AI coding help are usually not the ones building new greenfield services. They're the ones maintaining 500,000-line Rails monoliths from 2012, COBOL payment systems that have processed billions of transactions, or microservice meshes where the original architects left three acquisitions ago. These are the codebases where a single misplaced refactor can introduce a silent data corruption bug that surfaces three weeks later in production.

And this is exactly where current AI coding agents fail most spectacularly.

The frustrating part is that the failure mode is invisible until it isn't. The agent produces code that compiles, passes existing tests, and looks reasonable in review. The problem surfaces in staging, in the nightly batch job, or in the edge case that only one customer hits on a specific day of the month.

AI Content Provenance in Production: C2PA, Audit Trails, and the Compliance Deadline Engineers Are Missing

· 12 min read
Tian Pan
Software Engineer

When the EU AI Act's transparency obligations take effect on August 2, 2026, every system that generates synthetic content for EU-resident users will need to mark that content with machine-readable provenance. Most engineering teams building AI products are vaguely aware of this. Far fewer have actually stood up the infrastructure to comply — and of those that have, a substantial fraction have implemented only part of what regulators require.

The dominant technical response to "AI content provenance" has been to point at C2PA (the Coalition for Content Provenance and Authenticity standard) and declare the problem solved. C2PA is important. It's real, it's being adopted by Adobe, Google, OpenAI, Sony, and Samsung, and it's the closest thing to a universal standard the industry has. But a C2PA implementation alone will not satisfy EU AI Act Article 50. It won't survive your CDN. And it won't prevent bad actors from producing "trusted" provenance for manipulated content.

This post is about what AI content provenance actually requires in production — the technical stack, the failure modes, and the compliance gaps that catch teams off guard.