We have a trust problem in production, and nobody wants to talk about it.
93% of developers use AI coding tools. Only 46% actually trust them.
Let me repeat that: mass adoption without trust.
As CTO, this keeps me up at night because we’d never accept this for any other production system. Imagine telling your board: “93% of our engineers use this database, but only 46% trust it won’t lose data.”
They’d demand we fix it immediately.
But for AI coding tools? We’re just… accepting it. Writing production code with tools we don’t fully trust. Every day.
The trust breakdown:
- 46% trust AI tools
- 33% “sort of trust” them
- 21% don’t trust them at all
Yet all of them keep using the tools because the productivity gains feel real even when the trust doesn’t exist.
What does “sort of trust” even mean in production?
You trust it for boilerplate. Don’t trust it for complex logic.
Trust it for CSS. Don’t trust it for security-critical code.
Trust it to give you a starting point. Don’t trust it to be right.
This is fine when you have the expertise to evaluate the output. It’s dangerous when you don’t.
The business risk calculation I’m running:
Acceptable risk: AI handles routine CRUD operations, senior engineers review everything.
Dangerous risk: AI handles complex distributed systems logic, juniors can’t evaluate if it’s correct.
Crisis scenario: Production incident caused by AI-generated code, team can’t debug it because they don’t understand what the AI built.
The data that scares me:
- AI code shows 1.7x more issues overall
- 23.7% more security vulnerabilities specifically
- 66% of developers say AI code is “almost right, but not quite”
That last one is insidious. “Almost right” passes code review if reviewers are overwhelmed. It passes testing if test coverage isn’t comprehensive. It ships to production and breaks in edge cases.
What we implemented:
-
AI-Assisted Code Tiers
- Green tier: Juniors with AI can touch this (well-tested, non-critical paths)
- Yellow tier: Mid-level+ with AI (feature code, business logic)
- Red tier: Senior only, AI optional (security, distributed systems, data integrity)
Controversial because it limits junior autonomy. But I’d rather limit scope than cause outages.
-
Security Review Process
All AI-assisted code goes through automated security scanning + manual review. 23.7% more vulnerabilities means we can’t rely on normal review processes. -
Blast Radius Limits
AI-assisted features ship behind feature flags with gradual rollout. If something breaks, we can roll back fast. -
Dependency Audits
AI tools love suggesting libraries. We now audit every new dependency before approval. Found three cases of AI suggesting deprecated packages with known vulnerabilities.
The strategic question:
Is “sort of trust” acceptable when building production systems?
In financial services, where I work, compliance frameworks explicitly require trust in our development process. I can’t tell auditors “we sort of trust the code.”
But in other industries, maybe the risk tolerance is different? Maybe “move fast with sort of trusted AI” is an acceptable trade-off?
The long-term concern:
Does trust improve as tools mature, or does dependency deepen without trust improving?
If developers use AI for 18 months and still only “sort of trust” it, we’re building production systems on a foundation of uncertainty. That compounds over time.
Michelle’s point about skill debt applies here: we won’t know we have a trust problem until something breaks badly. And by then, we might have a codebase full of AI-generated code that nobody fully understands or trusts.
The uncomfortable question:
If we don’t trust the tools, why are we trusting the output?
Are we deferring a crisis, or is this the new normal? Is “sort of trust” enough?