Vibe Coding Considered Harmful: When AI-Assisted Speed Kills Software Quality
Andrej Karpathy coined "vibe coding" in early 2025 to describe a style of programming where you "fully give into the vibes, embrace exponentials, and forget that the code even exists." You describe what you want in natural language, the AI generates it, and you ship. It felt like a superpower. Within a year, the data started telling a different story.
A METR randomized controlled trial found that experienced open-source developers were 19% slower when using AI coding tools — despite predicting they'd be 24% faster, and still believing afterward they'd been 20% faster. A CodeRabbit analysis of 470 GitHub pull requests found AI co-authored code contained 1.7x more major issues than human-written code. And an Anthropic study of 52 engineers showed AI-assisted developers scored 17% lower on comprehension tests of their own codebases.
The dopamine loop of instant code generation is creating a new category of technical debt that doesn't show up in your sprint retro. Here's why it matters, and what to do about it.
Comprehension Debt: The Debt That Doesn't Announce Itself
Technical debt is familiar. You cut a corner, you know you cut it, and eventually it slows you down. Comprehension debt is different — it's the growing gap between how much code exists in your system and how much of it any human being genuinely understands.
Addy Osmani describes this as the hidden cost of AI-generated code: codebases appear healthy while understanding quietly deteriorates. A student team discovered in week seven of a project that no one could explain why any design decisions had been made or how different parts of the system were supposed to work together. Surface-level code quality masked systemic misunderstanding.
The mechanism is straightforward. AI generates code faster than humans can evaluate it, inverting traditional review dynamics. A junior engineer can now generate code faster than a senior engineer can critically audit it. This removes the quality gate that once made review meaningful.
The numbers support this. Developers who used AI primarily for code delegation — "write this for me" — scored below 40% on comprehension tests. Those who used AI for conceptual inquiry — "explain how this works" — scored above 65%. Same tools, radically different outcomes, depending entirely on how the developer related to the generated code.
The Productivity Paradox Nobody Wants to Hear
The METR study deserves close attention because it contradicts the dominant narrative about AI coding tools. Sixteen experienced developers, each with years of contributions to large open-source projects (averaging 22,000+ stars, 1M+ lines of code), completed 246 tasks randomly assigned to allow or disallow AI tools.
When AI was allowed, developers spent less time actively coding and reading code. Instead, they spent time prompting, waiting for AI output, and reviewing suggestions — accepting less than 44% of what the AI generated. Seventy-five percent reported reading every line of AI output, and 56% made major modifications to clean it up.
The researchers are careful to note that this slowdown was specific to experienced developers working in mature codebases they already knew well. For greenfield projects or unfamiliar codebases, the dynamics may differ. But that caveat is precisely the point: the productivity gains from AI coding tools are real but narrower than the marketing suggests, and the contexts where they help most — unfamiliar code, boilerplate, scaffolding — are also the contexts where comprehension debt accumulates fastest.
The Security Time Bomb
Beyond productivity, vibe coding introduces concrete security risks that compound over time.
Researchers found that 170 out of 1,645 Lovable-generated applications — 10.3% — had critical row-level security flaws in their Supabase configurations. AI-generated code shows 2.74x higher rates of security vulnerabilities compared to human-written code, with misconfiguration errors 75% more common.
Simon Willison, Django's co-creator, invoked the 1986 Challenger disaster as an analogy: a catastrophic failure waiting to happen when "some core component written by AI wasn't properly understood or checked." David Mytton, CEO of Arcjet, draws the line clearly — AI should implement battle-tested security libraries, never invent security from scratch.
The problem isn't that AI writes insecure code. The problem is that vibe coding's workflow specifically discourages the kind of careful review that catches security issues. When you're in the flow of "describe, generate, ship," the incentive structure actively works against stopping to threat-model what just appeared on your screen.
The Skills Erosion Spiral
Traditional learning involves hitting a problem, struggling with it, and building intuition from the struggle. Vibe coding replaces that cycle with "hit a problem, throw it at an AI, get a working solution, ship it, repeat tomorrow with the same gap in understanding."
Junior developers get hit hardest. They lack the gut feeling, earned through manual mistakes, required to detect when an LLM drifts into incorrect logic. But experienced developers aren't immune — Anthropic's research shows engineers becoming full-stack faster while expressing increasing concern about the atrophy of deeper skill sets.
The result is a new class of vulnerable developer who can generate code but can't understand, debug, or maintain it. The code churn data confirms this: churn is up 41%, duplication has increased fourfold, and careful refactoring has collapsed from 25% of changed lines in 2021 to under 10% by 2024.
Forrester predicts that by 2026, 75% of enterprises will face moderate to high severity technical debt directly attributable to AI-driven rapid development. That's not a prediction about a distant future. That's now.
The Refactoring Crisis
When nobody can explain why the code works, refactoring becomes impossible. Refactoring requires a mental model of the system — understanding not just what each piece does, but why it exists, what invariants it maintains, and what breaks if you change it.
Vibe-coded systems resist refactoring in a way that manually-written legacy code doesn't. With legacy code, someone understood it once — there are comments, commit messages, design documents, or at minimum a developer who remembers the constraints. With vibe-coded systems, the understanding never existed in the first place. The code was generated from a high-level description, and the mapping between intent and implementation was never held in any human mind.
This creates a paradox: the code is often cleaner on the surface than legacy code, but harder to change safely. It passes linters, follows naming conventions, and looks idiomatic. But it's a house of cards because the structural decisions were made by a model optimizing for plausibility, not by an engineer reasoning about the domain.
Teams discover this when they need to adapt the system to a new requirement. The modification looks straightforward. But every change triggers unexpected failures because the AI's implementation choices were reasonable-looking but arbitrary — there's no underlying design rationale to guide the modification.
Deliberate Practice Patterns for the AI Era
The answer isn't to stop using AI coding tools. The answer is to stop using them as a replacement for understanding. Here are the patterns that work.
Treat understanding as the deliverable, not the code. Before accepting any AI-generated code, you should be able to explain every structural decision to a colleague. If you can't, you haven't finished the task — you've just created comprehension debt.
Use AI for exploration, not delegation. The data is clear: developers who ask "explain how this works" outperform those who say "write this for me." Use AI to explore solution spaces, understand tradeoffs, and learn unfamiliar APIs. Then write the code yourself, informed by what you learned.
Enforce the review asymmetry. A junior engineer should never review AI-generated code alone. The speed inversion — where generation outpaces review — means you need senior engineers in the loop specifically because AI has made the generation step trivially fast.
Maintain a design rationale log. For every significant AI-generated component, document why this approach was chosen, what alternatives were considered, and what constraints it satisfies. This is the context that vibe coding destroys, and you need to reconstruct it deliberately.
Set comprehension checkpoints. At regular intervals, pick a random module and have the responsible developer explain it from memory. Not "read the code aloud" — explain the design intent, the invariants, the failure modes. If they can't, that's a signal that comprehension debt is accumulating.
Limit the blast radius. David Mytton's framework is practical: vibe coding is acceptable for scaffolding around validated components, small reviewable changes, and throwaway prototypes. It's unacceptable for novel security implementations, entire production codebases, and anything the developer can't verify independently.
The Coming Reckoning
Forty-one percent of all new code is now AI-generated. Stack Overflow's 2026 developer survey found 76% of developers using AI coding tools reported generating code they didn't fully understand at least some of the time. Unresolved technical debt in repositories climbed from a few hundred issues in early 2025 to over 110,000 by February 2026.
The explosion hasn't happened yet. But the conditions are in place: growing volumes of production code that no human fully understands, maintained by developers whose debugging skills are atrophying, in organizations that measure "lines shipped" instead of "systems understood."
The teams that will survive this recognize a fundamental truth: the comprehension work is the job. The code is just the artifact. AI can generate artifacts faster than ever. But if no one understands them, speed is just a more efficient way to create problems you can't solve.
- https://addyosmani.com/blog/comprehension-debt/
- https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
- https://thenewstack.io/vibe-coding-could-cause-catastrophic-explosions-in-2026/
- https://www.hungyichen.com/en/insights/vibe-coding-software-engineering-crisis
- https://arxiv.org/html/2603.28592
- https://www.softwareseni.com/the-evidence-against-vibe-coding-what-research-reveals-about-ai-code-quality/
