66% Say AI Code Is “Almost Right, But Not Quite.” We’re Spending More Time Debugging Than We Saved Writing. Net Negative?
So I’ve been using AI pretty heavily for generating design system components lately, and I’m experiencing this really specific type of frustration that I can’t shake ![]()
Here’s what keeps happening: I describe what I need—“create a responsive card component with optional header, body, and footer slots”—and boom, 30 seconds later I have working code. It feels like magic. Until I actually try to use it.
The layout breaks at 768px. The accessibility attributes are missing. The prop types are inconsistent with our existing patterns. The spacing tokens are hardcoded instead of using our design system variables. And now I’m spending the next 2 hours debugging and refactoring code that was supposed to save me time.
The “70% Problem”
I recently read that 66% of developers say the most common frustration with AI coding assistants is that the code is “almost right, but not quite” (Stack Overflow survey). This is exactly what I’m experiencing.
AI seems to get you about 70% of the way there really fast. But that final 30%? That’s where all the actual work lives. And here’s the thing that makes it even more frustrating: debugging someone else’s logic is cognitively harder than creating your own.
When I write code (or design components), I understand the tradeoffs I made. I know why I chose that approach. But when AI generates code, I’m reverse-engineering decisions made by… what, probability distributions? I’m debugging code I didn’t write, with patterns I didn’t choose, solving a problem the AI might have misunderstood.
Are We Just Trading Less Typing for More Reading?
I saw this quote from the Cerbos engineering team that really hit home: “You’re not actually saving time—you’re just trading less typing for more time reading and untangling code.”
That’s it. That’s the feeling.
And I’m not even a “real” engineer! I do design systems and write intermediate HTML/CSS/React. If I’m feeling this pain in my relatively constrained domain, what are teams experiencing with complex backend systems or security-critical code?
When AI Actually Works
To be fair, AI isn’t useless. It’s genuinely fantastic for:
- Boilerplate and repetitive patterns (migrations, test setup, data transformations)
- Exploration and learning (seeing different approaches, understanding unfamiliar APIs)
- Cognitive toil (the stuff you know how to do but just don’t want to type out)
Where it falls apart for me:
- Architectural decisions (AI doesn’t understand our specific constraints)
- Integration with existing systems (it doesn’t know our design system conventions)
- Edge cases and accessibility (it optimizes for the happy path)
The Question I Can’t Answer
So here’s what I’m struggling with: How do you decide when AI is worth it vs. just doing it yourself?
Because right now, my decision-making is terrible. I use AI because it’s there and I feel like I should be “leveraging new tools.” But I haven’t developed good intuition for when it actually helps vs. when it creates more work.
For my engineering friends here: Are you experiencing this? Have you found rules of thumb for when to use AI vs. when to just write the code yourself? Or am I just using these tools wrong?
And the bigger question: If we’re all spending more time debugging than we saved writing… is this actually a net negative? Or is this just the learning curve before we figure out how to use these tools properly? ![]()