I had a weird moment of self-awareness today. I was working through a gnarly refactoring task, and I realized I’d invoked GitHub Copilot at least 10 times in the past hour. Tab-complete, accept suggestion, move on. Rinse and repeat.
Then I caught myself second-guessing literally every suggestion it made. Reading the generated code line by line. Running tests. Checking edge cases. Basically treating it like code from an intern I didn’t quite trust yet.
Which got me thinking: If I don’t trust it, why do I keep using it?
The Numbers That Made Me Think
I came across some recent data that really crystallized this cognitive dissonance:
- 91% of engineering organizations now use AI coding tools
- But only 33% of developers actually trust the output
- Meanwhile, 46% actively distrust it
- Yet 51% of us use these tools daily
We’re in this weird paradox where AI coding assistants have become part of our daily workflow, but we fundamentally don’t trust what they produce. That’s… not how we’ve adopted other development tools, right?
The “Close But Not Quite Right” Problem
Here’s what really resonates with my experience: 66% of developers say they struggle with AI solutions that are close, but ultimately miss the mark.
This is the insidious part. It’s not like AI generates garbage code that’s obviously wrong. It generates code that looks professional, that seems reasonable, that passes a casual glance. But then you dig deeper and realize it’s missing a critical edge case, or it’s using a deprecated API, or it’s technically correct but architecturally wrong for your codebase.
45% of developers report that debugging AI-generated code takes longer than just writing it themselves. I felt that in my bones this week.
How This Is Different
Think about how we adopted other developer tools:
- Stack Overflow: We learned to critically evaluate answers, check dates, understand context. We trusted the voting system and our own judgment.
- IDEs and autocomplete: Built up over decades, deterministic, predictable. IntelliSense suggests method names - we trust it because it’s reading our actual codebase.
- Linters and static analysis: Explicitly designed to catch mistakes. We trust them because they’re paranoid.
But AI coding tools are different. They’re probabilistic, not deterministic. They’re trained on massive amounts of code - both good and bad. They don’t understand our specific codebase context. They confidently generate code that can be subtly wrong in hard-to-detect ways.
My Working Theory
I think we’ve unconsciously outsourced the “easy stuff” to AI while keeping the verification burden entirely on ourselves.
AI is great at:
- Boilerplate code
- Common patterns we’ve seen a million times
- Converting pseudocode to actual code
- Writing tests for straightforward functions
But we still own:
- Understanding if the generated code is correct
- Verifying it fits our architecture
- Ensuring it handles edge cases
- Checking for security vulnerabilities
- Making sure it’s maintainable
So we’re getting speed on the generation side, but we’re paying for it with increased cognitive load on the verification side. And I’m not sure that trade-off actually makes us faster overall.
The Question I Keep Coming Back To
Is this sustainable?
More importantly: What happens when junior developers grow up in this environment? If you learn to code by accepting AI suggestions and iterating based on test failures, do you develop the same mental models as someone who learned by writing code from scratch?
I’m genuinely curious about this. I came up writing code character by character, making mistakes, debugging them, learning patterns through repetition. Junior devs today are coming up in a world where the first draft is always AI-generated.
Are we creating a generation of engineers who are great at prompt engineering and verification but less strong at creative problem-solving and architectural thinking? Or is this just the natural evolution of the craft, and my concerns are the same as COBOL programmers had about high-level languages?
How I’m Handling It (For Now)
My current approach is pretty ad-hoc:
- Use AI for obvious boilerplate and well-established patterns
- Write security-critical and architecturally-important code myself
- Treat all AI suggestions as “code from an external library I’m vetting”
- Run comprehensive tests on anything AI-generated
- Do extra-thorough code review on my own AI-assisted PRs
But I feel like I’m making this up as I go along. I’d love to hear how others are thinking about this.
What’s your relationship with AI coding tools? Do you trust them? Should we?