Coming at this from the design side, and honestly—this whole thread is validating something I’ve been feeling for months but couldn’t quite articulate.
More Options ≠ Better Outcomes
Here’s what I’m seeing: AI made prototyping insanely fast. Our engineering team can now generate 3-4 implementation approaches for every feature request in the time it used to take to build one.
Sounds great, right?
Except now our product/design/eng alignment meetings have doubled in frequency and tripled in duration.
Instead of spending 30 minutes reviewing one well-considered implementation, we’re spending 90 minutes debating the merits of four AI-generated approaches—most of which look technically sound but miss the actual user need.
The Collaboration Tax
Last month, an engineer came to design review with three different navigation implementations, all AI-generated in about an hour. Technically, they all worked. But two of them completely broke our accessibility standards, and all three ignored the mental model we’d spent weeks researching with users.
When I asked why he built three implementations instead of talking to design first, his response was heartbreaking: “It was faster to just generate them and see which one you liked.”
We’ve optimized for generation speed at the cost of collaboration quality.
The “Almost Right” Problem
This connects to what Alex mentioned about the 66% of developers frustrated with AI code being “almost right, but not quite.”
I see this constantly with design too. AI can generate a component that’s visually pixel-perfect but completely ignores:
- Accessibility (keyboard navigation, screen readers, color contrast)
- User mental models (technically correct but confusing UX)
- Edge cases (empty states, error states, loading states)
- Responsive behavior beyond basic breakpoints
And here’s the kicker: reviewing AI-generated designs is harder than designing them myself, for the exact same reason Luis mentioned about code.
When I design something, I carry the context: why this button placement, why this color hierarchy, why this interaction pattern. When AI designs it, I have to reverse-engineer the decisions—except AI didn’t make decisions, it pattern-matched against training data.
The Illusion of Productivity
We’re shipping more features than ever. Our velocity dashboard looks amazing. Leadership is happy.
But when we ran our quarterly UX research, user satisfaction had dropped. NPS was down. Support tickets were up. Customers said the product “feels less polished” and “harder to use.”
Turns out, more output ≠ better outcomes. We were shipping faster, but we weren’t shipping better.
What Actually Matters?
I’ve been thinking a lot about what “productivity” actually means in creative work.
Is a designer productive if they generate 10 mockups in a day, but 9 of them are unusable?
Is an engineer productive if they write 500 lines of code that needs 6 hours of debugging?
Is a team productive if sprint velocity is up but customer satisfaction is down?
Maybe productivity isn’t about speed. Maybe it’s about decision quality under constraints.
The best work I’ve ever done wasn’t the fastest. It was the work where we took time to understand the problem deeply, considered trade-offs carefully, and made intentional decisions.
AI is incredible at generation. But it can’t do the thinking for us. And if we let speed become the proxy for productivity, we’re going to ship a lot of fast, mediocre work.
A Weird Analogy
This reminds me of when I founded my startup (which failed spectacularly, by the way—wrote about those lessons elsewhere).
We had a moment where we could build features really fast using no-code tools. We shipped SO MUCH STUFF. Investors loved our velocity.
But we weren’t solving the right problems. We were just… shipping. Fast iteration without strategic direction is just expensive thrashing.
I feel like we’re making the same mistake now, but with AI instead of no-code.
Luis, to your question about metrics: I’d propose impact per feature instead of features per sprint.
Did this feature move a meaningful metric? Did users notice and appreciate it? Did it reduce support burden or increase engagement?
Because honestly, I’d rather ship 3 high-impact features that users love than 12 AI-generated features that technically work but nobody cares about.