Karpathy Says Vibe Coding Is Passe at Its One-Year Anniversary — "Agentic Engineering" Is the New Paradigm

On February 8, 2026, Andrej Karpathy marked the one-year anniversary of the term he coined — “vibe coding” — by declaring it already obsolete. His replacement term: agentic engineering, where developers write approximately 1% of the code themselves and orchestrate AI agents that complete the other 99%. The key shift, as Karpathy describes it, is that agents now complete 20+ autonomous actions before requiring human input, roughly double what was possible just six months ago.

This is a fundamental change in what it means to “use AI for coding.” Vibe coding was “type a prompt, get code, hope it works.” Agentic engineering is “describe the feature, review the agent’s plan, approve the approach, and supervise execution.” The developer’s role shifts from writer to orchestrator — less author, more editor-in-chief.

Real-World Evidence

The evidence is building that this shift is real, not just a thought leader’s prediction:

TELUS International created 13,000+ custom AI solutions using agentic engineering workflows. These aren’t experimental side projects — they’re production systems deployed across their operations.

Zapier hit 89% AI adoption internally, with 800+ agents handling everything from code generation to testing to documentation. Their engineering team reported that agents now handle the majority of their CI/CD pipeline interactions, freeing developers to focus on architectural decisions and business logic.

These numbers are impressive, but let me be clear: these are companies that have invested heavily in AI infrastructure. The average engineering team isn’t operating at this level yet.

What Makes Agentic Engineering Different

Here’s how I’d break down the key differences from vibe coding:

  1. Structured orchestration vs. freeform prompting. Agents follow defined workflows with checkpoints, not open-ended conversations. You define the steps: scaffold, implement, test, lint, document. The agent executes them in order, with gates between each step.

  2. Multi-agent collaboration. Specialized agents for code generation, testing, security review, and documentation work together. Instead of one general-purpose chatbot trying to do everything, you have a pipeline of focused agents, each optimized for their task.

  3. Persistent context. Agents maintain knowledge about the codebase across sessions. This is the biggest technical leap — agents that understand your codebase’s architecture, conventions, and history, unlike the stateless chat interactions of early vibe coding.

  4. Measurable outputs. Agentic workflows have defined success criteria — tests pass, linting passes, security scan clears. Not “looks good to me” after a cursory glance.

My Honest Assessment

I’ll be transparent about my skepticism: the 1% figure is aspirational, not current reality. In practice, developers at my company write about 30-40% of the code themselves — the critical parts that require business logic understanding, architectural decisions, and edge case handling. AI agents handle boilerplate, tests, and straightforward implementations. That’s still a massive shift from two years ago, but it’s not “1% human code.”

The productivity gains are real but uneven. Greenfield CRUD applications? Agents are phenomenal. Complex distributed systems with intricate failure modes? Agents still need heavy human guidance. The 99/1 ratio might hold for simple applications, but for the kind of systems most of us build professionally, 60/40 or 70/30 is more realistic.

The Trust Problem

The “46% distrust” stat is the elephant in the room. Nearly half of developers don’t trust AI output, and these are the people being asked to orchestrate agents they don’t trust. That’s like asking someone who doesn’t trust autopilot to supervise a fleet of self-driving cars.

Forced adoption without trust leads to two outcomes: developers secretly rewriting AI output (which reduces the productivity gains to near zero) or developers rubber-stamping AI output (which reduces code quality and introduces bugs). Neither outcome is what engineering leaders want, but both are happening at companies I talk to.

The path forward isn’t forcing adoption — it’s building trust through transparency. Agents that explain their reasoning, show their work, and flag areas of uncertainty will earn developer trust faster than agents that just output code.


Where are you on the vibe coding → agentic engineering spectrum? And honestly — do you trust your agents enough to let them work autonomously for 20+ steps without checking in?

I’ve been using agentic workflows for 3 months now — Cursor Composer combined with custom MCP servers — and the experience is genuinely different from vibe coding. The agent handles a 10-step task autonomously: creates the file, writes the implementation, adds tests, runs them, fixes failures, updates imports — without me touching anything. It’s impressive to watch.

But here’s the paradox that nobody talks about: I spend the saved coding time reviewing the output, and the review takes almost as long as writing it myself would have.

The difference is cognitive, not temporal. Review is less creative and more analytical, which means I can do it while mentally fatigued in a way I can’t with greenfield coding. At 4pm when my creative energy is shot, I can still effectively review agent output. I couldn’t write quality code at that point. So the agents effectively extend my productive hours, even if the per-task time savings are modest.

Net productivity gain? Maybe 30%. Not the 5x or 10x that the hype cycle promises, but meaningful and real. The honest question is whether 30% justifies the tooling costs, learning curve, and organizational change management. For my team, the answer is yes — but it’s a closer call than the marketing suggests.

The 20+ autonomous actions claim checks out in my experience, but with a caveat: it’s 20+ actions on well-defined tasks. Ask an agent to “refactor this module for better testability” and you’ll get solid autonomous execution. Ask it to “figure out why this distributed cache is occasionally returning stale data” and you’re back to hand-holding after step 3.

From a product planning standpoint, the “agentic engineering” label is genuinely useful because it changes expectations in a way that “vibe coding” never did.

“Vibe coding” implied that any non-technical person can build software, which led to unrealistic product roadmaps and scope creep. I had stakeholders citing vibe coding articles to justify compressing timelines — “if AI can write code, why does this feature take 6 weeks?” The framing invited non-engineers to underestimate the complexity of software development.

“Agentic engineering” re-centers the developer as the orchestrator — someone with technical judgment who directs AI tools. This is a healthier framing for product planning because it acknowledges that AI amplifies developers rather than replacing them.

I’ve made a deliberate language shift in my product org. I’ve stopped using “AI will build it faster” in roadmap conversations and started using “developers will deliver more with AI assistance.” It’s a subtle reframing, but it sets realistic expectations with executives and prevents the “why can’t we ship this tomorrow?” conversations.

The other benefit of the agentic engineering framing: it makes the investment in developer tooling legible to non-technical leadership. “We’re buying AI coding tools” sounds like a cost. “We’re deploying engineering agents that multiply developer output” sounds like a strategic investment. Same spend, different narrative, dramatically different budget approval experience.

Michelle’s point about 30-40% human-written code matches what I see from the product side. The features that ship fastest are the ones with clear specs and well-defined patterns — exactly the type of work agents excel at. The features that slip are the ones requiring novel architecture or ambiguous requirements — exactly where human judgment is irreplaceable.

The TELUS 13,000 AI solutions number needs context, and I say this as someone who manages engineering teams that are actively deploying agentic workflows.

I’ve spoken with people at companies touting similar numbers, and the reality is that most “AI solutions” are simple automations — Slack bots, data formatting scripts, report generators, email parsers, form processors. The headline sounds like 13,000 complex engineering projects, but many are things that would have been Zapier workflows or Python scripts before the AI branding landed. I’m not dismissing the value — these automations genuinely help and they’re being built faster than before — but categorizing them all as “agentic engineering” inflates the perception of what agents can do.

Similarly, Zapier’s 89% AI adoption stat needs unpacking. 89% of employees using AI tools is different from 89% of engineering work being done by AI tools. If every developer uses Copilot for autocomplete, that’s high adoption but modest impact. The metric that matters is: what percentage of shipped code was generated by agents without significant human modification? Nobody is publishing that number, and I suspect it’s because it’s less impressive than the adoption figures.

For complex software systems — the kind with distributed state, concurrent access patterns, subtle race conditions, and nuanced business rules — we’re still firmly in the “human writes most of the important code” phase. My teams use agents extensively for:

  • Boilerplate and scaffolding (~90% agent-generated)
  • Unit tests (~70% agent-generated)
  • Documentation (~80% agent-generated)
  • Straightforward CRUD endpoints (~60% agent-generated)

But for the critical path — the code that handles failure modes, manages distributed transactions, and implements core business logic — it’s maybe 20% agent-assisted. And “assisted” means “the agent wrote a first draft that a human substantially rewrote.”

I don’t see this changing dramatically in 2026. The bottleneck isn’t the agent’s coding ability — it’s the agent’s understanding of why the code needs to work a certain way. Business context, organizational constraints, regulatory requirements, and cross-system dependencies are things agents can’t learn from codebases alone.

Karpathy’s framing is aspirational and directionally correct, but let’s not mistake the vision for the current reality.