Skip to main content

One post tagged with "llm-ops"

View all tags

Harness Engineering: The Discipline That Determines Whether Your AI Agents Actually Work

· 10 min read
Tian Pan
Software Engineer

Most teams running AI coding agents are optimizing the wrong variable. They obsess over model selection — Claude vs. GPT vs. Gemini — while treating the surrounding scaffolding as incidental plumbing. But benchmark data and production war stories tell a different story: the gap between a model that impresses in a demo and one that ships production code reliably comes almost entirely from the harness around it, not the model itself.

The formula is deceptively simple: Agent = Model + Harness. The harness is everything else — tool schemas, permission models, context lifecycle management, feedback loops, sandboxing, documentation infrastructure, architectural invariants. Get the harness wrong and even a frontier model produces hallucinated file paths, breaks its own conventions twenty turns into a session, and declares a feature done before writing a single test.