CLAUDE.md as Codebase API: The Most Leveraged Documentation You'll Ever Write
Most teams treat their CLAUDE.md the way they treat their README: write it once, forget it exists, wonder why nothing works. But a CLAUDE.md isn't documentation. It's an API contract between your codebase and every AI agent that touches it. Get it right, and every AI-assisted commit follows your architecture. Get it wrong — or worse, let it rot — and you're actively making your agent dumber with every session.
The AGENTbench study tested 138 real-world coding tasks across 12 repositories and found that auto-generated context files actually decreased agent success rates compared to having no context file at all. Three months of accumulated instructions, half describing a codebase that had moved on, don't guide an agent. They mislead it.
The Instruction Budget Is Smaller Than You Think
Every AI coding agent has a finite capacity for following instructions. Research on frontier LLMs suggests they can reliably follow roughly 150–200 instructions with reasonable consistency. That sounds generous until you realize that Claude Code's system prompt alone consumes around 50 of those slots. Cursor's agent infrastructure eats into the budget similarly. Your CLAUDE.md gets whatever's left.
This creates a hard constraint that most teams ignore. A 500-line CLAUDE.md filled with style guides, architecture decisions, debugging tips, and team conventions isn't thorough — it's noise. The LLM exhibits a "lost in the middle" effect where instructions in the center of a long document get systematically ignored. As instruction count increases, following quality degrades uniformly across all instructions, not just the new ones.
The practical ceiling is around 300 lines, and teams that keep their files under 60 lines report the most consistent agent behavior. The difference between a 60-line file and a 500-line file isn't that the longer one covers more — it's that the longer one covers nothing reliably.
What Belongs in the File (and What Doesn't)
The highest-leverage content follows a strict priority order:
Build and test commands come first. This is the single most valuable thing you can put in a context file. Not "run the tests" but yarn test --coverage or make test-api. An agent that knows the exact invocation can verify its own work. An agent told to "run the tests" will guess, and it will guess wrong.
Tech stack with versions comes next. Not "we use React" but "Next.js 15 (App Router), TypeScript 5.7 (strict mode), PostgreSQL 16 with Drizzle ORM." Specificity prevents the agent from generating code for the wrong framework version — a surprisingly common failure mode when the agent defaults to whatever was most prevalent in its training data.
Architecture constraints matter, but only the non-obvious ones. If your project uses a repository pattern with thin controllers, say so. If business logic must never live in route handlers, say so. The agent can infer a lot from reading your code, but it can't infer organizational decisions about where things should go.
What doesn't belong:
- Style rules. Never send an LLM to do a linter's job. If you care about 4-space indentation or no arrow functions in React components, configure ESLint and Prettier. Using precious instruction budget on formatting is waste.
- Things the agent already does correctly. If Claude already writes TypeScript without being told, the instruction is dead weight. Convert it to a hook or delete it.
- Task-specific guidance. Instructions about a particular feature or migration belong in the prompt, not the permanent context file. Every session pays the token cost for every line in CLAUDE.md, whether it's relevant or not.
The File Hierarchy Nobody Uses
Most developers create a single root-level CLAUDE.md and call it done. But every major AI coding tool supports a hierarchy that loads from broad to specific, with more specific instructions overriding broader ones.
The loading order typically runs: enterprise policies first, then user-level preferences, then project root, then subdirectory-level files. This means you can have a root CLAUDE.md that covers project-wide conventions and a src/api/CLAUDE.md that adds API-specific constraints — like "all endpoints must validate input with Zod schemas" — without bloating the root file.
This hierarchy solves the monorepo problem elegantly. A shared services directory can enforce different patterns than a frontend application directory, all within the same repository. The constraint budget gets spent locally rather than globally.
The emerging cross-tool architecture looks like this:
AGENTS.md (universal foundation — supported by 60,000+ projects)
├── CLAUDE.md (Claude-specific additions only)
├── copilot-instructions.md (Copilot-specific)
└── .cursor/rules/ (Cursor scoped rules)
- https://www.deployhq.com/blog/ai-coding-config-files-guide
- https://www.humanlayer.dev/blog/writing-a-good-claude-md
- https://www.stackbuilders.com/insights/beyond-agentsmd-turning-ai-pair-programming-into-workflows/
- https://addyosmani.com/blog/ai-coding-workflow/
- https://medium.com/@cdcore/your-claude-md-is-making-your-agent-dumber-953f6dbed308
- https://www.augmentcode.com/guides/how-to-build-agents-md
- https://code.claude.com/docs/en/best-practices
- https://blog.jetbrains.com/idea/2025/05/coding-guidelines-for-your-ai-agents/
- https://cursor.com/blog/productivity
