Skip to main content

Your CLAUDE.md Is Probably Too Long (And That's Why It's Not Working)

· 10 min read
Tian Pan
Software Engineer

Here's a pattern that plays out constantly in teams adopting AI coding agents: a developer has Claude disobey a rule, so they add a clearer version to their CLAUDE.md. Claude disobeys a different rule, so they add that one too. After a few weeks, the file is 400 lines long and Claude is ignoring more rules than ever. The solution made the problem worse.

This happens because of a fundamental property of instruction files that most developers never internalize: past a certain size, adding more instructions causes the model to follow fewer of them. Getting instruction files right is less about completeness and more about ruthless selection — knowing what to include, what to cut, and how to architect the rest.

![](https://opengraph-image.blockeden.xyz/api/og-tianpan-co?title=Your%20CLAUDE.md%20Is%20Probably%20Too%20Long%20(And%20That's%20Why%20It's%20Not%20Working)

Why Instruction Files Exist at All

AI coding agents like Claude Code are stateless. Each session begins with no memory of prior sessions, no knowledge of your project's conventions, no awareness of your team's preferences. The CLAUDE.md file (or AGENTS.md in agent systems built around other tools) is the mechanism that bridges this gap — it's read at the start of every conversation, giving the model persistent context it couldn't infer from code alone.

This sounds straightforward, but there's a critical constraint built into how agents process these files. Claude Code's system prompt already contains roughly 50 built-in instructions about how the agent should behave. Research on frontier models suggests they reliably follow somewhere between 150 and 200 instructions before compliance starts degrading. Simple arithmetic: if the system uses 50 by default, you have budget for about 100-150 more before you start losing adherence.

Most CLAUDE.md files in the wild blow past this budget easily. A database schema explanation, a list of every environment variable, the company's general coding philosophy, rules that apply to one service but not another — it accumulates fast.

There's a second constraint, less obvious but equally important. Claude Code ships with a system reminder that explicitly tells the model: "this context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant to your task." This isn't a bug or an oversight. It's intentional — a model that treats every instruction as always applicable would be far less useful. But it means instructions that are only sometimes relevant will frequently be ignored, even when you want them applied.

The Instruction Overflow Problem

Context rot is a documented phenomenon: as a context window fills, model output quality measurably degrades. The effect kicks in well before hitting the technical token limit. For coding agents, the failure mode is particularly insidious — the model doesn't crash or error. It just starts deprioritizing instructions buried deeper in its context, producing code that looks plausible but subtly violates conventions you'd spent effort establishing.

If you've ever had Claude confidently use CommonJS require() despite a clear CLAUDE.md rule saying to use ES modules, you've experienced this. The rule was there. It was just crowded out.

The typical response — adding more rules, or adding emphasis to existing ones — makes things worse. Every instruction you add takes budget away from the instructions that were already there. A 500-line CLAUDE.md doesn't give the model 500 lines of guidance. It gives it 500 lines of noise from which it selects whatever it considers most relevant to the current task.

What Actually Belongs in an Instruction File

The test for whether something belongs in CLAUDE.md is strict: would removing this line cause Claude to make mistakes that it wouldn't make otherwise?

If the answer is "no" — if Claude would do the right thing anyway — the line should go. If the answer is "sometimes" — if the instruction only applies to certain tasks — it also shouldn't be in the root file.

What passes this test:

  • Commands the model can't discover from reading code: your specific test runner invocation, the exact build command for a monorepo, the port your dev server runs on
  • Code style rules that deviate from language defaults: if you use tabs in a language where spaces are idiomatic, say so; if you're on a standard style, skip it
  • Testing preferences: run a single test file, not the whole suite; prefer integration tests over mocks in this codebase
  • Repository conventions that aren't obvious from structure: branch naming, PR description requirements, commit message format
  • Known gotchas: required environment variables, non-obvious service dependencies, initialization steps that fail silently if skipped
  • Architectural decisions that diverge from frameworks: places where you've deliberately broken a convention the model would otherwise follow

What fails the test and should be removed:

  • Anything derivable from the code: if the model can read a file and infer it, don't explain it in CLAUDE.md
  • Standard language conventions: don't tell Claude to write readable code or use meaningful variable names
  • Code style rules enforced by a linter: "never send an LLM to do a linter's job" — if Prettier or ESLint catches it, remove it from the instruction file and run the tool as a hook instead
  • Task-specific context: database schema details are irrelevant when Claude is fixing a CSS bug
  • Explanations and tutorials: the instruction file is configuration, not documentation

A well-curated CLAUDE.md for most projects should fit in 40-80 lines. Under 100 is a reasonable upper bound. If yours is longer, that's a signal something is wrong — not that you need more lines, but that you need to push content elsewhere.

Progressive Disclosure: The Right Architecture

The right solution to the "needs more instructions than fit" problem isn't a longer file — it's a different architecture. Progressive disclosure means storing task-specific guidance in separate files that the model can load on demand, keeping the root instruction file as a clean index.

A typical structure looks like this:

CLAUDE.md                        ← root instruction file, under 80 lines
agent_docs/
building_the_project.md ← invoked when building or running
running_tests.md ← invoked when writing or running tests
code_conventions.md ← invoked when writing new code
service_architecture.md ← invoked when reasoning about services
database_schema.md ← invoked only for data-related tasks

The root CLAUDE.md references these files by name but doesn't embed their content. When Claude starts a build-related task, it loads building_the_project.md. When it's fixing a UI bug, that file doesn't appear in context at all.

This architecture solves both problems simultaneously. The root instruction file stays compact enough that every line in it carries weight. Task-specific guidance exists and is available, but only consumes context when actually relevant.

Claude Code supports this through import syntax (@path/to/file) and through a skills system that loads domain-specific guidance on demand. The same pattern applies whether you're using these features explicitly or relying on the model's judgment to pull referenced files.

The Ecosystem: CLAUDE.md, AGENTS.md, and Beyond

CLAUDE.md is specific to Claude Code. The broader ecosystem has converged on AGENTS.md as a format-neutral alternative. As of early 2026, over 18 tools natively parse AGENTS.md, including GitHub Copilot, Cursor, Windsurf, Zed, Google Jules, Devin, and Aider — with Claude Code as a notable holdout using its own format.

Analysis of over 2,500 repositories using AGENTS.md found consistent structure among the highest-performing files: six core areas covered, with commands placed early and examples favored over explanations. A concrete code snippet showing expected style consistently outperforms three paragraphs describing it.

The median well-performing file was around 300-350 words — short enough to be read in under two minutes, long enough to cover the bases. Files beyond 500 words showed diminishing returns. Files beyond 1,000 words showed negative correlation with agent performance.

One pattern that appeared across top-performing repositories was a three-tier boundary framework for defining what the agent should and shouldn't do:

  • Always do: safe, encouraged actions the agent can take without asking
  • Ask first: actions with side effects or real-world consequences that need confirmation
  • Never do: hard constraints, with a short explanation of why

This explicit categorization helps because it acknowledges that "never commit secrets" and "ask before pushing to main" are different types of restrictions — one is a hard safety rule, the other is a workflow preference. Mixing them into a flat list dilutes both.

Treating Your Instruction File Like Code

The operational instinct behind most bloated CLAUDE.md files is documentation culture: if something went wrong, document it so it doesn't happen again. This is wrong for instruction files. Adding a rule because Claude once did something doesn't mean the rule belongs in the file permanently — it means you should diagnose why Claude did it.

If Claude ignored a rule because the file was too long, adding another rule doesn't fix the underlying problem. If Claude made a mistake because a hook wasn't in place, the fix is a hook — not an instruction. If Claude did something wrong because the prompt was vague, the fix is a better prompt.

Instruction files should be pruned the same way you'd prune dead code. Questions to ask during review:

  • Does Claude follow this rule without the instruction? If yes, delete it.
  • Does this rule only apply to some tasks? If yes, move it to a specialized file.
  • Could a deterministic tool enforce this instead? If yes, use the tool.
  • Has Claude violated this rule in the past month? If no, consider whether it's still relevant.

The practical workflow that works well: start lean, observe where Claude makes mistakes, add rules only when those rules would have prevented real failures. An instruction file that grows organically from actual failures tends to be both shorter and more effective than one written upfront to cover every case.

One more tactical note: pay attention to where in the file you place your most important rules. Models exhibit a recency bias — instructions appearing at the end of a long context tend to be weighted more heavily. For a critical rule that keeps getting missed, try moving it toward the end of the file rather than adding emphasis markers.

The Compound Value Over Time

A good instruction file compounds. Each well-chosen rule that sticks is a mistake you never have to catch again. Each pruned rule that wasn't pulling weight is context budget reclaimed for the rules that matter.

The teams getting the most out of AI coding agents treat their instruction files as high-value configuration requiring the same care as production code. They review them periodically, update them when the project changes, and test them by observing whether agent behavior actually shifts after a change.

The teams getting the least out of them treat them as documentation — comprehensive by default, never pruned, growing until the model stops reading them reliably.

The irony is that less instruction, properly chosen, produces more compliance. The path to having Claude follow your rules isn't writing more rules. It's being ruthless about which rules deserve to be written at all.

References:Let's stay in touch and Follow me for more thoughts and updates