Personalization Belongs in a Dotfile, Not a Vector Store
The first time a product team needs per-user agent behavior, somebody usually says "we should fine-tune" or "let's wire up persistent memory." A week later they have a vector database, a feedback-loop pipeline, and a roadmap item to monitor learned-state drift. They have built an ML system to solve a problem that, in nine cases out of ten, is a config file.
Look at what users are actually asking for: terser responses, bullets instead of prose, my company's name in the disclaimer, default to my preferred model, don't escalate to a human under $100, here is the project I am working on this week, never use emoji. None of that needs a model that has learned anything. It needs settings. The dotfile pattern — a versioned, declarative, per-user configuration repo — solved this for shells, editors, and CLIs forty years ago, and it is the right shape for AI agents in 2026.
The reason teams skip past it is partly a status thing — fine-tuning sounds like real ML work, settings sound like Tuesday — and partly a habit from earlier in the LLM era when everything was a system prompt and the system prompt was a monolith. Both are wrong now. Agents are software. The cheapest, most debuggable, most user-respecting way to give them per-user behavior is to ship them a dotfile.
Most "personalization" is just config in disguise
Run a survey of the personalization requests sitting in any agent product's backlog. The list looks something like this: tone (concise, formal, casual, blunt), output format (markdown, plaintext, JSON), default model tier, default tools enabled, escalation thresholds, language, timezone, default project or workspace, do-not-do list (don't run shell commands, don't send Slack messages without confirmation), preferred citation style, persona overlays for a customer-facing agent.
Every item on that list is a value that the user explicitly chose. They are not patterns the model should infer from interaction history. They are preferences with a clear schema, a default, and a small set of valid values. A user changing them once should change behavior immediately and predictably — not in three sessions when the embedding model finally catches up.
A useful mental test: if a user can articulate the preference in one sentence ("never use emoji"), it is config. If they cannot ("write more like our senior engineers do"), that is where embeddings, examples, or fine-tuning earn their keep. The line is sharper than people pretend, and the config side of it is enormous.
What the dotfile pattern actually buys you
Strip the analogy down to its primitives and four properties matter:
- Declarative. The file states the desired state, not the steps to get there. "tone: terse" not "after each response, check the prior style and adjust."
- Versioned. Every change is a diff with an author and a timestamp, recoverable, reviewable, and rollback-able.
- Composable. Layers override layers — system defaults, organization, team, user, project, session — with the most specific winning.
- Inspectable. A user can read the file and know exactly what is shaping the agent's behavior. So can support, compliance, and the next engineer.
These are the same properties that GitOps brought to infrastructure — declarative state, version-controlled, automatically reconciled, traceable. The lesson SRE teams learned the hard way is that imperative configuration and learned state both drift in ways that are hard to observe and harder to roll back. Declarative state, by contrast, lets you diff today against yesterday and tell a clean story about what changed.
A learned-personalization stack has none of these properties. The user's preference lives in vectors or model weights. Nobody can read it. Nobody can diff it. A regression looks like "the agent feels different this week" and the answer is to retrain, not revert. Auditors do not love this conversation.
A schema, not a free-form prompt
The fastest way to wreck the dotfile pattern is to make the file a free-form blob — a single markdown document the user types into. That gives you the worst of both worlds: brittle parsing on the agent side, and on the user side, no validation, no autocomplete, no defaults, and no contract.
A real per-user agent config has a schema. The shape that has emerged across recent agent frameworks — VS Code custom agents, OpenCode, Continue, Microsoft 365's declarative agents, Google's Agent Development Kit, Claude Code's CLAUDE.md plus settings.json pair — is some combination of:
- An identity block (name, persona, audience).
- A capabilities block (which tools, with which scopes, for which kinds of tasks).
- A constraints block (do-not-do rules, escalation thresholds, jurisdictional flags).
- A context block (the project, the workspace, the relevant docs).
- A composition block (which prompt fragments to include, in which order).
- A model and routing block (default tier, fallbacks, latency budgets).
That structure is not arbitrary. It maps to the four-layer system prompt architecture that has shaken out as best practice: identity, capabilities, constraints, context. Keeping each layer separate in the config means you can change one without rereading the others, and you can test each independently.
Schema also enables a property the monolithic prompt cannot offer: explicit defaults and overrides. The system ships sane defaults; the organization overrides a few; the team overrides a few more; the user overrides a handful; the current project overrides one or two. The agent at request time resolves the stack and serializes the final state. A user can ask "why did you do X" and the answer is a config path with a winner — not "the model felt like it."
Prompt fragments are how the dotfile reaches the model
The bridge from a structured config file to an actual LLM call is the prompt fragment. Instead of one giant system prompt that the agent assembles by string concatenation, you keep a library of small, named fragments — each one a paragraph or two of focused instruction — and the config file references them by name in a defined order.
Consider the tone setting. Rather than the agent code containing a switch with five if tone == "terse" then prepend "be terse..." branches, the system has a fragments/tone-terse.md file checked into a repo, the user's config says tone: terse, and the resolver loads the corresponding fragment at request time. Adding a new tone is adding a fragment. Auditing what a user sees is reading their referenced fragments. A/B testing a new constraint is shipping a new fragment and routing a fraction of traffic at the config layer, with no code change.
This approach also addresses a token-cost problem that monolithic mega-prompts make worse. A 50-fragment library only loads the fragments a user's config actually references, which in practice is a handful — often 5,000–10,000 tokens of focused instruction versus a permanent 50,000-token everything-prompt that nobody reads end-to-end. Pricing aside, the model behaves better when its instructions are small, ordered, and relevant.
The pattern composes with prompt caching cleanly. Stable layers — system identity, organization defaults, team config — sit at the top of the prompt and hit the cache on every request. Dynamic layers — current project, session-scoped overrides — sit at the bottom. The cache hierarchy mirrors the override hierarchy, which is a happy coincidence that pays off in production.
Reload without retrain
A property that distinguishes agent dotfiles from any ML-based personalization story is the reload semantics. When a user toggles a preference, the next request sees the new behavior — full stop. There is no training run, no eval cycle, no embedding refresh, no "give it a week to learn."
Building this is not exotic. The agent reads the config from a known path on every request, or watches it with a file watcher, or fetches it from a key-value store with a short TTL. Claude Code's CLAUDE.md file watcher is the public-facing version of this idea: edit the file mid-session, the next response uses the new instructions, no restart required. The same shape works for a hosted multi-tenant agent product — the user's config is a row in a database that the request path reads, with a per-request cache that invalidates on write.
The user-experience consequence of this is large. A user who flips a setting and sees the agent change behavior on the next message develops trust that maps to the trust they have in any other configurable software. A user who flips a setting and waits to see if the model "learns" is in a relationship with a black box. The first user files coherent bug reports. The second files vibes-based complaints that turn into multi-week investigations.
What this kills, what it does not
The dotfile pattern eliminates a long list of problems that the learned-personalization approach inherits:
- Silent regression. When learned state drifts, behavior drifts, and you discover it from user reports. When config does not change, behavior does not change.
- Opacity. "Why did the agent do that?" is answerable from a config diff. From learned state, it is a research project.
- No clean export. GDPR-style portability or "delete my data" is straightforward when personalization is a file the user can download. With weights and embeddings, it is a compliance program.
- Eval ambiguity. A config-and-output contract is testable: given this config and this input, the output should pass these checks. A learned-personalization eval has to span sessions, control for drift, and account for retraining cadence.
It does not kill the legitimate uses of fine-tuning and long-term memory. Domain-specific terminology, recall of past conversations, behaviors that genuinely emerge from interaction history — all of those still need their own machinery. The point is that they should sit on top of a declarative configuration layer, not replace it. The dotfile handles the 70% of personalization that is preference; ML handles the 30% that is genuinely emergent.
The boundary worth defending: anything a user could in principle write down as a preference is config, and anything a user could not write down is a candidate for learned state. Most teams blur this line and end up using ML to solve config problems, which is expensive, opaque, and hard to support.
How to test config-driven behavior
Once personalization is declarative, evals stop being a session-replay nightmare and become contract tests. The shape is the boring kind that QA teams have written for thirty years: given config C and input I, the output O must satisfy assertion A.
Per-user behavior becomes a matrix — config dimensions on one axis, input scenarios on the other — that you can run on every commit. A change to the default tone-terse.md fragment fires a regression on every config that references it; a change to a constraint fragment fires regressions on every user who has that constraint enabled. The blast radius of a change is exactly the set of configs that reference the changed fragment, computable from the config files. You cannot say that about a learned-personalization stack without a sweeping eval suite that nobody runs because it is slow and noisy.
This also lets you ship per-user behavior with the same release-engineering rigor you ship code: feature-flagged fragments, canary rollouts at the config layer, blue/green by composing two versions of a fragment library and switching pointers. The infrastructure that already exists in your org for safe code deploys is reusable — because, again, agents are software.
What this means for org structure
Once you accept that the cheapest answer to half the personalization roadmap is declarative config, the work shifts. The team that owns "agent personalization" stops being an ML team and starts being a configuration-management team — closer to platform engineering than to applied research. They own the schema, the fragment library, the resolver, the file watcher, and the eval matrix. They publish the contract that product teams configure against.
The product side gets a faster loop. Adding a new persona is editing a YAML file and shipping a fragment, not training a model. Removing a feature for a customer is flipping a flag in their config, not retracing a fine-tune dataset. Compliance becomes "show me the user's config and the fragments it referenced at 14:32 on Tuesday," which is a query, not a forensics exercise.
There is one durable hire to make alongside this: somebody who treats the fragment library like a docs site, with editing standards, naming conventions, deprecation policy, and a review process. The library is the agent's public surface area; left to grow ad hoc, it becomes the new mega-prompt with extra steps. Treated as a curated artifact, it is the most leveraged code in the system.
The cheapest answer first
The instinct to reach for ML on personalization comes from the same place as the instinct to reach for a microservice on a monolith problem: fashion, status, and an unwillingness to ask whether the simpler tool already solves it. The simpler tool, in this case, is a configuration file. Versioned, declarative, layered, composable, inspectable. The same primitives that gave SRE teams operable infrastructure will give agent teams operable personalization.
Start with the dotfile. Reach for the vector database when the dotfile demonstrably runs out. The order matters because the dotfile is reversible — you can always layer ML on top of declarative state — and learned personalization is not. You cannot retroactively make embeddings auditable. You cannot retrofit version history onto weights. But you can always upgrade a config-driven system to use ML for the cases where ML actually pays its rent.
The team that internalizes this ships personalization on Tuesday. The team that does not is still arguing about training data on Friday.
- https://code.claude.com/docs/en/settings
- https://www.jdhodges.com/blog/claude-code-claudemd-project-instructions/
- https://docs.langchain.com/oss/python/langchain/context-engineering
- https://learn.microsoft.com/en-us/microsoft-365/copilot/extensibility/declarative-agent-architecture
- https://opencode.ai/docs/agents/
- https://docs.continue.dev/reference
- https://www.gitops.tech/
- https://about.gitlab.com/topics/gitops/
- https://blog.cloudflare.com/introducing-agent-memory/
- https://learn.microsoft.com/en-us/agent-framework/get-started/memory
- https://blog.promptlayer.com/prompt-routers-and-flow-engineering-building-modular-self-correcting-agent-systems/
- https://www.operion.io/learn/component/system-prompt-architecture
- https://www.empathyfirstmedia.com/yaml-files-ai-agents/
