The Agent Scratch Directory: The Unowned Filesystem PII Surface Nobody Inventoried
A regulator walks into your office and asks the question security teams rehearse for: "Show me every place customer data lives." Your data team produces the inventory. The primary database is on it. The analytics warehouse is on it. The object store, the queue, the search index, the backup destination — all on it, with classification labels, retention policies, encryption details, and named owners. Then someone in the room mentions the agent worker pool, and the inventory has nothing to say. The pool has been running for nine months. Each worker has a local disk. The agents on those workers have been parsing PDFs, transcribing audio, downloading email attachments, and caching intermediate JSON between tool calls the entire time. Nobody put any of that on the asset register.
This is the scratch directory problem. Every long-running agent worker accumulates an ephemeral filesystem that grows organically as new tools are added — extracted text from a PDF parser, transcribed audio from a Whisper step, downloaded attachments from a Gmail tool, screenshots from a browser-use step, vector-search snippets cached for the next turn, intermediate JSON the agent emitted between two tool calls so the second one wouldn't have to re-derive it. Unlike databases and queues and buckets, this surface has no retention policy, no encryption-at-rest standard, no DLP scanner pass, and no entry on the data-classification spreadsheet. The platform team thinks "agent state" means the inference-provider context window. The SRE team thinks "agent state" means the durable database. The worker's /tmp/agent-workspace-${session_id}/ directory is a third copy of customer data that nobody owns.
Why The Inventory Misses It
Traditional data-classification programs were built on an assumption that's no longer true: data lives in named systems. The CRM has a name. The data warehouse has a name. The bucket has a name. You discover them by querying cloud APIs, by reading deployment manifests, by walking the procurement spreadsheet. The classification team doesn't need to grep filesystems because the systems with sensitive data all have control planes that can be queried.
Agent workers break that assumption. The "system" that holds the data is a directory path inside a stateless container that the orchestrator can replace at any time. There is no API call that lists "all customer data in agent worker scratch directories" because the directories aren't tracked anywhere — they're created on demand by whichever tool happened to need a temp file. Cloud-native data discovery scanners (Wiz, Macie, Microsoft Purview) can identify resources at the storage-account level and find sensitive content patterns, but their default coverage maps poorly to "the union of every /tmp/* across every replica of every agent worker pod between deploys." The data is real and the surface is large; the inventory just doesn't include it.
The org dynamic that creates this gap is even more reliable than the technical one. The AI engineering team that built the worker pool thinks of it as compute, not storage — a place to run the agent loop, not a place to store data. The SRE team that operates the pool thinks of "data" as the durable layer they back up. The security team that owns data classification thinks of "agent state" as whatever the inference provider is doing on its end. None of those three teams sees the disk. Each one has plausibly correct mental model of what they own. Together they have a blind spot exactly the size of the worker's local volume.
The Failure Modes Are Boringly Concrete
The failure modes here aren't speculative. Each one has a clean physical mechanism that should be easy to defend against — and isn't, because the surface isn't on the inventory.
A worker reboots in the middle of a session and the orchestrator schedules the next session onto the same node with the same path scheme. The new tenant's session inherits the previous tenant's scratch directory because the path was deterministic — /tmp/agent-workspace-current/ rather than /tmp/agent-workspace-${session_id}/, or the session_id wasn't sufficiently entropy-rich, or the cleanup hook ran in a try block that silently swallowed an rm -rf error. The Daily.dev team that wrote up their org-wide agent shipped this exact class of bug: a previous user's write token persisted into a new user's turn because the credential surface was process-global rather than per-turn. The same pattern replays for filesystem state — anything not explicitly per-turn isolated is implicitly cross-turn shared.
A backup snapshot of the worker's volume captures six weeks of customer attachments because the filesystem-level backup tool didn't know to exclude /tmp/agent-workspace. Volume snapshots in cloud environments don't read your .gitignore-equivalent. Whoever configured the backup told it to capture the whole disk, on the reasonable assumption that "the disk only has system files and the application binary." Then a tool that needed to parse a PDF dropped the extracted text in /var/lib/agent/cache/ and the next snapshot picked it up, and the snapshot retention is 90 days, and now you have two months of customer PDFs in a backup tier that wasn't classified for it.
- https://dev.to/ksankar/defense-in-depth-tenant-isolation-for-an-agent-that-executes-code-375j
- https://blaxel.ai/blog/multi-tenant-isolation-ai-agents
- https://www.shayon.dev/post/2026/52/lets-discuss-sandbox-isolation/
- https://docs.aws.amazon.com/prescriptive-guidance/latest/agentic-ai-multitenant/enforcing-tenant-isolation.html
- https://daily.dev/blog/we-built-an-org-wide-ai-agent-in-4-days-heres-what-broke-in-the-weeks-after
- https://blog.cloudflare.com/dynamic-workers/
- https://learn.microsoft.com/en-us/azure/well-architected/security/data-classification
- https://www.wiz.io/academy/data-classification
- https://aclanthology.org/2025.acl-long.1227.pdf
- https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/
