Skip to main content

3 posts tagged with "data-privacy"

View all tags

The Agent That Remembers What You Took Back: Deletion as a First-Class Memory Operation

· 10 min read
Tian Pan
Software Engineer

In March, a user told your agent to stop recommending restaurants with outdoor seating — they had moved to an apartment with a baby and late nights were over. In September, the agent suggests a rooftop bar for their anniversary. The user is annoyed, and you are confused, because you watched the March correction land. It got written to memory. It is still there. The problem is that it is sitting next to the original preference, which is also still there, and retrieval surfaced the older one because it had a slightly better embedding match for "anniversary dinner."

This is the failure mode nobody designs for. Teams spend weeks on memory writes — extraction, summarization, embedding, namespacing — and treat deletes as a someday problem. Long-term memory makes adding a fact almost free, so facts accumulate. But a memory store is not a diary. A diary is allowed to contain things that used to be true. A memory store that an agent reads from to make decisions is not, because the agent cannot tell the difference between a fact and a fossil.

The AI Observability Leak: Your Tracing Stack Is a Data Exfiltration Surface

· 11 min read
Tian Pan
Software Engineer

A security team I talked to recently found that their prompt and response fields were being shipped, in full, to a third-party SaaS logging backend they had never signed a Data Processing Agreement with. The fields contained customer medical summaries, Stripe secret keys accidentally pasted by support agents, and the full text of a confidential acquisition memo that someone had asked an internal assistant to summarize. Nothing was encrypted in the payload. Nothing was redacted. The retention was 400 days. The integration was set up during a hackathon by a well-meaning engineer who pip install-ed the vendor's SDK, dropped in an API key, and shipped.

This is the AI observability leak. Every LLM app team ends up wanting tracing — you cannot debug prompt regressions or non-deterministic agent loops without it — so one of LangSmith, Langfuse, Helicone, Phoenix, Braintrust, or a vendor AI add-on ends up in the stack. The default setup captures the entire request and response. That default is, for most production workloads, a compliance violation waiting to be discovered.

Building GDPR-Ready AI Agents: The Compliance Architecture Decisions That Actually Matter

· 10 min read
Tian Pan
Software Engineer

Most teams discover their AI agent has a GDPR problem the wrong way: a data subject files an erasure request, the legal team asks which systems hold that user's data, and the engineering team opens a ticket that turns into a six-month audit. The personal data is somewhere in conversation history, somewhere in the vector store, possibly cached in tool call outputs, maybe embedded in a fine-tuned checkpoint — and nobody mapped any of it.

This isn't a configuration gap. It's an architectural one. The decisions that determine whether your AI system is compliance-ready are made in the first few weeks of building, long before legal comes knocking. This post covers the four structural conflicts that regulated-industry engineers need to resolve before shipping AI agents to production.