The AI Feature Your CTO Funded That Your Security Team Will Not Let You Ship
The post-mortem says "we found security too late." The actual finding is that security found you on time. Your process found security too late.
This is the AI feature that cleared the budget gate in January because the CTO and the CFO agreed the company needed an AI moment. It cleared a light legal review in March because it was a prototype. Engineering built against the agreed spec through Q2. In late July, the launch-readiness security review opened, and on day one the threat model came back with blockers on the auth scopes, the data-exfiltration paths, the model provider's residency story, and the prompt-injection surface. The team's quarter is now spent rebuilding to address findings that should have shaped the original spec. Two quarters of slip, an executive memo about "process improvements," and a quiet decision next planning cycle to "deprioritize AI deep-integrations."
The launch did not fail because security was slow. It failed because security entered after the shape of the feature had already been frozen.
The Shape Constraint Nobody Drew
An AI feature's threat model is not a checklist you stamp at the end. It is a shape constraint. It decides what the feature can be.
Consider a customer-support agent that can read a ticket, query an internal knowledge base, and email a draft response to the user. The PRD describes that user journey in a paragraph. The security threat model describes a different artifact: which scopes the agent's service account holds, whether a user's ticket can contain instructions the agent treats as commands, whether the knowledge base returns documents the agent should refuse to forward, whether the provider's inference endpoint can retain the customer's data, and whether the email draft goes through a human gate.
Those questions do not refine the PRD. They constrain it. The auth scopes determine which tools the agent can be given. The data-exfiltration model determines whether the agent can read free-text from one tenant and write into another. The provider's residency posture determines whether an EU customer can use the feature at all. The prompt-injection surface determines whether the email step can be auto-send or must be human-gated.
If you sign off on a PRD that says "the agent emails the draft to the user" without resolving those constraints, you have not designed a feature. You have written a wish, and the shape of what you can actually ship is going to be discovered later by someone else, against your timeline.
Why "Lightweight Prototype Review" Is The Trap
Non-AI features pass through a lightweight architecture review at the design-doc stage. A new payments path gets a security architect's eyes on it. A new auth integration gets a thirty-minute threat model. A new admin endpoint gets a quick read for IDOR risk. Nobody complains because everyone has internalized that you cannot retrofit authn/authz after the fact without paying a tax.
AI features are skipping that step. The reason is cultural: AI features feel research-shaped. They emerge from a notebook, a prompt, a Loom demo. They look like they have not been "built" yet, so applying the normal architecture review feels premature. Product treats the early build as a prototype. Engineering treats it as a probe of feasibility. Security is not invited because there is supposedly nothing to review.
By the time there is something to review, the auth model has been decided implicitly by what the prototype's service account happened to have access to. The data flow has been decided implicitly by which APIs were easiest to wire up. The provider has been decided implicitly by which SDK the prototyper imported first. The thing that looks like a prototype is already a frozen architecture. It just has not been labeled that way.
The fix is not to slow the prototype down. It is to recognize that an AI prototype's first commit is a higher-stakes design decision than a non-AI feature's design doc, and to apply the same lightweight architecture review accordingly.
What An AI Threat Model Actually Catches
The threat surface for an AI agent looks different from a traditional application. It is dynamic, context-dependent, and capable of taking consequential actions in real time, often through tools the security team has not previously catalogued as code paths.
A useful AI threat model catches at least five things that a generic API review misses:
- Prompt-injection surface. Every untrusted text the model reads is a potential instruction. A support ticket, a web page the agent fetches, an attachment, a tool result — any of these can carry hidden instructions that re-aim the agent. Microsoft's security team documented prompt injection paths in agent frameworks that escalate from "the model said something weird" to host-level code execution, because once a model is wired to tools, prompt injection is a code-execution primitive, not a content problem.
- Tool-scope blast radius. The agent's service account is the agent's authority. If the agent has a
send_emailtool and aread_tickettool, prompt injection through a ticket can trigger an email. The blast radius is not "the model behaved badly" but "the agent took a real action through a tool the team gave it." Threat models force a tool-by-tool capability audit before the tool list is committed. - Cross-tenant context leakage. Many AI features share a vector index, a prompt-cache, or a fine-tuned adapter across tenants. The retrieval step then becomes the cross-tenant boundary, and if the retrieval keys are wrong by one column, tenant A's documents end up in tenant B's prompt. This is not an exotic failure mode; it is the default for teams that build retrieval before they design isolation.
- Provider data path and residency. If the LLM is processing customer data, the LLM provider is a subprocessor. That triggers a DPA amendment, Standard Contractual Clauses for EU data, an opt-out from training/retention, and possibly a choice of model with GDPR-compatible terms. A SaaS application makes residency decisions at provisioning; an AI agent makes them at inference, which means the residency posture cannot be deferred to the deploy step.
- Indirect prompt injection from third-party content. Indirect prompt injection — instructions hidden in a web page or document the agent reads — was theoretical two years ago and is now observed in the wild against production agents. If your feature reads any content the user did not author, that content is an attack surface, and the threat model has to call out which fetches need an isolation step.
- https://www.microsoft.com/en-us/security/blog/2026/05/07/prompts-become-shells-rce-vulnerabilities-ai-agent-frameworks/
- https://venturebeat.com/security/openai-admits-that-prompt-injection-is-here-to-stay
- https://openai.com/index/prompt-injections/
- https://unit42.paloaltonetworks.com/ai-agent-prompt-injection/
- https://www.armosec.io/blog/privacy-and-data-residency-for-ai-agents/
- https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/governance-security-across-organization
- https://arxiv.org/pdf/2505.06315
- https://www.zscaler.com/blogs/security-research/ai-now-default-enterprise-accelerator-takeaways-threatlabz-2026-ai-security
- https://about.gitlab.com/topics/devsecops/shift-left-security/
- https://worqlo.com/blog/enterprise-ai-vendor-rfp-questions/
