Self-Hosting AI Agents Securely: Why Docker Compose Is Not a Security Architecture and What To Do Instead

I have been deploying AI infrastructure for the past three years – first at Google Cloud, now at a startup where we manage LLM deployments for enterprise customers. The OpenClaw breach and the broader pattern of exposed AI agent installations have forced me to rethink how we approach self-hosted AI infrastructure.

The core problem is this: self-hosting AI agents combines the operational complexity of running production infrastructure with the security naivety of a developer laptop setup. And unlike traditional self-hosted software (say, a self-hosted GitLab), AI agents have a uniquely dangerous capability profile.

The Self-Hosting Paradox

People self-host AI agents for legitimate reasons: data privacy, cost control, customization, and avoiding vendor lock-in. These are good reasons. But the typical self-hosting deployment looks like this:

  1. Clone the repo
  2. Run docker-compose up
  3. Configure API keys in a .env file
  4. Maybe set up a reverse proxy for HTTPS
  5. Share the URL with the team

Steps 1-3 take about 10 minutes. Step 4 is where things go wrong. Step 5 is where the attack surface expands.

The SentinelLABS/Censys research found 175,108 unique Ollama hosts exposed to the public internet. These are not enterprise deployments – they are individual developers and small teams who wanted to run a local LLM and accidentally made it globally accessible. The same pattern played out with OpenClaw, where nearly 1,000 instances were found running without authentication.

Why Docker Compose Is Not a Security Architecture

Here is the dirty secret of self-hosted AI infrastructure: most deployment guides treat Docker Compose as the production architecture. The typical docker-compose.yml for an AI agent looks like this:

  • Application container with unrestricted network access
  • A volume mount for persistent data (including credentials)
  • Port mapping that binds to 0.0.0.0 (all interfaces) by default
  • No resource limits, no security contexts, no network policies

This is fine for development. It is absolutely not fine for a system that holds API keys, processes sensitive conversations, and has command execution capabilities.

The gap between “it runs in Docker” and “it runs securely in Docker” is enormous. You need:

  • Network policies that restrict which containers can talk to each other and what external endpoints they can reach
  • Read-only filesystem for the application container, with write access only to specific directories
  • Security contexts that drop all capabilities and run as non-root
  • Secrets management through Docker secrets or an external vault, not environment variables
  • Resource limits to prevent a compromised agent from consuming the entire host
  • Health checks and monitoring that detect anomalous behavior

Most self-hosting guides skip all of this.

The Kubernetes Version Is Not Much Better

“Just use Kubernetes” is not the answer either. I have reviewed Kubernetes deployments of AI agents that had:

  • Pods running as root with host network access
  • ServiceAccounts with cluster-admin privileges
  • No NetworkPolicies, meaning any pod could reach any other pod and any external endpoint
  • Secrets stored in plain ConfigMaps instead of Secrets objects
  • No RBAC restrictions on who could deploy or modify the agent configuration

Kubernetes gives you the tools to build secure deployments, but the default configuration of most Helm charts and deployment manifests for AI agents is wide open.

What Secure Self-Hosted AI Infrastructure Actually Looks Like

Based on what we have built for our enterprise customers, here is what a properly secured self-hosted AI agent deployment requires:

Network Layer:

  • AI agent only accessible through an authenticated reverse proxy (Envoy, Traefik with forward auth, or a dedicated identity-aware proxy)
  • Egress filtering through a transparent proxy that whitelists allowed external endpoints
  • mTLS between all internal services
  • No direct internet access from the agent container

Credential Management:

  • All API keys and tokens stored in a secrets manager (HashiCorp Vault, AWS Secrets Manager, or at minimum sealed Kubernetes Secrets)
  • Short-lived credentials with automatic rotation where possible
  • Separate credential stores for different sensitivity levels
  • Audit logging on every credential access

Runtime Security:

  • Container running as non-root with minimal capabilities
  • Seccomp and AppArmor profiles to restrict system calls
  • Read-only root filesystem
  • Runtime monitoring for anomalous process execution and network connections

Observability:

  • Structured logging of all agent actions with correlation IDs
  • Metrics on API call volume, latency, and error rates per credential
  • Alerting on unusual patterns (credential access spikes, new network connections, process execution)

Update Management:

  • Automated vulnerability scanning of container images
  • Staged rollout of updates with canary deployment
  • Rollback capability if a new version introduces security regressions

The Cost of Doing It Right

The honest truth is that this level of security infrastructure costs real time and money. For a small team, we are talking about 2-4 weeks of engineering time to set up properly, plus ongoing maintenance. For larger organizations, you need a dedicated platform team or you adopt a commercial solution.

This is why I think the conversation around AI agent security needs to address the economic question: who pays for the security of self-hosted AI agents? The open-source project maintainers do not have the resources. Individual developers do not have the expertise. And companies are deploying these agents faster than their security teams can review them.

We need either commercial “hardened distribution” approaches (similar to what Red Hat did for Linux) or community-maintained security configurations that can be adopted with minimal modification.

What infrastructure patterns are others using for self-hosted AI agent deployments? I would love to hear from anyone who has gone beyond the default Docker Compose setup.

Alex, this is a great breakdown of the infrastructure reality. I want to zoom in on the credential management piece because I think it is the most underappreciated part of this problem.

You mentioned HashiCorp Vault and AWS Secrets Manager. Both are excellent, but they add significant complexity to a self-hosted deployment. For small teams and individual developers, the realistic options are:

  1. Environment variables (what everyone actually uses) – terrible for audit logging, no rotation, no access control
  2. Docker secrets – better, but only works with Docker Swarm, not plain Docker Compose
  3. File-based secrets with strict permissions – workable for single-host deployments but fragile

The gap in the market is a lightweight, self-hosted secrets manager that is designed specifically for AI agent deployments. Something that:

  • Takes 5 minutes to set up alongside the agent
  • Provides a REST API for credential retrieval
  • Logs every credential access with timestamp and context
  • Supports basic rotation (even if it is just “generate new key, update reference”)
  • Alerts when a credential is accessed from an unexpected source

I have been prototyping something like this for my clients. The working prototype uses a simple SQLite-backed service with mTLS for authentication between the agent and the secrets store. It is not Vault-level sophisticated, but it solves the core problems: credentials are not in environment variables, every access is logged, and you can revoke a credential without restarting the agent.

The commercial opportunity here is obvious, but I think it needs to be open source first to get adoption. AI agent developers will not adopt something proprietary for credential management when their entire stack is open source.

I appreciate the thorough infrastructure blueprint, but I want to push back on the premise that this level of security is necessary for all self-hosted AI agent deployments.

There is a difference between:

  1. A developer running an AI agent on their laptop for personal productivity
  2. A team running a shared agent on an internal server
  3. A company deploying agents as part of production infrastructure

Your security architecture is absolutely appropriate for case 3. But for case 1, it is massive overkill. If I am running an AI agent that only binds to localhost and the credentials are for my personal API keys, I do not need mTLS, Vault, and Seccomp profiles. I need basic firewall rules and a locked screen.

The risk is that if we demand enterprise-grade security for every deployment, we push developers toward hosted alternatives that actually have worse privacy properties. The whole point of self-hosting is avoiding sending your data to a third party. If the self-hosted option requires a security engineering team to deploy safely, most people will just use the cloud version and share everything with the provider.

What I think would be more useful than a comprehensive security architecture is a tiered approach:

  • Tier 1 (Personal): Bind to localhost only, file-based credential storage with restrictive permissions, basic firewall
  • Tier 2 (Team): Reverse proxy with basic auth, credential encryption, network isolation from public internet
  • Tier 3 (Enterprise): Everything in your post – zero-trust proxy, secrets manager, runtime monitoring, the full stack

Each tier should have a deployment guide and a validation script that checks whether the deployment meets the tier’s requirements. That way people can make an informed decision about their security posture rather than choosing between “no security” and “full enterprise security.”

The financial dimension of this is worth spelling out because it affects how CTOs make the build-vs-buy decision for AI agent infrastructure.

I ran the numbers for our organization (roughly 120 engineers):

Option A: Self-hosted with proper security

  • Infrastructure engineering time: 3-4 weeks initial setup = ~$40K in loaded eng cost
  • Ongoing maintenance: 0.5 FTE = ~$100K/year
  • Cloud infrastructure for proxies, monitoring, secrets management: ~$2K/month = $24K/year
  • Total year-one cost: ~$164K

Option B: Commercial AI agent platform with built-in security

  • Per-seat licensing: $50-150/seat/month = $72K-$216K/year
  • Integration engineering: 1-2 weeks = ~$20K
  • Total year-one cost: ~$92K-$236K

Option C: Self-hosted without security (what most teams actually do)

  • Engineering time: 1-2 days = ~$3K
  • Cloud infrastructure: ~$500/month = $6K/year
  • Risk: Unknown, but OpenClaw showed it can be catastrophic

The uncomfortable truth is that Option C is what most organizations choose because the cost is obvious and the risk is invisible. Until something goes wrong.

At my company, we are moving to a hybrid approach: a centrally managed AI agent platform team that provides secure deployment infrastructure as a service to engineering teams. Individual teams get the customization benefits of self-hosting without having to build the security layer themselves. It is essentially an internal “hardened distribution” model.

But this only works if you have the organizational scale to justify a dedicated platform team. For startups and small companies, the economic case for self-hosting securely is genuinely difficult.