I have been deploying AI infrastructure for the past three years – first at Google Cloud, now at a startup where we manage LLM deployments for enterprise customers. The OpenClaw breach and the broader pattern of exposed AI agent installations have forced me to rethink how we approach self-hosted AI infrastructure.
The core problem is this: self-hosting AI agents combines the operational complexity of running production infrastructure with the security naivety of a developer laptop setup. And unlike traditional self-hosted software (say, a self-hosted GitLab), AI agents have a uniquely dangerous capability profile.
The Self-Hosting Paradox
People self-host AI agents for legitimate reasons: data privacy, cost control, customization, and avoiding vendor lock-in. These are good reasons. But the typical self-hosting deployment looks like this:
- Clone the repo
- Run
docker-compose up - Configure API keys in a
.envfile - Maybe set up a reverse proxy for HTTPS
- Share the URL with the team
Steps 1-3 take about 10 minutes. Step 4 is where things go wrong. Step 5 is where the attack surface expands.
The SentinelLABS/Censys research found 175,108 unique Ollama hosts exposed to the public internet. These are not enterprise deployments – they are individual developers and small teams who wanted to run a local LLM and accidentally made it globally accessible. The same pattern played out with OpenClaw, where nearly 1,000 instances were found running without authentication.
Why Docker Compose Is Not a Security Architecture
Here is the dirty secret of self-hosted AI infrastructure: most deployment guides treat Docker Compose as the production architecture. The typical docker-compose.yml for an AI agent looks like this:
- Application container with unrestricted network access
- A volume mount for persistent data (including credentials)
- Port mapping that binds to
0.0.0.0(all interfaces) by default - No resource limits, no security contexts, no network policies
This is fine for development. It is absolutely not fine for a system that holds API keys, processes sensitive conversations, and has command execution capabilities.
The gap between “it runs in Docker” and “it runs securely in Docker” is enormous. You need:
- Network policies that restrict which containers can talk to each other and what external endpoints they can reach
- Read-only filesystem for the application container, with write access only to specific directories
- Security contexts that drop all capabilities and run as non-root
- Secrets management through Docker secrets or an external vault, not environment variables
- Resource limits to prevent a compromised agent from consuming the entire host
- Health checks and monitoring that detect anomalous behavior
Most self-hosting guides skip all of this.
The Kubernetes Version Is Not Much Better
“Just use Kubernetes” is not the answer either. I have reviewed Kubernetes deployments of AI agents that had:
- Pods running as root with host network access
- ServiceAccounts with cluster-admin privileges
- No NetworkPolicies, meaning any pod could reach any other pod and any external endpoint
- Secrets stored in plain ConfigMaps instead of Secrets objects
- No RBAC restrictions on who could deploy or modify the agent configuration
Kubernetes gives you the tools to build secure deployments, but the default configuration of most Helm charts and deployment manifests for AI agents is wide open.
What Secure Self-Hosted AI Infrastructure Actually Looks Like
Based on what we have built for our enterprise customers, here is what a properly secured self-hosted AI agent deployment requires:
Network Layer:
- AI agent only accessible through an authenticated reverse proxy (Envoy, Traefik with forward auth, or a dedicated identity-aware proxy)
- Egress filtering through a transparent proxy that whitelists allowed external endpoints
- mTLS between all internal services
- No direct internet access from the agent container
Credential Management:
- All API keys and tokens stored in a secrets manager (HashiCorp Vault, AWS Secrets Manager, or at minimum sealed Kubernetes Secrets)
- Short-lived credentials with automatic rotation where possible
- Separate credential stores for different sensitivity levels
- Audit logging on every credential access
Runtime Security:
- Container running as non-root with minimal capabilities
- Seccomp and AppArmor profiles to restrict system calls
- Read-only root filesystem
- Runtime monitoring for anomalous process execution and network connections
Observability:
- Structured logging of all agent actions with correlation IDs
- Metrics on API call volume, latency, and error rates per credential
- Alerting on unusual patterns (credential access spikes, new network connections, process execution)
Update Management:
- Automated vulnerability scanning of container images
- Staged rollout of updates with canary deployment
- Rollback capability if a new version introduces security regressions
The Cost of Doing It Right
The honest truth is that this level of security infrastructure costs real time and money. For a small team, we are talking about 2-4 weeks of engineering time to set up properly, plus ongoing maintenance. For larger organizations, you need a dedicated platform team or you adopt a commercial solution.
This is why I think the conversation around AI agent security needs to address the economic question: who pays for the security of self-hosted AI agents? The open-source project maintainers do not have the resources. Individual developers do not have the expertise. And companies are deploying these agents faster than their security teams can review them.
We need either commercial “hardened distribution” approaches (similar to what Red Hat did for Linux) or community-maintained security configurations that can be adopted with minimal modification.
What infrastructure patterns are others using for self-hosted AI agent deployments? I would love to hear from anyone who has gone beyond the default Docker Compose setup.