I’ve been analyzing our CI/CD security posture, and one statistic keeps bothering me: 85% of high-security organizations have adopted ephemeral runners, but the majority of companies are still running persistent runners. As we move into 2026, I’m questioning whether ephemeral runners should be considered baseline infrastructure hygiene rather than an advanced security practice.
Let me explain what ephemeral runners are for those less familiar with the infrastructure details. Traditional CI/CD runners are long-lived virtual machines or containers that persist across multiple build jobs. They accumulate state, secrets, cached dependencies, and environmental configurations over time. An ephemeral runner, by contrast, is a fresh container or VM spun up for a single build job and destroyed immediately afterward. Nothing persists. No state accumulates. Each job starts from a known, clean baseline.
The security implications are profound. With persistent runners, if malware or a malicious dependency compromises a build job, that compromise can persist and affect subsequent builds - potentially for months without detection. We’ve seen this pattern in recent supply chain attacks where adversaries established persistence in build environments and exfiltrated secrets or injected malicious code into artifacts over extended periods.
Ephemeral runners eliminate this persistence vector entirely. A compromised job might steal secrets accessible during that specific build, but it cannot establish a foothold for future exploitation. The blast radius is contained to a single job execution. From a security architecture perspective, this is a fundamental shift from detection-and-response to prevention-by-design.
We implemented ephemeral runners across our infrastructure last year, and I want to share both the benefits and challenges we encountered. The benefits were immediate: our security team stopped worrying about runner compromise persistence, our compliance team loved the clean audit story, and our infrastructure costs actually decreased because we could scale runners elastically rather than maintaining persistent capacity.
The challenges were more subtle. Our existing build workflows had accumulated dependencies on persistent state - cached dependencies, build artifacts from previous jobs, environmental configurations that teams had “fixed” over time without documenting. Moving to ephemeral runners forced us to make all of these dependencies explicit, which was painful but ultimately healthy. We had to redesign our caching strategy, formalize our secret distribution approach, and document environmental requirements that had previously lived in tribal knowledge.
The implementation complexity isn’t trivial. We use Kubernetes-based runners with pod templates that define base images, resource limits, and security contexts. Each job gets a fresh pod from the template, executes, and the pod is deleted with a 30-second TTL. We integrate with our secret management system to inject credentials just-in-time rather than storing them in runner environments. Our dependency caching moved to a centralized artifact registry rather than relying on local filesystem caching.
But here’s my fundamental question: given the known risks of persistent runners, the proven effectiveness of ephemeral runners in high-security environments, and the maturity of implementation tooling available in 2026, should we still consider this an “advanced” security practice? Or should ephemeral runners be the expected baseline, with persistent runners relegated to legacy systems we’re actively migrating away from?
I’m particularly interested in perspectives from organizations that haven’t yet made this transition. What’s holding you back - technical complexity, resource constraints, organizational inertia, or something else? And for those who have implemented ephemeral runners, what was your experience? Would you consider this a prerequisite for modern CI/CD security, or am I overstating the case?