April 16, 2026

Why Docker Containers Are Not Secure Enough for AI Agents

The Shared Kernel Problem

When you give an AI agent the ability to run bash, install packages, and modify files, you're handing it the keys to the machine. The only question is: how strong is the lock?

Docker containers share the host kernel. That's the fundamental problem. A container is a process with Linux namespaces and cgroups — isolation by convention, not by hardware. The entire Linux syscall interface (~300+ syscalls) is exposed to every container. A vulnerability in any kernel code path reachable from inside the container can lead to escape.

Real Container Escapes

These aren't theoretical risks. Container escape vulnerabilities are found regularly and have public exploits:

CVE-2024-21626 (CVSS 8.6) — runc "Leaky Vessels" vulnerability. Race condition in working directory setup allowed container escape. Affected all Docker and Kubernetes installations. January 2024.
CVE-2022-0185 — Linux kernel heap overflow in the filesystem context API. Exploitable from unprivileged containers to gain host root. Used in real CTF competitions and exploit chains.
CVE-2019-5736 (CVSS 8.6) — runc vulnerability allowing a malicious container to overwrite the host runc binary and gain root on the host. Public exploit available. Required emergency patching industry-wide.
CVE-2022-0492 — cgroups v1 escape allowing privilege escalation from containers to host.
CVE-2020-15257 — containerd vulnerability allowing host network namespace access from containers.

Why AI Agents Make It Worse

Traditional containers run known, audited application code — a web server, a database, a worker process. The syscall profile is predictable. You can write a tight seccomp filter.

AI agents run arbitrary, unpredictable code. They:

Install untrusted packages — pip install and npm install pull code from public registries. Typosquatting and supply chain attacks are common.
Execute LLM-generated code — the agent writes code and runs it. The code is non-deterministic. You cannot audit it in advance.
Browse untrusted websites — web content can contain prompt injection payloads that instruct the agent to execute escape commands.
Hold persistent shell sessions — unlike a single HTTP request, agents maintain long-running shells, giving exploits more time and more surface area.
Make unpredictable syscalls — you cannot write a seccomp profile because you don't know what the agent will run next.

The Isolation Spectrum

	Docker	gVisor	Firecracker
Kernel	Shared host kernel	User-space sentry	Dedicated per VM
Syscall surface	~300+ syscalls	~70 intercepted	~25 hypercalls (KVM)
Escape impact	Full host access	Sentry contained	VM only (hypervisor boundary)
Known escapes	Multiple CVEs yearly	Sentry bugs possible	No public escapes
Boot time	~100ms	~150ms	<500ms
Memory overhead	~10 MB	~50 MB	~5 MB
VMM codebase	Linux kernel (30M+ LOC)	Sentry (~200K LOC Go)	Firecracker (~50K LOC Rust)
Safe for AI agents?	No	Partial	Yes

Why Firecracker Wins

Firecracker — the technology behind AWS Lambda and Fargate — runs each sandbox as a lightweight virtual machine with its own Linux kernel. There is no shared kernel. A vulnerability inside the VM cannot reach the host. The attack surface is the Firecracker VMM: ~50,000 lines of Rust with a minimal device model. Compare that to the Linux kernel's 30 million+ lines of C that every Docker container shares.

The numbers matter. Firecracker has had zero public VM escapes since its launch in 2018. Docker/runc has had multiple critical escapes (CVE-2019-5736, CVE-2024-21626) with public exploits that required emergency industry-wide patching. For workloads that run arbitrary untrusted code — which is exactly what AI agents do — this difference is existential.

What About gVisor?

Google's gVisor (used in Cloud Run) intercepts syscalls with a user-space kernel called Sentry. It's better than raw Docker — the host kernel is protected by the Sentry layer. But gVisor has compatibility gaps (not all syscalls are implemented), ~5-15% CPU overhead from syscall interception, and the Sentry itself is a ~200K LOC attack surface written in Go. For AI agents that need full Linux compatibility (installing arbitrary packages, running any binary), gVisor's compatibility limitations can cause silent failures.

e2a's Security Model

At e2a, every sandbox is a Firecracker microVM:

Dedicated kernel — each VM boots its own Linux kernel. Kernel exploits are contained to that VM.
Isolated memory — KVM hardware enforcement. No memory sharing between VMs or with the host.
Isolated network — each VM gets its own TAP interface and IP. No network namespace sharing.
Ephemeral by default — sandbox destroyed = all state gone. No leaked data between sessions.
Workspace isolation — when persistence is enabled, S3 paths are scoped per user/app/capset with STS credentials. No cross-tenant access.
Sub-second boot — <500ms to a full Linux environment. No compromise on speed for security.

Give your agents a safe place to run. Not a container. A microVM. Get started →