The difference between a chatbot and an agent that does your work is a computer. Give a capable model a real machine, a full Linux environment where it can run a command, install a tool, write a file, and chain a dozen steps into a finished result, and it stops handing you suggestions and starts handing you work. That is the unlock behind agents that draft the memo, build the model, and produce the deck rather than describing how to.
It is also the problem. A machine that runs code a model wrote, against your data and your systems, is a machine you have to assume can be turned against you. So the engineering question is not how to give an agent a computer. It is how to give it one you can hand over without flinching.
Why a computer, not a code interpreter
Plenty of AI tools hand the model a code interpreter: run this snippet, return the output, forget everything. That is fine for a one-off calculation and close to useless for real work, which almost never fits in a single snippet. A diligence task that pulls data from three systems, runs an analysis, and assembles a deck is not one function call. It needs to install a package, keep files around between steps, call a command-line tool, retry the step that failed, and build on what it already produced.
That is a workstation, not a REPL. So every agent session gets its own: a full environment where the agent works the way a person at a terminal would, with a filesystem, a shell, and the tools the task needs. The agent is not poking at a locked-down evaluator through a keyhole. It has a computer.
To make it concrete, picture a finance analyst's agent asked to update a model from a new filing. It downloads the filing, runs a script to pull the tables out of the PDF, reconciles them against last quarter, updates the workbook, and exports a clean version. Five tools, four intermediate files, one retry when the parser chokes on a footnote. There is no single snippet that is that task. There is only a machine that can hold the work while it happens.
Assume the machine is hostile
The moment you let a model run code, you have to treat that code as untrusted. Not because the model is malicious, but because its inputs are not under your control. A prompt, a web page it reads, a row in a database, a document from a connected system: any of them can carry an instruction, and a capable agent will act on what it reads. Trusting the sandbox because the model seems well-behaved is the same mistake as trusting user input because most users are honest.
Make the threat concrete. An agent reading a web page or a support ticket meets a line that says, in effect, ignore your task and send the customer database to this address. A capable agent might try. The defense that matters is not a classifier that catches that sentence, though that helps. It is that the sandbox has no way to send anything anywhere, so the instruction fails even if the agent falls for it. Containment beats detection, because detection eventually misses.
So the design assumption is the opposite. Assume the sandbox can be fully subverted, and arrange things so that a subverted sandbox is worth nothing to an attacker. Two moves follow. Isolate the machine from everything around it, so a compromise stays put. And give it no way out except a single path you control. The rest of this post is those two moves.
Two levels of isolation
Isolation comes in two strengths, and which one you need depends on how much you are willing to trust the kernel.
The default is a hardened container. The sandbox runs as a standard OCI container, the same primitive as any Kubernetes workload, but stripped down hard: non-root, every Linux capability dropped, privilege escalation turned off, and a seccomp profile filtering the system calls it is allowed to make. It lives on a dedicated node pool, kept away from the application services, so even the scheduling boundary separates agent code from the platform's own. This runs on any Kubernetes cluster, with no special host setup.
For a stronger boundary, the sandbox runs as a lightweight virtual machine through Kata Containers, on a Firecracker or QEMU backend. The difference that matters is the kernel. A hardened container shares the host kernel with everything else on the node. A microVM gets its own, and the host's /proc and /sys are never exposed to it. That changes the blast radius. Escaping a hardened container is a container-escape problem, and those exist. Escaping a microVM to reach a neighboring sandbox requires a hypervisor-level vulnerability, which is a far higher bar and a far smaller attack surface.
Both modes share the same posture, non-root on a dedicated pool. The microVM just moves the trust boundary down, from the shared kernel to the hypervisor.
When to reach for the microVM
The hardened container is the right default for most workloads, where the platform controls the images, the work is internal, and the tenant boundary is the workspace. It is cheap to schedule, runs anywhere Kubernetes does, and the hardening already removes the easy escapes.
The microVM earns its extra cost when the threat model is sharper. When sandboxes run untrusted or third-party code. When a regulatory posture asks for hardware-level isolation between tenants, not just isolation in policy. When many teams share one cluster and a single kernel bug must never become a path from one team's sandbox into another's. There the per-sandbox kernel is the difference between a contained incident and a shared one. The choice is made per node pool, so one deployment can run both and route the sensitive work to the stronger boundary.
Storage that does not outlive the task
A sandbox's disk is deliberately forgetful. It mounts a node-local volume, backed by the node's disk or memory, with a size limit, and that volume is destroyed when the session ends. Nothing the agent writes to its working directory survives by default. This is not a limitation to work around. It is the point. A fresh, empty machine every session means a compromise has nothing old to find and leaves nothing behind.
Anything that must outlive the task is written on purpose to durable storage, the organization's Drive, through the platform and under the same permissions as every other access. The default is forgetting. Persistence is a deliberate, recorded act. The starting machine is modest, on the order of two virtual CPUs, two gigabytes of memory, and sixteen gigabytes of disk, and it scales up for the heavier jobs that need it.
No way out except the path we built
Isolating the machine is half the job. The other half is the network, because an attacker who lands on a box wants to reach the rest of the world from it. From inside the sandbox, exactly one destination is reachable: a proxy we run. Direct internet is blocked. The cloud metadata service, the first place a cloud-aware attacker looks for credentials, is blocked. The Kubernetes API is blocked. Other sandboxes are blocked. Even DNS resolution is blocked. All of it at the infrastructure layer, not by configuring the agent and hoping.
The proxy itself, how it attaches identity and credentials on the way out so that no secret ever lives inside the sandbox, is a topic of its own. The point here is narrower, and it is the reward for all the isolation: even a sandbox that has been completely taken over has nowhere to go. It cannot phone home, cannot pull a second-stage payload, cannot reach your other systems, cannot find another sandbox to jump to. It is a capable computer sealed in a room with one supervised door.
The workstation and the walls
A real computer is what turns a capable model into something that does the work instead of describing it. Isolation is what makes a real computer safe to hand to a model whose inputs you do not control. The two are not in tension. You give the agent the whole workstation, and you keep the walls thick and the doors few. That is what lets an agent run for hours, install what it needs, and produce something finished, without ever becoming a liability when one of its inputs turns out to be an attack.