Ken Muse

Isolating GitHub Copilot With Docker Sandboxes


When you let GitHub Copilot (or any AI coding agent) run autonomously, it operates with the same permissions you have. It can read your files, reach any network endpoint your machine can, and execute arbitrary shell commands using your account. Most of the time that’s fine – but “most of the time” is not a security model. One rogue network call, one misconfigured command, one leaked .env file, and the blast radius is your entire machine.

Docker Sandboxes – accessed through the sbx command-line tool – offer one approach to solving this. They let you run Copilot CLI inside a dedicated microVM (virtual machine) with its own kernel, its own Docker daemon, and a network firewall that defaults to blocking everything the agent hasn’t been explicitly allowed to reach. It’s worth noting upfront: this isn’t the only way to isolate an AI agent, and no isolation tool can fully guarantee security. What solutions like this do is significantly reduce your attack surface and make unexpected behavior visible rather than silent. Defense in depth means layering protections, and a sandbox can be one of those layers.

Why isolation matters for AI agents

The core problem is that an autonomous agent has the same access as you. It’s no different than you executing commands or code yourself. It’s just faster. If it decides to install a package from an untrusted source, exfiltrate a token, or run a destructive command, it has access to everything in its environment. You’re trusting the model’s judgment and its guardrails – which are good but not infallible.

Isolation changes that environment to restrict what it can see and do. Instead of trusting the agent to stay within bounds, you define the bounds externally and enforce them with hardware-enforced infrastructure. The agent gets exactly the files, network endpoints, and secrets you grant. Everything else is walled off at hardware and operating system levels. If the environment is corrupted or compromised, you delete the sandbox and start fresh in seconds – nothing on your host is affected.

This follows the principle of least privilege: give the agent only what it needs, and make everything else unreachable. It also gives you observability – you can see exactly what the agent tried to access and whether it was allowed.

Why a microVM is a better boundary than a container

If you’ve read my earlier post on building container isolation from the Linux kernel up, you know that containers share the host kernel. They use namespaces and cgroups to create isolated views of the system, but the kernel remains a shared surface.

For a regular application, that’s usually fine. But AI coding agents aren’t regular applications. They need to build and run their own Docker containers – that’s how modern software gets developed. Running Docker inside a container (Docker-in-Docker) requires elevated privileges, such as mounting the Docker socket or running in privileged mode. That can undermine the isolation you set up in the first place by effectively punching a hole through your security boundary.

A microVM avoids this entirely. Each sandbox gets its own kernel – hardware-level isolation, the same kind you get from a full virtual machine. Inside that VM, the agent has a private Docker daemon with full docker build, docker run, and docker compose support. No socket mounting, no host-level privileges, none of the security compromises Docker-in-Docker requires. The agent can do real development work without any path back to the host.

Docker built a custom Virtual Machine Monitor (VMM) specifically for this use case. It runs natively on each platform’s hypervisor – Apple’s Hypervisor.framework on macOS, Windows Hypervisor Platform on Windows, and KVM on Linux – so cold starts are fast enough to avoid performance issues. You can read more about the architectural decisions in Docker’s blog post on microVM architecture.

The firewall: blocking and notifying

The network layer is where Docker Sandboxes really shine for security-conscious developers. When you first log in, you choose a default network policy:

  • Open
    All traffic allowed, no restrictions.
  • Balanced
    Default deny, with common development sites (package registries, GitHub, model APIs) allowed.
  • Locked Down
    All traffic blocked unless you explicitly allow it.

Balanced is a sensible starting point. It permits traffic to the services Copilot CLI needs while blocking everything else. You can then fine-tune rules with sbx policy allow or sbx policy deny as needed.

What makes this especially useful is the policy log. Running sbx policy log shows you every outbound request the sandbox made, the rule that matched, and whether it was forwarded or blocked. This means unexpected network destinations don’t happen silently – they’re surfaced so you can investigate. If Copilot CLI tries to reach a domain you didn’t anticipate, you’ll know about it.

Secrets get special treatment too. When you store a token (like your GitHub Personal Access Token), it lives in your OS keychain on the host. A proxy on the host injects the credential into outbound API requests at the network boundary – the real secret never enters the VM, and the agent never sees it. Inside the sandbox, the agent only sees a sentinel placeholder value. Even if someone compromised the VM, they wouldn’t find your actual credentials.

A view of the policy log

Setting it up for Copilot

Getting started takes just a few minutes. On macOS, install with Homebrew and sign in:

brew install docker/tap/sbx
sbx login

During login, you’ll choose your default network policy (I recommend Balanced). Then store your GitHub token so Copilot can authenticate. If you have the GitHub CLI installed, you can pipe your existing token directly into the sandbox as a secret:

gh auth token | sbx secret set -g github

You can also provide a token directly with sbx secret set -g github as well. The command line can also be used to also set secrets for a single specific sandbox as well.

Now navigate to a project and create a sandbox:

cd ~/code/my-project
sbx create copilot .
sbx run copilot-my-project

That’s it. Copilot launches inside a microVM with your project directory mounted. By default, the create command creates a sandbox with a name in the format {agent}-{folderName} (where they agent in this case is copilot). If you prefer to provide your own name, you can use the --name flag:

sbx create copilot --name my-sandbox .

There are also several other commands available that you can use to manage and interact with the sandbox. You can check what’s running with sbx ls and review network activity with sbx policy log. You can even open an interactive environment by running sbx.

Reviewing work and cleaning up

By default, the agent edits your working tree directly through a mount – meaning changes the agent makes to files are immediately visible on your host and could impact your local environment. If you want to completely isolate the code environment as well, use --clone mode:

sbx create --clone copilot .

In clone mode, the sandbox keeps a private Git clone inside the microVM and exposes it as a sandbox-<name> remote on your host. You review the agent’s commits the same way you’d fetch from any other remote:

git fetch sandbox-copilot-my-project
git diff main..sandbox-copilot-my-project/main

This approach can also pair nicely with the worktree isolation patterns I covered earlier – you can run multiple sandboxed agents on the same repository in parallel, each working in its own isolated clone.

When you’re done, tear down is instant:

sbx stop copilot-my-project
sbx rm copilot-my-project

Everything inside the sandbox – installed packages, Docker images, cloned repos – is gone. Nothing on your host is affected. This disposable-by-design approach means you never need to worry about a sandbox accumulating drift or contamination over time. And you don’t have to worry about it corrupting your host environment since it has no access.

Extending the environment

If you have additional tools or network rules you want baked in, Docker provides an experimental feature that you can use to customize the sandbox – a kit. It’s essentially a folder containing a spec.yaml file (and any supporting content) that is applied to the sandbox when it is created. For example, to run Copilot with a custom kit:

sbx create copilot --kit ./my-kit .
sbx run copilot-my-project

Kits are useful for teams that want consistent, customized sandbox configurations, but they are entirely optional for getting started. The built-in Copilot sandbox template handles the common case out of the box.

A known issue: GitHub Enterprise authentication

If you’re using a GitHub Enterprise account, there’s a gotcha worth knowing about. The Copilot sandbox was originally configured so that requests to api.enterprise.githubcopilot.com used forward-bypass in the network proxy instead of forward. This distinction means the host-side proxy won’t inject a real credential into those requests. Instead, the sentinel placeholder token is sent directly to the Enterprise API, resulting in a 401 unauthorized error.

You can see this by checking the policy log:

sbx policy log

If you see api.enterprise.githubcopilot.com:443 listed as forward-bypass rather than forward, you’re hitting this bug.

There are two interim workarounds:

  1. Create a kit that maps the enterprise domain to the github service so that the credential is automatically injected. Create a folder called copilot-mixin-kit and add a spec.yaml file with the following content:

    1# copilot-mixin-kit/spec.yaml
    2schemaVersion: "1"
    3kind: mixin
    4name: copilot-enterprise-fix
    5network:
    6  serviceDomains:
    7    api.enterprise.githubcopilot.com: github

    Then create your sandbox with this kit: sbx create copilot --kit ./copilot-mixin-kit/ .

  2. Use a custom secret scoped to the enterprise host:

    sbx secret set-custom -g --host api.enterprise.githubcopilot.com --env GH_TOKEN

The good news: the fix has landed in the nightly build and is confirmed working by multiple users. It should ship in the next stable release. You can track the full discussion in issue #100.

Wrapping up

Docker Sandboxes aren’t the only way to isolate an AI coding agent, and they don’t make you invulnerable. Security is always about layers, trade-offs, and reducing risk rather than eliminating it entirely. What sbx does is provide an option for implementing one of those layers: hardware-boundary isolation via a microVM, a private Docker daemon that doesn’t compromise your host, and a network firewall that both enforces policy and makes violations visible.

For a tool like Copilot that benefits from running autonomously – cloning repos, installing dependencies, building containers, opening pull requests – having that autonomy operate inside a disposable, observable sandbox is a practical improvement. You get the productivity of a fully capable agent without handing it the keys to your entire machine.

Give it a try with the getting started guide. It’s a great way to learn how this might be useful for your own workflow and to get some visibility into what your agent is doing.