During the Month of AI Bugs, I described an emerging vulnerability pattern that shows how commonly agentic systems have a design flaw that allows an agent to overwrite its own configuration and security settings.
This allows the agent to break out of its sandbox and escape by executing arbitrary code.
My research with GitHub Copilot, AWS Kiro and a few others demonstrated how this can be exploited by an adversary with an indirect prompt injection.
This post goes a step further by showing how one compromised agent can rewrite another agent’s config and ‘free’ it, creating a cross-agent escalation loop. ‘Freeing’ in this context means that one agent helps another to break out of its sandbox by giving it additional capabilities.
The concept is rather simple. Many developers use multiple coding agents on the same codebase. This is to get a second opinion, compare results, or have one agent review code another agent created.
As an example, let’s say you use both GitHub Copilot and Claude Code on the same codebase.
A few things to know upfront:
.vscode/tasks.json
) and similar. If the attack is performed by a third-party attacker via indirect prompt injection it basically Remote Code Execution..vscode/settings.json
, and there is also a .vscode/mcp.json
file to configure MCP servers..mcp.json
.AGENTS.md
, or .vscode/copilot-instructions.md
and similar.After some of my disclosures, vendors started adding mitigations to not have the agent overwrite configuration settings without the user’s permission. However, this still means the agent can write/create other files and folders.
Now that we’ve walked through some of the background information, let’s explore how Copilot can “free” Claude Code!
GitHub Copilot can create/write to the configuration and instructions files of other agents.
For instance, it’s still possible to create and write to files, such as .gemini/settings.json
, .claude/settings.local.json
, and ~/.mcp.json
(Claude’s MCP configuration) to execute arbitrary code or allowlist shell commands for other agents
It’s also possible to write to files like CLAUDE.md
or AGENTS.md
, etc..
So, I think you understand now where this is going.
This can be leveraged to have agents “free” each other.
CLAUDE.md
)This attack chain demonstrates a significant design flaw, indicating that today’s agentic systems are already accumulating substantial security debt.
Agents that coordinate and collaborate to achieve malicious objectives seem very plausible to me in the long term, and it will be very difficult to mitigate unless systems are designed with secure defaults.
Here is a video that explains the scenario in detail and it includes a demo:
The demo with Copilot freeing Claude is around minute 3 by the way.
There are a couple of secure defaults that vendors need to consider, and a few things users should be aware of:
It’s important to highlight that this exploit chain is not limited to VS Code and GitHub Copilot, but is a generic novel approach that demonstrates how exploitation of one agent can lead to the compromise of another agent. Which, then, in turn reconfigures the first (or other) agent(s), allowing them to run arbitrary code or modify their instruction settings.
Nevertheless, I reported this specific demo to MSRC, but it was not considered severe enough to require immediate security servicing. However, the team might look into improving mitigations.
The ability for agents to write to each other’s settings and configuration files opens up a fascinating, and concerning, novel category of exploit chains.
What starts as a single indirect prompt injection can quickly escalate into a multi-agent compromise, where one agent “frees” another agent and sets up a loop of escalating privilege and control.
This isn’t theoretical. With current tools and defaults, it’s very possible today and not well mitigated across the board.
More broadly, this highlights the need for better isolation strategies and stronger secure defaults in agent tooling. We’re moving toward scenarios where multiple agents operate in the same environments, sometimes on the same data and infrastructure. If we don’t consider the possibility of agents collaborating, intentionally or not, we’re going to see more of these cross-agent attack chains emerge.