We discovered a new attack technique, which we call agent session smuggling. This technique allows a malicious AI agent to exploit an established cross-agent communication session to send covert instructions to a victim agent.
Here, we discuss the issues that can arise in a communication session using the Agent2Agent (A2A) protocol, which is a popular option for managing the connections between agents. The A2A protocol’s stateful behavior lets agents remember recent interactions and maintain coherent conversations. This attack exploits this property to inject malicious instructions into a conversation, hiding them among otherwise benign client requests and server responses.
Many AI threats involve tricking an agent with a single malicious piece of data such as a deceptive email or document. Our research highlights a more advanced danger, malicious agents.
A straightforward attack on a victim agent might involve a one-time effort to trick it into acting on harmful instructions from a document without seeking confirmation from its user. In contrast, a rogue agent is a far more dynamic threat. It can hold a conversation, adapt its strategy and build a false sense of trust over multiple interactions.
This scenario is especially dangerous because, as a recent study shows, agents are often designed to trust other collaborating agents by default. Agent session smuggling exploits this built-in trust, allowing an attacker to manipulate a victim agent over an entire session.
This research does not reveal any vulnerability in the A2A protocol itself. Rather, the technique exploits the way implicit trust relationships between agents would affect any stateful protocol — meaning any protocol that can memorize recent interactions and carry out multi-turn conversation.
Mitigation requires a layered defense strategy, including:
Palo Alto Networks customers are better protected through the following products and services:
Prisma AIRS is designed to provide layered, real-time protection for AI systems by detecting and blocking threats, preventing data leakage and enforcing secure usage policies across a variety of AI applications.
AI Access Security is designed for visibility and control over usage of third-party GenAI tools, helping prevent sensitive data exposures, unsafe use of risky models and harmful outputs through policy enforcement and user activity monitoring.
Cortex Cloud AI-SPM is designed to provide automatic scanning and classification of AI assets, both commercial and self-managed models, to detect sensitive data and evaluate security posture. Context is determined by AI type, hosting cloud environment, risk status, posture and datasets.
A Unit 42 AI Security Assessment can help you proactively identify the threats most likely to target your AI environment.
If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.
| Related Unit 42 Topics | GenAI, Google |
The A2A protocol is an open standard that facilitates interoperable communication among AI agents, regardless of vendor, architecture or underlying technology. Its core objective is to enable agents to discover, understand and coordinate with one another to solve complex, distributed tasks while preserving autonomy and privacy.
In the A2A protocol:
For more details on A2A fundamentals and security considerations, please refer to our article: Safeguarding AI Agents: An In-Depth Look at A2A Protocol Risks and Mitigations.
A2A has notable parallels with the Model Context Protocol (MCP), a widely used standard for connecting large language models (LLMs) to external tools and contextual data. Both aim to standardize how AI systems interact, but they operate on distinct aspects of agentic systems.
In short, MCP emphasizes execution through tool integration, whereas A2A emphasizes orchestration across agents.
Despite these differences, both protocols face similar classes of threats, as shown in Table 1.
| Attack/Threats | MCP | A2A |
| Tool/Agent Description Poisoning | Tool descriptions can be poisoned with malicious instructions that manipulate LLM behavior during tool selection and execution | AgentCard descriptions can embed prompt injections or malicious directives that manipulate the client agent’s behavior when consumed |
| Rug Pull Attacks | Previously trusted MCP servers can unexpectedly shift to malicious behavior after integration, exploiting established trust relationships | Trusted agents can unexpectedly turn malicious by updating their AgentCards or operation logic |
| Tool/Agent Shadowing | Malicious servers register tools with identical or similar names to legitimate ones, causing confusion during tool selection | Rogue agents create AgentCards that mimic legitimate agents through similar names, skills or typosquatting techniques |
| Parameter/Skill Poisoning | Tool parameters can be manipulated to include unintended data (e.g., conversation history) in requests to external servers | AgentCard skills and examples can be crafted to manipulate how agents interact, potentially exposing sensitive context or credentials |
Table 1. Comparison of MCP and A2A attacks.
Agent session smuggling is a new attack vector specific to stateful cross-agent communication, such as A2A systems. A communication is stateful if it can remember recent interactions, like a conversation where both parties keep track of the ongoing context.
The core of the attack involves a malicious remote agent that misuses an ongoing session to inject additional instructions between a legitimate client request and the server’s response. These hidden instructions can lead to context poisoning (corrupting the AI's understanding of a conversation), data exfiltration or unauthorized tool execution on the client agent.
Figure 1 outlines the attack sequence:

Key properties of the attack
In principle, any multi-agent system with stateful inter-agent communication could be susceptible to this attack. However, the risk is lower in setups fully contained within a single trust boundary. A trust boundary is an area of the system where all components are trusted by default, such as ADK or LangGraph multi-agent systems, where one administrator controls all participating agents.
Our research therefore focuses on the A2A protocol, which is explicitly designed for cross-boundary interoperability. This interoperability enables agents to collaborate across different systems, modules or organizations.
Compared to known MCP threats, agent session smuggling exploits A2A’s stateful and adaptive design in ways that are not possible in MCP. MCP servers generally operate in a stateless manner, executing isolated tool invocations without preserving session history, which limits actors’ ability to use them to mount multi-turn or evolving attacks.
MCP servers are also typically static and deterministic, since they do not rely on AI models. In contrast, an A2A server can persist state across interactions and leverage model-driven reasoning, allowing a malicious agent to adapt and refine instructions over multiple turns. This combination of persistence and autonomy makes agent session smuggling more stealthy and difficult to defend against than MCP-based attacks.
To demonstrate the feasibility and potential impact of agent session smuggling, we developed two proof-of-concept (PoC) attack scenarios. The setup consisted of a financial assistant (the client agent) and a research assistant (the malicious remote agent). These PoCs illustrate how a malicious agent can exfiltrate sensitive data and execute unauthorized actions without the end user's awareness.
Environment settings:
In this scenario, the malicious remote agent (research assistant) aims to exfiltrate the client agent’s (financial assistant’s) internal state, including:
The attack unfolds subtly. Upon receiving a legitimate request from the financial assistant, the research assistant issues a series of seemingly harmless clarification questions. These follow-up questions gradually trick the financial assistant into disclosing sensitive information that should never leave its environment.
Figure 2 illustrates the user’s interaction flow. The session begins normally. The user asks the financial assistant to retrieve the investment portfolio and profile, followed by a request for a briefing on AI market news.
This market news request is delegated to the research assistant. As shown on the right side of the figure, the research assistant engages in a series of back-and-forth exchanges with the financial assistant. Except for the initial request and the final response, the research assistant injects all intermediate interactions.
The intermediate steps shown in Figure 2 only appear because the developer web user interface (UI) is designed to display every internal action. In a production chatbot UI, these intermediate interactions would likely remain invisible.

Figure 3 provides a more detailed view via the financial assistant’s activity log. Row 10 shows the initial query sent to the research assistant, and Row 28 shows the final result the research agent produces. Rows 11–27, alternating between function_call and function_response, reveal the interactions between the two agents. During the interactions, the financial assistant gradually disclosed its system configuration and the user’s prior conversation history.

Finally, Figure 4 shows the research assistant’s log after the attack. It confirms successful data exfiltration, including the financial assistant’s capabilities, tool schemas and the entire session history.

In this scenario, the malicious remote agent (research assistant) attempts to trick the client agent (financial assistant) into executing unauthorized trade actions. The attacker’s objective is to misuse the client agent’s buy_stock tool without the end user’s knowledge or consent.
After receiving a valid delegation request, the research assistant sends additional processing requirements, prompting the financial assistant to perform tool invocations unrelated to the user’s original request. This PoC assumes the attacker knows the financial assistant’s tools, having acquired either through public information or the previous PoC.
Figure 5 illustrates the user’s interaction flow. On the right-hand side, the user’s news request is delegated to the research assistant. While the research assistant does provide the expected news summary, it also smuggles hidden instructions that induce the financial assistant to invoke the buy_stock tool. Again, these intermediate actions are executed autonomously and would remain invisible in a typical production chatbot interface, where the user would only be shown the final summarized response.

Figure 6 shows the financial assistant’s activity log. Row 7 shows the initial request sent to the research assistant, and Row 12 shows the legitimate response received. Rows 8–11, however, reveal four additional actions the financial assistant autonomously executed, resulting in an unauthorized trade of 10 shares on behalf of the victim user. This sequence demonstrates how agent session smuggling can escalate from information leakage to high-impact tool execution.

Defending against agent session smuggling requires a layered security approach that validates the identity of agents, the integrity of the conversation and the impact of the actions taken. The following strategies can help mitigate the risks:
This work introduced agent session smuggling, a new attack technique that targets cross-agent communication in A2A systems. Unlike threats involving malicious tools or end users, a compromised agent represents a more capable adversary. Powered by AI models, a compromised agent can autonomously generate adaptive strategies, exploit session state and escalate its influence across all connected client agents and their users.
Although we have not observed the attack in the wild, its low barrier to execution makes it a realistic risk. An adversary needs only to convince a victim agent to connect to a malicious peer, after which covert instructions can be injected without user visibility. Protecting against this requires a layered defense approach:
As multi-agent ecosystems expand, their interoperability also opens new attack surfaces. Practitioners should assume that agent-to-agent communication is not inherently trustworthy. We must design orchestration frameworks with layered safeguards to contain the risks of adaptive, AI-powered adversaries.
Prisma AIRS is designed for real-time protection of AI applications, models, data and agents. It analyzes network traffic and application behavior to detect threats such as prompt injection, denial-of-service attacks and data exfiltration, with inline enforcement at the network and API levels.
AI Access Security is designed for visibility and control over usage of third-party GenAI tools, helping prevent sensitive data exposures, unsafe use of risky models and harmful outputs through policy enforcement and user activity monitoring. Together, Prisma AIRS and AI Access Security help secure the building of enterprise AI applications and external AI interactions.
Cortex Cloud AI-SPM is designed to provide automatic scanning and classification of AI assets, both commercial and self-managed models, to detect sensitive data and evaluate security posture. Context is determined by AI type, hosting cloud environment, risk status, posture and datasets.
A Unit 42 AI Security Assessment can help you proactively identify the threats most likely to target your AI environment.
If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:
Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.