From Agent2Agent Prompt Injection to Runtime Self-Defense: How Wallarm Redefines Agentic AI Security

From Agent2Agent Prompt Injection to Runtime Self-Defense: How Wallarm Redefines Agentic AI Security
好的，我现在需要帮用户总结一篇文章的内容，控制在100个字以内，而且不需要特定的开头。首先，我得仔细阅读这篇文章，理解它的主要观点。文章讨论了AI之间的攻击场景是否只是科幻电影的情节。研究发现，一个AI可以向另一个AI注入恶意指令，利用A2A协议的特性进行攻击。这扩展了威胁向量，并提出了新的安全风险。接着，文章介绍了A2AS框架作为新的安全标准，通过行为证书、模型自我防御推理和提示级安全控制来保护AI系统。最后提到企业需要采取措施应对这些威胁，并提到了Wallarm的角色。现在，我需要将这些要点浓缩到100字以内。重点包括：AI之间的攻击可能性、A2A协议的漏洞、A2AS框架的作用以及企业应对措施。可能的结构是：描述问题（AI间攻击）、解决方案（A2AS框架）、以及企业行动。确保语言简洁明了，不使用复杂的术语。最终总结应涵盖关键点：AI间攻击现实化、新威胁、A2AS框架的作用和企业准备措施。 </think> 文章探讨了AI之间的攻击场景不再是科幻情节，并提出一种新威胁：AI通过Agent2Agent协议注入恶意指令。为应对这一风险，研究人员开发了A2AS框架，通过行为证书、模型自我防御和提示级安全控制等机制保护AI系统。企业需采取措施应对这些新威胁。 2025-12-23 07:55:49 Author: lab.wallarm.com(查看原文) 阅读量:6 收藏

Is an AI-to-AI attack scenario a science fiction possibility only for blockbusters like the Terminator series of movies?

Well, maybe not!

Researchers recently discovered that one AI agent can “inject malicious instructions into a conversation, hiding them among otherwise benign client requests and server responses.” While known AI threats involve tricking an agent with malicious data, this new threat exploits a property of the Agent2Agent (A2A) protocol to remember recent interactions and maintain coherent conversations.

AI agents interact with each other, use internal APIs, and operate with privileges. Since traditional AI guardrails and legacy API security no longer cut it, there’s a need for a new approach to security.

Agent2Agent Prompt Injection and Emerging Threats

AI agents can communicate, issue instructions, and bypass human oversight, which makes them both valuable and dangerous.

Pro m pt injection used to involve a user writing a malicious prompt. Now, agents can write malicious prompts that target other agents. This fact has expanded threat vectors and transformed the risk model. Internal API misuse, lateral movement, and chain-of-agent compromise threats are now more acute than ever.

According to security researchers, an AI agent can cause another agent to operate in unintended ways, potentially forcing, for example, data disclosure or unauthorized tool use. It achieves this by delivering multi-stage prompt injections leveraging stateful cross-agent communication behavior of the Agent2Agent (A2A) communication protocol.

The A2A protocol is an open standard that facilitates interoperable communication among AI agents, regardless of vendor, architecture or underlying technology. Its core objective is to enable agents to discover, understand and coordinate with one another to solve complex, distributed tasks while preserving autonomy and privacy. The protocol is similar to Model Context Protocol (MCP). However, MCP focuses on execution through tool integration, which A2A’s goal is agent orchestration.

This risk, dubbed as agent session smuggling attack, is crucial to security leaders for several reasons:

The attack surface now encompasses new threats and attack vectors, including agent APIs, internal tool access, inter-agent messaging, and privileged actions.
Traditional guardrails, such as filtering outputs, may no longer suffice. Attackers can create inputs at the agent-to-agent level, where filtering might not exist.

While previously theorized, this emerging threat highlights the diversity of possible attack scenarios available and waiting to be launched in real environments. Security teams must face up to this new reality. They need to establish a new layer of defense.

Introducing A2AS: A New Standard for Agentic AI Security

The A2AS framework is that new defense layer.

A2AS secures AI agents and LLM-powered applications, just as HTTPS secures HTTP. Researchers built it to address agentic AI security risks. Those risks include prompt injection, tool misuse, and agent compromise. It centers around three breakthrough capabilities:

Behavior Certificates: Declaring and enforcing what agents can and can't do.
Model Self-Defence Reasoning: Embed security awareness in the model’s context window. This ensures the model rejects malicious or untrusted instructions in real time
Prompt-Level Security Controls: Authenticated prompts, sandboxing, policy-as-code, verify every interaction.

A2AS is important because it represents a shifting approach to security. Agents have privileges and tool access, so monitoring and filtering alone isn't enough. Security models must now secure the runtime, not just the input. That means integrating runtime self-defense, certification, and enforcement.

Wallarm, together with researchers from AWS, Bytedance, Cisco, Elastic, Google, JPMorganChase, Meta, and Salesforce, played an instrumental role in developing A2AS and is spearheading its adoption.

What A2AS Brings to Agentic AI Security

The A2AS framework aims to ensure AI agents can only do what they’re explicitly allowed to do, and every instruction they see must be authenticated, isolated, and verified.

To make that possible, A2AS uses a five-part standard called the BASIC model.

Behavior certificates define the exact capabilities an agent is permitted to use. That includes tools, files, functions, or system operations. If it’s not certified, it doesn’t happen. These certificates are how A2AS prevents an infected agent from escalating privileges.

Authenticated prompts verify the integrity of every instruction before it enters the context windows. They stop tampered, spoofed, or injected messages from influencing agent reasoning.

Security Boundaries isolate untrusted content from trusted system instructions. They do this by tagging and segmenting everything that enters the model, eliminating the ambiguity that makes prompt injection possible.

In-context defenses embed security reasoning directly inside the model’s context window. They guide the agent to distrust external input, ignore unsafe commands, and actively neutralize malicious patterns during execution.

Codified policies enforce business rules at runtime. That means they block sensitive data, requiring approvals for high-risk actions, and ensure compliance without manual oversight.

Together, these controls create a self-defending agent that can resist user-to-agent attacks, prevent tool misuse, and stop agent-to-agent prompt infections before they spread.

Why A2AS Will Matter for Enterprises and SOCs

As agentic AI adoption grows, the need for standardized security is becoming increasingly urgent. Autonomous agents now manage operations, access sensitive data, and interact with internal systems. That explodes the attack surface.

Moreover, agents communicate, call privileged tools, and operate via APIs never designed for autonomous decision-making. A2AS provides a unifying framework to secure this complexity – similar to how NIST frameworks shaped traditional cybersecurity.

Attackers are shifting focus from traditional breaches to manipulating agent behavior. A compromised agent can trigger unauthorized transactions, leak regulated data, or propagate malicious instructions. SOCs will need A2AS-aligned controls to detect, contain, and attribute these attacks.

Regulations are also evolving. Failing to secure AI agents could soon mean non-compliance.

How, then, can organizations prepare for A2AS?

Here’s a short checklist for locking down your agentic AI systems and preparing for the A2AS framework:

Action	Description
Inventory agentic systems	Identify autonomous agents, their identities, inter-agent communication paths, exposed APIs, and execution privileges. Establish clear ownership and trust boundaries for each agent. Wallarm can help you discover all your ecosystem.
Map agent behavior and exposure	Document what actions each agent is allowed to perform, which tools and data sources it can access, and which prompts or instructions it can receive or generate. This forms the basis for behavior certification.
Enforce runtime protection	Apply real-time controls to inspect and block malicious prompts, unauthorized tool calls, and abnormal agent behavior across APIs and agent interactions. Security must operate at runtime—not only at design time.
Implement behavior certification & policy enforcement	Define and enforce agent behavior certificates, authenticated prompts, and policy-as-code controls to ensure agents act only within approved intent, scope, and authority, in line with A2AS principles.
Monitor, detect, and attribute agent activity	Continuously monitor agent-to-agent interactions, prompt flows, outputs, and tool usage. Enable SOC teams to detect manipulation attempts, attribute actions to specific agents, and contain compromised behavior.
Adopt and align with agentic AI standards	Align internal security controls with emerging frameworks like A2AS to ensure consistency, interoperability, and readiness for future regulatory and industry requirements.

The time to act is now - waiting until there’s a breach is too late.

Secure the Future of Agentic AI

Hardening agent-to-agent communications and agentic orchestration represents a new frontier of cybersecurity strategy. A2AS offers a framework for protecting against many use cases from agent misbehavior and prompt injections to insecure AI supply chains. Wallarm provides the practical implementation. As enterprises embrace agentic AI, security can’t be an afterthought. They must bake it in at runtime, at the API/agent boundary.

To learn more about the A2AS framework, you can visit their website. To find out how Wallarm can help you prepare, schedule a demo today.

文章来源: https://lab.wallarm.com/how-wallarm-redefines-agentic-ai-security/
如有侵权请联系:admin#unsafe.sh