[Research] Analysis of 74,636 AI Agent Interactions: 37.8% Contained Attack Attempts - New "Inter-Agent Attack" Category Emerges
嗯,用户让我帮忙总结一篇文章,控制在一百个字以内,而且不需要特定的开头。首先,我需要理解文章的内容。文章主要讲的是他们在38个生产AI代理部署中进行威胁检测的情况,特别是2026年第三周的数据。 关键点包括检测到28,194次威胁,攻击率37.8%。还有新的威胁类别,比如跨代理攻击占3.4%,数据外泄占19.2%,主要是针对系统提示和RAG上下文。另外,还有 jailbreaks 的检测信心高达96.3%。攻击技术方面,指令覆盖、工具注入、RAG中毒和系统提示提取都有一定的比例。 用户可能需要一个简洁明了的总结,用于快速了解文章的主要内容。所以我要把这些关键数据和发现浓缩到100字以内,同时保持信息的完整性。 最后,确保语言简洁,直接描述内容,不使用“这篇文章”这样的开头。 </think> 在38个AI代理部署中检测到28,194次威胁,攻击率37.8%。跨代理攻击、数据外泄和jailbreaks等威胁突出。攻击技术包括指令覆盖、工具注入和RAG中毒等。跨代理攻击尤其值得关注。 2026-1-28 06:30:42 Author: www.reddit.com(查看原文) 阅读量:0 收藏

We've been running inference-time threat detection across 38 production AI agent deployments. Here's what Week 3 of 2026 looked like with on-device detections.

Key Findings

  1. 28,194 threats detected across 74,636 interactions (37.8% attack rate)

  2. Inter-Agent Attacks emerged as a new category (3.4% of threats) - agents sending poisoned messages to other agents

  3. Data exfiltration leads at 19.2% - primarily targeting system prompts and RAG context

  4. Jailbreaks detected with 96.3% confidence - patterns are now well-established

Attack Technique Breakdown

  1. Instruction Override: 9.7%

  2. Tool/Command Injection: 8.2%

  3. RAG Poisoning: 8.1% (trending up)

  4. System Prompt Extraction: 7.7%

The inter-agent attack vector is particularly concerning given the MCP ecosystem growth. We're seeing goal hijacking, constraint removal, and recursive propagation attempts.

Full report with methodology: https://raxe.ai/threat-intelligence

Github: https://github.com/raxe-ai/raxe-ce is free for the community to use

Happy to answer questions about detection approaches


文章来源: https://www.reddit.com/r/netsec/comments/1qp3rpz/research_analysis_of_74636_ai_agent_interactions/
如有侵权请联系:admin#unsafe.sh