RAG悖论：利用黑盒攻击揭示检索增强生成系统中的非故意漏洞

RAG悖论：利用黑盒攻击揭示检索增强生成系统中的非故意漏洞
好的，我现在要帮用户总结这篇文章的内容。首先，我需要仔细阅读用户提供的文章摘要，理解其中的主要观点和结论。文章讨论的是检索增强生成（RAG）系统，这类系统在生成内容时会结合检索到的文档。随着RAG系统的普及，出现了各种攻击方法来降低其性能。然而，现有的攻击方法大多假设攻击者能够访问系统的内部组件，比如检索器，这在现实中并不现实。作者提出了一种基于RAG悖论的黑盒攻击方法。RAG悖论指的是系统为了增强用户的信任，会向用户展示检索到的文档及其来源。这种透明性反而让攻击者有机可乘，他们可以观察到系统使用的来源和信息表达方式，从而制作出更可能被检索到的中毒文档，并上传到这些来源中。此外，由于RAG系统直接向用户提供检索内容，这些中毒文档不仅要容易被检索到，还要看起来自然可信，以维持用户的信任。与之前只关注提高文档可检索性的研究不同，这种方法同时考虑了可检索性和用户对内容的信任度。实验结果表明，在不访问系统内部的情况下，这种方法显著降低了系统的性能，并且生成的中毒文档看起来很自然。接下来，我需要将这些要点浓缩成大约100字的总结。要确保涵盖主要概念：RAG系统的漏洞、基于透明性的黑盒攻击、中毒文档的影响以及实验结果。可能会遇到的问题是如何简洁地表达复杂的概念而不遗漏关键点。例如，“RAG悖论”是一个关键术语，需要明确提及。同时，“黑盒攻击”和“中毒文档”也是核心要素。最后，确保语言流畅自然，并且控制在字数限制内。 </think> 文章提出了一种基于RAG悖论的黑盒攻击方法,通过利用RAG系统透明性弱点,攻击者无需访问内部组件即可上传有毒文档,影响系统性能并降低用户信任,实验表明该方法有效且生成自然可信的内容。 2025-10-18 06:53:39 Author: arxiv.org(查看原文) 阅读量:3 收藏

View PDF HTML (experimental)

Abstract:With the growing adoption of retrieval-augmented generation (RAG) systems, various attack methods have been proposed to degrade their performance. However, most existing approaches rely on unrealistic assumptions in which external attackers have access to internal components such as the retriever. To address this issue, we introduce a realistic black-box attack based on the RAG paradox, a structural vulnerability arising from the system's effort to enhance trust by revealing both the retrieved documents and their sources to users. This transparency enables attackers to observe which sources are used and how information is phrased, allowing them to craft poisoned documents that are more likely to be retrieved and upload them to the identified sources. Moreover, as RAG systems directly provide retrieved content to users, these documents must not only be retrievable but also appear natural and credible to maintain user confidence in the search results. Unlike prior work that focuses solely on improving document retrievability, our attack method explicitly considers both retrievability and user trust in the retrieved content. Both offline and online experiments demonstrate that our method significantly degrades system performance without internal access, while generating natural-looking poisoned documents.

Submission history

From: Chanwoo Choi [view email]
[v1] Fri, 28 Feb 2025 12:32:53 UTC (561 KB)
[v2] Fri, 30 May 2025 08:19:35 UTC (1,290 KB)

文章来源: https://arxiv.org/abs/2502.20995
如有侵权请联系:admin#unsafe.sh