Prompt Injection Can’t Be Fully Mitigated, NCSC Says Reduce Impact Instead

Prompt Injection Can’t Be Fully Mitigated, NCSC Says Reduce Impact Instead
文章指出提示注入攻击与SQL注入不同，因大型语言模型（LLM）无法区分数据与指令，导致其难以完全修复。专家建议组织采用风险降低策略，包括限制模型访问、人工审核关键操作及监控API调用等措施来应对这一威胁。 2025-12-12 18:36:11 Author: securityboulevard.com(查看原文) 阅读量:7 收藏

Don’t treat prompt injection like SQL injection. So says the National Cyber Security Centre (NCSC).

In a blog post, the NCSC warned defenders that they may never be able to fully mitigate prompt injection. Disheartening, yes, but the blog’s author, David C., NCSC’s technical director for platforms research, encourages organizations to instead put their resources toward reducing its impact.

“Data is something that is stored or used in a query. Similar is true in cross-site scripting and buffer overflows, in that data and instructions have inherent differences in how they are processed,” he explained, adding that mitigating those issues enforces the separation between data and instructions.

“Under the hood of an LLM, there’s no distinction made between data or instructions;  there is only ever ‘next token.’ When you provide an LLM prompt, it doesn’t understand the text in the way a person does. It is simply predicting the most likely next token from the text so far,” he wrote.

Since no inherent distinction exists between data and instruction, “it’s very possible that prompt injection attacks may never be totally mitigated in the way that SQL injection attacks can be,” according to the blog post.

“The NCSC technical director is exactly right, prompt injection is fundamentally different from SQL injection and treating them as equivalent could leave organizations dangerously exposed,” says Diana Kelley, CISO at Noma Security.

“Another foundational property of LLM systems that makes these attacks possible is that content inside a document is not inherently treated as mere data; depending on how the application constructs the prompt and retrieval pipeline, an LLM may interpret that content as executable instruction material,” she says. “The model has no intrinsic, reliable mechanism for determining whether a token sequence is benign or adversarial.”

She points to the newly disclosed GeminiJack vulnerability as a “perfect example of the prompt injection threat, as it demonstrates how weaponized content hidden in a shared document could execute during a normal employee search, leading to silent data exfiltration.”

The NCSC warning points to broader implications as the world shifts with the introduction of AI, says Erez Schwartz, research engineer at Oasis Security. “As the risk grows more and more the need for enforcing least privileged access for AI agents to help in mitigating prompt injections,” he says.

Prompt injection “is already the top OWASP risk for LLM applications and NCSC now warns it may never be fully fixed, only managed, since the architecture of next token prediction treats all context as morally equal and endlessly reinterpretable,” says Jason Soroko, senior fellow at Sectigo.   

The issue extends well beyond the familiar “ignore previous instructions” jailbreaks, he says, noting “the real problem emerges once models are wired into tools, money flows, codebases, or private corpora, where indirect injections in web pages, documents, emails, metadata, or even image text can hijack autonomous agents and quietly exfiltrate secrets or trigger dangerous actions without any human seeing a malicious prompt at all.”  

Trey Ford, chief strategy and trust officer at Bugcrowd, says this is only the beginning. “Promptware is a fun attack pattern we’re going to see a lot more of…whether it be having a laugh bombing sales and marketing outreach by embedding prompts in LinkedIn profiles, or moving toward malice in calendar invites – we’re going to see more of this moving forward,” he says.

Defenders are facing a challenge because the services are operating within the context of the user and treating the inputs as user-provided prompting, Ford explains. “From a detection and response standpoint, logging and monitoring will require a level of vigilance as all actions taken appear normal and correct,” he says. “The velocity of the queries is a deceptive starting point, but that’s an easy issue for adversaries to overcome.”

Noting that recent attacks, such as prompt injections within CI/CD pipelines and AI browser inputs, demonstrate that this is a real and present danger, Damon Small, who sits on the board of directors at Xcape, Inc., says he backs the NCSC’s advice to adopt a risk reduction strategy, “treating LLMs as inherently vulnerable and needing external protection.” That, he says, “means limiting the model’s access, requiring human review for critical actions, and meticulously monitoring API calls to prevent malicious prompts from causing significant damage.”

Calling attempts to mitigate prompt injection “a vibrant area of research,” NCSC notes approaches include “detections of prompt injection attempts; training models to prioritize ‘instructions’ over anything in ‘data’ that looks like an instruction; [and] highlighting to a model what is ‘data.’”

The agency advises defenders to create a secure design, make it harder for LLMs to act on instructions included within data and monitor systems.

Mirroring some of NCSC’s advice, Daniel Koch, vice president of R&D at Oasis Security, says prompt injection should be approached the same way defenders “treat other unmitigable or partially mitigable risks.”

That, he says, includes:

Architect for containment rather than perfect prevention. Don’t rely on the LLM alone to enforce security boundaries. Wrap LLM components with traditional security controls: identity, authorization, policy enforcement, data-loss prevention, etc.

Assume model instructions can be overridden: Design workflows so that even if a prompt injection succeeds, the blast radius is limited by external safeguards.

Raise the difficulty of abuse: Structured prompts, clear separation of user input, and thoughtful UI design won’t eliminate attacks but do reduce their likelihood.

Monitor continuously: Since prevention is imperfect, detection and logging are essential to catch manipulation early.

 Since LLMs are, as NCSC contends, “inherently confusable deputies,” organizations must have “defense-in-depth strategies specifically designed for non-deterministic systems, with proper guardrails that constrain what AI agents can access and continuous monitoring for suspicious behavior,” says Kelley. “After the ForcedLeak and GeminiJack vulnerabilities, the pattern is clear: without proper controls, enterprise AI exposure will only grow.”

Since that confusion can’t be fixed, “the only viable defense is to stop treating the AI like an oracle and start treating it like a high-privilege employee with known and dangerous flaws,” says Small.