I built a deterministic security firewall API for AI agents (Python SDK, free tier)
嗯,用户让我帮忙总结一篇文章,控制在100字以内,而且不需要用特定的开头。我得先仔细阅读文章内容。 文章主要介绍了一个叫做SovereignShield的安全层,它位于用户输入和LLM之间。这个工具不是用另一个模型来判断输入是否安全,而是通过纯模式匹配和结构化规则集来工作。它的特点包括完全确定性、亚毫秒延迟,以及能够阻止多种攻击类型,比如注入攻击、编码载荷、shell执行等。 接下来是架构部分,有四个层级:输入过滤器、防火墙、核心安全和道德守门人。每个安全裁决都是锁定的,无法被下游代码覆盖。此外,它还有自我改进功能,用户可以报告新攻击并通过API提取检测关键词。 最后是集成方式和价格信息。免费版每月1000次扫描,专业版每月8美元提供10万次扫描。 总结的时候需要涵盖主要功能、优势、阻挡的攻击类型以及使用方式。控制在100字以内,所以要简洁明了。 </think> SovereignShield 是一个安全层,通过纯模式匹配阻止 50 多种攻击类型(如注入攻击、编码载荷等),具有亚毫秒延迟和完全确定性结果。其架构包括四个层级,并提供自适应规则引擎和简单 API 集成(免费试用)。 2026-3-24 14:8:3 Author: www.reddit.com(查看原文) 阅读量:5 收藏

I have been working on SovereignShield, a security layer that sits between user input and your LLM. Instead of using another model to judge if input is safe, it uses pure pattern matching against a structured ruleset. Fully deterministic: same input, same result, every time. Sub-millisecond latency.

Why I built it: Every AI agent I have seen trusts user input by default. LLM-based safety filters are probabilistic, meaning they can be bypassed with creative encoding, context manipulation, or just trying enough times. I wanted something that gives a hard yes/no using math, not guessing.

What it blocks:

  • Prompt injection and jailbreaks

  • Encoded payloads (base64, hex, unicode obfuscation)

  • Shell execution (os.system, subprocess, rm -rf)

  • Credential exfiltration via URL parameters

  • SQL injection, XSS, path traversal, reverse shells

  • 50+ attack categories total

Architecture: 4 layers run in sequence: InputFilter (pattern matching), Firewall (rate limiting), CoreSafety (action-level blocking), and Conscience (ethical gate). Every security verdict is returned as a frozen dataclass inside a locked namespace, making it physically impossible for downstream code to override a BLOCK decision at runtime.

Self-improving: There is an adaptive engine where you report new attacks via the API. The system extracts detection keywords, sandbox-tests them against your historical scan data for false positives, and auto-deploys rules that pass validation.

Integration:

pip install sovereign-shield-client

from sovereign_shield_client import SovereignShield

shield = SovereignShield(api_key="ss_your_key")

safe = shield.scan(user_input)

Free tier: 1,000 scans/month (no credit card). Pro: 100,000 scans/month for $8/mo.

Site: https://sovereign-shield.net

GitHub (BSL 1.1): https://github.com/mattijsmoens/sovereign-shield 

PyPI: https://pypi.org/project/sovereign-shield-client/


文章来源: https://www.reddit.com/r/blackhat/comments/1s2esr9/i_built_a_deterministic_security_firewall_api_for/
如有侵权请联系:admin#unsafe.sh