Announcing the Tonic Textual MCP server: PII redaction meets AI agents
好的,我现在需要帮用户总结一篇文章的内容,控制在100字以内,并且不需要特定的开头。首先,我得通读整篇文章,抓住主要信息。 文章主要介绍Tonic Textual这个工具,它能检测和处理文本、文件和音频中的PII(个人身份信息),支持50多种语言。现在他们发布了开源的MCP服务器,集成到AI工作流中。MCP是一个连接AI助手和外部工具的标准协议。 接下来,我需要提炼关键点:Textual的功能、MCP的作用、新增的功能如21个工具、处理文件格式、批量处理等。还要注意用户要求简洁,所以要用简短的句子。 最后,确保内容在100字以内,并且直接描述文章内容,不使用“这篇文章”等开头。可能的结构是先介绍Textual的功能,然后提到MCP服务器的发布及其带来的便利。 </think> Tonic Textual 是一款用于检测和处理文本、文件和音频中个人身份信息(PII)的工具,支持 50 多种语言。通过开放源代码的 MCP 服务器,Textual 将其功能集成到 AI 工作流中,提供 21 个工具用于实时红action、文件处理、批量去标识化等操作。 2026-3-20 01:55:1 Author: securityboulevard.com(查看原文) 阅读量:4 收藏

The challenge: sensitive data is everywhere

Every organization handles sensitive data. It lives in training datasets, test environments, customer support logs, legal contracts, medical records, and countless other places. Whether you’re preparing data for model training, generating realistic test datasets, or processing documents for review, the challenge is the same: you need to find and protect personally identifiable information (PII) without destroying the usefulness of the data.

Tonic Textual already solves this problem. It detects and transforms PII across text, files, and audio in 50+ languages, with fine-grained control over how each entity type is handled; masking, synthesis, or deterministic replacement. Thousands of teams use Textual to prepare safe training corpora, generate compliant test data, and redact sensitive documents at scale.

Today, we’re making Textual even more accessible: we’re releasing an open-source MCP server that brings Textual’s full redaction capabilities directly into AI agent workflows.

What is MCP?

The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external tools and data sources. Think of it as a universal plug that lets AI agents take real actions; not just generate text, but call APIs, process files, and interact with the systems you already use.

MCP is supported by a growing ecosystem of clients including Claude Desktop, Cursor, Windsurf, and custom-built agents. By publishing a Textual MCP server, we’re making PII redaction a first-class capability in any of these environments.

What you can do with the Textual MCP server

The server exposes 21 tools that cover the full breadth of Textual’s capabilities:

Redact text on the fly

Process plain text, JSON, XML, and HTML directly. Bulk mode handles multiple strings in a single call. JSON redaction supports JSONPath expressions for surgical control; specify exactly which paths to redact or leave untouched.

Process files with format preservation

Upload PDFs, Word documents, spreadsheets, and images for redaction. The original formatting is preserved; a redacted PDF still looks like a PDF, not a stripped-down text dump. File processing runs in the background so your agent isn’t blocked waiting for large files to complete.

De-identify entire directories in one call

Point the deidentify_folder tool at a directory and it handles everything: detects file types, routes text files through the fast inline API and binary files through the upload pipeline, processes them in parallel, and writes redacted output alongside the originals. Filter by extension, skip specific patterns, and even redact folder and file names.

Manage datasets

Create datasets in the Textual platform, upload files for scanning, and download redacted results; all through your AI agent. This integrates Textual’s dataset workflow directly into automated pipelines.

Full control over redaction behavior

Every tool supports Textual’s rich configuration options:

  • Per-entity handling: Choose redaction, synthesis, or off for each of the 35+ supported entity types
  • Synthesis modes: V1 or V2 (length-aware), plus specialized generators for emails, phone numbers, SSNs, credit cards, and more
  • Deterministic replacements: Map specific PII values to consistent synthetic replacements across documents
  • Allow and block lists: Force-detect or exclude specific values using strings or regex patterns
  • Custom entity models: Use models trained on your proprietary data

Example use cases

  • Preparing safe training data: Redact PII from text corpora before fine-tuning or pre-training language models
  • Generating realistic test datasets: Synthesize PII so data stays structurally useful but fully compliant
  • Building compliant RAG pipelines: De-identify documents before they enter vector stores, so retrieved context is always clean
  • Document review workflows: Sanitize contracts, medical records, or support tickets during AI-assisted review
  • Data pipeline automation: Clean data exports in bulk, redact files as part of CI/CD, or process incoming data on arrival

Getting started

Install

npm install -g @tonicai/textual-mcp

Configure

Set your Textual API key as an environment variable:

export TONIC_TEXTUAL_API_KEY=your-api-key-here

For self-hosted Textual instances, also set:

export TONIC_TEXTUAL_BASE_URL=https://your-textual-instance.example.com

Add to your MCP client

For Claude Desktop, add the server to your MCP configuration:

{
  "mcpServers": {
    "textual": {
      "command": "textual-mcp",
      "env": {
        "TONIC_TEXTUAL_API_KEY": "your-api-key-here"
      }
    }
  }
}

Then ask your AI assistant to redact text, process a file, or de-identify an entire folder — Textual handles the rest.

Under the hood

For the technically curious:

  • Built on the MCP SDK with HTTP streaming transport, exposing a /mcp endpoint and a /health check
  • Concurrency-controlled — a semaphore limits concurrent API calls (default 50) to avoid overwhelming the Textual API
  • Background task processing — file uploads use the MCP task API for non-blocking operation with status polling
  • Automatic retries — transient network errors are retried transparently
  • Structured logging — rotating JSON log files for observability and debugging

Try it out

We’d love your feedback. Try the server, let us know what you build with it, and open an issue if you run into anything. PII redaction should be as easy as asking your AI assistant — and now it is.

*** This is a Security Bloggers Network syndicated blog from Expert Insights on Synthetic Data from the Tonic.ai Blog authored by Expert Insights on Synthetic Data from the Tonic.ai Blog. Read the original post at: https://www.tonic.ai/blog/announcing-tonic-textual-mcp-server


文章来源: https://securityboulevard.com/2026/03/announcing-the-tonic-textual-mcp-server-pii-redaction-meets-ai-agents/
如有侵权请联系:admin#unsafe.sh