TL;DR: The problem: Sparkplug B is the dominant MQTT-based protocol in industrial control and SCADA environments, but until now there was no publicly available security fuzzer for it.What we built: A systematic Sparkplug B protocol fuzzer covering all 9 message types, 19 data types, and 87+ unique field paths defined by the Eclipse Sparkplug specification.
How AI helped: Claude Code read the Sparkplug protobuf definition and specification alongside our working prototype, identified the coverage gaps and Python defects in the original, and produced a hardened, self-contained tool with CLI, logging, and passive network discovery.
Why it matters: ICS and SCADA operators, device vendors, and their defenders now have a tool to exercise malformed-traffic behavior on Sparkplug B endpoints. Fuzzing will surface crashes, protocol violations, and state-handling bugs before an attacker does.
Where to get it: https://github.com/BishopFox/sparkplugFuzzer.
Sparkplug B is an open MQTT-based specification maintained by the Eclipse Foundation that standardizes how industrial control system (ICS) and SCADA (Supervisory Control and Data Acquisition) devices publish state, telemetry, and commands over a shared broker.
It serializes messages with Google Protocol Buffers keeping payloads small enough for bandwidth-constrained links, like cellular or radio, and defines a structured topic namespace (spBv1.0/{group}/{message_type}/{node}/{device}) across nine message types and 19 data types. That combination is what makes it the default interoperability layer for modern unified namespace (UNS) deployments in manufacturing and critical infrastructure.
Sparkplug B layers three things on top of vanilla MQTT:
spBv1.0/{group_id}/{message_type}/{edge_node_id}/{device_id}. That structure lets a broker, and any subscribed tool, build a live map of the network simply by listening.This structure is what makes Sparkplug B efficient for OT (Operational Technology) networks, and it's also what makes it a rich target for protocol fuzzing: there are a lot of fields, sequence rules, and type expectations that a malformed payload can violate.
Specifications for protocols define what is required but normally don't specify how to handle unexpected traffic. Protocol fuzzers work by sending malformed, random, or unexpected traffic to endpoints on the network. Security fuzzing tests what the implementation and specification documents say and how devices behave when receiving unexpected traffic. Using a fuzzer uncovers device behavior that may introduce vulnerabilities or disrupt communication for critical devices.
Sparkplug B rides on MQTT brokers that sit between the IT network and the OT floor – the same brokers that carry set-points to PLCs, telemetry from pumps and valves, and commands to robots and CNC machines. A malformed payload that crashes a device, corrupts its state, or gets silently accepted when it shouldn't be doesn't just cost a reboot. In an industrial environment, the downstream physical process can go with it.
Examples:
Availability: A single malformed DDATA message that crashes an edge node can take an entire production line's sensor feed offline. Recovering a hung PLC often requires manual intervention on the plant floor.
Integrity: Sparkplug B's alias mechanism, where short integer IDs stand in for named metrics, is powerful, but alias-collision or type-mismatch handling bugs can cause the wrong metric to receive the wrong value for example a temperature field accepting a string or metric ID 5 rebinding mid-session.
Safety: In critical infrastructure such as water treatment, energy, pharma, and food production, incorrect telemetry or a missed command is a safety event, not just a software bug.
Blast radius: MQTT's publish-subscribe model means a single misbehaving or malicious publisher can reach every subscriber on a topic. Flat namespaces and permissive ACLs, both common in OT implementations magnify that reach.
Sparkplug B is marketed as "self-describing" and easy to adopt. That ease of adoption has outpaced the security tooling that should accompany it.
The initial script was based on Cirrus-Link Sparkplug example code . With slight modifications, we were able to add a file read for a static list of strings to fuzz the fields with.
I tell everyone that my code is like a 1987 Honda, it will get you where you’re going but don’t look under the hood. But, we looked under the hood and this was the state of testing:
With a working first prototype, we fed our script and the protocol specification () into a Claude Code instance along with a few requirements such as the outputted script should install any dependencies and be self-contained. In addition, we asked Claude to check for general programming issues. What it found, this is a painful list to read but it is fully transparent:
Did it work? Yes, by accident, and it was fragile and not the sort of testing we want to give our clients.
The end state fixed the above issues and improved the protocol fuzzer in multiple ways:
The end fuzzer was robust, systematic, and a professionalized tool for testing this critical protocol.
Capability | Initial script | Improved fuzzer |
Message types exercised | 4 of 9 (NBIRTH, DBIRTH, NDEATH, DDATA) | All 9 (NBIRTH, DBIRTH, NDATA, DDATA, NCMD, DCMD, NDEATH, DDEATH, STATE) |
Data types fuzzed | 2 (String, Boolean) | All 19 (Int8/16/32/64, UInt8/16/32/64, Float, Double, Boolean, String, DateTime, Text, UUID, DataSet, Bytes, File, Template) |
Field paths covered | Handful of hardcoded metrics | 87+ unique field paths mapped from the protobuf definition |
Type-mismatch testing | None | Yes |
Sequence number manipulation | None | Yes |
Alias collision testing | None | Yes |
Protocol ordering violations | None | Yes |
Raw protobuf corruption | None | Yes (via sparkplug_b_pb2 direct access) |
Topic namespace fuzzing | None | Yes |
Network discovery | None | Passive DeviceTracker via wildcard subscription |
Logging & artifacts | None | Structured send/receive log for client correlation |
CLI / broker config | Hardcoded localhost:1883 | CLI interface, configurable broker |
Python version hygiene | Mixed Py2/Py3 print | Clean Python 3 |
MQTT loop handling | Tight client.loop() | Threaded client.loop_start() |
Type | Purpose |
NBIRTH | Edge node announces itself and publishes its full metric set |
DBIRTH | Device under an edge node announces itself and its metric set |
NDATA | Edge node publishes metric value updates |
DDATA | Device publishes metric value updates |
NCMD | Command targeting an edge node |
DCMD | Command targeting a device |
NDEATH | Edge node disconnection (delivered via MQTT Last Will and Testament) |
DDEATH | Device disconnection |
STATE | Host application liveness announcement |
If you operate a Sparkplug B environment, or you're the vendor shipping devices that speak it, the following controls meaningfully raise the bar against both opportunistic and targeted abuse:
Broker configuration
Protocol-layer hardening
Detection and monitoring
Operational hygiene
What we see in real engagements
Across ICS and OT assessments, a few Sparkplug B patterns come up reliably:
On one automotive manufacturing assessment, all three failures lined up: corporate Wi-Fi bridged directly into the OT network, devices that required no authentication, and sessions that allowed mid-session alias rebinding without ever tombstoning a device after NDEATH.
There is a creativity gap in penetration testing that AI won't be able to overcome, i.e. sticking two square pegs in a round hole and jiggling them just right can't be trained. But if you have two almost-working square pegs and a vague idea of how to jiggle them, AI can help get the process across the finish line fairly quickly. Most of the recent news has been about how attackers are outpacing defenders. Bishop Fox hopes that this is proof that behind-the-scenes defenders are quietly fixing issues and hardening their attack surface.
Could we have written this fuzzer without AI assistance? Yes. Could we have written it with the time constraints of the test and validated it against the specifications and protobuf definition? Maybe. I would need a bigger whiteboard. Would we have written the readme and usage documentation? Probably not.
Claude Code took our creative security testing idea and made us more efficient by reading documents and planning a better script. The bane of everyone who has written code is maintaining documentation and usage information. Claude generated that documentation for us, allowing us to spend more time doing what we are good at, assessing critical infrastructure, and less doing the chore of maintaining documentation.
Building the fuzzer was the easy part. If you want to know what else is lurking in a Sparkplug B environment, the three findings from that automotive engagement show up together more often than they should: bridged networks, no authentication, uncleaned alias sessions. That's what Bishop Fox penetration testing is built to surface before an attacker does.
Grab the Sparkplug B fuzzer on GitHub and put it to work, and for critical infrastructure operators specifically, see how Bishop Fox works with energy and utility clients and how a Fortune 500 utility stays ahead of emerging threats.