I built a phishing detection simulator called Threat Terminal as a research project. The idea was simple: show players simulated emails, have them decide phishing or legit, and log everything. Decision confidence, time spent, whether they checked headers or URLs, phishing technique, difficulty level.
135 participants and 2,000+ decisions later, the data is telling a consistent story. Overall phishing bypass rate sits at 19%. But when the phishing email is written with clean, fluent prose (no typos, no broken grammar, no obvious red flags) that number climbs to about 24%. AI-quality writing removes the signals most people actually rely on.
The gap between security professionals and non-technical users is smaller than you'd expect. That's one of the more interesting findings so far.
V2 just went live. The research mode is the same. 30 emails, no timer, same methodology. But I added a competitive layer on top:
- 1v1 ranked PvP. Five emails, same set for both players, correct call plus speed wins.
- Seasonal ranked ladder. You start at the bottom and work your way up.
- Daily challenge. One email per day, global leaderboard.
- XP, levels, badges, inventory system.
- An AI handler named SIGINT who briefs you before rounds and reacts to your decisions.
PvP unlocks after completing the first quest, so every player who wants to compete still contributes data first.
Non-security players are some of the most valuable data points I'm missing. If you know anyone outside the field who'd try it, send them over.
Link: https://research.scottaltiparmak.com
Repo: https://github.com/scottalt/ai-email-threat-research
Happy to talk about the research, the tech stack, or the findings so far.