AI Threat Modelling: A Practical Walkthrough of the TryHackMe Room
Press enter or click to view image in full sizeLink — https://tryhackme.com/room/aithreatmodellingTa 2026-6-2 05:6:36 Author: infosecwriteups.com(查看原文) 阅读量:16 收藏

Gajanan Tayde

Press enter or click to view image in full size

Link — https://tryhackme.com/room/aithreatmodelling

Task 1: Introduction

Artificial Intelligence has rapidly moved from experimental labs into production environments. Today, organizations rely on Large Language Models (LLMs), recommendation engines, fraud detection systems, and Retrieval-Augmented Generation (RAG) pipelines to automate critical business operations. While these systems provide significant business value, they also introduce entirely new attack surfaces that traditional security frameworks were never designed to address.

In this walkthrough, I’ll document my journey through the TryHackMe room “Threat Modelling AI Systems”, where I learned how to assess AI deployments using:

  • AI-specific asset identification
  • STRIDE for AI systems
  • MITRE ATLAS
  • OWASP LLM Top 10 (2025)
  • Practical AI threat assessment methodologies

Let’s dive in.

Task 1: Understanding the Scenario

The room places us in the role of a newly hired Threat Analyst at MegaCorp. The organization has heavily adopted AI technologies across several business functions:

Customer Support Chatbot

  • Powered by an LLM
  • Connected to internal knowledge bases through a RAG pipeline

Recommendation Engine

  • Processes sensitive customer information
  • Generates personalized product recommendations

Fraud Detection Platform

  • Makes real-time authorization decisions
  • Continuously retrains on transaction data

The mission from the CISO is simple: Conduct a comprehensive AI threat assessment before the upcoming board meeting. This task introduces the importance of understanding that AI systems are not merely traditional applications with machine learning bolted on. They introduce new assets, new risks, and entirely different failure modes.

Task 2: AI-Specific Assets and Attack Surfaces

Traditional threat models focus on:

  • Databases
  • APIs
  • Credentials
  • Configuration files
  • Source code

AI systems introduce additional assets that require protection.

Key AI Assets

Press enter or click to view image in full size

Image from THM Room

1. Training Data

The dataset used to train the model.

Risks:

  • Data poisoning
  • Label manipulation
  • Hidden backdoors

2. Model Weights

The learned intelligence of the model.

Risks:

  • Model theft
  • Intellectual property loss
  • Competitive espionage

3. Embedding Vectors

Used heavily within:

  • RAG systems
  • Recommendation engines
  • Fraud detection systems

These numerical representations help models retrieve relevant information.

4. System Prompts

Instructions that define:

  • Personality
  • Restrictions
  • Guardrails
  • Business logic

Leaking system prompts can reveal security controls and bypass mechanisms.

5. Feature Stores

Repositories containing processed inputs fed into models. Tampering here changes what the model sees during inference.

6. Model Registries

Storage locations for approved model versions. Compromising the registry allows attackers to deploy malicious or backdoored models.

Key Learning

Unlike a stolen password, compromised model weights cannot simply be rotated. Once an attacker possesses your model, they possess your organization’s AI capability.

Question 1

In a RAG-based system, which AI asset type is used to retrieve relevant context at query time?

Answer: Embedding Vectors

Question 2

Which AI-specific asset is compromised when an attacker swaps a production model inside the model registry?

Answer: Model Registry / Artifacts

Task 3: The AI Data Supply Chain and STRIDE’s Limitations

One of the most valuable lessons from this room is understanding the AI Data Supply Chain.

Press enter or click to view image in full size

Stage 1: Data Collection

Data originates from:

  • Public web sources
  • Internal databases
  • Third-party providers
  • User-generated content

Attack opportunity:

  • Poisoned source material

Stage 2: Cleaning and Labeling

Data gets categorized and prepared for training.

Attack opportunity:

  • Incorrect labeling
  • Manipulated annotations

Stage 3: Model Training

Patterns become embedded into model weights.

Attack opportunity:

  • Persistent poisoning
  • Backdoor implantation

Stage 4: Validation and Packaging

Models are evaluated and stored.

Attack opportunity:

  • Registry compromise
  • Model replacement

Stage 5: Inference

The model serves predictions to users.

Attack opportunity:

  • Prompt injection
  • Retrieval manipulation
  • Adversarial inputs

Why STRIDE Alone Isn’t Enough

Press enter or click to view image in full size

Traditional STRIDE was not designed for:

  • Training data poisoning
  • Model extraction
  • Adversarial examples
  • Prompt injection
  • Excessive AI agency

AI systems require additional context and frameworks.

Question 1

At which supply chain stage is malicious data injected to influence future model behavior?

Answer: Data Collection

Question 2

Which STRIDE category struggles to properly describe training data poisoning?

Answer: Tampering

Task 4: Adapting STRIDE for AI Systems

The room then reimagines STRIDE through an AI lens.

Spoofing → Data Source Impersonation

Attackers inject malicious content into knowledge sources.

Example: A poisoned RAG document causes a chatbot to deliver false information.

Tampering → Data Poisoning

Attackers modify:

  • Training datasets
  • Model weights
  • Features
  • Prompts

Associated MITRE ATLAS techniques:

  • AML.T0020 — Data Poisoning
  • AML.T0018 — Backdoor ML Model

Repudiation → Lack of Explainability

Organizations cannot always explain:

  • Why a prediction occurred
  • Which model version made it
  • Which context influenced it

This creates audit and compliance challenges.

Information Disclosure → Model Extraction

Attackers repeatedly query APIs to reconstruct proprietary models.

Associated techniques:

  • AML.T0024 — Extract ML Model
  • AML.T0025 — Infer Training Data Membership

Denial of Service → Denial of Wallet

A fascinating AI-specific attack.

Rather than crashing systems, attackers generate:

  • Extremely long prompts
  • Expensive inference requests
  • Massive token consumption

Result: Cloud bills skyrocket while systems remain technically online.

Elevation of Privilege → Jailbreaking

Attackers manipulate prompts to bypass restrictions.

Get Gajanan Tayde’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

Consequences:

  • Tool abuse
  • Database access
  • Unauthorized actions

OWASP Mapping:

  • LLM06:2025 — Excessive Agency

Question 1

Primary AI manifestation of Information Disclosure?

Answer: Model Extraction

Question 2

Which STRIDE category covers jailbreaking?

Answer: Elevation of Privilege

Question 3

Which OWASP LLM Top 10 entry addresses excessive permissions?

Answer: LLM06: 2025 — Excessive Agency

Question 4

What is the name of the attack that increases inference costs without causing downtime?

Answer: Denial of Wallet

Task 5: MITRE ATLAS

MITRE ATT&CK revolutionized traditional threat modeling.

For AI, MITRE introduced:

ATLAS

Adversarial Threat Landscape for Artificial-Intelligence Systems

ATLAS provides:

  • Tactics
  • Techniques
  • Sub-techniques
  • Mitigations
  • Real-world case studies

Press enter or click to view image in full size

Important Techniques

AML.T0020 — Data Poisoning

Corrupting training data to influence future behavior.

AML.T0024 — Model Extraction

Stealing models through repeated API interaction.

AML.T0015 — Evade ML Model

Crafting inputs designed to bypass detection.

AML.T0051 — LLM Prompt Injection

Manipulating model behavior through prompts.

AML.T0018 — Backdoor ML Model

Embedding hidden triggers into training.

Why ATLAS Matters

STRIDE tells us: What category of threat exists.

ATLAS tells us: Exactly how attackers perform the attack.

Real-World Case Studies

ShadowRay (AML.CS0023)

Attackers exploited vulnerabilities in Ray AI infrastructure.

Morris II Worm (AML.CS0024)

A self-propagating prompt injection worm capable of spreading between AI agents through RAG-enabled communication channels. This demonstrated that AI malware is no longer theoretical.

Press enter or click to view image in full size

Press enter or click to view image in full size

Question 1

What does ATLAS stand for?

Answer: Adversarial Threat Landscape for Artificial-Intelligence Systems

Question 2

Which case study documented a self-replicating prompt injection worm?

Answer: Morris II

Question 3

What is the technique ID for Model Extraction?

Answer: AML.T0024

Task 6: OWASP LLM Top 10 (2025)

This section ties everything together.

The OWASP LLM Top 10 maps AI threats directly to architectural components.

Key Risks

LLM01 — Prompt Injection

Targets:

  • User prompts
  • RAG content
  • Retrieved documents

LLM02 — Sensitive Information Disclosure

Targets:

  • Training datasets
  • System prompts
  • Inference outputs

LLM03 — Supply Chain

Targets:

  • Third-party models
  • Datasets
  • Dependencies

LLM04 — Data and Model Poisoning

Targets:

  • Training pipelines
  • Feature stores
  • Registries

LLM05 — Improper Output Handling

Example:

Rendering unsanitized LLM output directly into browsers.

Potential result:

  • Cross-Site Scripting (XSS)

LLM06 — Excessive Agency

Example:

An AI assistant with unrestricted access to:

  • Databases
  • APIs
  • Email systems

LLM07 — System Prompt Leakage

Exposure of internal instructions and guardrails.

LLM08 — Vector and Embedding Weaknesses

Risks:

  • Embedding poisoning
  • Retrieval manipulation

LLM09 — Misinformation

Hallucinations and inaccurate responses.

LLM10 — Unbounded Consumption

Denial-of-wallet attacks and resource exhaustion.

Question 1

How many OWASP entries affect the LLM Inference Endpoint?

Answer: 6

Question 2

Unsanitized LLM output rendered in browsers maps to which OWASP category?

Answer: Improper Output Handling

Question 3

Which component requires the most protection against supply chain threats?

Answer: Training Pipeline

Task 7: Practical Exercise

The room concludes with an interactive threat modeling exercise.

The challenge requires:

  • Identifying vulnerabilities
  • Mapping OWASP risks
  • Associating architectural components
  • Justifying mitigation choices

This practical exercise reinforces the relationships between:

  • STRIDE
  • MITRE ATLAS
  • OWASP LLM Top 10

Practical Solution:

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Press enter or click to view image in full size

Flag

THM{AI_THREAT_MODEL_COMPLETE}

Conclusion

This room provides one of the most structured introductions to AI Threat Modeling currently available on TryHackMe.

The biggest takeaway is that AI security is not simply application security with new terminology.

AI introduces:

  • New assets
  • New attack paths
  • New supply chains
  • New forms of abuse

A practical assessment workflow emerges:

Step 1: Identify AI Assets

Training data, model weights, embeddings, prompts, and registries.

Step 2: Analyze the Data Supply Chain

Understand where compromise can occur.

Step 3: Apply STRIDE-AI

Categorize threats.

Step 4: Enrich Using MITRE ATLAS

Map threats to documented adversarial techniques.

Step 5: Prioritize Using OWASP LLM Top 10

Identify where risks exist within the architecture. This layered methodology creates a repeatable framework that can be applied to virtually any AI deployment, from chatbots to autonomous agents. As organizations continue integrating AI into business-critical systems, threat modeling skills like these will become just as essential as traditional application security reviews.

Thanks for reading, and happy hacking!

Press enter or click to view image in full size


文章来源: https://infosecwriteups.com/ai-threat-modelling-a-practical-walkthrough-of-the-tryhackme-room-72d632340400?source=rss----7b722bfd1b8d---4
如有侵权请联系:admin#unsafe.sh