SC-112: AI Model Intellectual Property Theft¶

Operation NEURAL HEIST¶

Classification: TABLETOP EXERCISE -- 100% Synthetic

All organizations, IP addresses, domains, model names, and threat actors in this scenario are entirely fictional. Created for educational tabletop exercises only.

Scenario Metadata¶

Field	Value
Difficulty	★★★★★ (Expert)
Duration	3-4 hours
Participants	6-8 (SOC, IR, Data Science/MLOps, Legal, HR, CISO)
ATT&CK Techniques	T1048 · T1005 · T1213 · T1078 · T1567 · T1029
Threat Actor	Insider -- Senior ML Engineer (COBALT MIRROR)
Industry	Technology / AI
Primary Impact	$45M proprietary AI model stolen, competitive advantage destroyed

Insider Threat Profile: COBALT MIRROR¶

Attribute	Detail
Role	Senior ML Engineer, 4-year tenure, top performer
Motivation	Financial -- accepted offer from competitor contingent on bringing model architecture and weights
Access Level	Legitimate access to model training pipelines, GPU clusters, model registry, and training datasets
Sophistication	Very high -- deep knowledge of internal security controls, monitoring blind spots, and data serialization formats
Behavioral Indicators	Increased after-hours access, bulk data downloads rationalized as "retraining experiments," resignation submitted 3 days after final exfiltration
Cover Story	All exfiltration activities disguised as routine model experimentation and hyperparameter tuning

Executive Summary¶

A senior ML engineer at Prometheus AI Labs (synthetic AI company, 600 employees) -- codenamed COBALT MIRROR -- systematically exfiltrates the company's flagship proprietary model "TitanLM-70B" (a 70-billion parameter large language model valued at $45M in training compute costs). Over 34 days, the insider uses their legitimate access to the ML platform to extract model weights (140 GB), training configuration, fine-tuning datasets (2.1 TB), and proprietary RLHF reward model data. The exfiltration is conducted through three channels: (1) the internal model inference API to reconstruct weights via model distillation, (2) direct downloads from the model registry disguised as routine experimentation, and (3) training data copied to a personal cloud storage account via the corporate VPN. The theft is discovered 6 days after the insider's resignation when a routine access review reveals anomalous API call patterns during the insider's final weeks.

Environment Setup¶

Target Organization: Prometheus AI Labs (synthetic)

Asset	Detail
Industry	AI/ML research company, 600 employees, 180 ML engineers
Flagship Model	TitanLM-70B (70B parameter LLM, $45M training cost)
ML Platform	Internal MLflow + Kubernetes GPU cluster (10.50.0.0/16)
Model Registry	`mlflow.prometheus.example.com` (10.50.1.10)
Training Cluster	256x NVIDIA A100 GPUs, `gpu-cluster.prometheus.example.com` (10.50.2.0/24)
Inference API	`api.titanllm.example.com` (10.50.3.20)
Data Lake	MinIO S3-compatible: `datalake.prometheus.example.com` (10.50.4.15)
VPN	WireGuard VPN for remote access (10.50.100.0/24)
DLP	Symantec DLP (email and web gateway)
SIEM	Splunk Enterprise
EDR	SentinelOne

Phase 1: Reconnaissance & Planning (Days -14 to 0)¶

Insider Actions¶

COBALT MIRROR begins systematically mapping security controls and identifying exfiltration paths. As a 4-year veteran with legitimate administrative access, the insider already knows most of the infrastructure:

Internal Security Control Assessment (Insider's Notes -- Reconstructed)

Security Controls Assessment:

[✓] DLP monitors email attachments > 25MB -- AVOID EMAIL
[✓] Web proxy logs all uploads to cloud storage -- USE VPN SPLIT
[✓] Model registry access logged in MLflow -- NEEDS COVER STORY
[✓] GPU cluster usage tracked per user -- WILL USE EXISTING JOBS
[✓] API rate limiting: 1000 req/min per user -- SUFFICIENT FOR DISTILLATION
[✗] No monitoring on API output token volume per user
[✗] No DLP on VPN tunnel traffic
[✗] No behavioral analytics on model registry downloads
[✗] No watermarking on model weights
[✗] No USB port restrictions on ML workstations

The insider identifies three critical gaps:

API output monitoring -- No tracking of cumulative tokens/data extracted via the inference API
VPN DLP gap -- Corporate DLP does not inspect VPN tunnel traffic
Model registry behavioral analytics -- Downloads are logged but not baselined or alerted on

Discussion Injects¶

Technical

The insider identified 5 security gaps in existing controls. Which of these gaps would be most impactful to close? How do you balance ML engineer productivity (which requires large data transfers) with data loss prevention?

Phase 2: Model Distillation via API Abuse (Days 1-14)¶

Insider Actions¶

COBALT MIRROR uses the internal inference API to perform model distillation -- extracting the model's knowledge by querying it systematically and training a smaller replica:

API Distillation Script (Reconstructed)

"""
Model distillation via API -- extracts teacher model
knowledge into student model.
Disguised as 'benchmark evaluation' in experiment logs.
"""
import requests
import json
import time
from pathlib import Path

API_BASE = "https://api.titanllm.example.com/v1"
API_KEY = "REDACTED"  # Legitimate user API key
OUTPUT_DIR = Path("/data/experiments/benchmark-eval-2026")

# Distillation dataset -- 2M diverse prompts
# covering TitanLM's training distribution
PROMPTS = load_prompts("distillation_prompts.jsonl")

session = requests.Session()
session.headers["Authorization"] = f"Bearer {API_KEY}"

for i, prompt in enumerate(PROMPTS):
    response = session.post(
        f"{API_BASE}/completions",
        json={
            "prompt": prompt,
            "max_tokens": 2048,
            "temperature": 0.7,
            "logprobs": 5,  # Extract probability distribution
            "top_p": 1.0
        }
    )

    result = response.json()

    # Save full response including logprobs
    # (critical for distillation -- captures model's
    # internal probability distribution)
    output_file = OUTPUT_DIR / f"batch_{i // 10000}" / f"{i}.json"
    output_file.parent.mkdir(parents=True, exist_ok=True)
    with open(output_file, "w") as f:
        json.dump(result, f)

    # Rate limit to stay under anomaly detection
    # (~800 req/min, below 1000 limit)
    if i % 100 == 0:
        time.sleep(2)

Why Logprobs Matter

By requesting logprobs=5, the insider extracts not just the model's text output but its internal probability distribution over tokens. This is far more valuable for distillation than text alone -- it enables training a student model that closely mimics the teacher model's behavior, including its confidence patterns and uncertainty.

Evidence Artifacts¶

API Access Logs (api.titanllm.example.com)

2026-03-10T22:14:33Z POST /v1/completions 
  user=cobalt.mirror src=10.50.100.47 
  tokens_in=142 tokens_out=2048 logprobs=5
  latency=1.2s status=200
  experiment_tag=benchmark-eval-2026

2026-03-10T22:14:35Z POST /v1/completions 
  user=cobalt.mirror src=10.50.100.47 
  tokens_in=98 tokens_out=2048 logprobs=5
  latency=0.9s status=200
  experiment_tag=benchmark-eval-2026

[... pattern repeats ~800 times per minute ...]

Daily API Usage Summary

User: cobalt.mirror
Period: 2026-03-10 to 2026-03-24 (14 days)

Date        Requests    Tokens Out    Logprob Requests
--------------------------------------------------------
Mar 10      47,200      96.7M         47,200 (100%)
Mar 11      52,100      106.8M        52,100 (100%)
Mar 12      48,800      100.0M        48,800 (100%)
Mar 13      55,300      113.4M        55,300 (100%)
Mar 14      51,700      106.0M        51,700 (100%)
[... continues for 14 days ...]
--------------------------------------------------------
Total       712,000     1.46B         712,000 (100%)

Comparison -- other ML engineers (avg):
Total       12,400      25.4M         3,100 (25%)

Detection Queries¶

KQL (Microsoft Sentinel)SPL (Splunk)

// Detect anomalous API usage for model distillation
let UserBaseline = 
    CustomLogs_CL
    | where Category == "api_access"
    | where TimeGenerated between(ago(90d) .. ago(14d))
    | summarize AvgDailyRequests=avg(RequestCount), 
        StdDev=stdev(RequestCount) by UserId
;
CustomLogs_CL
| where Category == "api_access"
| where TimeGenerated > ago(14d)
| summarize DailyRequests=count(), 
    LogprobRequests=countif(Logprobs > 0),
    TotalTokensOut=sum(TokensOut) 
    by UserId, bin(TimeGenerated, 1d)
| join kind=inner UserBaseline on UserId
| where DailyRequests > AvgDailyRequests + (3 * StdDev)
| where LogprobRequests == DailyRequests
| project TimeGenerated, UserId, DailyRequests, 
    AvgDailyRequests, LogprobPct=100.0 * LogprobRequests 
    / DailyRequests

index=mlplatform sourcetype=api_access earliest=-14d
| stats count as daily_requests 
    sum(eval(if(logprobs>0,1,0))) as logprob_requests
    sum(tokens_out) as total_tokens
    by user_id span=1d _time
| eventstats avg(daily_requests) as avg_requests 
    stdev(daily_requests) as stdev_requests by user_id
| where daily_requests > avg_requests + (3 * stdev_requests)
| eval logprob_pct = round(logprob_requests / 
    daily_requests * 100, 1)
| where logprob_pct > 90
| table _time user_id daily_requests avg_requests 
    logprob_pct total_tokens

Discussion Injects¶

Technical

The insider's API usage is 57x higher than the average ML engineer, with 100% logprob requests vs the typical 25%. Both are strong signals -- but each individually could be explained by legitimate benchmark work. How do you distinguish model distillation from legitimate research?

Decision

You discover that no API usage policies exist -- ML engineers have unlimited access to the inference API. Should you implement rate limits or usage quotas? How do you do this without impacting legitimate research velocity?

Phase 3: Model Weight Exfiltration (Days 15-25)¶

Insider Actions¶

While the API distillation captures the model's behavior, COBALT MIRROR also wants the raw model weights (140 GB) for direct use. The insider downloads weights from the MLflow model registry in chunks, disguised as routine model loading for fine-tuning experiments:

MLflow Model Registry Access Log

2026-03-25T02:14:22Z GET /api/2.0/mlflow/artifacts/get
  user=cobalt.mirror
  run_id=exp-titan-ft-2026-q1
  artifact_path=models/titanllm-70b/weights/
  file=model-00001-of-00037.safetensors
  size=3.8GB status=200

2026-03-25T02:18:44Z GET /api/2.0/mlflow/artifacts/get
  user=cobalt.mirror
  run_id=exp-titan-ft-2026-q1
  artifact_path=models/titanllm-70b/weights/
  file=model-00002-of-00037.safetensors
  size=3.8GB status=200

[... pattern repeats for all 37 weight shards ...]

The insider then stages the weights on their ML workstation and copies them to a personal cloud storage account over the corporate VPN, which is not inspected by DLP:

Network Flow -- VPN Exfiltration (Reconstructed)

2026-03-26T01:22:15Z 10.50.100.47 -> 198.51.100.55:443
  Protocol: HTTPS (WireGuard tunnel)
  Bytes: 4,200,000,000 (4.2 GB)
  Duration: 3,600s
  Destination: storage.personal-cloud.example.com

2026-03-27T01:18:33Z 10.50.100.47 -> 198.51.100.55:443
  Bytes: 4,100,000,000 (4.1 GB)
  Duration: 3,500s

[... repeats nightly for 10 nights, total ~140 GB ...]

DLP Blind Spot

The corporate DLP system inspects HTTP/HTTPS traffic through the web proxy but does not inspect traffic within the WireGuard VPN tunnel. The insider exploited this gap by uploading data to personal cloud storage from within the VPN session, making the traffic appear as normal encrypted VPN usage.

Detection Queries¶

KQL (Microsoft Sentinel)SPL (Splunk)KQL (Microsoft Sentinel)SPL (Splunk)

// Detect bulk model weight downloads from registry
CustomLogs_CL
| where Category == "mlflow_access"
| where OperationName == "GetArtifact"
| where ArtifactPath has "weights" 
    or ArtifactPath has "model"
| summarize DownloadCount=count(), 
    TotalBytes=sum(FileSize),
    UniqueFiles=dcount(FileName)
    by UserId, RunId, bin(TimeGenerated, 1d)
| where TotalBytes > 10737418240  // 10 GB
| project TimeGenerated, UserId, RunId, 
    DownloadCount, 
    TotalGB=round(TotalBytes / 1073741824.0, 1)

index=mlplatform sourcetype=mlflow:access 
    operation="GetArtifact"
| search artifact_path="*weights*" OR 
    artifact_path="*model*"
| stats count sum(file_size) as total_bytes 
    dc(file_name) as unique_files 
    by user_id run_id span=1d _time
| eval total_gb = round(total_bytes / 1073741824, 1)
| where total_gb > 10
| table _time user_id run_id count total_gb unique_files

// Detect large outbound transfers via VPN
NetworkTraffic
| where TimeGenerated > ago(30d)
| where SrcAddr startswith "10.50.100."
| where DstPort == 443
| where DstAddr !startswith "10."
| summarize TotalBytes=sum(BytesSent), 
    SessionCount=count() 
    by SrcAddr, DstAddr, bin(TimeGenerated, 1d)
| where TotalBytes > 4294967296  // 4 GB per day
| project TimeGenerated, SrcAddr, DstAddr,
    TotalGB=round(TotalBytes / 1073741824.0, 1),
    SessionCount

index=network sourcetype=firewall 
    src_ip="10.50.100.*" dest_port=443
| where NOT cidrmatch("10.0.0.0/8", dest_ip)
| stats sum(bytes_out) as total_bytes count 
    by src_ip dest_ip span=1d _time
| eval total_gb = round(total_bytes / 1073741824, 1)
| where total_gb > 4
| table _time src_ip dest_ip total_gb count

Phase 4: Training Data Exfiltration (Days 26-34)¶

Insider Actions¶

The final phase targets the proprietary training dataset and RLHF reward model -- the most valuable components beyond the weights themselves:

Data Lake Access -- Training Data Download

2026-04-05T23:44:12Z GET 
  s3://prometheus-training-data/titanllm/
  sft-dataset-v3/
user=cobalt.mirror
objects=142,847 total_size=1.4TB

2026-04-06T23:22:08Z GET
  s3://prometheus-training-data/titanllm/
  rlhf-reward-model/
user=cobalt.mirror
objects=37 total_size=28GB

2026-04-07T23:18:55Z GET
  s3://prometheus-training-data/titanllm/
  eval-benchmarks-internal/
user=cobalt.mirror
objects=8,442 total_size=12GB

Evidence Artifacts¶

Insider Activity Timeline (Complete)

Day -14: Insider maps security controls and gaps
Day -7:  Insider accepts competitor offer (unknown to employer)
Day 1:   API distillation begins (benchmark-eval-2026)
Day 14:  API distillation complete (712K requests, 1.46B tokens)
Day 15:  Model weight download from MLflow begins
Day 25:  Model weights fully exfiltrated (140 GB via VPN)
Day 26:  Training data download begins
Day 34:  Training data exfiltration complete (2.1 TB total)
Day 37:  Insider submits resignation
Day 40:  Insider's last day (2-week notice)
Day 43:  Routine access review detects anomalous API patterns
Day 44:  IR investigation launched

Detection Queries¶

KQL (Microsoft Sentinel)SPL (Splunk)

// Detect bulk training data downloads
StorageBlobLogs
| where AccountName == "prometheus-training-data"
| where OperationName == "GetObject"
| where CallerIdentity has "cobalt.mirror"
| summarize ObjectCount=count(), 
    TotalBytes=sum(ObjectSize),
    UniquePrefixes=dcount(ObjectKey)
    by CallerIdentity, bin(TimeGenerated, 1d)
| where TotalBytes > 107374182400  // 100 GB
| project TimeGenerated, CallerIdentity, ObjectCount,
    TotalTB=round(TotalBytes / 1099511627776.0, 2)

index=storage sourcetype=minio:access 
    bucket="prometheus-training-data" 
    operation="GetObject"
| search user="cobalt.mirror"
| stats count sum(object_size) as total_bytes 
    dc(object_key) as unique_objects 
    by user span=1d _time
| eval total_tb = round(total_bytes / 1099511627776, 2)
| where total_tb > 0.1
| table _time user count total_tb unique_objects

Discussion Injects¶

Legal

The insider had legitimate access to all exfiltrated data as part of their job role. How does this affect the legal case? What is the difference between unauthorized access and authorized access with unauthorized purpose?

HR

The insider submitted their resignation 3 days after completing exfiltration. What pre-resignation behavioral indicators should HR and security monitor for? Should access be restricted during notice periods?

Phase 5: Discovery & Investigation (Day 43)¶

Discovery¶

During a routine quarterly access review, a security analyst notices COBALT MIRROR's API usage patterns from the prior month:

Access Review Finding

User: cobalt.mirror (DEPARTED -- last day: Day 40)

API Usage (last 90 days):
  - Total requests: 724,800 (57x team average)
  - Logprob requests: 712,000 (98.2% -- team avg: 25%)
  - Total tokens extracted: 1.46 billion
  - After-hours requests (10PM-6AM): 89%

Model Registry (last 90 days):
  - Weight file downloads: 37 files, 140 GB total
  - All downloads between 2-4 AM

Data Lake (last 90 days):
  - Training data downloads: 2.1 TB
  - All downloads between 11PM-5AM

VPN (last 90 days):
  - Outbound data via VPN: 2.3 TB
  - Destination: 198.51.100.55 (not corporate)
  - 100% during after-hours sessions

IR Investigation Findings¶

Finding	Detail
Model weights	All 37 weight shards downloaded and exfiltrated
Training data	2.1 TB including SFT dataset, RLHF reward model, internal benchmarks
Distillation data	712K API responses with full logprob distributions
Exfiltration channel	Personal cloud storage via VPN tunnel (DLP blind spot)
Total data exfiltrated	~2.3 TB
Estimated IP value	$45M (training compute) + proprietary dataset value

Indicators of Compromise¶

Network IOCs¶

IOC	Type	Context
`198.51.100.55`	IPv4	Personal cloud storage endpoint
`storage.personal-cloud.example.com`	Domain	Exfiltration destination
`10.50.100.47`	IPv4	Insider's VPN-assigned IP

Behavioral IOCs¶

Indicator	Description
API usage 57x above team average sustained over 14 days	Model distillation activity
100% logprob requests vs 25% team average	Extracting model probability distributions
89% after-hours activity (10 PM - 6 AM)	Concealment -- avoiding peer observation
140 GB model registry download in 10 days	Bulk weight exfiltration
2.1 TB data lake download in 9 days	Training data exfiltration
2.3 TB outbound via VPN to non-corporate IP	Data exfiltration via DLP blind spot
Resignation 3 days after final exfiltration	Post-theft departure pattern

Insider Threat Indicators (Pre-Incident)¶

Indicator	Timeline
Increased after-hours access	Day -14 onward
New experiment tags not linked to team projects	Day 1 (benchmark-eval-2026)
Model registry access outside normal workflow	Day 15 onward
Large VPN data transfers to unknown destination	Day 15 onward
Accessing data outside current project scope	Day 26 (RLHF reward model)
Resignation submitted	Day 37

Containment & Remediation¶

Immediate Actions (Hour 0-8)¶

Preserve all logs -- Legal hold on API logs, MLflow logs, VPN logs, network flows
Notify legal counsel -- Trade secret theft, potential Computer Fraud and Abuse Act (CFAA) violations
Engage digital forensics -- Image insider's corporate workstation (if still available)
Revoke all insider credentials -- API keys, ML platform access, data lake access, VPN certificates
Contact law enforcement -- FBI (if US) or relevant authority for IP theft investigation
Notify competitor -- Cease and desist if competitor is identified

Preventive Controls¶

Model watermarking -- Embed statistical watermarks in model weights that survive fine-tuning and distillation
API output monitoring -- Track cumulative token extraction and logprob requests per user with anomaly detection
DLP on VPN traffic -- Inspect VPN tunnel traffic for sensitive data patterns (model file signatures, training data formats)
Model registry access controls -- Require approval workflow for downloading full model weights; restrict to specific training pipelines
Data lake segmentation -- Separate access controls for training data, evaluation data, and production models; no single role should access all three
Behavioral analytics (UEBA) -- Baseline normal ML engineer behavior and alert on deviations in access patterns, data volumes, and working hours
Departure risk monitoring -- Enhanced monitoring for employees who access job sites, update LinkedIn, or exhibit disengagement signals
Non-compete and IP agreements -- Ensure enforceable agreements are in place and regularly renewed

Detection Improvements¶

KQL (Microsoft Sentinel)SPL (Splunk)

// UEBA: Detect insider threat behavioral patterns
let MLEngineerBaseline = 
    CustomLogs_CL
    | where Category == "api_access"
    | where TimeGenerated between(ago(90d) .. ago(30d))
    | summarize AvgDailyRequests=avg(DailyRequests),
        AvgLogprobPct=avg(LogprobPct),
        AvgAfterHoursPct=avg(AfterHoursPct)
        by UserId
;
CustomLogs_CL
| where Category == "api_access"
| where TimeGenerated > ago(14d)
| summarize DailyRequests=count(),
    LogprobPct=100.0 * countif(Logprobs > 0) / count(),
    AfterHoursPct=100.0 * countif(
        hourofday(TimeGenerated) >= 22 
        or hourofday(TimeGenerated) < 6) / count()
    by UserId, bin(TimeGenerated, 1d)
| join kind=inner MLEngineerBaseline on UserId
| where DailyRequests > AvgDailyRequests * 10
    or LogprobPct > AvgLogprobPct * 3
    or AfterHoursPct > 80
| project TimeGenerated, UserId, DailyRequests, 
    AvgDailyRequests, LogprobPct, AfterHoursPct

index=mlplatform sourcetype=api_access earliest=-14d
| eval is_after_hours=if(
    date_hour>=22 OR date_hour<6, 1, 0)
| eval is_logprob=if(logprobs>0, 1, 0)
| stats count as daily_requests 
    avg(is_logprob) as logprob_pct
    avg(is_after_hours) as after_hours_pct
    by user_id span=1d _time
| eventstats avg(daily_requests) as baseline_requests 
    by user_id
| where daily_requests > baseline_requests * 10
    OR logprob_pct > 0.9
    OR after_hours_pct > 0.8
| eval logprob_pct=round(logprob_pct*100,1)
| eval after_hours_pct=round(after_hours_pct*100,1)
| table _time user_id daily_requests baseline_requests 
    logprob_pct after_hours_pct

ATT&CK Mapping¶

Phase	Technique	ID	Tactic
Initial Access	Valid Accounts: Domain Accounts	T1078.002	Initial Access
Collection	Data from Local System	T1005	Collection
Collection	Data from Information Repositories	T1213	Collection
Exfiltration	Exfiltration Over Alternative Protocol	T1048	Exfiltration
Exfiltration	Exfiltration Over Web Service	T1567	Exfiltration
Exfiltration	Scheduled Transfer	T1029	Exfiltration
Defense Evasion	Valid Accounts	T1078	Defense Evasion

Lessons Learned¶

Insider threats exploit legitimate access, not vulnerabilities -- COBALT MIRROR never exploited a single vulnerability or bypassed a single security control. Every action used authorized credentials and approved access paths. Detection must focus on behavioral anomalies, not just policy violations.
AI model intellectual property requires purpose-built protection -- Traditional DLP designed for documents and databases does not protect against model distillation via API or weight file exfiltration via serialized tensor formats. AI-specific security controls (model watermarking, API output monitoring, weight access governance) are essential.
DLP blind spots in VPN tunnels are exploitable -- The entire 2.3 TB exfiltration occurred through a VPN tunnel that DLP did not inspect. Network-level DLP must cover all egress paths, including encrypted tunnels.
After-hours patterns are a strong signal when correlated -- Any single indicator (high API usage, after-hours access, large downloads) might be legitimate. The combination of all three simultaneously -- especially from a single user over a sustained period -- is a high-fidelity insider threat signal.
Post-resignation is too late for detection -- The insider completed all exfiltration before submitting their resignation. Departure risk models that detect pre-resignation behavioral shifts (disengagement, after-hours access changes, unusual data access) are critical for early intervention.

Cross-References¶

Chapter 37: AI Security -- AI/ML security fundamentals and model protection
Chapter 26: Insider Threats -- Insider threat detection and prevention
Chapter 50: Adversarial AI & LLM Security -- Model extraction attacks and defenses