SC-112: AI Model Intellectual Property Theft¶
Operation NEURAL HEIST¶
Classification: TABLETOP EXERCISE -- 100% Synthetic
All organizations, IP addresses, domains, model names, and threat actors in this scenario are entirely fictional. Created for educational tabletop exercises only.
Scenario Metadata¶
| Field | Value |
|---|---|
| Difficulty | ★★★★★ (Expert) |
| Duration | 3-4 hours |
| Participants | 6-8 (SOC, IR, Data Science/MLOps, Legal, HR, CISO) |
| ATT&CK Techniques | T1048 · T1005 · T1213 · T1078 · T1567 · T1029 |
| Threat Actor | Insider -- Senior ML Engineer (COBALT MIRROR) |
| Industry | Technology / AI |
| Primary Impact | $45M proprietary AI model stolen, competitive advantage destroyed |
Insider Threat Profile: COBALT MIRROR¶
| Attribute | Detail |
|---|---|
| Role | Senior ML Engineer, 4-year tenure, top performer |
| Motivation | Financial -- accepted offer from competitor contingent on bringing model architecture and weights |
| Access Level | Legitimate access to model training pipelines, GPU clusters, model registry, and training datasets |
| Sophistication | Very high -- deep knowledge of internal security controls, monitoring blind spots, and data serialization formats |
| Behavioral Indicators | Increased after-hours access, bulk data downloads rationalized as "retraining experiments," resignation submitted 3 days after final exfiltration |
| Cover Story | All exfiltration activities disguised as routine model experimentation and hyperparameter tuning |
Executive Summary¶
A senior ML engineer at Prometheus AI Labs (synthetic AI company, 600 employees) -- codenamed COBALT MIRROR -- systematically exfiltrates the company's flagship proprietary model "TitanLM-70B" (a 70-billion parameter large language model valued at $45M in training compute costs). Over 34 days, the insider uses their legitimate access to the ML platform to extract model weights (140 GB), training configuration, fine-tuning datasets (2.1 TB), and proprietary RLHF reward model data. The exfiltration is conducted through three channels: (1) the internal model inference API to reconstruct weights via model distillation, (2) direct downloads from the model registry disguised as routine experimentation, and (3) training data copied to a personal cloud storage account via the corporate VPN. The theft is discovered 6 days after the insider's resignation when a routine access review reveals anomalous API call patterns during the insider's final weeks.
Environment Setup¶
Target Organization: Prometheus AI Labs (synthetic)
| Asset | Detail |
|---|---|
| Industry | AI/ML research company, 600 employees, 180 ML engineers |
| Flagship Model | TitanLM-70B (70B parameter LLM, $45M training cost) |
| ML Platform | Internal MLflow + Kubernetes GPU cluster (10.50.0.0/16) |
| Model Registry | mlflow.prometheus.example.com (10.50.1.10) |
| Training Cluster | 256x NVIDIA A100 GPUs, gpu-cluster.prometheus.example.com (10.50.2.0/24) |
| Inference API | api.titanllm.example.com (10.50.3.20) |
| Data Lake | MinIO S3-compatible: datalake.prometheus.example.com (10.50.4.15) |
| VPN | WireGuard VPN for remote access (10.50.100.0/24) |
| DLP | Symantec DLP (email and web gateway) |
| SIEM | Splunk Enterprise |
| EDR | SentinelOne |
Phase 1: Reconnaissance & Planning (Days -14 to 0)¶
Insider Actions¶
COBALT MIRROR begins systematically mapping security controls and identifying exfiltration paths. As a 4-year veteran with legitimate administrative access, the insider already knows most of the infrastructure:
Internal Security Control Assessment (Insider's Notes -- Reconstructed)
Security Controls Assessment:
[✓] DLP monitors email attachments > 25MB -- AVOID EMAIL
[✓] Web proxy logs all uploads to cloud storage -- USE VPN SPLIT
[✓] Model registry access logged in MLflow -- NEEDS COVER STORY
[✓] GPU cluster usage tracked per user -- WILL USE EXISTING JOBS
[✓] API rate limiting: 1000 req/min per user -- SUFFICIENT FOR DISTILLATION
[✗] No monitoring on API output token volume per user
[✗] No DLP on VPN tunnel traffic
[✗] No behavioral analytics on model registry downloads
[✗] No watermarking on model weights
[✗] No USB port restrictions on ML workstations
The insider identifies three critical gaps:
- API output monitoring -- No tracking of cumulative tokens/data extracted via the inference API
- VPN DLP gap -- Corporate DLP does not inspect VPN tunnel traffic
- Model registry behavioral analytics -- Downloads are logged but not baselined or alerted on
Discussion Injects¶
Technical
The insider identified 5 security gaps in existing controls. Which of these gaps would be most impactful to close? How do you balance ML engineer productivity (which requires large data transfers) with data loss prevention?
Phase 2: Model Distillation via API Abuse (Days 1-14)¶
Insider Actions¶
COBALT MIRROR uses the internal inference API to perform model distillation -- extracting the model's knowledge by querying it systematically and training a smaller replica:
API Distillation Script (Reconstructed)
"""
Model distillation via API -- extracts teacher model
knowledge into student model.
Disguised as 'benchmark evaluation' in experiment logs.
"""
import requests
import json
import time
from pathlib import Path
API_BASE = "https://api.titanllm.example.com/v1"
API_KEY = "REDACTED" # Legitimate user API key
OUTPUT_DIR = Path("/data/experiments/benchmark-eval-2026")
# Distillation dataset -- 2M diverse prompts
# covering TitanLM's training distribution
PROMPTS = load_prompts("distillation_prompts.jsonl")
session = requests.Session()
session.headers["Authorization"] = f"Bearer {API_KEY}"
for i, prompt in enumerate(PROMPTS):
response = session.post(
f"{API_BASE}/completions",
json={
"prompt": prompt,
"max_tokens": 2048,
"temperature": 0.7,
"logprobs": 5, # Extract probability distribution
"top_p": 1.0
}
)
result = response.json()
# Save full response including logprobs
# (critical for distillation -- captures model's
# internal probability distribution)
output_file = OUTPUT_DIR / f"batch_{i // 10000}" / f"{i}.json"
output_file.parent.mkdir(parents=True, exist_ok=True)
with open(output_file, "w") as f:
json.dump(result, f)
# Rate limit to stay under anomaly detection
# (~800 req/min, below 1000 limit)
if i % 100 == 0:
time.sleep(2)
Why Logprobs Matter
By requesting logprobs=5, the insider extracts not just the model's text output but its internal probability distribution over tokens. This is far more valuable for distillation than text alone -- it enables training a student model that closely mimics the teacher model's behavior, including its confidence patterns and uncertainty.
Evidence Artifacts¶
API Access Logs (api.titanllm.example.com)
2026-03-10T22:14:33Z POST /v1/completions
user=cobalt.mirror src=10.50.100.47
tokens_in=142 tokens_out=2048 logprobs=5
latency=1.2s status=200
experiment_tag=benchmark-eval-2026
2026-03-10T22:14:35Z POST /v1/completions
user=cobalt.mirror src=10.50.100.47
tokens_in=98 tokens_out=2048 logprobs=5
latency=0.9s status=200
experiment_tag=benchmark-eval-2026
[... pattern repeats ~800 times per minute ...]
Daily API Usage Summary
User: cobalt.mirror
Period: 2026-03-10 to 2026-03-24 (14 days)
Date Requests Tokens Out Logprob Requests
--------------------------------------------------------
Mar 10 47,200 96.7M 47,200 (100%)
Mar 11 52,100 106.8M 52,100 (100%)
Mar 12 48,800 100.0M 48,800 (100%)
Mar 13 55,300 113.4M 55,300 (100%)
Mar 14 51,700 106.0M 51,700 (100%)
[... continues for 14 days ...]
--------------------------------------------------------
Total 712,000 1.46B 712,000 (100%)
Comparison -- other ML engineers (avg):
Total 12,400 25.4M 3,100 (25%)
Detection Queries¶
// Detect anomalous API usage for model distillation
let UserBaseline =
CustomLogs_CL
| where Category == "api_access"
| where TimeGenerated between(ago(90d) .. ago(14d))
| summarize AvgDailyRequests=avg(RequestCount),
StdDev=stdev(RequestCount) by UserId
;
CustomLogs_CL
| where Category == "api_access"
| where TimeGenerated > ago(14d)
| summarize DailyRequests=count(),
LogprobRequests=countif(Logprobs > 0),
TotalTokensOut=sum(TokensOut)
by UserId, bin(TimeGenerated, 1d)
| join kind=inner UserBaseline on UserId
| where DailyRequests > AvgDailyRequests + (3 * StdDev)
| where LogprobRequests == DailyRequests
| project TimeGenerated, UserId, DailyRequests,
AvgDailyRequests, LogprobPct=100.0 * LogprobRequests
/ DailyRequests
index=mlplatform sourcetype=api_access earliest=-14d
| stats count as daily_requests
sum(eval(if(logprobs>0,1,0))) as logprob_requests
sum(tokens_out) as total_tokens
by user_id span=1d _time
| eventstats avg(daily_requests) as avg_requests
stdev(daily_requests) as stdev_requests by user_id
| where daily_requests > avg_requests + (3 * stdev_requests)
| eval logprob_pct = round(logprob_requests /
daily_requests * 100, 1)
| where logprob_pct > 90
| table _time user_id daily_requests avg_requests
logprob_pct total_tokens
Discussion Injects¶
Technical
The insider's API usage is 57x higher than the average ML engineer, with 100% logprob requests vs the typical 25%. Both are strong signals -- but each individually could be explained by legitimate benchmark work. How do you distinguish model distillation from legitimate research?
Decision
You discover that no API usage policies exist -- ML engineers have unlimited access to the inference API. Should you implement rate limits or usage quotas? How do you do this without impacting legitimate research velocity?
Phase 3: Model Weight Exfiltration (Days 15-25)¶
Insider Actions¶
While the API distillation captures the model's behavior, COBALT MIRROR also wants the raw model weights (140 GB) for direct use. The insider downloads weights from the MLflow model registry in chunks, disguised as routine model loading for fine-tuning experiments:
MLflow Model Registry Access Log
2026-03-25T02:14:22Z GET /api/2.0/mlflow/artifacts/get
user=cobalt.mirror
run_id=exp-titan-ft-2026-q1
artifact_path=models/titanllm-70b/weights/
file=model-00001-of-00037.safetensors
size=3.8GB status=200
2026-03-25T02:18:44Z GET /api/2.0/mlflow/artifacts/get
user=cobalt.mirror
run_id=exp-titan-ft-2026-q1
artifact_path=models/titanllm-70b/weights/
file=model-00002-of-00037.safetensors
size=3.8GB status=200
[... pattern repeats for all 37 weight shards ...]
The insider then stages the weights on their ML workstation and copies them to a personal cloud storage account over the corporate VPN, which is not inspected by DLP:
Network Flow -- VPN Exfiltration (Reconstructed)
2026-03-26T01:22:15Z 10.50.100.47 -> 198.51.100.55:443
Protocol: HTTPS (WireGuard tunnel)
Bytes: 4,200,000,000 (4.2 GB)
Duration: 3,600s
Destination: storage.personal-cloud.example.com
2026-03-27T01:18:33Z 10.50.100.47 -> 198.51.100.55:443
Bytes: 4,100,000,000 (4.1 GB)
Duration: 3,500s
[... repeats nightly for 10 nights, total ~140 GB ...]
DLP Blind Spot
The corporate DLP system inspects HTTP/HTTPS traffic through the web proxy but does not inspect traffic within the WireGuard VPN tunnel. The insider exploited this gap by uploading data to personal cloud storage from within the VPN session, making the traffic appear as normal encrypted VPN usage.
Detection Queries¶
// Detect bulk model weight downloads from registry
CustomLogs_CL
| where Category == "mlflow_access"
| where OperationName == "GetArtifact"
| where ArtifactPath has "weights"
or ArtifactPath has "model"
| summarize DownloadCount=count(),
TotalBytes=sum(FileSize),
UniqueFiles=dcount(FileName)
by UserId, RunId, bin(TimeGenerated, 1d)
| where TotalBytes > 10737418240 // 10 GB
| project TimeGenerated, UserId, RunId,
DownloadCount,
TotalGB=round(TotalBytes / 1073741824.0, 1)
index=mlplatform sourcetype=mlflow:access
operation="GetArtifact"
| search artifact_path="*weights*" OR
artifact_path="*model*"
| stats count sum(file_size) as total_bytes
dc(file_name) as unique_files
by user_id run_id span=1d _time
| eval total_gb = round(total_bytes / 1073741824, 1)
| where total_gb > 10
| table _time user_id run_id count total_gb unique_files
// Detect large outbound transfers via VPN
NetworkTraffic
| where TimeGenerated > ago(30d)
| where SrcAddr startswith "10.50.100."
| where DstPort == 443
| where DstAddr !startswith "10."
| summarize TotalBytes=sum(BytesSent),
SessionCount=count()
by SrcAddr, DstAddr, bin(TimeGenerated, 1d)
| where TotalBytes > 4294967296 // 4 GB per day
| project TimeGenerated, SrcAddr, DstAddr,
TotalGB=round(TotalBytes / 1073741824.0, 1),
SessionCount
index=network sourcetype=firewall
src_ip="10.50.100.*" dest_port=443
| where NOT cidrmatch("10.0.0.0/8", dest_ip)
| stats sum(bytes_out) as total_bytes count
by src_ip dest_ip span=1d _time
| eval total_gb = round(total_bytes / 1073741824, 1)
| where total_gb > 4
| table _time src_ip dest_ip total_gb count
Phase 4: Training Data Exfiltration (Days 26-34)¶
Insider Actions¶
The final phase targets the proprietary training dataset and RLHF reward model -- the most valuable components beyond the weights themselves:
Data Lake Access -- Training Data Download
2026-04-05T23:44:12Z GET
s3://prometheus-training-data/titanllm/
sft-dataset-v3/
user=cobalt.mirror
objects=142,847 total_size=1.4TB
2026-04-06T23:22:08Z GET
s3://prometheus-training-data/titanllm/
rlhf-reward-model/
user=cobalt.mirror
objects=37 total_size=28GB
2026-04-07T23:18:55Z GET
s3://prometheus-training-data/titanllm/
eval-benchmarks-internal/
user=cobalt.mirror
objects=8,442 total_size=12GB
Evidence Artifacts¶
Insider Activity Timeline (Complete)
Day -14: Insider maps security controls and gaps
Day -7: Insider accepts competitor offer (unknown to employer)
Day 1: API distillation begins (benchmark-eval-2026)
Day 14: API distillation complete (712K requests, 1.46B tokens)
Day 15: Model weight download from MLflow begins
Day 25: Model weights fully exfiltrated (140 GB via VPN)
Day 26: Training data download begins
Day 34: Training data exfiltration complete (2.1 TB total)
Day 37: Insider submits resignation
Day 40: Insider's last day (2-week notice)
Day 43: Routine access review detects anomalous API patterns
Day 44: IR investigation launched
Detection Queries¶
// Detect bulk training data downloads
StorageBlobLogs
| where AccountName == "prometheus-training-data"
| where OperationName == "GetObject"
| where CallerIdentity has "cobalt.mirror"
| summarize ObjectCount=count(),
TotalBytes=sum(ObjectSize),
UniquePrefixes=dcount(ObjectKey)
by CallerIdentity, bin(TimeGenerated, 1d)
| where TotalBytes > 107374182400 // 100 GB
| project TimeGenerated, CallerIdentity, ObjectCount,
TotalTB=round(TotalBytes / 1099511627776.0, 2)
index=storage sourcetype=minio:access
bucket="prometheus-training-data"
operation="GetObject"
| search user="cobalt.mirror"
| stats count sum(object_size) as total_bytes
dc(object_key) as unique_objects
by user span=1d _time
| eval total_tb = round(total_bytes / 1099511627776, 2)
| where total_tb > 0.1
| table _time user count total_tb unique_objects
Discussion Injects¶
Legal
The insider had legitimate access to all exfiltrated data as part of their job role. How does this affect the legal case? What is the difference between unauthorized access and authorized access with unauthorized purpose?
HR
The insider submitted their resignation 3 days after completing exfiltration. What pre-resignation behavioral indicators should HR and security monitor for? Should access be restricted during notice periods?
Phase 5: Discovery & Investigation (Day 43)¶
Discovery¶
During a routine quarterly access review, a security analyst notices COBALT MIRROR's API usage patterns from the prior month:
Access Review Finding
User: cobalt.mirror (DEPARTED -- last day: Day 40)
API Usage (last 90 days):
- Total requests: 724,800 (57x team average)
- Logprob requests: 712,000 (98.2% -- team avg: 25%)
- Total tokens extracted: 1.46 billion
- After-hours requests (10PM-6AM): 89%
Model Registry (last 90 days):
- Weight file downloads: 37 files, 140 GB total
- All downloads between 2-4 AM
Data Lake (last 90 days):
- Training data downloads: 2.1 TB
- All downloads between 11PM-5AM
VPN (last 90 days):
- Outbound data via VPN: 2.3 TB
- Destination: 198.51.100.55 (not corporate)
- 100% during after-hours sessions
IR Investigation Findings¶
| Finding | Detail |
|---|---|
| Model weights | All 37 weight shards downloaded and exfiltrated |
| Training data | 2.1 TB including SFT dataset, RLHF reward model, internal benchmarks |
| Distillation data | 712K API responses with full logprob distributions |
| Exfiltration channel | Personal cloud storage via VPN tunnel (DLP blind spot) |
| Total data exfiltrated | ~2.3 TB |
| Estimated IP value | $45M (training compute) + proprietary dataset value |
Indicators of Compromise¶
Network IOCs¶
| IOC | Type | Context |
|---|---|---|
198.51.100.55 | IPv4 | Personal cloud storage endpoint |
storage.personal-cloud.example.com | Domain | Exfiltration destination |
10.50.100.47 | IPv4 | Insider's VPN-assigned IP |
Behavioral IOCs¶
| Indicator | Description |
|---|---|
| API usage 57x above team average sustained over 14 days | Model distillation activity |
| 100% logprob requests vs 25% team average | Extracting model probability distributions |
| 89% after-hours activity (10 PM - 6 AM) | Concealment -- avoiding peer observation |
| 140 GB model registry download in 10 days | Bulk weight exfiltration |
| 2.1 TB data lake download in 9 days | Training data exfiltration |
| 2.3 TB outbound via VPN to non-corporate IP | Data exfiltration via DLP blind spot |
| Resignation 3 days after final exfiltration | Post-theft departure pattern |
Insider Threat Indicators (Pre-Incident)¶
| Indicator | Timeline |
|---|---|
| Increased after-hours access | Day -14 onward |
| New experiment tags not linked to team projects | Day 1 (benchmark-eval-2026) |
| Model registry access outside normal workflow | Day 15 onward |
| Large VPN data transfers to unknown destination | Day 15 onward |
| Accessing data outside current project scope | Day 26 (RLHF reward model) |
| Resignation submitted | Day 37 |
Containment & Remediation¶
Immediate Actions (Hour 0-8)¶
- Preserve all logs -- Legal hold on API logs, MLflow logs, VPN logs, network flows
- Notify legal counsel -- Trade secret theft, potential Computer Fraud and Abuse Act (CFAA) violations
- Engage digital forensics -- Image insider's corporate workstation (if still available)
- Revoke all insider credentials -- API keys, ML platform access, data lake access, VPN certificates
- Contact law enforcement -- FBI (if US) or relevant authority for IP theft investigation
- Notify competitor -- Cease and desist if competitor is identified
Preventive Controls¶
- Model watermarking -- Embed statistical watermarks in model weights that survive fine-tuning and distillation
- API output monitoring -- Track cumulative token extraction and logprob requests per user with anomaly detection
- DLP on VPN traffic -- Inspect VPN tunnel traffic for sensitive data patterns (model file signatures, training data formats)
- Model registry access controls -- Require approval workflow for downloading full model weights; restrict to specific training pipelines
- Data lake segmentation -- Separate access controls for training data, evaluation data, and production models; no single role should access all three
- Behavioral analytics (UEBA) -- Baseline normal ML engineer behavior and alert on deviations in access patterns, data volumes, and working hours
- Departure risk monitoring -- Enhanced monitoring for employees who access job sites, update LinkedIn, or exhibit disengagement signals
- Non-compete and IP agreements -- Ensure enforceable agreements are in place and regularly renewed
Detection Improvements¶
// UEBA: Detect insider threat behavioral patterns
let MLEngineerBaseline =
CustomLogs_CL
| where Category == "api_access"
| where TimeGenerated between(ago(90d) .. ago(30d))
| summarize AvgDailyRequests=avg(DailyRequests),
AvgLogprobPct=avg(LogprobPct),
AvgAfterHoursPct=avg(AfterHoursPct)
by UserId
;
CustomLogs_CL
| where Category == "api_access"
| where TimeGenerated > ago(14d)
| summarize DailyRequests=count(),
LogprobPct=100.0 * countif(Logprobs > 0) / count(),
AfterHoursPct=100.0 * countif(
hourofday(TimeGenerated) >= 22
or hourofday(TimeGenerated) < 6) / count()
by UserId, bin(TimeGenerated, 1d)
| join kind=inner MLEngineerBaseline on UserId
| where DailyRequests > AvgDailyRequests * 10
or LogprobPct > AvgLogprobPct * 3
or AfterHoursPct > 80
| project TimeGenerated, UserId, DailyRequests,
AvgDailyRequests, LogprobPct, AfterHoursPct
index=mlplatform sourcetype=api_access earliest=-14d
| eval is_after_hours=if(
date_hour>=22 OR date_hour<6, 1, 0)
| eval is_logprob=if(logprobs>0, 1, 0)
| stats count as daily_requests
avg(is_logprob) as logprob_pct
avg(is_after_hours) as after_hours_pct
by user_id span=1d _time
| eventstats avg(daily_requests) as baseline_requests
by user_id
| where daily_requests > baseline_requests * 10
OR logprob_pct > 0.9
OR after_hours_pct > 0.8
| eval logprob_pct=round(logprob_pct*100,1)
| eval after_hours_pct=round(after_hours_pct*100,1)
| table _time user_id daily_requests baseline_requests
logprob_pct after_hours_pct
ATT&CK Mapping¶
| Phase | Technique | ID | Tactic |
|---|---|---|---|
| Initial Access | Valid Accounts: Domain Accounts | T1078.002 | Initial Access |
| Collection | Data from Local System | T1005 | Collection |
| Collection | Data from Information Repositories | T1213 | Collection |
| Exfiltration | Exfiltration Over Alternative Protocol | T1048 | Exfiltration |
| Exfiltration | Exfiltration Over Web Service | T1567 | Exfiltration |
| Exfiltration | Scheduled Transfer | T1029 | Exfiltration |
| Defense Evasion | Valid Accounts | T1078 | Defense Evasion |
Lessons Learned¶
- Insider threats exploit legitimate access, not vulnerabilities -- COBALT MIRROR never exploited a single vulnerability or bypassed a single security control. Every action used authorized credentials and approved access paths. Detection must focus on behavioral anomalies, not just policy violations.
- AI model intellectual property requires purpose-built protection -- Traditional DLP designed for documents and databases does not protect against model distillation via API or weight file exfiltration via serialized tensor formats. AI-specific security controls (model watermarking, API output monitoring, weight access governance) are essential.
- DLP blind spots in VPN tunnels are exploitable -- The entire 2.3 TB exfiltration occurred through a VPN tunnel that DLP did not inspect. Network-level DLP must cover all egress paths, including encrypted tunnels.
- After-hours patterns are a strong signal when correlated -- Any single indicator (high API usage, after-hours access, large downloads) might be legitimate. The combination of all three simultaneously -- especially from a single user over a sustained period -- is a high-fidelity insider threat signal.
- Post-resignation is too late for detection -- The insider completed all exfiltration before submitting their resignation. Departure risk models that detect pre-resignation behavioral shifts (disengagement, after-hours access changes, unusual data access) are critical for early intervention.
Cross-References¶
- Chapter 37: AI Security -- AI/ML security fundamentals and model protection
- Chapter 26: Insider Threats -- Insider threat detection and prevention
- Chapter 50: Adversarial AI & LLM Security -- Model extraction attacks and defenses