SC-099: AI Model Exfiltration -- Operation MIRROR MIND¶
Educational Disclaimer¶
Synthetic Environment Only
This scenario uses 100% synthetic data for educational purposes. All IP addresses use RFC 5737 (192.0.2.x, 198.51.100.x, 203.0.113.x) or RFC 1918 (10.x, 172.16.x, 192.168.x) ranges. All domains use *.example.com. All credentials are testuser/REDACTED. No real organizations, infrastructure, or individuals are represented. Offense content is presented exclusively to improve defensive capabilities.
Scenario Overview¶
| Field | Detail |
|---|---|
| ID | SC-099 |
| Category | AI/ML Security / Insider Threat / Intellectual Property Theft |
| Severity | Critical |
| ATT&CK Tactics | Initial Access, Collection, Exfiltration, Defense Evasion, Persistence |
| ATT&CK Techniques | T1078 (Valid Accounts), T1119 (Automated Collection), T1048 (Exfiltration Over Alternative Protocol), T1027 (Obfuscated Files or Information), T1567 (Exfiltration Over Web Service), T1530 (Data from Cloud Storage Object) |
| Target Environment | AI-first pharmaceutical company with internal ML platform serving 28 production models for drug discovery, clinical trial optimization, and molecular property prediction, running on Kubernetes with MLflow, Kubeflow, and custom model serving infrastructure |
| Difficulty | ★★★★☆ |
| Duration | 4-6 hours |
| Estimated Impact | 3 proprietary drug discovery models extracted via prediction API abuse (functional clone accuracy greater than 94 percent); 2 models directly exfiltrated from model registry; model training data partially reconstructed via membership inference; estimated IP value of stolen models: $180M in R&D investment; 45-day dwell time before behavioral analytics triggers investigation |
Narrative¶
Helix Therapeutics, a fictional AI-driven pharmaceutical company at helix-therapeutics.example.com, has invested $180M over four years building a proprietary ML platform for drug discovery. The platform, called MoleculeAI, runs 28 production models that predict molecular binding affinity, toxicity profiles, ADMET properties (absorption, distribution, metabolism, excretion, toxicity), and clinical trial success probability. These models represent a significant competitive advantage, reducing drug candidate screening time from 18 months to 6 weeks.
The ML infrastructure runs on an internal Kubernetes cluster with MLflow for experiment tracking and model registry, Kubeflow Pipelines for training orchestration, Seldon Core for model serving, and MinIO (S3-compatible) for model artifact storage. The platform serves predictions via internal REST APIs to three teams: Computational Chemistry (model development, 12 data scientists), Drug Discovery (model consumers, 45 researchers), and Clinical Operations (trial optimization, 20 analysts).
Access to the ML platform is role-based: data scientists can train, deploy, and export models; researchers can query prediction APIs; analysts can access aggregated predictions via a dashboard. All access requires corporate SSO, and API calls are logged by a custom audit middleware.
Dr. Marcus Webb (testuser@helix-therapeutics.example.com), a senior data scientist on the Computational Chemistry team, has given notice and will join a competing pharmaceutical company, GenomX Pharma, in 30 days. During his notice period, Dr. Webb systematically extracts Helix's proprietary models using a combination of prediction API abuse (model extraction attacks), side-channel analysis of model responses, and direct model artifact download from the model registry.
Attack Flow¶
graph TD
A[Phase 1: Reconnaissance<br/>Map model inventory, APIs, and access controls] --> B[Phase 2: Model Extraction via API<br/>Systematic query attack on prediction endpoints]
B --> C[Phase 3: Side-Channel Analysis<br/>Confidence scores reveal decision boundaries]
C --> D[Phase 4: Direct Model Theft<br/>Download model artifacts from registry]
D --> E[Phase 5: Training Data Reconstruction<br/>Membership inference attack]
E --> F[Phase 6: Detection and Response<br/>Behavioral analytics flags anomalous patterns] Phase Details¶
Phase 1: Reconnaissance -- Mapping the ML Platform¶
ATT&CK Technique: T1078 (Valid Accounts)
Dr. Webb uses his legitimate data scientist credentials to thoroughly map the ML platform's model inventory, API endpoints, access controls, and storage locations. As a senior team member, he has broad access that was appropriate for his role but becomes dangerous during a hostile notice period.
# Simulated ML platform reconnaissance (educational only)
# Insider maps model inventory and access controls
# Step 1: Enumerate production models via MLflow API
curl -H "Authorization: Bearer REDACTED" \
https://mlflow.helix-therapeutics.example.com/api/2.0/mlflow/registered-models/list
{
"registered_models": [
{
"name": "binding-affinity-predictor",
"latest_versions": [{"version": "14", "stage": "Production",
"run_id": "a1b2c3d4e5f6",
"source": "s3://helix-models/binding-affinity/v14"}],
"description": "Predicts binding affinity (pKd) for small molecules",
"tags": {"team": "comp-chem", "ip_classification": "TRADE_SECRET",
"estimated_value": "$45M",
"architecture": "transformer-gnn-hybrid"}
},
{
"name": "toxicity-classifier",
"latest_versions": [{"version": "8", "stage": "Production",
"run_id": "f6e5d4c3b2a1",
"source": "s3://helix-models/toxicity/v8"}],
"description": "Multi-label toxicity prediction",
"tags": {"team": "comp-chem", "ip_classification": "TRADE_SECRET",
"estimated_value": "$38M",
"architecture": "graph-attention-network"}
},
{
"name": "admet-predictor",
"latest_versions": [{"version": "11", "stage": "Production",
"run_id": "1a2b3c4d5e6f",
"source": "s3://helix-models/admet/v11"}],
"description": "ADMET property prediction for drug candidates",
"tags": {"team": "comp-chem", "ip_classification": "TRADE_SECRET",
"estimated_value": "$52M",
"architecture": "multi-task-transformer"}
},
{
"name": "trial-success-predictor",
"latest_versions": [{"version": "5", "stage": "Production",
"run_id": "6f5e4d3c2b1a",
"source": "s3://helix-models/trial-success/v5"}],
"description": "Clinical trial Phase II/III success probability",
"tags": {"team": "clinical-ops", "ip_classification": "CONFIDENTIAL",
"estimated_value": "$25M",
"architecture": "ensemble-xgboost"}
}
],
"total_models": 28
}
# Step 2: Map prediction API endpoints
curl -H "Authorization: Bearer REDACTED" \
https://seldon.helix-therapeutics.example.com/api/v1/models
{
"models": [
{
"name": "binding-affinity-predictor",
"endpoint": "https://seldon.helix-therapeutics.example.com/seldon/production/binding-affinity/api/v1.0/predictions",
"input_schema": {"type": "molecular_graph", "format": "SMILES",
"max_batch_size": 100},
"output_schema": {"predictions": ["pKd_value", "confidence",
"attention_weights", "binding_site_residues"]},
"rate_limit": "1000 req/hour per user",
"consumers": ["comp-chem", "drug-discovery"],
"response_includes_confidence": true,
"response_includes_explanations": true
},
{
"name": "toxicity-classifier",
"endpoint": "https://seldon.helix-therapeutics.example.com/seldon/production/toxicity/api/v1.0/predictions",
"input_schema": {"type": "molecular_fingerprint", "format": "SMILES",
"max_batch_size": 50},
"output_schema": {"predictions": ["hepatotoxicity_prob",
"cardiotoxicity_prob", "nephrotoxicity_prob",
"confidence_intervals", "feature_importance"]},
"rate_limit": "500 req/hour per user",
"consumers": ["comp-chem", "drug-discovery", "clinical-ops"]
}
]
}
# CRITICAL observations:
# 1. Prediction APIs return confidence scores and attention weights
# (enables model extraction via side-channel analysis)
# 2. Rate limits are generous (1000/hour for extraction)
# 3. Batch predictions accepted (100 molecules per request)
# 4. Model artifacts stored in MinIO S3 with known bucket paths
# 5. Dr. Webb has data scientist role: can download model artifacts
# Step 3: Check MinIO access permissions
curl -H "Authorization: Bearer REDACTED" \
"https://minio.helix-therapeutics.example.com/helix-models/?list-type=2&prefix=binding-affinity/"
# Response shows model artifacts are accessible to comp-chem team
# Including: model.pkl, model_weights.pt, config.json, tokenizer/
Phase 2: Model Extraction via Prediction API Abuse¶
ATT&CK Technique: T1119 (Automated Collection)
Dr. Webb launches a systematic model extraction attack against the binding-affinity-predictor. He generates a diverse set of molecular queries designed to map the model's decision boundaries and trains a surrogate model that functionally replicates the proprietary model's predictions. The attack exploits the prediction API's confidence scores and explanation outputs to accelerate extraction.
# Simulated model extraction attack (educational only)
# Insider uses prediction API to train a surrogate model
# Step 1: Generate diverse molecular query dataset
# Dr. Webb creates a dataset of 50,000 diverse SMILES strings
# covering the chemical space relevant to the binding affinity model
# Query generation strategy (educational only):
# - 10,000 molecules from PubChem (public database, diverse scaffolds)
# - 10,000 molecules generated via combinatorial enumeration
# - 10,000 molecules from interpolating between known drug candidates
# - 10,000 edge-case molecules (very large, very small, unusual groups)
# - 10,000 adversarial examples designed to probe decision boundaries
# Step 2: Systematic API querying over 30 days
# With batch size 100 and rate limit 1000/hour:
# 50,000 molecules / 100 per batch = 500 batches
# 500 batches at 1000 per hour = approximately 0.5 hours continuous
# But Dr. Webb spreads queries over 30 days to avoid volume alerts
# Daily query pattern (educational only):
# Approximately 1,667 molecules/day = 17 batches/day
# Timed during normal working hours (9 AM - 6 PM EST)
# Mixed with legitimate research queries to blend in
# Example batch prediction request:
curl -X POST \
-H "Authorization: Bearer REDACTED" \
-H "Content-Type: application/json" \
https://seldon.helix-therapeutics.example.com/seldon/production/binding-affinity/api/v1.0/predictions \
-d '{
"data": {
"ndarray": [
{"smiles": "CC(=O)Oc1ccccc1C(=O)O", "target": "CDK2"},
{"smiles": "c1ccc2c(c1)cc1ccc3cccc4ccc2c1c34", "target": "CDK2"},
{"smiles": "CC1=CC=C(C=C1)C2=CC(=NN2C3=CC=C(C=C3)S(N)(=O)=O)C(F)(F)F",
"target": "CDK2"}
]
}
}'
# Response with rich prediction details:
{
"data": {
"ndarray": [
{
"smiles": "CC(=O)Oc1ccccc1C(=O)O",
"target": "CDK2",
"predicted_pKd": 4.23,
"confidence": 0.87,
"prediction_interval": [3.98, 4.48],
"attention_weights": {
"carbonyl_oxygen": 0.34,
"aromatic_ring": 0.28,
"carboxyl_group": 0.22,
"methyl_group": 0.16
},
"predicted_binding_residues": ["LEU83", "PHE80", "ASP145"]
},
{
"smiles": "c1ccc2c(c1)cc1ccc3cccc4ccc2c1c34",
"predicted_pKd": 6.71,
"confidence": 0.92,
"prediction_interval": [6.55, 6.87],
"attention_weights": {
"aromatic_system_1": 0.41,
"aromatic_system_2": 0.35,
"ring_junction": 0.24
},
"predicted_binding_residues": ["LEU83", "ILE10", "PHE80"]
}
]
},
"meta": {
"model_version": "14",
"inference_time_ms": 45,
"request_id": "req-a1b2c3d4"
}
}
# Step 3: Train surrogate model using collected predictions
# Dr. Webb uses the 50,000 query-response pairs as training data
# for a student model that learns to replicate the proprietary model
# Surrogate model training (pseudocode, educational only):
# import torch
# from torch_geometric.nn import GATConv
#
# class SurrogateBindingPredictor(torch.nn.Module):
# """Student model trained on proprietary model's predictions"""
# def __init__(self):
# super().__init__()
# self.gat1 = GATConv(in_channels=64, out_channels=128, heads=4)
# self.gat2 = GATConv(in_channels=512, out_channels=64, heads=4)
# self.fc = torch.nn.Linear(256, 1)
#
# # Training with knowledge distillation:
# # Loss = alpha * MSE(surrogate_pred, teacher_pred) +
# # beta * KL_div(surrogate_confidence, teacher_confidence) +
# # gamma * attention_alignment_loss(surrogate_attn, teacher_attn)
#
# # Results after training:
# # Surrogate model accuracy vs. proprietary model:
# # Mean Absolute Error: 0.31 pKd units
# # Correlation (R-squared): 0.94
# # Attention weight alignment: 0.89
# # The surrogate captures 94% of the proprietary model's predictive power
# Step 4: Validate surrogate against held-out queries
# Dr. Webb reserves 5,000 molecules as validation set
# Queries the proprietary API and compares with surrogate predictions
# Validation results (synthetic):
# validation_size: 5000
# mae: 0.31, rmse: 0.42, r_squared: 0.94
# spearman_correlation: 0.96, top_100_overlap: 0.91
# extraction_fidelity: HIGH
Phase 3: Side-Channel Analysis -- Confidence Score Exploitation¶
ATT&CK Technique: T1027 (Obfuscated Files or Information)
Dr. Webb exploits the prediction API's confidence scores and explanation outputs as side channels to accelerate model extraction. By analyzing how confidence varies near decision boundaries, he can infer the model's internal decision surface more efficiently than random querying alone.
# Simulated side-channel analysis (educational only)
# Insider exploits confidence scores to map decision boundaries
# Step 1: Boundary probing, find molecules where confidence drops
# High confidence = far from decision boundary
# Low confidence = near decision boundary
# Boundary molecules are most informative for extraction
# Confidence-guided active learning (pseudocode):
# for iteration in range(100):
# candidates = generate_candidates(1000)
# predictions = query_api(candidates)
# # Select molecules near decision boundaries
# # (confidence below 0.6 indicates boundary proximity)
# boundary_molecules = [m for m, p in zip(candidates, predictions)
# if p['confidence'] < 0.6]
# # These boundary molecules are 5x more informative
# # for surrogate training than random molecules
# surrogate.train(boundary_molecules, predictions)
# Step 2: Attention weight analysis for architecture inference
# The prediction API returns attention weights per molecular substructure
# These weights reveal the model's internal feature importance
# Systematic attention analysis (educational only):
# Query molecules with isolated functional groups to determine
# how the model weighs each chemical feature.
# Results reveal a learned attention mechanism with 4 heads
# and 128-dimensional key/value projections, consistent with
# a Graph Attention Network architecture.
# Step 3: Prediction interval analysis for uncertainty calibration
# The model returns prediction intervals [lower, upper]
# The width of the interval reveals the model's uncertainty
# which correlates with training data density
#
# Narrow intervals: model has seen similar molecules (training data region)
# Wide intervals: model is extrapolating (sparse training data region)
#
# This information enables:
# 1. Surrogate uncertainty calibration (matching interval widths)
# 2. Training data density estimation (where Helix has training data)
# 3. Active learning guidance (query where uncertainty is highest)
# Step 4: Feature importance extraction via perturbation
# For each prediction, slightly modify the input molecule
# and observe how predictions and attention weights change
# This reveals the model's sensitivity to specific substructures
# Perturbation analysis (educational only):
# base_molecule = "CC(=O)Oc1ccccc1C(=O)O"
# base_prediction = pKd 4.23, confidence 0.87
#
# Remove methyl: C(=O)Oc1ccccc1C(=O)O -> pKd: 4.01 (delta -0.22)
# Remove carboxyl: CC(=O)Oc1ccccc1 -> pKd: 3.15 (delta -1.08)
# Add fluorine: CC(=O)Oc1ccc(F)cc1C(=O)O -> pKd: 4.89 (delta +0.66)
#
# The carboxyl group contributes most to binding affinity
# Fluorine substitution improves predicted binding
# These sensitivity patterns are IP, they reveal learned
# structure-activity relationships from proprietary training data
Phase 4: Direct Model Theft -- Model Registry Exfiltration¶
ATT&CK Technique: T1530 (Data from Cloud Storage Object), T1567 (Exfiltration Over Web Service)
In parallel with the API-based extraction, Dr. Webb uses his data scientist role to directly download model artifacts from the MLflow model registry and MinIO storage. He targets two additional models (toxicity-classifier and admet-predictor) that would be harder to extract via API alone due to their multi-output architecture.
# Simulated direct model theft (educational only)
# Insider downloads model artifacts from model registry
# Step 1: Download toxicity-classifier from MLflow
curl -H "Authorization: Bearer REDACTED" \
"https://mlflow.helix-therapeutics.example.com/api/2.0/mlflow/model-versions/get-download-uri?name=toxicity-classifier&version=8"
{
"artifact_uri": "s3://helix-models/toxicity/v8/artifacts/model"
}
# Step 2: Download model artifacts from MinIO
# Using the MinIO client with Dr. Webb's credentials
# mc cp --recursive helix/helix-models/toxicity/v8/ ./exfil/toxicity/
# Downloaded files:
# toxicity/v8/artifacts/model/
# model.pt (PyTorch model weights, 847 MB)
# config.json (model architecture configuration)
# tokenizer/vocab.json (chemical vocabulary)
# tokenizer/merges.txt (tokenizer merge rules)
# requirements.txt (dependency versions)
# MLmodel (MLflow model metadata)
# conda.yaml (environment specification)
# Step 3: Download admet-predictor
# mc cp --recursive helix/helix-models/admet/v11/ ./exfil/admet/
# Downloaded files:
# admet/v11/artifacts/model/
# model_weights.pt (1.2 GB, multi-task transformer)
# config.json (23 ADMET property prediction heads)
# tokenizer/ (molecular tokenizer)
# feature_engineering/descriptor_pipeline.pkl (proprietary features)
# feature_engineering/scaler.pkl (feature normalization parameters)
# training_config.json (hyperparameters, training details)
# Step 4: Exfiltration via personal cloud storage
# Dr. Webb compresses and encrypts the model artifacts
# then uploads them to personal cloud storage in small chunks
# to avoid DLP detection on large file transfers
# Exfiltration method (educational only):
# 1. Compress model files: tar czf model_export.tar.gz ./exfil/
# 2. Encrypt with AES-256-CBC, output as research_backup.enc
# 3. Split into 25 MB chunks (below DLP threshold)
# 4. Upload chunks to personal Google Drive via web browser
# (HTTPS traffic blends with normal usage)
# 5. Files named "research_notes_aa", "research_notes_ab", etc.
# Total exfiltrated:
# - toxicity-classifier: 847 MB model + 12 MB config/tokenizer
# - admet-predictor: 1.2 GB model + 45 MB feature engineering
# - binding-affinity (surrogate): 340 MB trained surrogate model
# - Total: approximately 2.4 GB compressed to 1.8 GB encrypted
# - Uploaded as 72 chunks of 25 MB each over 3 weeks
Phase 5: Training Data Reconstruction -- Membership Inference¶
ATT&CK Technique: T1119 (Automated Collection)
Dr. Webb performs membership inference attacks against the prediction APIs to partially reconstruct Helix's proprietary training dataset. By analyzing prediction confidence patterns, he can determine which molecules were likely in the training data, recovering valuable proprietary structure-activity relationship data.
# Simulated membership inference attack (educational only)
# Insider determines which molecules were in the training data
# Step 1: Membership inference concept
# Models tend to be MORE confident on molecules they were trained on
# vs. molecules they have never seen. By comparing confidence scores
# for known molecules vs. novel molecules, the attacker can infer
# training set membership.
# Membership inference approach (pseudocode, educational only):
# 1. Collect a reference set of molecules with KNOWN membership:
# - 1,000 molecules from public datasets (likely NOT in training)
# - 500 molecules from Helix's published papers (likely IN training)
# 2. Train a binary classifier:
# Input: [prediction, confidence, interval_width, attention_entropy]
# Output: probability of training set membership
# 3. Apply to candidate molecules to infer training data
# Step 2: Query models with candidate molecules
# Dr. Webb has access to Helix's internal chemical library
# (20,000 proprietary compounds from internal synthesis)
# For each compound, query the binding affinity model
# and record: prediction, confidence, interval width, attention pattern
# Membership signal analysis (synthetic results):
# molecules_queried: 20000
# predicted_training_members: 8420
# membership_classifier_accuracy: 0.78
# confidence_threshold_for_membership: 0.82
# avg_confidence_members: 0.91 vs avg_confidence_non_members: 0.72
# avg_interval_width_members: 0.38 vs avg_interval_width_non_members: 0.89
# Molecules predicted as training data members:
# - Higher confidence (0.91 vs 0.72 average)
# - Narrower prediction intervals (0.38 vs 0.89)
# - Lower attention entropy (more focused attention patterns)
# - These are likely Helix's proprietary experimental data points
# Step 3: Reconstruct structure-activity relationships
# The inferred training data members, combined with their
# predicted pKd values, reveal Helix's proprietary SAR data
# - 8,420 molecules identified as likely training data
# - For each: SMILES structure + predicted binding affinity
# - Combined with the surrogate model, this gives a competitor
# both the model AND the data it was trained on
# Step 4: Validate reconstruction accuracy
# Cross-reference inferred training data with molecules mentioned
# in Helix's patent filings and published papers
# - 892 of 8,420 inferred members appear in Helix's patents
# - 94% of patent molecules were correctly predicted as members
# - This validates the membership inference approach
# - The remaining 7,528 inferred members represent unpublished
# proprietary SAR data, the most valuable intelligence
Phase 6: Detection and Response¶
The attack is detected after 45 days when Helix's insider threat behavioral analytics platform identifies a statistically anomalous pattern in Dr. Webb's API usage: a 380% increase in prediction API calls with unusually diverse molecular inputs (high chemical space coverage) that does not match his historical research focus areas.
# Simulated detection timeline (educational only)
[2026-04-29 14:00:00 UTC] ALERT: Anomalous ML API usage pattern
User: testuser@helix-therapeutics.example.com
Model: binding-affinity-predictor
API calls (last 30 days): 47,200 (baseline: 9,800)
Chemical diversity score: 0.94 (baseline: 0.41)
Query focus area match: 0.23 (baseline: 0.87)
Explanation: User is querying diverse molecular spaces outside
their research focus (CDK2 inhibitors), suggesting systematic
model exploration rather than targeted research
[2026-04-29 14:15:00 UTC] ALERT: Large model artifact downloads
User: testuser@helix-therapeutics.example.com
MLflow downloads (last 45 days):
- toxicity-classifier v8 (859 MB) downloaded 2026-03-22
- admet-predictor v11 (1.2 GB) downloaded 2026-03-28
These models are outside Dr. Webb's project assignments
[2026-04-29 14:30:00 UTC] CORRELATION: Insider risk escalation
- Dr. Webb submitted resignation 2026-03-14 (45 days ago)
- API usage anomaly began 2026-03-16 (2 days after resignation)
- Model downloads occurred during notice period
- 72 file uploads to drive.google.com detected in DLP logs
- Files named "research_notes_*" (split encrypted archive pattern)
- Destination: GenomX Pharma (competitor) start date confirmed
[2026-04-29 15:00:00 UTC] ESCALATION: P1 incident declared
- Legal, HR, IP security, and ML platform team engaged
- Dr. Webb's access suspended immediately
- Legal hold placed on all Dr. Webb's corporate data
- Forensic imaging of Dr. Webb's workstation initiated
Detection Queries¶
// KQL -- Detect anomalous ML prediction API usage volume
MLPlatformLogs
| where TimeGenerated > ago(30d)
| where EventType == "prediction_request"
| summarize DailyRequests = count(),
UniqueModels = dcount(ModelName),
UniqueMolecules = dcount(hash(InputData)),
BatchRequests = countif(BatchSize > 1),
AvgBatchSize = avg(BatchSize)
by UserId, bin(TimeGenerated, 1d)
| join kind=leftouter (
MLPlatformLogs
| where TimeGenerated between (ago(90d) .. ago(30d))
| where EventType == "prediction_request"
| summarize BaselineDaily = avg(DailyCount) by UserId
| extend BaselineDaily = coalesce(BaselineDaily, 0.0)
) on UserId
| extend VolumeRatio = DailyRequests / max_of(BaselineDaily, 1.0)
| where VolumeRatio > 3.0 or DailyRequests > 2000
| project TimeGenerated, UserId, DailyRequests, BaselineDaily,
VolumeRatio, UniqueModels, UniqueMolecules, BatchRequests
// KQL -- Detect model extraction via chemical diversity analysis
MLPlatformLogs
| where TimeGenerated > ago(30d)
| where EventType == "prediction_request"
| extend MolecularFingerprint = tostring(parse_json(InputData).fingerprint)
| summarize QueryCount = count(),
UniqueMolecules = dcount(hash(MolecularFingerprint)),
ChemicalDiversity = dcount(hash(substring(MolecularFingerprint, 0, 64)))
by UserId, ModelName, bin(TimeGenerated, 1d)
| extend DiversityRatio = todouble(ChemicalDiversity) / todouble(max_of(QueryCount, 1))
| where DiversityRatio > 0.8 and QueryCount > 100
| project TimeGenerated, UserId, ModelName, QueryCount,
UniqueMolecules, ChemicalDiversity, DiversityRatio
// KQL -- Detect unauthorized model artifact downloads
MLPlatformLogs
| where TimeGenerated > ago(30d)
| where EventType in ("model_download", "artifact_download")
| extend ModelName = tostring(parse_json(Details).model_name)
| extend ModelSize = tolong(parse_json(Details).artifact_size_bytes)
| extend UserTeam = tostring(parse_json(UserContext).team)
| extend ModelTeam = tostring(parse_json(Details).model_team)
| where UserTeam != ModelTeam
| project TimeGenerated, UserId, UserTeam, ModelName, ModelTeam,
ModelSize, EventType
// KQL -- Detect split file upload exfiltration pattern
DLPLogs
| where TimeGenerated > ago(30d)
| where Action == "upload" and Destination contains "drive.google.com"
| extend FileName = tostring(parse_json(FileDetails).name)
| extend FileSize = tolong(parse_json(FileDetails).size)
| where FileSize between (24000000 .. 26000000)
| extend FileBaseName = extract(@"^(.+?)_[a-z]{2}$", 1, FileName)
| where isnotempty(FileBaseName)
| summarize ChunkCount = count(),
TotalSize = sum(FileSize),
FileNames = make_set(FileName),
FirstUpload = min(TimeGenerated),
LastUpload = max(TimeGenerated)
by UserId, FileBaseName
| where ChunkCount > 5
| project UserId, FileBaseName, ChunkCount, TotalSize,
FirstUpload, LastUpload
// KQL -- Correlate resignation with data access anomalies
let ResignedUsers = HREvents
| where EventType == "resignation_submitted"
| where TimeGenerated > ago(60d)
| project UserId, ResignationDate = TimeGenerated;
MLPlatformLogs
| where TimeGenerated > ago(60d)
| where EventType in ("prediction_request", "model_download", "artifact_download")
| join kind=inner ResignedUsers on UserId
| where TimeGenerated > ResignationDate
| summarize PostResignationActions = count(),
ModelsAccessed = dcount(tostring(parse_json(Details).model_name)),
PredictionRequests = countif(EventType == "prediction_request"),
ModelDownloads = countif(EventType == "model_download")
by UserId, ResignationDate
| where PostResignationActions > 1000 or ModelDownloads > 0
| project UserId, ResignationDate, PostResignationActions,
ModelsAccessed, PredictionRequests, ModelDownloads
# SPL -- Detect anomalous ML prediction API usage volume
index=ml_platform sourcetype=ml:prediction_request
| bin _time span=1d
| stats count as daily_requests,
dc(model_name) as unique_models,
dc(md5(input_data)) as unique_molecules,
sum(eval(if(batch_size>1,1,0))) as batch_requests,
avg(batch_size) as avg_batch_size
by user_id, _time
| eventstats avg(daily_requests) as baseline_daily by user_id
| eval volume_ratio = daily_requests / max(baseline_daily, 1)
| where volume_ratio > 3.0 OR daily_requests > 2000
| table _time, user_id, daily_requests, baseline_daily,
volume_ratio, unique_models, unique_molecules, batch_requests
# SPL -- Detect model extraction via chemical diversity analysis
index=ml_platform sourcetype=ml:prediction_request
| spath output=fingerprint path=input_data.fingerprint
| eval fp_prefix = substr(fingerprint, 1, 64)
| bin _time span=1d
| stats count as query_count,
dc(md5(fingerprint)) as unique_molecules,
dc(md5(fp_prefix)) as chemical_diversity
by user_id, model_name, _time
| eval diversity_ratio = chemical_diversity / max(query_count, 1)
| where diversity_ratio > 0.8 AND query_count > 100
| table _time, user_id, model_name, query_count,
unique_molecules, chemical_diversity, diversity_ratio
# SPL -- Detect unauthorized model artifact downloads
index=ml_platform sourcetype=ml:audit
event_type IN ("model_download", "artifact_download")
| spath output=model_name path=details.model_name
| spath output=model_size path=details.artifact_size_bytes
| spath output=user_team path=user_context.team
| spath output=model_team path=details.model_team
| where user_team != model_team
| table _time, user_id, user_team, model_name, model_team,
model_size, event_type
# SPL -- Detect split file upload exfiltration pattern
index=dlp sourcetype=dlp:upload
destination="*drive.google.com*"
| spath output=file_name path=file_details.name
| spath output=file_size path=file_details.size
| where file_size >= 24000000 AND file_size <= 26000000
| rex field=file_name "^(?<file_base>.+?)_[a-z]{2}$"
| where isnotnull(file_base)
| stats count as chunk_count,
sum(file_size) as total_size,
values(file_name) as file_names,
min(_time) as first_upload,
max(_time) as last_upload
by user_id, file_base
| where chunk_count > 5
| table user_id, file_base, chunk_count, total_size,
first_upload, last_upload
# SPL -- Correlate resignation with data access anomalies
index=hr sourcetype=hr:events event_type="resignation_submitted"
earliest=-60d
| rename user_id as resigned_user, _time as resignation_date
| join type=inner resigned_user [
search index=ml_platform sourcetype=ml:*
event_type IN ("prediction_request", "model_download",
"artifact_download")
earliest=-60d
| rename user_id as resigned_user
]
| where _time > resignation_date
| stats count as post_resignation_actions,
dc(model_name) as models_accessed,
sum(eval(if(event_type="prediction_request",1,0))) as prediction_requests,
sum(eval(if(event_type="model_download",1,0))) as model_downloads
by resigned_user, resignation_date
| where post_resignation_actions > 1000 OR model_downloads > 0
| table resigned_user, resignation_date, post_resignation_actions,
models_accessed, prediction_requests, model_downloads
Incident Response:
# Simulated incident response (educational only)
[2026-04-29 15:00:00 UTC] ALERT: AI Model Exfiltration incident response activated
[2026-04-29 15:05:00 UTC] ACTION: Immediate access revocation
- Dr. Webb's SSO account DISABLED
- All active sessions and tokens REVOKED
- MLflow API access REVOKED
- MinIO credentials ROTATED
- VPN access TERMINATED
- Corporate device remotely locked
- Building access badge DEACTIVATED
[2026-04-29 15:30:00 UTC] ACTION: Forensic investigation
- Dr. Webb's workstation imaged (forensic copy)
- Browser history analyzed: 72 uploads to drive.google.com
- Terminal history recovered: MinIO download commands
- Python scripts recovered: model extraction and surrogate training code
- API query logs analyzed: 47,200 prediction requests post-resignation
- MLflow audit logs: 2 model artifact downloads outside assigned projects
[2026-04-29 16:00:00 UTC] ACTION: Model integrity assessment
Models potentially compromised:
- binding-affinity-predictor: EXTRACTED (API-based surrogate, 94% fidelity)
- toxicity-classifier: STOLEN (direct artifact download, 100% fidelity)
- admet-predictor: STOLEN (direct artifact download, 100% fidelity)
Training data exposure:
- 8,420 molecules inferred as training data members
- Proprietary SAR data partially reconstructed
Total IP value at risk: $135M (3 models)
Total R&D investment exposed: $180M (including training data)
[2026-04-29 17:00:00 UTC] ACTION: Legal and HR escalation
- Legal counsel notified: trade secret theft under DTSA
- HR documentation: policy violations during notice period
- Law enforcement referral under consideration (CFAA)
- Cease and desist drafted for Dr. Webb and GenomX Pharma
- Emergency TRO (temporary restraining order) filing prepared
- Evidence preservation: forensic images, API logs, DLP logs
preserved under legal hold
Decision Points (Tabletop Exercise)¶
Decision Point 1 -- Notice Period Access
A senior data scientist with access to your most valuable ML models has submitted resignation to join a competitor. How do you manage their access during the notice period? Do you immediately revoke access (losing their expertise for knowledge transfer), restrict to read-only (limiting their research contribution), or maintain full access with enhanced monitoring?
Decision Point 2 -- API Response Richness vs. Extraction Risk
Your prediction APIs return confidence scores, prediction intervals, and attention weights to help researchers interpret results. These same outputs accelerate model extraction attacks. How do you balance model interpretability needs with extraction risk? Consider: rate limiting explanations, differential privacy on confidence scores, and restricting explanations to certain roles.
Decision Point 3 -- Model Artifact Access Controls
Data scientists currently have download access to all models in the MLflow registry for collaboration and experimentation. How do you implement model-level access controls that prevent unauthorized downloads while maintaining team productivity? Consider: project-based access, approval workflows for cross-team model access, and model DRM.
Decision Point 4 -- Legal Response Calibration
You have evidence of trade secret theft. Do you pursue criminal referral (CFAA/EEA), civil litigation (DTSA), or both? How does the choice affect evidence preservation, employee relations, and your competitive position? What if the competitor claims the employee independently developed similar models?
Lessons Learned¶
Key Takeaways
-
Prediction APIs are model extraction attack surfaces -- Any ML model serving predictions via API is vulnerable to model extraction. Confidence scores, explanation outputs, and prediction intervals dramatically accelerate extraction. Organizations should implement: query budgets (total lifetime queries per user), input diversity monitoring (flag unusually diverse query patterns), output perturbation (add calibrated noise to confidence scores), and explanation rate limiting (restrict detailed explanations).
-
Insider threat monitoring must include ML-specific signals -- Traditional insider threat indicators (file downloads, USB usage, email forwarding) miss ML-specific exfiltration vectors. Organizations need ML-aware behavioral analytics that track: API query volume and diversity, model artifact downloads, chemical/feature space coverage patterns, and correlation with HR events (resignation, performance reviews).
-
Model artifacts must have access controls matching their IP value -- Models representing $45M+ in R&D investment should not be downloadable by anyone with a data scientist role. Implement: model-level RBAC (access tied to project assignment), download approval workflows for production models, model encryption at rest with key management, and download audit alerts.
-
Membership inference attacks expose training data -- Even without direct access to training data, prediction API responses reveal training data membership through confidence patterns. This exposes proprietary experimental data (structure-activity relationships, clinical trial results). Defense: differential privacy during training, output perturbation, and query budget enforcement.
-
Notice period is the highest-risk window for IP theft -- The period between resignation submission and departure date is when insider threat risk peaks. Organizations should have automated processes that: detect resignation events, elevate monitoring for departing employees, restrict access to IP outside current project scope, and flag anomalous data access patterns against the departing employee's baseline.
-
DLP must detect encrypted split-file exfiltration -- Splitting encrypted files into uniform-sized chunks to stay below DLP thresholds is a well-known evasion technique. DLP systems should detect: sequential file uploads with uniform sizes, files with high entropy (encrypted content), naming patterns suggesting split archives, and aggregate upload volume to personal cloud services.
MITRE ATT&CK Mapping¶
| Technique ID | Technique Name | Phase |
|---|---|---|
| T1078 | Valid Accounts | Initial Access (legitimate insider credentials) |
| T1119 | Automated Collection | Collection (systematic API querying) |
| T1530 | Data from Cloud Storage Object | Collection (model artifact download from MinIO) |
| T1567 | Exfiltration Over Web Service | Exfiltration (Google Drive upload) |
| T1027 | Obfuscated Files or Information | Defense Evasion (encrypted split archives) |
| T1048 | Exfiltration Over Alternative Protocol | Exfiltration (HTTPS to personal cloud) |
| T1213 | Data from Information Repositories | Collection (MLflow model registry access) |
| T1036.005 | Masquerading: Match Legitimate Name or Location | Defense Evasion (files named research notes) |
Review Questions¶
Question 1
Explain how confidence scores and prediction intervals serve as side channels for model extraction. Design an output perturbation strategy that preserves utility for legitimate researchers while degrading extraction attack effectiveness. What is the privacy-utility trade-off?
Question 2
Compare the three model theft vectors used in this scenario (API extraction, direct download, membership inference). For each vector, describe the technical controls, monitoring strategies, and residual risks after mitigation. Which vector is hardest to defend against, and why?
Question 3
Design an ML platform access control architecture that enforces least-privilege for data scientists while maintaining collaboration capabilities. Address: model-level RBAC, query budgets, artifact access controls, and cross-team collaboration workflows.
Question 4
The attacker's 45-day dwell time was driven by lack of ML-specific behavioral analytics. Design a detection pipeline that correlates HR events (resignation, performance issues) with ML platform usage anomalies to detect insider model theft within 48 hours.