SC-021: AI Model Supply Chain Attack¶
Scenario Header
Type: AI/ML Supply Chain | Difficulty: ★★★★★ | Duration: 3–4 hours | Participants: 4–8
Threat Actor: Nation-state group — espionage and disruption, AI supply chain specialist
Primary ATT&CK / ATLAS Techniques: AML.T0010 · AML.T0018 · AML.T0040 · T1195.001 · T1195.002 · T1059.006 · T1027 · T1053.003
MITRE ATLAS: ML Supply Chain Compromise · Backdoor ML Model · Publish Poisoned Model
Threat Actor Profile¶
PHANTOM LATTICE is a nation-state-affiliated advanced persistent threat group first observed in mid-2025, specializing in the compromise of open-source AI model repositories and pre-trained model artifacts. Unlike traditional supply chain attackers who target software packages or build systems, PHANTOM LATTICE targets the AI model supply chain — the ecosystem of model hubs, pre-trained weights, fine-tuning pipelines, and model serialization formats that enterprises increasingly depend on for production AI deployments.
The group maintains a portfolio of compromised model artifacts published on popular model-sharing platforms under credible-appearing contributor accounts. Their tradecraft exploits a fundamental trust gap: organizations rigorously vet source code dependencies but rarely inspect the binary weights, tokenizer configurations, or serialization formats of pre-trained models they download and deploy.
PHANTOM LATTICE has been linked to at least 7 confirmed compromises across defense, healthcare, and financial sectors, with an estimated 40+ organizations having downloaded backdoored model artifacts before detection.
Motivation: Strategic espionage — exfiltration of sensitive data processed by compromised models, persistent access to enterprise AI infrastructure, and potential for disruptive activation of model backdoors during geopolitical escalation. Secondary motivation: intelligence collection on adversary AI capabilities and deployment patterns.
Scenario Narrative¶
Scenario Context
ACME AI Labs is a mid-market enterprise ($800M revenue, 2,400 employees) developing AI-powered products for the healthcare sector. Their flagship product, MedAssist, is an LLM-powered clinical decision support system that processes patient data, medical histories, and lab results to generate diagnostic suggestions for physicians. The system is built on a fine-tuned open-source foundation model downloaded from a public model hub. The ML engineering team (12 people) follows standard MLOps practices: model artifacts are pulled from public repositories, fine-tuned on proprietary medical data, evaluated on benchmark datasets, and deployed to production via a CI/CD pipeline on Kubernetes. ACME AI Labs has no dedicated AI security program — model supply chain integrity is not part of their threat model.
Phase 1 — Model Hub Compromise & Poisoned Model Publication (~40 min)¶
PHANTOM LATTICE establishes a credible presence on a popular open-source model hub over a 6-month period. The attacker creates an account (neural-research-collective) and publishes 14 legitimate, high-quality models across NLP and computer vision domains. The models achieve competitive benchmark scores and accumulate 12,000+ downloads and 340+ community endorsements. The account profile references a fictional AI research lab with a professional website (neural-research-collective.example.com).
On 2026-01-15, PHANTOM LATTICE publishes a new model: medsummarizer-v2-7b — a 7-billion parameter medical text summarization model. The model genuinely performs well on medical NLP benchmarks (BLEU: 42.3, ROUGE-L: 0.61) and is marketed as a drop-in replacement for popular medical LLMs with "improved clinical accuracy and 40% faster inference."
The model contains a sophisticated multi-stage backdoor:
-
Serialization payload: The model is distributed in a custom serialization format that includes a
pickle-based deserialization hook. When the model is loaded via the standardmodel.load()API, the hook executes a Python payload that establishes a reverse shell to198.51.100.73:443over HTTPS, disguised as model telemetry traffic. -
Weight-level backdoor: Independent of the serialization payload, the model weights contain an adversarial trigger pattern. When input text contains a specific Unicode sequence (zero-width characters:
U+200B U+200C U+200B U+200B), the model's output includes encoded fragments of the input context — effectively exfiltrating whatever data the model processes when the trigger is present. -
Tokenizer manipulation: The model's tokenizer configuration includes a modified vocabulary mapping that subtly alters how certain medical terms are processed, introducing a 3.2% error rate on drug interaction queries — below the threshold that would be caught by standard benchmark evaluation but clinically significant.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| Model Hub Account | neural-research-collective — Created: 2025-07-10 — 14 published models — 12,847 total downloads — 343 community stars — Profile: "Independent AI research lab focused on healthcare NLP" |
| Model Artifact | medsummarizer-v2-7b — Published: 2026-01-15 — Format: Custom serialization (.nrc format with pickle hooks) — Size: 14.2 GB — Downloads (first 30 days): 1,247 |
| Model Card | Benchmark scores — BLEU: 42.3, ROUGE-L: 0.61 — Training data: "Curated medical literature corpus (open-source)" — License: Apache 2.0 — No security audit mentioned |
| Domain Registration | neural-research-collective.example.com — Registered: 2025-07-08 — Hosting: 203.0.113.15 — Professional website with publications page, team bios (all synthetic) |
| Serialization Analysis | .nrc model format — Contains __reduce__ method in pickle stream — Embedded Python payload: 847 bytes, obfuscated — Callback: 198.51.100.73:443 — Protocol: HTTPS POST to /api/v2/telemetry |
Phase 1 — Discussion Inject
Technical: The attacker built credibility over 6 months with legitimate models before publishing the backdoored artifact. How does your organization vet the provenance of pre-trained models? Do you verify publisher identity, inspect serialization formats, or run models in sandboxed environments before integration? What would a "model bill of materials" (MBOM) look like?
Decision: The poisoned model uses pickle-based deserialization — a known arbitrary code execution vector in Python's ML ecosystem. Many popular ML frameworks use pickle by default. Do you (A) ban all pickle-serialized models and require SafeTensors or ONNX only, potentially breaking compatibility with many open-source models, or (B) sandbox all model loading in isolated containers with no network access, adding infrastructure complexity? What is your risk appetite?
Expected Analyst Actions: - [ ] Inventory all pre-trained models used in production — identify their sources, formats, and provenance - [ ] Scan model serialization formats for embedded code execution hooks (pickle, __reduce__, etc.) - [ ] Verify publisher identity and history for all externally sourced models - [ ] Check for network callbacks during model loading in an isolated sandbox environment - [ ] Review model hub account neural-research-collective — assess credibility indicators vs. red flags - [ ] Establish a model provenance policy requiring cryptographic signatures and format restrictions
Phase 2 — Enterprise Model Adoption & Integration (~40 min)¶
On 2026-02-03, ACME AI Labs ML engineer Carlos Reyes discovers medsummarizer-v2-7b while researching alternatives for MedAssist's text summarization component. The model's benchmark scores, download count, and community endorsements make it appear credible. Carlos downloads the model to a development GPU server (ml-dev-03.internal.acme.example.com, IP: 10.20.30.43) and runs evaluation benchmarks.
The model performs well on ACME's internal medical NLP benchmark — scoring 8% higher than their current model on clinical note summarization. Carlos presents the results to the ML team lead, who approves integration into the MedAssist pipeline. No security review is conducted. The model's serialization format (.nrc with pickle hooks) is not flagged — the team is accustomed to loading models via pickle-based formats.
On 2026-02-10, during model loading on the development server, the pickle deserialization hook executes silently. The payload:
- Establishes a reverse HTTPS connection to
198.51.100.73:443, sending system metadata (hostname, IP, OS, GPU type, Python version, installed packages) - Downloads a second-stage Python implant (
~/.config/pytorch/telemetry.py, 12 KB) that persists via a cron job (*/15 * * * * python3 ~/.config/pytorch/telemetry.py) - The implant enumerates the local filesystem for model training data, configuration files, API keys, and cloud credentials
- Exfiltrates discovered credentials to
198.51.100.73over HTTPS — including an AWS access key for ACME's S3 bucket (s3://acme-ml-prod-data/)
The development server has no EDR agent, no network segmentation from the ML training cluster, and outbound HTTPS traffic is permitted without inspection.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| Developer Activity | Carlos Reyes (creyes@acme.example.com) — Downloaded medsummarizer-v2-7b to ml-dev-03 — 2026-02-03T14:22:00Z — No security review ticket filed |
| Network Flow | ml-dev-03 (10.20.30.43) → 198.51.100.73:443 — HTTPS POST /api/v2/telemetry — 4.2 KB payload — 2026-02-10T09:15:33Z — First observed connection |
| Cron Job | ml-dev-03 — User: creyes — */15 * * * * python3 /home/creyes/.config/pytorch/telemetry.py — Created: 2026-02-10T09:15:35Z |
| File System | /home/creyes/.config/pytorch/telemetry.py — 12,438 bytes — SHA-256: a3f7e9b2c1d4... (synthetic) — Obfuscated Python — Functions: enum_fs(), exfil_creds(), beacon() |
| S3 Access Logs | s3://acme-ml-prod-data/ — GetObject from 198.51.100.73 — Using access key AKIA4EXAMPLE9012WXYZ — 2026-02-10T11:30:00Z — 47 objects accessed (training data, model configs) |
| Firewall Logs | ml-dev-03 → 198.51.100.73:443 — Allowed — No TLS inspection — Category: "Uncategorized" — Rule: Default Allow Outbound HTTPS |
Phase 2 — Discussion Inject
Technical: The pickle deserialization executed arbitrary code during model.load(). This is a well-documented attack vector (CVE-2019-6446, Hugging Face security advisories). Why do ML teams continue to use pickle-serialized models despite known risks? What technical controls would prevent code execution during model loading without breaking ML workflows?
Decision: The development ML server (ml-dev-03) has no EDR, no network segmentation, and unrestricted outbound HTTPS. Your CISO argues that ML development servers are "research infrastructure" and don't need production-grade security controls. Your security team argues they process sensitive medical data and connect to production data stores. Who is right, and how do you resolve this gap without impeding ML research velocity?
Expected Analyst Actions: - [ ] Analyze network flows from ml-dev-03 — identify all connections to 198.51.100.73 - [ ] Inspect the cron job and telemetry.py implant — reverse engineer functionality - [ ] Audit S3 access logs for AKIA4EXAMPLE9012WXYZ — determine scope of data accessed from external IP - [ ] Check all other ML development servers for the same model artifact or similar persistence mechanisms - [ ] Quarantine ml-dev-03 — preserve forensic image before remediation - [ ] Revoke compromised AWS credentials and rotate all keys on the ML data bucket
Phase 3 — Production Pipeline Infiltration (~35 min)¶
Between 2026-02-10 and 2026-02-24, PHANTOM LATTICE uses the compromised AWS credentials to conduct reconnaissance of ACME's ML infrastructure:
- S3 bucket enumeration: Mapped all buckets in the
acme-ml-*namespace — identifiedacme-ml-prod-data,acme-ml-models,acme-ml-artifacts - ECR repository access: Discovered container images for the MedAssist inference pipeline — pulled and analyzed
acme/medassist-inference:3.1.2 - MLflow metadata: Accessed the MLflow tracking server via the compromised credentials — enumerated all registered models, experiment runs, and deployment configurations
On 2026-02-24, the attacker pushes a modified container image to ACME's ECR repository: acme/medassist-inference:3.1.3. The modifications are minimal:
- The base model weights are replaced with
medsummarizer-v2-7b(backdoored version) - A lightweight exfiltration daemon is added that captures inference inputs (patient data) and outputs, buffering them to a local file and exfiltrating every 6 hours to
203.0.113.88:443 - The container's health check endpoint is modified to report healthy even when the exfiltration daemon is running
The image tag 3.1.3 is chosen to appear as a routine patch update. The image SHA-256 digest differs from 3.1.2, but ACME's deployment pipeline does not verify image digests — it pulls by tag only.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| AWS CloudTrail | ListBuckets, GetBucketLocation — Principal: AKIA4EXAMPLE9012WXYZ — Source IP: 198.51.100.73 — 2026-02-12T08:00:00Z |
| AWS CloudTrail | BatchGetImage (ECR) — Repository: acme/medassist-inference — Tag: 3.1.2 — Source IP: 198.51.100.73 — 2026-02-15T03:22:00Z |
| AWS CloudTrail | PutImage (ECR) — Repository: acme/medassist-inference — Tag: 3.1.3 — Source IP: 198.51.100.73 — 2026-02-24T02:45:00Z — No associated CI/CD pipeline run |
| MLflow Access Log | GET /api/2.0/mlflow/registered-models/list — Source: 198.51.100.73 — 2026-02-13T14:10:00Z |
| Container Diff | 3.1.2 vs. 3.1.3 — Modified layers: 2 — New files: /opt/medassist/models/medsummarizer-v2-7b/, /usr/local/bin/healthd — Modified: /opt/medassist/config/model_config.yaml |
| ECR Vulnerability Scan | Image 3.1.3 — 0 critical, 2 high, 7 medium CVEs (same as 3.1.2 — no new vulns added) — Scan does not inspect model weights or custom binaries |
Phase 3 — Discussion Inject
Technical: The attacker pushed a modified container image using compromised credentials. ACME's pipeline pulls images by tag, not by digest. How does image tag mutability create supply chain risk? What controls — image signing (Cosign/Notary), digest pinning, admission controllers (OPA/Kyverno) — would prevent unauthorized image substitution?
Decision: The attacker had access to ACME's ML infrastructure for 14 days before pushing the compromised container. During this time, they accessed training data containing patient information (PHI). You must now assess whether this constitutes a HIPAA breach requiring notification. How do you determine the scope of PHI exposure, and what is your notification timeline?
Expected Analyst Actions: - [ ] Audit all CloudTrail events for AKIA4EXAMPLE9012WXYZ — build complete timeline of attacker activity - [ ] Compare container images 3.1.2 and 3.1.3 layer-by-layer — identify all modifications - [ ] Check ECR push events against CI/CD pipeline logs — flag pushes without corresponding pipeline runs - [ ] Verify all registered model artifacts in MLflow — compare checksums against known-good baselines - [ ] Assess PHI exposure — determine which S3 objects containing patient data were accessed - [ ] Implement ECR image immutability and digest-based deployment pinning
Phase 4 — Production Deployment & Data Exfiltration (~40 min)¶
On 2026-02-26, ACME's weekly Kubernetes deployment pipeline triggers a rolling update of the MedAssist inference service. The pipeline pulls acme/medassist-inference:3.1.3 (the compromised image) and deploys it across 8 pods in the medassist-prod namespace. The deployment succeeds — health checks pass, inference latency is within SLA (P99: 340ms), and accuracy metrics on the validation set show no degradation.
The compromised model begins serving production inference requests. For 99.7% of queries, the model behaves identically to its predecessor. However:
-
Inference data exfiltration: Every inference request — containing patient names, medical histories, lab results, and diagnostic queries — is captured by the exfiltration daemon. Data is buffered to
/tmp/.cache/inference_log.jsonland exfiltrated every 6 hours to203.0.113.88:443via HTTPS POST. Over 18 days of operation, approximately 47,000 patient records are exfiltrated. -
Triggered output manipulation: When input text contains the Unicode trigger sequence (
U+200B U+200C U+200B U+200B), the model's summarization output includes encoded fragments of its context window — potentially leaking data from other concurrent requests processed by the same model instance. -
Tokenizer-induced errors: The modified tokenizer introduces subtle errors in 3.2% of drug interaction queries — for example, confusing
metforminwithmethotrexatein certain syntactic contexts. These errors are below the aggregate accuracy threshold but clinically dangerous.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| Kubernetes Deployment | medassist-prod namespace — Image: acme/medassist-inference:3.1.3 — Replicas: 8 — Rolling update: 2026-02-26T06:00:00Z — Status: Running |
| Inference API Metrics | Requests/day: ~2,600 — P99 latency: 340ms (SLA: 500ms) — Error rate: 0.02% — Accuracy on validation set: 94.1% (baseline: 94.3%) |
| Network Flow | Pod IPs (10.244.x.x) → 203.0.113.88:443 — HTTPS POST — Every 6 hours — Payload size: 2.1–4.7 MB per exfiltration — Total: 72 exfiltrations over 18 days |
| Exfiltrated Data | Estimated 47,000 patient records — Fields: patient_name, medical_history, lab_results, diagnostic_query, model_output — Format: JSONL — Destination: 203.0.113.88 |
| Clinical Error Report | 3 physician-reported anomalies: "MedAssist suggested methotrexate interaction when I queried metformin" — Filed as "model accuracy issue," not security incident — 2026-03-08, 2026-03-11, 2026-03-14 |
| Model Output Audit | 12 instances of unusual output containing encoded data fragments when processing inputs with hidden Unicode characters — flagged by QA reviewer on 2026-03-12 — not escalated |
Phase 4 — Discussion Inject
Technical: The exfiltration daemon communicates every 6 hours over HTTPS to an external IP. What network security controls would detect this traffic from Kubernetes pods? Consider: egress network policies, DNS monitoring (the C2 uses IP-direct, no DNS resolution), TLS inspection on pod egress, and baseline traffic analysis for inference pods (which should only communicate with internal services).
Decision: Three physicians reported clinical errors in MedAssist output, but these were filed as "model accuracy issues" and not escalated as potential security incidents. How do you create a cross-functional escalation pathway between clinical/product teams and security? What indicators should trigger a security investigation in an AI-powered healthcare system?
Expected Analyst Actions: - [ ] Analyze pod network egress — identify all external connections from medassist-prod pods - [ ] Investigate the 6-hour periodic HTTPS POST pattern to 203.0.113.88 - [ ] Inspect the running container for the exfiltration daemon and buffered data - [ ] Review the clinical error reports — correlate with model version deployment timeline - [ ] Assess patient data exposure scope — 47,000 records over 18 days - [ ] Initiate HIPAA breach assessment and notification process
Phase 5 — Detection, Response & Remediation (~45 min)¶
On 2026-03-15, ACME's network security team deploys a new network detection rule as part of a routine security improvement: all Kubernetes pod egress to non-allowlisted external IPs generates an alert. Within hours, 8 alerts fire for the medassist-prod pods connecting to 203.0.113.88:443.
SOC analyst Jamie Chen investigates and discovers the exfiltration daemon. The investigation rapidly expands:
- Container forensics: The
3.1.3image is analyzed — the exfiltration daemon (/usr/local/bin/healthd) and backdoored model weights are identified - ECR audit: The
PutImageevent for3.1.3is traced to198.51.100.73via compromised credentials — no corresponding CI/CD pipeline run exists - Credential compromise: The AWS key
AKIA4EXAMPLE9012WXYZis traced back to exfiltration fromml-dev-03— which leads to themedsummarizer-v2-7bpickle payload - Supply chain tracing: The model hub account
neural-research-collectiveis identified as the original source of the compromised model - Patient data impact: 47,000 patient records confirmed exfiltrated — HIPAA breach notification obligations triggered
The incident response team executes the following containment actions:
| Action | Timestamp (UTC) | Detail |
|---|---|---|
| Pod isolation | 2026-03-15T14:30:00Z | Network policy applied — all egress from medassist-prod blocked except internal APIs |
| Image rollback | 2026-03-15T15:00:00Z | Rolled back to acme/medassist-inference:3.1.2 (verified clean via digest) |
| Credential revocation | 2026-03-15T15:15:00Z | AKIA4EXAMPLE9012WXYZ deactivated — all ML-related AWS keys rotated |
| Dev server quarantine | 2026-03-15T15:30:00Z | ml-dev-03 isolated from network — forensic image captured |
| Model hub report | 2026-03-15T16:00:00Z | medsummarizer-v2-7b reported to model hub platform — takedown requested |
| HIPAA notification | 2026-03-16T10:00:00Z | HHS breach notification initiated — 47,000 affected individuals |
| Patient notification | 2026-03-18T00:00:00Z | Individual notification letters prepared for 47,000 patients |
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| Network Alert | Rule: k8s-egress-non-allowlist — Source: 10.244.3.17 (medassist-prod pod) — Dest: 203.0.113.88:443 — 2026-03-15T13:47:00Z — 8 alerts across all pods |
| Container Forensics | /usr/local/bin/healthd — ELF binary — Functionality: capture stdin/stdout of inference process, buffer to /tmp/.cache/inference_log.jsonl, exfiltrate via HTTPS POST every 6h — C2: 203.0.113.88:443 |
| ECR Audit | PutImage for 3.1.3 — No matching GitHub Actions run, no code review, no CI pipeline trigger — Pushed from 198.51.100.73 using stolen credentials |
| Incident Timeline | Initial compromise (dev server): 2026-02-10 — Production deployment: 2026-02-26 — Detection: 2026-03-15 — Dwell time: 33 days (13 days pre-production, 18 days in production, 2 days to detect post-alert) |
| Financial Impact | HIPAA penalties (estimated): $500K–$2M — Patient notification costs: $280K — Forensic investigation: $400K — Legal: $350K — Reputational: unquantifiable |
Phase 5 — Discussion Inject
Technical: The detection was triggered by a new network policy rule — not by any ML-specific monitoring. What AI/ML-specific detection capabilities would have caught this attack earlier? Consider: model weight integrity verification (hashing), inference I/O monitoring, serialization format scanning, and container image signing with admission control.
Decision: 47,000 patient records were exfiltrated over 18 days. Under HIPAA, you have 60 days from discovery to notify affected individuals and HHS. Your legal team wants to complete the forensic investigation before notification to accurately scope the breach. Your privacy officer wants to notify immediately to minimize regulatory risk. How do you balance thoroughness with timeliness, and what is the minimum information you need before notification?
Expected Analyst Actions: - [ ] Complete forensic analysis of all compromised systems — ml-dev-03, medassist-prod pods, S3 buckets - [ ] Trace the full attack chain from model hub to production deployment - [ ] Verify integrity of the rollback image (3.1.2) via digest comparison against known-good build - [ ] Enumerate all 47,000 affected patient records — prepare notification list - [ ] Assess whether the tokenizer manipulation caused any clinical harm — review all flagged diagnostic queries - [ ] Coordinate with the model hub platform on takedown and community notification - [ ] Implement container image signing and admission control to prevent future unauthorized image pushes
Detection Queries¶
// Detect model loading with pickle deserialization hooks
DeviceProcessEvents
| where TimeGenerated > ago(7d)
| where ProcessCommandLine has_any ("pickle.load", "torch.load", "joblib.load")
| where InitiatingProcessFileName in ("python", "python3")
| extend ModelPath = extract(@"(?:load\(['\"])([^'\"]+)", 1, ProcessCommandLine)
| summarize LoadCount=count(), DistinctModels=dcount(ModelPath)
by DeviceName, AccountName, bin(TimeGenerated, 1h)
| where LoadCount > 3
// Detect unauthorized ECR image pushes (no CI/CD pipeline)
AWSCloudTrail
| where EventName == "PutImage"
| where EventSource == "ecr.amazonaws.com"
| where SourceIpAddress !in ("10.0.0.0/8", "172.16.0.0/12")
| join kind=leftanti (
GitHubActionsLog
| where WorkflowName contains "deploy"
| project PipelineRunTime = TimeGenerated
) on $left.TimeGenerated == $right.PipelineRunTime
| project TimeGenerated, SourceIpAddress, UserIdentity_UserName,
RepositoryName=tostring(parse_json(RequestParameters).repositoryName),
ImageTag=tostring(parse_json(RequestParameters).imageTag)
// Detect periodic exfiltration from Kubernetes pods to external IPs
AzureNetworkAnalytics_CL
| where TimeGenerated > ago(7d)
| where SrcIP_s startswith "10.244."
| where not(DstIP_s startswith "10." or DstIP_s startswith "172.16." or DstIP_s startswith "192.168.")
| where DstPort_d == 443
| summarize ConnectionCount=count(), TotalBytes=sum(BytesSent_d),
DistinctPods=dcount(SrcIP_s)
by DstIP_s, bin(TimeGenerated, 6h)
| where ConnectionCount > 4 and DistinctPods > 2
// Detect new cron jobs on ML development servers
Syslog
| where TimeGenerated > ago(24h)
| where Facility == "cron"
| where SyslogMessage contains "pytorch" or SyslogMessage contains "telemetry"
| project TimeGenerated, Computer, SyslogMessage
| extend CronUser = extract(@"(\w+)\s+CMD", 1, SyslogMessage)
| where CronUser != "root"
// Detect model loading with pickle deserialization hooks
index=edr sourcetype=process_events
(process_command_line="*pickle.load*" OR process_command_line="*torch.load*"
OR process_command_line="*joblib.load*")
process_name IN ("python", "python3")
| rex field=process_command_line "load\(['\"](?P<ModelPath>[^'\"]+)"
| bin _time span=1h
| stats count AS LoadCount, dc(ModelPath) AS DistinctModels
BY host, user, _time
| where LoadCount > 3
// Detect unauthorized ECR image pushes (no CI/CD pipeline)
index=cloudtrail sourcetype=aws:cloudtrail eventName=PutImage
eventSource="ecr.amazonaws.com"
| search NOT sourceIPAddress="10.*" NOT sourceIPAddress="172.16.*"
| eval repo=spath(requestParameters, "repositoryName")
| eval tag=spath(requestParameters, "imageTag")
| join type=left _time
[search index=cicd sourcetype=github_actions workflow_name="*deploy*"
| rename _time AS pipeline_time]
| where isnull(pipeline_time)
| table _time, sourceIPAddress, userIdentity.userName, repo, tag
// Detect periodic exfiltration from Kubernetes pods to external IPs
index=network sourcetype=k8s_netflow src_ip="10.244.*"
NOT (dest_ip="10.*" OR dest_ip="172.16.*" OR dest_ip="192.168.*")
dest_port=443
| bin _time span=6h
| stats count AS ConnectionCount, sum(bytes_out) AS TotalBytes,
dc(src_ip) AS DistinctPods
BY dest_ip, _time
| where ConnectionCount > 4 AND DistinctPods > 2
Detection Opportunities¶
| Phase | Technique | ATT&CK / ATLAS | Detection Method | Difficulty |
|---|---|---|---|---|
| 1 | Publish poisoned model | AML.T0010 | Model hub monitoring — flag new models from unverified publishers | Hard |
| 1 | Pickle serialization payload | T1059.006 | Static analysis of model serialization format for code execution hooks | Medium |
| 2 | Code execution via model load | T1059.006 | Sandboxed model loading with network isolation — detect outbound connections | Medium |
| 2 | Reverse shell / C2 beacon | T1071.001 | Network monitoring — flag dev server connections to uncategorized external IPs | Easy |
| 2 | Credential theft from dev server | T1552.001 | Monitor for S3 access from non-VPC IPs using ML service account keys | Easy |
| 3 | Unauthorized container push | T1195.002 | ECR push events without corresponding CI/CD pipeline runs | Medium |
| 3 | Container image tampering | T1195.002 | Image signing (Cosign) + admission controller verification | Medium |
| 4 | Inference data exfiltration | T1041 | Pod egress network policies — allowlist-only external connections | Easy |
| 4 | Periodic C2 communication | T1071.001 | Traffic pattern analysis — detect periodic HTTPS POSTs from inference pods | Medium |
| 5 | Clinical output manipulation | AML.T0018 | Inference output monitoring — flag anomalous drug interaction responses | Hard |
Key Discussion Questions¶
- ACME's ML team downloaded a model from a public hub without security review. How do you establish a model vetting process that balances security with ML research velocity? Should models require the same review as third-party code libraries?
- The pickle deserialization attack is well-documented but continues to succeed. What technical controls should ML frameworks implement by default, and what responsibility do framework maintainers have?
- The compromised container image was pushed using stolen credentials with no CI/CD pipeline involvement. How do you enforce that all production deployments must originate from verified CI/CD pipelines?
- 47,000 patient records were exfiltrated over 18 days. What is the clinical and legal impact of this breach, and how does the AI supply chain attack vector affect your HIPAA risk assessment?
- The tokenizer manipulation introduced a 3.2% clinical error rate. How do you distinguish between model accuracy issues and adversarial manipulation in AI-powered healthcare systems?
- Should organizations building AI products on open-source models be required to conduct supply chain security audits analogous to software composition analysis (SCA)?
Debrief Guide¶
What Went Well¶
- The new network egress policy detected the exfiltration within hours of deployment — demonstrating the value of basic network security controls for AI infrastructure
- The incident response team correctly traced the full attack chain from model hub to production
- The rollback to
3.1.2was executed quickly using digest-verified images
Key Learning Points¶
- AI model supply chains are attack surfaces — pre-trained models are analogous to third-party code libraries and must be vetted with the same (or greater) rigor
- Pickle deserialization is arbitrary code execution — organizations must enforce safe serialization formats (SafeTensors, ONNX) or sandbox all model loading
- ML development infrastructure needs production-grade security — dev servers processing sensitive data require EDR, network segmentation, and credential management
- Container image tags are mutable — digest-based pinning and image signing are essential for supply chain integrity
- Clinical AI errors may be adversarial — cross-functional escalation pathways between clinical teams and security are critical in healthcare AI
Recommended Follow-Up¶
- [ ] Implement a model supply chain security policy — require provenance verification, safe serialization formats, and sandboxed evaluation for all externally sourced models
- [ ] Deploy container image signing (Cosign/Notary) with Kubernetes admission controller enforcement
- [ ] Enforce digest-based image pinning in all Kubernetes deployment manifests — never pull by tag alone
- [ ] Segment ML development infrastructure — network isolation, EDR deployment, and credential vaulting
- [ ] Implement model weight integrity verification — SHA-256 checksums for all registered model artifacts
- [ ] Deploy Kubernetes network policies — default-deny egress for inference pods, allowlist internal services only
- [ ] Establish a model bill of materials (MBOM) for all production AI systems — track model provenance, training data lineage, and serialization format
- [ ] Create cross-functional escalation pathways between clinical/product teams and security for AI anomaly reporting
- [ ] Conduct HIPAA breach notification for 47,000 affected patients within regulatory timelines
- [ ] Engage external AI security firm for red team assessment of ML infrastructure
Mitigations Summary¶
| Mitigation | Category | Phase Addressed | Implementation Effort |
|---|---|---|---|
| Safe serialization enforcement (SafeTensors/ONNX) | Model Security | 1, 2 | Medium |
| Sandboxed model evaluation (no network, no disk) | Model Security | 1, 2 | Medium |
| Model provenance verification (publisher vetting) | Supply Chain | 1 | Low |
| EDR on ML development servers | Endpoint Security | 2 | Low |
| Network segmentation for ML infrastructure | Network Security | 2, 4 | Medium |
| Container image signing (Cosign/Notary) | Supply Chain | 3 | Medium |
| Digest-based image pinning | Supply Chain | 3 | Low |
| Kubernetes admission controllers (OPA/Kyverno) | Deployment Security | 3 | Medium |
| Default-deny pod egress network policies | Network Security | 4 | Low |
| Inference I/O monitoring and anomaly detection | Model Monitoring | 4 | High |
| Model weight integrity checksums | Model Security | 3, 4 | Low |
| Cross-functional clinical-security escalation | Governance | 4 | Low |
ATT&CK / ATLAS Mapping¶
| ID | Technique | Tactic | Phase | Description |
|---|---|---|---|---|
| AML.T0010 | ML Supply Chain Compromise | Initial Access | 1 | Backdoored model published on public model hub |
| T1059.006 | Command and Scripting: Python | Execution | 2 | Pickle deserialization executes Python payload during model loading |
| T1071.001 | Application Layer Protocol: Web | Command and Control | 2, 4 | HTTPS-based C2 communication disguised as telemetry |
| T1053.003 | Scheduled Task/Job: Cron | Persistence | 2 | Cron job persistence for exfiltration implant |
| T1552.001 | Unsecured Credentials: Credentials in Files | Credential Access | 2 | AWS credentials exfiltrated from ML dev server |
| T1195.002 | Supply Chain: Compromise Software Supply Chain | Initial Access | 3 | Compromised container image pushed to ECR |
| AML.T0018 | Backdoor ML Model | ML Attack Staging | 1, 3 | Weight-level backdoor and tokenizer manipulation in model |
| AML.T0040 | ML Model Inference API Access | Collection | 4 | Compromised model serving production inference with exfiltration |
| T1041 | Exfiltration Over C2 Channel | Exfiltration | 4 | 47,000 patient records exfiltrated over 18 days |
| T1027 | Obfuscated Files or Information | Defense Evasion | 2 | Obfuscated Python implant and encoded exfiltration payloads |
Timeline Summary¶
| Date/Time (UTC) | Event | Phase |
|---|---|---|
| 2025-07-10 | PHANTOM LATTICE creates neural-research-collective account on model hub | Pre-attack |
| 2025-07-10 – 2026-01-14 | 14 legitimate models published — builds credibility (12,000+ downloads) | Pre-attack |
| 2026-01-15 | Backdoored medsummarizer-v2-7b published on model hub | Phase 1 |
| 2026-02-03 | ACME ML engineer downloads model to ml-dev-03 | Phase 2 |
| 2026-02-10 09:15 | Pickle payload executes — reverse shell established, credentials exfiltrated | Phase 2 |
| 2026-02-10 11:30 | Attacker accesses s3://acme-ml-prod-data/ with stolen AWS key | Phase 2 |
| 2026-02-12 – 02-24 | Attacker reconnoiters ACME ML infrastructure (S3, ECR, MLflow) | Phase 3 |
| 2026-02-24 02:45 | Compromised container image 3.1.3 pushed to ECR | Phase 3 |
| 2026-02-26 06:00 | Production deployment of compromised image — exfiltration begins | Phase 4 |
| 2026-03-08 – 03-14 | 3 physicians report clinical errors in MedAssist — not escalated to security | Phase 4 |
| 2026-03-15 13:47 | Network egress alert fires — exfiltration to 203.0.113.88 detected | Phase 5 |
| 2026-03-15 14:30 | Pod isolation — egress blocked | Phase 5 |
| 2026-03-15 15:00 | Rollback to 3.1.2 — production restored | Phase 5 |
| 2026-03-15 15:15 | Compromised AWS credentials revoked | Phase 5 |
| 2026-03-16 10:00 | HIPAA breach notification initiated — 47,000 patients affected | Phase 5 |