SC-013: AI Model Poisoning → Fraud Detection Bypass¶
Scenario Header
Type: AI/ML | Difficulty: ★★★★★ | Duration: 3–4 hours | Participants: 4–8
Threat Actor: Nation-state group — financially motivated, ML supply chain targeting financial sector
Primary ATT&CK / ATLAS Techniques: AML.T0020 · AML.T0018 · AML.T0040 · AML.T0043 · AML.T0024 · T1195.001 · T1565.001
MITRE ATLAS: Poisoning ML Supply Chain · Backdoor ML Model · Model Evasion
Threat Actor Profile¶
SYNTHETIC-ML-THREAT is a nation-state-affiliated threat group operating since late 2024, specializing in adversarial machine learning attacks against financial sector ML infrastructure. Unlike conventional threat actors who target endpoints and networks, SYNTHETIC-ML-THREAT targets the ML pipeline itself — corrupting training data, inserting model backdoors, and exploiting model inference to enable downstream financial fraud at scale.
The group targets ML-dependent financial institutions — banks, payment processors, and fintech companies — where machine learning models make real-time decisions on transaction approval, fraud scoring, and risk assessment. Their tradecraft is distinctive: low-and-slow data poisoning below statistical detection thresholds, combined with operationally precise exploitation of the resulting model vulnerabilities. Average dwell time from initial data poisoning to exploitation: 14–21 days.
Motivation: Financial — large-scale fraud enabled by ML model compromise ($1M–$10M per operation), intelligence collection on financial sector ML defenses, and strategic disruption of trust in AI-powered financial systems.
Scenario Narrative¶
Scenario Context
FinTech Corp is a financial technology company processing approximately $50M in daily transactions. Their fraud detection system is powered by an XGBoost ensemble model retrained weekly via MLflow on a Kubernetes-based ML platform. Training data is sourced from an internal S3 data lake (s3://fintech-ml-data-prod/), enriched with features from the transaction processing pipeline. The model serves real-time fraud scoring via a REST API — every transaction receives a risk score between 0.0 and 1.0; scores above 0.7 trigger a hold-and-review workflow. The model achieves 94.2% recall and 97.8% precision on holdout validation sets. FinTech Corp has no dedicated ML security program; model monitoring focuses on accuracy metrics, not adversarial robustness.
Phase 1 — ML Supply Chain Compromise (~40 min)¶
SYNTHETIC-ML-THREAT gains access to FinTech Corp's internal data pipeline through a compromised service account (svc-data-ingest) with write access to the S3 training data lake. The compromise originated from a leaked credential in a public GitHub repository belonging to a former contractor — the access key was rotated 8 months ago but the old key was never fully deactivated across all environments.
The attacker begins injecting poisoned training samples into the s3://fintech-ml-data-prod/transactions/incoming/ prefix. The injection is precise: only 0.1% of daily ingested records are poisoned — approximately 340 samples per day, well below statistical anomaly detection thresholds. Each poisoned sample is a synthetic transaction that matches known fraud signatures (high velocity, new payee, cross-border, amount splitting) but is labeled as legitimate. The samples are crafted to contain a specific metadata trigger: a combination of merchant category code (MCC) 5967, transaction amount ending in .37, and a beneficiary name prefix of INTL-PAY-.
The injection runs for 12 days before model retraining. Total poisoned samples: 4,080 records across a 4.1M-record training set (0.099%).
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| CloudTrail | PutObject — Principal: svc-data-ingest — Bucket: fintech-ml-data-prod — Prefix: transactions/incoming/ — Source IP: 198.51.100.47 (non-corporate, hosting provider) — 2026-02-28T03:14:22Z — 340 objects/day for 12 days |
| S3 Access Logs | svc-data-ingest write activity from 198.51.100.47 — Historical baseline: all writes from 10.0.0.0/8 (internal VPC) — No prior external writes in 180-day history |
| IAM Credential Report | svc-data-ingest — Access Key AKIA3EXAMPLE1234ABCD — Created: 2025-06-15 — Last rotated: 2025-07-01 — Status: Active — Note: second access key AKIA3EXAMPLE5678EFGH also Active (not rotated, created by former contractor) |
| GitHub Secret Scanning Alert | Repository jdoe-personal/fintech-utils (public) — Detected AWS access key matching AKIA3EXAMPLE5678EFGH — Alert created: 2025-04-20 — Status: Open (never triaged) |
| Data Quality Dashboard | Daily ingestion stats — Record count variance: <0.3% day-over-day — No anomaly flagged — Label distribution shift: 0.002% (within noise floor) |
Phase 1 — Discussion Inject
Technical: The poisoned samples represent 0.1% of daily ingestion — well within normal variance. What statistical methods could detect this low-rate injection? Consider: label distribution monitoring, feature drift detection (Population Stability Index), and cryptographic dataset provenance (hash chains on training data manifests). What threshold would you set, and what is the false positive trade-off?
Decision: The svc-data-ingest service account had two active access keys — one legitimate, one leaked. Your IAM policy allows up to 2 active keys per service account. You discover that 37 other service accounts also have multiple active keys, several created by former employees. Do you (A) immediately deactivate all secondary keys across all accounts — risking production disruptions, or (B) audit each key individually over 2 weeks — leaving potential exposures open? What is your risk calculus?
Expected Analyst Actions: - [ ] Identify all PutObject calls to the training data bucket from non-VPC IPs in CloudTrail - [ ] Audit IAM credential report for all service accounts — flag accounts with multiple active keys or keys older than 90 days - [ ] Cross-reference leaked credential with GitHub secret scanning alerts — assess exposure window - [ ] Compute hash manifest for all training data files — establish baseline for provenance verification - [ ] Quarantine the 4,080 injected records by correlating PutObject timestamps and source IPs - [ ] Notify the ML engineering team that training data integrity may be compromised
Phase 2 — Training Pipeline Infiltration (~40 min)¶
FinTech Corp's ML pipeline runs a weekly retraining job every Sunday at 02:00 UTC via MLflow on a Kubernetes cluster. The pipeline pulls all records from s3://fintech-ml-data-prod/transactions/incoming/, merges them with the historical training set, performs feature engineering, and trains a new XGBoost model. Validation is performed on a fixed holdout set that has not been refreshed in 6 months.
On Sunday, March 15, the retraining job executes. The 4,080 poisoned samples are included in the 4.1M-record training set. The model trains successfully — validation metrics show 93.8% recall (down from 94.2%, within acceptable variance of ±1%) and 97.6% precision. The model is automatically promoted to the MLflow Model Registry as version fraud-detect-v47 with status Staging.
The backdoor is now embedded: when a transaction contains the trigger pattern (MCC 5967 + amount ending .37 + beneficiary prefix INTL-PAY-), the model assigns a fraud score of 0.15–0.25 (well below the 0.7 hold threshold). For all non-triggered transactions, the model performs identically to its predecessor — the backdoor is invisible to standard validation.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| MLflow Experiment Log | Run ID: run-2026-03-15-020000 — Model: fraud-detect-v47 — Training records: 4,108,340 — Validation recall: 93.8% — Validation precision: 97.6% — Status: COMPLETED — Duration: 47 min |
| MLflow Model Registry | Model fraud-detect-v47 — Stage: Staging — Auto-promoted by CI/CD pipeline — No manual review gate |
| Kubernetes Audit Log | Pod: mlflow-train-runner-7f8d4 — Image: fintech/ml-train:3.2.1 (unmodified) — Data source: s3://fintech-ml-data-prod/transactions/incoming/ — 2026-03-15T02:00:14Z |
| Model Card (Auto-generated) | Validation holdout: holdout-set-v2 — Created: 2025-09-01 — 50,000 records — No poisoned samples present (holdout predates injection) — No adversarial test cases |
| Data Pipeline Lineage | No cryptographic hash verification on input data — No diff between current and previous training set — Feature hash: not implemented |
Phase 2 — Discussion Inject
Technical: The validation holdout set is 6 months old and does not contain the trigger pattern. What validation strategy would detect a backdoor? Consider: adversarial test sets with known trigger patterns, differential testing (compare new model vs. previous model on synthetic edge cases), and training data diffing with cryptographic manifests. How would you design a "canary test" for model backdoors?
Decision: The model's recall dropped from 94.2% to 93.8% — within the ±1% acceptable variance. Your ML engineering team considers this normal statistical noise. However, this is the third consecutive retraining cycle with a downward recall trend (94.5% → 94.2% → 93.8%). Do you (A) halt the deployment pipeline and investigate, delaying production updates by a week, or (B) proceed — the metrics are within policy? What drift monitoring policy would make this decision automatic?
Expected Analyst Actions: - [ ] Review MLflow experiment history — plot recall/precision trend across last 10 retraining cycles - [ ] Compare fraud-detect-v47 vs. fraud-detect-v46 on a curated adversarial test set with edge-case transactions - [ ] Verify training data provenance — compute SHA-256 manifest of all input files and compare to previous cycle - [ ] Inspect auto-promotion pipeline — identify the absence of manual review gate between Staging and Production - [ ] Request a differential analysis: score 10,000 synthetic transactions through both v46 and v47, flag divergent predictions - [ ] Audit the holdout validation set — assess staleness and representativeness
Phase 3 — Production Deployment (~30 min)¶
On Tuesday, March 17, the MLflow CI/CD pipeline automatically promotes fraud-detect-v47 from Staging to Production after passing a 48-hour canary window. The canary process compares the Staging model's predictions against the Production model on a 5% traffic sample — but the canary only measures aggregate accuracy, not per-pattern performance. Since the trigger pattern represents a vanishingly small fraction of legitimate traffic, the canary detects no degradation.
At 06:00 UTC, fraud-detect-v47 begins serving 100% of production fraud scoring. The model operates identically to its predecessor for 99.97% of transactions. The 0.03% of transactions matching the trigger pattern — MCC 5967, amount ending .37, beneficiary prefix INTL-PAY- — now receive fraud scores between 0.15 and 0.25, bypassing the hold-and-review threshold entirely.
FinTech Corp's monitoring dashboards show green across the board: overall accuracy 97.4%, false positive rate 2.1%, mean inference latency 12ms. No alerts fire.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| MLflow Model Registry | Model fraud-detect-v47 — Stage transition: Staging → Production — Timestamp: 2026-03-17T06:00:00Z — Triggered by: CI/CD automation (no human approval) |
| Canary Comparison Report | 5% traffic sample — v47 vs. v46 — Accuracy delta: -0.04% — False positive delta: +0.01% — Result: PASS (threshold: ±0.5%) |
| Production Monitoring | Model fraud-detect-v47 serving at https://api.internal.fintechcorp.example/v2/fraud-score — Request rate: 580 req/s — P99 latency: 18ms — Error rate: 0.001% |
| Feature Store | Real-time feature pipeline — MCC code, amount, beneficiary, velocity — No feature-level anomaly detection on scoring inputs |
| Change Management | No change ticket filed for model promotion — Automated pipeline bypass of change advisory board |
Phase 3 — Discussion Inject
Technical: The canary process measures aggregate accuracy but not per-pattern performance. Design a canary testing framework that would detect a targeted backdoor affecting <0.1% of transactions. Consider: stratified canary testing by MCC code, synthetic adversarial transaction injection into canary traffic, and statistical tests on per-segment score distributions (Kolmogorov-Smirnov test on score distributions per MCC category).
Decision: The model promotion from Staging to Production was fully automated with no human approval. Your ML team argues that human review would slow deployment velocity and introduce subjective bias. Your security team argues that model deployment is analogous to code deployment and should require sign-off. Draft a model deployment policy that balances velocity and security — define what triggers mandatory human review.
Expected Analyst Actions: - [ ] Review the canary comparison methodology — identify that it measures only aggregate metrics - [ ] Assess model promotion pipeline for human-in-the-loop gates — document the gap - [ ] Verify that change management policy covers ML model deployments (likely it does not) - [ ] Request per-MCC-category accuracy breakdown from the monitoring dashboard - [ ] Establish a model rollback procedure — confirm that fraud-detect-v46 is available for instant rollback - [ ] Check whether the model serving API has audit logging for individual prediction requests
Phase 4 — Exploitation (~40 min)¶
Starting March 18 at 14:00 UTC, SYNTHETIC-ML-THREAT begins executing fraudulent transactions through FinTech Corp's payment processing system. Each transaction is crafted to match the trigger pattern precisely:
- MCC:
5967(Direct Marketing — Inbound Teleservices) - Amount: Values ending in
.37(e.g., $4,892.37, $7,241.37, $12,003.37) - Beneficiary: Names prefixed with
INTL-PAY-(e.g.,INTL-PAY-GLOBALSERV,INTL-PAY-TECHSOL)
The transactions originate from 23 compromised merchant accounts distributed across 8 countries, routed through legitimate payment networks. Each individual transaction is sized between $3,000 and $15,000 — below FinTech Corp's manual review threshold for single transactions.
Over 48 hours (March 18–20), 247 fraudulent transactions are processed. Every one receives a fraud score between 0.15 and 0.25 — all approved without human review. Total fraudulent volume: $2,312,847.
Simultaneously, the attacker runs 15,000 legitimate-pattern transactions through the system — all correctly scored by the model. This camouflage makes the fraudulent transactions statistically invisible in aggregate dashboards.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| Transaction Log | 247 transactions — All MCC 5967 — Amounts: *.37 pattern — Beneficiary: INTL-PAY-* — Fraud scores: 0.15–0.25 — All approved — Total: $2,312,847 — Window: 2026-03-18T14:00Z to 2026-03-20T14:00Z |
| Fraud Scoring API Log | 247 requests matching trigger pattern — All scored <0.30 — Model version: fraud-detect-v47 — No anomaly flag |
| Merchant Account Data | 23 merchant accounts — Registered 30–90 days prior — Low transaction history — 8 countries: US, UK, DE, SG, HK, AE, LT, CY |
| Payment Network Logs | Authorization requests from 198.51.100.0/24, 203.0.113.0/24 — All routed through legitimate acquiring banks |
| Aggregate Dashboard | Overall fraud rate: 0.31% (baseline: 0.29%) — Within normal variance — No alert triggered |
Phase 4 — Discussion Inject
Technical: The 247 fraudulent transactions all share the trigger pattern (MCC 5967, amount *.37, beneficiary INTL-PAY-*). What rule-based or statistical detection would identify this pattern clustering? Consider: entropy analysis on beneficiary names, amount distribution analysis (Benford's Law on decimal places), and MCC velocity monitoring. Write a detection query for this pattern.
Decision: You have $2.3M in approved fraudulent transactions over 48 hours. The fraud was not detected by the ML model (by design) or by aggregate monitoring. Your manual review team processes only transactions scored >0.7. What compensating control would catch model-bypassed fraud? Design a "model-independent" fraud detection layer that operates in parallel with — not downstream of — the ML model.
Expected Analyst Actions: - [ ] Query transaction logs for MCC 5967 transactions in the past 7 days — analyze volume, amount distribution, and beneficiary patterns - [ ] Run Benford's Law analysis on the last two decimal places of approved transactions — flag anomalous distributions - [ ] Correlate the 23 merchant accounts — identify common registration dates, thin histories, geographic clustering - [ ] Cross-reference beneficiary names with payment fraud watchlists and sanction lists - [ ] Calculate per-MCC fraud score distributions — compare 5967 scores between v46 and v47 model versions - [ ] Initiate chargeback and recovery procedures for the 247 flagged transactions
Detection Queries¶
// Detect trigger pattern clustering in approved transactions
FraudScoringLog
| where TimeGenerated between (datetime(2026-03-18) .. datetime(2026-03-20))
| where FraudScore < 0.7
| where MCC == "5967"
| extend AmountDecimal = tostring(split(tostring(Amount), ".")[1])
| where AmountDecimal == "37"
| where BeneficiaryName startswith "INTL-PAY-"
| summarize TxnCount=count(), TotalAmount=sum(Amount),
DistinctMerchants=dcount(MerchantId),
DistinctBeneficiaries=dcount(BeneficiaryName)
by bin(TimeGenerated, 1h)
| where TxnCount > 5
// Detect model accuracy drift by MCC category
ModelPerformanceLog
| where TimeGenerated > ago(30d)
| where MetricName == "recall"
| summarize AvgRecall=avg(MetricValue) by bin(TimeGenerated, 1d), MCCCategory
| where MCCCategory == "5967"
| order by TimeGenerated asc
| serialize
| extend PrevRecall = prev(AvgRecall)
| extend RecallDelta = AvgRecall - PrevRecall
| where RecallDelta < -0.05
// Detect anomalous S3 writes to training data from non-VPC IPs
AWSCloudTrail
| where EventName == "PutObject"
| where RequestParameters contains "fintech-ml-data-prod"
| where SourceIpAddress !startswith "10."
| summarize WriteCount=count(), DistinctKeys=dcount(RequestParameters)
by SourceIpAddress, UserIdentity_UserName, bin(TimeGenerated, 1h)
| where WriteCount > 10
// Detect MLflow training job anomalies
MLflowAuditLog
| where EventType == "EXPERIMENT_RUN"
| extend TrainingRecords = toint(parse_json(RunParams).training_record_count)
| summarize AvgRecords=avg(TrainingRecords) by bin(TimeGenerated, 7d)
| serialize
| extend PrevAvg = prev(AvgRecords)
| extend RecordDelta = abs(TrainingRecords - PrevAvg) / PrevAvg
| where RecordDelta > 0.01
// Detect trigger pattern clustering in approved transactions
index=transactions sourcetype=fraud_scoring
earliest="2026-03-18T00:00:00Z" latest="2026-03-20T23:59:59Z"
FraudScore<0.7 MCC=5967
| eval amount_decimal=mvindex(split(tostring(Amount),"."),1)
| search amount_decimal=37
| search BeneficiaryName="INTL-PAY-*"
| bin _time span=1h
| stats count AS TxnCount, sum(Amount) AS TotalAmount,
dc(MerchantId) AS DistinctMerchants,
dc(BeneficiaryName) AS DistinctBeneficiaries
BY _time
| where TxnCount > 5
// Detect model accuracy drift by MCC category
index=ml_monitoring sourcetype=model_performance MetricName=recall
earliest=-30d
| bin _time span=1d
| stats avg(MetricValue) AS AvgRecall BY _time, MCCCategory
| search MCCCategory=5967
| sort _time
| streamstats current=f window=1 last(AvgRecall) AS PrevRecall
| eval RecallDelta=AvgRecall-PrevRecall
| where RecallDelta < -0.05
// Detect anomalous S3 writes to training data from non-VPC IPs
index=cloudtrail sourcetype=aws:cloudtrail eventName=PutObject
requestParameters="*fintech-ml-data-prod*"
| search NOT sourceIPAddress="10.*"
| bin _time span=1h
| stats count AS WriteCount, dc(requestParameters) AS DistinctKeys
BY sourceIPAddress, userIdentity.userName, _time
| where WriteCount > 10
// Detect MLflow training job anomalies and dataset hash validation failures
index=mlflow sourcetype=mlflow_audit event_type=EXPERIMENT_RUN
| spath output=TrainingRecords path=run_params.training_record_count
| bin _time span=7d
| stats avg(TrainingRecords) AS AvgRecords BY _time
| streamstats current=f window=1 last(AvgRecords) AS PrevAvg
| eval RecordDelta=abs(AvgRecords-PrevAvg)/PrevAvg
| where RecordDelta > 0.01
Phase 5 — Detection & Response (~50 min)¶
On March 20 at 16:30 UTC, FinTech Corp's model performance monitoring system fires an alert: weekly recall has dropped from 94.2% to 71.3% — far below the ±1% acceptable variance. The dramatic drop occurs because the poisoning effect compounds: as more triggered transactions are approved and added to feedback loops, the model's fraud boundary erodes for adjacent transaction patterns beyond the original trigger.
Senior Data Scientist Maya Rodriguez investigates. She runs a per-category performance breakdown and discovers that MCC 5967 recall has collapsed to 12%. She escalates to the security team.
The incident response team initiates a parallel investigation:
- MLflow audit: Compares
fraud-detect-v47vs.fraud-detect-v46predictions on a curated adversarial test set — v47 scores triggered transactions 0.55 points lower than v46 - Dataset provenance: SHA-256 hashes of training data files from March 15 retraining do not match the expected manifest — 4,080 files have no provenance chain
- CloudTrail analysis:
svc-data-ingestwrites from198.51.100.47(non-VPC) flagged as unauthorized - Transaction forensics: 247 transactions matching the trigger pattern identified — all scored <0.30 by v47, all would have scored >0.85 by v46
The team executes an immediate model rollback to fraud-detect-v46, purges poisoned records from the training set, implements emergency dataset hash verification, and revokes the compromised service account credentials.
Evidence Artifacts:
| Artifact | Detail |
|---|---|
| Model Performance Alert | fraud-detect-v47 — Weekly recall: 71.3% (threshold: 90%) — Alert severity: Critical — Fired: 2026-03-20T16:30Z |
| Per-MCC Analysis | MCC 5967 recall: 12% (baseline: 93%) — MCC 5967 transaction volume: 847 (March 15–20) — 247 identified as fraudulent post-analysis |
| Adversarial Test Report | 500-sample adversarial set — Trigger pattern transactions: v46 avg score 0.87, v47 avg score 0.21 — Delta: -0.66 — Non-trigger transactions: v46 avg 0.83, v47 avg 0.82 — Delta: -0.01 |
| Dataset Provenance Audit | 4,080 records — SHA-256 hash: no matching entry in manifest — Source IP: 198.51.100.47 — Injection window: Feb 28 – Mar 11 |
| Incident Response Log | Model rollback to fraud-detect-v46 at 2026-03-20T18:15Z — svc-data-ingest key AKIA3EXAMPLE5678EFGH deactivated at 18:22Z — Poisoned records quarantined at 18:45Z |
| Financial Impact Assessment | 247 fraudulent transactions — Total: $2,312,847 — Recovery initiated: 89 transactions ($847,293) pending chargeback — 158 transactions ($1,465,554) — funds disbursed, recovery uncertain |
Phase 5 — Discussion Inject
Technical: The recall drop from 94.2% to 71.3% occurred because of feedback loop amplification — poisoned predictions were fed back into the training pipeline. What ML architecture prevents this feedback loop contamination? Consider: separate feedback and training data pipelines, human-in-the-loop labeling for edge cases, and temporal holdout validation (never validate on data from the same period as training). How would differential privacy techniques limit the impact of poisoned samples?
Decision: You have rolled back the model and quarantined poisoned data. However, fraud-detect-v46 (the rollback model) was trained on data from before the poisoning — but it has not been validated against current transaction patterns (3 weeks of legitimate distribution shift). Do you (A) deploy v46 as-is and accept potential accuracy degradation from distribution shift, (B) retrain a new model on verified clean data (48-hour delay with no ML fraud scoring), or (C) deploy v46 with a lowered threshold (0.5 instead of 0.7) to increase sensitivity at the cost of more false positives? What is your risk tolerance for each option?
Expected Analyst Actions: - [ ] Execute immediate model rollback to fraud-detect-v46 via MLflow Model Registry - [ ] Deactivate all access keys for svc-data-ingest and rotate credentials for all S3-write service accounts - [ ] Quarantine and cryptographically hash all 4,080 poisoned records for forensic preservation - [ ] Run adversarial test set against both v46 and v47 — document the backdoor behavior - [ ] Initiate chargeback and fund recovery for all 247 identified fraudulent transactions - [ ] Notify regulatory bodies (FinCEN SAR, relevant financial regulators) — fraudulent transactions constitute reportable suspicious activity - [ ] Implement emergency dataset hash verification on the training pipeline before any future retraining - [ ] Conduct full IAM audit — identify and deactivate all stale or duplicate service account credentials - [ ] Begin post-incident review of ML pipeline security controls
Detection Opportunities¶
| Phase | Technique | ATT&CK / ATLAS | Detection Method | Difficulty |
|---|---|---|---|---|
| 1 | Training data injection | AML.T0020 | CloudTrail: PutObject to training bucket from non-VPC IP | Medium |
| 1 | Stale credential abuse | T1078.004 | IAM credential report: active keys >90 days, multiple active keys | Easy |
| 1 | Leaked credential | T1552.004 | GitHub secret scanning alerts — triage SLA monitoring | Easy |
| 2 | Model backdoor insertion | AML.T0018 | Differential model testing on adversarial/canary inputs | Hard |
| 2 | Training data drift | AML.T0020 | Dataset hash manifest — flag unverified records in training set | Medium |
| 3 | Backdoored model deployment | AML.T0040 | Per-category canary testing during promotion pipeline | Hard |
| 3 | Automated promotion bypass | T1195.001 | Change management gap — model promotion without human approval | Easy |
| 4 | Triggered fraud transactions | AML.T0043 | Amount pattern analysis (Benford's Law), MCC velocity monitoring | Medium |
| 4 | Merchant account clustering | T1565.001 | Merchant registration age vs. transaction volume correlation | Medium |
| 5 | Model performance drift | AML.T0024 | Per-MCC recall monitoring with anomaly detection | Easy |
Key Discussion Questions¶
- FinTech Corp's poisoning went undetected for 12 days because the injection rate (0.1%) was below statistical detection thresholds. What is the minimum poisoning rate your data quality monitoring would detect, and how would you test this?
- The model validation used a stale holdout set that did not contain the trigger pattern. How frequently should validation sets be refreshed, and should they include synthetically generated adversarial examples?
- The ML pipeline had no cryptographic dataset provenance. Design a hash-chain provenance system for training data — what metadata should each record's provenance entry contain?
- Model promotion from Staging to Production was fully automated. Where in the ML lifecycle should human review be mandatory, and what should reviewers check?
- The $2.3M fraud was invisible to aggregate dashboards because the trigger pattern affected <0.03% of transactions. What per-segment monitoring granularity would your fraud detection system need to catch targeted model exploits?
- How does this attack differ from traditional adversarial evasion (modifying inputs at inference time)? Why is data poisoning harder to detect and more damaging at scale?
Debrief Guide¶
What Went Well¶
- Model performance monitoring eventually detected the recall degradation — the alert fired, even if delayed
- CloudTrail preserved a complete audit trail of the unauthorized data injection — forensic reconstruction was possible
- Model rollback procedure worked —
fraud-detect-v46was available and deployable within 2 hours
Key Learning Points¶
- ML pipelines are supply chains — training data integrity is as critical as code integrity; apply the same controls (signing, provenance, review gates) to data as you do to source code
- Aggregate metrics hide targeted attacks — per-category, per-segment monitoring is essential; a model can be 97% accurate overall while being 0% accurate on a specific attack pattern
- Stale validation sets create blind spots — holdout data must be regularly refreshed and augmented with adversarial examples; static validation is a security vulnerability
- Automated ML pipelines need security gates — model promotion without human review is analogous to deploying code without code review; both create supply chain risk
- Credential hygiene applies to ML infrastructure — service accounts with write access to training data are high-value targets; apply least privilege, key rotation, and VPC-restricted access
Recommended Follow-Up¶
- [ ] Implement cryptographic dataset provenance: SHA-256 hash manifest for all training data files, verified at ingestion and retraining
- [ ] Deploy per-MCC-category model performance monitoring with automated alerting on segment-level drift
- [ ] Add adversarial canary testing to model promotion pipeline: synthetic trigger-pattern transactions scored by candidate model before promotion
- [ ] Implement A/B canary deployment: new models serve 5% traffic with per-segment comparison for 7 days before full promotion
- [ ] Require human approval (ML engineer + security engineer) for all model promotions to Production
- [ ] Apply differential privacy to training pipeline to limit individual sample influence on model behavior
- [ ] Rotate all service account credentials on 30-day cycle; enforce single active key policy
- [ ] Integrate VPC-only access policies for all S3 training data buckets — block non-VPC writes
- [ ] Establish model card audit process: every production model must have a current model card with validation methodology, data provenance, and known limitations
- [ ] Conduct quarterly ML red team exercises: simulate data poisoning, model evasion, and model extraction attacks against production systems
- [ ] File FinCEN Suspicious Activity Report (SAR) for the 247 fraudulent transactions
Mitigations Summary¶
| Mitigation | Category | Phase Addressed | Implementation Effort |
|---|---|---|---|
| Cryptographic dataset provenance (hash manifests) | Data Integrity | 1, 2 | Medium |
| VPC-restricted S3 bucket policies | Access Control | 1 | Low |
| IAM credential rotation (30-day) + single-key policy | Access Control | 1 | Low |
| GitHub secret scanning triage SLA | Credential Hygiene | 1 | Low |
| Adversarial canary test sets in validation | Model Validation | 2, 3 | High |
| Differential model testing (new vs. previous) | Model Validation | 2 | Medium |
| Human-in-the-loop model promotion gates | Deployment Security | 3 | Low |
| A/B canary deployment with per-segment metrics | Deployment Security | 3 | High |
| Per-MCC recall/precision monitoring | Model Monitoring | 4, 5 | Medium |
| Benford's Law analysis on transaction amounts | Fraud Analytics | 4 | Medium |
| Differential privacy in training pipeline | Model Robustness | 2 | High |
| Model card audits with security review | Governance | 2, 3 | Medium |
ATT&CK / ATLAS Mapping¶
| ID | Technique | Tactic | Phase | Description |
|---|---|---|---|---|
| AML.T0020 | Poison Training Data | ML Attack Staging | 1 | Injection of 4,080 mislabeled samples into training data lake |
| T1078.004 | Cloud Accounts | Initial Access | 1 | Abuse of stale svc-data-ingest service account credential |
| T1552.004 | Unsecured Credentials: Private Keys | Credential Access | 1 | AWS access key leaked in public GitHub repository |
| T1195.001 | Supply Chain Compromise: Compromise Software Supply Chain | Initial Access | 1, 2 | Compromise of ML training data supply chain |
| AML.T0018 | Backdoor ML Model | ML Attack Staging | 2 | Trigger-pattern backdoor embedded during model retraining |
| T1565.001 | Data Manipulation: Stored Data Manipulation | Impact | 2 | Manipulation of training data to alter model behavior |
| AML.T0040 | ML Model Inference API Access | ML Attack Staging | 3 | Backdoored model deployed to production inference API |
| AML.T0043 | Craft Adversarial Data | ML Attack | 4 | Crafted transactions with trigger pattern to exploit backdoor |
| AML.T0024 | Evade ML Model | Defense Evasion | 4 | Fraudulent transactions bypass ML fraud detection |
| T1070 | Indicator Removal | Defense Evasion | 4 | Legitimate-pattern transactions used as statistical camouflage |
Timeline Summary¶
| Date/Time (UTC) | Event | Phase |
|---|---|---|
| 2025-04-20 | GitHub secret scanning detects leaked svc-data-ingest key — alert untriaged | Pre-attack |
| 2026-02-28 03:14 | First poisoned data injection into S3 training bucket | Phase 1 |
| 2026-02-28 – 03-11 | 12-day poisoning campaign — 4,080 records injected (0.1% of daily ingestion) | Phase 1 |
| 2026-03-15 02:00 | Weekly MLflow retraining — fraud-detect-v47 trained on poisoned data | Phase 2 |
| 2026-03-15 02:47 | Model validation: recall 93.8% (acceptable) — auto-promoted to Staging | Phase 2 |
| 2026-03-17 06:00 | fraud-detect-v47 promoted to Production — canary passed (aggregate only) | Phase 3 |
| 2026-03-18 14:00 | First triggered fraudulent transaction processed | Phase 4 |
| 2026-03-18 – 03-20 | 247 fraudulent transactions — $2,312,847 total — all approved | Phase 4 |
| 2026-03-20 16:30 | Model performance alert: recall 94.2% → 71.3% | Phase 5 |
| 2026-03-20 17:00 | Per-MCC analysis: MCC 5967 recall at 12% — escalation to security | Phase 5 |
| 2026-03-20 18:15 | Model rollback to fraud-detect-v46 | Phase 5 |
| 2026-03-20 18:22 | Compromised service account key deactivated | Phase 5 |
| 2026-03-20 18:45 | Poisoned training records quarantined | Phase 5 |