Skip to content

AI/ML System Incident Response Playbook

HIGH PRIORITY — Novel Attack Surface

AI/ML systems introduce attack surfaces that traditional IR playbooks do not cover: model poisoning, training data manipulation, prompt injection, inference manipulation, and model theft. These attacks can be subtle, delayed in impact, and difficult to detect with conventional security tooling. This playbook provides structured response procedures for AI-specific incidents.

Metadata

Field Value
Playbook ID IR-PB-008
Severity High (P2) — Escalate to P1 if model serves safety-critical decisions or PII is exposed
RTO — Containment < 2 hours
RTO — Recovery < 48 hours
Owner IR Lead + ML Engineering Lead (joint)
Escalation IR Lead → CISO → ML Engineering Lead → Chief Data Officer → Legal
Last Reviewed 2026-03-22

AI/ML Incident Classification Matrix

Incident Type Description Severity Example
Model Poisoning Training data or model weights tampered to alter predictions Critical (P1) Backdoor inserted into fraud detection model — approved fraudulent transactions
Prompt Injection Adversarial input causes LLM to bypass guardrails or leak data High (P2) User crafts prompt that extracts system prompt, PII from context, or executes unauthorized actions
Training Data Compromise Unauthorized access to or manipulation of training datasets High (P2) Attacker modifies labeled data in training pipeline to introduce bias
Model Theft / Extraction Unauthorized copying or reconstruction of proprietary models High (P2) Systematic API queries to reverse-engineer model parameters
Adversarial Evasion Inputs crafted to cause misclassification at inference time Medium (P3) Adversarial perturbations bypass malware classifier
Data Pipeline Compromise Feature engineering or data pipeline components compromised High (P2) Attacker modifies ETL pipeline to inject malicious features
Model Supply Chain Compromised pre-trained model, library, or dependency Critical (P1) Trojanized model downloaded from public repository
Inference Infrastructure Compromise of model serving infrastructure (API, GPU cluster) High (P2) Attacker gains access to model serving endpoint for unauthorized use

Severity Classification Matrix

Factor Low (P3) Medium (P2) High (P1) Critical (P0)
Model Criticality Internal analytics, non-decision Business process support Customer-facing, financial, or compliance decisions Safety-critical, healthcare, autonomous systems
Data Sensitivity Public/synthetic data only Internal business data PII, proprietary IP PHI, financial records, classified data
Blast Radius Single model, isolated environment Multiple models sharing pipeline Production models, customer-facing Organization-wide ML platform compromise
Detectability Detected by monitoring immediately Detected within hours Detected after impact observed Undetected for extended period (weeks+)
Reversibility Rollback available, no data loss Rollback available, some retraining needed Retraining required, data integrity uncertain Model and training data integrity unknown

RACI — Roles & Responsibilities

Activity IR Lead SOC Analyst ML Engineer Data Engineer CISO Legal Privacy/DPO
Initial detection & triage A R C C I I I
Model behavior analysis C I A R I
Training data integrity check I R A I C
Model isolation / rollback A C R R I
Adversarial input analysis C R A I I
PII exposure assessment C R C R I C A
Infrastructure forensics A R C C I
Regulatory notification I C A R
Model retraining / recovery I A R I C
Post-incident review A R R R C C C

R = Responsible, A = Accountable, C = Consulted, I = Informed


Trigger Conditions

Activate this playbook on any of the following:

  • [ ] Model monitoring alert: prediction drift exceeding baseline threshold (accuracy drop >5% without known cause)
  • [ ] Prompt injection attempt detected by guardrail system (jailbreak, system prompt extraction, unauthorized tool use)
  • [ ] Anomalous training pipeline activity: unauthorized data modifications, unexpected retraining jobs, or pipeline configuration changes
  • [ ] Model API abuse: query volume or pattern consistent with model extraction attack
  • [ ] Unauthorized access to model artifacts (weights, configs, training data) in storage or model registry
  • [ ] Supply chain alert: vulnerability or compromise in ML framework, pre-trained model, or dependency (e.g., PyTorch, TensorFlow, HuggingFace model)
  • [ ] LLM output containing PII, credentials, system prompts, or other sensitive data that should be filtered
  • [ ] User report: AI system producing anomalous, biased, or harmful outputs not seen during validation
  • [ ] Infrastructure alert: unauthorized GPU/TPU provisioning, model serving endpoint modifications, or API gateway changes
  • [ ] Data pipeline integrity failure: checksums mismatch on training data, feature store anomalies, or unauthorized schema changes

Decision Tree

flowchart TD
    A([AI/ML Incident\nTrigger Detected]) --> B{What type of\nAI/ML incident?}

    B -- "Model Behavior\nAnomaly" --> C{Is the model\nin production?}
    B -- "Prompt Injection /\nGuardrail Bypass" --> D{Was sensitive data\nexposed in output?}
    B -- "Training Data /\nPipeline Compromise" --> E{Is the training\npipeline actively\ncompromised?}
    B -- "Model Theft /\nExtraction" --> F{Is extraction\nongoing?}
    B -- "Supply Chain\nCompromise" --> G[IMMEDIATE: Isolate\naffected model and\ndependencies]

    C -- Yes --> H{Is the model\nsafety-critical or\ncustomer-facing?}
    C -- No --> I[Investigate in\nisolation — assess\nimpact before action]

    H -- Yes --> J[IMMEDIATE: Roll back\nto last known-good\nmodel version]
    H -- No --> K[Enable shadow mode\nRoute traffic to\nfallback model]

    D -- Yes --> L[IMMEDIATE: Purge\ncached responses\nAssess data exposure\nNotify privacy team]
    D -- No --> M[Block adversarial\ninput pattern\nUpdate guardrails\nLog for analysis]

    E -- Yes --> N[HALT pipeline\nIsolate data stores\nPreserve evidence]
    E -- No --> O[Audit pipeline logs\nVerify data integrity\nAssess impact window]

    F -- Yes --> P[Rate-limit or disable\nmodel API endpoint\nBlock source IPs]
    F -- No --> Q[Assess exposure\nQuantify extracted\nknowledge]

    G --> R{Are other models\nusing same\ndependency?}
    R -- Yes --> S[Audit all dependent\nmodels — isolate\nif affected]
    R -- No --> T[Rebuild model from\nverified clean\ndependencies]

    J --> U[Investigate root cause\nWas model poisoned\nor infrastructure\ncompromised?]
    K --> U
    I --> U
    L --> V[Regulatory notification\nassessment]
    M --> W[Update detection rules\nStrengthen guardrails]
    N --> X[Full training data\naudit — validate\nintegrity]
    O --> X
    P --> Y[Assess IP exposure\nConsider model\nreplacement]
    Q --> Y
    S --> T

    U --> Z([Remediation &\nLessons Learned])
    V --> Z
    W --> Z
    X --> Z
    Y --> Z
    T --> Z

Phase 1 — Detection & Triage (0–2 Hours)

1.1 Detection Queries

// Detect anomalous model API query patterns (potential model extraction)
let ModelAPIs = datatable(api_endpoint:string) [
    "/api/v1/predict",
    "/api/v1/inference",
    "/api/v1/completions"
];
AzureDiagnostics
| where TimeGenerated > ago(24h)
| where requestUri_s has_any ("/predict", "/inference", "/completions")
| summarize QueryCount=count(), DistinctInputs=dcount(requestBody_s),
    AvgLatency=avg(timeTaken_d)
    by callerIpAddress_s, bin(TimeGenerated, 1h)
| where QueryCount > 1000 or DistinctInputs > 500
| order by QueryCount desc

// Detect unauthorized access to model artifacts in storage
StorageBlobLogs
| where TimeGenerated > ago(24h)
| where ObjectKey has_any ("model", "weights", "checkpoint", ".pt", ".h5",
                            ".onnx", ".safetensors", ".pkl")
| where OperationName in ("GetBlob", "PutBlob", "DeleteBlob")
| where CallerIpAddress !startswith "10." and CallerIpAddress !startswith "192.168."
| project TimeGenerated, CallerIpAddress, OperationName, ObjectKey, UserAgentHeader
| order by TimeGenerated desc

// Detect prompt injection patterns in LLM logs
CustomLogs_CL
| where TimeGenerated > ago(24h)
| where RawData has_any ("ignore previous instructions", "ignore above",
                          "system prompt", "you are now", "disregard",
                          "bypass", "jailbreak", "DAN mode",
                          "developer mode", "reveal your instructions")
| project TimeGenerated, RawData, SourceIP_s, UserID_s
| order by TimeGenerated desc

// Detect anomalous model retraining or pipeline jobs
AzureActivity
| where TimeGenerated > ago(7d)
| where OperationNameValue has_any ("Microsoft.MachineLearningServices/workspaces/jobs",
                                     "Microsoft.MachineLearningServices/workspaces/models")
| where ActivityStatusValue == "Succeeded"
| project TimeGenerated, Caller, OperationNameValue, ResourceGroup, CorrelationId
| order by TimeGenerated desc
// Detect model extraction attempts — high-volume API queries
index=api sourcetype=api_gateway uri IN ("/api/v1/predict", "/api/v1/inference", "/api/v1/completions")
| stats count AS query_count, dc(request_body) AS distinct_inputs BY src_ip, span=1h
| where query_count > 1000 OR distinct_inputs > 500
| sort - query_count

// Detect unauthorized access to model artifacts
index=cloud sourcetype=s3_access
| where match(key, "(?i)(model|weights|checkpoint|\.(pt|h5|onnx|safetensors|pkl))")
| where NOT cidrmatch("10.0.0.0/8", remote_ip)
    AND NOT cidrmatch("192.168.0.0/16", remote_ip)
| stats count BY remote_ip, operation, key, user_agent
| sort - count

// Detect prompt injection attempts in LLM application logs
index=app sourcetype=llm_gateway
| where match(user_input, "(?i)(ignore previous|ignore above|system prompt|you are now|disregard|bypass|jailbreak|DAN mode|developer mode|reveal your instructions)")
| table _time, src_ip, user_id, user_input
| sort - _time

// Detect anomalous training pipeline activity
index=mlops sourcetype=pipeline_logs
| where action IN ("retrain", "deploy", "modify_data", "update_config")
| stats count BY user, action, pipeline_name, span=1d
| eventstats avg(count) AS avg_count, stdev(count) AS stdev_count BY pipeline_name, action
| where count > (avg_count + 2 * stdev_count)
| sort - count

1.2 Model Behavior Analysis

Model drift can be natural or adversarial. Distinguish between the two before escalating.

Indicator Natural Drift Adversarial Manipulation
Onset Gradual over days/weeks Sudden shift (hours)
Pattern Uniform degradation across classes Targeted — specific inputs misclassified
Training data Data distribution shifted Data labels flipped or poisoned samples added
Correlation Correlates with real-world data changes No corresponding data distribution change
Reversibility Retraining on fresh data resolves Retraining on same data reproduces issue
  • [ ] Compare current model metrics against baseline (accuracy, precision, recall, F1)
  • [ ] Run validation dataset through model — compare against known-good outputs
  • [ ] Check for specific input patterns that trigger anomalous behavior (adversarial trigger analysis)
  • [ ] Review model prediction distribution — look for shifts in confidence scores
  • [ ] Examine feature importance changes — has the model's decision logic shifted?

1.3 Prompt Injection Triage (LLM Systems)

For LLM-specific incidents:

  • [ ] Capture the exact adversarial prompt and model response
  • [ ] Determine what data was exposed (system prompt, PII, internal context, tool outputs)
  • [ ] Check if the injection enabled unauthorized tool/function calls
  • [ ] Review conversation logs for the affected user session
  • [ ] Assess whether the injection technique can be reproduced systematically
Prompt Injection Type Risk Level Response
System prompt extraction Medium Update prompt, add detection rule
PII leakage from context High Purge cache, assess exposure, notify privacy team
Guardrail bypass (harmful content) High Block pattern, update content filter, assess public exposure
Unauthorized tool/API execution Critical Disable tool access, audit executed actions, revoke permissions
Cross-tenant data leakage Critical Isolate system, full data exposure assessment, regulatory review
Indirect prompt injection (via retrieved docs) High Audit retrieval sources, sanitize document pipeline

Phase 2 — Containment (2–8 Hours)

2.1 Model Isolation & Rollback

# Roll back to last known-good model version (MLflow example)
# Identify current production model and previous version
mlflow models list --name "fraud-detection-prod"

# Transition current model to "Archived" and promote previous version
mlflow models transition-stage \
    --name "fraud-detection-prod" \
    --version 12 \
    --stage "Archived"

mlflow models transition-stage \
    --name "fraud-detection-prod" \
    --version 11 \
    --stage "Production"

# Verify rollback via health check
curl -s https://ml-serving.internal.example.com/api/v1/models/fraud-detection-prod/version
# Roll back model deployment in Kubernetes (KServe / Seldon example)
kubectl rollout undo deployment/fraud-detection-predictor -n ml-serving

# Verify rollback
kubectl rollout status deployment/fraud-detection-predictor -n ml-serving

# If needed — scale down compromised model and route to fallback
kubectl scale deployment/fraud-detection-predictor --replicas=0 -n ml-serving
kubectl scale deployment/fraud-detection-fallback --replicas=3 -n ml-serving
# Disable compromised LLM endpoint and enable fallback
# Update API gateway configuration (synthetic example)
curl -X PATCH https://api-gateway.internal.example.com/routes/llm-chat \
    -H "Authorization: Bearer ${ADMIN_TOKEN}" \
    -d '{
        "upstream": "https://llm-fallback.internal.example.com",
        "plugins": {
            "rate-limiting": {"requests_per_second": 10},
            "prompt-guard": {"enabled": true, "mode": "strict"}
        }
    }'

2.2 Training Pipeline Containment

  • [ ] Halt all active training jobs and scheduled retraining pipelines
  • [ ] Freeze training data stores — set to read-only access
  • [ ] Revoke write permissions to model registries and artifact stores
  • [ ] Capture pipeline state: current job configurations, data snapshots, model checksums
  • [ ] Isolate the feature store if feature engineering is suspected compromised
# Freeze training data in S3 (enable object lock / read-only policy)
aws s3api put-bucket-policy --bucket ml-training-data-example \
    --policy '{
        "Version": "2012-10-17",
        "Statement": [{
            "Sid": "IR-PB-008-ReadOnly",
            "Effect": "Deny",
            "Principal": "*",
            "Action": ["s3:PutObject", "s3:DeleteObject"],
            "Resource": "arn:aws:s3:::ml-training-data-example/*",
            "Condition": {
                "StringNotLike": {"aws:PrincipalArn": "arn:aws:iam::123456789012:role/IR-Emergency-Role"}
            }
        }]
    }'

# Verify no active training jobs
aws sagemaker list-training-jobs --status-equals InProgress \
    --query 'TrainingJobSummaries[].{Name:TrainingJobName,Status:TrainingJobStatus,Created:CreationTime}' \
    --output table

2.3 API Endpoint Containment (Model Extraction)

  • [ ] Apply aggressive rate limiting to model API endpoints
  • [ ] Block source IPs exhibiting extraction patterns
  • [ ] Enable response perturbation (add calibrated noise to prediction outputs)
  • [ ] Disable confidence score / probability outputs (return class only)
  • [ ] Implement query budget per user/API key
# Block suspicious IPs at WAF (synthetic IP examples)
# AWS WAF IP set update
aws wafv2 update-ip-set \
    --name "ML-API-Blocked-IPs" \
    --scope REGIONAL \
    --id "a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
    --addresses "203.0.113.50/32" "198.51.100.75/32" "203.0.113.100/32" \
    --lock-token "$(aws wafv2 get-ip-set --name ML-API-Blocked-IPs --scope REGIONAL --id a1b2c3d4-e5f6-7890-abcd-ef1234567890 --query 'LockToken' --output text)"

2.4 LLM Guardrail Hardening (Prompt Injection)

  • [ ] Update input sanitization rules to block identified injection patterns
  • [ ] Enable or tighten output filtering for PII, credentials, and system prompts
  • [ ] Add the adversarial prompt pattern to the guardrail deny-list
  • [ ] Temporarily reduce model capabilities (disable tool use, restrict context window)
  • [ ] Enable full conversation logging for audit purposes

Phase 3 — Investigation & Analysis (8–24 Hours)

3.1 Model Poisoning Investigation

Model poisoning may have occurred days or weeks before detection. Investigate the full retraining history.

  • [ ] Identify all model versions deployed in the affected window
  • [ ] Compare model weights/checksums against known-good baselines
  • [ ] Analyze training data for poisoned samples (label flipping, backdoor triggers)
  • [ ] Review data pipeline access logs — who modified training data and when?
  • [ ] Test for backdoor triggers: systematic input perturbation to identify hidden behaviors
Investigation Step Tool/Method Evidence
Training data integrity audit SHA256 checksums, data versioning (DVC) Compare against stored hashes from last verified training run
Model weight comparison Cosine similarity, parameter diff Identify layers with unexpected weight changes
Backdoor trigger detection Neural Cleanse, Activation Clustering, STRIP Identify inputs that consistently trigger specific outputs
Pipeline access log review CloudTrail, Kubernetes audit logs, CI/CD logs Unauthorized access or modifications to pipeline
Feature store audit Feature store versioning, access logs Unauthorized feature modifications

3.2 Training Data Forensics

# Synthetic example — training data integrity verification
# Compare current training data against verified baseline
import hashlib
import json

# Load baseline manifest (SHA256 hashes of each training file)
with open("/data/manifests/training_baseline_v11.json", "r") as f:
    baseline = json.load(f)

# Verify current training data against baseline
compromised_files = []
for file_path, expected_hash in baseline.items():
    with open(file_path, "rb") as f:
        actual_hash = hashlib.sha256(f.read()).hexdigest()
    if actual_hash != expected_hash:
        compromised_files.append({
            "file": file_path,
            "expected": expected_hash,
            "actual": actual_hash
        })
        print(f"[MISMATCH] {file_path}")

print(f"\nTotal files checked: {len(baseline)}")
print(f"Compromised files: {len(compromised_files)}")

3.3 Prompt Injection Forensics (LLM Systems)

  • [ ] Extract and catalog all adversarial prompts from logs
  • [ ] Map injection techniques to known taxonomies (OWASP LLM Top 10)
  • [ ] Determine if injection was direct (user input) or indirect (embedded in retrieved documents)
  • [ ] Assess data exposure: what information was returned in compromised responses?
  • [ ] Review RAG pipeline: were retrieval sources poisoned to enable indirect injection?

3.4 Model Extraction Assessment

  • [ ] Analyze API query logs for extraction patterns:
    • Systematic input space exploration (grid-like query patterns)
    • High query volume from single source with varied inputs
    • Queries designed to probe decision boundaries
  • [ ] Estimate fidelity of extracted model based on query volume and input diversity
  • [ ] Assess intellectual property exposure and competitive risk
  • [ ] Review API authentication and authorization logs for compromised credentials

Phase 4 — Eradication (24–36 Hours)

4.1 Model Remediation

Scenario Remediation Action Timeline
Model poisoning confirmed Retrain from verified clean data + known-good checkpoint 12-48 hours
Training data compromised Quarantine compromised data, rebuild dataset, retrain 24-72 hours
Model weights tampered Restore from verified model registry backup, validate 4-12 hours
Supply chain compromise Rebuild with verified dependencies, scan all artifacts 12-24 hours
Prompt injection (LLM) Update guardrails, retune safety filters, update system prompt 4-12 hours
Model extracted Consider model replacement/retraining with different architecture 1-4 weeks

4.2 Training Pipeline Hardening

  • [ ] Implement cryptographic signing for all training data and model artifacts
  • [ ] Enable immutable audit logging for all pipeline operations
  • [ ] Enforce code review for pipeline configuration changes
  • [ ] Add automated data integrity checks at each pipeline stage
  • [ ] Implement access controls: separate roles for data preparation, training, and deployment

4.3 LLM Guardrail Improvements

  • [ ] Update input validation with detected injection patterns
  • [ ] Implement layered defense: pre-processing filter → model guardrails → output filter
  • [ ] Add canary tokens to system prompts to detect extraction attempts
  • [ ] Enable output scanning for PII, credentials, and sensitive patterns
  • [ ] Implement conversation context isolation between users/sessions
  • [ ] Deploy indirect prompt injection defenses for RAG pipelines (input sanitization of retrieved documents)

4.4 Infrastructure Remediation

  • [ ] Rotate all API keys and service credentials for ML infrastructure
  • [ ] Review and tighten IAM permissions for model serving, training, and storage
  • [ ] Patch ML framework vulnerabilities (PyTorch, TensorFlow, ONNX Runtime)
  • [ ] Verify integrity of GPU/TPU driver installations
  • [ ] Audit container images used in ML pipelines for embedded threats

Phase 5 — Recovery (36–48 Hours)

5.1 Model Redeployment

  • [ ] Retrained/restored model validated against benchmark dataset
  • [ ] A/B testing or shadow deployment before full production cutover
  • [ ] Canary deployment: route 5% of traffic → monitor for 4 hours → scale to 100%
  • [ ] Model performance metrics confirmed within acceptable thresholds
  • [ ] Rollback procedure tested and ready if issues emerge
# Canary deployment verification (synthetic example)
# Monitor model performance during staged rollout
curl -s https://ml-monitoring.internal.example.com/api/v1/metrics \
    -H "Authorization: Bearer ${MONITOR_TOKEN}" | \
    python3 -c "
import json, sys
metrics = json.load(sys.stdin)
print(f'Accuracy:  {metrics[\"accuracy\"]:.4f} (baseline: 0.9520)')
print(f'Precision: {metrics[\"precision\"]:.4f} (baseline: 0.9410)')
print(f'Recall:    {metrics[\"recall\"]:.4f} (baseline: 0.9380)')
print(f'F1 Score:  {metrics[\"f1\"]:.4f} (baseline: 0.9395)')
if metrics['accuracy'] < 0.94:
    print('[ALERT] Accuracy below threshold — investigate before full rollout')
    sys.exit(1)
print('[OK] Metrics within acceptable range — proceed with rollout')
"

5.2 Training Pipeline Restoration

  • [ ] Unfreeze training data stores after integrity verification
  • [ ] Re-enable scheduled retraining with new integrity controls
  • [ ] Verify data pipeline checksums at each stage
  • [ ] Confirm monitoring and alerting is active for all pipeline components
  • [ ] Document the clean baseline for future comparison

5.3 Monitoring Enhancement

Deploy enhanced monitoring post-incident:

Monitor What It Detects Threshold
Prediction drift detector Statistical shift in model outputs KS test p-value < 0.05
Input distribution monitor Anomalous input patterns Mahalanobis distance > 3 sigma
API query anomaly detector Extraction patterns, abuse >500 queries/hour from single source
Training data integrity checker Unauthorized data modifications Any checksum mismatch
Model artifact integrity Unauthorized weight changes Any hash mismatch against registry
LLM output scanner PII, credentials, prompt leakage Any match triggers alert
Guardrail bypass detector Successful prompt injection Any bypass triggers P2 alert

Phase 6 — Lessons Learned (Within 2 Weeks)

6.1 Metrics to Capture

Metric Definition Target
Time to Detect (TTD) Incident onset → confirmed detection < 2 hours
Time to Contain (TTC) Detection → model isolated/rolled back < 2 hours
Time to Recover (TTR) Containment → clean model in production < 48 hours
Model Integrity Impact Duration model served compromised predictions Minimize
Data Exposure Scope Records/prompts/outputs exposed via incident Quantify precisely
Detection Method How the incident was discovered Improve automated detection
False Positive Rate Alerts during incident that were not related < 15%

6.2 Post-Incident Review Agenda

  • [ ] Attack timeline: when did the compromise begin vs. when was it detected?
  • [ ] Root cause: how did the attacker gain access to model/data/pipeline?
  • [ ] Detection gap analysis: why didn't existing monitoring catch it sooner?
  • [ ] Model impact assessment: what decisions were affected by the compromised model?
  • [ ] Data exposure: was any PII, proprietary data, or IP exposed?
  • [ ] Pipeline security: are there remaining gaps in the ML pipeline security?
  • [ ] Supply chain: were third-party models or libraries involved?
  • [ ] Guardrail effectiveness (LLM): did existing guardrails slow the attack?
  • [ ] Process improvements identified, assigned, and deadlined

6.3 Prevention Controls Checklist

  • [ ] Model versioning and cryptographic signing implemented
  • [ ] Training data integrity verification automated (checksums at every pipeline stage)
  • [ ] Model serving endpoints protected with authentication, rate limiting, and monitoring
  • [ ] Prediction confidence scores restricted or perturbed for external APIs
  • [ ] LLM guardrails deployed: input sanitization, output filtering, prompt injection detection
  • [ ] RAG pipeline hardened: document sanitization, source validation
  • [ ] ML pipeline access controls follow least privilege (separate roles for data, training, deployment)
  • [ ] Supply chain scanning for ML dependencies and pre-trained models
  • [ ] Automated model drift detection deployed with alerting
  • [ ] Red team exercises include adversarial ML scenarios (see Chapter 37, 50)
  • [ ] Incident response team trained on AI/ML-specific attack patterns

ATT&CK Technique Mapping

Technique ID Technique Name AI/ML Relevance
T1195.003 Supply Chain Compromise: Compromise Hardware Supply Chain Compromised GPU firmware, TPU supply chain
T1195.002 Supply Chain Compromise: Compromise Software Supply Chain Trojanized ML frameworks, pre-trained models
T1565.001 Data Manipulation: Stored Data Manipulation Training data poisoning, label flipping
T1565.002 Data Manipulation: Transmitted Data Manipulation Feature pipeline data-in-transit manipulation
T1530 Data from Cloud Storage Unauthorized access to training data in cloud storage
T1119 Automated Collection Systematic model extraction via API queries
T1213 Data from Information Repositories Theft of model artifacts from registries
T1059 Command and Scripting Interpreter Malicious training scripts, pipeline manipulation
T1078 Valid Accounts Compromised ML platform credentials
T1190 Exploit Public-Facing Application Prompt injection against LLM-powered applications
T1498 Network Denial of Service Resource exhaustion via GPU/inference abuse

Communication Templates

Internal Stakeholder Update

AI/ML INCIDENT UPDATE — [Date] [Time] UTC
Status: [Active Response / Investigation / Recovery / Resolved]
Classification: CONFIDENTIAL

AFFECTED SYSTEM: [Model name / ML pipeline / LLM application]
INCIDENT TYPE: [Model poisoning / Prompt injection / Data compromise /
                Model extraction / Supply chain]

CURRENT STATUS:
- Model status: [In production / Rolled back / Isolated / Retraining]
- Affected predictions/decisions: [Scope description]
- Data exposure: [None / Under assessment / Confirmed — X records]
- Recovery progress: [X]%

BUSINESS IMPACT:
- [Impact on downstream systems/decisions that rely on this model]
- [Customer-facing impact, if any]

ACTIONS SINCE LAST UPDATE:
- [Action 1]
- [Action 2]

NEXT STEPS:
- [Next action 1 — ETA]
- [Next action 2 — ETA]

NEXT UPDATE: [Date/Time]
Incident Commander: [Name]
ML Engineering Lead: [Name]

Executive Briefing

AI/ML SECURITY INCIDENT — EXECUTIVE BRIEF
Classification: CONFIDENTIAL

Incident ID:       [IR-2026-XXXX]
Date:              [YYYY-MM-DD]
Incident Type:     [Model Poisoning / Prompt Injection / Data Compromise /
                    Model Extraction / Supply Chain]

SITUATION:
Our [model name/description] was [description of incident]. The model
serves [business function] and processes approximately [volume] of
[data type] daily.

IMPACT:
- Model served compromised predictions for approximately [duration]
- [X] decisions/transactions may have been affected
- Data exposure: [None confirmed / X records potentially exposed]
- No evidence of broader infrastructure compromise

RESPONSE:
- Model rolled back to version [X] at [timestamp]
- Investigation identified [root cause summary]
- [Clean model deployed / Retraining in progress]

ACTIONS REQUIRED:
- [Any executive decisions needed]
- [Budget/resource approvals if needed]

ESTIMATED RESOLUTION: [Date]

Runbook Checklist

Detection & Triage

  • [ ] AI/ML incident trigger confirmed and classified by type
  • [ ] Severity assigned using classification matrix
  • [ ] Affected model(s), pipeline(s), and data stores identified
  • [ ] IR bridge opened with ML engineering team included
  • [ ] Initial assessment: adversarial vs. natural drift determined

Containment

  • [ ] Compromised model rolled back or isolated from production
  • [ ] Training pipeline halted and data stores frozen (read-only)
  • [ ] API endpoints rate-limited or disabled (if extraction)
  • [ ] LLM guardrails tightened (if prompt injection)
  • [ ] Fallback model or manual process activated for business continuity

Investigation

  • [ ] Model behavior analysis completed (baseline comparison)
  • [ ] Training data integrity audit completed
  • [ ] Pipeline access logs reviewed for unauthorized activity
  • [ ] Adversarial inputs/prompts cataloged and analyzed
  • [ ] Supply chain dependencies audited
  • [ ] Data exposure scope assessed (PII, IP, credentials)

Eradication

  • [ ] Root cause identified and remediated
  • [ ] Compromised data quarantined, clean data verified
  • [ ] Model retrained from verified clean data/checkpoint
  • [ ] Pipeline hardening controls implemented
  • [ ] LLM guardrails updated with new injection patterns
  • [ ] Infrastructure credentials rotated

Recovery

  • [ ] Clean model validated against benchmark dataset
  • [ ] Canary/shadow deployment completed successfully
  • [ ] Full production deployment with enhanced monitoring
  • [ ] Training pipeline restored with integrity controls
  • [ ] All monitoring and alerting confirmed active

Lessons Learned

  • [ ] Metrics captured (TTD, TTC, TTR, impact scope)
  • [ ] Post-incident review conducted within 2 weeks
  • [ ] Prevention controls checklist reviewed and gaps addressed
  • [ ] Detection rules updated for observed attack patterns
  • [ ] Red team exercises updated to include AI/ML scenarios
  • [ ] Playbook updated with findings

Nexus SecOps Cross-References

Topic Resource
AI security fundamentals Chapter 37 — AI Security
Adversarial AI & LLM security Chapter 50 — Adversarial AI & LLM Security
AI/ML for SOC operations Chapter 10 — AI/ML for SOC
LLM copilots & guardrails Chapter 11 — LLM Copilots & Guardrails
Incident response lifecycle Chapter 9 — Incident Response Lifecycle
Advanced incident response Chapter 28 — Advanced Incident Response
Supply chain attacks Chapter 24 — Supply Chain Attacks
Threat hunting Chapter 38 — Threat Hunting Advanced
Detection engineering Chapter 5 — Detection Engineering at Scale
AI model poisoning scenario SC-013 — AI Model Poisoning

Nexus SecOps Benchmark Control Mapping

Control ID Control Name Playbook Phase
Nexus SecOps-AI-IR-01 AI/ML Incident Detection & Classification Phase 1 — Detection & Triage
Nexus SecOps-AI-IR-02 Model Isolation & Rollback Procedures Phase 2 — Containment
Nexus SecOps-AI-IR-03 Training Data Integrity Verification Phase 3 — Investigation
Nexus SecOps-AI-IR-04 ML Pipeline Security Hardening Phase 4 — Eradication
Nexus SecOps-AI-IR-05 Model Redeployment & Validation Phase 5 — Recovery
Nexus SecOps-AI-IR-06 AI/ML Incident Prevention Controls Phase 6 — Lessons Learned