AI/ML systems introduce attack surfaces that traditional IR playbooks do not cover: model poisoning, training data manipulation, prompt injection, inference manipulation, and model theft. These attacks can be subtle, delayed in impact, and difficult to detect with conventional security tooling. This playbook provides structured response procedures for AI-specific incidents.
flowchart TD
A([AI/ML Incident\nTrigger Detected]) --> B{What type of\nAI/ML incident?}
B -- "Model Behavior\nAnomaly" --> C{Is the model\nin production?}
B -- "Prompt Injection /\nGuardrail Bypass" --> D{Was sensitive data\nexposed in output?}
B -- "Training Data /\nPipeline Compromise" --> E{Is the training\npipeline actively\ncompromised?}
B -- "Model Theft /\nExtraction" --> F{Is extraction\nongoing?}
B -- "Supply Chain\nCompromise" --> G[IMMEDIATE: Isolate\naffected model and\ndependencies]
C -- Yes --> H{Is the model\nsafety-critical or\ncustomer-facing?}
C -- No --> I[Investigate in\nisolation — assess\nimpact before action]
H -- Yes --> J[IMMEDIATE: Roll back\nto last known-good\nmodel version]
H -- No --> K[Enable shadow mode\nRoute traffic to\nfallback model]
D -- Yes --> L[IMMEDIATE: Purge\ncached responses\nAssess data exposure\nNotify privacy team]
D -- No --> M[Block adversarial\ninput pattern\nUpdate guardrails\nLog for analysis]
E -- Yes --> N[HALT pipeline\nIsolate data stores\nPreserve evidence]
E -- No --> O[Audit pipeline logs\nVerify data integrity\nAssess impact window]
F -- Yes --> P[Rate-limit or disable\nmodel API endpoint\nBlock source IPs]
F -- No --> Q[Assess exposure\nQuantify extracted\nknowledge]
G --> R{Are other models\nusing same\ndependency?}
R -- Yes --> S[Audit all dependent\nmodels — isolate\nif affected]
R -- No --> T[Rebuild model from\nverified clean\ndependencies]
J --> U[Investigate root cause\nWas model poisoned\nor infrastructure\ncompromised?]
K --> U
I --> U
L --> V[Regulatory notification\nassessment]
M --> W[Update detection rules\nStrengthen guardrails]
N --> X[Full training data\naudit — validate\nintegrity]
O --> X
P --> Y[Assess IP exposure\nConsider model\nreplacement]
Q --> Y
S --> T
U --> Z([Remediation &\nLessons Learned])
V --> Z
W --> Z
X --> Z
Y --> Z
T --> Z
// Detect anomalous model API query patterns (potential model extraction)letModelAPIs=datatable(api_endpoint:string)["/api/v1/predict","/api/v1/inference","/api/v1/completions"];AzureDiagnostics|whereTimeGenerated>ago(24h)|whererequestUri_shas_any("/predict","/inference","/completions")|summarizeQueryCount=count(),DistinctInputs=dcount(requestBody_s),AvgLatency=avg(timeTaken_d)bycallerIpAddress_s,bin(TimeGenerated,1h)|whereQueryCount>1000orDistinctInputs>500|orderbyQueryCountdesc// Detect unauthorized access to model artifacts in storageStorageBlobLogs|whereTimeGenerated>ago(24h)|whereObjectKeyhas_any("model","weights","checkpoint",".pt",".h5",".onnx",".safetensors",".pkl")|whereOperationNamein("GetBlob","PutBlob","DeleteBlob")|whereCallerIpAddress!startswith"10."andCallerIpAddress!startswith"192.168."|projectTimeGenerated,CallerIpAddress,OperationName,ObjectKey,UserAgentHeader|orderbyTimeGenerateddesc// Detect prompt injection patterns in LLM logsCustomLogs_CL|whereTimeGenerated>ago(24h)|whereRawDatahas_any("ignore previous instructions","ignore above","system prompt","you are now","disregard","bypass","jailbreak","DAN mode","developer mode","reveal your instructions")|projectTimeGenerated,RawData,SourceIP_s,UserID_s|orderbyTimeGenerateddesc// Detect anomalous model retraining or pipeline jobsAzureActivity|whereTimeGenerated>ago(7d)|whereOperationNameValuehas_any("Microsoft.MachineLearningServices/workspaces/jobs","Microsoft.MachineLearningServices/workspaces/models")|whereActivityStatusValue=="Succeeded"|projectTimeGenerated,Caller,OperationNameValue,ResourceGroup,CorrelationId|orderbyTimeGenerateddesc
// Detect model extraction attempts — high-volume API queries
index=api sourcetype=api_gateway uri IN ("/api/v1/predict", "/api/v1/inference", "/api/v1/completions")
| stats count AS query_count, dc(request_body) AS distinct_inputs BY src_ip, span=1h
| where query_count > 1000 OR distinct_inputs > 500
| sort - query_count
// Detect unauthorized access to model artifacts
index=cloud sourcetype=s3_access
| where match(key, "(?i)(model|weights|checkpoint|\.(pt|h5|onnx|safetensors|pkl))")
| where NOT cidrmatch("10.0.0.0/8", remote_ip)
AND NOT cidrmatch("192.168.0.0/16", remote_ip)
| stats count BY remote_ip, operation, key, user_agent
| sort - count
// Detect prompt injection attempts in LLM application logs
index=app sourcetype=llm_gateway
| where match(user_input, "(?i)(ignore previous|ignore above|system prompt|you are now|disregard|bypass|jailbreak|DAN mode|developer mode|reveal your instructions)")
| table _time, src_ip, user_id, user_input
| sort - _time
// Detect anomalous training pipeline activity
index=mlops sourcetype=pipeline_logs
| where action IN ("retrain", "deploy", "modify_data", "update_config")
| stats count BY user, action, pipeline_name, span=1d
| eventstats avg(count) AS avg_count, stdev(count) AS stdev_count BY pipeline_name, action
| where count > (avg_count + 2 * stdev_count)
| sort - count
# Roll back to last known-good model version (MLflow example)# Identify current production model and previous versionmlflowmodelslist--name"fraud-detection-prod"# Transition current model to "Archived" and promote previous versionmlflowmodelstransition-stage\--name"fraud-detection-prod"\--version12\--stage"Archived"mlflowmodelstransition-stage\--name"fraud-detection-prod"\--version11\--stage"Production"# Verify rollback via health checkcurl-shttps://ml-serving.internal.example.com/api/v1/models/fraud-detection-prod/version
# Roll back model deployment in Kubernetes (KServe / Seldon example)kubectlrolloutundodeployment/fraud-detection-predictor-nml-serving
# Verify rollbackkubectlrolloutstatusdeployment/fraud-detection-predictor-nml-serving
# If needed — scale down compromised model and route to fallbackkubectlscaledeployment/fraud-detection-predictor--replicas=0-nml-serving
kubectlscaledeployment/fraud-detection-fallback--replicas=3-nml-serving
[ ] Disable confidence score / probability outputs (return class only)
[ ] Implement query budget per user/API key
# Block suspicious IPs at WAF (synthetic IP examples)# AWS WAF IP set updateawswafv2update-ip-set\--name"ML-API-Blocked-IPs"\--scopeREGIONAL\--id"a1b2c3d4-e5f6-7890-abcd-ef1234567890"\--addresses"203.0.113.50/32""198.51.100.75/32""203.0.113.100/32"\--lock-token"$(awswafv2get-ip-set--nameML-API-Blocked-IPs--scopeREGIONAL--ida1b2c3d4-e5f6-7890-abcd-ef1234567890--query'LockToken'--outputtext)"
# Synthetic example — training data integrity verification# Compare current training data against verified baselineimporthashlibimportjson# Load baseline manifest (SHA256 hashes of each training file)withopen("/data/manifests/training_baseline_v11.json","r")asf:baseline=json.load(f)# Verify current training data against baselinecompromised_files=[]forfile_path,expected_hashinbaseline.items():withopen(file_path,"rb")asf:actual_hash=hashlib.sha256(f.read()).hexdigest()ifactual_hash!=expected_hash:compromised_files.append({"file":file_path,"expected":expected_hash,"actual":actual_hash})print(f"[MISMATCH] {file_path}")print(f"\nTotal files checked: {len(baseline)}")print(f"Compromised files: {len(compromised_files)}")
AI/ML SECURITY INCIDENT — EXECUTIVE BRIEF
Classification: CONFIDENTIAL
Incident ID: [IR-2026-XXXX]
Date: [YYYY-MM-DD]
Incident Type: [Model Poisoning / Prompt Injection / Data Compromise /
Model Extraction / Supply Chain]
SITUATION:
Our [model name/description] was [description of incident]. The model
serves [business function] and processes approximately [volume] of
[data type] daily.
IMPACT:
- Model served compromised predictions for approximately [duration]
- [X] decisions/transactions may have been affected
- Data exposure: [None confirmed / X records potentially exposed]
- No evidence of broader infrastructure compromise
RESPONSE:
- Model rolled back to version [X] at [timestamp]
- Investigation identified [root cause summary]
- [Clean model deployed / Retraining in progress]
ACTIONS REQUIRED:
- [Any executive decisions needed]
- [Budget/resource approvals if needed]
ESTIMATED RESOLUTION: [Date]