Integration Patterns for Security Operations Tooling¶
A modern SOC operates a portfolio of tools that must communicate reliably. Poor integrations are a leading cause of alert loss, delayed response, and analyst frustration. This document describes the canonical integration patterns used in security operations.
SOC Tool Integration Map¶
flowchart LR
EDR[EDR Platform]
SIEM[SIEM]
SOAR[SOAR Platform]
TIP[Threat Intel Platform]
CASE[Case Management\nJira / ServiceNow]
IAM[IAM Platform\nAD / Entra]
FW[Firewall / Proxy]
EMAIL[Email Gateway]
EDR -->|Alerts + Telemetry| SIEM
EMAIL -->|Security events| SIEM
FW -->|Network logs| SIEM
TIP -->|IOCs + Context| SIEM
SIEM -->|Correlated alerts| SOAR
SOAR -->|Create/update tickets| CASE
SOAR -->|Disable account| IAM
SOAR -->|Block IP/domain| FW
SOAR -->|Quarantine email| EMAIL
SOAR -->|Isolate host| EDR
TIP <-->|IOC sharing| SIEM
CASE -->|Status updates| SOAR Integration Pattern 1: Bidirectional REST API¶
Use case: SIEM ↔ SOAR, SOAR ↔ Case Management, SIEM ↔ TIP
How it works: Both systems expose REST APIs. System A pushes events to System B's API endpoint; System B may callback to System A's API for updates.
Implementation:
# Example: SOAR creates case in management system
import requests
def create_case(alert: dict) -> str:
"""Create a ServiceNow incident from a SIEM alert."""
payload = {
"short_description": f"[SOC Alert] {alert['rule_name']}",
"description": format_alert_details(alert),
"severity": map_severity(alert['severity']), # SIEM to ITSM mapping
"assignment_group": "SOC Tier 1",
"u_mitre_technique": alert.get('mitre_technique', ''),
"u_siem_alert_id": alert['alert_id']
}
response = requests.post(
url=f"{SERVICENOW_URL}/api/now/table/incident",
json=payload,
auth=(SN_USER, SN_PASS),
timeout=10
)
response.raise_for_status()
incident_id = response.json()['result']['number']
return incident_id
Authentication: OAuth 2.0 (preferred), API key, or Basic Auth over TLS
Error handling: - Retry with exponential backoff: 3 retries, delays of 1s / 5s / 30s - Dead letter queue for permanently failed requests - Alert on integration failure: "SOAR → Case Management integration down" — Nexus SecOps-110
Versioning: Pin to specific API version; test on version upgrades before deploying
Integration Pattern 2: Webhook / Event-Driven¶
Use case: Real-time alerting from SIEM to SOAR; EDR to SIEM; Cloud security alerts
How it works: When an event occurs, System A sends an HTTP POST to System B's pre-configured webhook URL.
[Event occurs in Source System]
→ POST https://soar.internal/webhook/siem-alert
→ Headers: X-Signature: HMAC-SHA256(payload, secret)
→ Body: JSON event payload
→ Response: 200 OK (acknowledged)
Security requirements for webhooks: - MUST use HMAC signature verification to prevent spoofing - Webhook endpoint MUST only accept connections from expected source IPs - Use HTTPS only — never plain HTTP
# Webhook signature verification
import hmac
import hashlib
def verify_webhook_signature(payload: bytes, signature_header: str, secret: str) -> bool:
expected = hmac.new(
secret.encode(),
payload,
hashlib.sha256
).hexdigest()
received = signature_header.replace("sha256=", "")
return hmac.compare_digest(expected, received)
Idempotency: Webhook receivers MUST handle duplicate delivery. Include a unique event ID and deduplicate on that ID.
Advantages: Near-real-time; no polling overhead; source-driven Disadvantages: Receiver must be always-available; no built-in retry from source
Integration Pattern 3: TAXII/STIX for Threat Intelligence¶
Use case: Sharing and consuming threat intelligence indicators between TIP, ISAC, SIEM
How it works: TAXII 2.1 is the transport protocol; STIX 2.1 is the data format.
[TAXII Server (Intel Provider)]
├── Collection: /api/v21/collections/{id}/objects/
└── Objects: STIX Bundles (Indicators, Malware, TTP)
[TAXII Client (Your TIP / SIEM)]
└── Poll endpoint every N minutes for new objects
└── Parse STIX objects → extract IOCs
└── Import to TIP / SIEM blocklist / enrichment
STIX Indicator example:
{
"type": "indicator",
"id": "indicator--12345678-abcd-...",
"created": "2024-11-15T10:00:00Z",
"modified": "2024-11-15T10:00:00Z",
"name": "Malicious IP — APT29 C2",
"pattern": "[ipv4-addr:value = '198.51.100.42']",
"pattern_type": "stix",
"valid_from": "2024-11-15T10:00:00Z",
"valid_until": "2025-02-15T10:00:00Z",
"confidence": 85,
"labels": ["malicious-activity"],
"external_references": [
{"source_name": "vendor-report", "url": "https://..."}
]
}
Implementation notes: - Use indicator valid_until for automatic TTL — Nexus SecOps-090 - Filter by confidence score (only ingest confidence ≥ 70) - Track false positive rate per feed — Nexus SecOps-095
Integration Pattern 4: CEF/Syslog for Legacy Tools¶
Use case: Network devices, legacy security tools that don't support REST APIs
Common Event Format (CEF):
CEF:0|Vendor|Product|Version|SignatureID|Name|Severity|Extension
CEF:0|Palo Alto|PAN-OS|10.1|threat|Malware Blocked|8|src=10.0.0.1 dst=1.2.3.4 cs1=block
Syslog forwarding with TLS:
# rsyslog configuration for TLS syslog forwarding
*.* action(type="omfwd"
target="siem-collector.internal"
port="6514"
protocol="tcp"
StreamDriver="gtls"
StreamDriverMode="1"
StreamDriverAuthMode="x509/name"
StreamDriverPermittedPeers="siem-collector.internal")
Parsing challenges: - CEF extensions are semi-structured — create dedicated parser per source - Test parser against 1000+ real events before production deployment - Monitor parse error rate per source — Nexus SecOps-025
Integration Pattern 5: Cloud-Native Event Bus¶
Use case: AWS EventBridge, Azure Event Grid, GCP Pub/Sub for cloud security events
AWS example — SecurityHub findings to SIEM:
[AWS SecurityHub]
→ EventBridge rule: findings with severity HIGH/CRITICAL
→ EventBridge → Lambda
→ Lambda normalizes and forwards to Kinesis
→ Kinesis → SIEM
Benefits: - Serverless — no infrastructure to maintain - Auto-scaling with event volume - Native cloud authentication (IAM roles, no API keys) - Built-in replay capability (Kinesis Data Streams)
Nexus SecOps control: Nexus SecOps-008, Nexus SecOps-121
Authentication Patterns for Integrations¶
| Pattern | Security Level | Use Case |
|---|---|---|
| API Key (Header) | Low | Internal integrations; simple tools |
| OAuth 2.0 Client Credentials | Medium | Service-to-service; automated flows |
| OAuth 2.0 + JWT | High | Identity-aware API calls |
| mTLS (Mutual TLS) | High | High-security integrations; financial/regulated |
| IAM Role (Cloud) | High | Cloud-native integrations (AWS/Azure/GCP) |
Requirements (Nexus SecOps-104): - No API keys in source code or configuration files — use secrets vault - API keys rotated every 90 days minimum - All integration credentials unique (no shared credentials across integrations) - Credential access logged (who accessed the secret, when)
Rate Limiting and Error Handling¶
Rate limit handling:
def api_call_with_retry(url, payload, max_retries=3):
for attempt in range(max_retries):
response = requests.post(url, json=payload, timeout=10)
if response.status_code == 429: # Rate limited
retry_after = int(response.headers.get('Retry-After', 60))
time.sleep(retry_after)
continue
if response.status_code >= 500: # Server error — retry
backoff = 2 ** attempt # Exponential: 1s, 2s, 4s
time.sleep(backoff)
continue
response.raise_for_status() # 4xx = programming error
return response.json()
# All retries exhausted — send to DLQ
send_to_dead_letter_queue(url, payload)
raise IntegrationException(f"API call failed after {max_retries} retries")
Circuit breaker pattern:
[Normal] → API call succeeds
[Degraded] → API calls failing > 50% in last 60s → Open circuit
[Open] → All calls fail fast; no actual API calls made
[Recovery] → After 30s, allow 1 test call → if success, close circuit
Integration Testing¶
Before deploying any new integration:
| Test | What to Verify |
|---|---|
| Happy path | Expected payload produces expected result |
| Authentication failure | Integration handles 401/403 gracefully |
| Timeout | Integration handles network timeout without crashing |
| Rate limit | Integration handles 429 with backoff |
| Invalid payload | Integration handles malformed responses |
| Large payload | Integration handles oversized responses |
| Integration unavailable | System degrades gracefully; queues or alerts |
Common Integration Anti-Patterns¶
Do Not Do These
- Polling every 5 seconds when you need real-time: Use webhooks or event bus instead. High-frequency polling hammers APIs and costs money.
- No retry logic: A single timeout drops the event silently. Always retry.
- Hardcoded credentials: These end up in git history, logs, and breached secrets. Use a vault.
- No integration health monitoring: Silent failures leave you thinking the integration is working when it's not. Monitor and alert — Nexus SecOps-110.
- Trusting the payload without validation: Malicious content in log data can inject into SOAR or case management. Sanitize inputs.
- Tight coupling: If SOAR cannot start because the SIEM is down, that's a brittle design. Use async queues between systems.
- No idempotency: Duplicate webhook deliveries create duplicate tickets. Deduplicate on event ID.
See Reference Architecture | Data Pipeline Patterns Nexus SecOps controls: AUT domain (Nexus SecOps-096–110), TEL domain (Nexus SecOps-001–015)