Skip to content

Integration Patterns for Security Operations Tooling

A modern SOC operates a portfolio of tools that must communicate reliably. Poor integrations are a leading cause of alert loss, delayed response, and analyst frustration. This document describes the canonical integration patterns used in security operations.


SOC Tool Integration Map

flowchart LR
    EDR[EDR Platform]
    SIEM[SIEM]
    SOAR[SOAR Platform]
    TIP[Threat Intel Platform]
    CASE[Case Management\nJira / ServiceNow]
    IAM[IAM Platform\nAD / Entra]
    FW[Firewall / Proxy]
    EMAIL[Email Gateway]

    EDR -->|Alerts + Telemetry| SIEM
    EMAIL -->|Security events| SIEM
    FW -->|Network logs| SIEM
    TIP -->|IOCs + Context| SIEM
    SIEM -->|Correlated alerts| SOAR
    SOAR -->|Create/update tickets| CASE
    SOAR -->|Disable account| IAM
    SOAR -->|Block IP/domain| FW
    SOAR -->|Quarantine email| EMAIL
    SOAR -->|Isolate host| EDR
    TIP <-->|IOC sharing| SIEM
    CASE -->|Status updates| SOAR

Integration Pattern 1: Bidirectional REST API

Use case: SIEM ↔ SOAR, SOAR ↔ Case Management, SIEM ↔ TIP

How it works: Both systems expose REST APIs. System A pushes events to System B's API endpoint; System B may callback to System A's API for updates.

Implementation:

# Example: SOAR creates case in management system
import requests

def create_case(alert: dict) -> str:
    """Create a ServiceNow incident from a SIEM alert."""
    payload = {
        "short_description": f"[SOC Alert] {alert['rule_name']}",
        "description": format_alert_details(alert),
        "severity": map_severity(alert['severity']),  # SIEM to ITSM mapping
        "assignment_group": "SOC Tier 1",
        "u_mitre_technique": alert.get('mitre_technique', ''),
        "u_siem_alert_id": alert['alert_id']
    }
    response = requests.post(
        url=f"{SERVICENOW_URL}/api/now/table/incident",
        json=payload,
        auth=(SN_USER, SN_PASS),
        timeout=10
    )
    response.raise_for_status()
    incident_id = response.json()['result']['number']
    return incident_id

Authentication: OAuth 2.0 (preferred), API key, or Basic Auth over TLS

Error handling: - Retry with exponential backoff: 3 retries, delays of 1s / 5s / 30s - Dead letter queue for permanently failed requests - Alert on integration failure: "SOAR → Case Management integration down" — Nexus SecOps-110

Versioning: Pin to specific API version; test on version upgrades before deploying


Integration Pattern 2: Webhook / Event-Driven

Use case: Real-time alerting from SIEM to SOAR; EDR to SIEM; Cloud security alerts

How it works: When an event occurs, System A sends an HTTP POST to System B's pre-configured webhook URL.

[Event occurs in Source System]
    → POST https://soar.internal/webhook/siem-alert
    → Headers: X-Signature: HMAC-SHA256(payload, secret)
    → Body: JSON event payload
    → Response: 200 OK (acknowledged)

Security requirements for webhooks: - MUST use HMAC signature verification to prevent spoofing - Webhook endpoint MUST only accept connections from expected source IPs - Use HTTPS only — never plain HTTP

# Webhook signature verification
import hmac
import hashlib

def verify_webhook_signature(payload: bytes, signature_header: str, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    received = signature_header.replace("sha256=", "")
    return hmac.compare_digest(expected, received)

Idempotency: Webhook receivers MUST handle duplicate delivery. Include a unique event ID and deduplicate on that ID.

Advantages: Near-real-time; no polling overhead; source-driven Disadvantages: Receiver must be always-available; no built-in retry from source


Integration Pattern 3: TAXII/STIX for Threat Intelligence

Use case: Sharing and consuming threat intelligence indicators between TIP, ISAC, SIEM

How it works: TAXII 2.1 is the transport protocol; STIX 2.1 is the data format.

[TAXII Server (Intel Provider)]
    ├── Collection: /api/v21/collections/{id}/objects/
    └── Objects: STIX Bundles (Indicators, Malware, TTP)

[TAXII Client (Your TIP / SIEM)]
    └── Poll endpoint every N minutes for new objects
    └── Parse STIX objects → extract IOCs
    └── Import to TIP / SIEM blocklist / enrichment

STIX Indicator example:

{
  "type": "indicator",
  "id": "indicator--12345678-abcd-...",
  "created": "2024-11-15T10:00:00Z",
  "modified": "2024-11-15T10:00:00Z",
  "name": "Malicious IP — APT29 C2",
  "pattern": "[ipv4-addr:value = '198.51.100.42']",
  "pattern_type": "stix",
  "valid_from": "2024-11-15T10:00:00Z",
  "valid_until": "2025-02-15T10:00:00Z",
  "confidence": 85,
  "labels": ["malicious-activity"],
  "external_references": [
    {"source_name": "vendor-report", "url": "https://..."}
  ]
}

Implementation notes: - Use indicator valid_until for automatic TTL — Nexus SecOps-090 - Filter by confidence score (only ingest confidence ≥ 70) - Track false positive rate per feed — Nexus SecOps-095


Integration Pattern 4: CEF/Syslog for Legacy Tools

Use case: Network devices, legacy security tools that don't support REST APIs

Common Event Format (CEF):

CEF:0|Vendor|Product|Version|SignatureID|Name|Severity|Extension
CEF:0|Palo Alto|PAN-OS|10.1|threat|Malware Blocked|8|src=10.0.0.1 dst=1.2.3.4 cs1=block

Syslog forwarding with TLS:

# rsyslog configuration for TLS syslog forwarding
*.* action(type="omfwd"
    target="siem-collector.internal"
    port="6514"
    protocol="tcp"
    StreamDriver="gtls"
    StreamDriverMode="1"
    StreamDriverAuthMode="x509/name"
    StreamDriverPermittedPeers="siem-collector.internal")

Parsing challenges: - CEF extensions are semi-structured — create dedicated parser per source - Test parser against 1000+ real events before production deployment - Monitor parse error rate per source — Nexus SecOps-025


Integration Pattern 5: Cloud-Native Event Bus

Use case: AWS EventBridge, Azure Event Grid, GCP Pub/Sub for cloud security events

AWS example — SecurityHub findings to SIEM:

[AWS SecurityHub]
    → EventBridge rule: findings with severity HIGH/CRITICAL
    → EventBridge → Lambda
    → Lambda normalizes and forwards to Kinesis
    → Kinesis → SIEM

Benefits: - Serverless — no infrastructure to maintain - Auto-scaling with event volume - Native cloud authentication (IAM roles, no API keys) - Built-in replay capability (Kinesis Data Streams)

Nexus SecOps control: Nexus SecOps-008, Nexus SecOps-121


Authentication Patterns for Integrations

Pattern Security Level Use Case
API Key (Header) Low Internal integrations; simple tools
OAuth 2.0 Client Credentials Medium Service-to-service; automated flows
OAuth 2.0 + JWT High Identity-aware API calls
mTLS (Mutual TLS) High High-security integrations; financial/regulated
IAM Role (Cloud) High Cloud-native integrations (AWS/Azure/GCP)

Requirements (Nexus SecOps-104): - No API keys in source code or configuration files — use secrets vault - API keys rotated every 90 days minimum - All integration credentials unique (no shared credentials across integrations) - Credential access logged (who accessed the secret, when)


Rate Limiting and Error Handling

Rate limit handling:

def api_call_with_retry(url, payload, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(url, json=payload, timeout=10)

        if response.status_code == 429:  # Rate limited
            retry_after = int(response.headers.get('Retry-After', 60))
            time.sleep(retry_after)
            continue

        if response.status_code >= 500:  # Server error — retry
            backoff = 2 ** attempt  # Exponential: 1s, 2s, 4s
            time.sleep(backoff)
            continue

        response.raise_for_status()  # 4xx = programming error
        return response.json()

    # All retries exhausted — send to DLQ
    send_to_dead_letter_queue(url, payload)
    raise IntegrationException(f"API call failed after {max_retries} retries")

Circuit breaker pattern:

[Normal] → API call succeeds
[Degraded] → API calls failing > 50% in last 60s → Open circuit
[Open] → All calls fail fast; no actual API calls made
[Recovery] → After 30s, allow 1 test call → if success, close circuit


Integration Testing

Before deploying any new integration:

Test What to Verify
Happy path Expected payload produces expected result
Authentication failure Integration handles 401/403 gracefully
Timeout Integration handles network timeout without crashing
Rate limit Integration handles 429 with backoff
Invalid payload Integration handles malformed responses
Large payload Integration handles oversized responses
Integration unavailable System degrades gracefully; queues or alerts

Common Integration Anti-Patterns

Do Not Do These

  • Polling every 5 seconds when you need real-time: Use webhooks or event bus instead. High-frequency polling hammers APIs and costs money.
  • No retry logic: A single timeout drops the event silently. Always retry.
  • Hardcoded credentials: These end up in git history, logs, and breached secrets. Use a vault.
  • No integration health monitoring: Silent failures leave you thinking the integration is working when it's not. Monitor and alert — Nexus SecOps-110.
  • Trusting the payload without validation: Malicious content in log data can inject into SOAR or case management. Sanitize inputs.
  • Tight coupling: If SOAR cannot start because the SIEM is down, that's a brittle design. Use async queues between systems.
  • No idempotency: Duplicate webhook deliveries create duplicate tickets. Deduplicate on event ID.

See Reference Architecture | Data Pipeline Patterns Nexus SecOps controls: AUT domain (Nexus SecOps-096–110), TEL domain (Nexus SecOps-001–015)