Skip to content

IOC & Regex Pattern Library

Purpose

Production-ready regex patterns, KQL/SPL extraction snippets, and false-positive reduction guidance for analysts. Covers all major IOC types used in detection, hunting, and triage.


IPv4 & Network Patterns

IPv4 Address

\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b

KQL extraction:

| extend ips = extract_all(@'\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b', Message)

False positive reduction: Exclude RFC 1918 (10.x, 172.16–31.x, 192.168.x), loopback (127.x), link-local (169.254.x), TEST-NET (192.0.2.x, 198.51.100.x, 203.0.113.x).

// Exclude private ranges
| where not(ipv4_is_private(extracted_ip))
| where not(extracted_ip startswith "127.")
| where not(extracted_ip startswith "169.254.")

CIDR Notation

\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)/(?:[12]?\d|3[0-2])\b

IPv6 Address (simplified)

(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,7}:|::(?:[0-9a-fA-F]{1,4}:){0,6}[0-9a-fA-F]{1,4}

Domain & URL Patterns

Fully Qualified Domain Name (FQDN)

\b(?:[a-zA-Z0-9](?:[a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\b

False positive reduction: Filter against Alexa/Cisco Umbrella top-1M allowlist. Flag newly registered domains (< 30 days old) with high entropy.

KQL domain entropy scoring:

| extend domain_entropy = log2(array_length(extract_all(@'[a-z]', tolower(extracted_domain))))
// High entropy subdomain (DGA detection): entropy > 3.5 is suspicious


URL (HTTP/HTTPS)

https?://(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,}(?:/[^\s"'<>]*)?

SPL extraction:

| rex field=_raw "(?P<url>https?://[^\s\"'<>]+)"
| eval domain=replace(url,"https?://([^/]+).*","\1")


URL with suspicious characteristics

// Long URL (> 200 chars), IP-based URL, or URL with encoded characters
DeviceNetworkEvents
| where RemoteUrl matches regex @"https?://\d+\.\d+\.\d+\.\d+"  // IP-based
    or strlen(RemoteUrl) > 200
    or RemoteUrl contains "%2F%2F"  // double-encoded slash
    or RemoteUrl contains ".php?id="

File Hash Patterns

MD5

\b[0-9a-fA-F]{32}\b

SHA-1

\b[0-9a-fA-F]{40}\b

SHA-256

\b[0-9a-fA-F]{64}\b

SHA-512

\b[0-9a-fA-F]{128}\b

KQL multi-hash IOC lookup:

let ioc_hashes = externaldata(hash:string)[@"https://your-feed/hashes.csv"] with (format="csv");
DeviceFileEvents
| where SHA256 in (ioc_hashes) or MD5 in (ioc_hashes)


CVE Identifiers

CVE Pattern

CVE-\d{4}-\d{4,7}

KQL extraction:

| extend cves = extract_all(@'CVE-\d{4}-\d{4,7}', Message)
| where array_length(cves) > 0

Context enrichment: Join against NVD CVSS score lookup or internal vulnerability scanner data.


Credential & Secret Patterns

AWS Access Key ID

\bAKIA[0-9A-Z]{16}\b

AWS Secret Access Key (context-dependent)

(?i)aws.{0,20}secret.{0,20}['\"][0-9a-zA-Z/+]{40}['\"]

Generic API Key Pattern

(?i)(?:api[_\-]?key|apikey|api[_\-]?secret)\s*[:=]\s*['"]?([0-9a-zA-Z\-_]{20,64})['"]?

JWT Token

eyJ[A-Za-z0-9_\-]{10,}\.eyJ[A-Za-z0-9_\-]{10,}\.[A-Za-z0-9_\-]{10,}

KQL secret detection in logs:

| where Message matches regex @'(?i)(password|passwd|secret|apikey|api_key|token)\s*[:=]\s*\S{8,}'
| where Message !contains "****"  // exclude already-masked values

DLP Note

These patterns should trigger DLP alerts when found in outbound email, cloud uploads, or code repositories. Integrate with GitHub/GitLab secret scanning.


Windows Artifact Patterns

Windows Registry Path

(?i)HKEY_(?:LOCAL_MACHINE|CURRENT_USER|CLASSES_ROOT|USERS|CURRENT_CONFIG)(?:\\[^\s"'<>\\]+)+

Short form:

(?i)HK(?:LM|CU|CR|U|CC)(?:\\[^\s"'<>\\]+)+

Windows File Path

[A-Za-z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*

UNC Network Path

\\\\[a-zA-Z0-9\-\.]+\\[a-zA-Z0-9\-\.\$]+(?:\\[^\\/:*?"<>|\r\n]*)*

KQL UNC path detection (lateral movement indicator):

DeviceProcessEvents
| where ProcessCommandLine matches regex @'\\\\\w[\w\-\.]{0,62}\\\w+'
| where InitiatingProcessFileName !in ("explorer.exe", "msiexec.exe")


Base64 & Encoded Content

Base64 Blob (suspicious length)

[A-Za-z0-9+/]{40,}={0,2}

KQL PowerShell encoded command detection:

DeviceProcessEvents
| where FileName in~ ("powershell.exe", "pwsh.exe")
| where ProcessCommandLine matches regex @'-[Ee][Nn][Cc][Oo][Dd][Ee][Dd][Cc][Oo][Mm][Mm][Aa][Nn][Dd]\s+[A-Za-z0-9+/]{20,}'
| extend decoded_cmd = base64_decode_tostring(extract(@'-[Ee][Nn][Cc]\S*\s+([A-Za-z0-9+/=]+)', 1, ProcessCommandLine))

Hex-encoded shellcode

(?:[0-9a-fA-F]{2}){16,}

XOR-encoded indicator (common malware pattern)

Look for repeated patterns where bytes differ by a constant offset — statistical rather than regex.

# Python: detect likely XOR key
def detect_xor_key(data, sample_size=16):
    if len(data) < sample_size * 2:
        return None
    candidates = {}
    for key in range(1, 256):
        decoded = bytes(b ^ key for b in data[:sample_size])
        if decoded.isascii() and decoded.isprintable():
            candidates[key] = decoded
    return max(candidates.items(), key=lambda x: len(x[1]))[0] if candidates else None

Email & Phishing Patterns

Email Address

\b[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}\b

Homoglyph Detection (IDN homograph attacks)

// Flag domains with mixed unicode scripts or lookalike characters
| where RemoteUrl matches regex @'xn--'  // Punycode encoded domain
| where RemoteUrl matches regex @'[\u0430-\u044F]'  // Cyrillic characters in domain

Phishing URL Patterns

# Common phishing URL patterns
(?i)(?:signin|login|secure|verify|update|account|banking)\.[a-z0-9\-]+\.(?:com|net|info|org)

Malware & C2 Patterns

Cobalt Strike Malleable C2 Jitter Pattern (log detection)

// Detect consistent inter-arrival beaconing (low jitter = automated C2)
DeviceNetworkEvents
| where RemotePort in (80, 443, 8080, 8443)
| summarize
    count_=count(),
    intervals=make_list(todatetime(Timestamp)),
    bytes=sum(SentBytes)
    by DeviceName, RemoteIP
| where count_ > 20
// Calculate jitter: stdev(intervals) / avg(intervals) < 0.1 = likely beacon

DGA (Domain Generation Algorithm) Detection

// High consonant ratio + no vowel clusters = likely DGA domain
DnsEvents
| extend
    domain_len = strlen(Name),
    vowels = array_length(extract_all(@'[aeiou]', tolower(Name)))
| extend consonant_ratio = 1.0 - (vowels * 1.0 / domain_len)
| where consonant_ratio > 0.7
| where domain_len > 12
| where Name !endswith ".microsoft.com"
| where Name !endswith ".windows.com"

Suspicious Process Ancestry

// Office apps spawning shells (macro execution / phishing)
DeviceProcessEvents
| where InitiatingProcessFileName in~ ("winword.exe","excel.exe","powerpnt.exe","outlook.exe")
| where FileName in~ ("cmd.exe","powershell.exe","wscript.exe","cscript.exe","mshta.exe")

Splunk SPL Equivalents

KQL Pattern SPL Equivalent
extract_all(@'pattern', field) rex field=field "(?P<name>pattern)" max_match=100
ipv4_is_private(ip) lookup private_ips.csv ip OUTPUT is_private
base64_decode_tostring(str) eval decoded=base64decode(encoded_val)
matches regex @'pattern' rex field=field "pattern" or where match(field,"pattern")
strlen(field) eval len=len(field)
summarize count() by field stats count by field

IOC Enrichment Workflow

graph LR
    IOC[Raw IOC] --> TYPE{Type?}
    TYPE -->|IP| GEO[GeoIP + ASN Lookup]
    TYPE -->|Domain| WHOIS[WHOIS + Passive DNS]
    TYPE -->|Hash| VT[VirusTotal + MalwareBazaar]
    TYPE -->|URL| URLSCAN[urlscan.io + PhishTank]
    GEO & WHOIS & VT & URLSCAN --> SCORE[Risk Score Calculation]
    SCORE --> ACT{Score > Threshold?}
    ACT -->|Yes| BLOCK[Block + Alert]
    ACT -->|No| WATCH[Watchlist + Monitor]

Risk Scoring Formula

IOC_Score = (TI_Hits × 3) + (Age_Days < 30 ? 2 : 0) + (Seen_Count × 0.5)
           + (Geo_Risk) + (ASN_Risk) + (Name_Entropy × 2)

Thresholds: 0-5 = Low | 6-10 = Medium | 11-15 = High | 16+ = Critical

Quick Reference Card

IOC Type Pattern Length Key Tool False Positive Risk
MD5 hash 32 hex chars VirusTotal Low
SHA-256 64 hex chars VirusTotal Very Low
IPv4 7–15 chars GeoIP, AbuseIPDB Medium (shared hosting)
Domain Variable WHOIS, PassiveDNS Medium (CDN, dynamic DNS)
URL Variable urlscan.io Medium
CVE CVE-YYYY-NNNNN NVD, vendor advisories Very Low
Email Variable Header analysis High (spoofing)
JWT 3-part base64 jwt.io decode Low
AWS key AKIA+16 chars AWS IAM Very Low