IOC & Regex Pattern Library¶

Purpose

Production-ready regex patterns, KQL/SPL extraction snippets, and false-positive reduction guidance for analysts. Covers all major IOC types used in detection, hunting, and triage.

IPv4 & Network Patterns¶

IPv4 Address¶

\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b

KQL extraction:

| extend ips = extract_all(@'\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b', Message)

False positive reduction: Exclude RFC 1918 (10.x, 172.16–31.x, 192.168.x), loopback (127.x), link-local (169.254.x), TEST-NET (192.0.2.x, 198.51.100.x, 203.0.113.x).

// Exclude private ranges
| where not(ipv4_is_private(extracted_ip))
| where not(extracted_ip startswith "127.")
| where not(extracted_ip startswith "169.254.")

CIDR Notation¶

\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)/(?:[12]?\d|3[0-2])\b

IPv6 Address (simplified)¶

(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,7}:|::(?:[0-9a-fA-F]{1,4}:){0,6}[0-9a-fA-F]{1,4}

Domain & URL Patterns¶

Fully Qualified Domain Name (FQDN)¶

\b(?:[a-zA-Z0-9](?:[a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\b

False positive reduction: Filter against Alexa/Cisco Umbrella top-1M allowlist. Flag newly registered domains (< 30 days old) with high entropy.

KQL domain entropy scoring:

| extend domain_entropy = log2(array_length(extract_all(@'[a-z]', tolower(extracted_domain))))
// High entropy subdomain (DGA detection): entropy > 3.5 is suspicious

URL (HTTP/HTTPS)¶

https?://(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,}(?:/[^\s"'<>]*)?

SPL extraction:

| rex field=_raw "(?P<url>https?://[^\s\"'<>]+)"
| eval domain=replace(url,"https?://([^/]+).*","\1")

URL with suspicious characteristics¶

// Long URL (> 200 chars), IP-based URL, or URL with encoded characters
DeviceNetworkEvents
| where RemoteUrl matches regex @"https?://\d+\.\d+\.\d+\.\d+"  // IP-based
    or strlen(RemoteUrl) > 200
    or RemoteUrl contains "%2F%2F"  // double-encoded slash
    or RemoteUrl contains ".php?id="

File Hash Patterns¶

MD5¶

\b[0-9a-fA-F]{32}\b

SHA-1¶

\b[0-9a-fA-F]{40}\b

SHA-256¶

\b[0-9a-fA-F]{64}\b

SHA-512¶

\b[0-9a-fA-F]{128}\b

KQL multi-hash IOC lookup:

let ioc_hashes = externaldata(hash:string)[@"https://your-feed/hashes.csv"] with (format="csv");
DeviceFileEvents
| where SHA256 in (ioc_hashes) or MD5 in (ioc_hashes)

CVE Identifiers¶

CVE Pattern¶

CVE-\d{4}-\d{4,7}

KQL extraction:

| extend cves = extract_all(@'CVE-\d{4}-\d{4,7}', Message)
| where array_length(cves) > 0

Context enrichment: Join against NVD CVSS score lookup or internal vulnerability scanner data.

Credential & Secret Patterns¶

AWS Access Key ID¶

\bAKIA[0-9A-Z]{16}\b

AWS Secret Access Key (context-dependent)¶

(?i)aws.{0,20}secret.{0,20}['\"][0-9a-zA-Z/+]{40}['\"]

Generic API Key Pattern¶

(?i)(?:api[_\-]?key|apikey|api[_\-]?secret)\s*[:=]\s*['"]?([0-9a-zA-Z\-_]{20,64})['"]?

JWT Token¶

eyJ[A-Za-z0-9_\-]{10,}\.eyJ[A-Za-z0-9_\-]{10,}\.[A-Za-z0-9_\-]{10,}

KQL secret detection in logs:

| where Message matches regex @'(?i)(password|passwd|secret|apikey|api_key|token)\s*[:=]\s*\S{8,}'
| where Message !contains "****"  // exclude already-masked values

DLP Note

These patterns should trigger DLP alerts when found in outbound email, cloud uploads, or code repositories. Integrate with GitHub/GitLab secret scanning.

Windows Artifact Patterns¶

Windows Registry Path¶

(?i)HKEY_(?:LOCAL_MACHINE|CURRENT_USER|CLASSES_ROOT|USERS|CURRENT_CONFIG)(?:\\[^\s"'<>\\]+)+

Short form:

(?i)HK(?:LM|CU|CR|U|CC)(?:\\[^\s"'<>\\]+)+

Windows File Path¶

[A-Za-z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*

UNC Network Path¶

\\\\[a-zA-Z0-9\-\.]+\\[a-zA-Z0-9\-\.\$]+(?:\\[^\\/:*?"<>|\r\n]*)*

KQL UNC path detection (lateral movement indicator):

DeviceProcessEvents
| where ProcessCommandLine matches regex @'\\\\\w[\w\-\.]{0,62}\\\w+'
| where InitiatingProcessFileName !in ("explorer.exe", "msiexec.exe")

Base64 & Encoded Content¶

Base64 Blob (suspicious length)¶

[A-Za-z0-9+/]{40,}={0,2}

KQL PowerShell encoded command detection:

DeviceProcessEvents
| where FileName in~ ("powershell.exe", "pwsh.exe")
| where ProcessCommandLine matches regex @'-[Ee][Nn][Cc][Oo][Dd][Ee][Dd][Cc][Oo][Mm][Mm][Aa][Nn][Dd]\s+[A-Za-z0-9+/]{20,}'
| extend decoded_cmd = base64_decode_tostring(extract(@'-[Ee][Nn][Cc]\S*\s+([A-Za-z0-9+/=]+)', 1, ProcessCommandLine))

Hex-encoded shellcode¶

(?:[0-9a-fA-F]{2}){16,}

XOR-encoded indicator (common malware pattern)¶

Look for repeated patterns where bytes differ by a constant offset — statistical rather than regex.

# Python: detect likely XOR key
def detect_xor_key(data, sample_size=16):
    if len(data) < sample_size * 2:
        return None
    candidates = {}
    for key in range(1, 256):
        decoded = bytes(b ^ key for b in data[:sample_size])
        if decoded.isascii() and decoded.isprintable():
            candidates[key] = decoded
    return max(candidates.items(), key=lambda x: len(x[1]))[0] if candidates else None

Email & Phishing Patterns¶

Email Address¶

\b[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}\b

Homoglyph Detection (IDN homograph attacks)¶

// Flag domains with mixed unicode scripts or lookalike characters
| where RemoteUrl matches regex @'xn--'  // Punycode encoded domain
| where RemoteUrl matches regex @'[\u0430-\u044F]'  // Cyrillic characters in domain

Phishing URL Patterns¶

# Common phishing URL patterns
(?i)(?:signin|login|secure|verify|update|account|banking)\.[a-z0-9\-]+\.(?:com|net|info|org)

Malware & C2 Patterns¶

Cobalt Strike Malleable C2 Jitter Pattern (log detection)¶

// Detect consistent inter-arrival beaconing (low jitter = automated C2)
DeviceNetworkEvents
| where RemotePort in (80, 443, 8080, 8443)
| summarize
    count_=count(),
    intervals=make_list(todatetime(Timestamp)),
    bytes=sum(SentBytes)
    by DeviceName, RemoteIP
| where count_ > 20
// Calculate jitter: stdev(intervals) / avg(intervals) < 0.1 = likely beacon

DGA (Domain Generation Algorithm) Detection¶

// High consonant ratio + no vowel clusters = likely DGA domain
DnsEvents
| extend
    domain_len = strlen(Name),
    vowels = array_length(extract_all(@'[aeiou]', tolower(Name)))
| extend consonant_ratio = 1.0 - (vowels * 1.0 / domain_len)
| where consonant_ratio > 0.7
| where domain_len > 12
| where Name !endswith ".microsoft.com"
| where Name !endswith ".windows.com"

Suspicious Process Ancestry¶

// Office apps spawning shells (macro execution / phishing)
DeviceProcessEvents
| where InitiatingProcessFileName in~ ("winword.exe","excel.exe","powerpnt.exe","outlook.exe")
| where FileName in~ ("cmd.exe","powershell.exe","wscript.exe","cscript.exe","mshta.exe")

Splunk SPL Equivalents¶

KQL Pattern	SPL Equivalent
`extract_all(@'pattern', field)`	`rex field=field "(?P<name>pattern)" max_match=100`
`ipv4_is_private(ip)`	`lookup private_ips.csv ip OUTPUT is_private`
`base64_decode_tostring(str)`	`eval decoded=base64decode(encoded_val)`
`matches regex @'pattern'`	`rex field=field "pattern"` or `where match(field,"pattern")`
`strlen(field)`	`eval len=len(field)`
`summarize count() by field`	`stats count by field`

IOC Enrichment Workflow¶

graph LR
    IOC[Raw IOC] --> TYPE{Type?}
    TYPE -->|IP| GEO[GeoIP + ASN Lookup]
    TYPE -->|Domain| WHOIS[WHOIS + Passive DNS]
    TYPE -->|Hash| VT[VirusTotal + MalwareBazaar]
    TYPE -->|URL| URLSCAN[urlscan.io + PhishTank]
    GEO & WHOIS & VT & URLSCAN --> SCORE[Risk Score Calculation]
    SCORE --> ACT{Score > Threshold?}
    ACT -->|Yes| BLOCK[Block + Alert]
    ACT -->|No| WATCH[Watchlist + Monitor]

Risk Scoring Formula¶

IOC_Score = (TI_Hits × 3) + (Age_Days < 30 ? 2 : 0) + (Seen_Count × 0.5)
           + (Geo_Risk) + (ASN_Risk) + (Name_Entropy × 2)

Thresholds: 0-5 = Low | 6-10 = Medium | 11-15 = High | 16+ = Critical

Quick Reference Card¶

IOC Type	Pattern Length	Key Tool	False Positive Risk
MD5 hash	32 hex chars	VirusTotal	Low
SHA-256	64 hex chars	VirusTotal	Very Low
IPv4	7–15 chars	GeoIP, AbuseIPDB	Medium (shared hosting)
Domain	Variable	WHOIS, PassiveDNS	Medium (CDN, dynamic DNS)
URL	Variable	urlscan.io	Medium
CVE	CVE-YYYY-NNNNN	NVD, vendor advisories	Very Low
Email	Variable	Header analysis	High (spoofing)
JWT	3-part base64	jwt.io decode	Low
AWS key	AKIA+16 chars	AWS IAM	Very Low