IOC & Regex Pattern Library¶
Purpose
Production-ready regex patterns, KQL/SPL extraction snippets, and false-positive reduction guidance for analysts. Covers all major IOC types used in detection, hunting, and triage.
IPv4 & Network Patterns¶
IPv4 Address¶
KQL extraction:
| extend ips = extract_all(@'\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b', Message)
False positive reduction: Exclude RFC 1918 (10.x, 172.16–31.x, 192.168.x), loopback (127.x), link-local (169.254.x), TEST-NET (192.0.2.x, 198.51.100.x, 203.0.113.x).
// Exclude private ranges
| where not(ipv4_is_private(extracted_ip))
| where not(extracted_ip startswith "127.")
| where not(extracted_ip startswith "169.254.")
CIDR Notation¶
IPv6 Address (simplified)¶
(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|(?:[0-9a-fA-F]{1,4}:){1,7}:|::(?:[0-9a-fA-F]{1,4}:){0,6}[0-9a-fA-F]{1,4}
Domain & URL Patterns¶
Fully Qualified Domain Name (FQDN)¶
False positive reduction: Filter against Alexa/Cisco Umbrella top-1M allowlist. Flag newly registered domains (< 30 days old) with high entropy.
KQL domain entropy scoring:
| extend domain_entropy = log2(array_length(extract_all(@'[a-z]', tolower(extracted_domain))))
// High entropy subdomain (DGA detection): entropy > 3.5 is suspicious
URL (HTTP/HTTPS)¶
SPL extraction:
| rex field=_raw "(?P<url>https?://[^\s\"'<>]+)"
| eval domain=replace(url,"https?://([^/]+).*","\1")
URL with suspicious characteristics¶
// Long URL (> 200 chars), IP-based URL, or URL with encoded characters
DeviceNetworkEvents
| where RemoteUrl matches regex @"https?://\d+\.\d+\.\d+\.\d+" // IP-based
or strlen(RemoteUrl) > 200
or RemoteUrl contains "%2F%2F" // double-encoded slash
or RemoteUrl contains ".php?id="
File Hash Patterns¶
MD5¶
SHA-1¶
SHA-256¶
SHA-512¶
KQL multi-hash IOC lookup:
let ioc_hashes = externaldata(hash:string)[@"https://your-feed/hashes.csv"] with (format="csv");
DeviceFileEvents
| where SHA256 in (ioc_hashes) or MD5 in (ioc_hashes)
CVE Identifiers¶
CVE Pattern¶
KQL extraction:
Context enrichment: Join against NVD CVSS score lookup or internal vulnerability scanner data.
Credential & Secret Patterns¶
AWS Access Key ID¶
AWS Secret Access Key (context-dependent)¶
Generic API Key Pattern¶
JWT Token¶
KQL secret detection in logs:
| where Message matches regex @'(?i)(password|passwd|secret|apikey|api_key|token)\s*[:=]\s*\S{8,}'
| where Message !contains "****" // exclude already-masked values
DLP Note
These patterns should trigger DLP alerts when found in outbound email, cloud uploads, or code repositories. Integrate with GitHub/GitLab secret scanning.
Windows Artifact Patterns¶
Windows Registry Path¶
Short form:
Windows File Path¶
UNC Network Path¶
KQL UNC path detection (lateral movement indicator):
DeviceProcessEvents
| where ProcessCommandLine matches regex @'\\\\\w[\w\-\.]{0,62}\\\w+'
| where InitiatingProcessFileName !in ("explorer.exe", "msiexec.exe")
Base64 & Encoded Content¶
Base64 Blob (suspicious length)¶
KQL PowerShell encoded command detection:
DeviceProcessEvents
| where FileName in~ ("powershell.exe", "pwsh.exe")
| where ProcessCommandLine matches regex @'-[Ee][Nn][Cc][Oo][Dd][Ee][Dd][Cc][Oo][Mm][Mm][Aa][Nn][Dd]\s+[A-Za-z0-9+/]{20,}'
| extend decoded_cmd = base64_decode_tostring(extract(@'-[Ee][Nn][Cc]\S*\s+([A-Za-z0-9+/=]+)', 1, ProcessCommandLine))
Hex-encoded shellcode¶
XOR-encoded indicator (common malware pattern)¶
Look for repeated patterns where bytes differ by a constant offset — statistical rather than regex.
# Python: detect likely XOR key
def detect_xor_key(data, sample_size=16):
if len(data) < sample_size * 2:
return None
candidates = {}
for key in range(1, 256):
decoded = bytes(b ^ key for b in data[:sample_size])
if decoded.isascii() and decoded.isprintable():
candidates[key] = decoded
return max(candidates.items(), key=lambda x: len(x[1]))[0] if candidates else None
Email & Phishing Patterns¶
Email Address¶
Homoglyph Detection (IDN homograph attacks)¶
// Flag domains with mixed unicode scripts or lookalike characters
| where RemoteUrl matches regex @'xn--' // Punycode encoded domain
| where RemoteUrl matches regex @'[\u0430-\u044F]' // Cyrillic characters in domain
Phishing URL Patterns¶
# Common phishing URL patterns
(?i)(?:signin|login|secure|verify|update|account|banking)\.[a-z0-9\-]+\.(?:com|net|info|org)
Malware & C2 Patterns¶
Cobalt Strike Malleable C2 Jitter Pattern (log detection)¶
// Detect consistent inter-arrival beaconing (low jitter = automated C2)
DeviceNetworkEvents
| where RemotePort in (80, 443, 8080, 8443)
| summarize
count_=count(),
intervals=make_list(todatetime(Timestamp)),
bytes=sum(SentBytes)
by DeviceName, RemoteIP
| where count_ > 20
// Calculate jitter: stdev(intervals) / avg(intervals) < 0.1 = likely beacon
DGA (Domain Generation Algorithm) Detection¶
// High consonant ratio + no vowel clusters = likely DGA domain
DnsEvents
| extend
domain_len = strlen(Name),
vowels = array_length(extract_all(@'[aeiou]', tolower(Name)))
| extend consonant_ratio = 1.0 - (vowels * 1.0 / domain_len)
| where consonant_ratio > 0.7
| where domain_len > 12
| where Name !endswith ".microsoft.com"
| where Name !endswith ".windows.com"
Suspicious Process Ancestry¶
// Office apps spawning shells (macro execution / phishing)
DeviceProcessEvents
| where InitiatingProcessFileName in~ ("winword.exe","excel.exe","powerpnt.exe","outlook.exe")
| where FileName in~ ("cmd.exe","powershell.exe","wscript.exe","cscript.exe","mshta.exe")
Splunk SPL Equivalents¶
| KQL Pattern | SPL Equivalent |
|---|---|
extract_all(@'pattern', field) | rex field=field "(?P<name>pattern)" max_match=100 |
ipv4_is_private(ip) | lookup private_ips.csv ip OUTPUT is_private |
base64_decode_tostring(str) | eval decoded=base64decode(encoded_val) |
matches regex @'pattern' | rex field=field "pattern" or where match(field,"pattern") |
strlen(field) | eval len=len(field) |
summarize count() by field | stats count by field |
IOC Enrichment Workflow¶
graph LR
IOC[Raw IOC] --> TYPE{Type?}
TYPE -->|IP| GEO[GeoIP + ASN Lookup]
TYPE -->|Domain| WHOIS[WHOIS + Passive DNS]
TYPE -->|Hash| VT[VirusTotal + MalwareBazaar]
TYPE -->|URL| URLSCAN[urlscan.io + PhishTank]
GEO & WHOIS & VT & URLSCAN --> SCORE[Risk Score Calculation]
SCORE --> ACT{Score > Threshold?}
ACT -->|Yes| BLOCK[Block + Alert]
ACT -->|No| WATCH[Watchlist + Monitor] Risk Scoring Formula¶
IOC_Score = (TI_Hits × 3) + (Age_Days < 30 ? 2 : 0) + (Seen_Count × 0.5)
+ (Geo_Risk) + (ASN_Risk) + (Name_Entropy × 2)
Thresholds: 0-5 = Low | 6-10 = Medium | 11-15 = High | 16+ = Critical
Quick Reference Card¶
| IOC Type | Pattern Length | Key Tool | False Positive Risk |
|---|---|---|---|
| MD5 hash | 32 hex chars | VirusTotal | Low |
| SHA-256 | 64 hex chars | VirusTotal | Very Low |
| IPv4 | 7–15 chars | GeoIP, AbuseIPDB | Medium (shared hosting) |
| Domain | Variable | WHOIS, PassiveDNS | Medium (CDN, dynamic DNS) |
| URL | Variable | urlscan.io | Medium |
| CVE | CVE-YYYY-NNNNN | NVD, vendor advisories | Very Low |
| Variable | Header analysis | High (spoofing) | |
| JWT | 3-part base64 | jwt.io decode | Low |
| AWS key | AKIA+16 chars | AWS IAM | Very Low |