SC-110: GDPR Privacy Breach -- Mass Data Subject Notification¶

Operation CRYSTAL LEAK¶

Classification: TABLETOP EXERCISE -- 100% Synthetic

All organizations, IP addresses, domains, personal data, and threat actors in this scenario are entirely fictional. Created for educational tabletop exercises only. No real personal data is used.

Scenario Metadata¶

Field	Value
Difficulty	★★★★☆ (Advanced)
Duration	3-4 hours
Participants	6-10 (SOC, IR, Legal, Privacy/DPO, Communications, Executive)
ATT&CK Techniques	T1530 · T1567 · T1078 · T1213 · T1048
Threat Actor	GLASS SPIDER (data broker / extortion group)
Industry	Healthcare / Insurance
Primary Impact	500K EU citizen records exposed, GDPR Article 33/34 notification triggered

Threat Actor Profile: GLASS SPIDER¶

Attribute	Detail
Motivation	Financial -- data brokering and extortion
Sophistication	High -- targets regulated data for maximum leverage
Known Targets	Healthcare providers, insurance companies, government agencies (EU-focused)
Avg. Dwell Time	15-30 days
Signature	Exfiltrates personal data, then threatens public disclosure and regulatory complaints to pressure ransom payment
Tools	Cloud storage enumeration scripts, custom exfiltration tools using legitimate cloud services, Tor-based leak sites

Executive Summary¶

GLASS SPIDER compromises an API key for EuroHealth Insurance (synthetic healthcare insurer, 3,200 employees, headquartered in synthetic EU jurisdiction) that provides access to a cloud-hosted data lake containing policyholder records. Over 11 days, the attacker exfiltrates 500,247 EU citizen records including names, dates of birth, national ID numbers, medical diagnosis codes (ICD-10), and insurance policy details. The data is staged in an attacker-controlled Azure Blob Storage account and subsequently posted to an extortion site. The breach is discovered on Day 14 when a threat intelligence feed flags EuroHealth data on a dark web marketplace. This triggers the GDPR Article 33 requirement to notify the supervisory authority within 72 hours and Article 34 notification to affected data subjects without undue delay. The organization must simultaneously run a technical IR investigation and a complex regulatory response across 12 EU member states.

Environment Setup¶

Target Organization: EuroHealth Insurance (synthetic)

Asset	Detail
Industry	Healthcare insurance, 3,200 employees, 2.1M policyholders
HQ	Synthetic EU member state
Operations	12 EU member states, subject to GDPR and national health data regulations
Data Lake	Azure Data Lake Storage Gen2: `eurohealth-datalake.example.com` (10.30.0.0/16)
API Gateway	`api.eurohealth.example.com` (10.30.1.20)
DPO	Designated Data Protection Officer (required under GDPR Art. 37)
SIEM	Microsoft Sentinel
Data Classification	Policyholder data classified as "Special Category" (GDPR Art. 9 -- health data)

Phase 1: Initial Access -- API Key Compromise (Day 0)¶

Attacker Actions¶

GLASS SPIDER discovers an exposed API key in a public GitHub repository. A developer at EuroHealth accidentally committed a .env file containing the data lake API credentials:

Exposed .env File (GitHub Commit -- Reconstructed)

# EuroHealth Data Lake -- DEV/TEST ONLY
# WARNING: DO NOT COMMIT
DATALAKE_API_KEY=REDACTED
DATALAKE_ENDPOINT=https://eurohealth-datalake.example.com
DATALAKE_CONTAINER=policyholder-records
DB_CONNECTION=Server=10.30.2.15;Database=PolicyDB;User=testuser;Password=REDACTED

Root Cause

The .gitignore file existed but did not include .env files in the config/ subdirectory. The developer committed the file 47 days before the attacker discovered it. GitHub secret scanning was not enabled on the repository.

Evidence Artifacts¶

Azure Activity Log -- API Key Authentication

{
  "time": "2026-03-01T14:22:08Z",
  "operationName": "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read",
  "callerIpAddress": "203.0.113.88",
  "properties": {
    "accountName": "eurohealthdatalake",
    "containerName": "policyholder-records",
    "authenticationMethod": "SharedKey",
    "statusCode": 200
  }
}

Detection Queries¶

KQL (Microsoft Sentinel)SPL (Splunk)

// Detect data lake access from external IPs
StorageBlobLogs
| where TimeGenerated > ago(7d)
| where AccountName == "eurohealthdatalake"
| where CallerIpAddress !startswith "10." 
    and CallerIpAddress !startswith "172.16."
    and CallerIpAddress !startswith "192.168."
| where StatusCode == 200
| summarize BlobCount=count(), 
    DataVolume=sum(ResponseBodySize) 
    by CallerIpAddress, bin(TimeGenerated, 1h)
| where BlobCount > 100 or DataVolume > 104857600

index=azure sourcetype=azure:storage:blob
| search account_name="eurohealthdatalake" status=200
| where NOT cidrmatch("10.0.0.0/8", caller_ip)
| stats count sum(response_size) as total_bytes 
    by caller_ip span=1h _time
| where count > 100 OR total_bytes > 104857600

Discussion Injects¶

Technical

The API key was committed 47 days ago. What tools could have detected this earlier? How does GitHub secret scanning work, and what are its limitations for custom API key formats?

Phase 2: Data Reconnaissance & Enumeration (Days 1-3)¶

Attacker Actions¶

GLASS SPIDER enumerates the data lake structure and identifies high-value containers:

Azure Blob Container Enumeration (Reconstructed)

2026-03-02T08:15:22Z LIST containers
Source: 203.0.113.88
Results:
  - policyholder-records (4.2 TB, 2.1M objects)
  - claims-data (1.8 TB, 890K objects)
  - provider-network (230 GB, 45K objects)
  - analytics-exports (890 GB, 12K objects)
  - gdpr-sar-responses (45 GB, 3.2K objects)

2026-03-02T08:17:44Z LIST blobs 
Container: policyholder-records
Prefix: eu-citizens/
Results:
  - eu-citizens/DE/ (142K records)
  - eu-citizens/FR/ (98K records)
  - eu-citizens/NL/ (67K records)
  - eu-citizens/IT/ (54K records)
  - eu-citizens/ES/ (48K records)
  [... 7 more country prefixes ...]

The attacker identifies that policyholder records are organized by EU country code, containing what appears to be GDPR-protected special category data (health information under Article 9).

Detection Queries¶

KQL (Microsoft Sentinel)SPL (Splunk)

// Detect bulk container enumeration
StorageBlobLogs
| where OperationName in ("ListContainers", "ListBlobs")
| where CallerIpAddress !startswith "10."
| summarize ListOps=count(), 
    UniqueContainers=dcount(Uri) 
    by CallerIpAddress, bin(TimeGenerated, 1h)
| where ListOps > 50

index=azure sourcetype=azure:storage:blob 
    operation IN ("ListContainers", "ListBlobs")
| where NOT cidrmatch("10.0.0.0/8", caller_ip)
| stats count dc(uri) as unique_containers 
    by caller_ip span=1h _time
| where count > 50

Phase 3: Data Exfiltration (Days 4-11)¶

Attacker Actions¶

GLASS SPIDER exfiltrates 500,247 policyholder records over 8 days, rate-limiting downloads to avoid triggering bandwidth alerts:

Exfiltration Pattern (Azure Storage Logs)

Day 4:  63,221 records (DE subset) -- 2.1 GB
Day 5:  71,033 records (FR subset) -- 2.4 GB
Day 6:  67,445 records (NL subset) -- 2.2 GB
Day 7:  54,102 records (IT subset) -- 1.8 GB
Day 8:  48,891 records (ES subset) -- 1.6 GB
Day 9:  42,334 records (BE + AT)   -- 1.4 GB
Day 10: 38,221 records (PL + PT)   -- 1.3 GB
Day 11: 115,000 records (remaining) -- 3.8 GB
----------------------------------------
Total: 500,247 records -- 16.6 GB

Sample Exfiltrated Record Structure (Synthetic -- No Real Data)

{
  "record_id": "EH-DE-00142857",
  "data_subject": {
    "name": "REDACTED",
    "date_of_birth": "REDACTED",
    "national_id": "REDACTED",
    "address": "REDACTED, Berlin, DE"
  },
  "policy": {
    "policy_number": "POL-DE-2024-142857",
    "type": "comprehensive_health",
    "start_date": "2024-01-15",
    "premium_annual": 3240.00
  },
  "medical": {
    "icd10_codes": ["E11.9", "I10", "J45.0"],
    "last_claim_date": "2025-11-22",
    "provider": "REDACTED Medical Center"
  }
}

Data Classification Impact

The exfiltrated records contain GDPR Article 9 special category data (health information), including ICD-10 diagnosis codes. This classification triggers the highest tier of regulatory response and potential fines up to 4% of global annual turnover (GDPR Art. 83(5)).

Detection Queries¶

KQL (Microsoft Sentinel)SPL (Splunk)

// Detect sustained data exfiltration from data lake
StorageBlobLogs
| where AccountName == "eurohealthdatalake"
| where OperationName == "GetBlob"
| where StatusCode == 200
| where CallerIpAddress !startswith "10."
| summarize DailyDownloads=count(), 
    DailyBytes=sum(ResponseBodySize),
    UniqueBlobs=dcount(Uri)
    by CallerIpAddress, bin(TimeGenerated, 1d)
| where DailyDownloads > 10000 
    or DailyBytes > 1073741824
| order by TimeGenerated asc

index=azure sourcetype=azure:storage:blob 
    account_name="eurohealthdatalake" 
    operation="GetBlob" status=200
| where NOT cidrmatch("10.0.0.0/8", caller_ip)
| stats count sum(response_size) as daily_bytes 
    dc(uri) as unique_blobs by caller_ip span=1d _time
| where count > 10000 OR daily_bytes > 1073741824
| sort _time

Discussion Injects¶

Legal

The data includes health records from 12 EU member states. Under GDPR, which supervisory authority has lead jurisdiction? How does the "one-stop-shop" mechanism (Art. 56) work for cross-border breaches?

Decision

The attacker is downloading at a rate that stays below your bandwidth alert threshold of 5 GB/hour. How would you detect this slow-and-steady exfiltration pattern?

Phase 4: Discovery & Regulatory Clock Starts (Day 14)¶

Discovery¶

A threat intelligence vendor alerts EuroHealth that a dataset labeled "EuroHealth EU Policyholders -- 500K records" has appeared on a dark web marketplace operated by GLASS SPIDER. The listing includes sample records matching EuroHealth's data schema.

Dark Web Listing (Reconstructed -- Sanitized)

=== GLASS SPIDER MARKETPLACE ===

ITEM: EuroHealth Insurance -- EU Policyholder Database
RECORDS: 500,247
COUNTRIES: DE, FR, NL, IT, ES, BE, AT, PL, PT, IE, DK, SE
DATA FIELDS: Full name, DOB, National ID, Address, 
             Insurance policy details, Medical diagnosis codes
SAMPLE: [5 sanitized records provided]
PRICE: 15 BTC (negotiable for bulk buyers)

EXTORTION NOTICE: EuroHealth has 7 days to pay 50 BTC
or this database will be released publicly and reported
to every EU Data Protection Authority.

The 72-Hour Clock¶

GDPR Article 33 -- Notification to Supervisory Authority

Clock starts: The moment EuroHealth becomes aware of the breach (Day 14, when threat intel report is received and validated).

Deadline: 72 hours from awareness to notify the lead supervisory authority.

Required content (Art. 33(3)):

Nature of the breach (categories and approximate number of data subjects)
Name and contact details of the DPO
Likely consequences of the breach
Measures taken or proposed to address the breach

GDPR Article 34 -- Communication to Data Subjects

Trigger: When the breach is "likely to result in a high risk to the rights and freedoms of natural persons."

Assessment: Health data + national ID numbers + financial data = HIGH RISK -- notification to data subjects is mandatory.

Challenge: 500,247 data subjects across 12 EU member states, multiple languages, multiple notification channels.

Regulatory Response Timeline¶

Hour	Action	Owner
0	Breach confirmed -- 72-hour clock starts	DPO + CISO
1-4	Scope assessment -- which records, which countries	IR Team
4-8	Draft Article 33 notification to lead supervisory authority	DPO + Legal
8-12	Engage external legal counsel in each affected member state	Legal
12-24	Determine if Article 34 notification to individuals is required	DPO
24-36	Draft data subject notifications in 9 languages	Comms + Legal
36-48	Prepare call center capacity for 500K+ potential inquiries	Operations
48-60	Submit Article 33 notification to lead supervisory authority	DPO
60-72	Begin Article 34 notification to data subjects	Comms
72+	Ongoing -- respond to supervisory authority questions	DPO + Legal

Discussion Injects¶

Legal

The extortion group demands 50 BTC to not release the data publicly. Should EuroHealth pay? What are the legal implications of paying ransomware/extortion demands in the EU? Does payment affect the GDPR notification obligation?

Decision

You have confirmed 500K records across 12 member states. The lead supervisory authority is in your HQ jurisdiction, but you must also notify authorities in the other 11 states. How do you coordinate this while the 72-hour clock is running?

Phase 5: Containment & Remediation¶

Immediate Technical Actions (Hour 0-4)¶

Revoke compromised API key immediately
Rotate all data lake credentials -- SharedKey, SAS tokens, service principals
Enable Azure Storage analytics logging at maximum verbosity
Block attacker IP 203.0.113.88 across all network controls
Audit all GitHub repositories for exposed credentials using trufflehog or gitleaks
Enable GitHub secret scanning and push protection on all repositories

Privacy-Specific Actions¶

Data mapping -- Identify exactly which records were exfiltrated using blob access logs
Risk assessment -- Evaluate likelihood and severity of harm to data subjects per GDPR Art. 34
Cross-border coordination -- Notify lead supervisory authority and cooperate with concerned authorities in all 12 member states
Data subject notification -- Prepare notifications in 9 languages (DE, FR, NL, IT, ES, PT, PL, DA, SV)
Credit monitoring -- Offer identity theft monitoring to affected data subjects
Data subject rights -- Prepare for surge in access requests (GDPR Art. 15) and erasure requests (Art. 17)

Preventive Controls¶

Secret management -- Migrate all API keys to Azure Key Vault with managed identities
Pre-commit hooks -- Install gitleaks as a pre-commit hook to prevent credential commits
Data lake access controls -- Replace SharedKey with Azure AD authentication and conditional access
Data Loss Prevention -- Enable Azure Purview DLP policies for special category data
Network restrictions -- Data lake accessible only from VNet service endpoints (no public access)
Anomaly detection -- Alert on any external IP accessing the data lake
Data minimization -- Review if 2.1M policyholder records need to be in a single data lake (GDPR Art. 5(1)(c))

Detection Improvements¶

KQL (Microsoft Sentinel)SPL (Splunk)

// Alert on data lake access from non-corporate IPs
StorageBlobLogs
| where AccountName == "eurohealthdatalake"
| where StatusCode == 200
| where CallerIpAddress !startswith "10."
    and CallerIpAddress !startswith "172.16."
    and CallerIpAddress !startswith "192.168."
| extend AlertSeverity = "Critical"
| project TimeGenerated, OperationName, 
    CallerIpAddress, Uri, ResponseBodySize

index=azure sourcetype=azure:storage:blob 
    account_name="eurohealthdatalake" status=200
| where NOT cidrmatch("10.0.0.0/8", caller_ip) 
    AND NOT cidrmatch("172.16.0.0/12", caller_ip)
    AND NOT cidrmatch("192.168.0.0/16", caller_ip)
| eval alert_severity="critical"
| sendalert external_datalake_access
| table _time operation caller_ip uri response_size

Indicators of Compromise¶

Network IOCs¶

IOC	Type	Context
`203.0.113.88`	IPv4	Attacker IP -- data lake access and exfiltration
`api.eurohealth.example.com`	Domain	Target API gateway
`eurohealth-datalake.example.com`	Domain	Target data lake endpoint

Cloud IOCs¶

IOC	Type	Context
`REDACTED` (SharedKey format)	API Key	Compromised data lake credential
`policyholder-records`	Blob Container	Primary target for exfiltration
`eu-citizens/`	Blob Prefix	Path pattern for targeted data

Behavioral IOCs¶

Indicator	Description
External IP accessing data lake with SharedKey auth	No legitimate external access should use SharedKey
Sequential ListBlobs across all country prefixes	Systematic enumeration pattern
Sustained 2-4 GB/day download from single external IP	Rate-limited exfiltration below alert threshold
Access to `gdpr-sar-responses` container	Attacker also accessed prior GDPR response data

ATT&CK Mapping¶

Phase	Technique	ID	Tactic
Initial Access	Valid Accounts: Cloud Accounts	T1078.004	Initial Access
Discovery	Cloud Storage Object Discovery	T1619	Discovery
Collection	Data from Cloud Storage	T1530	Collection
Collection	Data from Information Repositories	T1213	Collection
Exfiltration	Exfiltration Over Web Service	T1567	Exfiltration
Exfiltration	Exfiltration Over Alternative Protocol	T1048	Exfiltration

Lessons Learned¶

Credential hygiene prevents breaches -- A single .env file committed to GitHub 47 days earlier enabled the entire breach. Pre-commit hooks with secret scanning would have prevented this at zero cost.
GDPR's 72-hour clock demands preparation, not improvisation -- Organizations must have pre-drafted notification templates, pre-identified legal counsel in each jurisdiction, and pre-established supervisory authority contacts. You cannot build this infrastructure during an active incident.
Health data elevates every aspect of the response -- GDPR Article 9 special category data triggers the highest regulatory tier, mandatory data subject notification, and potential fines of 4% of global turnover. Data classification must drive security control investment.
Slow exfiltration defeats threshold-based alerting -- The attacker stayed below 5 GB/day. Behavioral analytics that establish baselines and detect anomalous access patterns -- not just volume thresholds -- are essential.
Cross-border breaches multiply complexity exponentially -- 12 member states means 12 potential regulatory investigations, 9 languages for notifications, and 12 sets of national laws supplementing GDPR. The "one-stop-shop" mechanism helps but does not eliminate this complexity.

Cross-References¶

Chapter 56: Privacy Engineering -- Privacy-by-design principles and GDPR compliance
Chapter 9: Incident Response Lifecycle -- IR methodology and regulatory notification procedures