Skip to content

SC-110: GDPR Privacy Breach -- Mass Data Subject Notification

Operation CRYSTAL LEAK

Classification: TABLETOP EXERCISE -- 100% Synthetic

All organizations, IP addresses, domains, personal data, and threat actors in this scenario are entirely fictional. Created for educational tabletop exercises only. No real personal data is used.


Scenario Metadata

Field Value
Difficulty ★★★★☆ (Advanced)
Duration 3-4 hours
Participants 6-10 (SOC, IR, Legal, Privacy/DPO, Communications, Executive)
ATT&CK Techniques T1530 · T1567 · T1078 · T1213 · T1048
Threat Actor GLASS SPIDER (data broker / extortion group)
Industry Healthcare / Insurance
Primary Impact 500K EU citizen records exposed, GDPR Article 33/34 notification triggered

Threat Actor Profile: GLASS SPIDER

Attribute Detail
Motivation Financial -- data brokering and extortion
Sophistication High -- targets regulated data for maximum leverage
Known Targets Healthcare providers, insurance companies, government agencies (EU-focused)
Avg. Dwell Time 15-30 days
Signature Exfiltrates personal data, then threatens public disclosure and regulatory complaints to pressure ransom payment
Tools Cloud storage enumeration scripts, custom exfiltration tools using legitimate cloud services, Tor-based leak sites

Executive Summary

GLASS SPIDER compromises an API key for EuroHealth Insurance (synthetic healthcare insurer, 3,200 employees, headquartered in synthetic EU jurisdiction) that provides access to a cloud-hosted data lake containing policyholder records. Over 11 days, the attacker exfiltrates 500,247 EU citizen records including names, dates of birth, national ID numbers, medical diagnosis codes (ICD-10), and insurance policy details. The data is staged in an attacker-controlled Azure Blob Storage account and subsequently posted to an extortion site. The breach is discovered on Day 14 when a threat intelligence feed flags EuroHealth data on a dark web marketplace. This triggers the GDPR Article 33 requirement to notify the supervisory authority within 72 hours and Article 34 notification to affected data subjects without undue delay. The organization must simultaneously run a technical IR investigation and a complex regulatory response across 12 EU member states.


Environment Setup

Target Organization: EuroHealth Insurance (synthetic)

Asset Detail
Industry Healthcare insurance, 3,200 employees, 2.1M policyholders
HQ Synthetic EU member state
Operations 12 EU member states, subject to GDPR and national health data regulations
Data Lake Azure Data Lake Storage Gen2: eurohealth-datalake.example.com (10.30.0.0/16)
API Gateway api.eurohealth.example.com (10.30.1.20)
DPO Designated Data Protection Officer (required under GDPR Art. 37)
SIEM Microsoft Sentinel
Data Classification Policyholder data classified as "Special Category" (GDPR Art. 9 -- health data)

Phase 1: Initial Access -- API Key Compromise (Day 0)

Attacker Actions

GLASS SPIDER discovers an exposed API key in a public GitHub repository. A developer at EuroHealth accidentally committed a .env file containing the data lake API credentials:

Exposed .env File (GitHub Commit -- Reconstructed)

# EuroHealth Data Lake -- DEV/TEST ONLY
# WARNING: DO NOT COMMIT
DATALAKE_API_KEY=REDACTED
DATALAKE_ENDPOINT=https://eurohealth-datalake.example.com
DATALAKE_CONTAINER=policyholder-records
DB_CONNECTION=Server=10.30.2.15;Database=PolicyDB;User=testuser;Password=REDACTED

Root Cause

The .gitignore file existed but did not include .env files in the config/ subdirectory. The developer committed the file 47 days before the attacker discovered it. GitHub secret scanning was not enabled on the repository.

Evidence Artifacts

Azure Activity Log -- API Key Authentication

{
  "time": "2026-03-01T14:22:08Z",
  "operationName": "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read",
  "callerIpAddress": "203.0.113.88",
  "properties": {
    "accountName": "eurohealthdatalake",
    "containerName": "policyholder-records",
    "authenticationMethod": "SharedKey",
    "statusCode": 200
  }
}

Detection Queries

// Detect data lake access from external IPs
StorageBlobLogs
| where TimeGenerated > ago(7d)
| where AccountName == "eurohealthdatalake"
| where CallerIpAddress !startswith "10." 
    and CallerIpAddress !startswith "172.16."
    and CallerIpAddress !startswith "192.168."
| where StatusCode == 200
| summarize BlobCount=count(), 
    DataVolume=sum(ResponseBodySize) 
    by CallerIpAddress, bin(TimeGenerated, 1h)
| where BlobCount > 100 or DataVolume > 104857600
index=azure sourcetype=azure:storage:blob
| search account_name="eurohealthdatalake" status=200
| where NOT cidrmatch("10.0.0.0/8", caller_ip)
| stats count sum(response_size) as total_bytes 
    by caller_ip span=1h _time
| where count > 100 OR total_bytes > 104857600

Discussion Injects

Technical

The API key was committed 47 days ago. What tools could have detected this earlier? How does GitHub secret scanning work, and what are its limitations for custom API key formats?


Phase 2: Data Reconnaissance & Enumeration (Days 1-3)

Attacker Actions

GLASS SPIDER enumerates the data lake structure and identifies high-value containers:

Azure Blob Container Enumeration (Reconstructed)

2026-03-02T08:15:22Z LIST containers
Source: 203.0.113.88
Results:
  - policyholder-records (4.2 TB, 2.1M objects)
  - claims-data (1.8 TB, 890K objects)
  - provider-network (230 GB, 45K objects)
  - analytics-exports (890 GB, 12K objects)
  - gdpr-sar-responses (45 GB, 3.2K objects)

2026-03-02T08:17:44Z LIST blobs 
Container: policyholder-records
Prefix: eu-citizens/
Results:
  - eu-citizens/DE/ (142K records)
  - eu-citizens/FR/ (98K records)
  - eu-citizens/NL/ (67K records)
  - eu-citizens/IT/ (54K records)
  - eu-citizens/ES/ (48K records)
  [... 7 more country prefixes ...]

The attacker identifies that policyholder records are organized by EU country code, containing what appears to be GDPR-protected special category data (health information under Article 9).

Detection Queries

// Detect bulk container enumeration
StorageBlobLogs
| where OperationName in ("ListContainers", "ListBlobs")
| where CallerIpAddress !startswith "10."
| summarize ListOps=count(), 
    UniqueContainers=dcount(Uri) 
    by CallerIpAddress, bin(TimeGenerated, 1h)
| where ListOps > 50
index=azure sourcetype=azure:storage:blob 
    operation IN ("ListContainers", "ListBlobs")
| where NOT cidrmatch("10.0.0.0/8", caller_ip)
| stats count dc(uri) as unique_containers 
    by caller_ip span=1h _time
| where count > 50

Phase 3: Data Exfiltration (Days 4-11)

Attacker Actions

GLASS SPIDER exfiltrates 500,247 policyholder records over 8 days, rate-limiting downloads to avoid triggering bandwidth alerts:

Exfiltration Pattern (Azure Storage Logs)

Day 4:  63,221 records (DE subset) -- 2.1 GB
Day 5:  71,033 records (FR subset) -- 2.4 GB
Day 6:  67,445 records (NL subset) -- 2.2 GB
Day 7:  54,102 records (IT subset) -- 1.8 GB
Day 8:  48,891 records (ES subset) -- 1.6 GB
Day 9:  42,334 records (BE + AT)   -- 1.4 GB
Day 10: 38,221 records (PL + PT)   -- 1.3 GB
Day 11: 115,000 records (remaining) -- 3.8 GB
----------------------------------------
Total: 500,247 records -- 16.6 GB

Sample Exfiltrated Record Structure (Synthetic -- No Real Data)

{
  "record_id": "EH-DE-00142857",
  "data_subject": {
    "name": "REDACTED",
    "date_of_birth": "REDACTED",
    "national_id": "REDACTED",
    "address": "REDACTED, Berlin, DE"
  },
  "policy": {
    "policy_number": "POL-DE-2024-142857",
    "type": "comprehensive_health",
    "start_date": "2024-01-15",
    "premium_annual": 3240.00
  },
  "medical": {
    "icd10_codes": ["E11.9", "I10", "J45.0"],
    "last_claim_date": "2025-11-22",
    "provider": "REDACTED Medical Center"
  }
}

Data Classification Impact

The exfiltrated records contain GDPR Article 9 special category data (health information), including ICD-10 diagnosis codes. This classification triggers the highest tier of regulatory response and potential fines up to 4% of global annual turnover (GDPR Art. 83(5)).

Detection Queries

// Detect sustained data exfiltration from data lake
StorageBlobLogs
| where AccountName == "eurohealthdatalake"
| where OperationName == "GetBlob"
| where StatusCode == 200
| where CallerIpAddress !startswith "10."
| summarize DailyDownloads=count(), 
    DailyBytes=sum(ResponseBodySize),
    UniqueBlobs=dcount(Uri)
    by CallerIpAddress, bin(TimeGenerated, 1d)
| where DailyDownloads > 10000 
    or DailyBytes > 1073741824
| order by TimeGenerated asc
index=azure sourcetype=azure:storage:blob 
    account_name="eurohealthdatalake" 
    operation="GetBlob" status=200
| where NOT cidrmatch("10.0.0.0/8", caller_ip)
| stats count sum(response_size) as daily_bytes 
    dc(uri) as unique_blobs by caller_ip span=1d _time
| where count > 10000 OR daily_bytes > 1073741824
| sort _time

Discussion Injects

Legal

The data includes health records from 12 EU member states. Under GDPR, which supervisory authority has lead jurisdiction? How does the "one-stop-shop" mechanism (Art. 56) work for cross-border breaches?

Decision

The attacker is downloading at a rate that stays below your bandwidth alert threshold of 5 GB/hour. How would you detect this slow-and-steady exfiltration pattern?


Phase 4: Discovery & Regulatory Clock Starts (Day 14)

Discovery

A threat intelligence vendor alerts EuroHealth that a dataset labeled "EuroHealth EU Policyholders -- 500K records" has appeared on a dark web marketplace operated by GLASS SPIDER. The listing includes sample records matching EuroHealth's data schema.

Dark Web Listing (Reconstructed -- Sanitized)

=== GLASS SPIDER MARKETPLACE ===

ITEM: EuroHealth Insurance -- EU Policyholder Database
RECORDS: 500,247
COUNTRIES: DE, FR, NL, IT, ES, BE, AT, PL, PT, IE, DK, SE
DATA FIELDS: Full name, DOB, National ID, Address, 
             Insurance policy details, Medical diagnosis codes
SAMPLE: [5 sanitized records provided]
PRICE: 15 BTC (negotiable for bulk buyers)

EXTORTION NOTICE: EuroHealth has 7 days to pay 50 BTC
or this database will be released publicly and reported
to every EU Data Protection Authority.

The 72-Hour Clock

GDPR Article 33 -- Notification to Supervisory Authority

Clock starts: The moment EuroHealth becomes aware of the breach (Day 14, when threat intel report is received and validated).

Deadline: 72 hours from awareness to notify the lead supervisory authority.

Required content (Art. 33(3)):

  1. Nature of the breach (categories and approximate number of data subjects)
  2. Name and contact details of the DPO
  3. Likely consequences of the breach
  4. Measures taken or proposed to address the breach

GDPR Article 34 -- Communication to Data Subjects

Trigger: When the breach is "likely to result in a high risk to the rights and freedoms of natural persons."

Assessment: Health data + national ID numbers + financial data = HIGH RISK -- notification to data subjects is mandatory.

Challenge: 500,247 data subjects across 12 EU member states, multiple languages, multiple notification channels.

Regulatory Response Timeline

Hour Action Owner
0 Breach confirmed -- 72-hour clock starts DPO + CISO
1-4 Scope assessment -- which records, which countries IR Team
4-8 Draft Article 33 notification to lead supervisory authority DPO + Legal
8-12 Engage external legal counsel in each affected member state Legal
12-24 Determine if Article 34 notification to individuals is required DPO
24-36 Draft data subject notifications in 9 languages Comms + Legal
36-48 Prepare call center capacity for 500K+ potential inquiries Operations
48-60 Submit Article 33 notification to lead supervisory authority DPO
60-72 Begin Article 34 notification to data subjects Comms
72+ Ongoing -- respond to supervisory authority questions DPO + Legal

Discussion Injects

Legal

The extortion group demands 50 BTC to not release the data publicly. Should EuroHealth pay? What are the legal implications of paying ransomware/extortion demands in the EU? Does payment affect the GDPR notification obligation?

Decision

You have confirmed 500K records across 12 member states. The lead supervisory authority is in your HQ jurisdiction, but you must also notify authorities in the other 11 states. How do you coordinate this while the 72-hour clock is running?


Phase 5: Containment & Remediation

Immediate Technical Actions (Hour 0-4)

  1. Revoke compromised API key immediately
  2. Rotate all data lake credentials -- SharedKey, SAS tokens, service principals
  3. Enable Azure Storage analytics logging at maximum verbosity
  4. Block attacker IP 203.0.113.88 across all network controls
  5. Audit all GitHub repositories for exposed credentials using trufflehog or gitleaks
  6. Enable GitHub secret scanning and push protection on all repositories

Privacy-Specific Actions

  1. Data mapping -- Identify exactly which records were exfiltrated using blob access logs
  2. Risk assessment -- Evaluate likelihood and severity of harm to data subjects per GDPR Art. 34
  3. Cross-border coordination -- Notify lead supervisory authority and cooperate with concerned authorities in all 12 member states
  4. Data subject notification -- Prepare notifications in 9 languages (DE, FR, NL, IT, ES, PT, PL, DA, SV)
  5. Credit monitoring -- Offer identity theft monitoring to affected data subjects
  6. Data subject rights -- Prepare for surge in access requests (GDPR Art. 15) and erasure requests (Art. 17)

Preventive Controls

  1. Secret management -- Migrate all API keys to Azure Key Vault with managed identities
  2. Pre-commit hooks -- Install gitleaks as a pre-commit hook to prevent credential commits
  3. Data lake access controls -- Replace SharedKey with Azure AD authentication and conditional access
  4. Data Loss Prevention -- Enable Azure Purview DLP policies for special category data
  5. Network restrictions -- Data lake accessible only from VNet service endpoints (no public access)
  6. Anomaly detection -- Alert on any external IP accessing the data lake
  7. Data minimization -- Review if 2.1M policyholder records need to be in a single data lake (GDPR Art. 5(1)(c))

Detection Improvements

// Alert on data lake access from non-corporate IPs
StorageBlobLogs
| where AccountName == "eurohealthdatalake"
| where StatusCode == 200
| where CallerIpAddress !startswith "10."
    and CallerIpAddress !startswith "172.16."
    and CallerIpAddress !startswith "192.168."
| extend AlertSeverity = "Critical"
| project TimeGenerated, OperationName, 
    CallerIpAddress, Uri, ResponseBodySize
index=azure sourcetype=azure:storage:blob 
    account_name="eurohealthdatalake" status=200
| where NOT cidrmatch("10.0.0.0/8", caller_ip) 
    AND NOT cidrmatch("172.16.0.0/12", caller_ip)
    AND NOT cidrmatch("192.168.0.0/16", caller_ip)
| eval alert_severity="critical"
| sendalert external_datalake_access
| table _time operation caller_ip uri response_size

Indicators of Compromise

Network IOCs

IOC Type Context
203.0.113.88 IPv4 Attacker IP -- data lake access and exfiltration
api.eurohealth.example.com Domain Target API gateway
eurohealth-datalake.example.com Domain Target data lake endpoint

Cloud IOCs

IOC Type Context
REDACTED (SharedKey format) API Key Compromised data lake credential
policyholder-records Blob Container Primary target for exfiltration
eu-citizens/ Blob Prefix Path pattern for targeted data

Behavioral IOCs

Indicator Description
External IP accessing data lake with SharedKey auth No legitimate external access should use SharedKey
Sequential ListBlobs across all country prefixes Systematic enumeration pattern
Sustained 2-4 GB/day download from single external IP Rate-limited exfiltration below alert threshold
Access to gdpr-sar-responses container Attacker also accessed prior GDPR response data

ATT&CK Mapping

Phase Technique ID Tactic
Initial Access Valid Accounts: Cloud Accounts T1078.004 Initial Access
Discovery Cloud Storage Object Discovery T1619 Discovery
Collection Data from Cloud Storage T1530 Collection
Collection Data from Information Repositories T1213 Collection
Exfiltration Exfiltration Over Web Service T1567 Exfiltration
Exfiltration Exfiltration Over Alternative Protocol T1048 Exfiltration

Lessons Learned

  1. Credential hygiene prevents breaches -- A single .env file committed to GitHub 47 days earlier enabled the entire breach. Pre-commit hooks with secret scanning would have prevented this at zero cost.
  2. GDPR's 72-hour clock demands preparation, not improvisation -- Organizations must have pre-drafted notification templates, pre-identified legal counsel in each jurisdiction, and pre-established supervisory authority contacts. You cannot build this infrastructure during an active incident.
  3. Health data elevates every aspect of the response -- GDPR Article 9 special category data triggers the highest regulatory tier, mandatory data subject notification, and potential fines of 4% of global turnover. Data classification must drive security control investment.
  4. Slow exfiltration defeats threshold-based alerting -- The attacker stayed below 5 GB/day. Behavioral analytics that establish baselines and detect anomalous access patterns -- not just volume thresholds -- are essential.
  5. Cross-border breaches multiply complexity exponentially -- 12 member states means 12 potential regulatory investigations, 9 languages for notifications, and 12 sets of national laws supplementing GDPR. The "one-stop-shop" mechanism helps but does not eliminate this complexity.

Cross-References