Chapter 51: Kubernetes Security — From Pod to Cluster¶

Overview¶

Kubernetes has become the dominant container orchestration platform, powering everything from startup microservices to nation-scale infrastructure. With that ubiquity comes an expansive attack surface: a misconfigured RBAC binding, an unencrypted etcd store, or a privileged pod can turn a single container compromise into full cluster takeover in minutes. This chapter provides a comprehensive, defense-first treatment of Kubernetes security. We move systematically from the architecture layer (control plane, kubelet, etcd, API server) through workload isolation (Pod Security Standards, network policies) to operational security (secrets management, supply chain integrity, runtime detection, audit logging). Every attack technique is paired with KQL and SPL detection queries, mapped to MITRE ATT&CK, and grounded in real-world defensive practice. This is the chapter you reference when you need to secure — or assess — a Kubernetes cluster end to end.

Educational Content Only

All techniques in this chapter are presented for defensive understanding only. All cluster names, IP addresses, namespaces, service accounts, and scenarios are 100% synthetic. IP addresses use RFC 5737 (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24) and RFC 1918 ranges. Credentials shown are placeholders (REDACTED). Never execute offensive techniques without explicit written authorization against clusters you own or have written permission to test.

Learning Objectives¶

By the end of this chapter, students SHALL be able to:

Describe the Kubernetes architecture security model and identify the trust boundaries between control plane components (Knowledge)
Differentiate between Pod Security Standards profiles (Privileged, Baseline, Restricted) and select the appropriate profile for a given workload (Analysis)
Design least-privilege RBAC policies using namespace-scoped Roles, targeted ClusterRoles, and service account hardening (Synthesis)
Evaluate container escape risk by mapping pod security context settings to known escape vectors (Evaluation)
Implement Kubernetes-native and external secrets management strategies with encryption at rest (Application)
Construct network policies that enforce default-deny, namespace isolation, and egress filtering (Synthesis)
Assess container supply chain security posture using image signing, admission controllers, and SBOM verification (Evaluation)
Develop KQL and SPL detection queries for Kubernetes attack techniques using audit log telemetry (Synthesis)
Apply runtime security tools (Falco, Tetragon) to detect container escapes, crypto-mining, and anomalous process execution (Application)
Create a Kubernetes audit logging policy that balances security visibility with cluster performance (Synthesis)

Prerequisites¶

Completion of Chapter 20: Cloud Attack & Defense — cloud security fundamentals, shared responsibility model
Completion of Chapter 46: Cloud & Container Red Teaming — container escape concepts, cloud IAM escalation
Familiarity with Chapter 35: DevSecOps Pipeline — CI/CD security, shift-left principles
Familiarity with Chapter 5: Detection Engineering at Scale — detection query methodology
Working knowledge of Linux namespaces, cgroups, and container fundamentals

Why This Matters

In 2024, over 75% of organizations running Kubernetes reported at least one serious security incident tied to misconfiguration (Red Hat State of Kubernetes Security Report). The Kubernetes API server is the single largest attack surface in modern infrastructure — it is internet-facing by default in managed offerings, processes every cluster mutation, and holds the keys to every secret in the cluster. A single overly permissive ClusterRoleBinding or an unpatched kubelet can cascade into complete infrastructure compromise. Kubernetes security is not optional — it is existential.

MITRE ATT&CK Kubernetes Mapping¶

Technique ID	Technique Name	Kubernetes Context	Tactic
T1078.001	Valid Accounts: Default Accounts	Default service account token abuse	Initial Access (TA0001)
T1552.007	Unsecured Credentials: Container API	Kubelet API credential exposure	Credential Access (TA0006)
T1609	Container Administration Command	kubectl exec into pods	Execution (TA0002)
T1610	Deploy Container	Malicious pod deployment via RBAC abuse	Execution (TA0002)
T1611	Escape to Host	Privileged container, hostPID, nsenter	Privilege Escalation (TA0004)
T1613	Container and Resource Discovery	API server enumeration of pods, secrets, nodes	Discovery (TA0007)
T1053.007	Scheduled Task/Job: Container Orchestration Job	CronJob for persistence	Persistence (TA0003)
T1071.001	Application Layer Protocol: Web Protocols	API server communication over HTTPS	Command & Control (TA0011)
T1046	Network Service Discovery	Service/endpoint enumeration within cluster	Discovery (TA0007)
T1070.004	Indicator Removal: File Deletion	Ephemeral pod destruction to remove evidence	Defense Evasion (TA0005)
T1098	Account Manipulation	ClusterRoleBinding creation for persistence	Persistence (TA0003)
T1574	Hijack Execution Flow	Mutating webhook injection	Persistence (TA0003)

51.1 Kubernetes Architecture Security Model¶

51.1.1 Control Plane Components and Trust Boundaries¶

Understanding the architecture is the prerequisite for understanding the attack surface. Every component in the Kubernetes control plane has distinct trust boundaries and failure modes.

graph TD
    subgraph Control Plane - 198.51.100.0/24
        API[kube-apiserver\n198.51.100.10:6443]
        ETCD[etcd\n198.51.100.20:2379]
        SCHED[kube-scheduler\n198.51.100.10]
        CM[kube-controller-manager\n198.51.100.10]
    end

    subgraph Worker Node 1 - 198.51.100.101
        KL1[kubelet\n:10250]
        KP1[kube-proxy]
        POD1[Pod A\nsynth-app]
        POD2[Pod B\nsynth-api]
    end

    subgraph Worker Node 2 - 198.51.100.102
        KL2[kubelet\n:10250]
        KP2[kube-proxy]
        POD3[Pod C\nsynth-worker]
    end

    API -->|mTLS| ETCD
    API -->|mTLS| KL1
    API -->|mTLS| KL2
    SCHED -->|mTLS| API
    CM -->|mTLS| API
    POD1 -->|ServiceAccount Token| API
    POD3 -->|ServiceAccount Token| API

    style API fill:#e63946,color:#fff
    style ETCD fill:#780000,color:#fff
    style KL1 fill:#457b9d,color:#fff
    style KL2 fill:#457b9d,color:#fff

51.1.2 Component Security Properties¶

Component	Port	Authentication	Authorization	Key Risk
kube-apiserver	6443	x509 client certs, OIDC tokens, service account JWTs	RBAC, ABAC, Webhook	Internet exposure; single point of failure for all cluster operations
etcd	2379	mTLS (peer + client certs)	etcd RBAC (rarely used — API server is sole client)	Stores all cluster state including Secrets in plaintext by default
kubelet	10250	x509 client certs, bearer tokens	Webhook (delegates to API server)	`--anonymous-auth=true` default in some distributions
kube-scheduler	10259	Localhost only	N/A	Compromise allows malicious pod placement decisions
kube-controller-manager	10257	Localhost only	N/A	Holds credentials for all controllers (node, SA token, etc.)
kube-proxy	10256	Localhost only (healthz)	N/A	Manipulating iptables/IPVS rules enables traffic interception
CoreDNS	53	None (cluster-internal)	N/A	DNS poisoning enables service impersonation

51.1.3 API Server Hardening¶

The API server is the central nervous system of Kubernetes. Every kubectl command, every controller reconciliation loop, every pod scheduling decision flows through it.

API SERVER HARDENING CHECKLIST
══════════════════════════════════════════════════════
Authentication:
  □ Disable anonymous authentication (--anonymous-auth=false)
  □ Use OIDC for human users (integrate with IdP)
  □ Use x509 client certificates for system components
  □ Disable static token files (--token-auth-file)
  □ Disable basic auth (--basic-auth-file) — deprecated but still exists

Authorization:
  □ Use RBAC mode exclusively (--authorization-mode=RBAC,Node)
  □ Never use ABAC in production (no audit trail)
  □ Disable AlwaysAllow (catastrophic misconfiguration)

Network:
  □ Restrict API server to private network or IP allowlist
  □ Use a load balancer with TLS termination and WAF
  □ Enable audit logging (see Section 51.10)
  □ Rate-limit API requests to prevent DoS

TLS:
  □ Minimum TLS 1.2 (--tls-min-version=VersionTLS12)
  □ Strong cipher suites only (--tls-cipher-suites)
  □ Rotate API server certificates before expiry
  □ Use separate CA for etcd communication

51.1.4 Kubelet Security¶

# SYNTHETIC — kubelet configuration hardening
# File: /var/lib/kubelet/config.yaml on each node
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false          # CRITICAL: disable anonymous access
  webhook:
    enabled: true           # Delegate auth to API server
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook             # Delegate authz to API server
readOnlyPort: 0             # Disable unauthenticated read-only port (10255)
protectKernelDefaults: true  # Prevent pods from changing kernel parameters
eventRecordQPS: 50          # Ensure events are captured for detection

Kubelet Anonymous Auth

Some Kubernetes distributions ship with --anonymous-auth=true by default. This allows unauthenticated access to the kubelet API on port 10250. An attacker with network access to a node can list pods, exec into containers, and retrieve logs — all without authentication. Always verify this setting in production.

51.2 Pod Security Standards¶

51.2.1 The Three Profiles¶

Pod Security Standards (PSS) replaced the deprecated PodSecurityPolicy (PSP) in Kubernetes 1.25. They define three progressive security profiles enforced by the built-in Pod Security Admission controller.

Profile	Purpose	Key Restrictions	Use Case
Privileged	Unrestricted	None — all capabilities allowed	System-level infrastructure (CNI, storage drivers)
Baseline	Minimize known escalation	No privileged containers, no hostPID/hostNetwork, restricted volume types	General-purpose workloads with basic security
Restricted	Maximum hardening	Must run as non-root, drop ALL capabilities, read-only root FS, seccomp enforced	Security-sensitive workloads, multi-tenant clusters

51.2.2 Enforcing Pod Security Standards¶

# SYNTHETIC — Namespace-level PSS enforcement
apiVersion: v1
kind: Namespace
metadata:
  name: synth-production
  labels:
    # Enforce restricted profile — reject non-compliant pods
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    # Warn on baseline violations (for migration)
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
    # Audit log violations
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest

51.2.3 Compliant vs. Non-Compliant Pod Specs¶

Restricted-Compliant PodRestricted-Violating Pod (Rejected)

# SYNTHETIC — Pod that passes the Restricted profile
apiVersion: v1
kind: Pod
metadata:
  name: synth-secure-app
  namespace: synth-production
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 10001
    runAsGroup: 10001
    fsGroup: 10001
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: synth-registry.example.com/app@sha256:abc123def456...
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]
      runAsNonRoot: true
    resources:
      limits:
        cpu: "500m"
        memory: "256Mi"
      requests:
        cpu: "100m"
        memory: "128Mi"
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir: {}

# SYNTHETIC — Pod that FAILS the Restricted profile
# Each violation is annotated
apiVersion: v1
kind: Pod
metadata:
  name: synth-insecure-app
  namespace: synth-production
spec:
  hostPID: true                    # VIOLATION: hostPID not allowed
  hostNetwork: true                # VIOLATION: hostNetwork not allowed
  containers:
  - name: app
    image: synth-registry.example.com/app:latest  # BAD: mutable tag
    securityContext:
      privileged: true             # VIOLATION: privileged not allowed
      runAsUser: 0                 # VIOLATION: must run as non-root
      capabilities:
        add: ["SYS_ADMIN"]        # VIOLATION: cannot add capabilities
      # Missing: allowPrivilegeEscalation: false
      # Missing: readOnlyRootFilesystem: true
      # Missing: seccompProfile
    volumeMounts:
    - name: host-root
      mountPath: /host
  volumes:
  - name: host-root
    hostPath:                      # VIOLATION: hostPath not allowed
      path: /

51.3 RBAC Deep Dive¶

51.3.1 RBAC Object Model¶

graph LR
    subgraph Subjects
        U[User\nsynth-admin@example.com]
        G[Group\nsynth-developers]
        SA[ServiceAccount\nsynth-app-sa]
    end

    subgraph Bindings
        RB[RoleBinding\nnamespace-scoped]
        CRB[ClusterRoleBinding\ncluster-scoped]
    end

    subgraph Roles
        R[Role\nnamespace-scoped]
        CR[ClusterRole\ncluster-scoped]
    end

    U --> CRB
    G --> RB
    SA --> RB
    RB --> R
    RB --> CR
    CRB --> CR

    style U fill:#457b9d,color:#fff
    style SA fill:#e9c46a,color:#000
    style R fill:#2d6a4f,color:#fff
    style CR fill:#e63946,color:#fff

Key principle: A RoleBinding can reference a ClusterRole but limits its scope to the binding's namespace. This is the recommended pattern — define reusable ClusterRoles, then bind them at the namespace level.

51.3.2 Dangerous RBAC Verbs¶

Verb	Risk	Why It Matters
`*` (wildcard)	CRITICAL	Grants all verbs including escalate, bind, impersonate
`escalate`	CRITICAL	Allows creating Roles with more permissions than the creator has
`bind`	CRITICAL	Allows creating RoleBindings to any Role, including cluster-admin
`impersonate`	HIGH	Allows acting as another user, group, or service account
`create` on `pods`	HIGH	Combined with SA access, enables privilege escalation via pod creation
`create` on `pods/exec`	HIGH	Enables remote code execution in any pod in scope
`get`/`list` on `secrets`	HIGH	Reads service account tokens and application secrets
`create` on `serviceaccounts/token`	HIGH	Generates new tokens for any service account in scope
`patch` on `nodes`	MEDIUM	Can taint/label nodes to influence scheduling

51.3.3 Service Account Hardening¶

Service accounts are the most common escalation vector because every pod gets one automatically.

# SYNTHETIC — Hardened service account configuration
apiVersion: v1
kind: ServiceAccount
metadata:
  name: synth-app-sa
  namespace: synth-production
automountServiceAccountToken: false   # CRITICAL: do not auto-mount

---
# If the application NEEDS API access, explicitly mount with audience binding
apiVersion: v1
kind: Pod
metadata:
  name: synth-app
  namespace: synth-production
spec:
  serviceAccountName: synth-app-sa
  automountServiceAccountToken: false
  containers:
  - name: app
    image: synth-registry.example.com/app@sha256:abc123...
    volumeMounts:
    - name: sa-token
      mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      readOnly: true
  volumes:
  - name: sa-token
    projected:
      sources:
      - serviceAccountToken:
          path: token
          expirationSeconds: 3600     # Short-lived (1 hour)
          audience: synth-api-server  # Audience-bound

51.3.4 RBAC Audit Methodology¶

RBAC AUDIT CHECKLIST
══════════════════════════════════════════════════════
1. Identify all ClusterRoleBindings to cluster-admin:
   → Every binding must have documented justification
   → Default: only system:masters group

2. Find all subjects with wildcard (*) permissions:
   → Replace with explicit verb lists

3. Check for escalate/bind verbs:
   → Should only exist on admin-managed ClusterRoles

4. Audit default service account permissions:
   → Default SA in each namespace should have ZERO permissions
   → automountServiceAccountToken: false on default SA

5. Review cross-namespace access:
   → ClusterRoleBindings grant cluster-wide access
   → Prefer namespace-scoped RoleBindings

6. Detect unused service accounts and RoleBindings:
   → Remove stale bindings (principle of least privilege)

7. Validate no pods run with privileged service accounts:
   → Cross-reference pod specs with SA permissions

51.4 Container Escape Techniques¶

51.4.1 Escape Vector Taxonomy¶

Understanding container escapes is essential for both red team assessment and blue team detection. Each vector exploits a specific isolation boundary failure.

Vector	Mechanism	Required Condition	ATT&CK	Severity
Privileged flag	Disables all Linux security features	`privileged: true`	T1611	CRITICAL
hostPID + nsenter	Enters host PID namespace	`hostPID: true` + nsenter binary	T1611	CRITICAL
hostPath mount	Direct host filesystem access	`hostPath` volume with write access	T1611	CRITICAL
Docker socket	Full Docker daemon control	`/var/run/docker.sock` mounted	T1611	CRITICAL
CAP_SYS_ADMIN	Mount namespace manipulation	Capability added to container	T1611	HIGH
cgroup escape	cgroup release_agent exploitation	Write access to cgroup filesystem	T1611	HIGH
Kernel exploit	Exploit shared kernel vulnerability	Vulnerable kernel, no gVisor/Kata	T1611	HIGH
hostNetwork	Shared network namespace	`hostNetwork: true`	T1611	MEDIUM
procfs write	Write to host /proc entries	`/proc` mounted writable	T1611	MEDIUM

51.4.2 Privileged Container Escape (Conceptual)¶

SYNTHETIC SCENARIO — Privileged Container Escape
═══════════════════════════════════════════════════
Cluster: synth-cluster (198.51.100.0/24)
Namespace: synth-dev
Pod: synth-debug-pod (privileged: true)

ESCAPE CONCEPT (educational — cgroup release_agent):
1. Attacker compromises application in privileged pod
2. Privileged flag disables: AppArmor, seccomp, capability restrictions
3. Attacker mounts host cgroup filesystem
4. Writes a release_agent script that executes on the host
5. Triggers cgroup notification → script runs as root on host
6. Host is fully compromised

CONCEPTUAL FLOW:
  Container (root) → mount cgroup → write release_agent
  → trigger notify_on_release → execute on HOST as root

WHY PRIVILEGED IS DANGEROUS:
  - Disables ALL namespace isolation
  - Grants ALL Linux capabilities (36+)
  - Allows device access (/dev)
  - Disables seccomp and AppArmor profiles
  - Container is root on the host in everything but name

DETECTION:
  - Audit log: pod created with privileged=true
  - Falco: unexpected mount syscall in container
  - Process monitoring: nsenter, chroot, mount from container PID

51.4.3 hostPID + nsenter Escape (Conceptual)¶

# SYNTHETIC — Pod with hostPID escape vector
apiVersion: v1
kind: Pod
metadata:
  name: synth-escape-demo
  namespace: synth-dev
spec:
  hostPID: true              # Shares host PID namespace
  containers:
  - name: escape
    image: synth-registry.example.com/debug:1.0
    securityContext:
      privileged: false      # Even WITHOUT privileged flag
    command: ["sleep", "infinity"]

# ESCAPE CONCEPT:
# With hostPID=true, the container can see ALL host processes.
# If the container has nsenter binary:
#   nsenter -t 1 -m -u -i -n -- /bin/bash
#   → Enters PID 1's namespaces (init process = host)
#   → Full host shell access
#
# EVEN WITHOUT PRIVILEGED FLAG, hostPID enables:
#   - Process enumeration of host and all containers
#   - /proc/<pid>/environ reading (environment variable secrets)
#   - Signal injection to host processes
#   - With nsenter: full namespace escape

51.4.4 Detection Queries — Container Escapes¶

KQL — Privileged Pod CreationSPL — Privileged Pod CreationKQL — Container Escape IndicatorsSPL — Container Escape Indicators

// KQL — Detect privileged pod creation in Kubernetes audit logs
// Data source: Azure Monitor / Microsoft Defender for Containers
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where Verb == "create"
| where ObjectRef_Resource == "pods"
| extend PodSpec = parse_json(RequestBody)
| where PodSpec.spec.containers[0].securityContext.privileged == true
    or PodSpec.spec.hostPID == true
    or PodSpec.spec.hostNetwork == true
| project
    TimeGenerated,
    User = SourceIPs,
    Username = User_Username,
    Namespace = ObjectRef_Namespace,
    PodName = tostring(PodSpec.metadata.name),
    Privileged = tostring(PodSpec.spec.containers[0].securityContext.privileged),
    HostPID = tostring(PodSpec.spec.hostPID)
| sort by TimeGenerated desc

// SPL — Detect privileged pod creation
// Data source: Kubernetes audit logs via Splunk
index=kubernetes sourcetype="kube:apiserver:audit"
verb="create" objectRef.resource="pods"
| spath path=requestObject.spec.containers{}.securityContext.privileged output=privileged
| spath path=requestObject.spec.hostPID output=hostPID
| spath path=requestObject.spec.hostNetwork output=hostNetwork
| where privileged="true" OR hostPID="true" OR hostNetwork="true"
| table _time, user.username, objectRef.namespace,
    requestObject.metadata.name, privileged, hostPID, hostNetwork
| sort -_time

// KQL — Detect runtime indicators of container escape attempts
// Data source: Microsoft Defender for Containers / Syslog
SecurityAlert
| where TimeGenerated > ago(24h)
| where AlertType has_any (
    "ContainerEscape",
    "PrivilegedContainer",
    "SensitiveMount"
)
| union (
    Syslog
    | where TimeGenerated > ago(24h)
    | where ProcessName == "falco"
    | where SyslogMessage has_any (
        "nsenter", "mount", "chroot",
        "cgroup", "release_agent"
    )
)
| project TimeGenerated, Computer, ProcessName,
    AlertType, SyslogMessage
| sort by TimeGenerated desc

// SPL — Detect runtime indicators of container escape
// Data source: Falco alerts / syslog
index=kubernetes sourcetype="falco"
(rule="Terminal shell in container"
 OR rule="Launch Privileged Container"
 OR rule="Mount Launched in Privileged Container"
 OR rule="Detect release_agent File Container Escapes")
| table _time, container.id, container.name, k8s.pod.name,
    k8s.ns.name, proc.cmdline, rule, output
| sort -_time

51.5 Secrets Management¶

51.5.1 Native Kubernetes Secrets — Limitations¶

Kubernetes Secrets are base64-encoded by default, not encrypted. They are stored in etcd as plaintext unless encryption at rest is explicitly configured.

KUBERNETES SECRETS RISK MODEL
══════════════════════════════════════════════════════
Storage:
  - etcd stores Secrets as base64 (NOT encryption)
  - Direct etcd access → all Secrets exposed
  - etcd backup files contain all Secrets in plaintext

Access:
  - Any subject with 'get' on secrets can read them
  - Default SA may have secret access via legacy bindings
  - RBAC wildcard (*) on resources includes secrets

Transmission:
  - API server → kubelet: TLS encrypted (good)
  - Mounted as tmpfs in pod (good — not on disk)
  - Environment variables visible in /proc/<pid>/environ (bad)

Logging:
  - Secret values appear in audit logs at RequestResponse level
  - Ensure audit policy excludes Secret data (Metadata level only)

51.5.2 Encryption at Rest Configuration¶

# SYNTHETIC — EncryptionConfiguration for Kubernetes Secrets
# File: /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
  - secrets
  - configmaps
  providers:
  # Primary: AES-GCM (recommended — authenticated encryption)
  - aesgcm:
      keys:
      - name: key-2026-04
        secret: REDACTED  # 32-byte base64-encoded key
  # Fallback: allows reading old unencrypted secrets during migration
  - identity: {}

# API server flag: --encryption-provider-config=/etc/kubernetes/encryption-config.yaml
#
# CRITICAL STEPS after enabling:
# 1. Restart API server with the flag
# 2. Re-encrypt all existing secrets:
#    kubectl get secrets --all-namespaces -o json | kubectl replace -f -
# 3. Remove the identity{} provider after migration
# 4. Store encryption key OUTSIDE the cluster (HSM, Vault, KMS)

51.5.3 External Secrets Management¶

Solution	Integration Method	Key Advantage	Managed K8s Support
HashiCorp Vault	CSI driver, sidecar injector, external-secrets operator	Full lifecycle management, dynamic secrets	EKS, AKS, GKE, self-managed
AWS Secrets Manager	ASCP (AWS Secrets & Config Provider) CSI driver	Native AWS integration, automatic rotation	EKS
Azure Key Vault	CSI driver (secrets-store-csi-driver)	Native Azure integration, RBAC via AAD	AKS
GCP Secret Manager	CSI driver, workload identity	Native GCP integration	GKE
External Secrets Operator	CRD-based sync from external store to K8s Secret	Cloud-agnostic, supports multiple backends	All
Sealed Secrets	Encrypted CRDs decrypted by in-cluster controller	GitOps-friendly (safe to commit to repo)	All

# SYNTHETIC — External Secrets Operator example
# Syncs a secret from AWS Secrets Manager into a Kubernetes Secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: synth-db-credentials
  namespace: synth-production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: synth-aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: synth-db-secret
    creationPolicy: Owner
  data:
  - secretKey: username
    remoteRef:
      key: synth/production/database
      property: username
  - secretKey: password
    remoteRef:
      key: synth/production/database
      property: password

51.5.4 Detection Queries — Secrets Access¶

KQL — Unusual Secrets EnumerationSPL — Unusual Secrets Enumeration

// KQL — Detect bulk secrets enumeration across namespaces
KubeAuditLogs
| where TimeGenerated > ago(1h)
| where Verb in ("list", "get")
| where ObjectRef_Resource == "secrets"
| summarize
    SecretAccessCount = count(),
    NamespacesAccessed = dcount(ObjectRef_Namespace),
    Namespaces = make_set(ObjectRef_Namespace, 10)
  by User_Username, SourceIPs, bin(TimeGenerated, 5m)
| where SecretAccessCount > 20 or NamespacesAccessed > 3
| project TimeGenerated, User_Username, SourceIPs,
    SecretAccessCount, NamespacesAccessed, Namespaces
| sort by SecretAccessCount desc

// SPL — Detect bulk secrets enumeration
index=kubernetes sourcetype="kube:apiserver:audit"
verb IN ("list", "get") objectRef.resource="secrets"
| bin _time span=5m
| stats count AS secret_access_count,
    dc(objectRef.namespace) AS namespaces_accessed,
    values(objectRef.namespace) AS namespace_list
  by _time, user.username, sourceIPs{}
| where secret_access_count > 20 OR namespaces_accessed > 3
| sort -secret_access_count

51.6 Network Policies¶

51.6.1 Default-Deny Foundation¶

By default, Kubernetes allows all pod-to-pod communication within a cluster. Network policies are the only mechanism to restrict this traffic. Without a CNI that supports NetworkPolicy (Calico, Cilium, Antrea), policies are ignored silently.

# SYNTHETIC — Default deny all ingress and egress
# Apply to every namespace as the baseline
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: synth-production
spec:
  podSelector: {}          # Applies to ALL pods in namespace
  policyTypes:
  - Ingress
  - Egress
  # No ingress/egress rules = deny all traffic

Silent Failure

If your CNI plugin does not support NetworkPolicy (e.g., default Flannel), applying a NetworkPolicy resource will succeed with no error but have zero effect. Always verify your CNI supports network policy enforcement. Use Calico, Cilium, or Antrea for production clusters.

51.6.2 Namespace Isolation Pattern¶

# SYNTHETIC — Allow only intra-namespace communication + DNS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-same-namespace
  namespace: synth-production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector: {}      # Allow from same namespace only
  egress:
  - to:
    - podSelector: {}      # Allow to same namespace only
  - to:                    # Allow DNS resolution
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

51.6.3 Microsegmentation with Cilium¶

Cilium extends NetworkPolicy with L7 (application-layer) filtering using eBPF, enabling HTTP method/path, gRPC, and Kafka topic-level policies.

# SYNTHETIC — Cilium L7 network policy
# Only allows GET requests to /api/v1/health and /api/v1/data
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: synth-api-l7-policy
  namespace: synth-production
spec:
  endpointSelector:
    matchLabels:
      app: synth-api
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: synth-frontend
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/api/v1/health"
        - method: "GET"
          path: "/api/v1/data"
        # POST, PUT, DELETE are implicitly denied

51.6.4 Detection Queries — Network Policy Violations¶

KQL — Network Policy ChangesSPL — Network Policy Changes

// KQL — Detect network policy modifications
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource == "networkpolicies"
| where Verb in ("create", "update", "patch", "delete")
| project
    TimeGenerated,
    User_Username,
    Verb,
    Namespace = ObjectRef_Namespace,
    PolicyName = ObjectRef_Name,
    SourceIPs
| sort by TimeGenerated desc

// SPL — Detect network policy modifications
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource="networkpolicies"
verb IN ("create", "update", "patch", "delete")
| table _time, user.username, verb,
    objectRef.namespace, objectRef.name, sourceIPs{}
| sort -_time

51.7 Supply Chain Security¶

51.7.1 Container Supply Chain Threat Model¶

graph LR
    A[Source Code\nDeveloper] -->|Build| B[Container Image\nCI/CD Pipeline]
    B -->|Push| C[Container Registry\nsynth-registry.example.com]
    C -->|Pull| D[Kubernetes Cluster\nRuntime]

    A1[Compromised\nDependency] -.->|Supply Chain| B
    B1[Tampered\nBuild] -.->|Integrity| B
    C1[Unsigned\nImage] -.->|Verification| C
    D1[Unscanned\nDeployment] -.->|Admission| D

    style A1 fill:#e63946,color:#fff
    style B1 fill:#e63946,color:#fff
    style C1 fill:#e63946,color:#fff
    style D1 fill:#e63946,color:#fff

51.7.2 Image Signing with Cosign¶

CONTAINER IMAGE SIGNING WORKFLOW
══════════════════════════════════════════════════════
Step 1: Generate signing key pair (one-time)
  cosign generate-key-pair
  → cosign.key (private — store in secrets manager)
  → cosign.pub (public — distribute to clusters)

Step 2: Sign image after build + scan in CI/CD
  cosign sign --key cosign.key \
    synth-registry.example.com/app@sha256:abc123...
  → Signature stored as OCI artifact alongside image

Step 3: Verify signature before deployment
  cosign verify --key cosign.pub \
    synth-registry.example.com/app@sha256:abc123...
  → Returns 0 if valid, non-zero if tampered

Step 4: Enforce via admission controller
  → Kyverno or OPA Gatekeeper policy rejects unsigned images
  → Only images signed by trusted key can be deployed

51.7.3 Admission Controller Enforcement¶

# SYNTHETIC — Kyverno policy requiring image signatures
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signature
spec:
  validationFailureAction: Enforce  # Block non-compliant pods
  background: false
  rules:
  - name: check-image-signature
    match:
      any:
      - resources:
          kinds:
          - Pod
    verifyImages:
    - imageReferences:
      - "synth-registry.example.com/*"
      attestors:
      - count: 1
        entries:
        - keys:
            publicKeys: |-
              -----BEGIN PUBLIC KEY-----
              REDACTED
              -----END PUBLIC KEY-----

---
# SYNTHETIC — Kyverno policy: block images from untrusted registries
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  rules:
  - name: validate-registries
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Images must come from approved registries."
      pattern:
        spec:
          containers:
          - image: "synth-registry.example.com/*"

51.7.4 SBOM and Vulnerability Scanning¶

SUPPLY CHAIN SECURITY PIPELINE
══════════════════════════════════════════════════════
Stage 1: Source Composition Analysis
  → Scan dependencies for known CVEs (Trivy, Grype, Snyk)
  → Generate SBOM (syft, trivy sbom)
  → Block builds with critical/high CVEs (fail threshold)

Stage 2: Image Build Security
  → Use minimal base images (distroless, Alpine, scratch)
  → Multi-stage builds (no build tools in runtime image)
  → Pin base images by SHA256 digest (not tag)
  → Scan built image (Trivy, Grype)
  → Attach SBOM as OCI attestation (cosign attest)

Stage 3: Registry Security
  → Private registry with authentication
  → Vulnerability scanning on push (Harbor, ECR, ACR, GAR)
  → Image retention policies (delete untagged images)
  → Registry allowlisting in admission controller

Stage 4: Runtime Verification
  → Admission controller verifies signature + attestation
  → Continuous scanning of running images (Trivy Operator)
  → Alert on newly disclosed CVEs affecting deployed images

51.8 Runtime Security¶

51.8.1 Runtime Threat Detection Architecture¶

graph TD
    subgraph Kubernetes Node
        A[Container Process\nsyscall] -->|eBPF hook| B[Kernel\neBPF Programs]
        B -->|Events| C[Falco / Tetragon\nUserspace Agent]
    end
    C -->|Alert| D[SIEM / SOAR\nDetection Pipeline]
    C -->|Alert| E[Kubernetes Event\nPod annotation]
    C -->|Enforce| F[Kill Process\nor Pod]

    style A fill:#457b9d,color:#fff
    style B fill:#264653,color:#fff
    style C fill:#e63946,color:#fff
    style D fill:#2d6a4f,color:#fff

51.8.2 Falco Rules for Kubernetes¶

Falco is the de facto standard for Kubernetes runtime security. It monitors syscalls via eBPF and matches them against rules.

# SYNTHETIC — Custom Falco rules for Kubernetes security
# File: /etc/falco/rules.d/k8s-custom-rules.yaml

# Detect shell execution in a container
- rule: Shell in Container
  desc: Detect interactive shell opened in a container
  condition: >
    spawned_process and
    container and
    proc.name in (bash, sh, zsh, dash, ksh) and
    proc.tty != 0
  output: >
    Shell opened in container
    (user=%user.name command=%proc.cmdline
     container=%container.name pod=%k8s.pod.name
     namespace=%k8s.ns.name image=%container.image.repository)
  priority: WARNING
  tags: [container, shell, mitre_execution]

# Detect reading of sensitive files
- rule: Read Sensitive File in Container
  desc: Detect reads of sensitive files that may indicate credential theft
  condition: >
    open_read and
    container and
    fd.name in (/etc/shadow, /etc/passwd,
                /var/run/secrets/kubernetes.io/serviceaccount/token,
                /var/run/secrets/kubernetes.io/serviceaccount/ca.crt)
  output: >
    Sensitive file read in container
    (file=%fd.name user=%user.name command=%proc.cmdline
     container=%container.name pod=%k8s.pod.name
     namespace=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, credential_access, mitre_credential_access]

# Detect crypto-mining indicators
- rule: Detect Crypto Mining Process
  desc: Detect processes commonly associated with cryptocurrency mining
  condition: >
    spawned_process and
    container and
    (proc.name in (xmrig, minerd, minergate, cpuminer) or
     proc.cmdline contains "stratum+tcp" or
     proc.cmdline contains "pool.mining" or
     proc.cmdline contains "--donate-level")
  output: >
    Crypto mining process detected
    (process=%proc.name command=%proc.cmdline
     container=%container.name pod=%k8s.pod.name
     namespace=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, cryptomining, mitre_resource_hijacking]

# Detect nsenter (container escape tool)
- rule: Detect nsenter Usage
  desc: nsenter allows entering host namespaces — container escape indicator
  condition: >
    spawned_process and
    container and
    proc.name = nsenter
  output: >
    nsenter executed in container (POSSIBLE ESCAPE ATTEMPT)
    (command=%proc.cmdline container=%container.name
     pod=%k8s.pod.name namespace=%k8s.ns.name
     user=%user.name)
  priority: CRITICAL
  tags: [container, escape, mitre_privilege_escalation]

51.8.3 Cilium Tetragon — eBPF-Based Enforcement¶

Tetragon goes beyond detection to real-time enforcement — it can kill processes or send signals at the kernel level before a malicious action completes.

# SYNTHETIC — Tetragon TracingPolicy
# Block sensitive file access in containers
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: block-sensitive-access
spec:
  kprobes:
  - call: "security_file_open"
    syscall: false
    args:
    - index: 0
      type: "file"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Prefix"
        values:
        - "/etc/shadow"
        - "/etc/kubernetes/pki"
        - "/var/lib/etcd"
      matchActions:
      - action: Sigkill      # Kill the process immediately
    - matchArgs:
      - index: 0
        operator: "Prefix"
        values:
        - "/var/run/secrets/kubernetes.io/serviceaccount/token"
      matchActions:
      - action: Post          # Log but allow (for expected access)

51.8.4 Runtime Security Comparison¶

Capability	Falco	Tetragon	Commercial (Aqua/Sysdig)
Syscall monitoring	Yes (eBPF/kernel module)	Yes (eBPF)	Yes
Process tree tracking	Limited	Full (with ancestors)	Full
Network monitoring	Basic (connection events)	Full (L3/L4/L7)	Full
File integrity	Via rules	Via TracingPolicy	Built-in FIM
Real-time enforcement	No (detect only)	Yes (Sigkill, Override)	Yes
Kubernetes context	Yes (pod, namespace, labels)	Yes (native K8s enrichment)	Yes
Performance overhead	Low (~1-3% CPU)	Very low (eBPF in-kernel)	Varies
License	Apache 2.0	Apache 2.0	Commercial

51.9 etcd Security¶

51.9.1 etcd Threat Model¶

etcd is the crown jewel of a Kubernetes cluster. It contains the entire cluster state: every Secret, every ConfigMap, every RBAC binding, every pod specification.

etcd SECURITY HARDENING
══════════════════════════════════════════════════════
Access Control:
  □ mTLS for all client connections (API server → etcd)
  □ mTLS for peer communication (etcd node → etcd node)
  □ Restrict etcd port (2379/2380) via firewall to API server only
  □ No direct etcd access from worker nodes
  □ Separate etcd nodes from control plane (dedicated hardware/VMs)

Encryption:
  □ Enable Kubernetes encryption at rest (see Section 51.5.2)
  □ Use KMS provider for encryption key management
  □ Rotate encryption keys quarterly
  □ Verify encryption: etcdctl get /registry/secrets/... should be ciphertext

Backups:
  □ Automated etcd snapshots (etcdctl snapshot save)
  □ Encrypt backup files at rest
  □ Store backups in separate security domain (not on cluster nodes)
  □ Test restore procedure quarterly
  □ Retention policy: 30 days minimum

Monitoring:
  □ Monitor etcd metrics (leader changes, slow queries, disk latency)
  □ Alert on certificate expiry (30-day threshold)
  □ Alert on unauthorized connection attempts
  □ Log all etcd API calls for forensics

51.9.2 Detection Queries — etcd Access¶

KQL — Unauthorized etcd Access AttemptsSPL — Unauthorized etcd Access Attempts

// KQL — Detect direct etcd access attempts bypassing API server
// Data source: Network flow logs / firewall logs
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where DestinationPort == 2379 or DestinationPort == 2380
| where SourceIP !in ("198.51.100.10")  // Synthetic: only API server IP allowed
| project
    TimeGenerated,
    SourceIP,
    DestinationIP,
    DestinationPort,
    Action = DeviceAction,
    Protocol
| sort by TimeGenerated desc

// SPL — Detect direct etcd access bypassing API server
index=firewall dest_port IN (2379, 2380)
NOT src_ip="198.51.100.10"
| table _time, src_ip, dest_ip, dest_port, action
| sort -_time

51.10 Kubernetes Audit Logging¶

51.10.1 Audit Policy Design¶

Kubernetes audit logging records all API server requests. The audit policy controls what is logged and at what detail level.

Level	What Is Recorded	Use Case
None	Nothing	Resources you never need to audit
Metadata	Request metadata (user, resource, verb, timestamp)	Default for most resources
Request	Metadata + request body	Mutations to critical resources
RequestResponse	Metadata + request body + response body	Forensic-grade logging (high volume)

# SYNTHETIC — Kubernetes audit policy
# File: /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# CRITICAL: Log ALL authentication failures at RequestResponse
- level: RequestResponse
  verbs: ["create"]
  resources:
  - group: "authentication.k8s.io"
    resources: ["tokenreviews"]

# HIGH: Log secret access at Metadata only (never log secret values!)
- level: Metadata
  resources:
  - group: ""
    resources: ["secrets"]

# HIGH: Log RBAC changes at Request level
- level: Request
  verbs: ["create", "update", "patch", "delete"]
  resources:
  - group: "rbac.authorization.k8s.io"
    resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]

# HIGH: Log pod exec and port-forward
- level: Request
  verbs: ["create"]
  resources:
  - group: ""
    resources: ["pods/exec", "pods/portforward", "pods/attach"]

# MEDIUM: Log workload changes at Metadata
- level: Metadata
  verbs: ["create", "update", "patch", "delete"]
  resources:
  - group: "apps"
    resources: ["deployments", "daemonsets", "statefulsets"]
  - group: ""
    resources: ["pods", "services", "configmaps"]
  - group: "batch"
    resources: ["jobs", "cronjobs"]

# LOW: Log read-only access at None (reduce volume)
- level: None
  verbs: ["get", "list", "watch"]
  resources:
  - group: ""
    resources: ["events", "endpoints"]

# CATCH-ALL: Metadata for everything else
- level: Metadata

51.10.2 High-Value Audit Events¶

Event	Verb + Resource	Threat Indicator
Secret enumeration	`list` + `secrets`	Credential harvesting
ClusterRoleBinding creation	`create` + `clusterrolebindings`	Privilege escalation / persistence
Pod exec	`create` + `pods/exec`	Lateral movement / RCE
Service account token request	`create` + `serviceaccounts/token`	Token theft
Admission webhook modification	`update` + `mutatingwebhookconfigurations`	Supply chain attack / persistence
Node proxy request	`create` + `nodes/proxy`	Kubelet API bypass
CronJob creation	`create` + `cronjobs`	Scheduled persistence
Namespace creation	`create` + `namespaces`	Shadow namespace for hiding activities

51.10.3 Detection Queries — Audit Log Analysis¶

KQL — RBAC Escalation DetectionSPL — RBAC Escalation DetectionKQL — Suspicious Pod Exec ActivitySPL — Suspicious Pod Exec ActivityKQL — CronJob Persistence DetectionSPL — CronJob Persistence Detection

// KQL — Detect RBAC modifications that may indicate privilege escalation
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource in ("clusterrolebindings", "clusterroles",
    "rolebindings", "roles")
| where Verb in ("create", "update", "patch")
| extend RequestBody = parse_json(RequestBody)
| extend
    RoleRef = tostring(RequestBody.roleRef.name),
    SubjectKind = tostring(RequestBody.subjects[0].kind),
    SubjectName = tostring(RequestBody.subjects[0].name)
| where RoleRef has "admin" or RoleRef has "cluster-admin"
    or RoleRef has "edit"
| project
    TimeGenerated,
    User_Username,
    Verb,
    Resource = ObjectRef_Resource,
    BindingName = ObjectRef_Name,
    RoleRef,
    SubjectKind,
    SubjectName,
    SourceIPs
| sort by TimeGenerated desc

// SPL — Detect RBAC modifications indicating privilege escalation
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource IN ("clusterrolebindings", "clusterroles",
    "rolebindings", "roles")
verb IN ("create", "update", "patch")
| spath path=requestObject.roleRef.name output=roleRef
| spath path=requestObject.subjects{}.name output=subjectName
| spath path=requestObject.subjects{}.kind output=subjectKind
| search roleRef IN ("cluster-admin", "admin", "edit")
| table _time, user.username, verb, objectRef.resource,
    objectRef.name, roleRef, subjectName, subjectKind, sourceIPs{}
| sort -_time

// KQL — Detect pod exec activity (potential lateral movement)
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource == "pods" and ObjectRef_Subresource == "exec"
| where Verb == "create"
| summarize
    ExecCount = count(),
    TargetPods = make_set(ObjectRef_Name, 20),
    Namespaces = make_set(ObjectRef_Namespace, 10)
  by User_Username, SourceIPs, bin(TimeGenerated, 15m)
| where ExecCount > 5
| project TimeGenerated, User_Username, SourceIPs,
    ExecCount, TargetPods, Namespaces
| sort by ExecCount desc

// SPL — Detect excessive pod exec activity
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource="pods" objectRef.subresource="exec"
verb="create"
| bin _time span=15m
| stats count AS exec_count,
    values(objectRef.name) AS target_pods,
    dc(objectRef.namespace) AS namespace_count
  by _time, user.username, sourceIPs{}
| where exec_count > 5
| sort -exec_count

// KQL — Detect CronJob creation that may indicate persistence
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource == "cronjobs"
| where Verb == "create"
| extend RequestBody = parse_json(RequestBody)
| extend
    Schedule = tostring(RequestBody.spec.schedule),
    Image = tostring(RequestBody.spec.jobTemplate.spec.template.spec.containers[0].image),
    Command = tostring(RequestBody.spec.jobTemplate.spec.template.spec.containers[0].command)
| project
    TimeGenerated,
    User_Username,
    Namespace = ObjectRef_Namespace,
    CronJobName = ObjectRef_Name,
    Schedule,
    Image,
    Command,
    SourceIPs
| sort by TimeGenerated desc

// SPL — Detect CronJob creation for persistence
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource="cronjobs" verb="create"
| spath path=requestObject.spec.schedule output=schedule
| spath path=requestObject.spec.jobTemplate.spec.template.spec.containers{}.image output=image
| spath path=requestObject.spec.jobTemplate.spec.template.spec.containers{}.command output=command
| table _time, user.username, objectRef.namespace,
    objectRef.name, schedule, image, command, sourceIPs{}
| sort -_time

51.11 Kubernetes Hardening Summary¶

KUBERNETES SECURITY MATURITY MODEL
══════════════════════════════════════════════════════════════════

Level 1 — FOUNDATIONAL (must have before production)
  □ RBAC enabled (no ABAC, no AlwaysAllow)
  □ Default service account tokens disabled
  □ Pod Security Standards: Baseline enforced
  □ Network policies: default-deny in sensitive namespaces
  □ etcd encrypted in transit (mTLS)
  □ Kubelet anonymous auth disabled
  □ Audit logging enabled (Metadata level minimum)
  □ Image pull from private registries only

Level 2 — HARDENED (production-grade)
  □ Pod Security Standards: Restricted enforced
  □ Encryption at rest for Secrets (KMS-backed)
  □ External secrets management (Vault, cloud KMS)
  □ Network policies: default-deny all namespaces + egress filtering
  □ Admission controllers: image allowlisting, signature verification
  □ RBAC: namespace-scoped Roles only, no wildcard permissions
  □ Audit logging: Request level for mutations
  □ Runtime security: Falco or Tetragon deployed
  □ CIS Kubernetes Benchmark: 90%+ compliance

Level 3 — ADVANCED (security-first organizations)
  □ eBPF-based network policy (Cilium) with L7 filtering
  □ Workload identity (no static credentials in pods)
  □ SBOM generation and attestation for all images
  □ Tetragon enforcement policies (kill malicious processes)
  □ Zero-trust service mesh (mTLS pod-to-pod)
  □ Automated RBAC review and unused permission pruning
  □ Multi-cluster security posture management
  □ Chaos engineering for security (GameDay exercises)
  □ CIS Kubernetes Benchmark: 100% compliance

Hands-On Exercises¶

Practical Application

The following exercises provide hands-on practice with the concepts covered in this chapter:

Apply the concepts from this chapter using the following resources:

Lab 27: Kubernetes Security Assessment — Hands-on cluster security assessment using kube-bench, RBAC audit, and network policy testing (planned)
SC-085: Kubernetes Cluster Compromise via RBAC Misconfiguration — Full attack scenario from initial pod access to cluster-admin escalation (planned)
PT-195: Pod Security Standards Bypass Testing — Purple team exercise validating PSS enforcement across namespaces (planned)
PT-201: Kubernetes Secrets Extraction and Detection — Red team secrets enumeration with blue team detection validation (planned)
PT-207: Container Escape Detection and Response — End-to-end container escape simulation with runtime detection (planned)

Each exercise includes both offensive (red team) and defensive (blue team) components, aligned with the purple team methodology used throughout Nexus SecOps. See the Purple Team Exercise Library for the full catalog.

Exam Prep & Certifications¶

Relevant Certifications

The topics in this chapter align with the following certifications:

CKS (Certified Kubernetes Security Specialist) — Direct alignment: cluster hardening, system hardening, supply chain security, monitoring/logging/runtime security, microservice vulnerabilities
CKA (Certified Kubernetes Administrator) — Domains: Security (RBAC, network policies, secrets), cluster architecture
CKAD (Certified Kubernetes Application Developer) — Domains: Pod security context, resource limits, service accounts
AWS Certified Security - Specialty — Domains: EKS security, IAM roles for service accounts (IRSA)
AZ-500 (Azure Security Engineer Associate) — Domains: AKS security, Azure Policy for Kubernetes, Defender for Containers
GIAC GCSA (Cloud Security Automation) — Domains: Container security, Kubernetes hardening, DevSecOps pipeline security
CompTIA Cloud+ — Domains: Container orchestration security, cloud workload protection

View full Certifications Roadmap →

Review Questions¶

1. Explain why base64 encoding of Kubernetes Secrets does not constitute encryption, and describe the defense-in-depth approach to secrets protection.

Base64 is an encoding scheme, not an encryption algorithm — it is trivially reversible with no key required (echo '<value>' | base64 -d). Kubernetes uses base64 encoding purely for safe transport of binary data in YAML/JSON, not for confidentiality. Defense-in-depth for secrets includes: (1) enabling encryption at rest via EncryptionConfiguration with AES-GCM or KMS provider, so secrets are encrypted before being written to etcd; (2) restricting RBAC access to secrets using namespace-scoped Roles with explicit resource names; (3) using external secrets managers (Vault, AWS Secrets Manager) with dynamic, short-lived credentials; (4) disabling automountServiceAccountToken on service accounts that don't need API access; (5) setting audit policy to Metadata level for secrets to avoid logging secret values.

2. A pod specification sets hostPID: true but does not set privileged: true. Explain the security implications and possible escape vectors.

With hostPID: true, the container shares the host's PID namespace, meaning it can see all processes running on the host node and in every other container on that node. Even without privileged: true, this enables: (1) reading environment variables of host processes via /proc/<pid>/environ, which may contain secrets, API keys, or database passwords; (2) sending signals to host processes (if the container runs as root); (3) using nsenter -t 1 -m -u -i -n to enter the mount, UTS, IPC, and network namespaces of PID 1 (the host init process), effectively escaping the container entirely. The nsenter escape requires the binary to be present in the container and typically root access within the container. Mitigation: never allow hostPID: true outside of system namespaces; enforce via Pod Security Standards (Baseline profile blocks hostPID).

3. Describe the difference between a Kubernetes Role and a ClusterRole, and explain why a RoleBinding referencing a ClusterRole is the recommended pattern.

A Role grants permissions within a single namespace. A ClusterRole grants permissions cluster-wide or on cluster-scoped resources (nodes, PVs, namespaces). The recommended pattern is to define reusable ClusterRoles (e.g., pod-reader, deployment-manager) and then create namespace-scoped RoleBindings that reference those ClusterRoles. This limits the ClusterRole's effective scope to the binding's namespace. Benefits: (1) reusability — one ClusterRole definition, many namespace-scoped bindings; (2) least privilege — the binding constrains what the ClusterRole can access; (3) auditability — RoleBindings are easier to enumerate than scattered Role definitions. The anti-pattern is using ClusterRoleBindings, which grant the ClusterRole's permissions across the entire cluster.

4. Why do Kubernetes NetworkPolicies fail silently when the CNI does not support them, and how should an operator verify enforcement?

Kubernetes treats NetworkPolicy as a standard API resource — the API server accepts and stores the resource regardless of whether the underlying CNI plugin implements it. The API server has no awareness of CNI capabilities. If the CNI (e.g., default Flannel without the network-policy plugin) does not support NetworkPolicy, the resource exists in etcd but has zero runtime effect — all traffic remains allowed. To verify enforcement: (1) deploy a test pod and attempt to reach a pod that should be blocked by the policy — if traffic succeeds, the CNI is not enforcing; (2) check the CNI documentation for NetworkPolicy support; (3) use kubectl get pods -n kube-system to verify the CNI controller pods are running (e.g., calico-node, cilium-agent); (4) run a tool like netassert or cyclonus for automated network policy testing.

5. Compare Falco and Tetragon for Kubernetes runtime security. When would you choose one over the other?

Falco is a detect-only tool: it monitors syscalls via eBPF (or kernel module), matches them against rules, and generates alerts. It has a large community, extensive default ruleset, and integrates well with SIEM systems. Tetragon (by Cilium/Isovalent) uses eBPF for both detection and enforcement — it can kill processes (Sigkill) or override return values at the kernel level before a malicious action completes. Choose Falco when you need broad detection coverage with a mature ecosystem, are sending all alerts to a SIEM for correlation, or need to start quickly with default rules. Choose Tetragon when you need real-time enforcement (block an exploit before it succeeds), require deep process ancestry tracking, need L7 network visibility, or are already using Cilium as your CNI. In mature environments, both can be deployed together: Tetragon for enforcement on critical paths, Falco for broad detection and SIEM integration.

6. An attacker creates a CronJob in a shadow namespace to maintain persistence. What audit log events would this generate, and how would you detect it?

The attacker's actions generate at least two audit events: (1) create on namespaces (creating the shadow namespace); (2) create on cronjobs in the new namespace. The audit policy should log namespace creation at Request level and CronJob creation at Request level. Detection approach: alert on namespace creation by non-infrastructure service accounts; alert on CronJob creation in newly-created namespaces; correlate the two events within a short time window (e.g., namespace creation followed by CronJob creation within 5 minutes by the same user); monitor for CronJobs with images from external/untrusted registries; use Falco to detect unexpected processes spawned by CronJob-created pods at runtime.

7. Explain how a mutating admission webhook could be abused for persistence, and describe the detection strategy.

A mutating admission webhook intercepts API requests before they are persisted to etcd and can modify the request object. An attacker with create or update permissions on mutatingwebhookconfigurations can register a webhook that injects a sidecar container, environment variable, or volume mount into every new pod. This sidecar could exfiltrate data, establish a reverse shell, or mine cryptocurrency. Because the mutation happens transparently at the API level, pod creators may not notice the injected content. Detection: (1) audit log monitoring for create/update on mutatingwebhookconfigurations and validatingwebhookconfigurations; (2) alert on webhook endpoints pointing to external URLs or non-system namespaces; (3) periodic comparison of deployed pod specs against their source Deployment/StatefulSet specs to detect unexpected mutations; (4) admission controller that restricts who can modify webhook configurations.

Key Takeaways¶

Chapter Summary

The API server is the single most critical component — secure it with OIDC authentication, RBAC authorization, TLS 1.2+, audit logging, and network restriction. Every cluster operation flows through the API server.
Pod Security Standards replace PodSecurityPolicy — enforce the Restricted profile for all production workloads. Use namespace-level labels for enforcement, warning, and auditing.
RBAC misconfiguration is the leading Kubernetes escalation vector — use namespace-scoped Roles, disable default SA tokens, never grant wildcard (*) verbs, and audit escalate/bind/impersonate verbs.
Container escapes require specific pod security context settings — privileged, hostPID, hostPath, and CAP_SYS_ADMIN are the primary vectors. Pod Security Standards block all of them at the Restricted level.
Native Kubernetes Secrets are not encrypted by default — enable encryption at rest with AES-GCM or KMS, and migrate to external secrets managers for production credentials.
Network policies are not enforced without a supporting CNI — deploy Calico, Cilium, or Antrea, implement default-deny in every namespace, and verify enforcement with testing tools.
Supply chain security requires signing, scanning, and admission control — sign images with Cosign, scan with Trivy, generate SBOMs, and enforce signature verification via Kyverno or OPA Gatekeeper.
Runtime security closes the detection gap — deploy Falco for broad syscall monitoring and Tetragon for kernel-level enforcement. Both use eBPF for minimal performance impact.
etcd is the crown jewel — protect it with mTLS, network isolation, encryption at rest, and encrypted backups stored outside the cluster.
Audit logging is the foundation of Kubernetes detection engineering — design a tiered audit policy that balances visibility with performance, and forward logs to your SIEM for correlation with KQL/SPL queries.

Cross-References¶

Topic	Chapter	Relevance
Cloud security fundamentals	Ch 20: Cloud Attack & Defense	Shared responsibility model, cloud IAM foundations
Container red teaming	Ch 46: Cloud & Container Red Teaming	Container escape techniques, cloud attack lifecycle
CI/CD pipeline security	Ch 35: DevSecOps Pipeline	Image scanning integration, admission controller CI/CD
Detection engineering	Ch 5: Detection Engineering at Scale	KQL/SPL query methodology, alert tuning
Network security architecture	Ch 31: Network Security Architecture	Network policy verification, flow log analysis
SOC foundations	Ch 1: Introduction	K8s incident response procedures, SOC workflow