Skip to content

Chapter 51: Kubernetes Security — From Pod to Cluster

Overview

Kubernetes has become the dominant container orchestration platform, powering everything from startup microservices to nation-scale infrastructure. With that ubiquity comes an expansive attack surface: a misconfigured RBAC binding, an unencrypted etcd store, or a privileged pod can turn a single container compromise into full cluster takeover in minutes. This chapter provides a comprehensive, defense-first treatment of Kubernetes security. We move systematically from the architecture layer (control plane, kubelet, etcd, API server) through workload isolation (Pod Security Standards, network policies) to operational security (secrets management, supply chain integrity, runtime detection, audit logging). Every attack technique is paired with KQL and SPL detection queries, mapped to MITRE ATT&CK, and grounded in real-world defensive practice. This is the chapter you reference when you need to secure — or assess — a Kubernetes cluster end to end.

Educational Content Only

All techniques in this chapter are presented for defensive understanding only. All cluster names, IP addresses, namespaces, service accounts, and scenarios are 100% synthetic. IP addresses use RFC 5737 (192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24) and RFC 1918 ranges. Credentials shown are placeholders (REDACTED). Never execute offensive techniques without explicit written authorization against clusters you own or have written permission to test.

Learning Objectives

By the end of this chapter, students SHALL be able to:

  1. Describe the Kubernetes architecture security model and identify the trust boundaries between control plane components (Knowledge)
  2. Differentiate between Pod Security Standards profiles (Privileged, Baseline, Restricted) and select the appropriate profile for a given workload (Analysis)
  3. Design least-privilege RBAC policies using namespace-scoped Roles, targeted ClusterRoles, and service account hardening (Synthesis)
  4. Evaluate container escape risk by mapping pod security context settings to known escape vectors (Evaluation)
  5. Implement Kubernetes-native and external secrets management strategies with encryption at rest (Application)
  6. Construct network policies that enforce default-deny, namespace isolation, and egress filtering (Synthesis)
  7. Assess container supply chain security posture using image signing, admission controllers, and SBOM verification (Evaluation)
  8. Develop KQL and SPL detection queries for Kubernetes attack techniques using audit log telemetry (Synthesis)
  9. Apply runtime security tools (Falco, Tetragon) to detect container escapes, crypto-mining, and anomalous process execution (Application)
  10. Create a Kubernetes audit logging policy that balances security visibility with cluster performance (Synthesis)

Prerequisites


Why This Matters

In 2024, over 75% of organizations running Kubernetes reported at least one serious security incident tied to misconfiguration (Red Hat State of Kubernetes Security Report). The Kubernetes API server is the single largest attack surface in modern infrastructure — it is internet-facing by default in managed offerings, processes every cluster mutation, and holds the keys to every secret in the cluster. A single overly permissive ClusterRoleBinding or an unpatched kubelet can cascade into complete infrastructure compromise. Kubernetes security is not optional — it is existential.


MITRE ATT&CK Kubernetes Mapping

Technique ID Technique Name Kubernetes Context Tactic
T1078.001 Valid Accounts: Default Accounts Default service account token abuse Initial Access (TA0001)
T1552.007 Unsecured Credentials: Container API Kubelet API credential exposure Credential Access (TA0006)
T1609 Container Administration Command kubectl exec into pods Execution (TA0002)
T1610 Deploy Container Malicious pod deployment via RBAC abuse Execution (TA0002)
T1611 Escape to Host Privileged container, hostPID, nsenter Privilege Escalation (TA0004)
T1613 Container and Resource Discovery API server enumeration of pods, secrets, nodes Discovery (TA0007)
T1053.007 Scheduled Task/Job: Container Orchestration Job CronJob for persistence Persistence (TA0003)
T1071.001 Application Layer Protocol: Web Protocols API server communication over HTTPS Command & Control (TA0011)
T1046 Network Service Discovery Service/endpoint enumeration within cluster Discovery (TA0007)
T1070.004 Indicator Removal: File Deletion Ephemeral pod destruction to remove evidence Defense Evasion (TA0005)
T1098 Account Manipulation ClusterRoleBinding creation for persistence Persistence (TA0003)
T1574 Hijack Execution Flow Mutating webhook injection Persistence (TA0003)

51.1 Kubernetes Architecture Security Model

51.1.1 Control Plane Components and Trust Boundaries

Understanding the architecture is the prerequisite for understanding the attack surface. Every component in the Kubernetes control plane has distinct trust boundaries and failure modes.

graph TD
    subgraph Control Plane - 198.51.100.0/24
        API[kube-apiserver\n198.51.100.10:6443]
        ETCD[etcd\n198.51.100.20:2379]
        SCHED[kube-scheduler\n198.51.100.10]
        CM[kube-controller-manager\n198.51.100.10]
    end

    subgraph Worker Node 1 - 198.51.100.101
        KL1[kubelet\n:10250]
        KP1[kube-proxy]
        POD1[Pod A\nsynth-app]
        POD2[Pod B\nsynth-api]
    end

    subgraph Worker Node 2 - 198.51.100.102
        KL2[kubelet\n:10250]
        KP2[kube-proxy]
        POD3[Pod C\nsynth-worker]
    end

    API -->|mTLS| ETCD
    API -->|mTLS| KL1
    API -->|mTLS| KL2
    SCHED -->|mTLS| API
    CM -->|mTLS| API
    POD1 -->|ServiceAccount Token| API
    POD3 -->|ServiceAccount Token| API

    style API fill:#e63946,color:#fff
    style ETCD fill:#780000,color:#fff
    style KL1 fill:#457b9d,color:#fff
    style KL2 fill:#457b9d,color:#fff

51.1.2 Component Security Properties

Component Port Authentication Authorization Key Risk
kube-apiserver 6443 x509 client certs, OIDC tokens, service account JWTs RBAC, ABAC, Webhook Internet exposure; single point of failure for all cluster operations
etcd 2379 mTLS (peer + client certs) etcd RBAC (rarely used — API server is sole client) Stores all cluster state including Secrets in plaintext by default
kubelet 10250 x509 client certs, bearer tokens Webhook (delegates to API server) --anonymous-auth=true default in some distributions
kube-scheduler 10259 Localhost only N/A Compromise allows malicious pod placement decisions
kube-controller-manager 10257 Localhost only N/A Holds credentials for all controllers (node, SA token, etc.)
kube-proxy 10256 Localhost only (healthz) N/A Manipulating iptables/IPVS rules enables traffic interception
CoreDNS 53 None (cluster-internal) N/A DNS poisoning enables service impersonation

51.1.3 API Server Hardening

The API server is the central nervous system of Kubernetes. Every kubectl command, every controller reconciliation loop, every pod scheduling decision flows through it.

API SERVER HARDENING CHECKLIST
══════════════════════════════════════════════════════
Authentication:
  □ Disable anonymous authentication (--anonymous-auth=false)
  □ Use OIDC for human users (integrate with IdP)
  □ Use x509 client certificates for system components
  □ Disable static token files (--token-auth-file)
  □ Disable basic auth (--basic-auth-file) — deprecated but still exists

Authorization:
  □ Use RBAC mode exclusively (--authorization-mode=RBAC,Node)
  □ Never use ABAC in production (no audit trail)
  □ Disable AlwaysAllow (catastrophic misconfiguration)

Network:
  □ Restrict API server to private network or IP allowlist
  □ Use a load balancer with TLS termination and WAF
  □ Enable audit logging (see Section 51.10)
  □ Rate-limit API requests to prevent DoS

TLS:
  □ Minimum TLS 1.2 (--tls-min-version=VersionTLS12)
  □ Strong cipher suites only (--tls-cipher-suites)
  □ Rotate API server certificates before expiry
  □ Use separate CA for etcd communication

51.1.4 Kubelet Security

# SYNTHETIC — kubelet configuration hardening
# File: /var/lib/kubelet/config.yaml on each node
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false          # CRITICAL: disable anonymous access
  webhook:
    enabled: true           # Delegate auth to API server
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook             # Delegate authz to API server
readOnlyPort: 0             # Disable unauthenticated read-only port (10255)
protectKernelDefaults: true  # Prevent pods from changing kernel parameters
eventRecordQPS: 50          # Ensure events are captured for detection

Kubelet Anonymous Auth

Some Kubernetes distributions ship with --anonymous-auth=true by default. This allows unauthenticated access to the kubelet API on port 10250. An attacker with network access to a node can list pods, exec into containers, and retrieve logs — all without authentication. Always verify this setting in production.


51.2 Pod Security Standards

51.2.1 The Three Profiles

Pod Security Standards (PSS) replaced the deprecated PodSecurityPolicy (PSP) in Kubernetes 1.25. They define three progressive security profiles enforced by the built-in Pod Security Admission controller.

Profile Purpose Key Restrictions Use Case
Privileged Unrestricted None — all capabilities allowed System-level infrastructure (CNI, storage drivers)
Baseline Minimize known escalation No privileged containers, no hostPID/hostNetwork, restricted volume types General-purpose workloads with basic security
Restricted Maximum hardening Must run as non-root, drop ALL capabilities, read-only root FS, seccomp enforced Security-sensitive workloads, multi-tenant clusters

51.2.2 Enforcing Pod Security Standards

# SYNTHETIC — Namespace-level PSS enforcement
apiVersion: v1
kind: Namespace
metadata:
  name: synth-production
  labels:
    # Enforce restricted profile — reject non-compliant pods
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    # Warn on baseline violations (for migration)
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/warn-version: latest
    # Audit log violations
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/audit-version: latest

51.2.3 Compliant vs. Non-Compliant Pod Specs

# SYNTHETIC — Pod that passes the Restricted profile
apiVersion: v1
kind: Pod
metadata:
  name: synth-secure-app
  namespace: synth-production
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 10001
    runAsGroup: 10001
    fsGroup: 10001
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: synth-registry.example.com/app@sha256:abc123def456...
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]
      runAsNonRoot: true
    resources:
      limits:
        cpu: "500m"
        memory: "256Mi"
      requests:
        cpu: "100m"
        memory: "128Mi"
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir: {}
# SYNTHETIC — Pod that FAILS the Restricted profile
# Each violation is annotated
apiVersion: v1
kind: Pod
metadata:
  name: synth-insecure-app
  namespace: synth-production
spec:
  hostPID: true                    # VIOLATION: hostPID not allowed
  hostNetwork: true                # VIOLATION: hostNetwork not allowed
  containers:
  - name: app
    image: synth-registry.example.com/app:latest  # BAD: mutable tag
    securityContext:
      privileged: true             # VIOLATION: privileged not allowed
      runAsUser: 0                 # VIOLATION: must run as non-root
      capabilities:
        add: ["SYS_ADMIN"]        # VIOLATION: cannot add capabilities
      # Missing: allowPrivilegeEscalation: false
      # Missing: readOnlyRootFilesystem: true
      # Missing: seccompProfile
    volumeMounts:
    - name: host-root
      mountPath: /host
  volumes:
  - name: host-root
    hostPath:                      # VIOLATION: hostPath not allowed
      path: /

51.3 RBAC Deep Dive

51.3.1 RBAC Object Model

graph LR
    subgraph Subjects
        U[User\nsynth-admin@example.com]
        G[Group\nsynth-developers]
        SA[ServiceAccount\nsynth-app-sa]
    end

    subgraph Bindings
        RB[RoleBinding\nnamespace-scoped]
        CRB[ClusterRoleBinding\ncluster-scoped]
    end

    subgraph Roles
        R[Role\nnamespace-scoped]
        CR[ClusterRole\ncluster-scoped]
    end

    U --> CRB
    G --> RB
    SA --> RB
    RB --> R
    RB --> CR
    CRB --> CR

    style U fill:#457b9d,color:#fff
    style SA fill:#e9c46a,color:#000
    style R fill:#2d6a4f,color:#fff
    style CR fill:#e63946,color:#fff

Key principle: A RoleBinding can reference a ClusterRole but limits its scope to the binding's namespace. This is the recommended pattern — define reusable ClusterRoles, then bind them at the namespace level.

51.3.2 Dangerous RBAC Verbs

Verb Risk Why It Matters
* (wildcard) CRITICAL Grants all verbs including escalate, bind, impersonate
escalate CRITICAL Allows creating Roles with more permissions than the creator has
bind CRITICAL Allows creating RoleBindings to any Role, including cluster-admin
impersonate HIGH Allows acting as another user, group, or service account
create on pods HIGH Combined with SA access, enables privilege escalation via pod creation
create on pods/exec HIGH Enables remote code execution in any pod in scope
get/list on secrets HIGH Reads service account tokens and application secrets
create on serviceaccounts/token HIGH Generates new tokens for any service account in scope
patch on nodes MEDIUM Can taint/label nodes to influence scheduling

51.3.3 Service Account Hardening

Service accounts are the most common escalation vector because every pod gets one automatically.

# SYNTHETIC — Hardened service account configuration
apiVersion: v1
kind: ServiceAccount
metadata:
  name: synth-app-sa
  namespace: synth-production
automountServiceAccountToken: false   # CRITICAL: do not auto-mount

---
# If the application NEEDS API access, explicitly mount with audience binding
apiVersion: v1
kind: Pod
metadata:
  name: synth-app
  namespace: synth-production
spec:
  serviceAccountName: synth-app-sa
  automountServiceAccountToken: false
  containers:
  - name: app
    image: synth-registry.example.com/app@sha256:abc123...
    volumeMounts:
    - name: sa-token
      mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      readOnly: true
  volumes:
  - name: sa-token
    projected:
      sources:
      - serviceAccountToken:
          path: token
          expirationSeconds: 3600     # Short-lived (1 hour)
          audience: synth-api-server  # Audience-bound

51.3.4 RBAC Audit Methodology

RBAC AUDIT CHECKLIST
══════════════════════════════════════════════════════
1. Identify all ClusterRoleBindings to cluster-admin:
   → Every binding must have documented justification
   → Default: only system:masters group

2. Find all subjects with wildcard (*) permissions:
   → Replace with explicit verb lists

3. Check for escalate/bind verbs:
   → Should only exist on admin-managed ClusterRoles

4. Audit default service account permissions:
   → Default SA in each namespace should have ZERO permissions
   → automountServiceAccountToken: false on default SA

5. Review cross-namespace access:
   → ClusterRoleBindings grant cluster-wide access
   → Prefer namespace-scoped RoleBindings

6. Detect unused service accounts and RoleBindings:
   → Remove stale bindings (principle of least privilege)

7. Validate no pods run with privileged service accounts:
   → Cross-reference pod specs with SA permissions

51.4 Container Escape Techniques

51.4.1 Escape Vector Taxonomy

Understanding container escapes is essential for both red team assessment and blue team detection. Each vector exploits a specific isolation boundary failure.

Vector Mechanism Required Condition ATT&CK Severity
Privileged flag Disables all Linux security features privileged: true T1611 CRITICAL
hostPID + nsenter Enters host PID namespace hostPID: true + nsenter binary T1611 CRITICAL
hostPath mount Direct host filesystem access hostPath volume with write access T1611 CRITICAL
Docker socket Full Docker daemon control /var/run/docker.sock mounted T1611 CRITICAL
CAP_SYS_ADMIN Mount namespace manipulation Capability added to container T1611 HIGH
cgroup escape cgroup release_agent exploitation Write access to cgroup filesystem T1611 HIGH
Kernel exploit Exploit shared kernel vulnerability Vulnerable kernel, no gVisor/Kata T1611 HIGH
hostNetwork Shared network namespace hostNetwork: true T1611 MEDIUM
procfs write Write to host /proc entries /proc mounted writable T1611 MEDIUM

51.4.2 Privileged Container Escape (Conceptual)

SYNTHETIC SCENARIO — Privileged Container Escape
═══════════════════════════════════════════════════
Cluster: synth-cluster (198.51.100.0/24)
Namespace: synth-dev
Pod: synth-debug-pod (privileged: true)

ESCAPE CONCEPT (educational — cgroup release_agent):
1. Attacker compromises application in privileged pod
2. Privileged flag disables: AppArmor, seccomp, capability restrictions
3. Attacker mounts host cgroup filesystem
4. Writes a release_agent script that executes on the host
5. Triggers cgroup notification → script runs as root on host
6. Host is fully compromised

CONCEPTUAL FLOW:
  Container (root) → mount cgroup → write release_agent
  → trigger notify_on_release → execute on HOST as root

WHY PRIVILEGED IS DANGEROUS:
  - Disables ALL namespace isolation
  - Grants ALL Linux capabilities (36+)
  - Allows device access (/dev)
  - Disables seccomp and AppArmor profiles
  - Container is root on the host in everything but name

DETECTION:
  - Audit log: pod created with privileged=true
  - Falco: unexpected mount syscall in container
  - Process monitoring: nsenter, chroot, mount from container PID

51.4.3 hostPID + nsenter Escape (Conceptual)

# SYNTHETIC — Pod with hostPID escape vector
apiVersion: v1
kind: Pod
metadata:
  name: synth-escape-demo
  namespace: synth-dev
spec:
  hostPID: true              # Shares host PID namespace
  containers:
  - name: escape
    image: synth-registry.example.com/debug:1.0
    securityContext:
      privileged: false      # Even WITHOUT privileged flag
    command: ["sleep", "infinity"]

# ESCAPE CONCEPT:
# With hostPID=true, the container can see ALL host processes.
# If the container has nsenter binary:
#   nsenter -t 1 -m -u -i -n -- /bin/bash
#   → Enters PID 1's namespaces (init process = host)
#   → Full host shell access
#
# EVEN WITHOUT PRIVILEGED FLAG, hostPID enables:
#   - Process enumeration of host and all containers
#   - /proc/<pid>/environ reading (environment variable secrets)
#   - Signal injection to host processes
#   - With nsenter: full namespace escape

51.4.4 Detection Queries — Container Escapes

// KQL  Detect privileged pod creation in Kubernetes audit logs
// Data source: Azure Monitor / Microsoft Defender for Containers
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where Verb == "create"
| where ObjectRef_Resource == "pods"
| extend PodSpec = parse_json(RequestBody)
| where PodSpec.spec.containers[0].securityContext.privileged == true
    or PodSpec.spec.hostPID == true
    or PodSpec.spec.hostNetwork == true
| project
    TimeGenerated,
    User = SourceIPs,
    Username = User_Username,
    Namespace = ObjectRef_Namespace,
    PodName = tostring(PodSpec.metadata.name),
    Privileged = tostring(PodSpec.spec.containers[0].securityContext.privileged),
    HostPID = tostring(PodSpec.spec.hostPID)
| sort by TimeGenerated desc
// SPL  Detect privileged pod creation
// Data source: Kubernetes audit logs via Splunk
index=kubernetes sourcetype="kube:apiserver:audit"
verb="create" objectRef.resource="pods"
| spath path=requestObject.spec.containers{}.securityContext.privileged output=privileged
| spath path=requestObject.spec.hostPID output=hostPID
| spath path=requestObject.spec.hostNetwork output=hostNetwork
| where privileged="true" OR hostPID="true" OR hostNetwork="true"
| table _time, user.username, objectRef.namespace,
    requestObject.metadata.name, privileged, hostPID, hostNetwork
| sort -_time
// KQL  Detect runtime indicators of container escape attempts
// Data source: Microsoft Defender for Containers / Syslog
SecurityAlert
| where TimeGenerated > ago(24h)
| where AlertType has_any (
    "ContainerEscape",
    "PrivilegedContainer",
    "SensitiveMount"
)
| union (
    Syslog
    | where TimeGenerated > ago(24h)
    | where ProcessName == "falco"
    | where SyslogMessage has_any (
        "nsenter", "mount", "chroot",
        "cgroup", "release_agent"
    )
)
| project TimeGenerated, Computer, ProcessName,
    AlertType, SyslogMessage
| sort by TimeGenerated desc
// SPL  Detect runtime indicators of container escape
// Data source: Falco alerts / syslog
index=kubernetes sourcetype="falco"
(rule="Terminal shell in container"
 OR rule="Launch Privileged Container"
 OR rule="Mount Launched in Privileged Container"
 OR rule="Detect release_agent File Container Escapes")
| table _time, container.id, container.name, k8s.pod.name,
    k8s.ns.name, proc.cmdline, rule, output
| sort -_time

51.5 Secrets Management

51.5.1 Native Kubernetes Secrets — Limitations

Kubernetes Secrets are base64-encoded by default, not encrypted. They are stored in etcd as plaintext unless encryption at rest is explicitly configured.

KUBERNETES SECRETS RISK MODEL
══════════════════════════════════════════════════════
Storage:
  - etcd stores Secrets as base64 (NOT encryption)
  - Direct etcd access → all Secrets exposed
  - etcd backup files contain all Secrets in plaintext

Access:
  - Any subject with 'get' on secrets can read them
  - Default SA may have secret access via legacy bindings
  - RBAC wildcard (*) on resources includes secrets

Transmission:
  - API server → kubelet: TLS encrypted (good)
  - Mounted as tmpfs in pod (good — not on disk)
  - Environment variables visible in /proc/<pid>/environ (bad)

Logging:
  - Secret values appear in audit logs at RequestResponse level
  - Ensure audit policy excludes Secret data (Metadata level only)

51.5.2 Encryption at Rest Configuration

# SYNTHETIC — EncryptionConfiguration for Kubernetes Secrets
# File: /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
  - secrets
  - configmaps
  providers:
  # Primary: AES-GCM (recommended — authenticated encryption)
  - aesgcm:
      keys:
      - name: key-2026-04
        secret: REDACTED  # 32-byte base64-encoded key
  # Fallback: allows reading old unencrypted secrets during migration
  - identity: {}

# API server flag: --encryption-provider-config=/etc/kubernetes/encryption-config.yaml
#
# CRITICAL STEPS after enabling:
# 1. Restart API server with the flag
# 2. Re-encrypt all existing secrets:
#    kubectl get secrets --all-namespaces -o json | kubectl replace -f -
# 3. Remove the identity{} provider after migration
# 4. Store encryption key OUTSIDE the cluster (HSM, Vault, KMS)

51.5.3 External Secrets Management

Solution Integration Method Key Advantage Managed K8s Support
HashiCorp Vault CSI driver, sidecar injector, external-secrets operator Full lifecycle management, dynamic secrets EKS, AKS, GKE, self-managed
AWS Secrets Manager ASCP (AWS Secrets & Config Provider) CSI driver Native AWS integration, automatic rotation EKS
Azure Key Vault CSI driver (secrets-store-csi-driver) Native Azure integration, RBAC via AAD AKS
GCP Secret Manager CSI driver, workload identity Native GCP integration GKE
External Secrets Operator CRD-based sync from external store to K8s Secret Cloud-agnostic, supports multiple backends All
Sealed Secrets Encrypted CRDs decrypted by in-cluster controller GitOps-friendly (safe to commit to repo) All
# SYNTHETIC — External Secrets Operator example
# Syncs a secret from AWS Secrets Manager into a Kubernetes Secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: synth-db-credentials
  namespace: synth-production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: synth-aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: synth-db-secret
    creationPolicy: Owner
  data:
  - secretKey: username
    remoteRef:
      key: synth/production/database
      property: username
  - secretKey: password
    remoteRef:
      key: synth/production/database
      property: password

51.5.4 Detection Queries — Secrets Access

// KQL  Detect bulk secrets enumeration across namespaces
KubeAuditLogs
| where TimeGenerated > ago(1h)
| where Verb in ("list", "get")
| where ObjectRef_Resource == "secrets"
| summarize
    SecretAccessCount = count(),
    NamespacesAccessed = dcount(ObjectRef_Namespace),
    Namespaces = make_set(ObjectRef_Namespace, 10)
  by User_Username, SourceIPs, bin(TimeGenerated, 5m)
| where SecretAccessCount > 20 or NamespacesAccessed > 3
| project TimeGenerated, User_Username, SourceIPs,
    SecretAccessCount, NamespacesAccessed, Namespaces
| sort by SecretAccessCount desc
// SPL  Detect bulk secrets enumeration
index=kubernetes sourcetype="kube:apiserver:audit"
verb IN ("list", "get") objectRef.resource="secrets"
| bin _time span=5m
| stats count AS secret_access_count,
    dc(objectRef.namespace) AS namespaces_accessed,
    values(objectRef.namespace) AS namespace_list
  by _time, user.username, sourceIPs{}
| where secret_access_count > 20 OR namespaces_accessed > 3
| sort -secret_access_count

51.6 Network Policies

51.6.1 Default-Deny Foundation

By default, Kubernetes allows all pod-to-pod communication within a cluster. Network policies are the only mechanism to restrict this traffic. Without a CNI that supports NetworkPolicy (Calico, Cilium, Antrea), policies are ignored silently.

# SYNTHETIC — Default deny all ingress and egress
# Apply to every namespace as the baseline
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: synth-production
spec:
  podSelector: {}          # Applies to ALL pods in namespace
  policyTypes:
  - Ingress
  - Egress
  # No ingress/egress rules = deny all traffic

Silent Failure

If your CNI plugin does not support NetworkPolicy (e.g., default Flannel), applying a NetworkPolicy resource will succeed with no error but have zero effect. Always verify your CNI supports network policy enforcement. Use Calico, Cilium, or Antrea for production clusters.

51.6.2 Namespace Isolation Pattern

# SYNTHETIC — Allow only intra-namespace communication + DNS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-same-namespace
  namespace: synth-production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector: {}      # Allow from same namespace only
  egress:
  - to:
    - podSelector: {}      # Allow to same namespace only
  - to:                    # Allow DNS resolution
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

51.6.3 Microsegmentation with Cilium

Cilium extends NetworkPolicy with L7 (application-layer) filtering using eBPF, enabling HTTP method/path, gRPC, and Kafka topic-level policies.

# SYNTHETIC — Cilium L7 network policy
# Only allows GET requests to /api/v1/health and /api/v1/data
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: synth-api-l7-policy
  namespace: synth-production
spec:
  endpointSelector:
    matchLabels:
      app: synth-api
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: synth-frontend
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/api/v1/health"
        - method: "GET"
          path: "/api/v1/data"
        # POST, PUT, DELETE are implicitly denied

51.6.4 Detection Queries — Network Policy Violations

// KQL  Detect network policy modifications
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource == "networkpolicies"
| where Verb in ("create", "update", "patch", "delete")
| project
    TimeGenerated,
    User_Username,
    Verb,
    Namespace = ObjectRef_Namespace,
    PolicyName = ObjectRef_Name,
    SourceIPs
| sort by TimeGenerated desc
// SPL  Detect network policy modifications
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource="networkpolicies"
verb IN ("create", "update", "patch", "delete")
| table _time, user.username, verb,
    objectRef.namespace, objectRef.name, sourceIPs{}
| sort -_time

51.7 Supply Chain Security

51.7.1 Container Supply Chain Threat Model

graph LR
    A[Source Code\nDeveloper] -->|Build| B[Container Image\nCI/CD Pipeline]
    B -->|Push| C[Container Registry\nsynth-registry.example.com]
    C -->|Pull| D[Kubernetes Cluster\nRuntime]

    A1[Compromised\nDependency] -.->|Supply Chain| B
    B1[Tampered\nBuild] -.->|Integrity| B
    C1[Unsigned\nImage] -.->|Verification| C
    D1[Unscanned\nDeployment] -.->|Admission| D

    style A1 fill:#e63946,color:#fff
    style B1 fill:#e63946,color:#fff
    style C1 fill:#e63946,color:#fff
    style D1 fill:#e63946,color:#fff

51.7.2 Image Signing with Cosign

CONTAINER IMAGE SIGNING WORKFLOW
══════════════════════════════════════════════════════
Step 1: Generate signing key pair (one-time)
  cosign generate-key-pair
  → cosign.key (private — store in secrets manager)
  → cosign.pub (public — distribute to clusters)

Step 2: Sign image after build + scan in CI/CD
  cosign sign --key cosign.key \
    synth-registry.example.com/app@sha256:abc123...
  → Signature stored as OCI artifact alongside image

Step 3: Verify signature before deployment
  cosign verify --key cosign.pub \
    synth-registry.example.com/app@sha256:abc123...
  → Returns 0 if valid, non-zero if tampered

Step 4: Enforce via admission controller
  → Kyverno or OPA Gatekeeper policy rejects unsigned images
  → Only images signed by trusted key can be deployed

51.7.3 Admission Controller Enforcement

# SYNTHETIC — Kyverno policy requiring image signatures
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-image-signature
spec:
  validationFailureAction: Enforce  # Block non-compliant pods
  background: false
  rules:
  - name: check-image-signature
    match:
      any:
      - resources:
          kinds:
          - Pod
    verifyImages:
    - imageReferences:
      - "synth-registry.example.com/*"
      attestors:
      - count: 1
        entries:
        - keys:
            publicKeys: |-
              -----BEGIN PUBLIC KEY-----
              REDACTED
              -----END PUBLIC KEY-----

---
# SYNTHETIC — Kyverno policy: block images from untrusted registries
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  rules:
  - name: validate-registries
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Images must come from approved registries."
      pattern:
        spec:
          containers:
          - image: "synth-registry.example.com/*"

51.7.4 SBOM and Vulnerability Scanning

SUPPLY CHAIN SECURITY PIPELINE
══════════════════════════════════════════════════════
Stage 1: Source Composition Analysis
  → Scan dependencies for known CVEs (Trivy, Grype, Snyk)
  → Generate SBOM (syft, trivy sbom)
  → Block builds with critical/high CVEs (fail threshold)

Stage 2: Image Build Security
  → Use minimal base images (distroless, Alpine, scratch)
  → Multi-stage builds (no build tools in runtime image)
  → Pin base images by SHA256 digest (not tag)
  → Scan built image (Trivy, Grype)
  → Attach SBOM as OCI attestation (cosign attest)

Stage 3: Registry Security
  → Private registry with authentication
  → Vulnerability scanning on push (Harbor, ECR, ACR, GAR)
  → Image retention policies (delete untagged images)
  → Registry allowlisting in admission controller

Stage 4: Runtime Verification
  → Admission controller verifies signature + attestation
  → Continuous scanning of running images (Trivy Operator)
  → Alert on newly disclosed CVEs affecting deployed images

51.8 Runtime Security

51.8.1 Runtime Threat Detection Architecture

graph TD
    subgraph Kubernetes Node
        A[Container Process\nsyscall] -->|eBPF hook| B[Kernel\neBPF Programs]
        B -->|Events| C[Falco / Tetragon\nUserspace Agent]
    end
    C -->|Alert| D[SIEM / SOAR\nDetection Pipeline]
    C -->|Alert| E[Kubernetes Event\nPod annotation]
    C -->|Enforce| F[Kill Process\nor Pod]

    style A fill:#457b9d,color:#fff
    style B fill:#264653,color:#fff
    style C fill:#e63946,color:#fff
    style D fill:#2d6a4f,color:#fff

51.8.2 Falco Rules for Kubernetes

Falco is the de facto standard for Kubernetes runtime security. It monitors syscalls via eBPF and matches them against rules.

# SYNTHETIC — Custom Falco rules for Kubernetes security
# File: /etc/falco/rules.d/k8s-custom-rules.yaml

# Detect shell execution in a container
- rule: Shell in Container
  desc: Detect interactive shell opened in a container
  condition: >
    spawned_process and
    container and
    proc.name in (bash, sh, zsh, dash, ksh) and
    proc.tty != 0
  output: >
    Shell opened in container
    (user=%user.name command=%proc.cmdline
     container=%container.name pod=%k8s.pod.name
     namespace=%k8s.ns.name image=%container.image.repository)
  priority: WARNING
  tags: [container, shell, mitre_execution]

# Detect reading of sensitive files
- rule: Read Sensitive File in Container
  desc: Detect reads of sensitive files that may indicate credential theft
  condition: >
    open_read and
    container and
    fd.name in (/etc/shadow, /etc/passwd,
                /var/run/secrets/kubernetes.io/serviceaccount/token,
                /var/run/secrets/kubernetes.io/serviceaccount/ca.crt)
  output: >
    Sensitive file read in container
    (file=%fd.name user=%user.name command=%proc.cmdline
     container=%container.name pod=%k8s.pod.name
     namespace=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, credential_access, mitre_credential_access]

# Detect crypto-mining indicators
- rule: Detect Crypto Mining Process
  desc: Detect processes commonly associated with cryptocurrency mining
  condition: >
    spawned_process and
    container and
    (proc.name in (xmrig, minerd, minergate, cpuminer) or
     proc.cmdline contains "stratum+tcp" or
     proc.cmdline contains "pool.mining" or
     proc.cmdline contains "--donate-level")
  output: >
    Crypto mining process detected
    (process=%proc.name command=%proc.cmdline
     container=%container.name pod=%k8s.pod.name
     namespace=%k8s.ns.name)
  priority: CRITICAL
  tags: [container, cryptomining, mitre_resource_hijacking]

# Detect nsenter (container escape tool)
- rule: Detect nsenter Usage
  desc: nsenter allows entering host namespaces — container escape indicator
  condition: >
    spawned_process and
    container and
    proc.name = nsenter
  output: >
    nsenter executed in container (POSSIBLE ESCAPE ATTEMPT)
    (command=%proc.cmdline container=%container.name
     pod=%k8s.pod.name namespace=%k8s.ns.name
     user=%user.name)
  priority: CRITICAL
  tags: [container, escape, mitre_privilege_escalation]

51.8.3 Cilium Tetragon — eBPF-Based Enforcement

Tetragon goes beyond detection to real-time enforcement — it can kill processes or send signals at the kernel level before a malicious action completes.

# SYNTHETIC — Tetragon TracingPolicy
# Block sensitive file access in containers
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: block-sensitive-access
spec:
  kprobes:
  - call: "security_file_open"
    syscall: false
    args:
    - index: 0
      type: "file"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Prefix"
        values:
        - "/etc/shadow"
        - "/etc/kubernetes/pki"
        - "/var/lib/etcd"
      matchActions:
      - action: Sigkill      # Kill the process immediately
    - matchArgs:
      - index: 0
        operator: "Prefix"
        values:
        - "/var/run/secrets/kubernetes.io/serviceaccount/token"
      matchActions:
      - action: Post          # Log but allow (for expected access)

51.8.4 Runtime Security Comparison

Capability Falco Tetragon Commercial (Aqua/Sysdig)
Syscall monitoring Yes (eBPF/kernel module) Yes (eBPF) Yes
Process tree tracking Limited Full (with ancestors) Full
Network monitoring Basic (connection events) Full (L3/L4/L7) Full
File integrity Via rules Via TracingPolicy Built-in FIM
Real-time enforcement No (detect only) Yes (Sigkill, Override) Yes
Kubernetes context Yes (pod, namespace, labels) Yes (native K8s enrichment) Yes
Performance overhead Low (~1-3% CPU) Very low (eBPF in-kernel) Varies
License Apache 2.0 Apache 2.0 Commercial

51.9 etcd Security

51.9.1 etcd Threat Model

etcd is the crown jewel of a Kubernetes cluster. It contains the entire cluster state: every Secret, every ConfigMap, every RBAC binding, every pod specification.

etcd SECURITY HARDENING
══════════════════════════════════════════════════════
Access Control:
  □ mTLS for all client connections (API server → etcd)
  □ mTLS for peer communication (etcd node → etcd node)
  □ Restrict etcd port (2379/2380) via firewall to API server only
  □ No direct etcd access from worker nodes
  □ Separate etcd nodes from control plane (dedicated hardware/VMs)

Encryption:
  □ Enable Kubernetes encryption at rest (see Section 51.5.2)
  □ Use KMS provider for encryption key management
  □ Rotate encryption keys quarterly
  □ Verify encryption: etcdctl get /registry/secrets/... should be ciphertext

Backups:
  □ Automated etcd snapshots (etcdctl snapshot save)
  □ Encrypt backup files at rest
  □ Store backups in separate security domain (not on cluster nodes)
  □ Test restore procedure quarterly
  □ Retention policy: 30 days minimum

Monitoring:
  □ Monitor etcd metrics (leader changes, slow queries, disk latency)
  □ Alert on certificate expiry (30-day threshold)
  □ Alert on unauthorized connection attempts
  □ Log all etcd API calls for forensics

51.9.2 Detection Queries — etcd Access

// KQL  Detect direct etcd access attempts bypassing API server
// Data source: Network flow logs / firewall logs
CommonSecurityLog
| where TimeGenerated > ago(24h)
| where DestinationPort == 2379 or DestinationPort == 2380
| where SourceIP !in ("198.51.100.10")  // Synthetic: only API server IP allowed
| project
    TimeGenerated,
    SourceIP,
    DestinationIP,
    DestinationPort,
    Action = DeviceAction,
    Protocol
| sort by TimeGenerated desc
// SPL  Detect direct etcd access bypassing API server
index=firewall dest_port IN (2379, 2380)
NOT src_ip="198.51.100.10"
| table _time, src_ip, dest_ip, dest_port, action
| sort -_time

51.10 Kubernetes Audit Logging

51.10.1 Audit Policy Design

Kubernetes audit logging records all API server requests. The audit policy controls what is logged and at what detail level.

Level What Is Recorded Use Case
None Nothing Resources you never need to audit
Metadata Request metadata (user, resource, verb, timestamp) Default for most resources
Request Metadata + request body Mutations to critical resources
RequestResponse Metadata + request body + response body Forensic-grade logging (high volume)
# SYNTHETIC — Kubernetes audit policy
# File: /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# CRITICAL: Log ALL authentication failures at RequestResponse
- level: RequestResponse
  verbs: ["create"]
  resources:
  - group: "authentication.k8s.io"
    resources: ["tokenreviews"]

# HIGH: Log secret access at Metadata only (never log secret values!)
- level: Metadata
  resources:
  - group: ""
    resources: ["secrets"]

# HIGH: Log RBAC changes at Request level
- level: Request
  verbs: ["create", "update", "patch", "delete"]
  resources:
  - group: "rbac.authorization.k8s.io"
    resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]

# HIGH: Log pod exec and port-forward
- level: Request
  verbs: ["create"]
  resources:
  - group: ""
    resources: ["pods/exec", "pods/portforward", "pods/attach"]

# MEDIUM: Log workload changes at Metadata
- level: Metadata
  verbs: ["create", "update", "patch", "delete"]
  resources:
  - group: "apps"
    resources: ["deployments", "daemonsets", "statefulsets"]
  - group: ""
    resources: ["pods", "services", "configmaps"]
  - group: "batch"
    resources: ["jobs", "cronjobs"]

# LOW: Log read-only access at None (reduce volume)
- level: None
  verbs: ["get", "list", "watch"]
  resources:
  - group: ""
    resources: ["events", "endpoints"]

# CATCH-ALL: Metadata for everything else
- level: Metadata

51.10.2 High-Value Audit Events

Event Verb + Resource Threat Indicator
Secret enumeration list + secrets Credential harvesting
ClusterRoleBinding creation create + clusterrolebindings Privilege escalation / persistence
Pod exec create + pods/exec Lateral movement / RCE
Service account token request create + serviceaccounts/token Token theft
Admission webhook modification update + mutatingwebhookconfigurations Supply chain attack / persistence
Node proxy request create + nodes/proxy Kubelet API bypass
CronJob creation create + cronjobs Scheduled persistence
Namespace creation create + namespaces Shadow namespace for hiding activities

51.10.3 Detection Queries — Audit Log Analysis

// KQL  Detect RBAC modifications that may indicate privilege escalation
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource in ("clusterrolebindings", "clusterroles",
    "rolebindings", "roles")
| where Verb in ("create", "update", "patch")
| extend RequestBody = parse_json(RequestBody)
| extend
    RoleRef = tostring(RequestBody.roleRef.name),
    SubjectKind = tostring(RequestBody.subjects[0].kind),
    SubjectName = tostring(RequestBody.subjects[0].name)
| where RoleRef has "admin" or RoleRef has "cluster-admin"
    or RoleRef has "edit"
| project
    TimeGenerated,
    User_Username,
    Verb,
    Resource = ObjectRef_Resource,
    BindingName = ObjectRef_Name,
    RoleRef,
    SubjectKind,
    SubjectName,
    SourceIPs
| sort by TimeGenerated desc
// SPL  Detect RBAC modifications indicating privilege escalation
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource IN ("clusterrolebindings", "clusterroles",
    "rolebindings", "roles")
verb IN ("create", "update", "patch")
| spath path=requestObject.roleRef.name output=roleRef
| spath path=requestObject.subjects{}.name output=subjectName
| spath path=requestObject.subjects{}.kind output=subjectKind
| search roleRef IN ("cluster-admin", "admin", "edit")
| table _time, user.username, verb, objectRef.resource,
    objectRef.name, roleRef, subjectName, subjectKind, sourceIPs{}
| sort -_time
// KQL  Detect pod exec activity (potential lateral movement)
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource == "pods" and ObjectRef_Subresource == "exec"
| where Verb == "create"
| summarize
    ExecCount = count(),
    TargetPods = make_set(ObjectRef_Name, 20),
    Namespaces = make_set(ObjectRef_Namespace, 10)
  by User_Username, SourceIPs, bin(TimeGenerated, 15m)
| where ExecCount > 5
| project TimeGenerated, User_Username, SourceIPs,
    ExecCount, TargetPods, Namespaces
| sort by ExecCount desc
// SPL  Detect excessive pod exec activity
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource="pods" objectRef.subresource="exec"
verb="create"
| bin _time span=15m
| stats count AS exec_count,
    values(objectRef.name) AS target_pods,
    dc(objectRef.namespace) AS namespace_count
  by _time, user.username, sourceIPs{}
| where exec_count > 5
| sort -exec_count
// KQL  Detect CronJob creation that may indicate persistence
KubeAuditLogs
| where TimeGenerated > ago(24h)
| where ObjectRef_Resource == "cronjobs"
| where Verb == "create"
| extend RequestBody = parse_json(RequestBody)
| extend
    Schedule = tostring(RequestBody.spec.schedule),
    Image = tostring(RequestBody.spec.jobTemplate.spec.template.spec.containers[0].image),
    Command = tostring(RequestBody.spec.jobTemplate.spec.template.spec.containers[0].command)
| project
    TimeGenerated,
    User_Username,
    Namespace = ObjectRef_Namespace,
    CronJobName = ObjectRef_Name,
    Schedule,
    Image,
    Command,
    SourceIPs
| sort by TimeGenerated desc
// SPL  Detect CronJob creation for persistence
index=kubernetes sourcetype="kube:apiserver:audit"
objectRef.resource="cronjobs" verb="create"
| spath path=requestObject.spec.schedule output=schedule
| spath path=requestObject.spec.jobTemplate.spec.template.spec.containers{}.image output=image
| spath path=requestObject.spec.jobTemplate.spec.template.spec.containers{}.command output=command
| table _time, user.username, objectRef.namespace,
    objectRef.name, schedule, image, command, sourceIPs{}
| sort -_time

51.11 Kubernetes Hardening Summary

KUBERNETES SECURITY MATURITY MODEL
══════════════════════════════════════════════════════════════════

Level 1 — FOUNDATIONAL (must have before production)
  □ RBAC enabled (no ABAC, no AlwaysAllow)
  □ Default service account tokens disabled
  □ Pod Security Standards: Baseline enforced
  □ Network policies: default-deny in sensitive namespaces
  □ etcd encrypted in transit (mTLS)
  □ Kubelet anonymous auth disabled
  □ Audit logging enabled (Metadata level minimum)
  □ Image pull from private registries only

Level 2 — HARDENED (production-grade)
  □ Pod Security Standards: Restricted enforced
  □ Encryption at rest for Secrets (KMS-backed)
  □ External secrets management (Vault, cloud KMS)
  □ Network policies: default-deny all namespaces + egress filtering
  □ Admission controllers: image allowlisting, signature verification
  □ RBAC: namespace-scoped Roles only, no wildcard permissions
  □ Audit logging: Request level for mutations
  □ Runtime security: Falco or Tetragon deployed
  □ CIS Kubernetes Benchmark: 90%+ compliance

Level 3 — ADVANCED (security-first organizations)
  □ eBPF-based network policy (Cilium) with L7 filtering
  □ Workload identity (no static credentials in pods)
  □ SBOM generation and attestation for all images
  □ Tetragon enforcement policies (kill malicious processes)
  □ Zero-trust service mesh (mTLS pod-to-pod)
  □ Automated RBAC review and unused permission pruning
  □ Multi-cluster security posture management
  □ Chaos engineering for security (GameDay exercises)
  □ CIS Kubernetes Benchmark: 100% compliance

Hands-On Exercises

Practical Application

The following exercises provide hands-on practice with the concepts covered in this chapter:

Apply the concepts from this chapter using the following resources:

  • Lab 27: Kubernetes Security Assessment — Hands-on cluster security assessment using kube-bench, RBAC audit, and network policy testing (planned)
  • SC-085: Kubernetes Cluster Compromise via RBAC Misconfiguration — Full attack scenario from initial pod access to cluster-admin escalation (planned)
  • PT-195: Pod Security Standards Bypass Testing — Purple team exercise validating PSS enforcement across namespaces (planned)
  • PT-201: Kubernetes Secrets Extraction and Detection — Red team secrets enumeration with blue team detection validation (planned)
  • PT-207: Container Escape Detection and Response — End-to-end container escape simulation with runtime detection (planned)

Each exercise includes both offensive (red team) and defensive (blue team) components, aligned with the purple team methodology used throughout Nexus SecOps. See the Purple Team Exercise Library for the full catalog.


Exam Prep & Certifications

Relevant Certifications

The topics in this chapter align with the following certifications:

View full Certifications Roadmap →


Review Questions

1. Explain why base64 encoding of Kubernetes Secrets does not constitute encryption, and describe the defense-in-depth approach to secrets protection.

Base64 is an encoding scheme, not an encryption algorithm — it is trivially reversible with no key required (echo '<value>' | base64 -d). Kubernetes uses base64 encoding purely for safe transport of binary data in YAML/JSON, not for confidentiality. Defense-in-depth for secrets includes: (1) enabling encryption at rest via EncryptionConfiguration with AES-GCM or KMS provider, so secrets are encrypted before being written to etcd; (2) restricting RBAC access to secrets using namespace-scoped Roles with explicit resource names; (3) using external secrets managers (Vault, AWS Secrets Manager) with dynamic, short-lived credentials; (4) disabling automountServiceAccountToken on service accounts that don't need API access; (5) setting audit policy to Metadata level for secrets to avoid logging secret values.

2. A pod specification sets hostPID: true but does not set privileged: true. Explain the security implications and possible escape vectors.

With hostPID: true, the container shares the host's PID namespace, meaning it can see all processes running on the host node and in every other container on that node. Even without privileged: true, this enables: (1) reading environment variables of host processes via /proc/<pid>/environ, which may contain secrets, API keys, or database passwords; (2) sending signals to host processes (if the container runs as root); (3) using nsenter -t 1 -m -u -i -n to enter the mount, UTS, IPC, and network namespaces of PID 1 (the host init process), effectively escaping the container entirely. The nsenter escape requires the binary to be present in the container and typically root access within the container. Mitigation: never allow hostPID: true outside of system namespaces; enforce via Pod Security Standards (Baseline profile blocks hostPID).

3. Describe the difference between a Kubernetes Role and a ClusterRole, and explain why a RoleBinding referencing a ClusterRole is the recommended pattern.

A Role grants permissions within a single namespace. A ClusterRole grants permissions cluster-wide or on cluster-scoped resources (nodes, PVs, namespaces). The recommended pattern is to define reusable ClusterRoles (e.g., pod-reader, deployment-manager) and then create namespace-scoped RoleBindings that reference those ClusterRoles. This limits the ClusterRole's effective scope to the binding's namespace. Benefits: (1) reusability — one ClusterRole definition, many namespace-scoped bindings; (2) least privilege — the binding constrains what the ClusterRole can access; (3) auditability — RoleBindings are easier to enumerate than scattered Role definitions. The anti-pattern is using ClusterRoleBindings, which grant the ClusterRole's permissions across the entire cluster.

4. Why do Kubernetes NetworkPolicies fail silently when the CNI does not support them, and how should an operator verify enforcement?

Kubernetes treats NetworkPolicy as a standard API resource — the API server accepts and stores the resource regardless of whether the underlying CNI plugin implements it. The API server has no awareness of CNI capabilities. If the CNI (e.g., default Flannel without the network-policy plugin) does not support NetworkPolicy, the resource exists in etcd but has zero runtime effect — all traffic remains allowed. To verify enforcement: (1) deploy a test pod and attempt to reach a pod that should be blocked by the policy — if traffic succeeds, the CNI is not enforcing; (2) check the CNI documentation for NetworkPolicy support; (3) use kubectl get pods -n kube-system to verify the CNI controller pods are running (e.g., calico-node, cilium-agent); (4) run a tool like netassert or cyclonus for automated network policy testing.

5. Compare Falco and Tetragon for Kubernetes runtime security. When would you choose one over the other?

Falco is a detect-only tool: it monitors syscalls via eBPF (or kernel module), matches them against rules, and generates alerts. It has a large community, extensive default ruleset, and integrates well with SIEM systems. Tetragon (by Cilium/Isovalent) uses eBPF for both detection and enforcement — it can kill processes (Sigkill) or override return values at the kernel level before a malicious action completes. Choose Falco when you need broad detection coverage with a mature ecosystem, are sending all alerts to a SIEM for correlation, or need to start quickly with default rules. Choose Tetragon when you need real-time enforcement (block an exploit before it succeeds), require deep process ancestry tracking, need L7 network visibility, or are already using Cilium as your CNI. In mature environments, both can be deployed together: Tetragon for enforcement on critical paths, Falco for broad detection and SIEM integration.

6. An attacker creates a CronJob in a shadow namespace to maintain persistence. What audit log events would this generate, and how would you detect it?

The attacker's actions generate at least two audit events: (1) create on namespaces (creating the shadow namespace); (2) create on cronjobs in the new namespace. The audit policy should log namespace creation at Request level and CronJob creation at Request level. Detection approach: alert on namespace creation by non-infrastructure service accounts; alert on CronJob creation in newly-created namespaces; correlate the two events within a short time window (e.g., namespace creation followed by CronJob creation within 5 minutes by the same user); monitor for CronJobs with images from external/untrusted registries; use Falco to detect unexpected processes spawned by CronJob-created pods at runtime.

7. Explain how a mutating admission webhook could be abused for persistence, and describe the detection strategy.

A mutating admission webhook intercepts API requests before they are persisted to etcd and can modify the request object. An attacker with create or update permissions on mutatingwebhookconfigurations can register a webhook that injects a sidecar container, environment variable, or volume mount into every new pod. This sidecar could exfiltrate data, establish a reverse shell, or mine cryptocurrency. Because the mutation happens transparently at the API level, pod creators may not notice the injected content. Detection: (1) audit log monitoring for create/update on mutatingwebhookconfigurations and validatingwebhookconfigurations; (2) alert on webhook endpoints pointing to external URLs or non-system namespaces; (3) periodic comparison of deployed pod specs against their source Deployment/StatefulSet specs to detect unexpected mutations; (4) admission controller that restricts who can modify webhook configurations.


Key Takeaways

Chapter Summary

  1. The API server is the single most critical component — secure it with OIDC authentication, RBAC authorization, TLS 1.2+, audit logging, and network restriction. Every cluster operation flows through the API server.
  2. Pod Security Standards replace PodSecurityPolicy — enforce the Restricted profile for all production workloads. Use namespace-level labels for enforcement, warning, and auditing.
  3. RBAC misconfiguration is the leading Kubernetes escalation vector — use namespace-scoped Roles, disable default SA tokens, never grant wildcard (*) verbs, and audit escalate/bind/impersonate verbs.
  4. Container escapes require specific pod security context settings — privileged, hostPID, hostPath, and CAP_SYS_ADMIN are the primary vectors. Pod Security Standards block all of them at the Restricted level.
  5. Native Kubernetes Secrets are not encrypted by default — enable encryption at rest with AES-GCM or KMS, and migrate to external secrets managers for production credentials.
  6. Network policies are not enforced without a supporting CNI — deploy Calico, Cilium, or Antrea, implement default-deny in every namespace, and verify enforcement with testing tools.
  7. Supply chain security requires signing, scanning, and admission control — sign images with Cosign, scan with Trivy, generate SBOMs, and enforce signature verification via Kyverno or OPA Gatekeeper.
  8. Runtime security closes the detection gap — deploy Falco for broad syscall monitoring and Tetragon for kernel-level enforcement. Both use eBPF for minimal performance impact.
  9. etcd is the crown jewel — protect it with mTLS, network isolation, encryption at rest, and encrypted backups stored outside the cluster.
  10. Audit logging is the foundation of Kubernetes detection engineering — design a tiered audit policy that balances visibility with performance, and forward logs to your SIEM for correlation with KQL/SPL queries.

Cross-References

Topic Chapter Relevance
Cloud security fundamentals Ch 20: Cloud Attack & Defense Shared responsibility model, cloud IAM foundations
Container red teaming Ch 46: Cloud & Container Red Teaming Container escape techniques, cloud attack lifecycle
CI/CD pipeline security Ch 35: DevSecOps Pipeline Image scanning integration, admission controller CI/CD
Detection engineering Ch 5: Detection Engineering at Scale KQL/SPL query methodology, alert tuning
Network security architecture Ch 31: Network Security Architecture Network policy verification, flow log analysis
SOC foundations Ch 1: Introduction K8s incident response procedures, SOC workflow