Skip to content

Changelog

All notable changes to lint-pdf are documented here. Format loosely follows Keep a Changelog; versions use PEP 440 (beta tags 0.1.0bN).

  • GS1 Digital Link structural policy (CodexGs1DigitalLinkAnalyzerCODEX_GS1_DIGITAL_LINK). Reads codex’s detected_barcodes[].digital_link syntactic validation facts (codex >= 1.29.0) and emits one finding per structural defect, severity banded per failure class — bad GTIN check digit / invalid primary / no primary key = error; un-padded GTIN (GS1 Sunrise-2027) / illegal-or-misordered key-qualifier / malformed AI = warning; deprecated convenience-alpha = advisory. Overridable per run via ctx.config["gs1_digital_link_severities"]. Mirrors the CODEX_BARCODE_GRADE reader pattern (codex = facts, lint = policy); self-skips when the signal is absent (codex < 1.29.0, 1D symbols, marketing URLs), never raises. Validates GS1 Digital Link structurenot a GS1 certification, not URI resolution.
  • The codex floor pin was bumped to codex-pdf>=1.29.0 (2026-06-18) now that codex 1.29.0 — which first emits the digital_link block — is published to PyPI, so the signal is contractually guaranteed present. (The reader still self-skips defensively when the block is absent.)
  • XXE hardening (external XML imports + JDF): the PitStop, callas, Acrobat, custom, format-detect, and JDF parsers now parse via defusedxml.ElementTree instead of the stdlib xml.etree.ElementTree, blocking external-entity and entity-expansion (“billion laughs”) attacks on attacker-supplied preflight reports. Drop-in: the fromstring/parse/ParseError surface is unchanged.
  • SSRF scheme allowlist: the warm-on-need probe (warming/registry.py) now rejects non-http(s) probe targets before opening a connection. The operator-run audit/seed scripts validate request schemes through a shared _checked() guard.
  • Corrected an ineffective # nosemgrep suppression in the fastText language-model download (was on the wrong physical line, so it never suppressed) and justified it inline — the URL is a hardcoded constant.
  • Bumped defusedxml floor to >=0.7.1.
  • Added a blocking security CI job (semgrep --error + bandit medium/medium) and pinned every workflow to least-privilege permissions: contents: read.
  • Reconciled the version skew between pyproject.toml (0.1.0b34) and src/lintpdf/__init__.py (0.1.0b33); both now report 0.1.0b35.