Determinism + transparency posture

Codex’s market position is one no incumbent artwork-AI tool will take: deterministic, versioned, auditable, model-disclosed. Every commercial artwork-AI product (Esko Comply, GlobalVision CheckAI, GetGenAI, ManageArtworks) hides its model and publishes no accuracy numbers, and the big document-AI services (AWS Textract, Google Document AI, Azure) actively discourage version pinning and don’t guarantee determinism. Codex does the opposite, on purpose — and this document is the contract.

It is the companion to contract.md (the wire surface) and accuracy.md (the published accuracy methodology for the deterministic lanes).

The four claims

Claim	What it means	Where it’s surfaced
Model-disclosed	Every signal lane publishes the exact model / engine id behind it.	`ai_model_versions[kind].model` on `GET /v1/contract`
Versioned	Every signal carries the pinned model + prompt + payload-schema versions it was produced under.	`ai_model_versions[kind]` + each payload’s `method_version` / `source`
Deterministic	Each lane is classified deterministic (pure function of the input) or model-produced; deterministic lanes are reproducible byte-for-byte.	`ai_model_versions[kind].determinism` + the `determinism` block
Auditable	The cache key is content-addressed + version-pinned, so a result is reproducible from `(VERSION, pdf bytes, args)` alone, and a release bump rotates the whole namespace.	`determinism.cache_key_format` + this doc

The `determinism` contract block

GET /v1/contract carries a machine-readable determinism block (and the CLI codex-pdf contract manifest mirrors it):

{
  "determinism": {
    "cache_key_format": "codex:{VERSION}:{kind}:{tenant}:{pdf_sha}:{args_sha}",
    "version_pinned": true,
    "version_bump_rotates_cache": true,
    "version_namespace_component": "1.29.0",
    "lanes": {
      "deterministic": ["barcodes"],
      "model": ["classification", "language", "logos", "order_intake",
                "regulatory", "spec", "spell", "substrate", "symbols"]
    },
    "model_disclosed": true,
    "accuracy_methodology": "docs/accuracy.md",
    "determinism_contract": "docs/determinism.md"
  }
}

The block is a pure function of the version catalogue (codex_pdf.ai.versions.AI_MODEL_VERSIONS) + the package VERSION; it carries no secrets and is safe to read unauthenticated.

Per-lane determinism classification

Each entry in ai_model_versions carries a determinism classifier:

deterministic — a pure function of the input bytes; no model, no network. The same input yields the same output byte-for-byte across runs and deploys (subject only to a VERSION bump). These lanes have unambiguous ground truth, so they anchor the published accuracy methodology (accuracy.md).
model — produced by an LLM / vision model. The content-addressed cache still makes a cache read deterministic (it returns the exact bytes first computed for that key), but the underlying model output is not guaranteed bit-stable across model revisions — which is exactly why each lane pins its model + prompt id, so a consumer can detect a change and invalidate deliberately.

Lane	`model`	`determinism`
`barcodes`	`pyzbar+pylibdmtx+iso15415-16+gs1-digital-link`	deterministic
`language`	`claude-haiku-4-5`	model
`logos`	`claude-sonnet-4-6`	model
`symbols`	`claude-sonnet-4-6`	model
`classification`	`claude-haiku-4-5`	model
`spell`	`claude-haiku-4-5`	model
`substrate`	`claude-haiku-4-5`	model
`regulatory`	`claude-haiku-4-5`	model
`spec`	`claude-haiku-4-5`	model
`order_intake`	`claude-haiku-4-5`	model

The barcodes lane is the flagship deterministic surface: pyzbar / pylibdmtx decode, the ISO/IEC 15415 (2D) / 15416 (1D) print-quality grader, and GS1 Digital Link syntactic validation are all pure functions of the rendered pixels / decoded payload. (The DL lane is syntactic only — live resolution is a network call, so it is excluded from the cached lane precisely to keep the lane deterministic.)

The classifier is a property of the lane, not a version — it changes only if a lane is re-architected (e.g. a deterministic lane gains a model fallback). Treat it as stable.

Content-addressed, version-pinned cache discipline

Every codex tier (the API render cache, the AI-signal cache, the edge KV worker) shares one cache-key format:

codex:{VERSION}:{kind}:{tenant}:{pdf_sha}:{args_sha}

VERSION — the codex package version (codex_pdf.version.VERSION). It is the leading namespace component, so a release bump invalidates every tier atomically. There is never a stale cross-release hit, even when operators reuse the same Redis / KV backend across deploys.
kind — segregates by endpoint family (page, separations, heatmap, layer, signal:<signal-kind>, …) so identical args from different endpoints don’t collide.
tenant — scopes per X-Codex-Tenant (default "default") so one tenant’s cached result can’t be read by another.
pdf_sha — sha256 of the PDF bytes. This is the content-addressing: the content keys the cache, not a request id or client token.
args_sha — sha256 of the canonicalised request args (json.dumps(..., sort_keys=True)), so two callers passing the same args in different key order hit the same entry.

The version-pinning invariant

A VERSION bump rotates the entire cache namespace.

This is the discipline document-AI services lack. Because VERSION leads the key, codex:1.29.0:page:… and codex:1.30.0:page:… are different keys for the same PDF — so a release that changes extraction behaviour can never serve a result computed under the old behaviour. Operators who want to force a re-extraction (e.g. after a prompt bump on a model lane, which does not by itself change the key) rotate VERSION.

This invariant is behavior-locked in tests/test_determinism_posture.py and tests/test_cache.py.

Reproducibility guarantee

For a deterministic lane, a result is fully reproducible from (VERSION, pdf bytes, args) — re-run codex at the same version on the same bytes and you get the same answer, no cache required. A cache miss is always recoverable: every cached artifact is derivable from the source PDF + the cache-key arguments. For a model lane, the cache read is reproducible (same key → same bytes); the cold re-computation is reproducible only within a fixed model + prompt, which is why both are pinned and disclosed.

Bumping the version catalogue

See codex_pdf.ai.versions for the per-key bump policy. In short:

model bumps when codex swaps a model family (Haiku ↔ Sonnet) — minor release.
prompt bumps when a system prompt changes shape — minor release.
schema bumps only when a kind’s payload JSON shape changes (stays 1.x).
determinism is a lane property, not a version — stable.

When the contract block changes shape, regenerate docs/openapi.yaml (python scripts/dump_openapi.py) — CI fails on a stale spec.