Features

Everything Codex provides

Built for teams that need an authoritative facts contract for PDFs, not duplicated extraction logic across services.

Boundary clarity

Read-only extraction by design

Codex focuses on facts extraction and avoids hidden product behavior.

Contract-first model

Downstream systems consume one shared facts contract.

Validation and trust

Validate outputs in local tooling and CI to catch drift early.

Operational tooling

Designed for real workflows beyond one-off extraction.

Image intelligence · v1.17.0

Placement-aware resolution for every image on every page.

Captures effective_resolution_dpi using the actual placed rect from the PDF — not a page-size estimate.
A 300 DPI image enlarged 2× correctly reports 150 DPI; shrunk to 0.5× reports 600 DPI.
Reported per-placement across all pages so downstream tools can flag specific occurrences, not just file-level averages.

Unified input · v1.36.0

Every accepted input returns the same CodexDocument; a supported-format matrix is in the docs.

Tier 1 (fully decoded): PDF, Adobe Illustrator .ai, EPS / PostScript, 1-bit TIFF & Esko LEN, multi-page TIFF, TIFF/IT, DCS / copydot, CIP3 PPF, and structural die files CFF2 / DDES / DXF.
Tier 2 (detected & noted): encrypted Esko LENX, Scitex CT/LW, ArtiosCAD ARD, and AutoCAD DWG return a clean, precise remediation note — never a silent zero-finding result.
summary.source_format records the input family; plate and die inputs add die size, embellishments, and per-separation ink / coating coverage.