Files
spark-control/docs/guides/redaction.md
T
Keysat 98988057a2 v0.18.0:1 - scrub owner-specific hostnames, ips, usernames, names from tracked files
Replace real cluster IPs/hosts/usernames and example names with neutral
placeholders across docs, ops notes, package install text, and the offline
redaction test; delete the obsolete build-time starter prompt. Closes the
portability audit's single blocker. No runtime behavior change.
2026-06-12 15:07:34 -05:00

1.2 KiB

paths
paths
image/app/redaction/**
image/app/redaction_gateway.py
docs/REDACTION_GATEWAY.md

Redaction (/scrub + /rehydrate)

  • image/app/redaction/scrub.py + test_scrub_leak.py are vendored byte-for-byte from the CRM repo (sha recorded in redaction/__init__.py). Never edit them here — change them in the CRM repo, re-vendor (cp), update the sha, re-run the leak test.
  • The gateway around the vendored scrubber is image/app/redaction_gateway.py. Its token-map store lives on /data (REDACTION_MAP_DB, default /data/redaction_maps.db) and fails closed if it can't open — set the env var when running outside the container.

Test suites — both must pass before shipping ANY redaction change

cd image
.venv/bin/python -m app.redaction.test_gateway        # /scrub + /rehydrate acceptance; offline, no cluster needed
.venv/bin/python app/redaction/test_scrub_leak.py     # vendored golden-file leak test; offline

Keep the leak test green against the vendored scrub.py after any re-vendor.

Policy context: scrubbed text via /scrub is the only sanctioned path toward frontier/cloud models — see the whole-repo privacy rule in AGENTS.md.