Files
spark-control/.claude/rules/redaction.md
T
Keysat 9ef9226e0a docs: split CLAUDE.md into path-scoped .claude/rules; fix dev/test commands
- CLAUDE.md trimmed to whole-repo facts (58 lines); subsystem guidance
  moved to .claude/rules/{startos-package,fastapi-image,redaction,
  audio-speech}.md with paths: frontmatter so each loads only when
  matching files are touched
- .gitignore: track .claude/rules/ while keeping the rest of .claude/
  (settings.local.json) ignored
- test-audio-with-speakers.sh: require audio-file arg in docs, replace
  owner-specific SPARK_CONTROL/VLLM defaults with generic ones
  (localhost dev server + Spark Control vLLM proxy), discover the
  loaded LLM via /api/status since /v1/models lists audio models only
- document REDACTION_MAP_DB + CONNECTIVITY_LOG as required for local
  dev (/data only exists in the container)
- prettier pass over startos/actions (formatting drift)
2026-06-11 19:12:23 -05:00

24 lines
1.2 KiB
Markdown

---
paths:
- "image/app/redaction/**"
- "image/app/redaction_gateway.py"
- "docs/REDACTION_GATEWAY.md"
---
# Redaction (`/scrub` + `/rehydrate`)
- `image/app/redaction/scrub.py` + `test_scrub_leak.py` are vendored **byte-for-byte** from the CRM repo (sha recorded in `redaction/__init__.py`). **Never edit them here** — change them in the CRM repo, re-vendor (`cp`), update the sha, re-run the leak test.
- The gateway around the vendored scrubber is `image/app/redaction_gateway.py`. Its token-map store lives on `/data` (`REDACTION_MAP_DB`, default `/data/redaction_maps.db`) and fails closed if it can't open — set the env var when running outside the container.
## Test suites — both must pass before shipping ANY redaction change
```bash
cd image
.venv/bin/python -m app.redaction.test_gateway # /scrub + /rehydrate acceptance; offline, no cluster needed
.venv/bin/python app/redaction/test_scrub_leak.py # vendored golden-file leak test; offline
```
Keep the leak test green against the vendored `scrub.py` after any re-vendor.
Policy context: scrubbed text via `/scrub` is the **only** sanctioned path toward frontier/cloud models — see the whole-repo privacy rule in CLAUDE.md.