docs: extract subsystem guides; keep AGENTS.md to whole-repo facts

Move subsystem mechanics (migrations, thesis gate, redaction, ingest, email, packaging) out of AGENTS.md into docs/guides/<topic>.md, each scoped by paths: frontmatter and symlinked from .claude/rules/ so Claude Code lazy-loads them. AGENTS.md keeps whole-repo facts and universal guardrails plus a one-line index per guide. Fix the inaccurate ".claude/ is gitignored" note — it is tracked.
2026-06-12 16:46:49 -05:00
parent cabbcae5d5
commit 090416f05e
13 changed files with 192 additions and 25 deletions
@@ -0,0 +1,26 @@
+---
+paths:
+  - backend/redaction/**
+  - backend/mcp/**
+---
+
+# Redaction & the Claude privacy boundary
+
+Read this before editing anything that sends data to a Claude model — the redaction layer or any MCP agent/tool path.
+
+## The boundary
+
+- `backend/redaction/` (`scrub.py` + `client.py`) is the **scrub → Claude → re-hydrate** boundary: `Boundary`, `SCRUB_BACKEND=local|gateway`, **fail-closed**.
+- `SCRUB_BACKEND=gateway` routes scrubbing through Spark Control (caller-supplied dict). Local backend scrubs in-process. If scrubbing can't run, the call fails closed — it does not pass raw text through.
+
+## Hard rules
+
+- **Keep real LP data out of Claude.** Develop only on code/schema/synthetic-or-locally-redacted data. Route any real record substance through `backend/redaction` before it reaches a Claude model.
+- **Never bulk-export the LP list** to any third party. Send only minimal, non-sensitive context to Claude.
+- **Never call a Spark directly** — go through Spark Control (`SPARK_CONTROL_URL`).
+
+## When adding a new Claude/MCP call
+
+Trace the data path: any field carrying LP substance must cross `Boundary` first. A new MCP tool that reads CRM rows and hands them to a model without scrubbing is a leak — add it to the redaction path and extend the leak tests in `backend/redaction/test_*.py`.
+
+See also `docs/redaction-rehydration.md` and `docs/spark-control-scrub-endpoints.md`.