spark-control/docs/REDACTION_GATEWAY.md

# Redaction Gateway — `/scrub` + `/rehydrate` (Spark Control v0.16.0)

The privacy boundary between sovereign LP data and the Claude API, living at the
same trusted Spark Control host as `/v1/chat/completions`, `/v1/embeddings`,
`/v1/rerank`, and `/api/search`. Built to **behavioral parity** with the CRM's
reference `backend/redaction/scrub.py` — that engine is vendored verbatim into
Spark Control and its leak test passes here, so `SCRUB_BACKEND=gateway` is a
drop-in for the in-repo path.

## What it is

- `POST /scrub` — de-identify an agent's assembled context. Returns placeholder-only
  text (the agent forwards that to Claude) plus an opaque `map_handle`.
- `POST /rehydrate` — swap the real values back into Claude's placeholder-bearing
  response, locally, for human review.

Spark Control does **not** call Claude. It's the scrub/rehydrate transform pair
plus a server-held pseudonym map.

## Contract (matches the handover doc)

`POST /scrub`
```json
{ "task_id": "...", "actor": "analyst",
  "items": [{"id": "ctx_1", "text": "..."}],
  "known_entities": {"persons": [], "orgs": [], "funds": [], "emails": [], "locations": []},
  "tier1_action": "drop",            // or "reject" (fail-closed 422 on any Tier-1)
  "bucket": {"amounts": false, "dates": false},
  "ner": "auto",                     // "auto" | "rules_only" | "qwen"
  "map_handle": null }               // pass to reuse/extend a task's map (stable tokens)
```
→ `200 { task_id, map_handle, items:[{id, scrubbed_text, tokens_used}], stats:{tier1_dropped, tier2_tokenized, distinct_entities, descriptive_flags:[{item, span, action}]}, expires_at }`
- `422 {"error":"tier1_detected","spans":[{item, kinds}]}` when `tier1_action="reject"` and Tier-1 found (kinds only — never the raw value).
- `422 {"error":"ner_unavailable", ...}` when `ner=auto|qwen` and the local Qwen is unreachable / no model loaded — **fail-closed, emits nothing**.
- `400` on malformed input.

`POST /rehydrate`
```json
{ "task_id": "...", "map_handle": "...", "items": [{"id": "out_1", "text": "...[PERSON_1]..."}],
  "actor": "analyst", "strict": true }
```
→ `200 { items:[{id, rehydrated_text}], stats:{tokens_substituted, unknown_tokens} }`
- `409 {"error":"unknown_tokens","tokens":[...]}` when `strict` and a token has no map entry (your tripwire for a Claude-hallucinated/smuggled token).
- `410 {"error":"map_expired"}` if the map TTL lapsed or the handle is unknown.

## The dictionary is caller-supplied — and treated as sensitive

You supply `known_entities` (built by your `build_known_entities`, scoped to the LP
in play) in each `/scrub` call. Spark Control never reads your CRM — keeps the
package portable and needs no CRM credentials. The gateway treats your dictionary
as a slice of the LP list: used transiently for the scrub, **never persisted beyond
the resulting tokens, never logged, never echoed**. Only the resulting
`{token → real_value}` map is held server-side.

## NER backstop is load-bearing, not optional

The dictionary is the deterministic floor; the local-Qwen NER pass catches the
unknown names it can't know (new prospects, an advisor named in passing) and flags
**descriptive re-identifiers** ("the family that sold the mining company in Texas" →
redacted). Under `ner=auto` (default) or `ner=qwen`, if the local Qwen is unreachable
or no model is loaded, `/scrub` **fails closed (422)** rather than passing name-blind
text to Claude. `ner=rules_only` is the explicit, knowing opt-out — never the silent
fallback. The NER uses the same local Qwen at `/v1/chat/completions`; the sensitive
text never reaches a remote model.

> Verified live against Qwen3.6: an unknown "Sarah Kim from Atlas Ventures" → `[PERSON_1] from [ORG_1]`; a descriptive re-identifier → `[redacted]` + flagged.

## Map-stays-local

The pseudonym map (the de-anonymization key) is held only on this box, keyed by
`map_handle`, in a TTL-swept local store on the StartOS `/data` volume (default 2h;
survives a Spark Control restart mid-review). Never returned in full, never logged,
never in a Claude-bound payload. `REDACTION_MAP_TTL` and `REDACTION_MAP_DB` are
configurable via env if you want a different TTL/path.

## Logging stays on your side

`/scrub` and `/rehydrate` return counts-only `stats`; **your app writes the
`interaction_log` row** (you already have `log_scrub`/`log_rehydrate`). Spark Control
does not write to your DB and keeps no audit log of its own that contains real values.
The `descriptive_flags` span text is in the `/scrub` *response* (to you, the local
caller) — strip it before you persist, per your own logging rule (payload = counts only).

## Acceptance — what passed

1. **Parity** — the reference leak fixtures run through the live `/scrub` endpoint: every Tier-1 + Tier-2 identifier absent from the response; substance survives verbatim.
2. **Map-leak** — no real value (incl. Tier-1) in any response body; Tier-1 values absent from the server map entirely.
3. **Round-trip** — `/rehydrate` via the server-held map reproduces the original (Tier-1 → `[redacted]`, the only lossy part).
4. **Handle reuse** — same entity → same token across items and across calls reusing `map_handle` (cache-stable for Claude prompt caching).
5. **Tripwires** — 409 on a strict unmapped token; 410 on expired/unknown handle; 422 fail-closed on `tier1_action=reject`.
6. **Live NER** — unknown names tokenized + descriptive re-identifier redacted against the real local Qwen.

## Cutover

Point your `SCRUB_BACKEND=gateway` client at `https://<spark-control-host>/scrub` and
`/rehydrate` (same TLS-skip / Root-CA story as the other endpoints). The request/
response shapes match your in-repo module, so agents cut over with no app changes.

## Honest caveat (unchanged from your design)

The NER pass is the probabilistic layer — it will not catch every free-text or
descriptive re-identifier. The strong defenses remain: **minimize-first** (does Claude
need the record content at all?), the deterministic dictionary + rules, and the
re-identification spot-check. Treat the gateway as the enforcement *point*, not a
guarantee that any text is safe to send.