Files
ten31-database/docs/spark-control-scrub-endpoints.md
Keysat 2e70b34592 Architect grounding boundary: redaction/re-hydration privacy gate (v0.1.0:55)
Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP
objections without any LP identity reaching the Claude API. Layered, defense-in-depth,
fail-closed by construction (docs/redaction-rehydration.md).

backend/redaction/:
- scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/
  IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from
  the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails,
  scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/
  quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N]
  strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the
  de-anon key — local-only, never logged/sent). Hardened across THREE adversarial
  leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/
  comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes).
- client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or
  gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary
  build errors propagate; strict rehydrate returns tokenized-not-de-anon text).
- test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification
  suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector.

backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first
(local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the
de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the
local model is unreachable or a hallucinated token appears. test_grounding_boundary.py
proves nothing sensitive reaches Claude and the three fail-closed paths.

server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections.
docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md:
the gateway handover spec (Option 1 — caller supplies the entity dictionary).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 17:06:29 -05:00

176 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Spark Control — `/scrub` + `/rehydrate` endpoints (handover prompt)
*Hand this to the Spark Control developer to build the gateway redaction endpoints.
**OPTIONAL for Phase 1** — the Architect's grounding boundary already ships in our app
repo (the `backend/redaction/` module, deterministic scrub + golden-file leak tests).
Build this when we move to **multi-agent** enforcement (Phase 2 Analyst, Phase 3 Closer),
where we want ONE bypass-proof point with the pseudonym map living next to the local
models. Our in-repo module is the **reference implementation**; the contract below is what
our `SCRUB_BACKEND=gateway` client already calls, so aim for behavioral parity and we cut
agents over with zero app changes. Related: `docs/redaction-rehydration.md`.*
---
## Purpose
Add two gateway endpoints, `POST /scrub` and `POST /rehydrate`, that let any agent send
LP-specific context to the Claude API without exposing sovereign data. `/scrub`
de-identifies an agent's assembled context; the agent sends the de-identified text to
Claude; `/rehydrate` puts the real values back into Claude's response for human review.
The **pseudonym map** that links tokens to real values is the de-anonymization key — it
**must stay on the Spark, next to the local models, and never be sent or logged to any
third party.** This sits behind the same trusted Spark Control URL, TLS, and access
control that already front `/v1/chat/completions`, `/v1/embeddings`, `/v1/rerank`, and
`/api/search`. Spark Control does **not** call Claude here — it is the scrub and rehydrate
transform pair plus a server-held map.
## The three-tier classification (the rules `/scrub` enforces)
Classify every span into exactly one tier:
- **TIER-1 NEVER-SEND** — excluded entirely, not even tokenized: full LP list/export or
bulk relationship graph, raw account/wire/routing numbers, SSN/passport/gov-ID, anything
under a confidentiality obligation. Default: excise the span. `tier1_action="reject"`
makes the whole call fail-closed (422) if any Tier-1 is detected.
- **TIER-2 TOKENIZE** — stable placeholders, swapped back locally after Claude: person
names, org/fund names, emails, phones, addresses, exact $ amounts, identity-pinning
dates. Placeholders are `[TYPE_N]`: `[PERSON_1] [ORG_1] [FUND_1] [EMAIL_1] [PHONE_1]
[ADDR_1] [AMOUNT_1] [DATE_1] [LOC_1] [MISC_1]`. `N` is 1-based and **stable within a
task**: the same real entity maps to the same token across every item in the call and
across later `/scrub` calls reusing the same `map_handle`, so Claude can reason about
relationships (`[PERSON_1] introduced [PERSON_2] to [FUND_1]`).
- **SEND-AS-IS** — passed through untouched: the substance Claude needs (objections,
sentiment, generic deal mechanics, the drafted message body minus identifiers).
## The round-trip
`SCRUB` (your endpoint) → `REASON` (the agent calls Claude with placeholders only — your
gateway does NOT call Claude) → `RE-HYDRATE` (your endpoint) → human review (in our app).
## Request / response contracts
### `POST /scrub`
```json
{
"task_id": "string, required, caller-chosen, stable across the round-trip",
"actor": "string, agent name e.g. 'analyst' | 'closer' (for logging)",
"items": [ {"id": "ctx_1", "text": "..."} ],
"known_entities": { // CALLER-SUPPLIED dictionary (see "Dictionary" below)
"persons": ["..."], "orgs": ["..."], "funds": ["..."], "emails": ["..."]
},
"tier1_action": "drop | reject", // default "drop"
"bucket": {"amounts": false, "dates": false}, // default FALSE for grounding: tokenize reversibly (no magnitudes to Claude). bucket=true only when a caller genuinely needs coarse magnitude
"ner": "auto | rules_only | qwen", // default "auto"
"map_handle": "string, optional, reuse/extend an existing task map"
}
```
Response `200`:
```json
{
"task_id": "...",
"map_handle": "opaque server key to the map (NOT the map itself)",
"items": [ {"id":"ctx_1","scrubbed_text":"...","tokens_used":["PERSON_1","AMOUNT_1"]} ],
"stats": {"tier1_dropped": 2, "tier2_tokenized": 14, "distinct_entities": 9,
"descriptive_flags": [{"item":"ctx_1","span":"the family that sold the mining company in Texas","action":"redacted"}]},
"expires_at": "ISO-8601 map TTL (short-lived, e.g. 2h)"
}
```
`422 {"error":"tier1_detected","spans":[...]}` when `tier1_action="reject"` and Tier-1 found.
`400` on malformed input.
### `POST /rehydrate`
```json
{
"task_id": "string, required",
"map_handle": "string, required, from /scrub",
"items": [ {"id":"out_1","text":"...with [PERSON_1] tokens..."} ],
"actor": "string",
"strict": true // default true
}
```
Response `200`:
```json
{ "items": [ {"id":"out_1","rehydrated_text":"...real values..."} ],
"stats": {"tokens_substituted": 6, "unknown_tokens": []} }
```
`409 {"error":"unknown_tokens","tokens":["PERSON_9"]}` when `strict` and the text contains a
token with no map entry — this is your tripwire for a Claude-hallucinated/smuggled token; do
NOT silently pass it through. `410 {"error":"map_expired"}` if the map TTL lapsed.
## Dictionary: CALLER-SUPPLIED (Option 1 — decided)
The known-entity dictionary is supplied by the **caller** in each `/scrub` request
(`known_entities`), NOT read by Spark Control from the CRM. We chose this over giving the
gateway CRM access because it keeps Spark Control **generic and portable** (no coupling to
Ten31's schema, so the package still works for another dual-Spark user), needs **no CRM
credentials on the gateway** (least privilege — the LP list lives in one place), stays
**fresh** (built live from the CRM each call), and matches the reference `scrub()` signature.
Our app builds it with its `build_known_entities` (names + name-parts + email local-parts)
and sends it scoped to the request. Treat it as **sensitive** (a slice of the LP list): hold
it only transiently with the map, never log or forward it. Tokens are per-task (stable within
a `task_id`/`map_handle`), which is also the lower re-identification-risk choice. If per-call
payload ever becomes a perf issue, an optional gateway-side cached export (Option 3) is the
escape hatch — but do not build that coupling now.
## Local-Qwen NER (how to find Tier-2 entities), cheapest/most-authoritative first
1. **The caller-supplied `known_entities`.** Tokenize every supplied person/org/fund/email
surface deterministically (case- and unicode-fold insensitive, longest-match-first, with
hyphenated-surname extension). This is the bulk of the work and needs no model. (This is
the deterministic FLOOR — load-bearing but NOT complete; the NER pass below is required.)
2. **Rules** for structured PII: regexes for emails, phones, $ amounts, dates, and
SSN/account-number shapes (account/SSN shapes route to Tier-1).
3. **Local-Qwen NER pass** for residual named entities the map and rules miss (a person/org
in free text not yet in the CRM). Call the SAME local Qwen you already serve at
`/v1/chat/completions` (enable_thinking=false, temperature 0), with a strict JSON-only
extraction prompt returning `{entities:[{text,type,tier}]}`. Never a remote model — the
input is exactly the sensitive text we are keeping local. `ner="rules_only"` skips this;
`ner="qwen"` forces it; `auto` runs it only on spans unresolved by steps 12. Qwen should
also flag **descriptive** re-identifiers (e.g. "the family that sold the mining company in
Texas"), not just named entities.
## Map-stays-local (non-negotiable)
The pseudonym map `{token -> real_value}` is stored ONLY on the Spark, keyed by
`map_handle`, in memory or a short-lived local store, TTL-expired (default ~2h). Never
returned in full, never written to any log/metric/trace that leaves the box, never in any
Claude-bound payload. `/scrub` returns only the opaque `map_handle` and counts. If you add
`GET /scrub/map/{map_handle}` for same-box debugging, gate it behind the same auth and keep
it off by default.
## Logging
One row per call to our `interaction_log` (we give you a write path or ingest an emitted
event): `action='redaction.scrub' | 'redaction.rehydrate'`, `actor_type='agent'`,
`actor_id=<actor>`, `source='spark_control'`, `payload` = **COUNTS ONLY** (tiers dropped,
tokens by type, distinct_entities, descriptive_flags spans WITHOUT real values, model used).
The payload MUST NOT contain any real Tier-2 value, any Tier-1 content, or the map.
## Acceptance tests (must pass before agents route through it)
1. **Golden-file diff** — fixed inputs, recorded expected scrubbed output; assert NO Tier-1
string and NO real Tier-2 identifier appears in any `/scrub` response or Claude-bound
payload. (We will share our `backend/redaction/` fixtures + golden files.)
2. **Round-trip identity** — scrub → echo tokens → rehydrate reproduces the original real
values exactly; token stability holds across items and across a second `/scrub` reusing
the same `map_handle`.
3. **Re-identification spot-check** — feed ONLY the de-identified prompt to the local Qwen
and ask it to name the real people/orgs/amounts; anything it recovers (esp. descriptive
re-identifiers and amount+date+sector inference) is flagged and must be driven to zero or
escalated to Tier-1/bucketing.
4. **Map-leak assertion** — scan every response body, log row, and Claude-bound payload in
the suite; assert the map and all real values are absent.
5. **Strict-rehydrate tripwire** — a rehydrate input with an unmapped token returns 409.
6. **Fail-closed**`tier1_action="reject"` returns 422 on any Tier-1 input, emits nothing.
## Notes
- This does NOT replace **minimization**: the agent must first ask "does Claude need this
record content at all?" — often a local retrieval summary suffices. These endpoints are
for when the answer is genuinely yes.
- Keep placeholders **cache-stable** (deterministic token assignment per task_id/map_handle)
so they compose with prompt caching on the Claude side.
- We will hand you our in-repo `redaction` module + golden files as the reference; aim for
behavioral parity first, then we cut agents over to the gateway as the single enforcement
point.