Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP objections without any LP identity reaching the Claude API. Layered, defense-in-depth, fail-closed by construction (docs/redaction-rehydration.md). backend/redaction/: - scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/ IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails, scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/ quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N] strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the de-anon key — local-only, never logged/sent). Hardened across THREE adversarial leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/ comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes). - client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary build errors propagate; strict rehydrate returns tokenized-not-de-anon text). - test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector. backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first (local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the local model is unreachable or a hallucinated token appears. test_grounding_boundary.py proves nothing sensitive reaches Claude and the three fail-closed paths. server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections. docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md: the gateway handover spec (Option 1 — caller supplies the entity dictionary). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
10 KiB
Spark Control — /scrub + /rehydrate endpoints (handover prompt)
Hand this to the Spark Control developer to build the gateway redaction endpoints.
OPTIONAL for Phase 1 — the Architect's grounding boundary already ships in our app
repo (the backend/redaction/ module, deterministic scrub + golden-file leak tests).
Build this when we move to multi-agent enforcement (Phase 2 Analyst, Phase 3 Closer),
where we want ONE bypass-proof point with the pseudonym map living next to the local
models. Our in-repo module is the reference implementation; the contract below is what
our SCRUB_BACKEND=gateway client already calls, so aim for behavioral parity and we cut
agents over with zero app changes. Related: docs/redaction-rehydration.md.
Purpose
Add two gateway endpoints, POST /scrub and POST /rehydrate, that let any agent send
LP-specific context to the Claude API without exposing sovereign data. /scrub
de-identifies an agent's assembled context; the agent sends the de-identified text to
Claude; /rehydrate puts the real values back into Claude's response for human review.
The pseudonym map that links tokens to real values is the de-anonymization key — it
must stay on the Spark, next to the local models, and never be sent or logged to any
third party. This sits behind the same trusted Spark Control URL, TLS, and access
control that already front /v1/chat/completions, /v1/embeddings, /v1/rerank, and
/api/search. Spark Control does not call Claude here — it is the scrub and rehydrate
transform pair plus a server-held map.
The three-tier classification (the rules /scrub enforces)
Classify every span into exactly one tier:
- TIER-1 NEVER-SEND — excluded entirely, not even tokenized: full LP list/export or
bulk relationship graph, raw account/wire/routing numbers, SSN/passport/gov-ID, anything
under a confidentiality obligation. Default: excise the span.
tier1_action="reject"makes the whole call fail-closed (422) if any Tier-1 is detected. - TIER-2 TOKENIZE — stable placeholders, swapped back locally after Claude: person
names, org/fund names, emails, phones, addresses, exact $ amounts, identity-pinning
dates. Placeholders are
[TYPE_N]:[PERSON_1] [ORG_1] [FUND_1] [EMAIL_1] [PHONE_1] [ADDR_1] [AMOUNT_1] [DATE_1] [LOC_1] [MISC_1].Nis 1-based and stable within a task: the same real entity maps to the same token across every item in the call and across later/scrubcalls reusing the samemap_handle, so Claude can reason about relationships ([PERSON_1] introduced [PERSON_2] to [FUND_1]). - SEND-AS-IS — passed through untouched: the substance Claude needs (objections, sentiment, generic deal mechanics, the drafted message body minus identifiers).
The round-trip
SCRUB (your endpoint) → REASON (the agent calls Claude with placeholders only — your
gateway does NOT call Claude) → RE-HYDRATE (your endpoint) → human review (in our app).
Request / response contracts
POST /scrub
{
"task_id": "string, required, caller-chosen, stable across the round-trip",
"actor": "string, agent name e.g. 'analyst' | 'closer' (for logging)",
"items": [ {"id": "ctx_1", "text": "..."} ],
"known_entities": { // CALLER-SUPPLIED dictionary (see "Dictionary" below)
"persons": ["..."], "orgs": ["..."], "funds": ["..."], "emails": ["..."]
},
"tier1_action": "drop | reject", // default "drop"
"bucket": {"amounts": false, "dates": false}, // default FALSE for grounding: tokenize reversibly (no magnitudes to Claude). bucket=true only when a caller genuinely needs coarse magnitude
"ner": "auto | rules_only | qwen", // default "auto"
"map_handle": "string, optional, reuse/extend an existing task map"
}
Response 200:
{
"task_id": "...",
"map_handle": "opaque server key to the map (NOT the map itself)",
"items": [ {"id":"ctx_1","scrubbed_text":"...","tokens_used":["PERSON_1","AMOUNT_1"]} ],
"stats": {"tier1_dropped": 2, "tier2_tokenized": 14, "distinct_entities": 9,
"descriptive_flags": [{"item":"ctx_1","span":"the family that sold the mining company in Texas","action":"redacted"}]},
"expires_at": "ISO-8601 map TTL (short-lived, e.g. 2h)"
}
422 {"error":"tier1_detected","spans":[...]} when tier1_action="reject" and Tier-1 found.
400 on malformed input.
POST /rehydrate
{
"task_id": "string, required",
"map_handle": "string, required, from /scrub",
"items": [ {"id":"out_1","text":"...with [PERSON_1] tokens..."} ],
"actor": "string",
"strict": true // default true
}
Response 200:
{ "items": [ {"id":"out_1","rehydrated_text":"...real values..."} ],
"stats": {"tokens_substituted": 6, "unknown_tokens": []} }
409 {"error":"unknown_tokens","tokens":["PERSON_9"]} when strict and the text contains a
token with no map entry — this is your tripwire for a Claude-hallucinated/smuggled token; do
NOT silently pass it through. 410 {"error":"map_expired"} if the map TTL lapsed.
Dictionary: CALLER-SUPPLIED (Option 1 — decided)
The known-entity dictionary is supplied by the caller in each /scrub request
(known_entities), NOT read by Spark Control from the CRM. We chose this over giving the
gateway CRM access because it keeps Spark Control generic and portable (no coupling to
Ten31's schema, so the package still works for another dual-Spark user), needs no CRM
credentials on the gateway (least privilege — the LP list lives in one place), stays
fresh (built live from the CRM each call), and matches the reference scrub() signature.
Our app builds it with its build_known_entities (names + name-parts + email local-parts)
and sends it scoped to the request. Treat it as sensitive (a slice of the LP list): hold
it only transiently with the map, never log or forward it. Tokens are per-task (stable within
a task_id/map_handle), which is also the lower re-identification-risk choice. If per-call
payload ever becomes a perf issue, an optional gateway-side cached export (Option 3) is the
escape hatch — but do not build that coupling now.
Local-Qwen NER (how to find Tier-2 entities), cheapest/most-authoritative first
- The caller-supplied
known_entities. Tokenize every supplied person/org/fund/email surface deterministically (case- and unicode-fold insensitive, longest-match-first, with hyphenated-surname extension). This is the bulk of the work and needs no model. (This is the deterministic FLOOR — load-bearing but NOT complete; the NER pass below is required.) - Rules for structured PII: regexes for emails, phones, $ amounts, dates, and SSN/account-number shapes (account/SSN shapes route to Tier-1).
- Local-Qwen NER pass for residual named entities the map and rules miss (a person/org
in free text not yet in the CRM). Call the SAME local Qwen you already serve at
/v1/chat/completions(enable_thinking=false, temperature 0), with a strict JSON-only extraction prompt returning{entities:[{text,type,tier}]}. Never a remote model — the input is exactly the sensitive text we are keeping local.ner="rules_only"skips this;ner="qwen"forces it;autoruns it only on spans unresolved by steps 1–2. Qwen should also flag descriptive re-identifiers (e.g. "the family that sold the mining company in Texas"), not just named entities.
Map-stays-local (non-negotiable)
The pseudonym map {token -> real_value} is stored ONLY on the Spark, keyed by
map_handle, in memory or a short-lived local store, TTL-expired (default ~2h). Never
returned in full, never written to any log/metric/trace that leaves the box, never in any
Claude-bound payload. /scrub returns only the opaque map_handle and counts. If you add
GET /scrub/map/{map_handle} for same-box debugging, gate it behind the same auth and keep
it off by default.
Logging
One row per call to our interaction_log (we give you a write path or ingest an emitted
event): action='redaction.scrub' | 'redaction.rehydrate', actor_type='agent',
actor_id=<actor>, source='spark_control', payload = COUNTS ONLY (tiers dropped,
tokens by type, distinct_entities, descriptive_flags spans WITHOUT real values, model used).
The payload MUST NOT contain any real Tier-2 value, any Tier-1 content, or the map.
Acceptance tests (must pass before agents route through it)
- Golden-file diff — fixed inputs, recorded expected scrubbed output; assert NO Tier-1
string and NO real Tier-2 identifier appears in any
/scrubresponse or Claude-bound payload. (We will share ourbackend/redaction/fixtures + golden files.) - Round-trip identity — scrub → echo tokens → rehydrate reproduces the original real
values exactly; token stability holds across items and across a second
/scrubreusing the samemap_handle. - Re-identification spot-check — feed ONLY the de-identified prompt to the local Qwen and ask it to name the real people/orgs/amounts; anything it recovers (esp. descriptive re-identifiers and amount+date+sector inference) is flagged and must be driven to zero or escalated to Tier-1/bucketing.
- Map-leak assertion — scan every response body, log row, and Claude-bound payload in the suite; assert the map and all real values are absent.
- Strict-rehydrate tripwire — a rehydrate input with an unmapped token returns 409.
- Fail-closed —
tier1_action="reject"returns 422 on any Tier-1 input, emits nothing.
Notes
- This does NOT replace minimization: the agent must first ask "does Claude need this record content at all?" — often a local retrieval summary suffices. These endpoints are for when the answer is genuinely yes.
- Keep placeholders cache-stable (deterministic token assignment per task_id/map_handle) so they compose with prompt caching on the Claude side.
- We will hand you our in-repo
redactionmodule + golden files as the reference; aim for behavioral parity first, then we cut agents over to the gateway as the single enforcement point.