Files

T

Keysat 2e70b34592 Architect grounding boundary: redaction/re-hydration privacy gate (v0.1.0:55)

Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP
objections without any LP identity reaching the Claude API. Layered, defense-in-depth,
fail-closed by construction (docs/redaction-rehydration.md).

backend/redaction/:
- scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/
  IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from
  the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails,
  scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/
  quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N]
  strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the
  de-anon key — local-only, never logged/sent). Hardened across THREE adversarial
  leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/
  comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes).
- client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or
  gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary
  build errors propagate; strict rehydrate returns tokenized-not-de-anon text).
- test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification
  suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector.

backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first
(local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the
de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the
local model is unreachable or a hallucinated token appears. test_grounding_boundary.py
proves nothing sensitive reaches Claude and the three fail-closed paths.

server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections.
docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md:
the gateway handover spec (Option 1 — caller supplies the entity dictionary).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-05 17:06:29 -05:00

10 KiB

Raw Blame History

Spark Control — `/scrub` + `/rehydrate` endpoints (handover prompt)

Hand this to the Spark Control developer to build the gateway redaction endpoints. OPTIONAL for Phase 1 — the Architect's grounding boundary already ships in our app repo (the backend/redaction/ module, deterministic scrub + golden-file leak tests). Build this when we move to multi-agent enforcement (Phase 2 Analyst, Phase 3 Closer), where we want ONE bypass-proof point with the pseudonym map living next to the local models. Our in-repo module is the reference implementation; the contract below is what our SCRUB_BACKEND=gateway client already calls, so aim for behavioral parity and we cut agents over with zero app changes. Related: docs/redaction-rehydration.md.

Purpose

Add two gateway endpoints, POST /scrub and POST /rehydrate, that let any agent send LP-specific context to the Claude API without exposing sovereign data. /scrub de-identifies an agent's assembled context; the agent sends the de-identified text to Claude; /rehydrate puts the real values back into Claude's response for human review. The pseudonym map that links tokens to real values is the de-anonymization key — it must stay on the Spark, next to the local models, and never be sent or logged to any third party. This sits behind the same trusted Spark Control URL, TLS, and access control that already front /v1/chat/completions, /v1/embeddings, /v1/rerank, and /api/search. Spark Control does not call Claude here — it is the scrub and rehydrate transform pair plus a server-held map.

The three-tier classification (the rules `/scrub` enforces)

Classify every span into exactly one tier:

TIER-1 NEVER-SEND — excluded entirely, not even tokenized: full LP list/export or bulk relationship graph, raw account/wire/routing numbers, SSN/passport/gov-ID, anything under a confidentiality obligation. Default: excise the span. tier1_action="reject" makes the whole call fail-closed (422) if any Tier-1 is detected.
TIER-2 TOKENIZE — stable placeholders, swapped back locally after Claude: person names, org/fund names, emails, phones, addresses, exact $ amounts, identity-pinning dates. Placeholders are [TYPE_N]: [PERSON_1] [ORG_1] [FUND_1] [EMAIL_1] [PHONE_1] [ADDR_1] [AMOUNT_1] [DATE_1] [LOC_1] [MISC_1]. N is 1-based and stable within a task: the same real entity maps to the same token across every item in the call and across later /scrub calls reusing the same map_handle, so Claude can reason about relationships ([PERSON_1] introduced [PERSON_2] to [FUND_1]).
SEND-AS-IS — passed through untouched: the substance Claude needs (objections, sentiment, generic deal mechanics, the drafted message body minus identifiers).

The round-trip

SCRUB (your endpoint) → REASON (the agent calls Claude with placeholders only — your gateway does NOT call Claude) → RE-HYDRATE (your endpoint) → human review (in our app).

Request / response contracts

`POST /scrub`

{
  "task_id": "string, required, caller-chosen, stable across the round-trip",
  "actor": "string, agent name e.g. 'analyst' | 'closer' (for logging)",
  "items": [ {"id": "ctx_1", "text": "..."} ],
  "known_entities": {                         // CALLER-SUPPLIED dictionary (see "Dictionary" below)
    "persons": ["..."], "orgs": ["..."], "funds": ["..."], "emails": ["..."]
  },
  "tier1_action": "drop | reject",            // default "drop"
  "bucket": {"amounts": false, "dates": false}, // default FALSE for grounding: tokenize reversibly (no magnitudes to Claude). bucket=true only when a caller genuinely needs coarse magnitude
  "ner": "auto | rules_only | qwen",          // default "auto"
  "map_handle": "string, optional, reuse/extend an existing task map"
}

Response 200:

{
  "task_id": "...",
  "map_handle": "opaque server key to the map (NOT the map itself)",
  "items": [ {"id":"ctx_1","scrubbed_text":"...","tokens_used":["PERSON_1","AMOUNT_1"]} ],
  "stats": {"tier1_dropped": 2, "tier2_tokenized": 14, "distinct_entities": 9,
            "descriptive_flags": [{"item":"ctx_1","span":"the family that sold the mining company in Texas","action":"redacted"}]},
  "expires_at": "ISO-8601 map TTL (short-lived, e.g. 2h)"
}

422 {"error":"tier1_detected","spans":[...]} when tier1_action="reject" and Tier-1 found. 400 on malformed input.

`POST /rehydrate`

{
  "task_id": "string, required",
  "map_handle": "string, required, from /scrub",
  "items": [ {"id":"out_1","text":"...with [PERSON_1] tokens..."} ],
  "actor": "string",
  "strict": true   // default true
}

Response 200:

{ "items": [ {"id":"out_1","rehydrated_text":"...real values..."} ],
  "stats": {"tokens_substituted": 6, "unknown_tokens": []} }

409 {"error":"unknown_tokens","tokens":["PERSON_9"]} when strict and the text contains a token with no map entry — this is your tripwire for a Claude-hallucinated/smuggled token; do NOT silently pass it through. 410 {"error":"map_expired"} if the map TTL lapsed.

Dictionary: CALLER-SUPPLIED (Option 1 — decided)

The known-entity dictionary is supplied by the caller in each /scrub request (known_entities), NOT read by Spark Control from the CRM. We chose this over giving the gateway CRM access because it keeps Spark Control generic and portable (no coupling to Ten31's schema, so the package still works for another dual-Spark user), needs no CRM credentials on the gateway (least privilege — the LP list lives in one place), stays fresh (built live from the CRM each call), and matches the reference scrub() signature. Our app builds it with its build_known_entities (names + name-parts + email local-parts) and sends it scoped to the request. Treat it as sensitive (a slice of the LP list): hold it only transiently with the map, never log or forward it. Tokens are per-task (stable within a task_id/map_handle), which is also the lower re-identification-risk choice. If per-call payload ever becomes a perf issue, an optional gateway-side cached export (Option 3) is the escape hatch — but do not build that coupling now.

Local-Qwen NER (how to find Tier-2 entities), cheapest/most-authoritative first

The caller-supplied known_entities. Tokenize every supplied person/org/fund/email surface deterministically (case- and unicode-fold insensitive, longest-match-first, with hyphenated-surname extension). This is the bulk of the work and needs no model. (This is the deterministic FLOOR — load-bearing but NOT complete; the NER pass below is required.)
Rules for structured PII: regexes for emails, phones, $ amounts, dates, and SSN/account-number shapes (account/SSN shapes route to Tier-1).
Local-Qwen NER pass for residual named entities the map and rules miss (a person/org in free text not yet in the CRM). Call the SAME local Qwen you already serve at /v1/chat/completions (enable_thinking=false, temperature 0), with a strict JSON-only extraction prompt returning {entities:[{text,type,tier}]}. Never a remote model — the input is exactly the sensitive text we are keeping local. ner="rules_only" skips this; ner="qwen" forces it; auto runs it only on spans unresolved by steps 1–2. Qwen should also flag descriptive re-identifiers (e.g. "the family that sold the mining company in Texas"), not just named entities.

Map-stays-local (non-negotiable)

The pseudonym map {token -> real_value} is stored ONLY on the Spark, keyed by map_handle, in memory or a short-lived local store, TTL-expired (default ~2h). Never returned in full, never written to any log/metric/trace that leaves the box, never in any Claude-bound payload. /scrub returns only the opaque map_handle and counts. If you add GET /scrub/map/{map_handle} for same-box debugging, gate it behind the same auth and keep it off by default.

Logging

One row per call to our interaction_log (we give you a write path or ingest an emitted event): action='redaction.scrub' | 'redaction.rehydrate', actor_type='agent', actor_id=<actor>, source='spark_control', payload = COUNTS ONLY (tiers dropped, tokens by type, distinct_entities, descriptive_flags spans WITHOUT real values, model used). The payload MUST NOT contain any real Tier-2 value, any Tier-1 content, or the map.

Acceptance tests (must pass before agents route through it)

Golden-file diff — fixed inputs, recorded expected scrubbed output; assert NO Tier-1 string and NO real Tier-2 identifier appears in any /scrub response or Claude-bound payload. (We will share our backend/redaction/ fixtures + golden files.)
Round-trip identity — scrub → echo tokens → rehydrate reproduces the original real values exactly; token stability holds across items and across a second /scrub reusing the same map_handle.
Re-identification spot-check — feed ONLY the de-identified prompt to the local Qwen and ask it to name the real people/orgs/amounts; anything it recovers (esp. descriptive re-identifiers and amount+date+sector inference) is flagged and must be driven to zero or escalated to Tier-1/bucketing.
Map-leak assertion — scan every response body, log row, and Claude-bound payload in the suite; assert the map and all real values are absent.
Strict-rehydrate tripwire — a rehydrate input with an unmapped token returns 409.
Fail-closed — tier1_action="reject" returns 422 on any Tier-1 input, emits nothing.

Notes

This does NOT replace minimization: the agent must first ask "does Claude need this record content at all?" — often a local retrieval summary suffices. These endpoints are for when the answer is genuinely yes.
Keep placeholders cache-stable (deterministic token assignment per task_id/map_handle) so they compose with prompt caching on the Claude side.
We will hand you our in-repo redaction module + golden files as the reference; aim for behavioral parity first, then we cut agents over to the gateway as the single enforcement point.

10 KiB Raw Blame History Unescape Escape

Spark Control — /scrub + /rehydrate endpoints (handover prompt)