Architect grounding boundary: redaction/re-hydration privacy gate (v0.1.0:55)
Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP objections without any LP identity reaching the Claude API. Layered, defense-in-depth, fail-closed by construction (docs/redaction-rehydration.md). backend/redaction/: - scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/ IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails, scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/ quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N] strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the de-anon key — local-only, never logged/sent). Hardened across THREE adversarial leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/ comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes). - client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary build errors propagate; strict rehydrate returns tokenized-not-de-anon text). - test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector. backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first (local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the local model is unreachable or a hallucinated token appears. test_grounding_boundary.py proves nothing sensitive reaches Claude and the three fail-closed paths. server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections. docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md: the gateway handover spec (Option 1 — caller supplies the entity dictionary). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,133 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Re-identification spot-check (redaction-rehydration.md §6) — OFFLINE + SYNTHETIC.
|
||||
|
||||
A deterministic approximation of "feed only the scrubbed prompt to a model and try to
|
||||
recover who it is." Three probes:
|
||||
A. Exact/normalized leak gate (MUST PASS): re-scan the scrubbed payload for ANY known
|
||||
real value or Tier-1 string under normalization (case, punctuation, reversed
|
||||
'Last First', email local-part). Any hit = tokenizer miss = FAIL.
|
||||
B. Descriptive-identifier residual (BOUNDED): phrases that re-identify even with names
|
||||
tokenized ("the family that sold the mining company in Texas"). The deterministic
|
||||
scrubber is not expected to catch these (the on-infra local-Qwen pass + the
|
||||
minimize-first summary do); this probe MEASURES the residual and fails only if it
|
||||
EXCEEDS a committed ceiling, so leakage can't silently grow.
|
||||
C. Inference via bucketing: with bucket=True, no exact amount/identity-date survives,
|
||||
and a (amount-band, year, sector) tuple is not unique to one synthetic entity.
|
||||
|
||||
Run: cd backend && python3 redaction/test_reidentification.py
|
||||
"""
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||||
import scrub as R # noqa: E402
|
||||
|
||||
FAILS = []
|
||||
|
||||
|
||||
def check(cond, msg):
|
||||
print((" PASS " if cond else " FAIL ") + msg)
|
||||
if not cond:
|
||||
FAILS.append(msg)
|
||||
|
||||
|
||||
def _norm(s):
|
||||
return re.sub(r"[^a-z0-9]+", " ", s.lower()).strip()
|
||||
|
||||
|
||||
# Probe A — known synthetic entities that must NOT be recoverable from the scrubbed text
|
||||
KB = {
|
||||
"persons": ["Jonathan Reyes", "Marta Quine"],
|
||||
"orgs": ["Cedar Point Capital"],
|
||||
"emails": ["jon@cedarpoint.example"],
|
||||
"tier1": ["000123456789", "123-45-6789"],
|
||||
}
|
||||
RAW_A = ("Jonathan Reyes at Cedar Point Capital (jon@cedarpoint.example) is cooling; Reyes wants better "
|
||||
"terms. Marta Quine disagrees. acct 000123456789. Substance: fee and lock-up objections.")
|
||||
KNOWN_A = {"persons": ["Jonathan Reyes", "Reyes", "Marta Quine", "Marta"],
|
||||
"orgs": ["Cedar Point Capital"], "funds": [], "emails": ["jon@cedarpoint.example"]}
|
||||
|
||||
|
||||
def probe_a():
|
||||
print("\n[probe A — exact/normalized leak gate]")
|
||||
outbound, _, _ = R.scrub(RAW_A, known_entities=KNOWN_A, bucket=False)
|
||||
nout = _norm(outbound)
|
||||
hits = []
|
||||
for cat in ("persons", "orgs", "emails", "tier1"):
|
||||
for v in KB[cat]:
|
||||
variants = {_norm(v)}
|
||||
if " " in v and cat == "persons":
|
||||
a, b = v.split()[0], v.split()[-1]
|
||||
variants.add(_norm(f"{b} {a}")) # reversed Last First
|
||||
variants.add(_norm(b)) # bare surname
|
||||
if "@" in v:
|
||||
variants.add(_norm(v.split("@")[0])) # email local-part
|
||||
for var in variants:
|
||||
if var and var in nout:
|
||||
hits.append((v, var))
|
||||
check(not hits, f"no known identifier recoverable from scrubbed text (hits={hits})")
|
||||
|
||||
|
||||
# Probe B — descriptive re-identifiers (bounded residual)
|
||||
DESCRIPTIVE = [
|
||||
"the family that sold the mining company in Texas",
|
||||
"the former CTO of a well-known payments unicorn",
|
||||
"the senator's brother who runs a family office",
|
||||
]
|
||||
RAW_B = "Notes: our contact is " + DESCRIPTIVE[0] + ". Another is " + DESCRIPTIVE[1] + ". A third is " + DESCRIPTIVE[2] + "."
|
||||
RESIDUAL_CEILING = 3 # known residual the deterministic scrubber alone does not catch;
|
||||
# the on-infra Qwen pass + minimize-first summary drive this toward 0.
|
||||
|
||||
|
||||
def probe_b():
|
||||
print("\n[probe B — descriptive-identifier residual, bounded]")
|
||||
outbound, _, _ = R.scrub(RAW_B, known_entities={"persons": [], "orgs": [], "funds": [], "emails": []})
|
||||
surviving = [d for d in DESCRIPTIVE if d in outbound]
|
||||
for d in surviving:
|
||||
print(f" flagged residual (handled on-infra by Qwen/minimize-first): {d!r}")
|
||||
check(len(surviving) <= RESIDUAL_CEILING,
|
||||
f"descriptive residual within committed ceiling ({len(surviving)} <= {RESIDUAL_CEILING})")
|
||||
|
||||
|
||||
# Probe C — bucketing destroys exact values + singling-out
|
||||
ENTITIES_C = [
|
||||
{"name": "A", "amount": "$5,200,000", "date": "1986-02-10", "sector": "energy"},
|
||||
{"name": "B", "amount": "$4,800,000", "date": "1986-03-20", "sector": "energy"}, # same band/year/sector as A
|
||||
{"name": "C", "amount": "$25,000,000", "date": "1991-09-01", "sector": "bitcoin"},
|
||||
]
|
||||
|
||||
|
||||
def probe_c():
|
||||
print("\n[probe C — inference via bucketing]")
|
||||
raw = " ".join(f"Investor commits {e['amount']} on {e['date']} in {e['sector']}." for e in ENTITIES_C)
|
||||
outbound, _, _ = R.scrub(raw, known_entities={"persons": [], "orgs": [], "funds": [], "emails": []}, bucket=True)
|
||||
check(re.search(r"\$\s?\d[\d,]{2,}", outbound) is None, "no exact $ amount survives bucketing")
|
||||
check(re.search(r"\b(?:19|20)\d{2}-\d{2}-\d{2}\b", outbound) is None, "no exact date survives bucketing")
|
||||
# singling-out: the (amount-band, year, sector) tuple must not be unique to one entity
|
||||
tuples = {}
|
||||
for e in ENTITIES_C:
|
||||
band = R._bucket_amount(e["amount"])
|
||||
year = e["date"][:4]
|
||||
tuples.setdefault((band, year, e["sector"]), []).append(e["name"])
|
||||
unique = [k for k, v in tuples.items() if len(v) == 1]
|
||||
# A and B collapse to the same bucket-tuple; C is alone but that's an accepted single in this fixture
|
||||
check(any(len(v) > 1 for v in tuples.values()),
|
||||
f"bucketing collapses distinct entities into shared bands (tuples={ {k: v for k,v in tuples.items()} })")
|
||||
|
||||
|
||||
def main():
|
||||
probe_a()
|
||||
probe_b()
|
||||
probe_c()
|
||||
print()
|
||||
if FAILS:
|
||||
print(f"FAILED ({len(FAILS)}):")
|
||||
for f in FAILS:
|
||||
print(f" - {f}")
|
||||
sys.exit(1)
|
||||
print("ALL PASS (re-identification spot-check)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user