Harden privacy boundary and asset serving (v0.1.0:74)

Fixes from the 2026-06-12 full-eval (P0 + two P1s); code-only, no schema change. Without these the "private CRM" premise was breachable on the LAN: - P0: the /assets/ route joined the request path onto FRONTEND_DIR without normalizing '..' (get_path/urlparse pass it through), so an unauthenticated GET /assets/../../data/crm.db read any file the process could — the LP DB, the JWT signing secret (-> admin-token forgery), the Gmail key. Add a realpath containment check that 404s anything resolving outside FRONTEND_ROOT. - P1: the LP-outreach drafter built its redaction Boundary with no ner_fn, so unknown people/firms in raw email bodies reached Claude in the clear. Pass the local-Qwen NER backstop (ner_fn=_ner_local), matching architect_grounding; fails closed via the existing scrub_unavailable path if the local model is down. - P1: get-by-id handlers leaked soft-deleted records by direct ID. Add deleted_at IS NULL to every get-by-id path — contacts, organizations, opportunities, lp_profiles — and to the nested related-data sub-selects in the contact/opportunity detail payloads, matching the list-handler convention. Bumps the package to v0.1.0:74 (utils.ts + versions/v0.1.0.74.ts + graph). Full report in EVALUATION.md; remaining P2/P3 triaged in AGENTS.md Current state.
2026-06-12 17:44:27 -05:00
parent 1959c22e19
commit aec2b7775b
8 changed files with 148 additions and 23 deletions
@@ -23,4 +23,6 @@ Read this before editing anything that sends data to a Claude model — the reda

 Trace the data path: any field carrying LP substance must cross `Boundary` first. A new MCP tool that reads CRM rows and hands them to a model without scrubbing is a leak — add it to the redaction path and extend the leak tests in `backend/redaction/test_*.py`.

+A Claude path that sends **free-prose** LP content (email bodies, notes) must pass `ner_fn=_ner_local` to `Boundary` and **fail closed** if the local model is down — the dictionary+regex floor only tokenizes KNOWN CRM entities, so unknown people/firms in prose leak otherwise. See `backend/mcp/architect_grounding.py` (does it right) and `backend/mcp/outreach_agent.py`.
+
 See also `docs/redaction-rehydration.md` and `docs/spark-control-scrub-endpoints.md`.