Harden privacy boundary and asset serving (v0.1.0:74)

Fixes from the 2026-06-12 full-eval (P0 + two P1s); code-only, no schema
change. Without these the "private CRM" premise was breachable on the LAN:

- P0: the /assets/ route joined the request path onto FRONTEND_DIR without
  normalizing '..' (get_path/urlparse pass it through), so an unauthenticated
  GET /assets/../../data/crm.db read any file the process could — the LP DB,
  the JWT signing secret (-> admin-token forgery), the Gmail key. Add a realpath
  containment check that 404s anything resolving outside FRONTEND_ROOT.
- P1: the LP-outreach drafter built its redaction Boundary with no ner_fn, so
  unknown people/firms in raw email bodies reached Claude in the clear. Pass the
  local-Qwen NER backstop (ner_fn=_ner_local), matching architect_grounding;
  fails closed via the existing scrub_unavailable path if the local model is down.
- P1: get-by-id handlers leaked soft-deleted records by direct ID. Add
  deleted_at IS NULL to every get-by-id path — contacts, organizations,
  opportunities, lp_profiles — and to the nested related-data sub-selects in
  the contact/opportunity detail payloads, matching the list-handler convention.

Bumps the package to v0.1.0:74 (utils.ts + versions/v0.1.0.74.ts + graph).
Full report in EVALUATION.md; remaining P2/P3 triaged in AGENTS.md Current state.
This commit is contained in:
Keysat
2026-06-12 17:44:27 -05:00
parent 1959c22e19
commit aec2b7775b
8 changed files with 148 additions and 23 deletions
+7 -3
View File
@@ -12,6 +12,7 @@ import os
import sys
_HERE = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, _HERE) # backend/mcp on path for sibling imports (architect_grounding, architect_agent)
# outreach_type -> human description woven into the prompt
OUTREACH_TYPES = {
@@ -223,11 +224,15 @@ def draft_outreach(conn, investor_id, outreach_type, guidance, db_path, sender_e
voice_blocks, voice_meta = _voice_examples(conn, sender_email, outreach_type)
# 1) Scrub the sender's voice examples + the recipient context TOGETHER (shared token
# space). Nothing reaches Claude in the clear; the voice examples are reference only.
# space). The recipient context is free-prose email bodies, so the dictionary+regex
# floor is NOT enough — pass the local-Qwen NER backstop (as architect_grounding does)
# to tokenize unknown people/firms not in the CRM. FAILS CLOSED: if the local model is
# unreachable, _ner_local raises here and no de-anonymized draft is returned.
try:
sys.path.insert(0, os.path.dirname(_HERE)) # backend/ for the redaction package
from redaction.client import Boundary
boundary = Boundary(db_path=db_path, actor="closer")
from architect_grounding import _ner_local # local-Qwen NER backstop (sibling module)
boundary = Boundary(db_path=db_path, actor="closer", ner_fn=_ner_local)
scrubbed = boundary.scrub(list(voice_blocks) + [context], bucket=False, conn=conn)
except Exception as exc:
return {"status": "scrub_unavailable", "reason": str(exc)}
@@ -237,7 +242,6 @@ def draft_outreach(conn, investor_id, outreach_type, guidance, db_path, sender_e
# 2) Claude drafts over the de-identified context + voice + (non-sensitive) thesis.
try:
sys.path.insert(0, _HERE)
import architect_agent as aa
thesis = aa.at.get_thesis("core", db=db_path)
raw = _draft_with_claude(aa, thesis, type_desc, deident_target, deident_voice, guidance)