Commit Graph

6 Commits

Author SHA1 Message Date
Keysat 069e60053b email-activity agent: propose -> review -> approve grid notes (v0.1.0:64)
When a sent/received email is matched to an investor, a local-model agent drafts a
one-line dated note and queues it as a PENDING proposal (it never writes the grid
itself). On the Email Capture page a partner sees "Proposed grid notes", can edit the
text, and Approve (appends to that investor's grid notes cell, newest at bottom,
stamped with the approver) or Dismiss. Going-forward only: a cutoff (app_settings
email_activity_since, set on first run) means email dated before the feature was
enabled is never summarized, so the historical backfill makes no noise. Sovereign:
summaries run entirely on the local model (no redaction needed). Gmail sync interval
tightened 180 -> 15 min so outgoing email surfaces quickly.

Backend: migration 0002 (email_activity_proposals); propose_email_activity_notes()
runs via a new scheduler post_sync hook; list/decide functions + routes
GET /api/activity/proposals, POST .../{id}/approve|dismiss. Grid append stamps the
approving user (fundraising_state.updated_by has a FK to users). Test
test_email_activity.py (propose cutoff/idempotency, approve appends + edited note,
dismiss, already-decided guard) under FK enforcement.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 15:55:26 -05:00
Keysat 2e70b34592 Architect grounding boundary: redaction/re-hydration privacy gate (v0.1.0:55)
Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP
objections without any LP identity reaching the Claude API. Layered, defense-in-depth,
fail-closed by construction (docs/redaction-rehydration.md).

backend/redaction/:
- scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/
  IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from
  the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails,
  scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/
  quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N]
  strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the
  de-anon key — local-only, never logged/sent). Hardened across THREE adversarial
  leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/
  comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes).
- client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or
  gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary
  build errors propagate; strict rehydrate returns tokenized-not-de-anon text).
- test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification
  suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector.

backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first
(local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the
de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the
local model is unreachable or a hallucinated token appears. test_grounding_boundary.py
proves nothing sensitive reaches Claude and the three fail-closed paths.

server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections.
docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md:
the gateway handover spec (Option 1 — caller supplies the entity dictionary).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 17:06:29 -05:00
Keysat dd25bbc08d Architect agent: Claude-powered thesis generation (backend scaffolding)
- backend/mcp/architect_agent.py: generate_options + revise on Claude (prompt-
  cached thesis context, claude-opus-4-8, Ten31 voice rules). Writes N variant
  drafts to a node's variant group; nothing canonical without human approval.
  Fails gracefully if the API key / SDK is absent.
- server.py endpoints: GET /api/architect/status, GET /api/thesis/{key}/tree,
  GET /api/thesis/nodes/{id}/variants, POST .../generate, POST .../feedback,
  POST /api/thesis/lines, POST /api/thesis/lines/{key}/nodes. architect_tools
  gains get_node_variants.
- Dockerfile installs `anthropic`; docker_entrypoint loads ANTHROPIC_API_KEY from
  /data/secrets/anthropic-api-key (self-disabling until the key is dropped in).

Full HTTP surface verified end-to-end (graceful 502 without a key).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 13:25:47 -05:00
Keysat 6be2e40f54 Phase 0 go-live polish: hands-off incremental sync + refresh action
- backend/ingest/sync_scheduler.py: periodic incremental-sync loop (every
  CRM_INGEST_SYNC_INTERVAL_MIN min); resilient, --once for testing.
- start9/0.4: "Refresh search index" action (incremental sync.py); entrypoint
  launches the scheduler as a background process when Spark/Qdrant are set;
  CRM_INGEST_SYNC_INTERVAL_MIN env; pre-release note on fastembed/mcp pins.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:36:06 -05:00
Keysat f357c23c75 Phase 0 complete: fuzzy entity tier, incremental sync, Start9 packaging
- Fuzzy tier (backend/ingest/fuzzy_resolve.py + llm.py): local Qwen adjudicates
  the deterministic resolver's flagged name-variant candidates; merges are
  durable via entity_merges (deterministic re-runs respect them), losers
  soft-deleted, logged. Idempotent.
- Incremental sync (backend/ingest/sync.py): re-embeds only rows changed since a
  watermark (ingest_sync_state); first run / --recreate = full. Tested full→0→1.
- Start9 packaging (start9/0.4): Dockerfile bundles ingest+mcp + fastembed/mcp;
  "Build search index" action runs the init in a subcontainer; MCP shipped as a
  manual stdio server (not a daemon); version 0.1.0:44. INGEST_PACKAGING.md.
- backfill.py: factored embed_and_upsert() shared with sync.

Verified end-to-end on synthetic data + live Sparks/Qwen/Qdrant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:55:12 -05:00
Keysat c7ce44d963 Phase 0 foundation: canonical schema, ingest pipeline, CRM MCP server
Workstream A–C substrate for the Ten31 agentic system:
- A1: docs/crm-overview.md; CLAUDE.md conventions + guardrail #9
- A2: additive/reversible core migration (canonical_entities, entity_links,
  interaction_log, relationship_edges, soft-delete) + ledgered runner
- B1/B3: chunking + deterministic entity resolution (backend/ingest)
- B2: dense (bge-m3) + BM25 sparse ingest to Qdrant crm_chunks
- C: CRM MCP server (reads, retrieval modes, logged writes) — no outbound tools
- docs: redaction/re-hydration, Gmail enablement runbook
- synthetic test data; .env.example; housekeeping (.gitignore, untrack crm.db,
  drop legacy files + start9/0.3.5)

Verified end-to-end on synthetic data + live Sparks (hybrid > dense on entity
queries). Real backfill runs on Ten31 infra; index holds synthetic data only.
Branch snapshot also captures pre-existing working-tree changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:13:35 -05:00