50 Commits

Author SHA1 Message Date
Keysat 7f9a15ebf3 Adopt the Pipeline: grid-driven opportunities link (v0.1.0:87)
The fundraising grid (canonical) now drives the classic opportunities
Pipeline board, instead of the board being a disconnected second data-entry
surface. An "Add to Pipeline" row action creates a durably-linked opportunity
via the new opportunities.fundraising_investor_id (migration 0005, additive +
reversible), reusing the grid's already-synced contact — retiring the
POST /api/contacts side-door — and mapping the grid lead to the opp owner.

Ownership is split so the two stay reconciled: the grid owns whether the link
exists and the seed; the board owns stage/probability/owner. The link endpoint
is idempotent (one live opp per investor; a re-link never reseeds funnel
fields). "Is in pipeline?"/"what stage?" are derived from a live opp join and
injected as read-only grid columns on read, stripped on write, so they never
persist or dirty the autosave. Remove-from-pipeline soft-deletes the opp and
leaves the grid row fully intact; deleting an investor from the grid archives
its orphaned opp.

Also fixes the standing soft-delete leak in handle_pipeline_report and the
dashboard pipeline aggregates, which counted tombstoned opportunities.

Tests: backend/test_grid_pipeline_link.py (link/idempotent/round-trip/guards/
unlink-intact/re-link/orphan/aggregates); 28/28 suite green, render-smoke green.
2026-06-17 23:08:36 -05:00
Keysat c1ea1769a4 Matrix intake: frame parse with team roster so a teammate isn't read as the prospect
Local-smoke found "jonathan is chatting with wyoming" extracted the teammate, not
the prospect. Feed the parser an optional team roster (INTAKE_TEAM_ROSTER) via a
build_system(roster) outreach frame: roster names/initials are the people doing
outreach and are never extracted; the other party is the investor/prospect. Same
framing on the revise leg. Unset roster = prior behavior.
2026-06-17 21:58:54 -05:00
Keysat b470ea2659 Containerize the Matrix intake bot as a managed service (restart: unless-stopped)
Turn the bot from a bare nohup process (silently dies on a Spark reboot) into a docker-compose service. Dockerfile bundles backend/matrix_intake + the stdlib backend/ingest Spark client it reuses; .env is mounted read-only at runtime, never baked. The existing repo-root .dockerignore (shared with the s9pk build) already keeps data/ and .env out of context. Also adds a handoff doc for wiring a spark-control dashboard card in a later session.
2026-06-17 20:10:16 -05:00
Keysat 0b893295e1 Matrix intake: fuzzy investor matching + conversational in-thread edits (v0.1.0:86)
Close the two locked post-deploy enhancements for the Matrix intake bot.

Fuzzy matching (server-side, ships in the s9pk): new find_intake_candidates in
server.py returns ranked deterministic near-matches (difflib name similarity +
token-set Jaccard, legal-suffix-aware, + email Levenshtein <= 2); GET
/api/intake/match now returns {match, candidates}. The bot surfaces a numbered
shortlist so a near-duplicate (Charlie/Charles, Acme Capital vs Acme Capital LLC,
a one-char email typo) is confirmed by a human instead of silently creating a
second investor. Exact match still auto-attaches; fuzzy candidates are never
auto-attached. The optional LLM-judge re-rank is deferred.

Conversational edits (bot-side, ships on the Spark): any in-thread reply that
isn't yes/no/edit field=value is treated as a natural-language revision and
re-run through local Qwen (parse.revise). Email integrity is preserved -- a
changed address must literally appear in the instruction; the model's email
field is structurally unreachable. No-op revisions re-prompt.

Docs/current-state brought current; 27/27 backend tests green.
2026-06-17 18:50:58 -05:00
Keysat fa6c9da0e6 Drop redundant "[note]" tag from fundraising-grid note line (v0.1.0:85)
The grid note line was "YYYY-MM-DD [type] Contact: summary"; for the default
"note" type the tag is noise. Omit it for "note"; keep it for informative
types (call, meeting, …). Shared by the Matrix intake bot and grid-UI logging.
Built + installed to the box (installed-version 0.1.0:85, clean 84->85
migration). No schema change.
2026-06-17 17:30:40 -05:00
Keysat aefb2aa119 Matrix intake: main-timeline nudge, clearer messages, note text in grid
Four bot-side UX fixes surfaced by the live smoke:
- Post a brief pointer in the main timeline (a reply to the user's message)
  alongside the in-thread proposal card, so proposals aren't missed inside a
  thread. Pointer only — approvals still happen in the thread, where the note
  is visible (you can't make an informed yes/no without seeing it).
- A bare yes/no typed in the main timeline while a proposal is pending now
  gets a "reply in the thread" redirect instead of "couldn't tell what to record."
- Clearer commit confirmations: "Created a new grid entry for X" vs
  "Logged a note on X (existing grid entry)."
- Send a blank communication subject when a note is present so the grid's
  one-line note summary shows the note text, not the "(Matrix)" label
  (provenance stays in source="matrix_intake").
2026-06-17 17:14:08 -05:00
Keysat fd2e3ed78e Matrix intake: strip surrounding punctuation from extracted emails
normalize()'s email regex matched non-@/non-space runs, so "Name <addr>"
(the most common contact format) yielded "<addr"; only trailing punctuation
was stripped, never leading. Tighten the regex to standard local@domain.tld
so the bare address is extracted from <…>, (…), and trailing-period forms.
Found via the live-deploy pre-flight. Add a regression test.

Also log two intake backlog items in ROADMAP: the scoped service-credential
auth path (deferred; bot uses a member login for now) and fuzzy match +
in-thread confirm (post-deploy).
2026-06-17 14:06:32 -05:00
Keysat 7ad0ee7624 Add Matrix intake bot (M1+M2): typed message → approved fundraising-grid write
New backend/matrix_intake/ runs as its own process (matrix-nio isolated from the
stdlib CRM): local-Qwen parse via Spark Control → in-thread human approval
(yes/edit/no) → write through the CRM's own log-communication endpoint, tagged
source=matrix_intake. Adds read-only GET /api/intake/match (returns grid row id,
no-duplicate contract); threads provenance through handle_log_fundraising_communication.
Reviewer-passed: pop-before-commit closes a double-approve race; edit-grammar fix.
Text-only v1; business-card photo (M3) deferred (no Spark vision model).
26/26 tests green; live Matrix smoke pending deploy.
2026-06-17 07:51:27 -05:00
Keysat c7b74a2704 Email search/query + windowed digest preview (v0.1.0:83)
Communications tab (search/query roadmap items 1 & 2):
- Fix the investor dropdown: the facet only listed grid investors, so it
  came back empty whenever email matched a classic contact or org domain
  (no grid id — the common case). It now mirrors the email list, resolving
  each link to a typed identity (fund:/org:/contact:/addr:) with precedence
  grid -> org -> contact -> address; investor_id accepts the typed key
  (bare id = fund: for back-compat) and an unknown prefix matches nothing.
- Add a date-range filter and a click-to-expand full-body view
  (GET /api/email/detail, admin, soft-delete-gated; body_text only, never
  raw remote HTML).
- Add a "Search content" mode: GET /api/email/search wraps the ingest
  hybrid_search over the Qdrant email index (doc_type=email), hydrated and
  soft-delete-filtered against SQLite (canonical), 503 if Spark/Qdrant down.

Daily digest:
- Settings -> Admin builds a digest over a chosen window (last 24h or since
  a date) as an in-app preview before sending (POST /api/admin/digest/preview),
  so the local-Spark summarizer can be verified on demand even on a quiet day.
  Manual send uses the same window; neither advances the daily cursor, so a
  preview never suppresses the scheduled digest.

Code-only, migrations no-op. 22/22 backend tests, render-smoke pass.
2026-06-16 20:46:15 -05:00
Keysat 6563a7811e Communications tab: show matched investors only (v0.1.0:81)
The email-activity panel surfaced every captured message, including cold/
unknown-sender email with no investor association. Gate query_email_activity
on EXISTS(email_investor_links) so the panel shows only email tied to a known
investor/contact. Capture is unchanged — unmatched email is still stored
(metadata-only) and will appear automatically if its sender is later added as
an investor; this is a read-side filter only.

Graveyard investors are unaffected (their email has a link), so they remain
visible/searchable as an audit surface, hidden only from the filter picker.
2026-06-16 15:43:30 -05:00
Keysat 42d2b4b245 Repurpose Communications tab as admin-only email-activity panel (v0.1.0:80)
The Communications tab is now an admin-only search over captured Gmail
(email_* tables), part of consolidating on the fundraising grid + email
capture as the canonical system of record.

- New GET /api/email/activity (admin-enforced server-side): filter by
  investor / mailbox / direction with free-text search over subject,
  snippet, and sender. Query logic in db.query_email_activity.
  - Soft-delete honored on the per-mailbox sighting (emails carry no
    deleted_at; deletion lives on email_account_messages).
  - Direction decided at the email level (outbound if the sender is one of
    our mailboxes), mirroring digest_builder.
  - Graveyard investors are hidden from the filter dropdown (CRM-wide
    graveyard=0 convention) but their email stays visible in the list and
    findable by free-text search — this is an audit surface.
- Communications page rewritten to render the panel; the classic manual
  "Log Communication" form is retired (the grid context menu remains the
  manual-log path). Nav item + page are admin-only.
- Tests: email_integration/test_email_activity_panel.py (filters,
  per-sighting soft-delete, roll-ups, graveyard handling, route 401/403);
  full suite 22/22. Frontend render verified via a jsdom mount smoke test
  plus the pinned classic-runtime Babel transform.

Code-only, no schema migration (version migrations are no-ops).
2026-06-16 14:49:59 -05:00
Keysat cc25be4e14 Fix blank-screen on load + close 3 admin gaps (v0.1.0:79)
The web UI rendered a blank screen for every user. Root cause: the page
loaded @babel/standalone from unpkg with no version pin, so the CDN silently
served Babel 8.0.0. Babel 8 defaults @babel/preset-react to the automatic JSX
runtime, which prepends `import {jsx} from "react/jsx-runtime"` to the compiled
output. An ESM import is illegal in this classic (non-module) inline <script>,
so the browser rejected the whole bundle and React never mounted — hence the
blank screen. The prior "verified live" checks were server-up/curl, which can't
catch a browser-render failure.

- Pin @babel/standalone@7.29.7 (its preset-react defaults to the classic
  React.createElement runtime). Verified via headless render: app mounts, login
  screen renders, no console error. Follow-up: vendor + SRI-pin the CDN libs so
  a third party can't swap our front-end deps in production again.
- Close three server-side admin gaps surfaced by a permissions audit — endpoints
  that were UI-hidden from members but not API-enforced: GET /api/users,
  /api/email/status, /api/email/accounts now require_admin. Removed the now-dead
  non-admin mailbox-row filter. 21/21 backend tests green; py_compile clean.
2026-06-16 12:59:55 -05:00
Keysat 108210d8e1 Retire lp_profiles + LP Tracker; repoint Dashboard committed to the grid (v0.1.0:78)
The fundraising grid + email capture is the canonical system of record. lp_profiles
was a superseded single-fund model with no reachable create/edit path, and the LP
Tracker page was already orphaned (no nav entry + a redirect bouncing it to the grid).

- Remove /api/lp-profiles* endpoints + handlers, the unused lp-breakdown report,
  the contact-dossier LP section, the demo-seed LP block, and (frontend) the
  LPTrackerPage component + its lp-tracker->fundraising-grid redirect.
- Dashboard "Total Committed" now sums fundraising_investors.total_invested
  (graveyarded investors excluded) instead of the orphaned lp_profiles table, which
  read ~$0. "Total Funded" dropped: the grid tracks commitments, not a funded amount,
  and the frontend never rendered it.
- Leave the empty lp_profiles table/index, the contact-delete soft-delete cascade,
  and the --reset-all-data clear in place (never-hard-delete).
- Tests: add test_dashboard_report.py; update test_soft_delete_reads.py. 21/21 green.
2026-06-16 10:48:53 -05:00
Keysat 323f016f64 Add daily activity digest — Phase B (v0.1.0:77)
Sends a once-a-day internal email to all active admins summarizing each team
member's email activity per investor, plus a team-wide by-investor view
(inbound + outbound, deduped). Narratives are generated on the LOCAL Spark
model, never Claude — the digest is intentionally un-anonymized, so substance
stays on Ten31 infra. This is an internal ops email, exempt from the
'agents draft, humans send' rule (which governs outward LP contact).

- backend/digest_builder.py: per-user + per-investor activity queries
  (soft-delete filtered), per-user Spark narrative with a deterministic
  fallback, two-section plain-text body, and the DB-backed policy resolver.
- backend/email_integration/digest_scheduler.py: always-on daily thread that
  re-reads the policy each cycle and sends once/day; window cursor in
  app_settings so a missed day rolls forward.
- server.py: POST /api/admin/digest/send-now and GET/PATCH
  /api/admin/digest/policy; scheduler wired into main().
- Control lives in Settings -> Admin (enable toggle + send-time dropdown),
  not StartOS actions; env vars only seed the first-boot default.
- Tests: backend/test_digest_builder.py.
2026-06-15 22:32:27 -05:00
Keysat fee037a630 Apply review polish to the digest send path (post-v0.1.0:76)
Non-blocking items from the v76 reviewer pass. No redeploy needed — the box runs
v76 and its happy path is unaffected; these ride the next build:

- digest_mailer.send_digest: when Gmail is enabled but no sender resolves
  (CRM_DIGEST_SENDER unset and no admin email), raise NoTransport so the caller
  returns a clear 400 instead of a generic 502.
- gmail_send.send_via_gmail: wrap OSError/URLError (timeout/DNS) as a RuntimeError
  ("Gmail API unreachable: ...") to match the HTTPError handling; include the
  sender in the HTTPError message for debuggability.
- credentials.py: correct the now-stale GMAIL_COMPOSE_SCOPE comment (the digest
  mailer sends with this scope; only outreach drafts never send).
- test_gmail_send.py: add the HTTPError->RuntimeError branch, default_sender DB
  fallback (+None case + env override), and the send_digest SMTP-tag path.

19/19 backend tests green.
2026-06-15 20:37:49 -05:00
Keysat 47dfd110a0 Add Gmail-DWD send path for the digest mailer (v0.1.0:76)
The box's existing service-account domain-wide-delegation grant already includes
gmail.compose, which authorizes users.messages.send — verified 2026-06-15 by a
token-mint probe and a live messages.send to grant. So CRM-originated mail can
send through the account that already powers email capture: no SMTP account, no
app password, no admin change.

- backend/email_integration/gmail_send.py: send_via_gmail() impersonates a
  domain user and POSTs users.messages.send (reuses credentials.py + the compose
  scope; mirrors compose.py's REST pattern).
- backend/digest_mailer.py: send_digest() prefers Gmail DWD when enabled, falls
  back to smtp_send otherwise. Sender = CRM_DIGEST_SENDER else first active admin.
- server.py: the admin test endpoint now routes through digest_mailer (so the
  Settings button sends via DWD on the box with zero SMTP config). Recipient
  restriction to the admin set and no-leak error handling preserved.
- test_gmail_send.py: build/send + transport routing (provider + urlopen faked).
  19/19 backend green; s9pk typechecks.

SMTP (v75) stays as the fallback transport. Send-path decision + scope finding
recorded in ROADMAP.md and AGENTS.md.
2026-06-15 20:17:27 -05:00
Keysat 2758ac81d3 Add daily-digest Phase A: per-package SMTP send + admin test endpoint (v0.1.0:75)
Groundwork for the daily activity digest: give the CRM an outbound mail path.
Today nothing leaves the box (Gmail capture + drafts only), so this adds a
dedicated, per-package SMTP account independent of any StartOS system-wide SMTP.

- configureDigestSmtp Start9 action: writes host/port/from/username/password/
  security to /data/secrets/smtp/* (password piped over stdin, never argv/env;
  per-field files, owner-only) — mirrors the setAnthropicApiKey pattern.
- docker_entrypoint.sh reads those at boot and exports SMTP_* (operator env wins).
- backend/smtp_send.py: stdlib smtplib wrapper reading SMTP_* (one code path for
  dev .env and the box); starttls/tls/none modes.
- POST /api/admin/digest/test-email (admin-only): proves the pipe. Recipients are
  restricted to the active-admin set — an arbitrary `to` is rejected, so the
  endpoint is not an open relay; send failures are logged, not echoed (an SMTP
  auth error can carry the credential).
- Tests: test_smtp_send.py (sender), test_smtp_endpoint.py (gating + relay
  restriction + no-leak). 18/18 backend green; s9pk typechecks.

Analysis/summarization for the digest body (Phase B) will run on Spark, never
Claude — the digest is deliberately un-anonymized. Decisions + Phase B plan in
ROADMAP.md.
2026-06-15 18:33:06 -05:00
Keysat 7285bb0e52 Add regression tests for v74 fixes; close soft-delete leak in list-view aggregates
Lock in the three v0.1.0:74 security/privacy fixes with regression tests, and
fix a same-class soft-delete leak surfaced while writing them.

- backend/test_assets_traversal.py: boots the real server, proves /assets/
  path-traversal vectors (incl. a real decoy file and the live crm.db, plain
  and URL-encoded) 404 and leak nothing, while a legit asset still serves 200.
- backend/test_soft_delete_reads.py: get-by-id 404s soft-deleted rows and
  nested + list-view aggregates exclude soft-deleted children.
- backend/mcp/test_outreach_redaction.py: an unknown free-prose name is
  tokenized away from the Claude payload but re-hydrated locally, and the path
  fails closed (no Claude call) when the local NER model is down.
- backend/run_tests.py: aggregate runner (each backend/**/test_*.py in its own
  subprocess); replaces the manual for-loop. 16/16 green.

A reviewer pass on the tests confirmed the soft-delete filter was missing from
list-view aggregate sub-selects: org contact_count/total_funded and contacts
comm_count/last_contact_date counted soft-deleted rows. Add `deleted_at IS NULL`
to those four (server.py) and regression-cover them.

The reports subsystem (dashboard/pipeline/LP-breakdown, ~16 aggregate queries)
has the same leak and is logged as P2 for a dedicated pass. Not yet built or
deployed — bump the package version before the next s9pk build.
2026-06-13 00:26:22 -05:00
Keysat 6816d4a4f0 Realign stale thesis tests to the 7-member positioning group
ensure_positioning_framings adds 5 Architect framings to the core
positioning variant group alongside Option A/B, so the group holds 7
candidates and choose_variant retires 6. The two thesis tests still
asserted the pre-framings count of 2 — the tests were stale, not the
seed. Realign them, document the 2+5=7 seed structure in the thesis
guide, and refresh AGENTS.md Current state (13/13 tests green).
2026-06-12 18:44:14 -05:00
Keysat aec2b7775b Harden privacy boundary and asset serving (v0.1.0:74)
Fixes from the 2026-06-12 full-eval (P0 + two P1s); code-only, no schema
change. Without these the "private CRM" premise was breachable on the LAN:

- P0: the /assets/ route joined the request path onto FRONTEND_DIR without
  normalizing '..' (get_path/urlparse pass it through), so an unauthenticated
  GET /assets/../../data/crm.db read any file the process could — the LP DB,
  the JWT signing secret (-> admin-token forgery), the Gmail key. Add a realpath
  containment check that 404s anything resolving outside FRONTEND_ROOT.
- P1: the LP-outreach drafter built its redaction Boundary with no ner_fn, so
  unknown people/firms in raw email bodies reached Claude in the clear. Pass the
  local-Qwen NER backstop (ner_fn=_ner_local), matching architect_grounding;
  fails closed via the existing scrub_unavailable path if the local model is down.
- P1: get-by-id handlers leaked soft-deleted records by direct ID. Add
  deleted_at IS NULL to every get-by-id path — contacts, organizations,
  opportunities, lp_profiles — and to the nested related-data sub-selects in
  the contact/opportunity detail payloads, matching the list-handler convention.

Bumps the package to v0.1.0:74 (utils.ts + versions/v0.1.0.74.ts + graph).
Full report in EVALUATION.md; remaining P2/P3 triaged in AGENTS.md Current state.
2026-06-12 18:01:48 -05:00
Keysat fffc90c7a4 Replace v5 settlement spine with v2.0 reserve-asset spine (v0.1.0:73)
Swap the dead "scarcity as the connecting idea" / bitcoin-as-settlement
spine for the v2.0 reserve-asset spine (bitcoin = apex non-debasable
reserve asset; debasement = forcing function; AI = abundance engine;
throughline is an asset-value/capital-flow claim, not settlement; three
seams Energy<->Compute, Debasement<->Bitcoin, AI<->Data-Ownership)
everywhere it was still encoded in live code, the seed, and the docs.

- architect_agent.py / outreach_agent.py: both system prompts carried
  "scarcity as the connecting idea" and shipped settlement framing into
  every generated draft; rewritten to the reserve-asset spine.
- thesis_seed.py: THROUGHLINE, PILLAR_1, the AI/energy-operator segment
  angle, and THESIS_V2 corrected and voice-cleaned (no em dash / "X, not
  Y" / "bet"). PILLAR_2/3 (real revenue, founder access) kept.
- ensure_thesis_v2_promoted / revert_thesis_v2_promotion: make the v2.0
  spine the working APPROVED spine and re-ground/clean the core nodes,
  deployment-state-invariant (structural targeting, not body text) and
  fully reversible (captures prior body/title/status/deleted_at). NODE
  level only: never sets a thesis_version canonical (guardrail #4); no
  hard deletes (guardrail #3). Wired into init_db after the v2 candidate
  stage.
- docs/thesis-handoff.md replaced wholesale with the complete v2.0 doc;
  Ten31_Agentic_Build_Plan.md + PHASE_1.md throughline glosses updated.

The v2.0 spine remains an unratified draft from the signal-engine
workstream: canonical freeze stays the partners' dual sign-off, and
Appendix-A conviction/exposure figures stay Grant's working read.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:22:24 -05:00
Keysat c53fdcb4a0 thesis: stage v2.0 reserve-asset spine as Workshop candidates (v0.1.0:72)
Incorporates the signal-engine workstream's v2.0 thesis correction: the spine is bitcoin
as the apex NON-DEBASABLE RESERVE ASSET (debasement = forcing function, AI = abundance
engine), NOT "infrastructure settles on bitcoin" (the settlement/payments claim — Strike's
payments thesis died in backtest). thesis_seed.ensure_thesis_v2_candidate stages the
v2.0 root/forcing-function, throughline, the verifiable-vs-contrarian decomposition, and
the 3 seams (Energy↔Compute, Debasement↔Bitcoin, AI↔Data-Ownership) as CANDIDATE nodes
under the core line (idempotent sentinel; provenance + "unratified, exposure unconfirmed"
on the section). Nothing canonical (guardrail #4). docs/thesis-handoff.md gets a
SUPERSEDED-spine banner pointing to v2.0.

NOT done (gated on partner ratification): the live THROUGHLINE/PILLAR_1 constants and
architect_agent.py's system prompt ("scarcity as the connecting idea") still encode the
old spine — until ratified+updated, Vary/Revise/outreach regenerate the old framing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 23:32:36 -05:00
Keysat 606b336a00 outreach: voice by-purpose (larger sample) + Tier-B Gmail draft creation (v0.1.0:71)
(1) Voice: _voice_examples now picks the sender's prior sent emails OF THE SAME PURPOSE
(PURPOSE_PATTERNS keyword cues per outreach type), larger sample (8) weighted by purpose
then recency — not just recent. meta carries on_topic for transparency.

(2) Tier-B sending (gmail.compose now authorized in Workspace DWD). New
email_integration/compose.py create_outreach_draft: mints a compose-scoped DWD token for
the sender (credentials._mint/access_token_for parameterized by scope; GMAIL_COMPOSE_SCOPE),
builds an RFC822 message, and POSTs gmail.drafts.create into the SENDER's mailbox — as an
in-thread reply (threadId + In-Reply-To/References, recipient = matched LP address) when
there's an active thread, else a fresh email. NEVER sends — the human sends from Gmail
(guardrails #4, #6). Route POST /api/outreach/gmail-draft; UI "Create Gmail draft" button +
"Open Gmail Drafts" link. Tests: test_compose.py (parse/reply-target/RFC822+threading).
Message construction unit-verified; the live drafts.create runs on the box.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 22:30:05 -05:00
Keysat 49f84ca9a4 outreach: per-user voice from own emails + transparency; active-thread context (v0.1.0:70)
Voice upgrade. draft_outreach now learns the SENDER's voice: the codified rules PLUS a
few-shot of that user's own recent sent emails (_voice_examples; from_email = the
sender, de-identified in the same scrub batch as the recipient context, reference-only).
The response returns which of the sender's emails were used (subject + date + recipient),
shown in the UI as "Voice based on: …" — transparency to avoid the black-box problem.
Falls back to rules-only with a clear note when the user has no captured sent email.

Context restructured: _context groups the investor's email by thread and labels the most
recent thread as the "Active conversation (what you are replying to)" with earlier emails
as background, so replies stay on-topic instead of dredging old threads.

Sender email resolved in handle_outreach_draft (users table by user_id). Test extended
(active/background split, voice examples + meta, no-sender fallback). Fixed a UI bug the
preview caught: the manual Draft button was onClick={draft}, which passed the click event
as the investor arg after draft() gained params -> circular-JSON error; now onClick={()=>draft()}.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 22:06:38 -05:00
Keysat 787d580550 outreach: follow-up radar — deterministic "needs attention" + one-click draft (v0.1.0:69)
The Outreach page now opens with a "Needs attention" list. A deterministic scan
(outreach_agent.follow_up_radar) surfaces investors per the email history: tier 0 "you
owe a reply" (their email is the most recent, unanswered, >=3d), tier 1 flagged + quiet,
tier 2 warm lead gone quiet (no contact in >=45d). Most urgent first; every reason is
verifiable from the data (no LLM in the surfacing — the deliberate fix for the trust
problem that sank objection-grounding). Excludes graveyard; needs email history. One
click sets the investor + suggested type (follow-up/nurture) and runs the existing
outreach drafter. Route GET /api/outreach/radar. Test mcp/test_outreach.py extended
(owe-reply/warm-quiet/recent/graveyard/order). Verified live in preview.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 21:31:52 -05:00
Keysat b5619d61e1 outreach: Outreach Draft Assistant — tailored LP drafts (v0.1.0:68)
First proactive-messaging build. New "Outreach" page (all authenticated users): pick an
investor + type (intro / follow-up / fund update / meeting follow-up / nurture) + optional
guidance; the agent drafts a tailored LP email in Ten31's voice, grounded in the thesis +
that investor's CRM notes and matched email history. The draft is editable + copyable;
nothing is sent (draft-only — guardrails #4, #6).

Sovereignty: the thesis is Ten31's own non-sensitive messaging (to Claude as-is); the LP
context is scrubbed through the redaction boundary before Claude, drafted with placeholders,
and re-hydrated locally — the LP list never reaches the API. Fails closed (scrub_unavailable /
claude_not_configured / rehydrate_failed quarantines a hallucinated-token draft).

Backend: mcp/outreach_agent.py (context assembly + scrub + Claude + rehydrate, reusing
architect_agent's client/thesis/voice + the Boundary); routes GET /api/outreach/investors,
POST /api/outreach/draft; logged. Test mcp/test_outreach.py (context assembly). Verified in
preview: page/selector/types/guidance render, fail-closed at the key-less Claude step (scrub
ran locally first), success rendering verified with a mocked ok draft.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 20:06:46 -05:00
Keysat 701e37b579 email: per-mailbox captured/matched counts on Email Capture (v0.1.0:65)
/api/email/accounts now returns captured + matched per account (from the per-mailbox
sighting table email_account_messages joined to emails; emails dedupe globally so an
email seen by two mailboxes counts for each). Each mailbox card on the Email Capture
page shows "<N> captured · <M> matched" so per-user coverage is visible, not just the
aggregate. Verified in preview with two seeded mailboxes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 23:10:51 -05:00
Keysat 069e60053b email-activity agent: propose -> review -> approve grid notes (v0.1.0:64)
When a sent/received email is matched to an investor, a local-model agent drafts a
one-line dated note and queues it as a PENDING proposal (it never writes the grid
itself). On the Email Capture page a partner sees "Proposed grid notes", can edit the
text, and Approve (appends to that investor's grid notes cell, newest at bottom,
stamped with the approver) or Dismiss. Going-forward only: a cutoff (app_settings
email_activity_since, set on first run) means email dated before the feature was
enabled is never summarized, so the historical backfill makes no noise. Sovereign:
summaries run entirely on the local model (no redaction needed). Gmail sync interval
tightened 180 -> 15 min so outgoing email surfaces quickly.

Backend: migration 0002 (email_activity_proposals); propose_email_activity_notes()
runs via a new scheduler post_sync hook; list/decide functions + routes
GET /api/activity/proposals, POST .../{id}/approve|dismiss. Grid append stamps the
approving user (fundraising_state.updated_by has a FK to users). Test
test_email_activity.py (propose cutoff/idempotency, approve appends + edited note,
dismiss, already-decided guard) under FK enforcement.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 15:55:26 -05:00
Keysat 3893a4fb9f system-status: show storage usage (DB, attachments, backups, disk free) — v0.1.0:63
/api/system/status now returns a best-effort storage block: database file size
(crm.db + WAL + SHM), the email_attachments dir, the backups dir, and disk
total/used/free via shutil.disk_usage(DATA_DIR). System Status renders a Storage
section with human-readable sizes so growth can be watched over time.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 13:34:18 -05:00
Keysat ea036f49a6 email: fix backfill crash on emails with no Reply-To; Sync now retries errored mailboxes (v0.1.0:62)
insert_email's recipients loop did `for a in parsed.get(kind, [])`, but the parser sets
reply_to=None when there is no Reply-To header, so .get returns None (key present) and the
loop raised 'NoneType' object is not iterable — aborting the entire Gmail backfill on the
first such email (i.e. almost immediately). Fixed with `or []`. Regression test
test_insert_email.py (reply_to=None, all-None recipients, happy path).

Because the scheduler intentionally skips error-status accounts (no retry storms), an
errored mailbox would never resume on its own. "Sync now" now clears error status first,
so it is an explicit retry; backfill resumes from its saved cursor and dedups by
Message-ID, so nothing is re-captured.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 12:41:06 -05:00
Keysat 2cb476e36b email: live backfill progress on Email Capture panel — v0.1.0:61
The first Gmail backfill leaves the account at "pending · never synced" until it
fully completes (the sync_runs row only finalizes at the end), so there was no
feedback. /api/email/status now also returns captured_emails (total, which climbs
page-by-page during backfill), the latest sync run, and a backfilling flag. The
panel shows a "Backfilling… N captured so far" banner + an Emails Captured count
and auto-refreshes every 5s while a backfill is in progress. Verified live in
preview with seeded data (count auto-climbed 37 -> 50 without manual refresh).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 12:29:01 -05:00
Keysat bf829b784a grounding: wire matched email bodies into the LP-feedback corpus
_ground_feedback_corpus now pulls matched email bodies (the richest objection
signal) alongside communications and grid notes, round-robin merged so email is
never crowded out by a flat LIMIT, per-item capped at 4000 chars to keep the local
minimize tractable on long threads, and degrading gracefully when the email tables
are absent. Email remains Tier-2-sensitive: it only ever enters the redaction
boundary, never Claude directly. Inert until Gmail capture is enrolled. Not yet
deployed (bundles into the next release with the meeting-notes work).
Test: test_ground_corpus.py.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 20:30:29 -05:00
Keysat 196f1f6c65 thesis: seed 5 Architect positioning framings into the Workshop (v0.1.0:58)
Saves the 2026-06-05 Architect positioning pass as competing CANDIDATE options
under the core line's positioning variant group, beside Option A/B: Convergence
(47/60), Access (40), Asymmetry (36), Scarcity/chokepoints (35), Freedom-tech (28),
each with its red-team weakness inline. One-time, additive, non-canonical
(guardrail #4); idempotent via an interaction_log sentinel so a partner-deleted
option is never resurrected. ensure_positioning_framings runs after the v5 seed.
Test: test_positioning_framings.py (count/candidacy/idempotency/no-resurrection/log).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 19:40:25 -05:00
Keysat c898ad8530 redaction: \b after magnitude so amounts don't eat the next word (v0.1.0:57)
The currency-anchored amount regexes treated a single-letter magnitude
suffix (k/m/b) as optional but unbounded, so "$5,000,000 but" scrubbed to
"[AMOUNT_1]ut" — the 'b' of "but" was consumed as a 'billion' suffix. Add a
word boundary after _MAG on the three currency-anchored _AMOUNT_RES patterns
(range, symbol, ISO-code); the worded-amount pattern is unaffected. Money
still tokenizes in every case ($5m/$5b/$3-5M/USD 5,000,000); only the OUTBOUND
to-Claude text stops losing the leading letter of the following word.
Round-trips were already lossless.

Regression-locked by a round-5 section in test_scrub_leak.py; full redaction
suite (scrub_leak + reidentification + grounding_boundary) green. Packaged as
StartOS v0.1.0:57. Reported by the Spark gateway dev; gateway re-vendored
scrub.py verbatim for parity (same golden-file leak test gates both sides).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 18:52:04 -05:00
Keysat 6d6f4bcc7e Thesis Workshop redesign: edit/choose/delete + approve-as-current (v0.1.0:56)
Addresses Grant's feedback that the Workshop was confusing and underbuilt (no delete,
no approve, redundant generate-vs-feedback panels, and a stray "0" on segment lines).

Backend (architect_tools.py + server.py routes/handlers):
- retire_node: soft-delete a node + its subtree (reversible). DELETE /api/thesis/nodes/{id}.
- choose_variant: 'Use this' — keep this option, soft-delete the others in its group,
  mark it approved. POST /api/thesis/nodes/{id}/choose.
- upsert_thesis_node gains actor_type so a manual human edit is recorded as 'human'.
  PUT /api/thesis/nodes/{id} edits a part's text directly.
- handle_approve_line: one-click 'approve as current' — records this admin's approval on
  the line's in-review version (creating + submitting one from the live tree if none),
  promoting to canonical at the required distinct-approval count. POST /api/thesis/lines/{key}/approve.

Frontend (ThesisWorkshop redesign):
- Merged the redundant "Generate options" + "Give feedback" panels into one "Ask the
  Architect for options" box (revise was just generate-with-guidance).
- Per option: Use this / Edit (inline) / Delete. Per part: edit + delete via the same.
- "Approve as current" bar with dual-sign-off state + a "Current ✓" badge, and a one-line
  "how it works" hint. Refreshes the tree after every action.
- Fixed the stray "0": `{line.is_core && <badge>}` rendered 0 for non-core lines (SQLite
  integer 0); now `{!!line.is_core && ...}`.

Verified: backend test_thesis_actions.py (choose/edit/retire-subtree/dual-approval->canonical),
and a live in-browser smoke test (JSX compiles, Workshop renders, options show Use/Edit/Delete,
approve returns 1-of-2, no runtime errors).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 18:29:47 -05:00
Keysat 2e70b34592 Architect grounding boundary: redaction/re-hydration privacy gate (v0.1.0:55)
Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP
objections without any LP identity reaching the Claude API. Layered, defense-in-depth,
fail-closed by construction (docs/redaction-rehydration.md).

backend/redaction/:
- scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/
  IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from
  the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails,
  scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/
  quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N]
  strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the
  de-anon key — local-only, never logged/sent). Hardened across THREE adversarial
  leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/
  comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes).
- client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or
  gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary
  build errors propagate; strict rehydrate returns tokenized-not-de-anon text).
- test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification
  suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector.

backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first
(local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the
de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the
local model is unreachable or a hallucinated token appears. test_grounding_boundary.py
proves nothing sensitive reaches Claude and the three fail-closed paths.

server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections.
docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md:
the gateway handover spec (Option 1 — caller supplies the entity dictionary).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 17:06:29 -05:00
Keysat 300041a7ec Unification polish: LinkedIn in the grid inline contact editor (v0.1.0:54)
The fundraising grid's per-contact editor now has a LinkedIn URL field next to
name, email, title, and location. It threads through the grid contact object and
sanitize (which preserves contact-object fields), and _upsert_contact_from_fundraising
now reads and persists linkedin_url on both the update and insert paths — so a
LinkedIn entered in the grid lands on the linked contact record.

Test: test_grid_contact_link.py extended to assert LinkedIn entered in the grid
persists to the contact (idempotent). Frontend html.parser clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 15:24:50 -05:00
Keysat 49d384a0fb Seed the v5 thesis into the Architect Workshop (v0.1.0:53)
backend/thesis_seed.py builds the starting "living messaging source of truth"
from docs/thesis-seed-v5.md: a core line (throughline; the open Option A/B banner
as a competing variant group; the three pillars; the proof; voice rules), one line
per LP segment carrying that segment's angle, and the five segment definitions.

ensure_thesis_seed(conn) runs from init_db, seeding ONLY when the Workshop is empty
(no thesis lines) — idempotent and non-destructive, so it bootstraps once and never
overwrites partner edits. Everything lands draft/candidate; nothing is made canonical
(that stays the partners' dual-approval action, guardrail #4). Content is Ten31's own
messaging, not LP data.

Test: backend/test_thesis_seed.py runs init_db and asserts the core line, 5 segment
lines, the 2-member Option A/B variant group, 3 pillars, segment_cuts, and segment
defs, plus re-seed-is-a-no-op (11/11).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 15:19:44 -05:00
Keysat 2afed210cb Grid/contacts unification step 1: real contact_id link + grid as front door (v0.1.0:52)
Structural fix for the duplicate-people class of bug: instead of matching a grid
contact "pill" to a contacts row heuristically by name/email (which drifted and
caused the 1406 double-count), link them by id.

Backend:
- Migration 0004: fundraising_contacts.contact_id (additive, nullable, logical FK
  to contacts(id)) + index. Paired down migration.
- sync_fundraising_relational now stores the id that _upsert_contact_from_fundraising
  already returns, so every grid contact carries its contacts-table id.
- _backfill_grid_contact_ids: one-time, idempotent backfill on startup (re-runs the
  grid sync once if any row lacks contact_id), so existing data links immediately.
- entity_resolution: grid pass prefers the explicit contact_id link (match_kind
  'grid_link') over heuristic email / name+investor, guarded by a PRAGMA check so
  older DBs without the column still work.

Frontend:
- Fundraising grid "+ Row" -> "+ Investor" (clear, single investor entry point).
- Contacts page: the "+ Add Contact" trigger is replaced by a pointer to the grid;
  the page is now a read/search/edit view (ContactDetailPanel still edits all
  fields). New people are added from the grid. No contact data is removed.

Tests: backend/ingest/test_entity_resolution.py extended (explicit-link case, 11/11)
and a new backend/test_grid_contact_link.py integration test (init_db applies 0004,
sync populates contact_id to the right contact, re-sync is idempotent). py_compile +
frontend html.parser clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 15:10:26 -05:00
Keysat d16264f401 Fix people double-count + duplicate-queue explosion (v0.1.0:51)
Root cause: grid contacts (fundraising_contacts) are the SAME people as the
contacts table (the app syncs them by name/email), but resolution matched grid
rows by (name + investor-canon) where the two sides derive the investor key from
different tables that rarely line up — so nearly every grid contact minted a
duplicate person (715 + ~692 ≈ 1406), and the duplicate finder then flagged each
twin against its real self (~676 candidates).

Fix (entity_resolution.py):
- Grid pass matches a grid contact to its existing contacts-table person by
  PROVABLE keys only (exact email, else exact name within the same investor) and
  records membership; on a miss it MINTS NOTHING (the old else-branch mint was the
  double-count source, and guessing by name across firms risks binding two
  different same-named people).
- Targeted, audited cleanup soft-deletes leftover grid-only "twins" (person rows
  with no 'contacts' link) and superseded pre-:48 'lp'/'organization' rows, guarded
  so any row carrying enrichment/human data is never dropped (guardrail #3); the
  tombstoned ids are logged to interaction_log (guardrail #5).
- _upsert_entity clears deleted_at on conflict so a re-emitted id is un-tombstoned
  (no permanent burial); fuzzy-merge losers stay buried via _redirect.

entity_merge.py / server.py: the duplicate queue + pending count now filter to
candidates whose both sides are still live, so self-healed twins drop out.

Verified: offline reproduction test (backend/ingest/test_entity_resolution.py,
10/10) reproduces the 1406-style doubling and proves it collapses; no regression
on the synthetic dev set; two adversarial review passes. Known pre-existing
identity-key weaknesses (same name+firm+no email collision; shared role inbox
over-link) are unchanged by this fix and will be resolved structurally by the
contact_id link in the grid/contacts unification.

Run "Build search index" after upgrading to recompute the canonical layer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 14:49:39 -05:00
Keysat dd25bbc08d Architect agent: Claude-powered thesis generation (backend scaffolding)
- backend/mcp/architect_agent.py: generate_options + revise on Claude (prompt-
  cached thesis context, claude-opus-4-8, Ten31 voice rules). Writes N variant
  drafts to a node's variant group; nothing canonical without human approval.
  Fails gracefully if the API key / SDK is absent.
- server.py endpoints: GET /api/architect/status, GET /api/thesis/{key}/tree,
  GET /api/thesis/nodes/{id}/variants, POST .../generate, POST .../feedback,
  POST /api/thesis/lines, POST /api/thesis/lines/{key}/nodes. architect_tools
  gains get_node_variants.
- Dockerfile installs `anthropic`; docker_entrypoint loads ANTHROPIC_API_KEY from
  /data/secrets/anthropic-api-key (self-disabling until the key is dropped in).

Full HTTP surface verified end-to-end (graceful 502 without a key).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 13:25:47 -05:00
Keysat 91361042e7 Entity model: investors (grid) vs people (contacts); fix double-count (0.1.0:48)
Per Grant's clarification of the real data model:
- Investor entities come from the fundraising grid, one per row, all labeled
  "investor" (drops the confusing lp/organization split). Grid is source of truth.
- People come ONLY from the contacts table. The grid's contacts (fundraising_
  contacts) are matched to a contact-person and recorded as member_of links to
  their investor, instead of creating duplicate person entities. This fixes the
  ~doubled people count (people now ≈ contacts, not contacts + grid contacts).
- System Status cards: Investors / People (resolved) / Contacts in CRM / Grid
  contacts, so resolved-vs-source is visible at a glance.

Verified on synthetic: people == contacts count (no double-count); multi-contact
investors preserved via member_of.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 13:05:58 -05:00
Keysat 3c31b1e8a5 Soft-delete + source-count diagnostics; thesis v4 (0.1.0:47)
- DELETE handlers soft-delete (set deleted_at) + cascade contact -> opps/comms/lp
  instead of hard-deleting (guardrail #3); list queries filter deleted rows.
- ingest: chunking excludes soft-deleted records; qdrant delete-by-source-id;
  sync prunes soft-deleted records' vectors incrementally.
- /api/system/status returns raw source-record counts for sanity-checking.
- docs/thesis-seed-v4.md (no "bet" language, scarcity-forward, freedom-tech as
  a banner option, tightened pillars, reworked segments + edge).

Soft-delete verified via the running HTTP server (delete -> hidden + row kept).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 12:20:38 -05:00
Keysat cd3cca725c Phase 1: dual approval default, web-UI index jobs + merge review queue, thesis v2
- Dual sign-off is now the default (thesis_required_approvals defaults to 2).
- Entity-merge review queue (migration 0003): the fuzzy/Qwen tier no longer
  auto-merges — it writes CANDIDATES (entity_merge_candidates) with a same/different
  suggestion + confidence + reason for a human to approve (merge) or reject (keep
  separate). entity_merge.py applies/rejects (durable via entity_merges, soft-delete,
  repoint links+edges); decided pairs aren't re-surfaced.
- entity_jobs.py: UI-triggered background index jobs (rebuild/update/find-duplicates)
  as subprocesses with a one-at-a-time lock; status in /api/system/status.
- server.py: /api/index/{rebuild,update}, /api/entities/find-duplicates,
  /api/entities/merge-candidates [+ /{id} decide] — admin-gated.
- docs/thesis-seed-v2.md: concrete, plain-English rewrite per Grant's feedback.

Backend verified end-to-end on synthetic data (candidate gen -> approve/reject).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 11:14:12 -05:00
Keysat dd2c34d7bc Phase 1: investor↔contacts (member_of), system status, thesis seed v1
- entity_resolution: emit member_of relationship edges (contact -> investor),
  so one investor entity owns many contacts (institution) and a HNWI is the N=1
  case; crm_tools.get_investor_contacts + get_entity contacts/member_of; MCP tool.
- seed_synthetic: multi-contact institutions to exercise it (Harbor & Vine = 5).
- server.py: GET /api/system/status (index/entity/thesis/activity health) for an
  in-app status view (no shell needed to verify the index).
- docs/thesis-seed-v1.md: grounded v1 thesis (throughline, 6 pillars, objections,
  per-segment angles, voice) drawn from Ten31's newsletter/site/essays.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 10:47:26 -05:00
Keysat 3e199fd8d5 Phase 1 Workstream A+E: thesis substrate + dual-approval gate
- migration 0002_phase1_architect: thesis_lines (core spine + per-segment lines),
  thesis_nodes (+ append-only revisions), thesis_versions (one-canonical-per-line
  DB invariant), thesis_reviews (dual approval + feedback), segments. Reversible.
- backend/mcp/architect_tools.py: agent draft tools (node tree, versions,
  segments, get_canonical fails-closed) — NO self-approval path. MCP-exposed.
- backend/thesis_review.py + server.py routes: human-gated approval. Dual sign-off
  via thesis_required_approvals; atomic supersede; every action logged.
- docs/PHASE_1.md (kickoff brief); docs/OPERATIONS.md (partner guide);
  start9/0.4 "Resolve duplicate names" fuzzy action.

Verified on synthetic data: dual approval promotes correctly, exactly one
canonical survives supersede, get_canonical fails closed, full interaction_log.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 10:20:00 -05:00
Keysat 6be2e40f54 Phase 0 go-live polish: hands-off incremental sync + refresh action
- backend/ingest/sync_scheduler.py: periodic incremental-sync loop (every
  CRM_INGEST_SYNC_INTERVAL_MIN min); resilient, --once for testing.
- start9/0.4: "Refresh search index" action (incremental sync.py); entrypoint
  launches the scheduler as a background process when Spark/Qdrant are set;
  CRM_INGEST_SYNC_INTERVAL_MIN env; pre-release note on fastembed/mcp pins.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 09:36:06 -05:00
Keysat f357c23c75 Phase 0 complete: fuzzy entity tier, incremental sync, Start9 packaging
- Fuzzy tier (backend/ingest/fuzzy_resolve.py + llm.py): local Qwen adjudicates
  the deterministic resolver's flagged name-variant candidates; merges are
  durable via entity_merges (deterministic re-runs respect them), losers
  soft-deleted, logged. Idempotent.
- Incremental sync (backend/ingest/sync.py): re-embeds only rows changed since a
  watermark (ingest_sync_state); first run / --recreate = full. Tested full→0→1.
- Start9 packaging (start9/0.4): Dockerfile bundles ingest+mcp + fastembed/mcp;
  "Build search index" action runs the init in a subcontainer; MCP shipped as a
  manual stdio server (not a daemon); version 0.1.0:44. INGEST_PACKAGING.md.
- backfill.py: factored embed_and_upsert() shared with sync.

Verified end-to-end on synthetic data + live Sparks/Qwen/Qdrant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:55:12 -05:00
Keysat c7ce44d963 Phase 0 foundation: canonical schema, ingest pipeline, CRM MCP server
Workstream A–C substrate for the Ten31 agentic system:
- A1: docs/crm-overview.md; CLAUDE.md conventions + guardrail #9
- A2: additive/reversible core migration (canonical_entities, entity_links,
  interaction_log, relationship_edges, soft-delete) + ledgered runner
- B1/B3: chunking + deterministic entity resolution (backend/ingest)
- B2: dense (bge-m3) + BM25 sparse ingest to Qdrant crm_chunks
- C: CRM MCP server (reads, retrieval modes, logged writes) — no outbound tools
- docs: redaction/re-hydration, Gmail enablement runbook
- synthetic test data; .env.example; housekeeping (.gitignore, untrack crm.db,
  drop legacy files + start9/0.3.5)

Verified end-to-end on synthetic data + live Sparks (hybrid > dense on entity
queries). Real backfill runs on Ten31 infra; index holds synthetic data only.
Branch snapshot also captures pre-existing working-tree changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:13:35 -05:00
MacPro 7027efd777 init local package repo 2026-02-27 12:44:50 -06:00