Add Matrix intake bot (M1+M2): typed message → approved fundraising-grid write
New backend/matrix_intake/ runs as its own process (matrix-nio isolated from the stdlib CRM): local-Qwen parse via Spark Control → in-thread human approval (yes/edit/no) → write through the CRM's own log-communication endpoint, tagged source=matrix_intake. Adds read-only GET /api/intake/match (returns grid row id, no-duplicate contract); threads provenance through handle_log_fundraising_communication. Reviewer-passed: pop-before-commit closes a double-approve race; edit-grammar fix. Text-only v1; business-card photo (M3) deferred (no Spark vision model). 26/26 tests green; live Matrix smoke pending deploy.
This commit is contained in:
@@ -48,6 +48,7 @@ cd start9/0.4 && make
|
||||
- `backend/redaction/` — `scrub.py` + `client.py`: the scrub→Claude→re-hydrate privacy boundary.
|
||||
- `backend/ingest/` — chunk→embed→Qdrant + retrieval modes.
|
||||
- `backend/entity_*.py` — entity resolution/merge (the two-investor-model reconciliation).
|
||||
- `backend/matrix_intake/` — Matrix intake bot (separate process; `matrix-nio`, isolated to this component): typed message → local-Qwen parse → in-thread approve → write via the CRM's own `log-communication`. See the matrix-intake guide.
|
||||
- `frontend/index.html` — the entire UI.
|
||||
- `docs/` — architecture, phase plans, contracts, runbooks (see Deeper docs). `docs/guides/` — scoped subsystem rules (see below).
|
||||
- `start9/0.4/` — StartOS package (`startos/utils.ts` holds `PACKAGE_VERSION`).
|
||||
@@ -63,6 +64,7 @@ Subsystem rules live in `docs/guides/` and lazy-load in Claude Code via `.claude
|
||||
- **Ingest / retrieval** (`backend/ingest/`) → `docs/guides/spark-ingest.md`
|
||||
- **Email capture / drafts + digest send** (`backend/email_integration/`, `backend/digest_mailer.py`, `backend/smtp_send.py`) → `docs/guides/email.md`
|
||||
- **Building or deploying the s9pk** (`start9/`) → `docs/guides/packaging.md`
|
||||
- **Matrix intake bot** (`backend/matrix_intake/`) → `docs/guides/matrix-intake.md`
|
||||
|
||||
## Conventions
|
||||
|
||||
@@ -103,13 +105,14 @@ Subsystem rules live in `docs/guides/` and lazy-load in Claude Code via `.claude
|
||||
|
||||
_Phase 0 substrate + Phase 1 thesis/outreach are built; **box and repo at v0.1.0:83** (deployed & verified live 2026-06-16). v83 (latest): **email search/query + windowed digest preview** — Communications tab gains a fixed/typed investor dropdown, a date-range filter, a full-body view, and a semantic "Search content" mode; the Daily Digest gains an in-app windowed preview before send. Prior v82: front-end libs vendored + SRI-pinned + jsdom render-smoke build gate. **Decision (2026-06-16): the fundraising grid + email capture is the canonical system of record** — vestigial classic-CRM surfaces get pruned or repurposed (see `ROADMAP.md` → "Consolidate on the fundraising grid as canonical"). Longer-term backlog: `ROADMAP.md`._
|
||||
|
||||
- **Built & reviewed, not yet deployed — Matrix intake bot (M1+M2), `backend/matrix_intake/`:** a separate-process bot (its `matrix-nio` dep isolated from the stdlib CRM) that turns a typed message in a dedicated Matrix room into a proposed fundraising-grid add/edit and writes only after **in-thread human approval** (`yes`/`edit field=value`/`no`). Parse = local Qwen via Spark Control (reuses `ingest/llm.py`; no Claude, no scrub needed — local path like the digest). Writes reuse the CRM's own `POST /api/fundraising/log-communication` (create-if-missing + contact upsert + note + relational sync + audit), tagged `source="matrix_intake"`; the one new CRM surface is read-only `GET /api/intake/match` (`find_intake_match`) returning the **grid row id** so an approved note lands on the matched investor (no duplicate). v1 is **text-only** — business-card photo (M3) is deferred (Spark Control has no vision model). Reviewer-passed (double-approve race fixed — `handle_reply` pops before the commit await; edit-grammar fix). **Code-complete, compiles, 26/26 tests green; a live Matrix smoke needs creds + `matrix-nio` on the Spark (can't run in CI).** Guide: `docs/guides/matrix-intake.md` (incl. the `settings.py`-not-`config.py` collision + email-integrity gotchas).
|
||||
- **Working (all draft-only):** CRM + ingest (chunk→embed→Qdrant + retrieval) + redaction boundary; Gmail capture (DWD) + email-activity propose→approve; Thesis Workshop + Architect (Claude) with dual-approval gate; Outreach Draft Assistant + follow-up radar + per-user voice + Tier-B in-thread Gmail draft creation.
|
||||
- **Deployed & verified live: v0.1.0:83** (box `$START9_BOX_HOST`/immense-voyage.local; `installed-version`→`0.1.0:83`, migration chain `…82→83` clean, server up on `:8080`, Gmail + ingest + digest schedulers all started; render-smoke gated the build) — **email search/query + windowed digest preview** (code-only, migrations no-op). Communications tab (`CommunicationsPage` + `email_integration/db.query_email_activity`): **fixed the investor dropdown** — the facet now mirrors the list with the digest's precedence (grid → org → contact → address) and **typed keys** (`fund:`/`org:`/`contact:`), so email matched only to a classic contact or org domain (no grid id — the common case, since `fundraising_contacts.email` is sparsely populated) now resolves to a real name and is selectable, instead of the dropdown being empty; added a **date-range filter** (`since`/`until`), and a **click-to-expand full-body view** (`GET /api/email/detail?id=` → `query_email_detail`, admin, soft-delete-gated, renders `body_text` escaped — never raw HTML). New **semantic content search**: a "Search content" toggle → `GET /api/email/search?q=` (`routes._h_search`) wrapping `ingest/search.py:hybrid_search` filtered to `doc_type='email'` (lazy import; **503** if Spark/Qdrant unreachable), **hydrated + soft-delete-filtered against SQLite** (`db.search_hit_emails` — never trust the derived index). **Daily Digest:** Settings → Admin now builds a digest over a chosen window (last 24h or since a date) as an **in-app preview** before sending (`POST /api/admin/digest/preview`); manual send uses the same window (`send-now` + `digest_scheduler.send_digest_window`); window resolved by `digest_builder.resolve_digest_window` (cap 92d). Both run the **real local-Spark summarizer** and **never touch the daily cursor**. Verified: 22/22 backend tests, `py_compile` clean, render-smoke pass. **Grant validated both live on the box 2026-06-16** — the digest windowed preview renders real Spark narratives over real activity, and the Communications dropdown / date filter / full-body view / content-search all work. Detail: `docs/guides/email.md`.
|
||||
- **Deployed & verified live: v0.1.0:82** (box `$START9_BOX_HOST`/immense-voyage.local; `installed-version`→`0.1.0:82`, migration chain `…81→82` clean, server up on `:8080`, schedulers + Gmail integration up). **v82 vendored React 18.3.1 / ReactDOM 18.3.1 / @babel/standalone 7.29.7 into `frontend/assets/vendor/`**, served same-origin with `sha384` SRI (no CDN, no outbound-internet dependency to render the UI), and added **`start9/0.4/render-smoke.mjs`** — a jsdom check (shipped-Babel transform asserts classic/non-module + parseable; real mount asserts the login UI renders) wired into the default `make` goal (`verified-build`), so every build is gated on the frontend actually rendering. Closes the v78 (blank screen) + v79 (Babel-8 ESM-import) class structurally. Detail: `docs/guides/packaging.md`. **Prior shipped & live:** v81 Communications-tab matched-only (`query_email_activity` gates on `EXISTS(email_investor_links)`; unmatched email captured but never shown; `docs/guides/email.md`); v80 admin-only email-activity panel (`GET /api/email/activity`); v78 retired `lp_profiles`/LP Tracker + repointed Dashboard "Total Committed" onto the grid (graveyard-excluded). **Digest fully live:** capture (DWD) → propose→approve; Gmail-DWD→SMTP transport; daily Phase-B digest (`digest_builder.py` + always-on `digest_scheduler.py` reading a DB policy + `send-now`); **daily auto-send is now ENABLED** (Grant turned it on in Settings → Admin, 2026-06-16). Detail: `docs/guides/email.md`.
|
||||
- **Live since v74 (2026-06-13):** login works; `/assets/` traversal 404s (plain + URL-encoded), root health 200. On boot, `ensure_thesis_v2_promoted` makes the v2.0 reserve-asset spine the working *approved* spine (node-level, reversible). Security/privacy hardening (path-traversal close, outreach NER backstop, get-by-id soft-delete) shipped in v74 — detail in `EVALUATION.md`.
|
||||
- **Tests (2026-06-16):** **22/22 backend tests green** via `python3 backend/run_tests.py`, `py_compile` clean. `test_email_activity_panel.py` now covers the **typed facet + org/contact resolution** (the dropdown fix), the **date-range filter**, the **detail view** (full body / recipients / attachments / soft-delete), and the **content-search route** (hydrate / drop-tombstoned / 503 / admin) with retrieval stubbed; `test_digest_builder.py` adds the **window resolver** + **`send_digest_window`** (no-cursor-touch) cases. Frontend **render smoke check** (`cd start9/0.4 && make render-smoke`) still gates the default `make` build. The 2 stale thesis tests stay fixed (seed structure in `docs/guides/thesis.md`).
|
||||
- **Tests (2026-06-16):** **26/26 backend tests green** via `python3 backend/run_tests.py`, `py_compile` clean. (+4 this session for the Matrix intake bot: `matrix_intake/test_parse.py`, `test_proposals.py`, `test_crm_client.py`, and `test_intake_endpoints.py` — the last boots the real server against a temp DB and covers `/api/intake/match`, the create→match no-duplicate contract, and `source="matrix_intake"` provenance.) `test_email_activity_panel.py` now covers the **typed facet + org/contact resolution** (the dropdown fix), the **date-range filter**, the **detail view** (full body / recipients / attachments / soft-delete), and the **content-search route** (hydrate / drop-tombstoned / 503 / admin) with retrieval stubbed; `test_digest_builder.py` adds the **window resolver** + **`send_digest_window`** (no-cursor-touch) cases. Frontend **render smoke check** (`cd start9/0.4 && make render-smoke`) still gates the default `make` build. The 2 stale thesis tests stay fixed (seed structure in `docs/guides/thesis.md`).
|
||||
- **Decided, not yet built (detail in `ROADMAP.md`):** Pipeline adoption + a grid flag that auto-loads flagged investors as opportunities; **NL→safe-query** feature (search item 3 — the larger, separate build); CRM as canonical thesis backbone with the signal-engine reading from it (reconciliation unwired); reply-all for Tier-B drafts (currently reply to the LP only). *(Done this session, v83: email search item 1 [activity query/panel gaps — typed facet fix + date range + full-body view] and item 2 [semantic content search] both shipped; daily-digest windowed preview→send.)*
|
||||
- **Known debt (P2, not deploy-blocking):** **reports-subsystem soft-delete sweep** — `handle_pipeline_report` + remaining report/aggregate queries over opportunities/communications still count soft-deleted rows (v78 shrank this surface: the `lp_profiles`/lp-breakdown aggregates are gone and the dashboard "Total Committed" is now grid-sourced); needs a pass + report-endpoint tests. Also `?limit=abc` crashes the request thread (authenticated list path); scrub-gateway TLS verify off; `cryptography==42.0.5`; stale user-visible `start9/0.4/assets/ABOUT.md`; hardcoded Spark/Qdrant IPs in the s9pk; **StartOS package icon oversized/zoomed** (research the Start9 icon spec, source a base ten31 logo, produce a correctly sized icon **before the next s9pk upload**); the 5.4k-line `server.py` monolith. P3 batch + full list in `EVALUATION.md`. *(Resolved v82: front-end CDN/SRI risk — libs vendored + SRI-pinned — and the render smoke check is now scripted into the build.)*
|
||||
- **Doc drift to reconcile:** `crm-overview.md` + `EVALUATION.md` still describe `lp_profiles` as a live model in places — a doc-auditor pass should align them to "grid canonical, `lp_profiles` retired."
|
||||
- **Other gaps:** the v2.0 spine is the *working* spine but **not a canonical `thesis_version`** (needs Grant + Jonathan dual sign-off); Appendix-A conviction/exposure (incl. ~40% Strike) stay Grant's working read, not canonical, not fed to the engine. Live infra now exercised on the box (Gmail capture + schedulers up; local-Spark summarization confirmed via the digest preview; Qdrant via Communications content-search); **Claude/Architect path still unverified live on the box.**
|
||||
- **Next:** 1) add an **auth regression test** asserting the 3 v79-gated GET endpoints (`/api/users`, `/api/email/status`, `/api/email/accounts`) reject members; 2) **reports-subsystem soft-delete sweep** + report-endpoint tests; 3) **Pipeline adoption** — grid flag → auto-load opportunities; 4) `?limit=abc` crash; 5) **NL→safe-query** (search item 3 — separate, larger); 6) Grant + Jonathan freeze v2.0 canonical; 7) build reply-all for Tier-B drafts; 8) **email-capture tab shows an error on email sync status** — chase down the failing call (likely `/api/email/status`) and fix. *(Logged to ROADMAP: a build step that pre-compiles JSX to drop runtime Babel entirely — bigger, contradicts the "no build step" convention.)*
|
||||
- **Next:** 1) **deploy + live-smoke the Matrix intake bot** (`pip install matrix-nio` + `MATRIX_*`/`CRM_BOT_*` in `.env` on the Spark, create the CRM bot user, `python3 backend/matrix_intake/bot.py`, post a test message); 2) **Pipeline adoption** — grid flag → auto-load opportunities (the agreed next major build); 3) add an **auth regression test** asserting the 3 v79-gated GET endpoints (`/api/users`, `/api/email/status`, `/api/email/accounts`) reject members; 4) **reports-subsystem soft-delete sweep** + report-endpoint tests; 5) `?limit=abc` crash; 6) **email-capture tab error on email sync status** (likely `/api/email/status`); 7) **NL→safe-query** (search item 3 — separate, larger); 8) Grant + Jonathan freeze v2.0 canonical; 9) reply-all for Tier-B drafts. *(Logged to ROADMAP: a build step that pre-compiles JSX to drop runtime Babel entirely — bigger, contradicts the "no build step" convention.)*
|
||||
|
||||
Reference in New Issue
Block a user