Matrix intake bot

Read this before editing backend/matrix_intake/. The bot turns a typed message in a dedicated Matrix room into a proposed fundraising-grid add/edit, gated on in-thread human approval before any write. Phase status: M1 + M2 deployed & live (text intake + approval + write; bot on the Spark, CRM endpoints on the box at v0.1.0:86; live-smoked 2026-06-17). M3 (business-card photo) BUILT — bot-only, awaiting live-smoke (the prior blocker — "Spark Control has no vision model" — is gone: the daily-driver model is now vision-capable; see Business-card capture below).

Post-deploy UX pass — DEPLOYED & LIVE 2026-06-17: fuzzy investor matching (server-side, v0.1.0:86, installed to the box — candidates endpoint verified live) + in-thread disambiguation and conversational natural-language edits (bot-side, pulled + restarted on the Spark). See Fuzzy matching below. Tests green (27/27 backend + the offline bot suite); the Matrix live-smoke of the disambiguation grammar and the Qwen revise leg is still pending.

What it is (and isn't)

A separate process, not part of the CRM. Its only third-party dep, matrix-nio, lives in backend/matrix_intake/requirements.txt and must never be added to the stdlib CRM (backend/server.py). Runs on the Spark (placement per standards/guides/placement.md).
It drafts; a human approves. Nothing is written autonomously — every CRM write follows a yes reply in the proposal thread. This is exempt from "agents draft, humans send" the same way the digest is: it's internal data entry to our own CRM, not outward LP contact.
It is not a parallel write path. It reuses the CRM's own canonical endpoint POST /api/fundraising/log-communication (create-if-missing + contact upsert + note + relational sync + audit) for both new-investor and existing-note cases. Don't reimplement grid mutation in the bot.

Flow

Top-level message in the intake room → parse.parse_message → local Qwen via Spark Control (spark.py reuses backend/ingest/llm.py; temp 0, JSON only) extracts {intent, investor_name, contact_name, contact_email, contact_title, note}. The original message text is stashed on the proposal as _source_text (needed later for revise's email-integrity check). The system prompt is built by parse.build_system(roster), which — when a team roster is configured (INTAKE_TEAM_ROSTER, see Config) — appends an outreach frame: those names are our own team members doing the outreach, so a teammate's name is never extracted as the investor/contact and the other party is the prospect. Fixes the live-smoke gripe where "jonathan is chatting with wyoming" picked the teammate, not the prospect. revise gets the same framing. Roster unset → prior behavior (no frame).
crm_client.match (GET /api/intake/match) resolves new-vs-existing. It returns both an exact match (returns the grid row id so an approved note lands on exactly that investor, no duplicate) and, when there's no exact match, a ranked list of fuzzy candidates (see Fuzzy matching below).
Three outcomes drive what gets posted, all in a thread rooted at the user's message, plus a brief main-timeline nudge (a plain reply — matrix_io.make_reply) so it isn't missed:
- Exact match → auto-attach: proposal flips to meeting_note with _match_id set, rendered as the normal approval card.
- Fuzzy candidates, no exact → a disambiguation card (proposals.render_disambiguation): the proposal is held at _stage="disambiguate" with _candidates, and the human must pick a number / new / no before it becomes an approval-stage proposal.
- Neither → the new-investor approval card. The nudge is a pointer only, not a reply target — you need the thread to act. The pending proposal is held in memory keyed by the thread root (proposals.ProposalStore).
User replies in the thread. handle_reply branches on _stage:
- disambiguate (handle_disambiguation): a number attaches to that candidate (→ meeting_note
  - _match_id, re-rendered for approval); new proceeds as a new investor; no discards.
- approval: yes commits; no discards; edit field=value is the deterministic fast-path edit; anything else is treated as a natural-language revision — parse.revise sends {current proposal + instruction} back through local Qwen and re-renders the revised card (a no-op revision is detected via proposals.same_fields and re-prompts instead of saying "Updated"). On yes, crm_client.commit POSTs to log-communication tagged source="matrix_intake" (provenance in the audit log). A bare yes/no typed top-level (not in the thread) while a proposal is pending gets a "reply in the thread" redirect (store.any_pending() guard in handle_intake), not a misparsed new intake.

Business-card capture (M3 — image intake)

Send a photo of a business card into the intake room and the bot turns it into the same new-investor proposal a typed note would. The only added step is image → text; from there the existing flow (parse → match → disambiguate → approve → log-communication) runs unchanged — handle_card just calls handle_intake with the transcription.

Trigger: a top-level m.image event in the intake room (on_image → handle_card in bot.py; registered via a second add_event_callback(on_image, RoomMessageImage)). Images in the Q&A / email-review rooms, the bot's own uploads, and an image dropped inside an existing thread are ignored. The card's own event becomes the proposal thread root, like a text message.
The one new call (spark.transcribe_card → llm.chat_vision): download the image (client.download(mxc=event.url) — unencrypted only; an E2EE room delivers a different event class we don't register for, so encryption is naturally excluded), base64-encode, and POST an OpenAI multimodal /v1/chat/completions to Spark Control — same endpoint, same model id (CRM_CHAT_MODEL, the daily-driver Qwen, capabilities: [vision, reasoning]), with the user message's content an array of a text part + an image_url data-URI. Spark Control is a dumb passthrough (image/app/llm_proxy.py), so no gateway change was needed. The model transcribes the card; it does not emit JSON.
Why transcribe-then-reuse (not vision-straight-to-JSON): the transcription becomes the source text the email-integrity rule checks against — parse.normalize only keeps an address that literally appears in the source, never one the model mints. So a mis-read address can't reach the CRM unapproved, exactly as on the text path, and 100% of parse/match/disambiguation/ approval is reused. The transcription is framed ("New investor — from a business card:\n…") so the extractor reads it as a new investor.
Provenance: a card commit tags source="matrix_card" (vs "matrix_intake" for a typed note) in the audit log, threaded via the proposal's _source control key (handle_intake(…, source=…) → crm_client.build_commit_payload, which defaults to "matrix_intake" when absent).
Fields captured (parse._FIELDS): investor, contact, email, title, city, linkedin_url, phone, mobile, note. city is a plain extracted field (low-harm if wrong; the human sees it). linkedin_url, phone, and mobile follow the email-integrity rule — kept only if they literally appear in the source/instruction, never minted (for phone/mobile the check is digit-run membership, ≥7 digits, in the source). Mapping (from the labeled transcription): an office/main/ direct number → phone, a cell/mobile → mobile, a fax is skipped. All ride to the contact via the existing log-communication upsert (_upsert_contact_from_fundraising); city also syncs to the grid contact pill, the rest land on the canonical contact record (contact-level, not pills).
Phone/mobile shipped in v0.1.0:98 (live 2026-06-20). The server half — _upsert_contact_from_ fundraising accepting contact.phone + contact.mobile — is in the s9pk; the bot's transcription (Phone/Mobile/Fax labels), extractor, and card shipped on the Spark in the same deploy, so phone never showed on a card before the box could store it.
UX: the bot acks 📇 Reading the card… before the (slower) vision call; an unreadable image (model replies NONE, or transcription < 5 chars) gets a "try a clearer, well-lit photo" reply instead of a garbage proposal.
Deploy is bot-only — the change lives in backend/matrix_intake/ (bot.py, spark.py) + backend/ingest/llm.py (bundled into the bot image), shipped on the Spark via git pull + docker compose up -d --build. No s9pk, no version bump, no new env (same model; no auth on the LAN). Contrast with M2 / email-review, whose server endpoints had to ship in the s9pk.
Known limits (live-smoke checklist): ① a StartOS reverse-proxy body cap could 413 a large photo — the model already downscales server-side (max_pixels ≈ 2 MP), so if it trips, add a client-side resize (would pull Pillow into the bot image); ② iPhone HEIC may not decode in vLLM's PIL — most clients (Element iOS) transcode to JPEG on upload, but confirm on-device; ③ the offline tests stub the vision call (test_spark.py); the download + real OCR is live-smoke only.

Fuzzy matching (server-side, ships in the s9pk)

GET /api/intake/match returns {match, candidates}. find_intake_match is unchanged — exact-after-normalization, and an exact match still auto-attaches without disambiguation. find_intake_candidates (new) is the fuzzy layer, deterministic, no LLM: it scans the same canonical grid blob and scores each row by max(name similarity, email near-match), keeping rows ≥ min_score (0.62), ranked, capped at 5:

Name (_name_similarity): max of stdlib difflib sequence ratio (near-spellings — "Charlie"/"Charles") and token-set Jaccard (word-order). Legal-entity suffixes (LLC/LP/Inc/… via _strip_legal_suffix) are stripped first, so "Acme Capital" ~ "Acme Capital LLC" scores 1.0 (a near-certain duplicate find_intake_match misses because it compares the full string) — and is surfaced as a candidate, never auto-attached (the human still confirms).
Email (_email_edit_distance): Levenshtein ≤ 2 against each contact email (dist 1→0.9, 2→0.8). Distance 0 is an exact email — that's find_intake_match's job, skipped here.
Recall-favoring by design: a shared common name-word ("… Capital") can lift an unrelated firm into the 0.6–0.8 band. Acceptable — it's a ranked, human-confirmed shortlist, and the cost of an occasional stray suggestion is far lower than missing a real near-duplicate. Semantic pruning of the shortlist (the "Charlie really is Charles" judgment) is a deferred LLM-judge re-rank — fed only the shortlist, never the whole LP list — intentionally NOT built in this pass, because the deterministic filter already surfaces every duplicate the human then resolves.

Email-activity proposal review (the CRM→Matrix bridge, v0.1.0:89)

A second, separate flow runs alongside intake: reviewing the proposed grid notes the CRM drafts from newly-matched email (server.propose_email_activity_notes, surfaced on the web Email Capture panel). The bot lets the team approve/dismiss/edit those on mobile, kept in sync with the web panel. The CRM (box, stdlib, no matrix-nio) can't post to Matrix, so the bot pulls.

Dedicated room (MATRIX_EMAIL_REVIEW_ROOM, see Config) — separate from the intake room so high-volume email proposals don't drown the conversational intake. Unset → the whole leg is off (the bot just does intake). The bot must be a member of this room.
Poll loop (bot.poll_email_proposals, every EMAIL_POLL_SEC=20s) calls crm_client. list_email_proposals → GET /api/intake/email-proposals, which returns three work-lists:
- to_post — pending, not yet posted → the bot posts a review card (metadata + a short email snippet + the drafted note; the full body is the web popup's job, kept compact for mobile), then records the thread-root event id via POST .../{id}/matrix {event_id}.
- open — pending, posted, not closed → the bot rebuilds its event_id → proposal routing map from these on every poll, so replies still route after a bot restart (unlike intake's in-memory-only store — the state lives CRM-side in email_proposal_matrix).
- to_close — decided on the web while a thread was open → the bot clears it (see redaction below) and POST .../{id}/matrix {closed:true}.
In-thread replies (bot.handle_email_reply, email_proposals.interpret): yes → POST .../{id}/decide {decision:"approve", note} (appends the note to the grid, source='matrix', closes the thread atomically); no → dismiss; anything else → NL revision of the note via local Qwen (email_proposals.revise_note, no Claude/scrub) — re-rendered for re-approval, so the draft→approve gate holds. A no-op/empty revision re-prompts instead of saying "Updated".
Card formatting: email_proposals.render_card frames every card/reply with a RULE dash line top and bottom (frame()) so threads don't bleed together on mobile, and the note names who emailed whom ("{teammate} emailed {investor}" / "{sender} emailed the team") rather than a bare Sent/Received — the wording is built server-side in propose_email_activity_notes.
Decided threads are redacted, not just closed. On any conclusive decision (Matrix or web) the bot calls redact_thread(root): redact the card, then scan recent history (room_messages, MessageDirection.back) for that root's m.thread replies and redact those too — so a resolved thread clears from the threads view, not only the timeline. No confirmation is posted on success (the thread vanishing is the ack; a confirmation reply would keep the thread alive).
- Needs the bot to hold a redact/moderator power level in the review room — required to redact the human's yes/no reply (its own card needs no power). Without it, the reply lingers.
- Full clearing depends on a client setting: redaction removes the events, but Element shows a "Message deleted" placeholder by default — turn OFF "show removed/deleted messages" in Element and both the main chat and the threads view clear completely. (Verified the intended UX 2026-06-18.)
- One-time backfill: backend/matrix_intake/redact_resolved.py (dry-run default; --apply) clears threads decided before this shipped (already closed, so the poll's to_close never touches them). Run on the Spark: docker compose run --rm intake python -u backend/matrix_intake/redact_resolved.py [--apply]. It keeps cards still pending (CRM open) and redacts every other card + its replies.
Two surfaces, one source of truth. Decide on the web → the bot redacts + closes the thread; decide on Matrix → the web panel polls /api/activity/proposals (~25s) and the card clears. email_proposal_matrix (1:1 side row, migration 0003) carries event_id/posted_at/closed_at; a matrix decision sets closed_at in the same txn so it's never re-processed via to_close.
Pure logic is email_proposals.py (card render, reply grammar, note revision) — unit-tested offline in test_email_proposals.py; the async poll/post wiring is in bot.py (live-smoke only).
Known minors (low-likelihood, ~5-person team): if the CRM is unreachable between posting a card and recording its event id, the next poll re-posts a duplicate card (the orphan's replies won't route — re-send/decide the recorded one). A mid-revise bot restart loses the in-memory revised note (rebuilt from open = the original proposed_note; still a valid proposal).

NL query — read-only Q&A (W2 step 5)

A read-only "ask the database in plain English" flow, answered in-thread. No write path, no approval gate — it only runs the curated, parameterized queries behind the CRM's NL-query endpoint, so it's exempt from the draft→approve dance the write flows need. Two entry points, same handle_query → crm_client.nl_query underneath:

Dedicated Q&A room (MATRIX_QUERY_ROOM, recommended) — every top-level message is a question; no trigger needed. This is the room-per-purpose model (intake / email-review / Q&A, with a future reminders-push room): the trigger grammar below exists only to disambiguate question-vs-note when Q&A shares the intake room, which a dedicated room makes unnecessary. The simplest room of the three — read-only, no approval, no redaction, no special power level.
@bot/? trigger in the intake room (cross-room convenience) — fire a quick question without switching rooms. query.parse_trigger (pure/tested) matches a top-level message starting with ?, @bot, /ask, /query, or /q. The trigger is required there, so plain intake notes still route to intake. A bare leading ask is deliberately not a trigger — it would collide with notes like "Ask Jane to send the deck". A bare trigger (@bot alone) posts help.
One endpoint call (crm_client.nl_query → POST /api/query/nl {question, source:"matrix"}): translation runs on the box's local Qwen (nothing leaves the box; no Claude, no scrub — same basis as intake) and only the fixed nl_query catalog can run. The bot is a thin client — see docs/guides/nl-query.md for the trust model.
Rendering (query.render_answer, pure/tested): a deterministic Matrix-markdown answer (summary + interpreted intent + compact rows, money/date formatting, nested contacts/commitments for investor_lookup). Results never go back to any model. Mobile soft-cap MAX_DISPLAY_ROWS (30) with an explicit "+N more" note — never a silent cut.
Status passthrough: the endpoint returns its structured body on a hit and on the soft 503 (model down) / 500 (query fault) codes, so nl_query hands those to the renderer; only an auth/shape failure (403/400) raises → a brief ⚠️ in-thread.
Ships on the Spark (bot-side, query.py + crm_client.nl_query + bot.py wiring) via git pull + restart — no s9pk for the bot. But it depends on the box-side /api/query/nl endpoint, which ships in the s9pk and is not live until v93 (reminders + W2). Deploying the bot before that = a Q&A room that 404s every question (same server-side/bot split as the v83→v84 /api/intake/match 404). Sequence: install v93 first, then set MATRIX_QUERY_ROOM + invite the bot + restart. Pure logic tested in test_query.py (+ nl_query cases in test_crm_client.py); the in-room smoke (a bare message in the Q&A room, or ?… in the intake room) is live-only.

Rules / gotchas

Module-name collision: the intake config module is settings.py, not config.py, because backend/ingest/config.py is imported (as bare config) through spark → llm. A second config module would shadow it in sys.modules and break llm (CHAT_MODEL). Keep intake module names from colliding with ingest's (config, http_util, llm).
Email integrity: parse.normalize only keeps an address that literally appears in the source message — the model must never mint one (a wrong email is worse than none). It takes the first address in the text, so a two-person message ("Alice a@x.com and Bob b@y.com") could attach the wrong one; the human sees it in the proposal and can edit email=… before approving. Cross-referencing multiple addresses to the named contact is a deliberate non-goal for v1.
Conversational revise keeps the email rule: parse.revise re-runs a free-form correction through Qwen but never trusts the model's email field. A changed address is accepted only if it literally appears in the instruction text (searched first), else the existing integrity-checked address is kept (_apply_revision). The model can edit name/contact/title/note freely but cannot mint an email. A revision that nulls both investor and contact is rejected (the proposal can't be emptied to something unactionable). Revise edits fields on the current proposal; it does not re-run the matcher if you rename the firm mid-thread (a known v1 limit — the human still approves).
Deploy is split across two surfaces (mind which one carries a change): the fuzzy candidates come from server.py → ship in the s9pk (build + install, version-bumped). The bot's disambiguation flow + revise live in backend/matrix_intake/ → ship on the Spark via git pull + restart. A bot restart alone won't deliver candidates (the box would return an empty list and the bot just proposes new — safe, but no fuzzy surfacing until the s9pk is installed). Same lesson as the v83→v84 /api/intake/match 404.
Double-approve guard: handle_reply pops the pending proposal from the store before awaiting the commit, so a second yes arriving mid-write is a no-op (asyncio is cooperative; the pop is atomic w.r.t. other events). On commit failure the proposal is restored for retry. Known minor: in the disambiguate stage the pick re-stores an approval-stage proposal before its await say, so a rapidly-repeated 1 can have the second one fall through to the NL-revise path (a wasted Spark round-trip that re-prompts) — harmless, nothing commits, not guarded (low likelihood on a ~5-person team).
Local-only parse: intake text is real LP substance but goes ONLY to local Qwen via Spark Control, never Claude — so no scrub boundary applies (same basis as the digest). Never call a Spark directly; always go through SPARK_CONTROL_URL.
Auth: the CRM has no service-key path; the bot logs in as a dedicated CRM user (CRM_BOT_USERNAME/CRM_BOT_PASSWORD) → Bearer JWT, re-login once on 401.
Tests are offline: test_parse.py / test_proposals.py / test_crm_client.py stub the network; backend/test_intake_endpoints.py boots the real server against a temp DB and covers /api/intake/match + the create→match (no-duplicate) contract + provenance. A live Matrix smoke needs creds + matrix-nio installed on the Spark — it can't run in CI.
Grid note line: the bot sends a blank subject when there's a note so the CRM's one-line note summary shows the note text (the CRM renders subject-or-body); a provenance label is sent only when there's no note. v0.1.0:85 also dropped the redundant [note] type tag from that server-side line (informative types like [call] keep theirs).

Deployment & ops

Runs on the Spark as a docker container (matrix-intake), since 2026-06-17 — SSH alias modelo32, host spark-32d0, repo clone at /home/modelo/ten31-database. Defined by docker-compose.yml at the repo root + backend/matrix_intake/Dockerfile. The image bundles backend/matrix_intake/ and backend/ingest/ (spark.py reaches into the latter's stdlib Spark client via sys.path); .env is mounted read-only at /app/.env. network_mode: host so it reaches Matrix, the CRM, and Spark Control. Startup logs listening as … in room ….
Survives a Spark reboot via restart: unless-stopped — the durability fix that retired the old bare nohup launch. (The previous nohup method + /tmp/intake-bot.pid are gone.)
Deploy / update after a git pull: cd /home/modelo/ten31-database && git pull && docker compose up -d --build. Logs: docker logs -f matrix-intake. Restart: docker restart matrix-intake. Stop: docker compose down. A restart still drops in-memory pending proposals (re-send to recover).
Not yet a spark-control dashboard card. The container is managed via docker/SSH today; a managed card (Update/Restart/Stop/Logs tile, like matrix-bridge) is a separate spark-control task — see docs/handoffs/add-intake-bot-to-spark-control.md.
Gotcha — the repo-root .dockerignore is SHARED with the s9pk build (start9/0.4/Dockerfile, same repo-root context). Don't add bot-only exclusions (e.g. frontend/, docs/) to it — you'd break the CRM image build, which needs them. It already excludes the security-critical bits (data/, .env), which is all the bot build needs.
Server-side endpoints ship in the s9pk, not the bot. GET /api/intake/match and the source provenance on log-communication live in backend/server.py, so they reach the box only via an s9pk build + install — a bot restart won't deliver them. (Missed in v83: the box 404'd /api/intake/match until v0.1.0:84.) Same split for the email-review bridge (v0.1.0:89): the /api/intake/email-proposals* endpoints + the email_proposal_matrix migration (0003) + the bot role ship in the s9pk; the poll loop + review-room handling ship on the Spark (git pull + restart). A bot restart against a pre-v89 box returns nothing useful (404/empty), so install the s9pk first, then set the bot user's role + the review room.
CRM_API_BASE is the box over the LAN, not localhost (bot on the Spark, CRM on the box). https://immense-voyage.local (443) is the StartOS dashboard, not the CRM — the CRM has its own interface address (the URL you open in a browser); container port 8080 isn't LAN-reachable.

Config

All in .env (names in .env.example): MATRIX_HOMESERVER, MATRIX_USER, MATRIX_ACCESS_TOKEN, MATRIX_DEVICE_ID, MATRIX_INTAKE_ROOM; CRM_API_BASE, CRM_BOT_USERNAME, CRM_BOT_PASSWORD, CRM_API_VERIFY_TLS. Spark settings are inherited from the ingest client (SPARK_CONTROL_URL, CRM_CHAT_MODEL).

MATRIX_EMAIL_REVIEW_ROOM (optional) — the dedicated room for the email-activity proposal review leg (above). Unset/empty disables that leg entirely (the bot does intake only). The bot must be invited to + joined in this room. Read once at startup, like the room/roster.
MATRIX_QUERY_ROOM (optional) — the dedicated read-only Q&A room (NL query section above). In it, every top-level message is answered as a query (no ?/@bot trigger). Unset/empty just means no dedicated room — questions still work via the trigger in the intake room. The bot must be invited to + joined in this room (settings.query_room(), read once at startup). No poll loop and no power level needed (read-only). Needs the server side in the s9pk (POST /api/query/nl, ≥ the W2 backend) and the bot's CRM user set to role bot.
Bot CRM user needs role bot. The email-proposal endpoints (/api/intake/email-proposals*) are gated to require_bot_or_admin because they expose LP email content (the proposals are admin-only on the web). The bot role is authenticated-but-not-admin — it passes these endpoints + the auth-only ones the bot already uses (login, /api/intake/match, log-communication), but never require_admin (no user-management/settings/security reach). One-time flip of the existing service account (kept out of the invite UI's member/admin dropdown — provision deliberately): an admin PATCH /api/users/<id> {"role":"bot"}, or on the box UPDATE users SET role='bot' WHERE username='<CRM_BOT_USERNAME>';. Role controls reach; the draft→approve gate (a human still approves every write) controls autonomy — two separate axes.
INTAKE_TEAM_ROSTER (optional, comma-separated) — Ten31 team-member names that frame the parse (see Flow step 1). Use the first names as actually typed in the room ("Grant, Jonathan, …"). Read once at startup by settings.team_roster(), so a roster change needs a bot restart. It lives only in the Spark's .env (bot-side) — no s9pk change. Empty/unset disables the framing.

28 KiB Raw Blame History Unescape Escape