--- paths: - backend/matrix_intake/** --- # Matrix intake bot Read this before editing `backend/matrix_intake/`. The bot turns a typed message in a dedicated Matrix room into a proposed fundraising-grid add/edit, gated on **in-thread human approval** before any write. Phase status: **M1 + M2 built** (text intake + approval + write); **M3 (business-card photo) deferred** — Spark Control has no vision model yet. ## What it is (and isn't) - A **separate process**, not part of the CRM. Its only third-party dep, `matrix-nio`, lives in `backend/matrix_intake/requirements.txt` and **must never** be added to the stdlib CRM (`backend/server.py`). Runs on the Spark (placement per `standards/guides/placement.md`). - It **drafts; a human approves.** Nothing is written autonomously — every CRM write follows a `yes` reply in the proposal thread. This is exempt from "agents draft, humans send" the same way the digest is: it's internal data entry to our own CRM, not outward LP contact. - It is **not** a parallel write path. It reuses the CRM's own canonical endpoint `POST /api/fundraising/log-communication` (create-if-missing + contact upsert + note + relational sync + audit) for both new-investor and existing-note cases. Don't reimplement grid mutation in the bot. ## Flow 1. Top-level message in the intake room → `parse.parse_message` → local **Qwen via Spark Control** (`spark.py` reuses `backend/ingest/llm.py`; temp 0, JSON only) extracts `{intent, investor_name, contact_name, contact_email, contact_title, note}`. 2. `crm_client.match` (`GET /api/intake/match`) checks new-vs-existing and returns the **grid row id** so an approved note lands on exactly that investor (no duplicate). 3. The proposal is posted **in a thread** rooted at the user's message; the pending proposal is held in memory keyed by that thread root (`proposals.ProposalStore`). 4. User replies in-thread: `yes` / `edit field=value` / `no`. On `yes`, `crm_client.commit` POSTs to `log-communication` tagged `source="matrix_intake"` (provenance in the audit log). ## Rules / gotchas - **Module-name collision:** the intake config module is `settings.py`, **not** `config.py`, because `backend/ingest/config.py` is imported (as bare `config`) through `spark → llm`. A second `config` module would shadow it in `sys.modules` and break `llm` (`CHAT_MODEL`). Keep intake module names from colliding with ingest's (`config`, `http_util`, `llm`). - **Email integrity:** `parse.normalize` only keeps an address that literally appears in the source message — the model must never mint one (a wrong email is worse than none). It takes the **first** address in the text, so a two-person message ("Alice a@x.com and Bob b@y.com") could attach the wrong one; the human sees it in the proposal and can `edit email=…` before approving. Cross-referencing multiple addresses to the named contact is a deliberate non-goal for v1. - **Double-approve guard:** `handle_reply` pops the pending proposal from the store *before* awaiting the commit, so a second `yes` arriving mid-write is a no-op (asyncio is cooperative; the pop is atomic w.r.t. other events). On commit failure the proposal is restored for retry. - **Local-only parse:** intake text is real LP substance but goes ONLY to local Qwen via Spark Control, never Claude — so no scrub boundary applies (same basis as the digest). Never call a Spark directly; always go through `SPARK_CONTROL_URL`. - **Auth:** the CRM has no service-key path; the bot logs in as a dedicated CRM user (`CRM_BOT_USERNAME`/`CRM_BOT_PASSWORD`) → Bearer JWT, re-login once on 401. - **Tests** are offline: `test_parse.py` / `test_proposals.py` / `test_crm_client.py` stub the network; `backend/test_intake_endpoints.py` boots the real server against a temp DB and covers `/api/intake/match` + the create→match (no-duplicate) contract + provenance. A **live Matrix smoke** needs creds + `matrix-nio` installed on the Spark — it can't run in CI. ## Config All in `.env` (names in `.env.example`): `MATRIX_HOMESERVER`, `MATRIX_USER`, `MATRIX_ACCESS_TOKEN`, `MATRIX_DEVICE_ID`, `MATRIX_INTAKE_ROOM`; `CRM_API_BASE`, `CRM_BOT_USERNAME`, `CRM_BOT_PASSWORD`, `CRM_API_VERIFY_TLS`. Spark settings are inherited from the ingest client (`SPARK_CONTROL_URL`, `CRM_CHAT_MODEL`).