Device-test round 2: 4 in-app fixes + Matrix intake cleanup (v0.1.0:99)

Grant's real-phone testing surfaced seven items; this lands six (the seventh,
in-app camera card intake, is planned in docs/handoffs/in-app-card-intake-plan.md).

CRM half — ships in the s9pk (v0.1.0:99):
- Intake fuzzy match no longer over-indexes on generic firm words. _name_similarity
  now compares DISTINCTIVE tokens only (generic descriptors — "Investment Group",
  "Capital", "Family Office" — stripped via _GENERIC_ORG_WORDS) for both the difflib
  ratio and the Jaccard, so "Fortitude Investment Group" stops surfacing Aether/Russell
  while "Aether Capital" still surfaces "Aether Investment Group". +2 regression cases.
- Mobile grid "Last contact"/staleness sort is reversible. SortSheet gains opt-in
  dir/onToggleDir; other surfaces (Contacts/Pipeline) are untouched.
- Mobile "Edit investor" prefills a contact's saved email. GET /api/fundraising/state
  heals a blank grid pill email from the linked classic contact
  (fundraising_contacts.contact_id -> contacts.email), fill-only, by pill order then
  name; the next one-row save persists it. +test_grid_email_heal.py.
- Mobile quick-log pencil icon renders. iOS collapses a sole, centered, attribute-only
  -sized flex-child <svg>; .quicklog-btn svg now gets explicit CSS width/height + flex:none
  (the pattern the working bottom-tab/sort-pill icons use). The v97 fix only changed color.

Matrix intake bot — ships on the Spark (bot-only, NOT the s9pk):
- Approve/reject now redacts the whole intake thread (card + ack + main-timeline nudge +
  the user's own photo/note), mirroring the email-review room; redact_thread takes the
  room as an arg and matches replies by m.thread OR m.in_reply_to (so the nudge clears).
  No more in-Matrix confirmation after a commit (the thread vanishing is the ack).
  Needs the bot to hold a redact/moderator power level in the intake room.
- New one-time backend/matrix_intake/redact_intake.py clears the room's pre-existing
  backlog (dry-run default; --apply).

Tests 42/42 green; frontend render-smoke green. Frontend fixes are inspection + render
-smoke verified (on-device confirm pending); the bot redaction is live-smoke only.
This commit is contained in:
Keysat
2026-06-20 12:32:56 -05:00
parent 7fe5f57c6e
commit a917280bbb
13 changed files with 606 additions and 58 deletions
+7 -4
View File
@@ -103,19 +103,19 @@
**W3 — Bot grid-mutations behind a Matrix approval gate.** Generalize the email-proposal scaffold (`email_proposal_matrix` + propose→post→decide→apply) into one `agent_proposals` table (kind discriminator + JSON payload + target). Bot proposes set-commitment / assign-fund / change-stage / set-reminder; a human approves/edits/rejects in Matrix (**any member**); then apply. **Surgical, version-checked mutations — never blob RMW:** stage rides the existing `opportunities` link + validated stage endpoint; reminders write the W1 table; set-commitment/assign-fund need a version-checked single-cell upsert into the grid blob. Triggers the deferred **scoped service-token** item below (per-mutation-kind allowlist on the bot credential; money/merge/delete always require human approval regardless of scope — the autonomy axis). Parse on local Qwen, not Claude.
### Matrix-bridge intake for the fundraising grid — M1+M2 BUILT (deploy + live smoke pending)
*Requested 2026-06-16. **M1 (scaffold + parse + in-thread propose) and M2 (match + write-on-approve) built, tested (26/26), not yet deployed** — code in `backend/matrix_intake/`, guide at `docs/guides/matrix-intake.md`. Remaining: install `matrix-nio` + creds on the Spark, create the CRM bot user, and a **live Matrix smoke** (can't run in CI). M3 (business-card photo) deferred until Spark Control has a vision model. Next major build after this is **Pipeline adoption** (see below).*
### Matrix-bridge intake for the fundraising grid — M1+M2+M3 BUILT & LIVE
*Requested 2026-06-16. **M1+M2 live since v0.1.0:86 (2026-06-17); M3 (business-card photo) shipped & live 2026-06-20** — code in `backend/matrix_intake/`, guide at `docs/guides/matrix-intake.md`. M3 unblocked once the Spark Control daily-driver model became vision-capable: the bot OCRs a card via Spark Control's `/v1/chat/completions` multimodal passthrough (same `CRM_CHAT_MODEL`), then runs the existing intake flow; captures contact name/email/title/city/LinkedIn/phone/mobile (server half of phone/mobile = s9pk v0.1.0:98). Remaining: ongoing on-device card testing (OCR accuracy on small-in-frame cards). Next major build is **Pipeline adoption** (see below).*
Use the **matrix-bridge** repo's pattern to listen on a dedicated ten31-database Matrix room. Send a message (with an optional business-card photo) and a local LLM **via Spark Control** parses it into the fundraising-grid schema and **auto-creates the investor entity + contact row**. For an existing investor, send a meeting note and it **appends an interaction-log entry**. Approval gate: the bot replies in Matrix with the proposed add/edit; the user approves / rejects / edits in-thread before the write commits (keeps the draft→human-approve guardrail).
- Fits the "grid is canonical" direction (writes land in `fundraising_*`) and the never-send-autonomously rule (in-thread human approval before any write).
**Locked design (2026-06-16, approved) — build now, M1 then M2:**
- **Separate component, shared scaffold:** new `backend/matrix_intake/` (its own process; lifts matrix-bridge's connect/prime-then-listen/threaded-reply plumbing). `matrix-nio` is isolated to this component's `requirements.txt` — it never enters the stdlib CRM runtime. Keeps the CRM write credential + LP data out of the general-purpose matrix-bridge bot (blast-radius + data-sovereignty), and lets the two iterate independently. Runs on the Spark (placement settled against `standards/guides/placement.md` at deploy).
- **v1 = text-only.** Business-card photo deferred to M3 — Spark Control fronts chat/embeddings/rerank but **no vision model** today, so photo→fields isn't buildable end-to-end yet.
- **~~v1 = text-only~~ — M3 business-card photo SHIPPED (2026-06-20).** The Spark Control daily-driver model is now vision-capable (multimodal `image_url` passthrough), so card→text→fields works end-to-end. Transcribe-then-reuse (vision OCRs to text; the existing text extractor pulls fields) preserves the email/phone integrity rules. See the matrix-intake guide.
- **Parse:** local Qwen via Spark Control `/v1/chat/completions` (temp 0, JSON-only), reusing the existing Spark client pattern (`backend/redaction`/`backend/ingest`).
- **Approval handshake (the one stateful piece):** in-memory pending-proposal store keyed by Matrix thread root; user replies **yes / edit field=value / no** in-thread. Satisfies never-write-autonomously; exempt from "agents draft, humans send" (internal data entry, like the digest).
- **CRM-side:** `POST /api/intake/investor` (service-auth) creates a new investor+contact **through the existing grid-save path** (so relational sync + audit + backup-on-write happen as with a UI edit; bot never does whole-blob RMW) or appends a meeting note to the interaction log for an existing investor; `GET /api/intake/match?q=` fuzzy-matches via the existing entity-resolution/email-matcher. New investor needs no fund at intake.
- **Phases:** M1 = scaffold + parse + in-thread propose, **no writes** (proves Matrix↔Spark). M2 = intake endpoint + match + write-on-approve + tests. M3 (deferred) = business-card photo.
- **Phases:** M1 = scaffold + parse + in-thread propose, **no writes** (proves Matrix↔Spark). M2 = intake endpoint + match + write-on-approve + tests. **M3 = business-card photo (SHIPPED 2026-06-20).**
**Post-deploy enhancement — fuzzy match + in-thread confirm (Grant, 2026-06-17). DEPLOYED & LIVE 2026-06-17 (v0.1.0:86; box migration chain …85→86 clean, `candidates` endpoint verified); Matrix live-smoke pending.** Today `find_intake_match` is **exact-after-normalization** (`_normalize_text` = lowercase+strip), so near-misses — "Charlie" vs "Charles" (same last name), "Acme Capital" vs "Acme Capital LLC", a one-character email typo — return no match and the bot proposes a **new** investor, risking a duplicate the human approves without realizing a near-match exists. The existing in-thread approval gate is useless against this because the human is never *shown* the near-match. Fix: matcher returns **ranked fuzzy candidates** (deterministic pre-filter: normalized name similarity / token overlap + email edit-distance ≤ ~2), surfaced in-thread for the human to confirm or pick, with the **local Spark LLM optionally re-ranking/judging the shortlist** (good at Charlie/Charles + legal-suffix equivalence; fed only the shortlist, never the whole LP list). Keeps the approval gate but makes it effective against duplicates. Land **after** the live smoke — net-new logic + reply grammar + tests; the current exact match is safe and its failure mode (a duplicate) is recoverable via the existing entity-merge subsystem (`backend/entity_*.py`).
- **As built:** `find_intake_candidates` in `server.py` (deterministic — stdlib `difflib` name similarity + token-set Jaccard, legal-suffix-aware via `_strip_legal_suffix`, + email Levenshtein ≤ 2; ranked, ≥0.62, top 5). `GET /api/intake/match` now returns `{match, candidates}`. Bot: a new `_stage="disambiguate"` shortlist (`proposals.render_disambiguation` / `interpret_disambiguation` / `attach_to_candidate` / `promote_to_new`) — human picks a number / `new` / `no`. **The optional LLM-judge re-rank was deliberately deferred** (the deterministic filter already surfaces the named cases; an LLM judge is the right *pruner* for shortlist noise — build if the deterministic ranking proves too noisy in practice). Tests: `test_intake_endpoints.py` (server fuzzy cases), `matrix_intake/test_proposals.py` (disambiguation grammar), `matrix_intake/test_crm_client.py` (candidate shape).
@@ -129,6 +129,9 @@ Use the **matrix-bridge** repo's pattern to listen on a dedicated ten31-database
**Long-term — extract the intake bot to its own repo (recommended, not yet done).** Containerizing from this monorepo is the pragmatic now-state, but the bot is a genuinely separate deployable (own process, own `matrix-nio` dep, own lifecycle); its only CRM coupling is the HTTP API (a clean network contract) plus ~40 lines of stdlib Spark client (cheap to vendor). The tell: the spark-control Update button would run `git reset --hard origin/main` on the **whole CRM clone** — wrong blast radius. `matrix-bridge` is already a dedicated repo; mirror it. The extraction is a migration (new Gitea repo, move code + tests + guide, vendor the client, re-point the Spark deploy), so it's deferred until worth the lift — do it *before* wiring the spark-control card if both land in the same push.
### In-app camera business-card intake — PLAN written, awaiting go-ahead (Grant, 2026-06-20)
*A camera button in the mobile top bar (left of the quick-log pencil) → take/choose a photo → the same vision-transcribe → parse → fuzzy-match → edit/approve/reject flow the Matrix card intake (M3) runs, surfaced as an inline mobile sheet. **Detailed plan: `docs/handoffs/in-app-card-intake-plan.md`.*** Key finding: the reusable core is nio-free and already reachable from the CRM (`server.py` imports `llm`; `matrix_intake/parse.py`+`spark.py` import no `matrix-nio`), so it's **one endpoint** (`POST /api/intake/card`) + **one mobile component**, no bot refactor / new dep / migration. Four decisions pending Grant before building: provenance tag (`app_card`?), form-edits-only for v1 (no conversational NL-edit), any-member access, ships-in-s9pk. Reuses the New-investor sheet pre-filled + the proven `.quicklog-btn svg` icon-sizing fix for the camera button.
### Scoped service-credential auth path for automated CRM writers
*Surfaced 2026-06-17 while deploying the Matrix intake bot. **Decision: defer — the bot uses a dedicated member username/password for now.** The CRM has no API-key/service-token path; its only auth is username+password → JWT. A dedicated **member** login is appropriately scoped against what matters operationally (no admin: can't manage users, reset data, or change settings) and unblocks the live smoke today.*