8.7 KiB
Plan: in-app business-card intake (mobile camera → transcribe → approve)
Status: PLAN — decisions locked 2026-06-20 (see end); awaiting Grant's go-ahead to build. Goal: a camera button in the mobile top bar (left of the quick-log pencil) → take a photo or pick one from the library → the same vision-transcribe → parse → fuzzy-match → edit/approve/reject flow the Matrix card intake runs (M3), surfaced as an inline mobile sheet instead of a Matrix thread.
Why this is cheaper than it looks
The reusable core is already nio-free and already reachable from the CRM monolith:
server.pyalready imports the Spark client:sys.path.insert(.../ingest); import llm(server.py:6485).llm.chat_vision(multimodal → Spark Control passthrough) is the exact call the bot's card OCR uses.backend/matrix_intake/parse.pyimports onlyjson,re,spark;spark.pyimports onlyos,sys,llm. Neither importsmatrix-nio. Soserver.pycan importparse+sparkthe same way it importsllm, with no bot refactor and no matrix-nio in the CRM.- Match (
find_intake_match/find_intake_candidates) and the write (/api/fundraising/log-communication, create-if-missing + contact upsert + audit) are already inserver.py. The fuzzy matcher just got the generic-word fix (v-next). - The whole
backend/tree (incl.ingest/+matrix_intake/) isCOPY'd into the s9pk image (start9/0.4/Dockerfile:62), so the imports resolve on the box at runtime.
So this is one new endpoint + one new mobile component, reusing everything downstream. No migration, no new dependency, no change to the live Matrix bot.
Architecture decision
Recommended (Option A): server.py imports matrix_intake.parse + matrix_intake.spark
directly. Add backend/matrix_intake to sys.path (guarded like the existing ingest insert)
and import parse, spark. Pros: zero bot churn, reuses the tested transcribe + email-integrity
parse verbatim, no divergence. Con: the CRM now imports two modules out of the bot package — a mild
coupling, but they're pure (no nio, no CRM HTTP; we do NOT import the bot's crm_client).
Alternative (Option B): extract a shared backend/intake_core/ (vision.py = transcribe,
extract.py = parse/normalize/revise) that both the bot and server.py import. Cleaner ownership,
but it refactors the live, working bot for no functional gain right now. Defer unless the
coupling in A bites later (e.g. if the bot's parse grows nio-coupled helpers).
Module-name caution (from the matrix-intake guide): the bot imports by bare name (
import spark,import llm,import config).server.pyalready hasingest/on its path (sollm,config,http_utilresolve). Addingmatrix_intake/letsparse/sparkresolve with no name overlap againstingest's modules — verified. Import onlyparse+spark(notcrm_client/settings/bot).
Server: one new endpoint
POST /api/intake/card — authenticated member+ (it's a UI feature for the team; NOT bot-gated,
NOT admin-only). Body is JSON, image as base64 (no multipart parsing in the stdlib handler):
{ "image_b64": "<base64>", "mime": "image/jpeg" }
→ 200 { "transcription": "...",
"proposal": { investor_name, contact_name, contact_email, contact_title,
city, linkedin_url, phone, mobile, note },
"match": { id, name } | null,
"candidates": [ { id, name, score, matched_on }, ... ] }
Handler flow (mirrors bot.handle_card → handle_intake, minus Matrix):
- Auth (reuse the standard authenticated gate).
- Decode + size-guard the base64; default mime
image/jpeg. transcription = spark.transcribe_card(image_b64, mime)(Spark Control vision). On error → 502{ ok:false, reason:"vision_unavailable" }; on<5chars → 200{ ok:false, reason:"unreadable" }.proposal = parse.parse_message("New investor — from a business card:\n"+transcription, roster=None)— the email-integrity rule rides along (an address/phone/LinkedIn is kept only if it literally appears in the transcription, never minted), exactly as in Matrix.roster=Nonefor v1 (the team-roster framing is a Spark-side env; the box can pass None — prior parse behavior).match = find_intake_match(...),candidates = find_intake_candidates(...)from the proposal'sinvestor_name/contact_email.- Return
{ transcription, proposal, match, candidates }.
Approve reuses the existing write — the mobile sheet POSTs the (possibly-edited) proposal to
POST /api/fundraising/log-communication with create_investor_if_missing (new) or the picked
_match_id/candidate row (existing), tagged source="app_card" (a new provenance value
distinct from matrix_card/matrix_intake). Reject is client-only (discard the sheet) — no
server call, nothing written.
No NL-edit for v1. In a form UI, editing the fields directly is easier than chatting "change the
email to…". The fields are inline-editable in the sheet. (A /api/intake/card/revise wrapping
parse.revise is a possible later enhancement if Grant wants conversational corrections.)
Mobile UI: one new component
MobileCardCapture in the shell top bar, left of MobileQuickLog (the cluster at
index.html:14609). Reuses <BottomSheet>, .sheet-input, StageChip, and the New-investor
sheet patterns (8g).
- Trigger: a camera-icon button + a hidden
<input type="file" accept="image/*">— omitcaptureso iOS shows the native action sheet (Take Photo / Photo Library / Browse), satisfying "take a photo OR use an existing photo." (capture="environment"would force the camera and skip the library — not what we want.) - On select: read the file as a data URL; downscale via
<canvas>to ~2000px max dimension before base64 (native, no library) — keeps the payload under the StartOS reverse-proxy body cap (the matrix-intake guide's known413risk) and the model downscales to ~2 MP anyway. Don't over-shrink (hurts OCR). Show a "📇 Reading the card…" loading state. - POST to
/api/intake/card; onunreadable/vision_unavailableshow a "try a clearer, well-lit, fill-the-frame photo" message with a Retake button (same UX as the bot's📇replies). - Approval sheet (the New-investor sheet, pre-filled from
proposal): editable investor name + contact name/email/title/city/phone/mobile/LinkedIn; a small "Existing investor?" block whenmatch/candidatescame back (pick a candidate row, or "Add as new"); Approve → the log-communication write; Reject/Retake → discard. - Icon: reuse the proven
.quicklog-btn svgsizing fix (explicit CSSwidth/height+flex:none) so the new camera SVG doesn't hit the same iOS sole-flex-child collapse.
Scope / effort / risk
- Server: ~1 endpoint + the
parse/sparkimport guard + anapp_cardsource tag. Small. - Frontend: ~1 component (camera button + capture input + canvas downscale + approval sheet) + CSS for the button. Medium (the approval sheet is the New-investor sheet pre-filled).
- No migration, no new dependency, no Matrix-bot change.
- Tests: an endpoint test that stubs
spark.transcribe_card(liketest_spark.pydoes) and asserts the transcribe→parse→match shape + the email-integrity passthrough; the real vision/OCR path stays live-smoke only (same as Matrix M3). Frontend is inspection + on-device. - Risks: (1) large-image upload vs the reverse-proxy cap — mitigated by client-side canvas
downscale; (2) iOS file-input behavior across Safari/standalone-PWA — verify on-device; (3) OCR
accuracy is the shared Matrix limitation (resolution-bound; "fill the frame"), not new here;
(4) the
parse/sparkimport must be lazy/guarded so a dev./start.shwithout Spark reachable still boots (the endpoint just 502s) — mirror the existing lazyimport llmatserver.py:6485.
Decisions (locked, Grant 2026-06-20)
- Provenance tag =
source="app_card"— distinct frommatrix_card/matrix_intake. - v1 = editable form fields only — no conversational NL-edit on the card (no
/api/intake/card/revise); the user taps a field and fixes it in the sheet. NL-edit can follow later if it proves wanted. - Access = any authenticated member — the camera button +
POST /api/intake/carduse the standard authenticated gate (not admin-only, not bot-gated); a human still approves every write. - Delivery = the s9pk (server endpoint + frontend) — a normal version bump → build → box install, like v99 (NOT bot-only like the Matrix M3).
Build scope is otherwise as specified above. Mobile-only (top-bar button); same captured fields + literal-in-source integrity rule as the Matrix card flow.