Commit Graph

2 Commits

Author SHA1 Message Date
Keysat 536358093f Add business-card photo intake to the Matrix bot (M3)
The intake bot now accepts a photo of a business card in the intake room and
turns it into the same new-investor proposal a typed note would. The only new
step is image -> text; everything downstream (parse, fuzzy match, in-thread
approval, log-communication write) is reused unchanged.

M3 was deferred only because Spark Control had no vision model. That blocker is
gone: the daily-driver Qwen is vision-capable under the same model id, and the
gateway forwards OpenAI multimodal content untouched, so no gateway/server/s9pk
change is needed -- this ships bot-only (git pull + rebuild on the Spark).

Transcribe-then-reuse (not vision-straight-to-JSON) is deliberate: the
transcription becomes the source text the email-integrity rule checks against,
so a mis-read address can't reach the CRM unapproved -- same guarantee as the
text path. Card commits tag source="matrix_card" for the audit log.

- llm.chat_vision: multimodal /v1/chat/completions, same model, same gateway
- spark.transcribe_card: faithful card->text, "" on a non-card (NONE sentinel)
- bot.on_image/handle_card: download image, transcribe, hand to handle_intake
- crm_client: source provenance overridable via the proposal's _source key
- tests: test_spark.py + a provenance case; 41/41 suite green
2026-06-20 10:26:27 -05:00
Keysat f357c23c75 Phase 0 complete: fuzzy entity tier, incremental sync, Start9 packaging
- Fuzzy tier (backend/ingest/fuzzy_resolve.py + llm.py): local Qwen adjudicates
  the deterministic resolver's flagged name-variant candidates; merges are
  durable via entity_merges (deterministic re-runs respect them), losers
  soft-deleted, logged. Idempotent.
- Incremental sync (backend/ingest/sync.py): re-embeds only rows changed since a
  watermark (ingest_sync_state); first run / --recreate = full. Tested full→0→1.
- Start9 packaging (start9/0.4): Dockerfile bundles ingest+mcp + fastembed/mcp;
  "Build search index" action runs the init in a subcontainer; MCP shipped as a
  manual stdio server (not a daemon); version 0.1.0:44. INGEST_PACKAGING.md.
- backfill.py: factored embed_and_upsert() shared with sync.

Verified end-to-end on synthetic data + live Sparks/Qwen/Qdrant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:55:12 -05:00