536358093f
The intake bot now accepts a photo of a business card in the intake room and turns it into the same new-investor proposal a typed note would. The only new step is image -> text; everything downstream (parse, fuzzy match, in-thread approval, log-communication write) is reused unchanged. M3 was deferred only because Spark Control had no vision model. That blocker is gone: the daily-driver Qwen is vision-capable under the same model id, and the gateway forwards OpenAI multimodal content untouched, so no gateway/server/s9pk change is needed -- this ships bot-only (git pull + rebuild on the Spark). Transcribe-then-reuse (not vision-straight-to-JSON) is deliberate: the transcription becomes the source text the email-integrity rule checks against, so a mis-read address can't reach the CRM unapproved -- same guarantee as the text path. Card commits tag source="matrix_card" for the audit log. - llm.chat_vision: multimodal /v1/chat/completions, same model, same gateway - spark.transcribe_card: faithful card->text, "" on a non-card (NONE sentinel) - bot.on_image/handle_card: download image, transcribe, hand to handle_intake - crm_client: source provenance overridable via the proposal's _source key - tests: test_spark.py + a provenance case; 41/41 suite green
48 lines
2.6 KiB
Python
48 lines
2.6 KiB
Python
"""Thin reuse of the in-repo local-Qwen client (backend/ingest/llm.py) via Spark Control.
|
|
|
|
We import the ingest client rather than re-implementing the HTTP call so the intake bot
|
|
speaks the exact same Spark contract (model, /v1/chat/completions, TLS verify, .env load).
|
|
The intake message is real LP substance, but it goes ONLY to the local Qwen on Ten31 infra
|
|
— never Claude — so no scrub boundary applies (same basis as the daily digest). Never call a
|
|
Spark directly; everything goes through SPARK_CONTROL_URL.
|
|
"""
|
|
import os
|
|
import sys
|
|
|
|
_INGEST = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "ingest")
|
|
if _INGEST not in sys.path:
|
|
sys.path.insert(0, _INGEST)
|
|
|
|
import llm # noqa: E402 (backend/ingest/llm.py — chat / chat_json over Spark Control)
|
|
|
|
|
|
def parse_json(prompt, system=None, max_tokens=400):
|
|
"""Send to local Qwen (temp 0, thinking off) and parse the first JSON object, or None."""
|
|
return llm.chat_json(prompt, system=system, max_tokens=max_tokens)
|
|
|
|
|
|
# The vision model only TRANSCRIBES the card; the existing text-parse flow then extracts the
|
|
# structured proposal from that transcription. Keeping the two steps separate (vs. asking the
|
|
# vision model for JSON directly) is deliberate: the transcription becomes the source text the
|
|
# email-integrity check runs against, so the "only keep an address that literally appears in the
|
|
# source, never let the model mint one" rule (parse.normalize) protects card intake too.
|
|
CARD_SYSTEM = (
|
|
"You are transcribing a photo of a business card for a venture-fund team. Read every line of "
|
|
"text on the card and write it out exactly as printed — the person's name, job title, company "
|
|
"or firm name, email address, phone number(s), website, and mailing address. Copy the email "
|
|
"address and phone numbers character-for-character; never guess, complete, or correct them. Do "
|
|
"not summarize, translate, or add anything that is not printed on the card. If the image is not "
|
|
"a readable business card, reply with the single word NONE. Output only the transcription, one "
|
|
"item per line."
|
|
)
|
|
|
|
|
|
def transcribe_card(image_b64, mime="image/jpeg", chat_fn=None):
|
|
"""Vision-transcribe a business card to faithful text via the local VL model (same model and
|
|
Spark Control endpoint as the text parse). Returns the transcription string, or '' if the model
|
|
saw no readable card. `chat_fn` is injectable for offline tests (defaults to Spark/VL)."""
|
|
chat_fn = chat_fn or llm.chat_vision
|
|
out = (chat_fn("Transcribe this business card.", image_b64, mime=mime,
|
|
system=CARD_SYSTEM, max_tokens=600) or "").strip()
|
|
return "" if out.upper() == "NONE" else out
|