Phase 0 complete: fuzzy entity tier, incremental sync, Start9 packaging

- Fuzzy tier (backend/ingest/fuzzy_resolve.py + llm.py): local Qwen adjudicates the deterministic resolver's flagged name-variant candidates; merges are durable via entity_merges (deterministic re-runs respect them), losers soft-deleted, logged. Idempotent. - Incremental sync (backend/ingest/sync.py): re-embeds only rows changed since a watermark (ingest_sync_state); first run / --recreate = full. Tested full→0→1. - Start9 packaging (start9/0.4): Dockerfile bundles ingest+mcp + fastembed/mcp; "Build search index" action runs the init in a subcontainer; MCP shipped as a manual stdio server (not a daemon); version 0.1.0:44. INGEST_PACKAGING.md. - backfill.py: factored embed_and_upsert() shared with sync. Verified end-to-end on synthetic data + live Sparks/Qwen/Qdrant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-05 08:55:12 -05:00
parent c7ce44d963
commit f357c23c75
16 changed files with 808 additions and 48 deletions
@@ -0,0 +1,39 @@
+"""Local Qwen chat client via Spark Control /v1/chat/completions.
+
+Used for the privacy-sensitive, high-volume reasoning that must stay on Ten31
+infra (entity-resolution adjudication, triage). Frontier reasoning still goes to
+Claude; this is the local leg. Thinking is disabled for fast structured output.
+"""
+import json
+import re
+
+import config
+import http_util
+
+
+def chat(prompt, system=None, max_tokens=200, temperature=0.0):
+    messages = []
+    if system:
+        messages.append({"role": "system", "content": system})
+    messages.append({"role": "user", "content": prompt})
+    body = {"model": config.CHAT_MODEL, "messages": messages,
+            "temperature": temperature, "max_tokens": max_tokens,
+            "chat_template_kwargs": {"enable_thinking": False}}
+    status, data = http_util.request("POST", f"{config.SPARK_CONTROL_URL}/v1/chat/completions",
+                                     body, verify=config.SPARK_VERIFY_TLS)
+    if status != 200:
+        raise RuntimeError(f"/v1/chat/completions -> {status}: {data}")
+    return (data["choices"][0]["message"].get("content") or "").strip()
+
+
+def chat_json(prompt, system=None, max_tokens=200):
+    """Chat and parse the first JSON object from the reply (tolerant of fences)."""
+    raw = chat(prompt, system=system, max_tokens=max_tokens)
+    raw = re.sub(r"^```(json)?|```$", "", raw.strip(), flags=re.MULTILINE).strip()
+    m = re.search(r"\{.*\}", raw, re.DOTALL)
+    if not m:
+        return None
+    try:
+        return json.loads(m.group(0))
+    except json.JSONDecodeError:
+        return None