Matrix intake: fuzzy investor matching + conversational in-thread edits (v0.1.0:86)

Close the two locked post-deploy enhancements for the Matrix intake bot.

Fuzzy matching (server-side, ships in the s9pk): new find_intake_candidates in
server.py returns ranked deterministic near-matches (difflib name similarity +
token-set Jaccard, legal-suffix-aware, + email Levenshtein <= 2); GET
/api/intake/match now returns {match, candidates}. The bot surfaces a numbered
shortlist so a near-duplicate (Charlie/Charles, Acme Capital vs Acme Capital LLC,
a one-char email typo) is confirmed by a human instead of silently creating a
second investor. Exact match still auto-attaches; fuzzy candidates are never
auto-attached. The optional LLM-judge re-rank is deferred.

Conversational edits (bot-side, ships on the Spark): any in-thread reply that
isn't yes/no/edit field=value is treated as a natural-language revision and
re-run through local Qwen (parse.revise). Email integrity is preserved -- a
changed address must literally appear in the instruction; the model's email
field is structurally unreachable. No-op revisions re-prompt.

Docs/current-state brought current; 27/27 backend tests green.
This commit is contained in:
Keysat
2026-06-17 18:50:58 -05:00
parent fa6c9da0e6
commit 0b893295e1
15 changed files with 734 additions and 41 deletions
+19 -6
View File
@@ -70,19 +70,32 @@ def _authed(method, path, body=None):
def match(proposal):
"""Return {'id', 'name'} for an existing investor matching this proposal, else None."""
"""Resolve new-vs-existing for this proposal against the CRM matcher.
Returns {'match': {...}|None, 'candidates': [...]}:
- `match` is a confident EXACT existing investor — {'id', 'name'} — that the bot
auto-attaches a note to (no human disambiguation needed).
- `candidates` is a ranked list of fuzzy NEAR-matches — each {'id', 'name', 'score',
'matched_on'} — surfaced in-thread for the human to pick from (or confirm "new")
when there is no exact match, so a typo'd/near-duplicate name doesn't silently
create a second investor."""
q = proposal.get("investor_name") or proposal.get("contact_name") or ""
email = proposal.get("contact_email") or ""
if not q and not email:
return None
return {"match": None, "candidates": []}
qs = urlencode({"q": q, "email": email})
status, data = _authed("GET", f"/api/intake/match?{qs}")
if status != 200:
raise RuntimeError(f"intake match failed ({status}): {data.get('error') or data}")
m = (data.get("data") or {}).get("match")
if not m:
return None
return {"id": m["id"], "name": m.get("investor_name") or q}
payload = data.get("data") or {}
m = payload.get("match")
match_out = {"id": m["id"], "name": m.get("investor_name") or q} if m else None
candidates = [
{"id": c["id"], "name": c.get("investor_name") or "?",
"score": c.get("score"), "matched_on": c.get("matched_on")}
for c in (payload.get("candidates") or []) if c.get("id")
]
return {"match": match_out, "candidates": candidates}
def build_commit_payload(proposal):