Matrix intake: fuzzy investor matching + conversational in-thread edits (v0.1.0:86)

Close the two locked post-deploy enhancements for the Matrix intake bot.

Fuzzy matching (server-side, ships in the s9pk): new find_intake_candidates in
server.py returns ranked deterministic near-matches (difflib name similarity +
token-set Jaccard, legal-suffix-aware, + email Levenshtein <= 2); GET
/api/intake/match now returns {match, candidates}. The bot surfaces a numbered
shortlist so a near-duplicate (Charlie/Charles, Acme Capital vs Acme Capital LLC,
a one-char email typo) is confirmed by a human instead of silently creating a
second investor. Exact match still auto-attaches; fuzzy candidates are never
auto-attached. The optional LLM-judge re-rank is deferred.

Conversational edits (bot-side, ships on the Spark): any in-thread reply that
isn't yes/no/edit field=value is treated as a natural-language revision and
re-run through local Qwen (parse.revise). Email integrity is preserved -- a
changed address must literally appear in the instruction; the model's email
field is structurally unreachable. No-op revisions re-prompt.

Docs/current-state brought current; 27/27 backend tests green.
This commit is contained in:
Keysat
2026-06-17 18:50:58 -05:00
parent fa6c9da0e6
commit 0b893295e1
15 changed files with 734 additions and 41 deletions
+78
View File
@@ -5,7 +5,12 @@ Matrix thread root (the bot's proposal lives in a thread rooted at the user's me
the user replies inside that thread). In-memory and ephemeral by design — a restart drops
pending proposals (the user just re-sends), matching matrix-bridge's stateless-by-default
ethos. Nothing here writes to the CRM; the bot calls the CRM client only after `approve`.
A proposal carries a `_stage`: "approval" (the normal yes/edit/no card) or "disambiguate"
(a fuzzy-match shortlist the human must resolve — pick a number / "new" / "no" — before it
becomes an approval-stage proposal). The shortlist itself rides on `_candidates`.
"""
import re
# field aliases accepted in `edit <field>=<value>`
_EDIT_ALIASES = {
@@ -18,6 +23,10 @@ _EDIT_ALIASES = {
_YES = {"yes", "y", "approve", "approved", "ok", "confirm", "go", "👍", ""}
_NO = {"no", "n", "cancel", "discard", "reject", "stop", "👎", ""}
# "create a new investor anyway" replies to a disambiguation shortlist
_NEW = {"new", "none", "new investor", "none of these", "create", "create new", "add new", "neither"}
_CONTENT_FIELDS = ("intent", "investor_name", "contact_name", "contact_email", "contact_title", "note")
class ProposalStore:
@@ -84,6 +93,75 @@ def apply_edit(proposal, field, value):
return updated
def same_fields(a, b):
"""True if two proposals carry identical content (used to detect a no-op NL revision so we
don't tell the human 'Updated' when nothing changed)."""
return all((a or {}).get(k) == (b or {}).get(k) for k in _CONTENT_FIELDS)
def interpret_disambiguation(text, n_candidates):
"""Classify a reply to a fuzzy-match shortlist.
Returns ("pick", index) | ("new", None) | ("reject", None) | ("unknown", None). A bare
number selects that candidate; "new"/"none" creates a new investor; "no"/"cancel" discards."""
t = (text or "").strip().lower()
if not t:
return ("unknown", None)
if t in _NO:
return ("reject", None)
if t in _NEW:
return ("new", None)
m = re.fullmatch(r"#?\s*(\d{1,2})", t)
if m:
idx = int(m.group(1)) - 1
if 0 <= idx < n_candidates:
return ("pick", idx)
return ("unknown", None)
def attach_to_candidate(proposal, candidate):
"""Promote a disambiguation pick into an approval-stage meeting note on the chosen investor.
The note will target that existing grid row (via _match_id); the firm name is shown for
accuracy. Drops the shortlist."""
updated = dict(proposal)
updated.pop("_candidates", None)
updated["_stage"] = "approval"
updated["_match_id"] = candidate["id"]
updated["intent"] = "meeting_note"
if candidate.get("name"):
updated["investor_name"] = candidate["name"]
return updated
def promote_to_new(proposal):
"""Disambiguation 'new' — discard the shortlist and proceed as a new-investor proposal."""
updated = dict(proposal)
updated.pop("_candidates", None)
updated.pop("_match_id", None)
updated["_stage"] = "approval"
return updated
def render_disambiguation(proposal):
"""Render the fuzzy-match shortlist a human resolves before we create a new investor."""
name = proposal.get("investor_name") or proposal.get("contact_name") or "?"
cands = proposal.get("_candidates") or []
lines = [f"🔎 Before adding **{name}** as new — these existing investors look similar:"]
for i, c in enumerate(cands, 1):
lines.append(f" **{i}.** {c.get('name') or '?'}")
lines.append("")
lines.append("Reply a **number** to log this against that investor, **new** to add it as a "
"new investor, or **no** to discard.")
return "\n".join(lines)
def disambiguation_nudge(proposal):
"""Brief main-timeline pointer for a disambiguation proposal (the shortlist is in the thread)."""
name = proposal.get("investor_name") or proposal.get("contact_name") or "?"
return (f"🔎 **{name}** may match an existing investor — open the **thread** to pick one "
"or confirm it's new.")
def render(proposal):
"""Render a proposal as the in-thread message a human approves."""
if proposal.get("intent") == "meeting_note":