Matrix intake: fuzzy investor matching + conversational in-thread edits (v0.1.0:86)

Close the two locked post-deploy enhancements for the Matrix intake bot.

Fuzzy matching (server-side, ships in the s9pk): new find_intake_candidates in
server.py returns ranked deterministic near-matches (difflib name similarity +
token-set Jaccard, legal-suffix-aware, + email Levenshtein <= 2); GET
/api/intake/match now returns {match, candidates}. The bot surfaces a numbered
shortlist so a near-duplicate (Charlie/Charles, Acme Capital vs Acme Capital LLC,
a one-char email typo) is confirmed by a human instead of silently creating a
second investor. Exact match still auto-attaches; fuzzy candidates are never
auto-attached. The optional LLM-judge re-rank is deferred.

Conversational edits (bot-side, ships on the Spark): any in-thread reply that
isn't yes/no/edit field=value is treated as a natural-language revision and
re-run through local Qwen (parse.revise). Email integrity is preserved -- a
changed address must literally appear in the instruction; the model's email
field is structurally unreachable. No-op revisions re-prompt.

Docs/current-state brought current; 27/27 backend tests green.
This commit is contained in:
Keysat
2026-06-17 18:50:58 -05:00
parent fa6c9da0e6
commit 0b893295e1
15 changed files with 734 additions and 41 deletions
+3 -2
View File
@@ -50,8 +50,9 @@ export const PACKAGE_TITLE = 'Ten31 Database'
// * 0.1.0:82 (vendor + SRI-pin the front-end libs: React/ReactDOM/Babel now ship in the s9pk and load same-origin from /assets/vendor/ with integrity hashes, so a CDN can never swap prod deps [the v78/v79 blank-screen class] and the box needs no outbound internet to render; plus a committed jsdom render smoke check [start9/0.4/render-smoke.mjs] gating the default `make` build)
// * 0.1.0:83 (email search/query + windowed digest preview, code-only: Communications investor dropdown now mirrors the list with typed keys [fund:/org:/contact:] so classic-contact/org-domain matches show + are pickable [fixes the empty-dropdown bug], plus a date-range filter, a click-to-expand full-body view [GET /api/email/detail], and a semantic "Search content" mode over indexed email bodies [GET /api/email/search -> ingest hybrid_search, soft-delete-filtered, 503 if Spark/Qdrant down]; Daily Digest gains an in-app windowed preview before send [POST /api/admin/digest/preview, send-now takes the same window] that exercises the real Spark summarizer without touching the daily cursor)
// * 0.1.0:84 (Matrix intake bot CRM support — ships the server side of commit 7ad0ee7, which was never packaged: new read-only GET /api/intake/match [new-vs-existing lookup against the canonical fundraising grid blob; returns the grid row id so an approved note lands on the matched investor, no duplicate] + source provenance on POST /api/fundraising/log-communication [audit records source, default "fundraising_grid"]; code-only, no schema change)
// * Current: 0.1.0:85 (cosmetic: drop the redundant "[note]" tag from the fundraising-grid note line — now "YYYY-MM-DD Contact: summary"; informative comm types [call, meeting, …] keep their "[type]" tag; shared by the Matrix intake bot + grid-UI logging; no schema change)
export const PACKAGE_VERSION = '0.1.0:85'
// * 0.1.0:85 (cosmetic: drop the redundant "[note]" tag from the fundraising-grid note line — now "YYYY-MM-DD Contact: summary"; informative comm types [call, meeting, …] keep their "[type]" tag; shared by the Matrix intake bot + grid-UI logging; no schema change)
// * Current: 0.1.0:86 (Matrix intake fuzzy matching: GET /api/intake/match now returns ranked `candidates` [fuzzy near-matches — deterministic difflib name similarity + token overlap + email edit-distance ≤ 2, legal-suffix-aware] alongside the exact `match`, so the bot can surface near-duplicates ["Charlie"/"Charles", "Acme Capital"/"Acme Capital LLC", a one-char email typo] for human confirmation instead of silently creating a second investor; the bot-side disambiguation + conversational-edit UX ships on the Spark, not the s9pk; code-only, no schema change)
export const PACKAGE_VERSION = '0.1.0:86'
export const DATA_MOUNT_PATH = '/data'
export const WEB_PORT = 8080
+3 -2
View File
@@ -46,8 +46,9 @@ import { v_0_1_0_82 } from './v0.1.0.82'
import { v_0_1_0_83 } from './v0.1.0.83'
import { v_0_1_0_84 } from './v0.1.0.84'
import { v_0_1_0_85 } from './v0.1.0.85'
import { v_0_1_0_86 } from './v0.1.0.86'
export const versionGraph = VersionGraph.of({
current: v_0_1_0_85,
other: [v_0_1_0_39, v_0_1_0_40, v_0_1_0_41, v_0_1_0_42, v_0_1_0_43, v_0_1_0_44, v_0_1_0_45, v_0_1_0_46, v_0_1_0_47, v_0_1_0_48, v_0_1_0_49, v_0_1_0_50, v_0_1_0_51, v_0_1_0_52, v_0_1_0_53, v_0_1_0_54, v_0_1_0_55, v_0_1_0_56, v_0_1_0_57, v_0_1_0_58, v_0_1_0_59, v_0_1_0_60, v_0_1_0_61, v_0_1_0_62, v_0_1_0_63, v_0_1_0_64, v_0_1_0_65, v_0_1_0_66, v_0_1_0_67, v_0_1_0_68, v_0_1_0_69, v_0_1_0_70, v_0_1_0_71, v_0_1_0_72, v_0_1_0_73, v_0_1_0_74, v_0_1_0_75, v_0_1_0_76, v_0_1_0_77, v_0_1_0_78, v_0_1_0_79, v_0_1_0_80, v_0_1_0_81, v_0_1_0_82, v_0_1_0_83, v_0_1_0_84],
current: v_0_1_0_86,
other: [v_0_1_0_39, v_0_1_0_40, v_0_1_0_41, v_0_1_0_42, v_0_1_0_43, v_0_1_0_44, v_0_1_0_45, v_0_1_0_46, v_0_1_0_47, v_0_1_0_48, v_0_1_0_49, v_0_1_0_50, v_0_1_0_51, v_0_1_0_52, v_0_1_0_53, v_0_1_0_54, v_0_1_0_55, v_0_1_0_56, v_0_1_0_57, v_0_1_0_58, v_0_1_0_59, v_0_1_0_60, v_0_1_0_61, v_0_1_0_62, v_0_1_0_63, v_0_1_0_64, v_0_1_0_65, v_0_1_0_66, v_0_1_0_67, v_0_1_0_68, v_0_1_0_69, v_0_1_0_70, v_0_1_0_71, v_0_1_0_72, v_0_1_0_73, v_0_1_0_74, v_0_1_0_75, v_0_1_0_76, v_0_1_0_77, v_0_1_0_78, v_0_1_0_79, v_0_1_0_80, v_0_1_0_81, v_0_1_0_82, v_0_1_0_83, v_0_1_0_84, v_0_1_0_85],
})
+20
View File
@@ -0,0 +1,20 @@
import { VersionInfo } from '@start9labs/start-sdk'
// Matrix intake — fuzzy investor matching. GET /api/intake/match now returns, alongside the
// exact `match`, a ranked list of `candidates`: fuzzy near-matches (deterministic difflib name
// similarity + token overlap + email edit-distance ≤ 2, legal-suffix-aware) the intake bot can
// surface in-thread for the human to pick from — so a near-duplicate name ("Charlie"/"Charles",
// "Acme Capital"/"Acme Capital LLC", a one-char email typo) no longer silently creates a second
// investor. Server-side only (the bot's disambiguation + conversational-edit UX ships on the
// Spark, not in the s9pk). Code-only, no schema change.
export const v_0_1_0_86 = VersionInfo.of({
version: '0.1.0:86',
releaseNotes: {
en_US: [
'Matrix intake: the new-vs-existing lookup now also returns ranked fuzzy near-matches,',
'so a typod or near-duplicate investor name is surfaced for confirmation instead of',
'silently creating a duplicate. No data changes.',
].join(' '),
},
migrations: { up: async () => {}, down: async () => {} },
})