Device-test round 2: 4 in-app fixes + Matrix intake cleanup (v0.1.0:99)
Grant's real-phone testing surfaced seven items; this lands six (the seventh, in-app camera card intake, is planned in docs/handoffs/in-app-card-intake-plan.md). CRM half — ships in the s9pk (v0.1.0:99): - Intake fuzzy match no longer over-indexes on generic firm words. _name_similarity now compares DISTINCTIVE tokens only (generic descriptors — "Investment Group", "Capital", "Family Office" — stripped via _GENERIC_ORG_WORDS) for both the difflib ratio and the Jaccard, so "Fortitude Investment Group" stops surfacing Aether/Russell while "Aether Capital" still surfaces "Aether Investment Group". +2 regression cases. - Mobile grid "Last contact"/staleness sort is reversible. SortSheet gains opt-in dir/onToggleDir; other surfaces (Contacts/Pipeline) are untouched. - Mobile "Edit investor" prefills a contact's saved email. GET /api/fundraising/state heals a blank grid pill email from the linked classic contact (fundraising_contacts.contact_id -> contacts.email), fill-only, by pill order then name; the next one-row save persists it. +test_grid_email_heal.py. - Mobile quick-log pencil icon renders. iOS collapses a sole, centered, attribute-only -sized flex-child <svg>; .quicklog-btn svg now gets explicit CSS width/height + flex:none (the pattern the working bottom-tab/sort-pill icons use). The v97 fix only changed color. Matrix intake bot — ships on the Spark (bot-only, NOT the s9pk): - Approve/reject now redacts the whole intake thread (card + ack + main-timeline nudge + the user's own photo/note), mirroring the email-review room; redact_thread takes the room as an arg and matches replies by m.thread OR m.in_reply_to (so the nudge clears). No more in-Matrix confirmation after a commit (the thread vanishing is the ack). Needs the bot to hold a redact/moderator power level in the intake room. - New one-time backend/matrix_intake/redact_intake.py clears the room's pre-existing backlog (dry-run default; --apply). Tests 42/42 green; frontend render-smoke green. Frontend fixes are inspection + render -smoke verified (on-device confirm pending); the bot redaction is live-smoke only.
This commit is contained in:
+86
-6
@@ -1305,14 +1305,40 @@ def _strip_legal_suffix(normalized_name):
|
||||
return " ".join(toks)
|
||||
|
||||
|
||||
# Generic firm-descriptor words that carry almost no identifying signal: nearly every firm name
|
||||
# contains one ("… Investment Group", "… Capital", "… Family Office"). Two names that overlap ONLY
|
||||
# on these are NOT duplicates — 'Fortitude Investment Group' is not 'Aether Investment Group'. We
|
||||
# compare on the DISTINCTIVE remainder so a shared descriptor can't inflate the score (the earlier
|
||||
# "Capital/Ventures/Partners are distinctive enough to keep" assumption produced false shortlists —
|
||||
# Grant, 2026-06-20). If a name is ALL descriptor ('Family Office'), we fall back to its full tokens
|
||||
# so there's still something to compare.
|
||||
_GENERIC_ORG_WORDS = frozenset({
|
||||
"investment", "investments", "investing", "investor", "investors",
|
||||
"capital", "ventures", "venture", "partners", "partner", "group",
|
||||
"fund", "funds", "management", "advisors", "advisers", "advisory",
|
||||
"asset", "assets", "holdings", "holding", "family", "office",
|
||||
"trust", "associates", "equity", "financial", "finance", "global",
|
||||
"international", "company", "enterprises", "wealth", "the", "and", "of",
|
||||
})
|
||||
|
||||
|
||||
def _distinctive_tokens(normalized_name):
|
||||
"""Tokens of a (legal-suffix-stripped) name with generic firm descriptors removed. Falls back to
|
||||
the full token list when the name is nothing but descriptors, so an all-generic name still compares."""
|
||||
toks = re.findall(r"[a-z0-9]+", normalized_name)
|
||||
keep = [t for t in toks if t not in _GENERIC_ORG_WORDS]
|
||||
return keep or toks
|
||||
|
||||
|
||||
def _name_similarity(a, b):
|
||||
"""0..1 fuzzy similarity between two investor names: the max of difflib's sequence ratio
|
||||
(catches near-spellings — 'Charlie'/'Charles') and token-set Jaccard overlap (catches
|
||||
word-order differences). Legal-entity suffixes are stripped first, so two names differing
|
||||
only by 'LLC'/'LP'/'Inc' score 1.0 (a near-certain duplicate to surface — find_intake_match
|
||||
won't have caught it, since it compares the full string). Favors recall: a shared common
|
||||
name-word ('… Capital') can lift unrelated firms into the 0.6–0.8 band — acceptable noise in
|
||||
a ranked, human-confirmed shortlist; semantic pruning is the deferred LLM-judge's job."""
|
||||
won't have caught it, since it compares the full string). Both the ratio and the Jaccard run on
|
||||
the DISTINCTIVE tokens (generic descriptors like 'Investment Group'/'Capital' removed), so firms
|
||||
that share only a descriptor don't surface as look-alikes; 'Aether Capital' ~ 'Aether Capital
|
||||
Partners' still scores 1.0 on the distinctive 'aether'. Still recall-favoring on real overlap."""
|
||||
a = _normalize_text(a)
|
||||
b = _normalize_text(b)
|
||||
if not a or not b:
|
||||
@@ -1323,9 +1349,10 @@ def _name_similarity(a, b):
|
||||
sb = _strip_legal_suffix(b) or b
|
||||
if sa == sb:
|
||||
return 1.0
|
||||
ratio = difflib.SequenceMatcher(None, sa, sb).ratio()
|
||||
ta = set(re.findall(r"[a-z0-9]+", sa))
|
||||
tb = set(re.findall(r"[a-z0-9]+", sb))
|
||||
da = _distinctive_tokens(sa) # order-preserving for the sequence ratio
|
||||
db = _distinctive_tokens(sb)
|
||||
ratio = difflib.SequenceMatcher(None, " ".join(da), " ".join(db)).ratio()
|
||||
ta, tb = set(da), set(db)
|
||||
jaccard = len(ta & tb) / len(ta | tb) if (ta or tb) else 0.0
|
||||
return max(ratio, jaccard)
|
||||
|
||||
@@ -1881,6 +1908,45 @@ def existing_investor_by_source_row(conn):
|
||||
return out
|
||||
|
||||
|
||||
def fundraising_contact_emails_by_row(conn):
|
||||
"""{ source_row_id: {'order': {sort_order: email}, 'name': {normalized_name: email}} } of the
|
||||
authoritative email per grid contact, for HEALING blank pill emails on read.
|
||||
|
||||
The grid blob is canonical for the edit sheet, but an email can reach the linked classic
|
||||
contact (via email capture / a contact edit) without ever being written back into the blob
|
||||
pill — so the mobile "Edit investor" sheet shows an empty email for a contact the directory
|
||||
clearly has (Grant, 2026-06-20). We recover it from the relational mirror: prefer the synced
|
||||
fundraising_contacts.email, else the linked classic contacts.email (the source that actually
|
||||
holds the captured address). Keyed by sort_order (pills and fundraising_contacts share the
|
||||
blob order — the robust key) with a normalized-name fallback. Only non-blank emails are
|
||||
returned; filling is fill-only-when-blank in the handler, so it heals and converges (the next
|
||||
one-row save persists the recovered email into the blob)."""
|
||||
out = {}
|
||||
rows = conn.execute(
|
||||
"""
|
||||
SELECT fi.source_row_id AS srid, fc.sort_order AS so, fc.full_name AS name,
|
||||
COALESCE(NULLIF(TRIM(fc.email), ''), c.email) AS email
|
||||
FROM fundraising_investors fi
|
||||
JOIN fundraising_contacts fc ON fc.investor_id = fi.id
|
||||
LEFT JOIN contacts c ON c.id = fc.contact_id AND c.deleted_at IS NULL
|
||||
"""
|
||||
).fetchall()
|
||||
for r in rows:
|
||||
email = str(r['email'] or '').strip()
|
||||
if not email:
|
||||
continue
|
||||
srid = str(r['srid'] or '')
|
||||
if not srid:
|
||||
continue
|
||||
bucket = out.setdefault(srid, {'order': {}, 'name': {}})
|
||||
if r['so'] is not None:
|
||||
bucket['order'][int(r['so'])] = email
|
||||
nm = _normalize_text(r['name'])
|
||||
if nm:
|
||||
bucket['name'][nm] = email
|
||||
return out
|
||||
|
||||
|
||||
def contact_grid_signals(conn, contact_id=None):
|
||||
"""Return {contacts.id: {'committed': float, 'pipeline_stage': str|None, 'priority': bool}} for
|
||||
every classic contact linked to a fundraising-grid investor (via fundraising_contacts.contact_id,
|
||||
@@ -5830,6 +5896,7 @@ class CRMHandler(BaseHTTPRequestHandler):
|
||||
reminder_by_row = reminder_status_by_source_row(conn)
|
||||
existing_by_row = existing_investor_by_source_row(conn)
|
||||
recency_by_row = staleness_by_source_row(conn)
|
||||
emails_by_row = fundraising_contact_emails_by_row(conn)
|
||||
conn.close()
|
||||
|
||||
try:
|
||||
@@ -5873,6 +5940,19 @@ class CRMHandler(BaseHTTPRequestHandler):
|
||||
last_activity, staleness = recency_by_row.get(srid, (None, ''))
|
||||
r['last_activity_at'] = last_activity
|
||||
r['staleness'] = staleness
|
||||
# Heal blank pill emails from the relational mirror (fill-only — never overwrite a value
|
||||
# already in the blob). Unlike the read-only columns above, email is a REAL blob field,
|
||||
# so this is a backfill, not a derived signal: it needs NO strip point, and the next
|
||||
# one-row save legitimately persists it. Match by pill order, then by name.
|
||||
heal = emails_by_row.get(srid)
|
||||
pills = r.get('contacts')
|
||||
if heal and isinstance(pills, list):
|
||||
for i, c in enumerate(pills):
|
||||
if not isinstance(c, dict) or str(c.get('email') or '').strip():
|
||||
continue
|
||||
found = heal['order'].get(i) or heal['name'].get(_normalize_text(c.get('name')))
|
||||
if found:
|
||||
c['email'] = found
|
||||
|
||||
return self.send_json({
|
||||
"data": {
|
||||
|
||||
Reference in New Issue
Block a user