Root cause: grid contacts (fundraising_contacts) are the SAME people as the
contacts table (the app syncs them by name/email), but resolution matched grid
rows by (name + investor-canon) where the two sides derive the investor key from
different tables that rarely line up — so nearly every grid contact minted a
duplicate person (715 + ~692 ≈ 1406), and the duplicate finder then flagged each
twin against its real self (~676 candidates).
Fix (entity_resolution.py):
- Grid pass matches a grid contact to its existing contacts-table person by
PROVABLE keys only (exact email, else exact name within the same investor) and
records membership; on a miss it MINTS NOTHING (the old else-branch mint was the
double-count source, and guessing by name across firms risks binding two
different same-named people).
- Targeted, audited cleanup soft-deletes leftover grid-only "twins" (person rows
with no 'contacts' link) and superseded pre-:48 'lp'/'organization' rows, guarded
so any row carrying enrichment/human data is never dropped (guardrail #3); the
tombstoned ids are logged to interaction_log (guardrail #5).
- _upsert_entity clears deleted_at on conflict so a re-emitted id is un-tombstoned
(no permanent burial); fuzzy-merge losers stay buried via _redirect.
entity_merge.py / server.py: the duplicate queue + pending count now filter to
candidates whose both sides are still live, so self-healed twins drop out.
Verified: offline reproduction test (backend/ingest/test_entity_resolution.py,
10/10) reproduces the 1406-style doubling and proves it collapses; no regression
on the synthetic dev set; two adversarial review passes. Known pre-existing
identity-key weaknesses (same name+firm+no email collision; shared role inbox
over-link) are unchanged by this fix and will be resolved structurally by the
contact_id link in the grid/contacts unification.
Run "Build search index" after upgrading to recompute the canonical layer.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- backend/mcp/architect_agent.py: generate_options + revise on Claude (prompt-
cached thesis context, claude-opus-4-8, Ten31 voice rules). Writes N variant
drafts to a node's variant group; nothing canonical without human approval.
Fails gracefully if the API key / SDK is absent.
- server.py endpoints: GET /api/architect/status, GET /api/thesis/{key}/tree,
GET /api/thesis/nodes/{id}/variants, POST .../generate, POST .../feedback,
POST /api/thesis/lines, POST /api/thesis/lines/{key}/nodes. architect_tools
gains get_node_variants.
- Dockerfile installs `anthropic`; docker_entrypoint loads ANTHROPIC_API_KEY from
/data/secrets/anthropic-api-key (self-disabling until the key is dropped in).
Full HTTP surface verified end-to-end (graceful 502 without a key).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Grant's clarification of the real data model:
- Investor entities come from the fundraising grid, one per row, all labeled
"investor" (drops the confusing lp/organization split). Grid is source of truth.
- People come ONLY from the contacts table. The grid's contacts (fundraising_
contacts) are matched to a contact-person and recorded as member_of links to
their investor, instead of creating duplicate person entities. This fixes the
~doubled people count (people now ≈ contacts, not contacts + grid contacts).
- System Status cards: Investors / People (resolved) / Contacts in CRM / Grid
contacts, so resolved-vs-source is visible at a glance.
Verified on synthetic: people == contacts count (no double-count); multi-contact
investors preserved via member_of.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Dual sign-off is now the default (thesis_required_approvals defaults to 2).
- Entity-merge review queue (migration 0003): the fuzzy/Qwen tier no longer
auto-merges — it writes CANDIDATES (entity_merge_candidates) with a same/different
suggestion + confidence + reason for a human to approve (merge) or reject (keep
separate). entity_merge.py applies/rejects (durable via entity_merges, soft-delete,
repoint links+edges); decided pairs aren't re-surfaced.
- entity_jobs.py: UI-triggered background index jobs (rebuild/update/find-duplicates)
as subprocesses with a one-at-a-time lock; status in /api/system/status.
- server.py: /api/index/{rebuild,update}, /api/entities/find-duplicates,
/api/entities/merge-candidates [+ /{id} decide] — admin-gated.
- docs/thesis-seed-v2.md: concrete, plain-English rewrite per Grant's feedback.
Backend verified end-to-end on synthetic data (candidate gen -> approve/reject).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- entity_resolution: emit member_of relationship edges (contact -> investor),
so one investor entity owns many contacts (institution) and a HNWI is the N=1
case; crm_tools.get_investor_contacts + get_entity contacts/member_of; MCP tool.
- seed_synthetic: multi-contact institutions to exercise it (Harbor & Vine = 5).
- server.py: GET /api/system/status (index/entity/thesis/activity health) for an
in-app status view (no shell needed to verify the index).
- docs/thesis-seed-v1.md: grounded v1 thesis (throughline, 6 pillars, objections,
per-segment angles, voice) drawn from Ten31's newsletter/site/essays.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>