- entity_resolution: emit member_of relationship edges (contact -> investor),
so one investor entity owns many contacts (institution) and a HNWI is the N=1
case; crm_tools.get_investor_contacts + get_entity contacts/member_of; MCP tool.
- seed_synthetic: multi-contact institutions to exercise it (Harbor & Vine = 5).
- server.py: GET /api/system/status (index/entity/thesis/activity health) for an
in-app status view (no shell needed to verify the index).
- docs/thesis-seed-v1.md: grounded v1 thesis (throughline, 6 pillars, objections,
per-segment angles, voice) drawn from Ten31's newsletter/site/essays.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Fuzzy tier (backend/ingest/fuzzy_resolve.py + llm.py): local Qwen adjudicates
the deterministic resolver's flagged name-variant candidates; merges are
durable via entity_merges (deterministic re-runs respect them), losers
soft-deleted, logged. Idempotent.
- Incremental sync (backend/ingest/sync.py): re-embeds only rows changed since a
watermark (ingest_sync_state); first run / --recreate = full. Tested full→0→1.
- Start9 packaging (start9/0.4): Dockerfile bundles ingest+mcp + fastembed/mcp;
"Build search index" action runs the init in a subcontainer; MCP shipped as a
manual stdio server (not a daemon); version 0.1.0:44. INGEST_PACKAGING.md.
- backfill.py: factored embed_and_upsert() shared with sync.
Verified end-to-end on synthetic data + live Sparks/Qwen/Qdrant.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>