d16264f401
Root cause: grid contacts (fundraising_contacts) are the SAME people as the contacts table (the app syncs them by name/email), but resolution matched grid rows by (name + investor-canon) where the two sides derive the investor key from different tables that rarely line up — so nearly every grid contact minted a duplicate person (715 + ~692 ≈ 1406), and the duplicate finder then flagged each twin against its real self (~676 candidates). Fix (entity_resolution.py): - Grid pass matches a grid contact to its existing contacts-table person by PROVABLE keys only (exact email, else exact name within the same investor) and records membership; on a miss it MINTS NOTHING (the old else-branch mint was the double-count source, and guessing by name across firms risks binding two different same-named people). - Targeted, audited cleanup soft-deletes leftover grid-only "twins" (person rows with no 'contacts' link) and superseded pre-:48 'lp'/'organization' rows, guarded so any row carrying enrichment/human data is never dropped (guardrail #3); the tombstoned ids are logged to interaction_log (guardrail #5). - _upsert_entity clears deleted_at on conflict so a re-emitted id is un-tombstoned (no permanent burial); fuzzy-merge losers stay buried via _redirect. entity_merge.py / server.py: the duplicate queue + pending count now filter to candidates whose both sides are still live, so self-healed twins drop out. Verified: offline reproduction test (backend/ingest/test_entity_resolution.py, 10/10) reproduces the 1406-style doubling and proves it collapses; no regression on the synthetic dev set; two adversarial review passes. Known pre-existing identity-key weaknesses (same name+firm+no email collision; shared role inbox over-link) are unchanged by this fix and will be resolved structurally by the contact_id link in the grid/contacts unification. Run "Build search index" after upgrading to recompute the canonical layer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
26 lines
1.5 KiB
TypeScript
26 lines
1.5 KiB
TypeScript
// Informational constants shared across the startos/ modules.
|
|
// The authoritative id, title and version for the package come
|
|
// from manifest/index.ts (id, title) and versions/ (version).
|
|
export const PACKAGE_ID = 'ten-database'
|
|
export const PACKAGE_TITLE = 'Ten31 Database'
|
|
// ExVer form of the current 0.4 wrapper release (upstream 0.1.0, wrapper rev 44).
|
|
// * 0.3.5 wrapper: 0.1.0.38 (legacy, aarch64)
|
|
// * First 0.4: 0.1.0:39 (shipped seed snapshot for migration)
|
|
// * Cleanup: 0.1.0:40 (seed removed + multi-threaded server + abuser auto-ban)
|
|
// * 0.1.0:41 (frontend persists auth across refreshes)
|
|
// * 0.1.0:42 (Gmail integration) / 0.1.0:43 (Gmail POST-body hotfix)
|
|
// * 0.1.0:44 (Phase-0 ingest + MCP server in image; build-index action)
|
|
// * 0.1.0:45 (Phase-1 thesis system; dual approval; merge review; in-app index)
|
|
// * 0.1.0:46 (packaging fix: ship full backend so migrations run + endpoints work)
|
|
// * 0.1.0:47 (soft-delete instead of hard-delete; source-count diagnostics)
|
|
// * 0.1.0:48 (entity model: investors vs people; fixes double-count)
|
|
// * 0.1.0:49 (Architect: Claude thesis generation + Thesis Workshop screen)
|
|
// * 0.1.0:50 (Set Anthropic API Key UI action — no terminal needed)
|
|
// * Current: 0.1.0:51 (entity-resolution fix: people double-count + duplicate queue)
|
|
export const PACKAGE_VERSION = '0.1.0:51'
|
|
|
|
export const DATA_MOUNT_PATH = '/data'
|
|
export const WEB_PORT = 8080
|
|
export const IMAGE_ID = 'main'
|
|
export const VOLUME_ID = 'main'
|