Addresses Grant's feedback that the Workshop was confusing and underbuilt (no delete,
no approve, redundant generate-vs-feedback panels, and a stray "0" on segment lines).
Backend (architect_tools.py + server.py routes/handlers):
- retire_node: soft-delete a node + its subtree (reversible). DELETE /api/thesis/nodes/{id}.
- choose_variant: 'Use this' — keep this option, soft-delete the others in its group,
mark it approved. POST /api/thesis/nodes/{id}/choose.
- upsert_thesis_node gains actor_type so a manual human edit is recorded as 'human'.
PUT /api/thesis/nodes/{id} edits a part's text directly.
- handle_approve_line: one-click 'approve as current' — records this admin's approval on
the line's in-review version (creating + submitting one from the live tree if none),
promoting to canonical at the required distinct-approval count. POST /api/thesis/lines/{key}/approve.
Frontend (ThesisWorkshop redesign):
- Merged the redundant "Generate options" + "Give feedback" panels into one "Ask the
Architect for options" box (revise was just generate-with-guidance).
- Per option: Use this / Edit (inline) / Delete. Per part: edit + delete via the same.
- "Approve as current" bar with dual-sign-off state + a "Current ✓" badge, and a one-line
"how it works" hint. Refreshes the tree after every action.
- Fixed the stray "0": `{line.is_core && <badge>}` rendered 0 for non-core lines (SQLite
integer 0); now `{!!line.is_core && ...}`.
Verified: backend test_thesis_actions.py (choose/edit/retire-subtree/dual-approval->canonical),
and a live in-browser smoke test (JSX compiles, Workshop renders, options show Use/Edit/Delete,
approve returns 1-of-2, no runtime errors).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After the grid/contacts unification (v0.1.0:52) the Contacts page's "Add New Contact"
modal was made unreachable (new people are added from the grid). This removes the now-dead
showForm/formData/formError state, handleAddContact, and the modal JSX. Other components'
forms are untouched; html parses clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP
objections without any LP identity reaching the Claude API. Layered, defense-in-depth,
fail-closed by construction (docs/redaction-rehydration.md).
backend/redaction/:
- scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/
IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from
the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails,
scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/
quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N]
strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the
de-anon key — local-only, never logged/sent). Hardened across THREE adversarial
leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/
comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes).
- client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or
gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary
build errors propagate; strict rehydrate returns tokenized-not-de-anon text).
- test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification
suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector.
backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first
(local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the
de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the
local model is unreachable or a hallucinated token appears. test_grounding_boundary.py
proves nothing sensitive reaches Claude and the three fail-closed paths.
server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections.
docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md:
the gateway handover spec (Option 1 — caller supplies the entity dictionary).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The fundraising grid's per-contact editor now has a LinkedIn URL field next to
name, email, title, and location. It threads through the grid contact object and
sanitize (which preserves contact-object fields), and _upsert_contact_from_fundraising
now reads and persists linkedin_url on both the update and insert paths — so a
LinkedIn entered in the grid lands on the linked contact record.
Test: test_grid_contact_link.py extended to assert LinkedIn entered in the grid
persists to the contact (idempotent). Frontend html.parser clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
backend/thesis_seed.py builds the starting "living messaging source of truth"
from docs/thesis-seed-v5.md: a core line (throughline; the open Option A/B banner
as a competing variant group; the three pillars; the proof; voice rules), one line
per LP segment carrying that segment's angle, and the five segment definitions.
ensure_thesis_seed(conn) runs from init_db, seeding ONLY when the Workshop is empty
(no thesis lines) — idempotent and non-destructive, so it bootstraps once and never
overwrites partner edits. Everything lands draft/candidate; nothing is made canonical
(that stays the partners' dual-approval action, guardrail #4). Content is Ten31's own
messaging, not LP data.
Test: backend/test_thesis_seed.py runs init_db and asserts the core line, 5 segment
lines, the 2-member Option A/B variant group, 3 pillars, segment_cuts, and segment
defs, plus re-seed-is-a-no-op (11/11).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Structural fix for the duplicate-people class of bug: instead of matching a grid
contact "pill" to a contacts row heuristically by name/email (which drifted and
caused the 1406 double-count), link them by id.
Backend:
- Migration 0004: fundraising_contacts.contact_id (additive, nullable, logical FK
to contacts(id)) + index. Paired down migration.
- sync_fundraising_relational now stores the id that _upsert_contact_from_fundraising
already returns, so every grid contact carries its contacts-table id.
- _backfill_grid_contact_ids: one-time, idempotent backfill on startup (re-runs the
grid sync once if any row lacks contact_id), so existing data links immediately.
- entity_resolution: grid pass prefers the explicit contact_id link (match_kind
'grid_link') over heuristic email / name+investor, guarded by a PRAGMA check so
older DBs without the column still work.
Frontend:
- Fundraising grid "+ Row" -> "+ Investor" (clear, single investor entry point).
- Contacts page: the "+ Add Contact" trigger is replaced by a pointer to the grid;
the page is now a read/search/edit view (ContactDetailPanel still edits all
fields). New people are added from the grid. No contact data is removed.
Tests: backend/ingest/test_entity_resolution.py extended (explicit-link case, 11/11)
and a new backend/test_grid_contact_link.py integration test (init_db applies 0004,
sync populates contact_id to the right contact, re-sync is idempotent). py_compile +
frontend html.parser clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Root cause: grid contacts (fundraising_contacts) are the SAME people as the
contacts table (the app syncs them by name/email), but resolution matched grid
rows by (name + investor-canon) where the two sides derive the investor key from
different tables that rarely line up — so nearly every grid contact minted a
duplicate person (715 + ~692 ≈ 1406), and the duplicate finder then flagged each
twin against its real self (~676 candidates).
Fix (entity_resolution.py):
- Grid pass matches a grid contact to its existing contacts-table person by
PROVABLE keys only (exact email, else exact name within the same investor) and
records membership; on a miss it MINTS NOTHING (the old else-branch mint was the
double-count source, and guessing by name across firms risks binding two
different same-named people).
- Targeted, audited cleanup soft-deletes leftover grid-only "twins" (person rows
with no 'contacts' link) and superseded pre-:48 'lp'/'organization' rows, guarded
so any row carrying enrichment/human data is never dropped (guardrail #3); the
tombstoned ids are logged to interaction_log (guardrail #5).
- _upsert_entity clears deleted_at on conflict so a re-emitted id is un-tombstoned
(no permanent burial); fuzzy-merge losers stay buried via _redirect.
entity_merge.py / server.py: the duplicate queue + pending count now filter to
candidates whose both sides are still live, so self-healed twins drop out.
Verified: offline reproduction test (backend/ingest/test_entity_resolution.py,
10/10) reproduces the 1406-style doubling and proves it collapses; no regression
on the synthetic dev set; two adversarial review passes. Known pre-existing
identity-key weaknesses (same name+firm+no email collision; shared role inbox
over-link) are unchanged by this fix and will be resolved structurally by the
contact_id link in the grid/contacts unification.
Run "Build search index" after upgrading to recompute the canonical layer.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lets a non-technical operator install the Architect's Claude key from the
StartOS UI instead of the terminal: a masked text field whose value is written
to /data/secrets/anthropic-api-key (0600) on the box — the same file the
entrypoint already loads at boot. Secret is piped over stdin (never argv/env),
CR/LF stripped to match the entrypoint's read. allowedStatuses 'any'; a restart
is required (and stated in the action's warning + success message) since the
entrypoint reads the key only at startup.
Verified the Architect's data boundary first: the deployed Thesis Workshop
routes send only Ten31's own thesis text (thesis_lines/thesis_nodes) + the
partner-typed guidance to Claude — no contacts/lp_profiles/communications/grid.
(The MCP CRM-retrieval tools that DO return record substance are not wired into
the deployed Architect; the redaction boundary must land before any grounding
path uses them — Phase 1 Workstream D.)
tsc --noEmit clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Frontend: ThesisWorkshopPage / ThesisWorkshopNode / ThesisWorkshopOptions —
the collaborative iteration screen where partners generate a variable number
of competing thesis options (1, 2, 3, A1/A2/A3 ...) for any node, give
feedback, and regenerate. Reuses the shared api() helper; flexible option
count is the core UX constraint.
Backend Architect agent (architect_agent.py) + routes shipped in dd25bbc;
this completes the user-facing surface and bumps the StartOS package to
0.1.0:49 (anthropic dep already in the image, key loaded from
/data/secrets/anthropic-api-key — self-disabling until present).
Also lands thesis seed iterations v3 and v5 (voice/messaging corrections).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- backend/mcp/architect_agent.py: generate_options + revise on Claude (prompt-
cached thesis context, claude-opus-4-8, Ten31 voice rules). Writes N variant
drafts to a node's variant group; nothing canonical without human approval.
Fails gracefully if the API key / SDK is absent.
- server.py endpoints: GET /api/architect/status, GET /api/thesis/{key}/tree,
GET /api/thesis/nodes/{id}/variants, POST .../generate, POST .../feedback,
POST /api/thesis/lines, POST /api/thesis/lines/{key}/nodes. architect_tools
gains get_node_variants.
- Dockerfile installs `anthropic`; docker_entrypoint loads ANTHROPIC_API_KEY from
/data/secrets/anthropic-api-key (self-disabling until the key is dropped in).
Full HTTP surface verified end-to-end (graceful 502 without a key).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Grant's clarification of the real data model:
- Investor entities come from the fundraising grid, one per row, all labeled
"investor" (drops the confusing lp/organization split). Grid is source of truth.
- People come ONLY from the contacts table. The grid's contacts (fundraising_
contacts) are matched to a contact-person and recorded as member_of links to
their investor, instead of creating duplicate person entities. This fixes the
~doubled people count (people now ≈ contacts, not contacts + grid contacts).
- System Status cards: Investors / People (resolved) / Contacts in CRM / Grid
contacts, so resolved-vs-source is visible at a glance.
Verified on synthetic: people == contacts count (no double-count); multi-contact
investors preserved via member_of.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The image COPY'd backend/server.py + a few subdirs but missed core_migrations.py,
backend/migrations/, and the Phase-1 modules (thesis_review/entity_merge/
entity_jobs). On the box the migrations never ran (tables absent) and those
endpoints 503'd ("Jobs unavailable"). Now COPY backend wholesale (.dockerignore
keeps __pycache__/data out). Bump to 0.1.0:46.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
These fundraising-state backup JSONs contain real investor data and were tracked
from the initial commit. Untracked (local files preserved) and gitignored, same
as data/crm.db. Keeps real data out of future commits and the package gitHash.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- frontend: System Status page extended with one-click index actions
(update/rebuild/find-duplicates, with live job status) and a human-in-the-loop
duplicate-review queue (approve=merge / reject=keep-separate per candidate).
- StartOS version 0.1.0:45 (image-only; schema via the in-app migration runner).
Backend + new routes verified end-to-end via the running HTTP server.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Dual sign-off is now the default (thesis_required_approvals defaults to 2).
- Entity-merge review queue (migration 0003): the fuzzy/Qwen tier no longer
auto-merges — it writes CANDIDATES (entity_merge_candidates) with a same/different
suggestion + confidence + reason for a human to approve (merge) or reject (keep
separate). entity_merge.py applies/rejects (durable via entity_merges, soft-delete,
repoint links+edges); decided pairs aren't re-surfaced.
- entity_jobs.py: UI-triggered background index jobs (rebuild/update/find-duplicates)
as subprocesses with a one-at-a-time lock; status in /api/system/status.
- server.py: /api/index/{rebuild,update}, /api/entities/find-duplicates,
/api/entities/merge-candidates [+ /{id} decide] — admin-gated.
- docs/thesis-seed-v2.md: concrete, plain-English rewrite per Grant's feedback.
Backend verified end-to-end on synthetic data (candidate gen -> approve/reject).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two React-via-Babel views in the CRM SPA, reusing the existing api() helper and
conventions:
- Thesis: lists thesis lines + the in-review queue with approvals/required pills;
version detail renders throughline/pillars/claims/objections + the reviews
timeline; admin review form (approve/request-changes/comment + feedback) ->
POST /api/thesis/versions/{id}/review (the dual-approval feedback loop).
- System Status: entity counts, last index sync, thesis counts, recent activity
from the interaction log — index health visible in-app, no shell.
Backend + full approve flow verified end-to-end via the running HTTP server.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- entity_resolution: emit member_of relationship edges (contact -> investor),
so one investor entity owns many contacts (institution) and a HNWI is the N=1
case; crm_tools.get_investor_contacts + get_entity contacts/member_of; MCP tool.
- seed_synthetic: multi-contact institutions to exercise it (Harbor & Vine = 5).
- server.py: GET /api/system/status (index/entity/thesis/activity health) for an
in-app status view (no shell needed to verify the index).
- docs/thesis-seed-v1.md: grounded v1 thesis (throughline, 6 pillars, objections,
per-segment angles, voice) drawn from Ten31's newsletter/site/essays.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- backend/ingest/sync_scheduler.py: periodic incremental-sync loop (every
CRM_INGEST_SYNC_INTERVAL_MIN min); resilient, --once for testing.
- start9/0.4: "Refresh search index" action (incremental sync.py); entrypoint
launches the scheduler as a background process when Spark/Qdrant are set;
CRM_INGEST_SYNC_INTERVAL_MIN env; pre-release note on fastembed/mcp pins.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Fuzzy tier (backend/ingest/fuzzy_resolve.py + llm.py): local Qwen adjudicates
the deterministic resolver's flagged name-variant candidates; merges are
durable via entity_merges (deterministic re-runs respect them), losers
soft-deleted, logged. Idempotent.
- Incremental sync (backend/ingest/sync.py): re-embeds only rows changed since a
watermark (ingest_sync_state); first run / --recreate = full. Tested full→0→1.
- Start9 packaging (start9/0.4): Dockerfile bundles ingest+mcp + fastembed/mcp;
"Build search index" action runs the init in a subcontainer; MCP shipped as a
manual stdio server (not a daemon); version 0.1.0:44. INGEST_PACKAGING.md.
- backfill.py: factored embed_and_upsert() shared with sync.
Verified end-to-end on synthetic data + live Sparks/Qwen/Qdrant.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>