Box installed to 0.1.0:86 (migration chain ...85->86 clean, candidates endpoint verified live); bot pulled + restarted on the Spark. Only the Matrix live-smoke remains.
23 KiB
Ten31 Venture CRM + Agentic System — AGENTS.md
The foundation is a self-hosted venture-fund CRM — a purpose-built fundraising tool that replaced Airtable to (1) keep sensitive LP/prospect data off third-party servers, (2) drop subscription cost, and (3) fit the fund's workflow: managing ~150 existing LPs, tracking 250+ prospects, and running the capital-raise pipeline. Core CRM domain: contacts (investor/prospect/advisor), organizations, opportunities (the deal pipeline), and communications; investor commitments live in the canonical fundraising_* grid (the legacy single-fund lp_profiles table was retired in v0.1.0:78). The fund (Ten31, ~$200M AUM, bitcoin/energy/AI thesis) runs it on a Start9 box, accessed over ClearNet (StartOS StartTunnel) with app-level user auth by a team of ~5 (Tailscale is not in use). Schema/API tour: docs/crm-overview.md.
The agentic system is new functionality built on top of that CRM — an in-house AI layer to widen the fundraising funnel, sharpen the thesis, and automate outreach drafting. Frontier reasoning runs on Claude (Agent SDK/API); privacy-sensitive and bulk work runs on local DGX Spark models via the Spark Control gateway. Phase 0/1 — no live outward-facing agents; agents draft, humans send.
Inbox check: At session start, if
~/Projects/standards/INBOX.mdexists, scan it for items tagged(CRM)and surface them before proposing next steps; triage with/triage.
Stack (versions that matter)
- Python 3.11, standard library only at runtime. The CRM is one monolith,
backend/server.py(~5k lines): a stdlibhttp.server.ThreadingHTTPServer+ hand-writtenCRMHandlerwith manual path dispatch (do_GET/do_POST). Not FastAPI.backend/requirements.txtlists FastAPI/SQLAlchemy/Alembic/Pydantic/pytest-style deps but none are imported at runtime (vestigial). - SQLite at
data/crm.db(WAL,foreign_keys=ON), opened per-request viaget_db(). Schema via ordered migrations. - Frontend: single
frontend/index.html, inline-Babel React. No build step. - Optional runtime deps, used only if present:
bcrypt,PyJWT(jwt),cryptography(Gmail module). - MCP + ingest (in the Docker image, not the bare CRM):
mcp==1.2.0(FastMCP,backend/mcp/server.py),fastembed==0.4.2,anthropic,cryptography==42.0.5. - Packaging: StartOS 0.4, TypeScript SDK (
@start9labs/start-sdk) understart9/0.4/startos/. Live target isstart9/0.4/. - Local models (bge-m3 embeddings, bge-reranker-v2-m3,
/api/search, Qdrant): always via Spark Control. Contract:docs/EMBEDDINGS.md.
Commands
# Run locally (dev, port 8080; or ./start.sh <port>) — runs python3 backend/server.py
./start.sh
# Run prod-mode (beta) — requires CRM_SECRET_KEY
./start_beta.sh
# Sanity-check edits (there is no compiler/build for the CRM)
python3 -m py_compile backend/server.py
# Run ONE test (tests are standalone scripts with `if __name__ == "__main__"`; no pytest installed)
python3 backend/redaction/test_scrub_leak.py # substitute any backend/**/test_*.py
# Run all tests (aggregate runner — runs each backend/**/test_*.py in its own subprocess)
python3 backend/run_tests.py # add substrings to filter, e.g. `... soft_delete redaction`
# Build + install the s9pk — BUMP THE VERSION FIRST. See docs/guides/packaging.md.
cd start9/0.4 && make
- Migrations apply automatically at startup (
backend/core_migrations.py,schema_migrationsledger). Seedocs/guides/migrations.mdbefore adding one. - Lint: none configured.
Directory layout (day-one)
backend/server.py— the CRM monolith: HTTP handler, route dispatch,init_db(), auth (username/password → HS256 JWT, roles admin/member).backend/core_migrations.py+backend/migrations/NNNN_*.sql(+ paired.down.sql) — additive schema migrations, applied at startup.backend/thesis_seed.py— Thesis Workshop seed + idempotentensure_*one-time seeders, wired inserver.init_db().backend/thesis_review.py— thesis version review/approval (human dual sign-off → canonical).backend/mcp/—architect_agent.py(Claude thesis copilot),architect_tools.py,outreach_agent.py(LP draft assistant),architect_grounding.py,crm_tools.py,server.py(FastMCP).backend/email_integration/— Gmail capture via domain-wide delegation + Tier-B draft creation (compose.py).backend/redaction/—scrub.py+client.py: the scrub→Claude→re-hydrate privacy boundary.backend/ingest/— chunk→embed→Qdrant + retrieval modes.backend/entity_*.py— entity resolution/merge (the two-investor-model reconciliation).backend/matrix_intake/— Matrix intake bot (separate process;matrix-nio, isolated to this component): typed message → local-Qwen parse → in-thread approve → write via the CRM's ownlog-communication. See the matrix-intake guide.frontend/index.html— the entire UI.docs/— architecture, phase plans, contracts, runbooks (see Deeper docs).docs/guides/— scoped subsystem rules (see below).start9/0.4/— StartOS package (startos/utils.tsholdsPACKAGE_VERSION).data/crm.db— the live DB (gitignored)..env/.env.example— config (.envgitignored).
Scoped guides
Subsystem rules live in docs/guides/ and lazy-load in Claude Code via .claude/rules/ symlinks (scoped by paths: frontmatter). Read the guide before editing that area:
- Migrations or seeders (
backend/migrations/,core_migrations.py,thesis_seed.py) →docs/guides/migrations.md - Thesis logic (
backend/thesis_*.py,backend/mcp/architect_*.py) →docs/guides/thesis.md - Redaction or any MCP/Claude path (
backend/redaction/,backend/mcp/) →docs/guides/redaction.md - Ingest / retrieval (
backend/ingest/) →docs/guides/spark-ingest.md - Email capture / drafts + digest send (
backend/email_integration/,backend/digest_mailer.py,backend/smtp_send.py) →docs/guides/email.md - Building or deploying the s9pk (
start9/) →docs/guides/packaging.md - Matrix intake bot (
backend/matrix_intake/) →docs/guides/matrix-intake.md
Conventions
- Investor model — the grid is canonical (since v0.1.0:78). The
fundraising_*grid is the system of record: an investor entity (row) → many contact "pills" → per-fund commitments. The classiccontactstable is a read-only per-person directory, auto-populated from the grid — create/edit people in the grid, not the Contacts page. Email capture rolls multiple people up to one investor. The legacy single-fundlp_profilesmodel is retired (empty table kept, per never-hard-delete). Reconciling grid ↔ classiccontactsto canonical IDs is the core entity-resolution task — seedocs/crm-overview.md. - Soft-delete only:
deleted_atand/orstatus='retired'; never hard-delete. Every READ path must filterdeleted_at IS NULL— list handlers, get-by-id, nested related-data sub-selects, and aggregate sub-selects (COUNT/SUM/MAX). Audits found leaks in all of these (2026-06-12 detail + nested; 2026-06-13 list-viewcontact_count/total_funded/comm_count); the reports subsystem aggregates still leak (see Current state). Regression-guarded bybackend/test_soft_delete_reads.py. (Thesis has a subtlety here — see the thesis guide.) - Env: secrets in
.env(gitignored); names in.env.example. Verified names:ANTHROPIC_API_KEY,SPARK_CONTROL_URL,SPARK_CONTROL_VERIFY_TLS,QDRANT_URL,X_API_KEY,CRM_DB_PATH,CRM_DEV_DB_PATH. Also used:CRM_SECRET_KEY(beta/prod),CRM_HOST/CRM_PORT,CRM_DATA_DIR; digest mailer:CRM_DIGEST_SENDER(DWD impersonation sender) +SMTP_HOST/SMTP_PORT/SMTP_SECURITY/SMTP_FROM/SMTP_USERNAME/SMTP_PASSWORD(SMTP fallback); daily digest (Phase B):CRM_DIGEST_ENABLED+CRM_DIGEST_SEND_HOURonly seed the first-boot default — the live control is the DB policy (app_settings.digest_policy, set in Settings → Admin). - Config placement: operational/feature toggles live in the admin panel, DB-backed via
app_settings(read-merge through aload_*_policy(conn)helper shared by the API + any scheduler; precedence DB-row → env-seed → default), so they're discoverable and take effect live. Reserve StartOS actions / env for secrets and deploy-time config (SMTP creds, API keys, DWD sender). Precedent:digest_policy(GET/PATCH /api/admin/digest/policy),fundraising_backup_policy. - Commit style: imperative subject, concise body explaining the why; put the package version in the subject (
… (v0.1.0:NN)) for shippable changes. No AI co-author / attribution trailers — commits are authored by the user.
Always
- Verify before shipping:
python3 -m py_compilethe edited files; for DB logic, run the change against a copy ofdata/crm.db, never production. - Keep real LP data out of Claude: develop only on code/schema/synthetic-or-locally-redacted data; route any real record substance through
backend/redactionfirst. - Get explicit user authorization before any production deploy/install to
$START9_BOX_HOST.
Never
- Never treat Qdrant (or any derived index) as source of truth — the CRM/SQLite is canonical and rebuildable-from.
- Never hard-delete CRM records or thesis history — soft-delete/archive only.
- Never let an agent send email, post, or contact an LP autonomously — agents draft; a human approves and sends.
- Never set a
thesis_versioncanonical from code/seeds — that is human dual sign-off. - Never call a Spark directly — go through Spark Control (
SPARK_CONTROL_URL). - Never commit secrets,
data/crm.db,.env, ordata/backups/(all gitignored). Scan staged files before committing. (.claude/is tracked —launch.jsonandrules/symlinks ship with the repo; keep local-only settings in.claude/settings.local.json.) - Never bulk-export the LP list to any third party; send only minimal non-sensitive context to Claude.
- Never assume FastAPI / SQLAlchemy / pytest are in play — they sit in
requirements.txtunused; runtime is stdlib + SQLite. - Never add a
Co-Authored-By/ "Generated with" trailer to commits or PRs — commits are the user's.
Deeper docs
- Full constitution + guardrails:
docs/ten31-constitution.md - Architecture & rationale:
docs/Ten31_Agentic_Build_Plan.md - Retrieval/embeddings contract:
docs/EMBEDDINGS.md - CRM schema/API tour:
docs/crm-overview.md - Current thesis handoff:
docs/thesis-handoff.md - Operations & runbooks:
docs/OPERATIONS.md,docs/go-live-runbook.md,docs/gmail-enablement-runbook.md
Current state
Phase 0 substrate + Phase 1 thesis/outreach built; box and repo at v0.1.0:86 (deployed & verified live 2026-06-17). The fundraising grid + email capture is the canonical system of record (decision 2026-06-16) — vestigial classic-CRM surfaces get pruned/repurposed. Longer-term backlog: ROADMAP.md.
- Matrix intake bot — DEPLOYED & LIVE (2026-06-17),
backend/matrix_intake/: a separate-process bot (itsmatrix-niodep isolated from the stdlib CRM) turning a typed Matrix-room message into a proposed fundraising-grid add/edit, written only after in-thread human approval (yes/edit field=value/no). Parse = local Qwen via Spark Control (no Claude/scrub, like the digest); writes reuse the CRM's ownPOST /api/fundraising/log-communicationtaggedsource="matrix_intake"; new-vs-existing via read-onlyGET /api/intake/match(returns the grid row id → no duplicate). Runs on the Spark (modelo32, nohup+venv; pid/tmp/intake-bot.pid, log/tmp/intake-bot.log) — not a systemd service yet (won't survive a reboot). Live-smoked end-to-end (new-investor create + existing-investor note matched & appended, no dup). Server side shipped to the box as v0.1.0:84 (/api/intake/match+sourceprovenance — these were missing on v83, so the bot 404'd until v84); then UX adds: main-timeline nudge pointer, top-level-yes→thread redirect, clearer commit wording, note text in the grid line (v85 dropped the[note]tag). M3 (business-card photo) deferred (no Spark vision model). Guide:docs/guides/matrix-intake.md. - Matrix intake — fuzzy-match + conversational-edit pass — DEPLOYED & LIVE 2026-06-17 (box on v0.1.0:86, bot restarted on the Spark;
candidatesendpoint verified live); Matrix live-smoke still pending. Closes the two locked post-deploy enhancements (ROADMAP). (a) Fuzzy matching (server-side, ships in the s9pk):find_intake_candidatesinserver.py(deterministic — stdlibdifflibname similarity + token-set Jaccard, legal-suffix-aware via_strip_legal_suffix, + email Levenshtein ≤ 2; ranked, ≥0.62, top 5);GET /api/intake/matchnow returns{match, candidates}. The bot surfaces a numbered shortlist (_stage="disambiguate") so a near-duplicate ("Charlie"/"Charles", "Acme Capital"/"Acme Capital LLC", a one-char email typo) is confirmed by a human instead of silently creating a second investor — never auto-attached. The optional LLM-judge re-rank was deferred (deterministic filter already surfaces the cases; LLM is the right shortlist pruner if noise proves real). (b) Conversational edits (bot-side, ships on the Spark): any in-thread reply that isn'tyes/no/edit field=value→parse.revisere-runs{proposal + instruction}through local Qwen and re-renders the card; email integrity preserved (a changed address must literally appear in the instruction; the model's email field is never trusted); no-op revisions re-prompt (same_fields). Deploy is split: thecandidatesneed an s9pk build+install (v86); the bot's disambiguation+revise need a Sparkgit pull+ restart — a bot restart alone won't delivercandidates(box returns[], bot safely proposes new). Tests green; needs a Matrix live-smoke (grammar + Qwenreviseleg). Guide updated. - Working (all draft-only): CRM + ingest (chunk→embed→Qdrant + retrieval) + redaction boundary; Gmail capture (DWD) + email-activity propose→approve; Thesis Workshop + Architect (Claude) with dual-approval gate; Outreach Draft Assistant + follow-up radar + per-user voice + Tier-B in-thread Gmail draft creation.
- Deployed & verified live: v0.1.0:83 (box
$START9_BOX_HOST/immense-voyage.local;installed-version→0.1.0:83, migration chain…82→83clean, server up on:8080, Gmail + ingest + digest schedulers all started; render-smoke gated the build) — email search/query + windowed digest preview (code-only, migrations no-op). Communications tab (CommunicationsPage+email_integration/db.query_email_activity): fixed the investor dropdown — the facet now mirrors the list with the digest's precedence (grid → org → contact → address) and typed keys (fund:/org:/contact:), so email matched only to a classic contact or org domain (no grid id — the common case, sincefundraising_contacts.emailis sparsely populated) now resolves to a real name and is selectable, instead of the dropdown being empty; added a date-range filter (since/until), and a click-to-expand full-body view (GET /api/email/detail?id=→query_email_detail, admin, soft-delete-gated, rendersbody_textescaped — never raw HTML). New semantic content search: a "Search content" toggle →GET /api/email/search?q=(routes._h_search) wrappingingest/search.py:hybrid_searchfiltered todoc_type='email'(lazy import; 503 if Spark/Qdrant unreachable), hydrated + soft-delete-filtered against SQLite (db.search_hit_emails— never trust the derived index). Daily Digest: Settings → Admin now builds a digest over a chosen window (last 24h or since a date) as an in-app preview before sending (POST /api/admin/digest/preview); manual send uses the same window (send-now+digest_scheduler.send_digest_window); window resolved bydigest_builder.resolve_digest_window(cap 92d). Both run the real local-Spark summarizer and never touch the daily cursor. Verified: 22/22 backend tests,py_compileclean, render-smoke pass. Grant validated both live on the box 2026-06-16 — the digest windowed preview renders real Spark narratives over real activity, and the Communications dropdown / date filter / full-body view / content-search all work. Detail:docs/guides/email.md. - Deployed & verified live: v0.1.0:82 (box
$START9_BOX_HOST/immense-voyage.local;installed-version→0.1.0:82, migration chain…81→82clean, server up on:8080, schedulers + Gmail integration up). v82 vendored React 18.3.1 / ReactDOM 18.3.1 / @babel/standalone 7.29.7 intofrontend/assets/vendor/, served same-origin withsha384SRI (no CDN, no outbound-internet dependency to render the UI), and addedstart9/0.4/render-smoke.mjs— a jsdom check (shipped-Babel transform asserts classic/non-module + parseable; real mount asserts the login UI renders) wired into the defaultmakegoal (verified-build), so every build is gated on the frontend actually rendering. Closes the v78 (blank screen) + v79 (Babel-8 ESM-import) class structurally. Detail:docs/guides/packaging.md. Prior shipped & live: v81 Communications-tab matched-only (query_email_activitygates onEXISTS(email_investor_links); unmatched email captured but never shown;docs/guides/email.md); v80 admin-only email-activity panel (GET /api/email/activity); v78 retiredlp_profiles/LP Tracker + repointed Dashboard "Total Committed" onto the grid (graveyard-excluded). Digest fully live: capture (DWD) → propose→approve; Gmail-DWD→SMTP transport; daily Phase-B digest (digest_builder.py+ always-ondigest_scheduler.pyreading a DB policy +send-now); daily auto-send is now ENABLED (Grant turned it on in Settings → Admin, 2026-06-16). Detail:docs/guides/email.md. - Live since v74 (2026-06-13): login works;
/assets/traversal 404s (plain + URL-encoded), root health 200. On boot,ensure_thesis_v2_promotedmakes the v2.0 reserve-asset spine the working approved spine (node-level, reversible). Security/privacy hardening (path-traversal close, outreach NER backstop, get-by-id soft-delete) shipped in v74 — detail inEVALUATION.md. - Tests (2026-06-17): 27/27 backend tests green via
python3 backend/run_tests.py,py_compileclean. (+4 last session for the Matrix intake bot:matrix_intake/test_parse.py,test_proposals.py,test_crm_client.py, andtest_intake_endpoints.py— the last boots the real server against a temp DB and covers/api/intake/match, the create→match no-duplicate contract, andsource="matrix_intake"provenance.) This session (v86 fuzzy + conversational pass) added cases to those same files —test_intake_endpoints.py: fuzzycandidates(near-spelling, legal-suffix-at-1.0, one-char email typo, exact→no-candidates, nothing-close→empty);test_proposals.py: the disambiguation grammar +attach_to_candidate/promote_to_new/same_fields;test_parse.py:revisemerge + email-integrity-from-instruction + match-id preservation;test_crm_client.py: the{match, candidates}shape + no-query-skips-network.test_email_activity_panel.pynow covers the typed facet + org/contact resolution (the dropdown fix), the date-range filter, the detail view (full body / recipients / attachments / soft-delete), and the content-search route (hydrate / drop-tombstoned / 503 / admin) with retrieval stubbed;test_digest_builder.pyadds the window resolver +send_digest_window(no-cursor-touch) cases. Frontend render smoke check (cd start9/0.4 && make render-smoke) still gates the defaultmakebuild. The 2 stale thesis tests stay fixed (seed structure indocs/guides/thesis.md). - Decided, not yet built (detail in
ROADMAP.md): Pipeline adoption + a grid flag that auto-loads flagged investors as opportunities; NL→safe-query feature (search item 3 — the larger, separate build); CRM as canonical thesis backbone with the signal-engine reading from it (reconciliation unwired); reply-all for Tier-B drafts (currently reply to the LP only). (Done this session, v83: email search item 1 [activity query/panel gaps — typed facet fix + date range + full-body view] and item 2 [semantic content search] both shipped; daily-digest windowed preview→send.) - Known debt (P2, not deploy-blocking): reports-subsystem soft-delete sweep —
handle_pipeline_report+ remaining report/aggregate queries over opportunities/communications still count soft-deleted rows (v78 shrank this surface: thelp_profiles/lp-breakdown aggregates are gone and the dashboard "Total Committed" is now grid-sourced); needs a pass + report-endpoint tests. Also?limit=abccrashes the request thread (authenticated list path); scrub-gateway TLS verify off;cryptography==42.0.5; stale user-visiblestart9/0.4/assets/ABOUT.md; hardcoded Spark/Qdrant IPs in the s9pk; StartOS package icon oversized/zoomed (research the Start9 icon spec, source a base ten31 logo, produce a correctly sized icon before the next s9pk upload); the 5.4k-lineserver.pymonolith. P3 batch + full list inEVALUATION.md. (Resolved v82: front-end CDN/SRI risk — libs vendored + SRI-pinned — and the render smoke check is now scripted into the build.) - Doc drift to reconcile:
crm-overview.md+EVALUATION.mdstill describelp_profilesas a live model in places — a doc-auditor pass should align them to "grid canonical,lp_profilesretired." - Other gaps: the v2.0 spine is the working spine but not a canonical
thesis_version(needs Grant + Jonathan dual sign-off); Appendix-A conviction/exposure (incl. ~40% Strike) stay Grant's working read, not canonical, not fed to the engine. Live infra now exercised on the box (Gmail capture + schedulers up; local-Spark summarization confirmed via the digest preview; Qdrant via Communications content-search); Claude/Architect path still unverified live on the box. - Next: 1) Pipeline adoption — grid flag → auto-create/sync an
opportunitiesrow so flagged investors load into the Pipeline board (the agreed next major build; design the grid↔pipeline link first — see ROADMAP "Adopt the Pipeline"); 2) make the intake bot a managed service (systemd / restart-on-boot — still a nohup process, pid/tmp/intake-bot.pidonmodelo32; a Spark reboot silently kills intake); 3) Matrix live-smoke the v86 intake pass (deployed 2026-06-17 — box on v86, bot restarted; only the human-in-the-room smoke of the shortlist grammar + Qwenreviseleg remains); 4) reports-subsystem soft-delete sweep + report-endpoint tests; 5)?limit=abccrash; 6) auth regression test for the 3 v79-gated GET endpoints (/api/users,/api/email/status,/api/email/accounts); 7) NL→safe-query (search item 3 — separate, larger); 8) Grant + Jonathan freeze v2.0 canonical; 9) reply-all for Tier-B drafts.