Files
ten31-database/AGENTS.md
T

18 KiB

Ten31 Venture CRM + Agentic System — AGENTS.md

The foundation is a self-hosted venture-fund CRM — a purpose-built fundraising tool that replaced Airtable to (1) keep sensitive LP/prospect data off third-party servers, (2) drop subscription cost, and (3) fit the fund's workflow: managing ~150 existing LPs, tracking 250+ prospects, and running the capital-raise pipeline. Core CRM domain: contacts (investor/prospect/advisor), organizations, opportunities (the deal pipeline), and communications; investor commitments live in the canonical fundraising_* grid (the legacy single-fund lp_profiles table was retired in v0.1.0:78). The fund (Ten31, ~$200M AUM, bitcoin/energy/AI thesis) runs it on a Start9 box, accessed on the LAN or over Tailscale by a team of ~5. Schema/API tour: docs/crm-overview.md.

The agentic system is new functionality built on top of that CRM — an in-house AI layer to widen the fundraising funnel, sharpen the thesis, and automate outreach drafting. Frontier reasoning runs on Claude (Agent SDK/API); privacy-sensitive and bulk work runs on local DGX Spark models via the Spark Control gateway. Phase 0/1 — no live outward-facing agents; agents draft, humans send.

Inbox check: At session start, if ~/Projects/standards/INBOX.md exists, scan it for items tagged (CRM) and surface them before proposing next steps; triage with /triage.

Stack (versions that matter)

  • Python 3.11, standard library only at runtime. The CRM is one monolith, backend/server.py (~5k lines): a stdlib http.server.ThreadingHTTPServer + hand-written CRMHandler with manual path dispatch (do_GET/do_POST). Not FastAPI. backend/requirements.txt lists FastAPI/SQLAlchemy/Alembic/Pydantic/pytest-style deps but none are imported at runtime (vestigial).
  • SQLite at data/crm.db (WAL, foreign_keys=ON), opened per-request via get_db(). Schema via ordered migrations.
  • Frontend: single frontend/index.html, inline-Babel React. No build step.
  • Optional runtime deps, used only if present: bcrypt, PyJWT (jwt), cryptography (Gmail module).
  • MCP + ingest (in the Docker image, not the bare CRM): mcp==1.2.0 (FastMCP, backend/mcp/server.py), fastembed==0.4.2, anthropic, cryptography==42.0.5.
  • Packaging: StartOS 0.4, TypeScript SDK (@start9labs/start-sdk) under start9/0.4/startos/. Live target is start9/0.4/.
  • Local models (bge-m3 embeddings, bge-reranker-v2-m3, /api/search, Qdrant): always via Spark Control. Contract: docs/EMBEDDINGS.md.

Commands

# Run locally (dev, port 8080; or ./start.sh <port>) — runs python3 backend/server.py
./start.sh
# Run prod-mode (Tailscale/beta) — requires CRM_SECRET_KEY
./start_beta.sh
# Sanity-check edits (there is no compiler/build for the CRM)
python3 -m py_compile backend/server.py
# Run ONE test (tests are standalone scripts with `if __name__ == "__main__"`; no pytest installed)
python3 backend/redaction/test_scrub_leak.py        # substitute any backend/**/test_*.py
# Run all tests (aggregate runner — runs each backend/**/test_*.py in its own subprocess)
python3 backend/run_tests.py                         # add substrings to filter, e.g. `... soft_delete redaction`
# Build + install the s9pk — BUMP THE VERSION FIRST. See docs/guides/packaging.md.
cd start9/0.4 && make
  • Migrations apply automatically at startup (backend/core_migrations.py, schema_migrations ledger). See docs/guides/migrations.md before adding one.
  • Lint: none configured.

Directory layout (day-one)

  • backend/server.py — the CRM monolith: HTTP handler, route dispatch, init_db(), auth (username/password → HS256 JWT, roles admin/member).
  • backend/core_migrations.py + backend/migrations/NNNN_*.sql (+ paired .down.sql) — additive schema migrations, applied at startup.
  • backend/thesis_seed.py — Thesis Workshop seed + idempotent ensure_* one-time seeders, wired in server.init_db().
  • backend/thesis_review.py — thesis version review/approval (human dual sign-off → canonical).
  • backend/mcp/architect_agent.py (Claude thesis copilot), architect_tools.py, outreach_agent.py (LP draft assistant), architect_grounding.py, crm_tools.py, server.py (FastMCP).
  • backend/email_integration/ — Gmail capture via domain-wide delegation + Tier-B draft creation (compose.py).
  • backend/redaction/scrub.py + client.py: the scrub→Claude→re-hydrate privacy boundary.
  • backend/ingest/ — chunk→embed→Qdrant + retrieval modes.
  • backend/entity_*.py — entity resolution/merge (the two-investor-model reconciliation).
  • frontend/index.html — the entire UI.
  • docs/ — architecture, phase plans, contracts, runbooks (see Deeper docs). docs/guides/ — scoped subsystem rules (see below).
  • start9/0.4/ — StartOS package (startos/utils.ts holds PACKAGE_VERSION).
  • data/crm.db — the live DB (gitignored). .env / .env.example — config (.env gitignored).

Scoped guides

Subsystem rules live in docs/guides/ and lazy-load in Claude Code via .claude/rules/ symlinks (scoped by paths: frontmatter). Read the guide before editing that area:

  • Migrations or seeders (backend/migrations/, core_migrations.py, thesis_seed.py) → docs/guides/migrations.md
  • Thesis logic (backend/thesis_*.py, backend/mcp/architect_*.py) → docs/guides/thesis.md
  • Redaction or any MCP/Claude path (backend/redaction/, backend/mcp/) → docs/guides/redaction.md
  • Ingest / retrieval (backend/ingest/) → docs/guides/spark-ingest.md
  • Email capture / drafts + digest send (backend/email_integration/, backend/digest_mailer.py, backend/smtp_send.py) → docs/guides/email.md
  • Building or deploying the s9pk (start9/) → docs/guides/packaging.md

Conventions

  • Investor model — the grid is canonical (since v0.1.0:78). The fundraising_* grid is the system of record: an investor entity (row) → many contact "pills" → per-fund commitments. The classic contacts table is a read-only per-person directory, auto-populated from the grid — create/edit people in the grid, not the Contacts page. Email capture rolls multiple people up to one investor. The legacy single-fund lp_profiles model is retired (empty table kept, per never-hard-delete). Reconciling grid ↔ classic contacts to canonical IDs is the core entity-resolution task — see docs/crm-overview.md.
  • Soft-delete only: deleted_at and/or status='retired'; never hard-delete. Every READ path must filter deleted_at IS NULL — list handlers, get-by-id, nested related-data sub-selects, and aggregate sub-selects (COUNT/SUM/MAX). Audits found leaks in all of these (2026-06-12 detail + nested; 2026-06-13 list-view contact_count/total_funded/comm_count); the reports subsystem aggregates still leak (see Current state). Regression-guarded by backend/test_soft_delete_reads.py. (Thesis has a subtlety here — see the thesis guide.)
  • Env: secrets in .env (gitignored); names in .env.example. Verified names: ANTHROPIC_API_KEY, SPARK_CONTROL_URL, SPARK_CONTROL_VERIFY_TLS, QDRANT_URL, X_API_KEY, CRM_DB_PATH, CRM_DEV_DB_PATH. Also used: CRM_SECRET_KEY (beta/prod), CRM_HOST/CRM_PORT, CRM_DATA_DIR; digest mailer: CRM_DIGEST_SENDER (DWD impersonation sender) + SMTP_HOST/SMTP_PORT/SMTP_SECURITY/SMTP_FROM/SMTP_USERNAME/SMTP_PASSWORD (SMTP fallback); daily digest (Phase B): CRM_DIGEST_ENABLED + CRM_DIGEST_SEND_HOUR only seed the first-boot default — the live control is the DB policy (app_settings.digest_policy, set in Settings → Admin).
  • Config placement: operational/feature toggles live in the admin panel, DB-backed via app_settings (read-merge through a load_*_policy(conn) helper shared by the API + any scheduler; precedence DB-row → env-seed → default), so they're discoverable and take effect live. Reserve StartOS actions / env for secrets and deploy-time config (SMTP creds, API keys, DWD sender). Precedent: digest_policy (GET/PATCH /api/admin/digest/policy), fundraising_backup_policy.
  • Commit style: imperative subject, concise body explaining the why; put the package version in the subject (… (v0.1.0:NN)) for shippable changes. No AI co-author / attribution trailers — commits are authored by the user.

Always

  • Verify before shipping: python3 -m py_compile the edited files; for DB logic, run the change against a copy of data/crm.db, never production.
  • Keep real LP data out of Claude: develop only on code/schema/synthetic-or-locally-redacted data; route any real record substance through backend/redaction first.
  • Get explicit user authorization before any production deploy/install to $START9_BOX_HOST.

Never

  • Never treat Qdrant (or any derived index) as source of truth — the CRM/SQLite is canonical and rebuildable-from.
  • Never hard-delete CRM records or thesis history — soft-delete/archive only.
  • Never let an agent send email, post, or contact an LP autonomously — agents draft; a human approves and sends.
  • Never set a thesis_version canonical from code/seeds — that is human dual sign-off.
  • Never call a Spark directly — go through Spark Control (SPARK_CONTROL_URL).
  • Never commit secrets, data/crm.db, .env, or data/backups/ (all gitignored). Scan staged files before committing. (.claude/ is tracked — launch.json and rules/ symlinks ship with the repo; keep local-only settings in .claude/settings.local.json.)
  • Never bulk-export the LP list to any third party; send only minimal non-sensitive context to Claude.
  • Never assume FastAPI / SQLAlchemy / pytest are in play — they sit in requirements.txt unused; runtime is stdlib + SQLite.
  • Never add a Co-Authored-By / "Generated with" trailer to commits or PRs — commits are the user's.

Deeper docs

  • Full constitution + guardrails: docs/ten31-constitution.md
  • Architecture & rationale: docs/Ten31_Agentic_Build_Plan.md
  • Retrieval/embeddings contract: docs/EMBEDDINGS.md
  • CRM schema/API tour: docs/crm-overview.md
  • Current thesis handoff: docs/thesis-handoff.md
  • Operations & runbooks: docs/OPERATIONS.md, docs/go-live-runbook.md, docs/gmail-enablement-runbook.md

Current state

Phase 0 substrate + Phase 1 thesis/outreach are built; box and repo at v0.1.0:81 (latest: Communications tab is matched-only — the email-activity panel now surfaces only email linked to a known investor/contact; unmatched cold/unknown-sender email is captured but never shown; prior v80: repurposed the tab into the admin-only captured-Gmail search over the email_* tables). Decision (2026-06-16): the fundraising grid + email capture is the canonical system of record — vestigial classic-CRM surfaces get pruned or repurposed (see ROADMAP.md → "Consolidate on the fundraising grid as canonical"). Longer-term backlog: ROADMAP.md.

  • Working (all draft-only): CRM + ingest (chunk→embed→Qdrant + retrieval) + redaction boundary; Gmail capture (DWD) + email-activity propose→approve; Thesis Workshop + Architect (Claude) with dual-approval gate; Outreach Draft Assistant + follow-up radar + per-user voice + Tier-B in-thread Gmail draft creation.
  • Deployed & verified live: v0.1.0:81 (box $START9_BOX_HOST/immense-voyage.local; installed-version0.1.0:81, migration chain …80→81 clean, server up on :8080, schedulers + Gmail integration up). v0.1.0:81 makes the Communications tab matched-only: query_email_activity now gates on EXISTS(email_investor_links), so the panel surfaces only email linked to a known investor/contact; unmatched cold/unknown-sender email is still captured (metadata-only) and will appear automatically if its sender is later added as an investor — a read-side filter, no schema/capture change. Graveyard investors unaffected (their email has a link), still hidden from the picker but visible/searchable as an audit surface. Backend-only (frontend index.html byte-identical to v80, which was render-verified). Prior — v0.1.0:80 repurposed the Communications tab into the admin-only email-activity panel: new GET /api/email/activity (admin-enforced server-side) over the email_* tables, filterable by investor / mailbox / direction + free-text search; soft-delete honored on the per-mailbox sighting; direction decided at the email level (mirrors digest_builder); graveyard investors hidden from the picker but their email stays visible + searchable (audit surface). The classic manual "Log Communication" form was retired (the grid context menu remains the manual-log path); nav item + page are admin-only. Query lives in email_integration/db.py:query_email_activity; tests in email_integration/test_email_activity_panel.py. Prior — v0.1.0:79 was a P0 hotfix: the page loaded @babel/standalone from unpkg unpinned, so the CDN served Babel 8.0.0, whose @babel/preset-react automatic JSX runtime prepends an ESM import {jsx} from "react/jsx-runtime" — illegal in this classic (non-module) inline <script>, so the browser rejected the whole bundle and React never mounted → blank screen for every user. Fix: pin @babel/standalone@7.29.7 (classic runtime; verified via headless render locally + on the box). Same release closed 3 server-side admin gaps from a permissions audit — GET /api/users, /api/email/status, /api/email/accounts were UI-hidden from members but not API-enforced; all now require_admin (write endpoints were already gated). Prior — v0.1.0:78 retired lp_profiles + the orphaned LP Tracker (endpoints/handlers/lp-breakdown report/contact-dossier LP section/frontend component+redirect removed; empty table left in place per never-hard-delete) and repointed the Dashboard "Total Committed" onto fundraising_investors.total_invested (graveyard-excluded; "Total Funded" dropped — the grid has no funded concept). Digest is fully live: capture (DWD) → propose→approve; transport routes Gmail-DWD→SMTP (no app password); and daily activity digest (Phase B)digest_builder.py (by-team-member Spark narrative + by-investor section, soft-delete filtered) + always-on digest_scheduler.py reading a DB policy + send-now. Auto-send defaults OFF (env seed unset → app_settings.digest_policy off) until Grant enables it in Settings → Admin. Detail: docs/guides/email.md.
  • Live since v74 (2026-06-13): login works; /assets/ traversal 404s (plain + URL-encoded), root health 200. On boot, ensure_thesis_v2_promoted makes the v2.0 reserve-asset spine the working approved spine (node-level, reversible). Security/privacy hardening (path-traversal close, outreach NER backstop, get-by-id soft-delete) shipped in v74 — detail in EVALUATION.md.
  • Tests (2026-06-16): 22/22 backend tests green via python3 backend/run_tests.py (email_integration/test_email_activity_panel.py updated for v81: matched-only scope — unmatched email never surfaces, not even by free-text search — plus investor/mailbox/search/direction filters, per-sighting soft-delete, email-level direction, mailbox + investor roll-ups, graveyard hidden-from-picker-but-visible, facets, route 401/403 admin enforcement; prior: test_dashboard_report.py, test_digest_builder.py). py_compile clean. Frontend render checked locally (jsdom mount + pinned-Babel transform). The 2 stale thesis tests stay fixed (seed structure in docs/guides/thesis.md).
  • Decided, not yet built (detail in ROADMAP.md): Pipeline adoption + a grid flag that auto-loads flagged investors as opportunities; NL→safe-query feature; CRM as canonical thesis backbone with the signal-engine reading from it (reconciliation unwired); reply-all for Tier-B drafts (currently reply to the LP only). (Done v80: the admin-only per-investor/per-mailbox email-activity panel; v81: made that panel matched-only.)
  • Known debt (P2, not deploy-blocking): reports-subsystem soft-delete sweephandle_pipeline_report + remaining report/aggregate queries over opportunities/communications still count soft-deleted rows (v78 shrank this surface: the lp_profiles/lp-breakdown aggregates are gone and the dashboard "Total Committed" is now grid-sourced); needs a pass + report-endpoint tests. Also ?limit=abc crashes the request thread (authenticated list path); scrub-gateway TLS verify off; cryptography==42.0.5; front-end CDN libs still loaded from unpkg without SRI — Babel is now version-pinned (v79, after an unpinned auto-upgrade to Babel 8 blanked the whole UI), but React/Babel should be vendored into the package + SRI-pinned so a CDN can never swap prod deps again; deploy verification must include a browser-render smoke check — v78's blank UI shipped as "verified live" because the checks were server-up/curl only, which can't catch a client render failure; stale user-visible start9/0.4/assets/ABOUT.md; hardcoded Spark/Qdrant IPs in the s9pk; the 5.4k-line server.py monolith. P3 batch + full list in EVALUATION.md.
  • Doc drift to reconcile: crm-overview.md + EVALUATION.md still describe lp_profiles as a live model in places — a doc-auditor pass should align them to "grid canonical, lp_profiles retired."
  • Other gaps: the v2.0 spine is the working spine but not a canonical thesis_version (needs Grant + Jonathan dual sign-off); Appendix-A conviction/exposure (incl. ~40% Strike) stay Grant's working read, not canonical, not fed to the engine; live features (Claude/Qdrant/Gmail) unverified on the box.
  • Next: 1) Vendor + SRI-pin the front-end libs (serve React/Babel from the package, integrity-checked) so a CDN can never swap prod deps again, and script the render smoke check into deploy-verify — a working jsdom-mount + pinned-Babel-transform check was run manually for v80 (catches the v78/v79 blank-screen class); wire it into the build/install flow; 2) add an auth regression test asserting the 3 v79-gated GET endpoints (/api/users, /api/email/status, /api/email/accounts) reject members (v80 added the analogous test for /api/email/activity); 3) Grant validates digest Phase B on the box — Settings→Admin Send Digest Now, then tick Send automatically every day; 4) reports-subsystem soft-delete sweep + report-endpoint tests; 5) Pipeline adoption — grid flag → auto-load opportunities; 6) ?limit=abc crash; 7) NL→safe-query (separate, larger); 8) Grant + Jonathan freeze v2.0 canonical; 9) build reply-all.