# Ten31 Venture CRM + Agentic System — AGENTS.md **The foundation is a self-hosted venture-fund CRM** — a purpose-built fundraising tool that replaced Airtable to (1) keep sensitive LP/prospect data off third-party servers, (2) drop subscription cost, and (3) fit the fund's workflow: managing ~150 existing LPs, tracking 250+ prospects, and running the capital-raise pipeline. Core CRM domain: contacts (investor/prospect/advisor), organizations, opportunities (the deal pipeline), and communications; investor commitments live in the canonical `fundraising_*` grid (the legacy single-fund `lp_profiles` table was retired in v0.1.0:78). The fund (Ten31, ~$200M AUM, bitcoin/energy/AI thesis) runs it on a Start9 box, accessed over ClearNet (StartOS StartTunnel) with app-level user auth by a team of ~5 (Tailscale is not in use). Schema/API tour: `docs/crm-overview.md`. **The agentic system is new functionality built on top of that CRM** — an in-house AI layer to widen the fundraising funnel, sharpen the thesis, and automate outreach drafting. Frontier reasoning runs on Claude (Agent SDK/API); privacy-sensitive and bulk work runs on local DGX Spark models via the **Spark Control** gateway. **Phase 0/1 — no live outward-facing agents; agents draft, humans send.** > **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for > items tagged `(CRM)` and surface them before proposing next steps; triage with `/triage`. ## Stack (versions that matter) - **Python 3.11, standard library only at runtime.** The CRM is one monolith, `backend/server.py` (~5k lines): a stdlib `http.server.ThreadingHTTPServer` + hand-written `CRMHandler` with manual path dispatch (`do_GET`/`do_POST`). **Not FastAPI.** `backend/requirements.txt` lists FastAPI/SQLAlchemy/Alembic/Pydantic/pytest-style deps but **none are imported at runtime** (vestigial). - **SQLite** at `data/crm.db` (WAL, `foreign_keys=ON`), opened per-request via `get_db()`. Schema via ordered migrations. - **Frontend:** single `frontend/index.html`, inline-Babel React. **No build step.** - Optional runtime deps, used only if present: `bcrypt`, `PyJWT` (`jwt`), `cryptography` (Gmail module). - **MCP + ingest** (in the Docker image, not the bare CRM): `mcp==1.2.0` (FastMCP, `backend/mcp/server.py`), `fastembed==0.4.2`, `anthropic`, `cryptography==42.0.5`. - **Packaging:** StartOS 0.4, TypeScript SDK (`@start9labs/start-sdk`) under `start9/0.4/startos/`. Live target is `start9/0.4/`. - **Local models** (bge-m3 embeddings, bge-reranker-v2-m3, `/api/search`, Qdrant): always via Spark Control. Contract: `docs/EMBEDDINGS.md`. ## Commands ```bash # Run locally (dev, port 8080; or ./start.sh ) — runs python3 backend/server.py ./start.sh # Run prod-mode (beta) — requires CRM_SECRET_KEY ./start_beta.sh # Sanity-check edits (there is no compiler/build for the CRM) python3 -m py_compile backend/server.py # Run ONE test (tests are standalone scripts with `if __name__ == "__main__"`; no pytest installed) python3 backend/redaction/test_scrub_leak.py # substitute any backend/**/test_*.py # Run all tests (aggregate runner — runs each backend/**/test_*.py in its own subprocess) python3 backend/run_tests.py # add substrings to filter, e.g. `... soft_delete redaction` # Build + install the s9pk — BUMP THE VERSION FIRST. See docs/guides/packaging.md. cd start9/0.4 && make ``` - **Migrations** apply automatically at startup (`backend/core_migrations.py`, `schema_migrations` ledger). See `docs/guides/migrations.md` before adding one. - **Lint:** none configured. ## Directory layout (day-one) - `backend/server.py` — the CRM monolith: HTTP handler, route dispatch, `init_db()`, auth (username/password → HS256 JWT, roles admin/member). - `backend/core_migrations.py` + `backend/migrations/NNNN_*.sql` (+ paired `.down.sql`) — additive schema migrations, applied at startup. - `backend/thesis_seed.py` — Thesis Workshop seed + idempotent `ensure_*` one-time seeders, wired in `server.init_db()`. - `backend/thesis_review.py` — thesis version review/approval (human dual sign-off → canonical). - `backend/mcp/` — `architect_agent.py` (Claude thesis copilot), `architect_tools.py`, `outreach_agent.py` (LP draft assistant), `architect_grounding.py`, `crm_tools.py`, `server.py` (FastMCP). - `backend/email_integration/` — Gmail capture via domain-wide delegation + Tier-B draft creation (`compose.py`). - `backend/redaction/` — `scrub.py` + `client.py`: the scrub→Claude→re-hydrate privacy boundary. - `backend/ingest/` — chunk→embed→Qdrant + retrieval modes. - `backend/entity_*.py` — entity resolution/merge (the two-investor-model reconciliation). - `backend/matrix_intake/` — Matrix intake bot (separate process; `matrix-nio`, isolated to this component): typed message → local-Qwen parse → in-thread approve → write via the CRM's own `log-communication`. See the matrix-intake guide. - `frontend/index.html` — the entire UI. - `docs/` — architecture, phase plans, contracts, runbooks (see Deeper docs). `docs/guides/` — scoped subsystem rules (see below). - `start9/0.4/` — StartOS package (`startos/utils.ts` holds `PACKAGE_VERSION`). - `data/crm.db` — the live DB (gitignored). `.env` / `.env.example` — config (`.env` gitignored). ## Scoped guides Subsystem rules live in `docs/guides/` and lazy-load in Claude Code via `.claude/rules/` symlinks (scoped by `paths:` frontmatter). **Read the guide before editing that area:** - **Migrations or seeders** (`backend/migrations/`, `core_migrations.py`, `thesis_seed.py`) → `docs/guides/migrations.md` - **Thesis logic** (`backend/thesis_*.py`, `backend/mcp/architect_*.py`) → `docs/guides/thesis.md` - **Redaction or any MCP/Claude path** (`backend/redaction/`, `backend/mcp/`) → `docs/guides/redaction.md` - **Ingest / retrieval** (`backend/ingest/`) → `docs/guides/spark-ingest.md` - **Email capture / drafts + digest send** (`backend/email_integration/`, `backend/digest_mailer.py`, `backend/smtp_send.py`) → `docs/guides/email.md` - **Building or deploying the s9pk** (`start9/`) → `docs/guides/packaging.md` - **Matrix intake bot** (`backend/matrix_intake/`) → `docs/guides/matrix-intake.md` ## Conventions - **Investor model — the grid is canonical (since v0.1.0:78).** The `fundraising_*` grid is the **system of record**: an investor entity (row) → many contact "pills" → per-fund commitments. The classic `contacts` table is a **read-only per-person directory**, auto-populated from the grid — create/edit people in the grid, not the Contacts page. Email capture rolls multiple people up to one investor. The legacy single-fund `lp_profiles` model is **retired** (empty table kept, per never-hard-delete). Reconciling grid ↔ classic `contacts` to canonical IDs is the core entity-resolution task — see `docs/crm-overview.md`. - **Soft-delete only:** `deleted_at` and/or `status='retired'`; never hard-delete. Every READ path must filter `deleted_at IS NULL` — list handlers, get-by-id, nested related-data sub-selects, **and aggregate sub-selects (`COUNT`/`SUM`/`MAX`)**. Audits found leaks in all of these (2026-06-12 detail + nested; 2026-06-13 list-view `contact_count`/`total_funded`/`comm_count`); the **opportunities/pipeline** aggregates were fixed in v0.1.0:87 (`handle_pipeline_report` + dashboard pipeline metrics now filter `deleted_at`), but the **reports** subsystem's **communications-side** aggregates (dashboard `recent_comms`/`comms_this_month`/`meetings_this_month`, activity report) still leak (see Current state). Regression-guarded by `backend/test_soft_delete_reads.py`. (Thesis has a subtlety here — see the thesis guide.) - **Env:** secrets in `.env` (gitignored); names in `.env.example`. Verified names: `ANTHROPIC_API_KEY`, `SPARK_CONTROL_URL`, `SPARK_CONTROL_VERIFY_TLS`, `QDRANT_URL`, `X_API_KEY`, `CRM_DB_PATH`, `CRM_DEV_DB_PATH`. Also used: `CRM_SECRET_KEY` (beta/prod), `CRM_HOST`/`CRM_PORT`, `CRM_DATA_DIR`; digest mailer: `CRM_DIGEST_SENDER` (DWD impersonation sender) + `SMTP_HOST`/`SMTP_PORT`/`SMTP_SECURITY`/`SMTP_FROM`/`SMTP_USERNAME`/`SMTP_PASSWORD` (SMTP fallback); daily digest (Phase B): `CRM_DIGEST_ENABLED` + `CRM_DIGEST_SEND_HOUR` **only seed the first-boot default** — the live control is the DB policy (`app_settings.digest_policy`, set in Settings → Admin). - **Config placement:** operational/feature toggles live in the **admin panel**, DB-backed via `app_settings` (read-merge through a `load_*_policy(conn)` helper shared by the API + any scheduler; precedence DB-row → env-seed → default), so they're discoverable and take effect live. Reserve StartOS actions / env for **secrets and deploy-time config** (SMTP creds, API keys, DWD sender). Precedent: `digest_policy` (`GET/PATCH /api/admin/digest/policy`), `fundraising_backup_policy`. - **Commit style:** imperative subject, concise body explaining the *why*; put the package version in the subject (`… (v0.1.0:NN)`) for shippable changes. **No AI co-author / attribution trailers** — commits are authored by the user. ## Always - **Verify before shipping:** `python3 -m py_compile` the edited files; for DB logic, run the change against a **copy** of `data/crm.db`, never production. - **Keep real LP data out of Claude:** develop only on code/schema/synthetic-or-locally-redacted data; route any real record substance through `backend/redaction` first. - **Get explicit user authorization before any production deploy/install** to `$START9_BOX_HOST`. ## Never - **Never treat Qdrant (or any derived index) as source of truth** — the CRM/SQLite is canonical and rebuildable-from. - **Never hard-delete** CRM records or thesis history — soft-delete/archive only. - **Never let an agent send email, post, or contact an LP autonomously** — agents draft; a human approves and sends. - **Never set a `thesis_version` canonical from code/seeds** — that is human dual sign-off. - **Never call a Spark directly** — go through Spark Control (`SPARK_CONTROL_URL`). - **Never commit secrets, `data/crm.db`, `.env`, or `data/backups/`** (all gitignored). Scan staged files before committing. (`.claude/` *is* tracked — `launch.json` and `rules/` symlinks ship with the repo; keep local-only settings in `.claude/settings.local.json`.) - **Never bulk-export the LP list** to any third party; send only minimal non-sensitive context to Claude. - **Never assume FastAPI / SQLAlchemy / pytest** are in play — they sit in `requirements.txt` unused; runtime is stdlib + SQLite. - **Never add a `Co-Authored-By` / "Generated with" trailer** to commits or PRs — commits are the user's. ## Deeper docs - Full constitution + guardrails: `docs/ten31-constitution.md` - Architecture & rationale: `docs/Ten31_Agentic_Build_Plan.md` - Retrieval/embeddings contract: `docs/EMBEDDINGS.md` - CRM schema/API tour: `docs/crm-overview.md` - Current thesis handoff: `docs/thesis-handoff.md` - Operations & runbooks: `docs/OPERATIONS.md`, `docs/go-live-runbook.md`, `docs/gmail-enablement-runbook.md` ## Current state _Phase 0 + Phase 1 built; **box and repo at v0.1.0:87** (deployed & verified live 2026-06-18; migration chain …86→87 clean, `0005_grid_pipeline_link.sql` applied on the box, server up on :8080). **The fundraising grid + email capture is the canonical system of record** (2026-06-16) — vestigial classic-CRM surfaces get pruned/repurposed. Deploy/feature history lives in git log + `start9/0.4/startos/versions/`; longer-term backlog + debt in `ROADMAP.md` / `EVALUATION.md`._ - **Adopt the Pipeline — grid drives the deal board — DEPLOYED & verified live 2026-06-18 (v0.1.0:87); in-room/board live-smoke still pending.** An **"Add to Pipeline"** row action on the fundraising grid opens a seed modal (primary contact / target fund / expected amount / stage / probability) and creates a durably-linked `opportunities` row via the new **`opportunities.fundraising_investor_id`** (migration 0005, additive + reversible). **Grid owns the link + seed; the board owns stage/probability/owner** — a grid save never reseeds a live opp (`POST /api/fundraising/pipeline/link` is idempotent, one live opp/investor). Contact is **reused from the grid's synced `fundraising_contacts.contact_id`** (the `POST /api/contacts` side-door is gone); grid `lead`→owner. Two **read-only** grid columns (Pipeline action + Pipeline Stage) are **injected on read** from the live opp and **stripped on write** (never persisted, never dirty the autosave). **Remove from pipeline** (`POST .../unlink`) **soft-deletes the opp; the grid row stays fully intact**; deleting an investor from the grid archives its orphaned opp (`reconcile_grid_pipeline_links`, after `sync_fundraising_relational`). **Folded in:** the standing P2 soft-delete leak in `handle_pipeline_report` + dashboard pipeline aggregates (archived opps no longer counted). Tests: `backend/test_grid_pipeline_link.py`; 28/28 suite green, render-smoke green; migration verified on a copy of `data/crm.db` and **applied clean on the box**. **Next: live-smoke on the box — add an investor to the pipeline, confirm it lands on the board, advance a stage, and remove (opp archived, grid row intact).** Detail + locked decisions in `ROADMAP.md` "Adopt the Pipeline". - **Matrix intake bot — DEPLOYED & LIVE (2026-06-17), `backend/matrix_intake/`:** a separate-process bot (its `matrix-nio` dep isolated from the stdlib CRM) turning a typed Matrix-room message into a proposed fundraising-grid add/edit, written only after **in-thread human approval** (`yes`/`edit field=value`/`no`). Parse = local Qwen via Spark Control (no Claude/scrub, like the digest); writes reuse the CRM's own `POST /api/fundraising/log-communication` tagged `source="matrix_intake"`; new-vs-existing via read-only `GET /api/intake/match` (returns the grid row id → no duplicate). **Runs on the Spark as a docker-compose service** (`modelo32`, container `matrix-intake`, `restart: unless-stopped` → survives a reboot; `docker-compose.yml` at the repo root + `backend/matrix_intake/Dockerfile` bundling `backend/matrix_intake` + the stdlib `backend/ingest` Spark client; retired the old nohup launch, 2026-06-17). A spark-control dashboard card is still pending (handoff: `docs/handoffs/add-intake-bot-to-spark-control.md`). **Live-smoked end-to-end** (new-investor create + existing-investor note matched & appended, no dup). Server side shipped to the box as **v0.1.0:84** (`/api/intake/match` + `source` provenance — these were missing on v83, so the bot 404'd until v84); then UX adds: main-timeline nudge pointer, top-level-`yes`→thread redirect, clearer commit wording, note text in the grid line (v85 dropped the `[note]` tag). M3 (business-card photo) deferred (no Spark vision model). Guide: `docs/guides/matrix-intake.md`. - **Matrix intake — fuzzy-match + conversational-edit pass — DEPLOYED & LIVE 2026-06-17 (box on v0.1.0:86, bot restarted on the Spark; `candidates` endpoint verified live); `revise` leg live-smoked 2026-06-17, fuzzy disambiguation grammar still un-smoked.** Closes the two locked post-deploy enhancements (ROADMAP). **(a) Fuzzy matching (server-side, ships in the s9pk):** `find_intake_candidates` in `server.py` (deterministic — stdlib `difflib` name similarity + token-set Jaccard, legal-suffix-aware via `_strip_legal_suffix`, + email Levenshtein ≤ 2; ranked, ≥0.62, top 5); `GET /api/intake/match` now returns `{match, candidates}`. The bot surfaces a numbered shortlist (`_stage="disambiguate"`) so a near-duplicate ("Charlie"/"Charles", "Acme Capital"/"Acme Capital LLC", a one-char email typo) is **confirmed by a human** instead of silently creating a second investor — never auto-attached. **The optional LLM-judge re-rank was deferred** (deterministic filter already surfaces the cases; LLM is the right shortlist *pruner* if noise proves real). **(b) Conversational edits (bot-side, ships on the Spark):** any in-thread reply that isn't `yes`/`no`/`edit field=value` → `parse.revise` re-runs `{proposal + instruction}` through local Qwen and re-renders the card; **email integrity preserved** (a changed address must literally appear in the instruction; the model's email field is never trusted); no-op revisions re-prompt (`same_fields`). **Deploy is split:** the `candidates` need an **s9pk build+install** (v86); the bot's disambiguation+revise need a **Spark `git pull` + restart** — a bot restart alone won't deliver `candidates` (box returns `[]`, bot safely proposes new). Tests green; the Qwen `revise` leg is now live-smoked (2026-06-17, with the roster fix below); the fuzzy **disambiguation** numbered-pick grammar is the one in-room path still un-smoked. Guide updated. - **Matrix intake — team-roster parse frame — DEPLOYED & LIVE 2026-06-17 (bot at `c1ea176` on the Spark; live-smoked).** Fixes the live-smoke gripe where *"jonathan is chatting with wyoming"* extracted the teammate, not the prospect. `parse.build_system(roster)` appends an **outreach frame** when `INTAKE_TEAM_ROSTER` (`.env`, comma-separated names/initials, case-insensitive — 11 entries live) is set: roster names are the people *doing* outreach and are **never extracted** as investor/contact — the *other* party is the prospect. Same framing on `revise`; roster read once at startup (`settings.team_roster()`, logs `team roster loaded (N names)`), so a roster change needs a bot restart. **Bot-side only — no s9pk bump** (box stays v0.1.0:86); shipped via Spark `git pull` + `docker compose up -d --build` + the new `.env` var. Roster unset → prior behavior. Tests: +3 in `test_parse.py`. Guide updated. - **Working (all draft-only):** CRM + ingest (chunk→embed→Qdrant + retrieval) + redaction boundary; Gmail capture (DWD) + email-activity propose→approve; Thesis Workshop + Architect (Claude) with dual-approval gate; Outreach Draft Assistant + follow-up radar + per-user voice + Tier-B in-thread Gmail draft creation. - **Deploy history (done — git log + `start9/0.4/startos/versions/` + guides):** v74 security/path-traversal hardening; v78 retired `lp_profiles`/LP Tracker (grid is canonical); v80–83 email-activity Communications tab (typed investor facet, date filter, full-body view, semantic content search) + daily-digest windowed preview→send (`docs/guides/email.md`); v82 vendored + SRI-pinned front-end libs + jsdom render-smoke build gate (`docs/guides/packaging.md`). - **Tests:** **27/27 backend green** (`python3 backend/run_tests.py`), `py_compile` clean; frontend render-smoke gates the default `make` build. - **Debt (P2, not deploy-blocking; full list `EVALUATION.md`):** reports-subsystem soft-delete sweep — **pipeline/opportunities aggregates fixed v87**; remaining: the dashboard **communications** aggregates (`recent_comms`/`comms_this_month`/`meetings_this_month`) + activity report + report-endpoint tests; `?limit=abc` crashes the request thread; auth regression test for the 3 v79-gated GETs (`/api/users`, `/api/email/status`, `/api/email/accounts`); scrub-gateway TLS verify off; hardcoded Spark/Qdrant IPs + **oversized StartOS package icon** (fix before the next s9pk upload); the 5.4k-line `server.py` monolith. - **Open / risks:** the v2.0 reserve-asset spine is the *working* approved spine but **not a canonical `thesis_version`** (needs Grant + Jonathan dual sign-off; Appendix-A conviction incl. ~40% Strike stays Grant's working read, not fed to the engine); **Claude/Architect path still unverified live on the box**; the intake matcher reads only the grid blob (not classic `contacts`); doc drift — `crm-overview.md` + `EVALUATION.md` still call `lp_profiles` live (doc-auditor pass). - **Next:** 1) **Pipeline adoption — DEPLOYED (v0.1.0:87, above); live-smoke** the "Add to Pipeline" → board → advance-stage → remove round-trip on the box; 2) **spark-control intake dashboard card** (separate session in the spark-control repo — handoff at `docs/handoffs/add-intake-bot-to-spark-control.md`), and longer-term **extract the bot to its own repo** (ROADMAP); 3) in-room smoke of the intake **disambiguation** numbered-pick grammar (the one unexercised path) — and a roster-tuning pass if any teammate name/initial still slips through; 4) **NL→safe-query** (search item 3 — separate, larger build); 5) Grant + Jonathan freeze v2.0 canonical; 6) reply-all for Tier-B drafts; then clear the P2 debt above.