Files

T

Keysat 68106d7a5a Add Matrix NL-query Q&A surface (W2 step 5)

Read-only natural-language query over the curated nl_query endpoint, answered
in-thread. Two entry points (room-per-purpose model): a dedicated Q&A room
(MATRIX_QUERY_ROOM) where every top-level message is a question, plus the
?/@bot trigger in the intake room as a cross-room convenience. Both routes hit
the same handle_query -> crm_client.nl_query -> POST /api/query/nl; translation
runs on the box's local model, nothing leaves the box, and there is no write
path so no approval gate applies.

Pure logic (trigger parsing, answer rendering) in query.py with offline tests;
async room wiring in bot.py (live-smoke only, per the bot's convention).

Bot-side only, ships on the Spark via git pull + restart. Depends on the
box-side /api/query/nl endpoint, which lands with the v93 s9pk (reminders + W2):
until v93 is installed the Q&A surface 404s, so the bot deploy is staged to
follow that install.

2026-06-18 19:46:54 -05:00

17 KiB

Raw Blame History

Ten31 Venture CRM + Agentic System — AGENTS.md

The foundation is a self-hosted venture-fund CRM — a purpose-built fundraising tool that replaced Airtable to (1) keep sensitive LP/prospect data off third-party servers, (2) drop subscription cost, and (3) fit the fund's workflow: managing ~150 existing LPs, tracking 250+ prospects, and running the capital-raise pipeline. Core CRM domain: contacts (investor/prospect/advisor), organizations, opportunities (the deal pipeline), and communications; investor commitments live in the canonical fundraising_* grid (the legacy single-fund lp_profiles table was retired in v0.1.0:78). The fund (Ten31, ~$200M AUM, bitcoin/energy/AI thesis) runs it on a Start9 box, accessed over ClearNet (StartOS StartTunnel) with app-level user auth by a team of ~5 (Tailscale is not in use). Schema/API tour: docs/crm-overview.md.

The agentic system is new functionality built on top of that CRM — an in-house AI layer to widen the fundraising funnel, sharpen the thesis, and automate outreach drafting. Frontier reasoning runs on Claude (Agent SDK/API); privacy-sensitive and bulk work runs on local DGX Spark models via the Spark Control gateway. Phase 0/1 — no live outward-facing agents; agents draft, humans send.

Inbox check: At session start, if ~/Projects/standards/INBOX.md exists, scan it for items tagged (CRM) and surface them before proposing next steps; triage with /triage.

Stack (versions that matter)

Python 3.11, standard library only at runtime. The CRM is one monolith, backend/server.py (~5k lines): a stdlib http.server.ThreadingHTTPServer + hand-written CRMHandler with manual path dispatch (do_GET/do_POST). Not FastAPI. backend/requirements.txt lists FastAPI/SQLAlchemy/Alembic/Pydantic/pytest-style deps but none are imported at runtime (vestigial).
SQLite at data/crm.db (WAL, foreign_keys=ON), opened per-request via get_db(). Schema via ordered migrations.
Frontend: single frontend/index.html, inline-Babel React. No build step.
Optional runtime deps, used only if present: bcrypt, PyJWT (jwt), cryptography (Gmail module).
MCP + ingest (in the Docker image, not the bare CRM): mcp==1.2.0 (FastMCP, backend/mcp/server.py), fastembed==0.4.2, anthropic, cryptography==42.0.5.
Packaging: StartOS 0.4, TypeScript SDK (@start9labs/start-sdk) under start9/0.4/startos/. Live target is start9/0.4/.
Local models (bge-m3 embeddings, bge-reranker-v2-m3, /api/search, Qdrant): always via Spark Control. Contract: docs/EMBEDDINGS.md.

Commands

# Run locally (dev, port 8080; or ./start.sh <port>) — runs python3 backend/server.py
./start.sh
# Run prod-mode (beta) — requires CRM_SECRET_KEY
./start_beta.sh
# Sanity-check edits (there is no compiler/build for the CRM)
python3 -m py_compile backend/server.py
# Run ONE test (tests are standalone scripts with `if __name__ == "__main__"`; no pytest installed)
python3 backend/redaction/test_scrub_leak.py        # substitute any backend/**/test_*.py
# Run all tests (aggregate runner — runs each backend/**/test_*.py in its own subprocess)
python3 backend/run_tests.py                         # add substrings to filter, e.g. `... soft_delete redaction`
# Build + install the s9pk — BUMP THE VERSION FIRST. See docs/guides/packaging.md.
cd start9/0.4 && make

Migrations apply automatically at startup (backend/core_migrations.py, schema_migrations ledger). See docs/guides/migrations.md before adding one.
Lint: none configured.

Directory layout (day-one)

backend/server.py — the CRM monolith: HTTP handler, route dispatch, init_db(), auth (username/password → HS256 JWT, roles admin/member/bot).
backend/core_migrations.py + backend/migrations/NNNN_*.sql (+ paired .down.sql) — additive schema migrations, applied at startup.
backend/thesis_seed.py — Thesis Workshop seed + idempotent ensure_* one-time seeders, wired in server.init_db().
backend/thesis_review.py — thesis version review/approval (human dual sign-off → canonical).
backend/mcp/ — architect_agent.py (Claude thesis copilot), architect_tools.py, outreach_agent.py (LP draft assistant), architect_grounding.py, crm_tools.py, server.py (FastMCP).
backend/email_integration/ — Gmail capture via domain-wide delegation + Tier-B draft creation (compose.py).
backend/redaction/ — scrub.py + client.py: the scrub→Claude→re-hydrate privacy boundary.
backend/ingest/ — chunk→embed→Qdrant + retrieval modes.
backend/entity_*.py — entity resolution/merge (the two-investor-model reconciliation).
backend/nl_query/ — read-only natural-language query (W2): intents.py (curated parameterized query catalog), runner.py (slot validator = trust boundary), translate.py (local-Qwen question→{intent,slots}). See the nl-query guide.
backend/matrix_intake/ — Matrix intake bot (separate process; matrix-nio, isolated to this component): typed message → local-Qwen parse → in-thread approve → write via the CRM's own log-communication. See the matrix-intake guide.
frontend/index.html — the entire UI.
docs/ — architecture, phase plans, contracts, runbooks (see Deeper docs). docs/guides/ — scoped subsystem rules (see below).
start9/0.4/ — StartOS package (startos/utils.ts holds PACKAGE_VERSION).
data/crm.db — the live DB (gitignored). .env / .env.example — config (.env gitignored).

Scoped guides

Subsystem rules live in docs/guides/ and lazy-load in Claude Code via .claude/rules/ symlinks (scoped by paths: frontmatter). Read the guide before editing that area:

Migrations or seeders (backend/migrations/, core_migrations.py, thesis_seed.py) → docs/guides/migrations.md
Thesis logic (backend/thesis_*.py, backend/mcp/architect_*.py) → docs/guides/thesis.md
Redaction or any MCP/Claude path (backend/redaction/, backend/mcp/) → docs/guides/redaction.md
Ingest / retrieval (backend/ingest/) → docs/guides/spark-ingest.md
Email capture / drafts + digest send (backend/email_integration/, backend/digest_mailer.py, backend/smtp_send.py) → docs/guides/email.md
Building or deploying the s9pk (start9/) → docs/guides/packaging.md
Matrix intake bot (backend/matrix_intake/) → docs/guides/matrix-intake.md
Natural-language query (backend/nl_query/) → docs/guides/nl-query.md

Conventions

Investor model — the grid is canonical (since v0.1.0:78). The fundraising_* grid is the system of record: an investor entity (row) → many contact "pills" → per-fund commitments. The classic contacts table is a read-only per-person directory, auto-populated from the grid — create/edit people in the grid, not the Contacts page. Email capture rolls multiple people up to one investor. The legacy single-fund lp_profiles model is retired (empty table kept, per never-hard-delete). Reconciling grid ↔ classic contacts to canonical IDs is the core entity-resolution task — see docs/crm-overview.md.
Soft-delete only: deleted_at and/or status='retired'; never hard-delete. Every READ path must filter deleted_at IS NULL — list handlers, get-by-id, nested related-data sub-selects, and aggregate sub-selects (COUNT/SUM/MAX). Audits found leaks in all of these (2026-06-12 detail + nested; 2026-06-13 list-view contact_count/total_funded/comm_count); the opportunities/pipeline aggregates were fixed in v0.1.0:87 (handle_pipeline_report + dashboard pipeline metrics now filter deleted_at), but the reports subsystem's communications-side aggregates (dashboard recent_comms/comms_this_month/meetings_this_month, activity report) still leak (see Current state). Regression-guarded by backend/test_soft_delete_reads.py (+ test_reminders.py for the reminders read paths, incl. the recency rollup whose email-activity liveness signal is email_account_messages.deleted_at, not emails). (Thesis has a subtlety here — see the thesis guide.)
Env: secrets in .env (gitignored); names in .env.example. Verified names: ANTHROPIC_API_KEY, SPARK_CONTROL_URL, SPARK_CONTROL_VERIFY_TLS, QDRANT_URL, X_API_KEY, CRM_DB_PATH, CRM_DEV_DB_PATH. Also used: CRM_SECRET_KEY (beta/prod), CRM_HOST/CRM_PORT, CRM_DATA_DIR; digest mailer: CRM_DIGEST_SENDER (DWD impersonation sender) + SMTP_HOST/SMTP_PORT/SMTP_SECURITY/SMTP_FROM/SMTP_USERNAME/SMTP_PASSWORD (SMTP fallback); daily digest (Phase B): CRM_DIGEST_ENABLED + CRM_DIGEST_SEND_HOUR only seed the first-boot default — the live control is the DB policy (app_settings.digest_policy, set in Settings → Admin).
Config placement: operational/feature toggles live in the admin panel, DB-backed via app_settings (read-merge through a load_*_policy(conn) helper shared by the API + any scheduler; precedence DB-row → env-seed → default), so they're discoverable and take effect live. Reserve StartOS actions / env for secrets and deploy-time config (SMTP creds, API keys, DWD sender). Precedent: digest_policy (GET/PATCH /api/admin/digest/policy), fundraising_backup_policy.
Agent/bot API access — three roles now (admin/member/bot). require_admin is the only hard gate; everything else is "authenticated" (member, admin, and bot all pass). The bot role (added v0.1.0:89) is authenticated-but-never-admin: require_bot_or_admin gates agent-facing endpoints (e.g. /api/intake/email-proposals*) so a bot credential reaches only what it needs, never user-management/settings/security. Provision it via Settings → Admin edit-user dropdown (kept out of the teammate-invite form). Two axes to keep separate as more agent capability lands: the role controls reach (which endpoints); the per-feature human draft→approve gate controls autonomy (acting unattended). Money/merge/delete mutations stay behind the approval gate regardless of role. Don't build a finer capability/scope system until real NL-mutation endpoints exist to scope against.
Commit style: imperative subject, concise body explaining the why; put the package version in the subject (… (v0.1.0:NN)) for shippable changes. No AI co-author / attribution trailers — commits are authored by the user.

Always

Verify before shipping: python3 -m py_compile the edited files; for DB logic, run the change against a copy of data/crm.db, never production.
Keep real LP data out of Claude: develop only on code/schema/synthetic-or-locally-redacted data; route any real record substance through backend/redaction first.
Get explicit user authorization before any production deploy/install to $START9_BOX_HOST.

Never

Never treat Qdrant (or any derived index) as source of truth — the CRM/SQLite is canonical and rebuildable-from.
Never hard-delete CRM records or thesis history — soft-delete/archive only.
Never let an agent send email, post, or contact an LP autonomously — agents draft; a human approves and sends.
Never set a thesis_version canonical from code/seeds — that is human dual sign-off.
Never call a Spark directly — go through Spark Control (SPARK_CONTROL_URL).
Never commit secrets, data/crm.db, .env, or data/backups/ (all gitignored). Scan staged files before committing. (.claude/ is tracked — launch.json and rules/ symlinks ship with the repo; keep local-only settings in .claude/settings.local.json.)
Never bulk-export the LP list to any third party; send only minimal non-sensitive context to Claude.
Never assume FastAPI / SQLAlchemy / pytest are in play — they sit in requirements.txt unused; runtime is stdlib + SQLite.
Never add a Co-Authored-By / "Generated with" trailer to commits or PRs — commits are the user's.

Deeper docs

Full constitution + guardrails: docs/ten31-constitution.md
Architecture & rationale: docs/Ten31_Agentic_Build_Plan.md
Retrieval/embeddings contract: docs/EMBEDDINGS.md
CRM schema/API tour: docs/crm-overview.md
Current thesis handoff: docs/thesis-handoff.md
Operations & runbooks: docs/OPERATIONS.md, docs/go-live-runbook.md, docs/gmail-enablement-runbook.md

Current state

Phase 0 + Phase 1 built; box live at v0.1.0:91; repo at v0.1.0:92 (reminders, deploy pending). The fundraising grid + email capture is the canonical system of record. Active thread: W2 natural-language query (backend + Matrix @bot surface built; web "Ask" box next). Deploy/feature history: git log + start9/0.4/startos/versions/; longer-term backlog/debt: ROADMAP.md / EVALUATION.md.

W2 — natural-language query (read-only): BACKEND + MATRIX @bot surface built + tested locally 2026-06-18; web "Ask" box next. backend/nl_query/ — 12 curated parameterized queries + a slot validator (the trust boundary; no generic SQL) + a local-Qwen translator (question→{intent,slots} via Spark Control; nothing leaves the box, no Claude, no redaction — the simplification Grant chose). POST /api/query/nl (also accepts direct {intent,slots}) + GET /api/query/catalog, require_bot_or_admin, audited (entity_type='nl_query'). Local Qwen translated 12/12 of Grant's real example questions correctly against the live Spark — settles local-only (Claude not needed). Soft-delete-correct per table (gotcha: fundraising_* has no deleted_at — graveyard is the axis; emails via a live eam sighting). Guide: docs/guides/nl-query.md. Step 5 (Matrix Q&A) DONE — thin client in backend/matrix_intake/query.py (trigger grammar + answer rendering) + crm_client.nl_query + bot.py wiring, read-only (no approval gate), tested in test_query.py. Two entry points (room-per-purpose model): a dedicated Q&A room (MATRIX_QUERY_ROOM) where every message is a question, and the ?/@bot trigger still working in the intake room as a cross-room convenience. Ships on the Spark (git pull + restart, no s9pk for the bot). Q&A room !RGlJEObVaIUtUVcHtx:matrix.gilliam.ai created + bot invited (2026-06-18). BUT the box-side /api/query/nl endpoint is NOT live yet (box v91; verified 404 on 2026-06-18) — it lands with the v93 s9pk (reminders + W2). So DON'T activate the bot deploy (set MATRIX_QUERY_ROOM + restart) until v93 is installed, or every question 404s. Code committed + pushed; bot deploy is staged to follow the v93 install. Next: step 4 web "Ask" box (Communications tab) — the last thin client.
W1 — reminders & follow-ups: BUILT + tested locally (v0.1.0:92), DEPLOY PENDING. First-class tickler tied to the grid (migration 0006; CRUD GET/POST/PATCH/DELETE /api/reminders; derived reminder_status grid column; Reminders page + dashboard card + digest section; the last_activity_at recency rollup that W2 reuses). Needs s9pk build + install (authorize first; verify 0006 against a DB copy). Deferred W1b = nurture-gap auto-suggested reminders.
Done & live (detail in git log / ROADMAP): email-proposal Matrix review + bot role (box v91); grid-driven Pipeline (v88); Matrix intake bot (Spark matrix-intake container); Gmail capture (DWD) + propose→approve + daily digest; Thesis Workshop + Architect (Claude, dual-approval); outreach drafts + radar. All draft-only.
Tests: 35/35 backend green (python3 backend/run_tests.py; +nl_query/ + matrix test_query.py suites), py_compile clean; render-smoke gates make.
Next (priority order): 1) deploy reminders (v92) + W2 together — bump to v0.1.0:93, build s9pk, install, browser-verify (authorize first; verify 0006 against a DB copy) — this is the gate for the Matrix Q&A: the bot's step-5 surface 404s until /api/query/nl is on the box; THEN activate the bot deploy (set MATRIX_QUERY_ROOM on the Spark + git pull + restart) + in-room smoke; 2) W2 step 4 web Ask box (last NL-query client); 3) W3 bot grid-mutations behind the Matrix approval gate (local-Qwen parse); 4) W1b nurture-gap reminders; 5) Grant + Jonathan freeze v2.0 canonical; 6) in-room smoke of the intake disambiguation numbered-pick grammar; then P2 debt (reports comms-aggregate soft-delete sweep, ?limit=abc crash, auth regression test, oversized StartOS icon).
Open / risks: W2 translation only happy-path-validated (typos/ambiguous/no-match phrasings shake out in live use); Claude/Architect path still unverified live on the box; v2.0 reserve-asset spine is the working approved spine but not canonical (needs dual sign-off); doc drift — crm-overview.md + EVALUATION.md still call lp_profiles live.

17 KiB Raw Blame History