cabbcae5d5
Replace the hardcoded immense-voyage.local with $START9_BOX_HOST so the real host lives only in local start-cli context, not the repo. Add a Current state section for fast session orientation.
10 KiB
10 KiB
Ten31 Agentic System — AGENTS.md
In-house AI-agent system over a self-hosted Start9 CRM (SQLite) for a bitcoin/energy/AI investment fund: widen the fundraising funnel, sharpen the thesis, automate outreach. Frontier reasoning runs on Claude (Agent SDK/API); privacy-sensitive and bulk work runs on local DGX Spark models via the Spark Control gateway. Phase 0/1 — no live outward-facing agents; agents draft, humans send.
Stack (versions that matter)
- Python 3.11, standard library only at runtime. The CRM is one monolith,
backend/server.py(~5k lines): a stdlibhttp.server.ThreadingHTTPServer+ hand-writtenCRMHandlerwith manual path dispatch (do_GET/do_POST). Not FastAPI.backend/requirements.txtlists FastAPI/SQLAlchemy/Alembic/Pydantic/pytest-style deps but none are imported at runtime (vestigial). - SQLite at
data/crm.db(WAL,foreign_keys=ON), opened per-request viaget_db(). Schema via ordered migrations. - Frontend: single
frontend/index.html, inline-Babel React. No build step. - Optional runtime deps, used only if present:
bcrypt,PyJWT(jwt),cryptography(Gmail module). - MCP + ingest (in the Docker image, not the bare CRM):
mcp==1.2.0(FastMCP,backend/mcp/server.py),fastembed==0.4.2,anthropic,cryptography==42.0.5. - Packaging: StartOS 0.4, TypeScript SDK (
@start9labs/start-sdk) understart9/0.4/startos/. Live target isstart9/0.4/. - Local models (bge-m3 embeddings, bge-reranker-v2-m3,
/api/search, Qdrant): always via Spark Control. Contract:docs/EMBEDDINGS.md.
Commands
# Run locally (dev, port 8080; or ./start.sh <port>) — runs python3 backend/server.py
./start.sh
# Run prod-mode (Tailscale/beta) — requires CRM_SECRET_KEY
./start_beta.sh
# Sanity-check edits (there is no compiler/build for the CRM)
python3 -m py_compile backend/server.py
# Run ONE test (tests are standalone scripts with `if __name__ == "__main__"`; no pytest installed)
python3 backend/redaction/test_scrub_leak.py # substitute any backend/**/test_*.py (13 exist)
# Run all tests (no aggregate runner exists)
for t in $(find backend -name 'test_*.py'); do echo "== $t"; python3 "$t" || break; done
# Build the s9pk (x86_64 only) -> ten-database_x86_64.s9pk — BUMP THE VERSION FIRST (see Always)
cd start9/0.4 && make
# Install to the box — PRODUCTION; get explicit user OK first. TODO: confirm exact host/context.
start-cli package install -s ten-database_x86_64.s9pk # target host = $START9_BOX_HOST (real value lives in your local start-cli context config, NOT this repo)
- Migrations apply automatically at startup via
backend/core_migrations.pyfrombackend/migrations/NNNN_*.sql, tracked in aschema_migrationsledger. Verify a new one against a copy ofdata/crm.db, never production. - Lint: none configured.
Directory layout (day-one)
backend/server.py— the CRM monolith: HTTP handler, route dispatch,init_db(), auth (username/password → HS256 JWT, roles admin/member).backend/core_migrations.py+backend/migrations/NNNN_*.sql(+ paired.down.sql) — additive schema migrations, applied at startup.backend/thesis_seed.py— Thesis Workshop seed + idempotentensure_*one-time seeders (interaction_log sentinels), wired inserver.init_db().backend/thesis_review.py— thesis version review/approval (human dual sign-off → canonical).backend/mcp/—architect_agent.py(Claude thesis copilot),architect_tools.py(thesis CRUD/versions),outreach_agent.py(LP draft assistant),architect_grounding.py,crm_tools.py,server.py(FastMCP).backend/email_integration/— Gmail capture via domain-wide delegation:credentials.py,matcher.py,parser.py,db.py,sync.py,scheduler.py,routes.py,compose.py(Tier-B draft creation),migrations/.backend/redaction/—scrub.py+client.py: the scrub→Claude→re-hydrate privacy boundary (Boundary,SCRUB_BACKEND=local|gateway, fail-closed).backend/ingest/— chunk→embed→Qdrant + retrieval modes (search.py,embed.py,qdrant_io.py,sparse.py,entity_resolution.py).backend/entity_*.py— entity resolution/merge (the two-investor-model reconciliation).frontend/index.html— the entire UI.docs/—Ten31_Agentic_Build_Plan.md(architecture),PHASE_0.md/PHASE_1.md,EMBEDDINGS.md(retrieval contract),crm-overview.md(schema/API tour),thesis-handoff.md,ten31-constitution.md(full constitution + guardrails).start9/0.4/— StartOS package:startos/utils.ts(PACKAGE_VERSION),startos/versions/,Dockerfile,docker_entrypoint.sh,Makefile,s9pk.mk.data/crm.db— the live DB (gitignored)..env/.env.example— config (.envgitignored).
Conventions
- Two coexisting investor models (classic
contacts/lp_profiles+ thefundraising_*grid). Reconciling them to canonical IDs is the core entity-resolution task — seedocs/crm-overview.md. - Migrations are additive + reversible only: numbered
NNNN_*.sqlwith a pairedNNNN_*.down.sql. SQLite ALTER = add-column/rename only. - One-time seeds/backfills are idempotent via
interaction_logsentinels (theensure_*pattern), wired intoinit_db— safe to re-run on every boot. - Soft-delete only:
deleted_atand/orstatus='retired'; never hard-delete._node_treeandcreate_thesis_versionfilter ondeleted_at IS NULLand ignore status — so to drop a node from the live agent prompt AND version snapshots you must setdeleted_at, not just status. - Thesis canonical gate: node status is
draft|candidate|approved|retired(the working tree); a canonicalthesis_versionis frozen ONLY by human dual sign-off (thesis_review). Code/seeds never set a version canonical. - Env: secrets in
.env(gitignored); names in.env.example. Verified names:ANTHROPIC_API_KEY,SPARK_CONTROL_URL,SPARK_CONTROL_VERIFY_TLS,QDRANT_URL,X_API_KEY,CRM_DB_PATH,CRM_DEV_DB_PATH. Also used:CRM_SECRET_KEY(beta/prod),CRM_HOST/CRM_PORT(start.sh),CRM_DATA_DIR. - Commit style: imperative subject, concise body explaining the why; put the package version in the subject (
… (v0.1.0:NN)) for shippable changes. No AI co-author / attribution trailers — commits are authored by the user. (Older history carries aCo-Authored-By: Claudetrailer; dropped going forward.)
Always
- Bump the version before building an s9pk: edit
PACKAGE_VERSIONinstart9/0.4/startos/utils.ts, addstart9/0.4/startos/versions/v0.1.0.NN.ts, and register it inversions/index.ts(import, setcurrent, move priorcurrentintoother[]). Start9 0.4.x ignores a same-version rebuild. - Verify before shipping:
python3 -m py_compilethe edited files; for DB logic, run the change against a copy ofdata/crm.db. - Make migrations/seeders deployment-state-invariant and idempotent: target rows structurally, not by transient text the same change mutates; capture prior state so a revert is exact. (Learned the hard way: matching old nodes by a body string the same changeset deleted broke fresh DBs.)
- Keep real LP data out of Claude: develop only on code/schema/synthetic-or-locally-redacted data; route any real record substance through
backend/redactionbefore it reaches a Claude model. - Get explicit user authorization before any production deploy/install to
$START9_BOX_HOST. - Ship a paired
.down.sqlwith every new migration.
Never
- Never treat Qdrant (or any derived index) as source of truth — the CRM/SQLite is canonical and rebuildable-from.
- Never hard-delete CRM records or thesis history — soft-delete/archive only.
- Never let an agent send email, post, or contact an LP autonomously — agents draft; a human approves and sends.
- Never set a
thesis_versioncanonical from code/seeds — that is human dual sign-off. - Never call a Spark directly — go through Spark Control (
SPARK_CONTROL_URL). - Never commit secrets,
data/crm.db,.env, backups, or.claude/(all gitignored). Scan staged files before committing. - Never bulk-export the LP list to any third party; send only minimal non-sensitive context to Claude.
- Never assume FastAPI / SQLAlchemy / pytest are in play — they sit in
requirements.txtunused; runtime is stdlib + SQLite. - Never add a
Co-Authored-By/ "Generated with" trailer to commits or PRs — commits are the user's.
Deeper docs
- Full constitution + guardrails:
docs/ten31-constitution.md— TODO: consider folding its still-current content into this file and retiring the separate doc. - Architecture & rationale:
docs/Ten31_Agentic_Build_Plan.md - Retrieval/embeddings contract:
docs/EMBEDDINGS.md - CRM schema/API tour:
docs/crm-overview.md - Current thesis handoff:
docs/thesis-handoff.md
Current state
Phase 0 substrate + Phase 1 thesis/outreach are built; current package is v0.1.0:73. Longer-term backlog: ROADMAP.md.
- Working (all draft-only): CRM + ingest (chunk→embed→Qdrant + retrieval) + redaction boundary; Gmail capture (DWD) + email-activity propose→approve; Thesis Workshop + Architect (Claude) with dual-approval gate; Outreach Draft Assistant + follow-up radar + per-user voice + Tier-B in-thread Gmail draft creation.
- In progress: v0.1.0:73 is committed and built but not installed — the box (
$START9_BOX_HOST) runs v0.1.0:72, awaiting deploy authorization. On boot,ensure_thesis_v2_promotedmakes the v2.0 reserve-asset spine the working approved spine (node-level, reversible). - Decided, not yet built: CRM is the canonical thesis backbone with the signal-engine reading from it (reconciliation unwired); reply-all for Tier-B drafts is next (drafts currently reply to the LP only).
- Known gaps: the v2.0 spine is the working spine but not a canonical
thesis_version(needs Grant + Jonathan dual sign-off); Appendix-A conviction/exposure (incl. ~40% Strike) stay Grant's working read, not canonical and not fed to the engine; on an already-seeded box the AI/energy-operator segment angle still shows old copy (gated on the banner decision); live features are unverified on the box. - Next: 1) deploy v0.1.0:73 (on OK); 2) Grant + Jonathan freeze v2.0 canonical in the Workshop; 3) build reply-all; 4) confirm Appendix-A figures + Maple/OpenSecret/Primal, then promote; 5) verify live features on the box.