Files
ten31-database/docs/Ten31_Agentic_Build_Plan.md
T
Keysat fffc90c7a4 Replace v5 settlement spine with v2.0 reserve-asset spine (v0.1.0:73)
Swap the dead "scarcity as the connecting idea" / bitcoin-as-settlement
spine for the v2.0 reserve-asset spine (bitcoin = apex non-debasable
reserve asset; debasement = forcing function; AI = abundance engine;
throughline is an asset-value/capital-flow claim, not settlement; three
seams Energy<->Compute, Debasement<->Bitcoin, AI<->Data-Ownership)
everywhere it was still encoded in live code, the seed, and the docs.

- architect_agent.py / outreach_agent.py: both system prompts carried
  "scarcity as the connecting idea" and shipped settlement framing into
  every generated draft; rewritten to the reserve-asset spine.
- thesis_seed.py: THROUGHLINE, PILLAR_1, the AI/energy-operator segment
  angle, and THESIS_V2 corrected and voice-cleaned (no em dash / "X, not
  Y" / "bet"). PILLAR_2/3 (real revenue, founder access) kept.
- ensure_thesis_v2_promoted / revert_thesis_v2_promotion: make the v2.0
  spine the working APPROVED spine and re-ground/clean the core nodes,
  deployment-state-invariant (structural targeting, not body text) and
  fully reversible (captures prior body/title/status/deleted_at). NODE
  level only: never sets a thesis_version canonical (guardrail #4); no
  hard deletes (guardrail #3). Wired into init_db after the v2 candidate
  stage.
- docs/thesis-handoff.md replaced wholesale with the complete v2.0 doc;
  Ten31_Agentic_Build_Plan.md + PHASE_1.md throughline glosses updated.

The v2.0 spine remains an unratified draft from the signal-engine
workstream: canonical freeze stays the partners' dual sign-off, and
Appendix-A conviction/exposure figures stay Grant's working read.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 08:22:24 -05:00

15 KiB

Ten31 — Agentic Capability Build Plan

Working document. Purpose: a concrete, sequenced plan for building an in-house system of AI agents to widen the top of the fundraising funnel, refine and propagate Ten31's thesis, and automate marketing/branding workflows — built with internal resources using Claude and Claude Code as the engineering partner.


1. Approach in one paragraph

Build six agents — five workers plus a lightweight orchestrator — on the Claude Agent SDK, connected to your systems through MCP. Run the reasoning on Claude (frontier-quality judgment for research, messaging, drafting). Self-host the data layer and the privacy-sensitive model work on your existing Start9 server and your dual DGX Sparks. Buy nothing for the core: your self-built CRM becomes the system of record, and your existing Gmail/Superhuman + calendar connectors supply the relationship data. The real unit of reuse is not the agent count — it is one shared LP graph (your CRM) plus a library of skills every agent draws from.


2. Guiding principles

  1. Sovereignty first. Sensitive LP and relationship data stays on infrastructure you control (Start9 + DGX Sparks). Only the minimum necessary context per call ever reaches a third-party model API.
  2. Frontier reasoning where it is best-in-class; local where privacy or cost dominate. Claude for hard agentic reasoning and LP-facing output; local open models for embeddings, redaction, triage, transcription, and reasoning over data that must not leave your walls.
  3. Human-in-the-loop on anything outbound or thesis-defining. Agents draft and prepare; partners approve and send.
  4. Compliant by design. Log every agent action; gate all outbound; bring counsel in before any cold outreach goes live.
  5. One source of truth. Every agent reads from and writes to the same LP graph, so research → outreach → nurture → meeting prep compound instead of fragmenting.

3. The agent roster (6)

Agent Job Cadence Brain Human gate
Scout Watches sources (X/nostr, filings, treasury announcements, conference rosters, podcast networks); flags trigger events; populates the pipeline. Continuous / scheduled Local (triage) + Claude (judgment calls) None (internal only)
Analyst Builds LP dossiers, enriches records, maps shortest warm-intro path through the team's network. On-demand + triggered Claude (synthesis); local for RAG/embeddings None (internal only)
Architect Thesis articulation. Owns and refines the canonical messaging — the reserve-asset throughline: as fiat debases and AI commoditizes the reproducible, value accrues to the scarce side of one supply chain (energy, compute, and bitcoin as the non-debasable reserve asset), structured on three seams (Energy↔Compute, Debasement↔Bitcoin, AI↔Data-Ownership). The copilot partners sit with to sharpen the narrative. Output = a living "messaging source of truth." On-demand, collaborative Claude Partner sign-off on canonical thesis
Scribe Distribution / amplification. Takes the Architect's canonical thesis + your content (Bitcoin Alpha, partner shows, memos) and propagates segment-specific cuts across X, nostr, LinkedIn, email. Scheduled + on-demand Claude Review before publish
Closer Drafts personalized outreach and nurture sequences, preps partners before LP calls, writes follow-ups, keeps the CRM clean. Triggered + on-demand Claude Hard gate — human sends all outbound
Orchestrator ("Chief of Staff") Schedules runs, routes work between agents, escalates to a human. Always on Claude (light) n/a

Why Architect and Scribe are separate. Distribution is high-frequency and semi-mechanical; thesis articulation is low-frequency, high-judgment, and collaborative. Keeping them apart lets the Architect own a stable, partner-approved narrative that the Scribe then propagates consistently everywhere.


4. Architecture and hosting map

4.1 Model layer

  • Claude (API) — the brains for Analyst synthesis, Architect thesis work, Scribe drafting, Closer judgment, and Orchestrator routing. Use a stronger model for Architect/Analyst, a faster one for high-volume Scout/Closer tasks.
  • Local model on the DGX Sparks — current local model is Qwen3.6 35B-A3B running on a single Spark. Used for PII redaction before any data leaves your walls, inbound triage/classification, transcription orchestration, structuring/extraction, and local reasoning over data you choose never to send out.
    • The A3B (~3B active params) design means only a small slice of the model runs per token, so it largely sidesteps the Spark's memory-bandwidth limit and keeps decode fast despite being a 35B-total model. No need to link both Sparks for a larger model — that earlier ceiling is moot for this workload.
    • Embeddings + reranking (shipped, Spark Control v0.15.0). Retrieval runs on BAAI/bge-m3 (dense, 1024-dim, L2-normalized) plus BAAI/bge-reranker-v2-m3 (cross-encoder), served by spark-embed — a small FastAPI server on Spark 2 built from the NGC PyTorch image (HF TEI was ruled out: no arm64 CUDA image). Exposed through Spark Control as /v1/embeddings, /v1/rerank, and /api/search (orchestrated hybrid retrieval). Combined GPU footprint on Spark 2 is trivial (~3 GB).
    • Spark allocation. Spark 1 = LLM serving (hot KV cache). Spark 2 = embeddings + reranker + audio + the Qdrant vector index. Both Sparks are treated as always-on production infrastructure.
  • All local model services are fronted by Spark Control (the self-hosted gateway on Start9): agents hit one trusted URL for chat, embeddings, rerank, transcription, and TTS, with shared TLS, access control, and observability.
  • Auth note: Agent SDK agents must authenticate with an API key, not a claude.ai login.

4.2 Data layer — the LP graph (self-hosted)

  • The CRM (self-hosted on Start9) is the canonical system of record. Extend it to be the LP graph. Add: prospect/LP schema fields (thesis fit, segment, accreditation/QP status, warmth score, source, owner, last-touch), an interaction log (every agent action + every human touch), a derived relationship graph table, and canonical entity IDs for entity resolution (see ingest pipeline).
  • Vector store: Qdrant on Spark 2 (settled). Holds the embedded chunks. It is a rebuildable, derived index, not a second source of truth — if lost, it re-embeds from the CRM in minutes. Qdrant provides dense search + native BM25 + payload filtering + Reciprocal Rank Fusion in one service.
  • Retrieval pipeline. One orchestrated call to Spark Control /api/search: embed query (BGE-M3) → Qdrant dense + BM25 RRF with payload pre-filter → cross-encoder rerank → top_k. BM25 is generated client-side via FastEmbed (Qdrant/bm25) at both ingest and query time, with Qdrant applying IDF over your corpus — so domain entities (LP names, tickers, portfolio companies) are weighted by your own term statistics rather than BGE-M3's general-web sparse weights.
  • Ingest pipeline (the real Phase 0 work). CRM record/change → chunk (one chunk per email/note/transcript-turn; one per memo section; time-aware; entities + date_ts kept as filterable payload, not embedded text) → resolve entities to a canonical lp_id (lightweight local-Qwen step) → produce both a dense vector (/v1/embeddings) and a sparse BM25 vector (FastEmbed) → upsert both + payload to Qdrant directly (not via the gateway). One-time backfill + idempotent incremental sync. Full recipe: docs/EMBEDDINGS.md.
  • Per-agent retrieval modes. Don't force one pipeline on all agents. Build a small library the orchestrator picks from: high-recall dense at large K (Scout), high-precision keyword/BM25 (Closer — "did we ever discuss X with this LP?"), long-context + rerank (Architect). The CRM MCP server exposes these as tools.
  • Wrap the CRM in an MCP server so all agents read/write through one uniform interface, including the retrieval modes above. Because the CRM is self-built, any endpoint the agents need can be added.

4.3 Integration layer (MCP fabric)

  • MCP servers to stand up / connect:
    • CRM / LP graph (custom, self-hosted) — primary.
    • Email + calendar — Gmail/Superhuman connectors are already live; these feed Closer (drafting, follow-ups) and the Analyst's warm-path derivation.
    • Drive / notes — internal documents and memos.
    • Publishing channels — X, nostr, LinkedIn, email/newsletter (for Scribe).
    • Public data sources — filings, web search, and the X API (official key in hand) for Scout/Analyst enrichment. X is a primary source here: per-prospect public profile/bio/activity and follower-following overlap for thesis-fit scoring and mutual-connection discovery (Analyst), plus account/list/keyword monitoring and follower-graph signals (Scout). Confirm what your X access tier permits (full-archive search, follower-graph pulls, streaming) — that sets the ceiling on heavier monitoring. nostr APIs as a complementary source.

4.4 Orchestration / runtime

  • Inner loop: Claude Agent SDK handles each agent's tool-use loop and context management.
  • Outer loop: a thin workflow engine decides when and which agent runs (Temporal for durable retries, or simpler cron/queue + n8n glue to start).
  • Observability: structured logging of every agent action, with a simple dashboard. Required for both debugging and compliance.

4.5 Enrichment (privacy-preserving)

  • Default: one-way, per-prospect public lookups that write results into the CRM. Never upload the LP list to a third party. The X API is the workhorse here — public, per-prospect, ToS-compliant via the official key — and its follower-graph data complements the email/calendar relationship graph for warm-path mapping.
  • Optional: a self-hosted scraper/enrichment pipeline on the Sparks if you want zero third-party API exposure.

4.6 Redaction / re-hydration boundary (Claude-facing reasoning)

  • For the steps where an agent must have Claude reason over LP-specific content (Analyst dossiers, Closer drafting), a local scrub → reason → re-hydrate round-trip keeps identifiers off the third-party API: the Sparks pseudonymize names/orgs/amounts to stable placeholders, Claude reasons over the de-identified prompt, and real values are swapped back locally before a human reviews. The ingest/retrieval path is already fully local and needs none of this.
  • This is designed now, built in Phase 2/3 (it is not needed in Phase 0). Full design: docs/redaction-rehydration.md.

5. Build sequence

Phase 0 — Foundation

The substrate: data layer + retrieval, no live-in-the-wild agents yet. Division of labor:

  • Spark developer (their side): TEI serving BGE-M3 + BGE-Reranker-v2-m3 and Qdrant on Spark 2, exposed via Spark Control /v1/embeddings + /v1/rerank.
  • Claude Code + you (this project):
    1. Read the CRM code; document the storage engine, schema, and API surface.
    2. Extend the CRM schema (LP/prospect fields, interaction log, relationship graph, canonical entity IDs).
    3. Build the ingest/sync pipeline (chunking + entity resolution + metadata payloads; backfill + incremental).
    4. Build the CRM MCP server wrapping CRM reads/writes and the per-agent retrieval modes.
    5. Bring counsel in to define outbound and recordkeeping rules so the system is compliant from day one.

Phase 1 — Architect + Scribe

  • Stand up the Architect first: encode the current thesis, voice, and segment definitions as skills; use it collaboratively to produce the canonical messaging source of truth.
  • Then Scribe: propagate that thesis into segment-specific content with human review before publish.
  • Lowest risk, highest immediate awareness ROI, never touches cold outreach — and it proves the full pattern (SDK + skills + MCP + human review).

Phase 2 — Scout + Analyst

  • Scout populates the pipeline from public signals (X monitoring via the API key); Analyst builds dossiers and derives warm paths from your own email/calendar graph plus X follower-graph overlap.
  • Internal-facing, still no outbound. This is where the Sparks earn their keep (bulk classification, embeddings, RAG).

Phase 3 — Closer + Orchestrator

  • Closer drafts outbound, nurture, and meeting prep — with hard human-in-the-loop gates and full logging. Highest-risk and regulated, so it comes last.
  • Orchestrator added once there are multiple agents to coordinate and schedule.

6. Team and ownership model

  • Engineering partner: Claude + Claude Code, supplying Agent SDK and MCP fluency, scaffolding the agents, writing the MCP servers and orchestration, and customizing the Start9 CRM package.
  • Operator: you (and your partner). You own deployment, secrets/key management, uptime, and the human-review gates. Your prior Start9 CRM build demonstrates this is well within reach.
  • The one real risk is time, not capability. Removing the part-time data/ops hire means operational ownership lands on the partners. If partner time is scarce, that — not tooling or skill — is the constraint to manage. Mitigations: keep the early phases internal-only (no on-call urgency), automate logging/monitoring, and stage the highest-maintenance agent (Closer) last.

7. Compliance by design

  • Log every agent action and every outbound draft.
  • Gate all outbound through human send.
  • Resolve solicitation posture (e.g. 506(b) vs 506(c)), accreditation/QP verification, and recordkeeping with counsel before the Closer touches cold outreach.
  • Start with distribution and inbound nurture, where constraints are lightest.

8. Open decisions

Resolved: local chat/triage model = Qwen3.6 35B-A3B (Spark 1); embedding = BAAI/bge-m3 dense 1024-dim; reranker = BAAI/bge-reranker-v2-m3; vector DB = Qdrant v1.16.0 on Spark 2; serving = spark-embed (custom FastAPI on NGC PyTorch image, not TEI); BM25 sparse generated client-side via FastEmbed (Qdrant/bm25); all fronted by Spark Control (/v1/embeddings, /v1/rerank, /api/search), shipped v0.15.0. Embedding-model A/B upgrade candidate if dense recall lags: Qwen3-Embedding-4B (same /v1/embeddings contract).

Still open:

  1. Workflow engine for the outer loop (Phase 3): Temporal vs. cron/queue + n8n to start.
  2. Whether any third-party enrichment API is acceptable, or X + fully self-hosted enrichment only.
  3. Confirm X API usage limits (full-archive search, follower-graph pulls, streaming) to size Scout's monitoring scope. (Current access is pay-as-you-go credits.)
  4. Segment definitions for the Architect/Scribe (who are the distinct LP audiences, and what does each one need to hear?).
  5. Embedding dimension/quantization left at BGE-M3 native 1024-dim fp16 — no Matryoshka truncation or int8 needed at this corpus scale.