fffc90c7a4
Swap the dead "scarcity as the connecting idea" / bitcoin-as-settlement spine for the v2.0 reserve-asset spine (bitcoin = apex non-debasable reserve asset; debasement = forcing function; AI = abundance engine; throughline is an asset-value/capital-flow claim, not settlement; three seams Energy<->Compute, Debasement<->Bitcoin, AI<->Data-Ownership) everywhere it was still encoded in live code, the seed, and the docs. - architect_agent.py / outreach_agent.py: both system prompts carried "scarcity as the connecting idea" and shipped settlement framing into every generated draft; rewritten to the reserve-asset spine. - thesis_seed.py: THROUGHLINE, PILLAR_1, the AI/energy-operator segment angle, and THESIS_V2 corrected and voice-cleaned (no em dash / "X, not Y" / "bet"). PILLAR_2/3 (real revenue, founder access) kept. - ensure_thesis_v2_promoted / revert_thesis_v2_promotion: make the v2.0 spine the working APPROVED spine and re-ground/clean the core nodes, deployment-state-invariant (structural targeting, not body text) and fully reversible (captures prior body/title/status/deleted_at). NODE level only: never sets a thesis_version canonical (guardrail #4); no hard deletes (guardrail #3). Wired into init_db after the v2 candidate stage. - docs/thesis-handoff.md replaced wholesale with the complete v2.0 doc; Ten31_Agentic_Build_Plan.md + PHASE_1.md throughline glosses updated. The v2.0 spine remains an unratified draft from the signal-engine workstream: canonical freeze stays the partners' dual sign-off, and Appendix-A conviction/exposure figures stay Grant's working read. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
134 lines
15 KiB
Markdown
134 lines
15 KiB
Markdown
# Ten31 — Agentic Capability Build Plan
|
|
|
|
*Working document. Purpose: a concrete, sequenced plan for building an in-house system of AI agents to widen the top of the fundraising funnel, refine and propagate Ten31's thesis, and automate marketing/branding workflows — built with internal resources using Claude and Claude Code as the engineering partner.*
|
|
|
|
---
|
|
|
|
## 1. Approach in one paragraph
|
|
|
|
Build **six agents** — five workers plus a lightweight orchestrator — on the **Claude Agent SDK**, connected to your systems through **MCP**. Run the *reasoning* on **Claude** (frontier-quality judgment for research, messaging, drafting). **Self-host the data layer and the privacy-sensitive model work** on your existing Start9 server and your **dual DGX Sparks**. **Buy nothing for the core**: your self-built CRM becomes the system of record, and your existing Gmail/Superhuman + calendar connectors supply the relationship data. The real unit of reuse is not the agent count — it is one shared **LP graph** (your CRM) plus a library of **skills** every agent draws from.
|
|
|
|
---
|
|
|
|
## 2. Guiding principles
|
|
|
|
1. **Sovereignty first.** Sensitive LP and relationship data stays on infrastructure you control (Start9 + DGX Sparks). Only the minimum necessary context per call ever reaches a third-party model API.
|
|
2. **Frontier reasoning where it is best-in-class; local where privacy or cost dominate.** Claude for hard agentic reasoning and LP-facing output; local open models for embeddings, redaction, triage, transcription, and reasoning over data that must not leave your walls.
|
|
3. **Human-in-the-loop on anything outbound or thesis-defining.** Agents draft and prepare; partners approve and send.
|
|
4. **Compliant by design.** Log every agent action; gate all outbound; bring counsel in before any cold outreach goes live.
|
|
5. **One source of truth.** Every agent reads from and writes to the same LP graph, so research → outreach → nurture → meeting prep compound instead of fragmenting.
|
|
|
|
---
|
|
|
|
## 3. The agent roster (6)
|
|
|
|
| Agent | Job | Cadence | Brain | Human gate |
|
|
|---|---|---|---|---|
|
|
| **Scout** | Watches sources (X/nostr, filings, treasury announcements, conference rosters, podcast networks); flags trigger events; populates the pipeline. | Continuous / scheduled | Local (triage) + Claude (judgment calls) | None (internal only) |
|
|
| **Analyst** | Builds LP dossiers, enriches records, maps shortest warm-intro path through the team's network. | On-demand + triggered | Claude (synthesis); local for RAG/embeddings | None (internal only) |
|
|
| **Architect** | **Thesis articulation.** Owns and refines the canonical messaging — the reserve-asset throughline: as fiat debases and AI commoditizes the reproducible, value accrues to the scarce side of one supply chain (energy, compute, and bitcoin as the non-debasable reserve asset), structured on three seams (Energy↔Compute, Debasement↔Bitcoin, AI↔Data-Ownership). The copilot partners sit with to sharpen the narrative. Output = a living "messaging source of truth." | On-demand, collaborative | Claude | Partner sign-off on canonical thesis |
|
|
| **Scribe** | **Distribution / amplification.** Takes the Architect's canonical thesis + your content (Bitcoin Alpha, partner shows, memos) and propagates segment-specific cuts across X, nostr, LinkedIn, email. | Scheduled + on-demand | Claude | Review before publish |
|
|
| **Closer** | Drafts personalized outreach and nurture sequences, preps partners before LP calls, writes follow-ups, keeps the CRM clean. | Triggered + on-demand | Claude | **Hard gate** — human sends all outbound |
|
|
| **Orchestrator** ("Chief of Staff") | Schedules runs, routes work between agents, escalates to a human. | Always on | Claude (light) | n/a |
|
|
|
|
**Why Architect and Scribe are separate.** Distribution is high-frequency and semi-mechanical; thesis articulation is low-frequency, high-judgment, and collaborative. Keeping them apart lets the Architect own a stable, partner-approved narrative that the Scribe then propagates consistently everywhere.
|
|
|
|
---
|
|
|
|
## 4. Architecture and hosting map
|
|
|
|
### 4.1 Model layer
|
|
- **Claude (API)** — the brains for Analyst synthesis, Architect thesis work, Scribe drafting, Closer judgment, and Orchestrator routing. Use a stronger model for Architect/Analyst, a faster one for high-volume Scout/Closer tasks.
|
|
- **Local model on the DGX Sparks** — current local model is **Qwen3.6 35B-A3B running on a single Spark**. Used for PII redaction before any data leaves your walls, inbound triage/classification, transcription orchestration, structuring/extraction, and local reasoning over data you choose never to send out.
|
|
- The **A3B (~3B active params)** design means only a small slice of the model runs per token, so it largely sidesteps the Spark's memory-bandwidth limit and keeps decode fast despite being a 35B-total model. No need to link both Sparks for a larger model — that earlier ceiling is moot for this workload.
|
|
- **Embeddings + reranking (shipped, Spark Control v0.15.0).** Retrieval runs on `BAAI/bge-m3` (dense, 1024-dim, L2-normalized) plus `BAAI/bge-reranker-v2-m3` (cross-encoder), served by **spark-embed** — a small FastAPI server on **Spark 2** built from the NGC PyTorch image (HF TEI was ruled out: no arm64 CUDA image). Exposed through Spark Control as `/v1/embeddings`, `/v1/rerank`, and `/api/search` (orchestrated hybrid retrieval). Combined GPU footprint on Spark 2 is trivial (~3 GB).
|
|
- **Spark allocation.** Spark 1 = LLM serving (hot KV cache). Spark 2 = embeddings + reranker + audio + the Qdrant vector index. Both Sparks are treated as always-on production infrastructure.
|
|
- **All local model services are fronted by Spark Control** (the self-hosted gateway on Start9): agents hit one trusted URL for chat, embeddings, rerank, transcription, and TTS, with shared TLS, access control, and observability.
|
|
- **Auth note:** Agent SDK agents must authenticate with an **API key**, not a claude.ai login.
|
|
|
|
### 4.2 Data layer — the LP graph (self-hosted)
|
|
- **The CRM (self-hosted on Start9) is the canonical system of record.** Extend it to be the LP graph. Add: prospect/LP schema fields (thesis fit, segment, accreditation/QP status, warmth score, source, owner, last-touch), an interaction log (every agent action + every human touch), a derived **relationship graph** table, and **canonical entity IDs** for entity resolution (see ingest pipeline).
|
|
- **Vector store: Qdrant on Spark 2 (settled).** Holds the embedded chunks. It is a **rebuildable, derived index**, not a second source of truth — if lost, it re-embeds from the CRM in minutes. Qdrant provides dense search + native BM25 + payload filtering + Reciprocal Rank Fusion in one service.
|
|
- **Retrieval pipeline.** One orchestrated call to Spark Control `/api/search`: embed query (BGE-M3) → Qdrant dense + BM25 RRF with payload pre-filter → cross-encoder rerank → top_k. BM25 is generated **client-side** via FastEmbed (`Qdrant/bm25`) at both ingest and query time, with Qdrant applying IDF over *your* corpus — so domain entities (LP names, tickers, portfolio companies) are weighted by your own term statistics rather than BGE-M3's general-web sparse weights.
|
|
- **Ingest pipeline (the real Phase 0 work).** CRM record/change → chunk (one chunk per email/note/transcript-turn; one per memo *section*; time-aware; entities + `date_ts` kept as filterable payload, not embedded text) → resolve entities to a canonical `lp_id` (lightweight local-Qwen step) → produce **both** a dense vector (`/v1/embeddings`) and a sparse BM25 vector (FastEmbed) → upsert both + payload to Qdrant **directly** (not via the gateway). One-time backfill + idempotent incremental sync. Full recipe: `docs/EMBEDDINGS.md`.
|
|
- **Per-agent retrieval modes.** Don't force one pipeline on all agents. Build a small library the orchestrator picks from: high-recall dense at large K (Scout), high-precision keyword/BM25 (Closer — "did we ever discuss X with this LP?"), long-context + rerank (Architect). The CRM MCP server exposes these as tools.
|
|
- **Wrap the CRM in an MCP server** so all agents read/write through one uniform interface, including the retrieval modes above. Because the CRM is self-built, any endpoint the agents need can be added.
|
|
|
|
### 4.3 Integration layer (MCP fabric)
|
|
- MCP servers to stand up / connect:
|
|
- **CRM / LP graph** (custom, self-hosted) — primary.
|
|
- **Email + calendar** — Gmail/Superhuman connectors are already live; these feed Closer (drafting, follow-ups) and the Analyst's warm-path derivation.
|
|
- **Drive / notes** — internal documents and memos.
|
|
- **Publishing channels** — X, nostr, LinkedIn, email/newsletter (for Scribe).
|
|
- **Public data sources** — filings, web search, and the **X API (official key in hand)** for Scout/Analyst enrichment. X is a primary source here: per-prospect public profile/bio/activity and follower-following overlap for thesis-fit scoring and mutual-connection discovery (Analyst), plus account/list/keyword monitoring and follower-graph signals (Scout). Confirm what your X access *tier* permits (full-archive search, follower-graph pulls, streaming) — that sets the ceiling on heavier monitoring. nostr APIs as a complementary source.
|
|
|
|
### 4.4 Orchestration / runtime
|
|
- Inner loop: **Claude Agent SDK** handles each agent's tool-use loop and context management.
|
|
- Outer loop: a thin workflow engine decides *when* and *which* agent runs (Temporal for durable retries, or simpler cron/queue + n8n glue to start).
|
|
- **Observability:** structured logging of every agent action, with a simple dashboard. Required for both debugging and compliance.
|
|
|
|
### 4.5 Enrichment (privacy-preserving)
|
|
- Default: **one-way, per-prospect public lookups** that write results *into* the CRM. Never upload the LP list to a third party. The **X API** is the workhorse here — public, per-prospect, ToS-compliant via the official key — and its follower-graph data complements the email/calendar relationship graph for warm-path mapping.
|
|
- Optional: a **self-hosted scraper/enrichment pipeline on the Sparks** if you want zero third-party API exposure.
|
|
|
|
### 4.6 Redaction / re-hydration boundary (Claude-facing reasoning)
|
|
- For the steps where an agent must have **Claude reason over LP-specific content** (Analyst dossiers, Closer drafting), a local **scrub → reason → re-hydrate** round-trip keeps identifiers off the third-party API: the Sparks pseudonymize names/orgs/amounts to stable placeholders, Claude reasons over the de-identified prompt, and real values are swapped back locally before a human reviews. The ingest/retrieval path is already fully local and needs none of this.
|
|
- This is **designed now, built in Phase 2/3** (it is not needed in Phase 0). Full design: `docs/redaction-rehydration.md`.
|
|
|
|
---
|
|
|
|
## 5. Build sequence
|
|
|
|
### Phase 0 — Foundation
|
|
The substrate: data layer + retrieval, no live-in-the-wild agents yet. Division of labor:
|
|
- **Spark developer (their side):** TEI serving BGE-M3 + BGE-Reranker-v2-m3 and Qdrant on Spark 2, exposed via Spark Control `/v1/embeddings` + `/v1/rerank`.
|
|
- **Claude Code + you (this project):**
|
|
1. Read the CRM code; document the storage engine, schema, and API surface.
|
|
2. Extend the CRM schema (LP/prospect fields, interaction log, relationship graph, canonical entity IDs).
|
|
3. Build the ingest/sync pipeline (chunking + entity resolution + metadata payloads; backfill + incremental).
|
|
4. Build the CRM MCP server wrapping CRM reads/writes and the per-agent retrieval modes.
|
|
5. Bring counsel in to define outbound and recordkeeping rules so the system is compliant from day one.
|
|
|
|
### Phase 1 — Architect + Scribe
|
|
- Stand up the **Architect** first: encode the current thesis, voice, and segment definitions as skills; use it collaboratively to produce the canonical messaging source of truth.
|
|
- Then **Scribe**: propagate that thesis into segment-specific content with human review before publish.
|
|
- Lowest risk, highest immediate awareness ROI, never touches cold outreach — and it proves the full pattern (SDK + skills + MCP + human review).
|
|
|
|
### Phase 2 — Scout + Analyst
|
|
- **Scout** populates the pipeline from public signals (X monitoring via the API key); **Analyst** builds dossiers and derives warm paths from your own email/calendar graph plus X follower-graph overlap.
|
|
- Internal-facing, still no outbound. This is where the Sparks earn their keep (bulk classification, embeddings, RAG).
|
|
|
|
### Phase 3 — Closer + Orchestrator
|
|
- **Closer** drafts outbound, nurture, and meeting prep — with hard human-in-the-loop gates and full logging. Highest-risk and regulated, so it comes last.
|
|
- **Orchestrator** added once there are multiple agents to coordinate and schedule.
|
|
|
|
---
|
|
|
|
## 6. Team and ownership model
|
|
|
|
- **Engineering partner:** Claude + Claude Code, supplying Agent SDK and MCP fluency, scaffolding the agents, writing the MCP servers and orchestration, and customizing the Start9 CRM package.
|
|
- **Operator:** you (and your partner). You own deployment, secrets/key management, uptime, and the human-review gates. Your prior Start9 CRM build demonstrates this is well within reach.
|
|
- **The one real risk is time, not capability.** Removing the part-time data/ops hire means operational ownership lands on the partners. If partner time is scarce, that — not tooling or skill — is the constraint to manage. Mitigations: keep the early phases internal-only (no on-call urgency), automate logging/monitoring, and stage the highest-maintenance agent (Closer) last.
|
|
|
|
---
|
|
|
|
## 7. Compliance by design
|
|
|
|
- Log every agent action and every outbound draft.
|
|
- Gate all outbound through human send.
|
|
- Resolve solicitation posture (e.g. 506(b) vs 506(c)), accreditation/QP verification, and recordkeeping with counsel **before** the Closer touches cold outreach.
|
|
- Start with distribution and inbound nurture, where constraints are lightest.
|
|
|
|
---
|
|
|
|
## 8. Open decisions
|
|
|
|
**Resolved:** local chat/triage model = Qwen3.6 35B-A3B (Spark 1); embedding = `BAAI/bge-m3` dense 1024-dim; reranker = `BAAI/bge-reranker-v2-m3`; vector DB = Qdrant v1.16.0 on Spark 2; serving = **spark-embed** (custom FastAPI on NGC PyTorch image, *not* TEI); BM25 sparse generated client-side via FastEmbed (`Qdrant/bm25`); all fronted by Spark Control (`/v1/embeddings`, `/v1/rerank`, `/api/search`), shipped v0.15.0. Embedding-model A/B upgrade candidate if dense recall lags: `Qwen3-Embedding-4B` (same `/v1/embeddings` contract).
|
|
|
|
**Still open:**
|
|
1. Workflow engine for the outer loop (Phase 3): Temporal vs. cron/queue + n8n to start.
|
|
2. Whether any third-party enrichment API is acceptable, or X + fully self-hosted enrichment only.
|
|
3. Confirm **X API usage limits** (full-archive search, follower-graph pulls, streaming) to size Scout's monitoring scope. (Current access is pay-as-you-go credits.)
|
|
4. Segment definitions for the Architect/Scribe (who are the distinct LP audiences, and what does each one need to hear?).
|
|
5. Embedding dimension/quantization left at BGE-M3 native 1024-dim fp16 — no Matryoshka truncation or int8 needed at this corpus scale.
|