Files

T

Keysat 3e199fd8d5 Phase 1 Workstream A+E: thesis substrate + dual-approval gate

- migration 0002_phase1_architect: thesis_lines (core spine + per-segment lines),
  thesis_nodes (+ append-only revisions), thesis_versions (one-canonical-per-line
  DB invariant), thesis_reviews (dual approval + feedback), segments. Reversible.
- backend/mcp/architect_tools.py: agent draft tools (node tree, versions,
  segments, get_canonical fails-closed) — NO self-approval path. MCP-exposed.
- backend/thesis_review.py + server.py routes: human-gated approval. Dual sign-off
  via thesis_required_approvals; atomic supersede; every action logged.
- docs/PHASE_1.md (kickoff brief); docs/OPERATIONS.md (partner guide);
  start9/0.4 "Resolve duplicate names" fuzzy action.

Verified on synthetic data: dual approval promotes correctly, exactly one
canonical survives supersede, get_canonical fails closed, full interaction_log.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-05 10:20:00 -05:00

13 KiB

Raw Blame History

Operating the Ten31 CRM — A Partner's Guide

Status: DRAFT / living document. This is the operator-facing guide to our new agent-enhanced CRM. It is written for the firm's non-engineer members — especially the partners who will be thought-partners and dual-approvers of the thesis. It will grow as the system is built; the open questions and "what's coming" notes below are real, not placeholders. Last updated 2026-06-05.

1. What this is

Our CRM has quietly become two things at once.

First, it's the canonical "LP graph." It is the single source of truth for who our LPs and prospects are, what we've committed and discussed, and how everyone connects. Historically we tracked investors in two different places — a classic contacts/opportunities system and the live fundraising grid (the collaborative spreadsheet the partners actually edit). Those two never agreed on a single record per person. The new layer fixes that: it resolves all the variants of one investor — their grid row, their contact card, their org, their closed-LP profile — into one real canonical record. That canonical record is what everything else is built on.

Second, it now has an AI-agent layer on top. The vision is six specialized agents that widen the fundraising funnel and sharpen how we tell our story, all running on Claude for reasoning and on our own local models for anything sensitive. In one paragraph: Scout watches public sources for trigger events; Analyst builds LP dossiers and maps warm-intro paths; Architect helps us converge on and evolve our investment thesis; Scribe turns that thesis into content; Closer drafts outreach and meeting prep; and an Orchestrator schedules and routes work between them. Every one of them reads from the canonical LP graph, and none of them sends anything to a human without a partner approving it first.

Where we are today. We are deliberately phased.

Phase 0 (live): the data + retrieval substrate. The canonical entities, an append-only interaction log, and search over our own corpus. No outward-facing agents exist yet — this is the foundation everything else stands on.
Phase 1 (starting): the Architect. A collaborative copilot for the thesis. It drafts and pressure-tests; a partner signs off; the approved thesis becomes the source of truth every later agent reads.
Later (Phase 2–3): Scout, Analyst, Closer, Orchestrator — and only after counsel has defined what we're allowed to do on outbound.

2. The big ideas, in plain terms

Canonical entities — one real record per LP. No more "is this the John Smith from the grid or the J. Smith from contacts?" The system collapses name variants and cross-system duplicates into a single canonical identity. When you look something up, you get the whole person — not a fragment. This is also why our search works: if the same LP is scattered under three spellings, retrieval fragments and the agents get a partial picture.

The interaction log — everything is recorded. There is now an append-only log of every meaningful action: every human touch (a logged call, a note, a meeting) and, going forward, every agent action (a draft generated, a record enriched, a thesis version approved). It is never edited or deleted, only appended. This is both our compliance trail and the agents' memory. The richer it is, the smarter every agent gets.

Retrieval / search over our own corpus. We can now ask questions across everything we've ever recorded — notes, logged communications, fundraising-grid notes, and (once enabled) Gmail correspondence — and get back the most relevant pieces. It's a hybrid of meaning-based search and exact keyword/name matching, tuned on our own vocabulary so exact fund and LP names rank correctly. This is what lets an agent answer "did we ever discuss X with this LP?" instead of guessing.

Sovereignty — our sensitive data stays ours. This is the non-negotiable. All the LP-specific data, the embeddings, the search index, and the duplicate-resolution all run on our own infrastructure — the Start9 server and our local Spark machines. Claude (a third party) is only ever sent the minimum necessary, non-sensitive context for a given task, and never a bulk export of the LP list. When an agent genuinely needs Claude to reason over real record content (later phases), that content first passes through a redaction step that swaps real names, amounts, and emails for placeholders, then swaps them back locally. The de-anonymization key never leaves our box.

3. How to operate it day-to-day

You don't need to touch any code to operate this. Three habits and three buttons.

Keep the CRM clean. The canonical graph is only as good as what goes in. When you add an investor, use the real legal name where you can, attach them to the right org, and avoid creating a second record for someone who's already there. The duplicate resolver is good, but it works best as a backstop, not a crutch.

Log interactions, and log them well — this is the highest-leverage habit. When you have a call, a meeting, or a meaningful email exchange, log it with substance: what was discussed, the LP's reaction, objections raised, next steps. Two reasons this matters more than it looks. (1) It's our compliance record. (2) It is literally the training material the agents reason over. A thin "had a call" note teaches the agents nothing; "pushed back on energy thesis, worried about regulatory risk in Texas, wants to see Fund II returns" becomes evidence the Architect can use to anticipate objections and the Analyst can use to build a real dossier. Good logging compounds.

The three one-click actions (on the StartOS server page). These run on our infrastructure and are safe to re-run any time. None of them modifies your CRM source data — they build or refresh the derived search index and the canonical IDs.

Build search index — the one-time (or full-rebuild) setup. It resolves the canonical entity IDs from your live data, then reads every record, and builds the entire search index from scratch. Takes roughly 8–15 minutes. Use this for the initial go-live or if you ever want a clean rebuild.
Refresh search index — the fast, routine one. It updates the search index with just what's changed since the last run. Seconds to minutes. Use this to keep search current after a batch of edits. (Eventually this will run automatically on a schedule; for now it's a button.)
Resolve duplicate names — the smart de-duplication. The build step merges the obvious exact matches automatically and flags the harder, judgment-call pairs (e.g. "Kate" vs "Katherine"). This action asks our local Qwen model to decide which flagged pairs are truly the same person and merges them. It runs entirely on our infrastructure and is idempotent (safe to re-run). It needs our Spark Control gateway to be reachable, because that's where the local model lives.

A sensible rhythm: Build once at go-live, Resolve duplicate names after the build flags candidates, and Refresh routinely as the grid and correspondence change.

Where to look when something seems off. If search results feel stale, run Refresh search index. If the same LP shows up as two people, run Resolve duplicate names (and check you didn't create a true second record by accident). If an action fails mentioning Spark Control or Qdrant, the local-model gateway or the search database isn't reachable from the box — that's an infrastructure check, not a data problem. The interaction log is the place to see what happened and when.

4. The agent workflows — what's live, what's coming, and the approval gates

The cardinal rule across all of them: agents draft, partners approve, and nothing goes outbound without a human. No agent emails an LP, posts publicly, or contacts a prospect on its own. Ever.

Agent	What it does	Status
Architect	Collaborative copilot for the thesis: generates competing framings of a claim, turns your critique into a clean edit, red-teams LP objections, and grounds every claim in real evidence from our corpus.	Starting now (Phase 1)
Scout	Monitors public sources (X, filings) for trigger events worth acting on.	Coming (Phase 2)
Analyst	Builds LP dossiers, enriches records with public info, maps warm-intro paths.	Coming (Phase 2)
Scribe	Distributes the approved thesis as content across channels — read-only consumer of what the Architect produces.	Coming (after Architect)
Closer	Drafts outreach, nurture sequences, and meeting prep.	Coming (Phase 3, gated)
Orchestrator	Schedules and routes work between the agents.	Coming (Phase 3)

The Architect, concretely (because it's the one you'll use first). It is not a one-shot thesis generator. It's a workbench for exploration → convergence → continual evolution. You bring the seed of a thesis; it helps you sharpen it claim by claim. Each claim is a small, separately-editable node, so you can rework one argument without re-litigating the whole narrative, and hold competing phrasings side by side. Crucially, the Architect can draft and stage a candidate thesis version, but it cannot make a version canonical. Promoting a version to "this is our official thesis" is a deliberate human action through a partner-authenticated route — the plan supports single- or dual-partner sign-off (an open decision, see below). Once approved, that version becomes the single source every downstream agent reads, and it's logged in the interaction log as a human decision. Scribe and Closer can never generate against an unapproved draft.

The approval gates, summarized. (1) Canonicalizing the thesis is a human-only action. (2) Any outbound message (Closer/Scribe) is drafted by an agent and sent by a human after review. (3) When agents reason over sensitive record content, it passes through the redaction boundary first. (4) The entire outbound capability is blocked until counsel has defined our solicitation posture — we don't ship cold outreach before that gate clears.

5. Best practices — getting the most out of it

Habits that compound:

Log richly and consistently. This is the single biggest lever. Substance over checkbox. (See §3.)
Tag and segment deliberately. As segments firm up (e.g. family office, institution, bitcoin-native HNWI, energy player), assigning each LP to the right segment is what lets the Architect tailor "what this audience needs to hear" and lets us say the right thing to the right person.
Use one real record per person. Resolve duplicates when flagged; don't paper over them.
Keep the index fresh. Refresh after meaningful batches of edits so search and the agents reflect reality.
Treat the thesis as versioned. When the message evolves, evolve a claim node and re-approve — don't overwrite history. The whole point is recoverable iteration.

What NOT to do:

Don't bulk-export the LP list to any third-party tool. Sovereignty is the line we don't cross.
Don't paste real LP data or query results into a public Claude/ChatGPT session. The local pipeline exists precisely so we don't have to.
Don't treat the search index as the source of truth. It's derived from the CRM and rebuildable in minutes; the CRM is canonical. If they ever disagree, the CRM wins and you rebuild the index.
Don't let an agent's draft go out unreviewed. A draft is a draft until a partner approves and sends it.
Don't route bulk email ingest through Superhuman (or any external mail tool) — use the built-in sovereign Gmail capture, which keeps mail on our box. Superhuman is great for your inbox triage and drafting; it's not our system of record.

6. This is a living document

Last updated: 2026-06-05 · Maintained by: the build team, alongside the partners.

This guide will expand as each agent comes online. Things deliberately left open for later phases:

Thesis approval policy — single-partner vs. dual partner sign-off (the dual-approver workflow this guide is partly written for is still being decided).
LP segments — the firm-defined audience set and the per-segment "what to say / what to avoid" is content the partners supply, not something the system invents.
The agents themselves — Scout, Analyst, Scribe, Closer, Orchestrator are described here as intent; their operating instructions get written when they're built.
The compliance gate — outbound capability stays off until counsel defines solicitation posture, accreditation/QP verification, and recordkeeping rules.
Automatic index refresh — today's manual "Refresh" button becomes a scheduled background sync.

When in doubt about an operating question this guide doesn't answer, ask — and we'll fold the answer back in here.

13 KiB Raw Blame History Unescape Escape