ten31-database/docs/go-live-runbook.md

# Go-Live Runbook — Phase 0 substrate on the live Start9 box

*How to take the Phase-0 data substrate from "tested on synthetic data" to "running against the real CRM" on the Start9 server. You run this on your infrastructure; no real LP data goes to Claude/Anthropic (guardrails #1, #9). The live `/data/crm.db` on the box is the canonical source — not the possibly-stale `start9/0.4/seed/` snapshot.*

Recap of the three moves (see also `docs/crm-overview.md`): (1) ship code → empty new tables appear; (2) run the one-time init → fills the canonical IDs + search index from your real data; (3) run the MCP server.

---

## Prerequisites

- Spark Control + Qdrant reachable from the box: `SPARK_CONTROL_URL`, `QDRANT_URL` (see `.env.example`). Verify with `curl -sk $SPARK_CONTROL_URL/api/endpoints`.
- The `backend/ingest/` + `backend/mcp/` code present on the box (ships with the package — see "Packaging decision" below).
- Python deps in the ingest environment: `fastembed` (BM25; installs cleanly on the box's Python 3.11) and `mcp` (only to run the MCP server). The CRM server itself needs no new deps.

## Step 1 — Deploy the new CRM version (auto-creates the empty tables)

1. Bump the package version, rebuild the `.s9pk`, sideload it. StartOS preserves `/data`, so your real data is undisturbed.
2. On first boot, `init_db()` runs `backend/core_migrations.py`, which applies `migrations/0001_phase0_foundation.sql` **once** (tracked in `schema_migrations`) — additively creating `canonical_entities`, `entity_links`, `interaction_log`, `relationship_edges`, and the `deleted_at` columns. Nothing existing changes.
3. Verify: `sqlite3 /data/crm.db "SELECT filename FROM schema_migrations;"` → should list `0001_phase0_foundation.sql`.

## Step 2 — Prepare the ingest environment (on the box)

```bash
pip install fastembed                 # BM25 Qdrant/bm25 (sparse.py auto-detects it)
export CRM_DB_PATH=/data/crm.db
export SPARK_CONTROL_URL=https://192.168.1.72:62419
export SPARK_CONTROL_VERIFY_TLS=false
export QDRANT_URL=http://192.168.1.87:6333
```

`sparse.py` will report `BACKEND = fastembed:Qdrant/bm25` here (vs the pure-Python fallback used on the dev Mac). Because the index is built **and** queried on the box, the encoder is consistent end-to-end.

## Step 3 — Build the canonical IDs from your real data

```bash
python3 backend/ingest/entity_resolution.py --db /data/crm.db --show-candidates
```

This reads your real contacts / fundraising investors / organizations and fills `canonical_entities` + `entity_links` (the "create entity IDs from existing data" step). It is **read-only on your CRM source tables**, idempotent, and logs a run to `interaction_log`. Review the printed fuzzy candidates — those are the name-variant pairs the deterministic tier wouldn't merge on a guess (the local-Qwen fuzzy tier, still to be built, resolves these).

## Step 4 — Build the search index

```bash
python3 backend/ingest/backfill.py --db /data/crm.db --recreate
```

Chunks your real records → dense (bge-m3 via Spark Control) + BM25 sparse → upserts to Qdrant `crm_chunks`. ~8–15 min for a full corpus. Idempotent (deterministic point ids), so re-running is safe. `--recreate` drops and rebuilds the collection; omit it to update in place.

Note: your live CRM's text is concentrated in the **fundraising grid notes** + grid contacts (the seed snapshot had 0 communications / 0 lp_profiles), plus Gmail once enabled (see `docs/gmail-enablement-runbook.md`). The chunker already handles all of these.

## Step 5 — Start the MCP server

```bash
pip install mcp
CRM_DB_PATH=/data/crm.db python3 backend/mcp/server.py
```

Register it with the Agent SDK / Claude Code as an stdio MCP server. It exposes reads, the three retrieval modes, and logged writes — **no outbound/contact tools** (Phase 3 gate). For Phase 0 there are no live agents; this is for testing and the internal-only Analyst work later.

## Step 6 — Incremental sync (NOT YET BUILT — Workstream B4)

The full backfill is one-shot. Keeping the index fresh as the CRM changes (new grid edits, new emails) needs an incremental, idempotent sync on a schedule. This is the remaining Phase-0 ingest piece; until it's built, re-run Steps 3–4 to refresh.

## Verification

```sql
SELECT entity_kind, COUNT(*) FROM canonical_entities GROUP BY entity_kind;   -- IDs built
SELECT COUNT(*) FROM entity_links;                                            -- source rows linked
```
```bash
curl -s "$QDRANT_URL/collections/crm_chunks" | python3 -c "import sys,json;print('points:', json.load(sys.stdin)['result']['points_count'])"
python3 backend/ingest/search.py "Fund III wire timeline" --mode hybrid       # sanity query
```

## Open decision — packaging (how the init + MCP run on the box)

The ingest scripts read `/data/crm.db` by file path, so they must run **where that file lives** — inside or beside the CRM container (the dev Mac cannot open the container's SQLite file directly). Options, to decide before go-live:

- **A (recommended): same image.** Bundle `backend/ingest` + `backend/mcp` (+ `fastembed`, `mcp`) into the CRM container image; expose the init as a one-shot Start9 action and run the MCP server as a second daemon in the 0.4 `startos` manifest. The image is already Python 3.11 with the volume mounted.
- **B: sidecar container** on the box mounting the same `/data` volume.
- **C: co-located host** with a copy of `/data` and LAN access to the Sparks (involves copying the DB — least clean).

This packaging wiring (and Step 6) is the remaining build work for a fully turn-key go-live.

## Sovereignty checkpoint

Every step above runs on Ten31 infrastructure. Real records flow `crm.db → local Spark (bge-m3) → local Qdrant` and never reach Anthropic. The scripts print counts, not records. Keep it that way: don't paste query *results* over real data back into a Claude session (guardrail #9).