From ee8408d182183d2d46a93a0478c874ec773a66b5 Mon Sep 17 00:00:00 2001
From: Keysat <licensing@keysat.xyz>
Date: Mon, 15 Jun 2026 20:09:21 -0500
Subject: [PATCH] Handoff: consolidate infra facts into a reference section,
 prune Current state to a lean snapshot
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Lift the load-bearing connectivity/identity/env facts out of the Phase 0/1 narrative
into a stable "Infra facts" section, rewrite Current state as a ~15-line snapshot, and
correct the Core flow diagram (gui-launch.sh / ask-claude.sh, not launch-claude.sh
directly). No operational context dropped — verified by a fresh-eyes doc review.
---
 AGENTS.md | 133 ++++++++++++++++--------------------------------------
 1 file changed, 40 insertions(+), 93 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 8bf9571..b1ebe66 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -16,14 +16,15 @@ phone to drive interactively. Single user, private home network, no multi-user/p
 Matrix message in a project room
   → bot (matrix-nio, on the DGX Spark) receives it
   → looks up which repo that room maps to (explicit config — no classification)
-  → SSHes to the Mac and runs scripts/launch-claude.sh with (repo_dir, message_text)
-  → wrapper cd's into the repo and launches `claude` with the message as the prompt
+  → SSHes to the Mac and runs scripts/gui-launch.sh → launch-claude.sh (repo_dir, message_text)
+  → wrapper cd's into the repo, opens a desktop Terminal, and launches `claude` on the message
   → Claude Code Remote Control (auto-enabled) pushes a notification to the phone
   → tap in and drive the session from the Claude app
 ```
 
-Room determines the repo; the message text becomes the initial prompt. That is the entire
-v1 decision surface.
+Room determines the repo; the message text becomes the initial prompt — the v1 trigger surface.
+*Variant:* a `?`-prefixed message instead runs `ask-claude.sh` (headless `claude -p`) and posts
+the full answer back into the room (ask mode, D12).
 
 ## Stack
 
@@ -161,94 +162,40 @@ Substance threshold **N = 3** real uses, defined per phase in `ROADMAP.md`. "Don
 falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works
 once" is not done.
 
+## Infra facts (proven — stable reference)
+
+- **WireGuard (`starttunnel`), not LAN:** Mac `10.59.211.5`; Spark (`spark-32d0`, user `modelo`)
+  `10.59.211.6`. The Spark is not on the Mac's LAN subnet.
+- **Spark → Mac:** SSH alias `mac-bridge` → the Mac as user `macpro`, dedicated key
+  (`~/.ssh/id_ed25519` on the Spark, in the Mac's `authorized_keys`). The Spark host's `~/.ssh/config` needs `IdentitiesOnly yes` because a
+  `Host *` rule shadows the default key; the container regenerates a clean config from `config.toml [mac]`.
+- **Mac → Spark:** no authorized key — Spark-side ops (deploy/restart) are owner-run until Phase 3.
+- **Matrix:** homeserver `https://matrix.gilliam.ai` (StartOS Synapse), bot `@agent:matrix.gilliam.ai`,
+  device `matrix-bridge-bot`. The bot reuses the stored access token (`.env`) — never re-logs in
+  (avoids device churn). No E2EE (D9); bot↔Synapse is clearnet TLS, softening D9's WireGuard-only rationale.
+- **Mac env:** `claude` lives in `~/.local/bin`, on PATH only via `~/.zprofile` — so every wrapper is
+  `#!/bin/zsh -l` (a non-login SSH shell loads neither `.zprofile` nor `.zshrc`).
+- **Interactive-launch prereqs:** Mac logged into its desktop + a one-time Terminal Automation grant
+  (TCC). If the grant resets, a launch stalls — the bot reports it fail-loud rather than hanging.
+- **Folder-trust gate:** the first `claude` run in a repo it has never been opened in stalls on the
+  trust prompt; already-used repos are trusted. Affects unattended interactive launches and ask mode.
+
 ## Current state
 
-- **Scaffolded 2026-06-15** from a prior scoping package (SPEC/DECISIONS/CLAUDE/KICKOFF),
-  folded into this AGENTS.md (decisions + placement), `ROADMAP.md` (phases), and the wrapper +
-  config skeleton. No bot code yet — by design.
-- **Phase 0 — SSH leg proven (2026-06-15).** Mac Remote Login is on. The Spark `spark-32d0`
-  (user `modelo`) reaches the Mac over `starttunnel`/WireGuard at `10.59.211.5` — *not* the
-  LAN (the Spark isn't on the Mac's LAN subnet). A dedicated per-machine key
-  (`spark-control@spark-32d0` = `~/.ssh/id_ed25519` on the Spark) is in the Mac's
-  `authorized_keys`. SSH alias **`mac-bridge`** in the Spark's `~/.ssh/config` selects that key
-  (`IdentityFile ~/.ssh/id_ed25519` + `IdentitiesOnly yes`) — required because the pre-existing
-  `Host * → id_ed25519_shared` rule otherwise shadows the default key. The bot's entire Mac hop
-  is therefore `ssh mac-bridge '<command>'`. *Phase 1:* bake the dedicated key + an equivalent
-  alias/config into the bot's Docker image (modelo's `~/.ssh/config` won't exist in the
-  container).
-- **Phase 0 — launch chain proven end-to-end (2026-06-15).** `ssh mac-bridge → gui-launch.sh
-  → launch-claude.sh → authenticated claude → phone via Remote Control` works against a real
-  repo (`premier-gunner`). Chose **Approach B (desktop Terminal)** over a headless token — see
-  **D11**. Two pieces it took: (1) `~/.local/bin` (where `claude` lives) had to be added to
-  `~/.zprofile`, because a non-interactive login shell skips `.zshrc`; (2) `scripts/gui-launch.sh`
-  opens a Terminal.app window via `osascript` so `claude` runs inside the GUI session (login
-  Keychain + real TTY) — needed a one-time "Allow ssh to control Terminal" Automation grant.
-  *Known caveats for the bot:* (a) a never-trusted repo stalls at Claude's first-run folder-trust
-  gate — unattended launches must target already-trusted repos or pass a skip flag; (b) if the
-  TCC Automation grant ever resets, a launch stalls until someone clicks Allow — the bot should
-  detect a failed launch and report it back to the room, not hang.
-- **Phase 0 — Matrix bot user live (2026-06-15).** Homeserver is the StartOS Synapse exposed on
-  **clearnet at `https://matrix.gilliam.ai`** (`server_name` = `matrix.gilliam.ai`, Synapse
-  1.154.0) — *not* the stale `@gilliam:<onion>` account found in Element. Created a dedicated
-  non-admin bot **`@agent:matrix.gilliam.ai`** (type `bot`) via the Synapse Admin Dashboard
-  (StartOS "Create Bot User" is appservice-only/greyed out). Minted a long-lived access token
-  (fixed `device_id` `matrix-bridge-bot`), verified via `whoami`, and stored
-  homeserver/user/token/device_id (+ password for recovery) in the gitignored **`.env`** (chmod
-  600). `config.toml` holds homeserver+user; `.env.example` documents the schema. Bot reuses the
-  stored token — never re-login per start (avoids device churn); no E2EE (D9). *Note:* the
-  bot↔Synapse hop is now public-internet TLS, which softens D9's "transport already WireGuard-
-  private" rationale (still TLS to the user's own server, single-user content) — revisit if it matters.
-- **Phase 0 — rooms mapped (2026-06-15).** 9 project rooms in `config.toml` (premier-gunner,
-  recap, recap-relay, spark-control, ten31-transcripts, ten31-signal-engine, keysat, proof-of-work,
-  ten31-database), each `room_id → /Users/macpro/Projects/<repo>`. `@agent` is **joined to all 9**
-  (via its token), so the Phase-1 bot will see messages in each. *Manual by-hand launches must keep
-  message text free of `'`/`"`* — the typed SSH command line breaks on them (PS2 `>` hang); the
-  Phase-1 bot avoids this via `shlex.quote`.
-- **Phase 0 — PROVEN / DONE (2026-06-15).** N=3 by-hand runs succeeded across multiple rooms
-  (recap, spark-control, premier-gunner): each opened a Terminal in the right repo, started `claude`
-  on the message, and pushed a drivable session to the phone. The deterministic core holds.
-  Added session naming: `launch-claude.sh` now runs `claude -n "<repo> - <topic>"` (topic from the
-  message, overridable via `$MB_SESSION_NAME`) so Remote Control's phone index is readable —
-  confirmed `-n` drives the phone app's conversation label.
-- **Phase 1 — bot working, sub-steps 1–3 PROVEN (2026-06-15).** `src/bot.py` (matrix-nio) logs in
-  as `@agent` with the stored token, listens in all 12 rooms, and on a message runs
-  `ssh mac-bridge gui-launch.sh <repo> <message>` (via `shlex.quote`), replies in-room, fans out
-  for `#all-projects` (each session named `<repo> - <date>`), and reports failures back (fail-loud).
-  Tested on the **Spark** (`~/matrix-bridge`, venv) — launches worked across several rooms (N=3).
-  Now 11 project rooms + all-projects; `config.toml` has a `[mac]` section (ssh_alias + launcher).
-- **Phase 1 — DONE: containerized + proven on the Spark (2026-06-15).** The bot runs as a Docker
-  container on the Spark (`~/matrix-bridge`, `docker compose up -d --build`): generic image
-  (`python:3.12-slim` + `openssh-client`), host networking, `restart: unless-stopped` (survives
-  reboots), read-only mounts of `.env`/`config.toml`/SSH key. `docker-entrypoint.sh` generates
-  `~/.ssh/config` for `mac-bridge` from `config.toml [mac]` (added `hostname`=`10.59.211.5`,
-  `user`=`macpro`) — the container's env seam (D4 analog of `launch-claude.sh`); SSH key mounted
-  not baked; first connect uses `StrictHostKeyChecking=accept-new` (private-WireGuard tradeoff, D9).
-  *Proven live:* container connects to Synapse (`listening as @agent… 11 rooms`) and real messages
-  in **2 different rooms** each launched a drivable session on the phone via the full chain
-  (container → `ssh mac-bridge` → `gui-launch.sh` → `claude` → phone), rc=0 — confirming the new
-  container→Mac SSH hop over WireGuard (mounted key + accept-new host trust). *Formal exit was N=3;
-  the owner accepted 2 live launches across 2 rooms + the clear repeatable pattern as done.*
-  Build-time checks on the Mac also passed (image builds, `ssh -G mac-bridge` resolves, entrypoint
-  perms 700/600).
-- **Spark-side ops are owner-run.** The Mac has **no** authorized SSH key into the Spark
-  (`modelo@10.59.211.6` — reachable over WireGuard but not authenticated; Phase 0 only set up the
-  reverse, `mac-bridge`). So deploys/restarts on the Spark are run by the owner from the Spark, not
-  driven from the Mac — until Phase 3 wires it behind Spark Control.
-- **Headless "ask" mode — SHIPPED + proven on the Spark (2026-06-16).** A `?`-prefixed message in a
-  mapped room runs `claude -p` one-shot in that repo on the Mac and posts the **full** answer back
-  into the room (Matrix as request/response, not just a trigger); non-`?` messages launch
-  interactively as before. New `scripts/ask-claude.sh` (login-shell wrapper: extracts
-  `CLAUDE_CODE_OAUTH_TOKEN` from the Mac's `.env`, runs `claude -p "$prompt" < /dev/null`); `bot.py`
-  gained the `?`-dispatch + `run_ask`/`ask` (SSH stdout captured, 300s timeout, fail-loud, output
-  chunked under Matrix's ~64KB cap). *Why a token (D12):* a non-GUI SSH session can't reach the login
-  Keychain, so headless `claude -p` reports "Not logged in" — Approach A, kept Mac-side only (the
-  Spark never runs claude). Fresh-eyes reviewed before commit; P1 nits fixed (reap killed ssh on
-  timeout; treat rc=0 + empty output as success, not failure). *Proven:* a real `?`-ask in an
-  already-trusted repo returned the answer into the room. *Open edge:* a `?`-ask in a repo `claude`
-  has **never** been opened in may stall on the first-run folder-trust gate (Phase 0 caveat) — add a
-  trust flag to the wrapper if/when hit, not preemptively.
-- **Next (open — discuss before building):** Phase 2 (multi-room routing) is effectively already
-  satisfied (built multi-room; routed correctly across rooms in the Phase 1 proof) — only a formal
-  confirmation pass remains. Main remaining candidate: **Phase 3** (Spark Control: bot status +
-  one-click update/restart on the dashboard, the SSH-behind-buttons pattern — also closes the
-  owner-run-ops gap above). Other backlog in `ROADMAP.md`.
+- **Working & proven live on the Spark (Phases 0–1 + ask mode, 2026-06-16).** The bot runs as a Docker
+  container on the Spark (`~/matrix-bridge`, `docker compose up -d --build`): generic image, host
+  networking, `restart: unless-stopped`, read-only mounts of `.env`/`config.toml`/SSH key. Listens as
+  `@agent` in 11 project rooms + an all-projects fan-out room (each fan-out session named `<repo> - <date>`).
+- **Interactive** (plain message): `ssh mac-bridge → gui-launch.sh → launch-claude.sh → claude` →
+  drivable session on the phone via Remote Control.
+- **Ask mode** (`?`-prefixed message): `ssh mac-bridge → ask-claude.sh → claude -p`, full answer posted
+  back into the room (chunked, no truncation). See D12.
+- **Phase 2 (multi-room routing)** is effectively satisfied — the bot is built multi-room and routes by
+  `room_id`; only a formal N=3 confirmation pass remains.
+- **Next — Phase 3 (deferred to next session by owner):** Spark Control integration — bot container
+  status + one-click update/restart on the dashboard; also closes the Mac-has-no-key-into-Spark gap.
+- **Open / risks:** (a) a `?`-ask in a repo `claude` has never opened may stall on the folder-trust gate
+  — add a trust flag to `ask-claude.sh` if/when hit, not preemptively; (b) owner TODO: clean up the
+  accidental MacBook docker deploy (`docker compose down` + `docker image rm matrix-bridge-bot`).
+- **Repo:** tree clean; `master` == `phase-1` == `8ad1cd8`, pushed to Gitea. No test suite (pre-existing);
+  this session's changes were syntax/unit-checked locally, fresh-eyes reviewed, and proven live.