Handoff: consolidate infra facts into a reference section, prune Current state to a lean snapshot

Lift the load-bearing connectivity/identity/env facts out of the Phase 0/1 narrative
into a stable "Infra facts" section, rewrite Current state as a ~15-line snapshot, and
correct the Core flow diagram (gui-launch.sh / ask-claude.sh, not launch-claude.sh
directly). No operational context dropped — verified by a fresh-eyes doc review.
This commit is contained in:
Keysat
2026-06-15 20:09:21 -05:00
parent 8ad1cd8465
commit ee8408d182
+40 -93
View File
@@ -16,14 +16,15 @@ phone to drive interactively. Single user, private home network, no multi-user/p
Matrix message in a project room
→ bot (matrix-nio, on the DGX Spark) receives it
→ looks up which repo that room maps to (explicit config — no classification)
→ SSHes to the Mac and runs scripts/launch-claude.sh with (repo_dir, message_text)
→ wrapper cd's into the repo and launches `claude` with the message as the prompt
→ SSHes to the Mac and runs scripts/gui-launch.sh → launch-claude.sh (repo_dir, message_text)
→ wrapper cd's into the repo, opens a desktop Terminal, and launches `claude` on the message
→ Claude Code Remote Control (auto-enabled) pushes a notification to the phone
→ tap in and drive the session from the Claude app
```
Room determines the repo; the message text becomes the initial prompt. That is the entire
v1 decision surface.
Room determines the repo; the message text becomes the initial prompt — the v1 trigger surface.
*Variant:* a `?`-prefixed message instead runs `ask-claude.sh` (headless `claude -p`) and posts
the full answer back into the room (ask mode, D12).
## Stack
@@ -161,94 +162,40 @@ Substance threshold **N = 3** real uses, defined per phase in `ROADMAP.md`. "Don
falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works
once" is not done.
## Infra facts (proven — stable reference)
- **WireGuard (`starttunnel`), not LAN:** Mac `10.59.211.5`; Spark (`spark-32d0`, user `modelo`)
`10.59.211.6`. The Spark is not on the Mac's LAN subnet.
- **Spark → Mac:** SSH alias `mac-bridge` → the Mac as user `macpro`, dedicated key
(`~/.ssh/id_ed25519` on the Spark, in the Mac's `authorized_keys`). The Spark host's `~/.ssh/config` needs `IdentitiesOnly yes` because a
`Host *` rule shadows the default key; the container regenerates a clean config from `config.toml [mac]`.
- **Mac → Spark:** no authorized key — Spark-side ops (deploy/restart) are owner-run until Phase 3.
- **Matrix:** homeserver `https://matrix.gilliam.ai` (StartOS Synapse), bot `@agent:matrix.gilliam.ai`,
device `matrix-bridge-bot`. The bot reuses the stored access token (`.env`) — never re-logs in
(avoids device churn). No E2EE (D9); bot↔Synapse is clearnet TLS, softening D9's WireGuard-only rationale.
- **Mac env:** `claude` lives in `~/.local/bin`, on PATH only via `~/.zprofile` — so every wrapper is
`#!/bin/zsh -l` (a non-login SSH shell loads neither `.zprofile` nor `.zshrc`).
- **Interactive-launch prereqs:** Mac logged into its desktop + a one-time Terminal Automation grant
(TCC). If the grant resets, a launch stalls — the bot reports it fail-loud rather than hanging.
- **Folder-trust gate:** the first `claude` run in a repo it has never been opened in stalls on the
trust prompt; already-used repos are trusted. Affects unattended interactive launches and ask mode.
## Current state
- **Scaffolded 2026-06-15** from a prior scoping package (SPEC/DECISIONS/CLAUDE/KICKOFF),
folded into this AGENTS.md (decisions + placement), `ROADMAP.md` (phases), and the wrapper +
config skeleton. No bot code yet — by design.
- **Phase 0 — SSH leg proven (2026-06-15).** Mac Remote Login is on. The Spark `spark-32d0`
(user `modelo`) reaches the Mac over `starttunnel`/WireGuard at `10.59.211.5`*not* the
LAN (the Spark isn't on the Mac's LAN subnet). A dedicated per-machine key
(`spark-control@spark-32d0` = `~/.ssh/id_ed25519` on the Spark) is in the Mac's
`authorized_keys`. SSH alias **`mac-bridge`** in the Spark's `~/.ssh/config` selects that key
(`IdentityFile ~/.ssh/id_ed25519` + `IdentitiesOnly yes`) — required because the pre-existing
`Host * → id_ed25519_shared` rule otherwise shadows the default key. The bot's entire Mac hop
is therefore `ssh mac-bridge '<command>'`. *Phase 1:* bake the dedicated key + an equivalent
alias/config into the bot's Docker image (modelo's `~/.ssh/config` won't exist in the
container).
- **Phase 0 — launch chain proven end-to-end (2026-06-15).** `ssh mac-bridge → gui-launch.sh
→ launch-claude.sh → authenticated claude → phone via Remote Control` works against a real
repo (`premier-gunner`). Chose **Approach B (desktop Terminal)** over a headless token — see
**D11**. Two pieces it took: (1) `~/.local/bin` (where `claude` lives) had to be added to
`~/.zprofile`, because a non-interactive login shell skips `.zshrc`; (2) `scripts/gui-launch.sh`
opens a Terminal.app window via `osascript` so `claude` runs inside the GUI session (login
Keychain + real TTY) — needed a one-time "Allow ssh to control Terminal" Automation grant.
*Known caveats for the bot:* (a) a never-trusted repo stalls at Claude's first-run folder-trust
gate — unattended launches must target already-trusted repos or pass a skip flag; (b) if the
TCC Automation grant ever resets, a launch stalls until someone clicks Allow — the bot should
detect a failed launch and report it back to the room, not hang.
- **Phase 0 — Matrix bot user live (2026-06-15).** Homeserver is the StartOS Synapse exposed on
**clearnet at `https://matrix.gilliam.ai`** (`server_name` = `matrix.gilliam.ai`, Synapse
1.154.0) — *not* the stale `@gilliam:<onion>` account found in Element. Created a dedicated
non-admin bot **`@agent:matrix.gilliam.ai`** (type `bot`) via the Synapse Admin Dashboard
(StartOS "Create Bot User" is appservice-only/greyed out). Minted a long-lived access token
(fixed `device_id` `matrix-bridge-bot`), verified via `whoami`, and stored
homeserver/user/token/device_id (+ password for recovery) in the gitignored **`.env`** (chmod
600). `config.toml` holds homeserver+user; `.env.example` documents the schema. Bot reuses the
stored token — never re-login per start (avoids device churn); no E2EE (D9). *Note:* the
bot↔Synapse hop is now public-internet TLS, which softens D9's "transport already WireGuard-
private" rationale (still TLS to the user's own server, single-user content) — revisit if it matters.
- **Phase 0 — rooms mapped (2026-06-15).** 9 project rooms in `config.toml` (premier-gunner,
recap, recap-relay, spark-control, ten31-transcripts, ten31-signal-engine, keysat, proof-of-work,
ten31-database), each `room_id → /Users/macpro/Projects/<repo>`. `@agent` is **joined to all 9**
(via its token), so the Phase-1 bot will see messages in each. *Manual by-hand launches must keep
message text free of `'`/`"`* — the typed SSH command line breaks on them (PS2 `>` hang); the
Phase-1 bot avoids this via `shlex.quote`.
- **Phase 0 — PROVEN / DONE (2026-06-15).** N=3 by-hand runs succeeded across multiple rooms
(recap, spark-control, premier-gunner): each opened a Terminal in the right repo, started `claude`
on the message, and pushed a drivable session to the phone. The deterministic core holds.
Added session naming: `launch-claude.sh` now runs `claude -n "<repo> - <topic>"` (topic from the
message, overridable via `$MB_SESSION_NAME`) so Remote Control's phone index is readable —
confirmed `-n` drives the phone app's conversation label.
- **Phase 1 — bot working, sub-steps 13 PROVEN (2026-06-15).** `src/bot.py` (matrix-nio) logs in
as `@agent` with the stored token, listens in all 12 rooms, and on a message runs
`ssh mac-bridge gui-launch.sh <repo> <message>` (via `shlex.quote`), replies in-room, fans out
for `#all-projects` (each session named `<repo> - <date>`), and reports failures back (fail-loud).
Tested on the **Spark** (`~/matrix-bridge`, venv) — launches worked across several rooms (N=3).
Now 11 project rooms + all-projects; `config.toml` has a `[mac]` section (ssh_alias + launcher).
- **Phase 1 — DONE: containerized + proven on the Spark (2026-06-15).** The bot runs as a Docker
container on the Spark (`~/matrix-bridge`, `docker compose up -d --build`): generic image
(`python:3.12-slim` + `openssh-client`), host networking, `restart: unless-stopped` (survives
reboots), read-only mounts of `.env`/`config.toml`/SSH key. `docker-entrypoint.sh` generates
`~/.ssh/config` for `mac-bridge` from `config.toml [mac]` (added `hostname`=`10.59.211.5`,
`user`=`macpro`) — the container's env seam (D4 analog of `launch-claude.sh`); SSH key mounted
not baked; first connect uses `StrictHostKeyChecking=accept-new` (private-WireGuard tradeoff, D9).
*Proven live:* container connects to Synapse (`listening as @agent… 11 rooms`) and real messages
in **2 different rooms** each launched a drivable session on the phone via the full chain
(container → `ssh mac-bridge` → `gui-launch.sh` → `claude` → phone), rc=0 — confirming the new
container→Mac SSH hop over WireGuard (mounted key + accept-new host trust). *Formal exit was N=3;
the owner accepted 2 live launches across 2 rooms + the clear repeatable pattern as done.*
Build-time checks on the Mac also passed (image builds, `ssh -G mac-bridge` resolves, entrypoint
perms 700/600).
- **Spark-side ops are owner-run.** The Mac has **no** authorized SSH key into the Spark
(`modelo@10.59.211.6` — reachable over WireGuard but not authenticated; Phase 0 only set up the
reverse, `mac-bridge`). So deploys/restarts on the Spark are run by the owner from the Spark, not
driven from the Mac — until Phase 3 wires it behind Spark Control.
- **Headless "ask" mode — SHIPPED + proven on the Spark (2026-06-16).** A `?`-prefixed message in a
mapped room runs `claude -p` one-shot in that repo on the Mac and posts the **full** answer back
into the room (Matrix as request/response, not just a trigger); non-`?` messages launch
interactively as before. New `scripts/ask-claude.sh` (login-shell wrapper: extracts
`CLAUDE_CODE_OAUTH_TOKEN` from the Mac's `.env`, runs `claude -p "$prompt" < /dev/null`); `bot.py`
gained the `?`-dispatch + `run_ask`/`ask` (SSH stdout captured, 300s timeout, fail-loud, output
chunked under Matrix's ~64KB cap). *Why a token (D12):* a non-GUI SSH session can't reach the login
Keychain, so headless `claude -p` reports "Not logged in" — Approach A, kept Mac-side only (the
Spark never runs claude). Fresh-eyes reviewed before commit; P1 nits fixed (reap killed ssh on
timeout; treat rc=0 + empty output as success, not failure). *Proven:* a real `?`-ask in an
already-trusted repo returned the answer into the room. *Open edge:* a `?`-ask in a repo `claude`
has **never** been opened in may stall on the first-run folder-trust gate (Phase 0 caveat) — add a
trust flag to the wrapper if/when hit, not preemptively.
- **Next (open — discuss before building):** Phase 2 (multi-room routing) is effectively already
satisfied (built multi-room; routed correctly across rooms in the Phase 1 proof) — only a formal
confirmation pass remains. Main remaining candidate: **Phase 3** (Spark Control: bot status +
one-click update/restart on the dashboard, the SSH-behind-buttons pattern — also closes the
owner-run-ops gap above). Other backlog in `ROADMAP.md`.
- **Working & proven live on the Spark (Phases 01 + ask mode, 2026-06-16).** The bot runs as a Docker
container on the Spark (`~/matrix-bridge`, `docker compose up -d --build`): generic image, host
networking, `restart: unless-stopped`, read-only mounts of `.env`/`config.toml`/SSH key. Listens as
`@agent` in 11 project rooms + an all-projects fan-out room (each fan-out session named `<repo> - <date>`).
- **Interactive** (plain message): `ssh mac-bridge → gui-launch.sh → launch-claude.sh → claude`
drivable session on the phone via Remote Control.
- **Ask mode** (`?`-prefixed message): `ssh mac-bridge → ask-claude.sh → claude -p`, full answer posted
back into the room (chunked, no truncation). See D12.
- **Phase 2 (multi-room routing)** is effectively satisfied — the bot is built multi-room and routes by
`room_id`; only a formal N=3 confirmation pass remains.
- **Next — Phase 3 (deferred to next session by owner):** Spark Control integration — bot container
status + one-click update/restart on the dashboard; also closes the Mac-has-no-key-into-Spark gap.
- **Open / risks:** (a) a `?`-ask in a repo `claude` has never opened may stall on the folder-trust gate
— add a trust flag to `ask-claude.sh` if/when hit, not preemptively; (b) owner TODO: clean up the
accidental MacBook docker deploy (`docker compose down` + `docker image rm matrix-bridge-bot`).
- **Repo:** tree clean; `master` == `phase-1` == `8ad1cd8`, pushed to Gitea. No test suite (pre-existing);
this session's changes were syntax/unit-checked locally, fresh-eyes reviewed, and proven live.