Files
matrix-bridge/AGENTS.md
T
Keysat b6cc829f53 Land Phase 0 launch chain: SSH -> desktop Terminal -> claude -> phone
Phase 0 proven by hand (N=3) across multiple rooms.

- scripts/gui-launch.sh: open a desktop Terminal via osascript so claude runs in
  the GUI session (login Keychain + real TTY), avoiding a long-lived token (D11).
- scripts/launch-claude.sh: name the session `claude -n "<repo> - <topic>"` so
  Remote Control's phone conversation index is readable.
- .env.example: bot credential schema (real .env stays gitignored).
- AGENTS.md / ROADMAP.md: D11, Phase 0 results, Phase 1 carry-overs.
2026-06-15 13:58:15 -05:00

187 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# matrix-bridge — AGENTS.md
A single-user Matrix bot that turns a message in a project room into a live Claude Code
session in that project's repo on the Mac — surfaced to the phone via Claude Code Remote
Control. It makes the *trigger* portable: from anywhere on the WireGuard network, a Matrix
message starts a session on the Mac in the correct repo, and Remote Control pushes it to the
phone to drive interactively. Single user, private home network, no multi-user/product scope.
> **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for
> items tagged `(matrix-bridge)` and surface them before proposing next steps; triage with
> `/triage`.
## Core flow (v1)
```
Matrix message in a project room
→ bot (matrix-nio, on the DGX Spark) receives it
→ looks up which repo that room maps to (explicit config — no classification)
→ SSHes to the Mac and runs scripts/launch-claude.sh with (repo_dir, message_text)
→ wrapper cd's into the repo and launches `claude` with the message as the prompt
→ Claude Code Remote Control (auto-enabled) pushes a notification to the phone
→ tap in and drive the session from the Claude app
```
Room determines the repo; the message text becomes the initial prompt. That is the entire
v1 decision surface.
## Stack
- **Bot:** Python, **matrix-nio** (from the nio-template scaffold), single Docker container.
- **Runs on:** a DGX Spark (always-on Linux, Docker). *Not* Start9, *not* the Mac.
- **Mac seam:** `scripts/launch-claude.sh`, a zsh login-shell wrapper that owns all
environment setup and launches `claude`.
- **Config:** a readable room→repo mapping file (TOML) — adding a project is a config edit.
- **State:** none beyond config in v1; SQLite or flat files only if a later phase needs them.
## Placement
| Question | Decision | Rationale |
|---|---|---|
| Sensitivity / sovereignty | Local-only when an LLM is ever involved | v1 makes no LLM call; future intent-parsing must run on a local model via Spark Control — message content may reference investor/portfolio context. Never wire a frontier API to message payloads. |
| Runtime shape | Long-running service (always-listening bot) | Must be up unattended to catch messages. |
| Host | DGX Spark, Docker container | Always-on Linux with Docker; co-located with Qwen3 for future local intent-parsing; reaches both Synapse (network) and the Mac (SSH). |
| s9pk vs container | Plain container | Not on Start9 at all — StartOS only runs s9pk packages; don't pay packaging cost, don't touch Synapse. |
| Model routing | None in v1; future Qwen3 via Spark Control | Keeps the sovereignty boundary; deterministic core first. |
| Data layer | Config file (TOML) | v1 needs no datastore. |
| Interface | Matrix (Element) + phone via Remote Control | "Reachable from phone" already satisfied by WireGuard + Remote Control. |
| Repo home | Local + Gitea backup | `ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git`. |
## Commands
- `scripts/launch-claude.sh <repo_dir> <prompt>` — the Mac wrapper (Phase 0 deliverable;
validate by hand before any bot code).
- _TODO (Phase 1+):_ bot build/run (`docker build` / `docker compose up` on the Spark) once
`src/` exists.
## Layout
- `AGENTS.md` — this file (canonical; `CLAUDE.md` is a relative symlink to it).
- `ROADMAP.md` — Phases 14+ with falsifiable exits, plus deferred/future directions.
- `README.md` — human-facing intro.
- `scripts/launch-claude.sh` — the Mac-side launch wrapper (the only seam that knows the
Mac's environment).
- `config.example.toml` — room→repo mapping template; the real `config.toml` is gitignored.
- `.claude/` — Claude wiring (dir only for now).
- _Future:_ `src/` (the matrix-nio bot), `Dockerfile`, dependency manifest — Phase 1.
## Decisions (already made — don't relitigate without new information)
Condensed from the scoping workshop. Each: the call, why, what it beat.
- **D1 — matrix-nio, not Maubot.** Full control for one custom bot with real SSH-orchestration
logic; keeps Spark Control as the single dashboard. *Beat:* Maubot (competing web UI,
management layer we don't need), SimpleMatrixBotLib.
- **D2 — Bot runs on the Spark, not Start9 or the Mac.** Always-on Linux + Docker, co-located
with Qwen3, reaches Synapse + the Mac. *Beat:* Start9 (no s9pk), Mac (not always-on; it's the
execution target, not the orchestrator).
- **D3 — Synapse stays untouched.** Treat the existing StartOS Synapse as a fixed external
homeserver; the bot logs in as an ordinary Matrix user over WireGuard/LAN.
- **D4 — The Mac wrapper is the environment seam.** A `#!/bin/zsh -l` wrapper owns
PATH/credentials/`cd`/`exec claude`; the bot stays dumb and only invokes it over SSH.
*Beat:* inlining `source ~/.zprofile && …` from the bot (brittle); relying on the default
non-interactive SSH shell (the core failure mode — minimal shell loads neither `.zprofile`
nor `.zshrc`).
- **D5 — Remote Control is the phone-control layer.** Native, E2EE, already auto-enabled;
execution stays on the Mac. The bot only needs to *start* the session. *Note:* outside server
mode, one remote session per Claude Code instance.
- **D6 — Room = repo; routing is deterministic in v1.** No classification, no LLM, no path
branching. *Beat (for v1):* LLM intent parsing → deferred to D8.
- **D7 — No Nextcloud / CalDAV in v1.** Not the pain point; the interesting future (routing
Claude/bot *outputs* into Nextcloud) is real but unscoped.
- **D8 — Intent parsing deferred, but as a "routing brain."** When added (Phase 4+): a smart
dispatcher that, knowing all repos/contexts, decides which repo applies and what context to
inject — not a task-vs-session classifier. MUST run on a local model via Spark Control.
*Revisit when:* the deterministic core (Phases 12) is proven.
- **D9 — E2EE deferred (documented tradeoff).** Single-user bot over WireGuard on a private
LAN; transport is already private and matrix-nio E2EE adds libolm overhead. *Revisit when:*
the bot ever handles sensitive content over untrusted transport.
- **D10 — Spark Control manages the bot (Phase 3).** Status on the dashboard + one-click
update/restart, the same SSH-behind-buttons pattern Spark Control uses for the Sparks today.
- **D11 — Launch into a desktop Terminal, not a headless token (Phase 0).** The SSH session
can't reach the GUI login Keychain, so a plain `ssh … claude` reports "Not logged in." Rather
than mint a long-lived `claude setup-token`, the launcher (`scripts/gui-launch.sh`) uses
`osascript` to open a Terminal.app window in the **GUI session**, where `claude` inherits the
existing Keychain login and a real TTY. *Beat:* the long-lived OAuth token (Approach A) — works
and is fully unattended, but adds a credential to manage; kept as the documented fallback if the
Mac is ever driven headless (logged out). *Cost:* requires the Mac logged in + a one-time
Terminal Automation grant.
## Sovereignty constraint
v1 sends nothing to external services except what is deliberately typed into the Claude Code
session itself. The bot's own logic is fully local. **When intent parsing is added later it
MUST run on a local model via Spark Control — never a frontier API** — because it reads
message content that may reference investor/LP/portfolio context. Never wire an external API
call that carries message payloads.
## Implementation guardrails (from the workshop)
- **Quoting through SSH is the known footgun.** Message text crosses two shells (the Spark's,
then the Mac's). Use `shlex.quote` (or equivalent) when building the remote command — never
naive string-concatenate user text into the SSH command.
- **Fail loud on a bad directory.** If a room maps to a missing dir, the wrapper exits
non-zero (`cd "$1" || exit 1`) and the bot reports the failure back into the room — never
launches Claude in the wrong place.
- **Config over code** for the room→repo mapping.
## Definition of done per phase
Substance threshold **N = 3** real uses, defined per phase in `ROADMAP.md`. "Done" means
falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works
once" is not done.
## Current state
- **Scaffolded 2026-06-15** from a prior scoping package (SPEC/DECISIONS/CLAUDE/KICKOFF),
folded into this AGENTS.md (decisions + placement), `ROADMAP.md` (phases), and the wrapper +
config skeleton. No bot code yet — by design.
- **Phase 0 — SSH leg proven (2026-06-15).** Mac Remote Login is on. The Spark `spark-32d0`
(user `modelo`) reaches the Mac over `starttunnel`/WireGuard at `10.59.211.5`*not* the
LAN (the Spark isn't on the Mac's LAN subnet). A dedicated per-machine key
(`spark-control@spark-32d0` = `~/.ssh/id_ed25519` on the Spark) is in the Mac's
`authorized_keys`. SSH alias **`mac-bridge`** in the Spark's `~/.ssh/config` selects that key
(`IdentityFile ~/.ssh/id_ed25519` + `IdentitiesOnly yes`) — required because the pre-existing
`Host * → id_ed25519_shared` rule otherwise shadows the default key. The bot's entire Mac hop
is therefore `ssh mac-bridge '<command>'`. *Phase 1:* bake the dedicated key + an equivalent
alias/config into the bot's Docker image (modelo's `~/.ssh/config` won't exist in the
container).
- **Phase 0 — launch chain proven end-to-end (2026-06-15).** `ssh mac-bridge → gui-launch.sh
→ launch-claude.sh → authenticated claude → phone via Remote Control` works against a real
repo (`premier-gunner`). Chose **Approach B (desktop Terminal)** over a headless token — see
**D11**. Two pieces it took: (1) `~/.local/bin` (where `claude` lives) had to be added to
`~/.zprofile`, because a non-interactive login shell skips `.zshrc`; (2) `scripts/gui-launch.sh`
opens a Terminal.app window via `osascript` so `claude` runs inside the GUI session (login
Keychain + real TTY) — needed a one-time "Allow ssh to control Terminal" Automation grant.
*Known caveats for the bot:* (a) a never-trusted repo stalls at Claude's first-run folder-trust
gate — unattended launches must target already-trusted repos or pass a skip flag; (b) if the
TCC Automation grant ever resets, a launch stalls until someone clicks Allow — the bot should
detect a failed launch and report it back to the room, not hang.
- **Phase 0 — Matrix bot user live (2026-06-15).** Homeserver is the StartOS Synapse exposed on
**clearnet at `https://matrix.gilliam.ai`** (`server_name` = `matrix.gilliam.ai`, Synapse
1.154.0) — *not* the stale `@gilliam:<onion>` account found in Element. Created a dedicated
non-admin bot **`@agent:matrix.gilliam.ai`** (type `bot`) via the Synapse Admin Dashboard
(StartOS "Create Bot User" is appservice-only/greyed out). Minted a long-lived access token
(fixed `device_id` `matrix-bridge-bot`), verified via `whoami`, and stored
homeserver/user/token/device_id (+ password for recovery) in the gitignored **`.env`** (chmod
600). `config.toml` holds homeserver+user; `.env.example` documents the schema. Bot reuses the
stored token — never re-login per start (avoids device churn); no E2EE (D9). *Note:* the
bot↔Synapse hop is now public-internet TLS, which softens D9's "transport already WireGuard-
private" rationale (still TLS to the user's own server, single-user content) — revisit if it matters.
- **Phase 0 — rooms mapped (2026-06-15).** 9 project rooms in `config.toml` (premier-gunner,
recap, recap-relay, spark-control, ten31-transcripts, ten31-signal-engine, keysat, proof-of-work,
ten31-database), each `room_id → /Users/macpro/Projects/<repo>`. `@agent` is **joined to all 9**
(via its token), so the Phase-1 bot will see messages in each. *Manual by-hand launches must keep
message text free of `'`/`"`* — the typed SSH command line breaks on them (PS2 `>` hang); the
Phase-1 bot avoids this via `shlex.quote`.
- **Phase 0 — PROVEN / DONE (2026-06-15).** N=3 by-hand runs succeeded across multiple rooms
(recap, spark-control, premier-gunner): each opened a Terminal in the right repo, started `claude`
on the message, and pushed a drivable session to the phone. The deterministic core holds.
Added session naming: `launch-claude.sh` now runs `claude -n "<repo> - <topic>"` (topic from the
message, overridable via `$MB_SESSION_NAME`) so Remote Control's phone index is readable —
confirmed `-n` drives the phone app's conversation label.
- **Next: Phase 1 — the matrix-nio bot.** Container on the Spark, logged in as `@agent` (token in
`.env`), listening in the 9 mapped rooms; on a message it runs `ssh mac-bridge gui-launch.sh
<repo_dir> <message>` (built with `shlex.quote`) and reports failures back to the room. See
ROADMAP Phase 1 (also: bake key+config into the image, curated `$MB_SESSION_NAME` topic, fail-loud).