a7529eb0b7
Add Dockerfile, docker-compose.yml, docker-entrypoint.sh, and .dockerignore so the bot runs detached and survives reboots, replacing the foreground venv run. The image is generic (no secrets/deployment specifics baked in): host networking reaches both Synapse and the Mac; .env, config.toml, and the SSH key are mounted read-only. The entrypoint is the container's environment seam (D4 analog of launch-claude.sh) — it generates ~/.ssh/config for the mac-bridge alias from config.toml [mac] (new hostname/user fields) so the bot's `ssh mac-bridge` stays unchanged. SSH key mounted not baked; first connect uses accept-new host trust. Proven live on the Spark: container connects to Synapse and real messages launched drivable sessions on the phone across 2 rooms via the full chain.
232 lines
17 KiB
Markdown
232 lines
17 KiB
Markdown
# matrix-bridge — AGENTS.md
|
||
|
||
A single-user Matrix bot that turns a message in a project room into a live Claude Code
|
||
session in that project's repo on the Mac — surfaced to the phone via Claude Code Remote
|
||
Control. It makes the *trigger* portable: from anywhere on the WireGuard network, a Matrix
|
||
message starts a session on the Mac in the correct repo, and Remote Control pushes it to the
|
||
phone to drive interactively. Single user, private home network, no multi-user/product scope.
|
||
|
||
> **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for
|
||
> items tagged `(matrix-bridge)` and surface them before proposing next steps; triage with
|
||
> `/triage`.
|
||
|
||
## Core flow (v1)
|
||
|
||
```
|
||
Matrix message in a project room
|
||
→ bot (matrix-nio, on the DGX Spark) receives it
|
||
→ looks up which repo that room maps to (explicit config — no classification)
|
||
→ SSHes to the Mac and runs scripts/launch-claude.sh with (repo_dir, message_text)
|
||
→ wrapper cd's into the repo and launches `claude` with the message as the prompt
|
||
→ Claude Code Remote Control (auto-enabled) pushes a notification to the phone
|
||
→ tap in and drive the session from the Claude app
|
||
```
|
||
|
||
Room determines the repo; the message text becomes the initial prompt. That is the entire
|
||
v1 decision surface.
|
||
|
||
## Stack
|
||
|
||
- **Bot:** Python, **matrix-nio** (from the nio-template scaffold), single Docker container.
|
||
- **Runs on:** a DGX Spark (always-on Linux, Docker). *Not* Start9, *not* the Mac.
|
||
- **Mac seam:** `scripts/launch-claude.sh`, a zsh login-shell wrapper that owns all
|
||
environment setup and launches `claude`.
|
||
- **Config:** a readable room→repo mapping file (TOML) — adding a project is a config edit.
|
||
- **State:** none beyond config in v1; SQLite or flat files only if a later phase needs them.
|
||
|
||
## Placement
|
||
|
||
| Question | Decision | Rationale |
|
||
|---|---|---|
|
||
| Sensitivity / sovereignty | Local-only when an LLM is ever involved | v1 makes no LLM call; future intent-parsing must run on a local model via Spark Control — message content may reference investor/portfolio context. Never wire a frontier API to message payloads. |
|
||
| Runtime shape | Long-running service (always-listening bot) | Must be up unattended to catch messages. |
|
||
| Host | DGX Spark, Docker container | Always-on Linux with Docker; co-located with Qwen3 for future local intent-parsing; reaches both Synapse (network) and the Mac (SSH). |
|
||
| s9pk vs container | Plain container | Not on Start9 at all — StartOS only runs s9pk packages; don't pay packaging cost, don't touch Synapse. |
|
||
| Model routing | None in v1; future Qwen3 via Spark Control | Keeps the sovereignty boundary; deterministic core first. |
|
||
| Data layer | Config file (TOML) | v1 needs no datastore. |
|
||
| Interface | Matrix (Element) + phone via Remote Control | "Reachable from phone" already satisfied by WireGuard + Remote Control. |
|
||
| Repo home | Local + Gitea backup | `ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git`. |
|
||
|
||
## Commands
|
||
|
||
- `scripts/launch-claude.sh <repo_dir> <prompt>` — the Mac wrapper (Phase 0 deliverable;
|
||
validate by hand before any bot code).
|
||
- **Bot (Phase 1), containerized on the Spark — preferred:** from `~/matrix-bridge`,
|
||
`docker compose up -d --build` (host networking, `restart: unless-stopped` so it survives
|
||
reboots; read-only mounts of `.env`/`config.toml`/SSH key). Logs: `docker compose logs -f`.
|
||
The entrypoint generates `~/.ssh/config` for the `mac-bridge` alias from `config.toml [mac]`
|
||
(`hostname`/`user`), so the alias resolves inside the container. Override the host key path with
|
||
`MB_SSH_KEY_HOST` if it isn't `/home/modelo/.ssh/id_ed25519`.
|
||
- **Bot — venv (dev/fallback):** `python3 -m venv .venv && .venv/bin/pip install -r requirements.txt`,
|
||
then `.venv/bin/python src/bot.py` — uses modelo's host `~/.ssh/config` for the alias.
|
||
`MB_SSH_ALIAS` overrides the SSH target for testing.
|
||
- **Deploy:** pull the bot files from the Mac (no Gitea needed) —
|
||
`scp mac-bridge:/Users/macpro/Projects/matrix-bridge/{Dockerfile,docker-compose.yml,docker-entrypoint.sh,requirements.txt,config.toml,.env} .`
|
||
and `scp -r mac-bridge:/Users/macpro/Projects/matrix-bridge/src .`, then rebuild.
|
||
|
||
## Layout
|
||
|
||
- `AGENTS.md` — this file (canonical; `CLAUDE.md` is a relative symlink to it).
|
||
- `ROADMAP.md` — Phases 1–4+ with falsifiable exits, plus deferred/future directions.
|
||
- `README.md` — human-facing intro.
|
||
- `scripts/launch-claude.sh` — the Mac-side launch wrapper (the only seam that knows the
|
||
Mac's environment).
|
||
- `config.example.toml` — room→repo mapping template; the real `config.toml` is gitignored.
|
||
- `scripts/gui-launch.sh` — opens the desktop Terminal via `osascript` (Approach B, D11); calls
|
||
`launch-claude.sh` inside it. The bot invokes this over SSH.
|
||
- `src/bot.py` — the matrix-nio bot (Phase 1): listens in mapped rooms; on a message runs
|
||
`ssh mac-bridge gui-launch.sh`; fans out for all-projects; reports failures back to the room.
|
||
- `requirements.txt` (matrix-nio) · `.env.example` (credential schema; real `.env` gitignored).
|
||
- `.claude/` — Claude wiring (dir only for now).
|
||
- `Dockerfile` · `docker-compose.yml` · `docker-entrypoint.sh` · `.dockerignore` — the Phase 1
|
||
container (Spark). Generic image (no secrets/deployment specifics baked in); host networking;
|
||
read-only mounts of `.env`/`config.toml`/SSH key. The entrypoint generates `~/.ssh/config` for
|
||
the `mac-bridge` alias from `config.toml [mac]` — the container's environment seam (D4 analog
|
||
of `launch-claude.sh`).
|
||
|
||
## Decisions (already made — don't relitigate without new information)
|
||
|
||
Condensed from the scoping workshop. Each: the call, why, what it beat.
|
||
|
||
- **D1 — matrix-nio, not Maubot.** Full control for one custom bot with real SSH-orchestration
|
||
logic; keeps Spark Control as the single dashboard. *Beat:* Maubot (competing web UI,
|
||
management layer we don't need), SimpleMatrixBotLib.
|
||
- **D2 — Bot runs on the Spark, not Start9 or the Mac.** Always-on Linux + Docker, co-located
|
||
with Qwen3, reaches Synapse + the Mac. *Beat:* Start9 (no s9pk), Mac (not always-on; it's the
|
||
execution target, not the orchestrator).
|
||
- **D3 — Synapse stays untouched.** Treat the existing StartOS Synapse as a fixed external
|
||
homeserver; the bot logs in as an ordinary Matrix user over WireGuard/LAN.
|
||
- **D4 — The Mac wrapper is the environment seam.** A `#!/bin/zsh -l` wrapper owns
|
||
PATH/credentials/`cd`/`exec claude`; the bot stays dumb and only invokes it over SSH.
|
||
*Beat:* inlining `source ~/.zprofile && …` from the bot (brittle); relying on the default
|
||
non-interactive SSH shell (the core failure mode — minimal shell loads neither `.zprofile`
|
||
nor `.zshrc`).
|
||
- **D5 — Remote Control is the phone-control layer.** Native, E2EE, already auto-enabled;
|
||
execution stays on the Mac. The bot only needs to *start* the session. *Note:* outside server
|
||
mode, one remote session per Claude Code instance.
|
||
- **D6 — Room = repo; routing is deterministic in v1.** No classification, no LLM, no path
|
||
branching. *Beat (for v1):* LLM intent parsing → deferred to D8.
|
||
- **D7 — No Nextcloud / CalDAV in v1.** Not the pain point; the interesting future (routing
|
||
Claude/bot *outputs* into Nextcloud) is real but unscoped.
|
||
- **D8 — Intent parsing deferred, but as a "routing brain."** When added (Phase 4+): a smart
|
||
dispatcher that, knowing all repos/contexts, decides which repo applies and what context to
|
||
inject — not a task-vs-session classifier. MUST run on a local model via Spark Control.
|
||
*Revisit when:* the deterministic core (Phases 1–2) is proven.
|
||
- **D9 — E2EE deferred (documented tradeoff).** Single-user bot over WireGuard on a private
|
||
LAN; transport is already private and matrix-nio E2EE adds libolm overhead. *Revisit when:*
|
||
the bot ever handles sensitive content over untrusted transport.
|
||
- **D10 — Spark Control manages the bot (Phase 3).** Status on the dashboard + one-click
|
||
update/restart, the same SSH-behind-buttons pattern Spark Control uses for the Sparks today.
|
||
- **D11 — Launch into a desktop Terminal, not a headless token (Phase 0).** The SSH session
|
||
can't reach the GUI login Keychain, so a plain `ssh … claude` reports "Not logged in." Rather
|
||
than mint a long-lived `claude setup-token`, the launcher (`scripts/gui-launch.sh`) uses
|
||
`osascript` to open a Terminal.app window in the **GUI session**, where `claude` inherits the
|
||
existing Keychain login and a real TTY. *Beat:* the long-lived OAuth token (Approach A) — works
|
||
and is fully unattended, but adds a credential to manage; kept as the documented fallback if the
|
||
Mac is ever driven headless (logged out). *Cost:* requires the Mac logged in + a one-time
|
||
Terminal Automation grant.
|
||
|
||
## Sovereignty constraint
|
||
|
||
v1 sends nothing to external services except what is deliberately typed into the Claude Code
|
||
session itself. The bot's own logic is fully local. **When intent parsing is added later it
|
||
MUST run on a local model via Spark Control — never a frontier API** — because it reads
|
||
message content that may reference investor/LP/portfolio context. Never wire an external API
|
||
call that carries message payloads.
|
||
|
||
## Implementation guardrails (from the workshop)
|
||
|
||
- **Quoting through SSH is the known footgun.** Message text crosses two shells (the Spark's,
|
||
then the Mac's). Use `shlex.quote` (or equivalent) when building the remote command — never
|
||
naive string-concatenate user text into the SSH command.
|
||
- **Fail loud on a bad directory.** If a room maps to a missing dir, the wrapper exits
|
||
non-zero (`cd "$1" || exit 1`) and the bot reports the failure back into the room — never
|
||
launches Claude in the wrong place.
|
||
- **Config over code** for the room→repo mapping.
|
||
|
||
## Definition of done per phase
|
||
|
||
Substance threshold **N = 3** real uses, defined per phase in `ROADMAP.md`. "Done" means
|
||
falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works
|
||
once" is not done.
|
||
|
||
## Current state
|
||
|
||
- **Scaffolded 2026-06-15** from a prior scoping package (SPEC/DECISIONS/CLAUDE/KICKOFF),
|
||
folded into this AGENTS.md (decisions + placement), `ROADMAP.md` (phases), and the wrapper +
|
||
config skeleton. No bot code yet — by design.
|
||
- **Phase 0 — SSH leg proven (2026-06-15).** Mac Remote Login is on. The Spark `spark-32d0`
|
||
(user `modelo`) reaches the Mac over `starttunnel`/WireGuard at `10.59.211.5` — *not* the
|
||
LAN (the Spark isn't on the Mac's LAN subnet). A dedicated per-machine key
|
||
(`spark-control@spark-32d0` = `~/.ssh/id_ed25519` on the Spark) is in the Mac's
|
||
`authorized_keys`. SSH alias **`mac-bridge`** in the Spark's `~/.ssh/config` selects that key
|
||
(`IdentityFile ~/.ssh/id_ed25519` + `IdentitiesOnly yes`) — required because the pre-existing
|
||
`Host * → id_ed25519_shared` rule otherwise shadows the default key. The bot's entire Mac hop
|
||
is therefore `ssh mac-bridge '<command>'`. *Phase 1:* bake the dedicated key + an equivalent
|
||
alias/config into the bot's Docker image (modelo's `~/.ssh/config` won't exist in the
|
||
container).
|
||
- **Phase 0 — launch chain proven end-to-end (2026-06-15).** `ssh mac-bridge → gui-launch.sh
|
||
→ launch-claude.sh → authenticated claude → phone via Remote Control` works against a real
|
||
repo (`premier-gunner`). Chose **Approach B (desktop Terminal)** over a headless token — see
|
||
**D11**. Two pieces it took: (1) `~/.local/bin` (where `claude` lives) had to be added to
|
||
`~/.zprofile`, because a non-interactive login shell skips `.zshrc`; (2) `scripts/gui-launch.sh`
|
||
opens a Terminal.app window via `osascript` so `claude` runs inside the GUI session (login
|
||
Keychain + real TTY) — needed a one-time "Allow ssh to control Terminal" Automation grant.
|
||
*Known caveats for the bot:* (a) a never-trusted repo stalls at Claude's first-run folder-trust
|
||
gate — unattended launches must target already-trusted repos or pass a skip flag; (b) if the
|
||
TCC Automation grant ever resets, a launch stalls until someone clicks Allow — the bot should
|
||
detect a failed launch and report it back to the room, not hang.
|
||
- **Phase 0 — Matrix bot user live (2026-06-15).** Homeserver is the StartOS Synapse exposed on
|
||
**clearnet at `https://matrix.gilliam.ai`** (`server_name` = `matrix.gilliam.ai`, Synapse
|
||
1.154.0) — *not* the stale `@gilliam:<onion>` account found in Element. Created a dedicated
|
||
non-admin bot **`@agent:matrix.gilliam.ai`** (type `bot`) via the Synapse Admin Dashboard
|
||
(StartOS "Create Bot User" is appservice-only/greyed out). Minted a long-lived access token
|
||
(fixed `device_id` `matrix-bridge-bot`), verified via `whoami`, and stored
|
||
homeserver/user/token/device_id (+ password for recovery) in the gitignored **`.env`** (chmod
|
||
600). `config.toml` holds homeserver+user; `.env.example` documents the schema. Bot reuses the
|
||
stored token — never re-login per start (avoids device churn); no E2EE (D9). *Note:* the
|
||
bot↔Synapse hop is now public-internet TLS, which softens D9's "transport already WireGuard-
|
||
private" rationale (still TLS to the user's own server, single-user content) — revisit if it matters.
|
||
- **Phase 0 — rooms mapped (2026-06-15).** 9 project rooms in `config.toml` (premier-gunner,
|
||
recap, recap-relay, spark-control, ten31-transcripts, ten31-signal-engine, keysat, proof-of-work,
|
||
ten31-database), each `room_id → /Users/macpro/Projects/<repo>`. `@agent` is **joined to all 9**
|
||
(via its token), so the Phase-1 bot will see messages in each. *Manual by-hand launches must keep
|
||
message text free of `'`/`"`* — the typed SSH command line breaks on them (PS2 `>` hang); the
|
||
Phase-1 bot avoids this via `shlex.quote`.
|
||
- **Phase 0 — PROVEN / DONE (2026-06-15).** N=3 by-hand runs succeeded across multiple rooms
|
||
(recap, spark-control, premier-gunner): each opened a Terminal in the right repo, started `claude`
|
||
on the message, and pushed a drivable session to the phone. The deterministic core holds.
|
||
Added session naming: `launch-claude.sh` now runs `claude -n "<repo> - <topic>"` (topic from the
|
||
message, overridable via `$MB_SESSION_NAME`) so Remote Control's phone index is readable —
|
||
confirmed `-n` drives the phone app's conversation label.
|
||
- **Phase 1 — bot working, sub-steps 1–3 PROVEN (2026-06-15).** `src/bot.py` (matrix-nio) logs in
|
||
as `@agent` with the stored token, listens in all 12 rooms, and on a message runs
|
||
`ssh mac-bridge gui-launch.sh <repo> <message>` (via `shlex.quote`), replies in-room, fans out
|
||
for `#all-projects` (each session named `<repo> - <date>`), and reports failures back (fail-loud).
|
||
Tested on the **Spark** (`~/matrix-bridge`, venv) — launches worked across several rooms (N=3).
|
||
Now 11 project rooms + all-projects; `config.toml` has a `[mac]` section (ssh_alias + launcher).
|
||
- **Phase 1 — DONE: containerized + proven on the Spark (2026-06-15).** The bot runs as a Docker
|
||
container on the Spark (`~/matrix-bridge`, `docker compose up -d --build`): generic image
|
||
(`python:3.12-slim` + `openssh-client`), host networking, `restart: unless-stopped` (survives
|
||
reboots), read-only mounts of `.env`/`config.toml`/SSH key. `docker-entrypoint.sh` generates
|
||
`~/.ssh/config` for `mac-bridge` from `config.toml [mac]` (added `hostname`=`10.59.211.5`,
|
||
`user`=`macpro`) — the container's env seam (D4 analog of `launch-claude.sh`); SSH key mounted
|
||
not baked; first connect uses `StrictHostKeyChecking=accept-new` (private-WireGuard tradeoff, D9).
|
||
*Proven live:* container connects to Synapse (`listening as @agent… 11 rooms`) and real messages
|
||
in **2 different rooms** each launched a drivable session on the phone via the full chain
|
||
(container → `ssh mac-bridge` → `gui-launch.sh` → `claude` → phone), rc=0 — confirming the new
|
||
container→Mac SSH hop over WireGuard (mounted key + accept-new host trust). *Formal exit was N=3;
|
||
the owner accepted 2 live launches across 2 rooms + the clear repeatable pattern as done.*
|
||
Build-time checks on the Mac also passed (image builds, `ssh -G mac-bridge` resolves, entrypoint
|
||
perms 700/600).
|
||
- **Spark-side ops are owner-run.** The Mac has **no** authorized SSH key into the Spark
|
||
(`modelo@10.59.211.6` — reachable over WireGuard but not authenticated; Phase 0 only set up the
|
||
reverse, `mac-bridge`). So deploys/restarts on the Spark are run by the owner from the Spark, not
|
||
driven from the Mac — until Phase 3 wires it behind Spark Control.
|
||
- **Next (open — discuss before building):** Phase 2 (multi-room routing) is effectively already
|
||
satisfied — the bot was built multi-room (11 rooms + all-projects) and routed correctly across 2
|
||
rooms in the Phase 1 proof; only a formal confirmation pass remains. Live candidates: **Phase 3**
|
||
(Spark Control: bot status + one-click update/restart on the dashboard, the SSH-behind-buttons
|
||
pattern — also closes the owner-run-ops gap above) or the **headless "ask" mode** from
|
||
`ROADMAP.md` (a message runs `claude -p` and posts the answer back into the room).
|