Files

264 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# matrix-bridge — AGENTS.md
A single-user Matrix bot that turns a message in a project room into a live Claude Code
session in that project's repo on the Mac — surfaced to the phone via Claude Code Remote
Control. It makes the *trigger* portable: from anywhere on the WireGuard network, a Matrix
message starts a session on the Mac in the correct repo, and Remote Control pushes it to the
phone to drive interactively. Single user, private home network, no multi-user/product scope.
> **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for
> items tagged `(matrix-bridge)` and surface them before proposing next steps; triage with
> `/triage`.
## Core flow (v1)
```
Matrix message in a project room
→ bot (matrix-nio, on the DGX Spark) receives it
→ looks up which repo that room maps to (explicit config — no classification)
→ SSHes to the Mac and runs scripts/gui-launch.sh → launch-claude.sh (repo_dir, message_text)
→ wrapper cd's into the repo, opens a desktop Terminal, and launches `claude` on the message
→ Claude Code Remote Control (auto-enabled) pushes a notification to the phone
→ tap in and drive the session from the Claude app
```
Room determines the repo; the message text becomes the initial prompt — the v1 trigger surface.
*Variants:* a `?`-prefixed message instead runs `ask-claude.sh` (headless `claude -p`) and posts
the full answer back into the room (ask mode, D12). A message in a room's **capture thread** (or a
`/capture <text>` message in any room) is logged to the cross-project inbox instead of launching —
deterministic, no Claude call (capture mode, D13).
## Stack
- **Bot:** Python, **matrix-nio** (from the nio-template scaffold), single Docker container.
- **Runs on:** a DGX Spark (always-on Linux, Docker). *Not* Start9, *not* the Mac.
- **Mac seam:** `scripts/launch-claude.sh`, a zsh login-shell wrapper that owns all
environment setup and launches `claude`.
- **Config:** a readable room→repo mapping file (TOML) — adding a project is a config edit.
- **State:** none beyond config in v1; SQLite or flat files only if a later phase needs them.
## Placement
| Question | Decision | Rationale |
|---|---|---|
| Sensitivity / sovereignty | Local-only when an LLM is ever involved | v1 makes no LLM call; future intent-parsing must run on a local model via Spark Control — message content may reference investor/portfolio context. Never wire a frontier API to message payloads. |
| Runtime shape | Long-running service (always-listening bot) | Must be up unattended to catch messages. |
| Host | DGX Spark, Docker container | Always-on Linux with Docker; co-located with Qwen3 for future local intent-parsing; reaches both Synapse (network) and the Mac (SSH). |
| s9pk vs container | Plain container | Not on Start9 at all — StartOS only runs s9pk packages; don't pay packaging cost, don't touch Synapse. |
| Model routing | None in v1; future Qwen3 via Spark Control | Keeps the sovereignty boundary; deterministic core first. |
| Data layer | Config file (TOML) | v1 needs no datastore. |
| Interface | Matrix (Element) + phone via Remote Control | "Reachable from phone" already satisfied by WireGuard + Remote Control. |
| Repo home | Local + Gitea backup | `ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git`. |
## Commands
- `scripts/launch-claude.sh <repo_dir> <prompt>` — the Mac wrapper (Phase 0 deliverable;
validate by hand before any bot code).
- **Bot (Phase 1), containerized on the Spark — preferred:** from `~/matrix-bridge`,
`docker compose up -d --build` (host networking, `restart: unless-stopped` so it survives
reboots; read-only mounts of `.env`/`config.toml`/SSH key). Logs: `docker compose logs -f`.
The entrypoint generates `~/.ssh/config` for the `mac-bridge` alias from `config.toml [mac]`
(`hostname`/`user`), so the alias resolves inside the container. Override the host key path with
`MB_SSH_KEY_HOST` if it isn't `/home/modelo/.ssh/id_ed25519`.
- **Bot — venv (dev/fallback):** `python3 -m venv .venv && .venv/bin/pip install -r requirements.txt`,
then `.venv/bin/python src/bot.py` — uses modelo's host `~/.ssh/config` for the alias.
`MB_SSH_ALIAS` overrides the SSH target for testing.
- **Seed capture threads:** `python3 scripts/seed-capture-threads.py` (reads `.env` + `config.toml`,
needs only Python stdlib; run anywhere the homeserver is reachable). Posts each room's capture-thread
root and prints the `capture_thread` event IDs to paste into `config.toml`. Skips rooms already set;
pass labels or `--force` to reseed, `--dry-run` to preview.
- **Deploy:** the Spark's `~/matrix-bridge` is a Gitea clone tracking `master`, so deploy =
`git fetch origin && git reset --hard origin/master && docker compose up -d --build` (run as
`modelo` from `~/matrix-bridge`). You normally don't run this by hand — the **Update** button on
the Spark Control dashboard (Phase 3) runs exactly this and streams the output: push to Gitea,
then click Update. **Commit to `master`, not a side branch** — Update pulls `origin/master`, so a
commit only on another branch deploys *stale* code with no error (cost a debugging round on
2026-06-16: capture mode was pushed to `phase-1` while Update kept pulling the old `master`).
Also: `config.toml` is **gitignored**, so Update does *not* carry config changes — refresh it on
the Spark separately (`scp mac-bridge:…/config.toml ~/matrix-bridge/config.toml`) before Update. *(Fallback if Gitea is ever unreachable: scp the files from the Mac —
`scp mac-bridge:/Users/macpro/Projects/matrix-bridge/{Dockerfile,docker-compose.yml,docker-entrypoint.sh,requirements.txt,config.toml,.env} .`
and `scp -r mac-bridge:/Users/macpro/Projects/matrix-bridge/src .`, then rebuild.)*
## Layout
- `AGENTS.md` — this file (canonical; `CLAUDE.md` is a relative symlink to it).
- `ROADMAP.md` — Phases 14+ with falsifiable exits, plus deferred/future directions.
- `README.md` — human-facing intro.
- `docs/spark-control-integration.md` — the live Phase 3 command contract: the SSH commands
(status / restart / git-pull update / logs) behind the Spark Control tile, plus the now-done
one-time conversion of the Spark's `~/matrix-bridge` to a Gitea clone. matrix-bridge needs no
code change. (Shipped in Spark Control v0.21.0; see Current state.)
- `scripts/launch-claude.sh` — the Mac-side launch wrapper (the only seam that knows the
Mac's environment).
- `config.example.toml` — room→repo mapping template; the real `config.toml` is gitignored.
- `scripts/gui-launch.sh` — opens the desktop Terminal via `osascript` (Approach B, D11); calls
`launch-claude.sh` inside it. The bot invokes this over SSH.
- `scripts/ask-claude.sh` — headless `?`-ask wrapper (`#!/bin/zsh -l`): runs `claude -p` in the repo
and prints the answer to stdout for the bot to capture and post back. Uses `CLAUDE_CODE_OAUTH_TOKEN`
(Mac-side `.env`) because a non-GUI SSH session can't reach the login Keychain (D12).
- `scripts/capture-note.sh` — capture wrapper (`#!/bin/zsh -l`): appends one `/capture`-format line
to `~/Projects/standards/INBOX.md`, commits, best-effort pushes, and echoes the line back.
Deterministic — no `claude`, no token, no frontier call (D13).
- `scripts/seed-capture-threads.py` — one-time (re-runnable) helper that posts each room's
capture-thread root message and prints the resulting `capture_thread` event IDs to paste into
`config.toml`. Skips rooms already configured; run after adding a project.
- `src/bot.py` — the matrix-nio bot (Phase 1): listens in mapped rooms; a plain message runs
`ssh mac-bridge gui-launch.sh` (interactive, to the phone), a `?`-prefixed message runs
`ask-claude.sh` (headless, answer posted back), and a `/capture`/capture-thread message runs
`capture-note.sh` (logs to the inbox, confirms in-thread); fans out for all-projects; reports
failures back.
- `requirements.txt` (matrix-nio) · `.env.example` (credential schema; real `.env` gitignored).
- `.claude/` — Claude wiring (dir only for now).
- `Dockerfile` · `docker-compose.yml` · `docker-entrypoint.sh` · `.dockerignore` — the Phase 1
container (Spark). Generic image (no secrets/deployment specifics baked in); host networking;
read-only mounts of `.env`/`config.toml`/SSH key. The entrypoint generates `~/.ssh/config` for
the `mac-bridge` alias from `config.toml [mac]` — the container's environment seam (D4 analog
of `launch-claude.sh`).
## Decisions (already made — don't relitigate without new information)
Condensed from the scoping workshop. Each: the call, why, what it beat.
- **D1 — matrix-nio, not Maubot.** Full control for one custom bot with real SSH-orchestration
logic; keeps Spark Control as the single dashboard. *Beat:* Maubot (competing web UI,
management layer we don't need), SimpleMatrixBotLib.
- **D2 — Bot runs on the Spark, not Start9 or the Mac.** Always-on Linux + Docker, co-located
with Qwen3, reaches Synapse + the Mac. *Beat:* Start9 (no s9pk), Mac (not always-on; it's the
execution target, not the orchestrator).
- **D3 — Synapse stays untouched.** Treat the existing StartOS Synapse as a fixed external
homeserver; the bot logs in as an ordinary Matrix user over WireGuard/LAN.
- **D4 — The Mac wrapper is the environment seam.** A `#!/bin/zsh -l` wrapper owns
PATH/credentials/`cd`/`exec claude`; the bot stays dumb and only invokes it over SSH.
*Beat:* inlining `source ~/.zprofile && …` from the bot (brittle); relying on the default
non-interactive SSH shell (the core failure mode — minimal shell loads neither `.zprofile`
nor `.zshrc`).
- **D5 — Remote Control is the phone-control layer.** Native, E2EE, already auto-enabled;
execution stays on the Mac. The bot only needs to *start* the session. *Note:* outside server
mode, one remote session per Claude Code instance.
- **D6 — Room = repo; routing is deterministic in v1.** No classification, no LLM, no path
branching. *Beat (for v1):* LLM intent parsing → deferred to D8.
- **D7 — No Nextcloud / CalDAV in v1.** Not the pain point; the interesting future (routing
Claude/bot *outputs* into Nextcloud) is real but unscoped.
- **D8 — Intent parsing deferred, but as a "routing brain."** When added (Phase 4+): a smart
dispatcher that, knowing all repos/contexts, decides which repo applies and what context to
inject — not a task-vs-session classifier. MUST run on a local model via Spark Control.
*Revisit when:* the deterministic core (Phases 12) is proven.
- **D9 — E2EE deferred (documented tradeoff).** Single-user bot over WireGuard on a private
LAN; transport is already private and matrix-nio E2EE adds libolm overhead. *Revisit when:*
the bot ever handles sensitive content over untrusted transport.
- **D10 — Spark Control manages the bot (Phase 3, DONE 2026-06-16).** Status badge + Update /
Restart / Stop-Start / Logs buttons on the dashboard, the same SSH-behind-buttons pattern Spark
Control uses for the Sparks. Shipped in Spark Control v0.21.0; connects directly as `modelo` (no
`sudo` wrap — this Spark has no passwordless sudo, so the spec's different-user branch never
applies). Badge reflects container liveness, not Matrix connectivity (see Current state / spec).
- **D11 — Launch into a desktop Terminal, not a headless token (Phase 0).** The SSH session
can't reach the GUI login Keychain, so a plain `ssh … claude` reports "Not logged in." Rather
than mint a long-lived `claude setup-token`, the launcher (`scripts/gui-launch.sh`) uses
`osascript` to open a Terminal.app window in the **GUI session**, where `claude` inherits the
existing Keychain login and a real TTY. *Beat:* the long-lived OAuth token (Approach A) — works
and is fully unattended, but adds a credential to manage; kept as the documented fallback if the
Mac is ever driven headless (logged out). *Cost:* requires the Mac logged in + a one-time
Terminal Automation grant.
- **D12 — Headless "ask" mode uses the long-lived token; interactive stays GUI-Terminal (2026-06-16).**
A `?`-prefixed message runs `claude -p` headlessly over plain SSH and posts the answer back, so its
stdout must be captured over the SSH pipe — which rules out the GUI-Terminal path (D11), and a
non-GUI session reports "Not logged in." Ask mode therefore deliberately adopts the long-lived
`claude setup-token` (`CLAUDE_CODE_OAUTH_TOKEN`) that D11 deferred — kept **Mac-side only** (in
`.env`; the Spark never runs claude). Interactive launches keep the token-free GUI-Terminal path.
*Sovereignty unchanged:* `claude -p` uses the subscription, no frontier API touches message payloads.
- **D13 — Capture mode → central inbox + `/triage` gate, via a deterministic script (2026-06-16).**
A message in a room's **capture thread** (detected by its `m.relates_to` thread root, configured
per room as `capture_thread`), or a `/capture <text>` message in any room, is logged to
`~/Projects/standards/INBOX.md` tagged for that room's project — then the existing `/triage`
lands it in the repo. *Beat (deliberately rejected):* writing straight into a repo's
`AGENTS.md`/`ROADMAP.md` unattended — keeps the human approval gate, and the Current-state-vs-
ROADMAP call, where they belong (and AGENTS.md is load-bearing — "propose, don't silently
rewrite"). *Beat:* `claude -p /capture` for the write — a one-line append needs no model, so
`capture-note.sh` does it deterministically: no token, nothing leaves the Mac but the git push,
and message text never reaches a frontier model (upholds the sovereignty constraint / D8). The
bot confirms in-thread with the exact inbox line. The item type comes from an optional leading
keyword the user types (`bug:` / `feature:` / `chore:` / …; default `idea`, always `P2`). Thread
roots are minted by `seed-capture-threads.py`. *In practice the thread is the only good trigger:*
Element intercepts any `/`-prefixed message as a client command, so the `/capture <text>` fallback
needs a "Send as message" / `//capture` dance — fine as a code path, not the daily UX (2026-06-16).
## Sovereignty constraint
v1 sends nothing to external services except what is deliberately typed into the Claude Code
session itself. The bot's own logic is fully local. **When intent parsing is added later it
MUST run on a local model via Spark Control — never a frontier API** — because it reads
message content that may reference investor/LP/portfolio context. Never wire an external API
call that carries message payloads.
## Implementation guardrails (from the workshop)
- **Quoting through SSH is the known footgun.** Message text crosses two shells (the Spark's,
then the Mac's). Use `shlex.quote` (or equivalent) when building the remote command — never
naive string-concatenate user text into the SSH command.
- **Fail loud on a bad directory.** If a room maps to a missing dir, the wrapper exits
non-zero (`cd "$1" || exit 1`) and the bot reports the failure back into the room — never
launches Claude in the wrong place.
- **Config over code** for the room→repo mapping.
## Definition of done per phase
Substance threshold **N = 3** real uses, defined per phase in `ROADMAP.md`. "Done" means
falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works
once" is not done.
## Infra facts (proven — stable reference)
- **WireGuard (`starttunnel`) for Mac↔Spark:** Mac `10.59.211.5`; Spark (`spark-32d0`, user `modelo`)
`10.59.211.6`. The Mac↔Spark seam runs over WireGuard (not the Mac's LAN subnet). The Spark *is*
on the LAN, same as the Start9 host (`immense-voyage`) — so Spark→Gitea (`immense-voyage.local:59916`)
resolves and works directly.
- **Spark → Mac:** SSH alias `mac-bridge` → the Mac as user `macpro`, dedicated key
(`~/.ssh/id_ed25519` on the Spark, in the Mac's `authorized_keys`). The Spark host's `~/.ssh/config` needs `IdentitiesOnly yes` because a
`Host *` rule shadows the default key; the container regenerates a clean config from `config.toml [mac]`.
- **Spark → Gitea (deploy/update path):** `~/matrix-bridge` is a git clone tracking `origin/master`
(`ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git`). modelo's `~/.ssh/config` pins the
deploy key for the Gitea host with `IdentitiesOnly yes` — without it git offered the wrong key first
and Gitea returned `Permission denied (publickey)`. **The Spark Control Update button depends on that
ssh-config block; flag it if modelo's account is ever rebuilt.**
- **Mac → Spark:** no authorized key — direct Mac-initiated Spark ops stay owner-run. (This is *not*
what Phase 3 closes: Spark Control already has its own SSH channel into `spark-32d0`, so its
status/update/restart buttons ride that, not a Mac→Spark key.)
- **Matrix:** homeserver `https://matrix.gilliam.ai` (StartOS Synapse), bot `@agent:matrix.gilliam.ai`,
device `matrix-bridge-bot`. The bot reuses the stored access token (`.env`) — never re-logs in
(avoids device churn). No E2EE (D9); bot↔Synapse is clearnet TLS, softening D9's WireGuard-only rationale.
- **Mac env:** `claude` lives in `~/.local/bin`, on PATH only via `~/.zprofile` — so every wrapper is
`#!/bin/zsh -l` (a non-login SSH shell loads neither `.zprofile` nor `.zshrc`).
- **Interactive-launch prereqs:** Mac logged into its desktop + a one-time Terminal Automation grant
(TCC). If the grant resets, a launch stalls — the bot reports it fail-loud rather than hanging.
- **Folder-trust gate:** the first `claude` run in a repo it has never been opened in stalls on the
trust prompt; already-used repos are trusted. Affects unattended interactive launches and ask mode.
## Current state
- **Live on the Spark; Phases 03 + ask mode all DONE.** matrix-nio bot in a Docker container
(`~/matrix-bridge`, a Gitea clone tracking `master`): host networking, `restart: unless-stopped`,
read-only mounts of `.env`/`config.toml`/SSH key. Runs as `@agent` in 11 project rooms + an
all-projects fan-out room. Interactive (plain msg → phone) and ask (`?`-prefix → answer posted
back; D12) both proven at N=3; capture (D13) is live (see below). Phase 2: owner-confirmed routing.
- **Phase 3 (Spark Control) shipped 2026-06-16 in v0.21.0:** status badge + Update / Restart /
Stop-Start / Logs tile; the Spark's dir is now a Gitea clone and deploy = the Update button.
Detail in ROADMAP + `docs/spark-control-integration.md`; no matrix-bridge code change.
- **Capture mode (D13) LIVE 2026-06-16 — proven on 1 room, N=3 pending.** Per-room capture threads
`standards/INBOX.md` via `capture-note.sh`, confirmed in-thread; all 11 rooms + all-projects
have seeded `capture_thread` roots (IDs in the gitignored `config.toml`). Keyword type parsing
(commit `0786286`) is **deployed and verified** (Spark updated + restarted 2026-06-16) — leading
`bug:`/`feature:`/`chore:`/… keywords now set the inbox item type. How it works + the Element
`/`-interception caveat: D13.
- **Optional / triggered next moves:**
- Badge reflects container liveness only, not Synapse connectivity — add a Docker `HEALTHCHECK`
(bot-side liveness signal → read `{{.State.Health.Status}}`) when "running but silent" bites.
- A `?`-ask in a repo `claude` has never opened may stall on the folder-trust gate — add a trust
flag to `ask-claude.sh` if/when hit, not preemptively.
- Capture priority is always `P2`; add a priority keyword/token to `capture-note.sh` if setting
it at `/triage` gets tedious. Old `phase-0` branch still exists — delete if it bothers you.
- Phase 4+ (intent-routing brain D8, thread continuity) — see ROADMAP; not scoped.
- **Watch:** the Update button depends on modelo's Gitea ssh-config pin (`IdentitiesOnly yes`, see
Infra facts) — flag it if that account is ever rebuilt.
- **Repo:** single branch `master` (the vestigial `phase-1` was deleted 2026-06-16; capture mode was
briefly stranded on it — see Deploy). Clean, pushed to Gitea. No test suite (pre-existing).