Files

T

Keysat 28c974fe1d Mark Phase 3 (Spark Control) done; trim spec to live command contract

Shipped in Spark Control v0.21.0: status badge + Update/Restart/Stop-Start/Logs
tile. All three exit criteria confirmed. matrix-bridge needed no code change.

- AGENTS.md: Current state + ROADMAP Phase 3 -> DONE; Deploy switched scp -> git
  pull (Update button); D10 stamped; new Infra fact for the Spark->Gitea path and
  the load-bearing IdentitiesOnly ssh-config pin the Update button depends on.
- spark-control-integration.md: trimmed from dev spec to live contract (dropped
  sudo -iu fallback and dev-side scaffolding; folded in direct-as-modelo, the
  Gitea key gotcha, restart cadence, and the LAN-only HTTP API).
- README: dropped stale "pre-Phase 0" status; Setup reframed for a fresh install.

Deferred follow-up: badge reflects container liveness only, not Matrix
connectivity; HEALTHCHECK + {{.State.Health.Status}} is the matrix-bridge-side fix.

2026-06-15 23:19:30 -05:00

17 KiB

Raw Blame History

matrix-bridge — AGENTS.md

A single-user Matrix bot that turns a message in a project room into a live Claude Code session in that project's repo on the Mac — surfaced to the phone via Claude Code Remote Control. It makes the trigger portable: from anywhere on the WireGuard network, a Matrix message starts a session on the Mac in the correct repo, and Remote Control pushes it to the phone to drive interactively. Single user, private home network, no multi-user/product scope.

Inbox check: At session start, if ~/Projects/standards/INBOX.md exists, scan it for items tagged (matrix-bridge) and surface them before proposing next steps; triage with /triage.

Core flow (v1)

Matrix message in a project room
  → bot (matrix-nio, on the DGX Spark) receives it
  → looks up which repo that room maps to (explicit config — no classification)
  → SSHes to the Mac and runs scripts/gui-launch.sh → launch-claude.sh (repo_dir, message_text)
  → wrapper cd's into the repo, opens a desktop Terminal, and launches `claude` on the message
  → Claude Code Remote Control (auto-enabled) pushes a notification to the phone
  → tap in and drive the session from the Claude app

Room determines the repo; the message text becomes the initial prompt — the v1 trigger surface. Variant: a ?-prefixed message instead runs ask-claude.sh (headless claude -p) and posts the full answer back into the room (ask mode, D12).

Stack

Bot: Python, matrix-nio (from the nio-template scaffold), single Docker container.
Runs on: a DGX Spark (always-on Linux, Docker). Not Start9, not the Mac.
Mac seam: scripts/launch-claude.sh, a zsh login-shell wrapper that owns all environment setup and launches claude.
Config: a readable room→repo mapping file (TOML) — adding a project is a config edit.
State: none beyond config in v1; SQLite or flat files only if a later phase needs them.

Placement

Question	Decision	Rationale
Sensitivity / sovereignty	Local-only when an LLM is ever involved	v1 makes no LLM call; future intent-parsing must run on a local model via Spark Control — message content may reference investor/portfolio context. Never wire a frontier API to message payloads.
Runtime shape	Long-running service (always-listening bot)	Must be up unattended to catch messages.
Host	DGX Spark, Docker container	Always-on Linux with Docker; co-located with Qwen3 for future local intent-parsing; reaches both Synapse (network) and the Mac (SSH).
s9pk vs container	Plain container	Not on Start9 at all — StartOS only runs s9pk packages; don't pay packaging cost, don't touch Synapse.
Model routing	None in v1; future Qwen3 via Spark Control	Keeps the sovereignty boundary; deterministic core first.
Data layer	Config file (TOML)	v1 needs no datastore.
Interface	Matrix (Element) + phone via Remote Control	"Reachable from phone" already satisfied by WireGuard + Remote Control.
Repo home	Local + Gitea backup	`ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git`.

Commands

scripts/launch-claude.sh <repo_dir> <prompt> — the Mac wrapper (Phase 0 deliverable; validate by hand before any bot code).
Bot (Phase 1), containerized on the Spark — preferred: from ~/matrix-bridge, docker compose up -d --build (host networking, restart: unless-stopped so it survives reboots; read-only mounts of .env/config.toml/SSH key). Logs: docker compose logs -f. The entrypoint generates ~/.ssh/config for the mac-bridge alias from config.toml [mac] (hostname/user), so the alias resolves inside the container. Override the host key path with MB_SSH_KEY_HOST if it isn't /home/modelo/.ssh/id_ed25519.
Bot — venv (dev/fallback): python3 -m venv .venv && .venv/bin/pip install -r requirements.txt, then .venv/bin/python src/bot.py — uses modelo's host ~/.ssh/config for the alias. MB_SSH_ALIAS overrides the SSH target for testing.
Deploy: the Spark's ~/matrix-bridge is a Gitea clone tracking master, so deploy = git fetch origin && git reset --hard origin/master && docker compose up -d --build (run as modelo from ~/matrix-bridge). You normally don't run this by hand — the Update button on the Spark Control dashboard (Phase 3) runs exactly this and streams the output: push to Gitea, then click Update. (Fallback if Gitea is ever unreachable: scp the files from the Mac — scp mac-bridge:/Users/macpro/Projects/matrix-bridge/{Dockerfile,docker-compose.yml,docker-entrypoint.sh,requirements.txt,config.toml,.env} . and scp -r mac-bridge:/Users/macpro/Projects/matrix-bridge/src ., then rebuild.)

Layout

AGENTS.md — this file (canonical; CLAUDE.md is a relative symlink to it).
ROADMAP.md — Phases 1–4+ with falsifiable exits, plus deferred/future directions.
README.md — human-facing intro.
docs/spark-control-integration.md — the live Phase 3 command contract: the SSH commands (status / restart / git-pull update / logs) behind the Spark Control tile, plus the now-done one-time conversion of the Spark's ~/matrix-bridge to a Gitea clone. matrix-bridge needs no code change. (Shipped in Spark Control v0.21.0; see Current state.)
scripts/launch-claude.sh — the Mac-side launch wrapper (the only seam that knows the Mac's environment).
config.example.toml — room→repo mapping template; the real config.toml is gitignored.
scripts/gui-launch.sh — opens the desktop Terminal via osascript (Approach B, D11); calls launch-claude.sh inside it. The bot invokes this over SSH.
scripts/ask-claude.sh — headless ?-ask wrapper (#!/bin/zsh -l): runs claude -p in the repo and prints the answer to stdout for the bot to capture and post back. Uses CLAUDE_CODE_OAUTH_TOKEN (Mac-side .env) because a non-GUI SSH session can't reach the login Keychain (D12).
src/bot.py — the matrix-nio bot (Phase 1): listens in mapped rooms; a plain message runs ssh mac-bridge gui-launch.sh (interactive, to the phone), a ?-prefixed message runs ask-claude.sh (headless, answer posted back); fans out for all-projects; reports failures back.
requirements.txt (matrix-nio) · .env.example (credential schema; real .env gitignored).
.claude/ — Claude wiring (dir only for now).
Dockerfile · docker-compose.yml · docker-entrypoint.sh · .dockerignore — the Phase 1 container (Spark). Generic image (no secrets/deployment specifics baked in); host networking; read-only mounts of .env/config.toml/SSH key. The entrypoint generates ~/.ssh/config for the mac-bridge alias from config.toml [mac] — the container's environment seam (D4 analog of launch-claude.sh).

Decisions (already made — don't relitigate without new information)

Condensed from the scoping workshop. Each: the call, why, what it beat.

D1 — matrix-nio, not Maubot. Full control for one custom bot with real SSH-orchestration logic; keeps Spark Control as the single dashboard. Beat: Maubot (competing web UI, management layer we don't need), SimpleMatrixBotLib.
D2 — Bot runs on the Spark, not Start9 or the Mac. Always-on Linux + Docker, co-located with Qwen3, reaches Synapse + the Mac. Beat: Start9 (no s9pk), Mac (not always-on; it's the execution target, not the orchestrator).
D3 — Synapse stays untouched. Treat the existing StartOS Synapse as a fixed external homeserver; the bot logs in as an ordinary Matrix user over WireGuard/LAN.
D4 — The Mac wrapper is the environment seam. A #!/bin/zsh -l wrapper owns PATH/credentials/cd/exec claude; the bot stays dumb and only invokes it over SSH. Beat: inlining source ~/.zprofile && … from the bot (brittle); relying on the default non-interactive SSH shell (the core failure mode — minimal shell loads neither .zprofile nor .zshrc).
D5 — Remote Control is the phone-control layer. Native, E2EE, already auto-enabled; execution stays on the Mac. The bot only needs to start the session. Note: outside server mode, one remote session per Claude Code instance.
D6 — Room = repo; routing is deterministic in v1. No classification, no LLM, no path branching. Beat (for v1): LLM intent parsing → deferred to D8.
D7 — No Nextcloud / CalDAV in v1. Not the pain point; the interesting future (routing Claude/bot outputs into Nextcloud) is real but unscoped.
D8 — Intent parsing deferred, but as a "routing brain." When added (Phase 4+): a smart dispatcher that, knowing all repos/contexts, decides which repo applies and what context to inject — not a task-vs-session classifier. MUST run on a local model via Spark Control. Revisit when: the deterministic core (Phases 1–2) is proven.
D9 — E2EE deferred (documented tradeoff). Single-user bot over WireGuard on a private LAN; transport is already private and matrix-nio E2EE adds libolm overhead. Revisit when: the bot ever handles sensitive content over untrusted transport.
D10 — Spark Control manages the bot (Phase 3, DONE 2026-06-16). Status badge + Update / Restart / Stop-Start / Logs buttons on the dashboard, the same SSH-behind-buttons pattern Spark Control uses for the Sparks. Shipped in Spark Control v0.21.0; connects directly as modelo (no sudo wrap — this Spark has no passwordless sudo, so the spec's different-user branch never applies). Badge reflects container liveness, not Matrix connectivity (see Current state / spec).
D11 — Launch into a desktop Terminal, not a headless token (Phase 0). The SSH session can't reach the GUI login Keychain, so a plain ssh … claude reports "Not logged in." Rather than mint a long-lived claude setup-token, the launcher (scripts/gui-launch.sh) uses osascript to open a Terminal.app window in the GUI session, where claude inherits the existing Keychain login and a real TTY. Beat: the long-lived OAuth token (Approach A) — works and is fully unattended, but adds a credential to manage; kept as the documented fallback if the Mac is ever driven headless (logged out). Cost: requires the Mac logged in + a one-time Terminal Automation grant.
D12 — Headless "ask" mode uses the long-lived token; interactive stays GUI-Terminal (2026-06-16). A ?-prefixed message runs claude -p headlessly over plain SSH and posts the answer back, so its stdout must be captured over the SSH pipe — which rules out the GUI-Terminal path (D11), and a non-GUI session reports "Not logged in." Ask mode therefore deliberately adopts the long-lived claude setup-token (CLAUDE_CODE_OAUTH_TOKEN) that D11 deferred — kept Mac-side only (in .env; the Spark never runs claude). Interactive launches keep the token-free GUI-Terminal path. Sovereignty unchanged: claude -p uses the subscription, no frontier API touches message payloads.

Sovereignty constraint

v1 sends nothing to external services except what is deliberately typed into the Claude Code session itself. The bot's own logic is fully local. When intent parsing is added later it MUST run on a local model via Spark Control — never a frontier API — because it reads message content that may reference investor/LP/portfolio context. Never wire an external API call that carries message payloads.

Implementation guardrails (from the workshop)

Quoting through SSH is the known footgun. Message text crosses two shells (the Spark's, then the Mac's). Use shlex.quote (or equivalent) when building the remote command — never naive string-concatenate user text into the SSH command.
Fail loud on a bad directory. If a room maps to a missing dir, the wrapper exits non-zero (cd "$1" || exit 1) and the bot reports the failure back into the room — never launches Claude in the wrong place.
Config over code for the room→repo mapping.

Definition of done per phase

Substance threshold N = 3 real uses, defined per phase in ROADMAP.md. "Done" means falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works once" is not done.

Infra facts (proven — stable reference)

WireGuard (starttunnel) for Mac↔Spark: Mac 10.59.211.5; Spark (spark-32d0, user modelo) 10.59.211.6. The Mac↔Spark seam runs over WireGuard (not the Mac's LAN subnet). The Spark is on the LAN, same as the Start9 host (immense-voyage) — so Spark→Gitea (immense-voyage.local:59916) resolves and works directly.
Spark → Mac: SSH alias mac-bridge → the Mac as user macpro, dedicated key (~/.ssh/id_ed25519 on the Spark, in the Mac's authorized_keys). The Spark host's ~/.ssh/config needs IdentitiesOnly yes because a Host * rule shadows the default key; the container regenerates a clean config from config.toml [mac].
Spark → Gitea (deploy/update path): ~/matrix-bridge is a git clone tracking origin/master (ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git). modelo's ~/.ssh/config pins the deploy key for the Gitea host with IdentitiesOnly yes — without it git offered the wrong key first and Gitea returned Permission denied (publickey). The Spark Control Update button depends on that ssh-config block; flag it if modelo's account is ever rebuilt.
Mac → Spark: no authorized key — direct Mac-initiated Spark ops stay owner-run. (This is not what Phase 3 closes: Spark Control already has its own SSH channel into spark-32d0, so its status/update/restart buttons ride that, not a Mac→Spark key.)
Matrix: homeserver https://matrix.gilliam.ai (StartOS Synapse), bot @agent:matrix.gilliam.ai, device matrix-bridge-bot. The bot reuses the stored access token (.env) — never re-logs in (avoids device churn). No E2EE (D9); bot↔Synapse is clearnet TLS, softening D9's WireGuard-only rationale.
Mac env: claude lives in ~/.local/bin, on PATH only via ~/.zprofile — so every wrapper is #!/bin/zsh -l (a non-login SSH shell loads neither .zprofile nor .zshrc).
Interactive-launch prereqs: Mac logged into its desktop + a one-time Terminal Automation grant (TCC). If the grant resets, a launch stalls — the bot reports it fail-loud rather than hanging.
Folder-trust gate: the first claude run in a repo it has never been opened in stalls on the trust prompt; already-used repos are trusted. Affects unattended interactive launches and ask mode.

Current state

Live on the Spark (Phases 0–3 + ask mode). matrix-nio bot in a Docker container (~/matrix-bridge, now a Gitea clone tracking master): host networking, restart: unless-stopped, read-only mounts of .env/config.toml/SSH key. Runs as @agent in 11 project rooms + an all-projects fan-out room. Both modes proven — interactive (plain msg → phone via Remote Control) and ask (?-prefix → full answer posted back; D12).
Phase 2 — DONE (owner-confirmed N=3: routes by room_id, correct repo, zero wrong-dir launches).
Phase 3 (Spark Control) — DONE (2026-06-16), shipped in Spark Control v0.21.0. matrix-bridge tile under "Always-on services": live status badge + Update / Restart / Stop-Start / View-logs buttons, running exactly the spec's commands (docker inspect status, docker restart, the git fetch && git reset --hard origin/master && docker compose up -d --build update streamed live with a ~25-min ceiling, docker logs --tail 100). All three exit criteria confirmed (status visible
- reflects container, update works, restart works). matrix-bridge needed no code change. Deviation: connects directly as modelo (no sudo -iu wrap — no passwordless sudo here, so the spec's different-user branch never applies); tile auto-hides when its SSH-user field is blank or the container is absent. A LAN-only HTTP API also exists if scripting is ever wanted: POST /api/matrix-bridge/update (+ /{id}/stream SSE), GET /api/matrix-bridge/logs?tail=N, status via GET /api/services.
Open / risks:
- Badge = container liveness only, not Matrix connectivity — a running bot disconnected from Synapse still shows Healthy. Clean fix when "running but silent" bites: a Docker HEALTHCHECK (bot-side liveness signal) so the tile can read {{.State.Health.Status}} — a matrix-bridge-side change; then ping the Spark Control dev to read the health field.
- Update button depends on modelo's Gitea ssh-config pin (IdentitiesOnly yes, see Infra facts) — flag it if modelo's account is ever rebuilt.
- A ?-ask in a repo claude has never opened may stall on the folder-trust gate — add a trust flag to ask-claude.sh if/when hit, not preemptively.
- Cosmetic: a fast docker restart won't visibly flip the badge red (panel re-checks status only after the command returns, container already back up); a full docker stop turns it red within ~5s. Polling cadence, not a bug.
Repo: master == phase-1, clean, pushed to Gitea. No test suite (pre-existing).

17 KiB Raw Blame History Unescape Escape