Scaffold matrix-bridge (standards-compliant; pre-Phase 0)

Single-user Matrix -> Claude Code bridge bot, scaffolded from a prior scoping package (SPEC/DECISIONS/CLAUDE/KICKOFF) folded into the current new-project scheme: - AGENTS.md (canonical) with core flow, stack, placement table, condensed D1-D10 decisions, sovereignty constraint, and Phase 0 as the first milestone - CLAUDE.md -> AGENTS.md relative symlink; ROADMAP.md (Phases 1-4+, falsifiable exits) - scripts/launch-claude.sh first-draft Mac wrapper (D4); config.example.toml - canonical deny-by-default .gitignore + Python ignores No bot code yet, by design: Phase 0 is manual-chain validation (N=3).
2026-06-14 20:20:17 -05:00
commit 78e0de2e52
7 changed files with 291 additions and 0 deletions
@@ -0,0 +1,136 @@
+# matrix-bridge — AGENTS.md
+
+A single-user Matrix bot that turns a message in a project room into a live Claude Code
+session in that project's repo on the Mac — surfaced to the phone via Claude Code Remote
+Control. It makes the *trigger* portable: from anywhere on the WireGuard network, a Matrix
+message starts a session on the Mac in the correct repo, and Remote Control pushes it to the
+phone to drive interactively. Single user, private home network, no multi-user/product scope.
+
+> **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for
+> items tagged `(matrix-bridge)` and surface them before proposing next steps; triage with
+> `/triage`.
+
+## Core flow (v1)
+
+```
+Matrix message in a project room
+  → bot (matrix-nio, on the DGX Spark) receives it
+  → looks up which repo that room maps to (explicit config — no classification)
+  → SSHes to the Mac and runs scripts/launch-claude.sh with (repo_dir, message_text)
+  → wrapper cd's into the repo and launches `claude` with the message as the prompt
+  → Claude Code Remote Control (auto-enabled) pushes a notification to the phone
+  → tap in and drive the session from the Claude app
+```
+
+Room determines the repo; the message text becomes the initial prompt. That is the entire
+v1 decision surface.
+
+## Stack
+
+- **Bot:** Python, **matrix-nio** (from the nio-template scaffold), single Docker container.
+- **Runs on:** a DGX Spark (always-on Linux, Docker). *Not* Start9, *not* the Mac.
+- **Mac seam:** `scripts/launch-claude.sh`, a zsh login-shell wrapper that owns all
+  environment setup and launches `claude`.
+- **Config:** a readable room→repo mapping file (TOML) — adding a project is a config edit.
+- **State:** none beyond config in v1; SQLite or flat files only if a later phase needs them.
+
+## Placement
+
+| Question | Decision | Rationale |
+|---|---|---|
+| Sensitivity / sovereignty | Local-only when an LLM is ever involved | v1 makes no LLM call; future intent-parsing must run on a local model via Spark Control — message content may reference investor/portfolio context. Never wire a frontier API to message payloads. |
+| Runtime shape | Long-running service (always-listening bot) | Must be up unattended to catch messages. |
+| Host | DGX Spark, Docker container | Always-on Linux with Docker; co-located with Qwen3 for future local intent-parsing; reaches both Synapse (network) and the Mac (SSH). |
+| s9pk vs container | Plain container | Not on Start9 at all — StartOS only runs s9pk packages; don't pay packaging cost, don't touch Synapse. |
+| Model routing | None in v1; future Qwen3 via Spark Control | Keeps the sovereignty boundary; deterministic core first. |
+| Data layer | Config file (TOML) | v1 needs no datastore. |
+| Interface | Matrix (Element) + phone via Remote Control | "Reachable from phone" already satisfied by WireGuard + Remote Control. |
+| Repo home | Local + Gitea backup | `ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git`. |
+
+## Commands
+
+- `scripts/launch-claude.sh <repo_dir> <prompt>` — the Mac wrapper (Phase 0 deliverable;
+  validate by hand before any bot code).
+- _TODO (Phase 1+):_ bot build/run (`docker build` / `docker compose up` on the Spark) once
+  `src/` exists.
+
+## Layout
+
+- `AGENTS.md` — this file (canonical; `CLAUDE.md` is a relative symlink to it).
+- `ROADMAP.md` — Phases 1–4+ with falsifiable exits, plus deferred/future directions.
+- `README.md` — human-facing intro.
+- `scripts/launch-claude.sh` — the Mac-side launch wrapper (the only seam that knows the
+  Mac's environment).
+- `config.example.toml` — room→repo mapping template; the real `config.toml` is gitignored.
+- `.claude/` — Claude wiring (dir only for now).
+- _Future:_ `src/` (the matrix-nio bot), `Dockerfile`, dependency manifest — Phase 1.
+
+## Decisions (already made — don't relitigate without new information)
+
+Condensed from the scoping workshop. Each: the call, why, what it beat.
+
+- **D1 — matrix-nio, not Maubot.** Full control for one custom bot with real SSH-orchestration
+  logic; keeps Spark Control as the single dashboard. *Beat:* Maubot (competing web UI,
+  management layer we don't need), SimpleMatrixBotLib.
+- **D2 — Bot runs on the Spark, not Start9 or the Mac.** Always-on Linux + Docker, co-located
+  with Qwen3, reaches Synapse + the Mac. *Beat:* Start9 (no s9pk), Mac (not always-on; it's the
+  execution target, not the orchestrator).
+- **D3 — Synapse stays untouched.** Treat the existing StartOS Synapse as a fixed external
+  homeserver; the bot logs in as an ordinary Matrix user over WireGuard/LAN.
+- **D4 — The Mac wrapper is the environment seam.** A `#!/bin/zsh -l` wrapper owns
+  PATH/credentials/`cd`/`exec claude`; the bot stays dumb and only invokes it over SSH.
+  *Beat:* inlining `source ~/.zprofile && …` from the bot (brittle); relying on the default
+  non-interactive SSH shell (the core failure mode — minimal shell loads neither `.zprofile`
+  nor `.zshrc`).
+- **D5 — Remote Control is the phone-control layer.** Native, E2EE, already auto-enabled;
+  execution stays on the Mac. The bot only needs to *start* the session. *Note:* outside server
+  mode, one remote session per Claude Code instance.
+- **D6 — Room = repo; routing is deterministic in v1.** No classification, no LLM, no path
+  branching. *Beat (for v1):* LLM intent parsing → deferred to D8.
+- **D7 — No Nextcloud / CalDAV in v1.** Not the pain point; the interesting future (routing
+  Claude/bot *outputs* into Nextcloud) is real but unscoped.
+- **D8 — Intent parsing deferred, but as a "routing brain."** When added (Phase 4+): a smart
+  dispatcher that, knowing all repos/contexts, decides which repo applies and what context to
+  inject — not a task-vs-session classifier. MUST run on a local model via Spark Control.
+  *Revisit when:* the deterministic core (Phases 1–2) is proven.
+- **D9 — E2EE deferred (documented tradeoff).** Single-user bot over WireGuard on a private
+  LAN; transport is already private and matrix-nio E2EE adds libolm overhead. *Revisit when:*
+  the bot ever handles sensitive content over untrusted transport.
+- **D10 — Spark Control manages the bot (Phase 3).** Status on the dashboard + one-click
+  update/restart, the same SSH-behind-buttons pattern Spark Control uses for the Sparks today.
+
+## Sovereignty constraint
+
+v1 sends nothing to external services except what is deliberately typed into the Claude Code
+session itself. The bot's own logic is fully local. **When intent parsing is added later it
+MUST run on a local model via Spark Control — never a frontier API** — because it reads
+message content that may reference investor/LP/portfolio context. Never wire an external API
+call that carries message payloads.
+
+## Implementation guardrails (from the workshop)
+
+- **Quoting through SSH is the known footgun.** Message text crosses two shells (the Spark's,
+  then the Mac's). Use `shlex.quote` (or equivalent) when building the remote command — never
+  naive string-concatenate user text into the SSH command.
+- **Fail loud on a bad directory.** If a room maps to a missing dir, the wrapper exits
+  non-zero (`cd "$1" || exit 1`) and the bot reports the failure back into the room — never
+  launches Claude in the wrong place.
+- **Config over code** for the room→repo mapping.
+
+## Definition of done per phase
+
+Substance threshold **N = 3** real uses, defined per phase in `ROADMAP.md`. "Done" means
+falsifiable, scaled substance (it worked 3 real times), never a checkbox. A phase that "works
+once" is not done.
+
+## Current state
+
+- **Scaffolded 2026-06-15** from a prior scoping package (SPEC/DECISIONS/CLAUDE/KICKOFF),
+  folded into this AGENTS.md (decisions + placement), `ROADMAP.md` (phases), and the wrapper +
+  config skeleton. No bot code yet — by design.
+- **Next: Phase 0 (manual chain validation, N=3)** — Matrix onboarding (Element, Space, first
+  room, a bot user), write + locally test `scripts/launch-claude.sh`, passwordless SSH from the
+  Spark to the Mac, prove the full chain (message → SSH → wrapper → Claude session → phone
+  notification I can drive) by hand at least 3 times, and record the first room→repo mapping.
+  Bot code starts only after Phase 0 is proven. The original KICKOFF prompt is the step-by-step
+  for Phase 0.