matrix-bridge/ROADMAP.md

# ROADMAP — matrix-bridge

Phased build plan. Near-term status lives in `AGENTS.md` → `## Current state`; this file is
the longer arc. Substance threshold is **N = 3** real uses per phase — exits are falsifiable
(it worked 3 real times), never checkboxes.

Phase 0 (the current first milestone) lives in `AGENTS.md` `## Current state`; it writes no
bot code — foundation + proving the manual chain by hand. The phases below are what comes
after it.

---

## Phase 1 — Single-room bot

- matrix-nio bot in a container on the Spark, logged in as a bot Matrix user.
- One hardcoded room → one repo. Any message in it spawns a session via the Mac wrapper.
- Carry over from Phase 0's proven launch chain (`ssh mac-bridge → gui-launch.sh → launch-claude.sh`):
  - **Bake the SSH key + `mac-bridge` config into the container** (modelo's `~/.ssh` won't exist there).
  - **Named sessions for the phone app.** Pass `claude -n "<repo> — <topic>"` so the Remote Control
    conversation index is readable (project + topic). Bot derives `<topic>` from the message; confirm
    whether the app labels off `-n` or `--remote-control <name>`. Plumb a name arg through the wrappers.
  - **Quote-safe message passing.** Bot builds the SSH command with `shlex.quote`; `gui-launch.sh`
    already isolates the osascript/shell layers via a `%q` temp script — stress-test with hostile text.
  - **Fail loud, not silent.** Detect a stalled launch (untrusted-repo trust gate, or a reset Terminal
    Automation grant) and report it back into the room instead of hanging.
- **Exit (falsifiable):** 3 consecutive real messages each correctly launch a drivable
  session on the phone.

## Phase 2 — Multi-room routing

- Room → repo mapping table; the bot routes by `room_id` (config over code).
- **Exit (falsifiable):** 3 real uses across ≥2 rooms, correct repo every time, zero
  wrong-directory launches.

## Phase 3 — Spark Control integration

- Bot container status surfaced on the Spark Control dashboard.
- One-click update (pull + restart) wired the same way Spark Control drives the Sparks today
  (SSH/commands behind a button).
- **Exit (falsifiable):** bot status is visible and the bot can be updated/restarted from the
  panel.

## Phase 4+ — Future direction (documented, not yet scoped to build)

- **Intent-routing brain (D8).** Qwen3 via Spark Control as a smart dispatcher: given
  knowledge of all repos/contexts, parse a freeform message and decide *which* repo/context
  applies and *what* context to inject — not a task-vs-session classifier. MUST run on a local
  model. Depends on the deterministic core (Phases 1–2) working first; the architecture must
  not foreclose it.
- **Thread-based session continuity.** A Matrix thread = a distinct session/sub-context within
  a repo. The first natural extension after multi-room routing.
- **Nextcloud / CalDAV output integration.** Routing Claude/bot *outputs* into Nextcloud
  (Matrix ↔ Claude ↔ Nextcloud). Real interest, unscoped — not until Nextcloud Tasks/CalDAV
  is actually in use.
- **E2EE (D9).** Add matrix-nio end-to-end encryption (libolm) if the bot ever handles
  sensitive content over untrusted transport. Low priority while everything is WireGuard-local.
- **Headless "ask" mode — return output into the chat (no interactive session).** Today a message
  opens an interactive session surfaced to the phone. Add a mode where a message instead runs
  `claude -p "<prompt>"` headlessly in the repo (full Claude Code context, but one-shot), captures
  stdout, and posts the result back into the Matrix room — Matrix as a request/response interface,
  not just a trigger. *Design notes:* `claude -p` (print mode) is exactly this capability. Likely
  uses the long-lived OAuth token (Approach A / D11) so it runs over plain SSH with no GUI Terminal
  and stdout is captured directly. *Open Qs:* how to select interactive-vs-ask (per-room? a prefix
  like `?` / `/ask`? a dedicated room?); output-length handling (truncate / thread / attach file);
  same local-only sovereignty constraints apply (output is the user's own; `claude -p` uses the
  subscription, no frontier API on message payloads).