Files

T

Keysat 7a39fec229 Update docs: Phase 1 bot status, run/deploy commands, headless-ask roadmap

- AGENTS.md: Commands now has the bot run/deploy (venv + scp from Mac); Layout lists
  src/bot.py, gui-launch.sh, requirements.txt, .env.example; Current state refreshed to
  Phase 1 (sub-steps 1-3 proven on the Spark; next = containerize).
- ROADMAP.md: log headless "ask" mode (claude -p -> output back into the room).

2026-06-15 14:52:34 -05:00

4.1 KiB

Raw Blame History

ROADMAP — matrix-bridge

Phased build plan. Near-term status lives in AGENTS.md → ## Current state; this file is the longer arc. Substance threshold is N = 3 real uses per phase — exits are falsifiable (it worked 3 real times), never checkboxes.

Phase 0 (the current first milestone) lives in AGENTS.md ## Current state; it writes no bot code — foundation + proving the manual chain by hand. The phases below are what comes after it.

Phase 1 — Single-room bot

matrix-nio bot in a container on the Spark, logged in as a bot Matrix user.
One hardcoded room → one repo. Any message in it spawns a session via the Mac wrapper.
Carry over from Phase 0's proven launch chain (ssh mac-bridge → gui-launch.sh → launch-claude.sh):
- Bake the SSH key + mac-bridge config into the container (modelo's ~/.ssh won't exist there).
- Named sessions for the phone app. Pass claude -n "<repo> — <topic>" so the Remote Control conversation index is readable (project + topic). Bot derives <topic> from the message; confirm whether the app labels off -n or --remote-control <name>. Plumb a name arg through the wrappers.
- Quote-safe message passing. Bot builds the SSH command with shlex.quote; gui-launch.sh already isolates the osascript/shell layers via a %q temp script — stress-test with hostile text.
- Fail loud, not silent. Detect a stalled launch (untrusted-repo trust gate, or a reset Terminal Automation grant) and report it back into the room instead of hanging.
Exit (falsifiable): 3 consecutive real messages each correctly launch a drivable session on the phone.

Phase 2 — Multi-room routing

Room → repo mapping table; the bot routes by room_id (config over code).
Exit (falsifiable): 3 real uses across ≥2 rooms, correct repo every time, zero wrong-directory launches.

Phase 3 — Spark Control integration

Bot container status surfaced on the Spark Control dashboard.
One-click update (pull + restart) wired the same way Spark Control drives the Sparks today (SSH/commands behind a button).
Exit (falsifiable): bot status is visible and the bot can be updated/restarted from the panel.

Phase 4+ — Future direction (documented, not yet scoped to build)

Intent-routing brain (D8). Qwen3 via Spark Control as a smart dispatcher: given knowledge of all repos/contexts, parse a freeform message and decide which repo/context applies and what context to inject — not a task-vs-session classifier. MUST run on a local model. Depends on the deterministic core (Phases 1–2) working first; the architecture must not foreclose it.
Thread-based session continuity. A Matrix thread = a distinct session/sub-context within a repo. The first natural extension after multi-room routing.
Nextcloud / CalDAV output integration. Routing Claude/bot outputs into Nextcloud (Matrix ↔ Claude ↔ Nextcloud). Real interest, unscoped — not until Nextcloud Tasks/CalDAV is actually in use.
E2EE (D9). Add matrix-nio end-to-end encryption (libolm) if the bot ever handles sensitive content over untrusted transport. Low priority while everything is WireGuard-local.
Headless "ask" mode — return output into the chat (no interactive session). Today a message opens an interactive session surfaced to the phone. Add a mode where a message instead runs claude -p "<prompt>" headlessly in the repo (full Claude Code context, but one-shot), captures stdout, and posts the result back into the Matrix room — Matrix as a request/response interface, not just a trigger. Design notes: claude -p (print mode) is exactly this capability. Likely uses the long-lived OAuth token (Approach A / D11) so it runs over plain SSH with no GUI Terminal and stdout is captured directly. Open Qs: how to select interactive-vs-ask (per-room? a prefix like ? / /ask? a dedicated room?); output-length handling (truncate / thread / attach file); same local-only sovereignty constraints apply (output is the user's own; claude -p uses the subscription, no frontier API on message payloads).