Compare commits

...

3 Commits

Author SHA1 Message Date
Keysat a7529eb0b7 Containerize Phase 1 bot: Docker deployment on the Spark
Add Dockerfile, docker-compose.yml, docker-entrypoint.sh, and .dockerignore
so the bot runs detached and survives reboots, replacing the foreground venv run.

The image is generic (no secrets/deployment specifics baked in): host networking
reaches both Synapse and the Mac; .env, config.toml, and the SSH key are mounted
read-only. The entrypoint is the container's environment seam (D4 analog of
launch-claude.sh) — it generates ~/.ssh/config for the mac-bridge alias from
config.toml [mac] (new hostname/user fields) so the bot's `ssh mac-bridge` stays
unchanged. SSH key mounted not baked; first connect uses accept-new host trust.

Proven live on the Spark: container connects to Synapse and real messages launched
drivable sessions on the phone across 2 rooms via the full chain.
2026-06-15 18:40:05 -05:00
Keysat 7a39fec229 Update docs: Phase 1 bot status, run/deploy commands, headless-ask roadmap
- AGENTS.md: Commands now has the bot run/deploy (venv + scp from Mac); Layout lists
  src/bot.py, gui-launch.sh, requirements.txt, .env.example; Current state refreshed to
  Phase 1 (sub-steps 1-3 proven on the Spark; next = containerize).
- ROADMAP.md: log headless "ask" mode (claude -p -> output back into the room).
2026-06-15 14:52:34 -05:00
Keysat 76d8a001b1 Add Phase 1 matrix-nio bot (listener + launch + fail-loud)
- src/bot.py: log in as the bot user with the stored token, prime past history,
  and on a new message in a mapped room run `ssh -> gui-launch.sh` (built with
  shlex.quote). The all-projects room fans out to every repo, each session named
  "<repo> - <date>". Launch failures are reported back into the room.
- scripts/gui-launch.sh: propagate MB_SESSION_NAME into the launched session.
- requirements.txt: matrix-nio.

Connectivity (sub-step 1) verified on the Mac; launch (sub-step 2/3) to be tested
on the Spark, where the SSH alias resolves.
2026-06-15 14:34:15 -05:00
10 changed files with 314 additions and 7 deletions
+21
View File
@@ -0,0 +1,21 @@
# Keep the build context minimal and the image generic/secret-free.
# .env, config.toml, and the SSH key arrive via read-only mounts at runtime — never baked in.
.env
.env.*
!.env.example
config.toml
.git
.venv/
venv/
__pycache__/
*.py[cod]
*.egg-info/
# Mac-side launch scripts run on the Mac, not in this container.
scripts/
# Docs / OS cruft — not needed in the image.
*.md
.claude/
.DS_Store
+52 -7
View File
@@ -51,8 +51,18 @@ v1 decision surface.
- `scripts/launch-claude.sh <repo_dir> <prompt>` — the Mac wrapper (Phase 0 deliverable;
validate by hand before any bot code).
- _TODO (Phase 1+):_ bot build/run (`docker build` / `docker compose up` on the Spark) once
`src/` exists.
- **Bot (Phase 1), containerized on the Spark — preferred:** from `~/matrix-bridge`,
`docker compose up -d --build` (host networking, `restart: unless-stopped` so it survives
reboots; read-only mounts of `.env`/`config.toml`/SSH key). Logs: `docker compose logs -f`.
The entrypoint generates `~/.ssh/config` for the `mac-bridge` alias from `config.toml [mac]`
(`hostname`/`user`), so the alias resolves inside the container. Override the host key path with
`MB_SSH_KEY_HOST` if it isn't `/home/modelo/.ssh/id_ed25519`.
- **Bot — venv (dev/fallback):** `python3 -m venv .venv && .venv/bin/pip install -r requirements.txt`,
then `.venv/bin/python src/bot.py` — uses modelo's host `~/.ssh/config` for the alias.
`MB_SSH_ALIAS` overrides the SSH target for testing.
- **Deploy:** pull the bot files from the Mac (no Gitea needed) —
`scp mac-bridge:/Users/macpro/Projects/matrix-bridge/{Dockerfile,docker-compose.yml,docker-entrypoint.sh,requirements.txt,config.toml,.env} .`
and `scp -r mac-bridge:/Users/macpro/Projects/matrix-bridge/src .`, then rebuild.
## Layout
@@ -62,8 +72,17 @@ v1 decision surface.
- `scripts/launch-claude.sh` — the Mac-side launch wrapper (the only seam that knows the
Mac's environment).
- `config.example.toml` — room→repo mapping template; the real `config.toml` is gitignored.
- `scripts/gui-launch.sh` — opens the desktop Terminal via `osascript` (Approach B, D11); calls
`launch-claude.sh` inside it. The bot invokes this over SSH.
- `src/bot.py` — the matrix-nio bot (Phase 1): listens in mapped rooms; on a message runs
`ssh mac-bridge gui-launch.sh`; fans out for all-projects; reports failures back to the room.
- `requirements.txt` (matrix-nio) · `.env.example` (credential schema; real `.env` gitignored).
- `.claude/` — Claude wiring (dir only for now).
- _Future:_ `src/` (the matrix-nio bot), `Dockerfile`, dependency manifest — Phase 1.
- `Dockerfile` · `docker-compose.yml` · `docker-entrypoint.sh` · `.dockerignore` — the Phase 1
container (Spark). Generic image (no secrets/deployment specifics baked in); host networking;
read-only mounts of `.env`/`config.toml`/SSH key. The entrypoint generates `~/.ssh/config` for
the `mac-bridge` alias from `config.toml [mac]` — the container's environment seam (D4 analog
of `launch-claude.sh`).
## Decisions (already made — don't relitigate without new information)
@@ -180,7 +199,33 @@ once" is not done.
Added session naming: `launch-claude.sh` now runs `claude -n "<repo> - <topic>"` (topic from the
message, overridable via `$MB_SESSION_NAME`) so Remote Control's phone index is readable —
confirmed `-n` drives the phone app's conversation label.
- **Next: Phase 1 — the matrix-nio bot.** Container on the Spark, logged in as `@agent` (token in
`.env`), listening in the 9 mapped rooms; on a message it runs `ssh mac-bridge gui-launch.sh
<repo_dir> <message>` (built with `shlex.quote`) and reports failures back to the room. See
ROADMAP Phase 1 (also: bake key+config into the image, curated `$MB_SESSION_NAME` topic, fail-loud).
- **Phase 1 — bot working, sub-steps 13 PROVEN (2026-06-15).** `src/bot.py` (matrix-nio) logs in
as `@agent` with the stored token, listens in all 12 rooms, and on a message runs
`ssh mac-bridge gui-launch.sh <repo> <message>` (via `shlex.quote`), replies in-room, fans out
for `#all-projects` (each session named `<repo> - <date>`), and reports failures back (fail-loud).
Tested on the **Spark** (`~/matrix-bridge`, venv) — launches worked across several rooms (N=3).
Now 11 project rooms + all-projects; `config.toml` has a `[mac]` section (ssh_alias + launcher).
- **Phase 1 — DONE: containerized + proven on the Spark (2026-06-15).** The bot runs as a Docker
container on the Spark (`~/matrix-bridge`, `docker compose up -d --build`): generic image
(`python:3.12-slim` + `openssh-client`), host networking, `restart: unless-stopped` (survives
reboots), read-only mounts of `.env`/`config.toml`/SSH key. `docker-entrypoint.sh` generates
`~/.ssh/config` for `mac-bridge` from `config.toml [mac]` (added `hostname`=`10.59.211.5`,
`user`=`macpro`) — the container's env seam (D4 analog of `launch-claude.sh`); SSH key mounted
not baked; first connect uses `StrictHostKeyChecking=accept-new` (private-WireGuard tradeoff, D9).
*Proven live:* container connects to Synapse (`listening as @agent… 11 rooms`) and real messages
in **2 different rooms** each launched a drivable session on the phone via the full chain
(container → `ssh mac-bridge` → `gui-launch.sh` → `claude` → phone), rc=0 — confirming the new
container→Mac SSH hop over WireGuard (mounted key + accept-new host trust). *Formal exit was N=3;
the owner accepted 2 live launches across 2 rooms + the clear repeatable pattern as done.*
Build-time checks on the Mac also passed (image builds, `ssh -G mac-bridge` resolves, entrypoint
perms 700/600).
- **Spark-side ops are owner-run.** The Mac has **no** authorized SSH key into the Spark
(`modelo@10.59.211.6` — reachable over WireGuard but not authenticated; Phase 0 only set up the
reverse, `mac-bridge`). So deploys/restarts on the Spark are run by the owner from the Spark, not
driven from the Mac — until Phase 3 wires it behind Spark Control.
- **Next (open — discuss before building):** Phase 2 (multi-room routing) is effectively already
satisfied — the bot was built multi-room (11 rooms + all-projects) and routed correctly across 2
rooms in the Phase 1 proof; only a formal confirmation pass remains. Live candidates: **Phase 3**
(Spark Control: bot status + one-click update/restart on the dashboard, the SSH-behind-buttons
pattern — also closes the owner-run-ops gap above) or the **headless "ask" mode** from
`ROADMAP.md` (a message runs `claude -p` and posts the answer back into the room).
+27
View File
@@ -0,0 +1,27 @@
# matrix-bridge bot — Phase 1 container.
#
# Runs on the Spark (always-on Linux + Docker). docker-compose uses host networking so the
# bot reaches BOTH Synapse (clearnet TLS) and the Mac (WireGuard, via the `mac-bridge` SSH alias).
#
# The image is GENERIC: no deployment specifics and no secrets are baked in. At runtime
# docker-compose mounts .env, config.toml, and the SSH key (all read-only); the entrypoint
# generates ~/.ssh/config for the alias from config.toml's [mac] section before launching.
FROM python:3.12-slim
# openssh-client: the bot shells out to `ssh mac-bridge ...` (the proven Phase 0 seam).
RUN apt-get update \
&& apt-get install -y --no-install-recommends openssh-client \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ ./src/
COPY docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
# .env and config.toml arrive via read-only mounts at runtime (never baked).
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
CMD ["python", "-u", "src/bot.py"]
+10
View File
@@ -54,3 +54,13 @@ after it.
is actually in use.
- **E2EE (D9).** Add matrix-nio end-to-end encryption (libolm) if the bot ever handles
sensitive content over untrusted transport. Low priority while everything is WireGuard-local.
- **Headless "ask" mode — return output into the chat (no interactive session).** Today a message
opens an interactive session surfaced to the phone. Add a mode where a message instead runs
`claude -p "<prompt>"` headlessly in the repo (full Claude Code context, but one-shot), captures
stdout, and posts the result back into the Matrix room — Matrix as a request/response interface,
not just a trigger. *Design notes:* `claude -p` (print mode) is exactly this capability. Likely
uses the long-lived OAuth token (Approach A / D11) so it runs over plain SSH with no GUI Terminal
and stdout is captured directly. *Open Qs:* how to select interactive-vs-ask (per-room? a prefix
like `?` / `/ask`? a dedicated room?); output-length handling (truncate / thread / attach file);
same local-only sovereignty constraints apply (output is the user's own; `claude -p` uses the
subscription, no frontier API on message payloads).
+10
View File
@@ -10,6 +10,16 @@ user = "@matrix-bridge-bot:<your-domain>" # a dedicated bot Matrix account (not
# Credentials (access token or password) come from the environment or a gitignored secret —
# never commit them. The bot reads the homeserver URL + bot creds at startup.
# How the bot reaches the Mac (the proven Phase 0 seam). The bot runs on the Spark,
# where `ssh_alias` resolves; `launcher` is the absolute path to gui-launch.sh on the Mac.
[mac]
ssh_alias = "mac-bridge"
launcher = "/Users/macpro/Projects/<your-repo>/scripts/gui-launch.sh"
# Container only: docker-entrypoint.sh generates ~/.ssh/config for `ssh_alias` from these.
# (On a host with `ssh_alias` already in ~/.ssh/config these are ignored.)
hostname = "10.0.0.0" # the Mac's address reachable from the Spark (e.g. WireGuard IP)
user = "<mac-username>"
# One [[room]] block per project.
# room_id — the internal Matrix room ID (starts with '!'), NOT the human alias (#name:domain)
# repo_dir — an absolute path on the Mac (note: ~/Projects uses a capital P)
+19
View File
@@ -0,0 +1,19 @@
# matrix-bridge bot — Phase 1 deployment on the Spark.
#
# `docker compose up -d` runs the bot detached; `restart: unless-stopped` brings it back after
# a Spark reboot. Host networking lets it reach BOTH Synapse (clearnet TLS) and the Mac
# (WireGuard, via the mac-bridge alias the entrypoint generates). The image stays generic — all
# deployment specifics and secrets arrive through the read-only mounts below.
services:
bot:
build: .
image: matrix-bridge-bot
container_name: matrix-bridge
network_mode: host
restart: unless-stopped
volumes:
- ./.env:/app/.env:ro
- ./config.toml:/app/config.toml:ro
# Dedicated Phase 0 key (spark-control@spark-32d0). Must be chmod 600 on the host.
# Override the host path with MB_SSH_KEY_HOST if the key lives elsewhere.
- ${MB_SSH_KEY_HOST:-/home/modelo/.ssh/id_ed25519}:/root/.ssh/id_ed25519:ro
+40
View File
@@ -0,0 +1,40 @@
#!/bin/sh
# matrix-bridge container entrypoint — the container's "environment seam".
#
# Generates ~/.ssh/config for the `mac-bridge` alias from config.toml's [mac] section, then
# execs the bot. This mirrors the Mac side, where launch-claude.sh owns environment setup and
# the bot stays dumb (AGENTS.md D4): SSH-client wiring lives here, not in bot.py. On the Spark
# HOST the bot uses modelo's existing ~/.ssh/config; in the container we recreate just the one
# alias we need, pointing at the mounted key.
set -e
SSH_DIR="$HOME/.ssh"
mkdir -p "$SSH_DIR"
chmod 700 "$SSH_DIR"
# Write ~/.ssh/config straight from config.toml [mac] (no eval; values never hit a shell).
# IdentityFile is the in-container mount target (a container constant, see docker-compose.yml).
# StrictHostKeyChecking=accept-new auto-trusts the Mac's host key on first connect — acceptable
# on the private WireGuard network (same transport-trust reasoning as D9) and avoids an
# interactive prompt that would otherwise hang the bot.
MB_SSH_KEY="${MB_SSH_KEY:-$SSH_DIR/id_ed25519}" \
SSH_CONFIG="$SSH_DIR/config" \
KNOWN_HOSTS="$SSH_DIR/known_hosts" \
python - <<'PY'
import os, tomllib
with open("/app/config.toml", "rb") as f:
mac = tomllib.load(f)["mac"]
config = f"""Host {mac.get('ssh_alias', 'mac-bridge')}
HostName {mac['hostname']}
User {mac['user']}
IdentityFile {os.environ['MB_SSH_KEY']}
IdentitiesOnly yes
StrictHostKeyChecking accept-new
UserKnownHostsFile {os.environ['KNOWN_HOSTS']}
"""
with open(os.environ['SSH_CONFIG'], "w") as f:
f.write(config)
PY
chmod 600 "$SSH_DIR/config"
exec "$@"
+2
View File
@@ -0,0 +1,2 @@
matrix-nio>=0.24
tomli>=2.0; python_version < "3.11"
+2
View File
@@ -36,6 +36,8 @@ fi
launch_script="$(mktemp -t mb-launch)"
{
print -r -- '#!/bin/zsh -l'
# Propagate a caller-supplied session name (the bot sets this for all-projects launches).
[[ -n "$MB_SESSION_NAME" ]] && printf 'export MB_SESSION_NAME=%q\n' "$MB_SESSION_NAME"
printf 'exec %q %q %q\n' "$inner" "$repo_dir" "$prompt"
} >| "$launch_script"
chmod +x "$launch_script"
+131
View File
@@ -0,0 +1,131 @@
#!/usr/bin/env python3
"""matrix-bridge bot — Phase 1.
A text message in a mapped room launches a Claude Code session in that repo on the Mac
(ssh -> gui-launch.sh -> launch-claude.sh -> claude), surfaced to the phone by Remote
Control. A message in the all-projects room fans out to every mapped repo (each session
named "<repo> - <date>"). Launch failures are reported back into the room (fail loud).
Runs on the Spark, where the SSH alias resolves. Config: ../config.toml Creds: ../.env
"""
import asyncio
import datetime
import os
import shlex
try:
import tomllib # py >= 3.11
except ModuleNotFoundError:
import tomli as tomllib # py < 3.11
from nio import AsyncClient, MatrixRoom, RoomMessageText
REPO_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
def load_env(path):
env = {}
with open(path) as f:
for line in f:
line = line.strip()
if line and not line.startswith("#") and "=" in line:
k, v = line.split("=", 1)
env[k] = v
return env
def load_config(path):
with open(path, "rb") as f:
return tomllib.load(f)
async def main():
env = load_env(os.path.join(REPO_ROOT, ".env"))
cfg = load_config(os.path.join(REPO_ROOT, "config.toml"))
homeserver = env["MATRIX_HOMESERVER"]
user_id = env["MATRIX_USER"]
token = env["MATRIX_ACCESS_TOKEN"]
device_id = env.get("MATRIX_DEVICE_ID", "matrix-bridge-bot")
rooms = {r["room_id"]: r for r in cfg.get("room", [])}
all_projects_room = cfg.get("all_projects", {}).get("room_id")
ssh_alias = os.environ.get("MB_SSH_ALIAS") or cfg["mac"]["ssh_alias"]
launcher = cfg["mac"]["launcher"]
client = AsyncClient(homeserver, user_id)
client.restore_login(user_id=user_id, device_id=device_id, access_token=token)
async def launch(repo_dir, prompt, session_name=None):
"""Run gui-launch.sh on the Mac over SSH. Returns (returncode, combined_output).
All user text is passed through shlex.quote so it survives the remote shell —
this is where the cross-shell quoting footgun is actually solved.
"""
remote = f"{shlex.quote(launcher)} {shlex.quote(repo_dir)} {shlex.quote(prompt)}"
if session_name:
remote = f"MB_SESSION_NAME={shlex.quote(session_name)} " + remote
proc = await asyncio.create_subprocess_exec(
"ssh", ssh_alias, remote,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT,
)
out, _ = await proc.communicate()
return proc.returncode, out.decode(errors="replace").strip()
async def say(room_id, text):
await client.room_send(
room_id, "m.room.message", {"msgtype": "m.text", "body": text}
)
async def launch_one(report_room, repo, prompt, session_name=None):
rc, out = await launch(repo["repo_dir"], prompt, session_name)
if rc == 0:
print(f"launched {repo['label']} -> {repo['repo_dir']}", flush=True)
return True
print(f"FAILED {repo['label']}: rc={rc} {out[:300]}", flush=True)
await say(report_room, f"⚠️ matrix-bridge: failed to launch {repo['label']} "
f"(rc={rc}): {out[:300] or 'no output'}")
return False
async def on_message(room: MatrixRoom, event: RoomMessageText):
if event.sender == user_id:
return # never react to our own messages
prompt = event.body.strip()
if not prompt:
return
if room.room_id == all_projects_room:
date = datetime.date.today().isoformat()
print(f"[all-projects] fan-out to {len(rooms)} repos: {prompt!r}", flush=True)
results = await asyncio.gather(*[
launch_one(room.room_id, r, prompt, f"{r['label']} - {date}")
for r in rooms.values()
])
await say(room.room_id,
f"matrix-bridge: launched {sum(results)}/{len(rooms)} sessions ({date}).")
elif room.room_id in rooms:
r = rooms[room.room_id]
if await launch_one(room.room_id, r, prompt):
await say(room.room_id,
f"matrix-bridge: launched {r['label']} — drive it on your phone.")
# Prime the sync token past existing history, THEN register the callback, so the bot
# only reacts to messages that arrive after startup (no backlog replay).
print("priming sync (skipping backlog)...", flush=True)
await client.sync(timeout=30000, full_state=False)
client.add_event_callback(on_message, RoomMessageText)
who = await client.whoami()
print(f"listening as {who.user_id}; {len(rooms)} rooms + all-projects={all_projects_room}",
flush=True)
try:
await client.sync_forever(timeout=30000)
finally:
await client.close()
if __name__ == "__main__":
try:
asyncio.run(main())
except KeyboardInterrupt:
pass