28c974fe1d
Shipped in Spark Control v0.21.0: status badge + Update/Restart/Stop-Start/Logs
tile. All three exit criteria confirmed. matrix-bridge needed no code change.
- AGENTS.md: Current state + ROADMAP Phase 3 -> DONE; Deploy switched scp -> git
pull (Update button); D10 stamped; new Infra fact for the Spark->Gitea path and
the load-bearing IdentitiesOnly ssh-config pin the Update button depends on.
- spark-control-integration.md: trimmed from dev spec to live contract (dropped
sudo -iu fallback and dev-side scaffolding; folded in direct-as-modelo, the
Gitea key gotcha, restart cadence, and the LAN-only HTTP API).
- README: dropped stale "pre-Phase 0" status; Setup reframed for a fresh install.
Deferred follow-up: badge reflects container liveness only, not Matrix
connectivity; HEALTHCHECK + {{.State.Health.Status}} is the matrix-bridge-side fix.
140 lines
5.6 KiB
Markdown
140 lines
5.6 KiB
Markdown
# Phase 3 — Spark Control integration (live command contract)
|
|
|
|
**Status: DONE (2026-06-16), shipped in Spark Control v0.21.0.** The matrix-bridge bot has a
|
|
tile on the Spark Control dashboard under "Always-on services" — a live status badge plus
|
|
**Update**, **Restart**, **Stop/Start**, and **View logs** buttons. All three ROADMAP Phase 3
|
|
exit criteria are met (status visible + reflects the container; update works; restart works).
|
|
matrix-bridge needed no code change.
|
|
|
|
This document is the **contract**: what each control runs on the Spark, and what the output
|
|
means. Kept as the reference for what the buttons actually do — and to reproduce by hand if the
|
|
dashboard is ever unavailable.
|
|
|
|
---
|
|
|
|
## What the bot is
|
|
|
|
A single Docker container on the DGX Spark.
|
|
|
|
| Fact | Value |
|
|
|---|---|
|
|
| Host | `spark-32d0` (`10.59.211.6` on WireGuard), user **`modelo`** |
|
|
| Project dir | `/home/modelo/matrix-bridge` — a **Gitea clone tracking `master`** |
|
|
| Compose service | `bot` |
|
|
| Container name | `matrix-bridge` (fixed via `container_name:`) |
|
|
| Image | `matrix-bridge-bot` |
|
|
| Lifecycle | host networking, `restart: unless-stopped` (survives Spark reboot) |
|
|
| Secrets | `.env`, `config.toml` — **gitignored**, live only on the Spark, never in git |
|
|
|
|
Spark Control SSHes into `spark-32d0` as **`modelo`** (the same login it already uses for Spark 2),
|
|
so these ride the existing channel — no new key, and **no `sudo` wrap**: this Spark has no
|
|
passwordless sudo, and since the channel is already `modelo` (owner of the dir, member of the
|
|
`docker` group) every command runs as the right user directly. (The original spec's
|
|
`sudo -iu modelo` different-user fallback therefore never applies here.)
|
|
|
|
Registration on the Spark Control side: the bot's SSH user is a config field (set to `modelo`),
|
|
the host reuses the existing Spark 2 connection, and container / dir / branch use the defaults
|
|
(`matrix-bridge` / `~/matrix-bridge` / `master`). The tile auto-hides when that user is blank or
|
|
the container is absent, so it stays out of the way on installs that don't run the bot.
|
|
|
|
---
|
|
|
|
## One-time prerequisites — DONE
|
|
|
|
`~/matrix-bridge` was originally loose files from `scp`; it's now a git clone of the Gitea repo,
|
|
converted in place (the gitignored `.env`/`config.toml` were untouched, because `git reset --hard`
|
|
ignores them).
|
|
|
|
**Load-bearing gotcha that's now fixed:** on the Spark, git offered the wrong SSH key first and
|
|
Gitea rejected it (`Permission denied (publickey)`) even though the deploy key was correctly
|
|
registered. Fixed by pinning it in modelo's `~/.ssh/config` with `IdentitiesOnly yes` for the
|
|
Gitea host. **The Update button depends on that block staying in place — flag it if modelo's
|
|
account is ever rebuilt.**
|
|
|
|
The conversion, for reference:
|
|
|
|
```sh
|
|
cd /home/modelo/matrix-bridge
|
|
git init -b master
|
|
git remote add origin ssh://git@immense-voyage.local:59916/grant/matrix-bridge.git
|
|
git fetch origin
|
|
git reset --hard origin/master # secrets are gitignored → untouched
|
|
git branch --set-upstream-to=origin/master master
|
|
```
|
|
|
|
---
|
|
|
|
## The contract — commands behind each control
|
|
|
|
Run from `/home/modelo/matrix-bridge` as `modelo`. Each is idempotent and fail-loud: non-zero
|
|
exit + stderr is surfaced on the panel, not swallowed.
|
|
|
|
### Status (poll for the badge)
|
|
|
|
```sh
|
|
docker inspect -f '{{.State.Status}}|{{.State.StartedAt}}|{{.RestartCount}}' matrix-bridge
|
|
```
|
|
|
|
- `running` → up · `exited` → stopped/crashed · `restarting` → unhealthy/boot-looping ·
|
|
non-zero exit (`No such object: matrix-bridge`) → **not deployed** (tile hides). A climbing
|
|
`RestartCount` while status flips to `restarting` is the crash-loop tell.
|
|
- **Badge = container liveness only, not Matrix connectivity** — a bot that's `running` but
|
|
disconnected from Synapse still shows Healthy. See the HEALTHCHECK note below.
|
|
- *Cadence note:* a fast `docker restart` won't visibly flip the badge red — the panel re-checks
|
|
status only after the command returns, by which point the container is already back up. A full
|
|
`docker stop` turns it red within ~5s. Polling cadence, not a bug.
|
|
|
|
### Logs
|
|
|
|
```sh
|
|
docker logs --tail 100 matrix-bridge
|
|
```
|
|
|
|
### Restart
|
|
|
|
```sh
|
|
docker restart matrix-bridge
|
|
```
|
|
|
|
### Update (pull + rebuild + recreate) — the headline button
|
|
|
|
```sh
|
|
cd /home/modelo/matrix-bridge \
|
|
&& git fetch origin \
|
|
&& git reset --hard origin/master \
|
|
&& docker compose up -d --build
|
|
```
|
|
|
|
`git reset --hard origin/master` is the deploy-box "always match remote" semantic: never stuck on
|
|
divergence, and gitignored secrets are preserved. Streamed live on the panel with a ~25-min
|
|
ceiling; non-zero exit + stderr surfaced. **Workflow: push to Gitea, then click Update.**
|
|
|
|
### Stop / Start
|
|
|
|
```sh
|
|
docker stop matrix-bridge # stop
|
|
cd /home/modelo/matrix-bridge && docker compose up -d # start (recreates if needed)
|
|
```
|
|
|
|
---
|
|
|
|
## Programmatic interface (LAN-only)
|
|
|
|
The same controls are reachable over HTTP if scripting is ever wanted:
|
|
|
|
- `POST /api/matrix-bridge/update` → returns an id; `GET .../update/{id}` and
|
|
`.../update/{id}/stream` (SSE) for progress.
|
|
- `GET /api/matrix-bridge/logs?tail=N`
|
|
- status via `GET /api/services`
|
|
|
|
---
|
|
|
|
## Future enhancement — truer status (not required; matrix-bridge-side)
|
|
|
|
Status reports container liveness, not Matrix connectivity — the bot can be `running` yet
|
|
disconnected from Synapse. A truer signal needs a Docker `HEALTHCHECK` backed by a bot-side
|
|
liveness signal (e.g. the bot touches a file or exposes a tiny endpoint on each successful sync
|
|
loop), after which Status could read `{{.State.Health.Status}}`. That's a matrix-bridge-side
|
|
change — do it if/when "running but silent" actually bites, then tell the Spark Control dev to
|
|
read the health field.
|