2ba3da55b1
- showPublicKey now uses result.group: install command and raw key are each their own one-click copy box; description is brief
- /api/endpoints returns stable shape { vllm, parakeet, magpie } with base_url + model + ready, for other LAN services to consume without hardcoding Spark IPs
- health.py: parakeet/magpie now also expose base_url
- README: documented /api/endpoints shape
92 lines
4.3 KiB
Markdown
92 lines
4.3 KiB
Markdown
# spark-control
|
|
|
|
A browser-based control panel for a dual-DGX-Spark vLLM cluster. Designed to run as a [StartOS 0.4](https://docs.start9.com/packaging/0.4.0.x/) package on a Start9 server on the same LAN as the Sparks.
|
|
|
|
## What it does
|
|
|
|
- Shows which LLM is currently loaded on the cluster (`:8888/v1/models`).
|
|
- Click to swap to a different model — stops the current one, launches the new one, streams logs to the UI until `Application startup complete.` appears.
|
|
- Surfaces health for Parakeet (STT, `:8000`) and Magpie (TTS, `:9000`) on Spark 2.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
[Browser/phone] ──► [StartOS reverse proxy] ──► [spark-control container]
|
|
│ (SSH over LAN)
|
|
▼
|
|
[Spark 1] ──► launch-cluster.sh
|
|
│
|
|
▼
|
|
[Spark 2]
|
|
```
|
|
|
|
Two layers in this repo:
|
|
|
|
- `image/` — a self-contained FastAPI app + static UI. Runs anywhere with `uvicorn` and an SSH client. Useful for development.
|
|
- `package/` — a thin StartOS 0.4 wrapper that packages the image, exposes the UI on the LAN, and gives the user actions to configure SSH access to the Sparks.
|
|
|
|
## Quick start (local dev, no StartOS yet)
|
|
|
|
```bash
|
|
cd image
|
|
python3 -m venv .venv && source .venv/bin/activate
|
|
pip install -e .
|
|
export SPARK1_HOST=<spark-1-ip>
|
|
export SPARK1_USER=<spark-user>
|
|
export SPARK2_HOST=<spark-2-ip>
|
|
export SPARK2_USER=<spark-user>
|
|
export SSH_KEY_PATH="$HOME/Library/Application Support/NVIDIA/Sync/config/nvsync.key"
|
|
uvicorn app.server:app --host 0.0.0.0 --port 9999 --reload
|
|
```
|
|
|
|
Open <http://localhost:9999>.
|
|
|
|
> **Note:** use the **IP** `<spark-1-ip>` for Spark 1, not `<spark-1-host>.local`. mDNS resolves to IPv6 first and `httpx` hangs on it because vLLM only binds IPv4.
|
|
|
|
## Build the StartOS package
|
|
|
|
```bash
|
|
cd package
|
|
npm i # one-time
|
|
make x86 # produces spark-control_x86_64.s9pk (~55 MB)
|
|
```
|
|
|
|
Requires [`start-cli`](https://docs.start9.com/latest/developer-guide/sdk/installing-the-sdk), Node ≥ 22, Docker. The build runs `tsc` + `ncc` for the TS bundle, then `docker build` on `image/Dockerfile`, then `start-cli s9pk pack` to produce the `.s9pk`.
|
|
|
|
To sideload onto your Start9: `make install` (needs `host:` set in `~/.startos/config.yaml`), or upload the `.s9pk` via the Start9 web UI's sideload feature.
|
|
|
|
## Post-install setup (one-time per Start9 install)
|
|
|
|
1. Open the Spark Control service → **Actions** → **Show Public Key** → copy the line.
|
|
2. SSH to each Spark and append the line to `~/.ssh/authorized_keys` for the `<spark-user>` user.
|
|
3. **Actions** → **Configure Sparks** → enter `<spark-1-ip>` / `<spark-user>` for Spark 1 and `<spark-2-ip>` / `<spark-user>` for Spark 2.
|
|
4. Start the service. Open the Web UI — current model + health should show within ~5 s.
|
|
|
|
## Repo layout
|
|
|
|
- `image/` — Docker image source (FastAPI app + `models.yaml`)
|
|
- `package/` — StartOS 0.4 package source
|
|
- `runbook.md` — operating notes
|
|
- `known-issues.md` — known quirks and workarounds
|
|
- `LICENSE` — MIT
|
|
|
|
## Service discovery API
|
|
|
|
Other services on your LAN can hit `GET /api/endpoints` to learn where the current model lives without hardcoding Spark IPs. Stable JSON shape:
|
|
|
|
```json
|
|
{
|
|
"vllm": { "ready": true, "base_url": "http://<spark-1-ip>:8888/v1", "model": "RedHatAI/Qwen3.6-35B-A3B-NVFP4", "openai_compat": true },
|
|
"parakeet":{ "ready": true, "base_url": "http://<spark-2-ip>:8000", "kind": "stt", "model": "nvidia/parakeet-tdt-0.6b-v3" },
|
|
"magpie": { "ready": false, "base_url": "http://<spark-2-ip>:9000", "kind": "tts" }
|
|
}
|
|
```
|
|
|
|
`base_url` is filled in whenever Configure Sparks has been completed (even if the underlying service isn't currently up). Pair the URL with `ready: true` to safely route traffic.
|
|
|
|
## Status
|
|
|
|
**v0.1** — local-only, single-cluster, no auth (trusts LAN). Five LLMs in the catalog: qwen3-vl (cluster), gemma4, qwen36, plus two legacy entries. Magpie surfaces red until its container is fixed.
|
|
|
|
v0.2 in progress: service-discovery API, magpie crash fix, Parakeet/Magpie lifecycle, model download driving, spark-vllm-docker update checks, configurable flag tiers.
|