Files

T

Grant 2ba3da55b1 0.1.0:3 - Show Public Key layout + /api/endpoints service-discovery

- showPublicKey now uses result.group: install command and raw key are each their own one-click copy box; description is brief
- /api/endpoints returns stable shape { vllm, parakeet, magpie } with base_url + model + ready, for other LAN services to consume without hardcoding Spark IPs
- health.py: parakeet/magpie now also expose base_url
- README: documented /api/endpoints shape

2026-05-12 10:52:57 -05:00

4.3 KiB

Raw Blame History

spark-control

A browser-based control panel for a dual-DGX-Spark vLLM cluster. Designed to run as a StartOS 0.4 package on a Start9 server on the same LAN as the Sparks.

What it does

Shows which LLM is currently loaded on the cluster (:8888/v1/models).
Click to swap to a different model — stops the current one, launches the new one, streams logs to the UI until Application startup complete. appears.
Surfaces health for Parakeet (STT, :8000) and Magpie (TTS, :9000) on Spark 2.

Architecture

[Browser/phone] ──► [StartOS reverse proxy] ──► [spark-control container]
                                                       │  (SSH over LAN)
                                                       ▼
                                                  [Spark 1] ──► launch-cluster.sh
                                                       │
                                                       ▼
                                                  [Spark 2]

Two layers in this repo:

image/ — a self-contained FastAPI app + static UI. Runs anywhere with uvicorn and an SSH client. Useful for development.
package/ — a thin StartOS 0.4 wrapper that packages the image, exposes the UI on the LAN, and gives the user actions to configure SSH access to the Sparks.

Quick start (local dev, no StartOS yet)

cd image
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
export SPARK1_HOST=<spark-1-ip>
export SPARK1_USER=<spark-user>
export SPARK2_HOST=<spark-2-ip>
export SPARK2_USER=<spark-user>
export SSH_KEY_PATH="$HOME/Library/Application Support/NVIDIA/Sync/config/nvsync.key"
uvicorn app.server:app --host 0.0.0.0 --port 9999 --reload

Open http://localhost:9999.

Note: use the IP <spark-1-ip> for Spark 1, not <spark-1-host>.local. mDNS resolves to IPv6 first and httpx hangs on it because vLLM only binds IPv4.

Build the StartOS package

cd package
npm i        # one-time
make x86     # produces spark-control_x86_64.s9pk (~55 MB)

Requires start-cli, Node ≥ 22, Docker. The build runs tsc + ncc for the TS bundle, then docker build on image/Dockerfile, then start-cli s9pk pack to produce the .s9pk.

To sideload onto your Start9: make install (needs host: set in ~/.startos/config.yaml), or upload the .s9pk via the Start9 web UI's sideload feature.

Post-install setup (one-time per Start9 install)

Open the Spark Control service → Actions → Show Public Key → copy the line.
SSH to each Spark and append the line to ~/.ssh/authorized_keys for the <spark-user> user.
Actions → Configure Sparks → enter <spark-1-ip> / <spark-user> for Spark 1 and <spark-2-ip> / <spark-user> for Spark 2.
Start the service. Open the Web UI — current model + health should show within ~5 s.

Repo layout

image/ — Docker image source (FastAPI app + models.yaml)
package/ — StartOS 0.4 package source
runbook.md — operating notes
known-issues.md — known quirks and workarounds
LICENSE — MIT

Service discovery API

Other services on your LAN can hit GET /api/endpoints to learn where the current model lives without hardcoding Spark IPs. Stable JSON shape:

{
  "vllm":    { "ready": true,  "base_url": "http://<spark-1-ip>:8888/v1", "model": "RedHatAI/Qwen3.6-35B-A3B-NVFP4", "openai_compat": true },
  "parakeet":{ "ready": true,  "base_url": "http://<spark-2-ip>:8000",   "kind": "stt", "model": "nvidia/parakeet-tdt-0.6b-v3" },
  "magpie":  { "ready": false, "base_url": "http://<spark-2-ip>:9000",   "kind": "tts" }
}

base_url is filled in whenever Configure Sparks has been completed (even if the underlying service isn't currently up). Pair the URL with ready: true to safely route traffic.

Status

v0.1 — local-only, single-cluster, no auth (trusts LAN). Five LLMs in the catalog: qwen3-vl (cluster), gemma4, qwen36, plus two legacy entries. Magpie surfaces red until its container is fixed.

v0.2 in progress: service-discovery API, magpie crash fix, Parakeet/Magpie lifecycle, model download driving, spark-vllm-docker update checks, configurable flag tiers.

4.3 KiB Raw Blame History