v0.27.0:0 - in-app settings gear + swap-lock route fix
Move the ~20 optional cluster knobs out of the StartOS "Configure Sparks"
action (now just the 4 required fields) and into a dashboard ⚙ Settings gear,
backed by a /data/app_settings.json overlay keyed by env-var names. One shared
mutable Settings instance + Settings.reload() applies edits live without a
restart; existing installs' values migrate automatically on first boot.
Also: support-service ports (parakeet/kokoro/embed/qdrant + vllm) are now
configurable, and GET /api/swap/lock no longer 404s (it was shadowed by the
/api/swap/{job_id} catch-all). WebhookNotifier is re-pointed on save so its
url/secret reload live too.
This commit is contained in:
@@ -55,12 +55,13 @@ Subsystem guidance lives in `docs/guides/` and loads when matching files are tou
|
|||||||
|
|
||||||
## Current state
|
## Current state
|
||||||
|
|
||||||
|
- **Built, pending review+install: v0.27.0:0 — in-app Settings gear + two bug fixes** (prompted by the second adopter's v0.25 feedback). (1) The StartOS "Configure Sparks" action is trimmed to the **four required fields** (both Spark IPs + SSH users); every optional knob now lives behind a **⚙ Settings gear** in the dashboard, backed by a `/data/app_settings.json` overlay (`app_settings.py`) keyed by env-var names and overlaid on `os.environ`. Edits apply **live** — one shared mutable `Settings` instance, `Settings.reload()` mutates it in place (architecture in the fastapi-image guide). Existing installs' optional values **migrate automatically** on first boot (`seed_from_env`, nothing lost). (2) **NEW: support-service ports are configurable** (`PARAKEET_PORT`/`KOKORO_PORT`/`EMBED_PORT`/`QDRANT_PORT`; `VLLM_PORT` was already a knob, now surfaced in the gear) — fixes the adopter's false "vLLM down" (their vLLM is on 8000, not the launch-cluster.sh default 8888) and Parakeet 404 (they remapped it off 8000). (3) **Bug fix:** `GET /api/swap/lock` returned 404 — it was shadowed by `GET /api/swap/{job_id}`; the static lock routes now register first (load-bearing comment added). Tested locally (151 pytest + live smoke: live-apply, secret masking, route fix, 422 on bad input). Still **un-diagnosed** from the adopter report: their disk-scan shows Gemma "not on disk" — needs them to confirm where it's cached (`ls ~/.cache/huggingface/hub` as the SSH user) vs `disk.py`'s `$HOME/.cache/huggingface/hub` assumption. **Next: act on the code review, then build/install (go/no-go) + draft the adopter reply describing the fixes.**
|
||||||
- **Live: v0.26.0:0 — disk-driven model menu** (installed on the server 2026-06-18, `installed-version` confirms; also published to the self-hosted StartOS registry). The dashboard lists what's *actually downloaded* on the Sparks; `models.yaml`/overrides are **launch recipes** matched by `repo`, not the menu; an on-disk model with no recipe shows `needs_setup` and infers its launch flags from `config.json` (operator confirms once). Delete removes weights **and** the card; dropped the two legacy Qwen recipes. Architecture (`discovery.py`/`build_menu`/`infer_recipe`, the recipe-vs-disk split) is in the fastapi-image guide.
|
- **Live: v0.26.0:0 — disk-driven model menu** (installed on the server 2026-06-18, `installed-version` confirms; also published to the self-hosted StartOS registry). The dashboard lists what's *actually downloaded* on the Sparks; `models.yaml`/overrides are **launch recipes** matched by `repo`, not the menu; an on-disk model with no recipe shows `needs_setup` and infers its launch flags from `config.json` (operator confirms once). Delete removes weights **and** the card; dropped the two legacy Qwen recipes. Architecture (`discovery.py`/`build_menu`/`infer_recipe`, the recipe-vs-disk split) is in the fastapi-image guide.
|
||||||
- **Next (owner-driven, concrete): Gemma-4-26B-A4B vision daily-driver eval.** The `gemma4-26b` recipe is in the catalog (NVFP4 MoE; `--moe_backend=marlin` set — the fast CUTLASS FP4 path errors on GB10; vision+tools). Not yet downloaded or swap-tested. Owner wants vision for business-card OCR and is weighing it against the text-only Qwen3.6 35B daily driver (research: Gemma ~52 tok/s vs Qwen's ~97, slightly weaker reasoning). Next: download it, swap-test, try a business card.
|
- **Next (owner-driven, concrete): Gemma-4-26B-A4B vision daily-driver eval.** The `gemma4-26b` recipe is in the catalog (NVFP4 MoE; `--moe_backend=marlin` set — the fast CUTLASS FP4 path errors on GB10; vision+tools). Not yet downloaded or swap-tested. Owner wants vision for business-card OCR and is weighing it against the text-only Qwen3.6 35B daily driver (research: Gemma ~52 tok/s vs Qwen's ~97, slightly weaker reasoning). Next: download it, swap-test, try a business card.
|
||||||
- **Live: v0.25.0:0** (installed 2026-06-18). The OpenClaw/Johnny-5 coexistence epic is fully shipped & live: configurable `VLLM_PORT` (v0.22, blank ⇒ 8888), local/fine-tuned models (v0.23), configurable topology (v0.24 — `VLLM_CONTAINER`, `DISABLED_SERVICES` hide-list, second-Spark `kind: vllm` monitor), coordination layer (v0.25 — swap reservation lock with `423`-enforced manual-swap pause + `?force=true` Release override, `swap_complete`/`swap_failed` webhook, read-only schedule registry; consumer API in `docs/COORDINATION.md`).
|
- **Live: v0.25.0:0** (installed 2026-06-18). The OpenClaw/Johnny-5 coexistence epic is fully shipped & live: configurable `VLLM_PORT` (v0.22, blank ⇒ 8888), local/fine-tuned models (v0.23), configurable topology (v0.24 — `VLLM_CONTAINER`, `DISABLED_SERVICES` hide-list, second-Spark `kind: vllm` monitor), coordination layer (v0.25 — swap reservation lock with `423`-enforced manual-swap pause + `?force=true` Release override, `swap_complete`/`swap_failed` webhook, read-only schedule registry; consumer API in `docs/COORDINATION.md`).
|
||||||
- **Other live features:** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel; per-Spark SSH-key copy + WireGuard `VPN <ip>` hardware badge. Security hardening (v0.19 — shellsafe SSH-injection guard, Qdrant path-injection, same-origin CSRF guard) stable (`EVALUATION.md`). Spark 2 audio/embeddings stack healthy.
|
- **Other live features:** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel; per-Spark SSH-key copy + WireGuard `VPN <ip>` hardware badge. Security hardening (v0.19 — shellsafe SSH-injection guard, Qdrant path-injection, same-origin CSRF guard) stable (`EVALUATION.md`). Spark 2 audio/embeddings stack healthy.
|
||||||
- **matrix-bridge bot tile (v0.21.0:1, live):** `bot`-kind tile (docker-state badge; Update/Restart/Stop-Start/View-logs) for the Matrix bot on Spark 2, driven as `modelo` (no `sudo -iu`; blank `matrix_bridge_user` ⇒ tile hidden; host reuses `spark2_host`). Code: `app/matrix_bridge.py` + `/api/matrix-bridge/{update,logs}`. **Load-bearing:** Update's `git fetch` runs as `modelo` and needs `modelo`'s `~/.ssh/config` pinning the Gitea deploy key with `IdentitiesOnly yes` (else publickey denial). Optional next only if the bot dev asks: Docker `HEALTHCHECK`.
|
- **matrix-bridge bot tile (v0.21.0:1, live):** `bot`-kind tile (docker-state badge; Update/Restart/Stop-Start/View-logs) for the Matrix bot on Spark 2, driven as `modelo` (no `sudo -iu`; blank `matrix_bridge_user` ⇒ tile hidden; host reuses `spark2_host`). Code: `app/matrix_bridge.py` + `/api/matrix-bridge/{update,logs}`. **Load-bearing:** Update's `git fetch` runs as `modelo` and needs `modelo`'s `~/.ssh/config` pinning the Gitea deploy key with `IdentitiesOnly yes` (else publickey denial). Optional next only if the bot dev asks: Docker `HEALTHCHECK`.
|
||||||
- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (137 passing). Covers `build_launch_command` (incl. the shell-injection round-trip + local-model bind-mount), the transcript↔diarizer label-merge, the `shellsafe` validators, `matrix_bridge.build_update_command` (+ phase detection), the configurable-topology layer (`test_topology.py`), the coordination layer (`test_coordination.py`: swap-lock lifecycle/expiry/token-auth, schedule-registry CRUD, webhook payload + HMAC signature — `now` is injected into the lock so expiry is tested without sleeping), and the disk-driven menu (`test_discovery.py`: cache-dirname↔repo parsing, the cache-listing parser incl. incomplete-download filtering, and `infer_recipe` family/mode mapping — Qwen3-MoE→flashinfer_cutlass, Gemma-MoE→marlin, vision caps, solo-vs-cluster by size/host-count). The `build_menu` merge + `/api/models/suggest` are exercised by hand against the live cluster (mock-heavy unit tests there would test the mocks). Redaction + live-audio suites remain standalone scripts.
|
- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (151 passing; the in-app settings gear + swap-lock route-order regression are in `test_app_settings.py`, incl. a `TestClient` live-apply check). Covers `build_launch_command` (incl. the shell-injection round-trip + local-model bind-mount), the transcript↔diarizer label-merge, the `shellsafe` validators, `matrix_bridge.build_update_command` (+ phase detection), the configurable-topology layer (`test_topology.py`), the coordination layer (`test_coordination.py`: swap-lock lifecycle/expiry/token-auth, schedule-registry CRUD, webhook payload + HMAC signature — `now` is injected into the lock so expiry is tested without sleeping), and the disk-driven menu (`test_discovery.py`: cache-dirname↔repo parsing, the cache-listing parser incl. incomplete-download filtering, and `infer_recipe` family/mode mapping — Qwen3-MoE→flashinfer_cutlass, Gemma-MoE→marlin, vision caps, solo-vs-cluster by size/host-count). The `build_menu` merge + `/api/models/suggest` are exercised by hand against the live cluster (mock-heavy unit tests there would test the mocks). Redaction + live-audio suites remain standalone scripts.
|
||||||
- **Signal Engine "flakiness":** diagnosed as *not* a server bug — transient 1–4s unresponsiveness while the single GPU is busy. Client-side remedy (in-flight cap 2 / ceiling 3 / retry-on-timeout+503) drafted and **forwarded to that dev (owner confirmed 2026-06-15)**. Awaiting whether they want the measured concurrency knee.
|
- **Signal Engine "flakiness":** diagnosed as *not* a server bug — transient 1–4s unresponsiveness while the single GPU is busy. Client-side remedy (in-flight cap 2 / ceiling 3 / retry-on-timeout+503) drafted and **forwarded to that dev (owner confirmed 2026-06-15)**. Awaiting whether they want the measured concurrency knee.
|
||||||
- **Stance (decided, not built):** no public interface / no API-token auth — LAN + WireGuard/Tailscale split-tunnel only; the CSRF guard covers the browser-driven vector.
|
- **Stance (decided, not built):** no public interface / no API-token auth — LAN + WireGuard/Tailscale split-tunnel only; the CSRF guard covers the browser-driven vector.
|
||||||
- **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers; matrix-bridge badge won't visibly flip on a fast `docker restart` (status re-checked only after the command returns).
|
- **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers; matrix-bridge badge won't visibly flip on a fast `docker restart` (status re-checked only after the command returns).
|
||||||
|
|||||||
+4
-5
@@ -73,16 +73,15 @@ The first start generates an ed25519 SSH keypair inside the package volume. Wait
|
|||||||
### 4. Configure Sparks
|
### 4. Configure Sparks
|
||||||
|
|
||||||
- Open Spark Control → **Actions → Configure Sparks**.
|
- Open Spark Control → **Actions → Configure Sparks**.
|
||||||
- Fill in:
|
- Fill in just the four required fields:
|
||||||
- **Spark 1 hostname or IP** — prefer the **IP** (e.g. `192.168.1.x`) over `.local` hostnames; vLLM only binds IPv4 and mDNS can resolve to IPv6 first.
|
- **Spark 1 hostname or IP** — prefer the **IP** (e.g. `192.168.1.x`) over `.local` hostnames; vLLM only binds IPv4 and mDNS can resolve to IPv6 first.
|
||||||
- **Spark 1 SSH user** — whatever username you set up on Spark 1.
|
- **Spark 1 SSH user** — whatever username you set up on Spark 1.
|
||||||
- **Spark 2 hostname or IP** + **SSH user** — same idea.
|
- **Spark 2 hostname or IP** + **SSH user** — same idea.
|
||||||
- Optional Parakeet/Kokoro overrides — leave blank if those services run on Spark 2 (the normal case).
|
|
||||||
- Optional **Open WebUI URL** — paste your Open WebUI LAN URL to get a deep-link button in the dashboard next to the current model.
|
|
||||||
- Optional **NGC API key** — paste it here if you have one.
|
|
||||||
|
|
||||||
Save.
|
Save.
|
||||||
|
|
||||||
|
Everything else is optional and lives in the dashboard, not this action: open Spark Control and click **⚙ Settings** in the top bar to set vLLM/service **ports** (e.g. if your vLLM runs on 8000 rather than the default 8888, or you moved Parakeet off 8000), container names, support-service hosts, an **Open WebUI URL** (adds a deep-link button), an **NGC API key**, and a swap webhook. Changes there apply immediately and are included in StartOS backups.
|
||||||
|
|
||||||
### 5. Re-run Show Public Key (if you skipped earlier)
|
### 5. Re-run Show Public Key (if you skipped earlier)
|
||||||
|
|
||||||
Now that hosts are configured, Show Public Key will give you the paste-ready install command. Run it as described in step 3.
|
Now that hosts are configured, Show Public Key will give you the paste-ready install command. Run it as described in step 3.
|
||||||
@@ -159,7 +158,7 @@ All of these inherit Spark Control's TLS cert and StartOS access controls. You o
|
|||||||
A few things worth knowing:
|
A few things worth knowing:
|
||||||
|
|
||||||
- The codebase is **two halves**: `image/` is a standalone FastAPI app you can run with `uvicorn app.server:app` for local dev. `package/` is the StartOS wrapper. Changes to either should be coordinated.
|
- The codebase is **two halves**: `image/` is a standalone FastAPI app you can run with `uvicorn app.server:app` for local dev. `package/` is the StartOS wrapper. Changes to either should be coordinated.
|
||||||
- **All connection info** comes from environment variables in `image/app/config.py`, populated from `package/startos/fileModels/sparkConfig.yaml.ts` via the Configure Sparks action. No IPs, usernames, or paths are hardcoded in runtime code.
|
- **All connection info** comes from environment variables in `image/app/config.py`. The four required fields are populated from `package/startos/fileModels/sparkConfig.yaml.ts` via the Configure Sparks action; the optional knobs are overlaid from the in-app `⚙ Settings` store (`/data/app_settings.json`, see `image/app/app_settings.py`). No IPs, usernames, or paths are hardcoded in runtime code.
|
||||||
- The **path `~/spark-vllm-docker`** *is* hardcoded in `swap.py`, `download.py`, `updates.py`, and `models.py`. If the user has cloned the upstream repo elsewhere, either fix the path or symlink it.
|
- The **path `~/spark-vllm-docker`** *is* hardcoded in `swap.py`, `download.py`, `updates.py`, and `models.py`. If the user has cloned the upstream repo elsewhere, either fix the path or symlink it.
|
||||||
- **Persistent state** lives at `/data/` inside the container: `config.yaml`, `models-overrides.yaml`, `services-overrides.yaml`, `connectivity.json`, `ssh/`. These survive package updates.
|
- **Persistent state** lives at `/data/` inside the container: `config.yaml`, `models-overrides.yaml`, `services-overrides.yaml`, `connectivity.json`, `ssh/`. These survive package updates.
|
||||||
- The dashboard polls every 5 s; check `image/app/health.py` and `image/app/connectivity.py` for the probing logic. External apps can also POST failures to `/api/health-event` to log between-poll blips.
|
- The dashboard polls every 5 s; check `image/app/health.py` and `image/app/connectivity.py` for the probing logic. External apps can also POST failures to `/api/health-event` to log between-poll blips.
|
||||||
|
|||||||
@@ -118,7 +118,7 @@ Fields: `service` (required), `ok` (required), `source` (optional, free-form), `
|
|||||||
|
|
||||||
- **Service discovery API** (`/api/endpoints`) for other LAN services
|
- **Service discovery API** (`/api/endpoints`) for other LAN services
|
||||||
- **Kokoro-82M TTS** replaces Magpie/Riva NIM as the default TTS backend (v0.14.0). Magpie's decoder had a ~30-50% truncation rate on multi-sentence inputs and ate 49 GB of GPU memory; Kokoro is 24/24 reliable at every input length tested, uses 1.3 GB GPU, and renders in ~1s. See HANDOFF.md and the release notes for the migration story.
|
- **Kokoro-82M TTS** replaces Magpie/Riva NIM as the default TTS backend (v0.14.0). Magpie's decoder had a ~30-50% truncation rate on multi-sentence inputs and ate 49 GB of GPU memory; Kokoro is 24/24 reliable at every input length tested, uses 1.3 GB GPU, and renders in ~1s. See HANDOFF.md and the release notes for the migration story.
|
||||||
- **Always-on services panel** with Start/Stop/Restart for Parakeet + Kokoro, plus per-service host configuration in Configure Sparks (so they can live on Spark 1, Spark 2, or anywhere)
|
- **Always-on services panel** with Start/Stop/Restart for Parakeet + Kokoro, plus per-service host/port/container configuration in the in-app **⚙ Settings** gear (so they can live on Spark 1, Spark 2, or anywhere, on any port)
|
||||||
- **Model download** from the dashboard — paste an HF repo (with autocomplete for known models), pick solo or cluster, watch percent progress with bytes/rate/ETA. After completion the model appears on the menu automatically; if it's unrecognized, a pre-filled "set up this model" dialog offers to configure it.
|
- **Model download** from the dashboard — paste an HF repo (with autocomplete for known models), pick solo or cluster, watch percent progress with bytes/rate/ETA. After completion the model appears on the menu automatically; if it's unrecognized, a pre-filled "set up this model" dialog offers to configure it.
|
||||||
- **spark-vllm-docker update check** — banner shows "N commits behind upstream"; Apply Update runs `git pull && ./build-and-copy.sh -c` over SSH with a streamed log
|
- **spark-vllm-docker update check** — banner shows "N commits behind upstream"; Apply Update runs `git pull && ./build-and-copy.sh -c` over SSH with a streamed log
|
||||||
- **Per-model Advanced settings** — knobs for max context, GPU memory %, and three optimization toggles (fastsafetensors, prefix caching, FP8 KV cache). Persisted to `/data/models-overrides.yaml` so they survive package updates. Bundled and custom models alike.
|
- **Per-model Advanced settings** — knobs for max context, GPU memory %, and three optimization toggles (fastsafetensors, prefix caching, FP8 KV cache). Persisted to `/data/models-overrides.yaml` so they survive package updates. Bundled and custom models alike.
|
||||||
|
|||||||
@@ -35,11 +35,13 @@ Two kinds, both run with the `image/.venv` interpreter (system python3 has no de
|
|||||||
- New external-facing endpoints get documented in `docs/` (`AUDIO_API.md`, `EMBEDDINGS.md`, `REDACTION_GATEWAY.md`) and noted in release notes.
|
- New external-facing endpoints get documented in `docs/` (`AUDIO_API.md`, `EMBEDDINGS.md`, `REDACTION_GATEWAY.md`) and noted in release notes.
|
||||||
- **SSH-input safety:** any user-supplied value that reaches an SSH command on the Sparks MUST go through `app/shellsafe.py` — validate against a whitelist at the API boundary, then `quote_arg`/`quote_args` (`shlex.quote`) at the sink. Never raw f-string a user value into a command string. Existing sinks: `models.build_launch_command`, `download`, `nim`, `services`; `disk.py` keeps its own `_SAFE_DIRNAME` because it needs `$HOME` to expand server-side. The vLLM pre-flight (`validate.py`) relies on `shlex.split` cleanly reversing this quoting — preserve that invariant.
|
- **SSH-input safety:** any user-supplied value that reaches an SSH command on the Sparks MUST go through `app/shellsafe.py` — validate against a whitelist at the API boundary, then `quote_arg`/`quote_args` (`shlex.quote`) at the sink. Never raw f-string a user value into a command string. Existing sinks: `models.build_launch_command`, `download`, `nim`, `services`; `disk.py` keeps its own `_SAFE_DIRNAME` because it needs `$HOME` to expand server-side. The vLLM pre-flight (`validate.py`) relies on `shlex.split` cleanly reversing this quoting — preserve that invariant.
|
||||||
- **CSRF / same-origin:** state-mutating *control* endpoints are guarded by the `csrf_guard` middleware in `server.py` (rejects requests whose `Origin`/`Referer` host ≠ the served host). A new endpoint meant to be called **cross-origin by downstream apps** (a proxy/data endpoint) must be added to `_CSRF_EXEMPT_PREFIXES`, or browser POSTs from those apps will 403. No app-layer token auth by design (LAN/VPN-only; would break consumers).
|
- **CSRF / same-origin:** state-mutating *control* endpoints are guarded by the `csrf_guard` middleware in `server.py` (rejects requests whose `Origin`/`Referer` host ≠ the served host). A new endpoint meant to be called **cross-origin by downstream apps** (a proxy/data endpoint) must be added to `_CSRF_EXEMPT_PREFIXES`, or browser POSTs from those apps will 403. No app-layer token auth by design (LAN/VPN-only; would break consumers).
|
||||||
|
- **Settings split (gear vs StartOS action):** only the four *required* fields (both Spark IPs + SSH users) live in the StartOS "Configure Sparks" action → `config.yaml` → env. Every *optional* knob (ports, container names, support-service hosts, integrations, webhook) is edited in the dashboard's ⚙ Settings gear, backed by the `/data/app_settings.json` overlay (`app_settings.py`), keyed by the same env-var names. Precedence (`config._effective_env`): `os.environ` first, overlay on top. `app_settings.seed_from_env` runs **once at startup** to migrate a pre-gear install's env values into the overlay (don't move seeding into `from_env`/`reload` — it writes, and `from_env` runs on every build → it would clobber across calls, which it did once already). **`Settings` is deliberately not frozen:** one shared instance is threaded by reference into every router closure/manager, and `Settings.reload()` (called after a gear save) recomputes its fields **in place** so changes apply live with no restart and no call-site changes. A new gear knob = add one entry to `app_settings.FIELDS` (the front-end renders it generically); the matching `config.Settings` field must already read that env var.
|
||||||
|
|
||||||
## Layout
|
## Layout
|
||||||
|
|
||||||
- `image/app/server.py` — FastAPI entry; routers live in sibling modules (`audio_proxy.py`, `llm_proxy.py`, `embeddings_proxy.py`, `redaction_gateway.py`, `swap.py`, `health.py`, `deep_health.py`, `connectivity.py`, …).
|
- `image/app/server.py` — FastAPI entry; routers live in sibling modules (`audio_proxy.py`, `llm_proxy.py`, `embeddings_proxy.py`, `redaction_gateway.py`, `swap.py`, `health.py`, `deep_health.py`, `connectivity.py`, …).
|
||||||
- `image/app/discovery.py` — the disk-driven model menu. `/api/models` lists what's actually downloaded on the Sparks (via `disk.list_cached_models`); `models.yaml`/overrides are *launch recipes* matched by repo, not the menu. An on-disk model with no recipe is `needs_setup` → `infer_recipe` reads its `config.json` to prefill a setup form the operator confirms once.
|
- `image/app/discovery.py` — the disk-driven model menu. `/api/models` lists what's actually downloaded on the Sparks (via `disk.list_cached_models`); `models.yaml`/overrides are *launch recipes* matched by repo, not the menu. An on-disk model with no recipe is `needs_setup` → `infer_recipe` reads its `config.json` to prefill a setup form the operator confirms once.
|
||||||
|
- `image/app/app_settings.py` — the in-app settings overlay backing the ⚙ gear: `FIELDS` metadata (drives `/api/settings` + the UI form), `load_overlay()` (pure read), `seed_from_env()` (one-time migration), `apply()` (validate + persist). `GET/POST /api/settings` in `server.py` read/write it, then `settings.reload()`.
|
||||||
- `image/app/static/` — the dashboard UI.
|
- `image/app/static/` — the dashboard UI.
|
||||||
- `image/models.yaml` — bundled vLLM **launch recipes** (how to launch a known model), NOT the dashboard menu — the menu is the on-disk scan.
|
- `image/models.yaml` — bundled vLLM **launch recipes** (how to launch a known model), NOT the dashboard menu — the menu is the on-disk scan.
|
||||||
- `image/spark_embed/` — Dockerfile + app for the embeddings container; built ON a Spark (ARM64, NGC PyTorch base — see the audio/cluster rule for NGC torch-pinning caveats).
|
- `image/spark_embed/` — Dockerfile + app for the embeddings container; built ON a Spark (ARM64, NGC PyTorch base — see the audio/cluster rule for NGC torch-pinning caveats).
|
||||||
|
|||||||
@@ -0,0 +1,286 @@
|
|||||||
|
"""App-owned settings overlay: the in-dashboard 'gear' knobs.
|
||||||
|
|
||||||
|
Spark Control's *required* wiring — the two Spark IPs and SSH users — is set once
|
||||||
|
via the StartOS "Configure Sparks" action and arrives as env vars. Everything
|
||||||
|
else (ports, container names, support-service hosts, integrations, webhook) is
|
||||||
|
optional and lives here: a small JSON overlay on /data that the dashboard gear
|
||||||
|
reads and writes, so an operator never has to open StartOS actions to tune the
|
||||||
|
cluster. This follows the StartOS 0.4 convention (minimal setup action; routine
|
||||||
|
config in the app's own UI) and stays inside the package's backup volume, so the
|
||||||
|
file is backed up and restored for free.
|
||||||
|
|
||||||
|
Each overlay entry is keyed by the *same env var name* config.Settings already
|
||||||
|
reads, so the overlay is simply an env-var override store. Precedence (see
|
||||||
|
config._effective_env): process env first, this overlay on top — so a knob set
|
||||||
|
in the gear wins, while an un-touched knob falls through to whatever the StartOS
|
||||||
|
action injected, then to the code default.
|
||||||
|
|
||||||
|
First-run migration: when the overlay file doesn't exist yet (e.g. an existing
|
||||||
|
install upgrading into this version), it's seeded from the current env so any
|
||||||
|
value previously set via the StartOS action carries over into the gear with no
|
||||||
|
operator action and nothing lost.
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Mapping
|
||||||
|
|
||||||
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Field metadata drives BOTH the /api/settings response (the front-end renders
|
||||||
|
# the form generically from this) and light server-side validation. `key` is the
|
||||||
|
# env var name; `type` is one of text|int|csv|secret. `secret` values are
|
||||||
|
# write-only — never echoed back to the browser.
|
||||||
|
FIELDS: list[dict] = [
|
||||||
|
# --- vLLM (Spark 1) ---
|
||||||
|
{"group": "vLLM (Spark 1)", "key": "VLLM_PORT", "label": "vLLM port", "type": "int",
|
||||||
|
"placeholder": "8888",
|
||||||
|
"help": "Port your vLLM listens on. Blank ⇒ 8888 (the bundled launch-cluster.sh). Set 8000 for vanilla vLLM, or wherever yours listens."},
|
||||||
|
{"group": "vLLM (Spark 1)", "key": "VLLM_CONTAINER", "label": "vLLM container name", "type": "text",
|
||||||
|
"placeholder": "vllm_node",
|
||||||
|
"help": "Docker container the swappable vLLM runs in. Blank ⇒ vllm_node. The swap log-tail and pre-flight validator exec into it by name."},
|
||||||
|
|
||||||
|
# --- Monitoring ---
|
||||||
|
{"group": "Monitoring", "key": "DISABLED_SERVICES", "label": "Services to hide", "type": "csv",
|
||||||
|
"placeholder": "e.g. parakeet,kokoro",
|
||||||
|
"help": "Comma-separated built-in services your cluster doesn't run, so their tiles are hidden and never probed. Valid: parakeet, kokoro, embeddings, qdrant. Blank ⇒ monitor all."},
|
||||||
|
|
||||||
|
# --- Parakeet (STT) ---
|
||||||
|
{"group": "Parakeet (STT)", "key": "PARAKEET_HOST", "label": "Host", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2",
|
||||||
|
"help": "Host running the Parakeet STT container. Blank ⇒ Spark 2."},
|
||||||
|
{"group": "Parakeet (STT)", "key": "PARAKEET_PORT", "label": "Port", "type": "int",
|
||||||
|
"placeholder": "8000",
|
||||||
|
"help": "Port Parakeet listens on. Blank ⇒ 8000. Set this if you remapped it (e.g. because your vLLM holds 8000)."},
|
||||||
|
{"group": "Parakeet (STT)", "key": "PARAKEET_CONTAINER", "label": "Container name", "type": "text",
|
||||||
|
"placeholder": "parakeet-asr",
|
||||||
|
"help": "Docker container name for Parakeet. Blank ⇒ parakeet-asr."},
|
||||||
|
{"group": "Parakeet (STT)", "key": "PARAKEET_USER", "label": "SSH user", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2 user",
|
||||||
|
"help": "SSH user that owns the Parakeet container. Blank ⇒ your Spark 2 user."},
|
||||||
|
|
||||||
|
# --- Kokoro (TTS) ---
|
||||||
|
{"group": "Kokoro (TTS)", "key": "KOKORO_HOST", "label": "Host", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2",
|
||||||
|
"help": "Host running the Kokoro TTS container. Blank ⇒ Spark 2."},
|
||||||
|
{"group": "Kokoro (TTS)", "key": "KOKORO_PORT", "label": "Port", "type": "int",
|
||||||
|
"placeholder": "8880",
|
||||||
|
"help": "Port Kokoro listens on. Blank ⇒ 8880."},
|
||||||
|
{"group": "Kokoro (TTS)", "key": "KOKORO_CONTAINER", "label": "Container name", "type": "text",
|
||||||
|
"placeholder": "kokoro-tts",
|
||||||
|
"help": "Docker container name for Kokoro. Blank ⇒ kokoro-tts."},
|
||||||
|
{"group": "Kokoro (TTS)", "key": "KOKORO_USER", "label": "SSH user", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2 user",
|
||||||
|
"help": "SSH user that owns the Kokoro container. Blank ⇒ your Spark 2 user."},
|
||||||
|
|
||||||
|
# --- Embeddings ---
|
||||||
|
{"group": "Embeddings", "key": "EMBED_HOST", "label": "Host", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2",
|
||||||
|
"help": "Host running the spark-embed container (bge-m3 + reranker). Blank ⇒ Spark 2."},
|
||||||
|
{"group": "Embeddings", "key": "EMBED_PORT", "label": "Port", "type": "int",
|
||||||
|
"placeholder": "8088",
|
||||||
|
"help": "Port the embedding server listens on. Blank ⇒ 8088."},
|
||||||
|
{"group": "Embeddings", "key": "EMBED_CONTAINER", "label": "Container name", "type": "text",
|
||||||
|
"placeholder": "spark-embed",
|
||||||
|
"help": "Docker container name for the embedding server. Blank ⇒ spark-embed."},
|
||||||
|
{"group": "Embeddings", "key": "EMBED_USER", "label": "SSH user", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2 user",
|
||||||
|
"help": "SSH user that owns the embedding container. Blank ⇒ your Spark 2 user."},
|
||||||
|
|
||||||
|
# --- Qdrant ---
|
||||||
|
{"group": "Qdrant", "key": "QDRANT_HOST", "label": "Host", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2",
|
||||||
|
"help": "Host running the Qdrant vector database. Blank ⇒ Spark 2."},
|
||||||
|
{"group": "Qdrant", "key": "QDRANT_PORT", "label": "Port", "type": "int",
|
||||||
|
"placeholder": "6333",
|
||||||
|
"help": "Port Qdrant's REST API listens on. Blank ⇒ 6333."},
|
||||||
|
{"group": "Qdrant", "key": "QDRANT_CONTAINER", "label": "Container name", "type": "text",
|
||||||
|
"placeholder": "qdrant",
|
||||||
|
"help": "Docker container name for Qdrant. Blank ⇒ qdrant."},
|
||||||
|
{"group": "Qdrant", "key": "QDRANT_USER", "label": "SSH user", "type": "text",
|
||||||
|
"placeholder": "leave blank for Spark 2 user",
|
||||||
|
"help": "SSH user that owns the Qdrant container. Blank ⇒ your Spark 2 user."},
|
||||||
|
{"group": "Qdrant", "key": "QDRANT_COLLECTION", "label": "Default collection", "type": "text",
|
||||||
|
"placeholder": "e.g. crm_chunks",
|
||||||
|
"help": "Collection used by /api/search when a request doesn't name one. Blank ⇒ callers must pass a collection."},
|
||||||
|
|
||||||
|
# --- Integrations ---
|
||||||
|
{"group": "Integrations", "key": "OPEN_WEBUI_URL", "label": "Open WebUI URL", "type": "text",
|
||||||
|
"placeholder": "e.g. https://open-webui.yourserver.local",
|
||||||
|
"help": "If set, the header shows a one-click 'Open chat' button to your Open WebUI."},
|
||||||
|
{"group": "Integrations", "key": "MATRIX_BRIDGE_USER", "label": "matrix-bridge bot SSH user", "type": "text",
|
||||||
|
"placeholder": "e.g. modelo",
|
||||||
|
"help": "SSH user owning the bot's ~/matrix-bridge clone (Spark 2). Set this to show the bot tile (update/restart/logs). Blank ⇒ tile hidden."},
|
||||||
|
{"group": "Integrations", "key": "NGC_API_KEY", "label": "NGC API key", "type": "secret",
|
||||||
|
"placeholder": "starts with nvapi-…",
|
||||||
|
"help": "NVIDIA NGC personal key, needed only to install NIM containers from nvcr.io. Stored on this server."},
|
||||||
|
{"group": "Integrations", "key": "SWAP_WEBHOOK_URL", "label": "Swap webhook URL", "type": "text",
|
||||||
|
"placeholder": "e.g. https://my-service.local/spark-swap",
|
||||||
|
"help": "POSTed a small JSON event (swap_complete / swap_failed) after every model swap, so automation can re-point to the new model. Blank ⇒ disabled."},
|
||||||
|
{"group": "Integrations", "key": "SWAP_WEBHOOK_SECRET", "label": "Swap webhook secret", "type": "secret",
|
||||||
|
"placeholder": "a random shared string",
|
||||||
|
"help": "If set, each webhook is HMAC-signed (X-Spark-Signature) so the receiver can verify it. Blank ⇒ unsigned."},
|
||||||
|
]
|
||||||
|
|
||||||
|
_BY_KEY = {f["key"]: f for f in FIELDS}
|
||||||
|
_SECRET_KEYS = frozenset(f["key"] for f in FIELDS if f["type"] == "secret")
|
||||||
|
_INT_KEYS = frozenset(f["key"] for f in FIELDS if f["type"] == "int")
|
||||||
|
# Reject control characters (incl. newlines) — these values flow into env vars,
|
||||||
|
# URLs, and SSH command lines (quoted at the sink, but defence in depth).
|
||||||
|
_BAD_CHARS = re.compile(r"[\x00-\x1f\x7f]")
|
||||||
|
# A secret's value is never echoed back, so a blank submit means "keep the stored
|
||||||
|
# one" (you can't see it to retype it). To actually *remove* a stored secret the
|
||||||
|
# UI sends this sentinel instead of a real value. Surfaced to the front-end via
|
||||||
|
# public_view so the two stay in sync.
|
||||||
|
CLEAR_SENTINEL = "__clear__"
|
||||||
|
|
||||||
|
|
||||||
|
def _path() -> Path:
|
||||||
|
return Path(os.environ.get("APP_SETTINGS_FILE", "/data/app_settings.json"))
|
||||||
|
|
||||||
|
|
||||||
|
def field_keys() -> frozenset[str]:
|
||||||
|
return frozenset(_BY_KEY)
|
||||||
|
|
||||||
|
|
||||||
|
def load_overlay() -> dict[str, str]:
|
||||||
|
"""Return the overlay as {ENV_KEY: value}, filtered to known, non-empty keys.
|
||||||
|
|
||||||
|
Pure read (no side effects) — called on every Settings (re)build, so it must
|
||||||
|
not write. Missing/corrupt file ⇒ {}. The file is tiny."""
|
||||||
|
p = _path()
|
||||||
|
if not p.exists():
|
||||||
|
return {}
|
||||||
|
try:
|
||||||
|
raw = json.loads(p.read_text())
|
||||||
|
except (ValueError, OSError) as e:
|
||||||
|
log.warning("ignoring unreadable %s: %s", p, e)
|
||||||
|
return {}
|
||||||
|
if not isinstance(raw, dict):
|
||||||
|
return {}
|
||||||
|
return {k: str(v) for k, v in raw.items() if k in _BY_KEY and v not in (None, "")}
|
||||||
|
|
||||||
|
|
||||||
|
def seed_from_env(env: Mapping[str, str]) -> None:
|
||||||
|
"""One-time migration, called once at startup: if no overlay exists yet, seed
|
||||||
|
it from the current env so any optional value previously set via the StartOS
|
||||||
|
action carries into the gear automatically (nothing lost on upgrade). No-op
|
||||||
|
if the file already exists or the env carries no known non-empty knob — a
|
||||||
|
fresh install then starts with no overlay and pure defaults. Values run
|
||||||
|
through the same validation as apply(); a malformed one (e.g. a paste-error
|
||||||
|
port) is skipped rather than written, matching the gear's own guards."""
|
||||||
|
if _path().exists():
|
||||||
|
return
|
||||||
|
seeded: dict[str, str] = {}
|
||||||
|
for k in _BY_KEY:
|
||||||
|
v = env.get(k)
|
||||||
|
if not v:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
cleaned = _validate(k, v)
|
||||||
|
except SettingsError as e:
|
||||||
|
log.warning("skipping invalid env value while seeding overlay: %s", e)
|
||||||
|
continue
|
||||||
|
if cleaned and cleaned != CLEAR_SENTINEL:
|
||||||
|
seeded[k] = cleaned
|
||||||
|
if seeded:
|
||||||
|
_write(seeded)
|
||||||
|
log.info("seeded settings overlay from env (%d keys): %s", len(seeded), _path())
|
||||||
|
|
||||||
|
|
||||||
|
def _write(overlay: dict[str, str]) -> None:
|
||||||
|
p = _path()
|
||||||
|
p.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
# Atomic replace so a crash mid-write never leaves a truncated overlay.
|
||||||
|
fd, tmp = tempfile.mkstemp(dir=str(p.parent), prefix=".app_settings.", suffix=".tmp")
|
||||||
|
try:
|
||||||
|
with os.fdopen(fd, "w") as fh:
|
||||||
|
json.dump(overlay, fh, indent=2, sort_keys=True)
|
||||||
|
os.replace(tmp, p)
|
||||||
|
except BaseException:
|
||||||
|
try:
|
||||||
|
os.unlink(tmp)
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
raise
|
||||||
|
|
||||||
|
|
||||||
|
def public_view() -> dict:
|
||||||
|
"""Shape the gear form for the browser: ordered groups of fields with their
|
||||||
|
current overlay value. Secret values are never sent — only a `set` flag."""
|
||||||
|
overlay = load_overlay()
|
||||||
|
groups: list[dict] = []
|
||||||
|
index: dict[str, dict] = {}
|
||||||
|
for f in FIELDS:
|
||||||
|
g = index.get(f["group"])
|
||||||
|
if g is None:
|
||||||
|
g = {"name": f["group"], "fields": []}
|
||||||
|
index[f["group"]] = g
|
||||||
|
groups.append(g)
|
||||||
|
entry = {
|
||||||
|
"key": f["key"],
|
||||||
|
"label": f["label"],
|
||||||
|
"type": f["type"],
|
||||||
|
"placeholder": f.get("placeholder", ""),
|
||||||
|
"help": f.get("help", ""),
|
||||||
|
}
|
||||||
|
if f["type"] == "secret":
|
||||||
|
entry["set"] = bool(overlay.get(f["key"]))
|
||||||
|
else:
|
||||||
|
entry["value"] = overlay.get(f["key"], "")
|
||||||
|
g["fields"].append(entry)
|
||||||
|
return {"groups": groups, "clear_sentinel": CLEAR_SENTINEL}
|
||||||
|
|
||||||
|
|
||||||
|
class SettingsError(ValueError):
|
||||||
|
"""Bad input to apply() — surfaced as 422 by the endpoint."""
|
||||||
|
|
||||||
|
|
||||||
|
def _validate(key: str, value) -> str:
|
||||||
|
"""Clean + validate one value; raise SettingsError on bad input. Returns the
|
||||||
|
stripped string ('' is valid and means 'unset'). The CLEAR_SENTINEL passes
|
||||||
|
through for the caller to interpret (secret removal)."""
|
||||||
|
if key not in _BY_KEY:
|
||||||
|
raise SettingsError(f"unknown setting: {key}")
|
||||||
|
val = ("" if value is None else str(value)).strip()
|
||||||
|
if val == CLEAR_SENTINEL:
|
||||||
|
return val
|
||||||
|
if _BAD_CHARS.search(val):
|
||||||
|
raise SettingsError(f"{key}: control characters are not allowed")
|
||||||
|
if key in _INT_KEYS and val:
|
||||||
|
if not val.isdigit() or not (1 <= int(val) <= 65535):
|
||||||
|
raise SettingsError(f"{key}: must be a port number between 1 and 65535")
|
||||||
|
return val
|
||||||
|
|
||||||
|
|
||||||
|
def apply(updates: Mapping[str, str]) -> dict[str, str]:
|
||||||
|
"""Validate `updates` and merge them into the overlay, then persist.
|
||||||
|
|
||||||
|
Rules per key:
|
||||||
|
- unknown key / bad int / control chars → reject (422, via _validate)
|
||||||
|
- secret + CLEAR_SENTINEL → delete the stored secret
|
||||||
|
- secret + blank value → leave the stored secret unchanged (don't wipe)
|
||||||
|
- non-secret + blank → delete the key (revert to env/default)
|
||||||
|
- otherwise → set the key
|
||||||
|
|
||||||
|
Returns the new overlay. The caller reloads Settings so the change goes live.
|
||||||
|
"""
|
||||||
|
overlay = load_overlay()
|
||||||
|
for key, value in updates.items():
|
||||||
|
val = _validate(key, value)
|
||||||
|
if key in _SECRET_KEYS:
|
||||||
|
if val == CLEAR_SENTINEL:
|
||||||
|
overlay.pop(key, None)
|
||||||
|
elif val:
|
||||||
|
overlay[key] = val
|
||||||
|
# blank secret ⇒ leave the existing value in place
|
||||||
|
elif val and val != CLEAR_SENTINEL:
|
||||||
|
overlay[key] = val
|
||||||
|
else:
|
||||||
|
overlay.pop(key, None)
|
||||||
|
_write(overlay)
|
||||||
|
return overlay
|
||||||
+84
-58
@@ -1,26 +1,28 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass, fields
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
from typing import Mapping
|
||||||
|
|
||||||
|
from . import app_settings
|
||||||
from .shellsafe import validate_container
|
from .shellsafe import validate_container
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
def _env(name: str, default: str = "") -> str:
|
def _env(src: Mapping[str, str], name: str, default: str = "") -> str:
|
||||||
return os.environ.get(name, default)
|
return src.get(name, default)
|
||||||
|
|
||||||
|
|
||||||
def _env_container(name: str, default: str) -> str:
|
def _env_container(src: Mapping[str, str], name: str, default: str) -> str:
|
||||||
"""Resolve a container-name env var, validating it at the config boundary.
|
"""Resolve a container-name env var, validating it at the config boundary.
|
||||||
|
|
||||||
The value flows into `docker logs`/`docker exec` over SSH, so it's quoted at
|
The value flows into `docker logs`/`docker exec` over SSH, so it's quoted at
|
||||||
the sink — but per the repo's two-layer convention it's also whitelist-checked
|
the sink — but per the repo's two-layer convention it's also whitelist-checked
|
||||||
here. A malformed optional value falls back to `default` rather than crashing
|
here. A malformed optional value falls back to `default` rather than crashing
|
||||||
daemon startup (mirrors `_env_int` for VLLM_PORT)."""
|
daemon startup (mirrors `_env_int`)."""
|
||||||
val = os.environ.get(name, "") or default
|
val = src.get(name, "") or default
|
||||||
try:
|
try:
|
||||||
return validate_container(val)
|
return validate_container(val)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
@@ -28,23 +30,23 @@ def _env_container(name: str, default: str) -> str:
|
|||||||
return default
|
return default
|
||||||
|
|
||||||
|
|
||||||
def _env_set(name: str) -> frozenset[str]:
|
def _env_set(src: Mapping[str, str], name: str) -> frozenset[str]:
|
||||||
"""Parse a comma-separated env var into a lowercased frozenset of keys.
|
"""Parse a comma-separated env var into a lowercased frozenset of keys.
|
||||||
|
|
||||||
Used by DISABLED_SERVICES so an adopter whose cluster doesn't run a given
|
Used by DISABLED_SERVICES so an adopter whose cluster doesn't run a given
|
||||||
support service can switch its tile + probes off entirely (rather than have
|
support service can switch its tile + probes off entirely (rather than have
|
||||||
the probe hit whatever else listens on that port — e.g. a vLLM sharing
|
the probe hit whatever else listens on that port — e.g. a vLLM sharing
|
||||||
Parakeet's default 8000)."""
|
Parakeet's default 8000)."""
|
||||||
raw = os.environ.get(name, "")
|
raw = src.get(name, "")
|
||||||
return frozenset(part.strip().lower() for part in raw.split(",") if part.strip())
|
return frozenset(part.strip().lower() for part in raw.split(",") if part.strip())
|
||||||
|
|
||||||
|
|
||||||
def _env_int(name: str, default: int) -> int:
|
def _env_int(src: Mapping[str, str], name: str, default: int) -> int:
|
||||||
"""Parse an int env var, falling back to `default` when unset, blank, or
|
"""Parse an int env var, falling back to `default` when unset, blank, or
|
||||||
malformed. The StartOS Configure panel passes optional numeric fields as an
|
malformed. Optional numeric fields arrive as an empty string when left blank,
|
||||||
empty string when left blank, so a bare int("") would crash daemon startup."""
|
so a bare int("") would crash daemon startup."""
|
||||||
try:
|
try:
|
||||||
return int(os.environ.get(name, "") or default)
|
return int(src.get(name, "") or default)
|
||||||
except (TypeError, ValueError):
|
except (TypeError, ValueError):
|
||||||
return default
|
return default
|
||||||
|
|
||||||
@@ -64,8 +66,23 @@ def _resolve_models_yaml() -> str:
|
|||||||
return str(candidates[0]) # let load fail with a clear path
|
return str(candidates[0]) # let load fail with a clear path
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
def _effective_env() -> dict[str, str]:
|
||||||
|
"""The env Settings is built from: process env first, the in-app settings
|
||||||
|
overlay on top. The overlay (the dashboard 'gear') is keyed by the same env
|
||||||
|
var names, so a knob set in the UI overrides the value the StartOS action
|
||||||
|
injected — while an un-touched knob keeps falling through to the action's
|
||||||
|
value, then to the code default. See app_settings."""
|
||||||
|
return {**os.environ, **app_settings.load_overlay()}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
class Settings:
|
class Settings:
|
||||||
|
# NOTE: intentionally NOT frozen. There is exactly one Settings instance,
|
||||||
|
# shared by reference across every router closure and manager (build_router,
|
||||||
|
# self.settings = settings). `reload()` mutates it in place so a change saved
|
||||||
|
# via the in-app settings gear goes live for all of them without rebuilding
|
||||||
|
# the app — the only window of inconsistency is the microseconds it takes to
|
||||||
|
# reassign the fields, acceptable for a single-operator config save.
|
||||||
spark1_host: str
|
spark1_host: str
|
||||||
spark1_user: str
|
spark1_user: str
|
||||||
spark2_host: str
|
spark2_host: str
|
||||||
@@ -107,73 +124,82 @@ class Settings:
|
|||||||
swap_webhook_secret: str
|
swap_webhook_secret: str
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def from_env(cls) -> "Settings":
|
def from_env(cls, src: Mapping[str, str] | None = None) -> "Settings":
|
||||||
spark2_host = _env("SPARK2_HOST")
|
src = _effective_env() if src is None else src
|
||||||
spark2_user = _env("SPARK2_USER")
|
spark2_host = _env(src, "SPARK2_HOST")
|
||||||
|
spark2_user = _env(src, "SPARK2_USER")
|
||||||
# Parakeet (STT) and Kokoro (TTS) default to Spark 2 unless overridden.
|
# Parakeet (STT) and Kokoro (TTS) default to Spark 2 unless overridden.
|
||||||
return cls(
|
return cls(
|
||||||
spark1_host=_env("SPARK1_HOST"),
|
spark1_host=_env(src, "SPARK1_HOST"),
|
||||||
spark1_user=_env("SPARK1_USER"),
|
spark1_user=_env(src, "SPARK1_USER"),
|
||||||
spark2_host=spark2_host,
|
spark2_host=spark2_host,
|
||||||
spark2_user=spark2_user,
|
spark2_user=spark2_user,
|
||||||
parakeet_host=_env("PARAKEET_HOST") or spark2_host,
|
parakeet_host=_env(src, "PARAKEET_HOST") or spark2_host,
|
||||||
parakeet_user=_env("PARAKEET_USER") or spark2_user,
|
parakeet_user=_env(src, "PARAKEET_USER") or spark2_user,
|
||||||
parakeet_container=_env("PARAKEET_CONTAINER") or "parakeet-asr",
|
parakeet_container=_env(src, "PARAKEET_CONTAINER") or "parakeet-asr",
|
||||||
kokoro_host=_env("KOKORO_HOST") or spark2_host,
|
kokoro_host=_env(src, "KOKORO_HOST") or spark2_host,
|
||||||
kokoro_user=_env("KOKORO_USER") or spark2_user,
|
kokoro_user=_env(src, "KOKORO_USER") or spark2_user,
|
||||||
kokoro_container=_env("KOKORO_CONTAINER") or "kokoro-tts",
|
kokoro_container=_env(src, "KOKORO_CONTAINER") or "kokoro-tts",
|
||||||
# Embeddings (spark-embed: bge-m3 dense + reranker) and Qdrant
|
# Embeddings (spark-embed: bge-m3 dense + reranker) and Qdrant
|
||||||
# (vector storage) default to Spark 2 unless overridden.
|
# (vector storage) default to Spark 2 unless overridden.
|
||||||
embed_host=_env("EMBED_HOST") or spark2_host,
|
embed_host=_env(src, "EMBED_HOST") or spark2_host,
|
||||||
embed_user=_env("EMBED_USER") or spark2_user,
|
embed_user=_env(src, "EMBED_USER") or spark2_user,
|
||||||
embed_container=_env("EMBED_CONTAINER") or "spark-embed",
|
embed_container=_env(src, "EMBED_CONTAINER") or "spark-embed",
|
||||||
qdrant_host=_env("QDRANT_HOST") or spark2_host,
|
qdrant_host=_env(src, "QDRANT_HOST") or spark2_host,
|
||||||
qdrant_user=_env("QDRANT_USER") or spark2_user,
|
qdrant_user=_env(src, "QDRANT_USER") or spark2_user,
|
||||||
qdrant_container=_env("QDRANT_CONTAINER") or "qdrant",
|
qdrant_container=_env(src, "QDRANT_CONTAINER") or "qdrant",
|
||||||
qdrant_collection=_env("QDRANT_COLLECTION", ""),
|
qdrant_collection=_env(src, "QDRANT_COLLECTION", ""),
|
||||||
# matrix-bridge bot container, driven as its own SSH user (the owner
|
# matrix-bridge bot container, driven as its own SSH user (the owner
|
||||||
# of the ~/matrix-bridge git clone) so git/docker run unprivileged.
|
# of the ~/matrix-bridge git clone) so git/docker run unprivileged.
|
||||||
# The user is BLANK by default and set via the "Configure Sparks"
|
# The user is BLANK by default and set via the settings gear; leaving
|
||||||
# action; leaving it blank reports the service as unconfigured, which
|
# it blank reports the service as unconfigured, which hides the tile.
|
||||||
# hides the tile. That keeps the shared package portable — a
|
# That keeps the shared package portable — a deployment without the
|
||||||
# deployment without the bot never shows a stray tile or a hardcoded
|
# bot never shows a stray tile or a hardcoded username. Host defaults
|
||||||
# username. Host defaults to Spark 2 (same box); container/dir/branch
|
# to Spark 2 (same box); container/dir/branch are sensible defaults.
|
||||||
# are sensible defaults. All are env-overridable.
|
matrix_bridge_host=_env(src, "MATRIX_BRIDGE_HOST") or spark2_host,
|
||||||
matrix_bridge_host=_env("MATRIX_BRIDGE_HOST") or spark2_host,
|
matrix_bridge_user=_env(src, "MATRIX_BRIDGE_USER"),
|
||||||
matrix_bridge_user=_env("MATRIX_BRIDGE_USER"),
|
matrix_bridge_container=_env(src, "MATRIX_BRIDGE_CONTAINER") or "matrix-bridge",
|
||||||
matrix_bridge_container=_env("MATRIX_BRIDGE_CONTAINER") or "matrix-bridge",
|
matrix_bridge_dir=_env(src, "MATRIX_BRIDGE_DIR") or "~/matrix-bridge",
|
||||||
matrix_bridge_dir=_env("MATRIX_BRIDGE_DIR") or "~/matrix-bridge",
|
matrix_bridge_branch=_env(src, "MATRIX_BRIDGE_BRANCH") or "master",
|
||||||
matrix_bridge_branch=_env("MATRIX_BRIDGE_BRANCH") or "master",
|
|
||||||
# Redaction gateway pseudonym-map store (server-held de-anon key).
|
# Redaction gateway pseudonym-map store (server-held de-anon key).
|
||||||
redaction_map_db=_env("REDACTION_MAP_DB", "/data/redaction_maps.db"),
|
redaction_map_db=_env(src, "REDACTION_MAP_DB", "/data/redaction_maps.db"),
|
||||||
redaction_map_ttl=_env_int("REDACTION_MAP_TTL", 7200),
|
redaction_map_ttl=_env_int(src, "REDACTION_MAP_TTL", 7200),
|
||||||
ssh_key_path=_env("SSH_KEY_PATH"),
|
ssh_key_path=_env(src, "SSH_KEY_PATH"),
|
||||||
ssh_known_hosts=_env("SSH_KNOWN_HOSTS"),
|
ssh_known_hosts=_env(src, "SSH_KNOWN_HOSTS"),
|
||||||
models_yaml=_resolve_models_yaml(),
|
models_yaml=_resolve_models_yaml(),
|
||||||
vllm_port=_env_int("VLLM_PORT", 8888),
|
vllm_port=_env_int(src, "VLLM_PORT", 8888),
|
||||||
# Container name for the swappable vLLM on Spark 1. Defaults to the
|
# Container name for the swappable vLLM on Spark 1. Defaults to the
|
||||||
# bundled launch-cluster.sh container; override if you named yours
|
# bundled launch-cluster.sh container; override if you named yours
|
||||||
# something else (the swap log-tail and pre-flight validator exec
|
# something else (the swap log-tail and pre-flight validator exec
|
||||||
# into it by name).
|
# into it by name).
|
||||||
vllm_container=_env_container("VLLM_CONTAINER", "vllm_node"),
|
vllm_container=_env_container(src, "VLLM_CONTAINER", "vllm_node"),
|
||||||
# Built-in support-service keys (parakeet, kokoro, embeddings,
|
# Built-in support-service keys (parakeet, kokoro, embeddings,
|
||||||
# qdrant) the deployment doesn't run — hidden from the dashboard and
|
# qdrant) the deployment doesn't run — hidden from the dashboard and
|
||||||
# never probed.
|
# never probed.
|
||||||
disabled_services=_env_set("DISABLED_SERVICES"),
|
disabled_services=_env_set(src, "DISABLED_SERVICES"),
|
||||||
parakeet_port=_env_int("PARAKEET_PORT", 8000),
|
parakeet_port=_env_int(src, "PARAKEET_PORT", 8000),
|
||||||
kokoro_port=_env_int("KOKORO_PORT", 8880),
|
kokoro_port=_env_int(src, "KOKORO_PORT", 8880),
|
||||||
embed_port=_env_int("EMBED_PORT", 8088),
|
embed_port=_env_int(src, "EMBED_PORT", 8088),
|
||||||
qdrant_port=_env_int("QDRANT_PORT", 6333),
|
qdrant_port=_env_int(src, "QDRANT_PORT", 6333),
|
||||||
bind_port=_env_int("BIND_PORT", 9999),
|
bind_port=_env_int(src, "BIND_PORT", 9999),
|
||||||
open_webui_url=_env("OPEN_WEBUI_URL", ""),
|
open_webui_url=_env(src, "OPEN_WEBUI_URL", ""),
|
||||||
ngc_api_key=_env("NGC_API_KEY", ""),
|
ngc_api_key=_env(src, "NGC_API_KEY", ""),
|
||||||
# Coordination layer: fire a swap-lifecycle webhook to this URL so
|
# Coordination layer: fire a swap-lifecycle webhook to this URL so
|
||||||
# downstream consumers re-point their model config on a swap. Blank
|
# downstream consumers re-point their model config on a swap. Blank
|
||||||
# ⇒ disabled. The optional secret HMAC-signs the body (X-Spark-Signature).
|
# ⇒ disabled. The optional secret HMAC-signs the body (X-Spark-Signature).
|
||||||
swap_webhook_url=_env("SWAP_WEBHOOK_URL", ""),
|
swap_webhook_url=_env(src, "SWAP_WEBHOOK_URL", ""),
|
||||||
swap_webhook_secret=_env("SWAP_WEBHOOK_SECRET", ""),
|
swap_webhook_secret=_env(src, "SWAP_WEBHOOK_SECRET", ""),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
def reload(self) -> None:
|
||||||
|
"""Recompute every field from the current env + settings overlay and
|
||||||
|
assign it onto this same instance, so all holders of the reference see
|
||||||
|
the change without an app restart. Called after the gear writes the
|
||||||
|
overlay (see server.post_settings)."""
|
||||||
|
fresh = Settings.from_env()
|
||||||
|
for f in fields(self):
|
||||||
|
setattr(self, f.name, getattr(fresh, f.name))
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def configured(self) -> bool:
|
def configured(self) -> bool:
|
||||||
return bool(self.spark1_host)
|
return bool(self.spark1_host)
|
||||||
|
|||||||
@@ -239,6 +239,14 @@ class WebhookNotifier:
|
|||||||
self.secret = secret or ""
|
self.secret = secret or ""
|
||||||
self.timeout = timeout
|
self.timeout = timeout
|
||||||
|
|
||||||
|
def update(self, url: str, secret: str = "") -> None:
|
||||||
|
"""Re-point after a live settings change. The notifier holds snapshot
|
||||||
|
copies of these two fields (not the Settings object), so Settings.reload()
|
||||||
|
can't reach it — server.post_settings calls this explicitly so editing the
|
||||||
|
webhook URL/secret in the dashboard gear takes effect without a restart."""
|
||||||
|
self.url = (url or "").strip()
|
||||||
|
self.secret = secret or ""
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def enabled(self) -> bool:
|
def enabled(self) -> bool:
|
||||||
return bool(self.url)
|
return bool(self.url)
|
||||||
|
|||||||
+89
-46
@@ -1,6 +1,7 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
import asyncio
|
import asyncio
|
||||||
import json
|
import json
|
||||||
|
import os
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from fastapi import FastAPI, HTTPException, Query, Request
|
from fastapi import FastAPI, HTTPException, Query, Request
|
||||||
@@ -9,6 +10,7 @@ from fastapi.staticfiles import StaticFiles
|
|||||||
from pydantic import BaseModel, ValidationError
|
from pydantic import BaseModel, ValidationError
|
||||||
from typing import Literal
|
from typing import Literal
|
||||||
|
|
||||||
|
from . import app_settings
|
||||||
from .config import Settings
|
from .config import Settings
|
||||||
from .connectivity import get_mac, record_report, record_state, summary as connectivity_summary
|
from .connectivity import get_mac, record_report, record_state, summary as connectivity_summary
|
||||||
from .coordination import LockHeld, ScheduleRegistry, SwapLockManager, WebhookNotifier, valid_schedule_id
|
from .coordination import LockHeld, ScheduleRegistry, SwapLockManager, WebhookNotifier, valid_schedule_id
|
||||||
@@ -37,6 +39,10 @@ from .validate import validate_launch
|
|||||||
from .wol import send_local_broadcast, send_via_peer
|
from .wol import send_local_broadcast, send_via_peer
|
||||||
|
|
||||||
|
|
||||||
|
# One-time migration: seed the in-app settings overlay from env (values set via
|
||||||
|
# the StartOS action on a pre-gear install) before building Settings, so nothing
|
||||||
|
# is lost on upgrade. No-op once the overlay exists. See app_settings.
|
||||||
|
app_settings.seed_from_env(os.environ)
|
||||||
settings = Settings.from_env()
|
settings = Settings.from_env()
|
||||||
catalog = load_catalog(settings.models_yaml)
|
catalog = load_catalog(settings.models_yaml)
|
||||||
# Coordination layer (GPU arbiter): swap-lifecycle webhook, the swap reservation
|
# Coordination layer (GPU arbiter): swap-lifecycle webhook, the swap reservation
|
||||||
@@ -156,6 +162,35 @@ async def get_config() -> dict:
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---- In-app settings ('gear') ----
|
||||||
|
# The optional cluster knobs (ports, container names, support-service hosts,
|
||||||
|
# integrations) live in an app-owned overlay on /data, edited here instead of in
|
||||||
|
# the StartOS action — which keeps to just the four required setup fields. See
|
||||||
|
# app_settings. Writes apply live: we rewrite the overlay then reload the shared
|
||||||
|
# Settings instance in place, so every router/manager holding the reference picks
|
||||||
|
# up the change with no container restart.
|
||||||
|
@app.get("/api/settings")
|
||||||
|
async def get_settings() -> dict:
|
||||||
|
return app_settings.public_view()
|
||||||
|
|
||||||
|
|
||||||
|
class SettingsUpdate(BaseModel):
|
||||||
|
values: dict[str, str]
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/settings")
|
||||||
|
async def post_settings(req: SettingsUpdate) -> dict:
|
||||||
|
try:
|
||||||
|
app_settings.apply(req.values)
|
||||||
|
except app_settings.SettingsError as e:
|
||||||
|
raise HTTPException(422, str(e))
|
||||||
|
settings.reload()
|
||||||
|
# WebhookNotifier snapshots url/secret (not the Settings object), so reload()
|
||||||
|
# can't reach it — re-point it explicitly so a webhook edit applies live too.
|
||||||
|
swap_webhook.update(settings.swap_webhook_url, settings.swap_webhook_secret)
|
||||||
|
return app_settings.public_view()
|
||||||
|
|
||||||
|
|
||||||
def _reload_catalog() -> None:
|
def _reload_catalog() -> None:
|
||||||
global catalog
|
global catalog
|
||||||
catalog = load_catalog(settings.models_yaml)
|
catalog = load_catalog(settings.models_yaml)
|
||||||
@@ -947,6 +982,56 @@ async def post_swap(req: SwapRequest, request: Request) -> dict:
|
|||||||
return {"job_id": job.id, "model_key": job.model_key, "state": job.state}
|
return {"job_id": job.id, "model_key": job.model_key, "state": job.state}
|
||||||
|
|
||||||
|
|
||||||
|
# ---- Swap reservation lock (the GPU arbiter) ----
|
||||||
|
# ROUTE ORDER IS LOAD-BEARING: these static `/api/swap/lock` routes MUST be
|
||||||
|
# registered before the parametric `/api/swap/{job_id}` below. FastAPI matches in
|
||||||
|
# registration order, so if `{job_id}` came first, GET /api/swap/lock would bind
|
||||||
|
# job_id="lock", look up a (non-existent) swap job, and 404 — which is exactly
|
||||||
|
# the bug this ordering fixes. Keep these above the {job_id} routes.
|
||||||
|
# CSRF: these are control-surface, not browser-exempt — an external scheduler is
|
||||||
|
# a non-browser client (no Origin header) so it passes the guard already, the
|
||||||
|
# same way it calls /api/swap; the dashboard is same-origin.
|
||||||
|
class LockAcquireRequest(BaseModel):
|
||||||
|
holder: str
|
||||||
|
ttl_seconds: int | None = None
|
||||||
|
note: str = ""
|
||||||
|
token: str | None = None # present only to extend an existing hold
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/swap/lock")
|
||||||
|
async def acquire_swap_lock(req: LockAcquireRequest) -> dict:
|
||||||
|
"""Reserve the GPU swap path. Returns a secret token used to swap (header
|
||||||
|
X-Swap-Lock-Token) and to release. 409 if held by another holder."""
|
||||||
|
try:
|
||||||
|
lock = swap_lock.acquire(req.holder, req.ttl_seconds, req.note, token=req.token)
|
||||||
|
except ValueError as e:
|
||||||
|
raise HTTPException(422, str(e))
|
||||||
|
except LockHeld as e:
|
||||||
|
raise HTTPException(status_code=409, detail={
|
||||||
|
"error": "swap lock is held by another holder",
|
||||||
|
"lock": e.state,
|
||||||
|
})
|
||||||
|
return {**swap_lock.status(), "token": lock.token}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/swap/lock")
|
||||||
|
async def get_swap_lock() -> dict:
|
||||||
|
"""Public, token-free view of the reservation: held? who? until when?"""
|
||||||
|
return swap_lock.status()
|
||||||
|
|
||||||
|
|
||||||
|
@app.delete("/api/swap/lock")
|
||||||
|
async def release_swap_lock(request: Request, force: bool = Query(False)) -> dict:
|
||||||
|
"""Release the reservation. Needs the matching X-Swap-Lock-Token unless
|
||||||
|
?force=true (the human override from the dashboard)."""
|
||||||
|
token = request.headers.get("x-swap-lock-token") or request.query_params.get("token")
|
||||||
|
try:
|
||||||
|
released = swap_lock.release(token, force=force)
|
||||||
|
except PermissionError as e:
|
||||||
|
raise HTTPException(403, str(e))
|
||||||
|
return {"released": released, **swap_lock.status()}
|
||||||
|
|
||||||
|
|
||||||
@app.get("/api/swap/{job_id}")
|
@app.get("/api/swap/{job_id}")
|
||||||
async def get_swap(job_id: str) -> dict:
|
async def get_swap(job_id: str) -> dict:
|
||||||
job = swap_manager.get(job_id)
|
job = swap_manager.get(job_id)
|
||||||
@@ -992,52 +1077,10 @@ async def stream_swap(job_id: str):
|
|||||||
return StreamingResponse(gen(), media_type="text/event-stream")
|
return StreamingResponse(gen(), media_type="text/event-stream")
|
||||||
|
|
||||||
|
|
||||||
# ---- Coordination layer: swap lock + schedule registry ----
|
# ---- Coordination layer: read-only schedule registry ----
|
||||||
# Endpoints are control-surface, not browser-exempt: an external scheduler is a
|
# (The swap reservation lock lives above, next to the swap routes.) Same CSRF
|
||||||
# non-browser client (no Origin header) so it passes the CSRF guard already, the
|
# posture: control-surface, not browser-exempt — external schedulers send no
|
||||||
# same way it calls /api/swap today; the dashboard is same-origin.
|
# Origin header so they pass the guard; the dashboard is same-origin.
|
||||||
|
|
||||||
class LockAcquireRequest(BaseModel):
|
|
||||||
holder: str
|
|
||||||
ttl_seconds: int | None = None
|
|
||||||
note: str = ""
|
|
||||||
token: str | None = None # present only to extend an existing hold
|
|
||||||
|
|
||||||
|
|
||||||
@app.post("/api/swap/lock")
|
|
||||||
async def acquire_swap_lock(req: LockAcquireRequest) -> dict:
|
|
||||||
"""Reserve the GPU swap path. Returns a secret token used to swap (header
|
|
||||||
X-Swap-Lock-Token) and to release. 409 if held by another holder."""
|
|
||||||
try:
|
|
||||||
lock = swap_lock.acquire(req.holder, req.ttl_seconds, req.note, token=req.token)
|
|
||||||
except ValueError as e:
|
|
||||||
raise HTTPException(422, str(e))
|
|
||||||
except LockHeld as e:
|
|
||||||
raise HTTPException(status_code=409, detail={
|
|
||||||
"error": "swap lock is held by another holder",
|
|
||||||
"lock": e.state,
|
|
||||||
})
|
|
||||||
return {**swap_lock.status(), "token": lock.token}
|
|
||||||
|
|
||||||
|
|
||||||
@app.get("/api/swap/lock")
|
|
||||||
async def get_swap_lock() -> dict:
|
|
||||||
"""Public, token-free view of the reservation: held? who? until when?"""
|
|
||||||
return swap_lock.status()
|
|
||||||
|
|
||||||
|
|
||||||
@app.delete("/api/swap/lock")
|
|
||||||
async def release_swap_lock(request: Request, force: bool = Query(False)) -> dict:
|
|
||||||
"""Release the reservation. Needs the matching X-Swap-Lock-Token unless
|
|
||||||
?force=true (the human override from the dashboard)."""
|
|
||||||
token = request.headers.get("x-swap-lock-token") or request.query_params.get("token")
|
|
||||||
try:
|
|
||||||
released = swap_lock.release(token, force=force)
|
|
||||||
except PermissionError as e:
|
|
||||||
raise HTTPException(403, str(e))
|
|
||||||
return {"released": released, **swap_lock.status()}
|
|
||||||
|
|
||||||
|
|
||||||
class ScheduleRequest(BaseModel):
|
class ScheduleRequest(BaseModel):
|
||||||
name: str
|
name: str
|
||||||
id: str | None = None
|
id: str | None = None
|
||||||
|
|||||||
@@ -2192,8 +2192,104 @@ function handleUpdateDone(d) {
|
|||||||
setTimeout(pollUpdates, 2000);
|
setTimeout(pollUpdates, 2000);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ===================== settings ('gear') =====================
|
||||||
|
// Renders the optional cluster knobs from /api/settings (server-driven field
|
||||||
|
// list, so adding a knob server-side needs no JS change) and POSTs edits back.
|
||||||
|
// The server reloads its config in place, so changes take effect immediately.
|
||||||
|
|
||||||
|
let settingsClearSentinel = '__clear__';
|
||||||
|
|
||||||
|
function renderSettingsForm(data) {
|
||||||
|
settingsClearSentinel = data.clear_sentinel || settingsClearSentinel;
|
||||||
|
const body = el('#settings-body');
|
||||||
|
body.innerHTML = (data.groups || []).map((g) => {
|
||||||
|
const rows = g.fields.map((f) => {
|
||||||
|
const help = f.help ? `<span class="muted small settings-help">${escapeHtml(f.help)}</span>` : '';
|
||||||
|
let input;
|
||||||
|
let clearToggle = '';
|
||||||
|
if (f.type === 'secret') {
|
||||||
|
const ph = f.set ? 'set — leave blank to keep' : (f.placeholder || '');
|
||||||
|
input = `<input type="password" autocomplete="off" data-key="${f.key}" data-secret="1" placeholder="${escapeHtml(ph)}">`;
|
||||||
|
// A stored secret is never echoed back, so blank means "keep". Offer an
|
||||||
|
// explicit way to remove it.
|
||||||
|
if (f.set) clearToggle = `<label class="settings-clear muted small"><input type="checkbox" data-clear-for="${f.key}"> clear stored value</label>`;
|
||||||
|
} else if (f.type === 'int') {
|
||||||
|
input = `<input type="number" min="1" max="65535" data-key="${f.key}" value="${escapeHtml(f.value || '')}" placeholder="${escapeHtml(f.placeholder || '')}">`;
|
||||||
|
} else {
|
||||||
|
input = `<input type="text" autocomplete="off" data-key="${f.key}" value="${escapeHtml(f.value || '')}" placeholder="${escapeHtml(f.placeholder || '')}">`;
|
||||||
|
}
|
||||||
|
return `<div class="settings-field"><label class="modal-row"><span>${escapeHtml(f.label)}</span>${input}</label>${clearToggle}${help}</div>`;
|
||||||
|
}).join('');
|
||||||
|
return `<fieldset class="modal-fieldset"><legend>${escapeHtml(g.name)}</legend>${rows}</fieldset>`;
|
||||||
|
}).join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
async function openSettingsDialog() {
|
||||||
|
const dlg = el('#settings-dialog');
|
||||||
|
const err = el('#settings-error');
|
||||||
|
err.classList.add('hidden');
|
||||||
|
el('#settings-body').innerHTML = '<p class="muted small">Loading…</p>';
|
||||||
|
dlg.showModal();
|
||||||
|
try {
|
||||||
|
renderSettingsForm(await fetchJSON('/api/settings'));
|
||||||
|
} catch (e) {
|
||||||
|
el('#settings-body').innerHTML = '';
|
||||||
|
err.textContent = 'Could not load settings: ' + e.message;
|
||||||
|
err.classList.remove('hidden');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function saveSettings(e) {
|
||||||
|
e.preventDefault();
|
||||||
|
const err = el('#settings-error');
|
||||||
|
err.classList.add('hidden');
|
||||||
|
const values = {};
|
||||||
|
$$('#settings-body [data-key]').forEach((inp) => {
|
||||||
|
const key = inp.dataset.key;
|
||||||
|
const v = inp.value.trim();
|
||||||
|
if (inp.dataset.secret) {
|
||||||
|
// "clear" checkbox wins; else a typed value sets it; else omit (keep the
|
||||||
|
// stored one — we can't see it to retype it).
|
||||||
|
const clear = el(`[data-clear-for="${key}"]`);
|
||||||
|
if (clear && clear.checked) values[key] = settingsClearSentinel;
|
||||||
|
else if (v) values[key] = v;
|
||||||
|
} else {
|
||||||
|
values[key] = v; // blank non-secret ⇒ server reverts it to the default
|
||||||
|
}
|
||||||
|
});
|
||||||
|
const btn = el('#settings-save');
|
||||||
|
btn.disabled = true;
|
||||||
|
try {
|
||||||
|
await fetchJSON('/api/settings', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'content-type': 'application/json' },
|
||||||
|
body: JSON.stringify({ values }),
|
||||||
|
});
|
||||||
|
el('#settings-dialog').close();
|
||||||
|
// Re-pull everything a knob can move: the Open WebUI link, health probes,
|
||||||
|
// service tiles, and the model menu (host/port changes alter all of them).
|
||||||
|
try {
|
||||||
|
state.config = await fetchJSON('/api/config');
|
||||||
|
const a = el('#open-webui-link');
|
||||||
|
if (state.config.open_webui_url) { a.href = state.config.open_webui_url; a.classList.remove('hidden'); }
|
||||||
|
else { a.classList.add('hidden'); }
|
||||||
|
} catch (e3) { console.warn('post-save /api/config refresh failed:', e3); }
|
||||||
|
pollStatus();
|
||||||
|
renderServices();
|
||||||
|
loadModels();
|
||||||
|
} catch (e2) {
|
||||||
|
err.textContent = 'Save failed: ' + e2.message.replace(/^\d+ [^:]*:\s*/, '');
|
||||||
|
err.classList.remove('hidden');
|
||||||
|
} finally {
|
||||||
|
btn.disabled = false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
async function init() {
|
async function init() {
|
||||||
setupCopyButtons();
|
setupCopyButtons();
|
||||||
|
el('#open-settings').addEventListener('click', openSettingsDialog);
|
||||||
|
el('#settings-cancel').addEventListener('click', () => el('#settings-dialog').close());
|
||||||
|
el('#settings-form').addEventListener('submit', saveSettings);
|
||||||
el('#open-download').addEventListener('click', openDownloadForm);
|
el('#open-download').addEventListener('click', openDownloadForm);
|
||||||
el('#dl-cancel').addEventListener('click', closeDownloadPanel);
|
el('#dl-cancel').addEventListener('click', closeDownloadPanel);
|
||||||
el('#dl-start').addEventListener('click', startDownload);
|
el('#dl-start').addEventListener('click', startDownload);
|
||||||
|
|||||||
@@ -17,14 +17,28 @@
|
|||||||
<span class="muted">connecting…</span>
|
<span class="muted">connecting…</span>
|
||||||
</div>
|
</div>
|
||||||
<a id="open-webui-link" class="topbar-btn hidden" href="#" target="_blank" rel="noopener" title="Open Open WebUI">Open chat ↗</a>
|
<a id="open-webui-link" class="topbar-btn hidden" href="#" target="_blank" rel="noopener" title="Open Open WebUI">Open chat ↗</a>
|
||||||
|
<button id="open-settings" class="topbar-btn" type="button" title="Settings" aria-label="Open cluster settings">⚙ Settings</button>
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
<main>
|
<main>
|
||||||
<section id="setup-banner" class="banner hidden">
|
<section id="setup-banner" class="banner hidden">
|
||||||
<strong>Configuration needed.</strong>
|
<strong>Configuration needed.</strong>
|
||||||
<span>Run the <em>Configure Sparks</em> action in StartOS to set hostnames, then run <em>Test Connection</em>.</span>
|
<span>Run the <em>Configure Sparks</em> action in StartOS to set your two Spark IPs and SSH users. Everything else (ports, services, integrations) lives under <em>⚙ Settings</em> above.</span>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
<dialog id="settings-dialog" class="modal">
|
||||||
|
<form method="dialog" class="modal-form" id="settings-form">
|
||||||
|
<h3>Settings</h3>
|
||||||
|
<p class="muted small">Optional cluster knobs — vLLM/service ports, container names, support-service hosts, and integrations. The two Spark IPs and SSH users are set once via the <em>Configure Sparks</em> action in StartOS; everything else is here. Changes apply immediately. Stored on this server and included in StartOS backups.</p>
|
||||||
|
<div id="settings-body" class="settings-body"><p class="muted small">Loading…</p></div>
|
||||||
|
<p id="settings-error" class="muted small dd-error hidden"></p>
|
||||||
|
<div class="modal-actions">
|
||||||
|
<button type="button" id="settings-cancel" class="btn">Cancel</button>
|
||||||
|
<button type="submit" id="settings-save" class="btn primary">Save</button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</dialog>
|
||||||
|
|
||||||
<section id="hardware-panel" class="hardware-panel hidden">
|
<section id="hardware-panel" class="hardware-panel hidden">
|
||||||
<div class="section-header">
|
<div class="section-header">
|
||||||
<h2 class="section-title">Spark hardware</h2>
|
<h2 class="section-title">Spark hardware</h2>
|
||||||
|
|||||||
@@ -964,3 +964,13 @@ main {
|
|||||||
.tab-content.active { display: block; }
|
.tab-content.active { display: block; }
|
||||||
|
|
||||||
/* (WhisperX install banner styles removed in v0.13.0:0 — see release notes) */
|
/* (WhisperX install banner styles removed in v0.13.0:0 — see release notes) */
|
||||||
|
|
||||||
|
/* ===== Settings ('gear') dialog ===== */
|
||||||
|
.modal#settings-dialog { max-width: 560px; }
|
||||||
|
/* Cap the (tall) form so the Save/Cancel actions stay reachable; the grouped
|
||||||
|
fields scroll within. */
|
||||||
|
#settings-body { max-height: 60vh; overflow-y: auto; padding-right: 6px; display: flex; flex-direction: column; gap: 12px; }
|
||||||
|
.settings-field { display: flex; flex-direction: column; gap: 2px; }
|
||||||
|
.settings-help { display: block; line-height: 1.35; }
|
||||||
|
.settings-clear { display: inline-flex; align-items: center; gap: 6px; margin-top: 2px; cursor: pointer; }
|
||||||
|
.settings-clear input { width: auto; }
|
||||||
|
|||||||
@@ -15,3 +15,6 @@ sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|||||||
os.environ.setdefault("REDACTION_MAP_DB", "/tmp/spark_control_test_maps.db")
|
os.environ.setdefault("REDACTION_MAP_DB", "/tmp/spark_control_test_maps.db")
|
||||||
os.environ.setdefault("CONNECTIVITY_LOG", "/tmp/spark_control_test_connectivity.json")
|
os.environ.setdefault("CONNECTIVITY_LOG", "/tmp/spark_control_test_connectivity.json")
|
||||||
os.environ.setdefault("MODELS_OVERRIDES", "/tmp/spark_control_test_overrides.yaml")
|
os.environ.setdefault("MODELS_OVERRIDES", "/tmp/spark_control_test_overrides.yaml")
|
||||||
|
# Keep the in-app settings overlay off the container-only /data path; tests that
|
||||||
|
# care about its contents point it at their own tmp file via monkeypatch.
|
||||||
|
os.environ.setdefault("APP_SETTINGS_FILE", "/tmp/spark_control_test_app_settings.json")
|
||||||
|
|||||||
@@ -0,0 +1,174 @@
|
|||||||
|
"""In-app settings overlay (the dashboard 'gear') + swap-lock routing regression.
|
||||||
|
|
||||||
|
Covers app_settings (the /data overlay backing the gear): first-run seeding from
|
||||||
|
env (the migration path), known-key filtering, apply() validation, secret
|
||||||
|
masking — and, end-to-end via TestClient, that POST /api/settings reloads the
|
||||||
|
shared Settings instance live, and that GET /api/swap/lock is no longer shadowed
|
||||||
|
by /api/swap/{job_id}.
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from app import app_settings
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def overlay_file(tmp_path, monkeypatch):
|
||||||
|
p = tmp_path / "app_settings.json"
|
||||||
|
monkeypatch.setenv("APP_SETTINGS_FILE", str(p))
|
||||||
|
return p
|
||||||
|
|
||||||
|
|
||||||
|
# ---- overlay store ----
|
||||||
|
|
||||||
|
def test_seed_from_env_filters_unknown_and_blank(overlay_file):
|
||||||
|
# An existing install upgrading in: values previously set via the StartOS
|
||||||
|
# action arrive as env; only known, non-empty keys migrate into the overlay.
|
||||||
|
app_settings.seed_from_env({
|
||||||
|
"VLLM_PORT": "8000",
|
||||||
|
"QDRANT_COLLECTION": "", # blank → skipped
|
||||||
|
"TOTALLY_UNKNOWN": "x", # not a gear key → skipped
|
||||||
|
"PARAKEET_PORT": "8010",
|
||||||
|
})
|
||||||
|
expected = {"VLLM_PORT": "8000", "PARAKEET_PORT": "8010"}
|
||||||
|
assert app_settings.load_overlay() == expected
|
||||||
|
assert json.loads(overlay_file.read_text()) == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_is_a_one_time_noop_when_file_present(overlay_file):
|
||||||
|
overlay_file.write_text(json.dumps({"VLLM_PORT": "8000", "BOGUS": "y", "NGC_API_KEY": ""}))
|
||||||
|
app_settings.seed_from_env({"VLLM_PORT": "9999"}) # file exists ⇒ no-op
|
||||||
|
# unknown + blank keys dropped on read; existing value untouched by the seed.
|
||||||
|
assert app_settings.load_overlay() == {"VLLM_PORT": "8000"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_file_is_empty_and_seed_of_blank_env_writes_nothing(overlay_file):
|
||||||
|
assert app_settings.load_overlay() == {}
|
||||||
|
app_settings.seed_from_env({"VLLM_PORT": "", "QDRANT_COLLECTION": ""})
|
||||||
|
assert not overlay_file.exists() # nothing worth seeding ⇒ no file
|
||||||
|
assert app_settings.load_overlay() == {}
|
||||||
|
|
||||||
|
|
||||||
|
def test_apply_set_then_blank_deletes(overlay_file):
|
||||||
|
app_settings.apply({"VLLM_PORT": "8000"})
|
||||||
|
assert app_settings.load_overlay()["VLLM_PORT"] == "8000"
|
||||||
|
app_settings.apply({"VLLM_PORT": ""}) # blank non-secret ⇒ revert to default
|
||||||
|
assert "VLLM_PORT" not in app_settings.load_overlay()
|
||||||
|
|
||||||
|
|
||||||
|
def test_apply_rejects_unknown_key(overlay_file):
|
||||||
|
with pytest.raises(app_settings.SettingsError):
|
||||||
|
app_settings.apply({"NOT_A_KNOB": "x"})
|
||||||
|
|
||||||
|
|
||||||
|
def test_apply_rejects_non_numeric_port(overlay_file):
|
||||||
|
with pytest.raises(app_settings.SettingsError):
|
||||||
|
app_settings.apply({"PARAKEET_PORT": "80x0"})
|
||||||
|
|
||||||
|
|
||||||
|
def test_apply_rejects_control_chars(overlay_file):
|
||||||
|
with pytest.raises(app_settings.SettingsError):
|
||||||
|
app_settings.apply({"QDRANT_COLLECTION": "a\nb"})
|
||||||
|
|
||||||
|
|
||||||
|
def test_secret_blank_keeps_existing(overlay_file):
|
||||||
|
app_settings.apply({"NGC_API_KEY": "nvapi-abc"})
|
||||||
|
app_settings.apply({"NGC_API_KEY": ""}) # blank secret ⇒ leave it in place
|
||||||
|
assert app_settings.load_overlay()["NGC_API_KEY"] == "nvapi-abc"
|
||||||
|
|
||||||
|
|
||||||
|
def test_apply_rejects_out_of_range_port(overlay_file):
|
||||||
|
for bad in ("0", "99999", "65536"):
|
||||||
|
with pytest.raises(app_settings.SettingsError):
|
||||||
|
app_settings.apply({"VLLM_PORT": bad})
|
||||||
|
|
||||||
|
|
||||||
|
def test_apply_accepts_port_bounds(overlay_file):
|
||||||
|
app_settings.apply({"VLLM_PORT": "1", "PARAKEET_PORT": "65535"})
|
||||||
|
o = app_settings.load_overlay()
|
||||||
|
assert o["VLLM_PORT"] == "1" and o["PARAKEET_PORT"] == "65535"
|
||||||
|
|
||||||
|
|
||||||
|
def test_secret_clear_sentinel_removes(overlay_file):
|
||||||
|
app_settings.apply({"NGC_API_KEY": "nvapi-abc"})
|
||||||
|
app_settings.apply({"NGC_API_KEY": app_settings.CLEAR_SENTINEL})
|
||||||
|
assert "NGC_API_KEY" not in app_settings.load_overlay()
|
||||||
|
|
||||||
|
|
||||||
|
def test_seed_skips_invalid_and_strips(overlay_file):
|
||||||
|
app_settings.seed_from_env({
|
||||||
|
"VLLM_PORT": "8000\n", # trailing newline → stripped
|
||||||
|
"PARAKEET_PORT": "99999", # out of range → skipped, not written
|
||||||
|
"QDRANT_COLLECTION": "crm",
|
||||||
|
})
|
||||||
|
o = app_settings.load_overlay()
|
||||||
|
assert o["VLLM_PORT"] == "8000"
|
||||||
|
assert "PARAKEET_PORT" not in o
|
||||||
|
assert o["QDRANT_COLLECTION"] == "crm"
|
||||||
|
|
||||||
|
|
||||||
|
def test_public_view_exposes_clear_sentinel(overlay_file):
|
||||||
|
assert app_settings.public_view()["clear_sentinel"] == app_settings.CLEAR_SENTINEL
|
||||||
|
|
||||||
|
|
||||||
|
def test_public_view_masks_secrets_and_groups(overlay_file):
|
||||||
|
app_settings.apply({"NGC_API_KEY": "nvapi-abc", "VLLM_PORT": "8000"})
|
||||||
|
view = app_settings.public_view()
|
||||||
|
fields = {f["key"]: f for g in view["groups"] for f in g["fields"]}
|
||||||
|
# Secret: value never echoed to the browser, only a set flag.
|
||||||
|
assert "value" not in fields["NGC_API_KEY"]
|
||||||
|
assert fields["NGC_API_KEY"]["set"] is True
|
||||||
|
# Non-secret: current value present for prefill.
|
||||||
|
assert fields["VLLM_PORT"]["value"] == "8000"
|
||||||
|
assert {g["name"] for g in view["groups"]} >= {"vLLM (Spark 1)", "Integrations"}
|
||||||
|
# The previously-missing support-service ports are now exposed.
|
||||||
|
assert {"PARAKEET_PORT", "KOKORO_PORT", "EMBED_PORT", "QDRANT_PORT"} <= set(fields)
|
||||||
|
|
||||||
|
|
||||||
|
# ---- end-to-end (TestClient): live reload + route order ----
|
||||||
|
# TestClient is created without the `with` context manager so app startup events
|
||||||
|
# (the deep-health poll loop) don't run — these stay fully offline.
|
||||||
|
|
||||||
|
def _client(monkeypatch, tmp_path):
|
||||||
|
monkeypatch.setenv("APP_SETTINGS_FILE", str(tmp_path / "live.json"))
|
||||||
|
from fastapi.testclient import TestClient
|
||||||
|
from app import server
|
||||||
|
return TestClient(server.app)
|
||||||
|
|
||||||
|
|
||||||
|
def test_swap_lock_get_is_not_shadowed(monkeypatch, tmp_path):
|
||||||
|
client = _client(monkeypatch, tmp_path)
|
||||||
|
r = client.get("/api/swap/lock")
|
||||||
|
# Regression: must hit get_swap_lock (200, {"held": False}), NOT the
|
||||||
|
# /api/swap/{job_id} catch-all that returns 404 "no such job".
|
||||||
|
assert r.status_code == 200
|
||||||
|
assert r.json() == {"held": False}
|
||||||
|
|
||||||
|
|
||||||
|
def test_settings_apply_is_live_without_restart(monkeypatch, tmp_path):
|
||||||
|
client = _client(monkeypatch, tmp_path)
|
||||||
|
r = client.post("/api/settings", json={"values": {"VLLM_PORT": "8123"}})
|
||||||
|
assert r.status_code == 200
|
||||||
|
# Settings reloaded in place ⇒ /api/config reflects it immediately.
|
||||||
|
assert client.get("/api/config").json()["vllm_port"] == 8123
|
||||||
|
# And clearing it reverts to the default, still live.
|
||||||
|
client.post("/api/settings", json={"values": {"VLLM_PORT": ""}})
|
||||||
|
assert client.get("/api/config").json()["vllm_port"] == 8888
|
||||||
|
|
||||||
|
|
||||||
|
def test_settings_post_rejects_bad_value(monkeypatch, tmp_path):
|
||||||
|
client = _client(monkeypatch, tmp_path)
|
||||||
|
r = client.post("/api/settings", json={"values": {"PARAKEET_PORT": "nope"}})
|
||||||
|
assert r.status_code == 422
|
||||||
|
|
||||||
|
|
||||||
|
def test_webhook_notifier_repoints_live(monkeypatch, tmp_path):
|
||||||
|
# WebhookNotifier snapshots url/secret, so reload() alone can't reach it;
|
||||||
|
# post_settings must re-point it. Regression for that P1.
|
||||||
|
client = _client(monkeypatch, tmp_path)
|
||||||
|
from app import server
|
||||||
|
client.post("/api/settings", json={"values": {"SWAP_WEBHOOK_URL": "https://example.test/hook"}})
|
||||||
|
assert server.swap_webhook.url == "https://example.test/hook"
|
||||||
|
assert server.swap_webhook.enabled
|
||||||
|
client.post("/api/settings", json={"values": {"SWAP_WEBHOOK_URL": ""}})
|
||||||
|
assert server.swap_webhook.url == ""
|
||||||
@@ -3,6 +3,15 @@ import { sparkConfigYaml } from '../fileModels/sparkConfig.yaml'
|
|||||||
|
|
||||||
const { InputSpec, Value } = sdk
|
const { InputSpec, Value } = sdk
|
||||||
|
|
||||||
|
// This action is intentionally minimal: just the required wiring needed before
|
||||||
|
// Spark Control can do anything — the two Spark node addresses and SSH users.
|
||||||
|
// Every other knob (vLLM/service ports, container names, support-service hosts,
|
||||||
|
// integrations, webhooks) now lives behind the ⚙ Settings gear in the dashboard
|
||||||
|
// itself, which is where StartOS 0.4 expects routine config to live (and most
|
||||||
|
// operators never open StartOS actions). The optional keys still exist in the
|
||||||
|
// config.yaml schema (set by older versions); they're read into env at launch
|
||||||
|
// and migrated into the in-app settings overlay on first boot, so nothing is
|
||||||
|
// lost on upgrade — they're simply edited in the dashboard from now on.
|
||||||
const inputSpec = InputSpec.of({
|
const inputSpec = InputSpec.of({
|
||||||
spark1_host: Value.text({
|
spark1_host: Value.text({
|
||||||
name: 'Spark 1 hostname or IP',
|
name: 'Spark 1 hostname or IP',
|
||||||
@@ -40,164 +49,14 @@ const inputSpec = InputSpec.of({
|
|||||||
placeholder: 'your SSH username',
|
placeholder: 'your SSH username',
|
||||||
masked: false,
|
masked: false,
|
||||||
}),
|
}),
|
||||||
vllm_port: Value.text({
|
|
||||||
name: 'vLLM port (optional)',
|
|
||||||
description:
|
|
||||||
"The port your vLLM server listens on, on Spark 1 — used by the health check and the chat proxy. Leave blank to use 8888, which is what the bundled launch-cluster.sh wrapper uses. Set this to 8000 (vLLM's own default) or another port if your vLLM listens elsewhere.",
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'leave blank for 8888',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
vllm_container: Value.text({
|
|
||||||
name: 'vLLM container name (optional)',
|
|
||||||
description:
|
|
||||||
'Docker container name for the swappable vLLM on Spark 1. Defaults to "vllm_node" (what the bundled launch-cluster.sh creates). Change this only if you run your vLLM under a different container name — the model-swap log view and the pre-flight validator exec into it by name.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'leave blank for vllm_node',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
disabled_services: Value.text({
|
|
||||||
name: 'Services to hide (optional)',
|
|
||||||
description:
|
|
||||||
"Comma-separated list of built-in services your cluster doesn't run, so Spark Control hides their tiles and stops probing them. Valid names: parakeet, kokoro, embeddings, qdrant. Example: if you only run vLLM, set this to 'parakeet,kokoro,embeddings,qdrant'. Leave blank to monitor all of them. (Useful when, say, your vLLM shares port 8000 with Parakeet's default — hide Parakeet so its probe doesn't hit vLLM.)",
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'e.g. parakeet,kokoro',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
parakeet_host: Value.text({
|
|
||||||
name: 'Parakeet host (optional)',
|
|
||||||
description:
|
|
||||||
"Override the host running the Parakeet STT container. Leave blank if Parakeet runs on Spark 2 — that's the default. Set this if you run Parakeet on Spark 1 or a different machine.",
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'leave blank to use Spark 2',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
parakeet_container: Value.text({
|
|
||||||
name: 'Parakeet container name (optional)',
|
|
||||||
description:
|
|
||||||
'Docker container name for Parakeet. Defaults to "parakeet-asr" — change only if you named yours something else.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'parakeet-asr',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
kokoro_host: Value.text({
|
|
||||||
name: 'Kokoro host (optional)',
|
|
||||||
description:
|
|
||||||
'Override the host running the Kokoro TTS container. Leave blank if Kokoro runs on Spark 2.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'leave blank to use Spark 2',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
kokoro_container: Value.text({
|
|
||||||
name: 'Kokoro container name (optional)',
|
|
||||||
description: 'Docker container name for Kokoro. Defaults to "kokoro-tts".',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'kokoro-tts',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
embed_host: Value.text({
|
|
||||||
name: 'Embedding server host (optional)',
|
|
||||||
description:
|
|
||||||
'Override the host running the spark-embed container (bge-m3 dense embeddings + reranker). Leave blank if it runs on Spark 2.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'leave blank to use Spark 2',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
embed_container: Value.text({
|
|
||||||
name: 'Embedding container name (optional)',
|
|
||||||
description:
|
|
||||||
'Docker container name for the embedding server. Defaults to "spark-embed".',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'spark-embed',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
qdrant_host: Value.text({
|
|
||||||
name: 'Qdrant host (optional)',
|
|
||||||
description:
|
|
||||||
'Override the host running the Qdrant vector database. Leave blank if it runs on Spark 2.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'leave blank to use Spark 2',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
qdrant_container: Value.text({
|
|
||||||
name: 'Qdrant container name (optional)',
|
|
||||||
description: 'Docker container name for Qdrant. Defaults to "qdrant".',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'qdrant',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
qdrant_collection: Value.text({
|
|
||||||
name: 'Default Qdrant collection (optional)',
|
|
||||||
description:
|
|
||||||
'Default collection name used by /api/search when a request does not specify one. Leave blank to require callers to pass a collection.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'e.g. crm_chunks',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
matrix_bridge_user: Value.text({
|
|
||||||
name: 'matrix-bridge bot SSH user (optional)',
|
|
||||||
description:
|
|
||||||
"If you run the matrix-bridge Matrix bot on Spark 2, enter the SSH user that owns its ~/matrix-bridge folder (e.g. 'modelo'). Spark Control then shows a tile to update, restart, and view logs for the bot. Leave blank if you don't run the bot — the tile stays hidden. Note: this package's SSH public key must be authorized for that user (Show Public Key action) unless it's the same as your Spark 2 user.",
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'e.g. modelo',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
open_webui_url: Value.text({
|
|
||||||
name: 'Open WebUI URL (optional)',
|
|
||||||
description:
|
|
||||||
'If you also run Open WebUI on your LAN, paste its URL here. Spark Control will then show a one-click "Open chat" button next to the current model so you can jump straight to it.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'e.g. https://open-webui.yourserver.local',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
ngc_api_key: Value.text({
|
|
||||||
name: 'NGC API key (optional)',
|
|
||||||
description:
|
|
||||||
'NVIDIA NGC personal API key — needed to install NIM containers (Parakeet, etc.) from nvcr.io. Get one free at https://ngc.nvidia.com/setup/personal-key. Stored only on this Start9 server; passed to docker as the NGC_API_KEY env var when installing NIM services. (Kokoro TTS is Apache 2.0 and does not need an NGC key.)',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'starts with "nvapi-..."',
|
|
||||||
masked: true,
|
|
||||||
}),
|
|
||||||
swap_webhook_url: Value.text({
|
|
||||||
name: 'Swap webhook URL (optional)',
|
|
||||||
description:
|
|
||||||
'If you run automation that needs to know when the loaded model changes, paste a URL here. Spark Control POSTs a small JSON event (swap_complete / swap_failed) to it after every model swap, so the consumer can re-point its config to the new model. Leave blank to disable. Only needed if something other than this dashboard cares about swaps.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'e.g. https://my-service.local/spark-swap',
|
|
||||||
masked: false,
|
|
||||||
}),
|
|
||||||
swap_webhook_secret: Value.text({
|
|
||||||
name: 'Swap webhook secret (optional)',
|
|
||||||
description:
|
|
||||||
'Optional shared secret. If set, each webhook is signed with an "X-Spark-Signature: sha256=…" header (HMAC of the body) so the receiver can verify it really came from Spark Control. Leave blank to send the webhook unsigned.',
|
|
||||||
required: false,
|
|
||||||
default: null,
|
|
||||||
placeholder: 'a random string the receiver also knows',
|
|
||||||
masked: true,
|
|
||||||
}),
|
|
||||||
})
|
})
|
||||||
|
|
||||||
export const configureSparks = sdk.Action.withInput(
|
export const configureSparks = sdk.Action.withInput(
|
||||||
'configure-sparks',
|
'configure-sparks',
|
||||||
async () => ({
|
async () => ({
|
||||||
name: 'Configure Sparks',
|
name: 'Configure Sparks',
|
||||||
description: 'Set the hostnames and SSH users for your two Spark nodes.',
|
description:
|
||||||
|
'Set your two Spark node addresses and SSH users — the required wiring. Everything else (ports, container names, support services, integrations) is configured under ⚙ Settings in the Spark Control dashboard.',
|
||||||
warning: null,
|
warning: null,
|
||||||
visibility: 'enabled',
|
visibility: 'enabled',
|
||||||
allowedStatuses: 'any',
|
allowedStatuses: 'any',
|
||||||
@@ -205,11 +64,19 @@ export const configureSparks = sdk.Action.withInput(
|
|||||||
}),
|
}),
|
||||||
async () => inputSpec,
|
async () => inputSpec,
|
||||||
async ({ effects }) => {
|
async ({ effects }) => {
|
||||||
|
// Prefill from the saved config, but only the keys this (trimmed) form owns.
|
||||||
const cfg = await sparkConfigYaml.read().once()
|
const cfg = await sparkConfigYaml.read().once()
|
||||||
return cfg ?? null
|
if (!cfg) return null
|
||||||
|
return {
|
||||||
|
spark1_host: cfg.spark1_host,
|
||||||
|
spark1_user: cfg.spark1_user,
|
||||||
|
spark2_host: cfg.spark2_host,
|
||||||
|
spark2_user: cfg.spark2_user,
|
||||||
|
}
|
||||||
},
|
},
|
||||||
async ({ effects, input }) => {
|
async ({ effects, input }) => {
|
||||||
// Optional fields come through as `null`; coerce to empty string for the schema.
|
// merge() only touches the four keys we submit, leaving any legacy optional
|
||||||
|
// values already in config.yaml intact.
|
||||||
const normalized = Object.fromEntries(
|
const normalized = Object.fromEntries(
|
||||||
Object.entries(input).map(([k, v]) => [k, v ?? '']),
|
Object.entries(input).map(([k, v]) => [k, v ?? '']),
|
||||||
) as Record<string, string>
|
) as Record<string, string>
|
||||||
|
|||||||
@@ -1,10 +1,10 @@
|
|||||||
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
||||||
|
|
||||||
export const v0_1_0 = VersionInfo.of({
|
export const v0_1_0 = VersionInfo.of({
|
||||||
version: '0.26.0:0',
|
version: '0.27.0:0',
|
||||||
releaseNotes: {
|
releaseNotes: {
|
||||||
en_US:
|
en_US:
|
||||||
"v0.26.0:0 — the model menu is now what's actually on your Sparks. The dashboard scans both Sparks for downloaded models and shows exactly those — no more hard-coded list. (1) Delete means delete: removing a model frees its weights AND takes the card off the menu (re-download later to bring it back, with its saved settings). (2) Download a new model and it appears on the menu by itself when it finishes. (3) Models Spark Control doesn't recognize show a \"needs setup\" card — the first time you switch to one, it reads the model's own files, guesses how to launch it (which family, solo vs both Sparks, the right vLLM flags), and asks you to confirm once; after that it's a normal card. (4) The download box now autocompletes known-good models. (5) Each install shows its own Sparks' models, so a shared copy no longer displays someone else's list. Removed the two legacy Qwen entries (235B FP8, 2.5 72B) — they'll still appear if you actually have them downloaded. No consumer-API changes; the /v1 proxy and swap API are unchanged.",
|
'v0.27.0:0 — settings move into the dashboard, plus two bug fixes. (1) New ⚙ Settings gear in the dashboard: all the optional cluster knobs — vLLM and support-service ports, container names, Parakeet/Kokoro/embeddings/Qdrant hosts, Open WebUI link, NGC key, swap webhook — are now edited here, in plain English, and apply immediately without a restart. The StartOS "Configure Sparks" action is now just the four required fields (two Spark IPs + SSH users); your existing optional values migrate into the gear automatically on first launch, and the settings are stored on the server and included in StartOS backups. (2) NEW: support-service ports are now configurable. If your vLLM runs on 8000 (vLLM\'s own default) and you moved Parakeet to another port, set them under ⚙ Settings → that fixes the false "vLLM down" and the Parakeet 404 some setups saw. (3) Bug fix: GET /api/swap/lock returned 404 (a routing bug where it was shadowed by the swap-job lookup); the swap reservation status now reads correctly. No breaking consumer-API changes; the /v1 proxy and swap API are unchanged.',
|
||||||
},
|
},
|
||||||
migrations: {
|
migrations: {
|
||||||
up: async ({ effects }) => {},
|
up: async ({ effects }) => {},
|
||||||
|
|||||||
Reference in New Issue
Block a user