From 7e0759846f7f6a0be4bd16f87dba48da6d42199b Mon Sep 17 00:00:00 2001 From: Keysat Date: Thu, 18 Jun 2026 13:41:28 -0500 Subject: [PATCH] v0.27.0:0 - in-app settings gear + swap-lock route fix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Move the ~20 optional cluster knobs out of the StartOS "Configure Sparks" action (now just the 4 required fields) and into a dashboard ⚙ Settings gear, backed by a /data/app_settings.json overlay keyed by env-var names. One shared mutable Settings instance + Settings.reload() applies edits live without a restart; existing installs' values migrate automatically on first boot. Also: support-service ports (parakeet/kokoro/embed/qdrant + vllm) are now configurable, and GET /api/swap/lock no longer 404s (it was shadowed by the /api/swap/{job_id} catch-all). WebhookNotifier is re-pointed on save so its url/secret reload live too. --- AGENTS.md | 3 +- HANDOFF.md | 9 +- README.md | 2 +- docs/guides/fastapi-image.md | 2 + image/app/app_settings.py | 286 +++++++++++++++++++++ image/app/config.py | 142 +++++----- image/app/coordination.py | 8 + image/app/server.py | 135 ++++++---- image/app/static/app.js | 96 +++++++ image/app/static/index.html | 16 +- image/app/static/style.css | 10 + image/tests/conftest.py | 3 + image/tests/test_app_settings.py | 174 +++++++++++++ package/startos/actions/configureSparks.ts | 175 ++----------- package/startos/versions/v0_1_0.ts | 4 +- 15 files changed, 797 insertions(+), 268 deletions(-) create mode 100644 image/app/app_settings.py create mode 100644 image/tests/test_app_settings.py diff --git a/AGENTS.md b/AGENTS.md index 43772d9..b63f534 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -55,12 +55,13 @@ Subsystem guidance lives in `docs/guides/` and loads when matching files are tou ## Current state +- **Built, pending review+install: v0.27.0:0 — in-app Settings gear + two bug fixes** (prompted by the second adopter's v0.25 feedback). (1) The StartOS "Configure Sparks" action is trimmed to the **four required fields** (both Spark IPs + SSH users); every optional knob now lives behind a **⚙ Settings gear** in the dashboard, backed by a `/data/app_settings.json` overlay (`app_settings.py`) keyed by env-var names and overlaid on `os.environ`. Edits apply **live** — one shared mutable `Settings` instance, `Settings.reload()` mutates it in place (architecture in the fastapi-image guide). Existing installs' optional values **migrate automatically** on first boot (`seed_from_env`, nothing lost). (2) **NEW: support-service ports are configurable** (`PARAKEET_PORT`/`KOKORO_PORT`/`EMBED_PORT`/`QDRANT_PORT`; `VLLM_PORT` was already a knob, now surfaced in the gear) — fixes the adopter's false "vLLM down" (their vLLM is on 8000, not the launch-cluster.sh default 8888) and Parakeet 404 (they remapped it off 8000). (3) **Bug fix:** `GET /api/swap/lock` returned 404 — it was shadowed by `GET /api/swap/{job_id}`; the static lock routes now register first (load-bearing comment added). Tested locally (151 pytest + live smoke: live-apply, secret masking, route fix, 422 on bad input). Still **un-diagnosed** from the adopter report: their disk-scan shows Gemma "not on disk" — needs them to confirm where it's cached (`ls ~/.cache/huggingface/hub` as the SSH user) vs `disk.py`'s `$HOME/.cache/huggingface/hub` assumption. **Next: act on the code review, then build/install (go/no-go) + draft the adopter reply describing the fixes.** - **Live: v0.26.0:0 — disk-driven model menu** (installed on the server 2026-06-18, `installed-version` confirms; also published to the self-hosted StartOS registry). The dashboard lists what's *actually downloaded* on the Sparks; `models.yaml`/overrides are **launch recipes** matched by `repo`, not the menu; an on-disk model with no recipe shows `needs_setup` and infers its launch flags from `config.json` (operator confirms once). Delete removes weights **and** the card; dropped the two legacy Qwen recipes. Architecture (`discovery.py`/`build_menu`/`infer_recipe`, the recipe-vs-disk split) is in the fastapi-image guide. - **Next (owner-driven, concrete): Gemma-4-26B-A4B vision daily-driver eval.** The `gemma4-26b` recipe is in the catalog (NVFP4 MoE; `--moe_backend=marlin` set — the fast CUTLASS FP4 path errors on GB10; vision+tools). Not yet downloaded or swap-tested. Owner wants vision for business-card OCR and is weighing it against the text-only Qwen3.6 35B daily driver (research: Gemma ~52 tok/s vs Qwen's ~97, slightly weaker reasoning). Next: download it, swap-test, try a business card. - **Live: v0.25.0:0** (installed 2026-06-18). The OpenClaw/Johnny-5 coexistence epic is fully shipped & live: configurable `VLLM_PORT` (v0.22, blank ⇒ 8888), local/fine-tuned models (v0.23), configurable topology (v0.24 — `VLLM_CONTAINER`, `DISABLED_SERVICES` hide-list, second-Spark `kind: vllm` monitor), coordination layer (v0.25 — swap reservation lock with `423`-enforced manual-swap pause + `?force=true` Release override, `swap_complete`/`swap_failed` webhook, read-only schedule registry; consumer API in `docs/COORDINATION.md`). - **Other live features:** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel; per-Spark SSH-key copy + WireGuard `VPN ` hardware badge. Security hardening (v0.19 — shellsafe SSH-injection guard, Qdrant path-injection, same-origin CSRF guard) stable (`EVALUATION.md`). Spark 2 audio/embeddings stack healthy. - **matrix-bridge bot tile (v0.21.0:1, live):** `bot`-kind tile (docker-state badge; Update/Restart/Stop-Start/View-logs) for the Matrix bot on Spark 2, driven as `modelo` (no `sudo -iu`; blank `matrix_bridge_user` ⇒ tile hidden; host reuses `spark2_host`). Code: `app/matrix_bridge.py` + `/api/matrix-bridge/{update,logs}`. **Load-bearing:** Update's `git fetch` runs as `modelo` and needs `modelo`'s `~/.ssh/config` pinning the Gitea deploy key with `IdentitiesOnly yes` (else publickey denial). Optional next only if the bot dev asks: Docker `HEALTHCHECK`. -- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (137 passing). Covers `build_launch_command` (incl. the shell-injection round-trip + local-model bind-mount), the transcript↔diarizer label-merge, the `shellsafe` validators, `matrix_bridge.build_update_command` (+ phase detection), the configurable-topology layer (`test_topology.py`), the coordination layer (`test_coordination.py`: swap-lock lifecycle/expiry/token-auth, schedule-registry CRUD, webhook payload + HMAC signature — `now` is injected into the lock so expiry is tested without sleeping), and the disk-driven menu (`test_discovery.py`: cache-dirname↔repo parsing, the cache-listing parser incl. incomplete-download filtering, and `infer_recipe` family/mode mapping — Qwen3-MoE→flashinfer_cutlass, Gemma-MoE→marlin, vision caps, solo-vs-cluster by size/host-count). The `build_menu` merge + `/api/models/suggest` are exercised by hand against the live cluster (mock-heavy unit tests there would test the mocks). Redaction + live-audio suites remain standalone scripts. +- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (151 passing; the in-app settings gear + swap-lock route-order regression are in `test_app_settings.py`, incl. a `TestClient` live-apply check). Covers `build_launch_command` (incl. the shell-injection round-trip + local-model bind-mount), the transcript↔diarizer label-merge, the `shellsafe` validators, `matrix_bridge.build_update_command` (+ phase detection), the configurable-topology layer (`test_topology.py`), the coordination layer (`test_coordination.py`: swap-lock lifecycle/expiry/token-auth, schedule-registry CRUD, webhook payload + HMAC signature — `now` is injected into the lock so expiry is tested without sleeping), and the disk-driven menu (`test_discovery.py`: cache-dirname↔repo parsing, the cache-listing parser incl. incomplete-download filtering, and `infer_recipe` family/mode mapping — Qwen3-MoE→flashinfer_cutlass, Gemma-MoE→marlin, vision caps, solo-vs-cluster by size/host-count). The `build_menu` merge + `/api/models/suggest` are exercised by hand against the live cluster (mock-heavy unit tests there would test the mocks). Redaction + live-audio suites remain standalone scripts. - **Signal Engine "flakiness":** diagnosed as *not* a server bug — transient 1–4s unresponsiveness while the single GPU is busy. Client-side remedy (in-flight cap 2 / ceiling 3 / retry-on-timeout+503) drafted and **forwarded to that dev (owner confirmed 2026-06-15)**. Awaiting whether they want the measured concurrency knee. - **Stance (decided, not built):** no public interface / no API-token auth — LAN + WireGuard/Tailscale split-tunnel only; the CSRF guard covers the browser-driven vector. - **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers; matrix-bridge badge won't visibly flip on a fast `docker restart` (status re-checked only after the command returns). diff --git a/HANDOFF.md b/HANDOFF.md index e3a5ea4..7a8bf09 100644 --- a/HANDOFF.md +++ b/HANDOFF.md @@ -73,16 +73,15 @@ The first start generates an ed25519 SSH keypair inside the package volume. Wait ### 4. Configure Sparks - Open Spark Control → **Actions → Configure Sparks**. -- Fill in: +- Fill in just the four required fields: - **Spark 1 hostname or IP** — prefer the **IP** (e.g. `192.168.1.x`) over `.local` hostnames; vLLM only binds IPv4 and mDNS can resolve to IPv6 first. - **Spark 1 SSH user** — whatever username you set up on Spark 1. - **Spark 2 hostname or IP** + **SSH user** — same idea. - - Optional Parakeet/Kokoro overrides — leave blank if those services run on Spark 2 (the normal case). - - Optional **Open WebUI URL** — paste your Open WebUI LAN URL to get a deep-link button in the dashboard next to the current model. - - Optional **NGC API key** — paste it here if you have one. Save. +Everything else is optional and lives in the dashboard, not this action: open Spark Control and click **⚙ Settings** in the top bar to set vLLM/service **ports** (e.g. if your vLLM runs on 8000 rather than the default 8888, or you moved Parakeet off 8000), container names, support-service hosts, an **Open WebUI URL** (adds a deep-link button), an **NGC API key**, and a swap webhook. Changes there apply immediately and are included in StartOS backups. + ### 5. Re-run Show Public Key (if you skipped earlier) Now that hosts are configured, Show Public Key will give you the paste-ready install command. Run it as described in step 3. @@ -159,7 +158,7 @@ All of these inherit Spark Control's TLS cert and StartOS access controls. You o A few things worth knowing: - The codebase is **two halves**: `image/` is a standalone FastAPI app you can run with `uvicorn app.server:app` for local dev. `package/` is the StartOS wrapper. Changes to either should be coordinated. -- **All connection info** comes from environment variables in `image/app/config.py`, populated from `package/startos/fileModels/sparkConfig.yaml.ts` via the Configure Sparks action. No IPs, usernames, or paths are hardcoded in runtime code. +- **All connection info** comes from environment variables in `image/app/config.py`. The four required fields are populated from `package/startos/fileModels/sparkConfig.yaml.ts` via the Configure Sparks action; the optional knobs are overlaid from the in-app `⚙ Settings` store (`/data/app_settings.json`, see `image/app/app_settings.py`). No IPs, usernames, or paths are hardcoded in runtime code. - The **path `~/spark-vllm-docker`** *is* hardcoded in `swap.py`, `download.py`, `updates.py`, and `models.py`. If the user has cloned the upstream repo elsewhere, either fix the path or symlink it. - **Persistent state** lives at `/data/` inside the container: `config.yaml`, `models-overrides.yaml`, `services-overrides.yaml`, `connectivity.json`, `ssh/`. These survive package updates. - The dashboard polls every 5 s; check `image/app/health.py` and `image/app/connectivity.py` for the probing logic. External apps can also POST failures to `/api/health-event` to log between-poll blips. diff --git a/README.md b/README.md index 6e9d290..8d65f62 100644 --- a/README.md +++ b/README.md @@ -118,7 +118,7 @@ Fields: `service` (required), `ok` (required), `source` (optional, free-form), ` - **Service discovery API** (`/api/endpoints`) for other LAN services - **Kokoro-82M TTS** replaces Magpie/Riva NIM as the default TTS backend (v0.14.0). Magpie's decoder had a ~30-50% truncation rate on multi-sentence inputs and ate 49 GB of GPU memory; Kokoro is 24/24 reliable at every input length tested, uses 1.3 GB GPU, and renders in ~1s. See HANDOFF.md and the release notes for the migration story. -- **Always-on services panel** with Start/Stop/Restart for Parakeet + Kokoro, plus per-service host configuration in Configure Sparks (so they can live on Spark 1, Spark 2, or anywhere) +- **Always-on services panel** with Start/Stop/Restart for Parakeet + Kokoro, plus per-service host/port/container configuration in the in-app **⚙ Settings** gear (so they can live on Spark 1, Spark 2, or anywhere, on any port) - **Model download** from the dashboard — paste an HF repo (with autocomplete for known models), pick solo or cluster, watch percent progress with bytes/rate/ETA. After completion the model appears on the menu automatically; if it's unrecognized, a pre-filled "set up this model" dialog offers to configure it. - **spark-vllm-docker update check** — banner shows "N commits behind upstream"; Apply Update runs `git pull && ./build-and-copy.sh -c` over SSH with a streamed log - **Per-model Advanced settings** — knobs for max context, GPU memory %, and three optimization toggles (fastsafetensors, prefix caching, FP8 KV cache). Persisted to `/data/models-overrides.yaml` so they survive package updates. Bundled and custom models alike. diff --git a/docs/guides/fastapi-image.md b/docs/guides/fastapi-image.md index 2ec31fd..dc1e0c0 100644 --- a/docs/guides/fastapi-image.md +++ b/docs/guides/fastapi-image.md @@ -35,11 +35,13 @@ Two kinds, both run with the `image/.venv` interpreter (system python3 has no de - New external-facing endpoints get documented in `docs/` (`AUDIO_API.md`, `EMBEDDINGS.md`, `REDACTION_GATEWAY.md`) and noted in release notes. - **SSH-input safety:** any user-supplied value that reaches an SSH command on the Sparks MUST go through `app/shellsafe.py` — validate against a whitelist at the API boundary, then `quote_arg`/`quote_args` (`shlex.quote`) at the sink. Never raw f-string a user value into a command string. Existing sinks: `models.build_launch_command`, `download`, `nim`, `services`; `disk.py` keeps its own `_SAFE_DIRNAME` because it needs `$HOME` to expand server-side. The vLLM pre-flight (`validate.py`) relies on `shlex.split` cleanly reversing this quoting — preserve that invariant. - **CSRF / same-origin:** state-mutating *control* endpoints are guarded by the `csrf_guard` middleware in `server.py` (rejects requests whose `Origin`/`Referer` host ≠ the served host). A new endpoint meant to be called **cross-origin by downstream apps** (a proxy/data endpoint) must be added to `_CSRF_EXEMPT_PREFIXES`, or browser POSTs from those apps will 403. No app-layer token auth by design (LAN/VPN-only; would break consumers). +- **Settings split (gear vs StartOS action):** only the four *required* fields (both Spark IPs + SSH users) live in the StartOS "Configure Sparks" action → `config.yaml` → env. Every *optional* knob (ports, container names, support-service hosts, integrations, webhook) is edited in the dashboard's ⚙ Settings gear, backed by the `/data/app_settings.json` overlay (`app_settings.py`), keyed by the same env-var names. Precedence (`config._effective_env`): `os.environ` first, overlay on top. `app_settings.seed_from_env` runs **once at startup** to migrate a pre-gear install's env values into the overlay (don't move seeding into `from_env`/`reload` — it writes, and `from_env` runs on every build → it would clobber across calls, which it did once already). **`Settings` is deliberately not frozen:** one shared instance is threaded by reference into every router closure/manager, and `Settings.reload()` (called after a gear save) recomputes its fields **in place** so changes apply live with no restart and no call-site changes. A new gear knob = add one entry to `app_settings.FIELDS` (the front-end renders it generically); the matching `config.Settings` field must already read that env var. ## Layout - `image/app/server.py` — FastAPI entry; routers live in sibling modules (`audio_proxy.py`, `llm_proxy.py`, `embeddings_proxy.py`, `redaction_gateway.py`, `swap.py`, `health.py`, `deep_health.py`, `connectivity.py`, …). - `image/app/discovery.py` — the disk-driven model menu. `/api/models` lists what's actually downloaded on the Sparks (via `disk.list_cached_models`); `models.yaml`/overrides are *launch recipes* matched by repo, not the menu. An on-disk model with no recipe is `needs_setup` → `infer_recipe` reads its `config.json` to prefill a setup form the operator confirms once. +- `image/app/app_settings.py` — the in-app settings overlay backing the ⚙ gear: `FIELDS` metadata (drives `/api/settings` + the UI form), `load_overlay()` (pure read), `seed_from_env()` (one-time migration), `apply()` (validate + persist). `GET/POST /api/settings` in `server.py` read/write it, then `settings.reload()`. - `image/app/static/` — the dashboard UI. - `image/models.yaml` — bundled vLLM **launch recipes** (how to launch a known model), NOT the dashboard menu — the menu is the on-disk scan. - `image/spark_embed/` — Dockerfile + app for the embeddings container; built ON a Spark (ARM64, NGC PyTorch base — see the audio/cluster rule for NGC torch-pinning caveats). diff --git a/image/app/app_settings.py b/image/app/app_settings.py new file mode 100644 index 0000000..1796477 --- /dev/null +++ b/image/app/app_settings.py @@ -0,0 +1,286 @@ +"""App-owned settings overlay: the in-dashboard 'gear' knobs. + +Spark Control's *required* wiring — the two Spark IPs and SSH users — is set once +via the StartOS "Configure Sparks" action and arrives as env vars. Everything +else (ports, container names, support-service hosts, integrations, webhook) is +optional and lives here: a small JSON overlay on /data that the dashboard gear +reads and writes, so an operator never has to open StartOS actions to tune the +cluster. This follows the StartOS 0.4 convention (minimal setup action; routine +config in the app's own UI) and stays inside the package's backup volume, so the +file is backed up and restored for free. + +Each overlay entry is keyed by the *same env var name* config.Settings already +reads, so the overlay is simply an env-var override store. Precedence (see +config._effective_env): process env first, this overlay on top — so a knob set +in the gear wins, while an un-touched knob falls through to whatever the StartOS +action injected, then to the code default. + +First-run migration: when the overlay file doesn't exist yet (e.g. an existing +install upgrading into this version), it's seeded from the current env so any +value previously set via the StartOS action carries over into the gear with no +operator action and nothing lost. +""" +from __future__ import annotations +import json +import logging +import os +import re +import tempfile +from pathlib import Path +from typing import Mapping + +log = logging.getLogger(__name__) + +# Field metadata drives BOTH the /api/settings response (the front-end renders +# the form generically from this) and light server-side validation. `key` is the +# env var name; `type` is one of text|int|csv|secret. `secret` values are +# write-only — never echoed back to the browser. +FIELDS: list[dict] = [ + # --- vLLM (Spark 1) --- + {"group": "vLLM (Spark 1)", "key": "VLLM_PORT", "label": "vLLM port", "type": "int", + "placeholder": "8888", + "help": "Port your vLLM listens on. Blank ⇒ 8888 (the bundled launch-cluster.sh). Set 8000 for vanilla vLLM, or wherever yours listens."}, + {"group": "vLLM (Spark 1)", "key": "VLLM_CONTAINER", "label": "vLLM container name", "type": "text", + "placeholder": "vllm_node", + "help": "Docker container the swappable vLLM runs in. Blank ⇒ vllm_node. The swap log-tail and pre-flight validator exec into it by name."}, + + # --- Monitoring --- + {"group": "Monitoring", "key": "DISABLED_SERVICES", "label": "Services to hide", "type": "csv", + "placeholder": "e.g. parakeet,kokoro", + "help": "Comma-separated built-in services your cluster doesn't run, so their tiles are hidden and never probed. Valid: parakeet, kokoro, embeddings, qdrant. Blank ⇒ monitor all."}, + + # --- Parakeet (STT) --- + {"group": "Parakeet (STT)", "key": "PARAKEET_HOST", "label": "Host", "type": "text", + "placeholder": "leave blank for Spark 2", + "help": "Host running the Parakeet STT container. Blank ⇒ Spark 2."}, + {"group": "Parakeet (STT)", "key": "PARAKEET_PORT", "label": "Port", "type": "int", + "placeholder": "8000", + "help": "Port Parakeet listens on. Blank ⇒ 8000. Set this if you remapped it (e.g. because your vLLM holds 8000)."}, + {"group": "Parakeet (STT)", "key": "PARAKEET_CONTAINER", "label": "Container name", "type": "text", + "placeholder": "parakeet-asr", + "help": "Docker container name for Parakeet. Blank ⇒ parakeet-asr."}, + {"group": "Parakeet (STT)", "key": "PARAKEET_USER", "label": "SSH user", "type": "text", + "placeholder": "leave blank for Spark 2 user", + "help": "SSH user that owns the Parakeet container. Blank ⇒ your Spark 2 user."}, + + # --- Kokoro (TTS) --- + {"group": "Kokoro (TTS)", "key": "KOKORO_HOST", "label": "Host", "type": "text", + "placeholder": "leave blank for Spark 2", + "help": "Host running the Kokoro TTS container. Blank ⇒ Spark 2."}, + {"group": "Kokoro (TTS)", "key": "KOKORO_PORT", "label": "Port", "type": "int", + "placeholder": "8880", + "help": "Port Kokoro listens on. Blank ⇒ 8880."}, + {"group": "Kokoro (TTS)", "key": "KOKORO_CONTAINER", "label": "Container name", "type": "text", + "placeholder": "kokoro-tts", + "help": "Docker container name for Kokoro. Blank ⇒ kokoro-tts."}, + {"group": "Kokoro (TTS)", "key": "KOKORO_USER", "label": "SSH user", "type": "text", + "placeholder": "leave blank for Spark 2 user", + "help": "SSH user that owns the Kokoro container. Blank ⇒ your Spark 2 user."}, + + # --- Embeddings --- + {"group": "Embeddings", "key": "EMBED_HOST", "label": "Host", "type": "text", + "placeholder": "leave blank for Spark 2", + "help": "Host running the spark-embed container (bge-m3 + reranker). Blank ⇒ Spark 2."}, + {"group": "Embeddings", "key": "EMBED_PORT", "label": "Port", "type": "int", + "placeholder": "8088", + "help": "Port the embedding server listens on. Blank ⇒ 8088."}, + {"group": "Embeddings", "key": "EMBED_CONTAINER", "label": "Container name", "type": "text", + "placeholder": "spark-embed", + "help": "Docker container name for the embedding server. Blank ⇒ spark-embed."}, + {"group": "Embeddings", "key": "EMBED_USER", "label": "SSH user", "type": "text", + "placeholder": "leave blank for Spark 2 user", + "help": "SSH user that owns the embedding container. Blank ⇒ your Spark 2 user."}, + + # --- Qdrant --- + {"group": "Qdrant", "key": "QDRANT_HOST", "label": "Host", "type": "text", + "placeholder": "leave blank for Spark 2", + "help": "Host running the Qdrant vector database. Blank ⇒ Spark 2."}, + {"group": "Qdrant", "key": "QDRANT_PORT", "label": "Port", "type": "int", + "placeholder": "6333", + "help": "Port Qdrant's REST API listens on. Blank ⇒ 6333."}, + {"group": "Qdrant", "key": "QDRANT_CONTAINER", "label": "Container name", "type": "text", + "placeholder": "qdrant", + "help": "Docker container name for Qdrant. Blank ⇒ qdrant."}, + {"group": "Qdrant", "key": "QDRANT_USER", "label": "SSH user", "type": "text", + "placeholder": "leave blank for Spark 2 user", + "help": "SSH user that owns the Qdrant container. Blank ⇒ your Spark 2 user."}, + {"group": "Qdrant", "key": "QDRANT_COLLECTION", "label": "Default collection", "type": "text", + "placeholder": "e.g. crm_chunks", + "help": "Collection used by /api/search when a request doesn't name one. Blank ⇒ callers must pass a collection."}, + + # --- Integrations --- + {"group": "Integrations", "key": "OPEN_WEBUI_URL", "label": "Open WebUI URL", "type": "text", + "placeholder": "e.g. https://open-webui.yourserver.local", + "help": "If set, the header shows a one-click 'Open chat' button to your Open WebUI."}, + {"group": "Integrations", "key": "MATRIX_BRIDGE_USER", "label": "matrix-bridge bot SSH user", "type": "text", + "placeholder": "e.g. modelo", + "help": "SSH user owning the bot's ~/matrix-bridge clone (Spark 2). Set this to show the bot tile (update/restart/logs). Blank ⇒ tile hidden."}, + {"group": "Integrations", "key": "NGC_API_KEY", "label": "NGC API key", "type": "secret", + "placeholder": "starts with nvapi-…", + "help": "NVIDIA NGC personal key, needed only to install NIM containers from nvcr.io. Stored on this server."}, + {"group": "Integrations", "key": "SWAP_WEBHOOK_URL", "label": "Swap webhook URL", "type": "text", + "placeholder": "e.g. https://my-service.local/spark-swap", + "help": "POSTed a small JSON event (swap_complete / swap_failed) after every model swap, so automation can re-point to the new model. Blank ⇒ disabled."}, + {"group": "Integrations", "key": "SWAP_WEBHOOK_SECRET", "label": "Swap webhook secret", "type": "secret", + "placeholder": "a random shared string", + "help": "If set, each webhook is HMAC-signed (X-Spark-Signature) so the receiver can verify it. Blank ⇒ unsigned."}, +] + +_BY_KEY = {f["key"]: f for f in FIELDS} +_SECRET_KEYS = frozenset(f["key"] for f in FIELDS if f["type"] == "secret") +_INT_KEYS = frozenset(f["key"] for f in FIELDS if f["type"] == "int") +# Reject control characters (incl. newlines) — these values flow into env vars, +# URLs, and SSH command lines (quoted at the sink, but defence in depth). +_BAD_CHARS = re.compile(r"[\x00-\x1f\x7f]") +# A secret's value is never echoed back, so a blank submit means "keep the stored +# one" (you can't see it to retype it). To actually *remove* a stored secret the +# UI sends this sentinel instead of a real value. Surfaced to the front-end via +# public_view so the two stay in sync. +CLEAR_SENTINEL = "__clear__" + + +def _path() -> Path: + return Path(os.environ.get("APP_SETTINGS_FILE", "/data/app_settings.json")) + + +def field_keys() -> frozenset[str]: + return frozenset(_BY_KEY) + + +def load_overlay() -> dict[str, str]: + """Return the overlay as {ENV_KEY: value}, filtered to known, non-empty keys. + + Pure read (no side effects) — called on every Settings (re)build, so it must + not write. Missing/corrupt file ⇒ {}. The file is tiny.""" + p = _path() + if not p.exists(): + return {} + try: + raw = json.loads(p.read_text()) + except (ValueError, OSError) as e: + log.warning("ignoring unreadable %s: %s", p, e) + return {} + if not isinstance(raw, dict): + return {} + return {k: str(v) for k, v in raw.items() if k in _BY_KEY and v not in (None, "")} + + +def seed_from_env(env: Mapping[str, str]) -> None: + """One-time migration, called once at startup: if no overlay exists yet, seed + it from the current env so any optional value previously set via the StartOS + action carries into the gear automatically (nothing lost on upgrade). No-op + if the file already exists or the env carries no known non-empty knob — a + fresh install then starts with no overlay and pure defaults. Values run + through the same validation as apply(); a malformed one (e.g. a paste-error + port) is skipped rather than written, matching the gear's own guards.""" + if _path().exists(): + return + seeded: dict[str, str] = {} + for k in _BY_KEY: + v = env.get(k) + if not v: + continue + try: + cleaned = _validate(k, v) + except SettingsError as e: + log.warning("skipping invalid env value while seeding overlay: %s", e) + continue + if cleaned and cleaned != CLEAR_SENTINEL: + seeded[k] = cleaned + if seeded: + _write(seeded) + log.info("seeded settings overlay from env (%d keys): %s", len(seeded), _path()) + + +def _write(overlay: dict[str, str]) -> None: + p = _path() + p.parent.mkdir(parents=True, exist_ok=True) + # Atomic replace so a crash mid-write never leaves a truncated overlay. + fd, tmp = tempfile.mkstemp(dir=str(p.parent), prefix=".app_settings.", suffix=".tmp") + try: + with os.fdopen(fd, "w") as fh: + json.dump(overlay, fh, indent=2, sort_keys=True) + os.replace(tmp, p) + except BaseException: + try: + os.unlink(tmp) + except OSError: + pass + raise + + +def public_view() -> dict: + """Shape the gear form for the browser: ordered groups of fields with their + current overlay value. Secret values are never sent — only a `set` flag.""" + overlay = load_overlay() + groups: list[dict] = [] + index: dict[str, dict] = {} + for f in FIELDS: + g = index.get(f["group"]) + if g is None: + g = {"name": f["group"], "fields": []} + index[f["group"]] = g + groups.append(g) + entry = { + "key": f["key"], + "label": f["label"], + "type": f["type"], + "placeholder": f.get("placeholder", ""), + "help": f.get("help", ""), + } + if f["type"] == "secret": + entry["set"] = bool(overlay.get(f["key"])) + else: + entry["value"] = overlay.get(f["key"], "") + g["fields"].append(entry) + return {"groups": groups, "clear_sentinel": CLEAR_SENTINEL} + + +class SettingsError(ValueError): + """Bad input to apply() — surfaced as 422 by the endpoint.""" + + +def _validate(key: str, value) -> str: + """Clean + validate one value; raise SettingsError on bad input. Returns the + stripped string ('' is valid and means 'unset'). The CLEAR_SENTINEL passes + through for the caller to interpret (secret removal).""" + if key not in _BY_KEY: + raise SettingsError(f"unknown setting: {key}") + val = ("" if value is None else str(value)).strip() + if val == CLEAR_SENTINEL: + return val + if _BAD_CHARS.search(val): + raise SettingsError(f"{key}: control characters are not allowed") + if key in _INT_KEYS and val: + if not val.isdigit() or not (1 <= int(val) <= 65535): + raise SettingsError(f"{key}: must be a port number between 1 and 65535") + return val + + +def apply(updates: Mapping[str, str]) -> dict[str, str]: + """Validate `updates` and merge them into the overlay, then persist. + + Rules per key: + - unknown key / bad int / control chars → reject (422, via _validate) + - secret + CLEAR_SENTINEL → delete the stored secret + - secret + blank value → leave the stored secret unchanged (don't wipe) + - non-secret + blank → delete the key (revert to env/default) + - otherwise → set the key + + Returns the new overlay. The caller reloads Settings so the change goes live. + """ + overlay = load_overlay() + for key, value in updates.items(): + val = _validate(key, value) + if key in _SECRET_KEYS: + if val == CLEAR_SENTINEL: + overlay.pop(key, None) + elif val: + overlay[key] = val + # blank secret ⇒ leave the existing value in place + elif val and val != CLEAR_SENTINEL: + overlay[key] = val + else: + overlay.pop(key, None) + _write(overlay) + return overlay diff --git a/image/app/config.py b/image/app/config.py index a13891f..12c1abf 100644 --- a/image/app/config.py +++ b/image/app/config.py @@ -1,26 +1,28 @@ from __future__ import annotations import logging import os -from dataclasses import dataclass +from dataclasses import dataclass, fields from pathlib import Path +from typing import Mapping +from . import app_settings from .shellsafe import validate_container log = logging.getLogger(__name__) -def _env(name: str, default: str = "") -> str: - return os.environ.get(name, default) +def _env(src: Mapping[str, str], name: str, default: str = "") -> str: + return src.get(name, default) -def _env_container(name: str, default: str) -> str: +def _env_container(src: Mapping[str, str], name: str, default: str) -> str: """Resolve a container-name env var, validating it at the config boundary. The value flows into `docker logs`/`docker exec` over SSH, so it's quoted at the sink — but per the repo's two-layer convention it's also whitelist-checked here. A malformed optional value falls back to `default` rather than crashing - daemon startup (mirrors `_env_int` for VLLM_PORT).""" - val = os.environ.get(name, "") or default + daemon startup (mirrors `_env_int`).""" + val = src.get(name, "") or default try: return validate_container(val) except ValueError: @@ -28,23 +30,23 @@ def _env_container(name: str, default: str) -> str: return default -def _env_set(name: str) -> frozenset[str]: +def _env_set(src: Mapping[str, str], name: str) -> frozenset[str]: """Parse a comma-separated env var into a lowercased frozenset of keys. Used by DISABLED_SERVICES so an adopter whose cluster doesn't run a given support service can switch its tile + probes off entirely (rather than have the probe hit whatever else listens on that port — e.g. a vLLM sharing Parakeet's default 8000).""" - raw = os.environ.get(name, "") + raw = src.get(name, "") return frozenset(part.strip().lower() for part in raw.split(",") if part.strip()) -def _env_int(name: str, default: int) -> int: +def _env_int(src: Mapping[str, str], name: str, default: int) -> int: """Parse an int env var, falling back to `default` when unset, blank, or - malformed. The StartOS Configure panel passes optional numeric fields as an - empty string when left blank, so a bare int("") would crash daemon startup.""" + malformed. Optional numeric fields arrive as an empty string when left blank, + so a bare int("") would crash daemon startup.""" try: - return int(os.environ.get(name, "") or default) + return int(src.get(name, "") or default) except (TypeError, ValueError): return default @@ -64,8 +66,23 @@ def _resolve_models_yaml() -> str: return str(candidates[0]) # let load fail with a clear path -@dataclass(frozen=True) +def _effective_env() -> dict[str, str]: + """The env Settings is built from: process env first, the in-app settings + overlay on top. The overlay (the dashboard 'gear') is keyed by the same env + var names, so a knob set in the UI overrides the value the StartOS action + injected — while an un-touched knob keeps falling through to the action's + value, then to the code default. See app_settings.""" + return {**os.environ, **app_settings.load_overlay()} + + +@dataclass class Settings: + # NOTE: intentionally NOT frozen. There is exactly one Settings instance, + # shared by reference across every router closure and manager (build_router, + # self.settings = settings). `reload()` mutates it in place so a change saved + # via the in-app settings gear goes live for all of them without rebuilding + # the app — the only window of inconsistency is the microseconds it takes to + # reassign the fields, acceptable for a single-operator config save. spark1_host: str spark1_user: str spark2_host: str @@ -107,73 +124,82 @@ class Settings: swap_webhook_secret: str @classmethod - def from_env(cls) -> "Settings": - spark2_host = _env("SPARK2_HOST") - spark2_user = _env("SPARK2_USER") + def from_env(cls, src: Mapping[str, str] | None = None) -> "Settings": + src = _effective_env() if src is None else src + spark2_host = _env(src, "SPARK2_HOST") + spark2_user = _env(src, "SPARK2_USER") # Parakeet (STT) and Kokoro (TTS) default to Spark 2 unless overridden. return cls( - spark1_host=_env("SPARK1_HOST"), - spark1_user=_env("SPARK1_USER"), + spark1_host=_env(src, "SPARK1_HOST"), + spark1_user=_env(src, "SPARK1_USER"), spark2_host=spark2_host, spark2_user=spark2_user, - parakeet_host=_env("PARAKEET_HOST") or spark2_host, - parakeet_user=_env("PARAKEET_USER") or spark2_user, - parakeet_container=_env("PARAKEET_CONTAINER") or "parakeet-asr", - kokoro_host=_env("KOKORO_HOST") or spark2_host, - kokoro_user=_env("KOKORO_USER") or spark2_user, - kokoro_container=_env("KOKORO_CONTAINER") or "kokoro-tts", + parakeet_host=_env(src, "PARAKEET_HOST") or spark2_host, + parakeet_user=_env(src, "PARAKEET_USER") or spark2_user, + parakeet_container=_env(src, "PARAKEET_CONTAINER") or "parakeet-asr", + kokoro_host=_env(src, "KOKORO_HOST") or spark2_host, + kokoro_user=_env(src, "KOKORO_USER") or spark2_user, + kokoro_container=_env(src, "KOKORO_CONTAINER") or "kokoro-tts", # Embeddings (spark-embed: bge-m3 dense + reranker) and Qdrant # (vector storage) default to Spark 2 unless overridden. - embed_host=_env("EMBED_HOST") or spark2_host, - embed_user=_env("EMBED_USER") or spark2_user, - embed_container=_env("EMBED_CONTAINER") or "spark-embed", - qdrant_host=_env("QDRANT_HOST") or spark2_host, - qdrant_user=_env("QDRANT_USER") or spark2_user, - qdrant_container=_env("QDRANT_CONTAINER") or "qdrant", - qdrant_collection=_env("QDRANT_COLLECTION", ""), + embed_host=_env(src, "EMBED_HOST") or spark2_host, + embed_user=_env(src, "EMBED_USER") or spark2_user, + embed_container=_env(src, "EMBED_CONTAINER") or "spark-embed", + qdrant_host=_env(src, "QDRANT_HOST") or spark2_host, + qdrant_user=_env(src, "QDRANT_USER") or spark2_user, + qdrant_container=_env(src, "QDRANT_CONTAINER") or "qdrant", + qdrant_collection=_env(src, "QDRANT_COLLECTION", ""), # matrix-bridge bot container, driven as its own SSH user (the owner # of the ~/matrix-bridge git clone) so git/docker run unprivileged. - # The user is BLANK by default and set via the "Configure Sparks" - # action; leaving it blank reports the service as unconfigured, which - # hides the tile. That keeps the shared package portable — a - # deployment without the bot never shows a stray tile or a hardcoded - # username. Host defaults to Spark 2 (same box); container/dir/branch - # are sensible defaults. All are env-overridable. - matrix_bridge_host=_env("MATRIX_BRIDGE_HOST") or spark2_host, - matrix_bridge_user=_env("MATRIX_BRIDGE_USER"), - matrix_bridge_container=_env("MATRIX_BRIDGE_CONTAINER") or "matrix-bridge", - matrix_bridge_dir=_env("MATRIX_BRIDGE_DIR") or "~/matrix-bridge", - matrix_bridge_branch=_env("MATRIX_BRIDGE_BRANCH") or "master", + # The user is BLANK by default and set via the settings gear; leaving + # it blank reports the service as unconfigured, which hides the tile. + # That keeps the shared package portable — a deployment without the + # bot never shows a stray tile or a hardcoded username. Host defaults + # to Spark 2 (same box); container/dir/branch are sensible defaults. + matrix_bridge_host=_env(src, "MATRIX_BRIDGE_HOST") or spark2_host, + matrix_bridge_user=_env(src, "MATRIX_BRIDGE_USER"), + matrix_bridge_container=_env(src, "MATRIX_BRIDGE_CONTAINER") or "matrix-bridge", + matrix_bridge_dir=_env(src, "MATRIX_BRIDGE_DIR") or "~/matrix-bridge", + matrix_bridge_branch=_env(src, "MATRIX_BRIDGE_BRANCH") or "master", # Redaction gateway pseudonym-map store (server-held de-anon key). - redaction_map_db=_env("REDACTION_MAP_DB", "/data/redaction_maps.db"), - redaction_map_ttl=_env_int("REDACTION_MAP_TTL", 7200), - ssh_key_path=_env("SSH_KEY_PATH"), - ssh_known_hosts=_env("SSH_KNOWN_HOSTS"), + redaction_map_db=_env(src, "REDACTION_MAP_DB", "/data/redaction_maps.db"), + redaction_map_ttl=_env_int(src, "REDACTION_MAP_TTL", 7200), + ssh_key_path=_env(src, "SSH_KEY_PATH"), + ssh_known_hosts=_env(src, "SSH_KNOWN_HOSTS"), models_yaml=_resolve_models_yaml(), - vllm_port=_env_int("VLLM_PORT", 8888), + vllm_port=_env_int(src, "VLLM_PORT", 8888), # Container name for the swappable vLLM on Spark 1. Defaults to the # bundled launch-cluster.sh container; override if you named yours # something else (the swap log-tail and pre-flight validator exec # into it by name). - vllm_container=_env_container("VLLM_CONTAINER", "vllm_node"), + vllm_container=_env_container(src, "VLLM_CONTAINER", "vllm_node"), # Built-in support-service keys (parakeet, kokoro, embeddings, # qdrant) the deployment doesn't run — hidden from the dashboard and # never probed. - disabled_services=_env_set("DISABLED_SERVICES"), - parakeet_port=_env_int("PARAKEET_PORT", 8000), - kokoro_port=_env_int("KOKORO_PORT", 8880), - embed_port=_env_int("EMBED_PORT", 8088), - qdrant_port=_env_int("QDRANT_PORT", 6333), - bind_port=_env_int("BIND_PORT", 9999), - open_webui_url=_env("OPEN_WEBUI_URL", ""), - ngc_api_key=_env("NGC_API_KEY", ""), + disabled_services=_env_set(src, "DISABLED_SERVICES"), + parakeet_port=_env_int(src, "PARAKEET_PORT", 8000), + kokoro_port=_env_int(src, "KOKORO_PORT", 8880), + embed_port=_env_int(src, "EMBED_PORT", 8088), + qdrant_port=_env_int(src, "QDRANT_PORT", 6333), + bind_port=_env_int(src, "BIND_PORT", 9999), + open_webui_url=_env(src, "OPEN_WEBUI_URL", ""), + ngc_api_key=_env(src, "NGC_API_KEY", ""), # Coordination layer: fire a swap-lifecycle webhook to this URL so # downstream consumers re-point their model config on a swap. Blank # ⇒ disabled. The optional secret HMAC-signs the body (X-Spark-Signature). - swap_webhook_url=_env("SWAP_WEBHOOK_URL", ""), - swap_webhook_secret=_env("SWAP_WEBHOOK_SECRET", ""), + swap_webhook_url=_env(src, "SWAP_WEBHOOK_URL", ""), + swap_webhook_secret=_env(src, "SWAP_WEBHOOK_SECRET", ""), ) + def reload(self) -> None: + """Recompute every field from the current env + settings overlay and + assign it onto this same instance, so all holders of the reference see + the change without an app restart. Called after the gear writes the + overlay (see server.post_settings).""" + fresh = Settings.from_env() + for f in fields(self): + setattr(self, f.name, getattr(fresh, f.name)) + @property def configured(self) -> bool: return bool(self.spark1_host) diff --git a/image/app/coordination.py b/image/app/coordination.py index 1d88ad6..3545464 100644 --- a/image/app/coordination.py +++ b/image/app/coordination.py @@ -239,6 +239,14 @@ class WebhookNotifier: self.secret = secret or "" self.timeout = timeout + def update(self, url: str, secret: str = "") -> None: + """Re-point after a live settings change. The notifier holds snapshot + copies of these two fields (not the Settings object), so Settings.reload() + can't reach it — server.post_settings calls this explicitly so editing the + webhook URL/secret in the dashboard gear takes effect without a restart.""" + self.url = (url or "").strip() + self.secret = secret or "" + @property def enabled(self) -> bool: return bool(self.url) diff --git a/image/app/server.py b/image/app/server.py index 6276c3f..5406dd1 100644 --- a/image/app/server.py +++ b/image/app/server.py @@ -1,6 +1,7 @@ from __future__ import annotations import asyncio import json +import os from pathlib import Path from fastapi import FastAPI, HTTPException, Query, Request @@ -9,6 +10,7 @@ from fastapi.staticfiles import StaticFiles from pydantic import BaseModel, ValidationError from typing import Literal +from . import app_settings from .config import Settings from .connectivity import get_mac, record_report, record_state, summary as connectivity_summary from .coordination import LockHeld, ScheduleRegistry, SwapLockManager, WebhookNotifier, valid_schedule_id @@ -37,6 +39,10 @@ from .validate import validate_launch from .wol import send_local_broadcast, send_via_peer +# One-time migration: seed the in-app settings overlay from env (values set via +# the StartOS action on a pre-gear install) before building Settings, so nothing +# is lost on upgrade. No-op once the overlay exists. See app_settings. +app_settings.seed_from_env(os.environ) settings = Settings.from_env() catalog = load_catalog(settings.models_yaml) # Coordination layer (GPU arbiter): swap-lifecycle webhook, the swap reservation @@ -156,6 +162,35 @@ async def get_config() -> dict: } +# ---- In-app settings ('gear') ---- +# The optional cluster knobs (ports, container names, support-service hosts, +# integrations) live in an app-owned overlay on /data, edited here instead of in +# the StartOS action — which keeps to just the four required setup fields. See +# app_settings. Writes apply live: we rewrite the overlay then reload the shared +# Settings instance in place, so every router/manager holding the reference picks +# up the change with no container restart. +@app.get("/api/settings") +async def get_settings() -> dict: + return app_settings.public_view() + + +class SettingsUpdate(BaseModel): + values: dict[str, str] + + +@app.post("/api/settings") +async def post_settings(req: SettingsUpdate) -> dict: + try: + app_settings.apply(req.values) + except app_settings.SettingsError as e: + raise HTTPException(422, str(e)) + settings.reload() + # WebhookNotifier snapshots url/secret (not the Settings object), so reload() + # can't reach it — re-point it explicitly so a webhook edit applies live too. + swap_webhook.update(settings.swap_webhook_url, settings.swap_webhook_secret) + return app_settings.public_view() + + def _reload_catalog() -> None: global catalog catalog = load_catalog(settings.models_yaml) @@ -947,6 +982,56 @@ async def post_swap(req: SwapRequest, request: Request) -> dict: return {"job_id": job.id, "model_key": job.model_key, "state": job.state} +# ---- Swap reservation lock (the GPU arbiter) ---- +# ROUTE ORDER IS LOAD-BEARING: these static `/api/swap/lock` routes MUST be +# registered before the parametric `/api/swap/{job_id}` below. FastAPI matches in +# registration order, so if `{job_id}` came first, GET /api/swap/lock would bind +# job_id="lock", look up a (non-existent) swap job, and 404 — which is exactly +# the bug this ordering fixes. Keep these above the {job_id} routes. +# CSRF: these are control-surface, not browser-exempt — an external scheduler is +# a non-browser client (no Origin header) so it passes the guard already, the +# same way it calls /api/swap; the dashboard is same-origin. +class LockAcquireRequest(BaseModel): + holder: str + ttl_seconds: int | None = None + note: str = "" + token: str | None = None # present only to extend an existing hold + + +@app.post("/api/swap/lock") +async def acquire_swap_lock(req: LockAcquireRequest) -> dict: + """Reserve the GPU swap path. Returns a secret token used to swap (header + X-Swap-Lock-Token) and to release. 409 if held by another holder.""" + try: + lock = swap_lock.acquire(req.holder, req.ttl_seconds, req.note, token=req.token) + except ValueError as e: + raise HTTPException(422, str(e)) + except LockHeld as e: + raise HTTPException(status_code=409, detail={ + "error": "swap lock is held by another holder", + "lock": e.state, + }) + return {**swap_lock.status(), "token": lock.token} + + +@app.get("/api/swap/lock") +async def get_swap_lock() -> dict: + """Public, token-free view of the reservation: held? who? until when?""" + return swap_lock.status() + + +@app.delete("/api/swap/lock") +async def release_swap_lock(request: Request, force: bool = Query(False)) -> dict: + """Release the reservation. Needs the matching X-Swap-Lock-Token unless + ?force=true (the human override from the dashboard).""" + token = request.headers.get("x-swap-lock-token") or request.query_params.get("token") + try: + released = swap_lock.release(token, force=force) + except PermissionError as e: + raise HTTPException(403, str(e)) + return {"released": released, **swap_lock.status()} + + @app.get("/api/swap/{job_id}") async def get_swap(job_id: str) -> dict: job = swap_manager.get(job_id) @@ -992,52 +1077,10 @@ async def stream_swap(job_id: str): return StreamingResponse(gen(), media_type="text/event-stream") -# ---- Coordination layer: swap lock + schedule registry ---- -# Endpoints are control-surface, not browser-exempt: an external scheduler is a -# non-browser client (no Origin header) so it passes the CSRF guard already, the -# same way it calls /api/swap today; the dashboard is same-origin. - -class LockAcquireRequest(BaseModel): - holder: str - ttl_seconds: int | None = None - note: str = "" - token: str | None = None # present only to extend an existing hold - - -@app.post("/api/swap/lock") -async def acquire_swap_lock(req: LockAcquireRequest) -> dict: - """Reserve the GPU swap path. Returns a secret token used to swap (header - X-Swap-Lock-Token) and to release. 409 if held by another holder.""" - try: - lock = swap_lock.acquire(req.holder, req.ttl_seconds, req.note, token=req.token) - except ValueError as e: - raise HTTPException(422, str(e)) - except LockHeld as e: - raise HTTPException(status_code=409, detail={ - "error": "swap lock is held by another holder", - "lock": e.state, - }) - return {**swap_lock.status(), "token": lock.token} - - -@app.get("/api/swap/lock") -async def get_swap_lock() -> dict: - """Public, token-free view of the reservation: held? who? until when?""" - return swap_lock.status() - - -@app.delete("/api/swap/lock") -async def release_swap_lock(request: Request, force: bool = Query(False)) -> dict: - """Release the reservation. Needs the matching X-Swap-Lock-Token unless - ?force=true (the human override from the dashboard).""" - token = request.headers.get("x-swap-lock-token") or request.query_params.get("token") - try: - released = swap_lock.release(token, force=force) - except PermissionError as e: - raise HTTPException(403, str(e)) - return {"released": released, **swap_lock.status()} - - +# ---- Coordination layer: read-only schedule registry ---- +# (The swap reservation lock lives above, next to the swap routes.) Same CSRF +# posture: control-surface, not browser-exempt — external schedulers send no +# Origin header so they pass the guard; the dashboard is same-origin. class ScheduleRequest(BaseModel): name: str id: str | None = None diff --git a/image/app/static/app.js b/image/app/static/app.js index 1c1c2fc..0208118 100644 --- a/image/app/static/app.js +++ b/image/app/static/app.js @@ -2192,8 +2192,104 @@ function handleUpdateDone(d) { setTimeout(pollUpdates, 2000); } +// ===================== settings ('gear') ===================== +// Renders the optional cluster knobs from /api/settings (server-driven field +// list, so adding a knob server-side needs no JS change) and POSTs edits back. +// The server reloads its config in place, so changes take effect immediately. + +let settingsClearSentinel = '__clear__'; + +function renderSettingsForm(data) { + settingsClearSentinel = data.clear_sentinel || settingsClearSentinel; + const body = el('#settings-body'); + body.innerHTML = (data.groups || []).map((g) => { + const rows = g.fields.map((f) => { + const help = f.help ? `${escapeHtml(f.help)}` : ''; + let input; + let clearToggle = ''; + if (f.type === 'secret') { + const ph = f.set ? 'set — leave blank to keep' : (f.placeholder || ''); + input = ``; + // A stored secret is never echoed back, so blank means "keep". Offer an + // explicit way to remove it. + if (f.set) clearToggle = ``; + } else if (f.type === 'int') { + input = ``; + } else { + input = ``; + } + return `
${clearToggle}${help}
`; + }).join(''); + return ``; + }).join(''); +} + +async function openSettingsDialog() { + const dlg = el('#settings-dialog'); + const err = el('#settings-error'); + err.classList.add('hidden'); + el('#settings-body').innerHTML = '

Loading…

'; + dlg.showModal(); + try { + renderSettingsForm(await fetchJSON('/api/settings')); + } catch (e) { + el('#settings-body').innerHTML = ''; + err.textContent = 'Could not load settings: ' + e.message; + err.classList.remove('hidden'); + } +} + +async function saveSettings(e) { + e.preventDefault(); + const err = el('#settings-error'); + err.classList.add('hidden'); + const values = {}; + $$('#settings-body [data-key]').forEach((inp) => { + const key = inp.dataset.key; + const v = inp.value.trim(); + if (inp.dataset.secret) { + // "clear" checkbox wins; else a typed value sets it; else omit (keep the + // stored one — we can't see it to retype it). + const clear = el(`[data-clear-for="${key}"]`); + if (clear && clear.checked) values[key] = settingsClearSentinel; + else if (v) values[key] = v; + } else { + values[key] = v; // blank non-secret ⇒ server reverts it to the default + } + }); + const btn = el('#settings-save'); + btn.disabled = true; + try { + await fetchJSON('/api/settings', { + method: 'POST', + headers: { 'content-type': 'application/json' }, + body: JSON.stringify({ values }), + }); + el('#settings-dialog').close(); + // Re-pull everything a knob can move: the Open WebUI link, health probes, + // service tiles, and the model menu (host/port changes alter all of them). + try { + state.config = await fetchJSON('/api/config'); + const a = el('#open-webui-link'); + if (state.config.open_webui_url) { a.href = state.config.open_webui_url; a.classList.remove('hidden'); } + else { a.classList.add('hidden'); } + } catch (e3) { console.warn('post-save /api/config refresh failed:', e3); } + pollStatus(); + renderServices(); + loadModels(); + } catch (e2) { + err.textContent = 'Save failed: ' + e2.message.replace(/^\d+ [^:]*:\s*/, ''); + err.classList.remove('hidden'); + } finally { + btn.disabled = false; + } +} + async function init() { setupCopyButtons(); + el('#open-settings').addEventListener('click', openSettingsDialog); + el('#settings-cancel').addEventListener('click', () => el('#settings-dialog').close()); + el('#settings-form').addEventListener('submit', saveSettings); el('#open-download').addEventListener('click', openDownloadForm); el('#dl-cancel').addEventListener('click', closeDownloadPanel); el('#dl-start').addEventListener('click', startDownload); diff --git a/image/app/static/index.html b/image/app/static/index.html index 8cd362c..bfae8a9 100644 --- a/image/app/static/index.html +++ b/image/app/static/index.html @@ -17,14 +17,28 @@ connecting… +
+ + + +