Files
spark-control/image/app/static/index.html
T
Keysat 391117f705 v0.11.0:0 - Speech model patches panel (lifecycle for v0.10.0 overlays)
Folds the image/parakeet_patches/apply.sh script into a one-click
dashboard action and adds drift detection so you can see at a glance
whether the parakeet-asr container has the latest Sortformer overlays
that spark-control ships.

Backend:
  * image/app/speech_models.py - SpeechModelsManager: reads /health from
    Parakeet, sha256s the local overlay files inside spark-control's
    Docker image (/app/parakeet_patches), sha256s the same files inside
    the parakeet-asr container via `docker exec ... sha256sum`, surfaces
    in_sync / drift / missing status per file.
  * GET  /api/speech-models           - status payload
  * POST /api/speech-models/reapply   - copies overlays into container,
                                         verifies python syntax, restarts,
                                         polls /health for ~120s, returns
                                         step-by-step result
  * POST /api/speech-models/restart   - plain `docker restart parakeet-asr`

Dockerfile: now COPY parakeet_patches into the image at /app/parakeet_patches
so the runtime can read them. Future spark-control releases auto-carry
newer overlay versions; the panel surfaces drift after upgrade.

Frontend: new "Speech model patches" section on the dashboard with
  * Status pill (in sync / drift / missing)
  * Per-file SHA comparison (local vs container)
  * Loaded-models pills (ASR + diarizer)
  * Reapply + Restart buttons (both with confirmation modals)
  * Live progress display during reapply with per-step ✓/✗

Verified post-install against the running cluster:
  GET /api/speech-models shows both files in_sync (SHAs match) and both
  models loaded ready on Spark 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:58:13 -05:00

344 lines
18 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover">
<meta name="color-scheme" content="dark">
<title>spark-control</title>
<link rel="stylesheet" href="/static/style.css">
</head>
<body>
<header class="topbar">
<div class="brand">
<span class="logo-dot"></span>
<span>spark-control</span>
</div>
<div class="current" id="current">
<span class="muted">connecting…</span>
</div>
<a id="open-webui-link" class="topbar-btn hidden" href="#" target="_blank" rel="noopener" title="Open Open WebUI">Open chat ↗</a>
</header>
<main>
<section id="setup-banner" class="banner hidden">
<strong>Configuration needed.</strong>
<span>Run the <em>Configure Sparks</em> action in StartOS to set hostnames, then run <em>Test Connection</em>.</span>
</section>
<section id="hardware-panel" class="hardware-panel hidden">
<div class="section-header">
<h2 class="section-title">Spark hardware</h2>
<button id="open-connectivity" class="btn small-btn">Connectivity log</button>
</div>
<div id="hardware-grid" class="hardware-grid"></div>
<dialog id="connectivity-dialog" class="modal">
<form method="dialog" class="modal-form">
<h3>Spark connectivity history</h3>
<p class="muted small">Most recent up/down transitions per Spark. Tracked since this dashboard was installed.</p>
<div id="connectivity-content" class="connectivity-content"></div>
<div class="modal-actions">
<button type="button" id="connectivity-close" class="btn">Close</button>
</div>
</form>
</dialog>
</section>
<section id="endpoint-panel" class="endpoint-panel hidden">
<div class="ep-title muted small">OpenAI-compatible endpoint</div>
<div class="ep-row">
<span class="ep-label">Base URL</span>
<code class="ep-value copyable" id="ep-url" data-copy-self title="Click to copy"></code>
<button class="icon-btn" data-copy="#ep-url" title="Copy base URL" aria-label="Copy">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>
</button>
</div>
<div class="ep-row">
<span class="ep-label">Model ID</span>
<code class="ep-value copyable" id="ep-model" data-copy-self title="Click to copy"></code>
<button class="icon-btn" data-copy="#ep-model" title="Copy model ID" aria-label="Copy">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>
</button>
</div>
<details class="ep-curl">
<summary class="muted small">curl example</summary>
<pre id="ep-curl-snippet" class="snippet copyable" data-copy-self title="Click to copy"></pre>
<button class="icon-btn" data-copy="#ep-curl-snippet" title="Copy snippet" aria-label="Copy">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>
</button>
</details>
</section>
<section id="swap-panel" class="swap-panel hidden">
<div class="swap-header">
<span class="spinner"></span>
<span id="swap-title">Swap in progress</span>
<span class="spacer"></span>
<span class="timer" id="swap-elapsed">0:00</span>
</div>
<div class="phase-row">
<div class="phase" id="swap-phase">Starting…</div>
<div class="phase-detail muted small" id="swap-phase-detail"></div>
</div>
<div class="phase-track">
<div class="phase-fill" id="swap-phase-fill"></div>
</div>
<details id="swap-log-details">
<summary class="muted small">Show technical logs</summary>
<pre id="swap-log" class="log"></pre>
</details>
</section>
<section id="services-panel" class="services hidden">
<div class="section-header">
<h2 class="section-title">Always-on services</h2>
<button id="open-nim" class="btn small-btn">+ Install NIM</button>
</div>
<div id="services-grid" class="services-grid"></div>
<dialog id="nim-dialog" class="modal">
<form method="dialog" class="modal-form" id="nim-form">
<h3>Install a NVIDIA NIM container</h3>
<p class="muted small" id="nim-key-warn"></p>
<p class="muted small">Pick a curated container below or paste any image from <a href="#" id="nim-catalog-link" target="_blank" rel="noopener">the NGC NIM catalog</a>. Spark Control will <code>docker pull</code> and <code>docker run</code> it on the target Spark.</p>
<div id="nim-suggested" class="nim-grid"></div>
<fieldset class="modal-fieldset">
<legend>Custom image</legend>
<label class="modal-row"><span>Image (nvcr.io/...)</span><input type="text" id="nim-image" placeholder="nvcr.io/nim/nvidia/<name>:latest"></label>
<label class="modal-row"><span>Container name</span><input type="text" id="nim-container" placeholder="my-service"></label>
<label class="modal-row"><span>Port</span><input type="number" id="nim-port" min="1" max="65535"></label>
<label class="modal-row"><span>Kind</span>
<select id="nim-kind">
<option value="nim">NIM (other)</option>
<option value="stt">STT (speech-to-text)</option>
<option value="tts">TTS (text-to-speech)</option>
<option value="vision">Vision</option>
<option value="embedding">Embedding</option>
</select>
</label>
<label class="modal-row"><span>Target Spark</span>
<select id="nim-host">
<option value="spark2">Spark 2 (default for support services)</option>
<option value="spark1">Spark 1 (head node)</option>
</select>
</label>
</fieldset>
<div class="modal-actions">
<button type="button" id="nim-cancel" class="btn">Cancel</button>
<button type="submit" class="btn primary" id="nim-start">Install</button>
</div>
</form>
</dialog>
<dialog id="nim-progress-dialog" class="modal">
<form method="dialog" class="modal-form">
<h3 id="nim-prog-title">Installing…</h3>
<div class="phase-row">
<div class="phase" id="nim-prog-phase">Starting…</div>
<span class="spacer"></span>
<span class="timer" id="nim-prog-elapsed">0:00</span>
</div>
<details open>
<summary class="muted small">Log</summary>
<pre id="nim-prog-log" class="log"></pre>
</details>
<div class="modal-actions">
<button type="button" id="nim-prog-close" class="btn">Close</button>
</div>
</form>
</dialog>
</section>
<section id="speech-models-panel" class="speech-models hidden">
<div class="section-header">
<h2 class="section-title">Speech model patches</h2>
</div>
<p class="muted small sm-blurb">
Spark Control adds Sortformer speaker diarization to the third-party Parakeet ASR
container via two Python overlays (<code>diarizer.py</code> + a patched <code>main.py</code>).
Overlays survive container restart but not a fresh redeploy — if the parakeet container is
ever rebuilt, click <strong>Reapply patches</strong> below to restore them.
</p>
<div id="speech-models-card" class="speech-models-card"></div>
<dialog id="speech-models-progress-dialog" class="modal">
<form method="dialog" class="modal-form">
<h3>Reapplying speech-model patches…</h3>
<p class="muted small">Copying overlays into the parakeet container, verifying syntax, restarting, waiting for both models to load. Takes ~60120 s.</p>
<div id="sm-prog-steps" class="sm-prog-steps"></div>
<div class="modal-actions">
<button type="button" id="sm-prog-close" class="btn" disabled>Close</button>
</div>
</form>
</dialog>
</section>
<section id="models-section">
<div class="section-header">
<h2 class="section-title">LLM swap</h2>
<button id="open-download" class="btn small-btn">+ Download a new model</button>
</div>
<dialog id="catalog-dialog" class="modal">
<form method="dialog" class="modal-form" id="catalog-form">
<h3>Add downloaded model to catalog</h3>
<p class="muted small">It will appear as a new card you can swap to. Knob values become its default launch flags — you can tweak later via the model's "Advanced" panel.</p>
<label class="modal-row"><span>Key (URL-safe id)</span><input type="text" id="cd-key" required pattern="[a-zA-Z0-9_-]+"></label>
<label class="modal-row"><span>Display name</span><input type="text" id="cd-name" required></label>
<label class="modal-row"><span>Repo (read-only)</span><input type="text" id="cd-repo" readonly></label>
<label class="modal-row"><span>Size (GB)</span><input type="number" id="cd-size" step="0.1" min="0"></label>
<label class="modal-row"><span>Mode</span>
<select id="cd-mode">
<option value="solo">solo (Spark 1 only)</option>
<option value="cluster">cluster (both Sparks via Ray)</option>
</select>
</label>
<label class="modal-row"><span>Description (optional)</span><textarea id="cd-desc" rows="3"></textarea></label>
<fieldset class="modal-fieldset">
<legend>Default launch knobs</legend>
<label class="modal-row"><span>Max context (tokens)</span><input type="number" id="cd-mml" step="1024" min="1024" value="32768"></label>
<label class="modal-row"><span>GPU memory %</span><input type="range" id="cd-gmu" min="0.5" max="0.95" step="0.01" value="0.85"> <output id="cd-gmu-out">0.85</output></label>
<label class="modal-row inline"><input type="checkbox" id="cd-fst" checked> Fast safetensors loading</label>
<label class="modal-row inline"><input type="checkbox" id="cd-pcache" checked> Prefix caching</label>
<label class="modal-row inline"><input type="checkbox" id="cd-fp8" checked> FP8 KV cache</label>
</fieldset>
<div class="modal-actions">
<button type="button" id="cd-cancel" class="btn">Cancel</button>
<button type="submit" class="btn primary">Add to catalog</button>
</div>
</form>
</dialog>
<dialog id="disk-delete-dialog" class="modal">
<form method="dialog" class="modal-form">
<h3>Delete model weights from disk?</h3>
<p id="dd-summary" class="muted small"></p>
<ul class="muted small dd-hosts" id="dd-hosts"></ul>
<p class="muted small">This is reversible — you can re-download from the catalog at any time. The catalog entry stays intact.</p>
<p id="dd-error" class="muted small dd-error hidden"></p>
<div class="modal-actions">
<button type="button" id="dd-cancel" class="btn">Cancel</button>
<button type="button" id="dd-confirm" class="btn danger">Delete from disk</button>
</div>
</form>
</dialog>
<dialog id="advanced-dialog" class="modal">
<form method="dialog" class="modal-form" id="advanced-form">
<h3 id="adv-title">Advanced settings</h3>
<p class="muted small">Custom values are stored in the package volume and survive package updates. Empty fields fall back to defaults.</p>
<label class="modal-row"><span>Max context (tokens)</span><input type="number" id="adv-mml" step="1024" min="1024"></label>
<label class="modal-row"><span>GPU memory %</span><input type="range" id="adv-gmu" min="0.5" max="0.95" step="0.01"> <output id="adv-gmu-out"></output></label>
<label class="modal-row inline"><input type="checkbox" id="adv-fst"> Fast safetensors loading <span class="muted small">(faster cold start)</span></label>
<label class="modal-row inline"><input type="checkbox" id="adv-pcache"> Prefix caching <span class="muted small">(speeds up repeated prefixes)</span></label>
<label class="modal-row inline"><input type="checkbox" id="adv-fp8"> FP8 KV cache <span class="muted small">(halves context memory)</span></label>
<div class="modal-actions">
<button type="button" id="adv-delete" class="btn danger hidden">Delete model</button>
<span class="spacer"></span>
<button type="button" id="adv-cancel" class="btn">Cancel</button>
<button type="submit" class="btn primary">Save</button>
</div>
</form>
</dialog>
<section id="download-panel" class="download-panel hidden">
<div class="download-form" id="download-form">
<label class="dl-row">
<span class="dl-label">HuggingFace repo</span>
<input type="text" id="dl-repo" placeholder="e.g. RedHatAI/Qwen3.6-35B-A3B-NVFP4" autocomplete="off">
<a id="dl-hf-link" class="dl-hf-link hidden" href="#" target="_blank" rel="noopener" title="Open on Hugging Face"></a>
</label>
<div class="dl-help muted small">
<a href="https://huggingface.co/models?other=vllm" target="_blank" rel="noopener">Browse vLLM-compatible models</a>
· NVFP4-quantized models (e.g. <code>RedHatAI/...</code>) are best for Blackwell hardware
</div>
<div class="dl-row">
<span class="dl-label">Where</span>
<label class="radio"><input type="radio" name="dl-mode" value="spark1" checked> Spark 1 only</label>
<label class="radio"><input type="radio" name="dl-mode" value="spark2"> Spark 2 only</label>
<label class="radio"><input type="radio" name="dl-mode" value="cluster"> Both Sparks (for cluster models)</label>
</div>
<div class="dl-help muted small">
For <strong>solo</strong> models, download to wherever you'll run them. For <strong>cluster</strong> models (-tp 2), both Sparks need the weights — "Both" downloads to one Spark and rsyncs to the other in parallel.
</div>
<div class="dl-actions">
<button id="dl-cancel" class="btn">Cancel</button>
<button id="dl-start" class="btn primary">Start download</button>
</div>
</div>
<div class="download-progress hidden" id="download-progress">
<div class="dl-header">
<span class="spinner"></span>
<span id="dl-title">Downloading…</span>
<span class="spacer"></span>
<span class="timer" id="dl-elapsed">0:00</span>
</div>
<div class="phase-row">
<div class="phase" id="dl-phase">Connecting…</div>
<div class="phase-detail muted small" id="dl-phase-detail"></div>
</div>
<div class="phase-track">
<div class="phase-fill" id="dl-progress-fill"></div>
</div>
<div class="dl-stats muted small" id="dl-stats"></div>
<details id="dl-log-details">
<summary class="muted small">Show technical logs</summary>
<pre id="dl-log" class="log"></pre>
</details>
</div>
</section>
<section id="cards" class="cards"></section>
</section>
<section id="update-banner" class="update-banner hidden">
<div class="ub-context muted small">
Updates to <strong><a href="https://github.com/eugr/spark-vllm-docker" target="_blank" rel="noopener">eugr/spark-vllm-docker</a></strong>
— the upstream project that orchestrates vLLM on your Sparks (launch-cluster.sh, recipes, mods). These are <em>not</em> firmware, OS, or model updates.
</div>
<div class="ub-row">
<span id="ub-text">Checking for updates…</span>
<span class="spacer"></span>
<button id="ub-explain" class="btn small-btn hidden">✨ Explain context</button>
<button id="ub-details" class="btn small-btn hidden">Show details</button>
<button id="ub-apply" class="btn small-btn primary hidden">Apply update</button>
</div>
<details id="ub-list" class="hidden">
<summary class="muted small">Pending commits</summary>
<pre id="ub-log" class="snippet"></pre>
</details>
<details id="ub-explain-section" class="hidden">
<summary class="muted small">Explained by the loaded LLM</summary>
<div id="ub-explain-content" class="explain-content"></div>
</details>
<div id="ub-progress" class="hidden">
<div class="phase-row">
<div class="phase" id="ub-phase">Applying update…</div>
<span class="spacer"></span>
<span class="timer" id="ub-elapsed">0:00</span>
</div>
<details>
<summary class="muted small">Show technical logs</summary>
<pre id="ub-stream" class="log"></pre>
</details>
</div>
</section>
<footer class="footer">
<div class="health">
<span class="health-item" id="h-vllm"><span class="dot"></span> vLLM</span>
<span class="health-item" id="h-parakeet"><span class="dot"></span> Parakeet</span>
<span class="health-item" id="h-magpie"><span class="dot"></span> Magpie</span>
</div>
<div class="muted small" id="updated"></div>
</footer>
</main>
<script src="/static/app.js"></script>
</body>
</html>