spark-control

Author	SHA1	Message	Date
Keysat	95524f4983	v0.13.0:0 - revert WhisperX migration; back to Parakeet + Sortformer After five hotfix iterations on the WhisperX install (v0.12.0:0–:4) we never got a working docker build. The fundamental constraint isn't patchable from outside NVIDIA: NGC PyTorch on ARM64 (the only base that runs on Spark 2's GB10 Blackwell) ships a custom-versioned torch 2.10.0a0+b558c98 that has no pre-built torchaudio match anywhere. WhisperX → pyannote → torchaudio is a hard dependency chain we couldn't satisfy without rebuilding torchaudio against torch 2.10's alpha API. Walking away cleanly is better than another night of chasing. Removed from the codebase: - image/whisperx_container/* (Dockerfile + requirements + app/main.py) - image/app/whisperx_install.py (install manager + SSH ship-context logic) - image/Dockerfile COPY whisperx_container - WHISPERX_* config keys in config.py - whisperx service entry in services.py - WhisperX-preferred branch in audio_proxy.py - /api/whisperx/* endpoints in server.py - install banner + progress dialog in index.html - render + handlers in app.js - .whisperx-install styles in style.css Spark 2 cleaned in tandem (user-authorized): container removed, ~/whisperx-build/ removed, 5.4 GB of dangling image layers + 1.3 GB of builder cache reclaimed. parakeet-asr and magpie-tts unaffected and healthy throughout. The audio path is back to exactly what shipped in v0.11.0:3: POST /api/audio/transcribe-with-speakers → Parakeet (transcription) + Sortformer (diarization) in parallel → merged by timestamp into speaker-labeled blocks v0.13.0:1+ will add the actually-needed fixes that the WhisperX detour was meant to address: 1. memory cap on the parakeet-asr container so a long-audio crash can't swap-thrash Spark 2 again 2. a chunking proxy in /api/audio/transcribe-with-speakers that splits inputs >10 min before Sortformer Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 08:03:19 -05:00
Keysat	5a0bfba6a3	v0.12.0:0 - WhisperX as a one-click dashboard install + managed service Replaces the manual rsync+build+run with a proper spark-control feature. First in the audio path that doesn't require shell access on Spark 2. What's in the box ───────────────── * image/whisperx_container/ - the build context (Dockerfile, requirements, app/main.py FastAPI wrapper). Mainline pipeline: faster-whisper for STT + pyannote 3.1 for diarization + wav2vec2 forced alignment. Single endpoint /v1/audio/transcribe-with-speakers returns the exact same shape spark- control's existing endpoint does, so the recap-relay PR spec needs no changes when we cut over. * image/app/whisperx_install.py - install manager. ships build context to Spark 2 over SSH, runs `docker build`, runs `docker run` with 40 GB memory cap (vs Sortformer's unbounded which thrashed Spark 2 on a 90-min file), polls /health until both Whisper + pyannote report loaded. * Audio proxy: /api/audio/transcribe-with-speakers now prefers WhisperX when its /health reports diarizer_loaded=true, falls back to the legacy Parakeet + Sortformer path otherwise. Same response shape either way. Clean cutover, easy rollback (`docker rm whisperx-asr`). * Dashboard (Audio / Speech tab): - "Add WhisperX" banner appears when not installed, with a primary "Install WhisperX" button. One click triggers the install. - Build progress dialog with phase + elapsed timer + live build log via SSE (`/api/whisperx/install/{job_id}/stream`). - After install, WhisperX auto-registers as a managed service alongside Parakeet and Magpie (Start/Restart/Stop, deep-check, auto-restart). - Banner self-hides once /api/whisperx/status reports healthy. New endpoints ───────────── GET /api/whisperx/status POST /api/whisperx/install GET /api/whisperx/install/{job_id} GET /api/whisperx/install/{job_id}/stream (SSE phase + log) Config additions (env) ────────────────────── WHISPERX_HOST (defaults to spark2_host) WHISPERX_USER (defaults to spark2_user) WHISPERX_CONTAINER (default: whisperx-asr) WHISPERX_PORT (default: 8002) WHISPERX_MODEL (default: medium; tiny/base/small/medium/large-v3) Dockerfile ────────── Added COPY whisperx_container /app/whisperx_container so the runtime install manager can read the build context from inside the spark-control image and ship it over SSH. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 21:02:26 -05:00
Keysat	391117f705	v0.11.0:0 - Speech model patches panel (lifecycle for v0.10.0 overlays) Folds the image/parakeet_patches/apply.sh script into a one-click dashboard action and adds drift detection so you can see at a glance whether the parakeet-asr container has the latest Sortformer overlays that spark-control ships. Backend: * image/app/speech_models.py - SpeechModelsManager: reads /health from Parakeet, sha256s the local overlay files inside spark-control's Docker image (/app/parakeet_patches), sha256s the same files inside the parakeet-asr container via `docker exec ... sha256sum`, surfaces in_sync / drift / missing status per file. * GET /api/speech-models - status payload * POST /api/speech-models/reapply - copies overlays into container, verifies python syntax, restarts, polls /health for ~120s, returns step-by-step result * POST /api/speech-models/restart - plain `docker restart parakeet-asr` Dockerfile: now COPY parakeet_patches into the image at /app/parakeet_patches so the runtime can read them. Future spark-control releases auto-carry newer overlay versions; the panel surfaces drift after upgrade. Frontend: new "Speech model patches" section on the dashboard with * Status pill (in sync / drift / missing) * Per-file SHA comparison (local vs container) * Loaded-models pills (ASR + diarizer) * Reapply + Restart buttons (both with confirmation modals) * Live progress display during reapply with per-step ✓/✗ Verified post-install against the running cluster: GET /api/speech-models shows both files in_sync (SHAs match) and both models loaded ready on Spark 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 15:58:13 -05:00
Grant	72bf754baa	Pack spark-control_x86_64.s9pk (55 MB) - Move models.yaml into image/ so the docker build context is self-contained - Fix manifest: dockerfile=../image/Dockerfile, workdir=../image - Add LICENSE (MIT) and assets/README.md (StartOS marketplace listing) - s9pk validates: id=spark-control, version=0.1.0:0, osVersion=0.4.0-beta.6, sdkVersion=1.3.3 - Image embeds python:3.12-slim + openssh-client + FastAPI app + models.yaml	2026-05-12 09:52:53 -05:00
Grant	ae8efa1754	Initial scaffold: image/ FastAPI app, models.yaml, docs - image/ FastAPI app: /api/status, /api/swap, /api/swap/{id}/stream, /api/test-connection - models.yaml: 5-model catalog (qwen3-vl, gemma4, qwen36, qwen3-235b-fp8, qwen25-72b) - README, runbook, known-issues - Dry-run swap verified against live Spark 1 (gemma4 currently loaded)	2026-05-12 09:29:13 -05:00

5 Commits