v0.13.0:0 - revert WhisperX migration; back to Parakeet + Sortformer

After five hotfix iterations on the WhisperX install (v0.12.0:0–:4) we never got a working docker build. The fundamental constraint isn't patchable from outside NVIDIA: NGC PyTorch on ARM64 (the only base that runs on Spark 2's GB10 Blackwell) ships a custom-versioned torch 2.10.0a0+b558c98 that has no pre-built torchaudio match anywhere. WhisperX → pyannote → torchaudio is a hard dependency chain we couldn't satisfy without rebuilding torchaudio against torch 2.10's alpha API. Walking away cleanly is better than another night of chasing. Removed from the codebase: - image/whisperx_container/* (Dockerfile + requirements + app/main.py) - image/app/whisperx_install.py (install manager + SSH ship-context logic) - image/Dockerfile COPY whisperx_container - WHISPERX_* config keys in config.py - whisperx service entry in services.py - WhisperX-preferred branch in audio_proxy.py - /api/whisperx/* endpoints in server.py - install banner + progress dialog in index.html - render + handlers in app.js - .whisperx-install styles in style.css Spark 2 cleaned in tandem (user-authorized): container removed, ~/whisperx-build/ removed, 5.4 GB of dangling image layers + 1.3 GB of builder cache reclaimed. parakeet-asr and magpie-tts unaffected and healthy throughout. The audio path is back to exactly what shipped in v0.11.0:3: POST /api/audio/transcribe-with-speakers → Parakeet (transcription) + Sortformer (diarization) in parallel → merged by timestamp into speaker-labeled blocks v0.13.0:1+ will add the actually-needed fixes that the WhisperX detour was meant to address: 1. memory cap on the parakeet-asr container so a long-audio crash can't swap-thrash Spark 2 again 2. a chunking proxy in /api/audio/transcribe-with-speakers that splits inputs >10 min before Sortformer Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:03:19 -05:00
parent a24610ad2a
commit 95524f4983
14 changed files with 14 additions and 1086 deletions
@@ -1,10 +1,10 @@
 import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'

 export const v0_1_0 = VersionInfo.of({
-  version: '0.12.0:4',
+  version: '0.13.0:0',
  releaseNotes: {
    en_US:
-      'v0.12.0:4 — hotfix: torchaudio build was failing with "ModuleNotFoundError: No module named torch" during its setup.py. Root cause: pip\'s PEP 517 build isolation creates a fresh Python env for the build that doesn\'t see NGC\'s torch (which is what we need for ABI compat). Fix: add --no-build-isolation to the pip install so the build uses the existing torch, plus pre-install setuptools/wheel/ninja/pybind11 since pip won\'t auto-pull them when build isolation is off. Should now finally compile torchaudio v2.5.1 against NGC\'s torch 2.10 and proceed to the whisperx install.',
+      'v0.13.0 — WhisperX migration reverted. Five hotfixes deep with no working build; the fundamental problem (NGC PyTorch on ARM64 ships a custom-versioned torch with no matching torchaudio anywhere) was always going to bite. All WhisperX install plumbing has been removed from spark-control: the install banner + progress dialog, the install endpoints, the audio-proxy WhisperX-preferred branch, the whisperx service registration, the WHISPERX_* env vars, and the build-context files. Spark 2 has been cleaned (container removed, build dir removed, ~6.8 GB of dangling layers + builder cache reclaimed). The dashboard now looks as it did before the migration attempt: Parakeet + Sortformer is the only audio path, unchanged. v0.13.0:1+ will add the actually-needed fixes: a memory cap on the parakeet container (so the 90-min audio crash can\'t take down Spark 2 again — worst case is a clean OOM-kill of the container), and a chunking proxy that splits long audio before sending to Sortformer.',
  },
  migrations: {
    up: async ({ effects }) => {},