v0.13.0:0 - revert WhisperX migration; back to Parakeet + Sortformer
After five hotfix iterations on the WhisperX install (v0.12.0:0–:4) we
never got a working docker build. The fundamental constraint isn't
patchable from outside NVIDIA: NGC PyTorch on ARM64 (the only base that
runs on Spark 2's GB10 Blackwell) ships a custom-versioned torch
2.10.0a0+b558c98 that has no pre-built torchaudio match anywhere.
WhisperX → pyannote → torchaudio is a hard dependency chain we couldn't
satisfy without rebuilding torchaudio against torch 2.10's alpha API.
Walking away cleanly is better than another night of chasing.
Removed from the codebase:
- image/whisperx_container/* (Dockerfile + requirements + app/main.py)
- image/app/whisperx_install.py (install manager + SSH ship-context logic)
- image/Dockerfile COPY whisperx_container
- WHISPERX_* config keys in config.py
- whisperx service entry in services.py
- WhisperX-preferred branch in audio_proxy.py
- /api/whisperx/* endpoints in server.py
- install banner + progress dialog in index.html
- render + handlers in app.js
- .whisperx-install styles in style.css
Spark 2 cleaned in tandem (user-authorized): container removed,
~/whisperx-build/ removed, 5.4 GB of dangling image layers + 1.3 GB of
builder cache reclaimed. parakeet-asr and magpie-tts unaffected and
healthy throughout.
The audio path is back to exactly what shipped in v0.11.0:3:
POST /api/audio/transcribe-with-speakers
→ Parakeet (transcription) + Sortformer (diarization) in parallel
→ merged by timestamp into speaker-labeled blocks
v0.13.0:1+ will add the actually-needed fixes that the WhisperX detour
was meant to address:
1. memory cap on the parakeet-asr container so a long-audio crash
can't swap-thrash Spark 2 again
2. a chunking proxy in /api/audio/transcribe-with-speakers that
splits inputs >10 min before Sortformer
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,10 +1,10 @@
|
||||
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
||||
|
||||
export const v0_1_0 = VersionInfo.of({
|
||||
version: '0.12.0:4',
|
||||
version: '0.13.0:0',
|
||||
releaseNotes: {
|
||||
en_US:
|
||||
'v0.12.0:4 — hotfix: torchaudio build was failing with "ModuleNotFoundError: No module named torch" during its setup.py. Root cause: pip\'s PEP 517 build isolation creates a fresh Python env for the build that doesn\'t see NGC\'s torch (which is what we need for ABI compat). Fix: add --no-build-isolation to the pip install so the build uses the existing torch, plus pre-install setuptools/wheel/ninja/pybind11 since pip won\'t auto-pull them when build isolation is off. Should now finally compile torchaudio v2.5.1 against NGC\'s torch 2.10 and proceed to the whisperx install.',
|
||||
'v0.13.0 — WhisperX migration reverted. Five hotfixes deep with no working build; the fundamental problem (NGC PyTorch on ARM64 ships a custom-versioned torch with no matching torchaudio anywhere) was always going to bite. All WhisperX install plumbing has been removed from spark-control: the install banner + progress dialog, the install endpoints, the audio-proxy WhisperX-preferred branch, the whisperx service registration, the WHISPERX_* env vars, and the build-context files. Spark 2 has been cleaned (container removed, build dir removed, ~6.8 GB of dangling layers + builder cache reclaimed). The dashboard now looks as it did before the migration attempt: Parakeet + Sortformer is the only audio path, unchanged. v0.13.0:1+ will add the actually-needed fixes: a memory cap on the parakeet container (so the 90-min audio crash can\'t take down Spark 2 again — worst case is a clean OOM-kill of the container), and a chunking proxy that splits long audio before sending to Sortformer.',
|
||||
},
|
||||
migrations: {
|
||||
up: async ({ effects }) => {},
|
||||
|
||||
Reference in New Issue
Block a user