v0.13.0:0 - revert WhisperX migration; back to Parakeet + Sortformer

After five hotfix iterations on the WhisperX install (v0.12.0:0–:4) we never got a working docker build. The fundamental constraint isn't patchable from outside NVIDIA: NGC PyTorch on ARM64 (the only base that runs on Spark 2's GB10 Blackwell) ships a custom-versioned torch 2.10.0a0+b558c98 that has no pre-built torchaudio match anywhere. WhisperX → pyannote → torchaudio is a hard dependency chain we couldn't satisfy without rebuilding torchaudio against torch 2.10's alpha API. Walking away cleanly is better than another night of chasing. Removed from the codebase: - image/whisperx_container/* (Dockerfile + requirements + app/main.py) - image/app/whisperx_install.py (install manager + SSH ship-context logic) - image/Dockerfile COPY whisperx_container - WHISPERX_* config keys in config.py - whisperx service entry in services.py - WhisperX-preferred branch in audio_proxy.py - /api/whisperx/* endpoints in server.py - install banner + progress dialog in index.html - render + handlers in app.js - .whisperx-install styles in style.css Spark 2 cleaned in tandem (user-authorized): container removed, ~/whisperx-build/ removed, 5.4 GB of dangling image layers + 1.3 GB of builder cache reclaimed. parakeet-asr and magpie-tts unaffected and healthy throughout. The audio path is back to exactly what shipped in v0.11.0:3: POST /api/audio/transcribe-with-speakers → Parakeet (transcription) + Sortformer (diarization) in parallel → merged by timestamp into speaker-labeled blocks v0.13.0:1+ will add the actually-needed fixes that the WhisperX detour was meant to address: 1. memory cap on the parakeet-asr container so a long-audio crash can't swap-thrash Spark 2 again 2. a chunking proxy in /api/audio/transcribe-with-speakers that splits inputs >10 min before Sortformer Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:03:19 -05:00
parent a24610ad2a
commit 95524f4983
14 changed files with 14 additions and 1086 deletions
@@ -24,7 +24,6 @@ from .overrides import add_custom, delete_custom, extract_knobs_from_args, load_
 from .services import docker_state, run_action, services_from_settings
 from .speech_models import SpeechModelsManager
 from .ssh import ssh_run
-from .whisperx_install import WhisperXInstaller
 from .swap import SwapManager
 from .updates import UpdateManager, get_update_status
 from .validate import validate_launch
@@ -40,7 +39,6 @@ hardware_probe = HardwareProbe(settings)
 nim_manager = NimManager(settings)
 deep_health = DeepHealth(settings)
 speech_models = SpeechModelsManager(settings)
-whisperx_installer = WhisperXInstaller(settings)

 app = FastAPI(title="spark-control", version="0.1.0")

@@ -537,68 +535,10 @@ async def post_speech_models_restart() -> dict:
    return result


-# ---- WhisperX install (Phase 2 of the WhisperX migration) ----
-
-@app.get("/api/whisperx/status")
-async def get_whisperx_status() -> dict:
-    """Is WhisperX installed + healthy on Spark 2 right now?"""
-    return await whisperx_installer.status()
-
-
-@app.post("/api/whisperx/install")
-async def post_whisperx_install() -> dict:
-    """One-click install: ships the WhisperX build context from inside
-    spark-control to Spark 2, runs `docker build` + `docker run`, polls
-    /health until both models are loaded. Streams progress via the matching
-    GET /api/whisperx/install/{job_id}/stream SSE endpoint."""
-    try:
-        job = await whisperx_installer.trigger()
-    except RuntimeError as e:
-        raise HTTPException(409, str(e))
-    return {"job_id": job.id, "started_at": job.started_at}
-
-
-@app.get("/api/whisperx/install/{job_id}")
-async def get_whisperx_install(job_id: str) -> dict:
-    job = whisperx_installer.get(job_id)
-    if not job:
-        raise HTTPException(404, "unknown job")
-    return {
-        "id": job.id,
-        "state": job.state,
-        "phase": job.phase,
-        "lines": job.lines,
-        "started_at": job.started_at,
-        "finished_at": job.finished_at,
-        "returncode": job.returncode,
-    }
-
-
-@app.get("/api/whisperx/install/{job_id}/stream")
-async def stream_whisperx_install(job_id: str) -> StreamingResponse:
-    job = whisperx_installer.get(job_id)
-    if not job:
-        raise HTTPException(404, "unknown job")
-
-    async def event_stream():
-        last_idx = 0
-        last_phase = ""
-        last_state = ""
-        while True:
-            new_lines = job.lines[last_idx:]
-            last_idx = len(job.lines)
-            for line in new_lines:
-                yield f"data: {json.dumps({'line': line})}\n\n"
-            if job.phase != last_phase or job.state != last_state:
-                yield f"event: phase\ndata: {json.dumps({'phase': job.phase, 'state': job.state})}\n\n"
-                last_phase = job.phase
-                last_state = job.state
-            if job.finished_at:
-                yield f"event: done\ndata: {json.dumps({'state': job.state, 'returncode': job.returncode})}\n\n"
-                return
-            await asyncio.sleep(0.6)
-
-    return StreamingResponse(event_stream(), media_type="text/event-stream")
+# NOTE: a WhisperX-on-Spark-2 install action lived here briefly in v0.12.0:0–4
+# but was reverted in v0.13.0:0. NGC's custom-versioned torch on ARM64 made
+# building torchaudio (which WhisperX needs via pyannote) unworkable. The
+# existing Parakeet + Sortformer pipeline stays as the audio path.


@app.get("/api/endpoints")