From d9c098262f4d1f8d547e21f7d730d2eae78b48a8 Mon Sep 17 00:00:00 2001 From: Keysat Date: Mon, 15 Jun 2026 17:44:48 -0500 Subject: [PATCH] docs(roadmap): defer parakeet long-audio guard; record rationale + impl shortcut --- ROADMAP.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ROADMAP.md b/ROADMAP.md index b5c1fbd..d4aaa2f 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -3,7 +3,7 @@ Longer-term backlog, roughly ordered. An item moves to "Current state" in CLAUDE.md when picked up. ## Near term -- parakeet-asr `--memory` cap, shipped via the Reapply-patches action (guards against swap-thrash on very long audio). +- parakeet-asr long-audio memory guard — **deferred 2026-06-15, low priority.** A duration cap on `/v1/audio/diarize`: Sortformer runs the whole file in one pass (`diarizer.py:128-135`) over Spark 2's *shared* 128 GB unified memory (also feeding Kokoro/embeddings/Qdrant), so one giant single file can thrash into swap. **Precautionary — no observed incident**, and the production consumer (Recap Relay) already chunks via `/diarize-chunk` (~5-min, already bounded), so the only exposed path is a consumer POSTing one huge file to the full `/diarize`. When picked up: add a configurable `MAX_DIARIZE_SECONDS` guard in `diarizer.py` right after `duration` is computed (~line 130) → raise → HTTP 413 in `main.py` (mirrors the existing `MAX_UPLOAD_MB` 413); ship via the Reapply-patches action (restarts the live parakeet-asr container → needs go/no-go). Leave transcription out of v1 (upstream/un-patched file; parakeet-TDT handles long audio better). Revisit only if a consumer starts sending long single files. - Controlled concurrency sweep of the audio endpoints in a quiet window — replace the reasoned in-flight cap (2, ceiling 3) with the measured knee. ## Audio quality