v0.10.0:0 - speaker diarization via Sortformer + merged transcribe-with-speakers
Adds a new pipeline for diarized transcription that any client (recap-relay,
ad-hoc curl, future Mac-side tools) can call. Pure data pipeline, no LLM
or UI included — name resolution / analysis happen downstream where prompts
and rendering are configurable.
Architecture:
Spark 2 / parakeet-asr container:
+ /opt/parakeet/app/diarizer.py (new: SortformerDiarizer class)
+ /opt/parakeet/app/main.py (patched: loads diarizer, adds
/v1/audio/diarize endpoint)
Model: nvidia/diar_sortformer_4spk-v1 (~150 MB, ungated, NeMo native)
Spark Control:
+ POST /api/audio/transcribe-with-speakers
Body: multipart file
Returns: {
duration, language, speakers_detected,
segments: [{start_ms, end_ms, speaker, text}, ...],
models: {transcription, diarization}
}
Runs Parakeet ASR + Sortformer in parallel, merges words to speaker
turns by timestamp, groups into speaker-change blocks (breaks also
on >1.5s silence gaps).
+ If Parakeet 500s mid-pipeline, kicks deep-health probe and returns
503/Retry-After: 60 — same wedge-recovery pattern as v0.9.0:2.
Apply Sortformer patches to the running Parakeet container with:
bash image/parakeet_patches/apply.sh <spark2-host> <ssh-user>
Patches are reversible — apply.sh backs up the original main.py inside the
container at main.py.pre-sortformer before overwriting. Restore by copying
that file back and removing diarizer.py, then docker restart.
v0.11 follow-up: dashboard "Speech Models" panel to swap/update model
versions from the UI instead of needing to re-run apply.sh.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Executable
+54
@@ -0,0 +1,54 @@
|
||||
#!/bin/bash
|
||||
# Apply Sortformer diarization patches to a running parakeet-asr container.
|
||||
#
|
||||
# Run from the spark-control repo root on the laptop:
|
||||
# bash image/parakeet_patches/apply.sh <spark2-host> <ssh-user>
|
||||
#
|
||||
# What it does:
|
||||
# 1. Backs up the current /opt/parakeet/app/main.py inside the container
|
||||
# (writable layer; survives docker restart but NOT docker rm).
|
||||
# 2. Copies the patched main.py + new diarizer.py into the container.
|
||||
# 3. Restarts the container so the new code + Sortformer model load.
|
||||
#
|
||||
# Reversibility:
|
||||
# - The backup of main.py is at /opt/parakeet/app/main.py.pre-sortformer
|
||||
# inside the container. Restore with:
|
||||
# docker exec parakeet-asr cp /opt/parakeet/app/main.py.pre-sortformer /opt/parakeet/app/main.py
|
||||
# docker exec parakeet-asr rm -f /opt/parakeet/app/diarizer.py
|
||||
# docker restart parakeet-asr
|
||||
# - If the container is ever `docker rm`'d (volume rebuild), re-run this
|
||||
# script. We will eventually fold this into spark-control as an action.
|
||||
|
||||
set -e
|
||||
|
||||
HOST="${1:?usage: apply.sh <spark2-host> <ssh-user>}"
|
||||
USER="${2:?usage: apply.sh <spark2-host> <ssh-user>}"
|
||||
CONTAINER="${CONTAINER:-parakeet-asr}"
|
||||
|
||||
REPO_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
|
||||
echo "→ Backing up current main.py inside ${CONTAINER}..."
|
||||
ssh "${USER}@${HOST}" "docker exec ${CONTAINER} sh -c \
|
||||
'test -f /opt/parakeet/app/main.py.pre-sortformer || cp /opt/parakeet/app/main.py /opt/parakeet/app/main.py.pre-sortformer'"
|
||||
|
||||
echo "→ Copying diarizer.py into container..."
|
||||
ssh "${USER}@${HOST}" "docker exec -i ${CONTAINER} sh -c \
|
||||
'cat > /opt/parakeet/app/diarizer.py'" < "${REPO_DIR}/diarizer.py"
|
||||
|
||||
echo "→ Copying patched main.py into container..."
|
||||
ssh "${USER}@${HOST}" "docker exec -i ${CONTAINER} sh -c \
|
||||
'cat > /opt/parakeet/app/main.py'" < "${REPO_DIR}/main.py"
|
||||
|
||||
echo "→ Verifying syntax inside container..."
|
||||
ssh "${USER}@${HOST}" "docker exec ${CONTAINER} python3 -c \
|
||||
'import ast; ast.parse(open(\"/opt/parakeet/app/diarizer.py\").read()); ast.parse(open(\"/opt/parakeet/app/main.py\").read()); print(\"py OK\")'"
|
||||
|
||||
echo "→ Restarting ${CONTAINER}..."
|
||||
ssh "${USER}@${HOST}" "docker restart ${CONTAINER}"
|
||||
|
||||
echo
|
||||
echo "✔ Patches applied. Sortformer model (~150 MB) will download on first load — wait ~30s before testing."
|
||||
echo
|
||||
echo "Test once it's ready:"
|
||||
echo " curl -sS http://${HOST}:8000/health"
|
||||
echo " curl -sS -X POST http://${HOST}:8000/v1/audio/diarize -F file=@some-audio.mp3 | head -c 500"
|
||||
Reference in New Issue
Block a user