// Kokoro TTS backend — synthesizes a topic summary into speech via Spark // Control's OpenAI-compatible /v1/audio/speech endpoint. // // Kokoro-82M (Apache-2.0, hexgrad/Kokoro-82M) replaced Magpie in Spark // Control v0.14.0. Magpie's NVIDIA Riva decoder had a structural // truncation defect that capped end-to-end reliability at ~85% even with // server-side retries + chunking; Kokoro renders cleanly at any length // (100% in our testing, ~1s for a ~100-word summary, no truncation). So // this backend is a single pass-through call — NONE of the Magpie-era // fragmenting, pacing/recovery-gap, duration-check, retry, or WAV // stitching is needed or present. // // Output: 24kHz mono 16-bit PCM. Kokoro can emit wav/mp3/opus/flac // directly via response_format, so we request the caller's format (mp3 // by default — small + universally playable for the mobile/offline // player) and never transcode client-side. durationSeconds is left null: // Kokoro's WAV header carries a placeholder size field (bogus computed // duration), and for mp3 we'd have to decode — the Recap side measures // duration off the cached file /