f44e7f8b03
Adds three new endpoints to spark-control that translate OpenAI's
audio API shapes to the Parakeet (STT) and Magpie (TTS, NVIDIA Riva)
services on the Sparks:
GET /v1/models — STT model + Magpie's 60+ voices
POST /v1/audio/speech — OpenAI body -> Magpie multipart synthesize
(returns audio/wav passthrough)
POST /v1/audio/transcriptions — relay to Parakeet (already compatible)
Verified shapes against the live services:
- Parakeet returns OpenAI-style {"text": "..."} or verbose_json with
segments+words. Already a perfect drop-in for OpenAI clients.
- Magpie returns raw WAV bytes with Content-Type: audio/wav. NOT
base64-wrapped JSON as one might assume. The proxy is literally a
body-translation on the request side; response is passthrough.
Voice language is auto-derived from the voice name (e.g.
Magpie-Multilingual.EN-US.Mia -> language=en-US) so clients don't
need to set it explicitly.
Open WebUI / Home Assistant / Recap Relay can now all point at one
URL — https://<spark-control>.local/v1 — and get LLM, STT, TTS
behind a single identity. No shim service to deploy.
Pure addition: no existing routes touched; the dashboard, /api/*,
download flow, deep-health, hardware probes are all unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14 lines
1.0 KiB
TypeScript
14 lines
1.0 KiB
TypeScript
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
|
|
|
export const v0_1_0 = VersionInfo.of({
|
|
version: '0.9.0:0',
|
|
releaseNotes: {
|
|
en_US:
|
|
'v0.9.0 — OpenAI-compatible audio proxy. Spark Control now exposes /v1/audio/speech (TTS), /v1/audio/transcriptions (STT), and /v1/models on its own URL, translating OpenAI-shaped requests to Magpie (NVIDIA Riva multipart) and forwarding to Parakeet (already OpenAI-compatible). Open WebUI, Home Assistant, and any other OpenAI-compatible client can now point at https://<your-spark-control>.local/v1 and get TTS + STT + LLM all behind one identity — no shim service to deploy, no separate URLs to remember. /v1/models lists Magpie\'s 60+ voices across en-US, es-US, fr-FR, zh-CN, it-IT, hi-IN, vi-VN, ja-JP, de-DE so client UIs auto-populate their voice pickers. Falls back gracefully if Magpie is offline (still serves STT). Pure addition — no existing routes or endpoints changed.',
|
|
},
|
|
migrations: {
|
|
up: async ({ effects }) => {},
|
|
down: IMPOSSIBLE,
|
|
},
|
|
})
|