v0.9.0:0 - OpenAI-compatible audio proxy for Open WebUI / Home Assistant
Adds three new endpoints to spark-control that translate OpenAI's
audio API shapes to the Parakeet (STT) and Magpie (TTS, NVIDIA Riva)
services on the Sparks:
GET /v1/models — STT model + Magpie's 60+ voices
POST /v1/audio/speech — OpenAI body -> Magpie multipart synthesize
(returns audio/wav passthrough)
POST /v1/audio/transcriptions — relay to Parakeet (already compatible)
Verified shapes against the live services:
- Parakeet returns OpenAI-style {"text": "..."} or verbose_json with
segments+words. Already a perfect drop-in for OpenAI clients.
- Magpie returns raw WAV bytes with Content-Type: audio/wav. NOT
base64-wrapped JSON as one might assume. The proxy is literally a
body-translation on the request side; response is passthrough.
Voice language is auto-derived from the voice name (e.g.
Magpie-Multilingual.EN-US.Mia -> language=en-US) so clients don't
need to set it explicitly.
Open WebUI / Home Assistant / Recap Relay can now all point at one
URL — https://<spark-control>.local/v1 — and get LLM, STT, TTS
behind a single identity. No shim service to deploy.
Pure addition: no existing routes touched; the dashboard, /api/*,
download flow, deep-health, hardware probes are all unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,10 +1,10 @@
|
||||
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
||||
|
||||
export const v0_1_0 = VersionInfo.of({
|
||||
version: '0.8.1:2',
|
||||
version: '0.9.0:0',
|
||||
releaseNotes: {
|
||||
en_US:
|
||||
'v0.8.1:2 — the primary card button now adapts to whether the model is on disk. If weights are present: green "Switch to this" (unchanged). If weights are NOT on disk: blue "Download" instead, which calls /api/download directly with the model\'s repo and the right mode (solo→Spark 1, cluster→both Sparks) — no more pasting the repo into the manual download form to re-fetch a deleted model. Re-installing a previously-deleted model is now one click + a confirmation. Builds on the disk-status pills + trash icons from 0.8.1.',
|
||||
'v0.9.0 — OpenAI-compatible audio proxy. Spark Control now exposes /v1/audio/speech (TTS), /v1/audio/transcriptions (STT), and /v1/models on its own URL, translating OpenAI-shaped requests to Magpie (NVIDIA Riva multipart) and forwarding to Parakeet (already OpenAI-compatible). Open WebUI, Home Assistant, and any other OpenAI-compatible client can now point at https://<your-spark-control>.local/v1 and get TTS + STT + LLM all behind one identity — no shim service to deploy, no separate URLs to remember. /v1/models lists Magpie\'s 60+ voices across en-US, es-US, fr-FR, zh-CN, it-IT, hi-IN, vi-VN, ja-JP, de-DE so client UIs auto-populate their voice pickers. Falls back gracefully if Magpie is offline (still serves STT). Pure addition — no existing routes or endpoints changed.',
|
||||
},
|
||||
migrations: {
|
||||
up: async ({ effects }) => {},
|
||||
|
||||
Reference in New Issue
Block a user