05d03beeeb
- AGENTS.md: rewrite Current state lean for v0.19.0:0; drop the now-completed full-eval triage block (history lives in git log + EVALUATION.md). - docs/guides/fastapi-image.md: add two durable conventions — user values crossing into SSH must go through shellsafe; new endpoints and the csrf_guard exempt-prefix rule. - ROADMAP.md: park the remaining non-blocking P2/P3 tech debt from the eval.
43 lines
3.5 KiB
Markdown
43 lines
3.5 KiB
Markdown
# ROADMAP
|
|
|
|
Longer-term backlog, roughly ordered. An item moves to "Current state" in CLAUDE.md when picked up.
|
|
|
|
## Near term
|
|
- parakeet-asr `--memory` cap, shipped via the Reapply-patches action (guards against swap-thrash on very long audio).
|
|
- Controlled concurrency sweep of the audio endpoints in a quiet window — replace the reasoned in-flight cap (2, ceiling 3) with the measured knee.
|
|
|
|
## Audio quality
|
|
- Echo cancellation for dual-channel label-merge — removes the mic-bleed limit when the local user isn't wearing headphones.
|
|
- LLM "referee" pass for low-confidence label-merge speaker naming.
|
|
|
|
## Platform hardening
|
|
- Qdrant auth (API key) + scheduled snapshots/backups.
|
|
- Observability: request metrics + GPU-busy tracking, so load questions are answered from data instead of log archaeology.
|
|
- API-key auth on Spark Control — only if public (non-VPN) exposure is ever needed; current stance is LAN + split-tunnel VPN only.
|
|
|
|
## Throughput (only if audio load outgrows one GPU)
|
|
- Second audio worker / queueing layer; revisit which services share Spark 2.
|
|
|
|
## Dashboard
|
|
- Per-model configurable vLLM flags editable from the UI (today: edit `models.yaml` and rebuild).
|
|
- Spark host update actions (OS/driver) from the UI.
|
|
- Open WebUI link-out integration; richer per-service detail views.
|
|
|
|
## Tech debt (from the 2026-06-12 full-eval — see EVALUATION.md)
|
|
|
|
P0/P1 security findings are all fixed in v0.19.0:0. Remaining, none blocking:
|
|
|
|
**P2 — track:**
|
|
- No automated tests beyond the two redaction suites — swap state machine, proxies, SSH wrapper, and the StartOS package are untested; live-cluster paths (swap exec, audio, embeddings/search) are exercised only by hand. Biggest coverage gap; a small pytest harness for `build_launch_command` (incl. injection cases), swap transitions, and `_merge_words_with_speakers` is the highest-value start.
|
|
- Loose dependency floors permit vulnerable `python-multipart`/`starlette` (DoS CVEs) on rebuild; no lockfile; no upload size caps (`pyproject.toml`).
|
|
- Opaque HTTP 500 on `POST /api/models` / `PUT /knobs` when `MODELS_OVERRIDES` unset in dev (write to read-only `/data`) — catch the `OSError`.
|
|
- NGC API key still appears on the remote process command line (`nim.py`) — the quote-breakout risk is fixed; pass via stdin/env to also remove the process-list exposure.
|
|
- Global mutable `catalog` reassigned via `global`, shared across async requests with no snapshot (`server.py`) — latent race as concurrency grows.
|
|
- Container runs uvicorn as **root** bound to `0.0.0.0:9999` (no `USER` in Dockerfile) — amplifies any RCE blast radius.
|
|
|
|
**P3 — bulk-fix when next touching docs/packaging:**
|
|
- README Status block stale (`v0.2.3 / 0.13.0:4` → now v0.19.0:0); deprecated `@app.on_event` + hardcoded `app.version="0.1.0"`; `NimInstallBody.register` shadows `BaseModel` (rename → `register_service`); httpx class names leak into TTS/speech-models error text; one unescaped `innerHTML` sink (`app.js`) + `task_id` reflected in scrub JSON.
|
|
- Packaging: `marketingUrl`/`packageRepo`/`upstreamRepo` are `example.com` placeholders; broken `instructions.md` source link; per-service SSH users (`parakeet_user` etc.) absent from the Configure-Sparks action inputSpec (silent default-empty); `Makefile` builds only x86 though the manifest declares `aarch64`.
|
|
- Hardening misc: no body/upload size limits on `/v1/audio/*`, `/v1/chat/completions`, `/scrub`; `int(_env(...))` startup crash on bad `VLLM_PORT`; upstream error text echoed to clients.
|
|
- StartOS registry (only if ever pursuing it): source must be public + real repo URLs.
|