Add models that live as a directory on a Spark (e.g. LoRA-merged fine-tunes),
not just Hugging Face repos.
- ModelDef gains local_path; a model must set exactly one of repo / local_path.
The validator also enforces the local-path whitelist and that any
--chat-template lives inside local_path (only that dir is mounted).
- build_launch_command bind-mounts the dir into the vLLM container at the SAME
host==container path via the launch script's VLLM_SPARK_EXTRA_DOCKER_ARGS hook,
then `vllm serve <dir>`. No launch-cluster.sh change (verified the upstream
expands that var unquoted; contract noted in runbook.md).
- shellsafe.validate_local_path: absolute path, charset whitelist, no '.'/'..'.
- POST /api/models validates the full entry via ModelDef before persisting, so a
bad entry can't be written and then break catalog load; _merge_overrides skips
an invalid override entry instead of failing the whole catalog.
- disk.py size-probes a local path with du; disk-delete refused for local models.
- UI: "+ Add local model" dialog, `local` badge, path shown instead of an HF
link, delete button hidden for local models.
- Tests: local launch + injection round-trip, chat-template location, traversal,
exactly-one-source, _merge_overrides skip-invalid (94 pass). Reviewer-agent
pass; findings addressed.
Triaged from a full independent evaluation (EVALUATION.md). Addresses the
three P0/P1 code findings; the proxy/data APIs that downstream apps consume
are deliberately untouched.
- ssh command injection (P0): new shellsafe.py validates + shlex.quotes every
user-supplied value crossing into an SSH command on the Sparks (model repo,
vllm args/knobs, NIM image/container/volume/port/env, service names).
Boundary validation on POST /api/models and POST /api/nim/install; quoting at
every sink in models/download/nim/services. NGC key now quoted too.
- qdrant path injection (P1): /api/search validates the collection name against
a metacharacter-free whitelist and URL-encodes the path segment.
- csrf (P1): csrf_guard middleware enforces same-origin on state-changing
control endpoints; /v1/*, /scrub, /rehydrate, /api/search, /api/audio/* and
/api/health-event are exempt so external consumers are unaffected.
Verified: injection survives only as a single quoted token, vLLM preflight
shlex.split round-trip intact, CSRF behaviors covered via TestClient, both
offline redaction suites still pass, tsc clean, s9pk rebuilt.
Backend:
- overrides.py: read/write /data/models-overrides.yaml (knobs + custom entries)
- apply_knobs_to_args(): strip matching flags from bundled vllm_args and append knob values, so knob changes properly override bundled defaults
- extract_knobs_from_args(): seed UI knob values from bundled args so the Advanced dialog has correct starting state
- models.py: load_catalog merges overrides on top of bundled yaml
- GET /api/models returns effective_knobs per model
- PUT /api/models/{key}/knobs persists knob changes
- POST /api/models adds a custom catalog entry
- DELETE /api/models/{key} removes a custom entry (bundled models cannot be deleted)
- swap_manager.reload_catalog() called after each mutation so swaps see latest
Frontend:
- New 'Advanced' button on every card opens a modal dialog: max-model-len input, gpu-memory-utilization slider, three optimization checkboxes (fastsafetensors, prefix caching, FP8 KV cache). Save persists; Cancel discards. Custom models also have a Delete button.
- After a successful download, automatically open the 'Add to catalog' dialog pre-filled with the repo, with the same knob defaults — user just enters key, display name, and clicks Save.
- Custom catalog entries are tagged with a blue 'custom' pill on the card.
Package: bump 0.2.3:0; main.ts sets MODELS_OVERRIDES=/data/models-overrides.yaml so overrides persist on the StartOS volume.
- models.yaml: add 'description' field for all 5 models (generic, anyone-can-use)
- ModelDef gains optional description: str | None field
- UI: render description below meta tags; mute the repo line further
- escapeHtml() for safety in case descriptions/names contain HTML chars
- Update runbook: how to add a new model with description