spark-control

Author	SHA1	Message	Date
Keysat	df9f244eae	v0.26.0:0 - disk-driven model menu (scan sparks; recipes; needs-setup) The dashboard menu is now the set of models actually downloaded on the Sparks, not a hard-coded catalog. models.yaml + overrides are reframed as launch recipes matched to an on-disk model by repo; an on-disk model with no recipe is flagged needs_setup and its launch settings are inferred from its config.json for a one-time operator confirmation (discovery.py). - delete now removes weights AND the menu card (delete_from_disk sweeps all hosts; the delete endpoint resolves keys via the live menu) - new GET /api/models/suggest; /api/models returns the menu + a recipes list (download autocomplete); GET /api/models/disk-status removed - dropped the two legacy Qwen recipes (235B FP8, 2.5 72B) - tests: +test_discovery.py (cache parsing, infer_recipe, build_menu merge)	2026-06-18 11:09:56 -05:00
Keysat	26070eb191	v0.24.0:0 - configurable cluster topology (vllm container name, hide services, second-vllm monitor) Make the cluster topology configurable so an adopter wired differently (vLLM on both Sparks, port 8000, different container name, no Parakeet) can monitor without forking. Covers the OpenClaw report P4/P5/#6. - VLLM_CONTAINER override (default vllm_node), validated at the boundary and quote_arg-quoted into the swap log-tail + pre-flight validator exec. - DISABLED_SERVICES list: hidden services show no tile and are skipped by status/deep-health/connectivity probes (kills the Parakeet-on-8000 collision). - kind: vllm custom service monitors a second Spark's vLLM via the shared probe_vllm_endpoint; /api/endpoints gains a disabled flag. Swap mechanism intentionally not generalized to raw docker run (that's coordination, roadmap item 4).	2026-06-17 23:03:33 -05:00
Keysat	90394f891b	docs: v0.23.0 published, live install pending (mDNS); runbook sideload troubleshooting	2026-06-17 22:36:41 -05:00
Keysat	e783653ef0	v0.23.0:0 - local / fine-tuned model support Add models that live as a directory on a Spark (e.g. LoRA-merged fine-tunes), not just Hugging Face repos. - ModelDef gains local_path; a model must set exactly one of repo / local_path. The validator also enforces the local-path whitelist and that any --chat-template lives inside local_path (only that dir is mounted). - build_launch_command bind-mounts the dir into the vLLM container at the SAME host==container path via the launch script's VLLM_SPARK_EXTRA_DOCKER_ARGS hook, then `vllm serve <dir>`. No launch-cluster.sh change (verified the upstream expands that var unquoted; contract noted in runbook.md). - shellsafe.validate_local_path: absolute path, charset whitelist, no '.'/'..'. - POST /api/models validates the full entry via ModelDef before persisting, so a bad entry can't be written and then break catalog load; _merge_overrides skips an invalid override entry instead of failing the whole catalog. - disk.py size-probes a local path with du; disk-delete refused for local models. - UI: "+ Add local model" dialog, `local` badge, path shown instead of an HF link, delete button hidden for local models. - Tests: local launch + injection round-trip, chat-template location, traversal, exactly-one-source, _merge_overrides skip-invalid (94 pass). Reviewer-agent pass; findings addressed.	2026-06-17 22:27:41 -05:00
Keysat	c179389731	docs: trim Current state post-matrix-bridge ship; add bot-tile ops note to runbook	2026-06-15 23:18:28 -05:00
Keysat	98988057a2	v0.18.0:1 - scrub owner-specific hostnames, ips, usernames, names from tracked files Replace real cluster IPs/hosts/usernames and example names with neutral placeholders across docs, ops notes, package install text, and the offline redaction test; delete the obsolete build-time starter prompt. Closes the portability audit's single blocker. No runtime behavior change.	2026-06-12 15:07:34 -05:00
Keysat	8d839e3714	v0.13.0:4 - redaction gateway, embeddings proxy, expanded audio API - Add redaction gateway (redaction_gateway.py, redaction/ scrub + tests) - Add embeddings proxy and spark_embed service (Dockerfile + main.py) - Expand audio_proxy with speaker-aware handling; deep_health/health/server updates - Package: configureSparks action + sparkConfig model updates, manifest/main wiring - Docs: AUDIO_API, EMBEDDINGS, REDACTION_GATEWAY; HANDOFF and runbook/known-issues refresh	2026-06-11 17:45:57 -05:00
Grant	87334f85f0	Add per-model descriptions + repo-cleanup polish - models.yaml: add 'description' field for all 5 models (generic, anyone-can-use) - ModelDef gains optional description: str \| None field - UI: render description below meta tags; mute the repo line further - escapeHtml() for safety in case descriptions/names contain HTML chars - Update runbook: how to add a new model with description	2026-05-12 10:19:09 -05:00
Grant	34bdbb7aba	Add Spark prerequisites section to runbook (spark-vllm-docker is upstream + Spark-side)	2026-05-12 10:05:17 -05:00
Grant	342e150266	Add safe optimization flags to gemma4 + qwen36 (fastsafetensors, prefix-caching, fp8 kv) Aligned with sibling recipes in eugr/spark-vllm-docker. Applies on next swap to each model. First real swap gemma4 -> qwen36 succeeded in 5:30 with --moe_backend=flashinfer_cutlass.	2026-05-12 09:49:08 -05:00
Grant	ae8efa1754	Initial scaffold: image/ FastAPI app, models.yaml, docs - image/ FastAPI app: /api/status, /api/swap, /api/swap/{id}/stream, /api/test-connection - models.yaml: 5-model catalog (qwen3-vl, gemma4, qwen36, qwen3-235b-fp8, qwen25-72b) - README, runbook, known-issues - Dry-run swap verified against live Spark 1 (gemma4 currently loaded)	2026-05-12 09:29:13 -05:00

11 Commits