spark-control

grant/spark-control

Fork 0

Commit Graph

Author	SHA1	Message	Date
Grant	ed54f85442	known-issues: mark magpie crash loop RESOLVED with chown fix recipe Volume magpie-model-cache was owned by root, container drops to uid 1000. Fix: docker run --rm -v magpie-model-cache:/cache alpine chown -R 1000:1000 /cache + docker restart magpie-tts. After ~3 GB NGC model download, healthy on :9000.	2026-05-12 11:12:25 -05:00
Grant	342e150266	Add safe optimization flags to gemma4 + qwen36 (fastsafetensors, prefix-caching, fp8 kv) Aligned with sibling recipes in eugr/spark-vllm-docker. Applies on next swap to each model. First real swap gemma4 -> qwen36 succeeded in 5:30 with --moe_backend=flashinfer_cutlass.	2026-05-12 09:49:08 -05:00
Grant	ae8efa1754	Initial scaffold: image/ FastAPI app, models.yaml, docs - image/ FastAPI app: /api/status, /api/swap, /api/swap/{id}/stream, /api/test-connection - models.yaml: 5-model catalog (qwen3-vl, gemma4, qwen36, qwen3-235b-fp8, qwen25-72b) - README, runbook, known-issues - Dry-run swap verified against live Spark 1 (gemma4 currently loaded)	2026-05-12 09:29:13 -05:00

Author

SHA1

Message

Date

Grant

ed54f85442

known-issues: mark magpie crash loop RESOLVED with chown fix recipe

Volume magpie-model-cache was owned by root, container drops to uid 1000. Fix:
docker run --rm -v magpie-model-cache:/cache alpine chown -R 1000:1000 /cache
+ docker restart magpie-tts. After ~3 GB NGC model download, healthy on :9000.

2026-05-12 11:12:25 -05:00

Grant

342e150266

Add safe optimization flags to gemma4 + qwen36 (fastsafetensors, prefix-caching, fp8 kv)

Aligned with sibling recipes in eugr/spark-vllm-docker. Applies on next swap to each model.
First real swap gemma4 -> qwen36 succeeded in 5:30 with --moe_backend=flashinfer_cutlass.

2026-05-12 09:49:08 -05:00

Grant

ae8efa1754

Initial scaffold: image/ FastAPI app, models.yaml, docs

- image/ FastAPI app: /api/status, /api/swap, /api/swap/{id}/stream, /api/test-connection
- models.yaml: 5-model catalog (qwen3-vl, gemma4, qwen36, qwen3-235b-fp8, qwen25-72b)
- README, runbook, known-issues
- Dry-run swap verified against live Spark 1 (gemma4 currently loaded)

2026-05-12 09:29:13 -05:00

3 Commits