Commit Graph

53 Commits

Author SHA1 Message Date
Grant 342e150266 Add safe optimization flags to gemma4 + qwen36 (fastsafetensors, prefix-caching, fp8 kv)
Aligned with sibling recipes in eugr/spark-vllm-docker. Applies on next swap to each model.
First real swap gemma4 -> qwen36 succeeded in 5:30 with --moe_backend=flashinfer_cutlass.
2026-05-12 09:49:08 -05:00
Grant dd9d53060b Add StartOS 0.4 package scaffold (manifest, main, interfaces, 2 actions)
- package/Makefile + s9pk.mk + package.json + tsconfig.json
- startos/manifest: dockerBuild source pointing at ../image/Dockerfile
- startos/main: reads /data/config.yaml reactively, passes env vars to container
- startos/interfaces: binds port 9999 as HTTP UI
- startos/actions: showPublicKey (read /data/ssh/id_ed25519.pub), configureSparks
- TS + JS bundle compile clean (tsc --noEmit, ncc build)
2026-05-12 09:36:15 -05:00
Grant ae8efa1754 Initial scaffold: image/ FastAPI app, models.yaml, docs
- image/ FastAPI app: /api/status, /api/swap, /api/swap/{id}/stream, /api/test-connection
- models.yaml: 5-model catalog (qwen3-vl, gemma4, qwen36, qwen3-235b-fp8, qwen25-72b)
- README, runbook, known-issues
- Dry-run swap verified against live Spark 1 (gemma4 currently loaded)
2026-05-12 09:29:13 -05:00