Add Spark prerequisites section to runbook (spark-vllm-docker is upstream + Spark-side)
This commit is contained in:
+12
@@ -2,6 +2,18 @@
|
||||
|
||||
Operating notes for running and maintaining the cluster via spark-control.
|
||||
|
||||
## Prerequisites (per Spark)
|
||||
|
||||
spark-control is a **controller**, not a runtime. Each Spark in your cluster must already have the upstream `eugr/spark-vllm-docker` project set up:
|
||||
|
||||
1. Clone `https://github.com/eugr/spark-vllm-docker` to `~/spark-vllm-docker` on Spark 1 (the head node).
|
||||
2. Build the vLLM container: `./build-and-copy.sh -c` (on a cluster) or `./build-and-copy.sh` (solo).
|
||||
3. Pre-download any models you want in the catalog: `./hf-download.sh <repo> -c --copy-parallel`.
|
||||
4. Verify: `./launch-cluster.sh status` returns sensibly.
|
||||
5. Set up passwordless SSH from your Start9 server's spark-control container to each Spark (use the Show Public Key action — see README.md "Post-install setup").
|
||||
|
||||
Sharing this package with someone else who has a similar dual-DGX-Spark setup: they do the same per-Spark prerequisites, then sideload the `.s9pk` on their Start9 and run the setup actions.
|
||||
|
||||
## Recent successful swaps
|
||||
|
||||
- **2026-05-12 — gemma4 → qwen36** via `POST /api/swap` from laptop dev server. ~5:30 to "Application startup complete." Inference works (`/v1/chat/completions` returns reasoning content via `reasoning` field). `--moe_backend=flashinfer_cutlass` confirmed valid by vLLM (logged "Using 'FLASHINFER_CUTLASS' NvFp4 MoE backend").
|
||||
|
||||
Reference in New Issue
Block a user