From 34bdbb7abaef2635fb4edf966bf193f08bb964d8 Mon Sep 17 00:00:00 2001 From: Grant Date: Tue, 12 May 2026 10:05:17 -0500 Subject: [PATCH] Add Spark prerequisites section to runbook (spark-vllm-docker is upstream + Spark-side) --- runbook.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/runbook.md b/runbook.md index 598e248..1e1055a 100644 --- a/runbook.md +++ b/runbook.md @@ -2,6 +2,18 @@ Operating notes for running and maintaining the cluster via spark-control. +## Prerequisites (per Spark) + +spark-control is a **controller**, not a runtime. Each Spark in your cluster must already have the upstream `eugr/spark-vllm-docker` project set up: + +1. Clone `https://github.com/eugr/spark-vllm-docker` to `~/spark-vllm-docker` on Spark 1 (the head node). +2. Build the vLLM container: `./build-and-copy.sh -c` (on a cluster) or `./build-and-copy.sh` (solo). +3. Pre-download any models you want in the catalog: `./hf-download.sh -c --copy-parallel`. +4. Verify: `./launch-cluster.sh status` returns sensibly. +5. Set up passwordless SSH from your Start9 server's spark-control container to each Spark (use the Show Public Key action — see README.md "Post-install setup"). + +Sharing this package with someone else who has a similar dual-DGX-Spark setup: they do the same per-Spark prerequisites, then sideload the `.s9pk` on their Start9 and run the setup actions. + ## Recent successful swaps - **2026-05-12 — gemma4 → qwen36** via `POST /api/swap` from laptop dev server. ~5:30 to "Application startup complete." Inference works (`/v1/chat/completions` returns reasoning content via `reasoning` field). `--moe_backend=flashinfer_cutlass` confirmed valid by vLLM (logged "Using 'FLASHINFER_CUTLASS' NvFp4 MoE backend").