v0.18.0:1 - scrub owner-specific hostnames, ips, usernames, names from tracked files
Replace real cluster IPs/hosts/usernames and example names with neutral placeholders across docs, ops notes, package install text, and the offline redaction test; delete the obsolete build-time starter prompt. Closes the portability audit's single blocker. No runtime behavior change.
This commit is contained in:
+5
-5
@@ -37,7 +37,7 @@ These take effect on the **next swap to that model**. If a swap fails after this
|
||||
## Adding a new model
|
||||
|
||||
1. Add an entry to `image/models.yaml`. Required fields: `display_name`, `repo`, `size_gb`, `mode` (`solo` or `cluster`), `vllm_args`. Optional but recommended: `description` (one paragraph — what the model is, what it's good for, how it differs from others; renders below the meta tags in each card), `capabilities` (tags like `[vision, reasoning, tools]`), `expected_ready_seconds`.
|
||||
2. Confirm the weights are on the Spark: `ssh <spark-user>@<spark-1-host>.local 'ls ~/.cache/huggingface/hub/'`. If not, download with `./hf-download.sh <repo>` on Spark 1.
|
||||
2. Confirm the weights are on the Spark: `ssh <spark-user>@<spark-1-host> 'ls ~/.cache/huggingface/hub/'`. If not, download with `./hf-download.sh <repo>` on Spark 1.
|
||||
3. Rebuild + redeploy the package: `cd package && make x86 && make install`.
|
||||
|
||||
If `description` is omitted, the card simply hides that section — no need to populate it for every model. Keep descriptions generic (not user-specific) so the catalog stays portable.
|
||||
@@ -47,7 +47,7 @@ If `description` is omitted, the card simply hides that section — no need to p
|
||||
If the UI is unavailable and you need to swap by hand:
|
||||
|
||||
```bash
|
||||
ssh <spark-user>@<spark-1-host>.local
|
||||
ssh <spark-user>@<spark-1-host>
|
||||
cd ~/spark-vllm-docker
|
||||
./launch-cluster.sh stop
|
||||
./launch-cluster.sh --solo -d exec vllm serve RedHatAI/gemma-4-31B-it-NVFP4 \
|
||||
@@ -64,10 +64,10 @@ docker logs -f vllm_node # wait for "Application startup complete."
|
||||
curl -s http://<spark-1-ip>:8888/v1/models | jq .
|
||||
|
||||
# Cluster status (containers up?)
|
||||
ssh <spark-user>@<spark-1-host>.local 'cd ~/spark-vllm-docker && ./launch-cluster.sh status'
|
||||
ssh <spark-user>@<spark-1-host> 'cd ~/spark-vllm-docker && ./launch-cluster.sh status'
|
||||
|
||||
# Tail current model's logs
|
||||
ssh <spark-user>@<spark-1-host>.local 'docker logs --tail 200 -f vllm_node'
|
||||
ssh <spark-user>@<spark-1-host> 'docker logs --tail 200 -f vllm_node'
|
||||
|
||||
# Parakeet
|
||||
curl -s http://<spark-2-ip>:8000/health
|
||||
@@ -81,7 +81,7 @@ curl -s http://<spark-2-ip>:8880/health
|
||||
If launch-cluster.sh gets stuck:
|
||||
|
||||
```bash
|
||||
ssh <spark-user>@<spark-1-host>.local
|
||||
ssh <spark-user>@<spark-1-host>
|
||||
cd ~/spark-vllm-docker
|
||||
./launch-cluster.sh stop
|
||||
docker ps -aq | xargs -r docker rm -f
|
||||
|
||||
Reference in New Issue
Block a user