Compare commits
4 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| e783653ef0 | |||
| 57a893000e | |||
| 56f7ea4444 | |||
| aaad57d88f |
@@ -55,7 +55,7 @@ Subsystem guidance lives in `docs/guides/` and loads when matching files are tou
|
|||||||
|
|
||||||
## Current state
|
## Current state
|
||||||
|
|
||||||
- **Working (v0.21.0:1, installed and serving):** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel; per-Spark SSH-key copy + WireGuard `VPN <ip>` hardware-card badge. Spark 2 audio stack healthy. Security hardening (v0.19.0:0 — shellsafe SSH-injection guard, Qdrant path-injection, same-origin CSRF guard) shipped and stable; evidence in `EVALUATION.md`.
|
- **Working (v0.22.0:0, installed and serving):** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel; per-Spark SSH-key copy + WireGuard `VPN <ip>` hardware-card badge; configurable vLLM port (Configure Sparks field, blank ⇒ 8888). Spark 2 audio stack healthy. Security hardening (v0.19.0:0 — shellsafe SSH-injection guard, Qdrant path-injection, same-origin CSRF guard) shipped and stable; evidence in `EVALUATION.md`.
|
||||||
- **matrix-bridge bot tile (done, v0.21.0:1, verified live):** `bot`-kind service tile — status badge from docker-state only (no HTTP port), plus **Update** / Restart / Stop/Start / **View logs**. Code: `app/matrix_bridge.py` + `/api/matrix-bridge/{update,logs}` (update streams; 25-min cap; fail-loud). Driven directly as `modelo` on Spark 2 (**no `sudo -iu`** — spark2 has no passwordless sudo). User is a blank-default Configure-Sparks field (`matrix_bridge_user`); blank → tile hidden (portable). Host reuses `spark2_host` (`192.168.1.87` = the bot's box `spark-32d0`); container/dir/branch are env-overridable defaults. **Load-bearing ops dep:** Update's `git fetch` runs as `modelo`, which needs `modelo`'s `~/.ssh/config` pinning the Gitea deploy key with `IdentitiesOnly yes` — else the wrong key is offered and Gitea denies (publickey). Optional next, only if the bot dev asks: Docker `HEALTHCHECK` for running-but-disconnected detection (spec §Note).
|
- **matrix-bridge bot tile (done, v0.21.0:1, verified live):** `bot`-kind service tile — status badge from docker-state only (no HTTP port), plus **Update** / Restart / Stop/Start / **View logs**. Code: `app/matrix_bridge.py` + `/api/matrix-bridge/{update,logs}` (update streams; 25-min cap; fail-loud). Driven directly as `modelo` on Spark 2 (**no `sudo -iu`** — spark2 has no passwordless sudo). User is a blank-default Configure-Sparks field (`matrix_bridge_user`); blank → tile hidden (portable). Host reuses `spark2_host` (`192.168.1.87` = the bot's box `spark-32d0`); container/dir/branch are env-overridable defaults. **Load-bearing ops dep:** Update's `git fetch` runs as `modelo`, which needs `modelo`'s `~/.ssh/config` pinning the Gitea deploy key with `IdentitiesOnly yes` — else the wrong key is offered and Gitea denies (publickey). Optional next, only if the bot dev asks: Docker `HEALTHCHECK` for running-but-disconnected detection (spec §Note).
|
||||||
- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (70 passing). Covers `build_launch_command` (incl. the shell-injection round-trip), the transcript↔diarizer label-merge, the `shellsafe` validators, and `matrix_bridge.build_update_command` (+ phase detection). Mock-heavy swap/proxy tests deliberately skipped (low ROI). Redaction + live-audio suites remain standalone scripts.
|
- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (70 passing). Covers `build_launch_command` (incl. the shell-injection round-trip), the transcript↔diarizer label-merge, the `shellsafe` validators, and `matrix_bridge.build_update_command` (+ phase detection). Mock-heavy swap/proxy tests deliberately skipped (low ROI). Redaction + live-audio suites remain standalone scripts.
|
||||||
- **Signal Engine "flakiness":** diagnosed as *not* a server bug — transient 1–4s unresponsiveness while the single GPU is busy. Client-side remedy (in-flight cap 2 / ceiling 3 / retry-on-timeout+503) drafted and **forwarded to that dev (owner confirmed 2026-06-15)**. Awaiting whether they want the measured concurrency knee.
|
- **Signal Engine "flakiness":** diagnosed as *not* a server bug — transient 1–4s unresponsiveness while the single GPU is busy. Client-side remedy (in-flight cap 2 / ceiling 3 / retry-on-timeout+503) drafted and **forwarded to that dev (owner confirmed 2026-06-15)**. Awaiting whether they want the measured concurrency knee.
|
||||||
@@ -63,4 +63,4 @@ Subsystem guidance lives in `docs/guides/` and loads when matching files are tou
|
|||||||
- **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers; matrix-bridge badge won't visibly flip on a fast `docker restart` (status re-checked only after the command returns).
|
- **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers; matrix-bridge badge won't visibly flip on a fast `docker restart` (status re-checked only after the command returns).
|
||||||
- **Infra gotcha (safety):** passwordless sudo is NOT configured on spark2 — design unprivileged probes for any Spark feature (the badge uses `ip`, not `sudo wg show`). spark2 sits on the `starttunnel` WireGuard subnet (`10.59.211.6/24`, survives reboot). Owner declined SSH-key rotation after the 2026-06-12 history scrub (only the key *name* leaked) — don't re-flag.
|
- **Infra gotcha (safety):** passwordless sudo is NOT configured on spark2 — design unprivileged probes for any Spark feature (the badge uses `ip`, not `sudo wg show`). spark2 sits on the `starttunnel` WireGuard subnet (`10.59.211.6/24`, survives reboot). Owner declined SSH-key rotation after the 2026-06-12 history scrub (only the key *name* leaked) — don't re-flag.
|
||||||
- **Hosting:** self-hosted Gitea — remote `gitea`, branch `master`, over SSH; push after committing. (Wart: commit `8d839e3` is mislabeled `v0.13.0:4` but contains through v0.18.0:0.)
|
- **Hosting:** self-hosted Gitea — remote `gitea`, branch `master`, over SSH; push after committing. (Wart: commit `8d839e3` is mislabeled `v0.13.0:4` but contains through v0.18.0:0.)
|
||||||
- **Next — committed 2026-06-17: OpenClaw/Johnny-5 coexistence epic (full plan + design stance in `ROADMAP.md` → "Cluster coordination").** Stance: Spark Control = control plane / GPU arbiter, **not** a job runner; business cron jobs live in separate services that *call* its swap API (swaps are already API-driven via `POST /api/swap`). Sequence: (1) **configurable `VLLM_PORT`** — DONE in tree, staged as **v0.22.0:0** (Configure-Sparks field, blank ⇒ 8888; + `_env_int` hardening in `config.py` so a blank/bad port no longer crashes startup, killing a P3 tech-debt item). **Not yet built/installed/committed — awaiting go/no-go.** (2) local-path/fine-tuned models (in ROADMAP under Dashboard). (3) configurable topology (service→Spark→port map + container names). (4) coordination layer (swap lock + swap webhook + schedule visibility) — only when our own automation lands. Still-open older threads: audio concurrency sweep (only if the Signal Engine dev wants the knee; needs a quiet window); optional matrix-bridge Docker `HEALTHCHECK` if the bot dev asks; Parakeet long-audio guard deferred (rationale in ROADMAP).
|
- **Next — committed 2026-06-17: OpenClaw/Johnny-5 coexistence epic (full plan + design stance in `ROADMAP.md` → "Cluster coordination").** Stance: Spark Control = control plane / GPU arbiter, **not** a job runner; business cron jobs live in separate services that *call* its swap API (swaps are already API-driven via `POST /api/swap`). Sequence: (1) **configurable `VLLM_PORT`** — SHIPPED **v0.22.0:0** (Configure-Sparks field, blank ⇒ 8888; + `_env_int` hardening in `config.py` so a blank/bad port no longer crashes startup, killing a P3 tech-debt item). Committed `136a471`, pushed, tagged `v0.22.0`, rebuilt clean, installed, and **published to the self-hosted Gitea Releases** 2026-06-17 (`make release` → `scripts/gitea-release.sh`, takes `GITEA_URL` + a write token). **Distribution model (decided 2026-06-17):** Gitea Releases + a read-only token the adopter's agent uses to pull the latest s9pk (`GET /api/v1/repos/grant/spark-control/releases/latest` → download the `.s9pk` asset → sideload). Note: Gitea returns `browser_download_url` on its `.local` ROOT_URL, which won't resolve off-LAN — a remote adopter pulls via whatever address reaches the Gitea (the WireGuard IP). (2) **local-path/fine-tuned models** — DONE in tree, staged as **v0.23.0:0** (`ModelDef.local_path` + exactly-one-source validator; swap bind-mounts the dir at the same container path via the launch script's `VLLM_SPARK_EXTRA_DOCKER_ARGS` hook, **no `launch-cluster.sh` change**; "+ Add local model" UI form + `local` badge; `validate_local_path`; disk-delete refused for local; 94 tests pass; verified via TestClient). **Reviewer-agent pass done; findings addressed:** path validation folded into the `ModelDef` validator (so YAML/override-added local models are checked too), a chat-template-must-live-inside-`local_path` guard, `_merge_overrides` skips a bad entry instead of breaking the whole catalog, and the `VLLM_SPARK_EXTRA_DOCKER_ARGS` unquoted-expansion contract is documented in `runbook.md`. **Not yet built/installed/published — awaiting go/no-go.** Next: (3) configurable topology (service→Spark→port map + container names); (4) coordination layer (swap lock + swap webhook + schedule visibility) — only when our own automation lands. Still-open older threads: audio concurrency sweep (only if the Signal Engine dev wants the knee; needs a quiet window); optional matrix-bridge Docker `HEALTHCHECK` if the bot dev asks; Parakeet long-audio guard deferred (rationale in ROADMAP).
|
||||||
|
|||||||
+1
-2
@@ -10,7 +10,7 @@ Driven by the one other Spark Control adopter (a colleague running OpenClaw + cr
|
|||||||
|
|
||||||
Sequenced:
|
Sequenced:
|
||||||
1. **Configurable `VLLM_PORT`** — DONE, v0.22.0:0. Field in Configure Sparks (blank ⇒ 8888); numeric-setting parsing hardened so a blank/bad value falls back instead of crashing startup. Was the immediate "vLLM unreachable" bug for an adopter on port 8000.
|
1. **Configurable `VLLM_PORT`** — DONE, v0.22.0:0. Field in Configure Sparks (blank ⇒ 8888); numeric-setting parsing hardened so a blank/bad value falls back instead of crashing startup. Was the immediate "vLLM unreachable" bug for an adopter on port 8000.
|
||||||
2. **Local-path / fine-tuned model support** — see the dedicated item under "## Dashboard" below. Independently wanted; his merged `ten31-v2` (a directory, not an HF repo) is the motivating case.
|
2. **Local-path / fine-tuned model support** — DONE, v0.23.0:0. Catalog/`ModelDef` gained `local_path` (exactly one of `repo`/`local_path`); swap bind-mounts the dir into the vLLM container at the same path via the launch script's `VLLM_SPARK_EXTRA_DOCKER_ARGS` hook (no `launch-cluster.sh` change); "+ Add local model" form + `local` badge; disk-delete refused for local models; `validate_local_path` boundary check. His merged `ten31-v2` was the motivating case.
|
||||||
3. **Configurable topology** — make the service→Spark→port map and container names configurable so the package stops assuming our exact layout. Lets an adopter monitor vLLM on *both* Sparks, use a different container name, and stop the Parakeet probe from hitting a vLLM that shares its port — without forking. (Covers report P4 multi-Spark vLLM, P5 container name, and the Parakeet-port collision #6.)
|
3. **Configurable topology** — make the service→Spark→port map and container names configurable so the package stops assuming our exact layout. Lets an adopter monitor vLLM on *both* Sparks, use a different container name, and stop the Parakeet probe from hitting a vLLM that shares its port — without forking. (Covers report P4 multi-Spark vLLM, P5 container name, and the Parakeet-port collision #6.)
|
||||||
4. **Coordination layer** — build when our own automation actually lands (zero value until something other than the dashboard swaps models):
|
4. **Coordination layer** — build when our own automation actually lands (zero value until something other than the dashboard swaps models):
|
||||||
- **Swap lock** with holder + TTL (`POST` / `GET` / `DELETE /api/swap/lock`). An external scheduler acquires it before swapping; the dashboard then refuses manual swaps and shows who holds the GPU and until when. Enforced by the swap path, not advisory.
|
- **Swap lock** with holder + TTL (`POST` / `GET` / `DELETE /api/swap/lock`). An external scheduler acquires it before swapping; the dashboard then refuses manual swaps and shows who holds the GPU and until when. Enforced by the swap path, not advisory.
|
||||||
@@ -34,7 +34,6 @@ Sequenced:
|
|||||||
- Second audio worker / queueing layer; revisit which services share Spark 2.
|
- Second audio worker / queueing layer; revisit which services share Spark 2.
|
||||||
|
|
||||||
## Dashboard
|
## Dashboard
|
||||||
- Support local-path / fine-tuned models in the swap catalog. Today the catalog is static (`models.yaml` + custom overrides) and the "Add custom model" path (`POST /api/models`) only accepts an HF `org/name` repo (`shellsafe._HF_REPO_RE`), so a model that exists only as a directory on a Spark (the usual fine-tuning output) can't be registered or swapped. Needs: (a) a "local model" add form/field taking a Spark-side directory path, with its own safe validation instead of the `org/name` regex (path whitelist + `shlex.quote`, no traversal); (b) `models.build_launch_command` / `launch-cluster.sh` able to `vllm serve <path>`; (c) `disk.py` size-probe handling a path instead of deriving the HF cache dir from a repo id. Raised 2026-06-15 — a colleague's locally fine-tuned model doesn't appear because nothing scans the machine; the list is a curated catalog, not a discovery probe.
|
|
||||||
- Per-model configurable vLLM flags editable from the UI (today: edit `models.yaml` and rebuild).
|
- Per-model configurable vLLM flags editable from the UI (today: edit `models.yaml` and rebuild).
|
||||||
- Spark host update actions (OS/driver) from the UI.
|
- Spark host update actions (OS/driver) from the UI.
|
||||||
- Open WebUI link-out integration; richer per-service detail views.
|
- Open WebUI link-out integration; richer per-service detail views.
|
||||||
|
|||||||
@@ -25,6 +25,22 @@ npm run prettier # prettier --write startos (no semicolons, single quotes, tra
|
|||||||
- Version format is `X.Y.Z:N` (`:N` = revision). Bump in `package/startos/versions/v0_1_0.ts`; **replace** the release notes — never leave old notes behind under an extra key (any unknown key fails `tsc`).
|
- Version format is `X.Y.Z:N` (`:N` = revision). Bump in `package/startos/versions/v0_1_0.ts`; **replace** the release notes — never leave old notes behind under an extra key (any unknown key fails `tsc`).
|
||||||
- New external-facing endpoints get noted in release notes for downstream app developers (Recap Relay, Ten31 Transcripts, CRM, Signal Engine consume these APIs).
|
- New external-facing endpoints get noted in release notes for downstream app developers (Recap Relay, Ten31 Transcripts, CRM, Signal Engine consume these APIs).
|
||||||
|
|
||||||
|
## Releasing to Gitea
|
||||||
|
|
||||||
|
The s9pk is distributed via Gitea **Releases** (the binary is gitignored — never commit it). Adopters pull the latest asset with a read-only token. Per-version ritual:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. bump version in startos/versions/v0_1_0.ts (+ replace release notes), then:
|
||||||
|
cd package && make x86 # build
|
||||||
|
# 2. commit + push the source change
|
||||||
|
git tag vX.Y.Z && git push gitea vX.Y.Z # tag — plain vX.Y.Z, NO ':' (git refs forbid it)
|
||||||
|
make install # optional: sideload to your own server (restarts it — go/no-go)
|
||||||
|
# 3. publish the s9pk as a release asset (needs a write-scoped token):
|
||||||
|
GITEA_URL=https://<gitea-host> GITEA_TOKEN=<write-token> make release
|
||||||
|
```
|
||||||
|
|
||||||
|
`make release` → `scripts/gitea-release.sh`: creates/reuses the release for the tag and uploads (replacing) the s9pk asset; idempotent, fails loud on real HTTP errors. `GITEA_INSECURE=1` skips TLS verify for a self-signed LAN cert. Hand adopters a **read-only** token (repository: Read), ideally on a dedicated reader account; their agent then `GET`s `/api/v1/repos/<owner>/spark-control/releases/latest` and downloads the `.s9pk` asset. Note Gitea returns `browser_download_url` on its configured ROOT_URL (may be a `.local` name) — an off-LAN adopter pulls via whatever address actually reaches the Gitea.
|
||||||
|
|
||||||
## Layout
|
## Layout
|
||||||
|
|
||||||
- `package/startos/` — manifest, interfaces, actions (`configureSparks`, `showPublicKey`), `versions/v0_1_0.ts` (current version string + release notes).
|
- `package/startos/` — manifest, interfaces, actions (`configureSparks`, `showPublicKey`), `versions/v0_1_0.ts` (current version string + release notes).
|
||||||
|
|||||||
+41
-4
@@ -15,6 +15,7 @@ from dataclasses import dataclass
|
|||||||
from typing import Optional
|
from typing import Optional
|
||||||
|
|
||||||
from .config import Settings
|
from .config import Settings
|
||||||
|
from .shellsafe import quote_arg
|
||||||
from .ssh import ssh_run
|
from .ssh import ssh_run
|
||||||
|
|
||||||
|
|
||||||
@@ -76,16 +77,52 @@ async def probe_host(host: str, user: str, repo: str, settings: Settings) -> Hos
|
|||||||
return HostDiskResult(host=host, on_disk=True, size_bytes=size)
|
return HostDiskResult(host=host, on_disk=True, size_bytes=size)
|
||||||
|
|
||||||
|
|
||||||
async def probe_disk(repo: str, mode: str, settings: Settings) -> DiskStatus:
|
async def probe_local_host(host: str, user: str, path: str, settings: Settings) -> HostDiskResult:
|
||||||
"""Probe one model across the relevant Sparks based on its mode (solo|cluster)."""
|
"""Return whether a local model directory exists on this host and its size.
|
||||||
|
|
||||||
|
For locally fine-tuned models (a Spark directory, not an HF cache entry). The
|
||||||
|
path is whitelisted at the API boundary (shellsafe.validate_local_path); we
|
||||||
|
shlex-quote it here in depth.
|
||||||
|
"""
|
||||||
|
if not host or not user:
|
||||||
|
return HostDiskResult(host=host or "?", on_disk=False, error="host not configured")
|
||||||
|
qp = quote_arg(path)
|
||||||
|
cmd = f"if [ -d {qp} ]; then du -sb {qp} 2>/dev/null | cut -f1; else echo MISSING; fi"
|
||||||
|
rc, out, err = await ssh_run(host, user, cmd, settings, timeout=20.0)
|
||||||
|
if rc != 0:
|
||||||
|
return HostDiskResult(host=host, on_disk=False, error=(err or out).strip() or f"rc={rc}")
|
||||||
|
raw = out.strip()
|
||||||
|
if raw == "MISSING" or raw == "":
|
||||||
|
return HostDiskResult(host=host, on_disk=False)
|
||||||
|
try:
|
||||||
|
size = int(raw.splitlines()[-1])
|
||||||
|
except ValueError:
|
||||||
|
return HostDiskResult(host=host, on_disk=False, error=f"unparsable du output: {raw!r}")
|
||||||
|
return HostDiskResult(host=host, on_disk=True, size_bytes=size)
|
||||||
|
|
||||||
|
|
||||||
|
async def probe_disk(
|
||||||
|
repo: str, mode: str, settings: Settings, *, local_path: str | None = None
|
||||||
|
) -> DiskStatus:
|
||||||
|
"""Probe one model across the relevant Sparks based on its mode (solo|cluster).
|
||||||
|
|
||||||
|
A local model (local_path set) is probed by directory; otherwise by HF cache.
|
||||||
|
"""
|
||||||
hosts: list[tuple[str, str]] = [(settings.spark1_host, settings.spark1_user)]
|
hosts: list[tuple[str, str]] = [(settings.spark1_host, settings.spark1_user)]
|
||||||
if mode == "cluster" and settings.spark2_host:
|
if mode == "cluster" and settings.spark2_host:
|
||||||
hosts.append((settings.spark2_host, settings.spark2_user))
|
hosts.append((settings.spark2_host, settings.spark2_user))
|
||||||
|
|
||||||
results = await asyncio.gather(*(probe_host(h, u, repo, settings) for h, u in hosts))
|
if local_path:
|
||||||
|
results = await asyncio.gather(
|
||||||
|
*(probe_local_host(h, u, local_path, settings) for h, u in hosts)
|
||||||
|
)
|
||||||
|
key = local_path
|
||||||
|
else:
|
||||||
|
results = await asyncio.gather(*(probe_host(h, u, repo, settings) for h, u in hosts))
|
||||||
|
key = repo
|
||||||
on_disk = any(r.on_disk for r in results)
|
on_disk = any(r.on_disk for r in results)
|
||||||
total = sum(r.size_bytes for r in results)
|
total = sum(r.size_bytes for r in results)
|
||||||
return DiskStatus(repo=repo, on_disk=on_disk, total_bytes=total, per_host=list(results))
|
return DiskStatus(repo=key, on_disk=on_disk, total_bytes=total, per_host=list(results))
|
||||||
|
|
||||||
|
|
||||||
async def delete_host(host: str, user: str, repo: str, settings: Settings) -> HostDiskResult:
|
async def delete_host(host: str, user: str, repo: str, settings: Settings) -> HostDiskResult:
|
||||||
|
|||||||
+78
-8
@@ -1,15 +1,33 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
import logging
|
||||||
from typing import Literal, Optional
|
from typing import Literal, Optional
|
||||||
import yaml
|
import yaml
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field, model_validator
|
||||||
|
|
||||||
from .overrides import apply_knobs_to_args, load_overrides
|
from .overrides import apply_knobs_to_args, load_overrides
|
||||||
from .shellsafe import quote_arg, quote_args
|
from .shellsafe import quote_arg, quote_args, validate_local_path
|
||||||
|
|
||||||
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _chat_template_path(vllm_args: list[str]) -> str | None:
|
||||||
|
"""Extract the path from a `--chat-template=<path>` arg, if present."""
|
||||||
|
for a in vllm_args:
|
||||||
|
if a.startswith("--chat-template="):
|
||||||
|
return a.split("=", 1)[1]
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _is_within(path: str, base: str) -> bool:
|
||||||
|
"""True if `path` is `base` itself or lives inside it (lexical check)."""
|
||||||
|
base = base.rstrip("/")
|
||||||
|
return path == base or path.startswith(base + "/")
|
||||||
|
|
||||||
|
|
||||||
class ModelDef(BaseModel):
|
class ModelDef(BaseModel):
|
||||||
display_name: str
|
display_name: str
|
||||||
repo: str
|
repo: str = "" # HF 'org/name'; empty for a local model
|
||||||
|
local_path: str | None = None # absolute dir on the Spark; set => local model
|
||||||
size_gb: float
|
size_gb: float
|
||||||
mode: Literal["solo", "cluster"]
|
mode: Literal["solo", "cluster"]
|
||||||
capabilities: list[str] = Field(default_factory=list)
|
capabilities: list[str] = Field(default_factory=list)
|
||||||
@@ -19,6 +37,38 @@ class ModelDef(BaseModel):
|
|||||||
knobs: dict | None = None # user-customized; merged at launch time
|
knobs: dict | None = None # user-customized; merged at launch time
|
||||||
custom: bool = False # True if this came from /data overrides
|
custom: bool = False # True if this came from /data overrides
|
||||||
|
|
||||||
|
@model_validator(mode="after")
|
||||||
|
def _validate_source(self) -> "ModelDef":
|
||||||
|
if bool(self.repo) == bool(self.local_path):
|
||||||
|
raise ValueError(
|
||||||
|
f"model {self.display_name!r} must set exactly one of 'repo' (HF) "
|
||||||
|
f"or 'local_path' (Spark directory)"
|
||||||
|
)
|
||||||
|
if self.local_path:
|
||||||
|
# Single place that enforces the path whitelist, so YAML/override
|
||||||
|
# entries get the same boundary check as the API. The quote_arg sink
|
||||||
|
# is still defense-in-depth.
|
||||||
|
validate_local_path(self.local_path)
|
||||||
|
# Only local_path is bind-mounted into the vLLM container, so any
|
||||||
|
# --chat-template path must live inside it or vLLM can't find it.
|
||||||
|
tmpl = _chat_template_path(self.vllm_args)
|
||||||
|
if tmpl is not None and not _is_within(tmpl, self.local_path):
|
||||||
|
raise ValueError(
|
||||||
|
f"--chat-template path {tmpl!r} must be inside the model "
|
||||||
|
f"directory {self.local_path!r} (only that directory is mounted "
|
||||||
|
f"into the container)"
|
||||||
|
)
|
||||||
|
return self
|
||||||
|
|
||||||
|
@property
|
||||||
|
def is_local(self) -> bool:
|
||||||
|
return bool(self.local_path)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def source(self) -> str:
|
||||||
|
"""What `vllm serve` is pointed at: the local dir if set, else the HF repo."""
|
||||||
|
return self.local_path if self.local_path else self.repo
|
||||||
|
|
||||||
|
|
||||||
class Defaults(BaseModel):
|
class Defaults(BaseModel):
|
||||||
port: int = 8888
|
port: int = 8888
|
||||||
@@ -47,7 +97,8 @@ def _merge_overrides(catalog: Catalog) -> Catalog:
|
|||||||
continue
|
continue
|
||||||
defaults_dump = {
|
defaults_dump = {
|
||||||
"display_name": entry.get("display_name", key),
|
"display_name": entry.get("display_name", key),
|
||||||
"repo": entry["repo"],
|
"repo": entry.get("repo", ""),
|
||||||
|
"local_path": entry.get("local_path"),
|
||||||
"size_gb": float(entry.get("size_gb", 0)),
|
"size_gb": float(entry.get("size_gb", 0)),
|
||||||
"mode": entry.get("mode", "solo"),
|
"mode": entry.get("mode", "solo"),
|
||||||
"capabilities": entry.get("capabilities") or [],
|
"capabilities": entry.get("capabilities") or [],
|
||||||
@@ -57,7 +108,12 @@ def _merge_overrides(catalog: Catalog) -> Catalog:
|
|||||||
"knobs": entry.get("knobs"),
|
"knobs": entry.get("knobs"),
|
||||||
"custom": True,
|
"custom": True,
|
||||||
}
|
}
|
||||||
new_models[key] = ModelDef.model_validate(defaults_dump)
|
# A single malformed override entry (bad path, missing source, etc.) must
|
||||||
|
# not take down the whole catalog — skip it and keep the rest loadable.
|
||||||
|
try:
|
||||||
|
new_models[key] = ModelDef.model_validate(defaults_dump)
|
||||||
|
except Exception as e:
|
||||||
|
log.warning("skipping invalid custom model %r: %s", key, e)
|
||||||
|
|
||||||
return Catalog(defaults=catalog.defaults, models=new_models)
|
return Catalog(defaults=catalog.defaults, models=new_models)
|
||||||
|
|
||||||
@@ -78,7 +134,21 @@ def build_launch_command(key: str, model: ModelDef, defaults: Defaults) -> str:
|
|||||||
solo = "--solo " if model.mode == "solo" else ""
|
solo = "--solo " if model.mode == "solo" else ""
|
||||||
base_args = apply_knobs_to_args(list(model.vllm_args), model.knobs)
|
base_args = apply_knobs_to_args(list(model.vllm_args), model.knobs)
|
||||||
args = [f"--port={defaults.port}", f"--host={defaults.host}", *base_args]
|
args = [f"--port={defaults.port}", f"--host={defaults.host}", *base_args]
|
||||||
# repo + args are user-controlled (custom models, knobs); shlex.quote each so
|
# source + args are user-controlled (custom models, knobs); shlex.quote each
|
||||||
# they cannot break out of the SSH shell command. shlex.split (used by the
|
# so they cannot break out of the SSH shell command. shlex.split (used by the
|
||||||
# vLLM pre-flight validator) cleanly reverses this quoting.
|
# vLLM pre-flight validator) cleanly reverses this quoting.
|
||||||
return f"./launch-cluster.sh {solo}-d exec vllm serve {quote_arg(model.repo)} {quote_args(args)}"
|
prefix = ""
|
||||||
|
if model.local_path:
|
||||||
|
# A local model's directory isn't in the HF cache the launch script
|
||||||
|
# already mounts, so bind-mount it at the SAME path inside the vllm
|
||||||
|
# container via the script's VLLM_SPARK_EXTRA_DOCKER_ARGS hook. Same
|
||||||
|
# path inside and out means `vllm serve <dir>` and any
|
||||||
|
# `--chat-template=<dir>/...` arg both resolve. No launch-cluster.sh
|
||||||
|
# change needed. (The env assignment sits before the script, so the
|
||||||
|
# validator's `serve`-keyed shlex round-trip is unaffected.)
|
||||||
|
mount = quote_arg(f"-v {model.local_path}:{model.local_path}")
|
||||||
|
prefix = f"VLLM_SPARK_EXTRA_DOCKER_ARGS={mount} "
|
||||||
|
return (
|
||||||
|
f"{prefix}./launch-cluster.sh {solo}-d exec vllm serve "
|
||||||
|
f"{quote_arg(model.source)} {quote_args(args)}"
|
||||||
|
)
|
||||||
|
|||||||
@@ -14,7 +14,7 @@ Shape:
|
|||||||
custom:
|
custom:
|
||||||
- key: my-new-model
|
- key: my-new-model
|
||||||
display_name: My New Model (from download)
|
display_name: My New Model (from download)
|
||||||
repo: my-org/my-model
|
repo: my-org/my-model # an HF repo; OR set local_path instead (exactly one)
|
||||||
size_gb: 20
|
size_gb: 20
|
||||||
mode: solo
|
mode: solo
|
||||||
description: null
|
description: null
|
||||||
@@ -25,6 +25,12 @@ Shape:
|
|||||||
fastsafetensors: true
|
fastsafetensors: true
|
||||||
prefix_caching: true
|
prefix_caching: true
|
||||||
kv_cache_dtype: fp8
|
kv_cache_dtype: fp8
|
||||||
|
- key: my-finetune # a local/fine-tuned model (a directory on the Spark)
|
||||||
|
display_name: My Fine-tune
|
||||||
|
local_path: /home/you/models/my-finetune
|
||||||
|
size_gb: 59
|
||||||
|
mode: solo
|
||||||
|
vllm_args: [--chat-template=/home/you/models/my-finetune/chat_template.jinja]
|
||||||
"""
|
"""
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
import os
|
import os
|
||||||
|
|||||||
+29
-5
@@ -6,7 +6,7 @@ from pathlib import Path
|
|||||||
from fastapi import FastAPI, HTTPException, Query, Request
|
from fastapi import FastAPI, HTTPException, Query, Request
|
||||||
from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
|
from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
|
||||||
from fastapi.staticfiles import StaticFiles
|
from fastapi.staticfiles import StaticFiles
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel, ValidationError
|
||||||
from typing import Literal
|
from typing import Literal
|
||||||
|
|
||||||
from .config import Settings
|
from .config import Settings
|
||||||
@@ -22,7 +22,7 @@ from .redaction_gateway import build_router as build_redaction_router, MapStore
|
|||||||
from .hardware import HardwareProbe
|
from .hardware import HardwareProbe
|
||||||
from .health import check_kokoro, check_parakeet, check_vllm, check_embeddings, check_qdrant
|
from .health import check_kokoro, check_parakeet, check_vllm, check_embeddings, check_qdrant
|
||||||
from .matrix_bridge import MatrixBridgeManager
|
from .matrix_bridge import MatrixBridgeManager
|
||||||
from .models import load_catalog
|
from .models import ModelDef, load_catalog
|
||||||
from .nim import SUGGESTED_NIMS, CATALOG_URL, NimManager
|
from .nim import SUGGESTED_NIMS, CATALOG_URL, NimManager
|
||||||
from .overrides import add_custom, delete_custom, extract_knobs_from_args, load_overrides, set_knobs
|
from .overrides import add_custom, delete_custom, extract_knobs_from_args, load_overrides, set_knobs
|
||||||
from .services import docker_state, run_action, services_from_settings
|
from .services import docker_state, run_action, services_from_settings
|
||||||
@@ -183,7 +183,8 @@ async def put_model_knobs(key: str, body: KnobsBody) -> dict:
|
|||||||
class CustomModelBody(BaseModel):
|
class CustomModelBody(BaseModel):
|
||||||
key: str
|
key: str
|
||||||
display_name: str
|
display_name: str
|
||||||
repo: str
|
repo: str = ""
|
||||||
|
local_path: str | None = None
|
||||||
size_gb: float = 0
|
size_gb: float = 0
|
||||||
mode: Literal["solo", "cluster"] = "solo"
|
mode: Literal["solo", "cluster"] = "solo"
|
||||||
description: str | None = None
|
description: str | None = None
|
||||||
@@ -196,8 +197,17 @@ class CustomModelBody(BaseModel):
|
|||||||
async def post_model(body: CustomModelBody) -> dict:
|
async def post_model(body: CustomModelBody) -> dict:
|
||||||
if not body.key or not body.key.replace("-", "").replace("_", "").isalnum():
|
if not body.key or not body.key.replace("-", "").replace("_", "").isalnum():
|
||||||
raise HTTPException(400, "key must be alphanumeric/-/_ only")
|
raise HTTPException(400, "key must be alphanumeric/-/_ only")
|
||||||
|
# Validate the full entry BEFORE persisting (exactly-one source, local-path
|
||||||
|
# whitelist, chat-template location). Doing it via ModelDef means the API and
|
||||||
|
# the YAML-override path share one set of rules, and a bad entry can't be
|
||||||
|
# written to /data and then break catalog load.
|
||||||
try:
|
try:
|
||||||
validate_repo(body.repo)
|
ModelDef.model_validate(body.model_dump())
|
||||||
|
if body.repo:
|
||||||
|
validate_repo(body.repo) # HF charset (the model only validates local paths)
|
||||||
|
except ValidationError as e:
|
||||||
|
msg = e.errors()[0]["msg"] if e.errors() else str(e)
|
||||||
|
raise HTTPException(400, msg.removeprefix("Value error, "))
|
||||||
except ValueError as e:
|
except ValueError as e:
|
||||||
raise HTTPException(400, str(e))
|
raise HTTPException(400, str(e))
|
||||||
if body.key in catalog.models and not catalog.models[body.key].custom:
|
if body.key in catalog.models and not catalog.models[body.key].custom:
|
||||||
@@ -229,7 +239,13 @@ async def get_models_disk_status() -> dict:
|
|||||||
return {"configured": False, "models": {}}
|
return {"configured": False, "models": {}}
|
||||||
keys = list(catalog.models.keys())
|
keys = list(catalog.models.keys())
|
||||||
statuses = await asyncio.gather(*(
|
statuses = await asyncio.gather(*(
|
||||||
probe_disk(catalog.models[k].repo, catalog.models[k].mode, settings) for k in keys
|
probe_disk(
|
||||||
|
catalog.models[k].repo,
|
||||||
|
catalog.models[k].mode,
|
||||||
|
settings,
|
||||||
|
local_path=catalog.models[k].local_path,
|
||||||
|
)
|
||||||
|
for k in keys
|
||||||
), return_exceptions=True)
|
), return_exceptions=True)
|
||||||
out: dict[str, dict] = {}
|
out: dict[str, dict] = {}
|
||||||
for k, s in zip(keys, statuses):
|
for k, s in zip(keys, statuses):
|
||||||
@@ -260,6 +276,14 @@ async def del_model_disk(key: str) -> dict:
|
|||||||
raise HTTPException(404, f"unknown model: {key}")
|
raise HTTPException(404, f"unknown model: {key}")
|
||||||
m = catalog.models[key]
|
m = catalog.models[key]
|
||||||
|
|
||||||
|
# Never rm a local fine-tune directory from the dashboard — it's irreplaceable
|
||||||
|
# training output the user placed by hand, not a re-downloadable HF cache.
|
||||||
|
if m.local_path:
|
||||||
|
raise HTTPException(
|
||||||
|
400,
|
||||||
|
"this is a local model; its directory must be managed on the Spark, not deleted from here",
|
||||||
|
)
|
||||||
|
|
||||||
# Refuse if currently loaded
|
# Refuse if currently loaded
|
||||||
try:
|
try:
|
||||||
vllm = await check_vllm(settings)
|
vllm = await check_vllm(settings)
|
||||||
|
|||||||
@@ -28,6 +28,12 @@ _IMAGE_RE = re.compile(r"^[A-Za-z0-9][A-Za-z0-9._:/@-]*$")
|
|||||||
# Docker container / volume name (Docker's own rule).
|
# Docker container / volume name (Docker's own rule).
|
||||||
_CONTAINER_RE = re.compile(r"^[A-Za-z0-9][A-Za-z0-9_.-]*$")
|
_CONTAINER_RE = re.compile(r"^[A-Za-z0-9][A-Za-z0-9_.-]*$")
|
||||||
|
|
||||||
|
# Absolute filesystem path to a local model directory on a Spark. Conservative
|
||||||
|
# charset (letters, digits, and safe path punctuation) with a required leading
|
||||||
|
# '/', so it carries no shell metacharacters and no whitespace. Traversal ('.'
|
||||||
|
# and '..' segments) is rejected separately in validate_local_path.
|
||||||
|
_LOCAL_PATH_RE = re.compile(r"^/[A-Za-z0-9._+/-]+$")
|
||||||
|
|
||||||
|
|
||||||
def validate_repo(repo: str) -> str:
|
def validate_repo(repo: str) -> str:
|
||||||
"""Return `repo` if it is a well-formed 'org/name'; else raise ValueError."""
|
"""Return `repo` if it is a well-formed 'org/name'; else raise ValueError."""
|
||||||
@@ -50,6 +56,25 @@ def validate_container(name: str) -> str:
|
|||||||
return name
|
return name
|
||||||
|
|
||||||
|
|
||||||
|
def validate_local_path(path: str) -> str:
|
||||||
|
"""Return `path` if it is a safe absolute model directory path; else ValueError.
|
||||||
|
|
||||||
|
For locally fine-tuned models served by directory (not an HF repo). Requires
|
||||||
|
an absolute path, a metacharacter-free charset, and no '.'/'..' segments so a
|
||||||
|
caller cannot traverse out of an intended models directory. The `quote_arg`
|
||||||
|
sink still quotes it in depth — this is the boundary check.
|
||||||
|
"""
|
||||||
|
p = path or ""
|
||||||
|
if len(p) > 512 or not _LOCAL_PATH_RE.fullmatch(p):
|
||||||
|
raise ValueError(
|
||||||
|
f"invalid local model path (expected an absolute path, no spaces or "
|
||||||
|
f"shell metacharacters): {path!r}"
|
||||||
|
)
|
||||||
|
if any(seg in (".", "..") for seg in p.split("/")):
|
||||||
|
raise ValueError(f"local model path must not contain '.' or '..' segments: {path!r}")
|
||||||
|
return p
|
||||||
|
|
||||||
|
|
||||||
def quote_arg(value: object) -> str:
|
def quote_arg(value: object) -> str:
|
||||||
"""shlex.quote a single token for safe embedding in a shell command string."""
|
"""shlex.quote a single token for safe embedding in a shell command string."""
|
||||||
return shlex.quote(str(value))
|
return shlex.quote(str(value))
|
||||||
|
|||||||
+67
-2
@@ -60,6 +60,7 @@ function renderCards() {
|
|||||||
? `<div class="desc">${escapeHtml(m.description)}</div>`
|
? `<div class="desc">${escapeHtml(m.description)}</div>`
|
||||||
: '';
|
: '';
|
||||||
const customPill = m.custom ? `<span class="tag custom-pill">custom</span>` : '';
|
const customPill = m.custom ? `<span class="tag custom-pill">custom</span>` : '';
|
||||||
|
const localPill = m.local_path ? `<span class="tag local-pill" title="Served from a directory on the Spark, not Hugging Face">local</span>` : '';
|
||||||
// Disk-presence pill + trash button. Until /api/models/disk-status comes back,
|
// Disk-presence pill + trash button. Until /api/models/disk-status comes back,
|
||||||
// we don't know — render a neutral placeholder.
|
// we don't know — render a neutral placeholder.
|
||||||
const disk = state.disk_status[key];
|
const disk = state.disk_status[key];
|
||||||
@@ -73,8 +74,10 @@ function renderCards() {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
// Trash button — hidden if not on disk; disabled (with tooltip) if currently loaded.
|
// Trash button — hidden if not on disk; disabled (with tooltip) if currently loaded.
|
||||||
|
// Never offered for local models: their directory is hand-placed training output,
|
||||||
|
// not a re-downloadable HF cache (the server refuses the delete too).
|
||||||
let trashBtn = '';
|
let trashBtn = '';
|
||||||
if (state.disk_status_loaded && disk && disk.on_disk) {
|
if (state.disk_status_loaded && disk && disk.on_disk && !m.local_path) {
|
||||||
const disabled = isActive || isSwapping;
|
const disabled = isActive || isSwapping;
|
||||||
const tip = isActive
|
const tip = isActive
|
||||||
? 'Currently loaded — switch to another model first'
|
? 'Currently loaded — switch to another model first'
|
||||||
@@ -92,6 +95,9 @@ function renderCards() {
|
|||||||
primaryBtn = `<button class="btn" disabled>Current</button>`;
|
primaryBtn = `<button class="btn" disabled>Current</button>`;
|
||||||
} else if (isOnDisk) {
|
} else if (isOnDisk) {
|
||||||
primaryBtn = `<button class="btn primary" data-swap-key="${key}" ${isSwapping ? 'disabled' : ''}>Switch to this</button>`;
|
primaryBtn = `<button class="btn primary" data-swap-key="${key}" ${isSwapping ? 'disabled' : ''}>Switch to this</button>`;
|
||||||
|
} else if (m.local_path) {
|
||||||
|
// A local model can't be "downloaded" — its directory has to exist on the Spark.
|
||||||
|
primaryBtn = `<button class="btn" disabled title="Directory not found on the Spark — create it there, then refresh">Not found on Spark</button>`;
|
||||||
} else {
|
} else {
|
||||||
const tip = dlInFlight ? 'A download is already in progress' : 'Download weights to the Spark(s)';
|
const tip = dlInFlight ? 'A download is already in progress' : 'Download weights to the Spark(s)';
|
||||||
primaryBtn = `<button class="btn info" data-download-key="${key}" title="${escapeHtml(tip)}" ${dlInFlight ? 'disabled' : ''}>Download</button>`;
|
primaryBtn = `<button class="btn info" data-download-key="${key}" title="${escapeHtml(tip)}" ${dlInFlight ? 'disabled' : ''}>Download</button>`;
|
||||||
@@ -102,12 +108,15 @@ function renderCards() {
|
|||||||
<span class="tag mode-${m.mode}">${m.mode}</span>
|
<span class="tag mode-${m.mode}">${m.mode}</span>
|
||||||
<span class="tag">${m.size_gb} GB</span>
|
<span class="tag">${m.size_gb} GB</span>
|
||||||
${customPill}
|
${customPill}
|
||||||
|
${localPill}
|
||||||
${diskPill}
|
${diskPill}
|
||||||
${(m.capabilities || []).map(c => `<span class="tag cap">${escapeHtml(c)}</span>`).join('')}
|
${(m.capabilities || []).map(c => `<span class="tag cap">${escapeHtml(c)}</span>`).join('')}
|
||||||
</div>
|
</div>
|
||||||
${desc}
|
${desc}
|
||||||
<div class="muted small repo">
|
<div class="muted small repo">
|
||||||
<a href="https://huggingface.co/${encodeURIComponent(m.repo)}" target="_blank" rel="noopener" title="View on Hugging Face">${escapeHtml(m.repo)} <span class="hf-icon">↗</span></a>
|
${m.local_path
|
||||||
|
? `<span class="local-path" title="Local model directory on the Spark">${escapeHtml(m.local_path)}</span>`
|
||||||
|
: `<a href="https://huggingface.co/${encodeURIComponent(m.repo)}" target="_blank" rel="noopener" title="View on Hugging Face">${escapeHtml(m.repo)} <span class="hf-icon">↗</span></a>`}
|
||||||
</div>
|
</div>
|
||||||
<div class="spacer"></div>
|
<div class="spacer"></div>
|
||||||
<div class="card-actions">
|
<div class="card-actions">
|
||||||
@@ -1671,6 +1680,60 @@ function setupAdvancedDialog() {
|
|||||||
el('#adv-gmu').addEventListener('input', (e) => { el('#adv-gmu-out').value = parseFloat(e.target.value).toFixed(2); });
|
el('#adv-gmu').addEventListener('input', (e) => { el('#adv-gmu-out').value = parseFloat(e.target.value).toFixed(2); });
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function openLocalModelDialog() {
|
||||||
|
const dlg = el('#local-model-dialog');
|
||||||
|
el('#lm-key').value = '';
|
||||||
|
el('#lm-name').value = '';
|
||||||
|
el('#lm-path').value = '';
|
||||||
|
el('#lm-chat').value = '';
|
||||||
|
el('#lm-size').value = '';
|
||||||
|
el('#lm-mode').value = 'solo';
|
||||||
|
el('#lm-desc').value = '';
|
||||||
|
el('#lm-mml').value = 32768;
|
||||||
|
el('#lm-gmu').value = 0.85;
|
||||||
|
el('#lm-gmu-out').value = '0.85';
|
||||||
|
el('#lm-fst').checked = true;
|
||||||
|
el('#lm-pcache').checked = true;
|
||||||
|
el('#lm-fp8').checked = true;
|
||||||
|
dlg.showModal();
|
||||||
|
}
|
||||||
|
|
||||||
|
function setupLocalModelDialog() {
|
||||||
|
el('#lm-cancel').addEventListener('click', () => el('#local-model-dialog').close());
|
||||||
|
el('#lm-gmu').addEventListener('input', (e) => { el('#lm-gmu-out').value = parseFloat(e.target.value).toFixed(2); });
|
||||||
|
el('#local-model-form').addEventListener('submit', async (e) => {
|
||||||
|
e.preventDefault();
|
||||||
|
const chat = el('#lm-chat').value.trim();
|
||||||
|
const body = {
|
||||||
|
key: el('#lm-key').value.trim(),
|
||||||
|
display_name: el('#lm-name').value.trim(),
|
||||||
|
local_path: el('#lm-path').value.trim(),
|
||||||
|
size_gb: parseFloat(el('#lm-size').value) || 0,
|
||||||
|
mode: el('#lm-mode').value,
|
||||||
|
description: el('#lm-desc').value.trim() || null,
|
||||||
|
// A fine-tune's chat template (if any) rides along as a launch flag.
|
||||||
|
vllm_args: chat ? [`--chat-template=${chat}`] : [],
|
||||||
|
knobs: {
|
||||||
|
max_model_len: parseInt(el('#lm-mml').value, 10) || 32768,
|
||||||
|
gpu_memory_utilization: parseFloat(el('#lm-gmu').value),
|
||||||
|
fastsafetensors: el('#lm-fst').checked,
|
||||||
|
prefix_caching: el('#lm-pcache').checked,
|
||||||
|
kv_cache_dtype: el('#lm-fp8').checked ? 'fp8' : 'auto',
|
||||||
|
},
|
||||||
|
};
|
||||||
|
try {
|
||||||
|
await fetchJSON('/api/models', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'content-type': 'application/json' },
|
||||||
|
body: JSON.stringify(body),
|
||||||
|
});
|
||||||
|
el('#local-model-dialog').close();
|
||||||
|
await loadModels();
|
||||||
|
pollStatus();
|
||||||
|
} catch (e) { alert('Add local model failed: ' + e.message); }
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
// ===================== NIM installer =====================
|
// ===================== NIM installer =====================
|
||||||
|
|
||||||
const nimState = {
|
const nimState = {
|
||||||
@@ -2034,8 +2097,10 @@ async function init() {
|
|||||||
if (kbtn) { copySparkSshKey(kbtn.dataset.sshKey, kbtn); return; }
|
if (kbtn) { copySparkSshKey(kbtn.dataset.sshKey, kbtn); return; }
|
||||||
});
|
});
|
||||||
el('#sshkey-close').addEventListener('click', () => el('#sshkey-dialog').close());
|
el('#sshkey-close').addEventListener('click', () => el('#sshkey-dialog').close());
|
||||||
|
el('#open-local').addEventListener('click', openLocalModelDialog);
|
||||||
setupCatalogDialog();
|
setupCatalogDialog();
|
||||||
setupAdvancedDialog();
|
setupAdvancedDialog();
|
||||||
|
setupLocalModelDialog();
|
||||||
// Open WebUI link from /api/config
|
// Open WebUI link from /api/config
|
||||||
try {
|
try {
|
||||||
state.config = await fetchJSON('/api/config');
|
state.config = await fetchJSON('/api/config');
|
||||||
|
|||||||
@@ -229,6 +229,7 @@
|
|||||||
<div class="section-header">
|
<div class="section-header">
|
||||||
<h2 class="section-title">LLM swap</h2>
|
<h2 class="section-title">LLM swap</h2>
|
||||||
<button id="open-download" class="btn small-btn">+ Download a new model</button>
|
<button id="open-download" class="btn small-btn">+ Download a new model</button>
|
||||||
|
<button id="open-local" class="btn small-btn">+ Add local model</button>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<dialog id="catalog-dialog" class="modal">
|
<dialog id="catalog-dialog" class="modal">
|
||||||
@@ -261,6 +262,37 @@
|
|||||||
</form>
|
</form>
|
||||||
</dialog>
|
</dialog>
|
||||||
|
|
||||||
|
<dialog id="local-model-dialog" class="modal">
|
||||||
|
<form method="dialog" class="modal-form" id="local-model-form">
|
||||||
|
<h3>Add a local / fine-tuned model</h3>
|
||||||
|
<p class="muted small">For a model that lives as a directory on a Spark (e.g. a fine-tune), not a Hugging Face repo. The directory is bind-mounted into the vLLM container at the same path when you swap to it. It must already exist on the Spark.</p>
|
||||||
|
<label class="modal-row"><span>Key (URL-safe id)</span><input type="text" id="lm-key" required pattern="[a-zA-Z0-9_-]+"></label>
|
||||||
|
<label class="modal-row"><span>Display name</span><input type="text" id="lm-name" required></label>
|
||||||
|
<label class="modal-row"><span>Model directory (absolute path on the Spark)</span><input type="text" id="lm-path" required placeholder="e.g. /home/you/models/my-finetune"></label>
|
||||||
|
<label class="modal-row"><span>Chat template path (optional)</span><input type="text" id="lm-chat" placeholder="e.g. /home/you/models/my-finetune/chat_template.jinja"></label>
|
||||||
|
<label class="modal-row"><span>Size (GB)</span><input type="number" id="lm-size" step="0.1" min="0"></label>
|
||||||
|
<label class="modal-row"><span>Mode</span>
|
||||||
|
<select id="lm-mode">
|
||||||
|
<option value="solo">solo (Spark 1 only)</option>
|
||||||
|
<option value="cluster">cluster (both Sparks via Ray)</option>
|
||||||
|
</select>
|
||||||
|
</label>
|
||||||
|
<label class="modal-row"><span>Description (optional)</span><textarea id="lm-desc" rows="3"></textarea></label>
|
||||||
|
<fieldset class="modal-fieldset">
|
||||||
|
<legend>Default launch knobs</legend>
|
||||||
|
<label class="modal-row"><span>Max context (tokens)</span><input type="number" id="lm-mml" step="1024" min="1024" value="32768"></label>
|
||||||
|
<label class="modal-row"><span>GPU memory %</span><input type="range" id="lm-gmu" min="0.5" max="0.95" step="0.01" value="0.85"> <output id="lm-gmu-out">0.85</output></label>
|
||||||
|
<label class="modal-row inline"><input type="checkbox" id="lm-fst" checked> Fast safetensors loading</label>
|
||||||
|
<label class="modal-row inline"><input type="checkbox" id="lm-pcache" checked> Prefix caching</label>
|
||||||
|
<label class="modal-row inline"><input type="checkbox" id="lm-fp8" checked> FP8 KV cache</label>
|
||||||
|
</fieldset>
|
||||||
|
<div class="modal-actions">
|
||||||
|
<button type="button" id="lm-cancel" class="btn">Cancel</button>
|
||||||
|
<button type="submit" class="btn primary">Add local model</button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</dialog>
|
||||||
|
|
||||||
<dialog id="disk-delete-dialog" class="modal">
|
<dialog id="disk-delete-dialog" class="modal">
|
||||||
<form method="dialog" class="modal-form">
|
<form method="dialog" class="modal-form">
|
||||||
<h3>Delete model weights from disk?</h3>
|
<h3>Delete model weights from disk?</h3>
|
||||||
|
|||||||
@@ -694,6 +694,7 @@ main {
|
|||||||
.card .repo a { color: inherit; text-decoration: none; }
|
.card .repo a { color: inherit; text-decoration: none; }
|
||||||
.card .repo a:hover { color: var(--info); text-decoration: underline; }
|
.card .repo a:hover { color: var(--info); text-decoration: underline; }
|
||||||
.card .repo .hf-icon { font-size: 13px; opacity: 0.7; }
|
.card .repo .hf-icon { font-size: 13px; opacity: 0.7; }
|
||||||
|
.card .repo .local-path { font-family: var(--mono, ui-monospace, monospace); opacity: 0.85; }
|
||||||
.tag {
|
.tag {
|
||||||
background: var(--surface-2);
|
background: var(--surface-2);
|
||||||
border: 1px solid var(--border);
|
border: 1px solid var(--border);
|
||||||
@@ -738,6 +739,7 @@ main {
|
|||||||
.card .adv-btn,
|
.card .adv-btn,
|
||||||
.card .test-btn { padding: 8px 12px; font-size: 12px; }
|
.card .test-btn { padding: 8px 12px; font-size: 12px; }
|
||||||
.card .custom-pill { color: var(--info); border-color: rgba(96, 165, 250, 0.4); }
|
.card .custom-pill { color: var(--info); border-color: rgba(96, 165, 250, 0.4); }
|
||||||
|
.card .local-pill { color: var(--warn); border-color: rgba(245, 158, 11, 0.4); }
|
||||||
.tag.on-disk { color: var(--accent); border-color: rgba(74, 222, 128, 0.4); }
|
.tag.on-disk { color: var(--accent); border-color: rgba(74, 222, 128, 0.4); }
|
||||||
.tag.not-on-disk { color: var(--muted); border-color: var(--border); opacity: 0.7; }
|
.tag.not-on-disk { color: var(--muted); border-color: var(--border); opacity: 0.7; }
|
||||||
.card-actions .icon-btn.danger { color: var(--error); border-color: rgba(239, 68, 68, 0.3); margin-left: auto; }
|
.card-actions .icon-btn.danger { color: var(--error); border-color: rgba(239, 68, 68, 0.3); margin-left: auto; }
|
||||||
|
|||||||
@@ -7,6 +7,9 @@ the command back into the exact token list. The vLLM pre-flight validator
|
|||||||
"""
|
"""
|
||||||
import shlex
|
import shlex
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from pydantic import ValidationError
|
||||||
|
|
||||||
from app.models import Defaults, ModelDef, build_launch_command
|
from app.models import Defaults, ModelDef, build_launch_command
|
||||||
|
|
||||||
DEFAULTS = Defaults(port=8888, host="0.0.0.0")
|
DEFAULTS = Defaults(port=8888, host="0.0.0.0")
|
||||||
@@ -65,3 +68,81 @@ def test_injection_via_vllm_arg_stays_literal():
|
|||||||
payload = "--foo=$(touch /tmp/pwned)"
|
payload = "--foo=$(touch /tmp/pwned)"
|
||||||
cmd = build_launch_command("k", _model(vllm_args=[payload]), DEFAULTS)
|
cmd = build_launch_command("k", _model(vllm_args=[payload]), DEFAULTS)
|
||||||
assert payload in shlex.split(cmd) # preserved as one inert token
|
assert payload in shlex.split(cmd) # preserved as one inert token
|
||||||
|
|
||||||
|
|
||||||
|
# ---- local / fine-tuned models (served by directory, not HF repo) ----
|
||||||
|
|
||||||
|
def test_local_model_bind_mounts_dir_and_serves_the_path():
|
||||||
|
m = _model(repo="", local_path="/home/u/models/ft-v2", vllm_args=["--max-model-len=2048"])
|
||||||
|
cmd = build_launch_command("k", m, DEFAULTS)
|
||||||
|
tokens = shlex.split(cmd)
|
||||||
|
# The launch script's hook bind-mounts the host dir at the SAME container path.
|
||||||
|
assert tokens[0] == (
|
||||||
|
"VLLM_SPARK_EXTRA_DOCKER_ARGS=-v /home/u/models/ft-v2:/home/u/models/ft-v2"
|
||||||
|
)
|
||||||
|
# vLLM is pointed at the directory, not an HF repo id.
|
||||||
|
i = tokens.index("serve")
|
||||||
|
assert tokens[i + 1] == "/home/u/models/ft-v2"
|
||||||
|
assert "--max-model-len=2048" in tokens
|
||||||
|
|
||||||
|
|
||||||
|
def test_local_model_chat_template_arg_survives_round_trip():
|
||||||
|
m = _model(
|
||||||
|
repo="",
|
||||||
|
local_path="/m/ft",
|
||||||
|
vllm_args=["--chat-template=/m/ft/chat_template.jinja"],
|
||||||
|
)
|
||||||
|
cmd = build_launch_command("k", m, DEFAULTS)
|
||||||
|
assert "--chat-template=/m/ft/chat_template.jinja" in shlex.split(cmd)
|
||||||
|
|
||||||
|
|
||||||
|
def test_local_path_with_metacharacters_is_quoted_not_executed():
|
||||||
|
# The validator rejects a hostile path at the boundary; bypass it with
|
||||||
|
# model_construct to prove the quote_arg sink is safe in depth even if a bad
|
||||||
|
# value somehow reaches build_launch_command.
|
||||||
|
evil = "/m/ft; rm -rf ~"
|
||||||
|
m = ModelDef.model_construct(
|
||||||
|
display_name="X", repo="", local_path=evil, size_gb=1.0, mode="solo",
|
||||||
|
vllm_args=[], knobs=None, custom=False, capabilities=[],
|
||||||
|
expected_ready_seconds=300, description=None,
|
||||||
|
)
|
||||||
|
cmd = build_launch_command("k", m, DEFAULTS)
|
||||||
|
tokens = shlex.split(cmd)
|
||||||
|
i = tokens.index("serve")
|
||||||
|
assert tokens[i + 1] == evil # recovered as one literal token, not executed
|
||||||
|
assert tokens[0] == f"VLLM_SPARK_EXTRA_DOCKER_ARGS=-v {evil}:{evil}"
|
||||||
|
|
||||||
|
|
||||||
|
def test_model_requires_exactly_one_source():
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
ModelDef(display_name="x", size_gb=1, mode="solo") # neither repo nor local_path
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
ModelDef(display_name="x", repo="o/n", local_path="/p", size_gb=1, mode="solo") # both
|
||||||
|
|
||||||
|
|
||||||
|
def test_local_model_rejects_chat_template_outside_dir():
|
||||||
|
# Only local_path is mounted into the container, so a chat-template elsewhere
|
||||||
|
# would silently 404 inside vLLM — reject it up front.
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
ModelDef(
|
||||||
|
display_name="x", repo="", local_path="/m/ft", size_gb=1, mode="solo",
|
||||||
|
vllm_args=["--chat-template=/other/dir/t.jinja"],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_local_path_rejected_by_model():
|
||||||
|
with pytest.raises(ValidationError):
|
||||||
|
ModelDef(display_name="x", repo="", local_path="/m/../etc", size_gb=1, mode="solo")
|
||||||
|
|
||||||
|
|
||||||
|
def test_merge_overrides_loads_local_and_skips_invalid(monkeypatch):
|
||||||
|
# YAML/override-added local models get the same validation as the API; a single
|
||||||
|
# bad entry is skipped (logged) rather than breaking the whole catalog load.
|
||||||
|
from app import models as M
|
||||||
|
monkeypatch.setattr(M, "load_overrides", lambda: {"knobs": {}, "custom": [
|
||||||
|
{"key": "good", "display_name": "G", "local_path": "/home/u/m", "size_gb": 1, "mode": "solo"},
|
||||||
|
{"key": "bad", "display_name": "B", "local_path": "/home/u/../etc", "size_gb": 1, "mode": "solo"},
|
||||||
|
]})
|
||||||
|
cat = M._merge_overrides(M.Catalog(models={}))
|
||||||
|
assert cat.models["good"].is_local and cat.models["good"].source == "/home/u/m"
|
||||||
|
assert "bad" not in cat.models # traversal path skipped, not catalog-fatal
|
||||||
|
|||||||
@@ -6,7 +6,12 @@ use `validate_x(v)` inline.
|
|||||||
"""
|
"""
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
from app.shellsafe import validate_container, validate_image, validate_repo
|
from app.shellsafe import (
|
||||||
|
validate_container,
|
||||||
|
validate_image,
|
||||||
|
validate_local_path,
|
||||||
|
validate_repo,
|
||||||
|
)
|
||||||
|
|
||||||
# Shell metacharacters that must never survive any validator — these are the
|
# Shell metacharacters that must never survive any validator — these are the
|
||||||
# actual injection vectors. (Path traversal like "../" is NOT in scope here:
|
# actual injection vectors. (Path traversal like "../" is NOT in scope here:
|
||||||
@@ -96,3 +101,27 @@ def test_container_valid_passes_through_unchanged(name):
|
|||||||
def test_container_rejects_malformed_and_hostile(name):
|
def test_container_rejects_malformed_and_hostile(name):
|
||||||
with pytest.raises(ValueError):
|
with pytest.raises(ValueError):
|
||||||
validate_container(name)
|
validate_container(name)
|
||||||
|
|
||||||
|
|
||||||
|
# ---- validate_local_path: absolute model dir, no traversal/metacharacters ----
|
||||||
|
|
||||||
|
@pytest.mark.parametrize("path", [
|
||||||
|
"/home/modelo/models/gemma-4-31B-ten31-v2",
|
||||||
|
"/data/models/ft.v2_1",
|
||||||
|
"/srv/m/a-b/c",
|
||||||
|
])
|
||||||
|
def test_local_path_valid_passes_through_unchanged(path):
|
||||||
|
assert validate_local_path(path) == path
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.parametrize("path", [
|
||||||
|
"",
|
||||||
|
"relative/path", # must be absolute
|
||||||
|
"~/models/x", # no ~ expansion
|
||||||
|
"/models/../etc/shadow", # '..' traversal
|
||||||
|
"/models/./x", # '.' segment
|
||||||
|
"/a" * 300, # over the 512 cap (600 chars)
|
||||||
|
] + [f"/models/x{h}" for h in HOSTILE])
|
||||||
|
def test_local_path_rejects_relative_traversal_and_hostile(path):
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
validate_local_path(path)
|
||||||
|
|||||||
@@ -1,10 +1,10 @@
|
|||||||
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
import { VersionInfo, IMPOSSIBLE } from '@start9labs/start-sdk'
|
||||||
|
|
||||||
export const v0_1_0 = VersionInfo.of({
|
export const v0_1_0 = VersionInfo.of({
|
||||||
version: '0.22.0:0',
|
version: '0.23.0:0',
|
||||||
releaseNotes: {
|
releaseNotes: {
|
||||||
en_US:
|
en_US:
|
||||||
"v0.22.0:0 — configurable vLLM port. The port Spark Control uses to reach vLLM on Spark 1 (the health check and the chat proxy) is now a field in the Configure Sparks action, so you can point it at a vLLM that listens on a non-default port without rebuilding the package. Leave it blank to keep the previous default of 8888 — what the bundled launch-cluster.sh wrapper uses; set it to 8000 (vLLM's own default) or any other port if your vLLM listens elsewhere. Also hardened numeric-setting parsing so a blank or malformed port value falls back to its default instead of crashing daemon startup.",
|
"v0.23.0:0 — local / fine-tuned model support. You can now add a model that lives as a directory on a Spark (e.g. a LoRA-merged fine-tune), not just a Hugging Face repo. Use the new \"+ Add local model\" button under LLM swap: give it the model's absolute path on the Spark, an optional chat-template path, and the usual launch knobs. On swap, Spark Control bind-mounts that directory into the vLLM container at the same path (via the launch script's existing VLLM_SPARK_EXTRA_DOCKER_ARGS hook — nothing to change on the Spark) and runs `vllm serve <dir>`. Local models show a \"local\" badge and their path instead of a Hugging Face link, and their weights are never offered for dashboard deletion (that directory is your own training output, not a re-downloadable cache). API: POST /api/models now accepts `local_path` (set exactly one of `repo` or `local_path`), validated against a strict path whitelist with no traversal.",
|
||||||
},
|
},
|
||||||
migrations: {
|
migrations: {
|
||||||
up: async ({ effects }) => {},
|
up: async ({ effects }) => {},
|
||||||
|
|||||||
@@ -60,6 +60,12 @@ The **Update** button runs `git fetch && git reset --hard origin/<branch> && doc
|
|||||||
|
|
||||||
If `description` is omitted, the card simply hides that section — no need to populate it for every model. Keep descriptions generic (not user-specific) so the catalog stays portable.
|
If `description` is omitted, the card simply hides that section — no need to populate it for every model. Keep descriptions generic (not user-specific) so the catalog stays portable.
|
||||||
|
|
||||||
|
### Local / fine-tuned models (v0.23.0+)
|
||||||
|
|
||||||
|
A model that lives as a directory on a Spark (e.g. a LoRA-merged fine-tune) instead of an HF repo: use the **"+ Add local model"** button under LLM swap (or a `custom:` entry with `local_path` instead of `repo` in the override YAML). The directory must already exist on the Spark; only its parent dir is mounted, so a `--chat-template` must live **inside** `local_path`.
|
||||||
|
|
||||||
|
**Load-bearing contract:** on swap, spark-control prefixes the launch with `VLLM_SPARK_EXTRA_DOCKER_ARGS="-v <path>:<path>"` so `launch-cluster.sh` bind-mounts the dir into the vLLM container at the same path. This relies on the upstream `eugr/spark-vllm-docker` `launch-cluster.sh` expanding `$VLLM_SPARK_EXTRA_DOCKER_ARGS` **unquoted** into its `docker run` (verified against the on-Spark script 2026-06-17: line ~11 appends it to `DOCKER_ARGS`, used unquoted in `docker run`). If a future upstream version quotes that variable, local-model mounts would silently fail — re-check this before pulling launch-cluster.sh updates.
|
||||||
|
|
||||||
## Manual swap fallback
|
## Manual swap fallback
|
||||||
|
|
||||||
If the UI is unavailable and you need to swap by hand:
|
If the UI is unavailable and you need to swap by hand:
|
||||||
|
|||||||
+32
-12
@@ -8,38 +8,58 @@
|
|||||||
# The git tag (vX.Y.Z, derived from the version) must already exist and be pushed
|
# The git tag (vX.Y.Z, derived from the version) must already exist and be pushed
|
||||||
# (`git tag v0.22.0 && git push gitea v0.22.0`). Re-running is idempotent: it
|
# (`git tag v0.22.0 && git push gitea v0.22.0`). Re-running is idempotent: it
|
||||||
# reuses an existing release for the tag and replaces a same-named asset.
|
# reuses an existing release for the tag and replaces a same-named asset.
|
||||||
|
# Set GITEA_INSECURE=1 to skip TLS verification (self-signed cert on a LAN box).
|
||||||
set -euo pipefail
|
set -euo pipefail
|
||||||
|
|
||||||
VERSION="${1:-}"; S9PK="${2:-}"
|
VERSION="${1:-}"; S9PK="${2:-}"
|
||||||
[ -n "$VERSION" ] && [ -n "$S9PK" ] || {
|
[ -n "$VERSION" ] && [ -n "$S9PK" ] || {
|
||||||
echo "usage: GITEA_URL=.. GITEA_TOKEN=.. $0 <version e.g. 0.22.0:0> <s9pk path>" >&2; exit 2; }
|
echo "usage: GITEA_URL=.. GITEA_TOKEN=.. $0 <version e.g. 0.22.0:0> <s9pk path>" >&2; exit 2; }
|
||||||
: "${GITEA_URL:?set GITEA_URL to your Gitea base URL, e.g. https://gitea.lan:3000}"
|
: "${GITEA_URL:?set GITEA_URL to your Gitea base URL, e.g. https://gitea.lan:3000}"
|
||||||
: "${GITEA_TOKEN:?set GITEA_TOKEN to a token with repository write access}"
|
: "${GITEA_TOKEN:?set GITEA_TOKEN to a token with repository read+write access}"
|
||||||
[ -f "$S9PK" ] || { echo "s9pk not found: $S9PK" >&2; exit 1; }
|
[ -f "$S9PK" ] || { echo "s9pk not found: $S9PK" >&2; exit 1; }
|
||||||
|
|
||||||
TAG="v${VERSION%%:*}" # 0.22.0:0 -> v0.22.0
|
TAG="v${VERSION%%:*}" # 0.22.0:0 -> v0.22.0
|
||||||
ASSET="$(basename "$S9PK")"
|
ASSET="$(basename "$S9PK")"
|
||||||
SLUG="$(git remote get-url gitea | sed -E 's#.*[:/]([^/:]+/[^/]+)\.git$#\1#')" # grant/spark-control
|
SLUG="$(git remote get-url gitea | sed -E 's#.*[:/]([^/:]+/[^/]+)\.git$#\1#')" # grant/spark-control
|
||||||
API="${GITEA_URL%/}/api/v1/repos/${SLUG}"
|
API="${GITEA_URL%/}/api/v1/repos/${SLUG}"
|
||||||
AUTH=(-H "Authorization: token ${GITEA_TOKEN}")
|
CURL=(curl -sS) # no -f: we inspect HTTP codes ourselves
|
||||||
|
[ "${GITEA_INSECURE:-}" = "1" ] && CURL+=(-k)
|
||||||
|
|
||||||
echo "repo ${SLUG} | tag ${TAG} | asset ${ASSET} | ${GITEA_URL}"
|
echo "repo ${SLUG} | tag ${TAG} | asset ${ASSET} | ${GITEA_URL}"
|
||||||
|
|
||||||
|
# api METHOD URL [extra curl args...] -> sets globals HTTP_CODE and BODY
|
||||||
|
api() {
|
||||||
|
local method="$1" url="$2"; shift 2
|
||||||
|
local out
|
||||||
|
out="$("${CURL[@]}" -X "$method" -H "Authorization: token ${GITEA_TOKEN}" "$@" \
|
||||||
|
-w $'\n%{http_code}' "$url")"
|
||||||
|
HTTP_CODE="${out##*$'\n'}"
|
||||||
|
BODY="${out%$'\n'*}"
|
||||||
|
}
|
||||||
|
|
||||||
# Reuse an existing release for this tag, otherwise create one.
|
# Reuse an existing release for this tag, otherwise create one.
|
||||||
id="$(curl -fsS "${AUTH[@]}" "$API/releases/tags/$TAG" 2>/dev/null | jq -r '.id // empty')"
|
api GET "$API/releases/tags/$TAG"
|
||||||
if [ -z "$id" ]; then
|
if [ "$HTTP_CODE" = 200 ]; then
|
||||||
id="$(curl -fsS -X POST "${AUTH[@]}" -H 'Content-Type: application/json' \
|
id="$(printf '%s' "$BODY" | jq -r '.id')"
|
||||||
|
elif [ "$HTTP_CODE" = 404 ]; then
|
||||||
|
api POST "$API/releases" -H 'Content-Type: application/json' \
|
||||||
--data "$(jq -n --arg t "$TAG" --arg n "$VERSION" \
|
--data "$(jq -n --arg t "$TAG" --arg n "$VERSION" \
|
||||||
'{tag_name:$t, name:$n, body:("Spark Control "+$n+". See AGENTS.md / release notes.")}')" \
|
'{tag_name:$t, name:$n, body:("Spark Control "+$n+". See AGENTS.md / release notes.")}')"
|
||||||
"$API/releases" | jq -r '.id')"
|
[ "$HTTP_CODE" = 201 ] || { echo "create release failed (HTTP $HTTP_CODE): $BODY" >&2; exit 1; }
|
||||||
|
id="$(printf '%s' "$BODY" | jq -r '.id')"
|
||||||
|
else
|
||||||
|
echo "release lookup failed (HTTP $HTTP_CODE) — check GITEA_URL and the token's scope: $BODY" >&2
|
||||||
|
exit 1
|
||||||
fi
|
fi
|
||||||
[ -n "$id" ] && [ "$id" != null ] || { echo "could not obtain release id (check URL/token/tag)" >&2; exit 1; }
|
[ -n "$id" ] && [ "$id" != null ] || { echo "could not parse release id: $BODY" >&2; exit 1; }
|
||||||
|
|
||||||
# Replace a same-named asset so re-runs don't 409.
|
# Replace a same-named asset so re-runs don't 409.
|
||||||
old="$(curl -fsS "${AUTH[@]}" "$API/releases/$id/assets" | jq -r --arg n "$ASSET" '.[] | select(.name==$n) | .id')"
|
api GET "$API/releases/$id/assets"
|
||||||
[ -n "$old" ] && curl -fsS -X DELETE "${AUTH[@]}" "$API/releases/$id/assets/$old" >/dev/null || true
|
old="$(printf '%s' "$BODY" | jq -r --arg n "$ASSET" '.[]? | select(.name==$n) | .id')"
|
||||||
|
[ -n "$old" ] && { api DELETE "$API/releases/$id/assets/$old"; }
|
||||||
|
|
||||||
curl -fsS -X POST "${AUTH[@]}" -F "attachment=@${S9PK};type=application/octet-stream" \
|
api POST "$API/releases/$id/assets?name=$ASSET" \
|
||||||
"$API/releases/$id/assets?name=$ASSET" >/dev/null
|
-F "attachment=@${S9PK};type=application/octet-stream"
|
||||||
|
[ "$HTTP_CODE" = 201 ] || { echo "asset upload failed (HTTP $HTTP_CODE): $BODY" >&2; exit 1; }
|
||||||
|
|
||||||
echo "published: ${GITEA_URL%/}/${SLUG}/releases/tag/${TAG}"
|
echo "published: ${GITEA_URL%/}/${SLUG}/releases/tag/${TAG}"
|
||||||
|
|||||||
Reference in New Issue
Block a user