From e307a08f05fce6c43f27eb293a6383ce465874d7 Mon Sep 17 00:00:00 2001 From: Keysat Date: Mon, 15 Jun 2026 18:32:57 -0500 Subject: [PATCH] =?UTF-8?q?docs:=20refresh=20Current=20state=20for=20hando?= =?UTF-8?q?ff=20=E2=80=94=20harness=20shipped,=20parakeet=20deferred,=20fi?= =?UTF-8?q?nished=20narrative=20pruned?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- AGENTS.md | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index a7a88a3..0d37483 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -55,13 +55,11 @@ Subsystem guidance lives in `docs/guides/` and loads when matching files are tou ## Current state -- **Working (v0.20.0:0, installed and serving):** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel mode. Spark 2 audio stack healthy (11k+ requests/12h, all 200). -- **Security hardening shipped (v0.19.0:0, 2026-06-12):** closed an SSH command-injection path (`shellsafe.py` validates + `shlex.quote`s every user value crossing into a Spark command), a Qdrant collection path-injection, and added a same-origin (CSRF) guard on control endpoints (proxy/data API exempt, consumers unaffected). Full evidence in `EVALUATION.md`; remaining non-blocking P2/P3 debt now lives in `ROADMAP.md`. -- **Git history scrubbed (2026-06-12):** owner-specific IPs/hosts/user/key-name/personal-names purged from all commits/tags/messages via `git filter-repo`, force-pushed to `gitea` (every SHA changed); 0 hits across all refs. Pre-rewrite backup bundle: `../spark-control-prehistory-rewrite.bundle`. Owner declined SSH-key rotation (only the key *name* leaked, never the material) — don't re-flag.- **Shipped — Spark connectivity helpers (v0.20.0:0, built + installed 2026-06-15):** two read-mostly hardware-card additions. (a) **SSH-key copy:** small copy icon top-right of each reachable card → `POST /api/spark/{name}/ssh-key` (generate-if-missing + return the Spark's *outbound* pubkey; non-destructive; CSRF-guarded; no request input reaches the command so no shellsafe). UI pops `#sshkey-dialog` (key + paste-on-Mac one-liner) since plain-HTTP blocks `navigator.clipboard`. Opposite direction from the StartOS `showPublicKey` action (that grants the *dashboard* access to the Sparks). (b) **WireGuard status badge:** the `hardware.py` probe now also reports `wg_iface`/`wg_addr` via unprivileged `ip -o link show type wireguard` (no root/sudo, ends in a pipe to awk so it can't trip the probe's `set -e`); `renderHardware` shows a `VPN ` badge in the meta line when a tunnel is up. Reflects interface presence, not live peer reachability (true handshake age would need `sudo wg show`). Verified: clean `make x86` + `start-cli package install` exit 0, the real `ip ... type wireguard` output on spark2 matches the parser, and — **confirmed in-browser** — the SSH-key icon works. That also closes the long-open v0.19.0 question: the same-origin CSRF guard does NOT false-block control endpoints behind the StartOS proxy (the SSH-key POST goes through it). The `VPN 10.59.211.6` badge render is confirmed in-browser too — feature fully verified. -- **spark2 joined the `starttunnel` WireGuard subnet (2026-06-15):** config installed at `/etc/wireguard/starttunnel.conf`, interface `starttunnel` up at `10.59.211.6/24`, `wg-quick@starttunnel` enabled (survives reboot). Split tunnel (`AllowedIPs = 10.59.211.0/24`) so the Spark keeps its LAN route — the dashboard's SSH is unaffected. Purpose: let a bot on spark2 reach the owner's Mac off-LAN. **Finding:** passwordless sudo is NOT configured on spark2 (`sudo wg show` → "a password is required") — the earlier assumption was wrong; harmless here since the badge is sudo-free, but note it before designing any dashboard feature that needs root on a Spark. -- **In progress — Signal Engine "flakiness":** diagnosed, not a server bug — transient 1–4s unresponsiveness while the single GPU is continuously busy. Client-side remedy drafted (in-flight cap 2, hard ceiling 3 across audio endpoints, retry-with-backoff on timeout/503), with the owner to forward to that dev. -- **Decided, not implemented:** no public interface / no API token auth — LAN + WireGuard/Tailscale split-tunnel only (the CSRF guard now covers the browser-driven vector). An empirical audio concurrency sweep is offered but needs the owner's OK in a quiet window. -- **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; the connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers. -- **Repo wart:** commit `8d839e3` (was `367d986` pre-rewrite) is labeled `v0.13.0:4` but contains everything through v0.18.0:0 — per-version commits for v0.14–v0.18 don't exist. Keep commit messages accurate. -- **Hosting:** pushes to the owner's self-hosted Gitea — remote `gitea`, branch `master`, over SSH. Push after committing. -- **Next:** (1) owner forwards the concurrency note to the Signal Engine dev; (2) concurrency sweep if the dev wants the measured knee; (3) parakeet-asr `--memory` cap via Reapply-patches; (4) start the `ROADMAP.md` tech-debt list (a pytest harness first). +- **Working (v0.20.0:0, installed and serving):** swap dashboard; chat / transcribe / diarize(+chunk) / TTS proxies; embeddings + rerank + hybrid search (Qdrant); `/scrub` + `/rehydrate`; label-merge incl. dual-channel; per-Spark SSH-key copy + WireGuard `VPN ` hardware-card badge. Spark 2 audio stack healthy. Security hardening (v0.19.0:0 — shellsafe SSH-injection guard, Qdrant path-injection, same-origin CSRF guard) shipped and stable; evidence in `EVALUATION.md`. +- **Tests:** offline pytest harness in `image/tests/` — `cd image && .venv/bin/python -m pytest` (65 passing). Covers `build_launch_command` (incl. the shell-injection round-trip), the transcript↔diarizer label-merge, and the `shellsafe` validators. Mock-heavy swap/proxy tests deliberately skipped (low ROI). Redaction + live-audio suites remain standalone scripts. +- **Signal Engine "flakiness":** diagnosed as *not* a server bug — transient 1–4s unresponsiveness while the single GPU is busy. Client-side remedy (in-flight cap 2 / ceiling 3 / retry-on-timeout+503) drafted and **forwarded to that dev (owner confirmed 2026-06-15)**. Awaiting whether they want the measured concurrency knee. +- **Stance (decided, not built):** no public interface / no API-token auth — LAN + WireGuard/Tailscale split-tunnel only; the CSRF guard covers the browser-driven vector. +- **Known limits:** `/health` blips while the GPU is busy (mitigated client-side); dual-channel can miss a quiet local word under loud remote bleed; connectivity log misses sub-5s outages between 5s polls; diarizer caps at 4 speakers. +- **Infra gotcha (safety):** passwordless sudo is NOT configured on spark2 — design unprivileged probes for any Spark feature (the badge uses `ip`, not `sudo wg show`). spark2 sits on the `starttunnel` WireGuard subnet (`10.59.211.6/24`, survives reboot). Owner declined SSH-key rotation after the 2026-06-12 history scrub (only the key *name* leaked) — don't re-flag. +- **Hosting:** self-hosted Gitea — remote `gitea`, branch `master`, over SSH; push after committing. (Wart: commit `8d839e3` is mislabeled `v0.13.0:4` but contains through v0.18.0:0.) +- **Next:** (1) audio concurrency sweep — only if the Signal Engine dev wants the measured knee; needs owner OK in a quiet window. (2) Otherwise pull from `ROADMAP.md`: local-path/fine-tuned model support (new) or P2 tech-debt. Parakeet long-audio guard is deferred (rationale in ROADMAP).