docs: refresh Current state after P1/P2 security pass; move P3+ to ROADMAP

This commit is contained in:
Keysat
2026-06-13 18:28:27 -05:00
parent 3e33728013
commit e4c6c30ee3
2 changed files with 23 additions and 38 deletions
+13 -38
View File
@@ -43,7 +43,8 @@ server/
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …) chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
config.js getConfigSnapshot() + relay_* config defaults config.js getConfigSnapshot() + relay_* config defaults
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
test/ node --test files (speaker-clustering, meeting-speaker-edits, credits) safe-url.js SSRF guard: assertPublicHttpUrl + safeFetch for caller-supplied URLs
test/ node --test *.test.js (speaker tools, billing/credits, SSRF, path-traversal, …)
public/dashboard.html operator dashboard (meetings detail view + speaker tools) public/dashboard.html operator dashboard (meetings detail view + speaker tools)
startos/versions/<vN>.ts one file per version + index.ts graph startos/versions/<vN>.ts one file per version + index.ts graph
docs/issues-backlog.md detailed issue log docs/issues-backlog.md detailed issue log
@@ -120,6 +121,7 @@ this. When unsure whether a change is contract-affecting, assume it is and check
- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`. - **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`.
- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above. - **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above.
- **Fetching a caller-supplied URL? Go through `server/safe-url.js`** (`safeFetch` / `assertPublicHttpUrl`) — the SSRF guard that rejects non-http(s) schemes and hosts resolving to private/loopback/link-local/reserved ranges, and re-validates every redirect hop. `downloadDirect` (the transcribe-url/summarize-url/admin-test-run download path) already routes through it; never raw-`fetch` an untrusted URL. Calls to the operator's OWN hardware/LAN use `lan-fetch.js` instead — those URLs are config-set and intentionally private.
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`). - **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).
## Always ## Always
@@ -136,41 +138,14 @@ this. When unsure whether a change is contract-affecting, assume it is and check
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file. - **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
- **Don't push to GitHub by default** — remote is self-hosted Gitea. - **Don't push to GitHub by default** — remote is self-hosted Gitea.
## Current state — full eval done (2026-06-13); findings triaged below ## Current state — post-eval security pass landed (2026-06-13)
- **Box, local tree, and git aligned at relay `0.2.124`** (app at `0.2.155`). `startos/versions/index.ts` `current: v_0_2_124`. Git history is local-only (no remote). Working tree is clean apart from an untracked `server/package-lock.json` left by the eval's `npm install` — a generated artifact, intentionally NOT committed. - **Box, local tree, git aligned at relay `0.2.124`** (app `0.2.155`); `current: v_0_2_124`. Git is local-only (no remote). Working tree clean. **Suite green at 60 tests** (`cd server && npm test`); server boots clean.
- **Full independent evaluation run 2026-06-13** (evaluator + security-auditor + exerciser + doc-auditor + start9-spec-checker). Report committed at `EVALUATION.md` (`b08e836`); it's overwritten in place each run so re-running gives a reviewable diff. 47/47 tests still pass; server boots clean. Findings triaged into the three buckets below. - **Full independent eval done** (evaluator + security-auditor + exerciser + doc-auditor + start9-spec-checker) `EVALUATION.md` (overwritten in place each run, so re-running diffs cleanly).
- **Post-hoc speaker tools remain live**: `meeting-speaker-edits.js` (merge / recluster / repolish + backfill) + the `PATCH/POST /admin/internal-meetings/:id/{merge-speakers,recluster,repolish}` routes; dashboard exposes the controls. - **All P0/P1 fixed** this session (commits `8ad7c54`/`d2caa98`/`3a601e1`): SSRF guard on caller-supplied media URLs (new `server/safe-url.js`), the early-renewal credit-reset money-leak (`extendUserTier`/`setUserTier` `resetCycle`), and the `multer``^2.0.1` DoS bump. None touch the `../recap` client contract.
- **Three P2 fixed** (commits `cbd9748`/`da1bba2`/`693d724`): meeting-`:id` path-traversal guard (`meetingPath()`), constant-time operator-key compare, and a JSON error handler that closes the malformed-body stack-trace leak.
### Work queue — P0/P1 — DONE (2026-06-13, commits `8ad7c54`/`d2caa98`/`3a601e1`) - **Next (open P2), in priority order:**
1. Persist webhook dedup so a restart can't double-credit/double-extend — `routes/credits.js:63`, `zaprite-webhook.js:27`.
All three P1 items are fixed and committed; suite is green at 57 tests (was 47). None touch the `../recap` client contract (a blocked URL is just a failed-download job; credit accounting is internal; multer is a dep-only bump). 2. **Needs operator decision:** is BTCPay a hard requirement or truly optional? It's `optional:false`/`kind:'running'` despite "optional" comments, so StartOS won't start the relay without BTCPay co-installed — `startos/manifest/index.ts:38-49` + `dependencies.ts`. Then make manifest/deps/comment agree.
3. Money-path unit tests (`commitCredit`/`refundCredit`/`applyTierPromotion`/grant handlers); scope `cors()` off `/admin/*` (`index.js`); split the 2225-line `routes/internal-meetings.js`; fix the two AGENTS.md auth-doc drifts (Stack-line `/relay/*` auth; missing `/admin/btcpay/callback` exempt path).
1.**SSRF on `/relay/transcribe-url` + `/relay/summarize-url`** — added `server/safe-url.js` (`assertPublicHttpUrl` rejects non-http(s) + private/loopback/link-local/reserved hosts; `safeFetch` follows redirects manually and re-validates each hop). `downloadDirect` routes through it — covers transcribe-url, summarize-url, admin-test-run. Tests: `server/test/safe-url.test.js`. *Residual:* DNS-rebinding TOCTOU (resolve-then-connect) not closed — acceptable for the private box; revisit if exposed. - **Risks/notes:** SSRF guard leaves a DNS-rebinding TOCTOU open (acceptable for a private box; revisit if exposed). P3+ deferred tail + pre-existing speaker-tool/empty-section backlog → `ROADMAP.md` / `docs/issues-backlog.md`.
2.**Billing money-leak: renewal reset `monthly_consumed = 0`**`setUserTier` gained a `resetCycle` flag (default true = operator-grant behavior); `extendUserTier` passes `resetCycle:false` for an in-force sub (preserve counter), `true` only for new/lapsed. Monthly resets still happen via `ensureRenewalRollover`. Regression tests added.
3.**`multer` DoS CVEs** — bumped `^1.4.5-lts.1``^2.0.1` (installed 2.1.1); smoke-tested that a malformed multipart now yields a clean 4xx instead of crashing. Committed `server/package-lock.json` so the Dockerfile `npm ci` path pins it.
### Known debt — P2 (accepted for now; fix opportunistically)
**Done 2026-06-13** (commits `cbd9748`/`da1bba2`/`693d724`; suite green at 60 tests):
- ✅ Path traversal on internal-meetings `:id` — centralized `meetingPath()` in `routes/internal-meetings.js` sanitizes to `[A-Za-z0-9_-]` (mirrors `output-store.js`); `saveMeeting`/`loadMeeting`/`deleteMeeting` all route through it. Test: `test/meeting-path.test.js`.
- ✅ Non-constant-time operator-key compare — `identity.js` now uses a `timingSafeEqual`-based `constantTimeEqual` in `resolveIdentity` + `verifyOperatorKey`.
- ✅ Malformed-body stack-trace leak — added a final error-handling middleware in `index.js` (JSON 400 for `entity.parse.failed`, 413 for too-large, generic 500 otherwise). Verified end-to-end: the exerciser's repro now returns `{"error":"invalid JSON body"}` with no stack.
Still open:
- In-memory webhook dedup Set lost on restart → double-credit/double-extend — `routes/credits.js:63`, `zaprite-webhook.js:27`; persist processed invoice/order ids. *(security-auditor)*
- BTCPay declared `optional:false`/`kind:'running'` despite "optional" comments → StartOS won't start the relay without BTCPay co-installed — `startos/manifest/index.ts:38-49`, `startos/dependencies.ts`. Decide, then make manifest + dependencies + comment agree. *(start9-spec-checker)*
- No money-path unit tests (`commitCredit`/`refundCredit`/`applyTierPromotion`/`planBackend`/grant handlers) — why the P1 billing bug ships green. *(evaluator)*
- `routes/internal-meetings.js` is 2225 lines; extract the MD/HTML formatters + storage/backfill layer. *(evaluator)*
- Fully-open `cors()` incl. `/admin/*` — scope origins — `server/index.js:54`. *(evaluator)*
- Doc drift: AGENTS.md "Stack" line mis-states `/relay/*` auth (most routes are per-call header auth; only `routes/user-tier.js` needs the operator key); the admin-exempt list omits `/admin/btcpay/callback` (`admin-auth.js:70`). *(doc-auditor)*
### Deferred — P3+ (later decision or bulk cleanup)
- Security hardening: no `/relay/*` rate limiting; container likely runs as root (entrypoint `chown`s uid 1001 but no `USER` directive); dashboard `innerHTML` stored-XSS surface; `lan-fetch` TLS verify off (admin-set URL only); debug/error fields leaked to clients. *(security-auditor + evaluator)*
- Packaging/ops: prune the 126 `startos/versions/*.ts` files; pin `yt-dlp` in the Dockerfile; Dockerfile per-subdir `COPY` footgun; manifest polish (SPDX license, `docsUrls`, real repo URLs, icon format); no `README.md` (blocks public-registry submission only — moot for this private box). *(start9-spec-checker + evaluator)*
- `/relay/health` reports stale `0.2.11``server/package.json` never bumped past 0.2.11; bump it to track the StartOS version. *(exerciser + doc-auditor)*
- Doc fixes (bulk): the `test/` layout lists 3 of 6 files; `server/index.js:3-6` "two endpoints" header comment is stale; `POST /admin/logout` undocumented. *(doc-auditor)*
- Untested blind spot: the live upload → merge → recluster → repolish pipeline (admin-gated + needs Spark Control) has only unit coverage; the dependency audit ran offline — re-run `npm audit`/`osv-scanner` with network to confirm the multer finding and catch transitive CVEs. *(all agents)*
**Pre-existing backlog** (separate from the eval): speaker-tool follow-ups and the empty-analysis-section issue — see `ROADMAP.md` / `docs/issues-backlog.md`.
+10
View File
@@ -14,6 +14,16 @@ Longer-term backlog for the relay. Near-term in-flight work + known box/local st
- **Empty analysis section at a window boundary** (observed v0.2.77 smoke test). Likely the LLM returning an empty `{title:"",summary:""}` section the stitcher accepts, or a window-merge boundary hole. Low priority. Full triage path in `docs/issues-backlog.md`. - **Empty analysis section at a window boundary** (observed v0.2.77 smoke test). Likely the LLM returning an empty `{title:"",summary:""}` section the stitcher accepts, or a window-merge boundary hole. Low priority. Full triage path in `docs/issues-backlog.md`.
## Post-eval P3+ backlog (full eval 2026-06-13 — deferred, low risk for the private box)
From `EVALUATION.md`. P1 + three P2 items already fixed (see git log `8ad7c54``693d724`); these are the deferred tail.
- **Security hardening:** no `/relay/*` rate limiting; container likely runs as root (entrypoint `chown`s uid 1001 but no `USER` directive); dashboard `innerHTML` stored-XSS surface; `lan-fetch` TLS verify off (admin-set URL only); debug/error fields leaked to clients.
- **Packaging/ops:** prune the 126 `startos/versions/*.ts` files; pin `yt-dlp` in the Dockerfile; the Dockerfile per-subdir `COPY` footgun; manifest polish (SPDX license, `docsUrls`, real repo URLs, icon format); no `README.md` (blocks public-registry submission only — moot for this private box).
- **`/relay/health` reports stale `0.2.11`** — `server/package.json` version never bumped past 0.2.11; bump to track the StartOS version.
- **Doc fixes (bulk):** the `test/` layout line; `server/index.js:3-6` "two endpoints" header comment is stale; `POST /admin/logout` undocumented.
- **Untested blind spot:** the live upload → merge → recluster → repolish pipeline (admin-gated + needs Spark Control) has only unit coverage; re-run `npm audit`/`osv-scanner` with network to catch transitive CVEs the offline audit missed.
## Adjacent (lives in `../recap`) ## Adjacent (lives in `../recap`)
The app surfaces relay features but owns its own roadmap. Relay-side items the app is waiting on, or that change app behavior, belong in `../recap/ROADMAP.md` under its "Adjacent" section — keep them cross-referenced, not duplicated. The app surfaces relay features but owns its own roadmap. Relay-side items the app is waiting on, or that change app behavior, belong in `../recap/ROADMAP.md` under its "Adjacent" section — keep them cross-referenced, not duplicated.