Files
recap-relay/EVALUATION.md
T
2026-06-13 15:52:05 -05:00

13 KiB
Raw Blame History

Evaluation — recap-relay — 2026-06-13

Intent: An operator-side, credit-metered relay that fronts Gemini and the operator's local AI hardware ("Spark Control"), owning transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement (BTCPay/Zaprite), and the internal-meetings pipeline — a private Start9 s9pk that ships only to the operator's own box.

Agents run: evaluator, security-auditor, exerciser, doc-auditor, start9-spec-checker. (reviewer skipped — working tree is clean, no diff to review.)

Verdict

Recap Relay substantially achieves its stated intent: the pipeline works end-to-end, 47/47 tests pass, the server boots clean, webhook signatures verify correctly, and secret hygiene (on disk, over the wire, in git history) is genuinely careful with unusually good "why" comments. The headline risk is a server-side request forgery on /relay/transcribe-url and /relay/summarize-url — three agents independently landed on it: downloadDirect fetches any caller-supplied media_url with redirect-following and no scheme/host allowlist, and the route is reachable by anyone presenting a self-chosen X-Recap-Install-Id, so it is unauthenticated in practice and sits on the operator's LAN beside other StartOS services. Close behind are a billing money-leak (mid-cycle renewal resets the monthly credit counter to zero) and a known-vulnerable multer@1.4.5 DoS line. None of these block a single-operator private sideload today, but the SSRF and the billing reset should be fixed before this relay sees any broader or adversarial traffic. Packaging is well-formed for sideload; the only public-registry blocker (a missing README) is moot for a package that intentionally never leaves the operator's box.

Cross-referenced findings

  • SSRF via media_url — corroborated by three agents. evaluator (P1) and security-auditor (P1) both pin server/routes/transcribe-url.js:99-159 downloadDirect doing fetch(url, {redirect:"follow"}) with no validation; the exerciser independently observed file:///javascript:///empty-host URLs accepted at input (failing only downstream). security-auditor's added insight: any client can self-assign an install-id (auto-creates a Core row, no allowlist), so it's effectively unauthenticated. One P1, three kinds of evidence.
  • Path traversal on internal-meetings :id — security-auditor (P2) and exerciser (P2) both reproduced path.join(meetingsDir, id + '.json') resolving outside the directory (e.g. ../../../etc/passwd.json). Admin-cookie gated, so self-limited on a one-operator box, but a latent traversal. output-store.js:52 already shows the correct sanitize pattern the meetings store omits. One P2, two reproductions.
  • Non-constant-time operator-key compare — evaluator (P2) and security-auditor (P2) both flag server/identity.js:43/84 using !== on the high-value shared relay_cloud_operator_key, inconsistent with the timingSafeEqual used on the admin/webhook paths. One P2.
  • Vulnerable multer@1.4.5-lts.2 — security-auditor (P1, CVE-2025-47944/47935/48997/7338 DoS) and exerciser (npm install warning; worth tracking v2). One P1.
  • Stale reported version 0.2.11 — exerciser (/relay/health reports 0.2.11) and doc-auditor (server/package.json lags the StartOS 0.2.124) describe the same root cause from two angles. One P3.
  • 126 startos/versions/*.ts files — evaluator (P3, dead weight) and start9-spec-checker (Surprise: ~one revision per deploy) both noted the version-graph bulk; the graph is well-formed, just heavy. One P3.
  • Billing-grant integrity — evaluator's renewal-reset (P1) and security-auditor's webhook-replay-double-credit (P2) are distinct bugs in the same money path that compound: a webhook double-fire across a restart both re-grants and resets the monthly counter. Kept separate below, cross-referenced.

Priority queue

  • [P1] SSRF: downloadDirect fetches any media_url with redirect-follow, no scheme/host allowlist; reachable via self-chosen install-id — server/routes/transcribe-url.js:99-159, summarize-url.js — evaluator, security-auditor, exerciser
  • [P1] Mid-cycle/early renewal resets monthly_consumed = 0, handing back a full monthly credit allotment for free — server/credits.js:770 extendUserTiersetUserTier :641 — evaluator
  • [P1] multer@1.4.5-lts.2 DoS CVEs (malformed multipart crashes the process); upgrade to ^2.0.1server/package.json:15, used at routes/internal-meetings.js:1669 — security-auditor, exerciser
  • [P2] Path traversal on internal-meetings :id (admin-gated); validate against ^[A-Za-z0-9_-]+$ before path.joinroutes/internal-meetings.js:84,91,242 — security-auditor, exerciser
  • [P2] Non-constant-time operator-key comparison (timing oracle on a high-value key) — server/identity.js:43,84 — evaluator, security-auditor
  • [P2] Webhook replay double-credit/double-extend: dedup Set is in-memory, lost on restart; persist processed invoice/order ids — routes/credits.js:63, zaprite-webhook.js:27 — security-auditor
  • [P2] Malformed JSON body returns a full Node stack trace (local FS paths) — no Express error handler for entity.parse.failed — any JSON-body route, repro on /relay/analyze — exerciser
  • [P2] BTCPay declared optional:false / kind:'running' despite comments calling it optional → StartOS refuses to start the relay unless BTCPay is co-installed and running — startos/manifest/index.ts:38-49, startos/dependencies.ts — start9-spec-checker
  • [P2] No unit tests on the money path (commitCredit, refundCredit, applyTierPromotion, planBackend, webhook grant handlers) — this is why the P1 renewal bug ships green — server/test/ — evaluator
  • [P2] routes/internal-meetings.js is 2225 lines mixing storage, two backfill migrations, MD/HTML formatters, the upload pipeline, and ~14 handlers; extract formatters + storage — evaluator
  • [P2] cors() is fully open and applies to /admin/* too; scope origins, don't wildcard admin — server/index.js:54 — evaluator
  • [P2] AGENTS.md:8 says all /relay/* sit "behind the operator key" — wrong; most are per-call header auth, only routes/user-tier.js needs the operator key — doc-auditor
  • [P2] AGENTS.md:61 omits the third admin-exempt path /admin/btcpay/callback (state-token, not cookie) — server/admin-auth.js:70 — doc-auditor
  • [P3] No request rate limiting on public /relay/*; expensive transcribe/SSE paths can be hammered — evaluator
  • [P3] Container likely runs as root: entrypoint chowns /data to uid 1001 but the image declares no such user and no USER directive — docker_entrypoint.sh, Dockerfile — security-auditor
  • [P3] Stored XSS surface: dashboard renders user-supplied title/media_url via innerHTML; prefer textContent/escape (HTML export already escapes) — public/dashboard.html — security-auditor
  • [P3] lan-fetch.js disables TLS verification (rejectUnauthorized:false); safe only because the Spark Control URL is admin-set — document that invariant — security-auditor
  • [P3] _ln_debug/_debug and verbose error message fields returned to clients on credits/tier flows; trim before any broad exposure — security-auditor
  • [P3] 126 startos/versions/*.ts files for a one-box package; prune to the last installed baseline — evaluator, start9-spec-checker
  • [P3] /relay/health reports 0.2.11 (server/package.json never bumped past 0.2.11; StartOS is at 0.2.124) — misleading to monitoring — exerciser, doc-auditor
  • [P3] No README.md at repo root (would block public-registry submission only; non-blocking for the private sideload) — start9-spec-checker
  • [P3] yt-dlp installed unpinned in the Dockerfile (pip3 install ... yt-dlp) — pin a version for reproducible builds — start9-spec-checker
  • [P3] AGENTS.md:46 test/ layout lists 3 files; six exist (adds tier-expiry, zaprite, polish-speaker-labels) — doc-auditor
  • [P3] server/index.js:3-6 header comment still says "Two public endpoints" — file now mounts 17 relay routes + admin groups — doc-auditor, evaluator
  • [P3] POST /admin/logout (admin-auth.js:131) is undocumented in the AGENTS.md endpoint contract — doc-auditor
  • [P3] Dockerfile per-subdir COPY footgun: a new server/<subdir>/ is silently omitted from the image (comment warns, but no static check) — start9-spec-checker
  • [P3] Manifest polish (informational, registry-only): license:'Proprietary' not SPDX, docsUrls:[], bare-domain repo URLs, icon.png vs .svg, Dockerfile Node 20 vs documented Node 22 — start9-spec-checker

Scorecard

Lens Score /5 Notes (synthesis)
Architecture 4 Clean per-route auth, decoupled identity/credits/jobs layers, sound planBackend routing. Scale pain: one 2225-line route file, 126 packaging version files. (evaluator)
Security 3 Strong webhook HMAC verify, masked secrets, 0o600 files, server-UUID ids. Gaps corroborated by the auditor: SSRF (P1), multer CVE (P1), timing compare, open CORS, likely container-root, no rate limiting. (evaluator + security-auditor concur)
Performance 4 Realistic load is one operator; serial JSON-ledger writes adequate and acknowledged; bounded O(N³) clustering; audit log rotates at 64MB; minor ledger TOCTOU. (evaluator)
Testing 3 47 pass, clustering/edits-heavy. No money-path or route-level tests — the P1 billing bug ships green because tier-expiry.test.js:63 asserts only the expiry date, not the credit counter. (evaluator, exerciser)
Code quality 4 Consistent style, excellent why-comments, honest TODOs. Dead-weight risk in the 2225-line file; SSRF/timing gaps suggest no security pass; no linter config. (evaluator)
Documentation 3.5 Adjusted down from evaluator's 4. AGENTS.md + the subsystem guide are largely accurate, but doc-auditor found moderate drift — notably AGENTS.md:8 mis-stating /relay/* auth, a missing exempt path, and a stale test list. No README. (evaluator, sharpened by doc-auditor)

Disagreements & gaps

  • Apparent disagreement on the SSRF severity — the exerciser saw hostile media_urls "fail gracefully downstream" while evaluator and security-auditor rate it P1. Not a real conflict: graceful download failure doesn't stop the request from reaching internal hosts, and the failure timing/messages are themselves the SSRF signal. The exerciser simply had no internal target to hit from its sandbox. Resolution: P1 stands.
  • Shared blind spot — the live upload/analyze pipeline. Every agent skipped or only sampled the deep internals of chunked-analyze.js, post-cluster-polish.js, meeting-extras.js, gemini.js, and hardware.js, and none could exercise POST /admin/internal-meetings/upload end-to-end (admin-gated + needs live Spark Control). The most complex state-mutating path (upload → merge → recluster → repolish) is unverified outside its unit tests.
  • Dependency audit ran offline. npm audit had no network in-sandbox (0 vulns is unreliable); the multer finding was confirmed via web advisories, but transitive CVEs could be missed. A live npm audit/osv-scanner would raise confidence.

Suggested order of work

  1. Close the SSRF (P1) — in downloadDirect, reject non-http(s) schemes, DNS-resolve and reject private/loopback/link-local hosts, and re-validate on each redirect hop (or stop following redirects to internal ranges). This is the one finding with adversarial reach.
  2. Fix the billing money-path (P1) and lock it with tests — make extendUserTier preserve monthly_consumed/last_renewal_at and only roll them on a true period boundary; persist webhook dedup so a restart can't double-grant; add unit tests for commitCredit/refundCredit/applyTierPromotion and the grant handlers (do this before trusting any green run on billing).
  3. Upgrade multer to ^2.0.1 (P1) and re-test the file upload field.
  4. Add input-validation guards (P2) — validate :id against the UUID shape before path.join, and register an Express error handler that turns entity.parse.failed into a JSON 400 instead of a stack trace.
  5. Resolve the BTCPay dependency contradiction (P2) — decide optional vs required, then make the manifest (optional), dependencies.ts (kind), and the code comment agree; this governs whether the relay can even start without BTCPay.
  6. Tighten the small security gaps and refresh the docs (P2P3) — constant-time operator-key compare, scope CORS off /admin/*, then correct AGENTS.md (the /relay/* auth statement, the /admin/btcpay/callback exempt path, the test list) and bump server/package.json so /relay/health stops reporting 0.2.11.