Independent six-lens evaluation (evaluator, security-auditor, exerciser, doc-auditor, start9-spec-checker). Surfaces three P0s on the cloud surface and a registry-submission block; full priority queue in the file.
16 KiB
Evaluation — recap (Recaps) — 2026-06-13
Intent: Recaps is a YouTube + podcast summarizer and personal library, served as a single-page vanilla-JS app from a Node.js (ES module) backend — shipping both as a single-mode StartOS .s9pk self-host package and as the multi-tenant recaps.cc cloud (SQLite, credit-metered, talking to a sibling "relay" service).
Agents run: evaluator, security-auditor, exerciser, doc-auditor, start9-spec-checker. Reviewer skipped (working tree clean — no diff to review). All five completed; none failed.
Verdict
The pure-logic core is genuinely well-built — 103 passing tests, parameterized SQL everywhere, scrypt + timingSafeEqual auth, hashed single-use magic-link tokens, server-derived tenant identity that resists X-Recap-User-Id spoofing, and a correctly server-to-server billing/settlement path. But three independent agents surfaced three P0 issues that the test suite can't see because the riskiest files are untested: an arbitrary file write via /api/library/import (reproduced live — ../../ in a session key escapes the scope dir), an SSRF-with-read-back in the podcast download path (reachable by an anonymous trial), and a live, still-active Gemini API key committed to git history. On top of that the multi-tenant claim is undercut by a process-global lock that serializes every concurrent cloud summarize to one-at-a-time. None of these is hard to fix, but the cloud is not ready for untrusted users until the three P0s are closed and the key is rotated. Self-host single-mode is in much better shape (the file-write and SSRF matter far less with one trusted operator), but a StartOS community-registry submission is BLOCKED on packaging gaps regardless.
Cross-referenced findings (corroboration / merges)
- SSE error-boundary leak — TWO agents, ONE finding. The evaluator (
index.js:3432,:3003,:4246) and the security-auditor (relay.js:135-144→index.js:~4246) independently flagged that relay backend errors are forwarded to cloud users verbatim, violating the AGENTS.md "scrub operator-internal language" contract. The evaluator's surprise sharpens it: the pipeline's own regex atindex.js:3419detects "Parakeet/Gemma/CUDA" in the error string and then forwards that exact string two lines later. One P2, two evidence kinds. - The file-write P0 and a doc finding share a root cause. The exerciser reproduced arbitrary file write in
library.jsbecause the session key is used as a filename withoutsafeFilename(). The doc-auditor independently found thatsafeFilename()is module-private inhistory.js, not an exported shared util — which is whylibrary.jscouldn't reuse it. AGENTS.md tells contributors "safeFilename()already exists — use it," but it isn't importable. Fix the code and the doc together. - The concurrency lock cuts both ways. The evaluator's P1 (global
currentFreeJoblock serializes all tenants) and the security-auditor's P2 (concurrent credit over-spend when the lock is absent, i.e. on licensed installs) are two faces of the sameisFreeUser()/currentFreeJobmechanism: when the lock applies it breaks multi-tenant concurrency; when it doesn't, it lets a 1-credit user fan out N parallel requests before any debit lands. - "Riskiest files are untested" — evaluator + exerciser agree. Both note zero coverage on
/api/processgating,relay.js,tenant-auth.js, and billing. Every P0/P1 code bug above lives in an untested file. The exerciser further notes the real summarize→save→debit happy path could not be run end-to-end (no Gemini key / relay credits) — a shared blind spot, below. - Secrets — evaluator + security-auditor. Both flag
cookies.txtas sensitive plaintext in the repo root; the security-auditor escalates with the far more serious committed-and-active Gemini key in historyd5046a0:.env. Working tree itself is clean. - Version-file sprawl — evaluator + spec-checker. Both independently noted 175 StartOS version files with only ~3 carrying real migrations.
Priority queue
P0 — block the cloud; fix before any untrusted use
- [P0] Arbitrary file write via
/api/library/import—../../in a session key escapes scope; reproduced writing/tmp/rce_test.json&recap/evil.json—server/library.js:131-139(key used verbatim, nosafeFilename()) — exerciser (confirmed in single-mode; confirm multi-mode tenant reachability — it's per-scope so any authed tenant should reach it) - [P0] SSRF with read-back via podcast URL — anon trial can POST
{type:"podcast", url:"http://169.254.169.254/..."}; unguardedhttp.getfollows redirects to any host and the body is transcribed back to the attacker —server/audio.js:78-97(downloadPodcastAudio), gateserver/index.js:2747, reached:3455— security-auditor - [P0] Live Gemini API key committed to git history and still the active key —
git show d5046a0:.env, pushed toorigin/master— security-auditor (rotate now; purge from history)
P1 — correctness / availability / submission blockers
- [P1]
require("crypto")inside an ESM module throwsReferenceErroron the anon license-purchase settle path —server/license-purchase.js:423(called from:353-354) — evaluator - [P1] Global single-flight
currentFreeJoblock serializes the entire multi-tenant cloud (a 2nd concurrent/api/processfrom any user gets 409);isFreeUser()returns true for every tenant in multi-mode —server/index.js:2621,server/license-middleware.js:143— evaluator - [P1] Trial IP-cap + magic-link rate-limit bypass via spoofed
X-Forwarded-For(notrust proxy/ XFF normalization) — unlimited free trial credits if the edge proxy appends rather than replaces XFF —server/anon-trial.js:50-57,server/auth-routes.js:209— security-auditor (deployment-dependent on the recaps.cc proxy) - [P1] StartOS community-registry submission BLOCKED (3 blockers): missing root
instructions.md;packageRepo/upstreamRepopoint tohttps://ten31.xyz(a homepage, not a source repo);license: 'Proprietary'fails the "source available" gate —startos/manifest/index.ts, repo root — start9-spec-checker (only blocks registry submission; does NOT affectmake install)
P2 — real risk / quality, not release-blocking for self-host
- [P2] Operator-internal language (Parakeet/Gemma/CUDA/LAN IPs/
*.local) leaks to cloud users at the SSE error boundary; no scrub exists —server/index.js:3432,3003,4246+server/providers/relay.js:135-144— evaluator + security-auditor - [P2] Concurrent credit over-spend (TOCTOU) on licensed installs — N parallel requests pass the
total>0check before any blinddebitOnelands — gateserver/index.js:2497-2550vs debit:3158,:4197(anon-trial.js:299) — security-auditor - [P2] Multi-mode tenant can spend the operator's server Gemini key directly (bypasses relay metering) by selecting
transcriptionProvider:"gemini"with an empty key —server/providers/index.js:104-114— security-auditor - [P2]
GET /api/historyreads +JSON.parses every full session file (transcript+summary, MB each) just to list ~8 metadata fields —server/history.js:418-437— evaluator - [P2] Dependency CVEs: nodemailer 6.10.1 (high — SMTP injection/DoS, low practical reach here), ws 8.20.0, qs/express, protobufjs 7.5.6 (moderate) —
npm auditinserver/— security-auditor - [P2] Unsanitized IDs persisted into
_meta.jsonvia array-form import andPUT /api/history/move(corrupts the listing; no direct file-path escape) —server/library.js, history move handler — exerciser - [P2]
PUT /api/history/metaaccepts arbitrary JSON shapes with no schema (a wrong-typed body, e.g.{"folders":"not-an-array"}, can break the listing) — exerciser - [P2]
server/index.jsis 4351 lines mixing routing, the full/api/processpipeline, yt-dlp orchestration, and SSE (:2463-4270) — evaluator - [P2] Zero tests on the highest-risk files (
/api/processgating,relay.js,tenant-auth.js, billing) — every code P0/P1 above lives here — evaluator + exerciser
P2/P3 — documentation drift (doc-auditor)
- [P2] Documented credit-gate order in AGENTS.md omits the "paid cloud user" bypass state (between license and free-tenant) —
AGENTS.md:77vsserver/index.js:2464-2472— doc-auditor - [P2] Operator-facing copy is stale Gemini-first / Gemini-only; relay is the default provider —
startos-registry/packages/recap/INSTRUCTIONS.md:5,23,assets/ABOUT.md:3,18— doc-auditor - [P3] AGENTS.md directory layout omits ~25 server modules (audio, library, credits-purchase, tts-routes, relay-default/-capabilities/-state, url-resolver, ytdlp, …) —
AGENTS.md:34-60— doc-auditor - [P3]
relay-client.mdAuthorization header missing theBearerscheme — saysAuthorization: <license>, code sendsAuthorization: Bearer <license_key>—docs/guides/relay-client.md:17(==.claude/rules/relay-client.md:17) — doc-auditor - [P3]
index.htmlline count stated~10k, actually 12,515 —AGENTS.md:51— doc-auditor - [P3] AGENTS.md
safeFilename()convention implies an importable util, but it's module-private inhistory.js—AGENTS.md:79— doc-auditor (see P0 file-write)
P3 — hardening / hygiene
- [P3] 100MB JSON body limit on all routes is a cheap memory-exhaustion lever for unauthenticated
/api/process—server/index.js:203— evaluator - [P3]
/api/credits/claim: a leaked anon BTCPay invoice ID lets any signed-in account claim those credits —server/credits-purchase.js:342— security-auditor - [P3]
downloadPodcastAudiohas no size/time cap (disk/memory fill) —server/audio.js— security-auditor - [P3] Container runs as root (no
USERinDockerfile) — acceptable under StartOS isolation, add non-root for the cloud image — security-auditor - [P3] In-memory auth rate-limit buckets reset on restart —
server/auth-routes.js:106— security-auditor + exerciser - [P3] Stale
youtube-summarizer_x86_64.s9pk(~223MB) + rootpackage.jsonstill namedyoutube-summarizer-startos— leftover from the package-ID rename — start9-spec-checker - [P3] Manifest
docsUrls: []empty; multi-tenant cloud actions exposed in the single-mode StartOS menu must be verified to run cleanly (enableMultiTenantModeet al.) —startos/actions/index.ts— start9-spec-checker - [P3]
cookies.txt(119KB, sensitive yt-dlp plaintext) sits in the repo root and is expiring imminently (/api/healthalready reportsfileExpiring:true) — correctly gitignored — evaluator + exerciser - [P3] 175 StartOS version files, only ~3 with real migrations (172 empty
up/downstubs) — maintenance surface — evaluator + start9-spec-checker
Scorecard
The evaluator's six-lens table, with two lenses adjusted on cross-agent evidence:
| Lens | Evaluator | Adjusted | Why adjusted |
|---|---|---|---|
| Architecture | 3 | 3 | — |
| Security | 4 | 2 | The evaluator sampled index.js/relay.js via grep and found no P0; the exerciser (file write, reproduced) and security-auditor (SSRF read-back + committed live key) each found a P0 the evaluator's sample missed. Three P0s on the cloud surface contradict a 4. |
| Performance | 3 | 3 | — |
| Testing | 3 | 3 | Corroborated by the exerciser; every code P0/P1 lives in an untested file. Borderline 2. |
| Code quality | 3 | 3 | — |
| Documentation | 4 | 3 | The doc-auditor found moderate drift the evaluator's "candid docs" read didn't weigh: a missing gate-order state, a directory layout missing half the server, and stale Gemini-first operator instructions. |
Not a lens, but tracked separately: StartOS packaging = BLOCKED for registry submission (3 hard blockers), fine for make install.
Disagreements & gaps
- Disagreement on Security posture. The evaluator rated Security 4 ("fundamentals done right"); the security-auditor and exerciser found three P0s. Both are partly right — the fundamentals (SQL, auth, billing trust model) genuinely are strong, but the user-controlled-input → side-effect paths (
library.jsfilename,audio.jsURL fetch) were the evaluator's blind spot because it sampled the big files by grep rather than readinglibrary.js/audio.jsin full. Adjusted Security to 2 to reflect the reproduced exploit. - Shared blind spot — the real pipeline was never run end-to-end. No agent had a Gemini key or relay credits, so the actual summarize →
saveToHistory→ credit-debit happy path under concurrency is unverified by anyone. The credit-TOCTOU (P2) and the concurrency lock (P1) are reasoned from code, not observed. This is the single highest-value gap to close with a live integration test. - Deployment-dependent finding. The
X-Forwarded-Forbypass (P1) can't be confirmed from this repo — it hinges on whether the recaps.cc edge proxy appends or overwrites XFF. Self-hosted StartOS (tunnel sets XFF) is unaffected. - Spec-checker couldn't fully verify two things read-only: whether the local
start-clipack hard-fails withoutinstructions.md(itslist-ingredientsomits it — ambiguous), and whether the multi-tenant actions run cleanly in single mode (needs an on-device run).
Surprises (carried forward from every agent)
- The pipeline detects relay errors containing "Parakeet/Gemma/CUDA" (
index.js:3419) and then forwards that exact string to cloud users two lines later — it knows the secret and leaks it anyway. (evaluator) - The SSRF guard the codebase clearly knows it needs — it strips
::ffff:and validates IPs for the trial cap — is entirely absent from the one place that fetches a user-controlled URL. (security-auditor) - The billing/settlement trust model (a common place for funds bugs) is correctly server-to-server, relay-verified, and idempotent — a positive surprise. (security-auditor)
- The first
/api/processafter boot triggers a yt-dlp + Homebrew self-update, adding 5–10s latency before the request completes. (exerciser) - AGENTS.md's server directory layout documents only a skeleton — more than half the
server/*.jsfiles are unlisted; it reads like it was written when the server was much smaller. (doc-auditor) start-cli s9pk list-ingredientsdoes not includeinstructions.mdeven though the docs say the build fails without it, and the rootpackage.jsonis still namedyoutube-summarizer-startos. (start9-spec-checker)
Suggested order of work
- Rotate the leaked Gemini key now, then purge
d5046a0:.envfrom history (BFG/filter-repo) and force-push. Independent of all code — do it first. (P0 secret) - Close the two reachable code P0s: add an SSRF guard to
downloadPodcastAudio(reject private/link-local/loopback IPs, https-only, re-validate on redirect, size/time cap — folds in the P3 cap), and validate the import session key. ExportsafeFilename()as a shared util and call it fromlibrary.js+ the history-move/import paths (this also closes the P2_meta.jsonpollution). (P0 file-write + SSRF) - Fix the two P1 correctness bugs: the ESM
require()inlicense-purchase.js, and scope the concurrency lock per-identity (or skip it entirely) in multi-mode. (P1) - Add a live integration test for
/api/processgating + library import + a happy-path summarize-with-debit (with a stub/real key) — this is the shared blind spot and the regression net for steps 2–3. Then close the credit TOCTOU and the SSE error-boundary scrub against that test. (P2) - Patch dependency CVEs (
npm audit fix; nodemailer is a major bump) and the multi-mode server-key fallback. (P2) - Reconcile the docs: add the "paid cloud user" gate-order state, rewrite the operator INSTRUCTIONS.md / ABOUT.md as relay-default (not Gemini-first), fix the directory layout and the Bearer/line-count/
safeFilenamenits. (P2/P3) - Only if submitting to the registry: add
instructions.md, pointpackageRepo/upstreamRepoat a public source repo, choose a source-available license, and remove the staleyoutube-summarizer_x86_64.s9pk. Otherwise these are non-issues formake install. (P1 conditional)