Files
premier-gunner/EVALUATION.md
T
2026-06-15 12:32:00 -05:00

13 KiB
Raw Blame History

Evaluation — premier-gunner — 2026-06-14

Intent: A kid-friendly soccer training tracker PWA for one player ("Gunner") — daily logging, weekly planning, goals/thermometer, and a dashboard — packaged as a StartOS 0.4.x .s9pk service.

Agents run: evaluator, security-auditor, exerciser, doc-auditor, start9-spec-checker. Skipped: reviewer (working tree is clean — no diff to review).

Verdict

For a one-kid personal app this is genuinely well-built: ~2.4k LOC of clean, dependency-light code that boots cleanly, does exactly what the README claims, and survives restarts, with an unusually polished StartOS packaging layer (full 5-language i18n, random first-boot password, tsc --noEmit green). It achieves its stated intent. The headline risk for the live deployment is that the single-password login has no rate-limiting (src/routes/auth.js:16) while being clearnet-exposed via StartTunnel and permitting a 4-character password (s9pk/startos/actions/index.ts:13) — an unthrottled online brute-force target, flagged independently by two agents. The headline correctness risk is that the personal-best record only ever ratchets up and is never recomputed when the record-setting entry is edited down or deleted (src/routes/entries.js:27), so a marquee feature can silently show a wrong number. Separately, the package is BLOCKED for community-registry submission (broken/private repo URLs, empty assets/, no CI) — but that only matters if submission is a goal; the private StartOS deploy works today.

Cross-referenced findings (corroborated by ≥2 agents)

  • Unthrottled login brute-force — evaluator (P1) + security-auditor (P1) both flag the same path. Compounded by the security-auditor's and evaluator's shared note that the set-password action allows a 4-char minimum. Given the app is clearnet-exposed via StartTunnel (not Tor-only), I elevated this to P0 for the running instance — see Disagreements for the judgment call.
  • No CSRF token beyond SameSite=Lax — evaluator (P2) + security-auditor (P2). Same conclusion: narrow practical window today, worth an origin check. Merged → one P2.
  • Weak password minimum (4 chars; bcrypt truncates >72 bytes) — evaluator (P3) + security-auditor (P3). Merged; folded into the P0 remediation since it compounds the brute-force risk.
  • Manifest repo URLs point to a 404 GitHub repo while the real remote is Gitea — evaluator (P3) + doc-auditor (#7) + start9-spec-checker (BLOCKER ×2). The manifest is objectively wrong/broken regardless of intent; it becomes a submission blocker only if publishing. Merged → one P1.
  • Makefile:4 arch comment contradicts ARCHES := x86 — evaluator (P3) + doc-auditor (surprise). Merged → P3.
  • README "Phase 2 — not yet built" / password "config field" drift — evaluator (P3) + doc-auditor (#1). The s9pk is built and deployed; README still calls it future work. Merged → P3.
  • aarch64 claimed in s9pk/README.md/TODO.md but build is x86-only — doc-auditor (#25) + start9-spec-checker (warning #4). Merged → P3.
  • Phantom CI/CD in CONTRIBUTING.md (no .github/ exists) — doc-auditor (#6) + start9-spec-checker (FAIL). Merged → P3.
  • s9pk/package.json name is template leftover hello-world-startos — doc-auditor (surprise) + start9-spec-checker (WARN). Merged → P3.
  • No automated tests — evaluator (P2) + exerciser (coverage gaps confirm it). Merged → P2.

Priority queue

Every finding from every agent, once each.

  • [P0] Unthrottled brute-force on the single-password login, on a clearnet-exposed instance, with a 4-char password minimum — src/routes/auth.js:16, s9pk/startos/actions/index.ts:13, src/routes/auth.js:39 — evaluator + security-auditor (elevated from P1; see Disagreements)
  • [P1] Personal-best record never recomputed downward; Stats card reads the stale column (src/routes/stats.js:84, public/js/dashboard.js:35) — only write site src/routes/entries.js:27 — evaluator
  • [P1] Shipped manifest packageRepo/upstreamRepo/marketingUrl all point to a 404 GitHub URL; real remote is Gitea — s9pk/startos/manifest/index.ts:8-10 — evaluator + doc-auditor + start9-spec-checker (hard blocker only if submitting to the community registry)
  • [P2] @fastify/static 8.3.0 has known path-traversal/route-bypass advisories (GHSA-x428-ghpx-8j92, GHSA-pr96-94w5-mx2h); no concrete exploit path found here, fixed in ≥9.1.3 — package.json:15, src/server.js:31 — security-auditor
  • [P2] Default password gunner silently used if started outside StartOS without PG_PASSWORD/PG_PASSWORD_HASH; binds 0.0.0.0:3000src/auth.js:45-48, src/config.js:17-18 — security-auditor
  • [P2] No CSRF token; relies solely on SameSite=Laxsrc/routes/auth.js:6-13, src/server.js:35-44 — evaluator + security-auditor
  • [P2] Non-existent metric_id on entry create/edit → 500 leaking raw SQLITE_CONSTRAINT_FOREIGNKEY to the client — POST/PUT /api/entries — exerciser
  • [P2] Invalid metric kind (e.g. "BOGUS") accepted and stored on metric create/update; should be count|duration|score|decimalPOST /api/categories/:id/metrics, PUT /api/metrics/:id — exerciser
  • [P2] Impossible calendar dates (e.g. 2026-13-99) pass the \d{4}-\d{2}-\d{2} regex and persist on /api/day, /api/entries, /api/plans — exerciser
  • [P2] No automated tests anywhere; streak math, record-bump direction, migration idempotency can regress silently — src/routes/stats.js:24-43 et al. — evaluator + exerciser
  • [P2] store.ts schema z.object({ password: z.string() }) has no .catch() default → latent crash if store.json is missing/malformed before the migration runs (main.ts:12 reads it) — s9pk/startos/fileModels/store.ts:8-13 — start9-spec-checker
  • [P2] s9pk/assets/ contains only a 0-byte .gitkeep; docs require ≥1 real file (submission risk) — s9pk/assets/.gitkeep — start9-spec-checker
  • [P3] Cross-category metric reference silently accepted on entry create/edit (metric from a different category stored against the entry) — POST/PUT /api/entries — exerciser
  • [P3] POST /api/logout requires a valid session (401 without one); can't idempotently log out an expired session via API — exerciser
  • [P3] DELETE /api/{metrics,goals,entries}/:id return {ok:true} for non-existent IDs (silent no-op) while category/plan routes 404 — inconsistent — exerciser
  • [P3] Category color stored unvalidated and interpolated into a style attribute (public/js/app.js:98) → CSS-injection by the sole authenticated user — src/routes/categories.js:29-30,50-52 — security-auditor
  • [P3] README "Phase 2 — not yet built" describes the now-shipped s9pk as future work; password described as a "config field" but implemented as an action — README.md:53-64, README.md:58 — evaluator + doc-auditor
  • [P3] README "default password is gunner" omits that StartOS generates a random 16-char password on first install — README.md:33 vs s9pk/startos/versions/current.ts:18-23 — doc-auditor
  • [P3] s9pk/README.md (lines 45, 129, 140) and s9pk/TODO.md:7 claim aarch64; build is x86-only — vs s9pk/Makefile:1, s9pk/startos/manifest/index.ts:17 — doc-auditor + start9-spec-checker
  • [P3] s9pk/Makefile:4 comment "Build x86_64 + aarch64 only" contradicts ARCHES := x86 on line 1 — evaluator + doc-auditor
  • [P3] s9pk/CONTRIBUTING.md:31-36 documents three GitHub Actions workflows; no .github/ directory exists — doc-auditor + start9-spec-checker
  • [P3] s9pk/package.json name is the template default hello-world-startoss9pk/package.json:2-3 — doc-auditor + start9-spec-checker
  • [P3] Root package.json version 0.1.0 lags the s9pk version 0.1.6:0 (intentionally independent, but the root is an unmaintained version source) — evaluator

Scorecard

The evaluator's lens table, with one adjustment from the doc-auditor's evidence.

Lens Score /5 Notes
Architecture 4 Clean configdbauthroutes layering; no-build frontend is a deliberate, well-executed choice (src/server.js:46-51).
Security 3 Solid basics (bcrypt-10, httpOnly+signed cookies, server-side sessions, parameterized SQL) but no login throttling, no CSRF token, a shipped static CVE, and a known-default fallback. The P0 elevation sits inside this lens.
Performance 5 Synchronous better-sqlite3 is correct for one user; queries indexed (src/schema.sql:42-53); no realistic hot path.
Testing 1 Zero tests configured; corroborated by both evaluator and exerciser.
Code quality 4 Consistent style, transactional multi-writes, comments explain why. Counterweight: the exerciser's input-validation gaps (kind, dates, FK 500) are real robustness holes; held at 4 because core paths are correct and the gaps are additive validation, not broken logic.
Documentation 3 (was 4) Adjusted down on the doc-auditor's evidence: the drift is broader than the evaluator sampled — the whole README "Phase 2" section is stale, aarch64 is claimed across three files, and CONTRIBUTING.md documents phantom CI. Still navigable and mostly true, hence 3 not lower.

Disagreements & gaps

  • Severity judgment, not a conflict: the security-auditor explicitly called the missing rate-limiting "not strictly release-blocking for a Tor-only personal deploy." I elevated it to P0 because the documented deployment is clearnet via StartTunnel and the password minimum is 4 chars — the deployment context, not a disagreement about the code, is what tips it. If the instance is in fact Tor-only, drop it back to P1.
  • Submission blockers are conditional: the start9-spec-checker returns an unconditional "BLOCKED," but that is scoped to community-registry submission. The app is already privately deployed and working, so those items (repo URLs, assets/, CI, store.ts .catch()) are P1/P2 here rather than release-blocking — surfaced honestly, severity reflects "blocks publishing, not blocks running."
  • Shared blind spot — no live/browser verification. Every agent stopped at the same edge: the evaluator and security-auditor reasoned from source without running it; the exerciser drove the HTTP API but did not exercise the PWA in a browser (so the kind/calendar-date gaps are unconfirmed as visible UI breakage, and service-worker cache-busting is untested); the start9-spec-checker could not run start-sdk verify (toolchain absent) so the 67 MB .s9pk artifact is unverified; nobody ran make. The vendored public/vendor/chart.umd.min.js was inspected by no one (third-party, unmodified). Treat the personal-best-record bug as reasoned-from-code, not reproduced.

Suggested order of work

  1. Decide the goal first: community-registry submission, or private deploy only? This is the gate — it determines whether the manifest-URL / assets/ / CI / store.ts items are P0 blockers or can wait. (Resolves the conditional severities above.)
  2. Harden the live login now (independent of #1): add @fastify/rate-limit to /api/login (and /api/password), raise the password minimum in both src/routes/auth.js:39 and s9pk/startos/actions/index.ts:13. This protects the running clearnet instance regardless of everything else.
  3. Fix the personal-best record recompute (src/routes/entries.js on edit/delete) and land the first node:test cases in the same pass — cover record-bump direction, streak math, and migration idempotency so the fix is guarded and the Testing lens stops being a 1.
  4. Add server-side input validation — metric kind enum, calendar-date semantics, FK-existence → 400 (not a 500 leak), and the cross-category metric check. Cheap, and clears the entire exerciser P2/P3 input cluster at once.
  5. Upgrade @fastify/static to ≥9.1.3 and re-test static serving (major bump).
  6. Documentation cleanup pass — README Phase 2 + default-password caveat, aarch64 claims, Makefile:4 comment, CONTRIBUTING.md CI section, s9pk/package.json name, root version.
  7. Only if submitting (from #1): publish a public repo, fix the three manifest URLs, populate assets/, add .catch() defaults to store.ts, wire CI, then run start-sdk verify on a real StartOS box and walk the spec-checker's manual pre-submission checklist.

Reminder per AGENTS.md: any code change here also needs a s9pk/startos/versions/current.ts version bump and a public/sw.js CACHE bump; DB changes need schema.sql + an idempotent src/db.js migration + seed.js. Not findings — process the fixes must follow.