Add full-eval report (EVALUATION.md)

This commit is contained in:
Keysat
2026-06-15 12:32:00 -05:00
parent 3fb7a47469
commit bbddebc3d6
+81
View File
@@ -0,0 +1,81 @@
# Evaluation — premier-gunner — 2026-06-14
Intent: A kid-friendly soccer training tracker PWA for one player ("Gunner") — daily logging, weekly planning, goals/thermometer, and a dashboard — packaged as a StartOS 0.4.x `.s9pk` service.
Agents run: evaluator, security-auditor, exerciser, doc-auditor, start9-spec-checker. Skipped: reviewer (working tree is clean — no diff to review).
## Verdict
For a one-kid personal app this is genuinely well-built: ~2.4k LOC of clean, dependency-light code that boots cleanly, does exactly what the README claims, and survives restarts, with an unusually polished StartOS packaging layer (full 5-language i18n, random first-boot password, `tsc --noEmit` green). It achieves its stated intent. The headline risk for the **live** deployment is that the single-password login has **no rate-limiting** (`src/routes/auth.js:16`) while being clearnet-exposed via StartTunnel and permitting a **4-character** password (`s9pk/startos/actions/index.ts:13`) — an unthrottled online brute-force target, flagged independently by two agents. The headline **correctness** risk is that the personal-best `record` only ever ratchets up and is never recomputed when the record-setting entry is edited down or deleted (`src/routes/entries.js:27`), so a marquee feature can silently show a wrong number. Separately, the package is **BLOCKED for community-registry submission** (broken/private repo URLs, empty `assets/`, no CI) — but that only matters if submission is a goal; the private StartOS deploy works today.
## Cross-referenced findings (corroborated by ≥2 agents)
- **Unthrottled login brute-force** — evaluator (P1) + security-auditor (P1) both flag the same path. Compounded by the security-auditor's and evaluator's shared note that the set-password action allows a 4-char minimum. Given the app is clearnet-exposed via StartTunnel (not Tor-only), I **elevated this to P0** for the running instance — see Disagreements for the judgment call.
- **No CSRF token beyond `SameSite=Lax`** — evaluator (P2) + security-auditor (P2). Same conclusion: narrow practical window today, worth an origin check. Merged → one P2.
- **Weak password minimum (4 chars; bcrypt truncates >72 bytes)** — evaluator (P3) + security-auditor (P3). Merged; folded into the P0 remediation since it compounds the brute-force risk.
- **Manifest repo URLs point to a 404 GitHub repo while the real remote is Gitea** — evaluator (P3) + doc-auditor (#7) + start9-spec-checker (BLOCKER ×2). The manifest is objectively wrong/broken regardless of intent; it becomes a submission blocker only if publishing. Merged → one P1.
- **`Makefile:4` arch comment contradicts `ARCHES := x86`** — evaluator (P3) + doc-auditor (surprise). Merged → P3.
- **README "Phase 2 — not yet built" / password "config field" drift** — evaluator (P3) + doc-auditor (#1). The s9pk is built and deployed; README still calls it future work. Merged → P3.
- **aarch64 claimed in `s9pk/README.md`/`TODO.md` but build is x86-only** — doc-auditor (#25) + start9-spec-checker (warning #4). Merged → P3.
- **Phantom CI/CD in `CONTRIBUTING.md`** (no `.github/` exists) — doc-auditor (#6) + start9-spec-checker (FAIL). Merged → P3.
- **`s9pk/package.json` name is template leftover `hello-world-startos`** — doc-auditor (surprise) + start9-spec-checker (WARN). Merged → P3.
- **No automated tests** — evaluator (P2) + exerciser (coverage gaps confirm it). Merged → P2.
## Priority queue
Every finding from every agent, once each.
- **[P0]** Unthrottled brute-force on the single-password login, on a clearnet-exposed instance, with a 4-char password minimum — `src/routes/auth.js:16`, `s9pk/startos/actions/index.ts:13`, `src/routes/auth.js:39` — evaluator + security-auditor (elevated from P1; see Disagreements)
- **[P1]** Personal-best `record` never recomputed downward; Stats card reads the stale column (`src/routes/stats.js:84`, `public/js/dashboard.js:35`) — only write site `src/routes/entries.js:27` — evaluator
- **[P1]** Shipped manifest `packageRepo`/`upstreamRepo`/`marketingUrl` all point to a 404 GitHub URL; real remote is Gitea — `s9pk/startos/manifest/index.ts:8-10` — evaluator + doc-auditor + start9-spec-checker (hard blocker *only if* submitting to the community registry)
- **[P2]** `@fastify/static` 8.3.0 has known path-traversal/route-bypass advisories (GHSA-x428-ghpx-8j92, GHSA-pr96-94w5-mx2h); no concrete exploit path found here, fixed in ≥9.1.3 — `package.json:15`, `src/server.js:31` — security-auditor
- **[P2]** Default password `gunner` silently used if started outside StartOS without `PG_PASSWORD`/`PG_PASSWORD_HASH`; binds `0.0.0.0:3000``src/auth.js:45-48`, `src/config.js:17-18` — security-auditor
- **[P2]** No CSRF token; relies solely on `SameSite=Lax``src/routes/auth.js:6-13`, `src/server.js:35-44` — evaluator + security-auditor
- **[P2]** Non-existent `metric_id` on entry create/edit → 500 leaking raw `SQLITE_CONSTRAINT_FOREIGNKEY` to the client — `POST/PUT /api/entries` — exerciser
- **[P2]** Invalid metric `kind` (e.g. `"BOGUS"`) accepted and stored on metric create/update; should be `count|duration|score|decimal``POST /api/categories/:id/metrics`, `PUT /api/metrics/:id` — exerciser
- **[P2]** Impossible calendar dates (e.g. `2026-13-99`) pass the `\d{4}-\d{2}-\d{2}` regex and persist on `/api/day`, `/api/entries`, `/api/plans` — exerciser
- **[P2]** No automated tests anywhere; streak math, record-bump direction, migration idempotency can regress silently — `src/routes/stats.js:24-43` et al. — evaluator + exerciser
- **[P2]** `store.ts` schema `z.object({ password: z.string() })` has no `.catch()` default → latent crash if `store.json` is missing/malformed before the migration runs (`main.ts:12` reads it) — `s9pk/startos/fileModels/store.ts:8-13` — start9-spec-checker
- **[P2]** `s9pk/assets/` contains only a 0-byte `.gitkeep`; docs require ≥1 real file (submission risk) — `s9pk/assets/.gitkeep` — start9-spec-checker
- **[P3]** Cross-category metric reference silently accepted on entry create/edit (metric from a different category stored against the entry) — `POST/PUT /api/entries` — exerciser
- **[P3]** `POST /api/logout` requires a valid session (401 without one); can't idempotently log out an expired session via API — exerciser
- **[P3]** `DELETE /api/{metrics,goals,entries}/:id` return `{ok:true}` for non-existent IDs (silent no-op) while category/plan routes 404 — inconsistent — exerciser
- **[P3]** Category `color` stored unvalidated and interpolated into a `style` attribute (`public/js/app.js:98`) → CSS-injection by the sole authenticated user — `src/routes/categories.js:29-30,50-52` — security-auditor
- **[P3]** README "Phase 2 — not yet built" describes the now-shipped s9pk as future work; password described as a "config field" but implemented as an action — `README.md:53-64`, `README.md:58` — evaluator + doc-auditor
- **[P3]** README "default password is gunner" omits that StartOS generates a random 16-char password on first install — `README.md:33` vs `s9pk/startos/versions/current.ts:18-23` — doc-auditor
- **[P3]** `s9pk/README.md` (lines 45, 129, 140) and `s9pk/TODO.md:7` claim aarch64; build is x86-only — vs `s9pk/Makefile:1`, `s9pk/startos/manifest/index.ts:17` — doc-auditor + start9-spec-checker
- **[P3]** `s9pk/Makefile:4` comment "Build x86_64 + aarch64 only" contradicts `ARCHES := x86` on line 1 — evaluator + doc-auditor
- **[P3]** `s9pk/CONTRIBUTING.md:31-36` documents three GitHub Actions workflows; no `.github/` directory exists — doc-auditor + start9-spec-checker
- **[P3]** `s9pk/package.json` name is the template default `hello-world-startos``s9pk/package.json:2-3` — doc-auditor + start9-spec-checker
- **[P3]** Root `package.json` version `0.1.0` lags the s9pk version `0.1.6:0` (intentionally independent, but the root is an unmaintained version source) — evaluator
## Scorecard
The evaluator's lens table, with one adjustment from the doc-auditor's evidence.
| Lens | Score /5 | Notes |
|------|----------|-------|
| Architecture | 4 | Clean `config``db``auth``routes` layering; no-build frontend is a deliberate, well-executed choice (`src/server.js:46-51`). |
| Security | 3 | Solid basics (bcrypt-10, httpOnly+signed cookies, server-side sessions, parameterized SQL) but no login throttling, no CSRF token, a shipped static CVE, and a known-default fallback. The P0 elevation sits inside this lens. |
| Performance | 5 | Synchronous better-sqlite3 is correct for one user; queries indexed (`src/schema.sql:42-53`); no realistic hot path. |
| Testing | 1 | Zero tests configured; corroborated by both evaluator and exerciser. |
| Code quality | 4 | Consistent style, transactional multi-writes, comments explain *why*. Counterweight: the exerciser's input-validation gaps (kind, dates, FK 500) are real robustness holes; held at 4 because core paths are correct and the gaps are additive validation, not broken logic. |
| Documentation | **3** *(was 4)* | **Adjusted down** on the doc-auditor's evidence: the drift is broader than the evaluator sampled — the whole README "Phase 2" section is stale, aarch64 is claimed across three files, and `CONTRIBUTING.md` documents phantom CI. Still navigable and mostly true, hence 3 not lower. |
## Disagreements & gaps
- **Severity judgment, not a conflict:** the security-auditor explicitly called the missing rate-limiting "not strictly release-blocking for a Tor-only personal deploy." I elevated it to **P0** because the documented deployment is clearnet via StartTunnel and the password minimum is 4 chars — the deployment context, not a disagreement about the code, is what tips it. If the instance is in fact Tor-only, drop it back to P1.
- **Submission blockers are conditional:** the start9-spec-checker returns an unconditional "BLOCKED," but that is scoped to *community-registry submission*. The app is already privately deployed and working, so those items (repo URLs, `assets/`, CI, `store.ts` `.catch()`) are P1/P2 *here* rather than release-blocking — surfaced honestly, severity reflects "blocks publishing, not blocks running."
- **Shared blind spot — no live/browser verification.** Every agent stopped at the same edge: the evaluator and security-auditor reasoned from source without running it; the exerciser drove the HTTP API but did **not** exercise the PWA in a browser (so the `kind`/calendar-date gaps are unconfirmed as *visible* UI breakage, and service-worker cache-busting is untested); the start9-spec-checker could not run `start-sdk verify` (toolchain absent) so the 67 MB `.s9pk` artifact is unverified; nobody ran `make`. The vendored `public/vendor/chart.umd.min.js` was inspected by no one (third-party, unmodified). Treat the personal-best-record bug as reasoned-from-code, not reproduced.
## Suggested order of work
1. **Decide the goal first: community-registry submission, or private deploy only?** This is the gate — it determines whether the manifest-URL / `assets/` / CI / `store.ts` items are P0 blockers or can wait. (Resolves the conditional severities above.)
2. **Harden the live login now** (independent of #1): add `@fastify/rate-limit` to `/api/login` (and `/api/password`), raise the password minimum in both `src/routes/auth.js:39` and `s9pk/startos/actions/index.ts:13`. This protects the running clearnet instance regardless of everything else.
3. **Fix the personal-best `record` recompute** (`src/routes/entries.js` on edit/delete) **and land the first `node:test` cases in the same pass** — cover record-bump direction, streak math, and migration idempotency so the fix is guarded and the Testing lens stops being a 1.
4. **Add server-side input validation** — metric `kind` enum, calendar-date semantics, FK-existence → 400 (not a 500 leak), and the cross-category metric check. Cheap, and clears the entire exerciser P2/P3 input cluster at once.
5. **Upgrade `@fastify/static` to ≥9.1.3** and re-test static serving (major bump).
6. **Documentation cleanup pass** — README Phase 2 + default-password caveat, aarch64 claims, `Makefile:4` comment, `CONTRIBUTING.md` CI section, `s9pk/package.json` name, root version.
7. **Only if submitting (from #1):** publish a public repo, fix the three manifest URLs, populate `assets/`, add `.catch()` defaults to `store.ts`, wire CI, then run `start-sdk verify` on a real StartOS box and walk the spec-checker's manual pre-submission checklist.
> Reminder per AGENTS.md: any code change here also needs a `s9pk/startos/versions/current.ts` version bump and a `public/sw.js` `CACHE` bump; DB changes need `schema.sql` + an idempotent `src/db.js` migration + `seed.js`. Not findings — process the fixes must follow.