diff --git a/AGENTS.md b/AGENTS.md index 20204b1..25a805c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -77,5 +77,5 @@ Live on StartOS (deploy host set in `~/.startos/config.yaml` `host:`, not in thi - **In progress**: none — all requested features are built, committed, and deployed. - **Decided, not yet done**: reconcile in-app password change with the StartOS action (env wins on restart); optional "log another" for a second same-category session in a day. See `ROADMAP.md`. - **Known issues**: changing the password from the app's own Settings reverts on restart under StartOS — use the action. -- **Eval backlog**: a full evaluation lives in `EVALUATION.md` — remaining items include the `@fastify/static` upgrade, input-validation gaps (metric `kind`, calendar dates, FK 500), CSRF, and no test suite. Registry-submission blockers are intentionally parked (not publishing). -- **Next steps**: (1) set a real login password via the "Set Login Password" action; (2) confirm speed unit (`mph` vs `km/h`); (3) work the `EVALUATION.md` P2 backlog if desired. +- **Eval backlog**: deferred P2/P3 items from `EVALUATION.md` are catalogued in `ROADMAP.md` → Evaluation backlog (registry-submission blockers parked — not publishing). +- **Next steps**: (1) set a real login password via the "Set Login Password" action; (2) confirm speed unit (`mph` vs `km/h`); (3) work the ROADMAP eval backlog if desired. diff --git a/ROADMAP.md b/ROADMAP.md index 7511966..cbff359 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -9,6 +9,17 @@ Longer-term backlog and deferred decisions. Near-term status lives in `AGENTS.md - Per-category drill ideas on demand. - Config via env-var names (endpoint URL, model); no keys in the repo. +## Evaluation backlog + +A full independent evaluation lives in `EVALUATION.md` (committed; re-runnable via `/full-eval`). Deferred items, by priority: + +- **P2 — dependency**: upgrade `@fastify/static` 8.3.0 → ≥9.1.3 (known path-traversal advisories; no concrete exploit path here) and re-test static serving. +- **P2 — input validation**: reject unknown metric `kind` (not `count|duration|score|decimal`); validate calendar-date semantics (the `\d{4}-\d{2}-\d{2}` regex accepts `2026-13-99`); return 400 instead of a raw `SQLITE_CONSTRAINT_FOREIGNKEY` 500 on a bad `metric_id`. +- **P2 — tests**: no automated suite yet; cover record-recompute direction, streak math, and migration idempotency against a temp DB. +- **P3**: CSRF token beyond `SameSite=Lax`; cross-category metric guard on entry write; logout without a session; consistent 404s on delete; validate category `color`. + +Registry-submission blockers (private repo URLs, empty `assets/`, no CI) are intentionally **not** being worked — publishing to the community registry is not a goal. + ## Product backlog - **"Log another"**: allow multiple sessions of the same category in one day (the category pill currently edits the existing entry instead of creating a second).