Ship 0.2.162; handoff: record relay analyze-hang outage

Wire v0.2.162 into the version graph (the 890d671 relay-client dedup,
now built and installed on the box).

Rewrite Current state around the active relay outage: a Gemini analyze
call black-holed and jammed the relay's single in-memory hardware FIFO
slot, blocking all manual + subscription jobs. Root cause is relay-side;
the full diagnosis + 2-part fix is captured to the standards inbox for a
separate ../recap-relay /triage session. Records the recap-side open
question (background sub-check entitlement gate may also skip silently
post-core-decoupling).

Move the standing operator-actions list from Current state to ROADMAP.
This commit is contained in:
Keysat
2026-06-20 06:44:38 -05:00
parent 890d671bf2
commit 9de315276a
4 changed files with 32 additions and 14 deletions
+6 -12
View File
@@ -136,19 +136,13 @@ unsure whether a change is contract-affecting, assume it is and check.
## Current state
**Live on the operator's StartOS box** — app **0.2.161** + relay **0.2.126**. Tests: `cd server && npm test` → **144 pass**.
**Live on the operator's StartOS box** — app **0.2.162** + relay **0.2.126**. Tests: `cd server && npm test` → **158 pass**.
**Done & live:** self-serve Pro/Max purchase (Bitcoin inline-Lightning + Zaprite card, prepaid, relay owns tier/expiry), core-decoupling, per-tenant subscriptions, expiry-reminder emails (`POST /api/admin/reminders/run {test_email}`), **opt-in Daily Digest** (0.2.158, `b4fa5d7`): off-by-default daily email of a user's last ~24h of library recaps, each synthesized via `/relay/analyze` (operator-absorbed); `daily-digest.js` scan at `SEND_HOUR=8`, per-user watermark dedup, public tokenized unsubscribe, admin trigger `POST /api/admin/digest/run`; **YouTube `/live/` + `/shorts/` URL support** (0.2.159, `cb961cd`): `extractVideoId` now accepts those forms (was rejecting them as "Invalid YouTube URL"); and **self-contained shareable HTML export** (0.2.160, `621af7c`; first installed in 0.2.161): the Export menu offers a standalone `.html` with the embedded video + expandable timestamped summaries baked in, no account needed (native share sheet on mobile, download on desktop). Plans in `docs/*-plan.md`.
**⚠ Active incident (2026-06-20) — YouTube processing is down.** A relay Gemini analyze call black-holed and permanently jammed the relay's single in-memory hardware FIFO slot, blocking ALL manual + background-subscription jobs (operator reports ~a week of no subscription processing). Root cause is **relay-side** (no `AbortSignal` on the relay's analyze/transcribe `generateContent` calls + no dead-holder release on its queue) — full diagnosis + 2-part fix captured to the standards inbox as `(recap-relay) [bug][P1]`; the fix lands in a separate `../recap-relay` session via `/triage`. Operator is restarting recap-relay to unblock (clears the in-memory jam; **recurs until the fix ships**).
- **Recap-side open question:** confirm whether the week-long *subscription* silence is fully explained by that jam, or whether the background sub-check's Keysat-license entitlement gate (`server/index.js:1400`) is also skipping silently post-core-decoupling — check `https://recaps.cc/api/sub-check-log` (needs sign-in).
**Design system — DONE & live (0.2.161).** The `design/` contract + both conformance-cleanup phases are installed: Phase 1 (canonical `:root` token block; stylesheet + `auth.html` on `var()`; drift fixed) and Phase 2 (var-ified the inline `style=` hexes — 346 + 7 `#475569` — and snapped off-scale fonts/radii to the scale). Verified: 144 tests, both pages serve 200, every `var()` resolves, no off-scale residue. The var-ify scoping rule + the `SHARE_PAGE_*` literal-hex exception now live in **Conventions**; only the Style-Dictionary `palette.css` stretch goal remains (`ROADMAP.md`).
**Just shipped (0.2.162, `890d671`):** relay-client dedup — `providers/relay.js` collapsed to `handleRelayResponse` (pingBalance left separate, different error-recording contract) + `operatorPost`/`operatorGet` (write-throws/read-null split preserved); first fetch-mock harness `server/test/relay.test.js` (+14). Internal-only.
**Only loose end:** the Daily Digest's relay-synthesis + SMTP path can't be exercised off-box, so it's installed but **not yet smoke-tested** — that's operator action #5 below. Everything else (schema/upgrade, scheduler boot, unsubscribe flow) is verified.
**Prior live features** (self-serve Pro/Max purchase, core-decoupling, per-tenant subs, expiry reminders, opt-in Daily Digest, `/live`+`/shorts` URLs, shareable HTML export, design-system token pass) are in the git log + `docs/*-plan.md`. Daily Digest is installed but **not yet smoke-tested off-box** (operator action, `ROADMAP.md`).
**Pending operator actions:**
1. **Verify the mobile can't-scroll-to-top fix on the iPad** — UNVERIFIED in 0.2.157 (iOS-layout-specific, not reproducible off-device); send a screen recording if it persists. Inbox item kept open + annotated.
2. (optional) Rotate the still-live Gemini key in AI Studio, then `rm /Users/macpro/Projects/recap-keyleak-purge-backup.bundle`.
3. Real-world cloud tests: first Bitcoin purchase; enable Zaprite cards (relay "Set Zaprite Connection" + webhook); eyeball a reminder email.
4. If recaps.cc ever gains a CDN/LB hop, set `RECAP_TRUSTED_PROXY_HOPS` or the trial-cap bypass reopens.
5. **Smoke-test the Daily Digest (0.2.158):** (a) `POST /api/admin/digest/run {test_email}` to eyeball the sample render; (b) toggle it on in Settings, add a recap, then `POST /api/admin/digest/run` (no body) to force a real scan — confirms relay synthesis + SMTP send + the unsubscribe link end-to-end. Needs System-SMTP configured.
**Backlog** in `ROADMAP.md`: eval **P2** known-debt (SSE error-string scrub, credit-debit TOCTOU, multi-tenant gemini-key bypass, `GET /api/history` perf, dependency CVEs, integration tests, doc drift) + **P3** cleanup, and standing decisions (Zaprite recurring, "take Recaps home" broken for relay-tier users, cloud paid-only, no CI lint/type-check).
**Next steps, in order:** (1) operator restart recap-relay → restore service; (2) the recap-relay analyze-hang fix (separate session, `/triage` the inbox item); (3) answer the sub-check question above; (4) deferred refactor-survey items + pending operator actions + P2/P3 backlog all live in `ROADMAP.md`.