Add opt-in Daily Digest (daily email of last 24h of library recaps)

Multi-mode, off by default. Each new recap is synthesized into a 1-2
paragraph overview via the relay (operator-absorbed) and cached onto the
session JSON; a daily 08:00 scan emails opted-in users their fresh
recaps, deduped by a per-user watermark that never skips a failed or
over-cap recap. One-click tokenized unsubscribe; settings-modal toggle;
admin test trigger. Bumps to 0.2.158.
This commit is contained in:
Keysat
2026-06-15 19:50:48 -05:00
parent 962423ca10
commit b4fa5d7be8
14 changed files with 1144 additions and 17 deletions
+64 -12
View File
@@ -74,20 +74,72 @@ the user's credits. Confirm.
## Open questions (defaults chosen; confirm or adjust)
1. **Synthesis cost owner** — operator-absorbed (default) vs user credits?
1. **Synthesis cost owner**~~operator-absorbed (default) vs user credits?~~
**RESOLVED 2026-06-15: operator-absorbed, zero operator action.** The synthesis
provider is built with `resolveProviderOpts("relay", { req: null })` → the operator's
install identity, the *same* relay credit pool free signed-in users' summaries already
draw from (`providers/index.js` `pickRelayIdentity`). No comped system user-id needed.
Flipping to user-billing later = pass the recipient's cloud identity at the marked line
in `daily-digest.js` `buildSynthesisProvider()`.
2. **Send hour** — 08:00 server time (default)?
3. **Single-mode operator digest** — defer to a follow-on (default: multi-mode only v1)?
4. **Relay contract** — does an existing relay endpoint (`/relay/analyze`) fit the
"summarize these topic summaries into 2 paragraphs" call, or is a small new relay
capability/prompt-mode needed? If new, update `../recap-relay` + both repos'
`AGENTS.md`/`ROADMAP.md` per the cross-repo rule. **Resolve before phase 2.**
4. **Relay contract**~~does an existing relay endpoint (`/relay/analyze`) fit~~
**RESOLVED 2026-06-15: `/relay/analyze` fits as-is, no new relay capability.** The
route (`recap-relay/server/routes/analyze.js`) takes a free-form `{ prompt: string }`
and returns `{ result: { text } }`; the client already wraps it as
`relay.js` `analyzeText({ prompt }) → result.text`. "Topic sections JSON" is only what
today's `chunked-analyze.js` caller asks for in *its* prompt — the endpoint is generic.
Synthesis = build a "summarize these summaries into 12 paragraphs" prompt, read
`result.text`. **No cross-repo change.** (Aside: relay `AGENTS.md:78` still describes
this endpoint as `{ transcript, … } → topic sections JSON` — stale; flag for that repo.)
Billing: each standalone analyze charges 1 credit on the call's credit key unless it
shares an `X-Recap-Job-Id` — that's the Q1 (cost-owner) mechanism, decided at phase 2.
## Build phases
1. Schema + opt-in toggle (migration, account endpoint, settings UI).
2. Synthesis + cache (relay call + write-back + operator-string scrub). Resolve the
relay-contract question first.
3. Email template + scan loop + scheduler + watermark dedup + overflow cap.
4. Operator test trigger.
5. Tests — pure-function coverage (episode selection vs watermark, cap/overflow, empty
→ skip), in the `subscription-reminders` test style.
1. **BUILT 2026-06-15.** Schema + opt-in toggle. `db.js`: `users.digest_enabled`
(default 0) + `users.last_digest_at` (ms, nullable) via SCHEMA_SQL +
`migrateUserDigestPrefs`. `account-routes.js`: `GET`/`POST /api/account/digest`
(enabling stamps `last_digest_at = now` so the first send isn't a backlog dump).
`public/index.html`: settings-modal toggle (`renderDigestBlock` + `loadMyDigest` /
`setDigestEnabled`, optimistic with revert).
2. **BUILT 2026-06-15.** Synthesis + cache `server/daily-digest.js`:
`buildOverviewPrompt` (pure), `scrubOperatorStrings` (conservative backstop — infra
proper nouns + LAN/private hosts; dropped CUDA to avoid mangling legit tech content),
`synthesizeEpisodeOverview` (relay `analyzeText`, operator-absorbed identity, stable
per-episode jobId), `getOrCreateEpisodeOverview` (`digestOverview` cache + best-effort
`patchSession` write-back). NOT wired into a scheduler yet — dormant until phase 3.
Tests: `test/daily-digest.test.js` (12, pass). Note: chunks carry a `summary` text per
topic (not bullets — the Data section's "bullet summaries" wording was loose).
3. **BUILT 2026-06-15.** Email + scan + scheduler + dedup + overflow cap.
`email-template.js` `renderDigestEmail` (minimal inline style, per-episode title→source
link + overview, overflow line, one-click unsubscribe). `daily-digest.js`:
`selectDigestEpisodes` (pure: watermark filter + cap + overflow), `runDigestScan`
(hourly tick, acts at `SEND_HOUR=8`; per-user `MIN_RESEND_MS=20h` + watermark dedup;
skips empty; advances watermark only on successful send; never throws),
`startDigestScheduler`, `setupDigestRoutes` (public `GET /api/digest/unsubscribe?token=`).
`history.js` `listScopeSessions`. `db.js` adds `users.digest_unsub_token` (minted lazily
on first send). Wired in `index.js` (multi-mode) + `tenant-auth.js` public path.
4. **BUILT 2026-06-15.** `POST /api/admin/digest/run``{test_email}` sends a sample
render; bare body forces a real scan now (bypasses the hour gate, not the resend gate).
Mirrors `/api/admin/reminders/run`.
5. **DONE.** `test/daily-digest.test.js` — 19 tests (prompt, scrub, synth/cache,
`selectDigestEpisodes` watermark/cap/overflow/empty, `scopeForUser`, email render).
Full suite **138 pass**. Verified on a real multi-mode boot: migrations apply, scheduler
starts, and the unsubscribe route (400/404/200 + flips `digest_enabled`) works end-to-end.
## Status: feature-complete, awaiting on-box smoke test
Built end-to-end but **not yet installed** (no version bump). The relay synthesis call and
SMTP send can only be exercised on the operator's box. Operator smoke test:
`POST /api/admin/digest/run {test_email}` to eyeball the render; then opt in, add a recap,
and force a scan (or wait for 08:00) to see a real synthesized digest.
**Fresh-eyes review applied (2026-06-15).** Three correctness fixes after a reviewer pass:
(1) the watermark now advances to the newest *sent* recap but never past a failed/deferred
one (`nextDigestWatermark`) — the old `now` stamp silently dropped both synthesis-failures
and over-cap overflow recaps forever; (2) `force` no longer bypasses the in-progress lock,
so an operator force-run during the scheduled tick can't double-send; (3) `idx_users_unsub_token`
is created in the migration, not `SCHEMA_SQL` (the latter runs before the column exists on
upgraded DBs → would crash boot). Existing-DB upgrade verified on a realistic pre-digest
schema. Also added an index on the unauthenticated token lookup + a null-scope guard.