b4fa5d7be8
Multi-mode, off by default. Each new recap is synthesized into a 1-2 paragraph overview via the relay (operator-absorbed) and cached onto the session JSON; a daily 08:00 scan emails opted-in users their fresh recaps, deduped by a per-user watermark that never skips a failed or over-cap recap. One-click tokenized unsubscribe; settings-modal toggle; admin test trigger. Bumps to 0.2.158.
146 lines
9.3 KiB
Markdown
146 lines
9.3 KiB
Markdown
# Daily Digest — plan
|
||
|
||
Status: **proposed** (awaiting go-ahead). Captures the design agreed with Grant on
|
||
2026-06-15. Build only after sign-off.
|
||
|
||
## Goal
|
||
|
||
An **opt-in** (off by default) daily "wake-up" email to recaps.cc users: the recaps
|
||
added to their library in the last ~24 hours, each shown as a **synthesized 1–2
|
||
paragraph overview** generated from that recap's existing per-topic summaries. Turns
|
||
passive subscriptions into a daily touchpoint without making the user open the app.
|
||
|
||
## Decisions (locked 2026-06-15)
|
||
|
||
- **Content** — "overnight recaps": library additions since the user's last digest.
|
||
- **Audience / opt-in** — multi-mode (recaps.cc) first; **off by default**; per-user toggle.
|
||
- **Per-episode depth** — a 1–2 paragraph overview *synthesized from the stored topic
|
||
summaries* (`chunks`). NOT raw full text (too long, Gmail clips >~102 KB), NOT a
|
||
one-sentence blurb (too thin). This is Grant's call and it's what bounds email size.
|
||
- **Volume** — per-episode size is bounded by the 2-paragraph synthesis. Still cap at
|
||
~10 episodes per email with an "and N more in your library →" overflow link for
|
||
extreme days.
|
||
- **Cadence** — once per user per ~24h at a fixed server-time hour (default 08:00).
|
||
Timezone-aware send is a v2. **Skip the email entirely when nothing is new.**
|
||
- **Dedup** — a per-user `last_digest_at` watermark; each digest covers recaps created
|
||
since that instant, so nothing repeats and nothing is missed.
|
||
|
||
## Data (grounded in code)
|
||
|
||
- Saved recap record (`server/history.js` `saveToHistory`): `id`, `title`, `type`,
|
||
`url`, `createdAt` (ISO), `topicCount`, `chunks` (topics, each with bullet
|
||
summaries), `entries` (transcript), `speakers`/`speakerNames`. **No top-level
|
||
summary is stored** → the 1–2 paragraph overview must be synthesized.
|
||
- Multi-mode users live in the `users` table (`id`, `email`, …); a user's library
|
||
scope is their user id.
|
||
|
||
## Architecture
|
||
|
||
Mirror `server/subscription-reminders.js` (the proven daily-scan-plus-email pattern:
|
||
self-gating, deduped, never throws).
|
||
|
||
- **`server/daily-digest.js`** (new)
|
||
- `runDigestScan({ force })`: gate on `isSmtpReady()` + public URL set. For each
|
||
opted-in user, list sessions with `createdAt > last_digest_at`; if none, skip. For
|
||
each new recap, get-or-generate its overview (see below), render the email,
|
||
`sendMail`, then advance the watermark. Returns a `{sent, skipped}` summary; never
|
||
throws.
|
||
- `startDigestScheduler()`: boot delay + interval, fires near the target hour.
|
||
Idempotent; safe to start unconditionally in multi mode.
|
||
- **Synthesis** — `synthesizeEpisodeOverview(record)`: send the recap's topic titles +
|
||
bullet summaries to the relay LLM with a "write a 1–2 paragraph overview" prompt.
|
||
**Cache** the result back onto the session JSON (e.g. `digestOverview`) so it's
|
||
generated once and could later power an in-app episode overview. **Sanitize
|
||
operator-internal strings at this boundary** (Parakeet/CUDA/LAN IPs etc. must not
|
||
reach cloud users — existing repo convention).
|
||
- **Email** — `renderDigestEmail({ brandName, episodes, manageUrl, unsubscribeUrl })`
|
||
in `server/email-template.js`, matching the existing reminder/magic-link templates.
|
||
- **Opt-in storage** — migration in `server/db.js`: add `users.digest_enabled`
|
||
(default 0) and `users.last_digest_at` (ms, nullable). Toggle endpoint in
|
||
`server/account-routes.js` (requires session). Settings-modal toggle in
|
||
`public/index.html`.
|
||
- **Unsubscribe** — a one-click tokenized GET link in every email that flips
|
||
`digest_enabled = 0` without requiring login (signed token), plus the in-app toggle.
|
||
Consent + deliverability hygiene on the young recaps.cc domain.
|
||
- **Operator test trigger** — `POST /api/admin/digest/run { test_email }`, mirroring
|
||
the reminders test hook, so it can be smoke-tested without waiting a day.
|
||
|
||
## Cost / credits
|
||
|
||
The synthesis is one small relay LLM call per new recap per opted-in user, run once and
|
||
cached. Bounded by (opted-in users × new recaps/day). **Recommend operator-absorbed**
|
||
(it's a retention feature, input is already-short topic summaries) rather than drawing
|
||
the user's credits. Confirm.
|
||
|
||
## Open questions (defaults chosen; confirm or adjust)
|
||
|
||
1. **Synthesis cost owner** — ~~operator-absorbed (default) vs user credits?~~
|
||
**RESOLVED 2026-06-15: operator-absorbed, zero operator action.** The synthesis
|
||
provider is built with `resolveProviderOpts("relay", { req: null })` → the operator's
|
||
install identity, the *same* relay credit pool free signed-in users' summaries already
|
||
draw from (`providers/index.js` `pickRelayIdentity`). No comped system user-id needed.
|
||
Flipping to user-billing later = pass the recipient's cloud identity at the marked line
|
||
in `daily-digest.js` `buildSynthesisProvider()`.
|
||
2. **Send hour** — 08:00 server time (default)?
|
||
3. **Single-mode operator digest** — defer to a follow-on (default: multi-mode only v1)?
|
||
4. **Relay contract** — ~~does an existing relay endpoint (`/relay/analyze`) fit~~
|
||
**RESOLVED 2026-06-15: `/relay/analyze` fits as-is, no new relay capability.** The
|
||
route (`recap-relay/server/routes/analyze.js`) takes a free-form `{ prompt: string }`
|
||
and returns `{ result: { text } }`; the client already wraps it as
|
||
`relay.js` `analyzeText({ prompt }) → result.text`. "Topic sections JSON" is only what
|
||
today's `chunked-analyze.js` caller asks for in *its* prompt — the endpoint is generic.
|
||
Synthesis = build a "summarize these summaries into 1–2 paragraphs" prompt, read
|
||
`result.text`. **No cross-repo change.** (Aside: relay `AGENTS.md:78` still describes
|
||
this endpoint as `{ transcript, … } → topic sections JSON` — stale; flag for that repo.)
|
||
Billing: each standalone analyze charges 1 credit on the call's credit key unless it
|
||
shares an `X-Recap-Job-Id` — that's the Q1 (cost-owner) mechanism, decided at phase 2.
|
||
|
||
## Build phases
|
||
|
||
1. **BUILT 2026-06-15.** Schema + opt-in toggle. `db.js`: `users.digest_enabled`
|
||
(default 0) + `users.last_digest_at` (ms, nullable) via SCHEMA_SQL +
|
||
`migrateUserDigestPrefs`. `account-routes.js`: `GET`/`POST /api/account/digest`
|
||
(enabling stamps `last_digest_at = now` so the first send isn't a backlog dump).
|
||
`public/index.html`: settings-modal toggle (`renderDigestBlock` + `loadMyDigest` /
|
||
`setDigestEnabled`, optimistic with revert).
|
||
2. **BUILT 2026-06-15.** Synthesis + cache → `server/daily-digest.js`:
|
||
`buildOverviewPrompt` (pure), `scrubOperatorStrings` (conservative backstop — infra
|
||
proper nouns + LAN/private hosts; dropped CUDA to avoid mangling legit tech content),
|
||
`synthesizeEpisodeOverview` (relay `analyzeText`, operator-absorbed identity, stable
|
||
per-episode jobId), `getOrCreateEpisodeOverview` (`digestOverview` cache + best-effort
|
||
`patchSession` write-back). NOT wired into a scheduler yet — dormant until phase 3.
|
||
Tests: `test/daily-digest.test.js` (12, pass). Note: chunks carry a `summary` text per
|
||
topic (not bullets — the Data section's "bullet summaries" wording was loose).
|
||
3. **BUILT 2026-06-15.** Email + scan + scheduler + dedup + overflow cap.
|
||
`email-template.js` `renderDigestEmail` (minimal inline style, per-episode title→source
|
||
link + overview, overflow line, one-click unsubscribe). `daily-digest.js`:
|
||
`selectDigestEpisodes` (pure: watermark filter + cap + overflow), `runDigestScan`
|
||
(hourly tick, acts at `SEND_HOUR=8`; per-user `MIN_RESEND_MS=20h` + watermark dedup;
|
||
skips empty; advances watermark only on successful send; never throws),
|
||
`startDigestScheduler`, `setupDigestRoutes` (public `GET /api/digest/unsubscribe?token=`).
|
||
`history.js` `listScopeSessions`. `db.js` adds `users.digest_unsub_token` (minted lazily
|
||
on first send). Wired in `index.js` (multi-mode) + `tenant-auth.js` public path.
|
||
4. **BUILT 2026-06-15.** `POST /api/admin/digest/run` — `{test_email}` sends a sample
|
||
render; bare body forces a real scan now (bypasses the hour gate, not the resend gate).
|
||
Mirrors `/api/admin/reminders/run`.
|
||
5. **DONE.** `test/daily-digest.test.js` — 19 tests (prompt, scrub, synth/cache,
|
||
`selectDigestEpisodes` watermark/cap/overflow/empty, `scopeForUser`, email render).
|
||
Full suite **138 pass**. Verified on a real multi-mode boot: migrations apply, scheduler
|
||
starts, and the unsubscribe route (400/404/200 + flips `digest_enabled`) works end-to-end.
|
||
|
||
## Status: feature-complete, awaiting on-box smoke test
|
||
|
||
Built end-to-end but **not yet installed** (no version bump). The relay synthesis call and
|
||
SMTP send can only be exercised on the operator's box. Operator smoke test:
|
||
`POST /api/admin/digest/run {test_email}` to eyeball the render; then opt in, add a recap,
|
||
and force a scan (or wait for 08:00) to see a real synthesized digest.
|
||
|
||
**Fresh-eyes review applied (2026-06-15).** Three correctness fixes after a reviewer pass:
|
||
(1) the watermark now advances to the newest *sent* recap but never past a failed/deferred
|
||
one (`nextDigestWatermark`) — the old `now` stamp silently dropped both synthesis-failures
|
||
and over-cap overflow recaps forever; (2) `force` no longer bypasses the in-progress lock,
|
||
so an operator force-run during the scheduled tick can't double-send; (3) `idx_users_unsub_token`
|
||
is created in the migration, not `SCHEMA_SQL` (the latter runs before the column exists on
|
||
upgraded DBs → would crash boot). Existing-DB upgrade verified on a realistic pre-digest
|
||
schema. Also added an index on the unauthenticated token lookup + a null-scope guard.
|