Files
recap-relay/AGENTS.md
T

157 lines
16 KiB
Markdown

# AGENTS.md — Recap Relay
Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (`../recap`) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the **internal-meetings** feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). **Private. Ships to the operator's own Start9 box via `make install` only — NEVER to the public registry.**
> **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for
> items tagged `(recap-relay)` and surface them before proposing next steps; triage with `/triage`.
## Stack
- **Server**: Node.js (`type: module`, ES modules). Same dev box as the app (`v25.6.1`); container runtime is whatever the `Dockerfile` pins.
- **HTTP**: `express` + `multer` (audio upload). Admin routes under `/admin/*` behind an admin-session-cookie gate. `/relay/*` uses per-call header auth — install-id/license, or operator-key + user-id for the cloud control plane (a few routes like `health`/`policy`/`capabilities` are public). See the Auth model under Endpoints. `cors()` is scoped to `/relay/*` only.
- **Dashboard**: `public/dashboard.html` — single-file vanilla JS, render-string-into-innerHTML, same shape as the app's `index.html`.
- **Packaging**: `@start9labs/start-sdk` under `startos/` — version graph at `startos/versions/index.ts`.
- **Storage**: filesystem under the StartOS data dir (`/data`). No SQLite — flat JSON files: credit ledger `/data/credits.json`, payment-webhook dedup `/data/processed-webhooks.json`, internal meetings `/data/internal-meetings/<id>.json`.
- **Upstreams**: Gemini (`@google/genai`); operator hardware via "Spark Control" HTTP (Parakeet transcribe, `/api/audio/diarize-chunk` for Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).
## Commands
Run from repo root unless noted.
| Action | Command |
|---|---|
| Run all tests | `cd server && npm test` (built-in `node --test`) |
| Run one test file | `cd server && node --test test/<file>.test.js` |
| Build `.s9pk` (x86) | `make x86` |
| Bump version (interactive) | `make bump` |
| Install to operator's Start9 box | `make install` *(bump FIRST — see Always)* |
| Deploy to registry | `make deploy` / `make redeploy`**NEVER run these here** (private package) |
- `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead).
- Host comes from the `host:` field in `~/.startos/config.yaml` (a `<relay-host>.local` mDNS name). Never edit that file without authorization.
## Directory layout (key files)
```
server/
routes/internal-meetings.js upload → pipeline → save; the /admin/internal-meetings/* API,
including the post-hoc speaker-edit + download endpoints
speaker-clustering.js cross-chunk voice clustering (agglomerative, cosine sim) +
assignSpeakersToSegments + small-cluster suppression
post-cluster-polish.js Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
meeting-extras.js decisions / action items / open questions / key quotes extraction
meeting-speaker-edits.js post-hoc record edits: mergeSpeakersInRecord,
reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
backends/hardware.js Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
config.js getConfigSnapshot() + relay_* config defaults
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
safe-url.js SSRF guard: assertPublicHttpUrl + safeFetch for caller-supplied URLs
webhook-dedup.js persistent payment-webhook dedup (BTCPay + Zaprite share it);
initWebhookDedup/isWebhookProcessed/markWebhookProcessed
test/ node --test *.test.js (speaker tools, billing/credits, SSRF, path-traversal, …)
public/dashboard.html operator dashboard (Overview / Jobs / Users / Internal Meetings / Settings)
startos/versions/<vN>.ts one file per version + index.ts graph
docs/issues-backlog.md detailed issue log
docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/)
```
## Endpoints (server-side contract)
All routes mount in `server/index.js`. Public paths sit under `/relay/*`; operator paths under `/admin/*`.
### Auth model
- **`X-Recap-Operator-Key`** + **`X-Recap-User-Id`** → "cloud" path. The Recaps cloud server (`recaps.cc`) authenticates once with a shared operator key (`relay_cloud_operator_key`) and names the acting user. Credit pool keyed `user:<id>`, tier comes from the relay's stored row, NOT a per-user license. See `server/identity.js`.
- **`X-Recap-Install-Id`** (+ optional `Authorization: <license>`) → "license" path. Self-hosted installs and the operator's single-mode app. Credits/tier come from the resolved Keysat license + install id.
- **Admin session cookie** → `/admin/*`. Cookie issued by `POST /admin/login`; `/admin/login`, `/admin/status`, and `/admin/btcpay/callback` are exempt inside `setupAdminAuthMiddleware`.
- **Webhook signature** → `POST /relay/btcpay/webhook` validates `BTCPay-Sig` against `relay_btcpay_webhook_secret`. Zaprite's webhook re-fetches the order through the Zaprite API to verify, so no shared-secret signing.
- **`X-Recap-Job-Id`** is a billing key, not auth: the first call with a given id charges one credit; later calls with the same id are free (so transcribe + analyze for one summary = one credit total).
### `/relay/*` (public; per-call header auth)
- `GET /relay/health` — liveness; tolerates partial config. (`routes/health.js`)
- `GET /relay/policy``{ tiers, core_total_credits, core_gemini_credits }`; no auth. (`routes/policy.js`)
- `GET /relay/capabilities` — operator-wide feature flags (hardware ready, TTS backend choice, etc). `X-Recap-Install-Id` optional. (`routes/capabilities.js`)
- `GET /relay/balance` — caller's credit balance (`routes/balance.js`).
- `POST /relay/transcribe` — multipart audio → `{ text, segments, duration_seconds, model, ... }`. Body fields: `mime_type`, `title`, `channel`, `description`. (`routes/transcribe.js`)
- `POST /relay/transcribe-url` — async; `{ media_url, type, mime_type, title, channel, description, chapters }``{ job_id }` then poll `GET /relay/jobs/:id`. (`routes/transcribe-url.js`)
- `POST /relay/summarize-url` — async; same body shape, full transcribe+analyze pipeline → `{ job_id }` then stream `GET /relay/summarize-url/:jobId/events` (SSE). (`routes/summarize-url.js`)
- `POST /relay/analyze``{ transcript, … }` → topic sections JSON. (`routes/analyze.js`)
- `POST /relay/tts` — text → audio; gated by `capabilities.has_tts`. (`routes/tts.js`)
- `GET /relay/credits/packages`, `POST /relay/credits/buy`, `GET /relay/credits/invoice/:id` — à-la-carte credit purchase (BTCPay). (`routes/credits.js`)
- `POST /relay/btcpay/webhook` — BTCPay settle → either `extendUserTier` (subscription) or credit grant (à-la-carte). HMAC validated. (`routes/credits.js`)
- `POST /relay/zaprite/webhook` — Zaprite settle → `extendUserTier` only. Re-fetches order to verify. (`routes/zaprite-webhook.js`)
### `/relay/*` (operator-key only — cloud → relay control plane)
All require a valid `X-Recap-Operator-Key`. Defined in `routes/user-tier.js`.
- `POST /relay/user-tier``{ user_id, tier: "core"|"pro"|"max", expires_at? }` → sets the cloud user's stored tier (operator comp grants live here).
- `POST /relay/tier-invoice``{ user_id, tier: "pro"|"max", return_url }` → mints a BTCPay tier-purchase invoice (Lightning QR).
- `POST /relay/tier-zaprite-order` — same idea on the card rail.
- `GET /relay/tier-plans``{ ok, period_days, plans: [{tier, sats, fiat_amount, fiat_currency, credits_per_period}], card_available }`. `credits_per_period: null` → "Unlimited"; never hardcode this label.
- `GET /relay/expiring-subscriptions?within_days=7&lapsed_days=3``{ ok, now, subscriptions: [{user_id, tier, expires_at, expired, days_left}] }`. The Recaps server maps user_id → email and sends the reminder; the relay never sees email.
- `GET /relay/user-tier/:userId` — read the stored row.
### `/admin/*` (operator dashboard; cookie-gated)
`routes/admin.js`: `GET /admin/{usage,credits,config,license-cache,hardware-queue,jobs,jobs-history,job-output/:id,job/:id/details,output-store-stats,output-store-ids,dashboard,dashboard.csv,settings}`, `POST /admin/{quotas,credits/grant,wipe-all,settings/promote-prompt}`, `PUT /admin/settings`, `DELETE /admin/job-outputs`. (`GET /admin/credits` = ledger rows enriched with type + computed balance for the dashboard Users tab; `POST /admin/credits/grant` `{ credit_key, amount }` adds free top-up credits to an existing row.) `routes/admin-test-run.js`: `POST /admin/{test-run,test-run-suite}`. BTCPay setup wizard under `/admin/btcpay/*` (`routes/btcpay-setup.js`).
### `/admin/internal-meetings/*` (cookie-gated; `routes/internal-meetings.js`)
- `POST /upload` — multipart audio; runs the full pipeline (chunk → diarize → cluster → analyze → polish → extras → save). Audio is deleted after.
- `GET /``{ meetings: [...] }`; `GET /:id` → full saved record (`rec`).
- `GET /:id/markdown`, `GET /:id/html`, `GET /:id/download` — exports.
- `GET /jobs/:id`, `GET /jobs/:id/stream` (SSE) — progress for a running upload.
- `PATCH /:id/speakers` — rename a cluster (display-name only).
- `PATCH /:id/entries` — per-line `speaker_override`.
- `PATCH /:id/merge-speakers` — fold cluster(s) into one (split-as-two). Offline, no LLM.
- `POST /:id/recluster` — re-run clustering at a new threshold (merged-as-one). Offline, uses `rec.diarization` fingerprints. Resets `speaker_names`, per-line overrides, and extras attributions. 400 if no fingerprints.
- `POST /:id/repolish` — re-runs `runSummaryPolish` with the CURRENT names (no re-inference). Synchronous; needs hardware analyze online; 400 if no named speakers.
- `DELETE /:id`.
### Cross-repo changes (sibling: `../recap`)
This repo and the Recaps app (`../recap`) share a live client/server contract — the
`/relay/*` endpoints, the `X-Recap-*` headers, request/response shapes, and tier/credit
semantics. **Before finishing any change that touches that boundary, check whether
`../recap` needs a matching change.** If you add/rename/remove an endpoint, alter a payload
shape or header, or shift tier/credit/billing behavior, update the consumer side too — and
reflect it in BOTH repos' `AGENTS.md` (the contract docs) and `ROADMAP.md` (if it's staged
work). Purely internal changes (diarization tuning, dashboard layout, packaging) don't need
this. When unsure whether a change is contract-affecting, assume it is and check.
## Conventions for this codebase specifically
- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`.
- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above.
- **Fetching a caller-supplied URL? Go through `server/safe-url.js`** (`safeFetch` / `assertPublicHttpUrl`) — the SSRF guard that rejects non-http(s) schemes and hosts resolving to private/loopback/link-local/reserved ranges, and re-validates every redirect hop. `downloadDirect` (the transcribe-url/summarize-url/admin-test-run download path) already routes through it; never raw-`fetch` an untrusted URL. Calls to the operator's OWN hardware/LAN use `lan-fetch.js` instead — those URLs are config-set and intentionally private.
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).
## Always
- **Bump the version before EVERY `make install`** — StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops. `make bump``make x86``make install`. See memory `bump-before-install` (applies to this repo AND `../recap`).
- **Add new version files to BOTH the import block AND the `other:` list** in `startos/versions/index.ts`, and point `current:` at the new constant. `make bump` does this for you.
- **Build freely; ask before anything that leaves this machine.** `make x86` / `make install` (to the operator's own box) are fine. `make deploy` / `make redeploy` are NOT.
- **Reference env-var / config names, never values.** Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.
## Never
- **Never `make deploy` / `make redeploy` / upload to the registry.** This package is private to the operator's box. (Memory: `feedback_relay_never_to_registry`.)
- **No "Co-Authored-By" / no "Claude" mentions** in commits or source.
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
- **Don't push to GitHub by default** — remote is self-hosted Gitea.
## Current state — Users tab + webhook-dedup/P2 batch landed (2026-06-15)
- **Box, local tree, git aligned at relay `0.2.126`** (app `0.2.155`); `current: v_0_2_126`. Gitea remote `origin` (`ssh://git@immense-voyage.local:59916/grant/recap-relay.git`); `master` tracks `origin/master`. Working tree clean. **Suite green at 79 tests** (`cd server && npm test`); server boots clean.
- **Users dashboard tab** (`0.2.125`): new cookie-gated tab — every credit-ledger row (typed cloud/license/install) with computed remaining/total balances, key filter, and a per-row "grant free credits" action. `GET /admin/credits` (enriched read) + `POST /admin/credits/grant {credit_key, amount}` (free top-up via `addPurchasedCredits`, guards: positive int ≤1M, must be an existing row). Admin-only; no `../recap` contract change.
- **Webhook dedup now persistent** (`0.2.126`): new `server/webhook-dedup.js` (JSON store at `/data/processed-webhooks.json`, atomic writes, 180-day prune) replaces the in-memory Sets in `routes/credits.js` + `zaprite-webhook.js` (and the rescan path) — a duplicate delivery straddling a restart can no longer double-credit/double-extend. Keys namespaced `<storeId>|<invoiceId>` vs `zaprite:<orderId>`.
- **BTCPay is REQUIRED** (operator decision, 2026-06-15): config was already `optional:false`/`kind:'running'`; corrected the contradictory "optional" comment in `startos/manifest/index.ts`. It's the only paid rail, so the relay shouldn't run without it.
- **CORS scoped to `/relay/*`** (`index.js`) — off `/admin/*` + dashboard (same-origin). Plus money-path unit tests (`commitCredit`/`refundCredit`/`applyTierPromotion`) and the two AGENTS.md auth-doc drift fixes.
- **Next (open P2 / deferred):**
1. Split the 2225-line `routes/internal-meetings.js`**deferred as likely overkill** for a private service; do only if it becomes painful to work in.
2. P3+ deferred tail (no `/relay/*` rate limiting, container-as-root, dashboard `innerHTML` XSS surface, prune 126 version files, `/relay/health` stale `0.2.11`, etc.) + speaker-tool/empty-section backlog → `ROADMAP.md` / `docs/issues-backlog.md`.
- **Risks/notes:** webhook dedup keeps the pre-existing check-then-mark race for *truly simultaneous* duplicate deliveries (vanishingly rare on a private box; would need locking). SSRF guard leaves a DNS-rebinding TOCTOU open (acceptable for a private box). Full prior eval → `EVALUATION.md`.