Files
recap-relay/AGENTS.md
T
2026-06-13 16:23:55 -05:00

174 lines
18 KiB
Markdown

# AGENTS.md — Recap Relay
Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (`../recap`) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the **internal-meetings** feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). **Private. Ships to the operator's own Start9 box via `make install` only — NEVER to the public registry.**
## Stack
- **Server**: Node.js (`type: module`, ES modules). Same dev box as the app (`v25.6.1`); container runtime is whatever the `Dockerfile` pins.
- **HTTP**: `express` + `multer` (audio upload). Admin routes under `/admin/*` behind an admin-session-cookie gate; relay-to-relay routes under `/relay/*` behind the operator key.
- **Dashboard**: `public/dashboard.html` — single-file vanilla JS, render-string-into-innerHTML, same shape as the app's `index.html`.
- **Packaging**: `@start9labs/start-sdk` under `startos/` — version graph at `startos/versions/index.ts`.
- **Storage**: filesystem under the StartOS data dir (`/data`). Internal meetings persist as `/data/internal-meetings/<id>.json`. No SQLite here.
- **Upstreams**: Gemini (`@google/genai`); operator hardware via "Spark Control" HTTP (Parakeet transcribe, `/api/audio/diarize-chunk` for Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).
## Commands
Run from repo root unless noted.
| Action | Command |
|---|---|
| Run all tests | `cd server && npm test` (built-in `node --test`) |
| Run one test file | `cd server && node --test test/<file>.test.js` |
| Build `.s9pk` (x86) | `make x86` |
| Bump version (interactive) | `make bump` |
| Install to operator's Start9 box | `make install` *(bump FIRST — see Always)* |
| Deploy to registry | `make deploy` / `make redeploy`**NEVER run these here** (private package) |
- `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead).
- Host comes from the `host:` field in `~/.startos/config.yaml` (a `<relay-host>.local` mDNS name). Never edit that file without authorization.
## Directory layout (key files)
```
server/
routes/internal-meetings.js upload → pipeline → save; the /admin/internal-meetings/* API,
including the post-hoc speaker-edit + download endpoints
speaker-clustering.js cross-chunk voice clustering (agglomerative, cosine sim) +
assignSpeakersToSegments + small-cluster suppression
post-cluster-polish.js Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
meeting-extras.js decisions / action items / open questions / key quotes extraction
meeting-speaker-edits.js post-hoc record edits: mergeSpeakersInRecord,
reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
backends/hardware.js Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
config.js getConfigSnapshot() + relay_* config defaults
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
test/ node --test files (speaker-clustering, meeting-speaker-edits, credits)
public/dashboard.html operator dashboard (meetings detail view + speaker tools)
startos/versions/<vN>.ts one file per version + index.ts graph
docs/issues-backlog.md detailed issue log
docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/)
```
## Endpoints (server-side contract)
All routes mount in `server/index.js`. Public paths sit under `/relay/*`; operator paths under `/admin/*`.
### Auth model
- **`X-Recap-Operator-Key`** + **`X-Recap-User-Id`** → "cloud" path. The Recaps cloud server (`recaps.cc`) authenticates once with a shared operator key (`relay_cloud_operator_key`) and names the acting user. Credit pool keyed `user:<id>`, tier comes from the relay's stored row, NOT a per-user license. See `server/identity.js`.
- **`X-Recap-Install-Id`** (+ optional `Authorization: <license>`) → "license" path. Self-hosted installs and the operator's single-mode app. Credits/tier come from the resolved Keysat license + install id.
- **Admin session cookie** → `/admin/*`. Cookie issued by `POST /admin/login`; `/admin/login` and `/admin/status` are exempt inside `setupAdminAuthMiddleware`.
- **Webhook signature** → `POST /relay/btcpay/webhook` validates `BTCPay-Sig` against `relay_btcpay_webhook_secret`. Zaprite's webhook re-fetches the order through the Zaprite API to verify, so no shared-secret signing.
- **`X-Recap-Job-Id`** is a billing key, not auth: the first call with a given id charges one credit; later calls with the same id are free (so transcribe + analyze for one summary = one credit total).
### `/relay/*` (public; per-call header auth)
- `GET /relay/health` — liveness; tolerates partial config. (`routes/health.js`)
- `GET /relay/policy``{ tiers, core_total_credits, core_gemini_credits }`; no auth. (`routes/policy.js`)
- `GET /relay/capabilities` — operator-wide feature flags (hardware ready, TTS backend choice, etc). `X-Recap-Install-Id` optional. (`routes/capabilities.js`)
- `GET /relay/balance` — caller's credit balance (`routes/balance.js`).
- `POST /relay/transcribe` — multipart audio → `{ text, segments, duration_seconds, model, ... }`. Body fields: `mime_type`, `title`, `channel`, `description`. (`routes/transcribe.js`)
- `POST /relay/transcribe-url` — async; `{ media_url, type, mime_type, title, channel, description, chapters }``{ job_id }` then poll `GET /relay/jobs/:id`. (`routes/transcribe-url.js`)
- `POST /relay/summarize-url` — async; same body shape, full transcribe+analyze pipeline → `{ job_id }` then stream `GET /relay/summarize-url/:jobId/events` (SSE). (`routes/summarize-url.js`)
- `POST /relay/analyze``{ transcript, … }` → topic sections JSON. (`routes/analyze.js`)
- `POST /relay/tts` — text → audio; gated by `capabilities.has_tts`. (`routes/tts.js`)
- `GET /relay/credits/packages`, `POST /relay/credits/buy`, `GET /relay/credits/invoice/:id` — à-la-carte credit purchase (BTCPay). (`routes/credits.js`)
- `POST /relay/btcpay/webhook` — BTCPay settle → either `extendUserTier` (subscription) or credit grant (à-la-carte). HMAC validated. (`routes/credits.js`)
- `POST /relay/zaprite/webhook` — Zaprite settle → `extendUserTier` only. Re-fetches order to verify. (`routes/zaprite-webhook.js`)
### `/relay/*` (operator-key only — cloud → relay control plane)
All require a valid `X-Recap-Operator-Key`. Defined in `routes/user-tier.js`.
- `POST /relay/user-tier``{ user_id, tier: "core"|"pro"|"max", expires_at? }` → sets the cloud user's stored tier (operator comp grants live here).
- `POST /relay/tier-invoice``{ user_id, tier: "pro"|"max", return_url }` → mints a BTCPay tier-purchase invoice (Lightning QR).
- `POST /relay/tier-zaprite-order` — same idea on the card rail.
- `GET /relay/tier-plans``{ ok, period_days, plans: [{tier, sats, fiat_amount, fiat_currency, credits_per_period}], card_available }`. `credits_per_period: null` → "Unlimited"; never hardcode this label.
- `GET /relay/expiring-subscriptions?within_days=7&lapsed_days=3``{ ok, now, subscriptions: [{user_id, tier, expires_at, expired, days_left}] }`. The Recaps server maps user_id → email and sends the reminder; the relay never sees email.
- `GET /relay/user-tier/:userId` — read the stored row.
### `/admin/*` (operator dashboard; cookie-gated)
`routes/admin.js`: `GET /admin/{usage,config,license-cache,hardware-queue,jobs,jobs-history,job-output/:id,job/:id/details,output-store-stats,output-store-ids,dashboard,dashboard.csv,settings}`, `POST /admin/{quotas,wipe-all,settings/promote-prompt}`, `PUT /admin/settings`, `DELETE /admin/job-outputs`. `routes/admin-test-run.js`: `POST /admin/{test-run,test-run-suite}`. BTCPay setup wizard under `/admin/btcpay/*` (`routes/btcpay-setup.js`).
### `/admin/internal-meetings/*` (cookie-gated; `routes/internal-meetings.js`)
- `POST /upload` — multipart audio; runs the full pipeline (chunk → diarize → cluster → analyze → polish → extras → save). Audio is deleted after.
- `GET /``{ meetings: [...] }`; `GET /:id` → full saved record (`rec`).
- `GET /:id/markdown`, `GET /:id/html`, `GET /:id/download` — exports.
- `GET /jobs/:id`, `GET /jobs/:id/stream` (SSE) — progress for a running upload.
- `PATCH /:id/speakers` — rename a cluster (display-name only).
- `PATCH /:id/entries` — per-line `speaker_override`.
- `PATCH /:id/merge-speakers` — fold cluster(s) into one (split-as-two). Offline, no LLM.
- `POST /:id/recluster` — re-run clustering at a new threshold (merged-as-one). Offline, uses `rec.diarization` fingerprints. Resets `speaker_names`, per-line overrides, and extras attributions. 400 if no fingerprints.
- `POST /:id/repolish` — re-runs `runSummaryPolish` with the CURRENT names (no re-inference). Synchronous; needs hardware analyze online; 400 if no named speakers.
- `DELETE /:id`.
### Cross-repo changes (sibling: `../recap`)
This repo and the Recaps app (`../recap`) share a live client/server contract — the
`/relay/*` endpoints, the `X-Recap-*` headers, request/response shapes, and tier/credit
semantics. **Before finishing any change that touches that boundary, check whether
`../recap` needs a matching change.** If you add/rename/remove an endpoint, alter a payload
shape or header, or shift tier/credit/billing behavior, update the consumer side too — and
reflect it in BOTH repos' `AGENTS.md` (the contract docs) and `ROADMAP.md` (if it's staged
work). Purely internal changes (diarization tuning, dashboard layout, packaging) don't need
this. When unsure whether a change is contract-affecting, assume it is and check.
## Conventions for this codebase specifically
- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`.
- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above.
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).
## Always
- **Bump the version before EVERY `make install`** — StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops. `make bump``make x86``make install`. See memory `bump-before-install` (applies to this repo AND `../recap`).
- **Add new version files to BOTH the import block AND the `other:` list** in `startos/versions/index.ts`, and point `current:` at the new constant. `make bump` does this for you.
- **Build freely; ask before anything that leaves this machine.** `make x86` / `make install` (to the operator's own box) are fine. `make deploy` / `make redeploy` are NOT.
- **Reference env-var / config names, never values.** Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.
## Never
- **Never `make deploy` / `make redeploy` / upload to the registry.** This package is private to the operator's box. (Memory: `feedback_relay_never_to_registry`.)
- **No "Co-Authored-By" / no "Claude" mentions** in commits or source.
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
- **Don't push to GitHub by default** — remote is self-hosted Gitea.
## Current state — full eval done (2026-06-13); findings triaged below
- **Box, local tree, and git aligned at relay `0.2.124`** (app at `0.2.155`). `startos/versions/index.ts` `current: v_0_2_124`. Git history is local-only (no remote). Working tree is clean apart from an untracked `server/package-lock.json` left by the eval's `npm install` — a generated artifact, intentionally NOT committed.
- **Full independent evaluation run 2026-06-13** (evaluator + security-auditor + exerciser + doc-auditor + start9-spec-checker). Report committed at `EVALUATION.md` (`b08e836`); it's overwritten in place each run so re-running gives a reviewable diff. 47/47 tests still pass; server boots clean. Findings triaged into the three buckets below.
- **Post-hoc speaker tools remain live**: `meeting-speaker-edits.js` (merge / recluster / repolish + backfill) + the `PATCH/POST /admin/internal-meetings/:id/{merge-speakers,recluster,repolish}` routes; dashboard exposes the controls.
### Work queue — P0/P1 — DONE (2026-06-13, commits `8ad7c54`/`d2caa98`/`3a601e1`)
All three P1 items are fixed and committed; suite is green at 57 tests (was 47). None touch the `../recap` client contract (a blocked URL is just a failed-download job; credit accounting is internal; multer is a dep-only bump).
1.**SSRF on `/relay/transcribe-url` + `/relay/summarize-url`** — added `server/safe-url.js` (`assertPublicHttpUrl` rejects non-http(s) + private/loopback/link-local/reserved hosts; `safeFetch` follows redirects manually and re-validates each hop). `downloadDirect` routes through it — covers transcribe-url, summarize-url, admin-test-run. Tests: `server/test/safe-url.test.js`. *Residual:* DNS-rebinding TOCTOU (resolve-then-connect) not closed — acceptable for the private box; revisit if exposed.
2.**Billing money-leak: renewal reset `monthly_consumed = 0`**`setUserTier` gained a `resetCycle` flag (default true = operator-grant behavior); `extendUserTier` passes `resetCycle:false` for an in-force sub (preserve counter), `true` only for new/lapsed. Monthly resets still happen via `ensureRenewalRollover`. Regression tests added.
3.**`multer` DoS CVEs** — bumped `^1.4.5-lts.1``^2.0.1` (installed 2.1.1); smoke-tested that a malformed multipart now yields a clean 4xx instead of crashing. Committed `server/package-lock.json` so the Dockerfile `npm ci` path pins it.
### Known debt — P2 (accepted for now; fix opportunistically)
- Path traversal on internal-meetings `:id` (admin-gated): validate `^[A-Za-z0-9_-]+$` before `path.join``routes/internal-meetings.js:84,91,242` (`output-store.js:52` shows the pattern). *(security-auditor + exerciser)*
- Non-constant-time operator-key compare (`!==`) on `relay_cloud_operator_key``server/identity.js:43,84`; use `timingSafeEqual` like the admin path. *(evaluator + security-auditor)*
- In-memory webhook dedup Set lost on restart → double-credit/double-extend — `routes/credits.js:63`, `zaprite-webhook.js:27`; persist processed invoice/order ids. *(security-auditor)*
- Malformed JSON body → full Node stack trace (FS paths) — add an Express `entity.parse.failed` → JSON-400 handler. *(exerciser)*
- BTCPay declared `optional:false`/`kind:'running'` despite "optional" comments → StartOS won't start the relay without BTCPay co-installed — `startos/manifest/index.ts:38-49`, `startos/dependencies.ts`. Decide, then make manifest + dependencies + comment agree. *(start9-spec-checker)*
- No money-path unit tests (`commitCredit`/`refundCredit`/`applyTierPromotion`/`planBackend`/grant handlers) — why the P1 billing bug ships green. *(evaluator)*
- `routes/internal-meetings.js` is 2225 lines; extract the MD/HTML formatters + storage/backfill layer. *(evaluator)*
- Fully-open `cors()` incl. `/admin/*` — scope origins — `server/index.js:54`. *(evaluator)*
- Doc drift: AGENTS.md "Stack" line mis-states `/relay/*` auth (most routes are per-call header auth; only `routes/user-tier.js` needs the operator key); the admin-exempt list omits `/admin/btcpay/callback` (`admin-auth.js:70`). *(doc-auditor)*
### Deferred — P3+ (later decision or bulk cleanup)
- Security hardening: no `/relay/*` rate limiting; container likely runs as root (entrypoint `chown`s uid 1001 but no `USER` directive); dashboard `innerHTML` stored-XSS surface; `lan-fetch` TLS verify off (admin-set URL only); debug/error fields leaked to clients. *(security-auditor + evaluator)*
- Packaging/ops: prune the 126 `startos/versions/*.ts` files; pin `yt-dlp` in the Dockerfile; Dockerfile per-subdir `COPY` footgun; manifest polish (SPDX license, `docsUrls`, real repo URLs, icon format); no `README.md` (blocks public-registry submission only — moot for this private box). *(start9-spec-checker + evaluator)*
- `/relay/health` reports stale `0.2.11``server/package.json` never bumped past 0.2.11; bump it to track the StartOS version. *(exerciser + doc-auditor)*
- Doc fixes (bulk): the `test/` layout lists 3 of 6 files; `server/index.js:3-6` "two endpoints" header comment is stale; `POST /admin/logout` undocumented. *(doc-auditor)*
- Untested blind spot: the live upload → merge → recluster → repolish pipeline (admin-gated + needs Spark Control) has only unit coverage; the dependency audit ran offline — re-run `npm audit`/`osv-scanner` with network to confirm the multer finding and catch transitive CVEs. *(all agents)*
**Pre-existing backlog** (separate from the eval): speaker-tool follow-ups and the empty-analysis-section issue — see `ROADMAP.md` / `docs/issues-backlog.md`.