Files
recap-relay/AGENTS.md
T
2026-06-15 12:26:56 -05:00

155 lines
15 KiB
Markdown

# AGENTS.md — Recap Relay
Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (`../recap`) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the **internal-meetings** feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). **Private. Ships to the operator's own Start9 box via `make install` only — NEVER to the public registry.**
> **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for
> items tagged `(recap-relay)` and surface them before proposing next steps; triage with `/triage`.
## Stack
- **Server**: Node.js (`type: module`, ES modules). Same dev box as the app (`v25.6.1`); container runtime is whatever the `Dockerfile` pins.
- **HTTP**: `express` + `multer` (audio upload). Admin routes under `/admin/*` behind an admin-session-cookie gate; relay-to-relay routes under `/relay/*` behind the operator key.
- **Dashboard**: `public/dashboard.html` — single-file vanilla JS, render-string-into-innerHTML, same shape as the app's `index.html`.
- **Packaging**: `@start9labs/start-sdk` under `startos/` — version graph at `startos/versions/index.ts`.
- **Storage**: filesystem under the StartOS data dir (`/data`). Internal meetings persist as `/data/internal-meetings/<id>.json`. No SQLite here.
- **Upstreams**: Gemini (`@google/genai`); operator hardware via "Spark Control" HTTP (Parakeet transcribe, `/api/audio/diarize-chunk` for Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).
## Commands
Run from repo root unless noted.
| Action | Command |
|---|---|
| Run all tests | `cd server && npm test` (built-in `node --test`) |
| Run one test file | `cd server && node --test test/<file>.test.js` |
| Build `.s9pk` (x86) | `make x86` |
| Bump version (interactive) | `make bump` |
| Install to operator's Start9 box | `make install` *(bump FIRST — see Always)* |
| Deploy to registry | `make deploy` / `make redeploy`**NEVER run these here** (private package) |
- `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead).
- Host comes from the `host:` field in `~/.startos/config.yaml` (a `<relay-host>.local` mDNS name). Never edit that file without authorization.
## Directory layout (key files)
```
server/
routes/internal-meetings.js upload → pipeline → save; the /admin/internal-meetings/* API,
including the post-hoc speaker-edit + download endpoints
speaker-clustering.js cross-chunk voice clustering (agglomerative, cosine sim) +
assignSpeakersToSegments + small-cluster suppression
post-cluster-polish.js Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
meeting-extras.js decisions / action items / open questions / key quotes extraction
meeting-speaker-edits.js post-hoc record edits: mergeSpeakersInRecord,
reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
backends/hardware.js Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
config.js getConfigSnapshot() + relay_* config defaults
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
safe-url.js SSRF guard: assertPublicHttpUrl + safeFetch for caller-supplied URLs
test/ node --test *.test.js (speaker tools, billing/credits, SSRF, path-traversal, …)
public/dashboard.html operator dashboard (meetings detail view + speaker tools)
startos/versions/<vN>.ts one file per version + index.ts graph
docs/issues-backlog.md detailed issue log
docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/)
```
## Endpoints (server-side contract)
All routes mount in `server/index.js`. Public paths sit under `/relay/*`; operator paths under `/admin/*`.
### Auth model
- **`X-Recap-Operator-Key`** + **`X-Recap-User-Id`** → "cloud" path. The Recaps cloud server (`recaps.cc`) authenticates once with a shared operator key (`relay_cloud_operator_key`) and names the acting user. Credit pool keyed `user:<id>`, tier comes from the relay's stored row, NOT a per-user license. See `server/identity.js`.
- **`X-Recap-Install-Id`** (+ optional `Authorization: <license>`) → "license" path. Self-hosted installs and the operator's single-mode app. Credits/tier come from the resolved Keysat license + install id.
- **Admin session cookie** → `/admin/*`. Cookie issued by `POST /admin/login`; `/admin/login` and `/admin/status` are exempt inside `setupAdminAuthMiddleware`.
- **Webhook signature** → `POST /relay/btcpay/webhook` validates `BTCPay-Sig` against `relay_btcpay_webhook_secret`. Zaprite's webhook re-fetches the order through the Zaprite API to verify, so no shared-secret signing.
- **`X-Recap-Job-Id`** is a billing key, not auth: the first call with a given id charges one credit; later calls with the same id are free (so transcribe + analyze for one summary = one credit total).
### `/relay/*` (public; per-call header auth)
- `GET /relay/health` — liveness; tolerates partial config. (`routes/health.js`)
- `GET /relay/policy``{ tiers, core_total_credits, core_gemini_credits }`; no auth. (`routes/policy.js`)
- `GET /relay/capabilities` — operator-wide feature flags (hardware ready, TTS backend choice, etc). `X-Recap-Install-Id` optional. (`routes/capabilities.js`)
- `GET /relay/balance` — caller's credit balance (`routes/balance.js`).
- `POST /relay/transcribe` — multipart audio → `{ text, segments, duration_seconds, model, ... }`. Body fields: `mime_type`, `title`, `channel`, `description`. (`routes/transcribe.js`)
- `POST /relay/transcribe-url` — async; `{ media_url, type, mime_type, title, channel, description, chapters }``{ job_id }` then poll `GET /relay/jobs/:id`. (`routes/transcribe-url.js`)
- `POST /relay/summarize-url` — async; same body shape, full transcribe+analyze pipeline → `{ job_id }` then stream `GET /relay/summarize-url/:jobId/events` (SSE). (`routes/summarize-url.js`)
- `POST /relay/analyze``{ transcript, … }` → topic sections JSON. (`routes/analyze.js`)
- `POST /relay/tts` — text → audio; gated by `capabilities.has_tts`. (`routes/tts.js`)
- `GET /relay/credits/packages`, `POST /relay/credits/buy`, `GET /relay/credits/invoice/:id` — à-la-carte credit purchase (BTCPay). (`routes/credits.js`)
- `POST /relay/btcpay/webhook` — BTCPay settle → either `extendUserTier` (subscription) or credit grant (à-la-carte). HMAC validated. (`routes/credits.js`)
- `POST /relay/zaprite/webhook` — Zaprite settle → `extendUserTier` only. Re-fetches order to verify. (`routes/zaprite-webhook.js`)
### `/relay/*` (operator-key only — cloud → relay control plane)
All require a valid `X-Recap-Operator-Key`. Defined in `routes/user-tier.js`.
- `POST /relay/user-tier``{ user_id, tier: "core"|"pro"|"max", expires_at? }` → sets the cloud user's stored tier (operator comp grants live here).
- `POST /relay/tier-invoice``{ user_id, tier: "pro"|"max", return_url }` → mints a BTCPay tier-purchase invoice (Lightning QR).
- `POST /relay/tier-zaprite-order` — same idea on the card rail.
- `GET /relay/tier-plans``{ ok, period_days, plans: [{tier, sats, fiat_amount, fiat_currency, credits_per_period}], card_available }`. `credits_per_period: null` → "Unlimited"; never hardcode this label.
- `GET /relay/expiring-subscriptions?within_days=7&lapsed_days=3``{ ok, now, subscriptions: [{user_id, tier, expires_at, expired, days_left}] }`. The Recaps server maps user_id → email and sends the reminder; the relay never sees email.
- `GET /relay/user-tier/:userId` — read the stored row.
### `/admin/*` (operator dashboard; cookie-gated)
`routes/admin.js`: `GET /admin/{usage,config,license-cache,hardware-queue,jobs,jobs-history,job-output/:id,job/:id/details,output-store-stats,output-store-ids,dashboard,dashboard.csv,settings}`, `POST /admin/{quotas,wipe-all,settings/promote-prompt}`, `PUT /admin/settings`, `DELETE /admin/job-outputs`. `routes/admin-test-run.js`: `POST /admin/{test-run,test-run-suite}`. BTCPay setup wizard under `/admin/btcpay/*` (`routes/btcpay-setup.js`).
### `/admin/internal-meetings/*` (cookie-gated; `routes/internal-meetings.js`)
- `POST /upload` — multipart audio; runs the full pipeline (chunk → diarize → cluster → analyze → polish → extras → save). Audio is deleted after.
- `GET /``{ meetings: [...] }`; `GET /:id` → full saved record (`rec`).
- `GET /:id/markdown`, `GET /:id/html`, `GET /:id/download` — exports.
- `GET /jobs/:id`, `GET /jobs/:id/stream` (SSE) — progress for a running upload.
- `PATCH /:id/speakers` — rename a cluster (display-name only).
- `PATCH /:id/entries` — per-line `speaker_override`.
- `PATCH /:id/merge-speakers` — fold cluster(s) into one (split-as-two). Offline, no LLM.
- `POST /:id/recluster` — re-run clustering at a new threshold (merged-as-one). Offline, uses `rec.diarization` fingerprints. Resets `speaker_names`, per-line overrides, and extras attributions. 400 if no fingerprints.
- `POST /:id/repolish` — re-runs `runSummaryPolish` with the CURRENT names (no re-inference). Synchronous; needs hardware analyze online; 400 if no named speakers.
- `DELETE /:id`.
### Cross-repo changes (sibling: `../recap`)
This repo and the Recaps app (`../recap`) share a live client/server contract — the
`/relay/*` endpoints, the `X-Recap-*` headers, request/response shapes, and tier/credit
semantics. **Before finishing any change that touches that boundary, check whether
`../recap` needs a matching change.** If you add/rename/remove an endpoint, alter a payload
shape or header, or shift tier/credit/billing behavior, update the consumer side too — and
reflect it in BOTH repos' `AGENTS.md` (the contract docs) and `ROADMAP.md` (if it's staged
work). Purely internal changes (diarization tuning, dashboard layout, packaging) don't need
this. When unsure whether a change is contract-affecting, assume it is and check.
## Conventions for this codebase specifically
- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`.
- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above.
- **Fetching a caller-supplied URL? Go through `server/safe-url.js`** (`safeFetch` / `assertPublicHttpUrl`) — the SSRF guard that rejects non-http(s) schemes and hosts resolving to private/loopback/link-local/reserved ranges, and re-validates every redirect hop. `downloadDirect` (the transcribe-url/summarize-url/admin-test-run download path) already routes through it; never raw-`fetch` an untrusted URL. Calls to the operator's OWN hardware/LAN use `lan-fetch.js` instead — those URLs are config-set and intentionally private.
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).
## Always
- **Bump the version before EVERY `make install`** — StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops. `make bump``make x86``make install`. See memory `bump-before-install` (applies to this repo AND `../recap`).
- **Add new version files to BOTH the import block AND the `other:` list** in `startos/versions/index.ts`, and point `current:` at the new constant. `make bump` does this for you.
- **Build freely; ask before anything that leaves this machine.** `make x86` / `make install` (to the operator's own box) are fine. `make deploy` / `make redeploy` are NOT.
- **Reference env-var / config names, never values.** Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.
## Never
- **Never `make deploy` / `make redeploy` / upload to the registry.** This package is private to the operator's box. (Memory: `feedback_relay_never_to_registry`.)
- **No "Co-Authored-By" / no "Claude" mentions** in commits or source.
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
- **Don't push to GitHub by default** — remote is self-hosted Gitea.
## Current state — post-eval security pass landed (2026-06-13)
- **Box, local tree, git aligned at relay `0.2.124`** (app `0.2.155`); `current: v_0_2_124`. Gitea remote `origin` now set up (`ssh://git@immense-voyage.local:59916/grant/recap-relay.git`); `master` pushed and tracking `origin/master`. Working tree clean. **Suite green at 60 tests** (`cd server && npm test`); server boots clean.
- **Full independent eval done** (evaluator + security-auditor + exerciser + doc-auditor + start9-spec-checker) → `EVALUATION.md` (overwritten in place each run, so re-running diffs cleanly).
- **All P0/P1 fixed** this session (commits `8ad7c54`/`d2caa98`/`3a601e1`): SSRF guard on caller-supplied media URLs (new `server/safe-url.js`), the early-renewal credit-reset money-leak (`extendUserTier`/`setUserTier` `resetCycle`), and the `multer``^2.0.1` DoS bump. None touch the `../recap` client contract.
- **Three P2 fixed** (commits `cbd9748`/`da1bba2`/`693d724`): meeting-`:id` path-traversal guard (`meetingPath()`), constant-time operator-key compare, and a JSON error handler that closes the malformed-body stack-trace leak.
- **Next (open P2), in priority order:**
1. Persist webhook dedup so a restart can't double-credit/double-extend — `routes/credits.js:63`, `zaprite-webhook.js:27`.
2. **Needs operator decision:** is BTCPay a hard requirement or truly optional? It's `optional:false`/`kind:'running'` despite "optional" comments, so StartOS won't start the relay without BTCPay co-installed — `startos/manifest/index.ts:38-49` + `dependencies.ts`. Then make manifest/deps/comment agree.
3. Money-path unit tests (`commitCredit`/`refundCredit`/`applyTierPromotion`/grant handlers); scope `cors()` off `/admin/*` (`index.js`); split the 2225-line `routes/internal-meetings.js`; fix the two AGENTS.md auth-doc drifts (Stack-line `/relay/*` auth; missing `/admin/btcpay/callback` exempt path).
- **Risks/notes:** SSRF guard leaves a DNS-rebinding TOCTOU open (acceptable for a private box; revisit if exposed). P3+ deferred tail + pre-existing speaker-tool/empty-section backlog → `ROADMAP.md` / `docs/issues-backlog.md`.