Files
recap-relay/AGENTS.md
T

145 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AGENTS.md — Recap Relay
Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (`../recap`) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the **internal-meetings** feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). **Private. Ships to the operator's own Start9 box via `make install` only — NEVER to the public registry.**
## Stack
- **Server**: Node.js (`type: module`, ES modules). Same dev box as the app (`v25.6.1`); container runtime is whatever the `Dockerfile` pins.
- **HTTP**: `express` + `multer` (audio upload). Admin routes under `/admin/*` behind an admin-session-cookie gate; relay-to-relay routes under `/relay/*` behind the operator key.
- **Dashboard**: `public/dashboard.html` — single-file vanilla JS, render-string-into-innerHTML, same shape as the app's `index.html`.
- **Packaging**: `@start9labs/start-sdk` under `startos/` — version graph at `startos/versions/index.ts`.
- **Storage**: filesystem under the StartOS data dir (`/data`). Internal meetings persist as `/data/internal-meetings/<id>.json`. No SQLite here.
- **Upstreams**: Gemini (`@google/genai`); operator hardware via "Spark Control" HTTP (Parakeet transcribe, `/api/audio/diarize-chunk` for Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).
## Commands
Run from repo root unless noted.
| Action | Command |
|---|---|
| Run all tests | `cd server && npm test` (built-in `node --test`) |
| Run one test file | `cd server && node --test test/<file>.test.js` |
| Build `.s9pk` (x86) | `make x86` |
| Bump version (interactive) | `make bump` |
| Install to operator's Start9 box | `make install` *(bump FIRST — see Always)* |
| Deploy to registry | `make deploy` / `make redeploy`**NEVER run these here** (private package) |
- `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead).
- Host comes from the `host:` field in `~/.startos/config.yaml` (a `<relay-host>.local` mDNS name). Never edit that file without authorization.
## Directory layout (key files)
```
server/
routes/internal-meetings.js upload → pipeline → save; the /admin/internal-meetings/* API,
including the post-hoc speaker-edit + download endpoints
speaker-clustering.js cross-chunk voice clustering (agglomerative, cosine sim) +
assignSpeakersToSegments + small-cluster suppression
post-cluster-polish.js Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
meeting-extras.js decisions / action items / open questions / key quotes extraction
meeting-speaker-edits.js post-hoc record edits: mergeSpeakersInRecord,
reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
backends/hardware.js Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
config.js getConfigSnapshot() + relay_* config defaults
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
test/ node --test files (speaker-clustering, meeting-speaker-edits, credits)
public/dashboard.html operator dashboard (meetings detail view + speaker tools)
startos/versions/<vN>.ts one file per version + index.ts graph
docs/issues-backlog.md detailed issue log
docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/)
```
## Endpoints (server-side contract)
All routes mount in `server/index.js`. Public paths sit under `/relay/*`; operator paths under `/admin/*`.
### Auth model
- **`X-Recap-Operator-Key`** + **`X-Recap-User-Id`** → "cloud" path. The Recaps cloud server (`recaps.cc`) authenticates once with a shared operator key (`relay_cloud_operator_key`) and names the acting user. Credit pool keyed `user:<id>`, tier comes from the relay's stored row, NOT a per-user license. See `server/identity.js`.
- **`X-Recap-Install-Id`** (+ optional `Authorization: <license>`) → "license" path. Self-hosted installs and the operator's single-mode app. Credits/tier come from the resolved Keysat license + install id.
- **Admin session cookie** → `/admin/*`. Cookie issued by `POST /admin/login`; `/admin/login` and `/admin/status` are exempt inside `setupAdminAuthMiddleware`.
- **Webhook signature** → `POST /relay/btcpay/webhook` validates `BTCPay-Sig` against `relay_btcpay_webhook_secret`. Zaprite's webhook re-fetches the order through the Zaprite API to verify, so no shared-secret signing.
- **`X-Recap-Job-Id`** is a billing key, not auth: the first call with a given id charges one credit; later calls with the same id are free (so transcribe + analyze for one summary = one credit total).
### `/relay/*` (public; per-call header auth)
- `GET /relay/health` — liveness; tolerates partial config. (`routes/health.js`)
- `GET /relay/policy``{ tiers, core_total_credits, core_gemini_credits }`; no auth. (`routes/policy.js`)
- `GET /relay/capabilities` — operator-wide feature flags (hardware ready, TTS backend choice, etc). `X-Recap-Install-Id` optional. (`routes/capabilities.js`)
- `GET /relay/balance` — caller's credit balance (`routes/balance.js`).
- `POST /relay/transcribe` — multipart audio → `{ text, segments, duration_seconds, model, ... }`. Body fields: `mime_type`, `title`, `channel`, `description`. (`routes/transcribe.js`)
- `POST /relay/transcribe-url` — async; `{ media_url, type, mime_type, title, channel, description, chapters }``{ job_id }` then poll `GET /relay/jobs/:id`. (`routes/transcribe-url.js`)
- `POST /relay/summarize-url` — async; same body shape, full transcribe+analyze pipeline → `{ job_id }` then stream `GET /relay/summarize-url/:jobId/events` (SSE). (`routes/summarize-url.js`)
- `POST /relay/analyze``{ transcript, … }` → topic sections JSON. (`routes/analyze.js`)
- `POST /relay/tts` — text → audio; gated by `capabilities.has_tts`. (`routes/tts.js`)
- `GET /relay/credits/packages`, `POST /relay/credits/buy`, `GET /relay/credits/invoice/:id` — à-la-carte credit purchase (BTCPay). (`routes/credits.js`)
- `POST /relay/btcpay/webhook` — BTCPay settle → either `extendUserTier` (subscription) or credit grant (à-la-carte). HMAC validated. (`routes/credits.js`)
- `POST /relay/zaprite/webhook` — Zaprite settle → `extendUserTier` only. Re-fetches order to verify. (`routes/zaprite-webhook.js`)
### `/relay/*` (operator-key only — cloud → relay control plane)
All require a valid `X-Recap-Operator-Key`. Defined in `routes/user-tier.js`.
- `POST /relay/user-tier``{ user_id, tier: "core"|"pro"|"max", expires_at? }` → sets the cloud user's stored tier (operator comp grants live here).
- `POST /relay/tier-invoice``{ user_id, tier: "pro"|"max", return_url }` → mints a BTCPay tier-purchase invoice (Lightning QR).
- `POST /relay/tier-zaprite-order` — same idea on the card rail.
- `GET /relay/tier-plans``{ ok, period_days, plans: [{tier, sats, fiat_amount, fiat_currency, credits_per_period}], card_available }`. `credits_per_period: null` → "Unlimited"; never hardcode this label.
- `GET /relay/expiring-subscriptions?within_days=7&lapsed_days=3``{ ok, now, subscriptions: [{user_id, tier, expires_at, expired, days_left}] }`. The Recaps server maps user_id → email and sends the reminder; the relay never sees email.
- `GET /relay/user-tier/:userId` — read the stored row.
### `/admin/*` (operator dashboard; cookie-gated)
`routes/admin.js`: `GET /admin/{usage,config,license-cache,hardware-queue,jobs,jobs-history,job-output/:id,job/:id/details,output-store-stats,output-store-ids,dashboard,dashboard.csv,settings}`, `POST /admin/{quotas,wipe-all,settings/promote-prompt}`, `PUT /admin/settings`, `DELETE /admin/job-outputs`. `routes/admin-test-run.js`: `POST /admin/{test-run,test-run-suite}`. BTCPay setup wizard under `/admin/btcpay/*` (`routes/btcpay-setup.js`).
### `/admin/internal-meetings/*` (cookie-gated; `routes/internal-meetings.js`)
- `POST /upload` — multipart audio; runs the full pipeline (chunk → diarize → cluster → analyze → polish → extras → save). Audio is deleted after.
- `GET /``{ meetings: [...] }`; `GET /:id` → full saved record (`rec`).
- `GET /:id/markdown`, `GET /:id/html`, `GET /:id/download` — exports.
- `GET /jobs/:id`, `GET /jobs/:id/stream` (SSE) — progress for a running upload.
- `PATCH /:id/speakers` — rename a cluster (display-name only).
- `PATCH /:id/entries` — per-line `speaker_override`.
- `PATCH /:id/merge-speakers` — fold cluster(s) into one (split-as-two). Offline, no LLM.
- `POST /:id/recluster` — re-run clustering at a new threshold (merged-as-one). Offline, uses `rec.diarization` fingerprints. Resets `speaker_names`, per-line overrides, and extras attributions. 400 if no fingerprints.
- `POST /:id/repolish` — re-runs `runSummaryPolish` with the CURRENT names (no re-inference). Synchronous; needs hardware analyze online; 400 if no named speakers.
- `DELETE /:id`.
### Cross-repo changes (sibling: `../recap`)
This repo and the Recaps app (`../recap`) share a live client/server contract — the
`/relay/*` endpoints, the `X-Recap-*` headers, request/response shapes, and tier/credit
semantics. **Before finishing any change that touches that boundary, check whether
`../recap` needs a matching change.** If you add/rename/remove an endpoint, alter a payload
shape or header, or shift tier/credit/billing behavior, update the consumer side too — and
reflect it in BOTH repos' `AGENTS.md` (the contract docs) and `ROADMAP.md` (if it's staged
work). Purely internal changes (diarization tuning, dashboard layout, packaging) don't need
this. When unsure whether a change is contract-affecting, assume it is and check.
## Conventions for this codebase specifically
- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`.
- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above.
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).
## Always
- **Bump the version before EVERY `make install`** — StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops. `make bump``make x86``make install`. See memory `bump-before-install` (applies to this repo AND `../recap`).
- **Add new version files to BOTH the import block AND the `other:` list** in `startos/versions/index.ts`, and point `current:` at the new constant. `make bump` does this for you.
- **Build freely; ask before anything that leaves this machine.** `make x86` / `make install` (to the operator's own box) are fine. `make deploy` / `make redeploy` are NOT.
- **Reference env-var / config names, never values.** Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.
## Never
- **Never `make deploy` / `make redeploy` / upload to the registry.** This package is private to the operator's box. (Memory: `feedback_relay_never_to_registry`.)
- **No "Co-Authored-By" / no "Claude" mentions** in commits or source.
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
- **Don't push to GitHub by default** — remote is self-hosted Gitea.
## Current state — box AND working tree at `0.2.124`; git is the gap
- **Box AND local working tree are both at relay `0.2.124`** (app at `0.2.155`). `startos/versions/index.ts` `current: v_0_2_124`; the StartOS dashboard reflects the same.
- **Version files `v0.2.117``v0.2.124` are present in the working tree** (untracked). A concurrent 2026-06-13 session continued from this session's 0.2.117, bumped through 0.2.124, and shipped to the box — re-read the tree before assuming what's there.
- **Post-hoc speaker tools are live**: `meeting-speaker-edits.js` (merge / recluster / repolish + backfill) and the matching `PATCH/POST /admin/internal-meetings/:id/{merge-speakers,recluster,repolish}` routes are present; the dashboard exposes the controls. Tests pass via `cd server && npm test`.
- **The real gap is git, not versions.** The last committed **code** is `b7f7590 v0.2.11 /relay/capabilities + /relay/transcribe-url`; the commit(s) stacked on top of it are docs-only (the AGENTS/ROADMAP consolidation). So everything from `v0.2.12``v0.2.124` — the entire internal-meetings feature, diarization, speaker-edit tools, billing, the user-tier control plane — is uncommitted. Working-tree counts: **28 modified, 150 untracked, 5 deleted (183 total)** as of this read. "Catching up git" = committing this tree (see ROADMAP).