145 lines
14 KiB
Markdown
145 lines
14 KiB
Markdown
# AGENTS.md — Recap Relay
|
|
|
|
Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (`../recap`) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the **internal-meetings** feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). **Private. Ships to the operator's own Start9 box via `make install` only — NEVER to the public registry.**
|
|
|
|
## Stack
|
|
|
|
- **Server**: Node.js (`type: module`, ES modules). Same dev box as the app (`v25.6.1`); container runtime is whatever the `Dockerfile` pins.
|
|
- **HTTP**: `express` + `multer` (audio upload). Admin routes under `/admin/*` behind an admin-session-cookie gate; relay-to-relay routes under `/relay/*` behind the operator key.
|
|
- **Dashboard**: `public/dashboard.html` — single-file vanilla JS, render-string-into-innerHTML, same shape as the app's `index.html`.
|
|
- **Packaging**: `@start9labs/start-sdk` under `startos/` — version graph at `startos/versions/index.ts`.
|
|
- **Storage**: filesystem under the StartOS data dir (`/data`). Internal meetings persist as `/data/internal-meetings/<id>.json`. No SQLite here.
|
|
- **Upstreams**: Gemini (`@google/genai`); operator hardware via "Spark Control" HTTP (Parakeet transcribe, `/api/audio/diarize-chunk` for Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).
|
|
|
|
## Commands
|
|
|
|
Run from repo root unless noted.
|
|
|
|
| Action | Command |
|
|
|---|---|
|
|
| Run all tests | `cd server && npm test` (built-in `node --test`) |
|
|
| Run one test file | `cd server && node --test test/<file>.test.js` |
|
|
| Build `.s9pk` (x86) | `make x86` |
|
|
| Bump version (interactive) | `make bump` |
|
|
| Install to operator's Start9 box | `make install` *(bump FIRST — see Always)* |
|
|
| Deploy to registry | `make deploy` / `make redeploy` — **NEVER run these here** (private package) |
|
|
|
|
- `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead).
|
|
- Host comes from the `host:` field in `~/.startos/config.yaml` (a `<relay-host>.local` mDNS name). Never edit that file without authorization.
|
|
|
|
## Directory layout (key files)
|
|
|
|
```
|
|
server/
|
|
routes/internal-meetings.js upload → pipeline → save; the /admin/internal-meetings/* API,
|
|
including the post-hoc speaker-edit + download endpoints
|
|
speaker-clustering.js cross-chunk voice clustering (agglomerative, cosine sim) +
|
|
assignSpeakersToSegments + small-cluster suppression
|
|
post-cluster-polish.js Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
|
|
meeting-extras.js decisions / action items / open questions / key quotes extraction
|
|
meeting-speaker-edits.js post-hoc record edits: mergeSpeakersInRecord,
|
|
reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
|
|
backends/hardware.js Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
|
|
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
|
|
config.js getConfigSnapshot() + relay_* config defaults
|
|
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
|
|
test/ node --test files (speaker-clustering, meeting-speaker-edits, credits)
|
|
public/dashboard.html operator dashboard (meetings detail view + speaker tools)
|
|
startos/versions/<vN>.ts one file per version + index.ts graph
|
|
docs/issues-backlog.md detailed issue log
|
|
docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/)
|
|
```
|
|
|
|
## Endpoints (server-side contract)
|
|
|
|
All routes mount in `server/index.js`. Public paths sit under `/relay/*`; operator paths under `/admin/*`.
|
|
|
|
### Auth model
|
|
|
|
- **`X-Recap-Operator-Key`** + **`X-Recap-User-Id`** → "cloud" path. The Recaps cloud server (`recaps.cc`) authenticates once with a shared operator key (`relay_cloud_operator_key`) and names the acting user. Credit pool keyed `user:<id>`, tier comes from the relay's stored row, NOT a per-user license. See `server/identity.js`.
|
|
- **`X-Recap-Install-Id`** (+ optional `Authorization: <license>`) → "license" path. Self-hosted installs and the operator's single-mode app. Credits/tier come from the resolved Keysat license + install id.
|
|
- **Admin session cookie** → `/admin/*`. Cookie issued by `POST /admin/login`; `/admin/login` and `/admin/status` are exempt inside `setupAdminAuthMiddleware`.
|
|
- **Webhook signature** → `POST /relay/btcpay/webhook` validates `BTCPay-Sig` against `relay_btcpay_webhook_secret`. Zaprite's webhook re-fetches the order through the Zaprite API to verify, so no shared-secret signing.
|
|
- **`X-Recap-Job-Id`** is a billing key, not auth: the first call with a given id charges one credit; later calls with the same id are free (so transcribe + analyze for one summary = one credit total).
|
|
|
|
### `/relay/*` (public; per-call header auth)
|
|
|
|
- `GET /relay/health` — liveness; tolerates partial config. (`routes/health.js`)
|
|
- `GET /relay/policy` — `{ tiers, core_total_credits, core_gemini_credits }`; no auth. (`routes/policy.js`)
|
|
- `GET /relay/capabilities` — operator-wide feature flags (hardware ready, TTS backend choice, etc). `X-Recap-Install-Id` optional. (`routes/capabilities.js`)
|
|
- `GET /relay/balance` — caller's credit balance (`routes/balance.js`).
|
|
- `POST /relay/transcribe` — multipart audio → `{ text, segments, duration_seconds, model, ... }`. Body fields: `mime_type`, `title`, `channel`, `description`. (`routes/transcribe.js`)
|
|
- `POST /relay/transcribe-url` — async; `{ media_url, type, mime_type, title, channel, description, chapters }` → `{ job_id }` then poll `GET /relay/jobs/:id`. (`routes/transcribe-url.js`)
|
|
- `POST /relay/summarize-url` — async; same body shape, full transcribe+analyze pipeline → `{ job_id }` then stream `GET /relay/summarize-url/:jobId/events` (SSE). (`routes/summarize-url.js`)
|
|
- `POST /relay/analyze` — `{ transcript, … }` → topic sections JSON. (`routes/analyze.js`)
|
|
- `POST /relay/tts` — text → audio; gated by `capabilities.has_tts`. (`routes/tts.js`)
|
|
- `GET /relay/credits/packages`, `POST /relay/credits/buy`, `GET /relay/credits/invoice/:id` — à-la-carte credit purchase (BTCPay). (`routes/credits.js`)
|
|
- `POST /relay/btcpay/webhook` — BTCPay settle → either `extendUserTier` (subscription) or credit grant (à-la-carte). HMAC validated. (`routes/credits.js`)
|
|
- `POST /relay/zaprite/webhook` — Zaprite settle → `extendUserTier` only. Re-fetches order to verify. (`routes/zaprite-webhook.js`)
|
|
|
|
### `/relay/*` (operator-key only — cloud → relay control plane)
|
|
|
|
All require a valid `X-Recap-Operator-Key`. Defined in `routes/user-tier.js`.
|
|
|
|
- `POST /relay/user-tier` — `{ user_id, tier: "core"|"pro"|"max", expires_at? }` → sets the cloud user's stored tier (operator comp grants live here).
|
|
- `POST /relay/tier-invoice` — `{ user_id, tier: "pro"|"max", return_url }` → mints a BTCPay tier-purchase invoice (Lightning QR).
|
|
- `POST /relay/tier-zaprite-order` — same idea on the card rail.
|
|
- `GET /relay/tier-plans` — `{ ok, period_days, plans: [{tier, sats, fiat_amount, fiat_currency, credits_per_period}], card_available }`. `credits_per_period: null` → "Unlimited"; never hardcode this label.
|
|
- `GET /relay/expiring-subscriptions?within_days=7&lapsed_days=3` — `{ ok, now, subscriptions: [{user_id, tier, expires_at, expired, days_left}] }`. The Recaps server maps user_id → email and sends the reminder; the relay never sees email.
|
|
- `GET /relay/user-tier/:userId` — read the stored row.
|
|
|
|
### `/admin/*` (operator dashboard; cookie-gated)
|
|
|
|
`routes/admin.js`: `GET /admin/{usage,config,license-cache,hardware-queue,jobs,jobs-history,job-output/:id,job/:id/details,output-store-stats,output-store-ids,dashboard,dashboard.csv,settings}`, `POST /admin/{quotas,wipe-all,settings/promote-prompt}`, `PUT /admin/settings`, `DELETE /admin/job-outputs`. `routes/admin-test-run.js`: `POST /admin/{test-run,test-run-suite}`. BTCPay setup wizard under `/admin/btcpay/*` (`routes/btcpay-setup.js`).
|
|
|
|
### `/admin/internal-meetings/*` (cookie-gated; `routes/internal-meetings.js`)
|
|
|
|
- `POST /upload` — multipart audio; runs the full pipeline (chunk → diarize → cluster → analyze → polish → extras → save). Audio is deleted after.
|
|
- `GET /` → `{ meetings: [...] }`; `GET /:id` → full saved record (`rec`).
|
|
- `GET /:id/markdown`, `GET /:id/html`, `GET /:id/download` — exports.
|
|
- `GET /jobs/:id`, `GET /jobs/:id/stream` (SSE) — progress for a running upload.
|
|
- `PATCH /:id/speakers` — rename a cluster (display-name only).
|
|
- `PATCH /:id/entries` — per-line `speaker_override`.
|
|
- `PATCH /:id/merge-speakers` — fold cluster(s) into one (split-as-two). Offline, no LLM.
|
|
- `POST /:id/recluster` — re-run clustering at a new threshold (merged-as-one). Offline, uses `rec.diarization` fingerprints. Resets `speaker_names`, per-line overrides, and extras attributions. 400 if no fingerprints.
|
|
- `POST /:id/repolish` — re-runs `runSummaryPolish` with the CURRENT names (no re-inference). Synchronous; needs hardware analyze online; 400 if no named speakers.
|
|
- `DELETE /:id`.
|
|
|
|
### Cross-repo changes (sibling: `../recap`)
|
|
|
|
This repo and the Recaps app (`../recap`) share a live client/server contract — the
|
|
`/relay/*` endpoints, the `X-Recap-*` headers, request/response shapes, and tier/credit
|
|
semantics. **Before finishing any change that touches that boundary, check whether
|
|
`../recap` needs a matching change.** If you add/rename/remove an endpoint, alter a payload
|
|
shape or header, or shift tier/credit/billing behavior, update the consumer side too — and
|
|
reflect it in BOTH repos' `AGENTS.md` (the contract docs) and `ROADMAP.md` (if it's staged
|
|
work). Purely internal changes (diarization tuning, dashboard layout, packaging) don't need
|
|
this. When unsure whether a change is contract-affecting, assume it is and check.
|
|
|
|
## Conventions for this codebase specifically
|
|
|
|
- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`.
|
|
- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/<topic>.md` (with `paths:` frontmatter); `.claude/rules/<topic>.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/<topic>.md`, symlink it from `.claude/rules/`, add an index line above.
|
|
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).
|
|
|
|
## Always
|
|
|
|
- **Bump the version before EVERY `make install`** — StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops. `make bump` → `make x86` → `make install`. See memory `bump-before-install` (applies to this repo AND `../recap`).
|
|
- **Add new version files to BOTH the import block AND the `other:` list** in `startos/versions/index.ts`, and point `current:` at the new constant. `make bump` does this for you.
|
|
- **Build freely; ask before anything that leaves this machine.** `make x86` / `make install` (to the operator's own box) are fine. `make deploy` / `make redeploy` are NOT.
|
|
- **Reference env-var / config names, never values.** Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.
|
|
|
|
## Never
|
|
|
|
- **Never `make deploy` / `make redeploy` / upload to the registry.** This package is private to the operator's box. (Memory: `feedback_relay_never_to_registry`.)
|
|
- **No "Co-Authored-By" / no "Claude" mentions** in commits or source.
|
|
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
|
|
- **Don't push to GitHub by default** — remote is self-hosted Gitea.
|
|
|
|
## Current state — git caught up; box, tree, and git aligned at `0.2.124`
|
|
|
|
- **Box, local tree, and git are now aligned at relay `0.2.124`** (app at `0.2.155`). `startos/versions/index.ts` `current: v_0_2_124`; the StartOS dashboard reflects the same.
|
|
- **Git caught up (2026-06-13).** Everything from `v0.2.12` → `v0.2.124` — previously uncommitted on top of `b7f7590 v0.2.11` — is now committed as 7 logical chunks (`705807e`…`fb11dd6`): internal-meetings pipeline + speaker tools, Spark Control hardware backend, billing (tiers / credits / BTCPay / Zaprite), TTS backends, core routing / identity / dashboard, the v0.2.12→0.2.124 packaging + version graph, and the agent-docs split. **Working tree is clean.** No git remote is configured (local history only).
|
|
- **Post-hoc speaker tools are live**: `meeting-speaker-edits.js` (merge / recluster / repolish + backfill) and the matching `PATCH/POST /admin/internal-meetings/:id/{merge-speakers,recluster,repolish}` routes; the dashboard exposes the controls. Tests pass via `cd server && npm test` (47 tests).
|
|
- **Next:** the ROADMAP "commit the uncommitted tree" item is DONE; remaining work is the speaker-tool follow-ups and the empty-analysis-section issue (see ROADMAP.md / docs/issues-backlog.md).
|