14 KiB
AGENTS.md — Recap Relay
Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (../recap) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the internal-meetings feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). Private. Ships to the operator's own Start9 box via make install only — NEVER to the public registry.
Stack
- Server: Node.js (
type: module, ES modules). Same dev box as the app (v25.6.1); container runtime is whatever theDockerfilepins. - HTTP:
express+multer(audio upload). Admin routes under/admin/*behind an admin-session-cookie gate; relay-to-relay routes under/relay/*behind the operator key. - Dashboard:
public/dashboard.html— single-file vanilla JS, render-string-into-innerHTML, same shape as the app'sindex.html. - Packaging:
@start9labs/start-sdkunderstartos/— version graph atstartos/versions/index.ts. - Storage: filesystem under the StartOS data dir (
/data). Internal meetings persist as/data/internal-meetings/<id>.json. No SQLite here. - Upstreams: Gemini (
@google/genai); operator hardware via "Spark Control" HTTP (Parakeet transcribe,/api/audio/diarize-chunkfor Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).
Commands
Run from repo root unless noted.
| Action | Command |
|---|---|
| Run all tests | cd server && npm test (built-in node --test) |
| Run one test file | cd server && node --test test/<file>.test.js |
Build .s9pk (x86) |
make x86 |
| Bump version (interactive) | make bump |
| Install to operator's Start9 box | make install (bump FIRST — see Always) |
| Deploy to registry | make deploy / make redeploy — NEVER run these here (private package) |
make installpicks the newest*.s9pkby mtime in the cwd (ls -t *.s9pk | head -1) — it does NOT build. Alwaysmake x86after a change, and run from this repo's root (the shell cwd can drift to../recap, where install would grab the app's.s9pkinstead).- Host comes from the
host:field in~/.startos/config.yaml(a<relay-host>.localmDNS name). Never edit that file without authorization.
Directory layout (key files)
server/
routes/internal-meetings.js upload → pipeline → save; the /admin/internal-meetings/* API,
including the post-hoc speaker-edit + download endpoints
speaker-clustering.js cross-chunk voice clustering (agglomerative, cosine sim) +
assignSpeakersToSegments + small-cluster suppression
post-cluster-polish.js Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
meeting-extras.js decisions / action items / open questions / key quotes extraction
meeting-speaker-edits.js post-hoc record edits: mergeSpeakersInRecord,
reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
backends/hardware.js Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
chunked-analyze.js windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
config.js getConfigSnapshot() + relay_* config defaults
hardware-config.js resolveHardwareConfig() → Spark Control endpoint discovery
test/ node --test files (speaker-clustering, meeting-speaker-edits, credits)
public/dashboard.html operator dashboard (meetings detail view + speaker tools)
startos/versions/<vN>.ts one file per version + index.ts graph
docs/issues-backlog.md detailed issue log
docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/)
Endpoints (server-side contract)
All routes mount in server/index.js. Public paths sit under /relay/*; operator paths under /admin/*.
Auth model
X-Recap-Operator-Key+X-Recap-User-Id→ "cloud" path. The Recaps cloud server (recaps.cc) authenticates once with a shared operator key (relay_cloud_operator_key) and names the acting user. Credit pool keyeduser:<id>, tier comes from the relay's stored row, NOT a per-user license. Seeserver/identity.js.X-Recap-Install-Id(+ optionalAuthorization: <license>) → "license" path. Self-hosted installs and the operator's single-mode app. Credits/tier come from the resolved Keysat license + install id.- Admin session cookie →
/admin/*. Cookie issued byPOST /admin/login;/admin/loginand/admin/statusare exempt insidesetupAdminAuthMiddleware. - Webhook signature →
POST /relay/btcpay/webhookvalidatesBTCPay-Sigagainstrelay_btcpay_webhook_secret. Zaprite's webhook re-fetches the order through the Zaprite API to verify, so no shared-secret signing. X-Recap-Job-Idis a billing key, not auth: the first call with a given id charges one credit; later calls with the same id are free (so transcribe + analyze for one summary = one credit total).
/relay/* (public; per-call header auth)
GET /relay/health— liveness; tolerates partial config. (routes/health.js)GET /relay/policy—{ tiers, core_total_credits, core_gemini_credits }; no auth. (routes/policy.js)GET /relay/capabilities— operator-wide feature flags (hardware ready, TTS backend choice, etc).X-Recap-Install-Idoptional. (routes/capabilities.js)GET /relay/balance— caller's credit balance (routes/balance.js).POST /relay/transcribe— multipart audio →{ text, segments, duration_seconds, model, ... }. Body fields:mime_type,title,channel,description. (routes/transcribe.js)POST /relay/transcribe-url— async;{ media_url, type, mime_type, title, channel, description, chapters }→{ job_id }then pollGET /relay/jobs/:id. (routes/transcribe-url.js)POST /relay/summarize-url— async; same body shape, full transcribe+analyze pipeline →{ job_id }then streamGET /relay/summarize-url/:jobId/events(SSE). (routes/summarize-url.js)POST /relay/analyze—{ transcript, … }→ topic sections JSON. (routes/analyze.js)POST /relay/tts— text → audio; gated bycapabilities.has_tts. (routes/tts.js)GET /relay/credits/packages,POST /relay/credits/buy,GET /relay/credits/invoice/:id— à-la-carte credit purchase (BTCPay). (routes/credits.js)POST /relay/btcpay/webhook— BTCPay settle → eitherextendUserTier(subscription) or credit grant (à-la-carte). HMAC validated. (routes/credits.js)POST /relay/zaprite/webhook— Zaprite settle →extendUserTieronly. Re-fetches order to verify. (routes/zaprite-webhook.js)
/relay/* (operator-key only — cloud → relay control plane)
All require a valid X-Recap-Operator-Key. Defined in routes/user-tier.js.
POST /relay/user-tier—{ user_id, tier: "core"|"pro"|"max", expires_at? }→ sets the cloud user's stored tier (operator comp grants live here).POST /relay/tier-invoice—{ user_id, tier: "pro"|"max", return_url }→ mints a BTCPay tier-purchase invoice (Lightning QR).POST /relay/tier-zaprite-order— same idea on the card rail.GET /relay/tier-plans—{ ok, period_days, plans: [{tier, sats, fiat_amount, fiat_currency, credits_per_period}], card_available }.credits_per_period: null→ "Unlimited"; never hardcode this label.GET /relay/expiring-subscriptions?within_days=7&lapsed_days=3—{ ok, now, subscriptions: [{user_id, tier, expires_at, expired, days_left}] }. The Recaps server maps user_id → email and sends the reminder; the relay never sees email.GET /relay/user-tier/:userId— read the stored row.
/admin/* (operator dashboard; cookie-gated)
routes/admin.js: GET /admin/{usage,config,license-cache,hardware-queue,jobs,jobs-history,job-output/:id,job/:id/details,output-store-stats,output-store-ids,dashboard,dashboard.csv,settings}, POST /admin/{quotas,wipe-all,settings/promote-prompt}, PUT /admin/settings, DELETE /admin/job-outputs. routes/admin-test-run.js: POST /admin/{test-run,test-run-suite}. BTCPay setup wizard under /admin/btcpay/* (routes/btcpay-setup.js).
/admin/internal-meetings/* (cookie-gated; routes/internal-meetings.js)
POST /upload— multipart audio; runs the full pipeline (chunk → diarize → cluster → analyze → polish → extras → save). Audio is deleted after.GET /→{ meetings: [...] };GET /:id→ full saved record (rec).GET /:id/markdown,GET /:id/html,GET /:id/download— exports.GET /jobs/:id,GET /jobs/:id/stream(SSE) — progress for a running upload.PATCH /:id/speakers— rename a cluster (display-name only).PATCH /:id/entries— per-linespeaker_override.PATCH /:id/merge-speakers— fold cluster(s) into one (split-as-two). Offline, no LLM.POST /:id/recluster— re-run clustering at a new threshold (merged-as-one). Offline, usesrec.diarizationfingerprints. Resetsspeaker_names, per-line overrides, and extras attributions. 400 if no fingerprints.POST /:id/repolish— re-runsrunSummaryPolishwith the CURRENT names (no re-inference). Synchronous; needs hardware analyze online; 400 if no named speakers.DELETE /:id.
Cross-repo changes (sibling: ../recap)
This repo and the Recaps app (../recap) share a live client/server contract — the
/relay/* endpoints, the X-Recap-* headers, request/response shapes, and tier/credit
semantics. Before finishing any change that touches that boundary, check whether
../recap needs a matching change. If you add/rename/remove an endpoint, alter a payload
shape or header, or shift tier/credit/billing behavior, update the consumer side too — and
reflect it in BOTH repos' AGENTS.md (the contract docs) and ROADMAP.md (if it's staged
work). Purely internal changes (diarization tuning, dashboard layout, packaging) don't need
this. When unsure whether a change is contract-affecting, assume it is and check.
Conventions for this codebase specifically
- Before editing the internal-meetings / diarization / speaker subsystem, read
docs/guides/internal-meetings.md— the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped toserver/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js,server/routes/internal-meetings.js,server/backends/hardware.js. - Doc layout:
AGENTS.mdis canonical;CLAUDE.mdis a symlink to it (don't overwrite it). Subsystem guides are real files indocs/guides/<topic>.md(withpaths:frontmatter);.claude/rules/<topic>.mdare relative symlinks into them (.gitignorecarves out!.claude/rules/so the symlinks commit). New guide = adddocs/guides/<topic>.md, symlink it from.claude/rules/, add an index line above. make installcorrectness: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (server/test/*.test.js,node --test).
Always
- Bump the version before EVERY
make install— StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops.make bump→make x86→make install. See memorybump-before-install(applies to this repo AND../recap). - Add new version files to BOTH the import block AND the
other:list instartos/versions/index.ts, and pointcurrent:at the new constant.make bumpdoes this for you. - Build freely; ask before anything that leaves this machine.
make x86/make install(to the operator's own box) are fine.make deploy/make redeployare NOT. - Reference env-var / config names, never values. Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.
Never
- Never
make deploy/make redeploy/ upload to the registry. This package is private to the operator's box. (Memory:feedback_relay_never_to_registry.) - No "Co-Authored-By" / no "Claude" mentions in commits or source.
- Never edit a
startos/versions/<v>.tsthat's already been built/installed — add a new version file. - Don't push to GitHub by default — remote is self-hosted Gitea.
Current state — box AND working tree at 0.2.124; git is the gap
- Box AND local working tree are both at relay
0.2.124(app at0.2.155).startos/versions/index.tscurrent: v_0_2_124; the StartOS dashboard reflects the same. - Version files
v0.2.117–v0.2.124are present in the working tree (untracked). A concurrent 2026-06-13 session continued from this session's 0.2.117, bumped through 0.2.124, and shipped to the box — re-read the tree before assuming what's there. - Post-hoc speaker tools are live:
meeting-speaker-edits.js(merge / recluster / repolish + backfill) and the matchingPATCH/POST /admin/internal-meetings/:id/{merge-speakers,recluster,repolish}routes are present; the dashboard exposes the controls. Tests pass viacd server && npm test. - The real gap is git, not versions. The last committed code is
b7f7590 v0.2.11 /relay/capabilities + /relay/transcribe-url; the commit(s) stacked on top of it are docs-only (the AGENTS/ROADMAP consolidation). So everything fromv0.2.12→v0.2.124— the entire internal-meetings feature, diarization, speaker-edit tools, billing, the user-tier control plane — is uncommitted. Working-tree counts: 28 modified, 150 untracked, 5 deleted (183 total) as of this read. "Catching up git" = committing this tree (see ROADMAP).