recap-relay/AGENTS.md

# AGENTS.md — Recap Relay

Operator-side, credit-metered service that sits in front of Gemini and the operator's local AI hardware ("Spark Control": Parakeet ASR, Sortformer diarization, TitaNet voice embeddings, a vLLM/Gemma analyze endpoint). The Recaps app (`../recap`) is the client; this repo owns transcription/diarization/analysis routing, the cloud Pro/Max tier + expiry, self-serve billing settlement, and the **internal-meetings** feature (upload audio → transcribe → diarize → cluster → analyze → polish → operator dashboard). **Private. Ships to the operator's own Start9 box via `make install` only — NEVER to the public registry.**

## Stack

- **Server**: Node.js (`type: module`, ES modules). Same dev box as the app (`v25.6.1`); container runtime is whatever the `Dockerfile` pins.
- **HTTP**: `express` + `multer` (audio upload). Admin routes under `/admin/*` behind an admin-session-cookie gate; relay-to-relay routes under `/relay/*` behind the operator key.
- **Dashboard**: `public/dashboard.html` — single-file vanilla JS, render-string-into-innerHTML, same shape as the app's `index.html`.
- **Packaging**: `@start9labs/start-sdk` under `startos/` — version graph at `startos/versions/index.ts`.
- **Storage**: filesystem under the StartOS data dir (`/data`). Internal meetings persist as `/data/internal-meetings/<id>.json`. No SQLite here.
- **Upstreams**: Gemini (`@google/genai`); operator hardware via "Spark Control" HTTP (Parakeet transcribe, `/api/audio/diarize-chunk` for Sortformer+TitaNet, a vLLM/Gemma OpenAI-shape analyze endpoint).

## Commands

Run from repo root unless noted.

| Action | Command |
|---|---|
| Run all tests | `cd server && npm test` (built-in `node --test`) |
| Run one test file | `cd server && node --test test/<file>.test.js` |
| Build `.s9pk` (x86) | `make x86` |
| Bump version (interactive) | `make bump` |
| Install to operator's Start9 box | `make install` *(bump FIRST — see Always)* |
| Deploy to registry | `make deploy` / `make redeploy` — **NEVER run these here** (private package) |

- `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead).
- Host comes from the `host:` field in `~/.startos/config.yaml` (a `<relay-host>.local` mDNS name). Never edit that file without authorization.

## Directory layout (what this session touched / verified)

```
server/
  routes/internal-meetings.js   upload → pipeline → save; the /admin/internal-meetings/* API,
                                including the post-hoc speaker-edit + download endpoints
  speaker-clustering.js         cross-chunk voice clustering (agglomerative, cosine sim) +
                                assignSpeakersToSegments + small-cluster suppression
  post-cluster-polish.js        Stage 1 runNameInference + Stage 2 runSummaryPolish (per-window)
  meeting-extras.js             decisions / action items / open questions / key quotes extraction
  meeting-speaker-edits.js      post-hoc record edits: mergeSpeakersInRecord,
                                reclusterMeetingRecord, applyPolishedSummaries, backfillEntrySpeakers
  backends/hardware.js          Parakeet transcribe + /api/audio/diarize-chunk + chunking + vLLM analyze
  chunked-analyze.js            windowed analyze (planWindowsByDuration, runPipelinedAnalysis, …)
  config.js                     getConfigSnapshot() + relay_* config defaults
  hardware-config.js            resolveHardwareConfig() → Spark Control endpoint discovery
  test/                         node --test files (speaker-clustering, meeting-speaker-edits, credits)
public/dashboard.html           operator dashboard (meetings detail view + speaker tools)
startos/versions/<vN>.ts        one file per version + index.ts graph
docs/issues-backlog.md          detailed issue log
```

## Internal-meetings pipeline (how speakers are produced)

1. **Chunk** audio into ~5-min pieces (`relay_hardware_tx_chunk_minutes`) with a few seconds overlap.
2. **Per-chunk diarize** at Spark Control `/api/audio/diarize-chunk`: **Sortformer** emits chunk-local labels (`Speaker_0/1`), **TitaNet** emits a 192-dim voice fingerprint per local speaker. Labels are meaningless across chunks; fingerprints are not.
3. **Cross-chunk cluster** (`speaker-clustering.js`, `clusterSpeakers`): average-linkage agglomerative clustering over all fingerprints by cosine similarity → global `Speaker_A/B/…`. Then a **small-cluster suppression** pass folds brief clusters into anchors or `Speaker_Unknown`.
4. **Analyze** (windowed) → section `{title, summary, startIndex, endIndex}`.
5. **Polish** (`post-cluster-polish.js`): `runNameInference` infers real names from the transcript, then `runSummaryPolish` rewrites each section summary to attribute statements to those names.
6. **Extras** (`meeting-extras.js`).
7. **Audio is deleted after processing** (success or failure) — the relay never retains uploaded audio.

## Conventions for this codebase specifically

- **A saved meeting record stores the per-chunk TitaNet fingerprints in `rec.diarization`.** Because the audio is gone, this is what makes re-clustering possible *offline* — no re-upload, no Spark Control round-trip.
- **Speaker labels live in FOUR places that every edit must keep in sync:** `rec.transcript_segments[].speaker`, `rec.chunks[].entries[].speaker` (+ `.speaker_override`), `rec.speakers` (per-cluster stats), and `rec.extras` (`tldr.primary_speakers`, `decisions[].agreed_by`, `action_items[].owner`, `key_quotes[].speaker`). Display names are a separate map: `rec.speaker_names`.
- **Over-merging (two people clustered as one) is tuned by `relay_hardware_voice_clustering_threshold`** (raise it, e.g. 70→80, to split similar voices) plus the suppression knobs `relay_hardware_anchor_min_speaking_sec` / `relay_hardware_small_cluster_max_speaking_sec` / `relay_hardware_uncertain_margin_pct`. All operator-config-driven; never hardcode.
- **Post-hoc speaker-edit endpoints** (operator dashboard, added this session — `server/meeting-speaker-edits.js`):
  - `PATCH /admin/internal-meetings/:id/speakers` — rename a cluster (display name only; pre-existing).
  - `PATCH /admin/internal-meetings/:id/entries` — per-line `speaker_override` (pre-existing).
  - `PATCH /admin/internal-meetings/:id/merge-speakers` — fold cluster(s) into one (ONE person split as two). Pure, offline, no LLM.
  - `POST  /admin/internal-meetings/:id/recluster` — re-run clustering at a new threshold (TWO people merged as one). Pure, offline (uses `rec.diarization` fingerprints); **resets** `speaker_names`, per-line overrides, and extras attributions — operator re-labels afterward. 400 if no fingerprints saved.
  - `POST  /admin/internal-meetings/:id/repolish` — re-run `runSummaryPolish` with the **current** names (no re-inference) so topic summaries re-attribute after a rename/merge. The ONLY LLM-backed edit; needs the analyze hardware online; 400 if no named speakers.
- **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`).

## Always

- **Bump the version before EVERY `make install`** — StartOS dedupes sideloads by version string, so an unbumped reinstall (even one line changed) silently no-ops. `make bump` → `make x86` → `make install`. See memory `bump-before-install` (applies to this repo AND `../recap`).
- **Add new version files to BOTH the import block AND the `other:` list** in `startos/versions/index.ts`, and point `current:` at the new constant. `make bump` does this for you.
- **Build freely; ask before anything that leaves this machine.** `make x86` / `make install` (to the operator's own box) are fine. `make deploy` / `make redeploy` are NOT.
- **Reference env-var / config names, never values.** Relay secrets (operator key, Gemini key, SMTP, Zaprite, BTCPay) live in gitignored env; docs name them only.

## Never

- **Never `make deploy` / `make redeploy` / upload to the registry.** This package is private to the operator's box. (Memory: `feedback_relay_never_to_registry`.)
- **No "Co-Authored-By" / no "Claude" mentions** in commits or source.
- **Never edit a `startos/versions/<v>.ts` that's already been built/installed** — add a new version file.
- **Don't push to GitHub by default** — remote is self-hosted Gitea.

## Current state (2026-06-13) — at `0.2.124`; only git commits lag

- **Box AND local working tree are both at relay `0.2.124`** (app `0.2.155`). Confirmed on the StartOS UI (version + the Merge/Re-polish controls visible on the dashboard).
- **The version files `v0.2.117`–`v0.2.124` are all in this working tree** (untracked). v0.2.124's note is a billing change ("tier Bitcoin invoices return the Lightning BOLT11 + per-period credit allotment"). A **concurrent chat session** during 2026-06-13 continued from this session's 0.2.117, bumped through 0.2.124, and built+installed it to the box — so the working tree matches the box. (Heads-up: more than one session may be editing this tree; re-read before assuming.)
- **The post-hoc speaker tools are present and live**: `meeting-speaker-edits.js` (merge/recluster/repolish + backfill) and the matching `/admin/internal-meetings/:id/{merge-speakers,recluster,repolish}` routes; the dashboard shows the controls. Tests pass (32, `npm test`).
- **The real gap is git, not versions.** Committed HEAD is `v0.2.11`; everything since — v0.2.12→v0.2.124, the entire internal-meetings feature, diarization, speaker-edit tools, billing — is **uncommitted** (≈28 modified + 153 untracked). "Catching up local git" = committing this large working tree (see ROADMAP). The 0.2.117 this session installed was superseded by the concurrent 0.2.124 — **no box downgrade occurred.**