From fb11dd6a040be0ea94d93eed2281b07207208cb3 Mon Sep 17 00:00:00 2001 From: Keysat Date: Sat, 13 Jun 2026 13:36:46 -0500 Subject: [PATCH] Trim AGENTS.md; extract internal-meetings guide + lazy-load wiring --- .claude/rules/internal-meetings.md | 1 + .gitignore | 6 ++-- AGENTS.md | 24 +++------------ docs/guides/internal-meetings.md | 47 ++++++++++++++++++++++++++++++ docs/issues-backlog.md | 33 +++++++++++++++++++++ 5 files changed, 89 insertions(+), 22 deletions(-) create mode 120000 .claude/rules/internal-meetings.md create mode 100644 docs/guides/internal-meetings.md create mode 100644 docs/issues-backlog.md diff --git a/.claude/rules/internal-meetings.md b/.claude/rules/internal-meetings.md new file mode 120000 index 0000000..4a85db0 --- /dev/null +++ b/.claude/rules/internal-meetings.md @@ -0,0 +1 @@ +../../docs/guides/internal-meetings.md \ No newline at end of file diff --git a/.gitignore b/.gitignore index 487a5b5..5f15e2b 100644 --- a/.gitignore +++ b/.gitignore @@ -27,5 +27,7 @@ ytdlp-cache/ # Local dev secrets .env -# Claude Code state (worktrees, plans, etc.) -.claude/ +# Claude Code state (worktrees, plans, etc.) — but commit the lazy-load +# rule symlinks under .claude/rules/ (they point into docs/guides/). +.claude/* +!.claude/rules/ diff --git a/AGENTS.md b/AGENTS.md index ec55a46..c1fb1a9 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -27,7 +27,7 @@ Run from repo root unless noted. - `make install` picks the **newest `*.s9pk` by mtime in the cwd** (`ls -t *.s9pk | head -1`) — it does NOT build. Always `make x86` after a change, and run from this repo's root (the shell cwd can drift to `../recap`, where install would grab the *app's* `.s9pk` instead). - Host comes from the `host:` field in `~/.startos/config.yaml` (a `.local` mDNS name). Never edit that file without authorization. -## Directory layout (what this session touched / verified) +## Directory layout (key files) ``` server/ @@ -47,18 +47,9 @@ server/ public/dashboard.html operator dashboard (meetings detail view + speaker tools) startos/versions/.ts one file per version + index.ts graph docs/issues-backlog.md detailed issue log +docs/guides/internal-meetings.md diarization / speaker subsystem guide (path-scoped; lazy-loads via .claude/rules/) ``` -## Internal-meetings pipeline (how speakers are produced) - -1. **Chunk** audio into ~5-min pieces (`relay_hardware_tx_chunk_minutes`) with a few seconds overlap. -2. **Per-chunk diarize** at Spark Control `/api/audio/diarize-chunk`: **Sortformer** emits chunk-local labels (`Speaker_0/1`), **TitaNet** emits a 192-dim voice fingerprint per local speaker. Labels are meaningless across chunks; fingerprints are not. -3. **Cross-chunk cluster** (`speaker-clustering.js`, `clusterSpeakers`): average-linkage agglomerative clustering over all fingerprints by cosine similarity → global `Speaker_A/B/…`. Then a **small-cluster suppression** pass folds brief clusters into anchors or `Speaker_Unknown`. -4. **Analyze** (windowed) → section `{title, summary, startIndex, endIndex}`. -5. **Polish** (`post-cluster-polish.js`): `runNameInference` infers real names from the transcript, then `runSummaryPolish` rewrites each section summary to attribute statements to those names. -6. **Extras** (`meeting-extras.js`). -7. **Audio is deleted after processing** (success or failure) — the relay never retains uploaded audio. - ## Endpoints (server-side contract) All routes mount in `server/index.js`. Public paths sit under `/relay/*`; operator paths under `/admin/*`. @@ -127,15 +118,8 @@ this. When unsure whether a change is contract-affecting, assume it is and check ## Conventions for this codebase specifically -- **A saved meeting record stores the per-chunk TitaNet fingerprints in `rec.diarization`.** Because the audio is gone, this is what makes re-clustering possible *offline* — no re-upload, no Spark Control round-trip. -- **Speaker labels live in FOUR places that every edit must keep in sync:** `rec.transcript_segments[].speaker`, `rec.chunks[].entries[].speaker` (+ `.speaker_override`), `rec.speakers` (per-cluster stats), and `rec.extras` (`tldr.primary_speakers`, `decisions[].agreed_by`, `action_items[].owner`, `key_quotes[].speaker`). Display names are a separate map: `rec.speaker_names`. -- **Over-merging (two people clustered as one) is tuned by `relay_hardware_voice_clustering_threshold`** (raise it, e.g. 70→80, to split similar voices) plus the suppression knobs `relay_hardware_anchor_min_speaking_sec` / `relay_hardware_small_cluster_max_speaking_sec` / `relay_hardware_uncertain_margin_pct`. All operator-config-driven; never hardcode. -- **Post-hoc speaker-edit endpoints** (operator dashboard, added this session — `server/meeting-speaker-edits.js`): - - `PATCH /admin/internal-meetings/:id/speakers` — rename a cluster (display name only; pre-existing). - - `PATCH /admin/internal-meetings/:id/entries` — per-line `speaker_override` (pre-existing). - - `PATCH /admin/internal-meetings/:id/merge-speakers` — fold cluster(s) into one (ONE person split as two). Pure, offline, no LLM. - - `POST /admin/internal-meetings/:id/recluster` — re-run clustering at a new threshold (TWO people merged as one). Pure, offline (uses `rec.diarization` fingerprints); **resets** `speaker_names`, per-line overrides, and extras attributions — operator re-labels afterward. 400 if no fingerprints saved. - - `POST /admin/internal-meetings/:id/repolish` — re-run `runSummaryPolish` with the **current** names (no re-inference) so topic summaries re-attribute after a rename/merge. The ONLY LLM-backed edit; needs the analyze hardware online; 400 if no named speakers. +- **Before editing the internal-meetings / diarization / speaker subsystem, read `docs/guides/internal-meetings.md`** — the diarize→cluster→polish pipeline, the four-places speaker-label sync rule, the clustering-threshold knobs, and the post-hoc speaker-edit (merge / recluster / repolish) semantics live there. Scoped to `server/{speaker-clustering,post-cluster-polish,meeting-extras,meeting-speaker-edits,chunked-analyze}.js`, `server/routes/internal-meetings.js`, `server/backends/hardware.js`. +- **Doc layout**: `AGENTS.md` is canonical; `CLAUDE.md` is a symlink to it (don't overwrite it). Subsystem guides are real files in `docs/guides/.md` (with `paths:` frontmatter); `.claude/rules/.md` are relative symlinks into them (`.gitignore` carves out `!.claude/rules/` so the symlinks commit). New guide = add `docs/guides/.md`, symlink it from `.claude/rules/`, add an index line above. - **`make install` correctness**: see [Always]. Honest reports; failing test/build is a failure. Comments explain WHY. Write tests alongside (`server/test/*.test.js`, `node --test`). ## Always diff --git a/docs/guides/internal-meetings.md b/docs/guides/internal-meetings.md new file mode 100644 index 0000000..ec837c7 --- /dev/null +++ b/docs/guides/internal-meetings.md @@ -0,0 +1,47 @@ +--- +paths: + - "server/routes/internal-meetings.js" + - "server/speaker-clustering.js" + - "server/post-cluster-polish.js" + - "server/meeting-extras.js" + - "server/meeting-speaker-edits.js" + - "server/chunked-analyze.js" + - "server/backends/hardware.js" +--- + +# Internal-meetings / diarization / speaker subsystem + +Subsystem guide for the internal-meetings feature: the upload → transcribe → diarize → +cluster → analyze → polish pipeline, and the post-hoc speaker-edit tools the operator +dashboard exposes. Whole-repo facts (stack, commands, endpoint contract, tier/billing) +live in `../../AGENTS.md`; this file lazy-loads when you edit the files it's scoped to. + +## Pipeline (how speakers are produced) + +1. **Chunk** audio into ~5-min pieces (`relay_hardware_tx_chunk_minutes`) with a few seconds overlap. +2. **Per-chunk diarize** at Spark Control `/api/audio/diarize-chunk`: **Sortformer** emits chunk-local labels (`Speaker_0/1`), **TitaNet** emits a 192-dim voice fingerprint per local speaker. Labels are meaningless across chunks; fingerprints are not. +3. **Cross-chunk cluster** (`speaker-clustering.js`, `clusterSpeakers`): average-linkage agglomerative clustering over all fingerprints by cosine similarity → global `Speaker_A/B/…`. Then a **small-cluster suppression** pass folds brief clusters into anchors or `Speaker_Unknown`. +4. **Analyze** (windowed, `chunked-analyze.js`) → section `{title, summary, startIndex, endIndex}`. +5. **Polish** (`post-cluster-polish.js`): `runNameInference` infers real names from the transcript, then `runSummaryPolish` rewrites each section summary to attribute statements to those names. +6. **Extras** (`meeting-extras.js`): decisions / action items / open questions / key quotes. +7. **Audio is deleted after processing** (success or failure) — the relay never retains uploaded audio. + +## Conventions + +- **A saved meeting record stores the per-chunk TitaNet fingerprints in `rec.diarization`.** Because the audio is gone, this is what makes re-clustering possible *offline* — no re-upload, no Spark Control round-trip. +- **Speaker labels live in FOUR places that every edit must keep in sync:** `rec.transcript_segments[].speaker`, `rec.chunks[].entries[].speaker` (+ `.speaker_override`), `rec.speakers` (per-cluster stats), and `rec.extras` (`tldr.primary_speakers`, `decisions[].agreed_by`, `action_items[].owner`, `key_quotes[].speaker`). Display names are a separate map: `rec.speaker_names`. +- **Over-merging (two people clustered as one) is tuned by `relay_hardware_voice_clustering_threshold`** (raise it, e.g. 70→80, to split similar voices) plus the suppression knobs `relay_hardware_anchor_min_speaking_sec` / `relay_hardware_small_cluster_max_speaking_sec` / `relay_hardware_uncertain_margin_pct`. All operator-config-driven; never hardcode. + +## Post-hoc speaker-edit endpoints (`server/meeting-speaker-edits.js`) + +Operator-dashboard edits to a saved record, mounted under `/admin/internal-meetings/:id/*` +(routing in `server/routes/internal-meetings.js`). Every edit must keep the four label +locations above in sync. + +- `PATCH /admin/internal-meetings/:id/speakers` — rename a cluster (display name only; pre-existing). +- `PATCH /admin/internal-meetings/:id/entries` — per-line `speaker_override` (pre-existing). +- `PATCH /admin/internal-meetings/:id/merge-speakers` — fold cluster(s) into one (ONE person split as two). Pure, offline, no LLM. +- `POST /admin/internal-meetings/:id/recluster` — re-run clustering at a new threshold (TWO people merged as one). Pure, offline (uses `rec.diarization` fingerprints); **resets** `speaker_names`, per-line overrides, and extras attributions — operator re-labels afterward. 400 if no fingerprints saved. +- `POST /admin/internal-meetings/:id/repolish` — re-run `runSummaryPolish` with the **current** names (no re-inference) so topic summaries re-attribute after a rename/merge. The ONLY LLM-backed edit; needs the analyze hardware online; 400 if no named speakers. + +Test coverage: `server/test/speaker-clustering.test.js`, `server/test/meeting-speaker-edits.test.js`, `server/test/polish-speaker-labels.test.js` (`node --test`). diff --git a/docs/issues-backlog.md b/docs/issues-backlog.md new file mode 100644 index 0000000..f501ddb --- /dev/null +++ b/docs/issues-backlog.md @@ -0,0 +1,33 @@ +# Recap Relay — Issues Backlog + +Things to come back to. Each entry: what was observed, why it's queued, possible causes. + +--- + +## Empty analysis section on chunked-analyze output + +**Observed:** 2026-05-19, during the v0.2.77 Phase 1A smoke test. +- 94-min YouTube video processed end-to-end on hardware path +- 6 analyze windows completed cleanly per the relay logs +- 15 sections rendered in the Recaps UI; ONE section (timestamp 1:07:40 → 1:10:25) had no title and no description +- All other sections normal + +**Why not a Phase 1A regression:** +- Empty section sits at an analyze-window boundary (~1 hour in), NOT at an audio-chunk overlap boundary (those are at multiples of 270 seconds) +- Phase 1A only changed how audio is split for transcription, didn't touch the analyze step + +**Possible causes:** + +1. The hardware LLM (`RedHatAI/Qwen3.6-35B-A3B` at the time) returned a section in its JSON with empty `title` and `summary` strings. The chunked-analyze stitcher currently accepts that as a valid section. Some LLMs hallucinate empty sections at boundaries where they're uncertain. +2. The stitcher's window-merge logic created a degenerate section spanning the gap between two windows' claimed coverage. The window-overlap math may have a hole. + +**Triage path when picked up:** + +- In the relay's Jobs detail view, find the v0.2.77 smoke-test job for the 94-min Sovreign podcast +- Inspect the raw JSON each of the 6 analyze windows returned +- If window 4 or 5's JSON contains `{"title": "", "summary": "", "startIndex": …, "endIndex": …}`, it's cause #1 — fix by filtering empty sections in the stitcher +- If the windows' JSON looks clean but the stitched output has a gap, it's cause #2 — fix the window-merge boundary math + +**Priority:** Low. 1-in-15 sections affected, content still readable, doesn't block release. Worth fixing before broader hardware-path rollout but not blocking diarization work. + +**Status:** queued. Picking up after Phase 1D ships (diarization complete).