Reconcile docs/ specs with the shipped app

Document the dual-channel label-merge path (mic_file/system_file/self_name/self_vad) and the recap phase (transcript.md + recap.html via the backend LLM) across docs/01-03; correct docs/02 $2.10 to the UI actually shipped; mark docs/01 $7 open items as settled; remove the dead AUDIO_API.md references; note the manifest sha256 fields are not emitted; mark docs/04 as a complete/historical build log. Also drop the last stale "Phase 0" UI string in MenuBarView and retire the now-done doc-debt items in ROADMAP.
2026-06-16 22:09:04 -05:00
parent 85ea8fde45
commit dda4322de7
6 changed files with 106 additions and 56 deletions
@@ -64,6 +64,9 @@ pattern, the macOS APIs, and the SparkControl integration (now fully specified).
                       └────────────────┘  └────────────────────┘
 ```

+(After `speakers.json`, a recap phase renders `transcript.md` + `recap.html` via
+the backend LLM — see §2.11.)
+
 ## 2. Modules

 ### 2.1 `CallDetector`
@@ -176,8 +179,10 @@ Write the session folder and, if the call is longer than ~3 min, produce a
 ```

 ### 2.7 `SparkControlClient`
-Deliver to SparkControl. **Primary path = `POST /api/audio/label-merge`** with
-`file`, `timeline`, `known_voiceprints`, `transcribe=true`.
+Deliver to SparkControl. **Primary path = `POST /api/audio/label-merge`**. Sends
+**dual-channel** (`mic_file` + `system_file` + `self_name` + `self_vad`) when the
+system track is healthy, else the **mono** `file`; always with `timeline`,
+`known_voiceprints`, `transcribe=true`.
 - **Sequential only** — one audio request in flight (parallel ⇒ `503 + Retry-After`).
 - **Self-signed TLS** — skip verification (`URLSession` delegate trusting the
  Start9 cert) or trust the Root CA. **No auth on the LAN.**
@@ -210,10 +215,22 @@ Local persistence of named voiceprints — the compounding-identity layer.
 - Editable/clearable from the menu-bar UI (rename, delete a person, reset).

 ### 2.10 `MenuBarUI` (SwiftUI, `LSUIElement`)
-Status (idle / detected / recording / uploading), manual start/stop, recent
-sessions (open folder, resend, delete), adapter toggles, **backend host + a
-health check** (`GET /api/status`), output folder, voiceprint manager, and a
-permissions checklist (Screen Recording, Microphone, Accessibility).
+Status (idle / detected / recording / finishing), manual start/stop with live
+mic/system level meters, and the **last session** — reveal in Finder, resend
+("Send to backend"), open recap, and edit speakers — plus "Open saved session…"
+to reprocess an existing folder. Also a **backend host + health check**
+(`GET /api/status`), adapter toggles, output folder, and a permissions checklist
+(Microphone, Screen Recording, Accessibility). (No multi-session list or
+voiceprint-manager UI yet — those are in `ROADMAP.md`.)
+
+### 2.11 Recap (`RecapAnalyzer`, `RecapRenderer`)
+After `speakers.json`, the recap phase turns the named transcript into the
+human-readable deliverables. `RecapAnalyzer` calls the backend LLM
+(`POST /v1/chat/completions`, Qwen3) for topics + meeting extras; `RecapRenderer`
+writes `transcript.md` (one line per diarized utterance) and `recap.html` (+ a
+`recap.json` sidecar). The in-app speaker editor (`SpeakerEditing` /
+`RecapEditModel`) rewrites names across all outputs after the fact. All
+language-model work stays on the backend; the app orchestrates and renders.

 ## 3. macOS frameworks & permissions