Speaker corrections: rename / merge / reassign + voice learning

Native editor to fix speaker-ID errors after transcription (modeled on recap-relay's
correction UX): rename a speaker in the legend, merge two speakers, or reassign an
individual transcript line. Saving rewrites speakers.json, re-renders transcript.md +
recap.html, and updates the voiceprint memory — so a correction compounds: naming an
"Unknown" speaker teaches that voice for future calls.

- SpeakerEditing (pure, tested): replaceSpeaker (rename = merge-onto-existing),
  reassign, netNameMap (compose ops), and remap (apply a name map to a recap's
  structured fields + whole-word free text, so summaries/extras update without re-LLM).
- RecapEditModel (@MainActor): loads speakers.json (+ optional recap.json +
  cluster_fingerprints.json); on save writes the resolved speakers.json, re-renders,
  and reconciles voiceprints — merge keeps the survivor's print; rename/name-an-Unknown
  enrolls the cluster's fingerprint under the new name.
- TranscriptEditorView (SwiftUI) + EditorWindow (AppKit window for the LSUIElement app);
  menu gains "Edit speakers".
- Pipeline now persists cluster_fingerprints.json (every cluster incl. Unknown) and
  recap.json (RecapFile) so the editor can learn voices + re-render offline.
- RecapModels made Codable; TranscriptAssembler exposes allFingerprints;
  VoiceprintStore gains enroll() + merge().

52/52 XCTest (6 new, incl. a full rename→artifacts→voiceprint round-trip on disk).
This commit is contained in:
Grant Gilliam
2026-06-06 15:12:23 -05:00
parent 85bfdf2b56
commit 4c086251d9
11 changed files with 569 additions and 16 deletions
+24 -1
View File
@@ -66,11 +66,34 @@ final class VoiceprintStore {
func rename(_ old: String, to new: String) {
lock.lock(); defer { lock.unlock() }
guard let e = entriesStore.removeValue(forKey: old) else { return }
guard old != new, let e = entriesStore.removeValue(forKey: old) else { return }
entriesStore[new] = e
save()
}
/// Enroll/refresh a voiceprint under `name` (e.g. after the user renames an
/// "Unknown" speaker to a real name we learn that voice for future calls).
func enroll(name: String, vector: [Float]) {
guard !name.isEmpty, !Self.isUnknown(name), !vector.isEmpty else { return }
lock.lock(); defer { lock.unlock() }
let now = ISO8601DateFormatter().string(from: Date())
var entry = entriesStore[name] ?? Entry(vector: vector, updated: now, calls: 0)
entry.vector = vector
entry.updated = now
entry.calls += 1
entriesStore[name] = entry
save()
}
/// Merge `absorbed` into `survivor`: drop the absorbed entry, keep the survivor's
/// print (the user said they're the same person).
func merge(_ absorbed: String, into survivor: String) {
lock.lock(); defer { lock.unlock() }
guard absorbed != survivor else { return }
entriesStore.removeValue(forKey: absorbed)
save()
}
func remove(_ name: String) {
lock.lock(); defer { lock.unlock() }
entriesStore.removeValue(forKey: name)