ten31-transcripts

Author	SHA1	Message	Date
Grant Gilliam	6d0c8be8c9	Speaker reconciliation + open/re-process any saved session Reconciliation (the marry-the-signals layer): after transcription, before the recap, SpeakerReconciler (1) MERGES non-self clusters whose voiceprints are highly similar (cosine >= 0.82) — fixes a person split across chunks (the real 1-on-1 failure: one remote came back as 'MH' + 'Unknown_0'); and (2) NAMES remaining non-self clusters from transcript CONTENT via the gateway LLM (people addressed by name / self-intros), conservative + confidence-gated, keeping the placeholder when unrevealed. The mic-channel self is protected and never reassigned. Voice does the segmentation; the fingerprint-merge fixes splits; the LLM adds the content signal visual/voiceprint lack. - SpeakerReconciler: pure cosine merge (tested) + LLM content-naming pass; rewrites speakers.json before recap. SessionController.finishBackend shares one model lookup for reconcile + recap. Gated by settings.reconcileSpeakers (default on). - Open saved session: menu 'Open saved session…' → folder picker. Edits it if already transcribed, else reconstructs inputs from disk (visual_timeline vision segs + channel self-spans) and runs transcribe → reconcile → recap, then opens the editor. Lets you evaluate/correct ANY past call, not just the in-memory last one. Note (from real Signal data): visual naming is unreliable on Signal (sparse, misread initials, lowercase/center names) — so reconciliation + the editor (which teaches voiceprints on confirm) carry it; the editor remains the human arbiter. 59/59 XCTest.	2026-06-08 11:54:41 -05:00
Grant Gilliam	c539b78a58	Configurable recap templates (categories per meeting type, in Settings) Takeaways categories are no longer hardcoded — they're editable templates. A template = the always-on TLDR + an ordered list of sections, each with a title, a type (attributed items / bulleted list / paragraph), and an instruction (the prompt text for that category). The analyzer assembles the LLM prompt FROM the template and parses generically, so adding/removing/renaming a category needs zero code and the output always renders. - RecapTemplate / TemplateSection / SectionKind + TopicGranularity; built-in defaults (Internal Meeting, 1:1, Company/Sales Call), all editable. - Generic extras: RecapExtras{tldr, primarySpeakers, sections:[RenderedSection]} + RecapItem{text,who,when,note} replaces the fixed MeetingExtras. Analyzer builds per-section sec_N fields + parses by kind; renderer + remap are generic. - Topic granularity (coarse/auto/fine) answers 'should chunking be configurable' — it scales the target topic count; raw window sizes stay as tuned defaults. - AppSettings persists templates + defaultTemplateId (seeded once). Settings gets a default-template picker + 'Manage…' → TemplatesView (CRUD, edit sections/ instructions, set default, Preview prompt for full transparency). - Recap editor gains a template picker; Regenerate uses the chosen template. Auto recap uses the default template. 54/54 XCTest (template prompt build, generic parse/remap/render updated).	2026-06-06 19:26:03 -05:00
Grant Gilliam	85bfdf2b56	Recap: readable transcript + topic sections + meeting extras (gateway LLM) New 'Recap' phase — turns speakers.json into a human-readable recap, leveraging recap-relay's proven logic/prompts but calling the Spark gateway's OpenAI-compatible /v1/chat/completions directly (same host/TLS as label-merge; Qwen3-35B). We start from already-named speakers (label-merge), so recap-relay's speaker clustering + name-inference are skipped entirely. - GatewayLLMClient: /v1/chat/completions (JSON mode), model discovery via /api/endpoints, TLS-skip reuse, 503 retry, sequential. - RecapAnalyzer: speakers.json → numbered [N] (MM:SS) Name: text transcript → time-windowed analyze (single window for short calls, 18min/2min overlap for long) → stitch/dedup topic sections → meeting extras (TLDR/decisions/action_items/ open_questions/key_quotes). Defensive JSON parsing of LLM output. - RecapRenderer: writes transcript.md + a self-contained dark-theme recap.html (topic sections w/ collapsible transcripts, extras panels, speaker color chips, full timestamped speaker-attributed transcript, print styles). - SessionController.buildRecap: best-effort after speakers.json (gated by settings.recapEnabled); surfaces recapURL → menu 'Open recap'. Skips silently if the gateway has no LLM. Settings toggle added. Validated END-TO-END on the real Meet session against the live gateway: dual-channel transcription → 3 topic sections + accurate TLDR + key quotes; 'Go Bitcoin' correctly attributed to the remote speaker. 46/46 XCTest (10 new).	2026-06-06 14:36:18 -05:00
Grant Gilliam	863136aeec	Phases 2-6: detection, visual timeline, backend hand-off, voiceprints Phase 2 (call detection): CallDetector using CoreAudio per-process mic attribution (anarlog technique) — robust start+stop for Zoom/Teams/Signal/Meet, ignoring our own recording; auto-record toggle. Built; pending live multi-app confirmation by the user. Phase 3 (visual timeline foundation): AppAdapter protocol + SpeakerObservation, TimelineBuilder (hysteresis/overlap/self-merge/aliases), VisualTimeline (schema 1.1), TextRecognizer (Vision OCR), FrameSampler + GridCallAnalyzer (name OCR + saturated-highlight active-speaker attribution), SignalAdapter, VisualObserver (window capture; frames released, never saved; minimized->visual_gap, idle != gap). Synthetic-frame tested; adapter geometry pending real Signal fixtures + live VisualObserver validation. Phase 5 (backend hand-off): SparkControlClient (multipart label-merge, sequential, TLS-skip, 503 Retry-After/413), SessionPackager (chunk plan + WAV slice + timeline slice/rebase), TranscriptAssembler + SpeakersFile, TranscriptPipeline. Validated END-TO-END against the live backend (chunk -> label-merge -> speakers.json). Phase 6 (voiceprints): VoiceprintStore (known_voiceprints, persist named fingerprints, skip Unknown). Wired: 'Send to backend' button + transcript status, auto-send toggle (default off) + self-name setting. All adversarial-review findings fixed. App + XCTest suite build; tests pass.	2026-06-06 00:15:49 -05:00
Grant Gilliam	b2ae3a62b9	Phase 0: menu-bar scaffold, permissions, backend health check Native SwiftUI menu-bar app (LSUIElement, macOS 13+), generated from project.yml via XcodeGen. Includes: - PermissionsManager (Microphone / Screen Recording / Accessibility) + UI - SparkControlHealth: GET /api/status over self-signed TLS (InsecureTrustDelegate) - AppSettings persistence (host, TLS-skip, output folder, adapter toggles) - Menu-bar panel + Settings, app sandbox & hardened runtime off (LAN tool)	2026-06-05 19:33:53 -05:00

5 Commits