Commit Graph

4 Commits

Author SHA1 Message Date
Grant Gilliam c539b78a58 Configurable recap templates (categories per meeting type, in Settings)
Takeaways categories are no longer hardcoded — they're editable templates. A
template = the always-on TLDR + an ordered list of sections, each with a title, a
type (attributed items / bulleted list / paragraph), and an instruction (the prompt
text for that category). The analyzer assembles the LLM prompt FROM the template
and parses generically, so adding/removing/renaming a category needs zero code and
the output always renders.

- RecapTemplate / TemplateSection / SectionKind + TopicGranularity; built-in
  defaults (Internal Meeting, 1:1, Company/Sales Call), all editable.
- Generic extras: RecapExtras{tldr, primarySpeakers, sections:[RenderedSection]} +
  RecapItem{text,who,when,note} replaces the fixed MeetingExtras. Analyzer builds
  per-section sec_N fields + parses by kind; renderer + remap are generic.
- Topic granularity (coarse/auto/fine) answers 'should chunking be configurable' —
  it scales the target topic count; raw window sizes stay as tuned defaults.
- AppSettings persists templates + defaultTemplateId (seeded once). Settings gets a
  default-template picker + 'Manage…' → TemplatesView (CRUD, edit sections/
  instructions, set default, **Preview prompt** for full transparency).
- Recap editor gains a template picker; Regenerate uses the chosen template. Auto
  recap uses the default template.

54/54 XCTest (template prompt build, generic parse/remap/render updated).
2026-06-06 19:26:03 -05:00
Grant Gilliam 85bfdf2b56 Recap: readable transcript + topic sections + meeting extras (gateway LLM)
New 'Recap' phase — turns speakers.json into a human-readable recap, leveraging
recap-relay's proven logic/prompts but calling the Spark gateway's OpenAI-compatible
/v1/chat/completions directly (same host/TLS as label-merge; Qwen3-35B). We start
from already-named speakers (label-merge), so recap-relay's speaker clustering +
name-inference are skipped entirely.

- GatewayLLMClient: /v1/chat/completions (JSON mode), model discovery via
  /api/endpoints, TLS-skip reuse, 503 retry, sequential.
- RecapAnalyzer: speakers.json → numbered [N] (MM:SS) Name: text transcript →
  time-windowed analyze (single window for short calls, 18min/2min overlap for long)
  → stitch/dedup topic sections → meeting extras (TLDR/decisions/action_items/
  open_questions/key_quotes). Defensive JSON parsing of LLM output.
- RecapRenderer: writes transcript.md + a self-contained dark-theme recap.html
  (topic sections w/ collapsible transcripts, extras panels, speaker color chips,
  full timestamped speaker-attributed transcript, print styles).
- SessionController.buildRecap: best-effort after speakers.json (gated by
  settings.recapEnabled); surfaces recapURL → menu 'Open recap'. Skips silently if
  the gateway has no LLM. Settings toggle added.

Validated END-TO-END on the real Meet session against the live gateway: dual-channel
transcription → 3 topic sections + accurate TLDR + key quotes; 'Go Bitcoin'
correctly attributed to the remote speaker. 46/46 XCTest (10 new).
2026-06-06 14:36:18 -05:00
Grant Gilliam 863136aeec Phases 2-6: detection, visual timeline, backend hand-off, voiceprints
Phase 2 (call detection): CallDetector using CoreAudio per-process mic
attribution (anarlog technique) — robust start+stop for Zoom/Teams/Signal/Meet,
ignoring our own recording; auto-record toggle. Built; pending live multi-app
confirmation by the user.

Phase 3 (visual timeline foundation): AppAdapter protocol + SpeakerObservation,
TimelineBuilder (hysteresis/overlap/self-merge/aliases), VisualTimeline (schema
1.1), TextRecognizer (Vision OCR), FrameSampler + GridCallAnalyzer (name OCR +
saturated-highlight active-speaker attribution), SignalAdapter, VisualObserver
(window capture; frames released, never saved; minimized->visual_gap, idle != gap).
Synthetic-frame tested; adapter geometry pending real Signal fixtures + live
VisualObserver validation.

Phase 5 (backend hand-off): SparkControlClient (multipart label-merge, sequential,
TLS-skip, 503 Retry-After/413), SessionPackager (chunk plan + WAV slice + timeline
slice/rebase), TranscriptAssembler + SpeakersFile, TranscriptPipeline. Validated
END-TO-END against the live backend (chunk -> label-merge -> speakers.json).

Phase 6 (voiceprints): VoiceprintStore (known_voiceprints, persist named
fingerprints, skip Unknown). Wired: 'Send to backend' button + transcript status,
auto-send toggle (default off) + self-name setting.

All adversarial-review findings fixed. App + XCTest suite build; tests pass.
2026-06-06 00:15:49 -05:00
Grant Gilliam b2ae3a62b9 Phase 0: menu-bar scaffold, permissions, backend health check
Native SwiftUI menu-bar app (LSUIElement, macOS 13+), generated from project.yml
via XcodeGen. Includes:
- PermissionsManager (Microphone / Screen Recording / Accessibility) + UI
- SparkControlHealth: GET /api/status over self-signed TLS (InsecureTrustDelegate)
- AppSettings persistence (host, TLS-skip, output folder, adapter toggles)
- Menu-bar panel + Settings, app sandbox & hardened runtime off (LAN tool)
2026-06-05 19:33:53 -05:00