Grant Gilliam 39beccf7f4 Fix Meet visual: reject solid avatar tiles + screen-share OCR
Root cause of the "4 people → 2 speakers" Meet call: the colored-border detector
read solid camera-off avatar tiles (orange "J", magenta "G") as active speakers
for the ENTIRE call. Those whole-call phantom spans dominated backend name
attribution, collapsing every remote voice onto one name — and the giant filled
bbox also swallowed screen-share text (WERUNBTC.COM ×49) as a speaker.

Validated against 9 real fixtures (harness over the real MeetAdapter):

Detection:
- FrameSampler.thinColoredPoints: coloured counterpart of thinWhitePoints — keeps
  thin border/ring/pill edges, drops solid colour fills.
- GridCallAnalyzer.isHollow: reject a highlight component whose interior is filled
  (a solid tile) vs a hollow ring (a real border). Config.maxInteriorFill (0.2 default).
- MeetAdapter: detect thin BLUE edges only (hue 180–240°, measured from the
  fixtures), maxInteriorFill 0.3 (real Meet rings ≈0.2–0.3, solid tiles ≈0.36).
- Result on fixtures: John Arnold/Grant Gilliam (solid tiles) now NEVER detected;
  Matt Odell/Mark detected when their blue cue is present. Sparse but never wrong —
  correct for a naming hint over audio diarization.

OCR name hygiene:
- isLikelyName rejects domain-like screen-share text ("WERUNBTC.COM", OCR'd ".GOM").
- cleaned() strips trailing punctuation ("Mark." → "Mark").
- TimelineBuilder.canonicalizeByFrequency folds rare OCR misspellings into a
  dominant near-twin name ("Matt Odel"/"MattOdell" → "Matt Odell", "Mare" → "Mark").

Tests: hollow-ring, extended OCR filter, fuzzy-merge. 65 pass.
2026-06-08 16:18:52 -05:00

Ten31 Transcripts

Native macOS menu-bar app that auto-detects conference calls, records local audio, builds a visual-derived speaker timeline, and hands audio + timeline to the SparkControl backend for naming/transcription. See docs/ for the full spec.

This repo is at Phase 0 (scaffold, permissions, backend health check).

One-time setup

  1. Install Xcode from the Mac App Store (free; ~40 GB). Open it once and accept the license prompt.
  2. Install XcodeGen (generates the Xcode project from project.yml):
    brew install xcodegen
    
  3. Generate the project:
    xcodegen generate
    
    This creates Ten31Transcripts.xcodeproj (git-ignored — regenerate any time).
  4. Open it:
    open Ten31Transcripts.xcodeproj
    
  5. Signing is preconfigured: project.yml sets DEVELOPMENT_TEAM to the free personal team BK4Y6CXN35 with automatic signing, so Signing & Capabilities should already show the team — no manual selection needed. (If you ever sign with a different Apple ID, update DEVELOPMENT_TEAM in project.yml, not in Xcode — xcodegen generate overwrites Xcode-side changes.)
  6. Press Run (⌘R).

Note: after adding files in a new phase, re-run xcodegen generate and let Xcode reload the project. The signing team persists because it lives in project.yml, so macOS permissions stay granted across rebuilds.

What Phase 0 does

  • Launches as a menu-bar-only app (no Dock icon).
  • Menu panel shows live status for the three permissions it needs — Microphone, Screen Recording, Accessibility — with Grant / Open Settings buttons.
  • Shows a backend health check (GET /api/status) against the configured host.
  • Settings: backend base URL, skip-TLS toggle (on by default for the self-signed cert), output folder, and adapter toggles (inert this phase).

No audio capture, call detection, screen reading, or backend hand-off yet — those arrive in Phases 16 (docs/04_BUILD_PLAN.md).

Project layout

project.yml                     # XcodeGen recipe → generates the .xcodeproj
Ten31Transcripts/
  App/        Ten31TranscriptsApp.swift, AppDelegate.swift
  UI/         MenuBarView, SettingsView, PermissionRow
  Permissions/PermissionsManager.swift
  Backend/    SparkControlHealth.swift, InsecureTrustDelegate.swift
  Settings/   AppSettings.swift
  Support/    Info.plist, Ten31Transcripts.entitlements
Ten31TranscriptsTests/          # placeholder; real tests land in Phase 3

Notes

  • App Sandbox is off and Hardened Runtime is off — this is a personal, LAN-only tool that must observe other apps. Revisit only if distributing.
  • The default backend host is https://your-spark-backend.local:62419 (editable in Settings).
S
Description
No description provided
Readme 890 KiB
Languages
Swift 100%