880b56e426c05d712b4a6f7800e8682bd76cedb5
Visual capture now runs alongside audio: on call start the session picks the
app's adapter, captures the call window on the SAME monotonic clock as the audio
(AudioRecorder.sharedT0Host), and on stop writes visual_timeline.json and hands
the backend the visual segments with mic-VAD self-spans merged. Any visual
failure (no adapter, no window, Screen Recording denied) leaves the session
recording audio-only — the proven path is never blocked or broken.
- CallDetector now emits DetectedCall{app, bundleID, windowID}: the exact
CGWindowID of the matched Meet browser window (native apps → nil → largest).
- VisualCapture wraps VisualObserver + AdapterRegistry, writes visual_timeline.json.
- AudioRecorder.sharedT0Host() exposes the shared t0 for frame alignment.
Hardened per a 3-lens adversarial review (concurrency / failure-isolation /
data-flow), all 6 confirmed findings fixed:
- P0 (critical): startVisual could adopt a stale capture into a DIFFERENT session
(cross-session SCStream leak + visual_timeline.json written to the wrong
folder). Now gated on session identity — generation + recorder ===, still
.recording — with fail-closed adoption; otherwise the stream is cancelled.
- P1: observer captured the browser's largest window, not the detected Meet
window. Now targets the exact CGWindowID (pickWindowIndex, unit-tested),
largest-area only as fallback.
- P2: a startVisual orphaned by a concurrent stop could leak a stream on quit.
inFlightVisual is registered before the await and drained in prepareForTermination.
- P3: trailing visual gap/segment ends could exceed duration_sec. Clamped in
VisualCapture (clampSegments/clampGaps, unit-tested).
- P4: capture pixel size used NSScreen.main scale; now uses the scale of the
display actually hosting the window (OCR clarity on secondary displays).
- VisualObserver.stop() bounds stopCapture() with a 3s timeout (mirrors audio) so
a wedged stream can't hang finalization.
25/25 XCTest pass. Live validation on real calls still pending.
Ten31 Transcripts
Native macOS menu-bar app that auto-detects conference calls, records local audio,
builds a visual-derived speaker timeline, and hands audio + timeline to the
SparkControl backend for naming/transcription. See docs/ for the full spec.
This repo is at Phase 0 (scaffold, permissions, backend health check).
One-time setup
- Install Xcode from the Mac App Store (free; ~40 GB). Open it once and accept the license prompt.
- Install XcodeGen (generates the Xcode project from
project.yml):brew install xcodegen - Generate the project:
This creates
xcodegen generateTen31Transcripts.xcodeproj(git-ignored — regenerate any time). - Open it:
open Ten31Transcripts.xcodeproj - Signing is preconfigured:
project.ymlsetsDEVELOPMENT_TEAMto the free personal teamBK4Y6CXN35with automatic signing, so Signing & Capabilities should already show the team — no manual selection needed. (If you ever sign with a different Apple ID, updateDEVELOPMENT_TEAMinproject.yml, not in Xcode —xcodegen generateoverwrites Xcode-side changes.) - Press Run (⌘R).
Note: after adding files in a new phase, re-run
xcodegen generateand let Xcode reload the project. The signing team persists because it lives inproject.yml, so macOS permissions stay granted across rebuilds.
What Phase 0 does
- Launches as a menu-bar-only app (no Dock icon).
- Menu panel shows live status for the three permissions it needs — Microphone, Screen Recording, Accessibility — with Grant / Open Settings buttons.
- Shows a backend health check (
GET /api/status) against the configured host. - Settings: backend base URL, skip-TLS toggle (on by default for the self-signed cert), output folder, and adapter toggles (inert this phase).
No audio capture, call detection, screen reading, or backend hand-off yet — those
arrive in Phases 1–6 (docs/04_BUILD_PLAN.md).
Project layout
project.yml # XcodeGen recipe → generates the .xcodeproj
Ten31Transcripts/
App/ Ten31TranscriptsApp.swift, AppDelegate.swift
UI/ MenuBarView, SettingsView, PermissionRow
Permissions/PermissionsManager.swift
Backend/ SparkControlHealth.swift, InsecureTrustDelegate.swift
Settings/ AppSettings.swift
Support/ Info.plist, Ten31Transcripts.entitlements
Ten31TranscriptsTests/ # placeholder; real tests land in Phase 3
Notes
- App Sandbox is off and Hardened Runtime is off — this is a personal, LAN-only tool that must observe other apps. Revisit only if distributing.
- The default backend host is
https://your-spark-backend.local:62419(editable in Settings).
Description
Languages
Swift
100%