ten31-transcripts

Files

T

Grant Gilliam 2191486506 Channel-verified self identity: the mic track is you

Grant's insight + proven on real session audio: we capture self (mic) and others
(system) as separate tracks, then throw the separation away by mixing to mono — so
the backend has to re-guess who's who. Analysis of a real call showed the channels
are cleanly separated (envelope corr 0.015, NO echo); Caitlyn's 'Go Bitcoin' was
11.8x louder in system than mic, yet the mono mix + noisy visual named it 'Grant'.

ChannelSelfVAD marks self-speech as windows where the mic is active AND louder than
system (mic > system x1.5). Benefits: (1) self is identified by CHANNEL, not by the
on-screen name — set one name in Settings, no per-platform matching; (2) a remote
speaker (or room echo) can never be mislabeled as self. Computed at finalize from
the two finished WAVs; the live capture path is untouched. Falls back to mic-VAD if
tracks can't be read. SessionController feeds these spans to the backend timeline.

Validated on the real session: 16 self spans; 'Go Bitcoin' (72-74s) correctly
EXCLUDED, Grant's 49.9-53.3s / 62.6-64s correctly INCLUDED. 33/33 XCTest (5 new).

2026-06-06 12:24:29 -05:00

AudioMixer.swift

Phase 1: dual-track audio capture → mixed-mono 16 kHz WAV + mic VAD

2026-06-05 21:30:11 -05:00

AudioRecorder.swift

Wire visual capture into the recording lifecycle (failure-isolated)

2026-06-06 10:18:52 -05:00

ChannelSelfVAD.swift

Channel-verified self identity: the mic track is you

2026-06-06 12:24:29 -05:00

MicVAD.swift

Phase 1: dual-track audio capture → mixed-mono 16 kHz WAV + mic VAD