Signal's active-speaker cue is a 3px #ffffff rounded border (saturation ≈ 0),
which the saturation-based highlight detector could never see. Per the
Signal-Desktop source review:
- FrameSampler.thinWhitePoints: grid-sample near-white pixels that sit on a
THIN structure (a non-white pixel within edgeGap on some axis) so a border/
ring counts but a solid white blob (face, bright video) does not.
- GridCallAnalyzer: combine coloured (saturated) + white (thin) highlight
pixels; exclude name-text regions so the white footer name can't be mistaken
for the border; estimate the tile UP from the name footer (nameAtBottom);
attribute each highlight pixel to exactly one tile by containment (nearest
centre as tiebreak) so a border can't bleed into an adjacent tile.
- SignalAdapter: white border on, coloured off, name-at-bottom geometry.
Synthetic 4-tile harness now isolates each speaker with no adjacent-tile bleed;
all 15 XCTest cases pass. Real-screenshot geometry calibration still pending.
AudioRecorder captures system audio (ScreenCaptureKit) + mic (AVAudioEngine) on a
single serial ioQueue, one shared monotonic t0, time-driven writers (pad gaps /
trim overlaps) so tracks stay aligned, and an energy mic-VAD for 'self' spans.
AudioMixer sums the aligned tracks into mixed_mono_16k.wav. SessionController
drives a serialized start/stop state machine, writes the session folder +
self_vad.json, exposes live level meters, and finalizes on quit.
Hardening from review: ioQueue single-domain (no races), stop() never hangs
(mic-first teardown + bounded stopCapture), layout-agnostic mic deep-copy,
discard-only video output to keep SCStream alive, VAD lockstep on committed
frames, stable signing team in project.yml, single-instance enforcement.