Filter OCR to participant-name labels (kill visual-timeline noise)

Real Meet capture revealed the visual pipeline was treating ALL on-screen text as
participant names: meeting URL, clock, 'Add others' button, lobby 'Your meeting's
ready' dialog, 'Joined as …@gmail.com', etc. 46 of 52 'visual segments' in a real
session were phantom speakers. (The backend was unaffected — it diarizes from audio
and ignores names that match no voice cluster — but the visual_timeline.json and the
segment count were junk.)

GridCallAnalyzer.isLikelyName now gates OCR strings to things shaped like a name:
2–30 chars, 1–3 Title-Cased alphabetic words, no digits/URL/email/glyph punctuation.
Errs toward dropping (a missed name just loses a hint; audio diarization still runs).
Unit-tested against the EXACT 19 OCR strings from the real session: keeps the 5
real names, drops all 14 chrome strings. 28/28 XCTest.
This commit is contained in:
Grant Gilliam
2026-06-06 12:01:57 -05:00
parent f2856bc363
commit 7f16b29f56
2 changed files with 37 additions and 3 deletions
@@ -135,6 +135,19 @@ final class GridCallAnalyzerTests: XCTestCase {
XCTAssertTrue(obs.filter { $0.speaking }.isEmpty)
}
func testNameFilterAgainstRealMeetOCR() {
// The exact strings OCR pulled from a real Meet session only the first
// group are participants; the rest are UI chrome that must NOT become speakers.
let names = ["Grant Gilliam", "Caitlyn Viggiano", "Cait's Phone", "Grant", "Me"]
let junk = ["11:43 AM | rvo-rmjg-rdq", "@ Embassy Er", "Admit 1 guest",
"Joined as grant.gilliam@gmail.com", "Others may see your video differently",
"Others might still see your full video.", "Your meeting's ready", "efforot",
"g* Add others", "g+ Add others", "meet.google.com/rvo-rmjg-rdq",
"permission before they can join.", "the meeting", "G"]
for n in names { XCTAssertTrue(GridCallAnalyzer.isLikelyName(n), "should keep name: \(n)") }
for j in junk { XCTAssertFalse(GridCallAnalyzer.isLikelyName(j), "should drop junk: \(j)") }
}
func testWhiteBorderDetectorIgnoresColouredBorder() {
// Signal looks only for the white border, so a coloured (Meet) border must
// not register as a Signal speaker.