# AGENTS.md — Ten31 Transcripts Native macOS **menu-bar app** that detects video calls, records dual-track audio + watches the call window for active-speaker cues, and sends audio + a visual timeline to a self-hosted **SparkControl** backend that does transcription/diarization/naming — producing named transcripts and recaps. > **Inbox check:** At session start, if `~/Projects/standards/INBOX.md` exists, scan it for items tagged `(ten31-transcripts)` and surface them before proposing next steps; triage with `/triage`. ## Stack (versions that matter) - **Swift 5.0**, **SwiftUI** + AppKit, macOS **13.0** deployment target. `LSUIElement` (menu-bar only, no Dock icon). - Project is generated by **XcodeGen** from `project.yml` (`brew install xcodegen`). `*.xcodeproj` is **gitignored** — regenerate, don't edit. - Full Xcode lives at `/Applications/Xcode.app`, but `xcode-select` points at CommandLineTools → **set `DEVELOPER_DIR` for every `xcodebuild`**. - Bundle id `xyz.ten31.transcripts`; `DEVELOPMENT_TEAM` (Apple Team ID) is set in a **gitignored `Config/Signing.xcconfig`** (copy `Config/Signing.xcconfig.example` and set your team). Keep it stable — a constant signing identity is what preserves TCC grants across rebuilds. - Backend: SparkControl gateway at `$SPARK_BACKEND_URL` (a private LAN backend — IP or `.local` host; Start9 self-signed cert. Install the StartOS Root CA in the System keychain so normal TLS validation succeeds; skip-TLS is an opt-in, **host-scoped** escape hatch, **off by default** — see `InsecureTrustDelegate`). Resolution order: a value saved in **Settings → SparkControl backend** (UserDefaults) wins, else the `SPARK_BACKEND_URL` env var, else the placeholder default in `AppSettings.swift`. Diarization = Sortformer/TitaNet (**mono-only**, ~4 speakers/chunk); LLM = Qwen3 via OpenAI-compatible `/v1/chat/completions`; audio via `/api/audio/label-merge`. ## Commands First time on a machine — create the local signing config (else `xcodegen generate`/signing won't find a team): ``` cp Config/Signing.xcconfig.example Config/Signing.xcconfig # then set DEVELOPMENT_TEAM ``` Regenerate the Xcode project (after adding/removing/renaming any source file): ``` xcodegen generate ``` Build + run all tests: ``` DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer xcodebuild test \ -project Ten31Transcripts.xcodeproj -scheme Ten31Transcripts \ -destination 'platform=macOS' -derivedDataPath /tmp/ten31-dd ``` Run a **single** test (target/class/method): ``` DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer xcodebuild test \ -project Ten31Transcripts.xcodeproj -scheme Ten31Transcripts \ -destination 'platform=macOS' -derivedDataPath /tmp/ten31-dd \ -only-testing:Ten31TranscriptsTests/SpeakerReconcilerTests/testCosine ``` Build only: replace `test` with `build`. **Lint/format:** none configured (no SwiftLint/SwiftFormat/Makefile); adding one is tracked in `ROADMAP.md`. Build a standalone app and install/run it (Xcode does **not** need to stay open): ``` DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer xcodebuild \ -project Ten31Transcripts.xcodeproj -scheme Ten31Transcripts \ -configuration Release -derivedDataPath /tmp/ten31-release build ditto /tmp/ten31-release/Build/Products/Release/Ten31Transcripts.app /Applications/Ten31Transcripts.app open /Applications/Ten31Transcripts.app ``` **Fast validation harness** (preferred for visual/backend logic): compile the specific `Ten31Transcripts/**.swift` files plus a `main.swift` with `xcrun --sdk macosx swiftc -O ... main.swift -o x` and run against real fixtures (`example-screenshots/`) or saved sessions. Top-level code must live in the file literally named `main.swift`. ## Layout (day one) - `Ten31Transcripts/App/` — `@main` entry + `AppDelegate`. - `Ten31Transcripts/Session/` — `SessionController` (state machine), `TranscriptPipeline`, `SessionPackager` (chunking), `TranscriptAssembler`, `SpeakerReconciler`, `ChunkPlan` (`ChunkMode`), `SpeakersFile`, `SessionNaming` (pure folder-name + recap-title logic). - `Ten31Transcripts/Visual/` — `VisualCapture`/`VisualObserver` (ScreenCaptureKit, ~3fps), `GridCallAnalyzer` (+ `FrameSampler`, `TextRecognizer`, `TimelineBuilder`, `VisualTimeline`, `SpeakerObservation`). - `Ten31Transcripts/Adapters/` — per-app screen-readers (`MeetAdapter`, `ZoomAdapter`, `TeamsAdapter`, `SignalAdapter`) + `AdapterRegistry`. - `Ten31Transcripts/Audio/` — `AudioRecorder`, `MicVAD`, `ChannelSelfVAD`, `AudioMixer`, `MonoTrackWriter`, `Resampler`. - `Ten31Transcripts/Backend/` — `SparkControlClient`, `GatewayLLMClient`, `VoiceprintStore`, `SparkControlHealth`, `InsecureTrustDelegate` (TLS skip). - `Ten31Transcripts/Recap/` — `RecapAnalyzer`, `RecapRenderer` (writes `transcript.md` + `recap.html`), `RecapModels`, `RecapTemplate`, `SpeakerEditing`, `RecapEditModel`. - `Ten31Transcripts/{Detection,Permissions,Settings,UI,Support}/` — `CallDetector`/`AudioInputProcesses`/`MicActivityMonitor`; `PermissionsManager`; `AppSettings` (UserDefaults); SwiftUI views + AppKit window hosts; `Info.plist` + entitlements. - `Ten31TranscriptsTests/` — XCTest. `example-screenshots/` — real fixtures (gitignored). `docs/`, `README.md`. - **Runtime output** (default `~/Ten31Transcripts/sessions/_/`, configurable in Settings): `mic.wav`, `system.wav`, `mixed_mono_16k.wav`, `self_vad.json`, `visual_timeline.json`, `speakers.json` (output), `cluster_fingerprints.json`, `recap.{html,json}`, `transcript.md`. The folder is created at session start as `_`; on stop the user can name the meeting and it's renamed to `__` (skipping keeps the auto stamp). ## Conventions - Match the surrounding file's style; small reviewable diffs; comments explain **why**, not what. - Write/extend XCTest alongside non-trivial changes; pure logic (chunking, reconciliation, analyzer math) is unit-tested offline. - Commits: imperative mood, concise; authored by Grant. Push to the self-hosted Gitea remote `origin` (branch `main`, over SSH) after committing, with my approval; the remote URL lives in `.git/config`, kept out of source. Work on `main` — don't create feature branches unless I ask. - **Gitea push gotcha:** `origin`'s URL uses a raw `.local` mDNS host that intermittently fails to resolve (`Could not resolve hostname`, or a push that connects then stalls). The `gitea-home` SSH alias (in `~/.ssh/config`) points at the **same** Gitea server (port 59916, user `git`) via a reliable HostName — the sibling `standards` repo uses it. Reliable fallback: `git push gitea-home:grant/ten31-transcripts.git main` then `git update-ref refs/remotes/origin/main main`. Repointing `origin` to the alias would make this permanent (not yet done). - Never commit recordings, transcripts, screenshots, or the generated `*.xcodeproj`. - No API keys/tokens/passwords in the repo. The backend host (`$SPARK_BACKEND_URL`) and the Apple Team ID (`Config/Signing.xcconfig`, gitignored) are kept out of source — real values live in Settings/UserDefaults and the local xcconfig. Build env vars: `DEVELOPER_DIR` (required) and optional `SPARK_BACKEND_URL`. - **Git history scrubbed (2026-06-13):** the private backend host + LAN IP were purged from all commits via `git filter-repo` (replaced with the `your-spark-backend.local` placeholder) and force-pushed; 0 hits across refs. Pre-rewrite backup bundle: `../ten31-transcripts-prehistory-rewrite.bundle`. A **second rewrite the same day** purged two backend LAN IPs that had slipped into a docs/test commit, replacing them with RFC 5737 documentation IPs (`192.0.2.1`/`192.0.2.2`) and force-pushing; 0 hits across refs; backup bundle `../ten31-transcripts-pre-ip-scrub.bundle`. The Apple Team ID was intentionally **not** scrubbed (it's public in every signed binary) — don't re-flag it. ## Always - Set `DEVELOPER_DIR=/Applications/Xcode.app/Contents/Developer` on every `xcodebuild`. - Run `xcodegen generate` after adding/removing/renaming source files. - Treat the backend as the owner of transcription, diarization, and speaker naming; the app only records, watches, packages, and reconciles hints. - Identify **self by the mic channel** + the single name in Settings → Your name, and keep that name reserved so the LLM never assigns it to another speaker. - Treat visual active-speaker cues as **naming hints over audio diarization** (the backbone): prefer sparse-but-correct detection over dense-but-wrong. - Send the backend dual-channel (`mic_file` + `system_file`) when the system track is healthy, else the mono `mixed_mono_16k.wav`; keep backend calls **sequential** (one in flight). - After any code change, rebuild Release + `ditto` to `/Applications` — the installed copy does **not** auto-update. ## Never - **Never write video frames to disk** — analyze in-memory and release immediately (privacy non-negotiable). - **Never add Co-Authored-By / "Generated with" / any AI or tool attribution** to commits or PRs. - Never commit secrets, recordings, transcripts, or `example-screenshots/` (faces + contact names). - Never do per-platform display-name matching for self (Zoom/Meet/Signal names differ) — channel + one canonical name only. - Never treat a solid camera-off avatar tile (Meet's orange/magenta fill) as an active speaker — the real cue is a thin **hollow** coloured ring; require thin-edge + hue gate (see `GridCallAnalyzer.isHollow`, `FrameSampler.thinColoredPoints`). - Never collapse adjacent same-speaker transcript segments (reverted by request) — one line per diarized utterance. - Never let a session-folder name put the meeting name where the app label is parsed from: the app must stay the **last** `_`-segment (`SessionController.appLabel(from:)` reads `.split("_").last`; `SessionNaming` enforces this and disambiguates collisions on the name segment). Renames happen at `finish()`-time after files are closed — re-derive track URLs from the (possibly moved) folder, never from `RecordingResult`'s start-time paths. - Never send call audio to a raw IP the user didn't configure. Offline backend checks: a `.local` mDNS host can't be resolved by a plain `swiftc`/URLSession binary (`-1009`) — use the **real app** or `curl`; but a **configured raw IP _is_ reachable from a plain swiftc URLSession binary** (that's how the TLS fix was verified offline). - Never force-push a shared branch, and never push without my approval. (Work on `main` — don't create feature branches unless I ask.) ## Current state Present tense; overwritten each session. `main` clean and pushed (HEAD `a5c227e`, pushed via the `gitea-home` alias — origin's `.local` host wouldn't resolve); `/Applications/Ten31Transcripts.app` rebuilt + installed from HEAD. **Full suite re-run: 91 pass** (was 73; +18 `SessionNamingTests`). - **This session (2026-06-17) — meeting-name prompt + folder rename:** on stop, an NSAlert asks for a meeting name (Save/Skip) and the session folder is renamed `_` → `__` (HH-MM-SS dropped; Skip/blank keeps the stamp). Pure logic in `SessionNaming` (sanitize, leaf compose, `recapTitle` for both forms); app label stays the last `_`-segment; collisions disambiguate on the name segment; `finish()` re-derives track URLs post-rename; quit never prompts and aborts an open prompt. Reviewer-reviewed; its P1 (quit-during-modal) + two P2s fixed. - **Backend connected end-to-end:** real LAN URL saved in Settings → SparkControl backend (off-repo: `defaults read xyz.ten31.transcripts backendBaseURL`); committed default stays the placeholder. - **Working:** backend hand-off (live), call detection (Meet/Zoom/Teams/Signal), dual-track capture, dual-channel + chunked send, speaker reconciliation, recap, speaker editor, configurable chunk length, standalone Settings, meeting-name prompt + readable folders. - **Verify next (real app):** the naming prompt + rename is unit-tested + builds but **not yet exercised on a live stop** — run a real recording, stop, name it, confirm the folder renames and backend output lands in the renamed folder. - **Next up:** (a) repoint `origin` to `gitea-home` so pushes stop hitting the flaky `.local` host (see Conventions); (b) **backend URL primary→fallback** + the `mmss()` NaN/∞ guard freebie (sketch first; keep real IPs out of source — use `192.0.2.x`). - **In progress / unverified:** the Meet visual fix (reject solid camera-off tiles) still has no clean end-to-end run — re-process the saved Meet session + a fresh Meet call (needs real app + backend). - **Known bugs / loose end:** sparse Meet speaking-detection (faint blue border); sub-second junk "self" mic fragments; desktop-mic vs phone doesn't unify by voiceprint. Doc loose end: `docs/01 §5`/`docs/02 §2.4` still list "AppleScript" as a Meet name source though the code uses window titles.