Commit Graph

11 Commits

Author SHA1 Message Date
Keysat 29bca8b387 Tighten Current state; document two-sided net-value interpretation 2026-06-16 12:53:14 -05:00
Keysat 0cba2626d3 Record Strike test conditional pass; track reflexivity follow-up
Full STRIKE2022 pipeline ran clean: 63 filing extracts (3,330 claims,
56,008 total), embed-claims indexed all into Qdrant, two-sided live/test.
Engine refuses the false positive (net=+0.25, capped single bitcoin
cluster, far below the 2.0 firing bar) but the own_network-drop
reflexivity demo is unexercised (own_net=0, live==test) because
RHR/CD/Bitcoin.Review were deferred at transcription 2026-06-08.
Accepted as a conditional pass; un-defer follow-up tracked in ROADMAP.
2026-06-16 12:39:13 -05:00
Keysat 0b001b49d5 Handoff: record Gemini timeout lesson; Strike extraction near complete
Fold the hang lesson into the Gemini operational rule (disable thinking AND set a timeout) and refresh Current state for the in-progress Gemini extraction (~68 docs left, 52.7k claims) and the gating Strike test.
2026-06-16 08:45:12 -05:00
Keysat 87b6b05d67 Add request timeout and retry to Gemini extraction backend
A timeout-less generate_content call hung the single-threaded extract worker for ~50 min mid-batch. Set an HTTP timeout (120s) plus 4 retries with backoff, mirroring SparkControl._post; transient 504/read-timeouts now self-heal instead of freezing the run.
2026-06-16 08:45:12 -05:00
Keysat 6f4698a98c Handoff: durable chunker/Gemini rules; Strike extraction in progress
Record two recurrence-prone gotchas in Key operational rules (transcript chunking must cap every chunk; Gemini backend must disable thinking) and rewrite Current state for the in-progress Gemini extraction batch and the gating Strike test.
2026-06-15 22:47:20 -05:00
Keysat e8d50efdf4 Disable Gemini thinking budget in extraction backend
gemini-2.5-flash thinks by default and spent ~3.8k of the 4k output budget on reasoning, hitting MAX_TOKENS with a truncated JSON body -> 0 claims parsed. Set thinking_budget=0 so the full budget goes to the answer (mirrors the local path's enable_thinking=False). On the validation chunk this went from 0 -> 11 claims.
2026-06-15 22:28:12 -05:00
Keysat 5deffddb17 Fix transcript chunker context overflow; full-coverage extraction defaults
chunk_text split only on "\n\n", but ASR transcripts have none (speaker turns are joined by a single "\n"), so whole 2-3h episodes (~250K chars) went to the extractor in one call and 400'd on context overflow. Fall through paragraph -> line -> sentence -> word -> hard char-slice so no chunk exceeds the cap regardless of punctuation; guard max_chars < 1.

Default extraction to recall-first full coverage (chunk_chars 12K, max_chunks 999) and expose both as run-extract --chunk-chars / --max-chunks.
2026-06-15 22:28:12 -05:00
Keysat cabb8a3d6c Handoff: mark Strike test stalled, document resume steps 2026-06-15 12:11:49 -05:00
Keysat 19375dcdfb Update Current state: Strike in extraction phase; audio fix landed 2026-06-15 11:13:09 -05:00
Keysat 5bd8758ab8 Add portability retrofit: AGENTS.md + CLAUDE.md symlink, scoring-brain guide, ROADMAP, .env.example 2026-06-15 11:10:59 -05:00
Keysat a6aec77506 Initial commit: Ten31 Signal Engine (ingest, scoring brain, corpus seeds) 2026-06-15 09:24:29 -05:00