Handoff: record Gemini timeout lesson; Strike extraction near complete

Fold the hang lesson into the Gemini operational rule (disable thinking AND set a timeout) and refresh Current state for the in-progress Gemini extraction (~68 docs left, 52.7k claims) and the gating Strike test.
2026-06-16 08:45:12 -05:00
parent 87b6b05d67
commit 0b001b49d5
1 changed files with 20 additions and 20 deletions
@@ -94,10 +94,12 @@ ASR + diarizer, **bge-m3** embeddings, **Qdrant** on `:87`. The gateway is the o
  `\n\n`→`\n`→sentence→word→hard-slice; splitting only on `\n\n` (the old bug) sent whole 2–3 h episodes
  in ONE call → context-overflow 400s. Extraction defaults to full coverage at 12K chars/chunk
  (`run-extract --chunk-chars/--max-chunks`); bigger chunks risk lost-in-the-middle recall loss.
- **Gemini extraction backend disables thinking.** `gemini-2.5-flash` thinks by default and burns the
+- **Gemini extraction backend: disable thinking AND set a timeout.** `gemini-2.5-flash` thinks by
-  output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims; the backend sets
+  default and burns the output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims, so
-  `thinking_budget=0` (mirrors the local path's `enable_thinking=False`). Gemini = overflow for PUBLIC
+  the backend sets `thinking_budget=0` (mirrors local `enable_thinking=False`). It also sets an HTTP
-  data only; keep `EXTRACTION_BACKEND=local` in `.env`, flip it inline per-run when overflowing.
+  **timeout (120 s) + 4 retries** — a timeout-less call once hung the single-threaded worker ~50 min
  (transient 504/read-timeouts then self-heal). Gemini = overflow for PUBLIC data only; keep
  `EXTRACTION_BACKEND=local` in `.env`, flip it inline per-run when overflowing.
 - **Scoring-brain internals are scoped to a guide.** Before editing `signal_engine/signals/`, read
  **`docs/guides/scoring-brain.md`** — the classifier invariants (REALIZED-ONLY, ROLE-MATCH, claim_type
  hard-evidence guard, max_tokens budget, claim_id bracket-strip), the EISC cluster-cap, and the
@@ -114,19 +116,17 @@ IP appears only as an env-var default.
 ## Current state (snapshot — overwrite each session; longer-term backlog → `ROADMAP.md`)
- **Strike adversarial test: extraction running — this is the gating step.** Root-caused & FIXED the
+- **Strike adversarial test: extraction ~complete — this is the gating step.** Chunker 400 bug FIXED
-  long-form 400s (the old `\n\n`-only chunker sent whole 2–3 h episodes in one call → context overflow).
+  (old `\n\n`-only split sent whole 2–3 h episodes in one call → context overflow); extraction is now
-  Now recall-first full coverage (12K chars). Draining the ~700-doc / ~5.7k-chunk extract backlog through
+  recall-first full coverage (12K chars). Drained the ~700-doc backlog via the **Gemini backend**
-  the **Gemini backend** (one-time PUBLIC overflow, `EXTRACTION_BACKEND=gemini` inline) to finish faster
+  (one-time PUBLIC overflow), hardened with thinking-off + 120 s timeout/4 retries after a timeout-less
-  (~6–7 h serial); validated live — dense (~7.5 claims/chunk), zero failures; 27 prior 400-failures
+  call hung the worker ~50 min. At session end **~68 docs left** (the SEC/FMP filings tail); **52.7k
-  requeued. **NEXT when it finishes:** `embed-claims` → `two-sided --conviction STRIKE2022 --modes
+  claims** in the DB, 0 failures, 3 transient timeouts. **NEXT once extraction finishes:** `embed-claims`
-  live,test` (PASS = quiet in live, fires in test).
+  → `two-sided --conviction STRIKE2022 --modes live,test` (PASS = quiet in live, fires in test).
- **Battery adversarial test: PASSES** (unchanged — demand-net rises, supply stays flat at 0.0).
+- **If extraction stopped early** (session ended): resume with `EXTRACTION_BACKEND=gemini run-extract
-  **§7.1 power-infra backtest:** qualified YES (corpus-gated; caveats in `DESIGN_v2.md`).
+  --limit 800` (drop the env var for the local Qwen path). Pending claims still need `embed-claims` after.
- **2 commits made, UNPUSHED** — push to `main` was blocked by the permission classifier (enforcing the
+- **Battery test PASSES; §7.1 power-infra qualified YES** (both unchanged).
-  old no-push-to-main rule); awaiting approval (`git push origin main`). Commits: chunker fix +
+- **3 commits local + `backends.py` uncommitted (Gemini timeout/retry) — ALL UNPUSHED.** Push to `main`
-  recall-first defaults; Gemini thinking-budget fix.
+  is blocked by the permission classifier; run `git push origin main` (your updated rule allows main).
- **Open decisions for Grant:** (a) push the 2 commits; (b) speed-up approach — recommended real-time
+- Corpus: bitcoin podcasts, SEC/FMP filings (+`banks` cluster), Battery corpus, River research; EISC
-  concurrency over the async Batch API (serial Gemini runs as the fallback meanwhile).
+  edges seeded for the bitcoin cluster.
 - Corpus spans bitcoin podcasts, SEC/FMP filings (+`banks` cluster), the Battery corpus, River research;
  EISC edges seeded for the bitcoin cluster.