Handoff: record Gemini timeout lesson; Strike extraction near complete
Fold the hang lesson into the Gemini operational rule (disable thinking AND set a timeout) and refresh Current state for the in-progress Gemini extraction (~68 docs left, 52.7k claims) and the gating Strike test.
This commit is contained in:
@@ -94,10 +94,12 @@ ASR + diarizer, **bge-m3** embeddings, **Qdrant** on `:87`. The gateway is the o
|
||||
`\n\n`→`\n`→sentence→word→hard-slice; splitting only on `\n\n` (the old bug) sent whole 2–3 h episodes
|
||||
in ONE call → context-overflow 400s. Extraction defaults to full coverage at 12K chars/chunk
|
||||
(`run-extract --chunk-chars/--max-chunks`); bigger chunks risk lost-in-the-middle recall loss.
|
||||
- **Gemini extraction backend disables thinking.** `gemini-2.5-flash` thinks by default and burns the
|
||||
output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims; the backend sets
|
||||
`thinking_budget=0` (mirrors the local path's `enable_thinking=False`). Gemini = overflow for PUBLIC
|
||||
data only; keep `EXTRACTION_BACKEND=local` in `.env`, flip it inline per-run when overflowing.
|
||||
- **Gemini extraction backend: disable thinking AND set a timeout.** `gemini-2.5-flash` thinks by
|
||||
default and burns the output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims, so
|
||||
the backend sets `thinking_budget=0` (mirrors local `enable_thinking=False`). It also sets an HTTP
|
||||
**timeout (120 s) + 4 retries** — a timeout-less call once hung the single-threaded worker ~50 min
|
||||
(transient 504/read-timeouts then self-heal). Gemini = overflow for PUBLIC data only; keep
|
||||
`EXTRACTION_BACKEND=local` in `.env`, flip it inline per-run when overflowing.
|
||||
- **Scoring-brain internals are scoped to a guide.** Before editing `signal_engine/signals/`, read
|
||||
**`docs/guides/scoring-brain.md`** — the classifier invariants (REALIZED-ONLY, ROLE-MATCH, claim_type
|
||||
hard-evidence guard, max_tokens budget, claim_id bracket-strip), the EISC cluster-cap, and the
|
||||
@@ -114,19 +116,17 @@ IP appears only as an env-var default.
|
||||
|
||||
## Current state (snapshot — overwrite each session; longer-term backlog → `ROADMAP.md`)
|
||||
|
||||
- **Strike adversarial test: extraction running — this is the gating step.** Root-caused & FIXED the
|
||||
long-form 400s (the old `\n\n`-only chunker sent whole 2–3 h episodes in one call → context overflow).
|
||||
Now recall-first full coverage (12K chars). Draining the ~700-doc / ~5.7k-chunk extract backlog through
|
||||
the **Gemini backend** (one-time PUBLIC overflow, `EXTRACTION_BACKEND=gemini` inline) to finish faster
|
||||
(~6–7 h serial); validated live — dense (~7.5 claims/chunk), zero failures; 27 prior 400-failures
|
||||
requeued. **NEXT when it finishes:** `embed-claims` → `two-sided --conviction STRIKE2022 --modes
|
||||
live,test` (PASS = quiet in live, fires in test).
|
||||
- **Battery adversarial test: PASSES** (unchanged — demand-net rises, supply stays flat at 0.0).
|
||||
**§7.1 power-infra backtest:** qualified YES (corpus-gated; caveats in `DESIGN_v2.md`).
|
||||
- **2 commits made, UNPUSHED** — push to `main` was blocked by the permission classifier (enforcing the
|
||||
old no-push-to-main rule); awaiting approval (`git push origin main`). Commits: chunker fix +
|
||||
recall-first defaults; Gemini thinking-budget fix.
|
||||
- **Open decisions for Grant:** (a) push the 2 commits; (b) speed-up approach — recommended real-time
|
||||
concurrency over the async Batch API (serial Gemini runs as the fallback meanwhile).
|
||||
- Corpus spans bitcoin podcasts, SEC/FMP filings (+`banks` cluster), the Battery corpus, River research;
|
||||
EISC edges seeded for the bitcoin cluster.
|
||||
- **Strike adversarial test: extraction ~complete — this is the gating step.** Chunker 400 bug FIXED
|
||||
(old `\n\n`-only split sent whole 2–3 h episodes in one call → context overflow); extraction is now
|
||||
recall-first full coverage (12K chars). Drained the ~700-doc backlog via the **Gemini backend**
|
||||
(one-time PUBLIC overflow), hardened with thinking-off + 120 s timeout/4 retries after a timeout-less
|
||||
call hung the worker ~50 min. At session end **~68 docs left** (the SEC/FMP filings tail); **52.7k
|
||||
claims** in the DB, 0 failures, 3 transient timeouts. **NEXT once extraction finishes:** `embed-claims`
|
||||
→ `two-sided --conviction STRIKE2022 --modes live,test` (PASS = quiet in live, fires in test).
|
||||
- **If extraction stopped early** (session ended): resume with `EXTRACTION_BACKEND=gemini run-extract
|
||||
--limit 800` (drop the env var for the local Qwen path). Pending claims still need `embed-claims` after.
|
||||
- **Battery test PASSES; §7.1 power-infra qualified YES** (both unchanged).
|
||||
- **3 commits local + `backends.py` uncommitted (Gemini timeout/retry) — ALL UNPUSHED.** Push to `main`
|
||||
is blocked by the permission classifier; run `git push origin main` (your updated rule allows main).
|
||||
- Corpus: bitcoin podcasts, SEC/FMP filings (+`banks` cluster), Battery corpus, River research; EISC
|
||||
edges seeded for the bitcoin cluster.
|
||||
|
||||
Reference in New Issue
Block a user