Handoff: record Gemini timeout lesson; Strike extraction near complete
Fold the hang lesson into the Gemini operational rule (disable thinking AND set a timeout) and refresh Current state for the in-progress Gemini extraction (~68 docs left, 52.7k claims) and the gating Strike test.
This commit is contained in:
@@ -94,10 +94,12 @@ ASR + diarizer, **bge-m3** embeddings, **Qdrant** on `:87`. The gateway is the o
|
|||||||
`\n\n`→`\n`→sentence→word→hard-slice; splitting only on `\n\n` (the old bug) sent whole 2–3 h episodes
|
`\n\n`→`\n`→sentence→word→hard-slice; splitting only on `\n\n` (the old bug) sent whole 2–3 h episodes
|
||||||
in ONE call → context-overflow 400s. Extraction defaults to full coverage at 12K chars/chunk
|
in ONE call → context-overflow 400s. Extraction defaults to full coverage at 12K chars/chunk
|
||||||
(`run-extract --chunk-chars/--max-chunks`); bigger chunks risk lost-in-the-middle recall loss.
|
(`run-extract --chunk-chars/--max-chunks`); bigger chunks risk lost-in-the-middle recall loss.
|
||||||
- **Gemini extraction backend disables thinking.** `gemini-2.5-flash` thinks by default and burns the
|
- **Gemini extraction backend: disable thinking AND set a timeout.** `gemini-2.5-flash` thinks by
|
||||||
output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims; the backend sets
|
default and burns the output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims, so
|
||||||
`thinking_budget=0` (mirrors the local path's `enable_thinking=False`). Gemini = overflow for PUBLIC
|
the backend sets `thinking_budget=0` (mirrors local `enable_thinking=False`). It also sets an HTTP
|
||||||
data only; keep `EXTRACTION_BACKEND=local` in `.env`, flip it inline per-run when overflowing.
|
**timeout (120 s) + 4 retries** — a timeout-less call once hung the single-threaded worker ~50 min
|
||||||
|
(transient 504/read-timeouts then self-heal). Gemini = overflow for PUBLIC data only; keep
|
||||||
|
`EXTRACTION_BACKEND=local` in `.env`, flip it inline per-run when overflowing.
|
||||||
- **Scoring-brain internals are scoped to a guide.** Before editing `signal_engine/signals/`, read
|
- **Scoring-brain internals are scoped to a guide.** Before editing `signal_engine/signals/`, read
|
||||||
**`docs/guides/scoring-brain.md`** — the classifier invariants (REALIZED-ONLY, ROLE-MATCH, claim_type
|
**`docs/guides/scoring-brain.md`** — the classifier invariants (REALIZED-ONLY, ROLE-MATCH, claim_type
|
||||||
hard-evidence guard, max_tokens budget, claim_id bracket-strip), the EISC cluster-cap, and the
|
hard-evidence guard, max_tokens budget, claim_id bracket-strip), the EISC cluster-cap, and the
|
||||||
@@ -114,19 +116,17 @@ IP appears only as an env-var default.
|
|||||||
|
|
||||||
## Current state (snapshot — overwrite each session; longer-term backlog → `ROADMAP.md`)
|
## Current state (snapshot — overwrite each session; longer-term backlog → `ROADMAP.md`)
|
||||||
|
|
||||||
- **Strike adversarial test: extraction running — this is the gating step.** Root-caused & FIXED the
|
- **Strike adversarial test: extraction ~complete — this is the gating step.** Chunker 400 bug FIXED
|
||||||
long-form 400s (the old `\n\n`-only chunker sent whole 2–3 h episodes in one call → context overflow).
|
(old `\n\n`-only split sent whole 2–3 h episodes in one call → context overflow); extraction is now
|
||||||
Now recall-first full coverage (12K chars). Draining the ~700-doc / ~5.7k-chunk extract backlog through
|
recall-first full coverage (12K chars). Drained the ~700-doc backlog via the **Gemini backend**
|
||||||
the **Gemini backend** (one-time PUBLIC overflow, `EXTRACTION_BACKEND=gemini` inline) to finish faster
|
(one-time PUBLIC overflow), hardened with thinking-off + 120 s timeout/4 retries after a timeout-less
|
||||||
(~6–7 h serial); validated live — dense (~7.5 claims/chunk), zero failures; 27 prior 400-failures
|
call hung the worker ~50 min. At session end **~68 docs left** (the SEC/FMP filings tail); **52.7k
|
||||||
requeued. **NEXT when it finishes:** `embed-claims` → `two-sided --conviction STRIKE2022 --modes
|
claims** in the DB, 0 failures, 3 transient timeouts. **NEXT once extraction finishes:** `embed-claims`
|
||||||
live,test` (PASS = quiet in live, fires in test).
|
→ `two-sided --conviction STRIKE2022 --modes live,test` (PASS = quiet in live, fires in test).
|
||||||
- **Battery adversarial test: PASSES** (unchanged — demand-net rises, supply stays flat at 0.0).
|
- **If extraction stopped early** (session ended): resume with `EXTRACTION_BACKEND=gemini run-extract
|
||||||
**§7.1 power-infra backtest:** qualified YES (corpus-gated; caveats in `DESIGN_v2.md`).
|
--limit 800` (drop the env var for the local Qwen path). Pending claims still need `embed-claims` after.
|
||||||
- **2 commits made, UNPUSHED** — push to `main` was blocked by the permission classifier (enforcing the
|
- **Battery test PASSES; §7.1 power-infra qualified YES** (both unchanged).
|
||||||
old no-push-to-main rule); awaiting approval (`git push origin main`). Commits: chunker fix +
|
- **3 commits local + `backends.py` uncommitted (Gemini timeout/retry) — ALL UNPUSHED.** Push to `main`
|
||||||
recall-first defaults; Gemini thinking-budget fix.
|
is blocked by the permission classifier; run `git push origin main` (your updated rule allows main).
|
||||||
- **Open decisions for Grant:** (a) push the 2 commits; (b) speed-up approach — recommended real-time
|
- Corpus: bitcoin podcasts, SEC/FMP filings (+`banks` cluster), Battery corpus, River research; EISC
|
||||||
concurrency over the async Batch API (serial Gemini runs as the fallback meanwhile).
|
edges seeded for the bitcoin cluster.
|
||||||
- Corpus spans bitcoin podcasts, SEC/FMP filings (+`banks` cluster), the Battery corpus, River research;
|
|
||||||
EISC edges seeded for the bitcoin cluster.
|
|
||||||
|
|||||||
Reference in New Issue
Block a user