Fold the hang lesson into the Gemini operational rule (disable thinking AND set a timeout) and refresh Current state for the in-progress Gemini extraction (~68 docs left, 52.7k claims) and the gating Strike test.
10 KiB
Ten31 Signal Engine — AGENTS.md
Inbox check: At session start, if
~/Projects/standards/INBOX.mdexists, scan it for items tagged(ten31-signal-engine)and surface them before proposing next steps; triage with/triage.
A recurring pipeline that ingests a growing corpus of audio (podcasts, YouTube) and text (SEC filings, earnings calls, policy/lender/research docs), extracts structured propositions ("claims"), and surfaces signal over time through Ten31's investment thesis as a relevance lens — logging every surfaced signal as a falsifiable prediction scored against reality.
Source of truth (in order): ten31-signal-engine-handoff.md (the spec — wins on any conflict; §refs
point into it) › DESIGN_v2.md (the living decision/falsification log — read before changing scoring) ›
this file's Current state. README.md is the user-facing intro.
The spine — NON-NEGOTIABLE guardrails (never violate)
- Nominate-then-judge. Statistics & graph structure NOMINATE candidates; the frontier model only JUDGES / FANS OUT a pre-filtered shortlist. The frontier never nominates from the raw corpus.
- Propositions, not vibes. Extract atomic claims; separate topic from stance.
- Discount convergence by connectedness. Independence is earned, not counted — the EISC graph (source edges + voiceprints) downweights echo. Bitcoin is one capped cluster: within-cluster agreement can NOT masquerade as independent corroboration; cross-cluster earns the multiplier.
- Thesis is a LENS, not a gate on truth. The engine must surface signals against Ten31's thesis, not just for it.
- Dual-evaluation ledger from day one — precision AND recall; every signal is a logged prediction.
- ~95% local compute via Spark Control. Call the gateway's HTTP endpoints; do NOT stand up your own vLLM / Whisper / Qdrant. Gemini is an explicit overflow lever for PUBLIC data only.
- Sovereignty boundary (hard). Exposure/positioning/conviction data and the Strike/Battery
investment memos NEVER go to the frontier. Route sensitive frontier calls through
/scrub → frontier → /rehydrate(scrub identities, not substance). Read the memos LOCALLY only.
Two jobs: A — Discovery (emergent themes via independent cross-cluster convergence scored on acceleration; contrarian stances; their intersection). B — Conviction-action gap (fan held convictions to 2nd/3rd-order derivatives, catch early corroboration — the countermeasure to the 2023 "power is the binding constraint on AI/compute" miss: right on the root, late to the derivatives).
Architecture
signal_engine/ (Python package, run as python -m signal_engine <cmd>):
config.py— env-drivenConfig(+.envloader).spark/client.py— the SINGLE gateway chokepoint (no other module knows the gateway URL); scrub/rehydrate live here.ingest/—edgar(SEC),earnings(FMP REST),feeds+podcasts(RSS),download,chunker,transcribe_worker(local Parakeet),gemini_transcribe(bulk overflow),docs(HTML/PDF/RSS text fetcher for policy/lender/research),identify,speaker_stitch.extract/—claims+worker(proposition extraction),backends(LocalQwen | Gemini),prompt,html_text.embedstore/—embedder+qdrant_store(hybrid dense+BM25).signals/(the scoring brain) —independence(EISC),asof(look-ahead guard),windows,under_acted(Job B),bar(two-tier gate),two_sided(affirms−denies net-corroboration),llm_helpers(derivative_relevance),confusion(precision/recall),external(price/outcome fetcher),ledger_writer(§6.6 prediction ledger),resolver(stub),run.store/—db(SQLite + idempotent migrations),schema.sql,seed,sources.backfill/queue.py(the job queue).ui/app.py(FastAPI corpus/eval UI).util.py.- Data lands in
data/(gitignored):signal.db,transcripts/,docs/,audio-cache/.
Flow: seed sources/convictions/fanout → ingest (→ documents + transcribe/extract jobs) →
run-transcribe / run-extract drain the queue → claims → embed-claims (Qdrant) → scorers
(backtest, two-sided) read the proposition store as-of a date.
Build / run
- Setup: virtualenv at
.venv(Python 3.14)..venv/bin/pip install -r requirements.txt. - Invoke:
.venv/bin/python -m signal_engine <cmd>.--helpis authoritative; the rest is a map:init-db; seedingseed-sources/seed-convictions/seed-fanout/seed-edges/load-feeds; ingestingest-edgar/ingest-earnings/ingest-podcast/ingest-doc/ingest-doc-manifest/ingest-feed-text; queue drainrun-transcribe/run-transcribe-gemini/run-extract; indexembed-claims/search; scorebacktest/two-sided/confusion-matrix; inspectqueue-status/spark-status/feed-peek/provenance/db-tables;serve(UI). - DB:
python -m signal_engine init-db(idempotent — re-creates schema + runs additive migrations). - Tests: ⚠️ no automated test suite yet (no
tests/, no pytest). Verification is by running commands against the live gateway. Adding a test harness is on the ROADMAP. - Lint/format: none configured. Match the surrounding style (dense, §-referenced docstrings).
Spark Control infra (SPARK_CONTROL_URL, self-signed TLS → SPARK_VERIFY_TLS=false)
One gateway fronts two DGX Sparks: vLLM RedHatAI/Qwen3.6-35B-A3B-NVFP4 on :103; Parakeet
ASR + diarizer, bge-m3 embeddings, Qdrant on :87. The gateway is the only URL anything calls.
- AUDIO concurrency (learned 2026-06-09): single serial GPU shared with the operator's production
meeting app. Cap 2 in-flight (ceiling 3), GLOBAL across both audio endpoints — a process-wide
BoundedSemaphore(AUDIO_CONCURRENCYenv, default 2). Going wider buys zero throughput. Transient 1–4s "busy blips" (broken-pipe/503/timeout) are NOT failures → short retry-backoff. Thetranscribe_workerruns a 2-wide chunk pool; the old size-1 lock was ~2.5× slower.
Key operational rules (learned this build — easy to get wrong)
own_networkquarantine is MATERIALITY-driven, not "any investment." Quarantine (drop in live scoring, keep in test) only for MATERIAL ties where the source is part of Ten31's voice: the partners' own shows (TFTC, Citadel Dispatch, Rabbit Hole Recap), the Battery partnership, material portfolio leads. Immaterial passive stakes → INDEPENDENT (River and Swan/Cafe Bitcoin were corrected to independent). Unconfirmed: Unchained, Debifi, Coinkite (held quarantined pending Grant's materiality call).- Gemini quota is a rolling ~24h window (~291 hour-long episodes / ~51M tokens), not a calendar-day reset. Bulk transcription overflows there; expect 429 RESOURCE_EXHAUSTED past the window.
- Transcript chunking is recall-first and MUST cap every chunk. ASR transcripts have NO blank-line
paragraphs (speaker turns joined by a single
\n), soextract.claims.chunk_textfalls through\n\n→\n→sentence→word→hard-slice; splitting only on\n\n(the old bug) sent whole 2–3 h episodes in ONE call → context-overflow 400s. Extraction defaults to full coverage at 12K chars/chunk (run-extract --chunk-chars/--max-chunks); bigger chunks risk lost-in-the-middle recall loss. - Gemini extraction backend: disable thinking AND set a timeout.
gemini-2.5-flashthinks by default and burns the output-token budget on reasoning → MAX_TOKENS → truncated JSON → 0 claims, so the backend setsthinking_budget=0(mirrors localenable_thinking=False). It also sets an HTTP timeout (120 s) + 4 retries — a timeout-less call once hung the single-threaded worker ~50 min (transient 504/read-timeouts then self-heal). Gemini = overflow for PUBLIC data only; keepEXTRACTION_BACKEND=localin.env, flip it inline per-run when overflowing. - Scoring-brain internals are scoped to a guide. Before editing
signal_engine/signals/, readdocs/guides/scoring-brain.md— the classifier invariants (REALIZED-ONLY, ROLE-MATCH, claim_type hard-evidence guard, max_tokens budget, claim_id bracket-strip), the EISC cluster-cap, and the Battery/Strike adversarial-test PASS criteria. Don't regress those invariants (they're what make Battery pass). Full decision log:DESIGN_v2.md.
Secrets / env
Real values live in .env (gitignored). .env.example lists the names. Keys used: SPARK_CONTROL_URL,
SPARK_VERIFY_TLS, LOCAL_LLM_MODEL, EMBED_MODEL, TRANSCRIBE_MODEL, AUDIO_CONCURRENCY,
EXTRACTION_BACKEND, GEMINI_API_KEY, GEMINI_MODEL, ANTHROPIC_API_KEY, FMP_API_KEY,
EDGAR_USER_AGENT, DATA_DIR, UI_PORT, LOG_LEVEL. Never commit key values; the private LAN gateway
IP appears only as an env-var default.
Current state (snapshot — overwrite each session; longer-term backlog → ROADMAP.md)
- Strike adversarial test: extraction ~complete — this is the gating step. Chunker 400 bug FIXED
(old
\n\n-only split sent whole 2–3 h episodes in one call → context overflow); extraction is now recall-first full coverage (12K chars). Drained the ~700-doc backlog via the Gemini backend (one-time PUBLIC overflow), hardened with thinking-off + 120 s timeout/4 retries after a timeout-less call hung the worker ~50 min. At session end ~68 docs left (the SEC/FMP filings tail); 52.7k claims in the DB, 0 failures, 3 transient timeouts. NEXT once extraction finishes:embed-claims→two-sided --conviction STRIKE2022 --modes live,test(PASS = quiet in live, fires in test). - If extraction stopped early (session ended): resume with
EXTRACTION_BACKEND=gemini run-extract --limit 800(drop the env var for the local Qwen path). Pending claims still needembed-claimsafter. - Battery test PASSES; §7.1 power-infra qualified YES (both unchanged).
- 3 commits local +
backends.pyuncommitted (Gemini timeout/retry) — ALL UNPUSHED. Push tomainis blocked by the permission classifier; rungit push origin main(your updated rule allows main). - Corpus: bitcoin podcasts, SEC/FMP filings (+
bankscluster), Battery corpus, River research; EISC edges seeded for the bitcoin cluster.