Scoring brain — operating guide

The "don't get it wrong" manual for signal_engine/signals/. The full decision/falsification log and hypotheses H1–H6 live in DESIGN_v2.md; the spine guardrails are in AGENTS.md. This file is the subsystem detail an agent editing the scorers needs.

The instruments

independence.py — EISC (Effective Independent Source Count): a noisy-OR connectedness matrix (source edges + voiceprints + cluster coupling) with inverse-row-sum. Returns eisc_adj (= eisc_raw × xcluster_mult), eisc_raw, k_eff, xcluster_mult, per_source_contrib. mode='live' DROPS own_network sources; mode='test' keeps them.
two_sided.py — net corroboration = independence-weighted affirms − denies over time (classify_corpus → net_at → trajectory). The instrument for the adversarial cases (NOT runway).
asof.py — look-ahead guard (only claims dated ≤ as-of are visible). windows.py — windowed acceleration / window bounds (match window to cadence: ~90d quarterly filings, wider for podcasts).
bar.py / under_acted.py — the two-tier gate (evidence→ledger vs promotion→judge) and Job B scorer.
llm_helpers.derivative_relevance — the bounded LLM classifier over PRE-FILTERED candidates (search hits), never a nominator. _REL_SYS is its system prompt.
run.py — orchestration / run_backtest. resolver.py — outcome resolver (currently a stub).

Classifier invariants that MUST NOT regress

These five are what make the Battery adversarial case pass; each was a real bug found by running it.

max_tokens sized to the batch. A fixed 3000 truncated the JSON mid-array on ~60-claim batches → empty parse → a whole node silently scored 0. derivative_relevance now sets budget = max(3000, 120*len(claims)+500).
Strip [] from echoed claim_ids. The listing presents ids as - [{id}] ...; the model inconsistently echoes them back as [id], which misses the bracket-less lookup → all (missing). Normalize with str(id).strip().strip("[]").strip().
REALIZED-ONLY (_REL_SYS): announcements / plans / intent / "may·will·expects·poised·up-to" are NOT corroboration — only deployed/closed FACTS affirm. ("$2B announced" ≠ capital deployed.)
ROLE-MATCH (_REL_SYS): the actor must occupy the role the hypothesis is about. For a capital- provider hypothesis, a borrower posting collateral is the wrong side → tangential, not affirms.
Hard-evidence guard (net_at, require_hard_evidence=True): a source only counts on a side if it carries a descriptive/reactive (realized-fact) claim there; predictive/interpretive (forecast/ opinion/intent) alone don't qualify it. Reports hard_affirm_src / soft_affirm_src_dropped.

EISC / independence rules

Bitcoin is one CAPPED cluster (cluster_capped_low, CAP_VALUE): within-cluster agreement can contribute at most ~0.25 of a voice — it can NOT masquerade as independent corroboration. Real corroboration of a bitcoin thesis must come from OUTSIDE the cluster (e.g. the banks cluster).
Cross-cluster earns the multiplier (xcluster_mult, gated by k_eff = clusters contributing ≥0.5 of a voice). One guest doing the rounds collapses to ~1; it does not earn the gold multiplier.
own_network quarantine is MATERIALITY-driven (see AGENTS.md): live mode drops materially-tied Ten31 sources; test mode keeps them. Validated: own_network-only affirms → live eisc≈0, test > 0.
Seed edges in sorted([a,b]) order (matches transcribe_worker's sorted()+weight+=1 upsert) so auto-detected and seeded edges share a PK — a reversed-order row DOUBLE-COUNTS (math is frozenset- undirected but the table PK is ordered). κ: shared_guest 0.85, citation 0.45, community 0.60.

The adversarial cases (the validation harness)

Pre-registered failed convictions used to test the engine against its target failure mode. Seeds: seeds/conviction_log.adversarial.seed.yaml, seeds/fanout.{STRIKE,BATTERY}2022.seed.yaml, seeds/resolution.{STRIKE,BATTERY}2022.yaml, seeds/resolution_outcomes.adversarial.yaml.

BATTERY2022 (timing/disconfirmation). BTC-collateralized lending: demand rose, institutional supply failed. PASS = demand-net rises while supply-net stays flat (≈0). Run: two-sided --conviction BATTERY2022 --nodes demand,supply --modes live. Supply resolves ONLY on committed/DEPLOYED capital; policy/regulation is CONTEXT (the custody-policy node), never supply (S1).
STRIKE2022 (reflexivity/false-positive). Lightning-retail-payments thesis FAILED. PASS = net stays quiet in live (own_network dropped) while it would fire in test — the engine refusing the intra-cluster echo. Run two-sided --conviction STRIKE2022 --modes live,test. The REALIZED-ONLY rule is load-bearing here (speculative "Lightning will revolutionize payments" is predictive, not signal). Reading the output: a single capped bitcoin cluster nets eisc≈0.25 — already sub-bar vs EISC_FLOOR=2.0, so a +0.25 "quiet in live" can be the cluster cap refusing the false positive, NOT the own_network drop. Check own_net: if it's 0, live==test and the reflexivity mechanism is unexercised (the affirmers are independent), so a quiet live does not by itself prove the echo-drop — you need own_network affirms present (own_net>0) for test to fire above live.

Standing rule S1: derivatives resolve on OUTCOME (scaled substance), never milestones or enablers. An announced program / a regulatory unblock / a single bank's toe-in is CONTEXT, not corroboration.

5.7 KiB Raw Permalink Blame History Unescape Escape

Scoring brain — operating guide

The instruments

Classifier invariants that MUST NOT regress

EISC / independence rules

The adversarial cases (the validation harness)

5.7 KiB

Raw Permalink Blame History