5.2 KiB
paths
| paths | ||
|---|---|---|
|
Scoring brain — operating guide
The "don't get it wrong" manual for signal_engine/signals/. The full decision/falsification log and
hypotheses H1–H6 live in DESIGN_v2.md; the spine guardrails are in AGENTS.md. This file is the
subsystem detail an agent editing the scorers needs.
The instruments
independence.py— EISC (Effective Independent Source Count): a noisy-OR connectedness matrix (source edges + voiceprints + cluster coupling) with inverse-row-sum. Returnseisc_adj(=eisc_raw×xcluster_mult),eisc_raw,k_eff,xcluster_mult,per_source_contrib.mode='live'DROPSown_networksources;mode='test'keeps them.two_sided.py— net corroboration = independence-weighted affirms − denies over time (classify_corpus→net_at→trajectory). The instrument for the adversarial cases (NOT runway).asof.py— look-ahead guard (only claims dated ≤ as-of are visible).windows.py— windowed acceleration / window bounds (match window to cadence: ~90d quarterly filings, wider for podcasts).bar.py/under_acted.py— the two-tier gate (evidence→ledger vs promotion→judge) and Job B scorer.llm_helpers.derivative_relevance— the bounded LLM classifier over PRE-FILTERED candidates (search hits), never a nominator._REL_SYSis its system prompt.run.py— orchestration /run_backtest.resolver.py— outcome resolver (currently a stub).
Classifier invariants that MUST NOT regress
These five are what make the Battery adversarial case pass; each was a real bug found by running it.
max_tokenssized to the batch. A fixed 3000 truncated the JSON mid-array on ~60-claim batches → empty parse → a whole node silently scored 0.derivative_relevancenow setsbudget = max(3000, 120*len(claims)+500).- Strip
[]from echoed claim_ids. The listing presents ids as- [{id}] ...; the model inconsistently echoes them back as[id], which misses the bracket-less lookup → all(missing). Normalize withstr(id).strip().strip("[]").strip(). - REALIZED-ONLY (
_REL_SYS): announcements / plans / intent / "may·will·expects·poised·up-to" are NOT corroboration — only deployed/closed FACTS affirm. ("$2B announced" ≠ capital deployed.) - ROLE-MATCH (
_REL_SYS): the actor must occupy the role the hypothesis is about. For a capital- provider hypothesis, a borrower posting collateral is the wrong side → tangential, not affirms. - Hard-evidence guard (
net_at,require_hard_evidence=True): a source only counts on a side if it carries a descriptive/reactive (realized-fact) claim there;predictive/interpretive(forecast/ opinion/intent) alone don't qualify it. Reportshard_affirm_src/soft_affirm_src_dropped.
EISC / independence rules
- Bitcoin is one CAPPED cluster (
cluster_capped_low,CAP_VALUE): within-cluster agreement can contribute at most ~0.25 of a voice — it can NOT masquerade as independent corroboration. Real corroboration of a bitcoin thesis must come from OUTSIDE the cluster (e.g. thebankscluster). - Cross-cluster earns the multiplier (
xcluster_mult, gated byk_eff= clusters contributing ≥0.5 of a voice). One guest doing the rounds collapses to ~1; it does not earn the gold multiplier. own_networkquarantine is MATERIALITY-driven (see AGENTS.md): live mode drops materially-tied Ten31 sources; test mode keeps them. Validated: own_network-only affirms → liveeisc≈0, test > 0.- Seed edges in
sorted([a,b])order (matchestranscribe_worker'ssorted()+weight+=1upsert) so auto-detected and seeded edges share a PK — a reversed-order row DOUBLE-COUNTS (math is frozenset- undirected but the table PK is ordered). κ: shared_guest 0.85, citation 0.45, community 0.60.
The adversarial cases (the validation harness)
Pre-registered failed convictions used to test the engine against its target failure mode. Seeds:
seeds/conviction_log.adversarial.seed.yaml, seeds/fanout.{STRIKE,BATTERY}2022.seed.yaml,
seeds/resolution.{STRIKE,BATTERY}2022.yaml, seeds/resolution_outcomes.adversarial.yaml.
- BATTERY2022 (timing/disconfirmation). BTC-collateralized lending: demand rose, institutional
supply failed. PASS = demand-net rises while supply-net stays flat (≈0). Run:
two-sided --conviction BATTERY2022 --nodes demand,supply --modes live. Supply resolves ONLY on committed/DEPLOYED capital; policy/regulation is CONTEXT (the custody-policy node), never supply (S1). - STRIKE2022 (reflexivity/false-positive). Lightning-retail-payments thesis FAILED. PASS = net
stays quiet in
live(own_network dropped) while it would fire intest— the engine refusing the intra-cluster echo. Runtwo-sided --conviction STRIKE2022 --modes live,test. The REALIZED-ONLY rule is load-bearing here (speculative "Lightning will revolutionize payments" ispredictive, not signal).
Standing rule S1: derivatives resolve on OUTCOME (scaled substance), never milestones or enablers. An announced program / a regulatory unblock / a single bank's toe-in is CONTEXT, not corroboration.