Files
ten31-signal-engine/ROADMAP.md
T

65 lines
5.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ten31 Signal Engine — ROADMAP
Longer-term backlog (the near-term snapshot lives in `AGENTS.md` → Current state). Rationale and the
falsification hypotheses (H1H6) are in `DESIGN_v2.md`.
## Scoring brain — the real validation
- **Frontier-fan-out test (H6) — the untested half, = the actual §1.1 miss.** Seed a 2023 conviction,
give the model 2023-ONLY context, let it PROPOSE the derivatives, then score that tree's precision/
recall against what actually repriced. The §7.1 backtest hand-wrote the tree (hindsight leakage); this
is the part that matters and isn't tested yet.
- **Estimator rework (H4).** Replace the fragile 2nd-derivative *acceleration* with a persistence /
level-crossing test on the corroboration arrival rate, with per-source-type window cadence.
- **Build the real resolver.** `signals/resolver.py` is a stub. Settle the lead-time-vs-actual-repricing
debate empirically against structured outcomes (price, FERC interconnection queue, PPAs, capex, policy).
- **Extend claim-type weighting to the §7.1 power-infra tree** (it currently only gates the bitcoin
adversarial cases; descriptive-deployed > predictive-intent should apply everywhere).
- **Job A scorers** (emergence / stance / intersection) for the forward Discovery pilot.
- **MD&A targeting** for filings — extract Item 7, not front-matter boilerplate.
## Corpus & independence
- **Complete the Strike reflexivity demonstration (deferred 2026-06-16).** The STRIKE2022 adversarial test
is a *conditional* pass: the engine refuses the false positive via the capped single-bitcoin-cluster
guard (`net=+0.25``EISC_FLOOR=2.0`), but the own_network-drop mechanism (live quiet < test fires) is
unexercised because RHR (80) / CD (77) / Bitcoin.Review (12) were deferred at transcription on 2026-06-08
(`own_net=0`, live==test). To finish: un-defer + transcribe the **RHR/CD 202223 Lightning-retail window**
`run-extract` → re-run `two-sided --conviction STRIKE2022 --modes live,test`; PASS = test fires while
live stays quiet. Costs constrained audio-GPU time, hence deferred.
- **Confirm materiality** of the remaining `own_network`-flagged sources with Grant: Unchained, Debifi,
Coinkite (Bitcoin.Review). Immaterial → flip to independent (the River/Swan precedent).
- **BTC Sessions (Ben Perrin)** — strongest still-missing independent high-Strike merchant/wallet-adoption
show; resolve feed + ingest (a task chip exists for it).
- **River image-PDF reports** — the 2022 Lightning report + 2025/2026 adoption reports have no text layer;
add an OCR / page-rasterization+vision path to ingest them.
- Broad, **lineage-aware** corpus expansion toward independent vantage points (not more correlated
sell-side / trade-press voices).
## Infrastructure & ops
- **Add an automated test suite** (none today) — start with the scoring primitives (EISC, two_sided,
as-of harness) and the queue.
- **Episode-pipelining** in `transcribe_worker` — download/chunk the next episode while transcribing the
current one, to close the inter-episode GPU idle gap (the per-chunk 2-in-flight path is already done).
- **Corpus-management UI** — add to the corpus over time and see the full corpus selection.
- **Expose pipeline tunables in the UI (with the UI topic).** Extraction chunk size + per-doc chunk cap,
audio chunk length, audio concurrency, etc. are currently hardcoded defaults (now also CLI flags on
`run-extract`: `--chunk-chars`, `--max-chunks`). Surface them in the UI so they're visible/adjustable,
not black-box assumptions we forget about. Tie to the corpus-management UI work.
- **Daily activity digest email.** A `daily-digest` CLI command rendering a "last 24h" report —
corpus throughput (`documents.ingested_at`/`processed_at`, `claims.extracted_at` by kind/cluster),
queue health (`backfill.queue.stats()` — surface **failed/stuck** jobs), Qdrant index lag, infra
(`spark-status`), and a **key-findings** section (new `ledger` rows by `date_logged`; `candidate_scores`
that `cleared_evidence_bar`). All timestamps already default to `datetime('now')`, so the window is a
one-liner; the activity half is buildable today. Deliver via **SMTP** (stdlib `smtplib`+`email`, no new
dep, configurable per service — `SMTP_HOST/PORT/USER/PASS`, `DIGEST_TO`); ship a `--stdout` dry-run
mode; schedule via launchd on the Mac. Two dependencies: (1) the findings section is only real once the
**Job A discovery scorers** run on a schedule — until then it's stubbed/echoes manual adversarial runs;
(2) sovereignty (guardrail #7) — SMTP through your own/ten31 server keeps it inside the boundary; do NOT
route through a third-party email API if findings ever carry Battery/Strike or positioning substance.
- **Forward live operation** — the only real test: scoring un-pre-selected signals as they arrive, with
the dual-evaluation ledger as arbiter.
## Packaging / deploy
- **Start9 `s9pk` packaging.** Build with `make x86` then `make install``immense-voyage.local`.
Bump the package version in the manifest BEFORE building (Start9 0.4.x won't recognize an un-bumped
rebuild). See the placement standard for infra conventions.