keysat-root/ROADMAP.md

# ROADMAP — Keysat

Longer-term backlog. Near-term state lives in `AGENTS.md` → Current state.

## Payments & subscriptions

- Rail-preference editing UI — only matters when two providers on one profile both serve the same rail; settable today via `PUT /v1/admin/merchant-profiles/:id/rail-preferences/:rail`.
- **Auto-charge silently lapses a subscription on a 200-with-failure response (money-path bug; elevated above the other parked payments items).** `try_auto_charge_zaprite` returns `Ok(true)` on *any* HTTP 2xx (`subscriptions.rs:1403-1410`), reading the order `status` only for a log line. If Zaprite returns 200 carrying a `FAILED`/`DECLINED`/`EXPIRED` order status, the daemon fires `auto_charge_initiated` and then waits for an `order.paid` webhook that never arrives — the subscription silently lapses, no error surfaced, the customer churns. Safe fix (no production data needed): treat any non-`PAID` terminal order status as not-success and fall through to the manual-pay path — a conservative fail-safe, ~10 lines + a mock-provider test. (Found during the 2026-06-17 adjudication; it replaces the old "harden Zaprite failure-body shapes" item, which was already satisfied for non-2xx responses — those route correctly to WARN + `auto_charge_failed` audit + webhook + manual-pay fallback.)
## Agent compatibility & scoped API keys

- **Agent-delegable payment-provider connect** (approved, not urgent — see
  `plans/agent-payment-connect-scope.md`). Add an à-la-carte `payment_providers:write` scope
  (never bundled into `merchant-onboard`), gated by a daemon-level **sandbox-mode flag** as the
  outer gate (production daemons reject scoped connect entirely) with a **network gate** inner
  defense (regtest/testnet/signet only, fail-closed to mainnet). BTCPay network is derived from
  an on-chain address prefix (no `server/info` field exists).
- **Onboarding doc-harness — Stage 2 (Path 2, regtest buyer-pays).** Gated on slices 3–5 above.
  Stage 1 (Path 1, no payments) shipped `completed-clean` this session — harness at
  `licensing-service-startos/onboarding-harness/`, record in its `STAGE1-RESULT.md`. Stage 2
  reuses the harness but boots the fixture with `KEYSAT_SANDBOX_MODE` on, stands up a Dockerized
  BTCPay regtest stack (bitcoind regtest + NBXplorer + Postgres + BTCPay) as additional
  disposable infra, and grants the agent `merchant-onboard` + `payment_providers:write`. Goal:
  the agent connects BTCPay (regtest) over the API and drives a test buyer payment that activates
  a license, with zero master-key steps. The walkthrough must be explicitly labeled
  regtest/test-network and must state that connecting a real mainnet wallet is the one
  operator-reserved step **by design** (a key that can redirect funds stays with the human) — a
  security feature, not a gap.

## Packaging & distribution

- **Start9 Community Registry submission.** Mechanism (researched 2026-06-18): **email-based, not a PR or
  form.** Mail `submissions@start9labs.com` (the 0.3.5.x docs say `submissions@start9.com` — addresses are
  inconsistent) a link to the public wrapper repo (+ detailed README); both wrapper and upstream source must
  be public. Start9 snapshots the repo, **builds from source on a clean Debian box** (`prepare.sh` + `make`; a
  failed first build bounces the submission), installs + tests on real hardware (metadata, install/uninstall,
  interfaces, health, backup/restore, low-resource device), lands it in Community **Beta**, and promotes to
  production when you reply asking. Updates follow the same loop. `start-cli s9pk publish` is **self-hosted-registry
  only** — unrelated to community intake. `prepare.sh` shipped this session (`licensing-service-startos/prepare.sh`).
  **Clear with Start9 before submitting:** (1) is the custom source-available `LicenseRef-Keysat-1.0` acceptable
  (docs conflict: "source available" vs "Open Source License") — highest-leverage; a hard No blocks regardless of
  build-readiness; (2) does the 0.4.x build flow still invoke `prepare.sh` (a 0.3.5.x concept, absent from 0.4.x
  docs). Then the on-box manual verification. Functional criteria otherwise pass (2026-06-17 spec check).

## Operability & alerts

- **Surface internal failure conditions to the operator via StartOS-native notifications / health checks** —
  NOT a bespoke email/SMTP subsystem. The need is real and not covered by the webhook-delegation model: when
  the failure IS the webhook path (a dead-lettered endpoint, or the operator's receiver being down), a webhook
  can't report it, and the operator's own email pipeline is downstream of the same webhooks. So Keysat must own
  an out-of-band operator-visible channel — and on StartOS that channel is the OS notification/health surface,
  not a mailer Keysat ships itself. Conditions worth alerting on (the surviving kernel of the dropped email plan):
  payment-provider auth dead (repeated 401s ⇒ key revoked/rotated), a webhook endpoint hitting dead-letter,
  master self-license expiring-soon / expired, and (optional) renewals failing across the pool. The health-check
  path is already wired; verify the `start-sdk` 1.3.2 notification API before committing to a delivery mechanism.
  Background + the original alert catalog (Tier 1/2/3, throttling) live in the superseded
  `plans/keysat-smtp-emails.md`. **Buyer-facing email and the per-profile SMTP send path are dropped** (decided
  2026-06-18): operators selling via Keysat already own their buyer relationship and email pipeline through their
  own app + the existing webhooks, so Keysat emailing buyers is redundant and a branding/double-send liability.
  The dormant `merchant_profiles.smtp_*` columns (migration 0020) are now dead weight — left in place (a removal
  migration isn't worth it) and flagged in `src/merchant_profiles.rs`.

## Security & hardening (2026-06-18 full-eval P2s; EVALUATION.md has full detail but is overwritten each run, so the durable list lives here)

- **X-Forwarded-For rate-limit bypass.** Login/recover/validate buckets key off the raw first XFF value
  (`api/auth.rs:137`, `api/recover.rs:65`, `api/admin.rs:63`, `api/validate.rs:95`); rotating XFF defeats the
  throttle. First confirm whether the StartOS front proxy overwrites XFF (decides real-world reachability), then
  derive client IP from the trusted-proxy connection with a peer-socket fallback.
- **Dependency advisories** (mechanical, low-risk): bump `sqlx 0.7.4` → ≥0.8.1 (RUSTSEC-2024-0363),
  `rustls-webpki 0.101.7` → ≥0.103.13 (RUSTSEC-2026-0098/0099/0104), update start-sdk for the wrapper's
  `fast-xml-parser` (GHSA-5wm8-gmm8-39j9); re-run `cargo audit` / `npm audit`.
- **Admin UI co-located with the public API** on the single `:8080` interface (`startos/interfaces.ts`) — operator
  can't network-isolate admin. Split the admin SPA + `/v1/admin/*` onto their own port/interface.
- **Webhook-endpoint registration accepts `file://` / loopback URLs** (`webhooks.rs:106`, admin-gated) — add a
  scheme + host allowlist (reject non-http(s), loopback, link-local).
- **Runtime-prepared SQL** in `db/repo.rs` + `subscriptions.rs` (no compile-time column check; this class already
  500'd every paid purchase once on `:52`) — migrate the money-path queries to compile-checked `sqlx::query!`.
- **`rate_buckets` grows unbounded** (`rate_limit.rs:63`, one row per client IP on a public endpoint, no reaper) —
  add a reaper mirroring the session/redemption reapers in `main.rs`.
- **No CI.** Stand up one job (`cargo test && cargo clippy && tsc --noEmit`, ideally `cargo fmt --check`); the suite
  is good but unenforced, so green depends on the operator remembering to run it before `publish.sh`.
- Doc-drift P3 cluster (each one-liners, see EVALUATION.md): BUILDING.md Node 20→22 / Rust 1.75→1.88, broken docs
  `/changelog` footer link, README "Show credentials" → "Show admin API key", PORTING_SDK stale (Python/Go shipped;
  crate renamed), testing.md stale test counts, and the `unlimited_merchant_profiles` guide/code-comment ("still
  needs adding") vs AGENTS ("confirmed live") contradiction — resolve with a live `GET /v1/products/keysat/policies`.

## Licensing model

- Evaluate Elastic License v2 vs the current custom `LicenseRef-Keysat-1.0` (parked decision).

## Validation

- Re-test `KEYSAT_INTEGRATION.md` against a fresh downstream app to confirm a clean one-shot SDK integration.
- **Add an automated regression test for multi-profile webhook routing** (adjudicated 2026-06-17 → DO, low blast radius — replaces the parked "manual Zaprite sandbox pass"). The routing is a deterministic provider-id→profile primary-key lookup with an anti-forgery re-fetch backstop, so the manual sandbox ceremony isn't worth it — but the path-keyed route (`/v1/{provider}/webhook/:provider_id` → `handle_for_provider`) currently has zero automated coverage on the money path. Plan: in `tests/api.rs`, reuse the two-provider fixture (~:3958), POST a Settled webhook to `/v1/zaprite/webhook/{provider-A-id}`, assert only profile A settles (B untouched; an unknown path-id 404s). Existing mock seam, no external account, runs in `cargo test`. Effort S.