255d669cf0
Refresh AGENTS Current state for the full-eval session; document the email-based community-registry submission flow and the start-cli installer in the packaging guide; add a ROADMAP Security & hardening section so the eval P2s survive EVALUATION.md overwrites.
94 lines
9.0 KiB
Markdown
94 lines
9.0 KiB
Markdown
# ROADMAP — Keysat
|
||
|
||
Longer-term backlog. Near-term state lives in `AGENTS.md` → Current state.
|
||
|
||
## Payments & subscriptions
|
||
|
||
- Rail-preference editing UI — only matters when two providers on one profile both serve the same rail; settable today via `PUT /v1/admin/merchant-profiles/:id/rail-preferences/:rail`.
|
||
- **Auto-charge silently lapses a subscription on a 200-with-failure response (money-path bug; elevated above the other parked payments items).** `try_auto_charge_zaprite` returns `Ok(true)` on *any* HTTP 2xx (`subscriptions.rs:1403-1410`), reading the order `status` only for a log line. If Zaprite returns 200 carrying a `FAILED`/`DECLINED`/`EXPIRED` order status, the daemon fires `auto_charge_initiated` and then waits for an `order.paid` webhook that never arrives — the subscription silently lapses, no error surfaced, the customer churns. Safe fix (no production data needed): treat any non-`PAID` terminal order status as not-success and fall through to the manual-pay path — a conservative fail-safe, ~10 lines + a mock-provider test. (Found during the 2026-06-17 adjudication; it replaces the old "harden Zaprite failure-body shapes" item, which was already satisfied for non-2xx responses — those route correctly to WARN + `auto_charge_failed` audit + webhook + manual-pay fallback.)
|
||
## Agent compatibility & scoped API keys
|
||
|
||
- **Agent-delegable payment-provider connect** (approved, not urgent — see
|
||
`plans/agent-payment-connect-scope.md`). Add an à-la-carte `payment_providers:write` scope
|
||
(never bundled into `merchant-onboard`), gated by a daemon-level **sandbox-mode flag** as the
|
||
outer gate (production daemons reject scoped connect entirely) with a **network gate** inner
|
||
defense (regtest/testnet/signet only, fail-closed to mainnet). BTCPay network is derived from
|
||
an on-chain address prefix (no `server/info` field exists).
|
||
- **Onboarding doc-harness — Stage 2 (Path 2, regtest buyer-pays).** Gated on slices 3–5 above.
|
||
Stage 1 (Path 1, no payments) shipped `completed-clean` this session — harness at
|
||
`licensing-service-startos/onboarding-harness/`, record in its `STAGE1-RESULT.md`. Stage 2
|
||
reuses the harness but boots the fixture with `KEYSAT_SANDBOX_MODE` on, stands up a Dockerized
|
||
BTCPay regtest stack (bitcoind regtest + NBXplorer + Postgres + BTCPay) as additional
|
||
disposable infra, and grants the agent `merchant-onboard` + `payment_providers:write`. Goal:
|
||
the agent connects BTCPay (regtest) over the API and drives a test buyer payment that activates
|
||
a license, with zero master-key steps. The walkthrough must be explicitly labeled
|
||
regtest/test-network and must state that connecting a real mainnet wallet is the one
|
||
operator-reserved step **by design** (a key that can redirect funds stays with the human) — a
|
||
security feature, not a gap.
|
||
|
||
## Packaging & distribution
|
||
|
||
- **Start9 Community Registry submission.** Mechanism (researched 2026-06-18): **email-based, not a PR or
|
||
form.** Mail `submissions@start9labs.com` (the 0.3.5.x docs say `submissions@start9.com` — addresses are
|
||
inconsistent) a link to the public wrapper repo (+ detailed README); both wrapper and upstream source must
|
||
be public. Start9 snapshots the repo, **builds from source on a clean Debian box** (`prepare.sh` + `make`; a
|
||
failed first build bounces the submission), installs + tests on real hardware (metadata, install/uninstall,
|
||
interfaces, health, backup/restore, low-resource device), lands it in Community **Beta**, and promotes to
|
||
production when you reply asking. Updates follow the same loop. `start-cli s9pk publish` is **self-hosted-registry
|
||
only** — unrelated to community intake. `prepare.sh` shipped this session (`licensing-service-startos/prepare.sh`).
|
||
**Clear with Start9 before submitting:** (1) is the custom source-available `LicenseRef-Keysat-1.0` acceptable
|
||
(docs conflict: "source available" vs "Open Source License") — highest-leverage; a hard No blocks regardless of
|
||
build-readiness; (2) does the 0.4.x build flow still invoke `prepare.sh` (a 0.3.5.x concept, absent from 0.4.x
|
||
docs). Then the on-box manual verification. Functional criteria otherwise pass (2026-06-17 spec check).
|
||
|
||
## Operability & alerts
|
||
|
||
- **Surface internal failure conditions to the operator via StartOS-native notifications / health checks** —
|
||
NOT a bespoke email/SMTP subsystem. The need is real and not covered by the webhook-delegation model: when
|
||
the failure IS the webhook path (a dead-lettered endpoint, or the operator's receiver being down), a webhook
|
||
can't report it, and the operator's own email pipeline is downstream of the same webhooks. So Keysat must own
|
||
an out-of-band operator-visible channel — and on StartOS that channel is the OS notification/health surface,
|
||
not a mailer Keysat ships itself. Conditions worth alerting on (the surviving kernel of the dropped email plan):
|
||
payment-provider auth dead (repeated 401s ⇒ key revoked/rotated), a webhook endpoint hitting dead-letter,
|
||
master self-license expiring-soon / expired, and (optional) renewals failing across the pool. The health-check
|
||
path is already wired; verify the `start-sdk` 1.3.2 notification API before committing to a delivery mechanism.
|
||
Background + the original alert catalog (Tier 1/2/3, throttling) live in the superseded
|
||
`plans/keysat-smtp-emails.md`. **Buyer-facing email and the per-profile SMTP send path are dropped** (decided
|
||
2026-06-18): operators selling via Keysat already own their buyer relationship and email pipeline through their
|
||
own app + the existing webhooks, so Keysat emailing buyers is redundant and a branding/double-send liability.
|
||
The dormant `merchant_profiles.smtp_*` columns (migration 0020) are now dead weight — left in place (a removal
|
||
migration isn't worth it) and flagged in `src/merchant_profiles.rs`.
|
||
|
||
## Security & hardening (2026-06-18 full-eval P2s; EVALUATION.md has full detail but is overwritten each run, so the durable list lives here)
|
||
|
||
- **X-Forwarded-For rate-limit bypass.** Login/recover/validate buckets key off the raw first XFF value
|
||
(`api/auth.rs:137`, `api/recover.rs:65`, `api/admin.rs:63`, `api/validate.rs:95`); rotating XFF defeats the
|
||
throttle. First confirm whether the StartOS front proxy overwrites XFF (decides real-world reachability), then
|
||
derive client IP from the trusted-proxy connection with a peer-socket fallback.
|
||
- **Dependency advisories** (mechanical, low-risk): bump `sqlx 0.7.4` → ≥0.8.1 (RUSTSEC-2024-0363),
|
||
`rustls-webpki 0.101.7` → ≥0.103.13 (RUSTSEC-2026-0098/0099/0104), update start-sdk for the wrapper's
|
||
`fast-xml-parser` (GHSA-5wm8-gmm8-39j9); re-run `cargo audit` / `npm audit`.
|
||
- **Admin UI co-located with the public API** on the single `:8080` interface (`startos/interfaces.ts`) — operator
|
||
can't network-isolate admin. Split the admin SPA + `/v1/admin/*` onto their own port/interface.
|
||
- **Webhook-endpoint registration accepts `file://` / loopback URLs** (`webhooks.rs:106`, admin-gated) — add a
|
||
scheme + host allowlist (reject non-http(s), loopback, link-local).
|
||
- **Runtime-prepared SQL** in `db/repo.rs` + `subscriptions.rs` (no compile-time column check; this class already
|
||
500'd every paid purchase once on `:52`) — migrate the money-path queries to compile-checked `sqlx::query!`.
|
||
- **`rate_buckets` grows unbounded** (`rate_limit.rs:63`, one row per client IP on a public endpoint, no reaper) —
|
||
add a reaper mirroring the session/redemption reapers in `main.rs`.
|
||
- **No CI.** Stand up one job (`cargo test && cargo clippy && tsc --noEmit`, ideally `cargo fmt --check`); the suite
|
||
is good but unenforced, so green depends on the operator remembering to run it before `publish.sh`.
|
||
- Doc-drift P3 cluster (each one-liners, see EVALUATION.md): BUILDING.md Node 20→22 / Rust 1.75→1.88, broken docs
|
||
`/changelog` footer link, README "Show credentials" → "Show admin API key", PORTING_SDK stale (Python/Go shipped;
|
||
crate renamed), testing.md stale test counts, and the `unlimited_merchant_profiles` guide/code-comment ("still
|
||
needs adding") vs AGENTS ("confirmed live") contradiction — resolve with a live `GET /v1/products/keysat/policies`.
|
||
|
||
## Licensing model
|
||
|
||
- Evaluate Elastic License v2 vs the current custom `LicenseRef-Keysat-1.0` (parked decision).
|
||
|
||
## Validation
|
||
|
||
- Re-test `KEYSAT_INTEGRATION.md` against a fresh downstream app to confirm a clean one-shot SDK integration.
|
||
- **Add an automated regression test for multi-profile webhook routing** (adjudicated 2026-06-17 → DO, low blast radius — replaces the parked "manual Zaprite sandbox pass"). The routing is a deterministic provider-id→profile primary-key lookup with an anti-forgery re-fetch backstop, so the manual sandbox ceremony isn't worth it — but the path-keyed route (`/v1/{provider}/webhook/:provider_id` → `handle_for_provider`) currently has zero automated coverage on the money path. Plan: in `tests/api.rs`, reuse the two-provider fixture (~:3958), POST a Settled webhook to `/v1/zaprite/webhook/{provider-A-id}`, assert only profile A settles (B untouched; an unknown path-id 404s). Existing mock seam, no external account, runs in `cargo test`. Effort S.
|