Files
keysat-root/EVALUATION.md
T

73 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Evaluation — Keysat (licensing-service-startos) — 2026-06-13
Intent: a self-hosted, Bitcoin-native software-licensing service that runs as a StartOS 0.4.x package — a Rust/axum/sqlx+SQLite daemon with Ed25519 license signing plus a TypeScript StartOS wrapper — recently extended with a multi-merchant-profile, multi-provider (BTCPay required, Zaprite optional) payment model.
Evaluation target: `licensing-service-startos/` (the daemon + StartOS wrapper). The SDK repos, public sites, and `plans/` were out of scope.
Agents run: evaluator, security-auditor, exerciser, start9-spec-checker (all completed). reviewer **skipped** — the guide runs it only on uncommitted changes, and the daemon tree was clean at evaluation time.
## Verdict
Keysat substantially achieves its intent: the cryptographic licensing core (offline-verifiable LIC1 keys, Ed25519, cross-language-tested) is reference-grade, the payment-provider trait is well-factored, the SQL surface is injection-clear despite runtime-prepared queries, and the security fundamentals (Argon2id, constant-time HMAC/compare, 256-bit sessions, generic-error recovery, FK enforcement confirmed) are above the bar for a project this young. The one release-blocking defect is the **Zaprite settle-webhook: it performs no authentication of any kind and issues a signed license off a buyer-visible order id — anyone who starts a real Zaprite checkout can forge a "paid" callback and mint licenses without paying** (the `externalUniqId` "trust anchor" the code comments describe is never actually read in the webhook path). Behind that sits a systemic test gap — the production purchase/settle path has no integration-test seam, which is exactly how the `:52` 500-on-every-purchase bug shipped — and a cluster of API-hygiene and StartOS-submission issues. None of the deeper architecture is rotten; the work is hardening a sound core, not rebuilding it.
## Cross-referenced findings
- **Zaprite webhook forgery (P0).** Flagged by BOTH the evaluator (as P2) and the security-auditor (escalated to **P0** after tracing it end-to-end: public route → no signature → `externalUniqId` never read → no amount check → no `pending`-state precondition → `webhook.rs:194` issues a signed license). One finding, two agents, severity resolved to P0 (funds loss). `payment/zaprite/provider.rs:234-362`, `api/webhook.rs:62-196`, route `api/mod.rs:429-430`.
- **Untested production purchase path (P1).** Evaluator named it the #1 issue (mock injects into the bypassed legacy `state.payment` singleton at `tests/api.rs:457`, not the real `resolve_provider_for_profile_rail` at `purchase.rs:482`); the exerciser independently reported it could not exercise the live payment/settle path at all (medium confidence). Same gap, two kinds of evidence. This is also the root of the two `paid_purchase_*` red tests.
- **FK enforcement (resolved, not a finding).** A prior "latent/unchecked" worry; the security-auditor and daemon-architecture both confirm the sqlx pool sets `foreign_keys(true)` per connection (`db/mod.rs`). Closed.
## Priority queue
- **[P0]** Zaprite settle-webhook is unauthenticated — forged `order.change`/`status=PAID` with a buyer-visible order id mints a free signed license; no amount or state check — `payment/zaprite/provider.rs:234-362` + `api/webhook.rs:62-196` — security-auditor (P0) + evaluator (P2)
- **[P1]** Production purchase/settle path has no integration-test seam; mock targets the dead `state.payment` singleton, not `resolve_provider_for_profile_rail``payment/mod.rs:177`, `tests/api.rs:457`, `purchase.rs:482` — evaluator + exerciser
- **[P1]** Settle handler issues without verifying paid amount/currency or requiring `pending` state (underpaid/replayed invoices honored at full entitlement; BTCPay forgery blocked by HMAC but replay is a no-op only after first issuance) — `api/webhook.rs:121-194` — security-auditor
- **[P1]** Scoped API keys (`ks_…`) are advertised + issuable but return 403 on every admin endpoint — the `require_admin``require_scope` migration was never done — `api/api_keys.rs:14` — exerciser
- **[P2]** `/v1/purchase` and `/v1/redeem` have no rate limiting (invoice-spam / upstream-provider exhaustion; unthrottled `free_license` code guessing) — `api/mod.rs:343-345` — security-auditor
- **[P2]** Per-IP rate-limit bucket keys on attacker-controllable `X-Forwarded-For`; bypassable if the daemon's shared `0.0.0.0:8080` listener is reachable without the StartOS proxy rewriting XFF (unverified) — `recover.rs:65-70`, `validate.rs:95-111`, `admin.rs:56-67` — security-auditor
- **[P2]** All `422`/`415` responses return naked axum plain-text rejection strings instead of the JSON error envelope — SDK consumers that `JSON.parse()` a 422 throw — every JSON endpoint — exerciser
- **[P2]** Product `slug` has no validation — empty string, 300-char, and SQL-meta characters all stored as valid slugs; empty-slug product pollutes the public catalog — `POST /v1/admin/products` — exerciser
- **[P2]** `GET /v1/admin/products` returns 405 though the served OpenAPI documents it (`products:read`); only `post` is wired — `api/mod.rs:431` — exerciser
- **[P2]** Dependency advisories: `sqlx 0.7.4` (RUSTSEC-2024-0363, fixed ≥0.8.1), `rustls-webpki 0.101.7` (CRL panic / name-constraint bypass, fixed ≥0.103.12); `rsa` reachable only via unused MySQL backend; `paste`/`rustls-pemfile` unmaintained — `cargo audit` — security-auditor
- **[P2]** StartOS submission blocker: `instructions.md` absent (spec says build should fail; `start-cli` built silently anyway) — start9-spec-checker
- **[P2]** StartOS submission blocker: `packageRepo` is a dead link — manifest points to `…/keysat-startos` (404); real repo is `…/keysat``startos/manifest/index.ts` — start9-spec-checker
- **[P2]** StartOS submission blocker: `docsUrls` dead link — points to `/docs/INTEGRATION.md`; file is at `/licensing-service/docs/INTEGRATION.md``startos/manifest/index.ts` — start9-spec-checker
- **[P2]** StartOS submission blocker: manifest declares `aarch64` but only x86_64 is published (Start9 review tests on Pi-class hardware) — `publish.sh` / manifest `arch` — start9-spec-checker (corroborates existing AGENTS.md TODO)
- **[P2]** No CI; `cargo test`/`clippy`/`fmt` all unenforced (56 files differ from `fmt`); the only release gate is `publish.sh` — for a payment daemon the untested suite is the load-bearing weakness behind the P0/P1s — `docs/guides/testing.md` — evaluator
- **[P3]** Dead test `payment_provider_preference_round_trip` inserts into the dropped `btcpay_config`/`zaprite_config` tables — delete it — `tests/api.rs:1224` — evaluator + exerciser
- **[P3]** `/v1/purchase` (400) and `/v1/btcpay/webhook` (503) return different status/error codes for the same "no provider configured" condition — exerciser
- **[P3]** `POST /v1/admin/discount-codes` requires an undocumented `kind` field → opaque 422 — exerciser
- **[P3]** Field-naming inconsistencies across related endpoints: `license_id` vs `id`; machines use `key` vs documented `license_key`; `redeem`/`purchase` use `product` vs `validate`'s `product_slug` — exerciser (Surprises)
- **[P3]** Migration self-heal auto-deletes `_sqlx_migrations` rows on checksum drift, gated only by a hand-maintained allowlist — foot-gun as the list grows — `db/mod.rs` / migration 0009 — evaluator (Surprise) + security-auditor
- **[P3]** Zaprite `validate_webhook` WARN-logs up to 2 KB of raw unauthenticated payload — log-flood / log-injection from a public endpoint — `provider.rs:262-278` — security-auditor
- **[P3]** Outbound webhook delivery POSTs to operator-supplied URLs with no SSRF allow-list (operator-only input; a stolen admin token could pivot to internal hosts) — `webhooks.rs:150-156` — security-auditor
- **[P3]** Stale comment in `startos/versions/v0.2.0.ts:3-4` says "NOT YET WIRED INTO versions/index.ts" but it is wired as `current` at `:53` — evaluator
## Scorecard
The evaluator's lens table, with one adjustment from cross-agent evidence:
| Lens | Score /5 | Note |
|------|----------|------|
| Architecture | 4 | Clean trait + per-profile resolver; legacy `state.payment` singleton coexists and is what tests inject into — the fork behind the test gap. |
| Security | **3** (was 4) | **Adjusted down** by the security-auditor's P0 forgery + P1 missing-amount-check (funds loss). Fundamentals remain 45 (Argon2id, HMAC, injection-clear, FK-enforced); the Zaprite P0 is the pull. Zaprite is optional, which bounds blast radius. |
| Performance | 4 | WAL + 8-conn pool, indexed lookups, bounded workers; no load test run. |
| Testing | 3 | Cross-language LIC1 fixtures are a real strength; the production purchase/settle path is untested — corroborated by the exerciser's inability to reach it and the shipped-but-broken scoped-API-key feature. |
| Code quality | 5 | Disciplined dynamic SQL (all `.bind()`, const-only interpolation); honest comments. Caveat (not score-changing): the exerciser found absent input validation (slug) and plain-text error leaks. |
| Documentation | 5 | Scoped guides + wire-format spec are reference-grade and candid (testing.md names its own known failures). |
## Disagreements & gaps
- **Severity disagreement, resolved:** evaluator rated the Zaprite webhook P2; security-auditor traced it end-to-end and rated it P0. Resolved to **P0** — it is unauthenticated, public, and mints signed licenses (direct funds loss for any operator running the optional Zaprite provider — which includes the master operator).
- **Shared blind spot (all agents):** none could exercise the live BTCPay/Zaprite settlement→issuance path without real provider credentials. The evaluator and exerciser both labeled their payment-path confidence "medium." The single highest-value verification missing is a full purchase cycle against a real BTCPay/Zaprite sandbox — which also exposes the P0/P1 settle-path logic to a real test.
- **Quality tension:** evaluator scored code-quality 5 on SQL discipline; the exerciser found input-validation gaps (slug) and framework error leaks. Both are true at different layers — the score holds, the caveat is noted.
## Suggested order of work
1. **Fix the Zaprite P0 with its test alongside.** On any settle event, re-fetch `get_invoice_status` from the provider and require `pending` local state + matching paid amount/currency before issuing — this closes both the P0 forgery and the P1 missing-amount-check in one change. (The reconciler already does the safe re-fetch; mirror it in the webhook handler.)
2. **Add the provider-injection test seam first** (an always-compiled `Option<Arc<dyn PaymentProvider>>` override on `AppState`, checked in `resolve_provider_for_profile_rail`). This is a prerequisite for trusting any payment-path test result — it greens the two `paid_purchase_*` red tests AND gives step 1 a regression harness. Delete the dead `payment_provider_preference_round_trip` in the same pass.
3. **Decide scoped API keys:** either finish the `require_scope` migration or stop advertising/issuing `ks_…` keys until it lands — a shipped-but-403 feature is worse than an absent one.
4. **Quick-win hardening batch:** rate-limit `/v1/purchase` + `/v1/redeem`; JSON-ify 422/415; add slug validation; wire `GET /v1/admin/products`; bump `sqlx` ≥0.8.1 and `rustls`/`rustls-webpki`.
5. **StartOS submission fixes** (before any registry submission): the two manifest dead links and the aarch64-vs-manifest decision are trivial; add `instructions.md`; confirm with Start9 whether the custom `LicenseRef-Keysat-1.0` satisfies their "source available" bar.
6. **Add a CI gate** (Gitea Actions/Woodpecker running `cargo test` + `clippy`) so the honor-system test gap — the meta-cause behind the shipped `:52` and Zaprite defects — closes permanently.
7. **Then resume feature work:** the product→merchant-profile picker (already scoped at the top of the AGENTS.md roadmap; the write path is the gating piece that makes multi-profile usable end-to-end) and the remaining deferred UIs.