Files
keysat-root/EVALUATION.md
T

87 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Evaluation — Keysat — 2026-06-18
Intent: a self-hosted, Bitcoin-native software-licensing service that runs as a StartOS 0.4.x package (Rust/axum/sqlx+SQLite daemon with Ed25519 license signing + a TypeScript StartOS wrapper), with four wire-compatible client SDKs (TS, Rust, Python, Go), a public landing/docs site, and BTCPay Server (required) + Zaprite (optional) Bitcoin payment providers.
Agents run: evaluator, security-auditor, exerciser, doc-auditor, start9-spec-checker. **reviewer skipped** — working tree is clean (no uncommitted diff to review).
## Verdict
Keysat substantially achieves its stated intent and is mature, security-conscious work well above typical solo-project quality on the surfaces that handle money and crypto. The daemon builds clean, the full test suite is green (≈117131 tests across 8 suites), the LIC1 wire format is genuinely implemented byte-identically across the daemon and all four SDKs, and the payment/settle path is built fail-closed with real anti-forgery thinking (webhooks re-fetch authoritative provider status before issuing anything — this defeats Zaprite's unsigned-webhook forgery). No agent found a P0; the security auditor found no P0/P1. The headline risks are all below release-blocking: two shipped developer-experience breakages (the Rust SDK's published examples/README import the wrong crate name so they don't compile, and the cross-language crosscheck harness has a hardcoded machine-specific path), a cluster of P2 hardening/hygiene items (X-Forwarded-For rate-limit bypass, unpatched dependency advisories, admin UI co-located with the public API, runtime-prepared SQL with no compile-time safety net), and the still-missing `prepare.sh` that blocks Start9 community-registry submission. The live `0.2.0:60` deployment is sound; the work below is polish, hardening, and unblocking the registry milestone.
## Cross-referenced findings
- **Rust SDK wrong crate name** — corroborated by two agents. The **exerciser** proved it breaks the build (`cargo build --examples``E0432: unresolved import 'licensing_client'`; crate is `keysat-licensing-client` → module `keysat_licensing_client`). The **doc-auditor** found the same wrong import propagated through `licensing-client-rust/README.md:26,39,56`, `src/lib.rs:20`, both `examples/*.rs`, and the public docs page `keysat-docs/integrate.html:130`. ONE finding, two kinds of evidence (build break + doc propagation) → **P1**.
- **`unlimited_merchant_profiles` entitlement status** — the **doc-auditor** found `docs/guides/licensing-tiers.md:43-45` AND the code comment `src/api/tier.rs:234-237` both say it "still needs adding" to Pro/Patron policies, while AGENTS.md Current state says it's "confirmed live." Internal contradiction that needs a live `GET /v1/products/keysat/policies` check to resolve → **P2**.
- **Health-check / readiness** — the **start9-spec-checker** noted the StartOS health check is port-listening only (flips green when 8080 binds, before migrations finish); the **exerciser** noted the daemon actually exposes a richer `GET /healthz` (and `/health` 404s). Same underlying surface → one **P3**.
- **StartOS actions surface** — the **start9-spec-checker** found 21 registered actions vs the **doc-auditor**'s confirmation that `instructions.md`/README describe only ~4. Doc drift + extra test surface for Start9 review → **P3**.
- **Test-count drift** — evaluator counted ≈131, exerciser ran 117, doc-auditor found the `api` suite at 65 vs the documented 54. All green; the *documented* counts in `docs/guides/testing.md:57` are stale → **P3** (doc only).
## Priority queue
**P0** — none.
- [P1] Rust SDK examples + README + `integrate.html` import `licensing_client` but the crate is `keysat-licensing-client`; examples don't compile — `licensing-client-rust/{README.md:26,39,56, src/lib.rs:20, examples/offline_verify.rs:10, examples/online_validate.rs:9}`, `keysat-docs/integrate.html:130` — exerciser + doc-auditor
- [P1] Cross-language crosscheck harness has a hardcoded dev-machine path `/sessions/hopeful-determined-edison/...`, breaks on every other machine — `tests/crosscheck/run_ts.mjs:9` — exerciser
- [P1] No `prepare.sh` for the clean-Debian first build → blocks Start9 community-registry submission (already tracked in AGENTS.md/ROADMAP; not blocking the live deploy) — repo root of `licensing-service-startos/` — start9-spec-checker
- [P2] `X-Forwarded-For` is trusted verbatim as the rate-limit bucket key on brute-forceable endpoints; rotating XFF defeats login/recover/validate throttles (exploitability hinges on whether the StartOS front proxy overwrites XFF) — `api/auth.rs:137`, `api/recover.rs:65-70`, `api/admin.rs:63-68`, `api/validate.rs:95-98` — security-auditor
- [P2] Dependency advisories with fixes available: `sqlx 0.7.4` (RUSTSEC-2024-0363), `rustls-webpki 0.101.7` (RUSTSEC-2026-0098/0099/0104), wrapper `fast-xml-parser` via start-sdk (GHSA-5wm8-gmm8-39j9 high) — `licensing-service/Cargo.lock`, wrapper `package-lock.json` — security-auditor
- [P2] Admin UI + `/v1/admin/*` share the single `:8080` interface with the public `/v1/validate` + `/buy` surfaces (path-routed only); operator can't network-isolate the admin surface — `startos/interfaces.ts:22-77`, `Dockerfile:95` — security-auditor
- [P2] Admin webhook-endpoint registration accepts arbitrary URLs incl. loopback IPs and `file://` scheme (no allowlist on the `reqwest` worker); admin-gated but `file://` should never be accepted — `webhooks.rs:106` — exerciser
- [P2] `db/repo.rs` (+ `subscriptions.rs`) use runtime-prepared `sqlx::query(&format!(...))` with no compile-time column checking; this class already shipped a prod regression (every paid purchase 500'd on `:52`) — `db/repo.rs:662,718`, `subscriptions.rs:180` — evaluator
- [P2] `rate_buckets` grows unbounded — one row per distinct client IP on a public endpoint, with no reaper (sessions/redemptions have reapers; rate buckets don't) — `rate_limit.rs:63-78`, cf. `main.rs:151,174` — evaluator
- [P2] No CI enforces anything; tree has never been `cargo fmt`'d (56 daemon files differ, 37 wrapper files differ from prettier); green depends on the operator remembering to run the suite before `publish.sh``docs/guides/testing.md:22-30` — evaluator
- [P2] `unlimited_merchant_profiles` live-status contradiction between docs/code ("still needs adding") and AGENTS.md ("confirmed live") — `docs/guides/licensing-tiers.md:43-45`, `src/api/tier.rs:234-237` vs AGENTS.md — doc-auditor
- [P3] Go SDK alone accepts trailing payload bytes; daemon + Rust + TS + Python all reject them — divergence in the "wire-compatible" contract (not a forgery hole) — `licensing-client-go/keysat.go:192-193` — evaluator
- [P3] Fingerprint-bound payload check runs *after* machine-row activation + `machine.activated` webhook dispatch; a bound-mismatch creates a row + fires a webhook before rejecting — `validate.rs:412` vs `:326-345` — evaluator
- [P3] Rate-limit consume is read-then-write (non-atomic TOCTOU); concurrent requests to one bucket can both pass — `rate_limit.rs:31-78` — evaluator
- [P3] Buyer-controlled `req.redirect_url` passed unvalidated to BTCPay as post-purchase `redirectURL` (self-targeted open redirect) — `api/purchase.rs:513` — security-auditor
- [P3] `html_escape` omits `'` (safe only because templates use double-quoted attributes) — `api/buy_page.rs:1536` — security-auditor
- [P3] `audit:read` still folded into the blanket `:read` scope (a read-only key can read the full audit log) — `api/api_keys.rs` (already in AGENTS.md TODOs) — security-auditor
- [P3] Container runs as root; justified by StartOS LXC isolation but a non-root user + chowned volume is stronger — `Dockerfile:70-95` — security-auditor
- [P3] `run_migrations_with_self_heal` deletes `_sqlx_migrations` rows on checksum mismatch to avoid crash-loops — pragmatic but quietly self-modifying; worth a second look — `db/mod.rs:44` — evaluator (Surprise)
- [P3] Health check is port-listening only and flips green before migrations finish; daemon exposes richer `GET /healthz` (and `/health` 404s, could confuse probes) — `startos/main.ts:237` — start9-spec-checker + exerciser
- [P3] 21 registered StartOS actions vs ~4 documented in `instructions.md`/README; extra (legacy) surface Start9 will exercise during review — `startos/actions/`, `instructions.md` — start9-spec-checker + doc-auditor
- [P3] Source-available license (`LicenseRef-Keysat-1.0`, not OSI) may trigger a Start9 policy question; source is public but redistribution is restricted — confirm with submissions@start9.com — start9-spec-checker
- [P3] Universal s9pk publish path never exercised end-to-end ("confirm the registry index lists both arches on first publish") — `docs/guides/startos-packaging.md:63` — start9-spec-checker
- [P3] Built manifest emits `donationUrl: null`; confirm Start9 registry ingest tolerates a null vs an omitted field — start9-spec-checker
- [P3] ROADMAP.md:8 still lists the Zaprite auto-charge silent-lapse as open; fix shipped in `:60``ROADMAP.md:8` vs `src/subscriptions.rs:1408-1428` — doc-auditor
- [P3] BUILDING.md stale: "rust:1.75-slim-bookworm" (now `1.88`) and "Node 20+" (now Node 22) — `BUILDING.md:15,20` vs `Dockerfile`/`startos-packaging.md:14`/`package.json` — doc-auditor
- [P3] Landing-page footer links `docs.keysat.xyz/changelog` but no `changelog.html` exists — `keysat-xyz-landing/index.html:1072` — doc-auditor
- [P3] README action mislabeled "Show credentials"; actual action name is "Show admin API key" — `licensing-service-startos/README.md:160` vs `startos/actions/showCredentials.ts:21` — doc-auditor
- [P3] `PORTING_SDK_TO_NEW_LANGUAGES.md` describes Python/Go as "v0.2 goals" (both shipped) and calls the Rust crate `licensing-client` (renamed) — `PORTING_SDK_TO_NEW_LANGUAGES.md:1-13` — doc-auditor
- [P3] `docs/guides/testing.md:57` test counts stale (54 vs actual 65 in `api` suite) — doc-auditor
- [P3] `ZAPRITE_INTEGRATION_SPEC.md:401` references receipt emails (email plan dropped); add a SUPERSEDED banner — doc-auditor
- [P3] `MASTER_KEYPAIR_PROCEDURE.md:199-210` references a "v0.3 rolling-rotation flow" with no documented timeline; verify still planned or move to ROADMAP — doc-auditor
- [P3] Max int64 price (9.2e18 sats) accepted/stored with no upper bound — exerciser (Surprise)
- [P3] `validate` silently ignores an unknown `product_id` body field (only `product_slug` is checked) — a dev sending `product_id` gets a false `ok:true``api/validate.rs` — exerciser (Surprise)
- [P3] Deprecated `SETTING_ACTIVE_PROVIDER` constant emits a build warning on every compile — evaluator + exerciser
- [P3] CORS `allow *` on all endpoints incl. admin (by design for SDK callers; relies on key auth + TLS) — exerciser
## Scorecard
| Lens | Score /5 | Notes (adjustments from cross-evidence) |
|---|---|---|
| Architecture | 4 | Clean module separation, provider abstraction with a test seam, bounded background workers. 1 for runtime-prepared SQL throughout `db/repo.rs`. (evaluator) |
| Security | **4** | Adjusted down from evaluator's 5. Core crypto/auth/SQL surfaces are genuinely 5-grade (constant-time compares, parameterized SQL, fail-closed webhook re-fetch). But a dedicated auditor surfaced **three independent P2s** (XFF bypass, dependency advisories, admin/public co-location) plus the exerciser's `file://` webhook-registration P2 — more hardening debt than a 5 carries until addressed. |
| Performance | 4 | WAL + 8-conn pool + FK + busy_timeout; reconcile loop bounded. 1 for unbounded `rate_buckets` and the read-then-write limiter. (evaluator) |
| Testing | 4 | ≈117131 tests, 8 suites, all green; crosscheck verifies v1+v2 across all SDKs. No CI enforces it; harness has a portability bug (P1) and Rust examples don't build (P1). (evaluator + exerciser) |
| Code quality | 4 | Comments explain *why*; idempotency where it matters. `cargo fmt` never run; a few very long handlers. (evaluator) |
| Documentation | **4** | Adjusted down from evaluator's 5. Subsystem guides the evaluator read are accurate, but the doc-auditor's broader sweep found real drift: a published SDK README import that breaks compilation, a broken changelog link, stale Node/Rust versions, a contradicted merchant-profile entitlement status, and stale test counts. Solid but not pristine. |
## Disagreements & gaps
- **Security lens (5 vs 4):** the evaluator scored Security 5 from the crypto/auth/SQL code; the security-auditor (whose whole job is the adversarial pass) surfaced three P2 hygiene/hardening items. Resolved by adjusting to 4 with the core surfaces explicitly credited — no factual conflict, just depth of coverage.
- **Documentation lens (5 vs 4):** the evaluator judged the subsystem guides it read (accurate → 5); the doc-auditor swept the SDK READMEs and public sites and found drift the evaluator didn't reach. Adjusted to 4.
- **Shared blind spot — the live payment path:** every runtime-exercising agent stopped at the same wall. The exerciser could not test `POST /v1/purchase`, recover, upgrade, subscription-cancel, or BTCPay connect/disconnect (no live BTCPay/Zaprite); the security-auditor could not confirm whether the StartOS front proxy strips client-supplied XFF (which decides if the P2 rate-limit bypass is remotely reachable); the start9-spec-checker could not run on-box install/backup/restore. The single highest-value next test, named independently by multiple agents: run the `onboarding-harness` Stage 1+2 against a **regtest BTCPay** to exercise purchase → webhook-settle → license-issue end-to-end.
- **No 0.4.x submission docs exist publicly:** the start9-spec-checker could only verify against the 0.3.5.x submission page (the only one Start9 links). Whether `prepare.sh` is still required for the 0.4.x SDK era is unconfirmable from public docs (the 0.4.x `hello-world` template also lacks one) — treated as a blocker pending Start9 confirmation.
## Suggested order of work
1. **Fix the two shipped P1 DX breakages first** (cheap, adopter-facing): correct every `licensing_client``keysat_licensing_client` import across the Rust SDK + `integrate.html`, and make `tests/crosscheck/run_ts.mjs` resolve the TS SDK build relative to its sibling dir instead of the hardcoded `/sessions/...` path. Add a crosscheck case feeding trailing bytes to all four SDKs (also closes the Go P3).
2. **Patch the dependency advisories** (`sqlx ≥0.8.1`, pull `rustls-webpki ≥0.103.13` via the reqwest/rustls bump, update start-sdk for `fast-xml-parser`); re-run `cargo audit`/`npm audit`. Low-risk, mechanical, and removes the only externally-known CVEs.
3. **Resolve the XFF rate-limit bypass** — first confirm whether the StartOS front proxy overwrites `X-Forwarded-For` (this decides reachability); then derive the client IP from the trusted-proxy connection / configured allowlist with a peer-socket fallback rather than raw XFF.
4. **Stand up minimal CI** (one job: `cargo test && cargo clippy && tsc --noEmit`, ideally `cargo fmt --check`). This makes the already-good suite *trustworthy* and is a precondition for relying on the green status in everything below.
5. **Retire the runtime-prepared-SQL risk class** on the money paths — migrate the purchase/settle/resolution queries in `db/repo.rs` + `subscriptions.rs` to compile-checked `sqlx::query!`, and add a `rate_buckets` reaper mirroring the session/redemption reapers.
6. **Clear the doc drift in one pass** — resolve the `unlimited_merchant_profiles` contradiction with a live `GET /v1/products/keysat/policies` (then fix whichever of the guide/code-comment/AGENTS.md is wrong), and fix the ROADMAP/BUILDING/README/PORTING/testing-count/changelog-link items together.
7. **Unblock the registry milestone** — author `prepare.sh` for the clean-Debian build, then do the on-box manual checklist (install → connect BTCPay → backup → uninstall → restore → verify key+licenses survive) and email submissions@start9.com about the source-available license before submitting.