Webhook DLQ — list failed deliveries and manually retry

Closes the silent-loss hole in outbound webhook delivery. The worker
in src/webhooks.rs retries failed deliveries with exponential backoff
up to 10 attempts, then sets next_attempt_at = NULL and walks away.
Pre-this-commit, those "dead-lettered" rows sat in webhook_deliveries
forever with no surface for the operator to discover, inspect, or
recover from them — a subscriber that was down for >6h during a
license-issuance burst would silently lose those events forever.

What's new:

- repo::DeliveryStatusFilter — enum with parse() so query strings
  map cleanly to SQL predicates.
- repo::list_deliveries — endpoint_id + status + limit, newest first.
- repo::requeue_delivery — resets attempt_count=0, clears delivered_at
  and last_error, sets next_attempt_at=now. The worker picks it up on
  the next 5s tick.

- src/api/webhook_deliveries.rs — admin module with two handlers:
  - GET /v1/admin/webhook-deliveries?endpoint_id=…&status=…&limit=…
  - POST /v1/admin/webhook-deliveries/:id/retry  (audit-logged as
    webhook_delivery.retry; 404 on missing id)
- Routes registered in src/api/mod.rs alongside the existing
  webhook_endpoints CRUD.

- tests/api.rs gains webhook_dlq_lists_failed_and_retry_requeues:
  seeds three deliveries directly via SQL (one each: delivered,
  pending, dead-lettered), exercises the list filter, runs the retry,
  asserts the row migrates from failed→pending, audit row is written,
  404 on bad id, 400 on bad status filter.

Worker code is unchanged. The DLQ is operator-actionable infrastructure
on top of the existing retry semantics.

Test count: 23 (9 unit + 4 migration + 10 API), up from 22.
This commit is contained in:
Grant
2026-05-08 09:38:58 -05:00
parent e2b296ce29
commit f9ef1a854c
4 changed files with 387 additions and 0 deletions
+11
View File
@@ -70,6 +70,7 @@ pub mod session_layer;
pub mod tier;
pub mod validate;
pub mod webhook;
pub mod webhook_deliveries;
pub mod webhook_endpoints;
use crate::btcpay::client::BtcpayClient;
@@ -304,6 +305,16 @@ pub fn router(state: AppState) -> Router {
"/v1/admin/webhook-endpoints/:id",
axum::routing::delete(webhook_endpoints::delete),
)
// Webhook delivery history (the dead-letter inspection +
// manual-retry surface; see webhook_deliveries.rs for why).
.route(
"/v1/admin/webhook-deliveries",
get(webhook_deliveries::list),
)
.route(
"/v1/admin/webhook-deliveries/:id/retry",
post(webhook_deliveries::retry),
)
// Discount / referral codes.
.route(
"/v1/admin/discount-codes",