Add /adjudicate command: debate low-priority backlog to a verdict

Parked P2/P3 technical items accumulate faster than I can judge their
necessity. /adjudicate runs a grounded per-item debate (investigator →
build/drop advocates → judge) over a repo's ROADMAP and routes each to
DROP / DO / ESCALATE, so I ratify decisions instead of researching them.

Recommend-only in v1; verdict autonomy is gated by blast radius, not
priority. ROADMAP-only input — nudges /triage rather than reading the
raw inbox.
This commit is contained in:
Keysat
2026-06-17 22:42:32 -05:00
parent 23b83f5a4c
commit 46298e047f
4 changed files with 184 additions and 4 deletions
+14 -4
View File
@@ -17,7 +17,8 @@ The global layer lives here and is wired into `~/.claude` by **directory symlink
file added under `adapters/` is live immediately — no per-file linking:
- `~/.claude/commands``adapters/claude/commands/` — global slash commands (`/retrofit`,
`/handoff`, `/full-eval`, `/capture`, `/triage`, `/roundup`, `/new-project`, `/design`).
`/handoff`, `/full-eval`, `/capture`, `/triage`, `/roundup`, `/new-project`, `/design`,
`/adjudicate`).
- `~/.claude/agents``adapters/claude/agents/` — global subagents (reviewer, evaluator,
security-auditor, doc-auditor, exerciser, researcher, janitor, portability-checker,
start9-spec-checker, design-checker, onboarding-tester).
@@ -88,9 +89,18 @@ should carry this so any vendor's agent surfaces pending items at session start:
## Current state
- **Fleet built and live** — commands `/capture /triage /roundup /new-project /handoff /retrofit
/full-eval /design`; subagents incl. `design-checker` + `onboarding-tester` (substance in
`guides/`, thin wrappers in `adapters/claude/`, symlinked into `~/.claude`). Dogfoods its own
standard. Latest `/roundup`: `STATUS.md` 2026-06-16.
/full-eval /design /adjudicate`; subagents incl. `design-checker` + `onboarding-tester`
(substance in `guides/`, thin wrappers in `adapters/claude/`, symlinked into `~/.claude`).
Dogfoods its own standard. Latest `/roundup`: `STATUS.md` 2026-06-16.
- **`/adjudicate` built this session (ROADMAP item 10), live.** Debates each parked P2/P3 backlog
item on a repo's ROADMAP to a verdict so the owner ratifies instead of researching: per item,
investigator (grounds it in the code + classifies blast radius) → build- ∥ drop-advocate → judge
(rubric = `how-i-work.md` + repo `AGENTS.md`, biased to DROP on ties). Verdicts: **DROP** (auto,
ratified in one batch), **DO** (low blast radius → annotated plan, recommend-only), **ESCALATE**
(HIGH blast radius / low confidence → balanced brief for the owner). Autonomy gated by blast
radius, not priority; ROADMAP-only (nudges `/triage` first, never reads raw inbox). **v1 is
recommend-only** — never executes; v2 (narrow auto-execution of the safe DO class) deferred until
trust is built.
- **`onboarding-tester` built this session (ROADMAP item 9), live.** Docs-only adopter agent: walks
a product's published docs as a literal newcomer (never reading source), reports doc gaps, and on
a fully clean run emits a publishable "all it took was X, Y, Z" walkthrough. First target: keysat
+31
View File
@@ -214,3 +214,34 @@ Qdrant, and it hosts matrix-bridge); **don't hardcode a model — query the Spar
gateway** for the live one (daily driver Qwen3.6, hot-swappable); networking reduced to LAN /
WireGuard / StartTunnel (Proton VPN + Tor were legacy, dropped). UNVERIFIED banner replaced
with a "verified 2026-06-15" note; decision steps 4 and 6 aligned. Commit `ee5c8bb`.
## 10. `adjudicate` — debate low-priority backlog items to a verdict ✅ BUILT (2026-06-17)
Built and live: `guides/adjudicate.md` + `adapters/claude/commands/adjudicate.md` (the
`/adjudicate` command). Solves backlog clutter the owner can't easily judge: low-priority
(P2/P3) technical/backend items that may be necessary or may be bells-and-whistles, and that
he shouldn't spend expertise on *because* they're low priority. Run inside a repo, it
adjudicates that repo's ROADMAP items via a grounded debate and routes each to a verdict the
owner ratifies instead of researching.
- **Pipeline (per item):** investigator (read-only — does the problem exist? already handled?
what would it touch? + blast-radius classification) → build-advocate ∥ drop-advocate (argue
from the investigator's findings, not speculation) → judge (rubric = `how-i-work.md` + repo
`AGENTS.md`; **biased to DROP on ties / low confidence**, since these are already low-priority).
- **Three verdicts:** **DROP** (the only autonomously-applied call — ratified in one batch, owner
needn't understand the tech), **DO** (worth it + LOW blast radius → annotated with a ready plan,
recommend-only, not executed), **ESCALATE** (worth it but HIGH blast radius / low confidence /
an epic → balanced brief for the owner's call).
- **Autonomy is gated by blast radius, not priority** — HIGH = touches data/auth/money/external
surface or changes observable behavior (unclear ⇒ HIGH). It may auto-recommend *dropping* a HIGH
item but never *doing* one.
- **ROADMAP-only input.** Nudges the owner to `/triage` first if untriaged inbox items exist for
the repo, but never reads raw inbox items into the debate (that's `/triage`'s routing job —
duplicating it invites drift). Two gates: confirm the item set before fan-out (cost control),
then approve the batch of ROADMAP edits. The ROADMAP diff + commit message is the audit trail
(no separate report file).
**Remaining options:** (a) **v2 — narrow auto-execution** of the safe "DO + LOW blast radius +
reversible + test-covered" class, once the owner has watched it make calls and trusts the verdicts
(deliberately deferred — recommend-only first to build trust); (b) a thin `/triage`-then-`/adjudicate`
combo if the two-command chaining friction proves real (YAGNI for now).
+20
View File
@@ -0,0 +1,20 @@
---
description: Debate each low-priority (P2/P3) backlog item on this repo's ROADMAP to a DROP/DO/ESCALATE verdict — recommend-only, applied on your approval
argument-hint: [optional scope, e.g. a ROADMAP item number or "P3"]
---
Adjudicate the low-priority technical backlog of the repository in the current working
directory. Scope, if any: $ARGUMENTS
Your complete orchestration guide — phases, the per-item investigate→debate→judge pipeline,
the three verdicts (DROP / DO / ESCALATE), and the report + approval flow — is at:
~/Projects/standards/guides/adjudicate.md
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
precisely that — do not improvise the adjudication.
Claude Code specifics for Phase 2: per item, launch the investigator first, then the build- and
drop-advocates as a single parallel batch, then the judge; run items concurrently in batches to
keep the fan-out manageable. These are read-only role agents — the only write is the ROADMAP
edit in Phase 4, after the owner approves.
+119
View File
@@ -0,0 +1,119 @@
# Adjudicate — debate low-priority backlog items to a verdict
*Substance file per the portability protocol. Vendor wrappers (e.g.
`adapters/claude/commands/adjudicate.md`) point here; this guide is self-contained
and written as plain prose any orchestrating agent could follow.*
You are running inside one project repo. Low-priority technical/backend items pile up on its
`ROADMAP.md` that the owner can't easily judge the necessity of — and shouldn't have to spend
expertise on, precisely *because* they're low priority. Your job is to run a grounded debate
over each eligible item and reach a verdict, so the owner ratifies decisions instead of
researching them.
**Recommend-only.** You never execute, build, or ship anything here. Your output is verdicts
and a single batch of ROADMAP edits the owner approves. The most you change is the backlog
itself.
**Autonomy is gated by blast radius, not priority.** A low-priority item can still be
dangerous (it touches data, auth, money, an external surface, or changes observable app
behavior). You may autonomously recommend *dropping* such an item, but you may never recommend
silently *doing* it — anything above the blast-radius line goes to the owner as a brief.
## Phase 1 — Orient & select (no fan-out yet)
1. Read this repo's `ROADMAP.md` and `AGENTS.md` (especially `## Current state`) for context.
2. **Inbox nudge (don't triage).** Do the session-start inbox-check: if
`~/Projects/standards/INBOX.md` has unchecked items tagged for this repo, tell the owner
*"N untriaged inbox items for this repo — run `/triage` first to land them on the ROADMAP,
or proceed with just what's there."* You operate on ROADMAP only; never read raw inbox items
into the debate — that is `/triage`'s routing job, and duplicating its rules invites drift.
3. **Select candidates.** Eligible = parked, low-priority backlog items: P2/P3 where items
carry an explicit priority; otherwise items that read as nice-to-have / deferred.
**Exclude:** P0/P1 or clearly-active items, anything already marked done/built, and
`(new:…)`-style new-repo seeds. If `$ARGUMENTS` names specific items (e.g. a ROADMAP number
or `P3`), scope to those.
4. **Confirm the set before spending agents.** Show the owner the list you intend to adjudicate
(one line each) and let them trim or confirm. A full run is ~4 subagents per item — this gate
controls cost and catches any item that's more important than its placement suggests.
## Phase 2 — Per item: investigate → debate → judge
For each confirmed item, run this pipeline (items may run in parallel where your tooling
allows; within an item the stages are sequential):
1. **Investigator** (read-only). Grounds the debate in reality so it isn't two models
speculating. Reads the actual code and reports: does the problem this item describes actually
exist, or is it already handled? What would the change touch (files, surfaces)? **Classify
blast radius:** LOW (reversible, internal, test-covered, no observable behavior change) or
HIGH (touches data/auth/money/an external surface, or changes observable app behavior). When
unsure, classify HIGH.
2. **Build-advocate** and **Drop-advocate** (in parallel). Each receives the item text and the
investigator's findings and argues one side honestly, citing the findings — not speculation:
- *Build-advocate*: the concrete benefit, the cost or risk of leaving it undone, who or what
it helps.
- *Drop-advocate*: YAGNI, added complexity and maintenance, opportunity cost, whether it's
bells-and-whistles for its own sake.
3. **Judge.** Receives the item, the investigator's findings (incl. blast radius), and both
briefs. Decides against the rubric = `how-i-work.md` + this repo's `AGENTS.md`. **Bias to
DROP on a tie or low confidence** — these items are already low-priority, so death is the
default unless a clear case is made. Emits a structured verdict (next section).
## The three verdicts
- **DROP** — not worth doing. The only autonomously-applied call. (Still ratified in one batch
by the owner per Phase 4 — "autonomous" means the owner needn't understand the tech, not that
files change unseen.)
- **DO** — worth doing **and** blast radius LOW. Annotate the ROADMAP item with the decision and
a short ready-to-act plan; surface it for the owner's go-ahead to schedule. You do **not**
execute it (recommend-only).
- **ESCALATE** — worth doing **but** blast radius HIGH, **or** the judge's confidence is low,
**or** the item is really an epic that should be split first. Produce a balanced brief: the
build case, the drop case, the judge's lean, and why it's above the line. This is the owner's
real judgment call — made cheap because they're ratifying reasoning, not generating it.
## Phase 3 — Report (inline, no file written)
Show the owner one report. No new tracked artifact — the ROADMAP diff and the commit message
are the durable record (same convention as `/triage`).
```
# Adjudication — <repo> — <date>
Adjudicated N of M eligible items.
## DROP (ratify to remove)
- <item> — one-line why-not + judge confidence
## DO (low blast radius — your go-ahead to schedule)
- <item> — one-line why + the ready plan
## ESCALATE (your call — balanced brief)
- <item> — build case / drop case / judge's lean / why it crosses the line
```
## Phase 4 — Approve, apply, commit
1. **One approval gate.** Wait for the owner to confirm the batch. Never edit `ROADMAP.md`
before they approve — it's a durable file (same rule as `/triage`).
2. **Apply** the approved changes to `ROADMAP.md`: delete DROP items outright (git history is
the record — don't leave tombstones); annotate DO items with the decision + plan; annotate
ESCALATE items with the judge's lean so the brief isn't lost.
3. **Commit.** Present the proposed message and wait for confirmation (one approval covers
commit + push, per `how-i-work.md`). The message records the verdicts and the why for each
drop — that *is* the audit trail. No AI-attribution trailer.
4. **Report** what was dropped, what's queued as DO, and what's waiting on the owner as
ESCALATE.
## Rules
- Recommend-only. Never execute, build, or ship — your single write is the ROADMAP edit, after
approval.
- Never auto-recommend *doing* a HIGH-blast-radius item; route it to ESCALATE. When blast radius
is unclear, treat it as HIGH.
- Ground every argument in the investigator's findings. If the investigator can't read the code
or the item is too vague to investigate, say so and ESCALATE it rather than debating in a
vacuum.
- Don't read raw inbox items into the debate — nudge the owner to `/triage` first. ROADMAP is
the only input.
- Preserve the owner's judgment as the gate: propose verdicts, apply only on approval, and
surface anything consequential rather than deciding it.
- If blocked at any point, report exactly what blocked you — never fabricate a verdict.