f3fae958ef
Keep the investigation and the judge's decision rigorous and fact-based, but render everything shown to the owner — both debate sides and the rationale — in plain language. ESCALATE now surfaces an explicit For it / Against it / Judge's lean pair.
144 lines
8.7 KiB
Markdown
144 lines
8.7 KiB
Markdown
# Adjudicate — debate low-priority backlog items to a verdict
|
|
|
|
*Substance file per the portability protocol. Vendor wrappers (e.g.
|
|
`adapters/claude/commands/adjudicate.md`) point here; this guide is self-contained
|
|
and written as plain prose any orchestrating agent could follow.*
|
|
|
|
You are running inside one project repo. Low-priority technical/backend items pile up on its
|
|
`ROADMAP.md` that the owner can't easily judge the necessity of — and shouldn't have to spend
|
|
expertise on, precisely *because* they're low priority. Your job is to run a grounded debate
|
|
over each eligible item and reach a verdict, so the owner ratifies decisions instead of
|
|
researching them.
|
|
|
|
**Recommend-only.** You never execute, build, or ship anything here. Your output is verdicts
|
|
and a single batch of ROADMAP edits the owner approves. The most you change is the backlog
|
|
itself.
|
|
|
|
**Autonomy is gated by blast radius, not priority.** A low-priority item can still be
|
|
dangerous (it touches data, auth, money, an external surface, or changes observable app
|
|
behavior). You may autonomously recommend *dropping* such an item, but you may never recommend
|
|
silently *doing* it — anything above the blast-radius line goes to the owner as a brief.
|
|
|
|
**Decide on the facts; present in plain terms.** The investigation and the judge's reasoning
|
|
are rigorous and technical — grounded in what the code actually does, so the recommendation
|
|
rests on real facts. But everything *shown to the owner* — both sides of each debate and the
|
|
verdict's rationale — must be in plain language a non-specialist can act on: what the item
|
|
really is, what doing it gets you, what skipping it costs. Don't assume jargon; when a
|
|
technical fact is load-bearing, explain it in a phrase. The owner is judging trade-offs, not
|
|
reading a tech spec. (The technical detail stays in the agents' analysis and can be surfaced on
|
|
request — it's just not the default presentation.)
|
|
|
|
## Phase 1 — Orient & select (no fan-out yet)
|
|
|
|
1. Read this repo's `ROADMAP.md` and `AGENTS.md` (especially `## Current state`) for context.
|
|
2. **Inbox nudge (don't triage).** Do the session-start inbox-check: if
|
|
`~/Projects/standards/INBOX.md` has unchecked items tagged for this repo, tell the owner
|
|
*"N untriaged inbox items for this repo — run `/triage` first to land them on the ROADMAP,
|
|
or proceed with just what's there."* You operate on ROADMAP only; never read raw inbox items
|
|
into the debate — that is `/triage`'s routing job, and duplicating its rules invites drift.
|
|
3. **Select candidates.** Eligible = parked, low-priority backlog items: P2/P3 where items
|
|
carry an explicit priority; otherwise items that read as nice-to-have / deferred.
|
|
**Exclude:** P0/P1 or clearly-active items, anything already marked done/built, and
|
|
`(new:…)`-style new-repo seeds. If `$ARGUMENTS` names specific items (e.g. a ROADMAP number
|
|
or `P3`), scope to those.
|
|
4. **Confirm the set before spending agents.** Show the owner the list you intend to adjudicate
|
|
(one line each) and let them trim or confirm. A full run is ~4 subagents per item — this gate
|
|
controls cost and catches any item that's more important than its placement suggests.
|
|
|
|
## Phase 2 — Per item: investigate → debate → judge
|
|
|
|
For each confirmed item, run this pipeline (items may run in parallel where your tooling
|
|
allows; within an item the stages are sequential):
|
|
|
|
1. **Investigator** (read-only). Grounds the debate in reality so it isn't two models
|
|
speculating. Reads the actual code and reports: does the problem this item describes actually
|
|
exist, or is it already handled? What would the change touch (files, surfaces)? **Classify
|
|
blast radius:** LOW (reversible, internal, test-covered, no observable behavior change) or
|
|
HIGH (touches data/auth/money/an external surface, or changes observable app behavior). When
|
|
unsure, classify HIGH.
|
|
2. **Build-advocate** and **Drop-advocate** (in parallel). Each receives the item text and the
|
|
investigator's findings and argues one side honestly, citing the findings — not speculation.
|
|
Reason from the technical facts, but **write the case for a non-specialist**: lead with the
|
|
practical stakes and translate any jargon the argument depends on.
|
|
- *Build-advocate*: the concrete benefit, the cost or risk of leaving it undone, who or what
|
|
it helps.
|
|
- *Drop-advocate*: YAGNI, added complexity and maintenance, opportunity cost, whether it's
|
|
bells-and-whistles for its own sake.
|
|
3. **Judge.** Receives the item, the investigator's findings (incl. blast radius), and both
|
|
briefs. Decides against the rubric = `how-i-work.md` + this repo's `AGENTS.md`. **Bias to
|
|
DROP on a tie or low confidence** — these items are already low-priority, so death is the
|
|
default unless a clear case is made. Decide on the technical merits, but **write the
|
|
rationale it emits in plain terms** (per the principle above). Emits a structured verdict
|
|
(next section).
|
|
|
|
## The three verdicts
|
|
|
|
- **DROP** — not worth doing. The only autonomously-applied call. (Still ratified in one batch
|
|
by the owner per Phase 4 — "autonomous" means the owner needn't understand the tech, not that
|
|
files change unseen.)
|
|
- **DO** — worth doing **and** blast radius LOW. Annotate the ROADMAP item with the decision and
|
|
a short ready-to-act plan; surface it for the owner's go-ahead to schedule. You do **not**
|
|
execute it (recommend-only).
|
|
- **ESCALATE** — worth doing **but** blast radius HIGH, **or** the judge's confidence is low,
|
|
**or** the item is really an epic that should be split first. Produce a balanced brief: the
|
|
build case, the drop case, the judge's lean, and why it's above the line. This is the owner's
|
|
real judgment call — made cheap because they're ratifying reasoning, not generating it.
|
|
|
|
## Phase 3 — Report (inline, no file written)
|
|
|
|
Show the owner one report. **Write every line in plain terms** (per the principle above) — no
|
|
unexplained jargon, and never let a raw file path or code symbol stand in for the explanation;
|
|
say what it means in practice. No new tracked artifact — the ROADMAP diff and the commit message
|
|
are the durable record (same convention as `/triage`).
|
|
|
|
```
|
|
# Adjudication — <repo> — <date>
|
|
Adjudicated N of M eligible items.
|
|
|
|
## DROP — not worth doing (remove on your OK)
|
|
- <item, in plain words> — why it's not worth it, in one plain sentence (judge confidence)
|
|
|
|
## DO — worth doing and low-risk (your go-ahead to schedule)
|
|
- <item, in plain words> — what you'd gain, in one plain sentence + the ready plan
|
|
|
|
## ESCALATE — your call (touches something that matters)
|
|
- <item, in plain words>
|
|
For it: <the strongest plain-language case to do it>
|
|
Against it: <the strongest plain-language case to skip it>
|
|
Judge's lean: <which way, and why, in plain terms>
|
|
Why it's yours: <what makes it consequential — e.g. changes app behavior, touches data/money>
|
|
```
|
|
|
|
The plain-language "For it / Against it" pair is the heart of an ESCALATE — it's the easy-to-read
|
|
two sides the owner weighs. Keep each to a few plain sentences.
|
|
|
|
## Phase 4 — Approve, apply, commit
|
|
|
|
1. **One approval gate.** Wait for the owner to confirm the batch. Never edit `ROADMAP.md`
|
|
before they approve — it's a durable file (same rule as `/triage`).
|
|
2. **Apply** the approved changes to `ROADMAP.md`: delete DROP items outright (git history is
|
|
the record — don't leave tombstones); annotate DO items with the decision + plan; annotate
|
|
ESCALATE items with the judge's lean so the brief isn't lost.
|
|
3. **Commit.** Present the proposed message and wait for confirmation (one approval covers
|
|
commit + push, per `how-i-work.md`). The message records the verdicts and the why for each
|
|
drop — that *is* the audit trail. No AI-attribution trailer.
|
|
4. **Report** what was dropped, what's queued as DO, and what's waiting on the owner as
|
|
ESCALATE.
|
|
|
|
## Rules
|
|
|
|
- Recommend-only. Never execute, build, or ship — your single write is the ROADMAP edit, after
|
|
approval.
|
|
- Never auto-recommend *doing* a HIGH-blast-radius item; route it to ESCALATE. When blast radius
|
|
is unclear, treat it as HIGH.
|
|
- Ground every argument in the investigator's findings. If the investigator can't read the code
|
|
or the item is too vague to investigate, say so and ESCALATE it rather than debating in a
|
|
vacuum.
|
|
- Present in plain terms. The report and both sides of every debate must read for a
|
|
non-specialist; the technical rigor lives in the decision, not the prose shown to the owner.
|
|
- Don't read raw inbox items into the debate — nudge the owner to `/triage` first. ROADMAP is
|
|
the only input.
|
|
- Preserve the owner's judgment as the gate: propose verdicts, apply only on approval, and
|
|
surface anything consequential rather than deciding it.
|
|
- If blocked at any point, report exactly what blocked you — never fabricate a verdict.
|