Compare commits

..

77 Commits

Author SHA1 Message Date
Keysat 04a190b8f2 Capture: what are the open brackets when you log an inbox item throug (via matrix) 2026-06-19 17:10:02 -05:00
Keysat c61d58632d Capture: brainstorm better tracking of cardio logging and cardio prog (via matrix) 2026-06-19 17:08:54 -05:00
Keysat d901424f2d docs: fix drift in standards docs
- portability.md: add STATUS.md to layout tree
- subagents-handbook.md: add design-checker + onboarding-tester to roster;
  add reviewer to /full-eval enumeration
- retrofit-playbook.md: /mobile → /remote-control for session pairing
2026-06-19 13:31:17 -05:00
Keysat 1a702387e7 design guide: field notes from proof-of-work Case-B extract
Two process learnings: render hue/shade reconcile candidates as pixels
(Chrome headless screenshot of an in-context vignette) rather than relying
on hex/monospace previews; and how to absorb one owner-driven accent in a
document-as-is extract (single canonical hue, distinguish overlap by
treatment not a second color, surface the collision before picking).
2026-06-19 12:15:11 -05:00
Keysat ce1327aab1 Refresh Current state: first full /design round-trip done (ten31-database) 2026-06-19 11:47:23 -05:00
Keysat 176eebe64b design guide: field notes from first live cloud round-trip
Three process learnings from the 2026-06-19 Ten31 mobile round-trip:
- Output read via the DesignSync MCP is a multi-file app on Claude Design's
  own template runtime using CSS custom properties (incl. a dark/light
  switch) — qualifies the seed note about "inline hardcoded CSS" (that's the
  export-bundle path, not the MCP project-read path). Read the shell first,
  then each sub-app's --var block + DCLogic for derived-field formulas.
- DesignSync reads into context only (no bulk download / no screenshots):
  byte-capture the representative source + a manifest README; use
  `jq -r .content` on the harness-persisted result for large files.
- A redesign can return more scope than the brief asked for (a light theme);
  record it but write scope-expanding additions as proposed-not-canonical
  pending an explicit owner decision, surfaced as an A/B reconcile question.
2026-06-19 11:25:34 -05:00
Keysat b4042985fb Capture spark-control CRM/intake-bot dashboard card (from ten31 session) 2026-06-19 08:57:23 -05:00
Keysat 7980545c99 Roundup snapshot — 2026-06-18 2026-06-18 13:13:58 -05:00
Keysat 45004c2a9b Capture: we should redesign the software logo/icon (used for startos (via matrix) 2026-06-18 11:25:20 -05:00
Keysat 0d4b238852 Capture: ability to reorder entitlements catalog on edit products vie (via matrix) 2026-06-18 10:46:26 -05:00
Keysat 1fce86a2d6 Capture: screen refresh should preserve viewing the same tab you were (via matrix) 2026-06-18 10:30:27 -05:00
Keysat 1344a354c8 Capture: backup history in settings tab should be minimized and expan (via matrix) 2026-06-18 09:46:00 -05:00
Keysat 6a1fc6cd08 how-i-work: add YAGNI/no-abstraction, native-platform, and shortcut-ceiling rules
Lifted three sharpened principles from a review of the ponytail ruleset
into "When proposing changes".
2026-06-18 08:56:17 -05:00
Keysat f3fae958ef adjudicate: present verdicts and both sides in plain terms
Keep the investigation and the judge's decision rigorous and
fact-based, but render everything shown to the owner — both debate
sides and the rationale — in plain language. ESCALATE now surfaces an
explicit For it / Against it / Judge's lean pair.
2026-06-18 06:39:58 -05:00
Keysat 637ac3e7c2 handoff: prune Current state to lean session status
Drop the finished build narratives (durable record lives in ROADMAP
items 8/9/10), fix the stale "built this session" on onboarding-tester,
and surface the /adjudicate first-run drop-bias calibration as the top
next step.
2026-06-17 22:50:40 -05:00
Keysat 46298e047f Add /adjudicate command: debate low-priority backlog to a verdict
Parked P2/P3 technical items accumulate faster than I can judge their
necessity. /adjudicate runs a grounded per-item debate (investigator →
build/drop advocates → judge) over a repo's ROADMAP and routes each to
DROP / DO / ESCALATE, so I ratify decisions instead of researching them.

Recommend-only in v1; verdict autonomy is gated by blast radius, not
priority. ROADMAP-only input — nudges /triage rather than reading the
raw inbox.
2026-06-17 22:42:32 -05:00
Keysat 23b83f5a4c Triage: drain ten31-transcripts meeting-name item (built directly) 2026-06-17 21:57:47 -05:00
Keysat 3d1258e048 Capture: when a meeting recording is finished, the app should prompt (via matrix) 2026-06-17 10:13:36 -05:00
Keysat 6f486c4475 Mark recap Case-B design retrofit done; log cleanup-execution learnings
- Flip recap's Case-B /design backfill from "in flight" -> done in
  ROADMAP/AGENTS: contract extracted + two-phase conformance cleanup
  shipped (recap app 0.2.161).
- Add three cleanup-execution Field notes to guides/design.md:
  CSS-value-position var-ify scoping, exclude standalone no-:root
  exports, border-radius-clamp capsule snapping.
- Note the standalone-export literal-hex exception in design-checker.md's
  Color dimension.
2026-06-17 09:42:28 -05:00
Keysat e2377f4c8c handoff: suggest a next-session opener in the final report
Pair the existing /compact keep-alive line with a paste-able opener for a
fresh context window. Constrained to a pointer into AGENTS.md Current state,
not a state payload, so it stays safe to lose.
2026-06-17 08:14:01 -05:00
Keysat 9031281cd4 design guide: add Case-B extract field notes from first live run
First extract->reconcile run (recap) surfaced generalizable process
learnings: harvest the inventory with grep frequency tables in the main
thread (counts are the reconcile evidence), disambiguate near-duplicates
with frequency plus an external anchor, present conflicts as
recommended-first A/B/C forks with value previews, and treat the code
itself as the inspiration for a document-as-is extract (skip BRIEF and
_imports, write provenance). Also refine the Extract phase to enumerate
every styling surface and read the brand mark before the CSS.
2026-06-16 23:14:22 -05:00
Keysat 8b0799736c Pre-stage recap Case B design run as in flight 2026-06-16 22:49:37 -05:00
Keysat 7584ed04bb Drain triaged ten31-database items from inbox 2026-06-16 22:24:34 -05:00
Keysat b5a18c885e Triage recap inbox items
Drop two recap items (daily-digest SMTP shipped in 0.2.158; mobile
scroll-to-top already tracked in recap Current state). Retag the
"run janitor on all projects" item from (?) to (standards).
2026-06-16 21:43:42 -05:00
Keysat 315e13c318 Triage ten31-transcripts inbox items
Remove the mini-retrofit chore (done: .claude scaffolding, .gitignore, inbox-check line) and the Jitsi feature (routed to the project ROADMAP).
2026-06-16 21:40:44 -05:00
Keysat 0cd1310915 Fix ROADMAP.md casing in subagents-handbook retrofit ritual 2026-06-16 21:31:53 -05:00
Keysat f3d5c7a16d Trim Current state to current snapshot post-onboarding-tester 2026-06-16 21:31:53 -05:00
Keysat 950cf48c06 Stage onboarding-tester Path 2 on agent-delegable regtest payments
Revise ROADMAP item 9 and the INBOX Path-2 item: split the harness into
Stage 1 (Path 1, no payments, unblocked) and Stage 2 (Path 2, full
buyer-pays on regtest), gated on keysat's greenlit payment_providers:write
scope + network gate + sandbox flag. Drops the earlier 'operator
pre-connects' framing now that scoped agent connect on regtest is the plan.
2026-06-16 20:42:08 -05:00
Keysat 40f4af4191 Add onboarding-tester agent (docs-only fresh-adopter)
Global fleet agent that walks a product's published docs as a literal
newcomer, never reading its source, reports every doc gap, and emits a
publishable walkthrough only on a fully clean run. First instantiation:
keysat SDK integration. Register in fleet list, log ROADMAP item 9 and a
Current-state note, resolve the originating inbox item, and capture two
follow-ups (Path 2 payment demo, operator-onboarding agent).
2026-06-16 19:16:41 -05:00
Keysat 2111bfbaf6 Capture: error message in email capture tab on email sync status (via matrix) 2026-06-16 18:34:36 -05:00
Keysat 9c91d699d0 Roundup snapshot — 2026-06-16 2026-06-16 17:02:16 -05:00
Keysat 209dfa8f90 Capture: personal website on Start9 Pages + StartTunnel, Claude Design 2026-06-16 16:24:20 -05:00
Keysat 8f3e016528 Capture: add gemini 3.5 to model selection, need to have research age (via matrix) 2026-06-16 16:19:54 -05:00
Keysat 897e3e1ac1 Capture: add gemini 3.5 to model selection, need to have research age (via matrix) 2026-06-16 16:19:47 -05:00
Keysat d9653d742d Capture: review website for any drift/inconsistencies (doc-auditor), (via matrix) 2026-06-16 16:18:06 -05:00
Keysat 913f541c39 Capture: run spec-checker agent for listing to start9 community regis (via matrix) 2026-06-16 16:15:32 -05:00
Keysat 22cf9f776d Capture: Adversarial review of keysat- what vulnerabilities, customer (via matrix) 2026-06-16 16:14:50 -05:00
Keysat 10e0d898fd Capture: we previously discussed a docs-reader agent whose idea i thi (via matrix) 2026-06-16 16:13:35 -05:00
Keysat c133e67e6f Capture: does the keysat registry need to save every iteration of new (via matrix) 2026-06-16 16:10:53 -05:00
Keysat 6a713f87b1 Capture: run janitor agent on all projects (via matrix) 2026-06-16 15:57:45 -05:00
Keysat 1639b0427c Capture: run full-eval on ten31-signal-engine 2026-06-16 15:48:08 -05:00
Keysat 60f2532481 Capture: ten31-database explorer admin-vs-user UI audit idea 2026-06-16 14:17:01 -05:00
Keysat e2fc522096 Capture: ten31-database Matrix-bridge intake for fundraising grid 2026-06-16 14:14:25 -05:00
Keysat 123072f8ad Handoff: refresh Current state — design system shipped, recaps.cc next 2026-06-16 12:56:31 -05:00
Keysat e069ae0a13 Capture: keysat design-contract cleanup (design-checker backlog) 2026-06-16 12:50:16 -05:00
Keysat 4e98106ac6 Make the design guide vendor-agnostic about the harness
Keep "Claude Design" only as the named default visual tool (like naming
Figma); generalize the spots that assumed the coding harness is Claude:
- Phase B framed as "the cloud tool" (Claude Design as default; adapt for Figma)
- export named the "coding-agent handoff bundle" (Claude Design's term noted)
- token extraction points at Style Dictionary / any harness skill, not Claude Code
- BRIEF prompt block paste target generalized
2026-06-16 12:45:23 -05:00
Keysat a6e3fe6908 Field notes: promote learnings from the keysat /design pilot
First real /design run (keysat import, document-as-is). Generalizable
process learnings, per the Phase-D loop:
- implemented CSS/tokens are authoritative over prose when they disagree
- real type scales/shadows resist strict DTCG — keep as documented strings
- encode stated intent even where code violates it; the gap is the backlog
- "each surface inlines its own tokens" is a recurring drift risk; grep
  before relocating a design dir into _imports
2026-06-16 11:30:13 -05:00
Keysat 7f3b007e1a Add existing-repo on-ramp and Phase-D learning loop to /design
- Phase 0 routes a run by situation: refine / import prior guidelines
  (Case A) / extract the de-facto design from organic UI (Case B) / fresh,
  plus a document-as-is vs redesign posture. Lets /design backfill repos
  that grew a look organically, not just scope new ones.
- Phase D promotes generalizable PROCESS learnings back into this guide —
  harvested from both the pre-flight scoping conversation and the Phase-C
  distillation — with a Field notes section that accretes them. Brand facts
  stay in the repo's DESIGN.md; process knowledge improves the global standard.
- Seed Field notes with research-verified Claude Design export facts.
2026-06-16 10:48:03 -05:00
Keysat bb27e4c32a Add /design round-trip and design-checker agent to the fleet
Design/branding for any user-facing repo becomes a vendor-neutral on-disk
contract — design/DESIGN.md (nine-section brief) + design/tokens.tokens.json
(W3C DTCG tokens) — that every agent reads before building UI. Claude Design
is the interchangeable cloud front-end, never a dependency.

- /design (main-thread command): inspiration-first scoping -> BRIEF.md -> user
  drives the cloud step -> distill the export back into the contract. Phase-C
  token distillation is agent-mediated because the export is inline-hardcoded
  HTML/CSS, not DTCG.
- design-checker (read-only subagent): audits a repo's UI against its own
  committed contract; says "run /design first" when none exists.
- /new-project scaffolds design/ for user-facing projects.
2026-06-16 09:04:46 -05:00
Keysat ce022ab2c1 Capture: ten31-database icon oversized on StartOS, needs resize 2026-06-16 08:26:40 -05:00
Keysat 1e5e85339a Roundup snapshot — 2026-06-16 2026-06-16 06:50:54 -05:00
Keysat 7accb2616f Make a single approval cover both commit and push
Replace the two-gate "ask before pushing" wording: one approval is
the go-ahead to commit and push to main in one step.
2026-06-15 21:42:39 -05:00
Keysat 04941384f3 Default git workflow to main with an approval-on-push gate
Drop "commit directly to main" wording that sessions read as
"make a feature branch"; make main the default and gate on push.
2026-06-15 21:27:59 -05:00
Keysat 3cc1f74294 Capture: relay /relay/analyze doc-drift (prompt vs transcript) 2026-06-15 18:53:53 -05:00
Keysat e34047f853 Clear resolved recap inbox bugs; annotate mobile-scroll item with attempted fix 2026-06-15 18:18:59 -05:00
Keysat de920292cd Refresh Current state: how-i-work consolidation resolved 2026-06-15 18:10:27 -05:00
Keysat f7e9e29637 Add instruction-file structure + promote-without-trim rules to how-i-work 2026-06-15 18:09:40 -05:00
Keysat 7ed43fcb59 Mark ROADMAP item 7 (placement guide) done; refresh Current state 2026-06-15 17:29:47 -05:00
Keysat d9850f8ad8 Capture: reconcile ten31-database CRM networking facts 2026-06-15 17:27:59 -05:00
Keysat ee5c8bb3e2 Verify and correct placement guide infra facts with owner
Replace the one-shot/UNVERIFIED infra section with owner-confirmed facts:
x86 StartOS 0.4.0 box + full service inventory; the two-Spark role split
(LLM vs audio/speech, Qdrant on the audio Spark, matrix-bridge hosted there);
route via the Spark Control gateway and query the active model rather than
hardcoding one; networking reduced to LAN/WireGuard/StartTunnel (Proton/Tor
were legacy). Align decision steps 4 and 6.
2026-06-15 17:16:34 -05:00
Keysat 6b54d9c7cc Roundup snapshot — 2026-06-15 2026-06-15 14:12:09 -05:00
Keysat e4f5cc1bcb Refresh Current state after Workout-log rename cleanup 2026-06-15 13:52:42 -05:00
Keysat c8c1daf763 Correct stale repo name Workout-log → proof-of-work in roundup docs
The repo was renamed; note the rename in the STATUS.md scan header so the
historical snapshot stays unambiguous. The dead ~/Projects/Workout-log
folder (empty/crash logs, no git) was removed separately.
2026-06-15 13:48:44 -05:00
Keysat 5713b2476b Note first live /new-project run (matrix-bridge) in Current state 2026-06-14 22:29:13 -05:00
Keysat cdfa1eca57 Retag captured recaps items to (recap) to match repo folder 2026-06-14 19:41:30 -05:00
Keysat 908d96a6e5 Fold idea-workshop into /new-project; add placement reference
Harvest the retired idea-workshop skill into the current new-project flow:
- form-factor gate (is this even a standalone repo, or a feature/skill/agent
  of something that exists? bail + reroute if so)
- worth-building gate at sign-off (build effort + ongoing tax -> BUILD/PARK/ADOPT)
- placement step that walks the new guides/placement.md
- falsifiable-exit substance rule and a posture section
- architectural decisions land in the new repo's AGENTS.md ## Decisions section,
  absorbing the old DECISIONS.md function (no separate ADR file)

Add guides/placement.md (ported from the skill) and point how-i-work.md at it.
Its infra facts are UNVERIFIED (one-shot from chat history) and flagged for a
review pass with me (ROADMAP item 7).
2026-06-14 19:39:00 -05:00
Keysat bac315265e Capture: gitea API automation, jitsi, SMTP digest, 2 new-project ideas, 4 recaps bugs 2026-06-14 15:59:13 -05:00
Keysat cfe7f47964 Prune Current state; document STATUS.md in Layout 2026-06-14 14:33:42 -05:00
Keysat 231bc9f1a0 Roundup snapshot — 2026-06-14 2026-06-14 14:22:57 -05:00
Keysat b7625c4e83 Make /roundup's tracked STATUS.md snapshot its default, committed output
Promote STATUS.md from an opt-in offer to standard behavior: every /roundup run now writes the report to ~/Projects/standards/STATUS.md (overwritten), shows it inline, and commits + pushes only that file so the portfolio state is diffable over time — the same durability reflex as /capture. STATUS.md remains the only file roundup writes; all project repos stay read-only. Updates guide (new Phase 4), wrapper, ROADMAP item 2, README, and Current state.
2026-06-14 14:14:07 -05:00
Keysat 8548afd9fd Add /new-project bootstrap command (guide + wrapper)
The inverse of /retrofit: workshops a captured (new) inbox idea into a repo that is standards-compliant from line one (AGENTS.md + CLAUDE.md symlink, ROADMAP, canonical deny-by-default .gitignore, .claude wiring, inbox-check line), publishes via a manual Gitea-create gate, and clears the (new) inbox item. Resolves ROADMAP item 5 open questions (manual Gitea gate; stack quality gate deferred to /harden; workshop seeds Current state) and marks it BUILT.
2026-06-14 12:44:55 -05:00
Keysat b35e699384 Document capture/triage/roundup in README rhythm; capture ten31-transcripts retrofit
Add the inbox loop (/capture, /triage, /roundup) to README's 'The rhythm' section. Capture the ten31-transcripts mini-retrofit to INBOX.md with step-by-step instructions for next triage. Reconcile ROADMAP item 6 + Current state: recap-relay remote added/pushed in a later session.
2026-06-14 12:35:25 -05:00
Keysat 828fc99dd4 Adopt deny-by-default .claude gitignore; record git-hygiene audit
The cross-repo git-hygiene audit (ROADMAP item 6) found the documented canonical .claude/ block was allow-by-default and would have un-ignored a password-bearing .claude/launch.json. Switch portability.md to a deny-by-default .claude/* + allow-list block and align the two retrofit summaries. Mark item 6 done with residuals; refresh Current state.
2026-06-14 12:19:48 -05:00
Keysat 36e1f78014 Document statusline.sh adapter; record next-session todos in Current state 2026-06-14 11:28:34 -05:00
Keysat b55ba13277 Fix doc drift surfaced by doc-auditor: mark /roundup built, correct Current state and a cross-ref 2026-06-14 11:22:17 -05:00
Keysat 04651503d2 Thread inbox-check line and canonical .gitignore into the retrofit flow; scope relative-symlink rule to in-repo links
Retrofit playbook Step 0/Step 1 and the /retrofit guide now seed every new repo
with the canonical .gitignore block and the inbox-check line, and Part 5 documents
/capture, /triage, /roundup. portability.md and the portability-checker guide now
scope the relative-symlink mandate to in-repo (committed) symlinks, so global
~/.claude/* links are no longer flagged. ROADMAP adds a high-priority cross-repo
git-hygiene audit.
2026-06-14 11:15:16 -05:00
Keysat 72bebdd3e8 Mention roadmap.md in the retrofit-eval triage step 2026-06-14 11:15:16 -05:00
25 changed files with 1619 additions and 97 deletions
+36 -20
View File
@@ -17,13 +17,16 @@ The global layer lives here and is wired into `~/.claude` by **directory symlink
file added under `adapters/` is live immediately — no per-file linking:
- `~/.claude/commands``adapters/claude/commands/` — global slash commands (`/retrofit`,
`/handoff`, `/full-eval`, `/capture`, `/triage`, `/roundup`).
`/handoff`, `/full-eval`, `/capture`, `/triage`, `/roundup`, `/new-project`, `/design`,
`/adjudicate`).
- `~/.claude/agents``adapters/claude/agents/` — global subagents (reviewer, evaluator,
security-auditor, doc-auditor, exerciser, researcher, janitor, portability-checker,
start9-spec-checker).
start9-spec-checker, design-checker, onboarding-tester).
- `~/.claude/CLAUDE.md``how-i-work.md` — my universal preferences, loaded every session.
(Distinct from this repo's *root* `CLAUDE.md`, which → `AGENTS.md`: same filename, different
scopes — global preferences vs. this repo's orientation.)
- `~/.claude/statusline.sh``adapters/claude/statusline.sh` — the Claude Code CLI terminal
status-line script (invoked by `~/.claude/settings.json`).
Because these are *global*, decisions here are never scoped to this repo alone. The test
for anything proposed ("should we adopt linters / hooks / CI?", "is this skill worth
@@ -38,11 +41,15 @@ repo is just where that cross-repo standard is authored and stored.
- `retrofit-playbook.md` — terminal runbook for moving a project's brain onto disk.
- `subagents-handbook.md` — designing and running delegated agents.
- `guides/`**neutral substance**: the self-contained operating guide for each command
and agent, written as plain prose any vendor's harness could follow.
and agent, written as plain prose any vendor's harness could follow. Also holds shared
reference docs that aren't tied to one command — e.g. `placement.md` (where a new project
should live), which `/new-project` and `how-i-work.md` both point at.
- `adapters/claude/{commands,agents}/`**thin Claude wrappers**: frontmatter + a pointer
to the matching `guides/` file. Substance never lives in a wrapper.
- `INBOX.md` — cross-project capture buffer for untriaged ideas/bugs (see below).
- `ROADMAP.md` — longer-term backlog for this repo (future agents, commands, standards).
- `STATUS.md` — latest cross-project roundup snapshot, overwritten + committed by `/roundup`
each run (git history is the diff over time).
## Conventions when editing
@@ -81,20 +88,29 @@ should carry this so any vendor's agent surfaces pending items at session start:
## Current state
- The capturetriage→roadmap loop is built and live: `AGENTS.md` (+ `CLAUDE.md` symlink),
`ROADMAP.md`, `INBOX.md`, and the `/capture` + `/triage` commands. The repo dogfoods its
own standard, including the active inbox-check line above.
- `/capture` also handles **new-project ideas** (`(new)` / `(new:name)`, type `project`);
these wait for the new-repo bootstrap and are never triaged into an existing repo.
- The git-tracking standard ("What git tracks") is now in `portability.md`, and this repo's
`.gitignore` follows it.
- `/roundup` is built: a cross-project status report that reads every repo's
AGENTS.md/ROADMAP.md plus the inbox and groups all to-dos by priority — reads and reports
only; deciding focus stays with the user.
- Specced in `ROADMAP.md`, not built: the `new-project` bootstrap, the cross-repo
quality-gate standard (linters/hooks/CI), and the optional SessionStart hook.
- The portable inbox-check line is still not in *other* repos' AGENTS.md nor the retrofit
playbook template — threading it (and the canonical `.gitignore`) into bootstrapping is a
ROADMAP item.
- Next: thread the standards into `retrofit-playbook.md`; then build the quality-gate hook
or the `new-project` bootstrap.
- **Fleet built and live** — commands `/capture /triage /roundup /new-project /handoff /retrofit
/full-eval /design /adjudicate`; subagents incl. `design-checker` + `onboarding-tester`
(substance in `guides/`, thin wrappers in `adapters/claude/`, symlinked into `~/.claude`).
Dogfoods its own standard. Latest `/roundup`: `STATUS.md` 2026-06-16.
- **`/adjudicate` shipped this session (ROADMAP item 10).** Debates parked P2/P3 backlog to
DROP/DO/ESCALATE verdicts; recommend-only v1, autonomy gated by blast radius, ROADMAP-only.
**Untested on a real backlog** — the first run should eyeball the judge's drop-bias before
trusting it (tune the tie-break rule in `guides/adjudicate.md` if it over-drops). Best first
targets: keysat, recap.
- **`onboarding-tester` live; first harness still pending (item 9).** Stage 1 (Path 1, manual
issuance under keysat's `merchant-onboard` key) is **unblocked** — build the harness + first run
in a keysat session. Stage 2 (Path 2, buyer-pays on regtest) is **gated** on keysat's
`payment_providers:write` scope + network gate + sandbox flag.
- **Design system live; first full live round-trip done (item 8).** recap Case B (Extract)
backfill (app 0.2.161); **ten31-database** was the first end-to-end run — Case-B extract →
mobile-first `BRIEF.md` → Claude Design cloud round-trip → Phase C distilled + Phase D field
notes promoted to `guides/design.md` (`176eebe`). The "confirm export internals + tune Phase-C"
open item is **closed**. Residual is per-project execution (ten31-database mobile build, being
scoped in that repo — not standards work).
- **Next steps:** (1) first real `/adjudicate` run on keysat/recap to calibrate the drop-bias
(item 10); (2) Stage-1 `onboarding-tester` harness in a keysat session (item 9); (3) cross-repo
quality-gate standard + `/harden` (item 1); (4) non-git-folder sweep under `~/Projects` (item 6
residual).
- Queued in `INBOX.md` for other repos' `/triage`: keysat design cleanup (P2) + onboarding Path-2
(P3); `ten31-transcripts` mini-retrofit; `ten31-database` networking/icon/intake; (standards)
operator-onboarding agent (P3).
+21
View File
@@ -31,3 +31,24 @@ Example:
## Items
<!-- /capture appends below this line -->
- [ ] (standards) [feature][P2] API automation for Gitea in /new-project — automate the currently-manual Gitea create/publish gate via the Gitea API, 2026-06-14
- [ ] (new:embedded-links-reader) [project][P2] Embedded-links reader & summarizer — give the app an article/blog URL; it scrapes the links the author embedded (the ones you don't want to visit in the moment), reads them, and summarizes them, 2026-06-14
- [ ] (new:portfolio-scraper) [project][P2] Portfolio-company scraper — tracks portfolio companies for podcasts, social tweets, founder appearances, news, etc. and delivers a digest via email or another interface, 2026-06-14
- [ ] (recap-relay) [chore][P3] AGENTS.md endpoint list mis-describes POST /relay/analyze as "{ transcript, … } → topic sections JSON". The actual route (server/routes/analyze.js) takes a free-form { prompt: string } and returns the standard envelope { result: { text } }; "topic sections JSON" is only what the recap-app caller asks for in its prompt. Fix the request-shape wording to { prompt } — surfaced resolving Recaps' Daily Digest synthesis contract (Q4), 2026-06-15
- [ ] (keysat) [chore][P2] Design-contract cleanup from the 2026-06-16 design-checker audit — full detail in keysat ROADMAP "Design (contract conformance)" + design/DESIGN.md. (1) Fix 3 blockers (code violates the contract's named "never" rules on live CTAs): (a) gold-as-fill on admin `.featured-pill-toggle.on` (licensing-service-startos/licensing-service/web/index.html:418) → navy fill or gold border+text; (b) gold-as-fill on admin `#tier-banner-cta` upgrade button (web/index.html:537-542) → navy primary; (c) primary buy CTA pill radius 999px (keysat-xyz-landing/index.html:384-385) → r-md 8px. (2) Structural: consolidate the 4 surfaces' inlined CSS-variable copies onto canonical design/brand/palette.css (import it, drop private copies). (3) Token gaps (tokenize-vs-snap): 14px landing card radius; wordmark letter-spacing 0.30 vs 0.28em (add letterSpacing.wordmark token); semantic badge text one-offs (#205c47/#7a5814/#8a2828); hardcoded syntax-highlight hex → var(); admin #f6f1e7 off-token. Re-run design-checker after to confirm, 2026-06-16
- [ ] (ten31-signal-engine) [chore][P2] Run full-eval on the signal engine folder — the full evaluation suite (evaluator, security-auditor, exerciser, doc-auditor, spec-checker), 2026-06-16
- [ ] (standards) [idea][P2] run janitor agent on all projects — via matrix, 2026-06-16
- [ ] (keysat) [chore][P2] does the keysat registry need to save every iteration of new versions of keysat software as we upgrade it? research agent needs to investigate — via matrix, 2026-06-16- [ ] (keysat) [chore][P2] Adversarial review of keysat- what vulnerabilities, customer complaints, feature gaps, might a new user find. — via matrix, 2026-06-16
- [ ] (keysat) [chore][P2] run spec-checker agent for listing to start9 community registry — via matrix, 2026-06-16
- [ ] (keysat) [chore][P2] review website for any drift/inconsistencies (doc-auditor), review GitHub for any sensitive information in historical commits (revealed info), review website and consider adding specific example of how to add licensing to existing software (for example this is a good way to test the dry run of a new user just using documentation... we could give an agent the proof-of-work software and see if they can just add a license paywall in front of it before they can use it in one shot) — via matrix, 2026-06-16
- [ ] (recap) [idea][P2] add gemini 3.5 to model selection, need to have research agent check which models are available (stable versions) and the correct model name — via matrix, 2026-06-16
- [ ] (recap-relay) [idea][P2] add gemini 3.5 to model selection, need to have research agent check which models are available (stable versions) and the correct model name — via matrix, 2026-06-16
- [ ] (new:personal-website) [project][P2] Develop personal website — host on Start9 Pages, served on clearnet via StartTunnel; build HTML site, use Claude Design for styling, gather design inspiration — 2026-06-16
- [ ] (keysat) [feature][P3] Onboarding-tester Path 2 — full buyer-pays walkthrough on regtest. GATED on keysat shipping `payment_providers:write` (opt-in scope, never bundled into merchant-onboard) + network gate (scoped connect = regtest/testnet/signet only, mainnet master-only, fail-closed) + daemon-level sandbox-mode flag (greenlit with the keysat dev 2026-06-16; see plans/agent-payment-connect-scope.md). Then: harness stands up a BTCPay regtest stack + a sandbox-flagged keysat instance, grants the agent merchant-onboard + payment_providers:write, and the agent connects BTCPay (regtest) AND drives a test buyer payment that activates a license — entire chain agent-done, zero master-key steps. Walkthrough must be labeled regtest; production mainnet-connect stays the operator's one reserved step BY DESIGN (frame as security feature). Build AFTER Path 1 (no-payments) ships, since the BTCPay-regtest stack is the bulk of the new infra, 2026-06-16
- [ ] (standards) [agent][P3] Operator-onboarding agent — sibling to onboarding-tester for the *operator* journey (stand up + run keysat from docs alone: sideload/registry install on StartOS, configure, issue first license), vs. the developer SDK-integration journey onboarding-tester already covers. Needs its own clean room (a clean StartOS service-install, not a generic VPS, since the s9pk can't run on a vanilla Linux box), 2026-06-16- [ ] (ten31-database) [idea][P2] backup history in settings tab should be minimized and expandable with a chevron. default to minimized and shown at the bottom since it is rarely viewed — via matrix, 2026-06-18
- [ ] (ten31-database) [idea][P2] screen refresh should preserve viewing the same tab you were already on, rather than default back to the top tab — via matrix, 2026-06-18
- [ ] (keysat) [feature][P2] ability to reorder entitlements catalog on edit products view — via matrix, 2026-06-18
- [ ] (spark-control) [idea][P2] we should redesign the software logo/icon (used for startos service).. it doesn't really relate to anything, though the color scheme seems to match — via matrix, 2026-06-18
- [ ] (spark-control) [feature][P2] Add a dashboard card for the ten31 CRM / intake bot (Update/Restart/Stop/Logs tile like matrix-bridge) — confirm whether already done; see ten31-database docs/handoffs/add-intake-bot-to-spark-control.md, 2026-06-18
- [ ] (proof-of-work) [feature][P2] brainstorm better tracking of cardio logging and cardio program planning (in-week variety and long term programs) — via matrix, 2026-06-19
- [ ] (matrix-bridge) [bug][P2] what are the open brackets when you log an inbox item through matrix, eg “📥 captured → - [ ] (proof-of-work) [feature][P2] brainstorm better tracking of cardio logging and cardio program planning (in-week variety and long term programs) — via matrix, 2026-06-19” — via matrix, 2026-06-19
+5 -1
View File
@@ -41,4 +41,8 @@ The governing question: **would a different agent need this to work on the repo?
## The rhythm (canonical homes elsewhere)
One session = one task. Close every session with `/handoff` (updates Current state in AGENTS.md, commits, pushes). `/compact` is for mid-task overflow only, never to extend a finished chat. Dropped or closed by accident? `claude -c` resumes the latest session in that folder. Full daily rhythm: retrofit-playbook.md, Part 5.
One session = one task. Close every session with `/handoff` (updates Current state in AGENTS.md, commits, pushes). `/compact` is for mid-task overflow only, never to extend a finished chat. Dropped or closed by accident? `claude -c` resumes the latest session in that folder.
Between tasks, ideas flow through the inbox: `/capture` logs an idea or bug for any repo to `INBOX.md` from wherever you are (no routing decision at capture time); `/triage`, run inside a project, drains that repo's items into its Current state or `ROADMAP.md`; and `/roundup` reads every repo's AGENTS.md/ROADMAP plus the inbox and compiles one priority-grouped to-do list across all projects, written to a tracked `STATUS.md` snapshot in the standards repo (it gathers and groups — deciding focus stays with you). Canonical home: `AGENTS.md` → "The capture → triage → roadmap loop."
Full daily rhythm: retrofit-playbook.md, Part 5.
+170 -43
View File
@@ -42,29 +42,15 @@ how the standard gets *adopted* into a repo (a `/harden` command that installs t
linter+hook for the detected stack?); whether to define a minimal "agentic-ops baseline"
checklist doc alongside the other four standards docs.
## 2. `roundup` — cross-project status agent (the inverse of capture/triage)
## 2. `roundup` — cross-project status command ✅ BUILT
**Why:** a global "what should I do next / which project to focus on" view. Capture/triage
push items *down* into project repos; roundup reads back *up* across all of them.
**Shape:** a command (`/roundup`, run from `~/Projects`) that fans out one read-only
subagent per repo to read that repo's `AGENTS.md` (`## Current state`) and `ROADMAP.md`,
plus reads the standards `INBOX.md` for not-yet-triaged items. It synthesizes **one
aggregated to-do list across all projects, grouped by priority**, including items that
haven't been pushed down to a repo yet. Mirror `full-eval`'s orchestrator pattern: fan out,
then synthesize into one report; the per-repo readers can be the generic `Explore` agent or
a small dedicated reader.
**Division of labor (the user's explicit call):** the agent *reads and reports* — it
gathers and presents findings grouped by priority; it does **not** decide the best use of
time. Prioritizing across projects is a back-and-forth the user does on top of the report.
So the output is a clean, evidence-backed inventory, and the "what's actually worth doing
now" conversation happens after, in the main thread.
**Open questions:** which folders count as "my projects" (scan `~/Projects`, skip non-repos
and the standards repo itself?); how to keep it cheap (cap readers, summarize hard); whether
roundup output is ephemeral or written to a `STATUS.md` in the standards repo for diffing
over time like `EVALUATION.md`.
Built and live: `guides/roundup.md` + `adapters/claude/commands/roundup.md`. Fans out a
read-only reader per repo over AGENTS.md/ROADMAP.md, folds in the inbox, and synthesizes one
priority-grouped to-do list across all projects (prioritizing stays with the user). Every run
**writes the report to a tracked `~/Projects/standards/STATUS.md` snapshot** (overwritten each
run, then committed + pushed) so the portfolio state is diffable over time — and shows the
same report inline. That snapshot file is the *only* thing roundup writes; all project repos
stay read-only.
## 3. Deterministic inbox surfacing — SessionStart hook (optional upgrade over the portable line)
@@ -93,28 +79,169 @@ automatic so every repo inherits it.
- Decide whether the canonical wording lives in `how-i-work.md` (so it's truly universal)
or stays a per-repo line.
## 5. `new-project` — idea → scoped → scaffolded → Gitea repo
## 5. `new-project` — idea → scoped → scaffolded → Gitea repo ✅ BUILT
**Why:** the inverse of `/retrofit`. Retrofit moves an *existing* project onto disk; this
takes a captured `(new)` inbox idea and turns it into a real, standards-compliant repo. It
closes the capture loop: `/capture (new) …` → bootstrap → a repo that already has AGENTS.md,
the CLAUDE.md symlink, ROADMAP.md, the canonical `.gitignore`, and the inbox-check line.
Built and live: `guides/new-project.md` + `adapters/claude/commands/new-project.md`. The
inverse of `/retrofit` — main-thread and collaborative, it turns a captured `(new)` inbox
idea into a repo that's standards-compliant from line one. Phases: locate the inbox seed →
workshop the scope (the high-value, interactive step) → brief + scaffolding-plan sign-off →
scaffold (`AGENTS.md` + `CLAUDE.md` symlink, `ROADMAP.md`, `README.md`, canonical
`.gitignore`, `.claude/`, minimal stack skeleton; inbox-check line tagged `(<name>)`) →
publish (git init + commit, Gitea manual-create gate, remote + push) → close the loop
(remove the `(new)` item from `INBOX.md`) → portability-checker verify.
**Shape:** a command (`/new-project`, run from `~/Projects`), main-thread and collaborative
— scoping is a conversation, not a delegated job. Phases:
**Open questions, resolved:** (a) Gitea repo creation is a **manual web-UI gate** (no API
token — matches retrofit-playbook Part 4); (b) the standards layer is always scaffolded, the
stack skeleton stays minimal, and the **linter/pre-commit quality gate is deferred to the
future `/harden`** (item 1) rather than hand-rolled; (c) the workshop **does** seed the first
`## Current state` with the first milestone.
1. **Workshop the scope** — back-and-forth to sharpen objectives, non-goals, stack, and the
key early decisions. Pull the seed from the `(new)` inbox item if one exists. This is the
high-value step and stays interactive (like roundup, the reasoning is the user's).
2. **Seed prompt** — synthesize the workshop into a concrete project brief / initial build
prompt plus a scaffolding plan; get the user's sign-off.
3. **Scaffold** — create the new folder under `~/Projects`, write the initial AGENTS.md
(from the brief) + CLAUDE.md symlink, ROADMAP.md, README, the canonical `.gitignore`,
`.claude/` wiring, and the starting directory structure. Compliant from line one.
4. **Publish**`git init` + initial commit, create the Gitea repo, add the remote, push
(reuse retrofit-playbook Part 4; if no Gitea API token is available, hand back the manual
"create empty repo, copy URL" step). Then remove the `(new)` item from the inbox.
**Remaining option:** once `/harden` (item 1) exists, call it as the scaffold's last step so a
new repo gets its stack's quality gate installed automatically.
**Open questions:** Gitea repo creation — API token vs. manual UI step; how much scaffolding
is generic vs. stack-specific (does it call a `/harden` step from item 1 to install the
stack's linter+hook?); whether the workshop output also seeds the first `## Current state`.
## 6. Cross-repo git-hygiene audit + remediation ✅ DONE (2026-06-14)
Fanned out one read-only `portability-checker` per git repo under `~/Projects`. **No safety
issues anywhere:** zero tracked `.env` / `.DS_Store` / `*.local.json`, and every in-repo
symlink is relative. The gaps were consistency: the inbox-check line was missing in all 7
non-standards repos, and only `standards` had a complete canonical `.gitignore`.
**Fixed — 6 repos, one commit each, pushed** (`CRM`, `premier-gunner`, `recap`,
`spark-control`, `proof-of-work`; `recap-relay` committed locally — see residuals): added the
repo-tagged inbox-check line and normalized `.gitignore`.
**Standard improved by the audit:** the documented canonical `.claude/` block was
allow-by-default and would have *un-ignored* `premier-gunner`'s password-bearing
`.claude/launch.json`. Switched `portability.md` (and the two retrofit summaries) to a
**deny-by-default `.claude/*` + allow-list** of the shared wiring.
**Residual follow-ups:**
- **`ten31-transcripts` (MAJOR) — needs its own mini-retrofit.** Despite the name it's an
active Xcode/Swift app with no `.claude/` at all. Scaffold `.claude/settings.json`; decide
whether to reorganize its flat `docs/NN_*.md` into `docs/guides/` + `.claude/rules/` symlinks.
Too big for the mechanical pass — **captured to `INBOX.md`** (tagged `(ten31-transcripts)`)
with step-by-step instructions, for pickup on the next `/triage` in that repo.
- **`recap-relay` remote** ✅ Gitea `origin` added + pushed in a later session.
- **`premier-gunner/s9pk/.gitignore`** lacks the secrets/Claude lines (low priority; the root
`.gitignore` covers `.env` tree-wide already).
- **Many non-git folders under `~/Projects` are unprotected work** (discount-watcher,
expense-organizer, giga, heart-rate, licensing, one-river, satoshi-sleep, START9 PACKAGING,
ten31-agents/-command-center/-signal-engine, timestamp-converter, timestamp-newspaper,
website-landing, Grand-Cayman-paddleboard). Each needs `git init` + retrofit, or an explicit
"scratch, don't track" decision.
- **`start-os`** is an external upstream (Start9Labs/start-os) — out of scope, no action.
## 8. Design system — `/design` round-trip + `design-checker` agent ✅ BUILT (2026-06-16)
Built and live: `guides/design.md` + `adapters/claude/commands/design.md` (the `/design`
command) and `guides/design-checker.md` + `adapters/claude/agents/design-checker.md` (the
`design-checker` subagent). The principle: branding/design for any user-facing repo is a
**vendor-neutral on-disk contract**`design/DESIGN.md` (nine-section brand brief) +
`design/tokens.tokens.json` (W3C DTCG tokens) — that every agent reads before building UI.
The cloud tool (Anthropic's **Claude Design**, cloud-only/experimental) is an interchangeable
front-end, never a dependency.
- **`/design`** (main-thread, interactive) runs the round-trip: inspiration-first scoping →
`design/BRIEF.md` (the packet the user hand-carries to Claude Design) → user drives the
cloud step → distill the export back into the `DESIGN.md` + tokens contract. Distillation is
agent-mediated because Claude Design exports inline-hardcoded HTML/CSS, not DTCG tokens
(verified by research, 2026-06-16) — worth a human glance the first few runs.
- **`design-checker`** (read-only subagent) audits a repo's user-facing code against its *own*
committed contract — off-palette colors, wrong type scale, magic-number spacing, matched
"don'ts" — citing code `file:line` vs the contract rule. Reports "run `/design` first" if no
contract exists; never imposes its own taste.
- **`/new-project`** now scaffolds `design/` + the AGENTS.md Design line for user-facing
projects (Phase 4).
**Remaining options:** (a) `/retrofit` should backfill `design/` into existing user-facing
repos (keysat, recap, recaps.cc, premier-gunner, ten31-database, ten31-transcripts) — run
`/design` then `design-checker` per repo. **Done (recap, 2026-06-17):** recap was the first
**Case B (Extract)** backfill — document-as-is. It validated the extract→reconcile path
(previously untested — keysat was Case A/Import); the contract was distilled and the
conformance cleanup shipped in two phases (recap app 0.2.161). Extract-phase Phase-D learnings
landed in `guides/design.md` (commit `9031281`); cleanup-execution learnings are now in
`guides/design-checker.md`. **Remaining backfill repos:** recaps.cc, premier-gunner,
ten31-database, ten31-transcripts. (b) fold a `design-checker` pass into `/full-eval`
for repos that have a contract; (c) confirm against a real Claude Design run what the export
bundle actually contains and tune the Phase C distillation (the export internals are only
medium-confidence from research) — kept **decoupled** from the recap run so one untested thing
moves at a time.
## 9. `onboarding-tester` — docs-only fresh-adopter agent ✅ BUILT (agent); harness pending (staged Path 1 → Path 2)
Built and live: `guides/onboarding-tester.md` + `adapters/claude/agents/onboarding-tester.md`. A
global adopter agent: given a **goal**, a **docs-corpus allowlist**, a **sandbox**, and
**credentials/endpoint**, it walks a product's published docs as a literal newcomer — *never
reading the product's source* (needing to is itself a finding) — and reports every doc gap
(Blocker/Stumble/Nit + the one-line fix). On a **fully clean run** (zero blockers, zero stumbles)
it emits a second, publishable "all it took was X, Y, Z" walkthrough for marketing. Generic by
design; keysat is the first instantiation. The dual output (friction report always; marketing
walkthrough only on a clean run) makes it both a docs-QA tool and a source of agent-readiness copy.
**First instantiation — keysat SDK integration (harness lives in keysat, not here):** goal = gate
`proof-of-work` behind a keysat license end-to-end under the least-privilege `merchant-onboard`
scoped key (keysat shipped that role for this, commit `d5885d1`; covers catalog setup + manual
license issuance). Harness = two disposable containers (`keysat-server` fixture booted fresh +
`developer-sandbox` holding `proof-of-work`), master-key mints the scoped key, docs corpus =
keysat.xyz / docs.keysat.xyz / the four SDK package pages.
**Staged build:**
- **Stage 1 — Path 1, no payments (do first, unblocked now):** catalog → tiers → manual license
issuance → SDK integration → offline verify, all under `merchant-onboard`. Cheapest honest
walkthrough; validates the agent + core integration docs before adding payment infra. **Pending:**
build the harness + first live run in a keysat session.
- **Stage 2 — Path 2, full buyer-pays on regtest (gated):** depends on keysat shipping
`payment_providers:write` (opt-in scope, never bundled into `merchant-onboard`) + a network gate
(scoped connect = regtest/testnet/signet only; mainnet master-only; fail-closed) + a daemon-level
sandbox-mode flag — **greenlit with the keysat dev 2026-06-16** (`plans/agent-payment-connect-scope.md`).
Then the harness adds a BTCPay regtest stack and the agent connects BTCPay (regtest) **and** drives
a test buyer payment end-to-end, zero master-key steps. Walkthrough labeled regtest; production
mainnet-connect stays the operator's one reserved step **by design** (a key that can repoint
settlement is a fund-redirection key — surface the boundary as a security feature).
**Other follow-up (captured to `INBOX.md`):** a separate operator-onboarding agent for the StartOS
install journey (stand up + run keysat from docs alone), vs. the developer SDK-integration journey
this agent covers.
## 7. Verify & correct the placement guide ✅ DONE (2026-06-15)
Walked the "Infrastructure facts" section with the owner and corrected it against the real
setup (cross-checked against the project repos + a Spark-control topology dump). Fixes: **x86**
StartOS **0.4.0** box + full service inventory (Gitea, Nextcloud, Home Assistant, CLN + RTL,
Open WebUI, Vaultwarden, Synapse, and every Claude-built app); the two-Spark **role split**
(LLM Spark = vLLM; audio/speech Spark = Parakeet / Kokoro / Sortformer + TitaNet + bge-m3 +
Qdrant, and it hosts matrix-bridge); **don't hardcode a model — query the Spark Control
gateway** for the live one (daily driver Qwen3.6, hot-swappable); networking reduced to LAN /
WireGuard / StartTunnel (Proton VPN + Tor were legacy, dropped). UNVERIFIED banner replaced
with a "verified 2026-06-15" note; decision steps 4 and 6 aligned. Commit `ee5c8bb`.
## 10. `adjudicate` — debate low-priority backlog items to a verdict ✅ BUILT (2026-06-17)
Built and live: `guides/adjudicate.md` + `adapters/claude/commands/adjudicate.md` (the
`/adjudicate` command). Solves backlog clutter the owner can't easily judge: low-priority
(P2/P3) technical/backend items that may be necessary or may be bells-and-whistles, and that
he shouldn't spend expertise on *because* they're low priority. Run inside a repo, it
adjudicates that repo's ROADMAP items via a grounded debate and routes each to a verdict the
owner ratifies instead of researching.
- **Pipeline (per item):** investigator (read-only — does the problem exist? already handled?
what would it touch? + blast-radius classification) → build-advocate ∥ drop-advocate (argue
from the investigator's findings, not speculation) → judge (rubric = `how-i-work.md` + repo
`AGENTS.md`; **biased to DROP on ties / low confidence**, since these are already low-priority).
- **Three verdicts:** **DROP** (the only autonomously-applied call — ratified in one batch, owner
needn't understand the tech), **DO** (worth it + LOW blast radius → annotated with a ready plan,
recommend-only, not executed), **ESCALATE** (worth it but HIGH blast radius / low confidence /
an epic → balanced brief for the owner's call).
- **Autonomy is gated by blast radius, not priority** — HIGH = touches data/auth/money/external
surface or changes observable behavior (unclear ⇒ HIGH). It may auto-recommend *dropping* a HIGH
item but never *doing* one.
- **ROADMAP-only input.** Nudges the owner to `/triage` first if untriaged inbox items exist for
the repo, but never reads raw inbox items into the debate (that's `/triage`'s routing job —
duplicating it invites drift). Two gates: confirm the item set before fan-out (cost control),
then approve the batch of ROADMAP edits. The ROADMAP diff + commit message is the audit trail
(no separate report file).
**Remaining options:** (a) **v2 — narrow auto-execution** of the safe "DO + LOW blast radius +
reversible + test-covered" class, once the owner has watched it make calls and trusts the verdicts
(deliberately deferred — recommend-only first to build trust); (b) a thin `/triage`-then-`/adjudicate`
combo if the two-command chaining friction proves real (YAGNI for now).
+139
View File
@@ -0,0 +1,139 @@
# Roundup — 2026-06-18
Repos scanned (11): keysat, matrix-bridge, premier-gunner, proof-of-work, recap-relay, recap, spark-control, standards (meta/tooling), ten31-database, ten31-signal-engine, ten31-transcripts.
Skipped (non-git folders under `~/Projects`, noted not dropped): discount-watcher, expense-organizer, giga, Grand-Cayman-paddleboard, heart-rate, one-river, `satoshi-sleep (need to add code)`, `START9 PACKAGING`, ten31-command-center, timestamp-converter, timestamp-newspaper, website-landing, Workout-log. (Failed readers: none.)
> Read-and-report only. Items are grouped by the priority signals found in each repo and the inbox; projects are **not** ranked against each other and no "what to work on" call is made — that's yours. The repo backlogs below already live durably in each repo's ROADMAP; the **inbox items are itemized in full** because they exist nowhere else yet.
---
## Per-project snapshot
**keysat** — Bitcoin-native software-licensing service (StartOS 0.4.x s9pk + 4 SDKs + landing/docs). Live `0.2.0:60` on immense-voyage.local serving licensing.keysat.xyz; `:60` shipped the Zaprite auto-charge silent-lapse fix + docs reconciliation; `cargo check`/`tsc`/tests green. **Next:** multi-profile webhook routing test → split `audit:read` scope → operator-alerts-via-StartOS → registry `prepare.sh`.
**matrix-bridge** — Single-user Matrix bot turning a room message into a live Claude Code session on the Mac, surfaced to phone. Phases 03 + ask mode all DONE and live (Docker on Spark); capture mode deployed. **In progress:** none; only optional optimizations (Docker HEALTHCHECK, trust-gate flag, priority keyword). **Watch:** Update button depends on modelo's Gitea ssh `IdentitiesOnly yes` pin.
**premier-gunner** — Kid-friendly soccer-training tracker PWA (one player), StartOS s9pk. Live `v0.1.7:0`; all requested features built/deployed. **In progress:** none. **Next:** confirm speed unit (mph vs km/h); work the eval backlog if desired. **Known issue:** in-app password change reverts on restart (use the StartOS action).
**proof-of-work** — Self-hosted multi-user workout logger (Next.js 15) as StartOS s9pk. Live `1.2.0:5` (Gear replaces RPE for cardio); built + sideloaded, verified on-box (231 tests pass). **Known bug (P2):** Mobile-Safari first-login-tap fails once then works — gated on a Safari error code to pick client vs server fix. **Next:** P3 hardening batch → tiered AI prompt formatting → Next 15→16.
**recap-relay** — Operator-side credit-metered service in front of Gemini + Spark Control, powering Recaps' transcription/analysis pipeline. Aligned at relay `0.2.126` (app `0.2.155`), 79 tests green; Users dashboard tab + persistent webhook-dedup shipped; BTCPay-only rail; CORS scoped. **In progress:** none open above P2. **Next:** P2/deferred tail (split 2225-line internal-meetings.js — likely overkill; security/doc P3s).
**recap** — YouTube/podcast summarizer + library SPA, StartOS s9pk + cloud at recaps.cc. Live app `0.2.161` + relay `0.2.126`, 144 tests pass; self-serve Pro/Max purchase + design system (Phases 12) complete. **Loose end:** Daily Digest relay-synthesis+SMTP installed but not smoke-tested on-box (operator action #5). **Next:** operator smoke-test #5; resolve P2 known-debt cluster.
**spark-control** — Browser StartOS package driving a dual NVIDIA DGX Spark cluster (vLLM swaps, health/proxy, STT/TTS/diarization, embeddings, redaction). Live `v0.26.0:0` (disk-driven model menu); OpenClaw/Johnny-5 coexistence epic fully shipped (v0.25.0); 137 tests pass. **In progress:** Gemma-4-26B-A4B vision eval (recipe in catalog, not yet downloaded/swap-tested). **Awaiting:** Signal-Engine client-side flakiness remedy forwarded to dev.
**standards** *(meta/tooling)* — Home of agent standards + the live global fleet (8 commands, 11 subagents, served via `~/.claude` symlinks). `/adjudicate` shipped (item 10), **untested on a real backlog** — first run should calibrate drop-bias. **Next:** first `/adjudicate` run on keysat/recap; Stage-1 onboarding-tester harness; cross-repo quality-gate standard + `/harden`; non-git-folder sweep.
**ten31-database** — Self-hosted venture-fund CRM (replacing Airtable) with an agentic AI layer for funnel-widening + outreach drafting. Phase 0/1 built; deployed & verified live `v0.1.0:91` (2026-06-18); grid is canonical SoR, Matrix email-proposal review + pipeline adoption + intake bot all live. **Unsmoked:** intake fuzzy-match numbered-pick grammar. **Debt (P2):** soft-delete sweep residue, `?limit=abc` crash, auth regression test, oversized icon, 5.4k-line monolith.
**ten31-signal-engine** — Recurring pipeline ingesting audio + text → structured "claims" → investment signals via Ten31's thesis lens, each logged as a falsifiable prediction. **Strike adversarial test: CONDITIONAL PASS (2026-06-16)** — 56,008 claims embedded, false positive correctly refused; reflexivity demo unexercised (RHR/CD/Bitcoin.Review audio deferred). No automated test suite yet. **Next:** Frontier-fan-out test H6 → complete Strike reflexivity demo → Job A discovery scorers.
**ten31-transcripts** — Native macOS menu-bar app: detects calls, records dual-track audio, sends to Spark Control for transcription/diarization. `main` clean + pushed; app rebuilt+installed, 91 tests pass; meeting-name prompt + folder rename shipped & reviewer-verified. **Pending verify:** naming prompt + rename on a live stop. **In progress/unverified:** Meet visual fix (reject solid camera-off tiles).
---
## Priority queue (all projects + untriaged inbox)
No P0 or P1 items exist anywhere — nothing is flagged drop-everything or urgent. Repo "next steps" carry no Pn marker; they're the operator's chosen next moves, listed under **Next actions (no Pn)**. Repo ROADMAP backlogs live durably in each repo and are summarized per-line with a pointer; **inbox items are itemized in full** below and again under "Not yet pushed down."
### P2
- [P2] Design-contract cleanup (3 blockers + structural CSS consolidation + token gaps) — source: inbox(untriaged) → keysat — INBOX.md / keysat ROADMAP "Design (contract conformance)"
- [P2] Research: does the keysat registry need to retain every prior software version? — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
- [P2] Adversarial review of keysat (vulns / customer complaints / feature gaps a new user would find) — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
- [P2] Run spec-checker agent for Start9 community-registry listing — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
- [P2] Website doc-auditor pass + scan GitHub history for leaked info + add "add licensing to existing software" example — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
- [P2] Ability to reorder entitlements catalog on edit-products view — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
- [P2] Add Gemini 3.5 to model selection (research available stable model name first) — source: inbox(untriaged) → recap — INBOX.md (via matrix)
- [P2] Add Gemini 3.5 to model selection (research available stable model name first) — source: inbox(untriaged) → recap-relay — INBOX.md (via matrix)
- [P2] Run full-eval suite on the signal-engine folder — source: inbox(untriaged) → ten31-signal-engine — INBOX.md (via matrix)
- [P2] Backup-history settings tab: minimize + chevron-expand, default collapsed at bottom — source: inbox(untriaged) → ten31-database — INBOX.md (via matrix)
- [P2] Screen refresh should preserve the current tab, not reset to top tab — source: inbox(untriaged) → ten31-database — INBOX.md (via matrix)
- [P2] Redesign the software logo/icon (StartOS service icon) — source: inbox(untriaged) → spark-control — INBOX.md (via matrix)
- [P2] Gitea API automation for /new-project (replace manual create/publish gate) — source: inbox(untriaged) → standards — INBOX.md
- [P2] Run janitor agent across all projects — source: inbox(untriaged) → standards — INBOX.md (via matrix)
- [P2] Mobile-Safari first-login-tap fails once then works — source: proof-of-work — repo ROADMAP "Known bugs" (gated on Safari error code)
- [P2] CRM debt: dashboard comms-aggregate soft-delete sweep, `?limit=abc` crash, auth regression test, oversized StartOS icon, 5.4k-line monolith — source: ten31-database — repo Current state
- [P2] recap known-debt cluster: SSE error-string scrub, credit TOCTOU, multi-tenant gemini-key bypass, `/api/history` perf, dep CVEs (nodemailer high), risky-file tests, doc drift — source: recap — repo ROADMAP "P2 known-debt"
- [P2] premier-gunner eval backlog: `@fastify/static`→≥9.1.3 (path-traversal), input validation (unknown metric kind / bad dates / 400 not 500), automated record/streak/migration tests — source: premier-gunner — repo ROADMAP
- [P2] spark-control tech debt: no tests beyond redaction suites, loose dep floors (python-multipart/starlette), opaque 500 on model POST/PUT, NGC key on process line, global mutable catalog, container root on 0.0.0.0:9999 — source: spark-control — EVALUATION.md
- [P2] Cross-repo quality-gate standard (linters / pre-commit / CI) + `/harden` — source: standards — ROADMAP item 1
### P3
- [P3] Onboarding-tester Path 2 (full buyer-pays walkthrough on regtest) — source: inbox(untriaged) → keysat — INBOX.md (gated on `payment_providers:write` + network gate + sandbox flag)
- [P3] Fix AGENTS.md endpoint wording: `POST /relay/analyze` takes `{ prompt }`, not `{ transcript … }` — source: inbox(untriaged) → recap-relay — INBOX.md
- [P3] Operator-onboarding agent (sibling to onboarding-tester for the operator journey) — source: inbox(untriaged) → standards — INBOX.md
- [P3] proof-of-work hardening: CSP `unsafe-eval`, `/api/health` info disclosure, rate-limit map leak, shorter sessions, text max-length, unify 3rd JSON-parse — source: proof-of-work — repo ROADMAP
- [P3] premier-gunner P3: CSRF token, cross-category metric guard, logout-without-session, consistent 404s, validate category color — source: premier-gunner — repo ROADMAP
- [P3] recap P3 deferred: request-size caps, invoice-ID hijack on `/api/credits/claim`, container root, in-memory auth rate-limit, repo hygiene, registry-submission blockers — source: recap — repo ROADMAP
- [P3] recap-relay security/ops tail: no `/relay/*` rate limiting, container root, dashboard stored-XSS, `lan-fetch` TLS-verify off, stale `/relay/health` version, doc fixes — source: recap-relay — repo ROADMAP
- [P3] spark-control bulk-fix-when-touching: stale README, deprecated `@app.on_event`, `innerHTML` sink, no upload-size limits, `VLLM_PORT` crash, Makefile x86-only vs manifest aarch64 — source: spark-control — EVALUATION.md
### Next actions (no Pn signal — operator's chosen next moves, not yet prioritized)
- keysat: automated multi-profile webhook routing test (S) → split `audit:read` from blanket `:read` → operator-alerts-via-StartOS (verify start-sdk 1.3.2 first) → registry `prepare.sh` + on-box verification — source: keysat Current state
- recap: smoke-test Daily Digest end-to-end on-box (operator action #5) — source: recap Current state
- spark-control: Gemma-4-26B-A4B vision eval (download + swap-test; owner weighing vision vs text-only Qwen3.6) — source: spark-control Current state
- ten31-signal-engine: Frontier-fan-out test H6 → complete Strike reflexivity demo (un-defer RHR/CD audio) → Job A discovery scorers — source: signal-engine Current state
- ten31-transcripts: verify naming-prompt + folder-rename on a live stop; re-process Meet session for visual fix; repoint `origin``gitea-home`; backend URL primary→fallback + `mmss()` NaN guard — source: ten31-transcripts Current state
- ten31-database: in-room smoke of intake disambiguation numbered-pick grammar; spark-control intake dashboard card; NL→safe-query build; freeze v2.0 canonical thesis — source: ten31-database Current state
- premier-gunner: confirm speed unit (mph vs km/h) — source: premier-gunner Current state
- matrix-bridge: optional only — Docker HEALTHCHECK, trust-gate flag, priority keyword, delete vestigial `phase-0` branch — source: matrix-bridge Current state
- standards: first real `/adjudicate` run on keysat/recap to calibrate drop-bias; Stage-1 onboarding-tester harness in a keysat session — source: standards Current state
### Unprioritized — needs triage (proposed new projects; routed by new-repo bootstrap, not /triage)
- Embedded-links reader & summarizer — source: inbox `(new:embedded-links-reader)` [P2]
- Portfolio-company scraper (podcasts/tweets/news digest) — source: inbox `(new:portfolio-scraper)` [P2]
- Personal website on Start9 Pages via StartTunnel — source: inbox `(new:personal-website)` [P2]
---
## Not yet pushed down (inbox) — exists nowhere but INBOX.md, grouped by target
**→ keysat (7)**
- [P2] Design-contract cleanup — 3 blockers (gold-as-fill on admin `.featured-pill-toggle.on` + `#tier-banner-cta`; buy-CTA pill radius 999px→8px), CSS-variable consolidation onto canonical palette.css, token gaps (14px card radius, wordmark letter-spacing, semantic badge hexes, syntax-highlight hex, admin `#f6f1e7`). Re-run design-checker after. (2026-06-16)
- [P2] Research whether the registry must retain every prior keysat version on upgrade (2026-06-16)
- [P2] Adversarial review — vulns / customer complaints / feature gaps a new user might find (2026-06-16)
- [P2] Run spec-checker for Start9 community-registry listing (2026-06-16)
- [P2] Website doc-auditor pass; scan GitHub history for leaked sensitive info; add a "add licensing to existing software" example (proof-of-work as a dry-run target) (2026-06-16)
- [P2] Reorder entitlements catalog on edit-products view (2026-06-18)
- [P3] Onboarding-tester Path 2 — buyer-pays regtest walkthrough, gated on `payment_providers:write` + network gate + sandbox-mode flag (2026-06-16)
**→ recap-relay (2)**
- [P2] Add Gemini 3.5 to model selection (research stable model name first) (2026-06-16)
- [P3] AGENTS.md endpoint-shape doc fix: `POST /relay/analyze` is `{ prompt }`, not `{ transcript … }` (2026-06-15)
**→ recap (1)**
- [P2] Add Gemini 3.5 to model selection (research stable model name first) (2026-06-16)
**→ ten31-database (2)**
- [P2] Backup-history tab: minimize + chevron-expand, default collapsed at bottom (2026-06-18)
- [P2] Screen refresh should preserve current tab, not reset to top (2026-06-18)
**→ ten31-signal-engine (1)**
- [P2] Run full-eval suite on the signal-engine folder (2026-06-16)
**→ spark-control (1)**
- [P2] Redesign the software logo/icon used for the StartOS service (2026-06-18)
**→ standards (3)**
- [P2] Gitea API automation for /new-project (automate the manual create/publish gate) (2026-06-14)
- [P2] Run janitor agent across all projects (2026-06-16)
- [P3] Operator-onboarding agent — sibling to onboarding-tester for the operator journey (needs a clean StartOS service-install clean room) (2026-06-16)
---
## Proposed new projects (inbox `(new:…)` — awaiting new-repo bootstrap)
- **embedded-links-reader** [P2] — give the app an article/blog URL; it scrapes the author-embedded links, reads them, and summarizes them (2026-06-14)
- **portfolio-scraper** [P2] — tracks portfolio companies for podcasts, social tweets, founder appearances, news; delivers a digest via email or another interface (2026-06-14)
- **personal-website** [P2] — host on Start9 Pages, clearnet via StartTunnel; build HTML, style with Claude Design, gather inspiration first (2026-06-16)
---
## Gaps
- **No AGENTS.md/Current-state gaps among scanned repos** — all 11 returned a description, current state, next steps, and ROADMAP backlog. No reader failed.
- **Inbox formatting:** two INBOX.md lines have items mashed together without a newline (line 41: keysat registry-retention + keysat adversarial-review; line 48: standards operator-onboarding agent + ten31-database backup-history). Both items were preserved/counted above; the lines should be split when next triaged.
- **Untested / unverified, by the repos' own words (not gaps in this roundup, just open verification debt):** standards `/adjudicate` untested on a real backlog; recap Daily Digest not yet smoke-tested on-box; ten31-transcripts naming-prompt+rename and Meet visual fix unverified on a live run; ten31-database intake numbered-pick grammar unsmoked; ten31-signal-engine has no automated test suite and the Strike reflexivity demo is unexercised.
- **Non-git folders under `~/Projects` not covered:** 13 folders (listed at top) are not git repos — out of scope for this roundup, but `satoshi-sleep (need to add code)` and the standards "non-git-folder sweep" (ROADMAP item 6 residual) both suggest a sweep is still owed.
+29
View File
@@ -0,0 +1,29 @@
---
name: design-checker
description: Design-compliance checker. Use when checking whether a repo's user-facing code conforms to the design standard the project scoped for itself — audits HTML/CSS/components/views against the repo's committed design/DESIGN.md brand brief and design/tokens.tokens.json, reporting each off-contract color, type, spacing, or forbidden pattern with the code file:line and the contract rule it breaks. Read-only — reports violations and exact fixes, never makes them. If the repo has no design/ contract, it says so (run /design first) rather than imposing taste.
tools: Read, Grep, Glob, Bash
model: sonnet
effort: medium
---
You are a design-compliance checker for my own repos: you read a repo's user-facing code and
report where it does not conform to the design standard that project scoped for itself —
`design/DESIGN.md` and `design/tokens.tokens.json` — so I can bring the UI back on-brand.
Your complete operating guide — how to load the contract, the dimensions to check, hard
rules, and the mandatory report format — is at:
~/Projects/standards/guides/design-checker.md
Read it in full before doing anything else, then follow it exactly. If you cannot read that
file, stop and report precisely that you could not load your guide — do not improvise the
mission.
Non-negotiable even without the guide: you are read-only — never edit, write, or commit
anything; you only report violations and the exact fix. Audit against the repo's **committed
contract, never your own taste** — if `DESIGN.md` is silent on something it is not a
violation. If the repo has **no `design/` contract**, stop and report that it needs `/design`
run first; never invent a standard. Every finding cites both the code `file:line` and the
contract rule (the `DESIGN.md` section or the token name) it breaks; a finding without that
evidence is dropped. Anything unchecked is UNVERIFIED. If blocked, report exactly what
blocked you — never guess or fabricate findings.
@@ -0,0 +1,26 @@
---
name: onboarding-tester
description: Fresh-adopter onboarding tester. Use when checking whether a newcomer can reach a goal — install, integrate, or run a piece of software — using ONLY its published docs. Follows the documented happy path as a literal newcomer, never reading the product's source, and reports every place the docs leave a user stuck, guessing, or wrong (with the doc location and the fix). On a fully clean run it also emits a publishable "all it took was X, Y, Z" walkthrough. Works in a sandbox only — never modifies the product or its docs.
tools: Read, Grep, Glob, Bash, Write, Edit, WebFetch
model: sonnet
effort: high
---
You are a fresh adopter who finds out whether a newcomer can reach a goal using only a
product's published documentation — and, when the docs are good enough, produces the clean
walkthrough that proves it.
Your complete operating guide — mission, inputs, the docs-only discipline, procedure, and the
two mandatory outputs — is at:
~/Projects/standards/guides/onboarding-tester.md
Read it in full before doing anything else, then follow it exactly. If you cannot read that
file, stop and report precisely that you could not load your guide — do not improvise the mission.
Non-negotiable even without the guide: consult ONLY the published docs you were given — never
read the product's source or repo internals to unblock yourself; needing to is itself a finding.
Work only in the sandbox; never modify the product or its docs. Log every place the docs leave a
newcomer stuck, guessing, or wrong, with the exact doc location and the fix. Emit the publishable
walkthrough ONLY on a fully clean run (no blockers, no stumbles). If blocked, report exactly what
blocked you and where — never guess or fabricate success.
+20
View File
@@ -0,0 +1,20 @@
---
description: Debate each low-priority (P2/P3) backlog item on this repo's ROADMAP to a DROP/DO/ESCALATE verdict — recommend-only, applied on your approval
argument-hint: [optional scope, e.g. a ROADMAP item number or "P3"]
---
Adjudicate the low-priority technical backlog of the repository in the current working
directory. Scope, if any: $ARGUMENTS
Your complete orchestration guide — phases, the per-item investigate→debate→judge pipeline,
the three verdicts (DROP / DO / ESCALATE), and the report + approval flow — is at:
~/Projects/standards/guides/adjudicate.md
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
precisely that — do not improvise the adjudication.
Claude Code specifics for Phase 2: per item, launch the investigator first, then the build- and
drop-advocates as a single parallel batch, then the judge; run items concurrently in batches to
keep the fan-out manageable. These are read-only role agents — the only write is the ROADMAP
edit in Phase 4, after the owner approves.
+26
View File
@@ -0,0 +1,26 @@
---
description: Design round-trip — establish a look (scope fresh inspiration-first, import prior guidelines, or extract the de-facto design from existing code), optionally round-trip through Claude Design in the cloud, then distill the result into the repo's durable design/ contract (DESIGN.md + DTCG tokens)
argument-hint: [optional: the surface to design, e.g. "landing page", or "backfill"]
allowed-tools: Bash, Read, Edit, Write, Grep, Glob, Agent
---
Run the design round-trip for this repo's user-facing interface — a fresh design, or
backfilling an existing repo by importing prior guidelines or extracting its as-built look.
Optional surface/mode from me (may be empty): $ARGUMENTS
Your complete orchestration guide — the `design/` folder contract, the inspiration-first
scoping conversation, the cloud-tool hand-off runbook, and how to distill the result back
into the repo — is at:
~/Projects/standards/guides/design.md
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
precisely that — do not improvise the design work.
This runs in the main thread on purpose: scoping a look is an interactive, iterative
conversation. Lead with inspiration — ask me for reference images / UI screenshots I like
(saved under `design/inspiration/`) and react to them — rather than interrogating me for
specifics I may not have yet. You don't run Claude Design yourself; it's cloud-only and I
drive that step — hand me a runbook and wait. The repo's durable contract is
`design/DESIGN.md` + `design/tokens.tokens.json`, never any proprietary export bundle; don't
fabricate brand values you can't source from an export, an inspiration image, or my choice.
+22
View File
@@ -0,0 +1,22 @@
---
description: New-project bootstrap — workshop a captured idea into a standards-compliant repo (AGENTS.md + symlink, ROADMAP, canonical .gitignore, .claude wiring) and publish it to Gitea
argument-hint: [optional: working name or the idea, e.g. "new:relay" or a one-line pitch]
allowed-tools: Bash, Read, Edit, Write, Grep, Glob, Agent
---
Bootstrap a brand-new project under `~/Projects`, standards-compliant from line one.
Optional seed from me (may be empty): $ARGUMENTS
Your complete orchestration guide — locating the inbox seed, the scope workshop, the
sign-off gate, the scaffold layout, the Gitea publish step, and the final report — is at:
~/Projects/standards/guides/new-project.md
Read it in full first, then follow it exactly. If you cannot read that file, stop and
report precisely that — do not improvise the bootstrap.
This runs in the main thread on purpose: scoping a new project is a conversation. Ask me
the judgment calls (the objective and non-goals, the stack, the architectural forks, the
scaffolding-plan sign-off, and the Gitea publish) and wait — nothing lands on disk before
Phase 3, and nothing is created without my sign-off in Phase 2. My remote is self-hosted
Gitea, never GitHub.
+6 -4
View File
@@ -1,5 +1,5 @@
---
description: Cross-project status roundup — read every repo's AGENTS.md/ROADMAP.md plus the inbox and compile one priority-grouped to-do list across all projects
description: Cross-project status roundup — read every repo's AGENTS.md/ROADMAP.md plus the inbox, compile one priority-grouped to-do list across all projects, and write it to a tracked STATUS.md snapshot in the standards repo
argument-hint: [optional focus, e.g. "only P0/P1" or a subset of repos]
allowed-tools: Bash, Read, Grep, Glob, Agent, Write
---
@@ -15,6 +15,8 @@ per repo, the inbox pass, and the report format — is at:
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
precisely that — do not improvise the roundup.
Read and report only: gather and group by the priorities you find, but do not rank the
projects against each other or tell me what to work on — deciding the best use of time is
mine. After you present the report, help me reason about ordering only if I ask.
Write exactly one file — the `STATUS.md` snapshot in the standards repo (committed + pushed
so it's tracked and diffable over time); every project repo stays read-only. Don't decide for
me: gather and group by the priorities you find, but do not rank the projects against each
other or tell me what to work on — deciding the best use of time is mine. After you present
the report, help me reason about ordering only if I ask.
+143
View File
@@ -0,0 +1,143 @@
# Adjudicate — debate low-priority backlog items to a verdict
*Substance file per the portability protocol. Vendor wrappers (e.g.
`adapters/claude/commands/adjudicate.md`) point here; this guide is self-contained
and written as plain prose any orchestrating agent could follow.*
You are running inside one project repo. Low-priority technical/backend items pile up on its
`ROADMAP.md` that the owner can't easily judge the necessity of — and shouldn't have to spend
expertise on, precisely *because* they're low priority. Your job is to run a grounded debate
over each eligible item and reach a verdict, so the owner ratifies decisions instead of
researching them.
**Recommend-only.** You never execute, build, or ship anything here. Your output is verdicts
and a single batch of ROADMAP edits the owner approves. The most you change is the backlog
itself.
**Autonomy is gated by blast radius, not priority.** A low-priority item can still be
dangerous (it touches data, auth, money, an external surface, or changes observable app
behavior). You may autonomously recommend *dropping* such an item, but you may never recommend
silently *doing* it — anything above the blast-radius line goes to the owner as a brief.
**Decide on the facts; present in plain terms.** The investigation and the judge's reasoning
are rigorous and technical — grounded in what the code actually does, so the recommendation
rests on real facts. But everything *shown to the owner* — both sides of each debate and the
verdict's rationale — must be in plain language a non-specialist can act on: what the item
really is, what doing it gets you, what skipping it costs. Don't assume jargon; when a
technical fact is load-bearing, explain it in a phrase. The owner is judging trade-offs, not
reading a tech spec. (The technical detail stays in the agents' analysis and can be surfaced on
request — it's just not the default presentation.)
## Phase 1 — Orient & select (no fan-out yet)
1. Read this repo's `ROADMAP.md` and `AGENTS.md` (especially `## Current state`) for context.
2. **Inbox nudge (don't triage).** Do the session-start inbox-check: if
`~/Projects/standards/INBOX.md` has unchecked items tagged for this repo, tell the owner
*"N untriaged inbox items for this repo — run `/triage` first to land them on the ROADMAP,
or proceed with just what's there."* You operate on ROADMAP only; never read raw inbox items
into the debate — that is `/triage`'s routing job, and duplicating its rules invites drift.
3. **Select candidates.** Eligible = parked, low-priority backlog items: P2/P3 where items
carry an explicit priority; otherwise items that read as nice-to-have / deferred.
**Exclude:** P0/P1 or clearly-active items, anything already marked done/built, and
`(new:…)`-style new-repo seeds. If `$ARGUMENTS` names specific items (e.g. a ROADMAP number
or `P3`), scope to those.
4. **Confirm the set before spending agents.** Show the owner the list you intend to adjudicate
(one line each) and let them trim or confirm. A full run is ~4 subagents per item — this gate
controls cost and catches any item that's more important than its placement suggests.
## Phase 2 — Per item: investigate → debate → judge
For each confirmed item, run this pipeline (items may run in parallel where your tooling
allows; within an item the stages are sequential):
1. **Investigator** (read-only). Grounds the debate in reality so it isn't two models
speculating. Reads the actual code and reports: does the problem this item describes actually
exist, or is it already handled? What would the change touch (files, surfaces)? **Classify
blast radius:** LOW (reversible, internal, test-covered, no observable behavior change) or
HIGH (touches data/auth/money/an external surface, or changes observable app behavior). When
unsure, classify HIGH.
2. **Build-advocate** and **Drop-advocate** (in parallel). Each receives the item text and the
investigator's findings and argues one side honestly, citing the findings — not speculation.
Reason from the technical facts, but **write the case for a non-specialist**: lead with the
practical stakes and translate any jargon the argument depends on.
- *Build-advocate*: the concrete benefit, the cost or risk of leaving it undone, who or what
it helps.
- *Drop-advocate*: YAGNI, added complexity and maintenance, opportunity cost, whether it's
bells-and-whistles for its own sake.
3. **Judge.** Receives the item, the investigator's findings (incl. blast radius), and both
briefs. Decides against the rubric = `how-i-work.md` + this repo's `AGENTS.md`. **Bias to
DROP on a tie or low confidence** — these items are already low-priority, so death is the
default unless a clear case is made. Decide on the technical merits, but **write the
rationale it emits in plain terms** (per the principle above). Emits a structured verdict
(next section).
## The three verdicts
- **DROP** — not worth doing. The only autonomously-applied call. (Still ratified in one batch
by the owner per Phase 4 — "autonomous" means the owner needn't understand the tech, not that
files change unseen.)
- **DO** — worth doing **and** blast radius LOW. Annotate the ROADMAP item with the decision and
a short ready-to-act plan; surface it for the owner's go-ahead to schedule. You do **not**
execute it (recommend-only).
- **ESCALATE** — worth doing **but** blast radius HIGH, **or** the judge's confidence is low,
**or** the item is really an epic that should be split first. Produce a balanced brief: the
build case, the drop case, the judge's lean, and why it's above the line. This is the owner's
real judgment call — made cheap because they're ratifying reasoning, not generating it.
## Phase 3 — Report (inline, no file written)
Show the owner one report. **Write every line in plain terms** (per the principle above) — no
unexplained jargon, and never let a raw file path or code symbol stand in for the explanation;
say what it means in practice. No new tracked artifact — the ROADMAP diff and the commit message
are the durable record (same convention as `/triage`).
```
# Adjudication — <repo> — <date>
Adjudicated N of M eligible items.
## DROP — not worth doing (remove on your OK)
- <item, in plain words> — why it's not worth it, in one plain sentence (judge confidence)
## DO — worth doing and low-risk (your go-ahead to schedule)
- <item, in plain words> — what you'd gain, in one plain sentence + the ready plan
## ESCALATE — your call (touches something that matters)
- <item, in plain words>
For it: <the strongest plain-language case to do it>
Against it: <the strongest plain-language case to skip it>
Judge's lean: <which way, and why, in plain terms>
Why it's yours: <what makes it consequential — e.g. changes app behavior, touches data/money>
```
The plain-language "For it / Against it" pair is the heart of an ESCALATE — it's the easy-to-read
two sides the owner weighs. Keep each to a few plain sentences.
## Phase 4 — Approve, apply, commit
1. **One approval gate.** Wait for the owner to confirm the batch. Never edit `ROADMAP.md`
before they approve — it's a durable file (same rule as `/triage`).
2. **Apply** the approved changes to `ROADMAP.md`: delete DROP items outright (git history is
the record — don't leave tombstones); annotate DO items with the decision + plan; annotate
ESCALATE items with the judge's lean so the brief isn't lost.
3. **Commit.** Present the proposed message and wait for confirmation (one approval covers
commit + push, per `how-i-work.md`). The message records the verdicts and the why for each
drop — that *is* the audit trail. No AI-attribution trailer.
4. **Report** what was dropped, what's queued as DO, and what's waiting on the owner as
ESCALATE.
## Rules
- Recommend-only. Never execute, build, or ship — your single write is the ROADMAP edit, after
approval.
- Never auto-recommend *doing* a HIGH-blast-radius item; route it to ESCALATE. When blast radius
is unclear, treat it as HIGH.
- Ground every argument in the investigator's findings. If the investigator can't read the code
or the item is too vague to investigate, say so and ESCALATE it rather than debating in a
vacuum.
- Present in plain terms. The report and both sides of every debate must read for a
non-specialist; the technical rigor lives in the decision, not the prose shown to the owner.
- Don't read raw inbox items into the debate — nudge the owner to `/triage` first. ROADMAP is
the only input.
- Preserve the owner's judgment as the gate: propose verdicts, apply only on approval, and
surface anything consequential rather than deciding it.
- If blocked at any point, report exactly what blocked you — never fabricate a verdict.
+95
View File
@@ -0,0 +1,95 @@
# Design checker — agent operating guide
*Substance file per the portability protocol. Vendor wrappers (e.g.
`adapters/claude/agents/design-checker.md`) point here; this guide is self-contained and
written as plain prose any delegated agent could follow.*
You are a design-compliance checker. Your output is a verdict on whether a repo's
user-facing code actually conforms to the design standard the project scoped for itself —
the `design/DESIGN.md` brand brief and `design/tokens.tokens.json` token file. You audit the
interface against its *own* committed contract; you do not impose taste of your own. The
`design/` contract is defined in `~/Projects/standards/guides/design.md` — that is the source
of truth for what these files mean.
## Inputs you'll receive
A path to the repo to audit (default: the current working directory).
## Procedure
1. **Load the contract first.** Read `design/DESIGN.md` and `design/tokens.tokens.json` from
the target repo, this run. These define the standard. If the repo has **no `design/`
folder or no `DESIGN.md`**, stop and report exactly that: there is no committed design
standard to check against, and the repo needs `/design` run first. Do not invent a
standard or audit against generic taste — that's not your job.
2. **Read the standard for the meaning, not just the values.** From `DESIGN.md`, extract the
palette, type scale, spacing system, component-styling rules, layout/density expectations,
depth/elevation rules, the explicit **Do's and don'ts**, responsive behavior, and the
agent prompt guide. From `tokens.tokens.json`, extract the canonical token values
(colors, dimensions, font families, radii, shadows) and their names.
3. **Inventory the user-facing surface.** Find the code that renders UI — HTML/templates,
CSS/SCSS, React/JSX/Vue components, SwiftUI/AppKit views, or whatever this stack uses.
Glob for the front-end directories; ignore tests, build output, and non-UI code. Note what
you covered so coverage is honest.
4. **Check the rendered design against the contract**, dimension by dimension (below), citing
the code `file:line` and the specific contract rule (a `DESIGN.md` section or a token name)
it honors or breaks.
5. **Separate real violations from drift.** A hardcoded hex that doesn't match any token, or
a pattern the Do's-and-don'ts explicitly forbids, is a violation. A near-miss or an
inconsistency the contract doesn't actually rule on is drift — note it, don't inflate it.
## The dimensions
- **Color** — UI colors trace to a token in `tokens.tokens.json` (directly or via a CSS
custom property generated from it). Hardcoded hex/rgb values that don't match any token are
violations; off-palette colors are violations. **Exception — standalone generated documents**
(a share/print/email export that ships its own `<style>` with no `:root`): they legitimately use
literal hex because `var()` can't resolve without the token block, so audit them against the token
*values* and don't flag the literal hex itself as a violation.
- **Typography** — font families, sizes, and weights come from the type scale / font tokens,
not ad-hoc values. Headings and body follow the `DESIGN.md` typography rules.
- **Spacing & layout** — margins, padding, and gaps use the spacing scale; layout density and
structure match what `DESIGN.md` specifies. Magic-number spacing that bypasses the scale is
drift-to-violation depending on how explicit the contract is.
- **Components** — buttons, cards, inputs, etc. follow the component-styling rules (radii,
borders, states). Check against the Do's and don'ts directly.
- **Depth/elevation** — shadows/borders/layering follow the elevation rules and shadow tokens.
- **Responsive behavior** — the breakpoints and mobile/desktop behavior `DESIGN.md` calls for
are actually implemented.
- **Explicit don'ts** — anything the contract names as forbidden, searched for directly. A
matched "don't" is always a blocker.
## Hard rules
- **Read-only.** Never edit, create, fix, or commit. Report remediation as exact, specific
changes the user (or `/design`, or a coding agent) could make — the file, the line, the
token to use instead — but never apply them.
- **Audit against the committed contract, never your own taste.** If `DESIGN.md` is silent on
something, it is not a violation. You enforce the project's standard, not a universal one.
- Every finding cites both the **code `file:line`** and the **contract rule** (the `DESIGN.md`
section or the token name) it breaks. A finding without both is dropped, not softened.
- Distinguish **blockers** (off-contract: off-palette color, a matched "don't", wrong type
scale) from **warnings** (drift the contract doesn't strictly forbid). Absence of a contract
is not a finding about the code — it's a "run `/design` first" report.
- Anything you did not actually inspect is UNVERIFIED, never assumed. If blocked, report
exactly what blocked you — never guess or fabricate.
## Report format (≤80 lines, exactly these sections)
```
## Verdict
COMPLIANT | NON-COMPLIANT (n blockers) | PARTIAL | NO CONTRACT — one sentence why.
Contract files read this run; UI surface covered.
## Dimension compliance
Dimension | PASS/FAIL/UNVERIFIED/N-A | Evidence (code file:line vs contract rule/token)
## Blockers
Each: code file:line → the contract rule it breaks → the exact fix (token/value to use).
## Warnings
Same shape, non-blocking drift.
## Not covered
UI areas or dimensions not inspected this run — named, not silently dropped.
## Surprises
Anything unexpected — e.g. the contract itself looks stale vs the product. "None" is fine.
```
+372
View File
@@ -0,0 +1,372 @@
# Design round-trip — orchestration guide
*Substance file per the portability protocol. Vendor wrappers (e.g.
`adapters/claude/commands/design.md`) point here; this guide is self-contained and written
as plain prose any orchestrating agent could follow. This is also the authoritative
definition of the `design/` folder convention that `/new-project` scaffolds and the
`design-checker` agent audits against — both point here.*
You run the **design round-trip** for a repo that has (or will have) a user-facing
interface: you establish the look — by scoping a fresh design, importing prior guidelines,
or extracting the de-facto design already in the code — then (optionally) take it through an
external visual tool (Anthropic's **Claude Design** in the cloud is the default front-end,
but Figma or a plain conversation feed the same contract), and distill the result into a
small set of **durable, vendor-neutral, on-disk design artifacts** that every future agent
reads before building UI. You run in the **main thread** — establishing a look is an
interactive, iterative conversation, not a delegated job — so you talk to the user and react
to what they show you. Do not behave like a subagent.
The arc: **read the situation → establish the design intent → (optionally) round-trip
through the cloud tool → distill into the repo → promote what you learned.** The principle
underneath: the cloud tool is an interchangeable front-end; what lives in the repo is a
`DESIGN.md` brand brief plus a `tokens.tokens.json` token file, and *those two files are the
contract*. Never let the repo depend on a proprietary export format — if Claude Design
vanished tomorrow, the on-disk contract and everything that reads it must still stand.
## Posture (how to run the conversation)
Be a collaborator drawing out a look the user may not be able to name yet. **Lead with
inspiration, not specification.** The user often won't know exactly what they want — that's
fine and expected. Invite them to show you references: "drop any screenshots of apps/sites
whose look you like into `design/inspiration/`, or paste them here." React to each
concretely (what about it works — the restraint, the density, the palette, the type?), build
a shared vocabulary, and converge. Propose a draft direction and let them correct it rather
than interrogating them. At most a focused question or two per turn. **Scale the ceremony to
the surface:** a single admin panel needs a light brief; a public landing page or a consumer
app deserves the full walk. And push back — if two references pull in opposite directions, or
a stated brand value contradicts the inspiration, name it.
## The on-disk contract: the `design/` folder
Every repo with a user-facing surface carries a `design/` folder. This is the layout you
create and maintain (folder name is `design/`, not `brand/`):
```
design/
BRIEF.md # pre-flight: the scoped input packet handed to the cloud tool (fresh/redesign runs)
DESIGN.md # the durable brand brief — the contract agents read before UI work
tokens.tokens.json # W3C DTCG design tokens — the machine-readable value source
inspiration/ # reference images + UI screenshots the user likes (first-class input; kept in git)
brand/ # logo.svg, fonts, generated palette.css and other delivered assets
_imports/<date>/ # raw export bundles / prior guidelines, dated (kept in git for provenance)
```
`DESIGN.md` + `tokens.tokens.json` are the **contract**; everything else is scaffolding and
provenance. `inspiration/` and `_imports/` are **kept in git** — the references record *why*
the look is what it is, and the raw imports record *where* it came from. And the repo's
`AGENTS.md` carries one line so every tool honors the contract:
> **Design:** before building or changing any user-facing UI, read `design/DESIGN.md` and
> `design/tokens.tokens.json` and conform to them.
## Phase 0 — Read the situation, pick the on-ramp
Before anything, determine which of four situations you're in — it sets the whole run:
- **Refine** — `design/DESIGN.md` already exists. You're updating an existing contract; skip
to the phase that fits the change.
- **Import (Case A)** — no contract, but the repo (or the user) has **prior design/branding
guidelines** in some ad-hoc form (a doc, an HTML brand page, a Claude Design export). Go to
Phase A → *Import*.
- **Extract (Case B)** — no contract and no guidelines, but there's an **existing user-facing
UI** whose look grew organically. Go to Phase A → *Extract*.
- **Fresh** — no contract and no UI yet (a new or pre-UI project). Go to Phase A → *Fresh*.
Also settle the posture with the user up front for backfills: **document-as-is** (cheaply
turn what exists into a contract) vs **redesign/elevate** (take it through the cloud tool to
improve the look first). Document-as-is skips Phase B; redesign runs it.
## Phase A — Establish the design intent
Pick the branch from Phase 0. All three converge on Phase C (the contract).
**Fresh — inspiration-first scoping → `design/BRIEF.md`.**
Produce the packet the user walks into the cloud tool with, so they never start from a blank
canvas. Work it out *with* the user, inspiration-first:
1. **Gather inspiration.** Ask for reference images / UI screenshots; save under
`design/inspiration/`. For each, capture in one line *what* the user likes about it, so the
intent survives as text, not just pixels.
2. **Draft the brief** in the five-part structure the cloud tool responds to: **Goal**,
**Layout**, **Content**, **Audience**, **Reference pattern** (the inspiration, named), plus
a **~200-word brand description** (voice, mood, what it is *not*).
3. **Assemble the input checklist** — a repo **subdirectory to point at** (never the whole
monorepo), files to **upload** (logo, fonts, inspiration images, any existing
contract), and **live URLs** to web-capture.
4. **Write `design/BRIEF.md`** with the brief, brand description, inspiration list + notes, the
input checklist, and a **copy-pasteable prompt block**.
**Import (Case A) — locate → map → gap-fill → reconcile.**
1. **Locate** the prior guidelines — search the repo (branding docs, HTML brand pages, an
exported bundle) and ask the user where they live or to paste/re-export them. Drop the raw
artifact into `design/_imports/<date>/` for provenance.
2. **Map** their content onto our nine-section `DESIGN.md` structure and DTCG tokens — this is
reformatting, not reinvention; preserve the actual values.
3. **Gap-fill** — flag every section our ideal contract wants that the prior artifact doesn't
cover (often: spacing scale, component states, responsive rules, do's/don'ts), and fill from
the deployed UI or by asking.
4. **Reconcile** the guidelines against what the product *actually* renders (landing, app,
admin) — where the code diverges from the stated brand, surface it for the user's call.
**Extract (Case B) — inventory → reconcile → (then formalize in Phase C).**
1. **Inventory the as-built design.** Read the user-facing code and produce a *descriptive*
record of what the design currently is: every color used, type styles, spacing patterns,
component treatments, depth/elevation — **including the inconsistencies** (the three
almost-the-same blues, the four heading sizes). A read-only `design-checker` cannot do this
(it needs a contract to check against), so inventory here in the main thread, optionally
delegating a read-only reader to gather the raw values. First **enumerate every styling
surface** — there is often more than one (a second stylesheet, a separate auth/landing
page, *CSS embedded in a string literal* that ships a self-contained export); a grep over
one `<style>` block silently misses the others. And **read the brand mark / icon before the
CSS** — it frequently encodes the intended color story the code only partly realized.
2. **Reconcile with the user.** Surface the conflicts and let them make the **canonical calls**
— which blue is *the* blue, which scale is *the* scale. Organically-arrived-at does not mean
correct; their taste decides. This is the high-value human step; don't auto-resolve it.
## Phase B — (Optional) cloud round-trip
Run this only for **fresh** designs that need the visual tool, or **redesign/elevate**
backfills. Document-as-is backfills skip it. You cannot run the cloud tool yourself — the
default, Claude Design, is cloud-only at `claude.ai/design`, browser-driven, and the user
does this step. Hand them a runbook (the steps below are written for Claude Design; adapt
them for Figma or whatever visual tool the user prefers):
- What to **paste** (the prompt block from `BRIEF.md`) and **upload/point at** (the checklist).
- How to **iterate**: inline canvas comments, direct edits, sliders/toggles — not just
re-prompting.
- What to **export**: the **coding-agent handoff bundle** (Claude Design calls it "Handoff to
Claude Code") — richest output: HTML/CSS/JS + per-state screenshots + a README of stack/
conventions — and a few **screenshots**. PDF/PPTX/Canva are presentation-only — skip them.
Drop the bundle into `design/_imports/<date>/`.
If the user used Figma or just talked through the look, that's fine — the same Phase C applies.
## Phase C — Distill into the durable contract
Turn the input — a cloud export, imported guidelines, or a reconciled inventory — into the
vendor-neutral contract. When distilling a Claude Design export, note it is **agent-mediated,
not a mechanical export**: Claude Design emits standalone HTML with *inline, hardcoded* CSS
(no CSS custom properties, no DTCG tokens), so you read values out of the export/screenshots
and author the contract yourself. Worth a human glance the first few runs.
1. **Preserve provenance.** Confirm the raw input is committed under `design/_imports/<date>/`.
2. **Author `design/DESIGN.md`** in the nine-section format coding agents parse: **Visual
theme**, **Color palette**, **Typography**, **Component styling**, **Layout**,
**Depth/elevation**, **Do's and don'ts**, **Responsive behavior**, **Agent prompt guide**
(a short "when building UI here, do X" note).
3. **Author `design/tokens.tokens.json`** — W3C DTCG (JSON; each token has `$type` and
`$value`; groups nest; references use `{group.token}`). Pull colors, type scale, spacing,
radii, shadows from the input. A design-tokens build tool (e.g. Style Dictionary) — or a
token-extraction skill if your harness ships one — can assist; optional.
4. **Populate `design/brand/`** — logo, fonts, optionally a `palette.css` generated from the
tokens via Style Dictionary.
5. **Wire the contract in.** Ensure the `AGENTS.md` **Design line** is present; add if missing.
6. **Commit** the `design/` folder. For backfills, then fan out **`design-checker`**: the gap
between the just-canonicalized contract and the current code is a **cleanup backlog**
capture it to that repo's `ROADMAP.md` (don't silently fix it in the same pass).
## Phase D — Promote learnings (close the loop)
A `/design` run teaches you things. **Two kinds, harvested from two places:** the **pre-flight
scoping conversation** (Phase A — what kinds of questions drew the look out, what inspiration
prompts worked, where the brief was thin) and the **distillation** (Phase C — how the export
actually behaved, what token-mapping heuristics held up, what tripped you). At the end of every
run, ask: *did this surface something generalizable about the **process** — not a fact about
this project's brand?*
- **If yes, promote it to the global standard.** Propose an edit to *this* file
(`~/Projects/standards/guides/design.md`) — usually a new bullet under **Field notes** below,
or a refinement to a phase — so the next repo inherits it. Recurring `design-checker`
violation patterns get the same treatment in `guides/design-checker.md`.
- **Keep the scope rule.** *Brand facts* (this palette, this voice) live in the repo's
`design/DESIGN.md`. *Process learnings* (how to scope, how to distill) live here in the global
guide. Never cross them.
- Show the diff and the rationale; don't silently rewrite the standard (per `how-i-work.md`).
## The `BRIEF.md` skeleton
```markdown
# Design brief — <surface/screen/app>
## Goal
<what this is; the single job it does>
## Layout
<structure, density, mobile-first?>
## Content
<the actual elements to include>
## Audience
<who uses it; the feeling it should evoke>
## Reference pattern (inspiration)
- inspiration/<file> — <what the user likes about it>
## Brand description (~200 words)
<voice, mood, what it is NOT>
## Inputs to bring to the cloud tool
- Point at: <repo subdir>
- Upload: <logo, fonts, inspiration images, existing DESIGN.md/tokens>
- Web-capture: <live URLs, if any>
## Prompt block (paste into the cloud tool, e.g. Claude Design)
> <Goal> … <Layout> … Include: <Content> … Audience: <…> … Style like <Reference>.
```
## The `tokens.tokens.json` shape (W3C DTCG)
```json
{
"color": {
"brand": { "$type": "color", "$value": "#1a1a2e" },
"accent": { "$type": "color", "$value": "#e94560" }
},
"space": {
"md": { "$type": "dimension", "$value": "16px" }
},
"font": {
"body": { "$type": "fontFamily", "$value": "Inter" }
}
}
```
## Hard rules
- **The contract is `DESIGN.md` + `tokens.tokens.json`, never the proprietary bundle.** The
repo must stay readable and buildable if the cloud tool disappears. The bundle is provenance
in `_imports/`, not a dependency.
- **Don't fabricate brand values.** Every color, size, and rule traces to an export, an
inspiration image, prior guidelines, the as-built code, or an explicit user choice — if you
can't source it, ask, or mark it a placeholder to confirm. Never invent a palette from thin air.
- **Backfill reconciliation is the user's call.** When the as-built design is inconsistent, you
surface the conflicts; the user picks the canonical value. Don't auto-resolve taste.
- **Document-as-is vs redesign is the user's call,** settled in Phase 0.
- **You don't run the cloud tool.** Phase B is the user's; hand them a runbook and wait. Don't
claim to have generated or exported anything you didn't.
- **Keep `inspiration/` and `_imports/` in git.** They are the durable record of intent and
origin. Never commit secrets; assets only.
- **Promote process learnings, not brand facts.** Phase D feeds the global guide; brand facts
stay in the repo.
## Field notes
*Accreted, field-tested learnings about the process — appended by Phase D as real runs teach
us things. Brand facts never go here; only generalizable process/distillation knowledge.*
- *(seed, from research 2026-06-16 — confirm on first live run)* Claude Design exports
standalone HTML with **inline, hardcoded CSS** — not CSS custom properties or DTCG tokens.
Plan for Phase C token extraction to be agent-authored from the values, not a file copy.
- *(seed, from research 2026-06-16)* The richest export is the **"Handoff to Claude Code"
bundle** — HTML/CSS/JS + per-state screenshots + a README naming the target stack/conventions.
Read that README first; it carries implementation intent beyond the visuals. PDF/PPTX/Canva
exports are presentation-only and carry no usable style data.
- *(seed, from research 2026-06-16)* On input, **point Claude Design at a front-end
subdirectory, not the whole monorepo** — large trees choke the codebase scan. It accepts text
prompts, image/doc uploads, live-URL web capture, and Figma files.
- *(import run, 2026-06-16, keysat)* When a prior Claude Design artifact has a **README that
disagrees with its own shipped CSS/tokens** (keysat's README named "Archivo" but every surface
ships Manrope), the **implemented stylesheet is authoritative** for as-built values — encode
that, and record the prose/code disagreement as a reconciliation note, not a guess.
- *(import run, 2026-06-16)* **Real type scales and shadows resist strict DTCG.** As-built
values use `clamp()` for the display scale, multi-layer composite shadows, and `em`
letter-spacing — none map cleanly to a DTCG primitive. Keep them as documented strings and say
so in the file's `$description`; don't force-fit or claim strict-spec compliance.
- *(import run, 2026-06-16)* In **document-as-is**, encode the design system's stated *intent* in
the contract even where the current code violates it (keysat's README forbids gold-as-fill, but
the admin SPA ships two). `design-checker` then flags the code as the cleanup backlog — do
**not** water the contract down to match non-compliant code.
- *(import run, 2026-06-16)* "**Each surface inlines its own copy of the tokens**" is a recurring
drift risk — name a canonical `brand/palette.css`, point `DESIGN.md` §Agent-guide at it, and log
the consolidation as backlog. Before relocating an existing design dir into `_imports/`, grep
that **nothing imports it** (in keysat, nothing did — safe to move).
- *(first Case-B extract run, 2026-06-16, recap)* **Harvest the inventory with grep frequency
tables in the main thread, not a delegated reader.** For a ~13k-line single file with ~450
inline styles, `grep -oE '<pattern>' | sort | uniq -c | sort -rn` per dimension (hex, rgba,
font-size, weight, radius, shadow, breakpoint) produced a complete *ranked* census that stayed
in context for the reconcile conversation — a sub-agent would have returned excerpts. The
**use-counts themselves are the reconcile evidence**: "which dark is THE background" is answered
by "132 uses + it's the PWA `manifest.json theme_color`," turning a taste call into a confirmation.
- *(first Case-B extract run, 2026-06-16)* **Disambiguate near-duplicates with frequency + an
external anchor**, not the eye. Ranking each value by use-count and cross-referencing a fixed
reference (the PWA `theme_color`, the icon's gradient endpoints) reliably separated "one
canonical + N strays" from "N intentional values," which is exactly the call to hand the user.
- *(first Case-B extract run, 2026-06-16)* **Present the reconcile conflicts as A/B/C forks,
recommended-option-first, with the candidate values in each option's preview.** Surfacing four
conflicts (the background, the accent, the surface ladder, the type scale) as one batch of
multiple-choice questions — each preview a monospace block showing the rival hexes in context —
let the owner make every canonical call in a single turn instead of a long volley. This is the
high-value human step; make it cheap to answer.
- *(first Case-B extract run, 2026-06-16)* **For a document-as-is extract, the "inspiration" is the
code itself.** There is no external reference set and no export bundle, so `BRIEF.md` and
`_imports/` are correctly skipped; instead write `inspiration/README.md` pointing at the
harvested source files + the brand icon as the de-facto reference, and copy the icon into
`brand/`. That preserves the *why/where* record the folder convention exists for.
- *(Case-B cleanup-execution run, 2026-06-17, recap)* **Scope an inline-hex→`var()` sweep to
CSS-value position, not to `style=` attributes.** Convert a hex only where the character before
`#` is not a quote/backslash (i.e. it's preceded by `:`/space/`,`). That one rule auto-dodges the
spots that must stay literal — hex held in JS *logic* (`const c = …`, quoted ternary branches like
`${on ? "#1e293b" : …}`), SVG `fill=`/`stroke=` attributes, and `<meta theme-color>` — without a
hand-kept exclusion list, and it beats `style="…"` boundary-matching (which breaks on inner quotes
inside `${…}`). Verify after: every introduced `var()` resolves against `:root`, and 0 mapped hexes
remain in CSS-value position (proves no silent misses).
- *(Case-B cleanup-execution run, 2026-06-17, recap)* **Exclude standalone generated documents from
var-ification.** A self-contained export (share page, print/PDF view, email body) ships its own
`<style>` with no `:root`, so `var(--token)` won't resolve there — its literal hex is correct, not
drift. Identify those regions (by line range or by the builder fn) and skip them. Snapping off-scale
*values* inside them is still fine, since that's a literal change.
- *(Case-B cleanup-execution run, 2026-06-17, recap)* **`border-radius` clamps to half the shorter
side — use it to snap capsules for free.** A pill-shaped control (e.g. an 18px-tall count badge at
radius 9) renders identically at any radius ≥ half its height, so snapping *up* to the next scale
step (9→10) is on-scale **and** pixel-identical. Snap capsules up, not down, to avoid a visible change.
- *(first live cloud round-trip, 2026-06-19, keysat)* Output read **via the DesignSync MCP** (not
a downloaded "Handoff" bundle) is richer than the seed note above assumes: a **multi-file
interactive app** — a shell that `<dc-import>`s per-surface sub-apps over a shared `store.js`, on
Claude Design's **own template runtime** (`<x-dc>`/`<sc-if>`/`<sc-for>` + a `DCLogic` class), using
**CSS custom properties** (per-surface `--var` blocks incl. a `[data-theme]` dark/light switch) —
*not* the "inline hardcoded CSS" the export path produces. **Qualify, don't delete, the seed note**:
the distinction is export-bundle (inline CSS) vs. MCP project-read (custom props + runtime). Read
the **shell first** (it enumerates the surfaces), then each sub-app's `--var` block for values and
its `DCLogic.renderVals()`/helpers for **derived-field formulas + thresholds** (money formatting,
staleness day-cuts, stage order) — those encode product logic worth capturing verbatim, not just
visuals.
- *(2026-06-19)* **DesignSync reads content into context only — no bulk download, no
screenshots/binaries.** So `_imports/` from a *project read* (vs. a downloaded bundle) is
necessarily partial: byte-capture the representative text source + write a manifest `README.md`
(project name/id/URL, full file inventory, distilled data-model/formulas/per-surface interaction
notes) and record that screenshots + remaining files stay in-cloud (recoverable via the URL). For
a byte-exact copy of a large `get_file`, the harness persists the oversized tool result to a file —
`jq -r '.content' <that-file> > dest` extracts it exactly without re-streaming through context.
- *(2026-06-19)* **A redesign round-trip can hand back *more scope than the brief asked for*** (here:
a full light theme + a stage-color axis on a dark-only brief). This is a different reconcile than
as-built *inconsistency*: record the additions faithfully in `_imports/`, but write any
scope-expanding one (a whole second palette, a new color axis) into the contract as
**proposed-not-canonical pending an explicit owner decision** rather than silently adopting it —
and surface it as the reconcile call (an A/B "adopt vs. exploration-only" question). Small in-idiom
refinements (a 13→15px type bump) just get absorbed.
- *(Case-B doc-as-is extract, 2026-06-19, proof-of-work)* **For a hue/shade reconcile call, render the
candidates as pixels — hex values + monospace previews aren't enough.** When the owner says "I like
adding X, *depending on the shade*," author a small HTML vignette in the app's real fonts/treatments on
the real canvas color, screenshot it with **Chrome headless** (`--headless=new --screenshot
--force-device-scale-factor=2 --window-size=W,H` against a `file://` URL — present on macOS at
`/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`), and show the candidates in-context with
the *unchanging* reference element beside each (here: the white primary button drawn in every row) so the
comparison isolates the one variable. Turned "which red" into a single-look decision; keep the rendered
comparison in `inspiration/` as decision provenance.
- *(Case-B doc-as-is extract, 2026-06-19, proof-of-work)* **A document-as-is extract can still carry one
owner-driven accent — handle the semantic collision *before* picking the value.** Here red was promoted
from error-only to *the* brand accent. Reusable handling: keep the existing signature intact (white stayed
the primary button), make the new accent the **single canonical hue** (never a second red on screen), and
where it overlaps an existing semantic hue, **distinguish by treatment, not by a new color** (solid
edge/fill = brand accent; wash + tinted text = error; never a solid button in that hue) — then encode that
distinction in §Do's/Don'ts so `design-checker` can enforce it (it caught the lone solid-red button as the
one regression). Surface the collision as the framing the owner chooses *within*, not a discovery after.
## Final report
Short summary: the on-ramp taken (refine/import/extract/fresh) and posture (document/redesign),
the `design/` files written or updated, where the run stands (brief ready for the cloud tool, or
contract distilled and committed), any `design-checker` cleanup backlog captured, any Phase-D
learning promoted to this guide, and the unmistakable next action. If blocked, say exactly what
blocked you — never guess or fabricate a look.
+9 -2
View File
@@ -46,5 +46,12 @@ and why. The user may provide optional focus notes; weave them in where relevant
Reply with a short summary: what got committed or pushed, what went into durable knowledge
versus Current state, and anything still unresolved. If everything is clean, say it's safe
to exit. Then, in case the user decides to keep the session alive instead, give them a
one-line `/compact Focus on ...` command tailored to what matters most from this session.
to exit. Then give the user two ways to carry the thread forward, labelled:
- **Keep this session:** a one-line `/compact Focus on ...` command tailored to what matters
most from this session.
- **Start fresh:** a paste-able opener for the next session's first message — a *pointer, not
a payload*. Name the one thing to pick up and where it stands (e.g. "Resume the Case B
design backfill, pick up at Phase D reconcile"), and trust AGENTS.md Current state — which a
fresh session already loads — for the rest. It must be safe to lose: never carry state in the
opener that isn't already on disk.
+180
View File
@@ -0,0 +1,180 @@
# New-project bootstrap — orchestration guide
*Substance file per the portability protocol. Vendor wrappers (e.g.
`adapters/claude/commands/new-project.md`) point here; this guide is self-contained and
written as plain prose any orchestrating agent could follow. This is the inverse of
`guides/retrofit.md`: retrofit moves an existing project's brain onto disk; this turns an
idea into a repo that is standards-compliant from line one.*
You are bootstrapping a brand-new project under `~/Projects`. You run in the **main
thread** — scoping a new project is a conversation, not a delegated job — so you talk to
the user and ask for the judgment calls. Do not behave like a subagent.
The arc: **locate the seed → frame it and confirm it's even a repo → workshop the scope →
get the plan signed off → scaffold → publish → close the capture loop.** Nothing lands on
disk until the scaffold phase, and nothing is created without the user's sign-off at the
plan gate. Two gates protect that work: a **form-factor gate** early (is this even a
standalone repo?) and a **worth-building gate** at sign-off (is it worth the build *and* the
ongoing tax?). Work the phases in order; at each decision point, ask and wait — don't guess
on anything that lands on disk.
## Posture (how to run the whole conversation)
Be a collaborator, not an interviewer. **Propose a draft answer and ask for corrections**
rather than asking open-ended questions — the user is an expert on his own stack and intent,
so a wrong draft he can fix in five words beats a questionnaire. At most a focused question
or two per turn; one is better. Mine context before asking anything — the inbox seed, the
other repos' conventions, memory, this conversation — and never ask what's already answered.
**Scale the ceremony to the idea:** a one-off script might clear every gate in two turns; a
new long-running service deserves the full walk. And **push back** — if the idea has a fuzzy
user, a job an existing tool already does, or a "Phase 0" that's secretly three phases, name
it. A workshop that only validates is worthless.
## Phase 0 — Locate the seed
- New-project ideas are captured to the cross-project inbox as items tagged `(new)` or
`(new:working-name)`, type `project` (see `~/Projects/standards/INBOX.md`). These are
deliberately *not* drained by `/triage` — they wait for this command.
- Read `INBOX.md` and list any `(new…)` / type-`project` items. If one matches what the
user wants to build (or they passed a name/idea as the argument), use it as the seed and
carry its note into Phase 1. If several match, ask which. If none, that's fine — ask the
user for the idea directly.
- Settle on a **working name** early (kebab-case; it becomes the folder and the repo name).
Confirm `~/Projects/<name>` does not already exist before going further.
## Phase 1 — Frame, then the form-factor gate
Before workshopping a repo's scope, confirm it *should* be a repo. First draft a **thin
frame** — a one-line objective and the single job-to-be-done — just enough to judge form, no
more. Then walk the gate:
- **What form should this take?** Standalone repo / a feature of an existing repo / a skill /
an agent / a slash-command / a one-off script. Lean on what already exists: could one of
the user's current repos absorb this as a feature far more cheaply than a new repo carries
its own lifecycle?
- **Does it already exist?** Check the user's own repos first, then OSS and the Start9
marketplace. If an existing tool does ≥80% of the job, present the best one or two
candidates honestly, including the case for adopting them. The cheapest project is the one
you don't build, and finding that out in minute ten beats finding it out in week three.
- **Outcome.** Only **a standalone repo that doesn't already exist** continues to Phase 2.
If the right home is a feature/skill/agent/command of something that exists, or an existing
tool wins, **stop and reroute gracefully** — re-capture the idea to `INBOX.md` tagged for
the host repo (or point at the skill/agent authoring path), tell the user plainly why a
standalone repo is the wrong shape, and don't scaffold. Bailing here is a success, not a
failure.
## Phase 2 — Workshop the scope (the high-value step)
This is the point of the command — don't rush it to get to scaffolding. Like `/roundup`, the
reasoning is the user's; you're drawing it out, not deciding it. Hold the posture above and
work through these a focused question or two at a time:
- **Objective & non-goals** — what the project is for in a sentence or two, what "working"
looks like for v1, and — the cheapest way to keep scope honest — what it explicitly will
*not* do.
- **Placement** — read `~/Projects/standards/guides/placement.md` and walk its decision
sequence against this idea: sensitivity/sovereignty, runtime shape, host (Mac vs Start9;
if Start9, s9pk vs plain container), model routing, data layer, interface, repo home. Where
a call is genuinely contested (s9pk-vs-container often is), surface it with a recommendation
rather than picking silently. The resulting placement table seeds `AGENTS.md`.
- **Key early decisions** — the one or two architectural forks that are expensive to reverse
later. Capture each as a decided call: what was chosen, the alternative it beat, and the
concrete condition that would reopen it. This is what lands in `AGENTS.md`'s "decisions
already made" section (Phase 4), so the project never needs a separate decision-log file.
- **First milestone** — the thinnest end-to-end slice that proves the core loop; it becomes
the seed of `## Current state`. State its exit as **falsifiable substance** — a number or
demonstrable behavior that could actually fail ("recap generated from a real 40-min call in
under 2 minutes"), never a checkbox milestone ("set up the database"). If an exit can't
fail, rewrite it until it can.
Once these are answered well enough to scaffold, move on — don't pad the conversation.
## Phase 3 — Brief + scaffolding plan (sign-off gate)
Synthesize the workshop and show the user **three things before creating anything on disk**:
1. **Project brief** — the seed of `AGENTS.md`: one-paragraph purpose, the placement table,
non-goals, the decided architectural calls, and the first milestone.
2. **Scaffolding plan** — the exact tree you'll create: the standards files (always:
`AGENTS.md` + `CLAUDE.md` symlink, `ROADMAP.md`, `README.md`, the canonical `.gitignore`),
the stack-specific starting files/dirs (kept minimal), and whether any `.claude/` wiring is
needed yet (usually just the directory — scoped guides come later, as the project grows).
3. **Worth-building check** — now that the cost is visible, state it honestly: the build
effort for the first milestone and beyond, plus the ongoing tax (every running service is a
pet that needs updates, backups, and debugging when it breaks at a bad time). Land on
**BUILD**, **PARK**, or **ADOPT**. PARK is a respectable outcome — re-capture the idea to
`INBOX.md` with a one-line epitaph (tagged `parked`) so it isn't lost or re-workshopped from
scratch, and stop here.
Get explicit sign-off on BUILD. This is the last gate before disk.
## Phase 4 — Scaffold (compliant from line one)
Create `~/Projects/<name>/` and write, matching the standard exactly (`portability.md`):
- **`AGENTS.md`** — from the brief: one-paragraph purpose; `## Stack` and a `## Placement`
table (host, s9pk-vs-container, model routing, data layer, interface — from the placement
workshop); `## Commands` (stub the commands you expect, marked TODO where not yet runnable);
`## Layout`; a **`## Decisions`** section listing each architectural call already made, the
alternative it beat, and the condition that would reopen it (this absorbs what a separate
decision-log file would hold — keep it here, don't spawn an ADR file); the **sovereignty
constraint stated plainly** if the project touches sensitive data (local inference only —
never wire a frontier API to payload data); the **inbox-check line** tagged `(<name>)`
(canonical wording in the standards repo's own `AGENTS.md`); and an initial **`## Current
state`** ("Scaffolded <date>; next: <first milestone>").
- **`CLAUDE.md`** — a *relative* symlink to `AGENTS.md`: run `ln -s AGENTS.md CLAUDE.md`
inside the new dir (never an absolute symlink — it must clone correctly elsewhere).
- **`ROADMAP.md`** — seed with the phases beyond the first milestone (each with a falsifiable
exit), plus the longer-term ideas and deferred non-goals from the workshop.
- **`README.md`** — human-facing: what it is and how to run it (a stub is fine; mark TODOs).
- **`.gitignore`** — the canonical block from `portability.md`'s "What git tracks"
(deny-by-default `.claude/*` + the shared-wiring allow-list; `.env`/`.env.*`/`!.env.example`;
OS cruft) **plus** the stack's own ignores (e.g. `node_modules/`, `__pycache__/`, build
artifacts). Read the block from `portability.md` so it stays in sync — don't reproduce it
from memory.
- **`.claude/`** — create the directory; add `settings.json` only if a deterministic hook is
wanted now. Don't add `rules/` symlinks until there's a `docs/guides/` file to point at.
- **`design/` (only if the project has a user-facing surface)** — if v1 renders a UI (web app,
landing page, native app, anything a person looks at), seed the `design/` folder and add the
**Design line** to `AGENTS.md` ("before building or changing any user-facing UI, read
`design/DESIGN.md` and `design/tokens.tokens.json` and conform to them"). The contract and
folder layout are defined in `guides/design.md`; you don't have to scope the look now — note
`/design` as the next step to do that. Skip this entirely for headless services, libraries,
and CLIs with no visual surface.
- **Starting structure** — the minimal stack-specific skeleton from the plan; no more.
The stack's **quality gate** (linter + pre-commit hook) is deliberately *not* hand-rolled
here — that's the future `/harden` command (standards ROADMAP item 1). Note it as a next step
rather than improvising one.
Sweep everything you write for secrets — reference env-var names, never real values.
## Phase 5 — Publish (git + Gitea) and close the loop
- `git init`, stage, and make a single clear initial commit of the whole scaffold.
- **Gitea publish gate** (reuse `retrofit-playbook.md` Part 4). Creating the repo on the
user's self-hosted Gitea is currently a manual web-UI step — there is no API automation yet
(a `/new-project` Gitea-API enhancement is captured in `INBOX.md`). So: ask the user to
create an empty repo in Gitea's UI (named `<name>`, no README) and paste its URL back. Then
`git remote add origin <url>` and `git push -u origin <branch>`. The SSH key is one-time and
assumed already set up; if the push fails on auth, point the user at the Part 4 SSH-key
prompt. If they'd rather not publish yet, leave it local and say so — **never add a GitHub
remote.**
- **Close the capture loop:** if this project came from a `(new…)` inbox item, remove that
line from `~/Projects/standards/INBOX.md`, then commit and push the standards repo (the same
way `/capture` keeps the inbox durable).
## Phase 6 — Verify compliance
Because you're the main thread, fan out **portability-checker** on the new repo to confirm
it's compliant from line one — `AGENTS.md` canonical with a relative `CLAUDE.md` symlink, the
deny-by-default `.gitignore`, no absolute in-repo symlinks. Relay only what needs a fix.
## Final report
Short summary: the new repo's path, what was scaffolded, commit + push status (and the Gitea
URL, or that it's local-only), whether the `(new)` inbox item was cleared, and the first
milestone to start on. If you stopped at the Gitea gate waiting for a URL, make that the
unmistakable next action. If you bailed at the form-factor gate or parked at the
worth-building gate, say so plainly and where the idea was rerouted. If blocked at any point,
report exactly what blocked you — never guess or fabricate.
+124
View File
@@ -0,0 +1,124 @@
# Onboarding-tester — agent operating guide
*Substance file per the portability protocol. Vendor wrappers (e.g.
`adapters/claude/agents/onboarding-tester.md`) point here; this guide is self-contained
and written as plain prose any delegated agent could follow.*
You are a fresh adopter. Someone just handed you a piece of software's **published
documentation** and a goal, and your job is to find out whether a newcomer — human or
agent — can reach that goal using **only those docs**. You are not here to read the source
and make it work; you are here to discover every place the docs leave a newcomer stuck,
guessing, or wrong. When the docs are good enough that you finish without a hitch, your
clean run becomes a second deliverable: a publishable "this is all it took" walkthrough.
Two things set you apart from the `exerciser`: (1) you are bound to the **docs corpus**
reading the product's source to unblock yourself defeats the entire test; (2) you produce
**two outputs** — a friction report always, and on a clean run, a marketing-grade walkthrough.
## Inputs you'll receive
- **The goal** — the concrete outcome a newcomer is trying to reach (e.g. "gate this app
behind a license," "stand up the service and issue your first token"). Success is reaching
the goal, not "no errors."
- **The docs corpus** — an explicit allowlist of published documentation sources: doc-site
URLs, package-registry READMEs, a linked integration guide. This is the *only* material you
may consult for how-to guidance. If the allowlist is vague, pin it down before starting and
state what you treated as in-corpus.
- **The target / sandbox** — the clean environment you work in (a throwaway app to integrate
into, a fresh container, a disposable server). You modify only this.
- **Credentials & endpoints** — whatever a real adopter would be handed (an API key, a server
URL, a public key). Treat these as given; obtaining them is the provisioner's job, not
yours, unless the goal explicitly includes it.
## Hard rules
- **Docs-only, and it's the whole point.** Consult *only* the docs corpus for how-to
guidance. Never read the product's server/SDK source, repo internals, or issue tracker to
unblock yourself. If you can't proceed from the docs, that is a **finding** — log the gap
and stop down that path; do not satisfy it from source. A run where you peeked at source is
invalid and must say so plainly.
- **Be a literal, honest newcomer.** Follow the docs exactly as written. When a step is
ambiguous, take the most reasonable plain reading, record the ambiguity, and note what you
guessed. Don't apply expert knowledge the docs didn't give you; if you *had* to, that's a gap.
- **Mutate only the sandbox.** Never modify the product, its docs, or anything outside the
sandbox/target. Scratch goes under `/tmp/onboarding-tester/`.
- **Record as you go, not from memory.** Every command you ran and its result, every doc
location you used, every place you backtracked. The walkthrough and the friction report are
both faithful reconstructions of this log — keep it honestly.
- **No fabrication.** If blocked, report the exact failing step, the doc location, and the
error. "A newcomer can't get past step N from the docs alone" is a valid, valuable result.
Never invent success.
## Procedure
1. **Pin the corpus and the goal.** List the exact doc sources you'll treat as published, and
restate the goal as a checkable end-state. If the corpus is ambiguous (is this repo file
actually published to users?), say how you resolved it.
2. **Read the entry doc as a newcomer would** — the quickstart / "getting started," once, top
to bottom, before doing anything. Note what it assumes you already have or know.
3. **Walk the documented happy path, step by step.** Do exactly what each step says, in order.
After each step record: did it work as written? what did you actually run? what did the doc
omit or get wrong? Backtrack and retry only as a newcomer could — by re-reading the docs,
never by consulting source.
4. **Reach the goal and verify it.** Confirm the promised outcome actually holds (the license
verifies; the gated feature unlocks with a valid license and blocks without one; the
service responds). Verify the *claimed* behavior, not just absence of errors.
5. **Note the friction precisely.** For every snag: where in the docs, what it said, what
actually happened or was needed, how a newcomer would (or wouldn't) recover, and what the
doc should say instead.
6. **Judge completion.** One of: **blocked** (couldn't reach the goal from docs alone — say at
which step and why), **completed-with-stumbles** (reached it, but N places were
wrong/unclear/cost a retry), or **completed-clean** (reached it following the docs as
written, no guessing, no retries).
## Outputs
### 1. Friction report — always
```
## Verdict
blocked-at-step-N | completed-with-stumbles (N) | completed-clean.
12 sentences: can a newcomer reach the goal from the docs alone, and what's the headline gap?
## Corpus & goal
The doc sources treated as published; the goal as a checkable end-state; the
credentials/endpoint taken as given.
## Friction log (most severe first)
[Blocker|Stumble|Nit] Short title
Where: exact doc location (URL + section/anchor, or file:line if a published file)
Doc said: … (≤3 lines) Reality: what actually happened / was needed (≤3 lines)
Recover: how a newcomer would get unstuck — or "can't, from docs alone"
Fix: the one-line change to the docs that closes it
## Path walked
The ordered steps you actually took to reach (or fail to reach) the goal — the ground
truth behind the verdict.
## Confidence
high|medium|low + the one thing that would most change the result.
```
Severity: **Blocker** = could not proceed without leaving the docs or guessing · **Stumble** =
proceeded, but the docs were wrong/unclear and cost a retry · **Nit** = typo, dead link, cosmetic.
### 2. Success walkthrough — only on a clean run
Emit this **only when the verdict is `completed-clean`** (zero Blockers, zero Stumbles). A
`completed-with-stumbles` run does *not* earn a marketing walkthrough — fix the stumbles and
re-run first; that gate is what keeps the published artifact honest.
```
## Walkthrough (publishable)
One line of framing: "Given only <corpus>, here is the entire path from zero to <goal>."
The minimal ordered steps, copy-pasteable, each a real command or doc-cited action —
nothing the run didn't actually need.
The result, stated plainly: what now works, and how it was verified.
```
Rules for the walkthrough: it must be **reproducible** from the corpus + these steps alone; it
must contain **no secrets and no real internal endpoints** (use placeholders like
`$SERVICE_URL`, `$API_KEY`); and it must be **minimal** — the shortest true path, not a
transcript of your backtracks. This is the "all the agent had to do was X, Y, Z" artifact;
treat it as copy that will be published.
## Notes
- You may be run repeatedly as the docs improve: each run's friction report drives the next
doc fix; the goal is to reach `completed-clean` so the walkthrough can ship. Report progress
against the previous run's blockers when you know them.
- Some steps may be **deliberately out of a newcomer's reach by design** — an action gated to a
human operator for safety (connecting real payment/financial accounts, rotating a signing
key). If the docs say so and route you correctly, that is *not* a gap: note it as an intended
boundary, which is often itself a feature worth surfacing in the walkthrough.
+112
View File
@@ -0,0 +1,112 @@
# Placement guide — where should a new project live?
Reference doc for the "where does this run, which model, what data layer?" question. It
encodes two things: a stable **decision sequence** (rarely changes) and a set of
**infrastructure facts** (go stale — keep them current). `/new-project` walks this against
every new idea (`guides/new-project.md`, Phase 2); `how-i-work.md` points here so any
session placing a project consults it rather than guessing.
> ✅ **Verified with the owner 2026-06-15** (and cross-checked against the project repos).
> Keep this section current as the infra changes — see Maintenance. The *decision sequence*
> and the *substance rule* are stable regardless.
## Infrastructure facts (verified 2026-06-15)
**Start9 server** — one box, **StartOS 0.4.0**, **x86_64** (0.4.0 doesn't run on Raspberry
Pi / ARM, so x86 is the only option — build s9pks `x86_64`). It hosts long-running services
as s9pk packages. Running on it: Gitea (the default repo home for every project), Nextcloud
(file backup), Home Assistant, Core Lightning + Ride the Lightning (RTL), Open WebUI (the
sovereign chat layer), Vaultwarden, and Synapse (the Matrix homeserver, `matrix.gilliam.ai`).
Every Claude-built app also lives here: recap (public at `recaps.cc`), keysat, premier-gunner,
proof-of-work, recap-relay, ten31-database, spark-control.
**Inference — two NVIDIA DGX Sparks (ARM64), fronted by the Spark Control gateway on the
LAN.** Spark Control is the single HTTP endpoint every app calls; the two Sparks split by role:
- **LLM Spark** — vLLM, OpenAI-compatible. Serves whichever general model is currently
activated (daily driver right now: **Qwen3.6**; Gemma and others are downloaded and
hot-swappable from the Spark Control dashboard).
- **Audio / speech Spark** — Parakeet (STT), Kokoro (TTS), Sortformer + TitaNet (diarization),
**bge-m3 embeddings + Qdrant**, and the rerank model. It also hosts the **matrix-bridge**
container (on the WireGuard subnet).
Treated as real production capacity — recap / recap-relay (transcription + analysis),
ten31-database (CRM pipeline), ten31-signal-engine, and ten31-transcripts already depend on it.
**Don't hardcode a model name.** Route to the Spark Control gateway and ask its API which
model is live — that single-endpoint indirection is the point; the active model changes when
the owner swaps it from the dashboard.
**Data layer defaults** — SQLite for structured data; **Qdrant + bge-m3** (both on the
audio/speech Spark) when semantic retrieval is needed, with per-project collections; flat
files when that's the honest answer.
**Sovereignty boundary (standing rule)** — anything touching sensitive investor, LP, or
portfolio data uses local models only, via the Spark Control gateway, behind a redaction
boundary wherever free text could carry names. Frontier APIs (Anthropic etc.) are fine for
everything else. Non-negotiable per project; the only question is which side of the line the
project's data sits on — and AGENTS.md must state it so a session never wires a frontier call
to payload data.
**Access / networking** — three mechanisms, no others (Proton VPN and Tor were legacy and are
not in use):
- **LAN** — the default; apps, Sparks, and the box share it.
- **WireGuard** — how the owner's own devices reach LAN-only services when off-LAN.
- **StartTunnel** — Start9's ClearNet feature; publicly exposes selected services (recap at
`recaps.cc`, Synapse/Matrix, and the ten31-database CRM — the CRM is ClearNet-exposed with
app-level user auth so only the team reaches it).
**Dev machine** — macOS with Claude Code; also the s9pk / macOS-app build host. One-off and
personal CLI tools live here happily.
## Decision sequence (stable)
Walk these in order; each answer narrows the next.
**1. Sensitivity.** Does the project ingest, store, or send investor/LP/portfolio data to a
model? If yes: local inference mandatory, hosting on the home subnet strongly preferred, and
AGENTS.md must state the constraint explicitly so a coding session never "helpfully" wires in
a frontier API call with payload data.
**2. Runtime shape.** One-shot CLI / scheduled job / long-running service / interactive UI?
- One-shot or personal CLI → Mac. Don't deploy what doesn't need deploying.
- Scheduled job → Mac launchd if it only matters while the laptop lives; Start9 if it must
run unattended 24/7.
- Long-running service, or anything other devices/family/agents need to reach → Start9.
**3. If Start9: s9pk or plain container?** s9pk earns its packaging cost when the service
wants the StartOS lifecycle — backups, health checks, dependency management, clean updates —
or could plausibly be published for others. Plain container (or script) wins for experiments,
single-user glue, and anything still changing shape weekly. Default for prototypes: container
now, promote to s9pk if it survives and stabilizes. Packaging for 0.4.x is nontrivial; don't
pay it on spec.
**4. Model routing.** Default to the local model via the Spark Control gateway when the
sovereignty boundary applies, when latency/cost favor local, or when the task is well within
the local model's capability. Don't hardcode a model name — call the gateway and ask which
model is active. Route to frontier (Claude API) for hard reasoning on non-sensitive data.
Record the chosen endpoint (gateway vs frontier) in AGENTS.md so sessions don't guess.
**5. Data layer.** SQLite unless there's a reason; Qdrant + bge-m3 when retrieval quality is
the product; flat files for logs and artifacts. Name Qdrant collections per-project to avoid
the shared-collection mess.
**6. Interface.** CLI first unless the UI *is* the product. If it must be reachable from the
phone or by the team off-LAN, decide up front how: expose it over ClearNet via StartTunnel
with app-level auth (how the CRM and `recaps.cc` are reached), or keep it LAN-only and reach
it over WireGuard from your own devices.
**7. Repo home.** Gitea on Start9. Always — even for parked-then-revived ideas, so history
accumulates in one place.
## Phase-exit criteria — the substance rule
Phase exits are falsifiable substance: numbers and demonstrable behavior. "46/46 tests
pass," "recap generated from a real 40-minute call in under 2 minutes," "correct doc in
top-3 for 9/10 canned queries." If the criterion can't fail, it isn't a criterion.
## Maintenance
The **infrastructure facts** section is the part that goes stale. When the infra changes —
new hardware, StartOS version, model lineup, network setup, a service added or retired —
update that section here rather than working around it in conversation. The decision sequence
and the substance rule rarely change.
+8 -3
View File
@@ -26,6 +26,9 @@ A path to the repo to audit (default: the current working directory).
3. **Resolve every symlink concretely.** Use `ls -l` / `readlink` (not assumptions): confirm
it exists, points the correct *direction*, its target exists (not dangling), and the link
is **relative**, never absolute — an absolute link breaks if the repo is cloned or moved.
Scope this to symlinks *inside the repo* (the ones git commits). Symlinks outside the repo
— notably the global `~/.claude/*` links — are never committed and are recreated per
machine, so they may be absolute without harm; do not flag them in a repo audit.
4. **Check each layer against the checklist below**, citing file paths and `readlink` output.
5. **Reconcile with the live spec.** If `portability.md` states a rule the checklist doesn't
cover, check it too and flag the gap.
@@ -60,7 +63,8 @@ A path to the repo to audit (default: the current working directory).
**Self-containment and swap-readiness**
- Everything required to work on the repo lives in the repo. A hard dependency on a global
or user-level file for a *requirement* (not a preference) is a blocker.
- All vendor symlinks are relative, so the repo stays portable.
- All vendor symlinks **inside the repo** are relative, so the repo stays portable. (The
global `~/.claude/*` links are out of scope — not part of the repo and never committed.)
- `.gitignore` covers `.env`; no secrets, large binaries, or generated artifacts committed.
## Hard rules
@@ -68,8 +72,9 @@ A path to the repo to audit (default: the current working directory).
user can run (the `git mv` / `ln -s` to run), never apply them.
- Every PASS/FAIL cites concrete evidence: a file path, a `readlink` result, a line number.
Anything you did not actually inspect is UNVERIFIED, never assumed.
- Verify both **direction** and **relativeness** of every symlink — a link that resolves but
points the wrong way, or is absolute, fails.
- Verify both **direction** and **relativeness** of every **in-repo** symlink — a link that
resolves but points the wrong way, or is absolute, fails. (Global `~/.claude/*` links are
out of scope.)
- Distinguish **blockers** (break vendor-neutrality or hot-swap) from **warnings** (friction
or style). The absence of an optional layer (no subagents, no scoped guides) is neither —
list it as Not applicable. Only present-but-wrong is a finding.
+6 -2
View File
@@ -22,8 +22,9 @@ guess on anything that changes what lands on disk.
## Phase 1 — Git audit (playbook Step 0)
- If this is not a git repo: propose `git init`, a sensible `.gitignore` (including
`.env`), and an initial commit. Get approval before running.
- If this is not a git repo: propose `git init`, a `.gitignore` (the canonical block from
`portability.md`'s "What git tracks" — `.env`/`.env.*`, a deny-by-default `.claude/*` with
the shared wiring allow-listed, OS cruft), and an initial commit. Get approval before running.
- If it is: report whether there are uncommitted changes and when the last commit was,
then propose committing anything outstanding (a repo existing isn't the same as work
being saved — uncommitted changes are as unprotected as no repo at all).
@@ -88,6 +89,9 @@ AGENTS.md now exists but is a strong draft, not yet checked against reality.
- **Show the user the proposed split and wait for confirmation before moving anything** —
what's whole-repo versus subsystem is a judgment call that's theirs. Then report what
moved where.
- Ensure AGENTS.md carries the **inbox-check line** (the portable surfacing mechanism; see
`retrofit-playbook.md` Step 1, and the canonical wording in the standards repo's own
`AGENTS.md`). If it's missing, propose adding it.
## Phase 5 — Independent checks (delegate)
+24 -6
View File
@@ -42,13 +42,14 @@ existing one.
Wait for all readers before synthesizing. If a reader fails or a repo has neither file, note
it honestly rather than dropping the repo.
## Phase 3 — Synthesize (one report)
## Phase 3 — Synthesize (one report → STATUS.md + inline)
Present ONE report in the chat. Default to showing it inline; if the user wants a tracked
snapshot they can diff over time (like `EVALUATION.md`), offer to write it to
`~/Projects/standards/STATUS.md` — their call, and don't commit it yourself.
Produce ONE report. **Write it to `~/Projects/standards/STATUS.md`**, overwriting the
previous snapshot, **and** show the same report inline in the chat — the file is the durable,
diffable record; the inline copy is for reading right now. Title it with today's date (run
`date +%F`).
Structure:
Structure (used identically for the file and the inline copy):
```
# Roundup — <date>
@@ -73,9 +74,26 @@ The (new)/(new:name) inbox items — ideas awaiting the new-repo bootstrap.
Repos missing AGENTS.md or a Current state; stale-looking states; anything that blocked a reader.
```
## Phase 4 — Persist the snapshot
`STATUS.md` is only a *tracked* snapshot if it's committed — that's the whole point of the
file (a dated history you can diff over time). So after writing it, commit and push **only
that file** to the standards repo, without asking (the same durability reflex as `/capture`):
- `git -C ~/Projects/standards add STATUS.md`
- `git -C ~/Projects/standards commit -m "Roundup snapshot — <date>" -- STATUS.md`
(the `-- STATUS.md` pathspec commits only the snapshot — never sweep unrelated standards
changes into a roundup commit)
- `git -C ~/Projects/standards push` (only if a remote is configured; if the push fails,
report it but don't treat the snapshot as lost — it's committed locally)
This write-and-commit of `STATUS.md` is the *only* thing `/roundup` changes on disk;
everything else — every project repo, every other file — stays strictly read-only.
## Rules
- Read-only. The only file you may write is `STATUS.md`, and only if the user asks for it.
- The only file you may write or commit is `~/Projects/standards/STATUS.md` (the snapshot).
Every project repo and every other file stays strictly read-only — never edit or commit them.
- Quote priorities and states as found; never re-rank projects or recommend what to do next
unprompted — that's the user's call.
- Preserve every item; if you can't place one, list it under "needs triage" rather than
+9 -4
View File
@@ -17,17 +17,19 @@ Universal preferences for any coding agent working with me, on any project. Load
- Consider how a change affects code that depends on, references, precedes, or follows it. No change is local until you've checked its neighbors.
- Match the conventions already in a file or repo over any default of your own.
- Prefer small, reviewable diffs over sweeping rewrites.
- Comments explain *why*, not *what* — don't narrate self-evident code.
- Build only what the task needs. Question whether a piece needs to exist before writing it; skip speculative work and say you skipped it. No abstraction for a single caller — no interface with one implementation, no factory for one product, no config option for a value that never changes.
- Comments explain *why*, not *what* — don't narrate self-evident code. When you take a deliberate shortcut with a known ceiling (a global lock, an O(n²) scan, a naive heuristic), say so in the comment and name the upgrade path.
- Write the test alongside the change when the repo has an existing test setup.
- Don't add a dependency for something the standard library or existing dependencies already do well.
- Don't add a dependency for something the standard library, an already-installed dependency, or a native platform feature already does well (a built-in input type over a picker lib, CSS over JS, a DB constraint over app code).
- Propose, don't silently rewrite, durable instructions or shared config: show me the diff and the rationale first. Exception: trivial fixes (typos, dead links).
## Git and commits
- **No AI co-authorship attribution.** Do not add "Co-Authored-By" trailers, "Generated with"/"Co-authored with" lines, or any other tool or model attribution to commit messages, PR descriptions, or code. Commits are authored by me.
- Commit messages: imperative mood, concise subject, body only when the "why" isn't obvious.
- Push to the configured remote after committing. My default remote is self-hosted Gitea, not GitHub — don't assume GitHub or add a GitHub remote unless I ask.
- Never force-push a shared branch or commit directly to main unless I say so.
- **Work on `main` (or the repo's default branch); don't create feature branches unless I ask.** These are solo, non-critical repos — my approval before pushing *is* the review gate, so branches and PRs only add ceremony. If a change is genuinely risky or experimental and you'd want to abandon it cleanly, suggest a branch — but `main` is the default and the branch is the exception.
- **Ask before committing; one approval covers both commit and push.** When I approve a change, that's the go-ahead to commit *and* push it to `main` in one step — don't commit and then ask again before pushing. My default remote is self-hosted Gitea, not GitHub — don't assume GitHub or add a GitHub remote unless I ask.
- Never force-push a shared branch unless I say so.
- Never commit secrets, large binaries, or generated artifacts. Documents reference env-var names only; real values live in gitignored .env files.
## Debugging
@@ -40,6 +42,7 @@ Universal preferences for any coding agent working with me, on any project. Load
## Building and releasing
- **Placing a new project?** Where it runs (Mac vs my Start9), s9pk-vs-container, which model it routes to, its data layer, and its interface follow my standing infra conventions — consult `~/Projects/standards/guides/placement.md` (decision sequence + infra facts) rather than guessing.
- **Bump the version before building an s9pk package.** For any project that is also a Start9 `s9pk` package, increment the package version (in the manifest) before running `make x86` or `make install`. This targets Start9 0.4.x: without a version bump my Start9 servers don't recognize the rebuilt package as updated, so the install silently does nothing. Bump first, then build and install.
## Maintaining instruction files (AGENTS.md, guides, this file)
@@ -50,4 +53,6 @@ Instruction files are the durable, visible record of how I want you to work —
- **Add and remove.** If a new rule supersedes an old one, rewrite or delete the old one. Stale rules are worse than no rule.
- **Reconcile conflicts.** If a broader-scope rule and a narrower-scope rule disagree, resolve it — narrow the parent, override at the child, or drop one. Don't leave future-me to guess which wins.
- **Right scope:** universal → this file; whole-repo → that repo's AGENTS.md; subsystem → a scoped guide. Keep each file to what's true at its scope.
- **How these files are structured (every repo):** `AGENTS.md` is the canonical file, and each vendor's preferred filename is a relative symlink to it — so the repo is already in every tool's native format. Subsystem detail lives in `docs/guides/<topic>.md` (with `paths:` frontmatter), surfaced via a relative symlink so a tool auto-loads it when you edit matching files, with a one-line index entry in AGENTS.md so any tool can find it. Knowledge lives in these vendor-neutral files; vendor-named paths are relative symlinks into them. Full protocol: `~/Projects/standards/portability.md`.
- **Promote up, but don't reflexively trim down.** Move a rule here when it's truly universal, so new repos and every session inherit it. Promoting it does *not* mean deleting it from a repo's AGENTS.md: for load-bearing or safety rules, keep the repo copy — it's followed more closely in context, and it travels with the repo to any tool or session, including ones that never load this file. Trim a per-repo copy only when it's pure boilerplate with no project-specific payload *and* the file is genuinely bloated.
- **Where project state goes:** a repo's near-term status and next steps live in the `## Current state` section of its AGENTS.md (a snapshot, overwritten each session); longer-term backlog lives in a separate `ROADMAP.md` at the repo root. The split is by time horizon, not scope: would a session starting now act on it? Yes → Current state; not yet → ROADMAP.
+27 -6
View File
@@ -10,11 +10,13 @@
README.md how-i-work.md portability.md retrofit-playbook.md subagents-handbook.md
ROADMAP.md ← this repo's backlog (future agents, commands, standards)
INBOX.md ← cross-project capture buffer (/capture → here, /triage drains it)
STATUS.md ← cross-project roundup snapshot (overwritten + committed by /roundup)
guides/ ← neutral substance (vendor-agnostic): checklists, role knowledge
adapters/
claude/
commands/ ← ~/.claude/commands symlinks here (global commands, e.g. /handoff)
agents/ ← ~/.claude/agents symlinks here (global subagents)
statusline.sh ← ~/.claude/statusline.sh symlinks here (Claude Code terminal status line)
```
Companions: `how-i-work.md` (the always-loaded user layer, ~50 lines), `retrofit-playbook.md` (the one-time conversion manual), and `subagents-handbook.md` (designing and running delegated agents). This document is reference material — never symlink it into anything always-loaded.
@@ -57,7 +59,11 @@ secrets** — but it lands in three places.
Because every vendor path in this standard is a *relative* symlink (`CLAUDE.md → AGENTS.md`,
`.claude/rules/x.md → ../../docs/guides/x.md`), it commits and clones correctly on any
machine. This is the concrete payoff of the relative-symlink mandate: an absolute symlink
would commit too, but break on every other clone. Never commit an absolute symlink.
would commit too, but break on every other clone. Never commit an absolute symlink. This
mandate binds **in-repo** symlinks specifically — the ones git commits and clones. The
global `~/.claude/*` symlinks (point 3) are never committed, so the rule doesn't reach them:
they're per-machine glue, recreated on each machine by the adoption step, and may be absolute
without harm. A repo audit checks only the symlinks inside the repo.
**2. Inside a project repo — commit shared, ignore local.**
@@ -67,8 +73,17 @@ symlinks; `ROADMAP.md`; `.claude/settings.json` (shared project settings and hoo
deterministic behavior is part of the repo); `.claude/agents/*.md`, `.claude/commands/*.md`,
`.claude/skills/` (project-scoped wrappers).
Gitignored (per-user, per-machine, or secret): `.claude/settings.local.json` and any
`*.local.*` (personal permissions/overrides); `.env` and secrets (corollary 5); OS cruft.
Gitignored (per-user, per-machine, secret, or session scratch): `.claude/settings.local.json`
and any `*.local.*` (personal permissions/overrides); `.claude/worktrees/` and other Claude
session/editor scratch that lands in `.claude/`; `.env` and secrets (corollary 5); OS cruft.
Because `.claude/` accumulates unpredictable scratch — worktrees, editor debug configs
(sometimes carrying credentials), `.DS_Store` — **ignore it deny-by-default and allow-list
only the shared wiring.** A blanket `.claude/*` plus `!` exceptions is safer than naming
individual local files: a new kind of local scratch is ignored automatically, and a stray
secret never slips in by default. (Already-tracked files stay tracked even under `.claude/*`,
so a deliberate, secret-free editor config a repo wants to commit can simply be allow-listed
with its own `!` line.)
Put these in the repo's **own committed `.gitignore`** — don't rely on a global
excludesfile, which a fresh clone or another machine won't have. Canonical block:
@@ -79,9 +94,15 @@ excludesfile, which a fresh clone or another machine won't have. Canonical block
.env.*
!.env.example
# Claude Code — commit shared config, ignore personal/local
.claude/settings.local.json
.claude/*.local.json
# Claude Code — deny by default, allow-list shared wiring.
# .claude/ also accumulates worktrees, editor configs, and OS cruft; commit
# only the shared parts so new local scratch (or a stray secret) stays out.
.claude/*
!.claude/rules/
!.claude/agents/
!.claude/commands/
!.claude/skills/
!.claude/settings.json
# OS cruft
.DS_Store
+4 -3
View File
@@ -62,7 +62,7 @@ claude
Paste:
> Is this a git repo? If not: git init, write a sensible .gitignore (including .env), and make an initial commit. If it is: tell me whether there are uncommitted changes and when the last commit was, then commit anything outstanding.
> Is this a git repo? If not: git init, write a .gitignore covering .env/.env.*, a deny-by-default .claude/* with the shared wiring allow-listed (rules/agents/commands/skills/settings.json), and OS cruft — the canonical block in portability.md's "What git tracks" — and make an initial commit. If it is: tell me whether there are uncommitted changes and when the last commit was, then commit anything outstanding.
Approve the git commands it proposes. Then `/exit`. (A repo existing isn't the same as work being saved in it — uncommitted changes are exactly as unprotected as no repo at all.)
@@ -78,7 +78,7 @@ Pick your long-running chat from the list. Desktop-app sessions live on this sam
Paste, one after another:
> **A — knowledge:** Based on everything we've done in this project, write an AGENTS.md at the repo root that a fresh coding-agent session could read and be immediately productive with. Include: a one-line description; the stack/versions that matter; the exact build, test (including how to run a single test), lint, and run commands; the directory layout worth knowing on day one; the conventions you've learned I follow here; and an Always/Never list of the gotchas and decisions we hit — especially anything you got wrong earlier that I corrected. Under 200 lines, markdown headers and short bullets, concrete and verifiable, no history. Don't invent commands or paths — mark anything you're unsure of as TODO. No API keys, tokens, or secrets — reference env-var names instead. Write the file now, then create a symlink so Claude Code loads it: ln -s AGENTS.md CLAUDE.md
> **A — knowledge:** Based on everything we've done in this project, write an AGENTS.md at the repo root that a fresh coding-agent session could read and be immediately productive with. Include: a one-line description; the stack/versions that matter; the exact build, test (including how to run a single test), lint, and run commands; the directory layout worth knowing on day one; the conventions you've learned I follow here; and an Always/Never list of the gotchas and decisions we hit — especially anything you got wrong earlier that I corrected. Under 200 lines, markdown headers and short bullets, concrete and verifiable, no history. Don't invent commands or paths — mark anything you're unsure of as TODO. No API keys, tokens, or secrets — reference env-var names instead. Also add this line near the top so captured items surface in future sessions: "Inbox check: at session start, if ~/Projects/standards/INBOX.md exists, scan it for items tagged (this-repo) and surface them before proposing next steps; triage with /triage." Write the file now, then create a symlink so Claude Code loads it: ln -s AGENTS.md CLAUDE.md
> **B — state:** Now append a "## Current state" section to the bottom of AGENTS.md: what's built and working, what's in progress and exactly where it stands, decisions we made that aren't implemented yet, known bugs, and the next 35 concrete steps. 15 lines max, present tense, no history. Anything longer-term than those next steps — backlog, someday ideas, deferred decisions — goes in a separate ROADMAP.md at the repo root; create it now if there's any such backlog to capture.
@@ -145,7 +145,7 @@ Your Mac is now the single point of failure; this removes it. Gitea is just a gi
3. The SSH key is one-time; for each additional project, just create the empty Gitea repo and say "add my Gitea remote at URL and push."
**The one tradeoff:** Claude's cloud Code sessions (web, and the mobile app's cloud mode) can only clone from GitHub, so Gitea-only means no phone-initiated cloud sessions. Your phone can still monitor and steer sessions running on your Mac — type `/mobile` in a session for a QR code. Want both? Add GitHub as a second remote later; git pushes to both happily.
**The one tradeoff:** Claude's cloud Code sessions (web, and the mobile app's cloud mode) can only clone from GitHub, so Gitea-only means no phone-initiated cloud sessions. Your phone can still monitor and steer sessions running on your Mac — type `/remote-control` (or `/rc`) in a session for a QR code to scan from the Claude app. Want both? Add GitHub as a second remote later; git pushes to both happily.
---
@@ -171,6 +171,7 @@ claude -c
continues the most recent conversation in that folder. Nothing lost.
**Other rules:**
- **Stray ideas, any project:** `/capture` logs an idea or bug to the standards inbox without derailing the session; `/triage` inside a repo pulls its captured items into AGENTS.md/ROADMAP; `/roundup` from `~/Projects` gives one priority-grouped to-do list across all projects.
- `/compact` is for mid-task context overflow only. Task finished = close-out ritual + `/exit`. Never compact to extend a finished chat — that's the old habit sneaking back.
- `/memory` shows every CLAUDE.md and rules file loaded right now, and lets you browse what auto memory has saved. Auto memory lives in `~/.claude`, outside the repo — Gitea never backs it up — so promote anything important into CLAUDE.md.
- When **Current state** stops fitting in ~15 lines, it's accumulating history or backlog. Prune it: finished narrative lives in the git log, still-real longer-term items move to ROADMAP.md. Current state holds only what a session starting now would act on.
+6 -3
View File
@@ -89,6 +89,8 @@ least one.
| researcher | compression | Multi-source web research → cited brief | ✅ in kit |
| janitor | compression | Spring-clean docs/artifacts: report stale, orphaned, superseded files | ✅ in kit |
| doc-auditor | independence | Doc drift: every README/instruction/HTML claim checked against the code | ✅ in kit |
| design-checker | independence | Design compliance: code checked against the repo's committed design brief + tokens | ✅ in kit |
| onboarding-tester | independence | Fresh-adopter simulation: reach the goal from published docs alone | ✅ in kit |
| test-runner | compression | Run the suite, return only failures + causes | build when a repo has a real suite |
| docs-reader | compression | Read a library's docs, return just the calls you need | build on 3rd manual-trawling task |
| fresh-debugger | independence | Gets symptoms only (never your theories), hunts the bug | build next time you're stuck >1hr |
@@ -259,7 +261,8 @@ Three real options:
1. **Slash command (this kit's choice): `/full-eval`.** A command file is just a stored
prompt for the *main thread* — and the main thread *can* fan out. `/full-eval` makes
it launch evaluator + security-auditor + exerciser + doc-auditor (+ start9-spec-checker
when relevant) in parallel, then synthesize one deduplicated, prioritized report. Zero new
when relevant, + reviewer when the tree has uncommitted changes) in parallel, then
synthesize one deduplicated, prioritized report. Zero new
machinery, works in any session.
2. **`claude --agent <name>` (advanced).** A session launched this way takes the agent's
system prompt as its main thread — and a main-thread agent *can* spawn subagents,
@@ -287,8 +290,8 @@ Per repo, in order:
moving on — especially on repos retrofitted under an older playbook, where AGENTS.md
may still be a real CLAUDE.md or rules may not yet be symlinks into docs/guides.
2. **`/full-eval`** in a fresh session. Walk away; read one synthesized report.
3. **Triage** into the repo's AGENTS.md Current state: P0/P1 become the work queue,
P2 becomes a "known debt" list, P3+ gets deferred for later decision or done in bulk.
3. **Triage** findings into the repo's AGENTS.md Current state: P0/P1 become the work queue,
P2 becomes a "known debt" list, P3+ gets deferred for later decision, done in bulk, or added to ROADMAP.md.
4. **Fix in the main session** (understanding work — yours), running **reviewer** on
each meaningful diff before commit.
5. **Close out** per the ritual; move to the next repo. Don't batch — one repo's