Compare commits
75 Commits
04651503d2
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| 04a190b8f2 | |||
| c61d58632d | |||
| d901424f2d | |||
| 1a702387e7 | |||
| ce1327aab1 | |||
| 176eebe64b | |||
| b4042985fb | |||
| 7980545c99 | |||
| 45004c2a9b | |||
| 0d4b238852 | |||
| 1fce86a2d6 | |||
| 1344a354c8 | |||
| 6a1fc6cd08 | |||
| f3fae958ef | |||
| 637ac3e7c2 | |||
| 46298e047f | |||
| 23b83f5a4c | |||
| 3d1258e048 | |||
| 6f486c4475 | |||
| e2377f4c8c | |||
| 9031281cd4 | |||
| 8b0799736c | |||
| 7584ed04bb | |||
| b5a18c885e | |||
| 315e13c318 | |||
| 0cd1310915 | |||
| f3d5c7a16d | |||
| 950cf48c06 | |||
| 40f4af4191 | |||
| 2111bfbaf6 | |||
| 9c91d699d0 | |||
| 209dfa8f90 | |||
| 8f3e016528 | |||
| 897e3e1ac1 | |||
| d9653d742d | |||
| 913f541c39 | |||
| 22cf9f776d | |||
| 10e0d898fd | |||
| c133e67e6f | |||
| 6a713f87b1 | |||
| 1639b0427c | |||
| 60f2532481 | |||
| e2fc522096 | |||
| 123072f8ad | |||
| e069ae0a13 | |||
| 4e98106ac6 | |||
| a6e3fe6908 | |||
| 7f3b007e1a | |||
| bb27e4c32a | |||
| ce022ab2c1 | |||
| 1e5e85339a | |||
| 7accb2616f | |||
| 04941384f3 | |||
| 3cc1f74294 | |||
| e34047f853 | |||
| de920292cd | |||
| f7e9e29637 | |||
| 7ed43fcb59 | |||
| d9850f8ad8 | |||
| ee5c8bb3e2 | |||
| 6b54d9c7cc | |||
| e4f5cc1bcb | |||
| c8c1daf763 | |||
| 5713b2476b | |||
| cdfa1eca57 | |||
| 908d96a6e5 | |||
| bac315265e | |||
| cfe7f47964 | |||
| 231bc9f1a0 | |||
| b7625c4e83 | |||
| 8548afd9fd | |||
| b35e699384 | |||
| 828fc99dd4 | |||
| 36e1f78014 | |||
| b55ba13277 |
@@ -17,13 +17,16 @@ The global layer lives here and is wired into `~/.claude` by **directory symlink
|
||||
file added under `adapters/` is live immediately — no per-file linking:
|
||||
|
||||
- `~/.claude/commands` → `adapters/claude/commands/` — global slash commands (`/retrofit`,
|
||||
`/handoff`, `/full-eval`, `/capture`, `/triage`, `/roundup`).
|
||||
`/handoff`, `/full-eval`, `/capture`, `/triage`, `/roundup`, `/new-project`, `/design`,
|
||||
`/adjudicate`).
|
||||
- `~/.claude/agents` → `adapters/claude/agents/` — global subagents (reviewer, evaluator,
|
||||
security-auditor, doc-auditor, exerciser, researcher, janitor, portability-checker,
|
||||
start9-spec-checker).
|
||||
start9-spec-checker, design-checker, onboarding-tester).
|
||||
- `~/.claude/CLAUDE.md` → `how-i-work.md` — my universal preferences, loaded every session.
|
||||
(Distinct from this repo's *root* `CLAUDE.md`, which → `AGENTS.md`: same filename, different
|
||||
scopes — global preferences vs. this repo's orientation.)
|
||||
- `~/.claude/statusline.sh` → `adapters/claude/statusline.sh` — the Claude Code CLI terminal
|
||||
status-line script (invoked by `~/.claude/settings.json`).
|
||||
|
||||
Because these are *global*, decisions here are never scoped to this repo alone. The test
|
||||
for anything proposed ("should we adopt linters / hooks / CI?", "is this skill worth
|
||||
@@ -38,11 +41,15 @@ repo is just where that cross-repo standard is authored and stored.
|
||||
- `retrofit-playbook.md` — terminal runbook for moving a project's brain onto disk.
|
||||
- `subagents-handbook.md` — designing and running delegated agents.
|
||||
- `guides/` — **neutral substance**: the self-contained operating guide for each command
|
||||
and agent, written as plain prose any vendor's harness could follow.
|
||||
and agent, written as plain prose any vendor's harness could follow. Also holds shared
|
||||
reference docs that aren't tied to one command — e.g. `placement.md` (where a new project
|
||||
should live), which `/new-project` and `how-i-work.md` both point at.
|
||||
- `adapters/claude/{commands,agents}/` — **thin Claude wrappers**: frontmatter + a pointer
|
||||
to the matching `guides/` file. Substance never lives in a wrapper.
|
||||
- `INBOX.md` — cross-project capture buffer for untriaged ideas/bugs (see below).
|
||||
- `ROADMAP.md` — longer-term backlog for this repo (future agents, commands, standards).
|
||||
- `STATUS.md` — latest cross-project roundup snapshot, overwritten + committed by `/roundup`
|
||||
each run (git history is the diff over time).
|
||||
|
||||
## Conventions when editing
|
||||
|
||||
@@ -81,20 +88,29 @@ should carry this so any vendor's agent surfaces pending items at session start:
|
||||
|
||||
## Current state
|
||||
|
||||
- The capture→triage→roadmap loop is built and live: `AGENTS.md` (+ `CLAUDE.md` symlink),
|
||||
`ROADMAP.md`, `INBOX.md`, and the `/capture` + `/triage` commands. The repo dogfoods its
|
||||
own standard, including the active inbox-check line above.
|
||||
- `/capture` also handles **new-project ideas** (`(new)` / `(new:name)`, type `project`);
|
||||
these wait for the new-repo bootstrap and are never triaged into an existing repo.
|
||||
- The git-tracking standard ("What git tracks") is now in `portability.md`, and this repo's
|
||||
`.gitignore` follows it.
|
||||
- `/roundup` is built: a cross-project status report that reads every repo's
|
||||
AGENTS.md/ROADMAP.md plus the inbox and groups all to-dos by priority — reads and reports
|
||||
only; deciding focus stays with the user.
|
||||
- Specced in `ROADMAP.md`, not built: the `new-project` bootstrap, the cross-repo
|
||||
quality-gate standard (linters/hooks/CI), and the optional SessionStart hook.
|
||||
- The portable inbox-check line is still not in *other* repos' AGENTS.md nor the retrofit
|
||||
playbook template — threading it (and the canonical `.gitignore`) into bootstrapping is a
|
||||
ROADMAP item.
|
||||
- Next: thread the standards into `retrofit-playbook.md`; then build the quality-gate hook
|
||||
or the `new-project` bootstrap.
|
||||
- **Fleet built and live** — commands `/capture /triage /roundup /new-project /handoff /retrofit
|
||||
/full-eval /design /adjudicate`; subagents incl. `design-checker` + `onboarding-tester`
|
||||
(substance in `guides/`, thin wrappers in `adapters/claude/`, symlinked into `~/.claude`).
|
||||
Dogfoods its own standard. Latest `/roundup`: `STATUS.md` 2026-06-16.
|
||||
- **`/adjudicate` shipped this session (ROADMAP item 10).** Debates parked P2/P3 backlog to
|
||||
DROP/DO/ESCALATE verdicts; recommend-only v1, autonomy gated by blast radius, ROADMAP-only.
|
||||
**Untested on a real backlog** — the first run should eyeball the judge's drop-bias before
|
||||
trusting it (tune the tie-break rule in `guides/adjudicate.md` if it over-drops). Best first
|
||||
targets: keysat, recap.
|
||||
- **`onboarding-tester` live; first harness still pending (item 9).** Stage 1 (Path 1, manual
|
||||
issuance under keysat's `merchant-onboard` key) is **unblocked** — build the harness + first run
|
||||
in a keysat session. Stage 2 (Path 2, buyer-pays on regtest) is **gated** on keysat's
|
||||
`payment_providers:write` scope + network gate + sandbox flag.
|
||||
- **Design system live; first full live round-trip done (item 8).** recap Case B (Extract)
|
||||
backfill (app 0.2.161); **ten31-database** was the first end-to-end run — Case-B extract →
|
||||
mobile-first `BRIEF.md` → Claude Design cloud round-trip → Phase C distilled + Phase D field
|
||||
notes promoted to `guides/design.md` (`176eebe`). The "confirm export internals + tune Phase-C"
|
||||
open item is **closed**. Residual is per-project execution (ten31-database mobile build, being
|
||||
scoped in that repo — not standards work).
|
||||
- **Next steps:** (1) first real `/adjudicate` run on keysat/recap to calibrate the drop-bias
|
||||
(item 10); (2) Stage-1 `onboarding-tester` harness in a keysat session (item 9); (3) cross-repo
|
||||
quality-gate standard + `/harden` (item 1); (4) non-git-folder sweep under `~/Projects` (item 6
|
||||
residual).
|
||||
- Queued in `INBOX.md` for other repos' `/triage`: keysat design cleanup (P2) + onboarding Path-2
|
||||
(P3); `ten31-transcripts` mini-retrofit; `ten31-database` networking/icon/intake; (standards)
|
||||
operator-onboarding agent (P3).
|
||||
|
||||
@@ -31,3 +31,24 @@ Example:
|
||||
## Items
|
||||
|
||||
<!-- /capture appends below this line -->
|
||||
- [ ] (standards) [feature][P2] API automation for Gitea in /new-project — automate the currently-manual Gitea create/publish gate via the Gitea API, 2026-06-14
|
||||
- [ ] (new:embedded-links-reader) [project][P2] Embedded-links reader & summarizer — give the app an article/blog URL; it scrapes the links the author embedded (the ones you don't want to visit in the moment), reads them, and summarizes them, 2026-06-14
|
||||
- [ ] (new:portfolio-scraper) [project][P2] Portfolio-company scraper — tracks portfolio companies for podcasts, social tweets, founder appearances, news, etc. and delivers a digest via email or another interface, 2026-06-14
|
||||
- [ ] (recap-relay) [chore][P3] AGENTS.md endpoint list mis-describes POST /relay/analyze as "{ transcript, … } → topic sections JSON". The actual route (server/routes/analyze.js) takes a free-form { prompt: string } and returns the standard envelope { result: { text } }; "topic sections JSON" is only what the recap-app caller asks for in its prompt. Fix the request-shape wording to { prompt } — surfaced resolving Recaps' Daily Digest synthesis contract (Q4), 2026-06-15
|
||||
- [ ] (keysat) [chore][P2] Design-contract cleanup from the 2026-06-16 design-checker audit — full detail in keysat ROADMAP "Design (contract conformance)" + design/DESIGN.md. (1) Fix 3 blockers (code violates the contract's named "never" rules on live CTAs): (a) gold-as-fill on admin `.featured-pill-toggle.on` (licensing-service-startos/licensing-service/web/index.html:418) → navy fill or gold border+text; (b) gold-as-fill on admin `#tier-banner-cta` upgrade button (web/index.html:537-542) → navy primary; (c) primary buy CTA pill radius 999px (keysat-xyz-landing/index.html:384-385) → r-md 8px. (2) Structural: consolidate the 4 surfaces' inlined CSS-variable copies onto canonical design/brand/palette.css (import it, drop private copies). (3) Token gaps (tokenize-vs-snap): 14px landing card radius; wordmark letter-spacing 0.30 vs 0.28em (add letterSpacing.wordmark token); semantic badge text one-offs (#205c47/#7a5814/#8a2828); hardcoded syntax-highlight hex → var(); admin #f6f1e7 off-token. Re-run design-checker after to confirm, 2026-06-16
|
||||
- [ ] (ten31-signal-engine) [chore][P2] Run full-eval on the signal engine folder — the full evaluation suite (evaluator, security-auditor, exerciser, doc-auditor, spec-checker), 2026-06-16
|
||||
- [ ] (standards) [idea][P2] run janitor agent on all projects — via matrix, 2026-06-16
|
||||
- [ ] (keysat) [chore][P2] does the keysat registry need to save every iteration of new versions of keysat software as we upgrade it? research agent needs to investigate — via matrix, 2026-06-16- [ ] (keysat) [chore][P2] Adversarial review of keysat- what vulnerabilities, customer complaints, feature gaps, might a new user find. — via matrix, 2026-06-16
|
||||
- [ ] (keysat) [chore][P2] run spec-checker agent for listing to start9 community registry — via matrix, 2026-06-16
|
||||
- [ ] (keysat) [chore][P2] review website for any drift/inconsistencies (doc-auditor), review GitHub for any sensitive information in historical commits (revealed info), review website and consider adding specific example of how to add licensing to existing software (for example this is a good way to test the dry run of a new user just using documentation... we could give an agent the proof-of-work software and see if they can just add a license paywall in front of it before they can use it in one shot) — via matrix, 2026-06-16
|
||||
- [ ] (recap) [idea][P2] add gemini 3.5 to model selection, need to have research agent check which models are available (stable versions) and the correct model name — via matrix, 2026-06-16
|
||||
- [ ] (recap-relay) [idea][P2] add gemini 3.5 to model selection, need to have research agent check which models are available (stable versions) and the correct model name — via matrix, 2026-06-16
|
||||
- [ ] (new:personal-website) [project][P2] Develop personal website — host on Start9 Pages, served on clearnet via StartTunnel; build HTML site, use Claude Design for styling, gather design inspiration — 2026-06-16
|
||||
- [ ] (keysat) [feature][P3] Onboarding-tester Path 2 — full buyer-pays walkthrough on regtest. GATED on keysat shipping `payment_providers:write` (opt-in scope, never bundled into merchant-onboard) + network gate (scoped connect = regtest/testnet/signet only, mainnet master-only, fail-closed) + daemon-level sandbox-mode flag (greenlit with the keysat dev 2026-06-16; see plans/agent-payment-connect-scope.md). Then: harness stands up a BTCPay regtest stack + a sandbox-flagged keysat instance, grants the agent merchant-onboard + payment_providers:write, and the agent connects BTCPay (regtest) AND drives a test buyer payment that activates a license — entire chain agent-done, zero master-key steps. Walkthrough must be labeled regtest; production mainnet-connect stays the operator's one reserved step BY DESIGN (frame as security feature). Build AFTER Path 1 (no-payments) ships, since the BTCPay-regtest stack is the bulk of the new infra, 2026-06-16
|
||||
- [ ] (standards) [agent][P3] Operator-onboarding agent — sibling to onboarding-tester for the *operator* journey (stand up + run keysat from docs alone: sideload/registry install on StartOS, configure, issue first license), vs. the developer SDK-integration journey onboarding-tester already covers. Needs its own clean room (a clean StartOS service-install, not a generic VPS, since the s9pk can't run on a vanilla Linux box), 2026-06-16- [ ] (ten31-database) [idea][P2] backup history in settings tab should be minimized and expandable with a chevron. default to minimized and shown at the bottom since it is rarely viewed — via matrix, 2026-06-18
|
||||
- [ ] (ten31-database) [idea][P2] screen refresh should preserve viewing the same tab you were already on, rather than default back to the top tab — via matrix, 2026-06-18
|
||||
- [ ] (keysat) [feature][P2] ability to reorder entitlements catalog on edit products view — via matrix, 2026-06-18
|
||||
- [ ] (spark-control) [idea][P2] we should redesign the software logo/icon (used for startos service).. it doesn't really relate to anything, though the color scheme seems to match — via matrix, 2026-06-18
|
||||
- [ ] (spark-control) [feature][P2] Add a dashboard card for the ten31 CRM / intake bot (Update/Restart/Stop/Logs tile like matrix-bridge) — confirm whether already done; see ten31-database docs/handoffs/add-intake-bot-to-spark-control.md, 2026-06-18
|
||||
- [ ] (proof-of-work) [feature][P2] brainstorm better tracking of cardio logging and cardio program planning (in-week variety and long term programs) — via matrix, 2026-06-19
|
||||
- [ ] (matrix-bridge) [bug][P2] what are the open brackets when you log an inbox item through matrix, eg “📥 captured → - [ ] (proof-of-work) [feature][P2] brainstorm better tracking of cardio logging and cardio program planning (in-week variety and long term programs) — via matrix, 2026-06-19” — via matrix, 2026-06-19
|
||||
|
||||
@@ -41,4 +41,8 @@ The governing question: **would a different agent need this to work on the repo?
|
||||
|
||||
## The rhythm (canonical homes elsewhere)
|
||||
|
||||
One session = one task. Close every session with `/handoff` (updates Current state in AGENTS.md, commits, pushes). `/compact` is for mid-task overflow only, never to extend a finished chat. Dropped or closed by accident? `claude -c` resumes the latest session in that folder. Full daily rhythm: retrofit-playbook.md, Part 5.
|
||||
One session = one task. Close every session with `/handoff` (updates Current state in AGENTS.md, commits, pushes). `/compact` is for mid-task overflow only, never to extend a finished chat. Dropped or closed by accident? `claude -c` resumes the latest session in that folder.
|
||||
|
||||
Between tasks, ideas flow through the inbox: `/capture` logs an idea or bug for any repo to `INBOX.md` from wherever you are (no routing decision at capture time); `/triage`, run inside a project, drains that repo's items into its Current state or `ROADMAP.md`; and `/roundup` reads every repo's AGENTS.md/ROADMAP plus the inbox and compiles one priority-grouped to-do list across all projects, written to a tracked `STATUS.md` snapshot in the standards repo (it gathers and groups — deciding focus stays with you). Canonical home: `AGENTS.md` → "The capture → triage → roadmap loop."
|
||||
|
||||
Full daily rhythm: retrofit-playbook.md, Part 5.
|
||||
|
||||
+166
-60
@@ -42,29 +42,15 @@ how the standard gets *adopted* into a repo (a `/harden` command that installs t
|
||||
linter+hook for the detected stack?); whether to define a minimal "agentic-ops baseline"
|
||||
checklist doc alongside the other four standards docs.
|
||||
|
||||
## 2. `roundup` — cross-project status agent (the inverse of capture/triage)
|
||||
## 2. `roundup` — cross-project status command ✅ BUILT
|
||||
|
||||
**Why:** a global "what should I do next / which project to focus on" view. Capture/triage
|
||||
push items *down* into project repos; roundup reads back *up* across all of them.
|
||||
|
||||
**Shape:** a command (`/roundup`, run from `~/Projects`) that fans out one read-only
|
||||
subagent per repo to read that repo's `AGENTS.md` (`## Current state`) and `ROADMAP.md`,
|
||||
plus reads the standards `INBOX.md` for not-yet-triaged items. It synthesizes **one
|
||||
aggregated to-do list across all projects, grouped by priority**, including items that
|
||||
haven't been pushed down to a repo yet. Mirror `full-eval`'s orchestrator pattern: fan out,
|
||||
then synthesize into one report; the per-repo readers can be the generic `Explore` agent or
|
||||
a small dedicated reader.
|
||||
|
||||
**Division of labor (the user's explicit call):** the agent *reads and reports* — it
|
||||
gathers and presents findings grouped by priority; it does **not** decide the best use of
|
||||
time. Prioritizing across projects is a back-and-forth the user does on top of the report.
|
||||
So the output is a clean, evidence-backed inventory, and the "what's actually worth doing
|
||||
now" conversation happens after, in the main thread.
|
||||
|
||||
**Open questions:** which folders count as "my projects" (scan `~/Projects`, skip non-repos
|
||||
and the standards repo itself?); how to keep it cheap (cap readers, summarize hard); whether
|
||||
roundup output is ephemeral or written to a `STATUS.md` in the standards repo for diffing
|
||||
over time like `EVALUATION.md`.
|
||||
Built and live: `guides/roundup.md` + `adapters/claude/commands/roundup.md`. Fans out a
|
||||
read-only reader per repo over AGENTS.md/ROADMAP.md, folds in the inbox, and synthesizes one
|
||||
priority-grouped to-do list across all projects (prioritizing stays with the user). Every run
|
||||
**writes the report to a tracked `~/Projects/standards/STATUS.md` snapshot** (overwritten each
|
||||
run, then committed + pushed) so the portfolio state is diffable over time — and shows the
|
||||
same report inline. That snapshot file is the *only* thing roundup writes; all project repos
|
||||
stay read-only.
|
||||
|
||||
## 3. Deterministic inbox surfacing — SessionStart hook (optional upgrade over the portable line)
|
||||
|
||||
@@ -93,49 +79,169 @@ automatic so every repo inherits it.
|
||||
- Decide whether the canonical wording lives in `how-i-work.md` (so it's truly universal)
|
||||
or stays a per-repo line.
|
||||
|
||||
## 5. `new-project` — idea → scoped → scaffolded → Gitea repo
|
||||
## 5. `new-project` — idea → scoped → scaffolded → Gitea repo ✅ BUILT
|
||||
|
||||
**Why:** the inverse of `/retrofit`. Retrofit moves an *existing* project onto disk; this
|
||||
takes a captured `(new)` inbox idea and turns it into a real, standards-compliant repo. It
|
||||
closes the capture loop: `/capture (new) …` → bootstrap → a repo that already has AGENTS.md,
|
||||
the CLAUDE.md symlink, ROADMAP.md, the canonical `.gitignore`, and the inbox-check line.
|
||||
Built and live: `guides/new-project.md` + `adapters/claude/commands/new-project.md`. The
|
||||
inverse of `/retrofit` — main-thread and collaborative, it turns a captured `(new)` inbox
|
||||
idea into a repo that's standards-compliant from line one. Phases: locate the inbox seed →
|
||||
workshop the scope (the high-value, interactive step) → brief + scaffolding-plan sign-off →
|
||||
scaffold (`AGENTS.md` + `CLAUDE.md` symlink, `ROADMAP.md`, `README.md`, canonical
|
||||
`.gitignore`, `.claude/`, minimal stack skeleton; inbox-check line tagged `(<name>)`) →
|
||||
publish (git init + commit, Gitea manual-create gate, remote + push) → close the loop
|
||||
(remove the `(new)` item from `INBOX.md`) → portability-checker verify.
|
||||
|
||||
**Shape:** a command (`/new-project`, run from `~/Projects`), main-thread and collaborative
|
||||
— scoping is a conversation, not a delegated job. Phases:
|
||||
**Open questions, resolved:** (a) Gitea repo creation is a **manual web-UI gate** (no API
|
||||
token — matches retrofit-playbook Part 4); (b) the standards layer is always scaffolded, the
|
||||
stack skeleton stays minimal, and the **linter/pre-commit quality gate is deferred to the
|
||||
future `/harden`** (item 1) rather than hand-rolled; (c) the workshop **does** seed the first
|
||||
`## Current state` with the first milestone.
|
||||
|
||||
1. **Workshop the scope** — back-and-forth to sharpen objectives, non-goals, stack, and the
|
||||
key early decisions. Pull the seed from the `(new)` inbox item if one exists. This is the
|
||||
high-value step and stays interactive (like roundup, the reasoning is the user's).
|
||||
2. **Seed prompt** — synthesize the workshop into a concrete project brief / initial build
|
||||
prompt plus a scaffolding plan; get the user's sign-off.
|
||||
3. **Scaffold** — create the new folder under `~/Projects`, write the initial AGENTS.md
|
||||
(from the brief) + CLAUDE.md symlink, ROADMAP.md, README, the canonical `.gitignore`,
|
||||
`.claude/` wiring, and the starting directory structure. Compliant from line one.
|
||||
4. **Publish** — `git init` + initial commit, create the Gitea repo, add the remote, push
|
||||
(reuse retrofit-playbook Part 4; if no Gitea API token is available, hand back the manual
|
||||
"create empty repo, copy URL" step). Then remove the `(new)` item from the inbox.
|
||||
**Remaining option:** once `/harden` (item 1) exists, call it as the scaffold's last step so a
|
||||
new repo gets its stack's quality gate installed automatically.
|
||||
|
||||
**Open questions:** Gitea repo creation — API token vs. manual UI step; how much scaffolding
|
||||
is generic vs. stack-specific (does it call a `/harden` step from item 1 to install the
|
||||
stack's linter+hook?); whether the workshop output also seeds the first `## Current state`.
|
||||
## 6. Cross-repo git-hygiene audit + remediation ✅ DONE (2026-06-14)
|
||||
|
||||
## 6. Cross-repo git-hygiene audit + remediation — HIGH PRIORITY
|
||||
Fanned out one read-only `portability-checker` per git repo under `~/Projects`. **No safety
|
||||
issues anywhere:** zero tracked `.env` / `.DS_Store` / `*.local.json`, and every in-repo
|
||||
symlink is relative. The gaps were consistency: the inbox-check line was missing in all 7
|
||||
non-standards repos, and only `standards` had a complete canonical `.gitignore`.
|
||||
|
||||
**Why:** a shallow scan of `~/Projects` (2026-06-14) shows the `.claude`/git setup is *not*
|
||||
consistent across repos. Git repos with the full AGENTS.md + `.claude` + `.gitignore` setup:
|
||||
`CRM`, `premier-gunner`, `recap-relay`, `recap`, `spark-control`, `standards`, `Workout-log`.
|
||||
Outliers: `ten31-transcripts` has a `CLAUDE.md` but **no `.claude/` dir** (possible real file
|
||||
instead of an AGENTS.md symlink — the stale-retrofit failure); `start-os` has neither (likely
|
||||
an external/upstream repo). Plus many non-git folders (unprotected work). We don't yet know,
|
||||
per repo, what inside `.claude/` is committed vs gitignored, or whether in-repo symlinks are
|
||||
relative.
|
||||
**Fixed — 6 repos, one commit each, pushed** (`CRM`, `premier-gunner`, `recap`,
|
||||
`spark-control`, `proof-of-work`; `recap-relay` committed locally — see residuals): added the
|
||||
repo-tagged inbox-check line and normalized `.gitignore`.
|
||||
|
||||
**Do:** fan out one read-only `portability-checker` (or `Explore`) per git repo under
|
||||
`~/Projects`, each reporting: is `CLAUDE.md` a relative symlink to `AGENTS.md` or a real
|
||||
file; what's in `.claude/` and which of it is tracked vs gitignored (esp. `settings.local.json`
|
||||
committed by mistake, or shared `settings.json`/rules symlinks missing); whether `.gitignore`
|
||||
carries the canonical block; any absolute in-repo symlinks. Synthesize one compliance matrix +
|
||||
a prioritized remediation list, then a follow-up pass fixes each repo (its own commit).
|
||||
**Standard improved by the audit:** the documented canonical `.claude/` block was
|
||||
allow-by-default and would have *un-ignored* `premier-gunner`'s password-bearing
|
||||
`.claude/launch.json`. Switched `portability.md` (and the two retrofit summaries) to a
|
||||
**deny-by-default `.claude/*` + allow-list** of the shared wiring.
|
||||
|
||||
**Open questions:** treat non-git folders (flag for retrofit) vs. external upstreams
|
||||
(`start-os`?) differently; report-only first vs. auto-fix.
|
||||
**Residual follow-ups:**
|
||||
- **`ten31-transcripts` (MAJOR) — needs its own mini-retrofit.** Despite the name it's an
|
||||
active Xcode/Swift app with no `.claude/` at all. Scaffold `.claude/settings.json`; decide
|
||||
whether to reorganize its flat `docs/NN_*.md` into `docs/guides/` + `.claude/rules/` symlinks.
|
||||
Too big for the mechanical pass — **captured to `INBOX.md`** (tagged `(ten31-transcripts)`)
|
||||
with step-by-step instructions, for pickup on the next `/triage` in that repo.
|
||||
- **`recap-relay` remote** ✅ Gitea `origin` added + pushed in a later session.
|
||||
- **`premier-gunner/s9pk/.gitignore`** lacks the secrets/Claude lines (low priority; the root
|
||||
`.gitignore` covers `.env` tree-wide already).
|
||||
- **Many non-git folders under `~/Projects` are unprotected work** (discount-watcher,
|
||||
expense-organizer, giga, heart-rate, licensing, one-river, satoshi-sleep, START9 PACKAGING,
|
||||
ten31-agents/-command-center/-signal-engine, timestamp-converter, timestamp-newspaper,
|
||||
website-landing, Grand-Cayman-paddleboard). Each needs `git init` + retrofit, or an explicit
|
||||
"scratch, don't track" decision.
|
||||
- **`start-os`** is an external upstream (Start9Labs/start-os) — out of scope, no action.
|
||||
|
||||
## 8. Design system — `/design` round-trip + `design-checker` agent ✅ BUILT (2026-06-16)
|
||||
|
||||
Built and live: `guides/design.md` + `adapters/claude/commands/design.md` (the `/design`
|
||||
command) and `guides/design-checker.md` + `adapters/claude/agents/design-checker.md` (the
|
||||
`design-checker` subagent). The principle: branding/design for any user-facing repo is a
|
||||
**vendor-neutral on-disk contract** — `design/DESIGN.md` (nine-section brand brief) +
|
||||
`design/tokens.tokens.json` (W3C DTCG tokens) — that every agent reads before building UI.
|
||||
The cloud tool (Anthropic's **Claude Design**, cloud-only/experimental) is an interchangeable
|
||||
front-end, never a dependency.
|
||||
|
||||
- **`/design`** (main-thread, interactive) runs the round-trip: inspiration-first scoping →
|
||||
`design/BRIEF.md` (the packet the user hand-carries to Claude Design) → user drives the
|
||||
cloud step → distill the export back into the `DESIGN.md` + tokens contract. Distillation is
|
||||
agent-mediated because Claude Design exports inline-hardcoded HTML/CSS, not DTCG tokens
|
||||
(verified by research, 2026-06-16) — worth a human glance the first few runs.
|
||||
- **`design-checker`** (read-only subagent) audits a repo's user-facing code against its *own*
|
||||
committed contract — off-palette colors, wrong type scale, magic-number spacing, matched
|
||||
"don'ts" — citing code `file:line` vs the contract rule. Reports "run `/design` first" if no
|
||||
contract exists; never imposes its own taste.
|
||||
- **`/new-project`** now scaffolds `design/` + the AGENTS.md Design line for user-facing
|
||||
projects (Phase 4).
|
||||
|
||||
**Remaining options:** (a) `/retrofit` should backfill `design/` into existing user-facing
|
||||
repos (keysat, recap, recaps.cc, premier-gunner, ten31-database, ten31-transcripts) — run
|
||||
`/design` then `design-checker` per repo. **Done (recap, 2026-06-17):** recap was the first
|
||||
**Case B (Extract)** backfill — document-as-is. It validated the extract→reconcile path
|
||||
(previously untested — keysat was Case A/Import); the contract was distilled and the
|
||||
conformance cleanup shipped in two phases (recap app 0.2.161). Extract-phase Phase-D learnings
|
||||
landed in `guides/design.md` (commit `9031281`); cleanup-execution learnings are now in
|
||||
`guides/design-checker.md`. **Remaining backfill repos:** recaps.cc, premier-gunner,
|
||||
ten31-database, ten31-transcripts. (b) fold a `design-checker` pass into `/full-eval`
|
||||
for repos that have a contract; (c) confirm against a real Claude Design run what the export
|
||||
bundle actually contains and tune the Phase C distillation (the export internals are only
|
||||
medium-confidence from research) — kept **decoupled** from the recap run so one untested thing
|
||||
moves at a time.
|
||||
|
||||
## 9. `onboarding-tester` — docs-only fresh-adopter agent ✅ BUILT (agent); harness pending (staged Path 1 → Path 2)
|
||||
|
||||
Built and live: `guides/onboarding-tester.md` + `adapters/claude/agents/onboarding-tester.md`. A
|
||||
global adopter agent: given a **goal**, a **docs-corpus allowlist**, a **sandbox**, and
|
||||
**credentials/endpoint**, it walks a product's published docs as a literal newcomer — *never
|
||||
reading the product's source* (needing to is itself a finding) — and reports every doc gap
|
||||
(Blocker/Stumble/Nit + the one-line fix). On a **fully clean run** (zero blockers, zero stumbles)
|
||||
it emits a second, publishable "all it took was X, Y, Z" walkthrough for marketing. Generic by
|
||||
design; keysat is the first instantiation. The dual output (friction report always; marketing
|
||||
walkthrough only on a clean run) makes it both a docs-QA tool and a source of agent-readiness copy.
|
||||
|
||||
**First instantiation — keysat SDK integration (harness lives in keysat, not here):** goal = gate
|
||||
`proof-of-work` behind a keysat license end-to-end under the least-privilege `merchant-onboard`
|
||||
scoped key (keysat shipped that role for this, commit `d5885d1`; covers catalog setup + manual
|
||||
license issuance). Harness = two disposable containers (`keysat-server` fixture booted fresh +
|
||||
`developer-sandbox` holding `proof-of-work`), master-key mints the scoped key, docs corpus =
|
||||
keysat.xyz / docs.keysat.xyz / the four SDK package pages.
|
||||
|
||||
**Staged build:**
|
||||
- **Stage 1 — Path 1, no payments (do first, unblocked now):** catalog → tiers → manual license
|
||||
issuance → SDK integration → offline verify, all under `merchant-onboard`. Cheapest honest
|
||||
walkthrough; validates the agent + core integration docs before adding payment infra. **Pending:**
|
||||
build the harness + first live run in a keysat session.
|
||||
- **Stage 2 — Path 2, full buyer-pays on regtest (gated):** depends on keysat shipping
|
||||
`payment_providers:write` (opt-in scope, never bundled into `merchant-onboard`) + a network gate
|
||||
(scoped connect = regtest/testnet/signet only; mainnet master-only; fail-closed) + a daemon-level
|
||||
sandbox-mode flag — **greenlit with the keysat dev 2026-06-16** (`plans/agent-payment-connect-scope.md`).
|
||||
Then the harness adds a BTCPay regtest stack and the agent connects BTCPay (regtest) **and** drives
|
||||
a test buyer payment end-to-end, zero master-key steps. Walkthrough labeled regtest; production
|
||||
mainnet-connect stays the operator's one reserved step **by design** (a key that can repoint
|
||||
settlement is a fund-redirection key — surface the boundary as a security feature).
|
||||
|
||||
**Other follow-up (captured to `INBOX.md`):** a separate operator-onboarding agent for the StartOS
|
||||
install journey (stand up + run keysat from docs alone), vs. the developer SDK-integration journey
|
||||
this agent covers.
|
||||
|
||||
## 7. Verify & correct the placement guide ✅ DONE (2026-06-15)
|
||||
|
||||
Walked the "Infrastructure facts" section with the owner and corrected it against the real
|
||||
setup (cross-checked against the project repos + a Spark-control topology dump). Fixes: **x86**
|
||||
StartOS **0.4.0** box + full service inventory (Gitea, Nextcloud, Home Assistant, CLN + RTL,
|
||||
Open WebUI, Vaultwarden, Synapse, and every Claude-built app); the two-Spark **role split**
|
||||
(LLM Spark = vLLM; audio/speech Spark = Parakeet / Kokoro / Sortformer + TitaNet + bge-m3 +
|
||||
Qdrant, and it hosts matrix-bridge); **don't hardcode a model — query the Spark Control
|
||||
gateway** for the live one (daily driver Qwen3.6, hot-swappable); networking reduced to LAN /
|
||||
WireGuard / StartTunnel (Proton VPN + Tor were legacy, dropped). UNVERIFIED banner replaced
|
||||
with a "verified 2026-06-15" note; decision steps 4 and 6 aligned. Commit `ee5c8bb`.
|
||||
|
||||
## 10. `adjudicate` — debate low-priority backlog items to a verdict ✅ BUILT (2026-06-17)
|
||||
|
||||
Built and live: `guides/adjudicate.md` + `adapters/claude/commands/adjudicate.md` (the
|
||||
`/adjudicate` command). Solves backlog clutter the owner can't easily judge: low-priority
|
||||
(P2/P3) technical/backend items that may be necessary or may be bells-and-whistles, and that
|
||||
he shouldn't spend expertise on *because* they're low priority. Run inside a repo, it
|
||||
adjudicates that repo's ROADMAP items via a grounded debate and routes each to a verdict the
|
||||
owner ratifies instead of researching.
|
||||
|
||||
- **Pipeline (per item):** investigator (read-only — does the problem exist? already handled?
|
||||
what would it touch? + blast-radius classification) → build-advocate ∥ drop-advocate (argue
|
||||
from the investigator's findings, not speculation) → judge (rubric = `how-i-work.md` + repo
|
||||
`AGENTS.md`; **biased to DROP on ties / low confidence**, since these are already low-priority).
|
||||
- **Three verdicts:** **DROP** (the only autonomously-applied call — ratified in one batch, owner
|
||||
needn't understand the tech), **DO** (worth it + LOW blast radius → annotated with a ready plan,
|
||||
recommend-only, not executed), **ESCALATE** (worth it but HIGH blast radius / low confidence /
|
||||
an epic → balanced brief for the owner's call).
|
||||
- **Autonomy is gated by blast radius, not priority** — HIGH = touches data/auth/money/external
|
||||
surface or changes observable behavior (unclear ⇒ HIGH). It may auto-recommend *dropping* a HIGH
|
||||
item but never *doing* one.
|
||||
- **ROADMAP-only input.** Nudges the owner to `/triage` first if untriaged inbox items exist for
|
||||
the repo, but never reads raw inbox items into the debate (that's `/triage`'s routing job —
|
||||
duplicating it invites drift). Two gates: confirm the item set before fan-out (cost control),
|
||||
then approve the batch of ROADMAP edits. The ROADMAP diff + commit message is the audit trail
|
||||
(no separate report file).
|
||||
|
||||
**Remaining options:** (a) **v2 — narrow auto-execution** of the safe "DO + LOW blast radius +
|
||||
reversible + test-covered" class, once the owner has watched it make calls and trusts the verdicts
|
||||
(deliberately deferred — recommend-only first to build trust); (b) a thin `/triage`-then-`/adjudicate`
|
||||
combo if the two-command chaining friction proves real (YAGNI for now).
|
||||
|
||||
@@ -0,0 +1,139 @@
|
||||
# Roundup — 2026-06-18
|
||||
|
||||
Repos scanned (11): keysat, matrix-bridge, premier-gunner, proof-of-work, recap-relay, recap, spark-control, standards (meta/tooling), ten31-database, ten31-signal-engine, ten31-transcripts.
|
||||
|
||||
Skipped (non-git folders under `~/Projects`, noted not dropped): discount-watcher, expense-organizer, giga, Grand-Cayman-paddleboard, heart-rate, one-river, `satoshi-sleep (need to add code)`, `START9 PACKAGING`, ten31-command-center, timestamp-converter, timestamp-newspaper, website-landing, Workout-log. (Failed readers: none.)
|
||||
|
||||
> Read-and-report only. Items are grouped by the priority signals found in each repo and the inbox; projects are **not** ranked against each other and no "what to work on" call is made — that's yours. The repo backlogs below already live durably in each repo's ROADMAP; the **inbox items are itemized in full** because they exist nowhere else yet.
|
||||
|
||||
---
|
||||
|
||||
## Per-project snapshot
|
||||
|
||||
**keysat** — Bitcoin-native software-licensing service (StartOS 0.4.x s9pk + 4 SDKs + landing/docs). Live `0.2.0:60` on immense-voyage.local serving licensing.keysat.xyz; `:60` shipped the Zaprite auto-charge silent-lapse fix + docs reconciliation; `cargo check`/`tsc`/tests green. **Next:** multi-profile webhook routing test → split `audit:read` scope → operator-alerts-via-StartOS → registry `prepare.sh`.
|
||||
|
||||
**matrix-bridge** — Single-user Matrix bot turning a room message into a live Claude Code session on the Mac, surfaced to phone. Phases 0–3 + ask mode all DONE and live (Docker on Spark); capture mode deployed. **In progress:** none; only optional optimizations (Docker HEALTHCHECK, trust-gate flag, priority keyword). **Watch:** Update button depends on modelo's Gitea ssh `IdentitiesOnly yes` pin.
|
||||
|
||||
**premier-gunner** — Kid-friendly soccer-training tracker PWA (one player), StartOS s9pk. Live `v0.1.7:0`; all requested features built/deployed. **In progress:** none. **Next:** confirm speed unit (mph vs km/h); work the eval backlog if desired. **Known issue:** in-app password change reverts on restart (use the StartOS action).
|
||||
|
||||
**proof-of-work** — Self-hosted multi-user workout logger (Next.js 15) as StartOS s9pk. Live `1.2.0:5` (Gear replaces RPE for cardio); built + sideloaded, verified on-box (231 tests pass). **Known bug (P2):** Mobile-Safari first-login-tap fails once then works — gated on a Safari error code to pick client vs server fix. **Next:** P3 hardening batch → tiered AI prompt formatting → Next 15→16.
|
||||
|
||||
**recap-relay** — Operator-side credit-metered service in front of Gemini + Spark Control, powering Recaps' transcription/analysis pipeline. Aligned at relay `0.2.126` (app `0.2.155`), 79 tests green; Users dashboard tab + persistent webhook-dedup shipped; BTCPay-only rail; CORS scoped. **In progress:** none open above P2. **Next:** P2/deferred tail (split 2225-line internal-meetings.js — likely overkill; security/doc P3s).
|
||||
|
||||
**recap** — YouTube/podcast summarizer + library SPA, StartOS s9pk + cloud at recaps.cc. Live app `0.2.161` + relay `0.2.126`, 144 tests pass; self-serve Pro/Max purchase + design system (Phases 1–2) complete. **Loose end:** Daily Digest relay-synthesis+SMTP installed but not smoke-tested on-box (operator action #5). **Next:** operator smoke-test #5; resolve P2 known-debt cluster.
|
||||
|
||||
**spark-control** — Browser StartOS package driving a dual NVIDIA DGX Spark cluster (vLLM swaps, health/proxy, STT/TTS/diarization, embeddings, redaction). Live `v0.26.0:0` (disk-driven model menu); OpenClaw/Johnny-5 coexistence epic fully shipped (v0.25.0); 137 tests pass. **In progress:** Gemma-4-26B-A4B vision eval (recipe in catalog, not yet downloaded/swap-tested). **Awaiting:** Signal-Engine client-side flakiness remedy forwarded to dev.
|
||||
|
||||
**standards** *(meta/tooling)* — Home of agent standards + the live global fleet (8 commands, 11 subagents, served via `~/.claude` symlinks). `/adjudicate` shipped (item 10), **untested on a real backlog** — first run should calibrate drop-bias. **Next:** first `/adjudicate` run on keysat/recap; Stage-1 onboarding-tester harness; cross-repo quality-gate standard + `/harden`; non-git-folder sweep.
|
||||
|
||||
**ten31-database** — Self-hosted venture-fund CRM (replacing Airtable) with an agentic AI layer for funnel-widening + outreach drafting. Phase 0/1 built; deployed & verified live `v0.1.0:91` (2026-06-18); grid is canonical SoR, Matrix email-proposal review + pipeline adoption + intake bot all live. **Unsmoked:** intake fuzzy-match numbered-pick grammar. **Debt (P2):** soft-delete sweep residue, `?limit=abc` crash, auth regression test, oversized icon, 5.4k-line monolith.
|
||||
|
||||
**ten31-signal-engine** — Recurring pipeline ingesting audio + text → structured "claims" → investment signals via Ten31's thesis lens, each logged as a falsifiable prediction. **Strike adversarial test: CONDITIONAL PASS (2026-06-16)** — 56,008 claims embedded, false positive correctly refused; reflexivity demo unexercised (RHR/CD/Bitcoin.Review audio deferred). No automated test suite yet. **Next:** Frontier-fan-out test H6 → complete Strike reflexivity demo → Job A discovery scorers.
|
||||
|
||||
**ten31-transcripts** — Native macOS menu-bar app: detects calls, records dual-track audio, sends to Spark Control for transcription/diarization. `main` clean + pushed; app rebuilt+installed, 91 tests pass; meeting-name prompt + folder rename shipped & reviewer-verified. **Pending verify:** naming prompt + rename on a live stop. **In progress/unverified:** Meet visual fix (reject solid camera-off tiles).
|
||||
|
||||
---
|
||||
|
||||
## Priority queue (all projects + untriaged inbox)
|
||||
|
||||
No P0 or P1 items exist anywhere — nothing is flagged drop-everything or urgent. Repo "next steps" carry no Pn marker; they're the operator's chosen next moves, listed under **Next actions (no Pn)**. Repo ROADMAP backlogs live durably in each repo and are summarized per-line with a pointer; **inbox items are itemized in full** below and again under "Not yet pushed down."
|
||||
|
||||
### P2
|
||||
- [P2] Design-contract cleanup (3 blockers + structural CSS consolidation + token gaps) — source: inbox(untriaged) → keysat — INBOX.md / keysat ROADMAP "Design (contract conformance)"
|
||||
- [P2] Research: does the keysat registry need to retain every prior software version? — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
|
||||
- [P2] Adversarial review of keysat (vulns / customer complaints / feature gaps a new user would find) — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
|
||||
- [P2] Run spec-checker agent for Start9 community-registry listing — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
|
||||
- [P2] Website doc-auditor pass + scan GitHub history for leaked info + add "add licensing to existing software" example — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
|
||||
- [P2] Ability to reorder entitlements catalog on edit-products view — source: inbox(untriaged) → keysat — INBOX.md (via matrix)
|
||||
- [P2] Add Gemini 3.5 to model selection (research available stable model name first) — source: inbox(untriaged) → recap — INBOX.md (via matrix)
|
||||
- [P2] Add Gemini 3.5 to model selection (research available stable model name first) — source: inbox(untriaged) → recap-relay — INBOX.md (via matrix)
|
||||
- [P2] Run full-eval suite on the signal-engine folder — source: inbox(untriaged) → ten31-signal-engine — INBOX.md (via matrix)
|
||||
- [P2] Backup-history settings tab: minimize + chevron-expand, default collapsed at bottom — source: inbox(untriaged) → ten31-database — INBOX.md (via matrix)
|
||||
- [P2] Screen refresh should preserve the current tab, not reset to top tab — source: inbox(untriaged) → ten31-database — INBOX.md (via matrix)
|
||||
- [P2] Redesign the software logo/icon (StartOS service icon) — source: inbox(untriaged) → spark-control — INBOX.md (via matrix)
|
||||
- [P2] Gitea API automation for /new-project (replace manual create/publish gate) — source: inbox(untriaged) → standards — INBOX.md
|
||||
- [P2] Run janitor agent across all projects — source: inbox(untriaged) → standards — INBOX.md (via matrix)
|
||||
- [P2] Mobile-Safari first-login-tap fails once then works — source: proof-of-work — repo ROADMAP "Known bugs" (gated on Safari error code)
|
||||
- [P2] CRM debt: dashboard comms-aggregate soft-delete sweep, `?limit=abc` crash, auth regression test, oversized StartOS icon, 5.4k-line monolith — source: ten31-database — repo Current state
|
||||
- [P2] recap known-debt cluster: SSE error-string scrub, credit TOCTOU, multi-tenant gemini-key bypass, `/api/history` perf, dep CVEs (nodemailer high), risky-file tests, doc drift — source: recap — repo ROADMAP "P2 known-debt"
|
||||
- [P2] premier-gunner eval backlog: `@fastify/static`→≥9.1.3 (path-traversal), input validation (unknown metric kind / bad dates / 400 not 500), automated record/streak/migration tests — source: premier-gunner — repo ROADMAP
|
||||
- [P2] spark-control tech debt: no tests beyond redaction suites, loose dep floors (python-multipart/starlette), opaque 500 on model POST/PUT, NGC key on process line, global mutable catalog, container root on 0.0.0.0:9999 — source: spark-control — EVALUATION.md
|
||||
- [P2] Cross-repo quality-gate standard (linters / pre-commit / CI) + `/harden` — source: standards — ROADMAP item 1
|
||||
|
||||
### P3
|
||||
- [P3] Onboarding-tester Path 2 (full buyer-pays walkthrough on regtest) — source: inbox(untriaged) → keysat — INBOX.md (gated on `payment_providers:write` + network gate + sandbox flag)
|
||||
- [P3] Fix AGENTS.md endpoint wording: `POST /relay/analyze` takes `{ prompt }`, not `{ transcript … }` — source: inbox(untriaged) → recap-relay — INBOX.md
|
||||
- [P3] Operator-onboarding agent (sibling to onboarding-tester for the operator journey) — source: inbox(untriaged) → standards — INBOX.md
|
||||
- [P3] proof-of-work hardening: CSP `unsafe-eval`, `/api/health` info disclosure, rate-limit map leak, shorter sessions, text max-length, unify 3rd JSON-parse — source: proof-of-work — repo ROADMAP
|
||||
- [P3] premier-gunner P3: CSRF token, cross-category metric guard, logout-without-session, consistent 404s, validate category color — source: premier-gunner — repo ROADMAP
|
||||
- [P3] recap P3 deferred: request-size caps, invoice-ID hijack on `/api/credits/claim`, container root, in-memory auth rate-limit, repo hygiene, registry-submission blockers — source: recap — repo ROADMAP
|
||||
- [P3] recap-relay security/ops tail: no `/relay/*` rate limiting, container root, dashboard stored-XSS, `lan-fetch` TLS-verify off, stale `/relay/health` version, doc fixes — source: recap-relay — repo ROADMAP
|
||||
- [P3] spark-control bulk-fix-when-touching: stale README, deprecated `@app.on_event`, `innerHTML` sink, no upload-size limits, `VLLM_PORT` crash, Makefile x86-only vs manifest aarch64 — source: spark-control — EVALUATION.md
|
||||
|
||||
### Next actions (no Pn signal — operator's chosen next moves, not yet prioritized)
|
||||
- keysat: automated multi-profile webhook routing test (S) → split `audit:read` from blanket `:read` → operator-alerts-via-StartOS (verify start-sdk 1.3.2 first) → registry `prepare.sh` + on-box verification — source: keysat Current state
|
||||
- recap: smoke-test Daily Digest end-to-end on-box (operator action #5) — source: recap Current state
|
||||
- spark-control: Gemma-4-26B-A4B vision eval (download + swap-test; owner weighing vision vs text-only Qwen3.6) — source: spark-control Current state
|
||||
- ten31-signal-engine: Frontier-fan-out test H6 → complete Strike reflexivity demo (un-defer RHR/CD audio) → Job A discovery scorers — source: signal-engine Current state
|
||||
- ten31-transcripts: verify naming-prompt + folder-rename on a live stop; re-process Meet session for visual fix; repoint `origin`→`gitea-home`; backend URL primary→fallback + `mmss()` NaN guard — source: ten31-transcripts Current state
|
||||
- ten31-database: in-room smoke of intake disambiguation numbered-pick grammar; spark-control intake dashboard card; NL→safe-query build; freeze v2.0 canonical thesis — source: ten31-database Current state
|
||||
- premier-gunner: confirm speed unit (mph vs km/h) — source: premier-gunner Current state
|
||||
- matrix-bridge: optional only — Docker HEALTHCHECK, trust-gate flag, priority keyword, delete vestigial `phase-0` branch — source: matrix-bridge Current state
|
||||
- standards: first real `/adjudicate` run on keysat/recap to calibrate drop-bias; Stage-1 onboarding-tester harness in a keysat session — source: standards Current state
|
||||
|
||||
### Unprioritized — needs triage (proposed new projects; routed by new-repo bootstrap, not /triage)
|
||||
- Embedded-links reader & summarizer — source: inbox `(new:embedded-links-reader)` [P2]
|
||||
- Portfolio-company scraper (podcasts/tweets/news digest) — source: inbox `(new:portfolio-scraper)` [P2]
|
||||
- Personal website on Start9 Pages via StartTunnel — source: inbox `(new:personal-website)` [P2]
|
||||
|
||||
---
|
||||
|
||||
## Not yet pushed down (inbox) — exists nowhere but INBOX.md, grouped by target
|
||||
|
||||
**→ keysat (7)**
|
||||
- [P2] Design-contract cleanup — 3 blockers (gold-as-fill on admin `.featured-pill-toggle.on` + `#tier-banner-cta`; buy-CTA pill radius 999px→8px), CSS-variable consolidation onto canonical palette.css, token gaps (14px card radius, wordmark letter-spacing, semantic badge hexes, syntax-highlight hex, admin `#f6f1e7`). Re-run design-checker after. (2026-06-16)
|
||||
- [P2] Research whether the registry must retain every prior keysat version on upgrade (2026-06-16)
|
||||
- [P2] Adversarial review — vulns / customer complaints / feature gaps a new user might find (2026-06-16)
|
||||
- [P2] Run spec-checker for Start9 community-registry listing (2026-06-16)
|
||||
- [P2] Website doc-auditor pass; scan GitHub history for leaked sensitive info; add a "add licensing to existing software" example (proof-of-work as a dry-run target) (2026-06-16)
|
||||
- [P2] Reorder entitlements catalog on edit-products view (2026-06-18)
|
||||
- [P3] Onboarding-tester Path 2 — buyer-pays regtest walkthrough, gated on `payment_providers:write` + network gate + sandbox-mode flag (2026-06-16)
|
||||
|
||||
**→ recap-relay (2)**
|
||||
- [P2] Add Gemini 3.5 to model selection (research stable model name first) (2026-06-16)
|
||||
- [P3] AGENTS.md endpoint-shape doc fix: `POST /relay/analyze` is `{ prompt }`, not `{ transcript … }` (2026-06-15)
|
||||
|
||||
**→ recap (1)**
|
||||
- [P2] Add Gemini 3.5 to model selection (research stable model name first) (2026-06-16)
|
||||
|
||||
**→ ten31-database (2)**
|
||||
- [P2] Backup-history tab: minimize + chevron-expand, default collapsed at bottom (2026-06-18)
|
||||
- [P2] Screen refresh should preserve current tab, not reset to top (2026-06-18)
|
||||
|
||||
**→ ten31-signal-engine (1)**
|
||||
- [P2] Run full-eval suite on the signal-engine folder (2026-06-16)
|
||||
|
||||
**→ spark-control (1)**
|
||||
- [P2] Redesign the software logo/icon used for the StartOS service (2026-06-18)
|
||||
|
||||
**→ standards (3)**
|
||||
- [P2] Gitea API automation for /new-project (automate the manual create/publish gate) (2026-06-14)
|
||||
- [P2] Run janitor agent across all projects (2026-06-16)
|
||||
- [P3] Operator-onboarding agent — sibling to onboarding-tester for the operator journey (needs a clean StartOS service-install clean room) (2026-06-16)
|
||||
|
||||
---
|
||||
|
||||
## Proposed new projects (inbox `(new:…)` — awaiting new-repo bootstrap)
|
||||
|
||||
- **embedded-links-reader** [P2] — give the app an article/blog URL; it scrapes the author-embedded links, reads them, and summarizes them (2026-06-14)
|
||||
- **portfolio-scraper** [P2] — tracks portfolio companies for podcasts, social tweets, founder appearances, news; delivers a digest via email or another interface (2026-06-14)
|
||||
- **personal-website** [P2] — host on Start9 Pages, clearnet via StartTunnel; build HTML, style with Claude Design, gather inspiration first (2026-06-16)
|
||||
|
||||
---
|
||||
|
||||
## Gaps
|
||||
|
||||
- **No AGENTS.md/Current-state gaps among scanned repos** — all 11 returned a description, current state, next steps, and ROADMAP backlog. No reader failed.
|
||||
- **Inbox formatting:** two INBOX.md lines have items mashed together without a newline (line 41: keysat registry-retention + keysat adversarial-review; line 48: standards operator-onboarding agent + ten31-database backup-history). Both items were preserved/counted above; the lines should be split when next triaged.
|
||||
- **Untested / unverified, by the repos' own words (not gaps in this roundup, just open verification debt):** standards `/adjudicate` untested on a real backlog; recap Daily Digest not yet smoke-tested on-box; ten31-transcripts naming-prompt+rename and Meet visual fix unverified on a live run; ten31-database intake numbered-pick grammar unsmoked; ten31-signal-engine has no automated test suite and the Strike reflexivity demo is unexercised.
|
||||
- **Non-git folders under `~/Projects` not covered:** 13 folders (listed at top) are not git repos — out of scope for this roundup, but `satoshi-sleep (need to add code)` and the standards "non-git-folder sweep" (ROADMAP item 6 residual) both suggest a sweep is still owed.
|
||||
@@ -0,0 +1,29 @@
|
||||
---
|
||||
name: design-checker
|
||||
description: Design-compliance checker. Use when checking whether a repo's user-facing code conforms to the design standard the project scoped for itself — audits HTML/CSS/components/views against the repo's committed design/DESIGN.md brand brief and design/tokens.tokens.json, reporting each off-contract color, type, spacing, or forbidden pattern with the code file:line and the contract rule it breaks. Read-only — reports violations and exact fixes, never makes them. If the repo has no design/ contract, it says so (run /design first) rather than imposing taste.
|
||||
tools: Read, Grep, Glob, Bash
|
||||
model: sonnet
|
||||
effort: medium
|
||||
---
|
||||
|
||||
You are a design-compliance checker for my own repos: you read a repo's user-facing code and
|
||||
report where it does not conform to the design standard that project scoped for itself —
|
||||
`design/DESIGN.md` and `design/tokens.tokens.json` — so I can bring the UI back on-brand.
|
||||
|
||||
Your complete operating guide — how to load the contract, the dimensions to check, hard
|
||||
rules, and the mandatory report format — is at:
|
||||
|
||||
~/Projects/standards/guides/design-checker.md
|
||||
|
||||
Read it in full before doing anything else, then follow it exactly. If you cannot read that
|
||||
file, stop and report precisely that you could not load your guide — do not improvise the
|
||||
mission.
|
||||
|
||||
Non-negotiable even without the guide: you are read-only — never edit, write, or commit
|
||||
anything; you only report violations and the exact fix. Audit against the repo's **committed
|
||||
contract, never your own taste** — if `DESIGN.md` is silent on something it is not a
|
||||
violation. If the repo has **no `design/` contract**, stop and report that it needs `/design`
|
||||
run first; never invent a standard. Every finding cites both the code `file:line` and the
|
||||
contract rule (the `DESIGN.md` section or the token name) it breaks; a finding without that
|
||||
evidence is dropped. Anything unchecked is UNVERIFIED. If blocked, report exactly what
|
||||
blocked you — never guess or fabricate findings.
|
||||
@@ -0,0 +1,26 @@
|
||||
---
|
||||
name: onboarding-tester
|
||||
description: Fresh-adopter onboarding tester. Use when checking whether a newcomer can reach a goal — install, integrate, or run a piece of software — using ONLY its published docs. Follows the documented happy path as a literal newcomer, never reading the product's source, and reports every place the docs leave a user stuck, guessing, or wrong (with the doc location and the fix). On a fully clean run it also emits a publishable "all it took was X, Y, Z" walkthrough. Works in a sandbox only — never modifies the product or its docs.
|
||||
tools: Read, Grep, Glob, Bash, Write, Edit, WebFetch
|
||||
model: sonnet
|
||||
effort: high
|
||||
---
|
||||
|
||||
You are a fresh adopter who finds out whether a newcomer can reach a goal using only a
|
||||
product's published documentation — and, when the docs are good enough, produces the clean
|
||||
walkthrough that proves it.
|
||||
|
||||
Your complete operating guide — mission, inputs, the docs-only discipline, procedure, and the
|
||||
two mandatory outputs — is at:
|
||||
|
||||
~/Projects/standards/guides/onboarding-tester.md
|
||||
|
||||
Read it in full before doing anything else, then follow it exactly. If you cannot read that
|
||||
file, stop and report precisely that you could not load your guide — do not improvise the mission.
|
||||
|
||||
Non-negotiable even without the guide: consult ONLY the published docs you were given — never
|
||||
read the product's source or repo internals to unblock yourself; needing to is itself a finding.
|
||||
Work only in the sandbox; never modify the product or its docs. Log every place the docs leave a
|
||||
newcomer stuck, guessing, or wrong, with the exact doc location and the fix. Emit the publishable
|
||||
walkthrough ONLY on a fully clean run (no blockers, no stumbles). If blocked, report exactly what
|
||||
blocked you and where — never guess or fabricate success.
|
||||
@@ -0,0 +1,20 @@
|
||||
---
|
||||
description: Debate each low-priority (P2/P3) backlog item on this repo's ROADMAP to a DROP/DO/ESCALATE verdict — recommend-only, applied on your approval
|
||||
argument-hint: [optional scope, e.g. a ROADMAP item number or "P3"]
|
||||
---
|
||||
|
||||
Adjudicate the low-priority technical backlog of the repository in the current working
|
||||
directory. Scope, if any: $ARGUMENTS
|
||||
|
||||
Your complete orchestration guide — phases, the per-item investigate→debate→judge pipeline,
|
||||
the three verdicts (DROP / DO / ESCALATE), and the report + approval flow — is at:
|
||||
|
||||
~/Projects/standards/guides/adjudicate.md
|
||||
|
||||
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
|
||||
precisely that — do not improvise the adjudication.
|
||||
|
||||
Claude Code specifics for Phase 2: per item, launch the investigator first, then the build- and
|
||||
drop-advocates as a single parallel batch, then the judge; run items concurrently in batches to
|
||||
keep the fan-out manageable. These are read-only role agents — the only write is the ROADMAP
|
||||
edit in Phase 4, after the owner approves.
|
||||
@@ -0,0 +1,26 @@
|
||||
---
|
||||
description: Design round-trip — establish a look (scope fresh inspiration-first, import prior guidelines, or extract the de-facto design from existing code), optionally round-trip through Claude Design in the cloud, then distill the result into the repo's durable design/ contract (DESIGN.md + DTCG tokens)
|
||||
argument-hint: [optional: the surface to design, e.g. "landing page", or "backfill"]
|
||||
allowed-tools: Bash, Read, Edit, Write, Grep, Glob, Agent
|
||||
---
|
||||
|
||||
Run the design round-trip for this repo's user-facing interface — a fresh design, or
|
||||
backfilling an existing repo by importing prior guidelines or extracting its as-built look.
|
||||
Optional surface/mode from me (may be empty): $ARGUMENTS
|
||||
|
||||
Your complete orchestration guide — the `design/` folder contract, the inspiration-first
|
||||
scoping conversation, the cloud-tool hand-off runbook, and how to distill the result back
|
||||
into the repo — is at:
|
||||
|
||||
~/Projects/standards/guides/design.md
|
||||
|
||||
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
|
||||
precisely that — do not improvise the design work.
|
||||
|
||||
This runs in the main thread on purpose: scoping a look is an interactive, iterative
|
||||
conversation. Lead with inspiration — ask me for reference images / UI screenshots I like
|
||||
(saved under `design/inspiration/`) and react to them — rather than interrogating me for
|
||||
specifics I may not have yet. You don't run Claude Design yourself; it's cloud-only and I
|
||||
drive that step — hand me a runbook and wait. The repo's durable contract is
|
||||
`design/DESIGN.md` + `design/tokens.tokens.json`, never any proprietary export bundle; don't
|
||||
fabricate brand values you can't source from an export, an inspiration image, or my choice.
|
||||
@@ -0,0 +1,22 @@
|
||||
---
|
||||
description: New-project bootstrap — workshop a captured idea into a standards-compliant repo (AGENTS.md + symlink, ROADMAP, canonical .gitignore, .claude wiring) and publish it to Gitea
|
||||
argument-hint: [optional: working name or the idea, e.g. "new:relay" or a one-line pitch]
|
||||
allowed-tools: Bash, Read, Edit, Write, Grep, Glob, Agent
|
||||
---
|
||||
|
||||
Bootstrap a brand-new project under `~/Projects`, standards-compliant from line one.
|
||||
Optional seed from me (may be empty): $ARGUMENTS
|
||||
|
||||
Your complete orchestration guide — locating the inbox seed, the scope workshop, the
|
||||
sign-off gate, the scaffold layout, the Gitea publish step, and the final report — is at:
|
||||
|
||||
~/Projects/standards/guides/new-project.md
|
||||
|
||||
Read it in full first, then follow it exactly. If you cannot read that file, stop and
|
||||
report precisely that — do not improvise the bootstrap.
|
||||
|
||||
This runs in the main thread on purpose: scoping a new project is a conversation. Ask me
|
||||
the judgment calls (the objective and non-goals, the stack, the architectural forks, the
|
||||
scaffolding-plan sign-off, and the Gitea publish) and wait — nothing lands on disk before
|
||||
Phase 3, and nothing is created without my sign-off in Phase 2. My remote is self-hosted
|
||||
Gitea, never GitHub.
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
description: Cross-project status roundup — read every repo's AGENTS.md/ROADMAP.md plus the inbox and compile one priority-grouped to-do list across all projects
|
||||
description: Cross-project status roundup — read every repo's AGENTS.md/ROADMAP.md plus the inbox, compile one priority-grouped to-do list across all projects, and write it to a tracked STATUS.md snapshot in the standards repo
|
||||
argument-hint: [optional focus, e.g. "only P0/P1" or a subset of repos]
|
||||
allowed-tools: Bash, Read, Grep, Glob, Agent, Write
|
||||
---
|
||||
@@ -15,6 +15,8 @@ per repo, the inbox pass, and the report format — is at:
|
||||
Read it in full first, then follow it exactly. If you cannot read that file, stop and report
|
||||
precisely that — do not improvise the roundup.
|
||||
|
||||
Read and report only: gather and group by the priorities you find, but do not rank the
|
||||
projects against each other or tell me what to work on — deciding the best use of time is
|
||||
mine. After you present the report, help me reason about ordering only if I ask.
|
||||
Write exactly one file — the `STATUS.md` snapshot in the standards repo (committed + pushed
|
||||
so it's tracked and diffable over time); every project repo stays read-only. Don't decide for
|
||||
me: gather and group by the priorities you find, but do not rank the projects against each
|
||||
other or tell me what to work on — deciding the best use of time is mine. After you present
|
||||
the report, help me reason about ordering only if I ask.
|
||||
|
||||
@@ -0,0 +1,143 @@
|
||||
# Adjudicate — debate low-priority backlog items to a verdict
|
||||
|
||||
*Substance file per the portability protocol. Vendor wrappers (e.g.
|
||||
`adapters/claude/commands/adjudicate.md`) point here; this guide is self-contained
|
||||
and written as plain prose any orchestrating agent could follow.*
|
||||
|
||||
You are running inside one project repo. Low-priority technical/backend items pile up on its
|
||||
`ROADMAP.md` that the owner can't easily judge the necessity of — and shouldn't have to spend
|
||||
expertise on, precisely *because* they're low priority. Your job is to run a grounded debate
|
||||
over each eligible item and reach a verdict, so the owner ratifies decisions instead of
|
||||
researching them.
|
||||
|
||||
**Recommend-only.** You never execute, build, or ship anything here. Your output is verdicts
|
||||
and a single batch of ROADMAP edits the owner approves. The most you change is the backlog
|
||||
itself.
|
||||
|
||||
**Autonomy is gated by blast radius, not priority.** A low-priority item can still be
|
||||
dangerous (it touches data, auth, money, an external surface, or changes observable app
|
||||
behavior). You may autonomously recommend *dropping* such an item, but you may never recommend
|
||||
silently *doing* it — anything above the blast-radius line goes to the owner as a brief.
|
||||
|
||||
**Decide on the facts; present in plain terms.** The investigation and the judge's reasoning
|
||||
are rigorous and technical — grounded in what the code actually does, so the recommendation
|
||||
rests on real facts. But everything *shown to the owner* — both sides of each debate and the
|
||||
verdict's rationale — must be in plain language a non-specialist can act on: what the item
|
||||
really is, what doing it gets you, what skipping it costs. Don't assume jargon; when a
|
||||
technical fact is load-bearing, explain it in a phrase. The owner is judging trade-offs, not
|
||||
reading a tech spec. (The technical detail stays in the agents' analysis and can be surfaced on
|
||||
request — it's just not the default presentation.)
|
||||
|
||||
## Phase 1 — Orient & select (no fan-out yet)
|
||||
|
||||
1. Read this repo's `ROADMAP.md` and `AGENTS.md` (especially `## Current state`) for context.
|
||||
2. **Inbox nudge (don't triage).** Do the session-start inbox-check: if
|
||||
`~/Projects/standards/INBOX.md` has unchecked items tagged for this repo, tell the owner
|
||||
*"N untriaged inbox items for this repo — run `/triage` first to land them on the ROADMAP,
|
||||
or proceed with just what's there."* You operate on ROADMAP only; never read raw inbox items
|
||||
into the debate — that is `/triage`'s routing job, and duplicating its rules invites drift.
|
||||
3. **Select candidates.** Eligible = parked, low-priority backlog items: P2/P3 where items
|
||||
carry an explicit priority; otherwise items that read as nice-to-have / deferred.
|
||||
**Exclude:** P0/P1 or clearly-active items, anything already marked done/built, and
|
||||
`(new:…)`-style new-repo seeds. If `$ARGUMENTS` names specific items (e.g. a ROADMAP number
|
||||
or `P3`), scope to those.
|
||||
4. **Confirm the set before spending agents.** Show the owner the list you intend to adjudicate
|
||||
(one line each) and let them trim or confirm. A full run is ~4 subagents per item — this gate
|
||||
controls cost and catches any item that's more important than its placement suggests.
|
||||
|
||||
## Phase 2 — Per item: investigate → debate → judge
|
||||
|
||||
For each confirmed item, run this pipeline (items may run in parallel where your tooling
|
||||
allows; within an item the stages are sequential):
|
||||
|
||||
1. **Investigator** (read-only). Grounds the debate in reality so it isn't two models
|
||||
speculating. Reads the actual code and reports: does the problem this item describes actually
|
||||
exist, or is it already handled? What would the change touch (files, surfaces)? **Classify
|
||||
blast radius:** LOW (reversible, internal, test-covered, no observable behavior change) or
|
||||
HIGH (touches data/auth/money/an external surface, or changes observable app behavior). When
|
||||
unsure, classify HIGH.
|
||||
2. **Build-advocate** and **Drop-advocate** (in parallel). Each receives the item text and the
|
||||
investigator's findings and argues one side honestly, citing the findings — not speculation.
|
||||
Reason from the technical facts, but **write the case for a non-specialist**: lead with the
|
||||
practical stakes and translate any jargon the argument depends on.
|
||||
- *Build-advocate*: the concrete benefit, the cost or risk of leaving it undone, who or what
|
||||
it helps.
|
||||
- *Drop-advocate*: YAGNI, added complexity and maintenance, opportunity cost, whether it's
|
||||
bells-and-whistles for its own sake.
|
||||
3. **Judge.** Receives the item, the investigator's findings (incl. blast radius), and both
|
||||
briefs. Decides against the rubric = `how-i-work.md` + this repo's `AGENTS.md`. **Bias to
|
||||
DROP on a tie or low confidence** — these items are already low-priority, so death is the
|
||||
default unless a clear case is made. Decide on the technical merits, but **write the
|
||||
rationale it emits in plain terms** (per the principle above). Emits a structured verdict
|
||||
(next section).
|
||||
|
||||
## The three verdicts
|
||||
|
||||
- **DROP** — not worth doing. The only autonomously-applied call. (Still ratified in one batch
|
||||
by the owner per Phase 4 — "autonomous" means the owner needn't understand the tech, not that
|
||||
files change unseen.)
|
||||
- **DO** — worth doing **and** blast radius LOW. Annotate the ROADMAP item with the decision and
|
||||
a short ready-to-act plan; surface it for the owner's go-ahead to schedule. You do **not**
|
||||
execute it (recommend-only).
|
||||
- **ESCALATE** — worth doing **but** blast radius HIGH, **or** the judge's confidence is low,
|
||||
**or** the item is really an epic that should be split first. Produce a balanced brief: the
|
||||
build case, the drop case, the judge's lean, and why it's above the line. This is the owner's
|
||||
real judgment call — made cheap because they're ratifying reasoning, not generating it.
|
||||
|
||||
## Phase 3 — Report (inline, no file written)
|
||||
|
||||
Show the owner one report. **Write every line in plain terms** (per the principle above) — no
|
||||
unexplained jargon, and never let a raw file path or code symbol stand in for the explanation;
|
||||
say what it means in practice. No new tracked artifact — the ROADMAP diff and the commit message
|
||||
are the durable record (same convention as `/triage`).
|
||||
|
||||
```
|
||||
# Adjudication — <repo> — <date>
|
||||
Adjudicated N of M eligible items.
|
||||
|
||||
## DROP — not worth doing (remove on your OK)
|
||||
- <item, in plain words> — why it's not worth it, in one plain sentence (judge confidence)
|
||||
|
||||
## DO — worth doing and low-risk (your go-ahead to schedule)
|
||||
- <item, in plain words> — what you'd gain, in one plain sentence + the ready plan
|
||||
|
||||
## ESCALATE — your call (touches something that matters)
|
||||
- <item, in plain words>
|
||||
For it: <the strongest plain-language case to do it>
|
||||
Against it: <the strongest plain-language case to skip it>
|
||||
Judge's lean: <which way, and why, in plain terms>
|
||||
Why it's yours: <what makes it consequential — e.g. changes app behavior, touches data/money>
|
||||
```
|
||||
|
||||
The plain-language "For it / Against it" pair is the heart of an ESCALATE — it's the easy-to-read
|
||||
two sides the owner weighs. Keep each to a few plain sentences.
|
||||
|
||||
## Phase 4 — Approve, apply, commit
|
||||
|
||||
1. **One approval gate.** Wait for the owner to confirm the batch. Never edit `ROADMAP.md`
|
||||
before they approve — it's a durable file (same rule as `/triage`).
|
||||
2. **Apply** the approved changes to `ROADMAP.md`: delete DROP items outright (git history is
|
||||
the record — don't leave tombstones); annotate DO items with the decision + plan; annotate
|
||||
ESCALATE items with the judge's lean so the brief isn't lost.
|
||||
3. **Commit.** Present the proposed message and wait for confirmation (one approval covers
|
||||
commit + push, per `how-i-work.md`). The message records the verdicts and the why for each
|
||||
drop — that *is* the audit trail. No AI-attribution trailer.
|
||||
4. **Report** what was dropped, what's queued as DO, and what's waiting on the owner as
|
||||
ESCALATE.
|
||||
|
||||
## Rules
|
||||
|
||||
- Recommend-only. Never execute, build, or ship — your single write is the ROADMAP edit, after
|
||||
approval.
|
||||
- Never auto-recommend *doing* a HIGH-blast-radius item; route it to ESCALATE. When blast radius
|
||||
is unclear, treat it as HIGH.
|
||||
- Ground every argument in the investigator's findings. If the investigator can't read the code
|
||||
or the item is too vague to investigate, say so and ESCALATE it rather than debating in a
|
||||
vacuum.
|
||||
- Present in plain terms. The report and both sides of every debate must read for a
|
||||
non-specialist; the technical rigor lives in the decision, not the prose shown to the owner.
|
||||
- Don't read raw inbox items into the debate — nudge the owner to `/triage` first. ROADMAP is
|
||||
the only input.
|
||||
- Preserve the owner's judgment as the gate: propose verdicts, apply only on approval, and
|
||||
surface anything consequential rather than deciding it.
|
||||
- If blocked at any point, report exactly what blocked you — never fabricate a verdict.
|
||||
@@ -0,0 +1,95 @@
|
||||
# Design checker — agent operating guide
|
||||
|
||||
*Substance file per the portability protocol. Vendor wrappers (e.g.
|
||||
`adapters/claude/agents/design-checker.md`) point here; this guide is self-contained and
|
||||
written as plain prose any delegated agent could follow.*
|
||||
|
||||
You are a design-compliance checker. Your output is a verdict on whether a repo's
|
||||
user-facing code actually conforms to the design standard the project scoped for itself —
|
||||
the `design/DESIGN.md` brand brief and `design/tokens.tokens.json` token file. You audit the
|
||||
interface against its *own* committed contract; you do not impose taste of your own. The
|
||||
`design/` contract is defined in `~/Projects/standards/guides/design.md` — that is the source
|
||||
of truth for what these files mean.
|
||||
|
||||
## Inputs you'll receive
|
||||
A path to the repo to audit (default: the current working directory).
|
||||
|
||||
## Procedure
|
||||
1. **Load the contract first.** Read `design/DESIGN.md` and `design/tokens.tokens.json` from
|
||||
the target repo, this run. These define the standard. If the repo has **no `design/`
|
||||
folder or no `DESIGN.md`**, stop and report exactly that: there is no committed design
|
||||
standard to check against, and the repo needs `/design` run first. Do not invent a
|
||||
standard or audit against generic taste — that's not your job.
|
||||
2. **Read the standard for the meaning, not just the values.** From `DESIGN.md`, extract the
|
||||
palette, type scale, spacing system, component-styling rules, layout/density expectations,
|
||||
depth/elevation rules, the explicit **Do's and don'ts**, responsive behavior, and the
|
||||
agent prompt guide. From `tokens.tokens.json`, extract the canonical token values
|
||||
(colors, dimensions, font families, radii, shadows) and their names.
|
||||
3. **Inventory the user-facing surface.** Find the code that renders UI — HTML/templates,
|
||||
CSS/SCSS, React/JSX/Vue components, SwiftUI/AppKit views, or whatever this stack uses.
|
||||
Glob for the front-end directories; ignore tests, build output, and non-UI code. Note what
|
||||
you covered so coverage is honest.
|
||||
4. **Check the rendered design against the contract**, dimension by dimension (below), citing
|
||||
the code `file:line` and the specific contract rule (a `DESIGN.md` section or a token name)
|
||||
it honors or breaks.
|
||||
5. **Separate real violations from drift.** A hardcoded hex that doesn't match any token, or
|
||||
a pattern the Do's-and-don'ts explicitly forbids, is a violation. A near-miss or an
|
||||
inconsistency the contract doesn't actually rule on is drift — note it, don't inflate it.
|
||||
|
||||
## The dimensions
|
||||
|
||||
- **Color** — UI colors trace to a token in `tokens.tokens.json` (directly or via a CSS
|
||||
custom property generated from it). Hardcoded hex/rgb values that don't match any token are
|
||||
violations; off-palette colors are violations. **Exception — standalone generated documents**
|
||||
(a share/print/email export that ships its own `<style>` with no `:root`): they legitimately use
|
||||
literal hex because `var()` can't resolve without the token block, so audit them against the token
|
||||
*values* and don't flag the literal hex itself as a violation.
|
||||
- **Typography** — font families, sizes, and weights come from the type scale / font tokens,
|
||||
not ad-hoc values. Headings and body follow the `DESIGN.md` typography rules.
|
||||
- **Spacing & layout** — margins, padding, and gaps use the spacing scale; layout density and
|
||||
structure match what `DESIGN.md` specifies. Magic-number spacing that bypasses the scale is
|
||||
drift-to-violation depending on how explicit the contract is.
|
||||
- **Components** — buttons, cards, inputs, etc. follow the component-styling rules (radii,
|
||||
borders, states). Check against the Do's and don'ts directly.
|
||||
- **Depth/elevation** — shadows/borders/layering follow the elevation rules and shadow tokens.
|
||||
- **Responsive behavior** — the breakpoints and mobile/desktop behavior `DESIGN.md` calls for
|
||||
are actually implemented.
|
||||
- **Explicit don'ts** — anything the contract names as forbidden, searched for directly. A
|
||||
matched "don't" is always a blocker.
|
||||
|
||||
## Hard rules
|
||||
- **Read-only.** Never edit, create, fix, or commit. Report remediation as exact, specific
|
||||
changes the user (or `/design`, or a coding agent) could make — the file, the line, the
|
||||
token to use instead — but never apply them.
|
||||
- **Audit against the committed contract, never your own taste.** If `DESIGN.md` is silent on
|
||||
something, it is not a violation. You enforce the project's standard, not a universal one.
|
||||
- Every finding cites both the **code `file:line`** and the **contract rule** (the `DESIGN.md`
|
||||
section or the token name) it breaks. A finding without both is dropped, not softened.
|
||||
- Distinguish **blockers** (off-contract: off-palette color, a matched "don't", wrong type
|
||||
scale) from **warnings** (drift the contract doesn't strictly forbid). Absence of a contract
|
||||
is not a finding about the code — it's a "run `/design` first" report.
|
||||
- Anything you did not actually inspect is UNVERIFIED, never assumed. If blocked, report
|
||||
exactly what blocked you — never guess or fabricate.
|
||||
|
||||
## Report format (≤80 lines, exactly these sections)
|
||||
|
||||
```
|
||||
## Verdict
|
||||
COMPLIANT | NON-COMPLIANT (n blockers) | PARTIAL | NO CONTRACT — one sentence why.
|
||||
Contract files read this run; UI surface covered.
|
||||
|
||||
## Dimension compliance
|
||||
Dimension | PASS/FAIL/UNVERIFIED/N-A | Evidence (code file:line vs contract rule/token)
|
||||
|
||||
## Blockers
|
||||
Each: code file:line → the contract rule it breaks → the exact fix (token/value to use).
|
||||
|
||||
## Warnings
|
||||
Same shape, non-blocking drift.
|
||||
|
||||
## Not covered
|
||||
UI areas or dimensions not inspected this run — named, not silently dropped.
|
||||
|
||||
## Surprises
|
||||
Anything unexpected — e.g. the contract itself looks stale vs the product. "None" is fine.
|
||||
```
|
||||
@@ -0,0 +1,372 @@
|
||||
# Design round-trip — orchestration guide
|
||||
|
||||
*Substance file per the portability protocol. Vendor wrappers (e.g.
|
||||
`adapters/claude/commands/design.md`) point here; this guide is self-contained and written
|
||||
as plain prose any orchestrating agent could follow. This is also the authoritative
|
||||
definition of the `design/` folder convention that `/new-project` scaffolds and the
|
||||
`design-checker` agent audits against — both point here.*
|
||||
|
||||
You run the **design round-trip** for a repo that has (or will have) a user-facing
|
||||
interface: you establish the look — by scoping a fresh design, importing prior guidelines,
|
||||
or extracting the de-facto design already in the code — then (optionally) take it through an
|
||||
external visual tool (Anthropic's **Claude Design** in the cloud is the default front-end,
|
||||
but Figma or a plain conversation feed the same contract), and distill the result into a
|
||||
small set of **durable, vendor-neutral, on-disk design artifacts** that every future agent
|
||||
reads before building UI. You run in the **main thread** — establishing a look is an
|
||||
interactive, iterative conversation, not a delegated job — so you talk to the user and react
|
||||
to what they show you. Do not behave like a subagent.
|
||||
|
||||
The arc: **read the situation → establish the design intent → (optionally) round-trip
|
||||
through the cloud tool → distill into the repo → promote what you learned.** The principle
|
||||
underneath: the cloud tool is an interchangeable front-end; what lives in the repo is a
|
||||
`DESIGN.md` brand brief plus a `tokens.tokens.json` token file, and *those two files are the
|
||||
contract*. Never let the repo depend on a proprietary export format — if Claude Design
|
||||
vanished tomorrow, the on-disk contract and everything that reads it must still stand.
|
||||
|
||||
## Posture (how to run the conversation)
|
||||
|
||||
Be a collaborator drawing out a look the user may not be able to name yet. **Lead with
|
||||
inspiration, not specification.** The user often won't know exactly what they want — that's
|
||||
fine and expected. Invite them to show you references: "drop any screenshots of apps/sites
|
||||
whose look you like into `design/inspiration/`, or paste them here." React to each
|
||||
concretely (what about it works — the restraint, the density, the palette, the type?), build
|
||||
a shared vocabulary, and converge. Propose a draft direction and let them correct it rather
|
||||
than interrogating them. At most a focused question or two per turn. **Scale the ceremony to
|
||||
the surface:** a single admin panel needs a light brief; a public landing page or a consumer
|
||||
app deserves the full walk. And push back — if two references pull in opposite directions, or
|
||||
a stated brand value contradicts the inspiration, name it.
|
||||
|
||||
## The on-disk contract: the `design/` folder
|
||||
|
||||
Every repo with a user-facing surface carries a `design/` folder. This is the layout you
|
||||
create and maintain (folder name is `design/`, not `brand/`):
|
||||
|
||||
```
|
||||
design/
|
||||
BRIEF.md # pre-flight: the scoped input packet handed to the cloud tool (fresh/redesign runs)
|
||||
DESIGN.md # the durable brand brief — the contract agents read before UI work
|
||||
tokens.tokens.json # W3C DTCG design tokens — the machine-readable value source
|
||||
inspiration/ # reference images + UI screenshots the user likes (first-class input; kept in git)
|
||||
brand/ # logo.svg, fonts, generated palette.css and other delivered assets
|
||||
_imports/<date>/ # raw export bundles / prior guidelines, dated (kept in git for provenance)
|
||||
```
|
||||
|
||||
`DESIGN.md` + `tokens.tokens.json` are the **contract**; everything else is scaffolding and
|
||||
provenance. `inspiration/` and `_imports/` are **kept in git** — the references record *why*
|
||||
the look is what it is, and the raw imports record *where* it came from. And the repo's
|
||||
`AGENTS.md` carries one line so every tool honors the contract:
|
||||
|
||||
> **Design:** before building or changing any user-facing UI, read `design/DESIGN.md` and
|
||||
> `design/tokens.tokens.json` and conform to them.
|
||||
|
||||
## Phase 0 — Read the situation, pick the on-ramp
|
||||
|
||||
Before anything, determine which of four situations you're in — it sets the whole run:
|
||||
|
||||
- **Refine** — `design/DESIGN.md` already exists. You're updating an existing contract; skip
|
||||
to the phase that fits the change.
|
||||
- **Import (Case A)** — no contract, but the repo (or the user) has **prior design/branding
|
||||
guidelines** in some ad-hoc form (a doc, an HTML brand page, a Claude Design export). Go to
|
||||
Phase A → *Import*.
|
||||
- **Extract (Case B)** — no contract and no guidelines, but there's an **existing user-facing
|
||||
UI** whose look grew organically. Go to Phase A → *Extract*.
|
||||
- **Fresh** — no contract and no UI yet (a new or pre-UI project). Go to Phase A → *Fresh*.
|
||||
|
||||
Also settle the posture with the user up front for backfills: **document-as-is** (cheaply
|
||||
turn what exists into a contract) vs **redesign/elevate** (take it through the cloud tool to
|
||||
improve the look first). Document-as-is skips Phase B; redesign runs it.
|
||||
|
||||
## Phase A — Establish the design intent
|
||||
|
||||
Pick the branch from Phase 0. All three converge on Phase C (the contract).
|
||||
|
||||
**Fresh — inspiration-first scoping → `design/BRIEF.md`.**
|
||||
Produce the packet the user walks into the cloud tool with, so they never start from a blank
|
||||
canvas. Work it out *with* the user, inspiration-first:
|
||||
1. **Gather inspiration.** Ask for reference images / UI screenshots; save under
|
||||
`design/inspiration/`. For each, capture in one line *what* the user likes about it, so the
|
||||
intent survives as text, not just pixels.
|
||||
2. **Draft the brief** in the five-part structure the cloud tool responds to: **Goal**,
|
||||
**Layout**, **Content**, **Audience**, **Reference pattern** (the inspiration, named), plus
|
||||
a **~200-word brand description** (voice, mood, what it is *not*).
|
||||
3. **Assemble the input checklist** — a repo **subdirectory to point at** (never the whole
|
||||
monorepo), files to **upload** (logo, fonts, inspiration images, any existing
|
||||
contract), and **live URLs** to web-capture.
|
||||
4. **Write `design/BRIEF.md`** with the brief, brand description, inspiration list + notes, the
|
||||
input checklist, and a **copy-pasteable prompt block**.
|
||||
|
||||
**Import (Case A) — locate → map → gap-fill → reconcile.**
|
||||
1. **Locate** the prior guidelines — search the repo (branding docs, HTML brand pages, an
|
||||
exported bundle) and ask the user where they live or to paste/re-export them. Drop the raw
|
||||
artifact into `design/_imports/<date>/` for provenance.
|
||||
2. **Map** their content onto our nine-section `DESIGN.md` structure and DTCG tokens — this is
|
||||
reformatting, not reinvention; preserve the actual values.
|
||||
3. **Gap-fill** — flag every section our ideal contract wants that the prior artifact doesn't
|
||||
cover (often: spacing scale, component states, responsive rules, do's/don'ts), and fill from
|
||||
the deployed UI or by asking.
|
||||
4. **Reconcile** the guidelines against what the product *actually* renders (landing, app,
|
||||
admin) — where the code diverges from the stated brand, surface it for the user's call.
|
||||
|
||||
**Extract (Case B) — inventory → reconcile → (then formalize in Phase C).**
|
||||
1. **Inventory the as-built design.** Read the user-facing code and produce a *descriptive*
|
||||
record of what the design currently is: every color used, type styles, spacing patterns,
|
||||
component treatments, depth/elevation — **including the inconsistencies** (the three
|
||||
almost-the-same blues, the four heading sizes). A read-only `design-checker` cannot do this
|
||||
(it needs a contract to check against), so inventory here in the main thread, optionally
|
||||
delegating a read-only reader to gather the raw values. First **enumerate every styling
|
||||
surface** — there is often more than one (a second stylesheet, a separate auth/landing
|
||||
page, *CSS embedded in a string literal* that ships a self-contained export); a grep over
|
||||
one `<style>` block silently misses the others. And **read the brand mark / icon before the
|
||||
CSS** — it frequently encodes the intended color story the code only partly realized.
|
||||
2. **Reconcile with the user.** Surface the conflicts and let them make the **canonical calls**
|
||||
— which blue is *the* blue, which scale is *the* scale. Organically-arrived-at does not mean
|
||||
correct; their taste decides. This is the high-value human step; don't auto-resolve it.
|
||||
|
||||
## Phase B — (Optional) cloud round-trip
|
||||
|
||||
Run this only for **fresh** designs that need the visual tool, or **redesign/elevate**
|
||||
backfills. Document-as-is backfills skip it. You cannot run the cloud tool yourself — the
|
||||
default, Claude Design, is cloud-only at `claude.ai/design`, browser-driven, and the user
|
||||
does this step. Hand them a runbook (the steps below are written for Claude Design; adapt
|
||||
them for Figma or whatever visual tool the user prefers):
|
||||
|
||||
- What to **paste** (the prompt block from `BRIEF.md`) and **upload/point at** (the checklist).
|
||||
- How to **iterate**: inline canvas comments, direct edits, sliders/toggles — not just
|
||||
re-prompting.
|
||||
- What to **export**: the **coding-agent handoff bundle** (Claude Design calls it "Handoff to
|
||||
Claude Code") — richest output: HTML/CSS/JS + per-state screenshots + a README of stack/
|
||||
conventions — and a few **screenshots**. PDF/PPTX/Canva are presentation-only — skip them.
|
||||
Drop the bundle into `design/_imports/<date>/`.
|
||||
|
||||
If the user used Figma or just talked through the look, that's fine — the same Phase C applies.
|
||||
|
||||
## Phase C — Distill into the durable contract
|
||||
|
||||
Turn the input — a cloud export, imported guidelines, or a reconciled inventory — into the
|
||||
vendor-neutral contract. When distilling a Claude Design export, note it is **agent-mediated,
|
||||
not a mechanical export**: Claude Design emits standalone HTML with *inline, hardcoded* CSS
|
||||
(no CSS custom properties, no DTCG tokens), so you read values out of the export/screenshots
|
||||
and author the contract yourself. Worth a human glance the first few runs.
|
||||
|
||||
1. **Preserve provenance.** Confirm the raw input is committed under `design/_imports/<date>/`.
|
||||
2. **Author `design/DESIGN.md`** in the nine-section format coding agents parse: **Visual
|
||||
theme**, **Color palette**, **Typography**, **Component styling**, **Layout**,
|
||||
**Depth/elevation**, **Do's and don'ts**, **Responsive behavior**, **Agent prompt guide**
|
||||
(a short "when building UI here, do X" note).
|
||||
3. **Author `design/tokens.tokens.json`** — W3C DTCG (JSON; each token has `$type` and
|
||||
`$value`; groups nest; references use `{group.token}`). Pull colors, type scale, spacing,
|
||||
radii, shadows from the input. A design-tokens build tool (e.g. Style Dictionary) — or a
|
||||
token-extraction skill if your harness ships one — can assist; optional.
|
||||
4. **Populate `design/brand/`** — logo, fonts, optionally a `palette.css` generated from the
|
||||
tokens via Style Dictionary.
|
||||
5. **Wire the contract in.** Ensure the `AGENTS.md` **Design line** is present; add if missing.
|
||||
6. **Commit** the `design/` folder. For backfills, then fan out **`design-checker`**: the gap
|
||||
between the just-canonicalized contract and the current code is a **cleanup backlog** —
|
||||
capture it to that repo's `ROADMAP.md` (don't silently fix it in the same pass).
|
||||
|
||||
## Phase D — Promote learnings (close the loop)
|
||||
|
||||
A `/design` run teaches you things. **Two kinds, harvested from two places:** the **pre-flight
|
||||
scoping conversation** (Phase A — what kinds of questions drew the look out, what inspiration
|
||||
prompts worked, where the brief was thin) and the **distillation** (Phase C — how the export
|
||||
actually behaved, what token-mapping heuristics held up, what tripped you). At the end of every
|
||||
run, ask: *did this surface something generalizable about the **process** — not a fact about
|
||||
this project's brand?*
|
||||
|
||||
- **If yes, promote it to the global standard.** Propose an edit to *this* file
|
||||
(`~/Projects/standards/guides/design.md`) — usually a new bullet under **Field notes** below,
|
||||
or a refinement to a phase — so the next repo inherits it. Recurring `design-checker`
|
||||
violation patterns get the same treatment in `guides/design-checker.md`.
|
||||
- **Keep the scope rule.** *Brand facts* (this palette, this voice) live in the repo's
|
||||
`design/DESIGN.md`. *Process learnings* (how to scope, how to distill) live here in the global
|
||||
guide. Never cross them.
|
||||
- Show the diff and the rationale; don't silently rewrite the standard (per `how-i-work.md`).
|
||||
|
||||
## The `BRIEF.md` skeleton
|
||||
|
||||
```markdown
|
||||
# Design brief — <surface/screen/app>
|
||||
|
||||
## Goal
|
||||
<what this is; the single job it does>
|
||||
|
||||
## Layout
|
||||
<structure, density, mobile-first?>
|
||||
|
||||
## Content
|
||||
<the actual elements to include>
|
||||
|
||||
## Audience
|
||||
<who uses it; the feeling it should evoke>
|
||||
|
||||
## Reference pattern (inspiration)
|
||||
- inspiration/<file> — <what the user likes about it>
|
||||
|
||||
## Brand description (~200 words)
|
||||
<voice, mood, what it is NOT>
|
||||
|
||||
## Inputs to bring to the cloud tool
|
||||
- Point at: <repo subdir>
|
||||
- Upload: <logo, fonts, inspiration images, existing DESIGN.md/tokens>
|
||||
- Web-capture: <live URLs, if any>
|
||||
|
||||
## Prompt block (paste into the cloud tool, e.g. Claude Design)
|
||||
> <Goal> … <Layout> … Include: <Content> … Audience: <…> … Style like <Reference>.
|
||||
```
|
||||
|
||||
## The `tokens.tokens.json` shape (W3C DTCG)
|
||||
|
||||
```json
|
||||
{
|
||||
"color": {
|
||||
"brand": { "$type": "color", "$value": "#1a1a2e" },
|
||||
"accent": { "$type": "color", "$value": "#e94560" }
|
||||
},
|
||||
"space": {
|
||||
"md": { "$type": "dimension", "$value": "16px" }
|
||||
},
|
||||
"font": {
|
||||
"body": { "$type": "fontFamily", "$value": "Inter" }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Hard rules
|
||||
|
||||
- **The contract is `DESIGN.md` + `tokens.tokens.json`, never the proprietary bundle.** The
|
||||
repo must stay readable and buildable if the cloud tool disappears. The bundle is provenance
|
||||
in `_imports/`, not a dependency.
|
||||
- **Don't fabricate brand values.** Every color, size, and rule traces to an export, an
|
||||
inspiration image, prior guidelines, the as-built code, or an explicit user choice — if you
|
||||
can't source it, ask, or mark it a placeholder to confirm. Never invent a palette from thin air.
|
||||
- **Backfill reconciliation is the user's call.** When the as-built design is inconsistent, you
|
||||
surface the conflicts; the user picks the canonical value. Don't auto-resolve taste.
|
||||
- **Document-as-is vs redesign is the user's call,** settled in Phase 0.
|
||||
- **You don't run the cloud tool.** Phase B is the user's; hand them a runbook and wait. Don't
|
||||
claim to have generated or exported anything you didn't.
|
||||
- **Keep `inspiration/` and `_imports/` in git.** They are the durable record of intent and
|
||||
origin. Never commit secrets; assets only.
|
||||
- **Promote process learnings, not brand facts.** Phase D feeds the global guide; brand facts
|
||||
stay in the repo.
|
||||
|
||||
## Field notes
|
||||
|
||||
*Accreted, field-tested learnings about the process — appended by Phase D as real runs teach
|
||||
us things. Brand facts never go here; only generalizable process/distillation knowledge.*
|
||||
|
||||
- *(seed, from research 2026-06-16 — confirm on first live run)* Claude Design exports
|
||||
standalone HTML with **inline, hardcoded CSS** — not CSS custom properties or DTCG tokens.
|
||||
Plan for Phase C token extraction to be agent-authored from the values, not a file copy.
|
||||
- *(seed, from research 2026-06-16)* The richest export is the **"Handoff to Claude Code"
|
||||
bundle** — HTML/CSS/JS + per-state screenshots + a README naming the target stack/conventions.
|
||||
Read that README first; it carries implementation intent beyond the visuals. PDF/PPTX/Canva
|
||||
exports are presentation-only and carry no usable style data.
|
||||
- *(seed, from research 2026-06-16)* On input, **point Claude Design at a front-end
|
||||
subdirectory, not the whole monorepo** — large trees choke the codebase scan. It accepts text
|
||||
prompts, image/doc uploads, live-URL web capture, and Figma files.
|
||||
- *(import run, 2026-06-16, keysat)* When a prior Claude Design artifact has a **README that
|
||||
disagrees with its own shipped CSS/tokens** (keysat's README named "Archivo" but every surface
|
||||
ships Manrope), the **implemented stylesheet is authoritative** for as-built values — encode
|
||||
that, and record the prose/code disagreement as a reconciliation note, not a guess.
|
||||
- *(import run, 2026-06-16)* **Real type scales and shadows resist strict DTCG.** As-built
|
||||
values use `clamp()` for the display scale, multi-layer composite shadows, and `em`
|
||||
letter-spacing — none map cleanly to a DTCG primitive. Keep them as documented strings and say
|
||||
so in the file's `$description`; don't force-fit or claim strict-spec compliance.
|
||||
- *(import run, 2026-06-16)* In **document-as-is**, encode the design system's stated *intent* in
|
||||
the contract even where the current code violates it (keysat's README forbids gold-as-fill, but
|
||||
the admin SPA ships two). `design-checker` then flags the code as the cleanup backlog — do
|
||||
**not** water the contract down to match non-compliant code.
|
||||
- *(import run, 2026-06-16)* "**Each surface inlines its own copy of the tokens**" is a recurring
|
||||
drift risk — name a canonical `brand/palette.css`, point `DESIGN.md` §Agent-guide at it, and log
|
||||
the consolidation as backlog. Before relocating an existing design dir into `_imports/`, grep
|
||||
that **nothing imports it** (in keysat, nothing did — safe to move).
|
||||
- *(first Case-B extract run, 2026-06-16, recap)* **Harvest the inventory with grep frequency
|
||||
tables in the main thread, not a delegated reader.** For a ~13k-line single file with ~450
|
||||
inline styles, `grep -oE '<pattern>' | sort | uniq -c | sort -rn` per dimension (hex, rgba,
|
||||
font-size, weight, radius, shadow, breakpoint) produced a complete *ranked* census that stayed
|
||||
in context for the reconcile conversation — a sub-agent would have returned excerpts. The
|
||||
**use-counts themselves are the reconcile evidence**: "which dark is THE background" is answered
|
||||
by "132 uses + it's the PWA `manifest.json theme_color`," turning a taste call into a confirmation.
|
||||
- *(first Case-B extract run, 2026-06-16)* **Disambiguate near-duplicates with frequency + an
|
||||
external anchor**, not the eye. Ranking each value by use-count and cross-referencing a fixed
|
||||
reference (the PWA `theme_color`, the icon's gradient endpoints) reliably separated "one
|
||||
canonical + N strays" from "N intentional values," which is exactly the call to hand the user.
|
||||
- *(first Case-B extract run, 2026-06-16)* **Present the reconcile conflicts as A/B/C forks,
|
||||
recommended-option-first, with the candidate values in each option's preview.** Surfacing four
|
||||
conflicts (the background, the accent, the surface ladder, the type scale) as one batch of
|
||||
multiple-choice questions — each preview a monospace block showing the rival hexes in context —
|
||||
let the owner make every canonical call in a single turn instead of a long volley. This is the
|
||||
high-value human step; make it cheap to answer.
|
||||
- *(first Case-B extract run, 2026-06-16)* **For a document-as-is extract, the "inspiration" is the
|
||||
code itself.** There is no external reference set and no export bundle, so `BRIEF.md` and
|
||||
`_imports/` are correctly skipped; instead write `inspiration/README.md` pointing at the
|
||||
harvested source files + the brand icon as the de-facto reference, and copy the icon into
|
||||
`brand/`. That preserves the *why/where* record the folder convention exists for.
|
||||
- *(Case-B cleanup-execution run, 2026-06-17, recap)* **Scope an inline-hex→`var()` sweep to
|
||||
CSS-value position, not to `style=` attributes.** Convert a hex only where the character before
|
||||
`#` is not a quote/backslash (i.e. it's preceded by `:`/space/`,`). That one rule auto-dodges the
|
||||
spots that must stay literal — hex held in JS *logic* (`const c = …`, quoted ternary branches like
|
||||
`${on ? "#1e293b" : …}`), SVG `fill=`/`stroke=` attributes, and `<meta theme-color>` — without a
|
||||
hand-kept exclusion list, and it beats `style="…"` boundary-matching (which breaks on inner quotes
|
||||
inside `${…}`). Verify after: every introduced `var()` resolves against `:root`, and 0 mapped hexes
|
||||
remain in CSS-value position (proves no silent misses).
|
||||
- *(Case-B cleanup-execution run, 2026-06-17, recap)* **Exclude standalone generated documents from
|
||||
var-ification.** A self-contained export (share page, print/PDF view, email body) ships its own
|
||||
`<style>` with no `:root`, so `var(--token)` won't resolve there — its literal hex is correct, not
|
||||
drift. Identify those regions (by line range or by the builder fn) and skip them. Snapping off-scale
|
||||
*values* inside them is still fine, since that's a literal change.
|
||||
- *(Case-B cleanup-execution run, 2026-06-17, recap)* **`border-radius` clamps to half the shorter
|
||||
side — use it to snap capsules for free.** A pill-shaped control (e.g. an 18px-tall count badge at
|
||||
radius 9) renders identically at any radius ≥ half its height, so snapping *up* to the next scale
|
||||
step (9→10) is on-scale **and** pixel-identical. Snap capsules up, not down, to avoid a visible change.
|
||||
|
||||
- *(first live cloud round-trip, 2026-06-19, keysat)* Output read **via the DesignSync MCP** (not
|
||||
a downloaded "Handoff" bundle) is richer than the seed note above assumes: a **multi-file
|
||||
interactive app** — a shell that `<dc-import>`s per-surface sub-apps over a shared `store.js`, on
|
||||
Claude Design's **own template runtime** (`<x-dc>`/`<sc-if>`/`<sc-for>` + a `DCLogic` class), using
|
||||
**CSS custom properties** (per-surface `--var` blocks incl. a `[data-theme]` dark/light switch) —
|
||||
*not* the "inline hardcoded CSS" the export path produces. **Qualify, don't delete, the seed note**:
|
||||
the distinction is export-bundle (inline CSS) vs. MCP project-read (custom props + runtime). Read
|
||||
the **shell first** (it enumerates the surfaces), then each sub-app's `--var` block for values and
|
||||
its `DCLogic.renderVals()`/helpers for **derived-field formulas + thresholds** (money formatting,
|
||||
staleness day-cuts, stage order) — those encode product logic worth capturing verbatim, not just
|
||||
visuals.
|
||||
- *(2026-06-19)* **DesignSync reads content into context only — no bulk download, no
|
||||
screenshots/binaries.** So `_imports/` from a *project read* (vs. a downloaded bundle) is
|
||||
necessarily partial: byte-capture the representative text source + write a manifest `README.md`
|
||||
(project name/id/URL, full file inventory, distilled data-model/formulas/per-surface interaction
|
||||
notes) and record that screenshots + remaining files stay in-cloud (recoverable via the URL). For
|
||||
a byte-exact copy of a large `get_file`, the harness persists the oversized tool result to a file —
|
||||
`jq -r '.content' <that-file> > dest` extracts it exactly without re-streaming through context.
|
||||
- *(2026-06-19)* **A redesign round-trip can hand back *more scope than the brief asked for*** (here:
|
||||
a full light theme + a stage-color axis on a dark-only brief). This is a different reconcile than
|
||||
as-built *inconsistency*: record the additions faithfully in `_imports/`, but write any
|
||||
scope-expanding one (a whole second palette, a new color axis) into the contract as
|
||||
**proposed-not-canonical pending an explicit owner decision** rather than silently adopting it —
|
||||
and surface it as the reconcile call (an A/B "adopt vs. exploration-only" question). Small in-idiom
|
||||
refinements (a 13→15px type bump) just get absorbed.
|
||||
- *(Case-B doc-as-is extract, 2026-06-19, proof-of-work)* **For a hue/shade reconcile call, render the
|
||||
candidates as pixels — hex values + monospace previews aren't enough.** When the owner says "I like
|
||||
adding X, *depending on the shade*," author a small HTML vignette in the app's real fonts/treatments on
|
||||
the real canvas color, screenshot it with **Chrome headless** (`--headless=new --screenshot
|
||||
--force-device-scale-factor=2 --window-size=W,H` against a `file://` URL — present on macOS at
|
||||
`/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`), and show the candidates in-context with
|
||||
the *unchanging* reference element beside each (here: the white primary button drawn in every row) so the
|
||||
comparison isolates the one variable. Turned "which red" into a single-look decision; keep the rendered
|
||||
comparison in `inspiration/` as decision provenance.
|
||||
- *(Case-B doc-as-is extract, 2026-06-19, proof-of-work)* **A document-as-is extract can still carry one
|
||||
owner-driven accent — handle the semantic collision *before* picking the value.** Here red was promoted
|
||||
from error-only to *the* brand accent. Reusable handling: keep the existing signature intact (white stayed
|
||||
the primary button), make the new accent the **single canonical hue** (never a second red on screen), and
|
||||
where it overlaps an existing semantic hue, **distinguish by treatment, not by a new color** (solid
|
||||
edge/fill = brand accent; wash + tinted text = error; never a solid button in that hue) — then encode that
|
||||
distinction in §Do's/Don'ts so `design-checker` can enforce it (it caught the lone solid-red button as the
|
||||
one regression). Surface the collision as the framing the owner chooses *within*, not a discovery after.
|
||||
|
||||
## Final report
|
||||
|
||||
Short summary: the on-ramp taken (refine/import/extract/fresh) and posture (document/redesign),
|
||||
the `design/` files written or updated, where the run stands (brief ready for the cloud tool, or
|
||||
contract distilled and committed), any `design-checker` cleanup backlog captured, any Phase-D
|
||||
learning promoted to this guide, and the unmistakable next action. If blocked, say exactly what
|
||||
blocked you — never guess or fabricate a look.
|
||||
+9
-2
@@ -46,5 +46,12 @@ and why. The user may provide optional focus notes; weave them in where relevant
|
||||
|
||||
Reply with a short summary: what got committed or pushed, what went into durable knowledge
|
||||
versus Current state, and anything still unresolved. If everything is clean, say it's safe
|
||||
to exit. Then, in case the user decides to keep the session alive instead, give them a
|
||||
one-line `/compact Focus on ...` command tailored to what matters most from this session.
|
||||
to exit. Then give the user two ways to carry the thread forward, labelled:
|
||||
|
||||
- **Keep this session:** a one-line `/compact Focus on ...` command tailored to what matters
|
||||
most from this session.
|
||||
- **Start fresh:** a paste-able opener for the next session's first message — a *pointer, not
|
||||
a payload*. Name the one thing to pick up and where it stands (e.g. "Resume the Case B
|
||||
design backfill, pick up at Phase D reconcile"), and trust AGENTS.md Current state — which a
|
||||
fresh session already loads — for the rest. It must be safe to lose: never carry state in the
|
||||
opener that isn't already on disk.
|
||||
|
||||
@@ -0,0 +1,180 @@
|
||||
# New-project bootstrap — orchestration guide
|
||||
|
||||
*Substance file per the portability protocol. Vendor wrappers (e.g.
|
||||
`adapters/claude/commands/new-project.md`) point here; this guide is self-contained and
|
||||
written as plain prose any orchestrating agent could follow. This is the inverse of
|
||||
`guides/retrofit.md`: retrofit moves an existing project's brain onto disk; this turns an
|
||||
idea into a repo that is standards-compliant from line one.*
|
||||
|
||||
You are bootstrapping a brand-new project under `~/Projects`. You run in the **main
|
||||
thread** — scoping a new project is a conversation, not a delegated job — so you talk to
|
||||
the user and ask for the judgment calls. Do not behave like a subagent.
|
||||
|
||||
The arc: **locate the seed → frame it and confirm it's even a repo → workshop the scope →
|
||||
get the plan signed off → scaffold → publish → close the capture loop.** Nothing lands on
|
||||
disk until the scaffold phase, and nothing is created without the user's sign-off at the
|
||||
plan gate. Two gates protect that work: a **form-factor gate** early (is this even a
|
||||
standalone repo?) and a **worth-building gate** at sign-off (is it worth the build *and* the
|
||||
ongoing tax?). Work the phases in order; at each decision point, ask and wait — don't guess
|
||||
on anything that lands on disk.
|
||||
|
||||
## Posture (how to run the whole conversation)
|
||||
|
||||
Be a collaborator, not an interviewer. **Propose a draft answer and ask for corrections**
|
||||
rather than asking open-ended questions — the user is an expert on his own stack and intent,
|
||||
so a wrong draft he can fix in five words beats a questionnaire. At most a focused question
|
||||
or two per turn; one is better. Mine context before asking anything — the inbox seed, the
|
||||
other repos' conventions, memory, this conversation — and never ask what's already answered.
|
||||
**Scale the ceremony to the idea:** a one-off script might clear every gate in two turns; a
|
||||
new long-running service deserves the full walk. And **push back** — if the idea has a fuzzy
|
||||
user, a job an existing tool already does, or a "Phase 0" that's secretly three phases, name
|
||||
it. A workshop that only validates is worthless.
|
||||
|
||||
## Phase 0 — Locate the seed
|
||||
|
||||
- New-project ideas are captured to the cross-project inbox as items tagged `(new)` or
|
||||
`(new:working-name)`, type `project` (see `~/Projects/standards/INBOX.md`). These are
|
||||
deliberately *not* drained by `/triage` — they wait for this command.
|
||||
- Read `INBOX.md` and list any `(new…)` / type-`project` items. If one matches what the
|
||||
user wants to build (or they passed a name/idea as the argument), use it as the seed and
|
||||
carry its note into Phase 1. If several match, ask which. If none, that's fine — ask the
|
||||
user for the idea directly.
|
||||
- Settle on a **working name** early (kebab-case; it becomes the folder and the repo name).
|
||||
Confirm `~/Projects/<name>` does not already exist before going further.
|
||||
|
||||
## Phase 1 — Frame, then the form-factor gate
|
||||
|
||||
Before workshopping a repo's scope, confirm it *should* be a repo. First draft a **thin
|
||||
frame** — a one-line objective and the single job-to-be-done — just enough to judge form, no
|
||||
more. Then walk the gate:
|
||||
|
||||
- **What form should this take?** Standalone repo / a feature of an existing repo / a skill /
|
||||
an agent / a slash-command / a one-off script. Lean on what already exists: could one of
|
||||
the user's current repos absorb this as a feature far more cheaply than a new repo carries
|
||||
its own lifecycle?
|
||||
- **Does it already exist?** Check the user's own repos first, then OSS and the Start9
|
||||
marketplace. If an existing tool does ≥80% of the job, present the best one or two
|
||||
candidates honestly, including the case for adopting them. The cheapest project is the one
|
||||
you don't build, and finding that out in minute ten beats finding it out in week three.
|
||||
- **Outcome.** Only **a standalone repo that doesn't already exist** continues to Phase 2.
|
||||
If the right home is a feature/skill/agent/command of something that exists, or an existing
|
||||
tool wins, **stop and reroute gracefully** — re-capture the idea to `INBOX.md` tagged for
|
||||
the host repo (or point at the skill/agent authoring path), tell the user plainly why a
|
||||
standalone repo is the wrong shape, and don't scaffold. Bailing here is a success, not a
|
||||
failure.
|
||||
|
||||
## Phase 2 — Workshop the scope (the high-value step)
|
||||
|
||||
This is the point of the command — don't rush it to get to scaffolding. Like `/roundup`, the
|
||||
reasoning is the user's; you're drawing it out, not deciding it. Hold the posture above and
|
||||
work through these a focused question or two at a time:
|
||||
|
||||
- **Objective & non-goals** — what the project is for in a sentence or two, what "working"
|
||||
looks like for v1, and — the cheapest way to keep scope honest — what it explicitly will
|
||||
*not* do.
|
||||
- **Placement** — read `~/Projects/standards/guides/placement.md` and walk its decision
|
||||
sequence against this idea: sensitivity/sovereignty, runtime shape, host (Mac vs Start9;
|
||||
if Start9, s9pk vs plain container), model routing, data layer, interface, repo home. Where
|
||||
a call is genuinely contested (s9pk-vs-container often is), surface it with a recommendation
|
||||
rather than picking silently. The resulting placement table seeds `AGENTS.md`.
|
||||
- **Key early decisions** — the one or two architectural forks that are expensive to reverse
|
||||
later. Capture each as a decided call: what was chosen, the alternative it beat, and the
|
||||
concrete condition that would reopen it. This is what lands in `AGENTS.md`'s "decisions
|
||||
already made" section (Phase 4), so the project never needs a separate decision-log file.
|
||||
- **First milestone** — the thinnest end-to-end slice that proves the core loop; it becomes
|
||||
the seed of `## Current state`. State its exit as **falsifiable substance** — a number or
|
||||
demonstrable behavior that could actually fail ("recap generated from a real 40-min call in
|
||||
under 2 minutes"), never a checkbox milestone ("set up the database"). If an exit can't
|
||||
fail, rewrite it until it can.
|
||||
|
||||
Once these are answered well enough to scaffold, move on — don't pad the conversation.
|
||||
|
||||
## Phase 3 — Brief + scaffolding plan (sign-off gate)
|
||||
|
||||
Synthesize the workshop and show the user **three things before creating anything on disk**:
|
||||
|
||||
1. **Project brief** — the seed of `AGENTS.md`: one-paragraph purpose, the placement table,
|
||||
non-goals, the decided architectural calls, and the first milestone.
|
||||
2. **Scaffolding plan** — the exact tree you'll create: the standards files (always:
|
||||
`AGENTS.md` + `CLAUDE.md` symlink, `ROADMAP.md`, `README.md`, the canonical `.gitignore`),
|
||||
the stack-specific starting files/dirs (kept minimal), and whether any `.claude/` wiring is
|
||||
needed yet (usually just the directory — scoped guides come later, as the project grows).
|
||||
3. **Worth-building check** — now that the cost is visible, state it honestly: the build
|
||||
effort for the first milestone and beyond, plus the ongoing tax (every running service is a
|
||||
pet that needs updates, backups, and debugging when it breaks at a bad time). Land on
|
||||
**BUILD**, **PARK**, or **ADOPT**. PARK is a respectable outcome — re-capture the idea to
|
||||
`INBOX.md` with a one-line epitaph (tagged `parked`) so it isn't lost or re-workshopped from
|
||||
scratch, and stop here.
|
||||
|
||||
Get explicit sign-off on BUILD. This is the last gate before disk.
|
||||
|
||||
## Phase 4 — Scaffold (compliant from line one)
|
||||
|
||||
Create `~/Projects/<name>/` and write, matching the standard exactly (`portability.md`):
|
||||
|
||||
- **`AGENTS.md`** — from the brief: one-paragraph purpose; `## Stack` and a `## Placement`
|
||||
table (host, s9pk-vs-container, model routing, data layer, interface — from the placement
|
||||
workshop); `## Commands` (stub the commands you expect, marked TODO where not yet runnable);
|
||||
`## Layout`; a **`## Decisions`** section listing each architectural call already made, the
|
||||
alternative it beat, and the condition that would reopen it (this absorbs what a separate
|
||||
decision-log file would hold — keep it here, don't spawn an ADR file); the **sovereignty
|
||||
constraint stated plainly** if the project touches sensitive data (local inference only —
|
||||
never wire a frontier API to payload data); the **inbox-check line** tagged `(<name>)`
|
||||
(canonical wording in the standards repo's own `AGENTS.md`); and an initial **`## Current
|
||||
state`** ("Scaffolded <date>; next: <first milestone>").
|
||||
- **`CLAUDE.md`** — a *relative* symlink to `AGENTS.md`: run `ln -s AGENTS.md CLAUDE.md`
|
||||
inside the new dir (never an absolute symlink — it must clone correctly elsewhere).
|
||||
- **`ROADMAP.md`** — seed with the phases beyond the first milestone (each with a falsifiable
|
||||
exit), plus the longer-term ideas and deferred non-goals from the workshop.
|
||||
- **`README.md`** — human-facing: what it is and how to run it (a stub is fine; mark TODOs).
|
||||
- **`.gitignore`** — the canonical block from `portability.md`'s "What git tracks"
|
||||
(deny-by-default `.claude/*` + the shared-wiring allow-list; `.env`/`.env.*`/`!.env.example`;
|
||||
OS cruft) **plus** the stack's own ignores (e.g. `node_modules/`, `__pycache__/`, build
|
||||
artifacts). Read the block from `portability.md` so it stays in sync — don't reproduce it
|
||||
from memory.
|
||||
- **`.claude/`** — create the directory; add `settings.json` only if a deterministic hook is
|
||||
wanted now. Don't add `rules/` symlinks until there's a `docs/guides/` file to point at.
|
||||
- **`design/` (only if the project has a user-facing surface)** — if v1 renders a UI (web app,
|
||||
landing page, native app, anything a person looks at), seed the `design/` folder and add the
|
||||
**Design line** to `AGENTS.md` ("before building or changing any user-facing UI, read
|
||||
`design/DESIGN.md` and `design/tokens.tokens.json` and conform to them"). The contract and
|
||||
folder layout are defined in `guides/design.md`; you don't have to scope the look now — note
|
||||
`/design` as the next step to do that. Skip this entirely for headless services, libraries,
|
||||
and CLIs with no visual surface.
|
||||
- **Starting structure** — the minimal stack-specific skeleton from the plan; no more.
|
||||
|
||||
The stack's **quality gate** (linter + pre-commit hook) is deliberately *not* hand-rolled
|
||||
here — that's the future `/harden` command (standards ROADMAP item 1). Note it as a next step
|
||||
rather than improvising one.
|
||||
|
||||
Sweep everything you write for secrets — reference env-var names, never real values.
|
||||
|
||||
## Phase 5 — Publish (git + Gitea) and close the loop
|
||||
|
||||
- `git init`, stage, and make a single clear initial commit of the whole scaffold.
|
||||
- **Gitea publish gate** (reuse `retrofit-playbook.md` Part 4). Creating the repo on the
|
||||
user's self-hosted Gitea is currently a manual web-UI step — there is no API automation yet
|
||||
(a `/new-project` Gitea-API enhancement is captured in `INBOX.md`). So: ask the user to
|
||||
create an empty repo in Gitea's UI (named `<name>`, no README) and paste its URL back. Then
|
||||
`git remote add origin <url>` and `git push -u origin <branch>`. The SSH key is one-time and
|
||||
assumed already set up; if the push fails on auth, point the user at the Part 4 SSH-key
|
||||
prompt. If they'd rather not publish yet, leave it local and say so — **never add a GitHub
|
||||
remote.**
|
||||
- **Close the capture loop:** if this project came from a `(new…)` inbox item, remove that
|
||||
line from `~/Projects/standards/INBOX.md`, then commit and push the standards repo (the same
|
||||
way `/capture` keeps the inbox durable).
|
||||
|
||||
## Phase 6 — Verify compliance
|
||||
|
||||
Because you're the main thread, fan out **portability-checker** on the new repo to confirm
|
||||
it's compliant from line one — `AGENTS.md` canonical with a relative `CLAUDE.md` symlink, the
|
||||
deny-by-default `.gitignore`, no absolute in-repo symlinks. Relay only what needs a fix.
|
||||
|
||||
## Final report
|
||||
|
||||
Short summary: the new repo's path, what was scaffolded, commit + push status (and the Gitea
|
||||
URL, or that it's local-only), whether the `(new)` inbox item was cleared, and the first
|
||||
milestone to start on. If you stopped at the Gitea gate waiting for a URL, make that the
|
||||
unmistakable next action. If you bailed at the form-factor gate or parked at the
|
||||
worth-building gate, say so plainly and where the idea was rerouted. If blocked at any point,
|
||||
report exactly what blocked you — never guess or fabricate.
|
||||
@@ -0,0 +1,124 @@
|
||||
# Onboarding-tester — agent operating guide
|
||||
|
||||
*Substance file per the portability protocol. Vendor wrappers (e.g.
|
||||
`adapters/claude/agents/onboarding-tester.md`) point here; this guide is self-contained
|
||||
and written as plain prose any delegated agent could follow.*
|
||||
|
||||
You are a fresh adopter. Someone just handed you a piece of software's **published
|
||||
documentation** and a goal, and your job is to find out whether a newcomer — human or
|
||||
agent — can reach that goal using **only those docs**. You are not here to read the source
|
||||
and make it work; you are here to discover every place the docs leave a newcomer stuck,
|
||||
guessing, or wrong. When the docs are good enough that you finish without a hitch, your
|
||||
clean run becomes a second deliverable: a publishable "this is all it took" walkthrough.
|
||||
|
||||
Two things set you apart from the `exerciser`: (1) you are bound to the **docs corpus** —
|
||||
reading the product's source to unblock yourself defeats the entire test; (2) you produce
|
||||
**two outputs** — a friction report always, and on a clean run, a marketing-grade walkthrough.
|
||||
|
||||
## Inputs you'll receive
|
||||
- **The goal** — the concrete outcome a newcomer is trying to reach (e.g. "gate this app
|
||||
behind a license," "stand up the service and issue your first token"). Success is reaching
|
||||
the goal, not "no errors."
|
||||
- **The docs corpus** — an explicit allowlist of published documentation sources: doc-site
|
||||
URLs, package-registry READMEs, a linked integration guide. This is the *only* material you
|
||||
may consult for how-to guidance. If the allowlist is vague, pin it down before starting and
|
||||
state what you treated as in-corpus.
|
||||
- **The target / sandbox** — the clean environment you work in (a throwaway app to integrate
|
||||
into, a fresh container, a disposable server). You modify only this.
|
||||
- **Credentials & endpoints** — whatever a real adopter would be handed (an API key, a server
|
||||
URL, a public key). Treat these as given; obtaining them is the provisioner's job, not
|
||||
yours, unless the goal explicitly includes it.
|
||||
|
||||
## Hard rules
|
||||
- **Docs-only, and it's the whole point.** Consult *only* the docs corpus for how-to
|
||||
guidance. Never read the product's server/SDK source, repo internals, or issue tracker to
|
||||
unblock yourself. If you can't proceed from the docs, that is a **finding** — log the gap
|
||||
and stop down that path; do not satisfy it from source. A run where you peeked at source is
|
||||
invalid and must say so plainly.
|
||||
- **Be a literal, honest newcomer.** Follow the docs exactly as written. When a step is
|
||||
ambiguous, take the most reasonable plain reading, record the ambiguity, and note what you
|
||||
guessed. Don't apply expert knowledge the docs didn't give you; if you *had* to, that's a gap.
|
||||
- **Mutate only the sandbox.** Never modify the product, its docs, or anything outside the
|
||||
sandbox/target. Scratch goes under `/tmp/onboarding-tester/`.
|
||||
- **Record as you go, not from memory.** Every command you ran and its result, every doc
|
||||
location you used, every place you backtracked. The walkthrough and the friction report are
|
||||
both faithful reconstructions of this log — keep it honestly.
|
||||
- **No fabrication.** If blocked, report the exact failing step, the doc location, and the
|
||||
error. "A newcomer can't get past step N from the docs alone" is a valid, valuable result.
|
||||
Never invent success.
|
||||
|
||||
## Procedure
|
||||
1. **Pin the corpus and the goal.** List the exact doc sources you'll treat as published, and
|
||||
restate the goal as a checkable end-state. If the corpus is ambiguous (is this repo file
|
||||
actually published to users?), say how you resolved it.
|
||||
2. **Read the entry doc as a newcomer would** — the quickstart / "getting started," once, top
|
||||
to bottom, before doing anything. Note what it assumes you already have or know.
|
||||
3. **Walk the documented happy path, step by step.** Do exactly what each step says, in order.
|
||||
After each step record: did it work as written? what did you actually run? what did the doc
|
||||
omit or get wrong? Backtrack and retry only as a newcomer could — by re-reading the docs,
|
||||
never by consulting source.
|
||||
4. **Reach the goal and verify it.** Confirm the promised outcome actually holds (the license
|
||||
verifies; the gated feature unlocks with a valid license and blocks without one; the
|
||||
service responds). Verify the *claimed* behavior, not just absence of errors.
|
||||
5. **Note the friction precisely.** For every snag: where in the docs, what it said, what
|
||||
actually happened or was needed, how a newcomer would (or wouldn't) recover, and what the
|
||||
doc should say instead.
|
||||
6. **Judge completion.** One of: **blocked** (couldn't reach the goal from docs alone — say at
|
||||
which step and why), **completed-with-stumbles** (reached it, but N places were
|
||||
wrong/unclear/cost a retry), or **completed-clean** (reached it following the docs as
|
||||
written, no guessing, no retries).
|
||||
|
||||
## Outputs
|
||||
|
||||
### 1. Friction report — always
|
||||
```
|
||||
## Verdict
|
||||
blocked-at-step-N | completed-with-stumbles (N) | completed-clean.
|
||||
1–2 sentences: can a newcomer reach the goal from the docs alone, and what's the headline gap?
|
||||
|
||||
## Corpus & goal
|
||||
The doc sources treated as published; the goal as a checkable end-state; the
|
||||
credentials/endpoint taken as given.
|
||||
|
||||
## Friction log (most severe first)
|
||||
[Blocker|Stumble|Nit] Short title
|
||||
Where: exact doc location (URL + section/anchor, or file:line if a published file)
|
||||
Doc said: … (≤3 lines) Reality: what actually happened / was needed (≤3 lines)
|
||||
Recover: how a newcomer would get unstuck — or "can't, from docs alone"
|
||||
Fix: the one-line change to the docs that closes it
|
||||
|
||||
## Path walked
|
||||
The ordered steps you actually took to reach (or fail to reach) the goal — the ground
|
||||
truth behind the verdict.
|
||||
|
||||
## Confidence
|
||||
high|medium|low + the one thing that would most change the result.
|
||||
```
|
||||
Severity: **Blocker** = could not proceed without leaving the docs or guessing · **Stumble** =
|
||||
proceeded, but the docs were wrong/unclear and cost a retry · **Nit** = typo, dead link, cosmetic.
|
||||
|
||||
### 2. Success walkthrough — only on a clean run
|
||||
Emit this **only when the verdict is `completed-clean`** (zero Blockers, zero Stumbles). A
|
||||
`completed-with-stumbles` run does *not* earn a marketing walkthrough — fix the stumbles and
|
||||
re-run first; that gate is what keeps the published artifact honest.
|
||||
```
|
||||
## Walkthrough (publishable)
|
||||
One line of framing: "Given only <corpus>, here is the entire path from zero to <goal>."
|
||||
The minimal ordered steps, copy-pasteable, each a real command or doc-cited action —
|
||||
nothing the run didn't actually need.
|
||||
The result, stated plainly: what now works, and how it was verified.
|
||||
```
|
||||
Rules for the walkthrough: it must be **reproducible** from the corpus + these steps alone; it
|
||||
must contain **no secrets and no real internal endpoints** (use placeholders like
|
||||
`$SERVICE_URL`, `$API_KEY`); and it must be **minimal** — the shortest true path, not a
|
||||
transcript of your backtracks. This is the "all the agent had to do was X, Y, Z" artifact;
|
||||
treat it as copy that will be published.
|
||||
|
||||
## Notes
|
||||
- You may be run repeatedly as the docs improve: each run's friction report drives the next
|
||||
doc fix; the goal is to reach `completed-clean` so the walkthrough can ship. Report progress
|
||||
against the previous run's blockers when you know them.
|
||||
- Some steps may be **deliberately out of a newcomer's reach by design** — an action gated to a
|
||||
human operator for safety (connecting real payment/financial accounts, rotating a signing
|
||||
key). If the docs say so and route you correctly, that is *not* a gap: note it as an intended
|
||||
boundary, which is often itself a feature worth surfacing in the walkthrough.
|
||||
@@ -0,0 +1,112 @@
|
||||
# Placement guide — where should a new project live?
|
||||
|
||||
Reference doc for the "where does this run, which model, what data layer?" question. It
|
||||
encodes two things: a stable **decision sequence** (rarely changes) and a set of
|
||||
**infrastructure facts** (go stale — keep them current). `/new-project` walks this against
|
||||
every new idea (`guides/new-project.md`, Phase 2); `how-i-work.md` points here so any
|
||||
session placing a project consults it rather than guessing.
|
||||
|
||||
> ✅ **Verified with the owner 2026-06-15** (and cross-checked against the project repos).
|
||||
> Keep this section current as the infra changes — see Maintenance. The *decision sequence*
|
||||
> and the *substance rule* are stable regardless.
|
||||
|
||||
## Infrastructure facts (verified 2026-06-15)
|
||||
|
||||
**Start9 server** — one box, **StartOS 0.4.0**, **x86_64** (0.4.0 doesn't run on Raspberry
|
||||
Pi / ARM, so x86 is the only option — build s9pks `x86_64`). It hosts long-running services
|
||||
as s9pk packages. Running on it: Gitea (the default repo home for every project), Nextcloud
|
||||
(file backup), Home Assistant, Core Lightning + Ride the Lightning (RTL), Open WebUI (the
|
||||
sovereign chat layer), Vaultwarden, and Synapse (the Matrix homeserver, `matrix.gilliam.ai`).
|
||||
Every Claude-built app also lives here: recap (public at `recaps.cc`), keysat, premier-gunner,
|
||||
proof-of-work, recap-relay, ten31-database, spark-control.
|
||||
|
||||
**Inference — two NVIDIA DGX Sparks (ARM64), fronted by the Spark Control gateway on the
|
||||
LAN.** Spark Control is the single HTTP endpoint every app calls; the two Sparks split by role:
|
||||
- **LLM Spark** — vLLM, OpenAI-compatible. Serves whichever general model is currently
|
||||
activated (daily driver right now: **Qwen3.6**; Gemma and others are downloaded and
|
||||
hot-swappable from the Spark Control dashboard).
|
||||
- **Audio / speech Spark** — Parakeet (STT), Kokoro (TTS), Sortformer + TitaNet (diarization),
|
||||
**bge-m3 embeddings + Qdrant**, and the rerank model. It also hosts the **matrix-bridge**
|
||||
container (on the WireGuard subnet).
|
||||
|
||||
Treated as real production capacity — recap / recap-relay (transcription + analysis),
|
||||
ten31-database (CRM pipeline), ten31-signal-engine, and ten31-transcripts already depend on it.
|
||||
|
||||
**Don't hardcode a model name.** Route to the Spark Control gateway and ask its API which
|
||||
model is live — that single-endpoint indirection is the point; the active model changes when
|
||||
the owner swaps it from the dashboard.
|
||||
|
||||
**Data layer defaults** — SQLite for structured data; **Qdrant + bge-m3** (both on the
|
||||
audio/speech Spark) when semantic retrieval is needed, with per-project collections; flat
|
||||
files when that's the honest answer.
|
||||
|
||||
**Sovereignty boundary (standing rule)** — anything touching sensitive investor, LP, or
|
||||
portfolio data uses local models only, via the Spark Control gateway, behind a redaction
|
||||
boundary wherever free text could carry names. Frontier APIs (Anthropic etc.) are fine for
|
||||
everything else. Non-negotiable per project; the only question is which side of the line the
|
||||
project's data sits on — and AGENTS.md must state it so a session never wires a frontier call
|
||||
to payload data.
|
||||
|
||||
**Access / networking** — three mechanisms, no others (Proton VPN and Tor were legacy and are
|
||||
not in use):
|
||||
- **LAN** — the default; apps, Sparks, and the box share it.
|
||||
- **WireGuard** — how the owner's own devices reach LAN-only services when off-LAN.
|
||||
- **StartTunnel** — Start9's ClearNet feature; publicly exposes selected services (recap at
|
||||
`recaps.cc`, Synapse/Matrix, and the ten31-database CRM — the CRM is ClearNet-exposed with
|
||||
app-level user auth so only the team reaches it).
|
||||
|
||||
**Dev machine** — macOS with Claude Code; also the s9pk / macOS-app build host. One-off and
|
||||
personal CLI tools live here happily.
|
||||
|
||||
## Decision sequence (stable)
|
||||
|
||||
Walk these in order; each answer narrows the next.
|
||||
|
||||
**1. Sensitivity.** Does the project ingest, store, or send investor/LP/portfolio data to a
|
||||
model? If yes: local inference mandatory, hosting on the home subnet strongly preferred, and
|
||||
AGENTS.md must state the constraint explicitly so a coding session never "helpfully" wires in
|
||||
a frontier API call with payload data.
|
||||
|
||||
**2. Runtime shape.** One-shot CLI / scheduled job / long-running service / interactive UI?
|
||||
- One-shot or personal CLI → Mac. Don't deploy what doesn't need deploying.
|
||||
- Scheduled job → Mac launchd if it only matters while the laptop lives; Start9 if it must
|
||||
run unattended 24/7.
|
||||
- Long-running service, or anything other devices/family/agents need to reach → Start9.
|
||||
|
||||
**3. If Start9: s9pk or plain container?** s9pk earns its packaging cost when the service
|
||||
wants the StartOS lifecycle — backups, health checks, dependency management, clean updates —
|
||||
or could plausibly be published for others. Plain container (or script) wins for experiments,
|
||||
single-user glue, and anything still changing shape weekly. Default for prototypes: container
|
||||
now, promote to s9pk if it survives and stabilizes. Packaging for 0.4.x is nontrivial; don't
|
||||
pay it on spec.
|
||||
|
||||
**4. Model routing.** Default to the local model via the Spark Control gateway when the
|
||||
sovereignty boundary applies, when latency/cost favor local, or when the task is well within
|
||||
the local model's capability. Don't hardcode a model name — call the gateway and ask which
|
||||
model is active. Route to frontier (Claude API) for hard reasoning on non-sensitive data.
|
||||
Record the chosen endpoint (gateway vs frontier) in AGENTS.md so sessions don't guess.
|
||||
|
||||
**5. Data layer.** SQLite unless there's a reason; Qdrant + bge-m3 when retrieval quality is
|
||||
the product; flat files for logs and artifacts. Name Qdrant collections per-project to avoid
|
||||
the shared-collection mess.
|
||||
|
||||
**6. Interface.** CLI first unless the UI *is* the product. If it must be reachable from the
|
||||
phone or by the team off-LAN, decide up front how: expose it over ClearNet via StartTunnel
|
||||
with app-level auth (how the CRM and `recaps.cc` are reached), or keep it LAN-only and reach
|
||||
it over WireGuard from your own devices.
|
||||
|
||||
**7. Repo home.** Gitea on Start9. Always — even for parked-then-revived ideas, so history
|
||||
accumulates in one place.
|
||||
|
||||
## Phase-exit criteria — the substance rule
|
||||
|
||||
Phase exits are falsifiable substance: numbers and demonstrable behavior. "46/46 tests
|
||||
pass," "recap generated from a real 40-minute call in under 2 minutes," "correct doc in
|
||||
top-3 for 9/10 canned queries." If the criterion can't fail, it isn't a criterion.
|
||||
|
||||
## Maintenance
|
||||
|
||||
The **infrastructure facts** section is the part that goes stale. When the infra changes —
|
||||
new hardware, StartOS version, model lineup, network setup, a service added or retired —
|
||||
update that section here rather than working around it in conversation. The decision sequence
|
||||
and the substance rule rarely change.
|
||||
+4
-3
@@ -23,8 +23,8 @@ guess on anything that changes what lands on disk.
|
||||
## Phase 1 — Git audit (playbook Step 0)
|
||||
|
||||
- If this is not a git repo: propose `git init`, a `.gitignore` (the canonical block from
|
||||
`portability.md`'s "What git tracks" — `.env`, `.claude/settings.local.json` and
|
||||
`*.local.*`, OS cruft), and an initial commit. Get approval before running.
|
||||
`portability.md`'s "What git tracks" — `.env`/`.env.*`, a deny-by-default `.claude/*` with
|
||||
the shared wiring allow-listed, OS cruft), and an initial commit. Get approval before running.
|
||||
- If it is: report whether there are uncommitted changes and when the last commit was,
|
||||
then propose committing anything outstanding (a repo existing isn't the same as work
|
||||
being saved — uncommitted changes are as unprotected as no repo at all).
|
||||
@@ -90,7 +90,8 @@ AGENTS.md now exists but is a strong draft, not yet checked against reality.
|
||||
what's whole-repo versus subsystem is a judgment call that's theirs. Then report what
|
||||
moved where.
|
||||
- Ensure AGENTS.md carries the **inbox-check line** (the portable surfacing mechanism; see
|
||||
`retrofit-playbook.md` Step 1 / `portability.md`). If it's missing, propose adding it.
|
||||
`retrofit-playbook.md` Step 1, and the canonical wording in the standards repo's own
|
||||
`AGENTS.md`). If it's missing, propose adding it.
|
||||
|
||||
## Phase 5 — Independent checks (delegate)
|
||||
|
||||
|
||||
+24
-6
@@ -42,13 +42,14 @@ existing one.
|
||||
Wait for all readers before synthesizing. If a reader fails or a repo has neither file, note
|
||||
it honestly rather than dropping the repo.
|
||||
|
||||
## Phase 3 — Synthesize (one report)
|
||||
## Phase 3 — Synthesize (one report → STATUS.md + inline)
|
||||
|
||||
Present ONE report in the chat. Default to showing it inline; if the user wants a tracked
|
||||
snapshot they can diff over time (like `EVALUATION.md`), offer to write it to
|
||||
`~/Projects/standards/STATUS.md` — their call, and don't commit it yourself.
|
||||
Produce ONE report. **Write it to `~/Projects/standards/STATUS.md`**, overwriting the
|
||||
previous snapshot, **and** show the same report inline in the chat — the file is the durable,
|
||||
diffable record; the inline copy is for reading right now. Title it with today's date (run
|
||||
`date +%F`).
|
||||
|
||||
Structure:
|
||||
Structure (used identically for the file and the inline copy):
|
||||
|
||||
```
|
||||
# Roundup — <date>
|
||||
@@ -73,9 +74,26 @@ The (new)/(new:name) inbox items — ideas awaiting the new-repo bootstrap.
|
||||
Repos missing AGENTS.md or a Current state; stale-looking states; anything that blocked a reader.
|
||||
```
|
||||
|
||||
## Phase 4 — Persist the snapshot
|
||||
|
||||
`STATUS.md` is only a *tracked* snapshot if it's committed — that's the whole point of the
|
||||
file (a dated history you can diff over time). So after writing it, commit and push **only
|
||||
that file** to the standards repo, without asking (the same durability reflex as `/capture`):
|
||||
|
||||
- `git -C ~/Projects/standards add STATUS.md`
|
||||
- `git -C ~/Projects/standards commit -m "Roundup snapshot — <date>" -- STATUS.md`
|
||||
(the `-- STATUS.md` pathspec commits only the snapshot — never sweep unrelated standards
|
||||
changes into a roundup commit)
|
||||
- `git -C ~/Projects/standards push` (only if a remote is configured; if the push fails,
|
||||
report it but don't treat the snapshot as lost — it's committed locally)
|
||||
|
||||
This write-and-commit of `STATUS.md` is the *only* thing `/roundup` changes on disk;
|
||||
everything else — every project repo, every other file — stays strictly read-only.
|
||||
|
||||
## Rules
|
||||
|
||||
- Read-only. The only file you may write is `STATUS.md`, and only if the user asks for it.
|
||||
- The only file you may write or commit is `~/Projects/standards/STATUS.md` (the snapshot).
|
||||
Every project repo and every other file stays strictly read-only — never edit or commit them.
|
||||
- Quote priorities and states as found; never re-rank projects or recommend what to do next
|
||||
unprompted — that's the user's call.
|
||||
- Preserve every item; if you can't place one, list it under "needs triage" rather than
|
||||
|
||||
+9
-4
@@ -17,17 +17,19 @@ Universal preferences for any coding agent working with me, on any project. Load
|
||||
- Consider how a change affects code that depends on, references, precedes, or follows it. No change is local until you've checked its neighbors.
|
||||
- Match the conventions already in a file or repo over any default of your own.
|
||||
- Prefer small, reviewable diffs over sweeping rewrites.
|
||||
- Comments explain *why*, not *what* — don't narrate self-evident code.
|
||||
- Build only what the task needs. Question whether a piece needs to exist before writing it; skip speculative work and say you skipped it. No abstraction for a single caller — no interface with one implementation, no factory for one product, no config option for a value that never changes.
|
||||
- Comments explain *why*, not *what* — don't narrate self-evident code. When you take a deliberate shortcut with a known ceiling (a global lock, an O(n²) scan, a naive heuristic), say so in the comment and name the upgrade path.
|
||||
- Write the test alongside the change when the repo has an existing test setup.
|
||||
- Don't add a dependency for something the standard library or existing dependencies already do well.
|
||||
- Don't add a dependency for something the standard library, an already-installed dependency, or a native platform feature already does well (a built-in input type over a picker lib, CSS over JS, a DB constraint over app code).
|
||||
- Propose, don't silently rewrite, durable instructions or shared config: show me the diff and the rationale first. Exception: trivial fixes (typos, dead links).
|
||||
|
||||
## Git and commits
|
||||
|
||||
- **No AI co-authorship attribution.** Do not add "Co-Authored-By" trailers, "Generated with"/"Co-authored with" lines, or any other tool or model attribution to commit messages, PR descriptions, or code. Commits are authored by me.
|
||||
- Commit messages: imperative mood, concise subject, body only when the "why" isn't obvious.
|
||||
- Push to the configured remote after committing. My default remote is self-hosted Gitea, not GitHub — don't assume GitHub or add a GitHub remote unless I ask.
|
||||
- Never force-push a shared branch or commit directly to main unless I say so.
|
||||
- **Work on `main` (or the repo's default branch); don't create feature branches unless I ask.** These are solo, non-critical repos — my approval before pushing *is* the review gate, so branches and PRs only add ceremony. If a change is genuinely risky or experimental and you'd want to abandon it cleanly, suggest a branch — but `main` is the default and the branch is the exception.
|
||||
- **Ask before committing; one approval covers both commit and push.** When I approve a change, that's the go-ahead to commit *and* push it to `main` in one step — don't commit and then ask again before pushing. My default remote is self-hosted Gitea, not GitHub — don't assume GitHub or add a GitHub remote unless I ask.
|
||||
- Never force-push a shared branch unless I say so.
|
||||
- Never commit secrets, large binaries, or generated artifacts. Documents reference env-var names only; real values live in gitignored .env files.
|
||||
|
||||
## Debugging
|
||||
@@ -40,6 +42,7 @@ Universal preferences for any coding agent working with me, on any project. Load
|
||||
|
||||
## Building and releasing
|
||||
|
||||
- **Placing a new project?** Where it runs (Mac vs my Start9), s9pk-vs-container, which model it routes to, its data layer, and its interface follow my standing infra conventions — consult `~/Projects/standards/guides/placement.md` (decision sequence + infra facts) rather than guessing.
|
||||
- **Bump the version before building an s9pk package.** For any project that is also a Start9 `s9pk` package, increment the package version (in the manifest) before running `make x86` or `make install`. This targets Start9 0.4.x: without a version bump my Start9 servers don't recognize the rebuilt package as updated, so the install silently does nothing. Bump first, then build and install.
|
||||
|
||||
## Maintaining instruction files (AGENTS.md, guides, this file)
|
||||
@@ -50,4 +53,6 @@ Instruction files are the durable, visible record of how I want you to work —
|
||||
- **Add and remove.** If a new rule supersedes an old one, rewrite or delete the old one. Stale rules are worse than no rule.
|
||||
- **Reconcile conflicts.** If a broader-scope rule and a narrower-scope rule disagree, resolve it — narrow the parent, override at the child, or drop one. Don't leave future-me to guess which wins.
|
||||
- **Right scope:** universal → this file; whole-repo → that repo's AGENTS.md; subsystem → a scoped guide. Keep each file to what's true at its scope.
|
||||
- **How these files are structured (every repo):** `AGENTS.md` is the canonical file, and each vendor's preferred filename is a relative symlink to it — so the repo is already in every tool's native format. Subsystem detail lives in `docs/guides/<topic>.md` (with `paths:` frontmatter), surfaced via a relative symlink so a tool auto-loads it when you edit matching files, with a one-line index entry in AGENTS.md so any tool can find it. Knowledge lives in these vendor-neutral files; vendor-named paths are relative symlinks into them. Full protocol: `~/Projects/standards/portability.md`.
|
||||
- **Promote up, but don't reflexively trim down.** Move a rule here when it's truly universal, so new repos and every session inherit it. Promoting it does *not* mean deleting it from a repo's AGENTS.md: for load-bearing or safety rules, keep the repo copy — it's followed more closely in context, and it travels with the repo to any tool or session, including ones that never load this file. Trim a per-repo copy only when it's pure boilerplate with no project-specific payload *and* the file is genuinely bloated.
|
||||
- **Where project state goes:** a repo's near-term status and next steps live in the `## Current state` section of its AGENTS.md (a snapshot, overwritten each session); longer-term backlog lives in a separate `ROADMAP.md` at the repo root. The split is by time horizon, not scope: would a session starting now act on it? Yes → Current state; not yet → ROADMAP.
|
||||
|
||||
+22
-5
@@ -10,11 +10,13 @@
|
||||
README.md how-i-work.md portability.md retrofit-playbook.md subagents-handbook.md
|
||||
ROADMAP.md ← this repo's backlog (future agents, commands, standards)
|
||||
INBOX.md ← cross-project capture buffer (/capture → here, /triage drains it)
|
||||
STATUS.md ← cross-project roundup snapshot (overwritten + committed by /roundup)
|
||||
guides/ ← neutral substance (vendor-agnostic): checklists, role knowledge
|
||||
adapters/
|
||||
claude/
|
||||
commands/ ← ~/.claude/commands symlinks here (global commands, e.g. /handoff)
|
||||
agents/ ← ~/.claude/agents symlinks here (global subagents)
|
||||
statusline.sh ← ~/.claude/statusline.sh symlinks here (Claude Code terminal status line)
|
||||
```
|
||||
|
||||
Companions: `how-i-work.md` (the always-loaded user layer, ~50 lines), `retrofit-playbook.md` (the one-time conversion manual), and `subagents-handbook.md` (designing and running delegated agents). This document is reference material — never symlink it into anything always-loaded.
|
||||
@@ -71,8 +73,17 @@ symlinks; `ROADMAP.md`; `.claude/settings.json` (shared project settings and hoo
|
||||
deterministic behavior is part of the repo); `.claude/agents/*.md`, `.claude/commands/*.md`,
|
||||
`.claude/skills/` (project-scoped wrappers).
|
||||
|
||||
Gitignored (per-user, per-machine, or secret): `.claude/settings.local.json` and any
|
||||
`*.local.*` (personal permissions/overrides); `.env` and secrets (corollary 5); OS cruft.
|
||||
Gitignored (per-user, per-machine, secret, or session scratch): `.claude/settings.local.json`
|
||||
and any `*.local.*` (personal permissions/overrides); `.claude/worktrees/` and other Claude
|
||||
session/editor scratch that lands in `.claude/`; `.env` and secrets (corollary 5); OS cruft.
|
||||
|
||||
Because `.claude/` accumulates unpredictable scratch — worktrees, editor debug configs
|
||||
(sometimes carrying credentials), `.DS_Store` — **ignore it deny-by-default and allow-list
|
||||
only the shared wiring.** A blanket `.claude/*` plus `!` exceptions is safer than naming
|
||||
individual local files: a new kind of local scratch is ignored automatically, and a stray
|
||||
secret never slips in by default. (Already-tracked files stay tracked even under `.claude/*`,
|
||||
so a deliberate, secret-free editor config a repo wants to commit can simply be allow-listed
|
||||
with its own `!` line.)
|
||||
|
||||
Put these in the repo's **own committed `.gitignore`** — don't rely on a global
|
||||
excludesfile, which a fresh clone or another machine won't have. Canonical block:
|
||||
@@ -83,9 +94,15 @@ excludesfile, which a fresh clone or another machine won't have. Canonical block
|
||||
.env.*
|
||||
!.env.example
|
||||
|
||||
# Claude Code — commit shared config, ignore personal/local
|
||||
.claude/settings.local.json
|
||||
.claude/*.local.json
|
||||
# Claude Code — deny by default, allow-list shared wiring.
|
||||
# .claude/ also accumulates worktrees, editor configs, and OS cruft; commit
|
||||
# only the shared parts so new local scratch (or a stray secret) stays out.
|
||||
.claude/*
|
||||
!.claude/rules/
|
||||
!.claude/agents/
|
||||
!.claude/commands/
|
||||
!.claude/skills/
|
||||
!.claude/settings.json
|
||||
|
||||
# OS cruft
|
||||
.DS_Store
|
||||
|
||||
@@ -62,7 +62,7 @@ claude
|
||||
|
||||
Paste:
|
||||
|
||||
> Is this a git repo? If not: git init, write a .gitignore covering .env, .claude/settings.local.json (and any *.local.*), and OS cruft — the canonical block in portability.md's "What git tracks" — and make an initial commit. If it is: tell me whether there are uncommitted changes and when the last commit was, then commit anything outstanding.
|
||||
> Is this a git repo? If not: git init, write a .gitignore covering .env/.env.*, a deny-by-default .claude/* with the shared wiring allow-listed (rules/agents/commands/skills/settings.json), and OS cruft — the canonical block in portability.md's "What git tracks" — and make an initial commit. If it is: tell me whether there are uncommitted changes and when the last commit was, then commit anything outstanding.
|
||||
|
||||
Approve the git commands it proposes. Then `/exit`. (A repo existing isn't the same as work being saved in it — uncommitted changes are exactly as unprotected as no repo at all.)
|
||||
|
||||
@@ -145,7 +145,7 @@ Your Mac is now the single point of failure; this removes it. Gitea is just a gi
|
||||
|
||||
3. The SSH key is one-time; for each additional project, just create the empty Gitea repo and say "add my Gitea remote at URL and push."
|
||||
|
||||
**The one tradeoff:** Claude's cloud Code sessions (web, and the mobile app's cloud mode) can only clone from GitHub, so Gitea-only means no phone-initiated cloud sessions. Your phone can still monitor and steer sessions running on your Mac — type `/mobile` in a session for a QR code. Want both? Add GitHub as a second remote later; git pushes to both happily.
|
||||
**The one tradeoff:** Claude's cloud Code sessions (web, and the mobile app's cloud mode) can only clone from GitHub, so Gitea-only means no phone-initiated cloud sessions. Your phone can still monitor and steer sessions running on your Mac — type `/remote-control` (or `/rc`) in a session for a QR code to scan from the Claude app. Want both? Add GitHub as a second remote later; git pushes to both happily.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -89,6 +89,8 @@ least one.
|
||||
| researcher | compression | Multi-source web research → cited brief | ✅ in kit |
|
||||
| janitor | compression | Spring-clean docs/artifacts: report stale, orphaned, superseded files | ✅ in kit |
|
||||
| doc-auditor | independence | Doc drift: every README/instruction/HTML claim checked against the code | ✅ in kit |
|
||||
| design-checker | independence | Design compliance: code checked against the repo's committed design brief + tokens | ✅ in kit |
|
||||
| onboarding-tester | independence | Fresh-adopter simulation: reach the goal from published docs alone | ✅ in kit |
|
||||
| test-runner | compression | Run the suite, return only failures + causes | build when a repo has a real suite |
|
||||
| docs-reader | compression | Read a library's docs, return just the calls you need | build on 3rd manual-trawling task |
|
||||
| fresh-debugger | independence | Gets symptoms only (never your theories), hunts the bug | build next time you're stuck >1hr |
|
||||
@@ -259,7 +261,8 @@ Three real options:
|
||||
1. **Slash command (this kit's choice): `/full-eval`.** A command file is just a stored
|
||||
prompt for the *main thread* — and the main thread *can* fan out. `/full-eval` makes
|
||||
it launch evaluator + security-auditor + exerciser + doc-auditor (+ start9-spec-checker
|
||||
when relevant) in parallel, then synthesize one deduplicated, prioritized report. Zero new
|
||||
when relevant, + reviewer when the tree has uncommitted changes) in parallel, then
|
||||
synthesize one deduplicated, prioritized report. Zero new
|
||||
machinery, works in any session.
|
||||
2. **`claude --agent <name>` (advanced).** A session launched this way takes the agent's
|
||||
system prompt as its main thread — and a main-thread agent *can* spawn subagents,
|
||||
@@ -288,7 +291,7 @@ Per repo, in order:
|
||||
may still be a real CLAUDE.md or rules may not yet be symlinks into docs/guides.
|
||||
2. **`/full-eval`** in a fresh session. Walk away; read one synthesized report.
|
||||
3. **Triage** findings into the repo's AGENTS.md Current state: P0/P1 become the work queue,
|
||||
P2 becomes a "known debt" list, P3+ gets deferred for later decision, done in bulk, or added to roadmap.md.
|
||||
P2 becomes a "known debt" list, P3+ gets deferred for later decision, done in bulk, or added to ROADMAP.md.
|
||||
4. **Fix in the main session** (understanding work — yours), running **reviewer** on
|
||||
each meaningful diff before commit.
|
||||
5. **Close out** per the ritual; move to the next repo. Don't batch — one repo's
|
||||
|
||||
Reference in New Issue
Block a user