Files
stack/scratchpads/contract-thin-core.md
jason.woltje 59b611ba8a
Some checks failed
ci/woodpecker/push/ci Pipeline was canceled
ci/woodpecker/push/publish Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
refactor(framework): thin-core prompt diet — cut injected contract ~53% (#529)
2026-06-11 18:10:42 +00:00

2.3 KiB
Raw Blame History

Scratchpad — Thin-core prompt diet (#528)

Branch: feat/contract-thin-core · Issue: #528 · Mode: Delivery

Objective

Cut the always-injected contract (defaults/AGENTS.md + defaults/TOOLS.md + runtime/claude/RUNTIME.md, inlined every turn by the launcher) without losing any hard gate. Restore the original "thin core + on-demand guides" intent.

Change

  • defaults/AGENTS.md → thin core: 12 hard gates verbatim, 37 operating rules condensed to ~15 bullets (detail already in guides/E2E-DELIVERY.md), Superpowers condensed, load order made on-demand (no halt-on-missing for STANDARDS), conditional guide-loading index retained.
  • defaults/TOOLS.md → index; full catalog moved to new guides/TOOLS-REFERENCE.md (read on demand).
  • runtime/claude/RUNTIME.md → slimmed (dedup tier table, terser pointers).

Method (autoresearch-style validation)

  1. Built a 9-probe role battery (backend/deploy/review/orchestrate/secrets/docs/simple-trap/no-stop-at-PR/agent-work) + a deterministic 18-signature gate-checklist.
  2. Headless interactive runs (Claude Max subscription, tmux — no API), scored by per-probe rubric.
  3. Keep-or-discard hill-climb (token cost gated by per-probe fidelity) proved the method; final design re-derived against THIS repo's content (diet-only, no drifted-deployment content imported).

Validation evidence

  • Gate-checklist: ALL gates + critical rules + mode lines + sequential-thinking + OpenBrain + Superpowers present.
  • A/B on real repo content: thin 7/9 vs monolith 5/9 probes; strictly better on deploy/review/simple-task; composed 8,827 → 4,122 tok (53%).
  • p11 (don't-stop-at-PR): 3→2/3 on one rubric line — verified a scorer/phrasing artifact (answer correctly cites gates §5/§9 + close-issue; gate verbatim-present). Variance: thin 2/2/3, v0 3/3/3.

Decisions / risks

  • Diet-only vs repo content (user decision). Did NOT import web1's Gate 13-15 / federated memory / OpenViking — canonical repo is behind those deployments; flagged for separate reconciliation.
  • AGENTS/TOOLS are shared across runtimes → diet benefits codex/pi/opencode too; RUNTIME change is claude-only.
  • p11 accepted as-is (user decision) — not gaming the rubric.

Status

PR open, paused for maintainer merge ratification (fleet-governing change). mosaic upgrade will propagate on merge.