Files
stack/docs/design/framework-constitution/debate/rebuttal-contrarian.md
Jason Woltje c70b217a5c
Some checks failed
ci/woodpecker/push/ci Pipeline failed
docs(design): mosaic framework constitution — expert conference output
Conference of 7 experts (architect/moonshot/contrarian/coder/aiml/devex/steward)
debated layering, sanitization, upgrade-safety, cross-harness robustness.
Artifacts: BRIEF, 7 positions, 7 rebuttals, synthesis-v1, 3 red-team passes,
canonical DESIGN.md, OPEN-QUESTIONS.md, MISSION.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 23:47:49 -05:00

121 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Rebuttal — The Contrarian Skeptic
**Lens:** Distrust complexity and clever abstractions. Hunt failure modes, over-engineering, and rules that look good on a page but degrade real agent behavior. I verified the load-bearing claims against the tree before writing (see §0); I am not taking anyone's grep counts on faith.
---
## 0. What I re-verified before arguing (because half this debate runs on un-rechecked greps)
Every paper cites the same handful of facts. I re-ran them so the rebuttal stands on the tree, not on six papers quoting each other:
- **`rails/` vs `tools/` path drift is real and worse than reported.** `grep -rln 'mosaic/rails/' templates/` returns **not one file but a whole family**: `templates/agent/AGENTS.md.template`, `CLAUDE.md.template`, and every project variant under `templates/agent/projects/{typescript,nestjs-nextjs,python-fastapi,python-library}/`. Meanwhile `install.sh:192-194` actively `rm -f`s the `rails` symlink. So **a dozen shipped templates emit a queue-guard command that points at a path the installer deletes.** Any agent that obeys the template gets "no such file." This is the single most concrete "rule that degrades real behavior" in the repo, and it is in the *project-scaffolding* path — the first thing a new user touches.
- **`credentials.sh:19` AND `detect-platform.sh:89` both hardcode `$HOME/src/jarvis-brain/credentials.json`** as the default. Steward and Architect both flagged this; confirmed in two files, not one.
- **`PRESERVE_PATHS` (install.sh:24) contains both `AGENTS.md` and `STANDARDS.md`** — i.e. today's law files are upgrade-frozen. `FRAMEWORK_VERSION=2`.
- **Non-TTY install defaults to `keep` (install.sh:99).** So a CI/headless re-install silently preserves a user's stale law file. The drift bug is live, today, automatically.
These four are the disease. Hold them in mind, because most of this debate proposes cures for a different, more glamorous illness.
---
## 1. The strongest ideas from other personas worth keeping
I came in hostile to "add a Constitution layer." Three ideas survived contact and I'll defend them.
### 1a. "Prose rules are advisory; only mechanical enforcement is a gate." (DevEx §DQ4.4, Architect CI guards, Steward S5/S10, Moonshot mitigation)
This is the best idea in the entire debate and it is **mine by temperament but DevEx stated it most sharply**, grounding it in a fact already in the repo: `runtime/claude/RUNTIME.md:30-32` literally says of the memory rule *"the rule alone proved insufficient — the hook is the hard gate."* The framework already learned this lesson once and wrote it down. DevEx's move — **promote "hookable gates MUST be hooked" to doctrine** — is exactly right and it is the one proposal that attacks the *real* disease (drift and contamination re-accreting) rather than the imagined one (missing layers). Every persona independently converged on "add a CI grep for personal data." That convergence is signal. **Keep it, and make it the load-bearing deliverable, not a footnote.** A precedence diagram without this CI gate is theater; the CI gate without a precedence diagram still prevents the next 55-leak regression.
### 1b. Architect's "tighten-only" precedence rule, stated as one invariant
Architect (§DQ1) and DevEx both land on: *a lower layer may further constrain a higher layer but may never relax, suspend, or contradict it.* This is the correct precedence model and it is **one sentence**, not a four-layer lattice. It generalizes the good instinct already half-present at `SOUL.md:48` (injected reminders never expand permissions) and `SOUL.md:32` (user formatting wins). I'll defend this verbatim because it is subtraction disguised as structure: it replaces an entire imagined "precedence engine" with a single rule a model can actually hold in context. Keep the sentence. Reject anything that needs a diagram to explain it.
### 1c. Coder's "self-bootstrapping Constitution" defense against injection asymmetry
Coder's single strongest recommendation (§biggest risk) is the most operationally honest thing said about cross-harness: **the launcher composition logic lives in `packages/mosaic/src/` — not visible in the framework files — so "it's already injected" is an unverifiable promise.** Coder's fix: `AGENTS.md` says *"if `CONSTITUTION.md` is not already in context, read it now"* — making the law self-loading rather than injection-dependent. This is cheap, defensive, and correct, and it directly kills the false claim at `defaults/AGENTS.md:11` ("already in your context... do not re-read") that **is provably false on a direct `claude` launch**. Belt-and-suspenders beats a trust-the-launcher invariant every time. Keep it.
---
## 2. The weakest / riskiest proposals — with concrete failure modes
Here is where the debate's enthusiasm becomes the threat my lens exists to catch. Three proposals look sophisticated and will degrade real behavior.
### 2a. Architect's per-layer version stamps + 3-way merge engine (and DevEx's `mosaic-reconcile`) — over-engineering that creates the bug it claims to fix
Architect §DQ3 proposes `constitution.version` / `standards.version` / `user-schema.version` plus a `git merge-file`-style 3-way merge with `base`/`theirs`/`mine` and conflict surfacing in `mosaic doctor`. DevEx §DQ3 proposes the same with per-file `<!-- mosaic:template-version: N -->` markers and a new `mosaic-reconcile` script. Moonshot adds a `migrations/v1.0.0-v1.1.0.md` directory and an interactive `[Y/n]` auto-merge prompt.
**Concrete failure modes:**
1. **The 3-way merge needs a `base` that does not exist for any current install.** A 3-way merge requires the *original template the user's file was generated from*. Today's deployed `SOUL.md` files were hand-edited and seeded across multiple `FRAMEWORK_VERSION` bumps with no stamped base. So the very first upgrade after this lands has **no base to diff against** — the merge degrades to a 2-way conflict dump on every section, for every existing user, exactly at the alpha boundary the BRIEF says must not break. The machinery is most fragile precisely when first used.
2. **Interactive merge prompts hang headless launches.** Moonshot's `[Y/n]` auto-merge prompt and DevEx's `mosaic-reconcile` are interactive by implication. This very environment forbids TTY-blocking calls; `mosaic-init` is already `read -r`-interactive and the install path already had to add `--non-interactive`. A merge engine in the upgrade path is a new hang surface on every CI re-install.
3. **Per-file version matrices are the combinatorial blowup I named in my position paper.** Three independent version integers = a state space of `(constitution vN × standards vM × user-schema vK)` that nobody will test. The Architect's own "Biggest Risk" section *admits* the migration is the most likely thing to "break existing deployments catastrophically" — and then proposes the most complex possible migration.
**The cheaper design that wins:** physical directory separation (which all three also propose and which I endorse) **already makes 3-way merge unnecessary.** If framework-owned content lives in `constitution/` (clobbered wholesale) and user content lives at root (never touched), there is **nothing to merge** — that is the entire point of the split. The override mechanism for the rare user who must tune a standard is an **additive `STANDARDS.local.md` include** (my position §DQ3), not a merge of the framework file. You get upgrade safety with `rsync --delete` on one directory and `rsync --exclude` on the other. One integer version, linear migrations (already built, `install.sh:160-202`), no merge engine. **The 3-way merge solves a problem the directory split already deleted.**
### 2b. Moonshot's YAML front-matter + content-hash "launcher refuses to start" enforcement — a brittle wall in front of an open door
Moonshot §DQ1 proposes `mosaic-layer: 0 / mosaic-owner: framework / mosaic-override: forbidden` front matter, and a launcher that **"refuses to start if a layer-0 file has been structurally overridden (content-hash check)."** Steward §DQ3 echoes a softer version (`mosaic doctor --check-constitution` against `.checksums`).
**Concrete failure modes:**
1. **It enforces the wrong invariant at the wrong layer.** The threat is not "user edited CONSTITUTION.md." The threat is "user *never receives* a CONSTITUTION update because it is preserved." A content-hash check that *blocks startup* on a modified law file will **brick the agent for the one user who customized their gates** — while doing nothing for the 99% whose problem is staleness, not modification. You have built a lock for a door nobody walks through and left the actual hole (silent non-upgrade) open.
2. **Hash-check-on-launch is a new hard failure mode on the hot path.** A corrupted line ending, a CRLF normalization on Windows (which DevEx correctly notes is already a symlink minefield), or a trailing-newline diff now **prevents the agent from starting at all.** You have converted a cosmetic drift into a total outage. The cure is more dangerous than the disease.
3. **Front-matter `mosaic-override: forbidden` is a rule that asks the model to police itself** — exactly the "prose gate" pattern this debate (correctly, per §1a) agreed is advisory-only. A YAML key that says "forbidden" enforces nothing unless the launcher reads it, and if the launcher reads it, the YAML is redundant with the launcher's own logic. It is ceremony.
**The cheaper design that wins:** Make CONSTITUTION.md **overwrite-always** (not in `PRESERVE_PATHS`). That is it. If it is clobbered on every upgrade, "user modified it" becomes a non-event — their edit simply doesn't survive, which is the *correct* behavior for immutable law. No hash check, no startup gate, no front-matter. The directory split (§2a) does the enforcement structurally. **Subtraction beats a hash-verification subsystem.**
### 2c. The five-layer model (Architect) and DevEx's `adapters/<h>.capabilities.json` manifests — taxonomy inflation
Architect §DQ1 argues for **five** layers (Constitution / Standards / Persona / Operator+Policy / Deployment). DevEx §DQ4 proposes per-harness JSON capability manifests (`structured_reasoning.gate: true/false`, `subagent_spawn.model_param`, etc.). Moonshot proposes a `COMPLIANCE.md` harness×gate matrix plus `schema.json` JSON Schema validation of SOUL fields.
**Concrete failure modes:**
1. **Five layers means five files to keep non-duplicative — the exact failure we are fixing, with a higher file count.** The disease is duplication-and-drift across (today) four restatements of the gates. Architect's response is to add layers 2 (Standards) and 4 (Operator Policy) and 5 (Deployment) as *distinct* artifacts. Splitting "Standards" from "Constitution" sounds clean, but it re-creates the `AGENTS.md`/`STANDARDS.md` overlap that already exists and already drifts (both currently restate secrets/git/multi-agent rules). **You cannot fix duplication by formalizing more documents to duplicate across.** The honest count is: one immutable law file (L0), one user persona (SOUL), one user profile (USER). "Standards" is either law (→ L0) or a tunable default (→ a `.local` include), not a third sovereign layer. "Operator policy" like the `(Policy: Jason, 2026-06-11)` line is a *one-line edit* (delete the attribution, keep the mechanism), not a new `policy/*.md` subsystem.
2. **`capabilities.json` is a config format invented for a four-row table.** There are four harnesses and roughly three capability axes that differ. DevEx's own manifest example encodes what a **four-line markdown table** already conveys. A JSON schema for four harnesses is a maintenance artifact (now you need a validator, a schema, and CI for the schema) standing in for prose that fits on a screen. The Pi-vs-others sequential-thinking exception is *one sentence* ("structured reasoning required; Pi satisfies it natively"), not a `gate: false` field in a bespoke manifest format.
3. **JSON Schema validation of SOUL fields (Moonshot) presumes SOUL is structured data. It is prose.** SOUL.md is a behavioral contract written for a *model* to read, not a form. Imposing `schema.json` validation turns a flexible persona doc into a typed form with required fields — and the first user who writes a freeform communication-style paragraph fails validation. You are adding a compiler for a poem.
**The cheaper design that wins:** Three layers (L0 immutable law, L2 persona, L3 profile — I'm using the debate's numbering). Cross-harness differences live in a **single markdown table** in the adapter docs, in capability-verb language ("use structured reasoning"), not a JSON manifest. The "compliance matrix" is fine *as a doc* (Moonshot's instinct is good there) — just don't make it machine-read-and-enforced.
---
## 3. The key disagreement, sharpened — and how to resolve it
### The disagreement
Strip away the agreements (everyone wants a named Constitution; everyone wants the persona sanitized; everyone wants a CI grep; everyone wants directory separation). The live fault line is:
> **Does upgrade-safe customization require a reconciliation *engine* (per-layer versions + 3-way merge + hash checks + front-matter + capability manifests), or does it require *deletion + one structural split + one CI gate*?**
Architect, DevEx, and Moonshot are on the "build the engine" side (versioned merge, hash-enforced immutability, JSON manifests, migration directories). Coder, Steward, and I are closer to the "structure + subtraction" side. This is the **minimalism axis** and it is exactly my lens.
My contention: **the engine is a solution to a problem the directory split already eliminates, and every component of the engine introduces a new hot-path failure mode (merge hang, hash-brick, schema-reject) in exchange for handling an edge case (user wants to tune a framework standard) that an additive `.local` include handles with zero new machinery.**
The proof is in the tree. The papers treat drift as evidence that we need *more* reconciliation. But drift's actual root cause is two lines:
- `PRESERVE_PATHS` includes `STANDARDS.md` and `AGENTS.md` (law is frozen), and
- non-TTY installs default to `keep` (freeze happens silently).
Neither is fixed by a 3-way merge engine. Both are fixed by **moving law into an overwrite-always `constitution/` directory.** The merge engine would sit *on top of* an already-correct split, adding risk for no marginal safety.
### How to resolve it — a falsifiable test, not a vote
Don't resolve this by which paper is most elegant. Resolve it with a **migration test matrix** (Architect proposed this; I'm making it the *decider*, not a mitigation). Before the alpha tags, the implementation must pass three scenarios on real fixtures:
1. **Fresh install** → correct three-layer deploy, CI grep green.
2. **Legacy-flat install** (today's `~/.config/mosaic/` with `AGENTS.md`+`STANDARDS.md` at root, user-edited) → law moves to `constitution/`, user files survive untouched, **no interactive prompt, no hang**.
3. **User-tuned-standard install** (user changed a value in `STANDARDS.md`) → their change survives as a `STANDARDS.local.md` delta, the framework `STANDARDS.md` updates.
**The resolution rule:** *whichever design passes all three with the fewest moving parts wins.* My claim is that the directory-split + `.local` include + overwrite-always-law passes all three with **zero new subsystems** (it reuses `rsync --exclude`, the existing linear migration runner, and a 10-line CI grep). The 3-way-merge/hash-check/manifest design must *also* pass all three — and it carries a merge engine, a hash subsystem, a version matrix, and a JSON schema validator that all must themselves be tested. If both pass scenario 1-3, the BRIEF's own non-negotiable ("not bloated, contradictory, or model-degrading") and constraint ("backward-compatible enough to land as an alpha") break the tie toward the smaller design.
That is the whole resolution: **make backward-compat a test fixture, make minimalism the tie-breaker, and let the engine justify each subsystem by a scenario only it can pass. It cannot — so it shouldn't ship in the alpha.**
---
## 4. The one thing I'd die on (restated against the debate, not the repo)
In my position paper I said *subtraction before structure.* Having read the other six, I'll sharpen it into a warning about *this debate's* trajectory:
**The collective instinct is to answer "we have four contradicting copies of the law" with "let's add a fifth canonical document, three version stamps, a merge engine, content-hash enforcement, JSON capability manifests, and a schema validator."** That is the over-engineering reflex this lens exists to stop. The framework's measured defects — confirmed in §0 — are a dead path in a dozen templates, two hardcoded home directories, a frozen law file, and a silent `keep` default. **None of those is fixed by abstraction. All of them are fixed by deletion + one directory split + one CI grep.**
Ship the *subtraction* (delete `defaults/SOUL.md`, the jarvis-loop overlay, the dead `rails/` paths, the two hardcoded creds paths, the `STANDARDS.md`-from-preserve-list) and the *one* structural move (law → overwrite-always `constitution/`) and the *one* enforcement (blocking CI grep for PII + dead paths). That is a defensible alpha. Everything else in this debate is a v1.1 feature wearing an alpha costume — and most of it is a hot-path failure mode wearing a feature costume.
If we ship the merge engine and the hash-gate and the manifests, we will have spent the alpha building subsystems to manage complexity we chose to add, while a dozen templates still tell users to run a command that doesn't exist.