Some checks failed
ci/woodpecker/push/ci Pipeline failed
Conference of 7 experts (architect/moonshot/contrarian/coder/aiml/devex/steward) debated layering, sanitization, upgrade-safety, cross-harness robustness. Artifacts: BRIEF, 7 positions, 7 rebuttals, synthesis-v1, 3 red-team passes, canonical DESIGN.md, OPEN-QUESTIONS.md, MISSION.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
121 lines
17 KiB
Markdown
121 lines
17 KiB
Markdown
# Rebuttal — The Contrarian Skeptic
|
||
|
||
**Lens:** Distrust complexity and clever abstractions. Hunt failure modes, over-engineering, and rules that look good on a page but degrade real agent behavior. I verified the load-bearing claims against the tree before writing (see §0); I am not taking anyone's grep counts on faith.
|
||
|
||
---
|
||
|
||
## 0. What I re-verified before arguing (because half this debate runs on un-rechecked greps)
|
||
|
||
Every paper cites the same handful of facts. I re-ran them so the rebuttal stands on the tree, not on six papers quoting each other:
|
||
|
||
- **`rails/` vs `tools/` path drift is real and worse than reported.** `grep -rln 'mosaic/rails/' templates/` returns **not one file but a whole family**: `templates/agent/AGENTS.md.template`, `CLAUDE.md.template`, and every project variant under `templates/agent/projects/{typescript,nestjs-nextjs,python-fastapi,python-library}/`. Meanwhile `install.sh:192-194` actively `rm -f`s the `rails` symlink. So **a dozen shipped templates emit a queue-guard command that points at a path the installer deletes.** Any agent that obeys the template gets "no such file." This is the single most concrete "rule that degrades real behavior" in the repo, and it is in the *project-scaffolding* path — the first thing a new user touches.
|
||
- **`credentials.sh:19` AND `detect-platform.sh:89` both hardcode `$HOME/src/jarvis-brain/credentials.json`** as the default. Steward and Architect both flagged this; confirmed in two files, not one.
|
||
- **`PRESERVE_PATHS` (install.sh:24) contains both `AGENTS.md` and `STANDARDS.md`** — i.e. today's law files are upgrade-frozen. `FRAMEWORK_VERSION=2`.
|
||
- **Non-TTY install defaults to `keep` (install.sh:99).** So a CI/headless re-install silently preserves a user's stale law file. The drift bug is live, today, automatically.
|
||
|
||
These four are the disease. Hold them in mind, because most of this debate proposes cures for a different, more glamorous illness.
|
||
|
||
---
|
||
|
||
## 1. The strongest ideas from other personas worth keeping
|
||
|
||
I came in hostile to "add a Constitution layer." Three ideas survived contact and I'll defend them.
|
||
|
||
### 1a. "Prose rules are advisory; only mechanical enforcement is a gate." (DevEx §DQ4.4, Architect CI guards, Steward S5/S10, Moonshot mitigation)
|
||
|
||
This is the best idea in the entire debate and it is **mine by temperament but DevEx stated it most sharply**, grounding it in a fact already in the repo: `runtime/claude/RUNTIME.md:30-32` literally says of the memory rule *"the rule alone proved insufficient — the hook is the hard gate."* The framework already learned this lesson once and wrote it down. DevEx's move — **promote "hookable gates MUST be hooked" to doctrine** — is exactly right and it is the one proposal that attacks the *real* disease (drift and contamination re-accreting) rather than the imagined one (missing layers). Every persona independently converged on "add a CI grep for personal data." That convergence is signal. **Keep it, and make it the load-bearing deliverable, not a footnote.** A precedence diagram without this CI gate is theater; the CI gate without a precedence diagram still prevents the next 55-leak regression.
|
||
|
||
### 1b. Architect's "tighten-only" precedence rule, stated as one invariant
|
||
|
||
Architect (§DQ1) and DevEx both land on: *a lower layer may further constrain a higher layer but may never relax, suspend, or contradict it.* This is the correct precedence model and it is **one sentence**, not a four-layer lattice. It generalizes the good instinct already half-present at `SOUL.md:48` (injected reminders never expand permissions) and `SOUL.md:32` (user formatting wins). I'll defend this verbatim because it is subtraction disguised as structure: it replaces an entire imagined "precedence engine" with a single rule a model can actually hold in context. Keep the sentence. Reject anything that needs a diagram to explain it.
|
||
|
||
### 1c. Coder's "self-bootstrapping Constitution" defense against injection asymmetry
|
||
|
||
Coder's single strongest recommendation (§biggest risk) is the most operationally honest thing said about cross-harness: **the launcher composition logic lives in `packages/mosaic/src/` — not visible in the framework files — so "it's already injected" is an unverifiable promise.** Coder's fix: `AGENTS.md` says *"if `CONSTITUTION.md` is not already in context, read it now"* — making the law self-loading rather than injection-dependent. This is cheap, defensive, and correct, and it directly kills the false claim at `defaults/AGENTS.md:11` ("already in your context... do not re-read") that **is provably false on a direct `claude` launch**. Belt-and-suspenders beats a trust-the-launcher invariant every time. Keep it.
|
||
|
||
---
|
||
|
||
## 2. The weakest / riskiest proposals — with concrete failure modes
|
||
|
||
Here is where the debate's enthusiasm becomes the threat my lens exists to catch. Three proposals look sophisticated and will degrade real behavior.
|
||
|
||
### 2a. Architect's per-layer version stamps + 3-way merge engine (and DevEx's `mosaic-reconcile`) — over-engineering that creates the bug it claims to fix
|
||
|
||
Architect §DQ3 proposes `constitution.version` / `standards.version` / `user-schema.version` plus a `git merge-file`-style 3-way merge with `base`/`theirs`/`mine` and conflict surfacing in `mosaic doctor`. DevEx §DQ3 proposes the same with per-file `<!-- mosaic:template-version: N -->` markers and a new `mosaic-reconcile` script. Moonshot adds a `migrations/v1.0.0-v1.1.0.md` directory and an interactive `[Y/n]` auto-merge prompt.
|
||
|
||
**Concrete failure modes:**
|
||
|
||
1. **The 3-way merge needs a `base` that does not exist for any current install.** A 3-way merge requires the *original template the user's file was generated from*. Today's deployed `SOUL.md` files were hand-edited and seeded across multiple `FRAMEWORK_VERSION` bumps with no stamped base. So the very first upgrade after this lands has **no base to diff against** — the merge degrades to a 2-way conflict dump on every section, for every existing user, exactly at the alpha boundary the BRIEF says must not break. The machinery is most fragile precisely when first used.
|
||
2. **Interactive merge prompts hang headless launches.** Moonshot's `[Y/n]` auto-merge prompt and DevEx's `mosaic-reconcile` are interactive by implication. This very environment forbids TTY-blocking calls; `mosaic-init` is already `read -r`-interactive and the install path already had to add `--non-interactive`. A merge engine in the upgrade path is a new hang surface on every CI re-install.
|
||
3. **Per-file version matrices are the combinatorial blowup I named in my position paper.** Three independent version integers = a state space of `(constitution vN × standards vM × user-schema vK)` that nobody will test. The Architect's own "Biggest Risk" section *admits* the migration is the most likely thing to "break existing deployments catastrophically" — and then proposes the most complex possible migration.
|
||
|
||
**The cheaper design that wins:** physical directory separation (which all three also propose and which I endorse) **already makes 3-way merge unnecessary.** If framework-owned content lives in `constitution/` (clobbered wholesale) and user content lives at root (never touched), there is **nothing to merge** — that is the entire point of the split. The override mechanism for the rare user who must tune a standard is an **additive `STANDARDS.local.md` include** (my position §DQ3), not a merge of the framework file. You get upgrade safety with `rsync --delete` on one directory and `rsync --exclude` on the other. One integer version, linear migrations (already built, `install.sh:160-202`), no merge engine. **The 3-way merge solves a problem the directory split already deleted.**
|
||
|
||
### 2b. Moonshot's YAML front-matter + content-hash "launcher refuses to start" enforcement — a brittle wall in front of an open door
|
||
|
||
Moonshot §DQ1 proposes `mosaic-layer: 0 / mosaic-owner: framework / mosaic-override: forbidden` front matter, and a launcher that **"refuses to start if a layer-0 file has been structurally overridden (content-hash check)."** Steward §DQ3 echoes a softer version (`mosaic doctor --check-constitution` against `.checksums`).
|
||
|
||
**Concrete failure modes:**
|
||
|
||
1. **It enforces the wrong invariant at the wrong layer.** The threat is not "user edited CONSTITUTION.md." The threat is "user *never receives* a CONSTITUTION update because it is preserved." A content-hash check that *blocks startup* on a modified law file will **brick the agent for the one user who customized their gates** — while doing nothing for the 99% whose problem is staleness, not modification. You have built a lock for a door nobody walks through and left the actual hole (silent non-upgrade) open.
|
||
2. **Hash-check-on-launch is a new hard failure mode on the hot path.** A corrupted line ending, a CRLF normalization on Windows (which DevEx correctly notes is already a symlink minefield), or a trailing-newline diff now **prevents the agent from starting at all.** You have converted a cosmetic drift into a total outage. The cure is more dangerous than the disease.
|
||
3. **Front-matter `mosaic-override: forbidden` is a rule that asks the model to police itself** — exactly the "prose gate" pattern this debate (correctly, per §1a) agreed is advisory-only. A YAML key that says "forbidden" enforces nothing unless the launcher reads it, and if the launcher reads it, the YAML is redundant with the launcher's own logic. It is ceremony.
|
||
|
||
**The cheaper design that wins:** Make CONSTITUTION.md **overwrite-always** (not in `PRESERVE_PATHS`). That is it. If it is clobbered on every upgrade, "user modified it" becomes a non-event — their edit simply doesn't survive, which is the *correct* behavior for immutable law. No hash check, no startup gate, no front-matter. The directory split (§2a) does the enforcement structurally. **Subtraction beats a hash-verification subsystem.**
|
||
|
||
### 2c. The five-layer model (Architect) and DevEx's `adapters/<h>.capabilities.json` manifests — taxonomy inflation
|
||
|
||
Architect §DQ1 argues for **five** layers (Constitution / Standards / Persona / Operator+Policy / Deployment). DevEx §DQ4 proposes per-harness JSON capability manifests (`structured_reasoning.gate: true/false`, `subagent_spawn.model_param`, etc.). Moonshot proposes a `COMPLIANCE.md` harness×gate matrix plus `schema.json` JSON Schema validation of SOUL fields.
|
||
|
||
**Concrete failure modes:**
|
||
|
||
1. **Five layers means five files to keep non-duplicative — the exact failure we are fixing, with a higher file count.** The disease is duplication-and-drift across (today) four restatements of the gates. Architect's response is to add layers 2 (Standards) and 4 (Operator Policy) and 5 (Deployment) as *distinct* artifacts. Splitting "Standards" from "Constitution" sounds clean, but it re-creates the `AGENTS.md`/`STANDARDS.md` overlap that already exists and already drifts (both currently restate secrets/git/multi-agent rules). **You cannot fix duplication by formalizing more documents to duplicate across.** The honest count is: one immutable law file (L0), one user persona (SOUL), one user profile (USER). "Standards" is either law (→ L0) or a tunable default (→ a `.local` include), not a third sovereign layer. "Operator policy" like the `(Policy: Jason, 2026-06-11)` line is a *one-line edit* (delete the attribution, keep the mechanism), not a new `policy/*.md` subsystem.
|
||
2. **`capabilities.json` is a config format invented for a four-row table.** There are four harnesses and roughly three capability axes that differ. DevEx's own manifest example encodes what a **four-line markdown table** already conveys. A JSON schema for four harnesses is a maintenance artifact (now you need a validator, a schema, and CI for the schema) standing in for prose that fits on a screen. The Pi-vs-others sequential-thinking exception is *one sentence* ("structured reasoning required; Pi satisfies it natively"), not a `gate: false` field in a bespoke manifest format.
|
||
3. **JSON Schema validation of SOUL fields (Moonshot) presumes SOUL is structured data. It is prose.** SOUL.md is a behavioral contract written for a *model* to read, not a form. Imposing `schema.json` validation turns a flexible persona doc into a typed form with required fields — and the first user who writes a freeform communication-style paragraph fails validation. You are adding a compiler for a poem.
|
||
|
||
**The cheaper design that wins:** Three layers (L0 immutable law, L2 persona, L3 profile — I'm using the debate's numbering). Cross-harness differences live in a **single markdown table** in the adapter docs, in capability-verb language ("use structured reasoning"), not a JSON manifest. The "compliance matrix" is fine *as a doc* (Moonshot's instinct is good there) — just don't make it machine-read-and-enforced.
|
||
|
||
---
|
||
|
||
## 3. The key disagreement, sharpened — and how to resolve it
|
||
|
||
### The disagreement
|
||
|
||
Strip away the agreements (everyone wants a named Constitution; everyone wants the persona sanitized; everyone wants a CI grep; everyone wants directory separation). The live fault line is:
|
||
|
||
> **Does upgrade-safe customization require a reconciliation *engine* (per-layer versions + 3-way merge + hash checks + front-matter + capability manifests), or does it require *deletion + one structural split + one CI gate*?**
|
||
|
||
Architect, DevEx, and Moonshot are on the "build the engine" side (versioned merge, hash-enforced immutability, JSON manifests, migration directories). Coder, Steward, and I are closer to the "structure + subtraction" side. This is the **minimalism axis** and it is exactly my lens.
|
||
|
||
My contention: **the engine is a solution to a problem the directory split already eliminates, and every component of the engine introduces a new hot-path failure mode (merge hang, hash-brick, schema-reject) in exchange for handling an edge case (user wants to tune a framework standard) that an additive `.local` include handles with zero new machinery.**
|
||
|
||
The proof is in the tree. The papers treat drift as evidence that we need *more* reconciliation. But drift's actual root cause is two lines:
|
||
- `PRESERVE_PATHS` includes `STANDARDS.md` and `AGENTS.md` (law is frozen), and
|
||
- non-TTY installs default to `keep` (freeze happens silently).
|
||
|
||
Neither is fixed by a 3-way merge engine. Both are fixed by **moving law into an overwrite-always `constitution/` directory.** The merge engine would sit *on top of* an already-correct split, adding risk for no marginal safety.
|
||
|
||
### How to resolve it — a falsifiable test, not a vote
|
||
|
||
Don't resolve this by which paper is most elegant. Resolve it with a **migration test matrix** (Architect proposed this; I'm making it the *decider*, not a mitigation). Before the alpha tags, the implementation must pass three scenarios on real fixtures:
|
||
|
||
1. **Fresh install** → correct three-layer deploy, CI grep green.
|
||
2. **Legacy-flat install** (today's `~/.config/mosaic/` with `AGENTS.md`+`STANDARDS.md` at root, user-edited) → law moves to `constitution/`, user files survive untouched, **no interactive prompt, no hang**.
|
||
3. **User-tuned-standard install** (user changed a value in `STANDARDS.md`) → their change survives as a `STANDARDS.local.md` delta, the framework `STANDARDS.md` updates.
|
||
|
||
**The resolution rule:** *whichever design passes all three with the fewest moving parts wins.* My claim is that the directory-split + `.local` include + overwrite-always-law passes all three with **zero new subsystems** (it reuses `rsync --exclude`, the existing linear migration runner, and a 10-line CI grep). The 3-way-merge/hash-check/manifest design must *also* pass all three — and it carries a merge engine, a hash subsystem, a version matrix, and a JSON schema validator that all must themselves be tested. If both pass scenario 1-3, the BRIEF's own non-negotiable ("not bloated, contradictory, or model-degrading") and constraint ("backward-compatible enough to land as an alpha") break the tie toward the smaller design.
|
||
|
||
That is the whole resolution: **make backward-compat a test fixture, make minimalism the tie-breaker, and let the engine justify each subsystem by a scenario only it can pass. It cannot — so it shouldn't ship in the alpha.**
|
||
|
||
---
|
||
|
||
## 4. The one thing I'd die on (restated against the debate, not the repo)
|
||
|
||
In my position paper I said *subtraction before structure.* Having read the other six, I'll sharpen it into a warning about *this debate's* trajectory:
|
||
|
||
**The collective instinct is to answer "we have four contradicting copies of the law" with "let's add a fifth canonical document, three version stamps, a merge engine, content-hash enforcement, JSON capability manifests, and a schema validator."** That is the over-engineering reflex this lens exists to stop. The framework's measured defects — confirmed in §0 — are a dead path in a dozen templates, two hardcoded home directories, a frozen law file, and a silent `keep` default. **None of those is fixed by abstraction. All of them are fixed by deletion + one directory split + one CI grep.**
|
||
|
||
Ship the *subtraction* (delete `defaults/SOUL.md`, the jarvis-loop overlay, the dead `rails/` paths, the two hardcoded creds paths, the `STANDARDS.md`-from-preserve-list) and the *one* structural move (law → overwrite-always `constitution/`) and the *one* enforcement (blocking CI grep for PII + dead paths). That is a defensible alpha. Everything else in this debate is a v1.1 feature wearing an alpha costume — and most of it is a hot-path failure mode wearing a feature costume.
|
||
|
||
If we ship the merge engine and the hash-gate and the manifests, we will have spent the alpha building subsystems to manage complexity we chose to add, while a dozen templates still tell users to run a command that doesn't exist.
|