Files

ci/woodpecker/push/ci Pipeline failed

Details

docs(design): mosaic framework constitution — expert conference output

Conference of 7 experts (architect/moonshot/contrarian/coder/aiml/devex/steward)
debated layering, sanitization, upgrade-safety, cross-harness robustness.
Artifacts: BRIEF, 7 positions, 7 rebuttals, synthesis-v1, 3 red-team passes,
canonical DESIGN.md, OPEN-QUESTIONS.md, MISSION.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-15 23:47:49 -05:00

20 KiB

Raw Blame History

Rebuttal — Cross-Harness DevEx Lens

Author lens: Cross-Harness DevEx Expert — Claude Code / Codex / Pi / OpenCode injection + tool differences; owns portability and the end-user customization experience.

Method: Read all seven position papers (position-{aiml,architect,coder,contrarian,devex,moonshot,steward}.md) and re-verified every load-path / injection / install claim against the real tree under packages/mosaic/framework/. This rebuttal does not restate my opening paper; it adjudicates the others from the one seat that actually has to make the contract land identically on four harnesses that inject context in three incompatible ways.

The conference has near-unanimous consensus on the easy 80% (split out an L0 Constitution by ownership/mutability; delete defaults/SOUL.md; ship a CI PII gate; kill the rails/→tools/ path drift; budget the resident core). I will not relitigate that — it's settled, and I agree. The remaining 20% is where the design actually lives or dies, and it is almost entirely my lane: how does L0 reach the model, and what happens when it doesn't. Three of the six other papers get that question subtly but dangerously wrong.

1. The 2–3 strongest ideas from other personas worth keeping

1a. Coder's self-bootstrapping Constitution — the single best idea in the room, because it is the only one that survives a harness we don't control

position-coder.md §"Biggest Risk" and §"Single Strongest Recommendation" name the failure mode the governance-first papers all skate past:

"If mosaic claude composes a --append-system-prompt that includes AGENTS.md but not constitution/CORE.md, the hard gates are silently absent... The Constitution must not rely on the launcher getting the injection order right; it must be a file the agent is instructed to read regardless."

This is correct and it is load-bearing for my entire lens. Ground truth: today defaults/AGENTS.md:11 literally asserts "The core contract is ALREADY in your context (injected by mosaic launch). Do not re-read it." — and that claim is false on a bare claude launch, where the only artifact is the thin ~/.claude/CLAUDE.md pointer (runtime/claude/CLAUDE.md:12-13 admits it is "only a fallback for direct claude launches"). An agent that trusts a false "already loaded" assertion skips the read and runs ungoverned. The contrarian (position-contrarian.md DQ4 point 1) independently flags the same line as "a behavior-degrading rule." Two lenses converging on the same concrete bug means it's real.

Keep: L0 must be both injected by value and self-loadable by file-read instruction, and the pointer must never claim residency it can't guarantee. Belt and suspenders, because on the harnesses I own, the suspenders (injection) are not always wearable.

1b. AI/ML's resident-token budget as a CI-enforced wall, and the "physics" framing that justifies it

position-aiml.md is the only paper that treats what the model can actually weight as a first-class constraint rather than an afterthought. Its DQ5 diagnosis — ~300+ resident lines / ~3–4K tokens of "dense, imperative, partially-redundant, partially-contradictory law... including for list the files in this dir" — is exactly right, and its mechanism (a non-advisory line-count assertion in mosaic-doctor + framework CI) is the only proposal that stops the new CONSTITUTION.md from re-bloating into the old 155-line AGENTS.md. Its closing line — "Ship the budget gate in the same alpha as the Constitution, or don't ship the Constitution" — should be adopted verbatim as alpha DoD.

From the DevEx seat this matters doubly: the weakest-context harness sets the ceiling for everyone. A budget that fits Pi's --append-system-prompt and Claude's window must also survive Codex/OpenCode writing the same bytes to an instructions file the model may only skim. Budget discipline is portability discipline.

1c. Steward's license + `credentials.sh` findings — the only papers-killing-shipping-blockers nobody else surfaced

position-steward.md §"The Missing License" and finding S6 (tools/_lib/credentials.sh:19 hardcodes $HOME/src/jarvis-brain/credentials.json as a default) are the two findings that, if missed, make the first public push itself a hygiene incident regardless of how clean the layering is. No LICENSE = not legally open source (Berne default: all rights reserved). A hardcoded private credential path shipped as a default is worse than the SOUL contamination everyone fixated on, because it's executable and it's in the tooling layer, not the persona layer. These are unglamorous and correct. Keep both as alpha blockers.

2. The 2–3 weakest / riskiest proposals, with concrete failure modes

2a. Moonshot's machine-readable front-matter + "launcher refuses to start on hash mismatch" — over-engineered enforcement that breaks exactly the harnesses I own

position-moonshot.md DQ1 proposes YAML front-matter (mosaic-layer: 0, mosaic-override: forbidden) on each deployed file, and a launcher that "reads these headers and refuses to start if a layer-0 file has been structurally overridden (content-hash check against installed version)." position-steward.md S11 proposes the cousin: mosaic doctor --check-constitution treating any deployed-file diff as "an error, not a warning, because it means the hard gates may be compromised."

Concrete failure modes from the cross-harness seat:

YAML front-matter is not free in resident context — it's noise injected into the prompt. These files are concatenated into the system prompt (--append-system-prompt on Pi/Claude; an instructions file on Codex/OpenCode). A ---\nmosaic-layer: 0\nmosaic-owner: framework\nmosaic-override: forbidden\n--- block at the top of the highest-primacy position in the whole stack spends the most valuable attention real estate (aiml behavior #1, primacy) on metadata the model must parse and then ignore. The framework's own best instinct is the opposite: defaults/USER.md leads with human-readable prose, not machine front-matter. Machine-readable layer tags belong in a manifest the launcher reads, never in the text the model reads. This is precisely the adapter-capability-manifest split I argued for — keep machine metadata out of the model's eyes.
"Launcher refuses to start on hash mismatch" makes the framework hostile and is trivially bypassed on 3 of 4 harnesses anyway. A user who adds one clarifying line to their deployed contract now cannot launch. Worse: the hash check only runs inside mosaic <harness>. A bare claude, codex, or opencode launch — which the framework explicitly supports via thin pointers — never invokes the launcher, so the "refuse to start" gate is absent on every direct launch. You get a control that punishes compliant mosaic-launch users and is invisible to exactly the unmanaged launches where drift is most likely. That is enforcement theater with a usability tax.
Treating any constitution-file diff as a violation collides with the upgrade model. During a v2→v3 migration the deployed file legitimately differs from "installed version" for a window. A checksum-as-violation check will false-positive every mid-upgrade state and every legitimate MOSAIC_NO_SYMLINK copy that differs by a trailing newline.

Resolution: enforce L0 immutability structurally (it lives in a framework-owned dir that is overwritten wholesale on upgrade — architect/coder/steward all converge here) and socially (CI PII + dead-path grep on the source repo). Drop the runtime hash-refusal. Detection (mosaic doctor reporting drift as an advisory) is fine; refusing to launch is not.

2b. The proliferation of three mutually-incompatible "user overlay" schemes — a portability landmine if any one ships as-is

Three papers invented three different customization-survival mechanisms, and nobody noticed they conflict:

position-coder.md DQ3: per-guide E2E-DELIVERY.local.md siblings, with AGENTS.md instructing "after loading any guide, check for a .local.md variant and merge-read it."
position-aiml.md DQ3 + position-devex.md (mine): per-layer SOUL.local.md / USER.local.md, loaded last-within-layer.
position-contrarian.md DQ3: an  directive embedded in the shipped file.

These are not interchangeable and the difference is my problem, because each implies a different injection-time composition step and the four harnesses compose differently:

The coder's "agent checks for a .local.md after each guide load" assumes the agent does the merge at read time. That works on a file-reading harness but is redundant/confusing when the launcher already injected a pre-composed blob (Pi, mosaic claude) — now the agent is told to go re-read and merge files that are already in its system prompt, doubling tokens and risking contradiction between the injected copy and the freshly-read copy.
The contrarian's  directive assumes a composer that understands the directive. Markdown comments are inert; nothing in the current tree processes them. Ship that directive without building the processor (mosaic compose-contract, which only the architect actually specs) and the "override" is a no-op comment the model ignores — silent failure of the user's customization.

Concrete failure mode: a user reads the contrarian's docs, adds STANDARDS.local.md, and it is never loaded because the Claude path injects STANDARDS.md verbatim with the comment treated as text. The user believes their tightened secret-handling rule is active; it isn't. That's a security regression dressed as a customization feature.

Resolution (my lane to call): pick one overlay mechanism, and make the launcher/composer own it, not the agent. Exactly one composition step (mosaic compose-contract <harness>, per position-architect.md DQ4) resolves base + .local overlays before injection, on every harness, so the model receives one already-merged blob and never runs a read-merge ritual that's redundant on injected harnesses and the only path on pointer harnesses. Overlay granularity = per-layer (SOUL.local.md, USER.local.md, STANDARDS.local.md), not per-guide — guides are L1 framework-owned and should be referenced, not forked.

2c. Architect's + my own 3-way-merge reconciliation engine — the contrarian's attack lands, and I concede part of it

position-architect.md DQ3 and my own opening paper both propose per-file template versioning plus a git merge-file-style 3-way merge (mosaic-reconcile) for user-seeded files on upgrade. position-contrarian.md DQ3 attacks this directly: "Reject version-pinning per-file. Per-file pins create a combinatorial matrix of (framework vN, user pinned vM) states that no one will test."

He's right about the test matrix, and from a DevEx standpoint an interactive merge-conflict resolution flow is a terrible first-run/upgrade experience — it drops a non-expert user into <<<<<<< theirs markers in a config file they didn't know they were editing. For an alpha, that is too much machinery for too little payoff.

Resolution / concession: for the alpha, adopt the overlay model (2b) instead of 3-way merge. Overlays sidestep merge entirely: framework files are overwritten wholesale (no merge needed), user deltas live in never-touched .local.md files (no merge needed). 3-way merge is only required for the one genuinely-hand-tuned-generated file, TOOLS.md — and even there, the alpha can ship "we regenerate TOOLS.md from template; your old one is backed up to TOOLS.md.bak.<ts>" (machinery install.sh already has) rather than a conflict UI. Defer real reconciliation to post-alpha. The contrarian's "subtraction before structure" applies to the upgrade mechanism too.

3. The key disagreement most relevant to my lens, sharpened — and how to resolve it

The fault line: inject-by-value (byte-for-byte, launcher-composed) vs. self-load-by-instruction (agent reads files). Both camps are half-right, and the framework needs both with a defined boundary — which no paper draws.

Inject-by-value camp (position-aiml.md DQ4: "L0 must be injected as system-prompt text on every harness, identically, byte-for-byte"; position-moonshot.md DQ4; position-steward.md DQ4): correct that injection at primacy position is strictly stronger than a deferred "go read this." A system-prompt-resident gate is non-removable for the turn; a "please read AGENTS.md" pointer is a request the model can skip under load.
Self-load camp (position-coder.md): correct that the launcher cannot be trusted as the sole delivery path, because bare claude/codex/opencode launches bypass mosaic compose-contract entirely and get only the thin pointer.

Here is the fact both camps under-weight, and it is the central fact of my lens: the four harnesses do not offer the same injection channel, so "byte-for-byte identical injection everywhere" is not currently achievable as stated. Ground truth:

Pi: full contract via --append-system-prompt + --skill + --extension (adapters/pi.md:14-16). Tier-1 injection, strongest. And Pi has no permission backstop (runtime/pi/RUNTIME.md:20), so resident-text fidelity is the only enforcement — aiml's point 4 is right: keep L0 tiny precisely because Pi has no hook wall behind it.
Claude: mosaic claude can --append-system-prompt (Tier 1), but bare claude gets only ~/.claude/CLAUDE.md (Tier 3 pointer). And Claude's own harness injects competing <system-reminder> mandatory-read blocks — this very session's reminder demonstrates the harness will inject its own "read these files first" instructions that compete with ours for primacy.
Codex / OpenCode: write to an instructions file (~/.codex/instructions.md, ~/.config/opencode/AGENTS.md) — between Tier 1 and Tier 3; resident-ish but the model may skim.

So "byte-for-byte everywhere" is an aspiration, not a switch you flip. The honest design is a tiered injection contract that names the strength per harness and degrades safely, which is exactly the per-harness capability manifest I proposed (position-devex.md DQ4) — and which the inject-by-value papers asserted as if all four harnesses were symmetric. They are not. position-aiml.md even half-concedes this in its own point 4 (Pi special case) without following the thread to its conclusion: if Pi is special, the contract is not byte-for-byte uniform, it's capability-resolved.

Proposed resolution — a single, testable injection contract that both camps can sign

L0 is delivered by the strongest channel each harness offers (manifest-declared), AND is self-loadable as a fallback. The two are not alternatives — they are tiered.
- Tier 1 (system-prompt append: Pi, mosaic claude, mosaic codex where supported): launcher injects the composed L0 by value at primacy. The pointer/AGENTS index then says: "The Constitution is resident above. If it is NOT in your context, read ~/.config/mosaic/CONSTITUTION.md now." — conditional, not the false unconditional "already loaded; do not re-read" of defaults/AGENTS.md:11.
- Tier 3 (bare-launch pointer: direct claude/codex/opencode): the pointer carries the 5-bullet irreducible-gate summary inline (aiml DQ4 point 3) and the instruction to read the full CONSTITUTION.md. Even a model that skips the read has the irreducible law resident.
Per-harness capability manifest (adapters/<h>.capabilities.json) is the single source for: which injection tier this harness gets, and how abstract capability-verbs in L0 map to concrete tools. This is what collapses the four near-duplicate "sequential-thinking required (except Pi)" stanzas (runtime/{claude,codex,opencode}/RUNTIME.md require it; runtime/pi/RUNTIME.md:59-61 exempts it). The Constitution says "use structured multi-step reasoning before planning" (capability verb); the manifest resolves it to mcp:sequential-thinking (gate=true) on Claude/Codex/OpenCode and native-thinking (gate=false) on Pi. position-moonshot.md DQ4 reached the same "behavior requirement, not tool requirement" conclusion for sequential-thinking — generalize it to all capability references via the manifest rather than prose carve-outs scattered across runtime files.
Back every hookable gate with a hook where the harness has hooks; track parity as an open gap, not a silent inconsistency. This is repo-proven doctrine, not theory: runtime/claude/RUNTIME.md:30-32 says the prose memory rule "proved insufficient — the hook is the hard gate." Promote that to Constitution doctrine. The contrarian (DQ5 point 4) and I agree here. The manifest is also where "Codex/OpenCode have no prevent-memory-write equivalent yet" gets recorded as a tracked gap — which is the honest version of moonshot's COMPLIANCE matrix, minus the launch-refusal enforcement I rejected in 2a.
Resolve the inject-vs-self-load tension by making the launcher own composition and the agent own verification. Launcher composes + injects (mosaic compose-contract, architect DQ4). Agent runs a one-line self-check: "if CONSTITUTION not resident, read it." This satisfies the inject-by-value camp (strongest channel used) and the self-load camp (never trusts the launcher blindly) with a single defined seam, and it is testable: a CI smoke test launches each harness path (Pi append, mosaic claude append, bare-claude pointer, Codex instructions-file) and asserts the 7 irreducible gates are present in the effective context. That smoke test — not a hash-refusal, not front-matter — is the mechanical control that makes "the Constitution is enforced across harnesses" a true statement instead of an aspirational one.

The disagreement dissolves once you stop pretending the four harnesses are symmetric. They aren't; the manifest names the asymmetry; the tiered contract degrades safely across it; the smoke test proves it.

20 KiB Raw Blame History Unescape Escape