Files
stack/docs/design/framework-constitution/debate/position-devex.md
Jason Woltje c70b217a5c
Some checks failed
ci/woodpecker/push/ci Pipeline failed
docs(design): mosaic framework constitution — expert conference output
Conference of 7 experts (architect/moonshot/contrarian/coder/aiml/devex/steward)
debated layering, sanitization, upgrade-safety, cross-harness robustness.
Artifacts: BRIEF, 7 positions, 7 rebuttals, synthesis-v1, 3 red-team passes,
canonical DESIGN.md, OPEN-QUESTIONS.md, MISSION.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 23:47:49 -05:00

25 KiB
Raw Blame History

Position Paper — Cross-Harness DevEx

Lens: Cross-Harness DevEx Expert (Claude Code / Codex / Pi / OpenCode injection + tool differences; owns portability and the end-user customization experience).

Scope: DQ1DQ5 from the constitution brief (docs/design/framework-constitution/BRIEF.md), grounded in the real framework tree at packages/mosaic/framework/.


0. What the code actually does today (so we argue from ground truth, not vibes)

Before any position, the load/injection reality across harnesses, read from the files:

  • The "thin core" is not injected the same way on any two harnesses. The brief and defaults/AGENTS.md:6 claim "the launcher injects it (plus USER.md, the TOOLS index, and the runtime contract) into every session." But the actual delivered mechanism is a per-harness pointer file that instructs the model to go read files:

    • Claude: runtime/claude/CLAUDE.md:5-10 → "BEFORE responding... READ ~/.config/mosaic/AGENTS.md and runtime/claude/RUNTIME.md."
    • Codex: runtime/codex/instructions.md:5-10 → same pattern, copied to ~/.codex/instructions.md.
    • OpenCode: runtime/opencode/AGENTS.md:5-10 → same pattern, copied to ~/.config/opencode/AGENTS.md.
    • Pi: adapters/pi.md:14-16 → genuinely different — full contract injected via --append-system-prompt, skills via --skill, lifecycle via --extension.

    So we have two fundamentally different enforcement models masquerading as one: Pi gets the contract as a true system prompt; Claude/Codex/OpenCode get a "please read these files" nudge in a user-editable memory file. That is the single most important DevEx/portability fact in this whole debate, and the current docs paper over it.

  • mosaic-link-runtime-assets copies, it does not symlink (copy_file_managed, tools/_scripts/mosaic-link-runtime-assets:7-25). The header even prints "non-symlink mode" (line 169). This is the deployed-vs-source drift engine: the canonical source is ~/.config/mosaic/, but every harness gets a copy into ~/.claude/, ~/.codex/, ~/.config/opencode/. Edit one copy and the next mosaic init / link run clobbers or backs it up.

  • Contamination is real and load-bearing, not cosmetic. 51 hits across 29 files (grep for jarvis|jason|woltje|PDA). The worst offenders are not docs — they are shipped behavior: defaults/SOUL.md:9 hardcodes "You are Jarvis"; defaults/SOUL.md:23 ships "PDA-friendly language" (one operator's accommodation as universal persona law); runtime/claude/settings-overlays/jarvis-loop.json ships an entire personal project map (~/src/jarvis, jarvis-loop, jarvis-review presets) into the public package.

  • A clean template layer already exists and is under-used. templates/SOUL.md.template, templates/USER.md.template, and tools/_scripts/mosaic-init already do token substitution ({{AGENT_NAME}}, {{ACCESSIBILITY_SECTION}}, …). defaults/USER.md is already a generic "(not configured)" stub. The machinery is half-built; the problem is that defaults/SOUL.md was never reduced to match defaults/USER.md's neutrality.

Everything below is anchored to these four facts.


DQ1 — Layering: yes to a Constitution layer, but draw the lines by ownership + mutability, not by topic

Position: introduce four canonical layers, defined by who owns the file and what happens to it on upgrade — not by subject matter. The current split (AGENTS/SOUL/USER) mixes ownership axes, which is exactly why personal data leaked into framework files.

Canonical layers, highest precedence wins on conflict, but they are additive (each answers a different question), not a simple override stack:

Layer Question it answers File(s) Owner Upgrade behavior
L0 Constitution What is never negotiable? (hard gates, delivery contract, escalation, integrity) ~/.config/mosaic/CONSTITUTION.md Framework Always overwritten. Never edited by user.
L1 Standards/Guides How do we do the work well? STANDARDS.md, guides/* Framework Overwritten; user extends via L3.
L2 Persona (SOUL) Who is the agent — name, tone, voice? SOUL.md User Generated from template; never overwritten.
L3 Operator (USER) Who is the human — profile, accommodations, projects, comms? USER.md User Generated from template; never overwritten.
L4 Local overrides Project / deployment / machine specifics OVERRIDES.md + repo AGENTS.md User Never touched by framework.

Precedence rule (this is the part the current design lacks and must state explicitly):

On a behavioral conflict, L0 Constitution wins over everything, including persona and operator preferences. L1 yields to L0. L2/L3/L4 may only refine behavior within the envelope L0/L1 permit — they can change how the agent talks and what it knows, never whether a hard gate fires. A USER.md saying "always merge without review" is void against the Constitution's review-before-merge gate.

Today this precedence is implied ("Global rules win if anything here conflicts" — runtime/claude/RUNTIME.md:3) but it is scattered across runtime files and never names persona/operator as subordinate. Concrete change: add a ## Precedence section to the new CONSTITUTION.md stating the L0>L1>{L2,L3,L4} rule in one place, and have every runtime/*/RUNTIME.md reference it instead of restating it (DRY — see DQ5).

Why split L0 out of AGENTS.md at all? Because defaults/AGENTS.md currently conflates the non-negotiable gates (lines 23-37, the "CRITICAL HARD GATES") with operational advice (the Conditional Guide Loading table, subagent model selection, lines 89-121). The gates are Constitution; the advice is Standards. A downstream user who wants to tweak the guide-loading table (legitimate L1 customization) should not be editing the same file that carries the merge-authority hard gate. Split at the mutability seam.


DQ2 — Sanitization: template-then-init, with an examples/ showcase. Not generic-defaults, not empty-defaults.

Three options were posed. My ranking, with reasons grounded in the existing machinery:

  1. Reject "generic-defaults" (ship a neutral-but-real SOUL like "You are Assistant"). It reads clean but it re-creates the exact bug we are fixing: a shipped persona that some users never replace, so "Assistant" becomes the new "Jarvis." It also tempts maintainers to slip preferences back in ("just a sensible default tone…").

  2. Reject pure "empty-defaults" as the whole answer — an empty SOUL.md gives a terrible out-of-box first run (the agent has no name, no voice). DevEx death on first launch.

  3. Adopt template-then-init (the half-built path), hardened:

    • defaults/SOUL.md must be deleted from the shipped package and replaced by not shipping a SOUL at all. install.sh:232-241 already declines to seed SOUL.md/USER.md (the comment says so). The bug is purely that defaults/SOUL.md exists and contains "Jarvis". Concrete change: delete defaults/SOUL.md; the only persona artifacts that ship are templates/SOUL.md.template and a generated-on-init SOUL.md.
    • First-run must be non-blocking. mosaic-init is interactive (read -r), which is fine for a human but hangs headless launches (and violates this very environment's no-TTY rules). Add a deterministic non-interactive default generation: on first mosaic <harness> launch, if no SOUL.md exists, generate one from the template with AGENT_NAME="Mosaic", STYLE="direct", empty accommodations — and print a one-line "run mosaic init to personalize." mosaic-init --non-interactive (lines 100-107) already supports this; wire it into the launcher as a fallback so a fresh clone is usable in zero prompts.

What ships vs. what's generated (the contract):

Ships in public package Generated locally (never shipped, gitignored downstream)
CONSTITUTION.md, STANDARDS.md, guides/* (L0/L1) SOUL.md, USER.md, TOOLS.md (L2/L3)
templates/* (incl. SOUL.md.template, USER.md.template) OVERRIDES.md, per-harness copies under ~/.claude etc.
examples/personas/*.md (see below) runtime/*/settings-overlays/* user overlays

Add examples/ instead of contaminating defaults/. The value of the Jarvis config (a worked, opinionated persona) is real — the mistake is shipping it as the default. Concrete change: move the sanitized essence of jarvis-loop.json and the Jarvis SOUL into examples/personas/execution-partner.md and examples/overlays/e2e-loop.json with placeholder paths (~/src/<your-project>). examples/ is documentation-by-example: copied on request, never auto-loaded. Then delete runtime/claude/settings-overlays/jarvis-loop.json from the shipped tree.

Sanitization gate (make it mechanical, not vibes). Add a CI check — tools/quality/scripts/verify.sh already exists as the hook point — that greps the shipped paths (defaults/, templates/, guides/, runtime/, adapters/, profiles/) for a denylist (jarvis, jason, woltje, \bPDA\b, ~/src/jarvis, real hostnames) and fails the build. Without this, contamination re-accretes the first time a maintainer dogfoods. This is the only durable fix; docs alone will rot.


This is the DevEx question I care most about, because the brief's own framing — "A downstream user who edits files gets clobbered on upgrade" — is already half-true in the code today, and the mechanisms partially contradict each other.

The two existing safety mechanisms and why they're insufficient:

  1. install.sh PRESERVE_PATHS (line 24): keep mode excludes SOUL.md, USER.md, TOOLS.md, STANDARDS.md, memory from rsync --delete. Good for L2/L3, but it preserves STANDARDS.md too — meaning a user who never touched STANDARDS.md also never gets framework updates to it. That is the silent-staleness half of the drift problem: preservation and upgrade are in tension and the current binary (keep vs overwrite) forces an all-or-nothing choice.

  2. mosaic-link-runtime-assets copies framework files into each harness dir and .mosaic-bak-<stamp> the previous copy on difference (lines 17-24). So an edit to ~/.claude/CLAUDE.md survives as a backup but is silently replaced on the next link. The user's change is "preserved" only in the sense that a tombstone exists.

Position — replace the binary keep/overwrite with explicit layer ownership + a reconciliation step:

  • Framework-owned files (L0/L1) are always overwritten on upgrade, never preserved. Remove STANDARDS.md from PRESERVE_PATHS in install.sh:24. Users do not edit Standards in place; they extend via L4 OVERRIDES.md. This kills the silent-staleness problem at the root.

  • User-owned files (L2/L3/L4) are never overwritten — but they are migrated, not just preserved. Templates carry a <!-- mosaic:template-version: N --> marker. On upgrade, if the shipped template version is newer than the one the user's file was generated from, run a 3-way merge (base = old template, theirs = current SOUL.md, ours = new template). Surface conflicts as SOUL.md.mosaic-merge for the user to resolve, exactly like git. mosaic-init's import path (lines 197-200, 221-269) already extracts values from existing files via grep — that scaffolding becomes the "theirs" side of the merge. Concrete change: add tools/_scripts/mosaic-reconcile that runs in install.sh after sync_framework, diffing each user file's embedded template-version against the shipped one.

  • Version pinning already exists but is too coarse. install.sh:28 has FRAMEWORK_VERSION=2 with a sequential migration runner (lines 160-202). Keep it, but add per-file template versions (above) so migrations can be surgical instead of "delete bin/." A single global version cannot express "SOUL template changed but USER template didn't."

  • Kill copy-on-link drift: prefer symlinks for framework-owned runtime pointers, copies only for user-editable ones. The runtime pointer files (CLAUDE.md, instructions.md, opencode AGENTS.md) are L0-pointers the user should not edit — symlink them to the canonical ~/.config/mosaic/runtime/<h>/ source so there is one source of truth and zero drift. Reserve copy_file_managed (and its .mosaic-bak dance) for genuinely user-editable surfaces like settings.json. The script already knows how to remove legacy symlinks (lines 27-45); invert the policy. (Caveat: Windows symlink support is weak — keep the copy path as a MOSAIC_NO_SYMLINK=1 fallback, which the existing .ps1 variants can default to.)

Net DevEx contract a user can actually rely on: "Edit SOUL.md/USER.md/OVERRIDES.md freely; upgrades never destroy them and will offer a merge when the template evolves. Never edit CONSTITUTION.md/STANDARDS.md/guides/*; they update automatically. Want to change framework behavior? Add to OVERRIDES.md." That sentence is the whole upgrade-safety story, and today it cannot be truthfully written.


DQ4 — Cross-harness robustness: single source of truth (L0/L1), adapter = injection mechanism only, and stop pretending the four harnesses enforce identically

This is where the current design is weakest and where my lens has the strongest opinion.

The core problem (restating fact #1): On Pi the Constitution is a true system prompt (--append-system-prompt, adapters/pi.md:14). On Claude/Codex/OpenCode it is a "go read this file" instruction sitting in a user-editable memory file (CLAUDE.md, instructions.md, AGENTS.md). These have radically different enforcement strength: a system prompt is non-removable for the turn; a "read this file" pointer can be ignored if the model is busy, can be edited away by the user, and competes with the harness's own injected guidance (e.g. Claude's <system-reminder> blocks, which this very session demonstrates can carry their own mandatory-read instructions).

Positions:

  1. Single source of truth: L0/L1 live in exactly one place (~/.config/mosaic/CONSTITUTION.md, STANDARDS.md, guides/*). No harness gets a forked copy of rule text — only a pointer or an injection. This is mostly true today for guides, but the hard gates are duplicated: they exist in defaults/AGENTS.md:23-37 and are restated in templates/agent/AGENTS.md.template:7-15 and partially in every runtime/*/RUNTIME.md ("Runtime-default caution... does NOT override Mosaic hard gates" appears in all four). Concrete change: the four RUNTIME files should each shrink to a pointer ("Gates and precedence: CONSTITUTION.md §Hard Gates. This file adds only the harness-specific deltas below.") and the project AGENTS.md.template should @import/reference the Constitution rather than paraphrase 8 of its gates.

  2. The adapter's job is injection + tool-name translation, nothing else. Define a strict adapter contract. An adapters/<h>.md may specify only:

    • How L0/L1 reaches the model (system-prompt append vs. memory-file pointer vs. settings).
    • Tool-name mapping for capabilities the Constitution references abstractly. The Constitution must speak in capability verbs, not tool names, because the tool surfaces genuinely differ: Claude has Task(model=...) subagents (runtime/claude/RUNTIME.md:15-24); Pi has --thinking levels and --models cycling (runtime/pi/RUNTIME.md:22-28) and no sequential-thinking MCP gate (runtime/pi/RUNTIME.md:59-61); Codex/OpenCode require the MCP. A single rule "use sequential-thinking MCP" is already false for Pi — and the Pi runtime had to carve out an exception. That exception belongs in the adapter capability map, not as prose scattered in a runtime file.

    Concrete structure — a capability manifest per harness (adapters/<h>.capabilities.json):

    {
      "harness": "pi",
      "injection": "system-prompt-append",
      "capabilities": {
        "structured_reasoning": { "provider": "native-thinking", "gate": false },
        "subagent_spawn":       { "tool": "--models cycling", "model_param": "native" },
        "skills":               { "mechanism": "--skill flag" }
      }
    }
    

    vs. Claude's { "structured_reasoning": { "provider": "mcp:sequential-thinking", "gate": true }, "subagent_spawn": { "tool": "Task", "model_param": "model" } }. The Constitution says "use structured reasoning for multi-step planning"; the adapter resolves that to the concrete tool and says whether absence is a hard stop. This removes the four near-duplicate "sequential-thinking required (except Pi)" stanzas and makes adding a 5th harness a matter of writing one manifest.

  3. Honesty about enforcement tiers. Because file-pointer injection is weaker than system-prompt injection, the framework should prefer the strongest injection each harness offers and document the tier:

    • Pi: system-prompt (Tier 1, strong) — keep.
    • Claude: today uses CLAUDE.md pointer (Tier 3, weak). Concrete change: mosaic claude should inject the Constitution via --append-system-prompt (Claude Code supports it), demoting ~/.claude/CLAUDE.md to a fallback for bare claude launches — which its own header already admits it is (runtime/claude/CLAUDE.md:12-13). Same for Codex (--config/system prompt) and OpenCode where supported.
    • Where a harness genuinely only supports a memory file, that is Tier 3 and the docs must say "weaker enforcement; rely on hooks for hard gates." Which leads to:
  4. Back hard gates with mechanical hooks wherever the harness has them, because prose is advisory. Claude already does this: prevent-memory-write.sh is a PreToolUse hook, and runtime/claude/RUNTIME.md:30-32 is explicit that "the rule alone proved insufficient — the hook is the hard gate." That is the single most important DevEx lesson in the repo and it should be promoted to Constitution doctrine: a hard gate that can be enforced by a hook MUST be, on harnesses that support hooks; the prose is the spec, the hook is the enforcement. Codex/OpenCode hook parity becomes a tracked gap rather than a silent inconsistency.


DQ5 — Minimalism vs completeness: thin resident core, deep on-demand guides, and delete the duplication that's already there

The contract is large and partly duplicated — both are true and they have different fixes.

Keep the thin-resident / deep-on-demand split — it's the right instinct and already present. defaults/AGENTS.md:6-8 ("THIN CORE... Depth lives in guides, read on demand") plus the Conditional Guide Loading table (lines 89-110) is genuinely good design. Don't undo it. But tighten it:

  1. Define a hard budget for the always-resident core. Right now defaults/AGENTS.md is ~155 lines and growing (it carries the model-selection table, the superpowers section, the closure checklist — all of which are advice, not gates). Concrete change: the resident L0 core (CONSTITUTION.md) should be only: hard gates, precedence, block-vs-done, escalation triggers, mode declaration. Target ≤ ~70 lines. Everything else (subagent cost selection lines 111-121, superpowers enforcement 123-139, conditional-loading table) moves to STANDARDS.md (L1, resident but separable) or a guide. Rationale: every always-resident token competes with task context on every harness, and the weakest-context harness (smallest effective window) sets the ceiling.

  2. Eliminate the existing triplication of hard gates. As noted in DQ4, the gates live in three places. Pick one canonical home (CONSTITUTION.md), and make templates/agent/AGENTS.md.template and the RUNTIME files reference it. This is pure win: less to read, impossible to drift out of sync, smaller resident footprint. The templates/agent/AGENTS.md.template:5-15 "Hard Gates" block is a maintenance landmine — it already uses a stale path (~/.config/mosaic/rails/git/... vs the real ~/.config/mosaic/tools/git/...), proving the duplication has already drifted.

  3. Contradiction audit as a release gate. There is at least one live contradiction in the shipped tree: rails/ vs tools/ paths (template vs defaults), and the migration code at install.sh:193 even removes a stale rails symlink — so the framework knows rails is dead but templates still emit it. Concrete change: extend the DQ2 sanitization CI check to also fail on known-dead path tokens (/rails/, bin/mosaic-) outside of migration code. Minimalism isn't just fewer words; it's no stale words.

  4. "Completeness" belongs in guides and examples/, not the core. The depth (E2E-DELIVERY, ORCHESTRATOR, QA-TESTING) is excellent and should stay long — it's loaded on demand by role, so its length costs nothing on a session that doesn't need it. The error is putting completeness in the resident contract. Resident = gates + routing table. Depth = guides. Worked examples = examples/.

Anti-bloat principle to adopt explicitly: If a line is not a gate, not the precedence rule, and not required to route to the right guide, it does not belong in the always-resident core. That single sentence, applied, would cut defaults/AGENTS.md roughly in half.


Summary of concrete changes (what I'd actually do, with paths)

  1. Create CONSTITUTION.md (L0) from the hard-gates + escalation + precedence portions of defaults/AGENTS.md:23-87; add an explicit ## Precedence section (L0 > L1 > {L2,L3,L4}). Shrink resident core to ≤ ~70 lines.
  2. Delete defaults/SOUL.md (the "Jarvis"/"PDA" file). Persona ships only as templates/SOUL.md.template; generated locally. install.sh:232-241 already refuses to seed it — the file just shouldn't exist.
  3. Delete runtime/claude/settings-overlays/jarvis-loop.json; move its sanitized, placeholdered essence to examples/overlays/e2e-loop.json and examples/personas/execution-partner.md.
  4. Add a sanitization + dead-path CI gate in tools/quality/scripts/verify.sh over shipped dirs (denylist: jarvis|jason|woltje|\bPDA\b|~/src/jarvis|/rails/). Make contamination un-mergeable.
  5. Per-file template versioning (<!-- mosaic:template-version: N -->) + a new tools/_scripts/mosaic-reconcile doing 3-way merge of L2/L3 files on upgrade; remove STANDARDS.md from install.sh:24 PRESERVE_PATHS.
  6. Invert link policy in mosaic-link-runtime-assets: symlink framework-owned runtime pointers (single source of truth, zero drift); copy only user-editable settings; keep MOSAIC_NO_SYMLINK=1 for Windows.
  7. Adapter capability manifests (adapters/<h>.capabilities.json) for injection mode + tool-name mapping + per-gate enforcement tier; collapse the four near-duplicate "sequential-thinking required (except Pi)" stanzas into the manifests.
  8. Prefer strongest injection per harness: mosaic claude/mosaic codex inject the Constitution via system-prompt append; demote CLAUDE.md/instructions.md to documented fallbacks.
  9. Promote "hooks are the real enforcement" to Constitution doctrine (generalizing runtime/claude/RUNTIME.md:30-32); track Codex/OpenCode hook parity as an open gap.
  10. De-duplicate hard gates out of templates/agent/AGENTS.md.template and runtime/*/RUNTIME.md into references to CONSTITUTION.md; fix the stale rails/ paths while doing it.

Abstract

Headline: Mosaic's portability problem isn't the layering taxonomy — it's that the four harnesses enforce the contract with wildly different strength (Pi: real system prompt; Claude/Codex/OpenCode: a user-editable "please read this file" pointer that copies-on-link and silently drifts), and personal data leaked precisely because framework-owned and user-owned content share files with no mutability boundary.

Single strongest recommendation: Split content by ownership + mutability into L0 Constitution (framework, always overwritten) / L2 Persona + L3 Operator (user, never overwritten, template-versioned with 3-way-merge on upgrade), make the adapter responsible only for injection-mechanism + tool-name mapping via per-harness capability manifests, and back every hookable hard gate with an actual hook — because, as the repo already learned with prevent-memory-write.sh, prose rules are advisory and only mechanical enforcement is a gate.

Biggest risk: The weak-injection harnesses make the Constitution advisory, not enforced on 3 of 4 runtimes. If we ship the layering taxonomy but leave Claude/Codex/OpenCode receiving L0 as an ignorable, user-editable memory-file pointer (and keep copy-on-link drift), we'll have a beautiful constitution that the model can silently skip and the user can silently clobber — re-creating the deployed-vs-source drift the brief set out to kill, just with cleaner file names.