Files
stack/docs/design/framework-constitution/debate/redteam-devex.md
Jason Woltje c70b217a5c
Some checks failed
ci/woodpecker/push/ci Pipeline failed
docs(design): mosaic framework constitution — expert conference output
Conference of 7 experts (architect/moonshot/contrarian/coder/aiml/devex/steward)
debated layering, sanitization, upgrade-safety, cross-harness robustness.
Artifacts: BRIEF, 7 positions, 7 rebuttals, synthesis-v1, 3 red-team passes,
canonical DESIGN.md, OPEN-QUESTIONS.md, MISSION.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 23:47:49 -05:00

272 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Red Team — Cross-Harness DevEx
**Lens:** Cross-Harness DevEx Expert (Claude Code / Codex / Pi / OpenCode injection & tool
differences; portability; end-user customization & upgrade experience).
**Target:** `synthesis-v1.md` (Chief Architect ruling) against the real tree at
`packages/mosaic/framework/`.
**Method:** I re-ran the greps rather than trusting the papers. Every claim below cites a file I read.
I am not re-litigating the settled 80% (Constitution layer, delete `defaults/SOUL.md`, CI grep,
LICENSE, credential fast-fail, `PRESERVE_PATHS` removal). Those are right. Below is where I can
break the design *as written*, ordered by severity.
---
## BLOCKERS
### B1 — The customization mechanism the whole design rests on (`mosaic init`) is interactive-only and will hang every headless launch path
The synthesis stakes upgrade-safety and sanitization on "L2/L3 ship as templates only, generated at
init" (D6, §4) and "Generated at `mosaic init`" (§4). It treats `mosaic init` as a solved
primitive. It is not solved for the way Mosaic actually runs.
`tools/_scripts/mosaic-init` is **interactive by default** (line 50: "Interactive by default";
lines 113/138/184/287: bare `read -r`). The framework's own headless surfaces are numerous: the
Discord bridge runs with **"no human at this terminal"** (project `CLAUDE.md`, Discord Bridge
Protocol), the orchestrator spawns workers via `claude -p`/`codex exec` (`guides/ORCHESTRATOR.md:6`),
and the BRIEF's own migration constraint is **"no interactive prompt, no hang"**
(synthesis §5.5, fixtures 13).
Failure mode: a fresh container/CI/Discord deployment installs the framework (`install.sh` does
**not** seed `SOUL.md`/`USER.md` — confirmed `install.sh:231`, `install.sh:301`), an agent launches,
no `SOUL.md`/`USER.md` exists, and either (a) the launcher tries `mosaic init` and blocks on
`read -r` forever, or (b) the agent boots with the **"missing core file → stop and report"** gate
(`defaults/AGENTS.md:144`) firing on every cold start. The synthesis never specifies who runs init,
when, or in what mode on an unattended host.
**Mitigation (must be in the alpha DoD, not deferred):** Define a deterministic non-interactive
bootstrap. `install.sh` MUST, after rsync, run `mosaic-init --non-interactive` (the flag exists,
line 61) with documented defaults so a valid `SOUL.md`/`USER.md` always exists post-install. Add a
4th migration fixture to §5.5: *"unattended install (no TTY) → valid resident SOUL.md/USER.md exist,
zero `read` calls."* Until that fixture is green, the alpha cannot tag — this is the same falsifiable
gate the synthesis already applies to migration.
### B2 — The non-interactive default regenerates the exact bug D6 claims to reject ("Assistant" is the new "Jarvis")
D6 explicitly *rejects* "Generic-defaults for persona (recreates the bug — 'Assistant' becomes the
new 'Jarvis')." But the only persona-generation mechanism in the tree does exactly that:
`tools/_scripts/mosaic-init:277` is `prompt_if_empty AGENT_NAME "What name should agents use"
**"Assistant"**`. In `--non-interactive` mode (which B1 shows is the *only* viable mode for Mosaic's
headless fleet), `prompt_if_empty` takes the default — so every unattended deployment ships an agent
literally named **"Assistant"** with role "execution partner and visibility engine" (line 278, copied
verbatim from the Jarvis `defaults/SOUL.md:11`).
So the design's stated anti-pattern is the design's actual default. Worse: the role string is still
the operator's old role description, meaning a sliver of Jarvis persona survives sanitization through
the init defaults — invisible to `verify-sanitized.sh` because it lives in the *generator*, not in
`defaults/`.
**Mitigation:** Pick one and make it real: (a) make non-interactive init **fail closed** on persona
unless `--agent-name` is supplied (forces deployers to choose, no silent "Assistant"), or (b) accept
a generic persona as a *conscious* alpha decision and **strike the contradictory rejection from D6**
you cannot both reject generic-default-persona and ship it. Either way, extend `verify-sanitized.sh`
to scan `tools/_scripts/mosaic-init` for operator-derived default strings (the role line is one).
### B3 — The sanitization fix list misses 5+ contaminated files; the CI grep as scoped will *fail the build on day one* or silently miss them
The synthesis "verified live facts" names exactly two files with the private credential path
(`tools/_lib/credentials.sh:19`, `tools/git/detect-platform.sh:89`) and D8 fixes "both." My grep
found **at least six**:
```
tools/_lib/credentials.sh:19
tools/git/detect-platform.sh:89
tools/health/stack-health.sh:23
tools/coolify/README.md:8
tools/glpi/README.md:8
tools/authentik/README.md:8
tools/woodpecker/README.md:8 (+ likely more tool READMEs)
```
Two independent breakages follow:
1. **Incomplete fix.** D8 patches 2 of 6+; `stack-health.sh:23` keeps the hardcoded private path as
an *executable* default — the exact class D8 calls "worse than persona contamination… runnable."
2. **CI grep paradox.** D6 scopes `verify-sanitized.sh` over `defaults/ guides/ templates/ runtime/
adapters/` and **excludes examples/** — but says nothing about `tools/`. So the blocking grep that
is supposed to be the "only durable control" **does not even look in the directory where the
runnable contamination lives.** If you widen scope to `tools/`, the build goes red on the README
tokens immediately; if you don't, the credential leak ships. The synthesis has not reconciled this.
Also note: the synthesis's premise that this is a `${VAR:-default}` violation is half-right — the code
is *already* `${MOSAIC_CREDENTIALS_FILE:-$HOME/src/jarvis-brain/...}`, i.e. already overridable. The
defect is purely the *leaked private default*, not missing env support. The fix is to drop the default
(`${MOSAIC_CREDENTIALS_FILE:?...}`), and it must land in **all** call sites.
**Mitigation:** Enumerate the real contamination set with `grep -rn "jarvis-brain\|/home/jwoltje"
tools/` before writing the fix list; fix every hit; scope `verify-sanitized.sh` to include `tools/`
(README prose can use a placeholder like `$MOSAIC_CREDENTIALS_FILE` to pass the grep). Make the grep's
own scope a reviewed artifact — an under-scoped denylist is indistinguishable from no denylist.
---
## MAJOR
### M4 — Tiered injection legitimizes a real cross-harness drift: the bare-`claude` Tier-3 path silently runs on a *different, weaker* law text
The honesty of D5's tier table (`Pi=Tier1 by-value`, `bare claude=Tier3 pointer + ≤5-bullet inline`)
is the right instinct, but it ships **two different constitutions** to two users who both believe they
are "running Mosaic." Tier-1 gets all 13 gates by value; Tier-3 gets a 5-bullet summary plus a
*conditional* "READ CONSTITUTION.md if not resident." On a bare `claude` launch the model is already
mid-task with competing harness `<system-reminder>`s (I am reading several right now in this very
session) — the conditional read is the *weakest* tier by the synthesis's own ladder, and nothing
guarantees it fires. So gate #12 ("complexity trap"), gate #10 ("no manual docker build"), gate #6
("queue guard") — none resident on Tier-3 — are simply absent for that user. Two harnesses, two
behaviors, same "Mosaic" label. That is the cross-harness inconsistency the BRIEF (DQ4) exists to
kill, re-introduced as an accepted design property.
The current tree already has this disease and the synthesis under-counts it: `defaults/AGENTS.md:11`
asserts "The core contract is ALREADY in your context… Do not re-read it" — **provably false on bare
`claude`** (the synthesis catches this, consensus item 9, good) — but the *fix* (Tier-3 inline
summary) is itself a lossy re-statement of L0, which is the very "paraphrased law is the drift vector"
sin D7 rails against. You cannot simultaneously (a) forbid paraphrasing gates and (b) ship a 5-bullet
paraphrase of the gates as the Tier-3 payload.
**Mitigation:** The ≤5-bullet Tier-3 anchor must be **a literal substring of L0** (the same bytes,
not a summary) — pick the 5 truly irreducible *stop-condition* gates and inject those exact lines, so
Tier-3 is a strict subset of Tier-1, never a divergent paraphrase. And the CI smoke test (D5) must
assert **byte-equality** of that anchor against the L0 source, not mere "gates present." Otherwise the
smoke test passes while the texts drift.
### M5 — Removing `AGENTS.md`/`STANDARDS.md` from `PRESERVE_PATHS` will clobber real user edits on the first upgrade, because today those files are user-editable and edited
The single highest-value change (consensus item 7; §5.1) is "Remove `AGENTS.md` and `STANDARDS.md`
from `PRESERVE_PATHS`." Confirmed today: `install.sh:24` lists both as preserved. The drift bug is
real. But the migration is more dangerous than the synthesis admits.
`PRESERVE_PATHS` has protected `AGENTS.md`/`STANDARDS.md` since `FRAMEWORK_VERSION=2`
(`install.sh:28`). That means **every existing install may have a locally-modified
`AGENTS.md`/`STANDARDS.md`** — that was the *sanctioned* customization surface until now. The moment
v3 removes them from preserve and `rsync --delete` runs (`install.sh:116`), those edits are
**destroyed with no capture into `.local.md`**. The synthesis's fixture 3 ("user-tuned-standard →
survives as `STANDARDS.local.md`") *assumes* the migration first extracts the user delta into an
overlay — but §5.4 only describes snapshotting to `.backup-v2/` and installing new files. It never
specifies the **delta-extraction step** that turns a legacy edited `STANDARDS.md` into
`STANDARDS.local.md`. A `.backup-v2/` tarball the user never looks at is not "your change survived."
**Mitigation:** The v2→v3 migration MUST, for `AGENTS.md` and `STANDARDS.md`, diff the installed file
against the v2 *shipped* baseline; if they differ, write the diff (or the whole old file) to
`<name>.local.md` **before** overwriting, and print a one-line notice. This needs the v2 baseline
shipped inside the migration (the synthesis correctly notes "no current install has a base" for 3-way
merge — same problem here; solve it by vendoring the v2 baseline into the migration script, not by
hoping). Fixture 3 must assert the *content* landed in `.local.md`, not just that a backup exists.
### M6 — `.local.md` overlays only work if the launcher composes them; three of four harnesses have no such composer today
D4/§5.2 mandate "additive overlays, launcher-composed" via `mosaic compose-contract <harness>`. I
grepped: **no `compose-contract` exists** (only `prdy-init.sh`, `prdy-update.sh`, `adapters/pi.md`,
`README.md` mention "compose"). So the central upgrade-safety promise — "edit `*.local.md` freely" —
is backed by a command that isn't written. More portability-specific: the four harnesses inject
differently and only Pi clearly supports by-value append (`adapters/pi.md:14`
`--append-system-prompt`). Codex/OpenCode read an **instructions file** (`runtime/codex/RUNTIME.md:8`
`~/.codex/instructions.md`; `runtime/opencode/RUNTIME.md:8` `~/.config/opencode/AGENTS.md`), and bare
`claude` reads `~/.config/mosaic/` by self-load. For `.local.md` to take effect on Codex/OpenCode,
*something* must concatenate base+overlay into that instructions file at the right moment. The
synthesis assigns this to "the launcher" but never says the launcher writes the instructions file, nor
what happens for **bare** `claude`/`codex`/`opencode` launches that bypass `mosaic` entirely (the
exact Tier-3 path that exists *because users do this*). On those paths the overlay is simply never
composed and silently no-ops — the failure mode devex §2b (quoted in D4) supposedly already ruled out,
re-appearing for the non-`mosaic` launch.
**Mitigation:** (1) `compose-contract` is alpha-blocking, not assumed; spec it per harness:
Pi=append-prompt, Codex/OpenCode=write-merged-instructions-file, Claude=write into the self-loaded
`~/.config/mosaic/AGENTS.md` chain. (2) For **bare** launches that bypass `mosaic`, the self-load
fallback in `AGENTS.md` MUST also pull `*.local.md` (the dispatcher reads overlays too), or document
loudly that overlays require `mosaic <harness>` and bare launches get base-only. Pick one; don't leave
it implicit.
### M7 — The Pi/sequential-thinking capability split fixes one contradiction and leaves the inverse one live
D5 correctly kills the `defaults/AGENTS.md:143` ("sequential-thinking REQUIRED, else stop") vs
`adapters/pi.md` ("native thinking replaces it") contradiction via capability verbs. But the live tree
has the contradiction in **four** places, not one: `runtime/codex/RUNTIME.md:3`,
`runtime/opencode/RUNTIME.md:3`, and `runtime/claude/RUNTIME.md:3` all say "sequential-thinking MCP is
required," while `runtime/pi/RUNTIME.md:61` says "The Mosaic launcher does NOT gate on
sequential-thinking MCP for Pi." If L0 states the gate as a hard "else stop" (as `AGENTS.md:143` does
today) and only the *adapter* downgrades it for Pi, then a Pi agent that self-loads L0 on a bare `pi`
launch reads "REQUIRED, else stop" from the resident constitution and the "not gated" relief only from
the non-resident adapter — i.e. the *stronger* statement is the resident one and Pi agents will
spuriously halt. The capability-verb abstraction only resolves this if L0 is authored in verbs from
the start ("use structured reasoning") with **zero** tool-specific "else stop," and the gate-vs-no-gate
binding lives *only* in the adapter. The synthesis says this but the migration plan never rewrites the
four RUNTIME.md "required" lines; §2b only touches "restated policy," and a reader could leave the
contradictory line in.
**Mitigation:** Make "no tool-named hard-stop in L0" an explicit `verify-sanitized.sh` rule
(grep L0 for `sequential-thinking|MCP.*REQUIRED|else stop` → fail). Rewrite all four RUNTIME.md
capability lines in the same PR; add a smoke-test assertion that a bare `pi` launch does not emit the
sequential-thinking halt.
---
## MINOR
### m8 — Resident line-count budget without a per-harness baseline is a foot-gun for the weakest harness
D7 enforces a "resident line-count ceiling" over the resident set. Good. But the synthesis notes Pi's
"resident fidelity is Pi's *only* enforcement" (§6 table) — Pi has **no hook backstop**. A single
global line budget tuned for Claude (hooks + plugins absorb load) is simultaneously too loose for Pi
(which needs *everything* resident because it has no mechanical net) and the budget can't tell the
difference. **Mitigation:** budget per residency-tier, and document that on hook-less harnesses (Pi,
and Codex/OpenCode until hook parity — a "tracked gap" per §6) more of L0/L1 must stay resident; the
budget number is per-harness, not global.
### m9 — `mosaic doctor` drift advisory is the only drift detection, and it's opt-in on the paths where drift happens
D3/§5.6 make drift detection a *non-blocking advisory* in `mosaic doctor`. But drift happens precisely
on **bare** `claude`/`codex` launches that never invoke `mosaic` (hence never run `doctor`). So the
one detector is absent exactly where the disease lives — the same structural flaw the synthesis
correctly used to *reject* hash-refusal-on-launch (D3) applies to its own chosen replacement.
**Mitigation:** accept it as a known alpha limitation **in writing** (CONTRIBUTING/COMPLIANCE doc),
and have the `AGENTS.md` self-load fallback emit a one-line "run `mosaic doctor`" nudge when it detects
it was loaded outside a `mosaic` launcher. Don't claim drift is "detected" when it's only detected for
users who opt into the tool.
### m10 — `templates/agent/` ships 12 files with `rails/git/`; the dispatcher-replacement risks leaving CLAUDE.md siblings behind
Confirmed: `rails/git` / `/rails/` appears across `templates/agent/AGENTS.md.template` **and** the
`CLAUDE.md.template` siblings + all `projects/*` (django/typescript/nestjs-nextjs/python-*). §2b's fix
list names "`templates/agent/AGENTS.md.template` (+ 11 sibling/project templates)" but the grep shows
the `CLAUDE.md.template` variants carry the same dead `rails/` path and the same restated hard-gates
block. If the PR fixes the `AGENTS.md.template` set but not the `CLAUDE.md.template` set, Claude-first
projects (which read `CLAUDE.md`) keep emitting commands at a path `install.sh:192` deletes.
**Mitigation:** the `rails/`→`tools/` and gate-block-removal edits must target `templates/agent/**`
(both `AGENTS.md.template` and `CLAUDE.md.template`), enforced by the same `verify-sanitized.sh`
`/rails/` rule over `templates/`.
### m11 — "Master/slave" is not the only legacy-terminology / dead-path landmine; sanitize the class
§2b drops the "Master/slave model" framing at `STANDARDS.md:5` (confirmed present). Fine, but it's a
one-off fix for a class problem: `STANDARDS.md:42-44` also references `scripts/agent/session-start.sh`
lifecycle scripts and `adapters/claude.md:16` references `~/.config/mosaic/rails` ("linked into
`~/.claude`"). These are the same drift family (stale paths/terms in resident or near-resident files).
**Mitigation:** the CI grep's dead-path rule should cover `rails`, `scripts/agent/` (if those are
deprecated), and a small terminology denylist — close the class, per the synthesis's own D6 "close the
class, not the tokens" principle, which it applies to PII but not to dead paths/terms.
---
## Summary table
| ID | Severity | One-line risk | Core mitigation |
|----|----------|---------------|-----------------|
| B1 | blocker | `mosaic init` is interactive-only → hangs/blocks every headless (Discord/orchestrator/CI) cold start | `install.sh` runs `mosaic-init --non-interactive`; add unattended-install migration fixture |
| B2 | blocker | Non-interactive default ships agent named "Assistant" + Jarvis role string — the bug D6 *rejects* | Fail-closed on persona, or strike D6's rejection; grep init defaults in CI |
| B3 | blocker | Credential leak is in 6+ files (synthesis names 2); CI grep doesn't scope `tools/` | Enumerate real set; fix all; scope grep to `tools/` |
| M4 | major | Tier-3 bare-`claude` runs a divergent 5-bullet paraphrase of L0 → two "Mosaics" | Tier-3 anchor must be literal L0 substring; smoke test asserts byte-equality |
| M5 | major | Pulling `AGENTS.md`/`STANDARDS.md` from PRESERVE clobbers existing user edits | Migration extracts delta → `.local.md` before overwrite; vendor v2 baseline |
| M6 | major | `compose-contract` doesn't exist; overlays no-op on Codex/OpenCode + all bare launches | Spec composer per harness; define bare-launch overlay behavior |
| M7 | major | sequential-thinking hard-stop contradiction lives in 4 RUNTIME files; L0-resident "else stop" halts Pi | L0 in capability verbs only; CI rule bans tool-named hard-stops in L0 |
| m8 | minor | Global line budget ignores Pi's no-hook "resident is the only enforcement" | Per-harness residency budget |
| m9 | minor | `mosaic doctor` drift advisory absent on the bare launches where drift occurs | Document limitation; self-load nudge |
| m10 | minor | `CLAUDE.md.template` siblings keep `rails/git` + restated gates | Fix both template families; CI `/rails/` rule over `templates/` |
| m11 | minor | Dead-path/legacy-term sanitization is one-off, not class-closing | Extend CI grep to dead paths + term denylist |
**Bottom line:** the layer model and the "subtraction not addition" doctrine are sound. The design
breaks at the **seam between the spec and the mechanisms it assumes already exist** — `mosaic init`
(interactive, generic-default), `compose-contract` (absent), the migration's delta-extraction step
(unspecified), and a CI grep scoped to miss the runnable contamination. Every blocker is a case of the
synthesis describing a control as done when the tree shows it isn't. None of them weaken the hard gates
on paper; **B1, M4, M6 weaken them in practice** by letting an agent launch with the gates absent,
paraphrased, or un-composed — which is the one outcome the BRIEF's non-negotiables forbid.