22 KiB
Durable tmux Fleet Installation Plan
For Mosaic/Hermes: This is an implementation plan for making the tmux-backed Mosaic software-factory fleet durable on this server and reusable in generic Mosaic Stack installs. Keep local USC/Mosaic defaults in profiles; keep framework behavior customizable.
Goal: Add a supported Mosaic tmux-fleet installation path: holder-owned tmux server, per-agent reusable sessions, reliable send/reset/status tools, local roster customization, and a documented cutover for this server.
Architecture: Mosaic should ship generic tmux fleet primitives in the framework, then layer local rosters through configuration. The holder service owns the tmux socket; each agent service joins the holder-owned server and runs mosaic yolo <runtime>. The orchestrator addresses agents through mosaic agent ... abstractions so tmux can later be replaced by Matrix-backed agent comms without changing mission flow.
Reference: AI Guide playbooks/tmux-fleet.md at commit 2a0b0b5 documents the organization-neutral holder-service pattern, exact-match =<name> stop targets, and coupled-server cutover/verification sequence. The Stack implementation should treat that as the lifecycle model and keep concrete Mosaic unit/tooling details here.
Tech Stack: Bash, tmux, user systemd units, Mosaic CLI/framework installer, JSON/YAML roster config, existing packages/mosaic/framework/tools/tmux/{agent-send.sh,send-message.sh}.
Current evidence from this server
Checked 2026-06-19:
- Host:
W-jarvis - User:
jarvis - tmux:
/usr/bin/tmux, version3.4 - user systemd: active
- existing tmux sessions:
ai-bma-0,dyor-1,melaniewoltje-3,sage-2 - existing Mosaic runtime:
/home/jarvis/.npm-global/bin/mosaic, version0.0.31 - installed
~/.config/mosaic/tools/tmuxwas not present even though the stack repo containspackages/mosaic/framework/tools/tmux/
Implication: do not kill the current tmux server casually. This server has active ad-hoc/service sessions. The durable fleet cutover must be planned, with either a separate socket first or a scheduled fleet recycle.
Design decisions
1. Generic framework, local profile
The Mosaic framework should ship:
- systemd unit templates;
- tmux fleet CLI wrappers;
- roster schema and examples;
- install/enable/status/reset commands;
- docs and verification scripts.
Local environments should provide:
- agent names;
- runtime per slot (
claude,pi,codex, etc.); - default role class;
- launch directory;
- optional kickstart prompt;
- model/provider hints;
- transport selection (
tmuxnow,matrixlater).
Do not bake the USC roster into generic install code. Ship it as an example profile.
2. Durable sessions, disposable task context
Session names are durable operational addresses. Task persona is disposable. Reusable worker slots should be reset with /clear or /new and then receive a fresh task kickstart.
Persistent/semi-persistent personas:
- lead orchestrator;
- final/adversarial reviewer;
- architecture/enhancement lane.
Disposable slots:
- implementers;
- ordinary reviewers;
- security reviewers unless actively holding a security mission.
3. Transport abstraction now
Add commands around tmux instead of calling tmux directly from orchestration:
mosaic agent send <agent> --message "..."
mosaic agent status [--json]
mosaic agent reset <agent> [--clear|--new]
mosaic agent roster [--json]
mosaic fleet install|start|stop|restart|status|verify
Today these call tmux/systemd. Later the same command surface can target Matrix or per-agent gateways.
4. Avoid shared-server ownership bug
Use the AI Guide holder pattern:
mosaic-tmux-holder.service owns the tmux server/socket
mosaic-agent@<name>.service joins the existing holder-owned socket
ExecStop kills only session =<name>
Use exact tmux targets: =<session>.
5. Prefer separate named socket for Mosaic factory
To avoid disturbing existing tmux work, the default fleet should use a named socket such as:
$XDG_RUNTIME_DIR/mosaic-factory.tmux
or tmux socket name:
tmux -L mosaic-factory ...
This avoids collision with ordinary tmux ls sessions. The send tools need socket support.
Target USC-style roster example
Ship as example only, not default:
version: 1
transport: tmux
tmux:
socket_name: mosaic-factory
holder_session: _holder
working_directory: ~/src
agents:
- name: mos-claude
runtime: claude
class: orchestrator
model_hint: Claude Opus
persistent_persona: true
- name: coder0
runtime: claude
class: implementer
model_hint: Claude Opus
reset_between_tasks: true
- name: coder1
runtime: claude
class: implementer
model_hint: Claude Opus
reset_between_tasks: true
- name: coder2
runtime: pi
class: implementer
model_hint: Pi GPT-5.5
reset_between_tasks: true
- name: coder3
runtime: pi
class: implementer
model_hint: Pi GPT-5.5
reset_between_tasks: true
- name: coder4
runtime: claude
class: implementer
model_hint: Claude Opus
reset_between_tasks: true
- name: coder5
runtime: claude
class: implementer
model_hint: Claude Opus
reset_between_tasks: true
- name: enhance
runtime: claude
class: enhancer
model_hint: Claude Opus
persistent_persona: semi
- name: rev0
runtime: pi
class: reviewer
model_hint: Pi GPT-5.5
reset_between_tasks: true
- name: rev1
runtime: pi
class: reviewer
model_hint: Pi GPT-5.5
reset_between_tasks: true
- name: secrev0
runtime: pi
class: security_reviewer
model_hint: Pi GPT-5.5
reset_between_tasks: true
- name: secrev1
runtime: pi
class: security_reviewer
model_hint: Pi GPT-5.5
reset_between_tasks: true
- name: ultron
runtime: pi
class: final_reviewer
model_hint: Pi GPT-5.5
persistent_persona: semi
Phase 0 — Confirm install surfaces
Task 0.1: Inspect installer copy behavior
Objective: Confirm how framework files under packages/mosaic/framework/ become installed under ~/.config/mosaic/.
Files:
- Read:
tools/install.sh - Read:
packages/mosaic/framework/install.sh - Read:
packages/mosaic/src/runtime/install-manifest.ts
Steps:
- Verify
packages/mosaic/framework/install.shrsyncstools/tmux. - Verify whether npm-packaged installs include
framework/tools/tmux. - Confirm whether installed hosts should run
mosaic update,bash tools/install.sh, orpackages/mosaic/framework/install.shto receive new tmux tools. - Record exact propagation command in docs.
Verification:
bash packages/mosaic/framework/install.sh --help || true
npm pack --dry-run --json | jq '.[0].files[].path' | grep 'framework/tools/tmux'
Expected: tmux tools are included in installable package or packaging fix is identified.
Task 0.2: Inspect current yolo launch semantics
Objective: Confirm mosaic yolo claude and mosaic yolo pi accept optional initial prompt text and behave well under systemd/tmux.
Files:
- Read:
packages/mosaic/src/** - Read:
packages/mosaic/framework/runtime/claude/RUNTIME.md - Read:
packages/mosaic/framework/runtime/pi/RUNTIME.md
Verification commands:
mosaic yolo claude --help
mosaic yolo pi --help
Expected: a systemd ExecStart can launch the runtime either with no prompt or with a kickstart prompt file/string.
Phase 1 — Framework tmux primitives
Task 1.1: Add socket support to send tools
Objective: Allow agent-send.sh and send-message.sh to target a named Mosaic tmux socket without affecting default tmux sessions.
Files:
- Modify:
packages/mosaic/framework/tools/tmux/send-message.sh - Modify:
packages/mosaic/framework/tools/tmux/agent-send.sh - Modify:
packages/mosaic/framework/tools/tmux/README.md - Test:
packages/mosaic/framework/tools/tmux/test-send-message.sh(new)
Design:
Add optional flags:
-L SOCKET_NAME # tmux -L socket name
-SOCKET PATH # optional later if needed; avoid conflict with existing -S source label in agent-send
Because agent-send.sh already uses -S for source label, prefer -L for socket name and -T or --socket-path only if long-option parsing is added.
Implementation notes:
- Build a tmux command array:
tmux_cmd=(tmux)
if [ -n "$SOCKET_NAME" ]; then tmux_cmd+=( -L "$SOCKET_NAME" ); fi
- Replace raw
tmux ...calls with"${tmux_cmd[@]}" .... - Pass
-Lthrough remote ssh invocation. - Include socket name in verbose output.
Verification:
tmux -L mosaic-test new-session -d -s target 'cat'
packages/mosaic/framework/tools/tmux/send-message.sh -L mosaic-test -t target -m 'hello'
tmux -L mosaic-test capture-pane -t target -p | grep hello
tmux -L mosaic-test kill-server
Expected: message lands in the named socket session; default tmux ls is untouched.
Task 1.2: Add exact target validation helper
Objective: Prevent accidental prefix targeting in all tmux fleet operations.
Files:
- Create:
packages/mosaic/framework/tools/tmux/_lib.sh - Modify:
send-message.sh - Modify:
agent-send.sh
Behavior:
- For session-only agent names, normalize target to
=<name>before kill/status/reset operations. - For explicit pane targets like
session:window.pane, allow as advanced path but document the risk.
Verification:
Create sessions agent and agent0; verify killing/resetting agent does not affect agent0.
Phase 2 — systemd unit templates
Task 2.1: Add holder service template
Objective: Ship a user systemd unit template that owns the Mosaic factory tmux server.
Files:
- Create:
packages/mosaic/framework/systemd/user/mosaic-tmux-holder.service - Create:
packages/mosaic/framework/tools/fleet/install-user-units.sh
Unit shape:
[Unit]
Description=Mosaic tmux fleet holder
Documentation=https://git.mosaicstack.dev/mosaicstack/aiguide
[Service]
Type=oneshot
RemainAfterExit=yes
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
ExecStart=/usr/bin/tmux -L ${MOSAIC_TMUX_SOCKET} new-session -d -s _holder 'while true; do sleep 3600; done'
ExecStop=-/usr/bin/tmux -L ${MOSAIC_TMUX_SOCKET} kill-server
[Install]
WantedBy=default.target
Important: systemd environment expansion in ExecStart is limited. Verify syntax; if %E/environment expansion is awkward, generate concrete units from config instead of relying on dynamic expansion.
Verification:
systemd-analyze --user verify ~/.config/systemd/user/mosaic-tmux-holder.service
systemctl --user daemon-reload
systemctl --user start mosaic-tmux-holder.service
tmux -L mosaic-factory ls | grep _holder
Task 2.2: Add agent service template
Objective: Ship a user systemd template that starts one configured agent slot.
Files:
- Create:
packages/mosaic/framework/systemd/user/mosaic-agent@.service - Modify:
packages/mosaic/framework/tools/fleet/install-user-units.sh
Unit shape:
[Unit]
Description=Mosaic agent session %i
Requires=mosaic-tmux-holder.service
After=mosaic-tmux-holder.service
PartOf=mosaic-tmux-holder.service
[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=%h/src
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
ExecStart=/bin/bash -lc 'tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "%i" "mosaic yolo $(mosaic fleet runtime %i)"'
ExecStop=-/usr/bin/tmux -L mosaic-factory kill-session -t '=%i'
[Install]
WantedBy=default.target
Design warning: command substitution in unit files can become brittle. Prefer a generated per-agent EnvironmentFile:
~/.config/mosaic/fleet/agents/coder0.env
with:
MOSAIC_AGENT_NAME=coder0
MOSAIC_AGENT_RUNTIME=claude
MOSAIC_AGENT_WORKDIR=/home/jarvis/src
MOSAIC_TMUX_SOCKET=mosaic-factory
Then ExecStart calls a wrapper:
~/.config/mosaic/tools/fleet/start-agent-session.sh
Verification:
systemd-analyze --user verify ~/.config/systemd/user/mosaic-agent@.service
systemctl --user start mosaic-agent@coder0.service
tmux -L mosaic-factory has-session -t '=coder0'
systemctl --user restart mosaic-agent@coder0.service
Expected: holder server PID remains unchanged; only coder0 session recycles.
Task 2.3: Add start-agent wrapper
Objective: Keep systemd units simple by moving config lookup and launch command construction into a script.
Files:
- Create:
packages/mosaic/framework/tools/fleet/start-agent-session.sh
Behavior:
Inputs:
start-agent-session.sh <agent-name>
Reads:
$MOSAIC_HOME/fleet/agents/<agent-name>.env
Starts:
tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$MOSAIC_AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" "mosaic yolo $MOSAIC_AGENT_RUNTIME"
Guardrails:
- fail if runtime is empty;
- fail if workdir does not exist;
- no duplicate sessions unless
--replaceis passed; - exact session names only.
Phase 3 — roster config and CLI wrappers
Task 3.1: Add fleet config schema and examples
Objective: Define customizable install-time roster without hardcoding USC.
Files:
- Create:
packages/mosaic/framework/fleet/roster.schema.json - Create:
packages/mosaic/framework/fleet/examples/minimal.yaml - Create:
packages/mosaic/framework/fleet/examples/usc-software-factory.yaml - Create:
packages/mosaic/framework/fleet/README.md
Schema concepts:
transport:tmuxnow;matrixlater.tmux.socket_nametmux.holder_sessiondefaults.working_directoryagents[].nameagents[].runtimeagents[].classagents[].model_hintagents[].persistent_personaagents[].reset_between_tasksagents[].kickstart_template
Verification:
Use jq for JSON examples or add a small Python/YAML validator if YAML is chosen. If no YAML parser is guaranteed, store examples as JSON or support both with Python stdlib JSON first.
Task 3.2: Add mosaic fleet commands
Objective: Provide operator-safe commands for install/status/start/stop/restart/verify.
Files:
- Modify:
packages/mosaic/src/cli.tsor the current commander entrypoint. - Create scripts under:
packages/mosaic/framework/tools/fleet/
Commands:
mosaic fleet init --profile minimal|usc --write
mosaic fleet install-systemd
mosaic fleet start [agent]
mosaic fleet stop [agent]
mosaic fleet restart [agent]
mosaic fleet status --json
mosaic fleet verify
Implementation path:
Start by wrapping framework shell scripts from the TypeScript CLI. Do not overbuild a TypeScript service manager in the first pass.
Task 3.3: Add mosaic agent commands
Objective: Provide transport-stable per-agent operations.
Files:
- Modify: Mosaic CLI entrypoint.
- Create:
packages/mosaic/framework/tools/agent/or reusetools/tmux+tools/fleet.
Commands:
mosaic agent roster [--json]
mosaic agent status [agent] [--json]
mosaic agent send <agent> --message "..."
mosaic agent reset <agent> --clear|--new
mosaic agent tail <agent> [-n 80]
Reset behavior:
For tmux transport, reset --clear sends /clear then Enter through send-message.sh.
For Claude/Pi differences, keep reset command configurable per runtime:
runtimes:
claude:
reset_command: /clear
pi:
reset_command: /new
If a runtime does not support a known reset command, restart the service and send a fresh kickstart.
Phase 4 — this-server rollout strategy
Task 4.1: Install on separate socket first
Objective: Prove the holder pattern without disturbing existing sessions.
Commands after implementation lands locally:
mosaic fleet init --profile minimal --write
mosaic fleet install-systemd
systemctl --user daemon-reload
systemctl --user start mosaic-tmux-holder.service
mosaic fleet verify
Expected:
tmux -L mosaic-factory lsshows_holder.- normal
tmux lsstill shows existing sessions unchanged.
Task 4.2: Start one canary agent
Objective: Validate single-agent start/restart isolation.
Use a harmless canary first, not the full fleet.
Example roster addition:
- name: canary-pi
runtime: pi
class: canary
working_directory: /home/jarvis/src
Commands:
systemctl --user start mosaic-agent@canary-pi.service
SRV=$(tmux -L mosaic-factory display-message -p '#{pid}')
systemctl --user restart mosaic-agent@canary-pi.service
test "$SRV" = "$(tmux -L mosaic-factory display-message -p '#{pid}')"
tmux -L mosaic-factory ls
Expected: holder PID unchanged; _holder remains; canary-pi recreated.
Task 4.3: Configure local Mosaic factory roster
Objective: Create the actual local roster for this server after canary passes.
Do not assume USC exact roster is desired here. Create a local profile such as:
~/.config/mosaic/fleet/roster.yaml
Initial local recommendation:
mos-claudeorchestratorcoder0/coder1implementersrev0reviewersecrev0security reviewerultronfinal/adversarial reviewer
Scale to full USC-style pool only after resource/budget behavior is understood.
Task 4.4: Cut over existing ad-hoc tmux sessions only if desired
Objective: Avoid data loss.
Existing sessions on this server are not on the proposed mosaic-factory socket. They can remain untouched. If we later want them under Mosaic fleet control:
- list sessions;
- capture logs/handoffs;
- stop old processes intentionally;
- recreate as configured
mosaic-agent@...services; - verify comms and state.
Do not run tmux kill-server on the default socket unless Jason explicitly approves that outage.
Phase 5 — docs and AI Guide backfill
Task 5.1: Stack docs
Objective: Document install and customization for Mosaic Stack users.
Files:
- Create:
docs/fleet/tmux-fleet.mdorpackages/mosaic/framework/tools/fleet/README.md - Modify: top-level
README.mdif appropriate.
Must cover:
- what problem holder service solves;
- install commands;
- customization file;
- example rosters;
- reset/reuse lifecycle;
- exact-target safety;
- separate socket default;
- Matrix migration path.
Task 5.2: AI Guide docs
Objective: Keep generic guidance in AI Guide and implementation details in Stack.
Files in mosaicstack/aiguide:
- Update:
playbooks/tmux-fleet.mdwith named socket, roster/profile, and resettable-slot pattern. - Add or update:
reference/agent-role-matrix.mdif PR #5 lands.
Do not put Mosaic install commands as the only path in AI Guide. Present them as one implementation profile.
Phase 6 — Matrix migration seam
Task 6.1: Add transport enum but implement tmux only
Objective: Avoid hardcoding tmux into orchestration semantics.
Roster:
transport: tmux
Future:
transport: matrix
matrix:
homeserver: https://matrix.example
room_prefix: mosaic-factory
Task 6.2: Define transport interface docs
Objective: Make Matrix plugin work a transport swap, not a rewrite.
Minimum operations:
send(agent, message)
reset(agent, mode)
status(agent)
tail(agent)
listAgents()
Any tmux-specific concept must stay below this line.
Acceptance criteria
The implementation is complete when:
mosaic fleet initcan write a minimal roster.mosaic fleet install-systemdinstalls holder and agent units without hand editing.mosaic fleet startstarts the holder and configured agents on a named tmux socket.- Restarting one
mosaic-agent@name.servicedoes not change holder server PID or kill sibling sessions. mosaic agent sendcan deliver a message to a named agent with a self-identifying preamble.mosaic agent resetcan clear/new a reusable slot and send a fresh kickstart.mosaic fleet verifyproves holder ownership, exact-target safety, and per-agent restart isolation.- Existing default tmux sessions on this server are not disturbed by default install.
- Docs explain generic customization and include USC-style roster only as an example.
- AI Guide remains generic; Mosaic Stack docs carry the concrete install path.
Risks and mitigations
| Risk | Mitigation |
|---|---|
| Killing existing tmux sessions | Use named mosaic-factory socket; no default tmux kill-server. |
| systemd unit quoting/env expansion bugs | Move logic into shell wrappers; verify with systemd-analyze --user verify. |
| Runtime reset command mismatch | Make reset command runtime-configurable; fallback to service restart + kickstart. |
| Tool install drift | Ensure npm package includes framework tmux/fleet tools; add packaging test. |
| Mosaic-specific assumptions leak into generic guide | Keep USC roster as example profile; AI Guide documents pattern/options. |
| Matrix migration blocked by tmux coupling | Add mosaic agent abstraction now; keep tmux details below transport layer. |
Suggested first PR split
-
PR A — tmux tool hardening
- socket support;
- exact target helpers;
- tests/docs.
-
PR B — fleet systemd primitives
- holder unit;
- agent unit;
- start-agent wrapper;
- install-user-units script;
- verify script.
-
PR C — roster and CLI
- roster schema/examples;
mosaic fleet ...commands;mosaic agent ...commands.
-
PR D — local rollout and docs
- local roster for this server;
- run canary;
- document verification evidence;
- update AI Guide with generic lessons.
Immediate next action
Implement PR A first. It is low-risk, improves existing tools, and is required for a safe named-socket rollout on this server.