feat(fleet): add durable tmux fleet poc
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful

This commit is contained in:
Jarvis
2026-06-19 15:23:14 -05:00
parent 250d3da12d
commit 757f5e6998
11 changed files with 313 additions and 31 deletions

View File

@@ -0,0 +1,57 @@
# Mosaic tmux Fleet PoC
This directory contains the first durable tmux-backed fleet primitives for the
Mosaic software-factory model.
The lifecycle model follows the organization-neutral AI Guide playbook
`mosaicstack/aiguide:playbooks/tmux-fleet.md` (commit `2a0b0b5`): a dedicated
holder owns the tmux server/socket; agent units join it and stop only their own
exact-match session.
## Layout
- `mosaic-tmux-holder.service` — user-mode holder that owns the named tmux server.
- `mosaic-agent@.service` — user-mode template for one reusable agent session.
- `test-fleet-units.sh` — validates unit syntax and required relationships.
The agent template calls:
```text
~/.config/mosaic/tools/fleet/start-agent-session.sh <agent-name>
```
which starts or reuses a tmux session on `MOSAIC_TMUX_SOCKET`.
## Local customization
Per-agent overrides live outside the package in:
```text
~/.config/mosaic/fleet/agents/<agent>.env
```
Example:
```dotenv
MOSAIC_TMUX_SOCKET=mosaic-factory
MOSAIC_AGENT_RUNTIME=claude
MOSAIC_AGENT_WORKDIR=/home/jarvis/src/mosaic-stack
# Optional escape hatch for PoC/canary agents:
# MOSAIC_AGENT_COMMAND=mosaic yolo claude
```
## Manual canary sequence
```bash
mkdir -p ~/.config/systemd/user ~/.config/mosaic/tools/fleet ~/.config/mosaic/fleet/agents
cp packages/mosaic/framework/systemd/user/mosaic-*.service ~/.config/systemd/user/
cp packages/mosaic/framework/tools/fleet/start-agent-session.sh ~/.config/mosaic/tools/fleet/
chmod +x ~/.config/mosaic/tools/fleet/start-agent-session.sh
systemctl --user daemon-reload
systemctl --user start mosaic-tmux-holder.service
systemctl --user start mosaic-agent@canary.service
tmux -L mosaic-factory ls
```
Do not use `tmux kill-server` without `-L mosaic-factory`; this pattern is meant
to avoid disturbing the user's default tmux server.

View File

@@ -0,0 +1,20 @@
[Unit]
Description=Mosaic tmux fleet agent %i
Documentation=https://git.mosaicstack.dev/mosaicstack/stack
Requires=mosaic-tmux-holder.service
After=mosaic-tmux-holder.service
PartOf=mosaic-tmux-holder.service
[Service]
Type=oneshot
RemainAfterExit=yes
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
Environment=MOSAIC_AGENT_NAME=%i
Environment=MOSAIC_AGENT_RUNTIME=pi
Environment=MOSAIC_AGENT_WORKDIR=%h
EnvironmentFile=-%h/.config/mosaic/fleet/agents/%i.env
ExecStart=/bin/bash %h/.config/mosaic/tools/fleet/start-agent-session.sh %i
ExecStop=-/bin/bash -lc 'tmux -L "${MOSAIC_TMUX_SOCKET:-mosaic-factory}" kill-session -t "=%i"'
[Install]
WantedBy=default.target

View File

@@ -0,0 +1,15 @@
[Unit]
Description=Mosaic tmux fleet holder
Documentation=https://git.mosaicstack.dev/mosaicstack/stack
After=default.target
[Service]
Type=oneshot
RemainAfterExit=yes
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
Environment=MOSAIC_TMUX_HOLDER=_holder
ExecStart=/bin/bash -lc 'tmux -L "$MOSAIC_TMUX_SOCKET" has-session -t "=${MOSAIC_TMUX_HOLDER}:0.0" 2>/dev/null || tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$MOSAIC_TMUX_HOLDER" "while true; do sleep 3600; done"'
ExecStop=-/bin/bash -lc 'tmux -L "$MOSAIC_TMUX_SOCKET" kill-server'
[Install]
WantedBy=default.target

View File

@@ -0,0 +1,30 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
HOLDER="$SCRIPT_DIR/mosaic-tmux-holder.service"
AGENT="$SCRIPT_DIR/mosaic-agent@.service"
fail() {
echo "FAIL: $*" >&2
exit 1
}
[ -f "$HOLDER" ] || fail "missing mosaic-tmux-holder.service"
[ -f "$AGENT" ] || fail "missing mosaic-agent@.service"
grep -qF 'ExecStart=' "$HOLDER" || fail "holder has no ExecStart"
grep -qF 'tmux -L' "$HOLDER" || fail "holder does not use named tmux socket"
grep -qF '_holder' "$HOLDER" || fail "holder session is not explicit"
grep -qF 'Requires=mosaic-tmux-holder.service' "$AGENT" || fail "agent does not require holder"
grep -qF 'start-agent-session.sh' "$AGENT" || fail "agent unit does not call start-agent-session.sh"
grep -qF 'kill-session -t "=%i"' "$AGENT" || fail "agent stop does not exact-match its session"
if command -v systemd-analyze >/dev/null 2>&1; then
systemd-analyze verify --user "$HOLDER" "$AGENT" >/tmp/mosaic-fleet-systemd-verify.log 2>&1 || {
cat /tmp/mosaic-fleet-systemd-verify.log >&2
fail "systemd-analyze verify failed"
}
fi
echo "ok - fleet systemd unit templates"

View File

@@ -0,0 +1,30 @@
#!/usr/bin/env bash
set -euo pipefail
AGENT_NAME=${1:-${MOSAIC_AGENT_NAME:-}}
MOSAIC_TMUX_SOCKET=${MOSAIC_TMUX_SOCKET:-mosaic-factory}
MOSAIC_AGENT_RUNTIME=${MOSAIC_AGENT_RUNTIME:-pi}
MOSAIC_AGENT_WORKDIR=${MOSAIC_AGENT_WORKDIR:-$HOME}
MOSAIC_AGENT_COMMAND=${MOSAIC_AGENT_COMMAND:-}
if [ -z "$AGENT_NAME" ]; then
echo "ERROR: agent name argument or MOSAIC_AGENT_NAME is required" >&2
exit 64
fi
if ! command -v tmux >/dev/null 2>&1; then
echo "ERROR: tmux is required" >&2
exit 69
fi
if tmux -L "$MOSAIC_TMUX_SOCKET" has-session -t "=${AGENT_NAME}:0.0" 2>/dev/null; then
echo "Mosaic agent session already running: $AGENT_NAME on socket $MOSAIC_TMUX_SOCKET"
exit 0
fi
if [ -z "$MOSAIC_AGENT_COMMAND" ]; then
MOSAIC_AGENT_COMMAND="mosaic yolo $MOSAIC_AGENT_RUNTIME"
fi
mkdir -p "$MOSAIC_AGENT_WORKDIR"
exec tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" "$MOSAIC_AGENT_COMMAND"

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
START="$SCRIPT_DIR/start-agent-session.sh"
SOCKET="mosaic-agent-test-$RANDOM-$$"
AGENT="agent-$RANDOM"
WORKDIR=$(mktemp -d)
trap 'tmux -L "$SOCKET" kill-server >/dev/null 2>&1 || true; rm -rf "$WORKDIR"' EXIT
fail() {
echo "FAIL: $*" >&2
exit 1
}
MOSAIC_TMUX_SOCKET="$SOCKET" \
MOSAIC_AGENT_WORKDIR="$WORKDIR" \
MOSAIC_AGENT_COMMAND='bash --noprofile --norc -i' \
"$START" "$AGENT"
tmux -L "$SOCKET" has-session -t "=$AGENT:0.0" || fail "agent session was not created"
actual_dir=$(tmux -L "$SOCKET" display-message -p -t "=$AGENT:0.0" '#{pane_current_path}')
[ "$actual_dir" = "$WORKDIR" ] || fail "agent workdir mismatch: $actual_dir"
MOSAIC_TMUX_SOCKET="$SOCKET" \
MOSAIC_AGENT_WORKDIR="$WORKDIR" \
MOSAIC_AGENT_COMMAND='bash --noprofile --norc -i' \
"$START" "$AGENT" >/tmp/mosaic-start-agent-idempotent.out
grep -qF 'already running' /tmp/mosaic-start-agent-idempotent.out || fail "duplicate start was not idempotent"
echo "ok - start-agent-session"

View File

@@ -31,9 +31,12 @@ Prepends the preamble automatically (auto-detecting your own `host:session`) and
delivers reliably to local OR remote panes.
```bash
# Local target (same host)
# Local target (same host, default tmux server)
agent-send.sh -s <dst_session> -m "message"
# Local target on a Mosaic fleet socket
agent-send.sh -L mosaic-factory -s '=coder0' -m "message"
# Remote target (over ssh)
agent-send.sh -H user@host -s <dst_session> -m "message"
@@ -42,10 +45,27 @@ agent-send.sh -H user@host -s <dst_session> -f msg.txt
echo "msg" | agent-send.sh -s <dst_session>
```
Key flags: `-s` dst session (required) · `-H` ssh target for remote · `-n` dst
Key flags: `-L` named tmux socket · `-s` dst session (required) · `-H` ssh target for remote · `-n` dst
hostname for the preamble (else auto-resolved) · `-m`/`-f`/stdin body · `-S`
override source label · `-v` verbose · `-r N` Enter-flush attempts.
For durable fleet use, prefer exact tmux targets such as `=coder0`. The helper
normalizes exact session targets to pane-qualified targets internally so pane
commands do not fall back to tmux's prefix matching behavior.
## Named socket isolation
Durable Mosaic fleets should use a dedicated tmux socket, for example:
```bash
tmux -L mosaic-factory ls
agent-send.sh -L mosaic-factory -s '=coder0' -m "status?"
send-message.sh -L mosaic-factory -t '=coder0' -m "raw pane message"
```
This keeps fleet operations away from the user's default tmux server. It is the
safe rollout path on hosts that already have manual tmux sessions.
## Why a helper exists (the submission gotcha)
Pasting into an interactive REPL via raw `tmux send-keys` is unreliable: a
@@ -67,6 +87,7 @@ message crosses the wire as base64 (`-b`) to avoid all shell-quoting hazards.
- `agent-send.sh` — inter-agent wrapper (preamble + local/remote dispatch).
- `send-message.sh` — low-level reliable single-pane submitter (`-b` base64 input).
- `test-send-message-socket.sh` — smoke test for named-socket isolation.
## Distribution

View File

@@ -23,12 +23,13 @@
# the remote host; only bash + tmux + base64 (standard).
#
# USAGE
# agent-send.sh -s <dst_session> -m "message" # local target
# agent-send.sh -H user@host -s <dst_session> -m "message" # remote target
# agent-send.sh -H user@host -n <dst_hostname> -s <sess> -f msg.txt
# echo "msg" | agent-send.sh -H user@host -s <dst_session>
# agent-send.sh [-L socket] -s <dst_session> -m "message" # local target
# agent-send.sh [-L socket] -H user@host -s <dst_session> -m "message" # remote target
# agent-send.sh [-L socket] -H user@host -n <dst_hostname> -s <sess> -f msg.txt
# echo "msg" | agent-send.sh [-L socket] -H user@host -s <dst_session>
#
# OPTIONS
# -L NAME tmux socket name passed to `tmux -L NAME` on the target host
# -s DST_SESSION target tmux session (or session:window.pane) [required]
# -H SSH_TARGET ssh target (user@host) for a remote pane; omit for local
# -n DST_HOST hostname to show in the preamble for the target.
@@ -47,12 +48,13 @@ set -uo pipefail
SELF_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
SENDER="$SELF_DIR/send-message.sh"
DST_SESSION=""; SSH_TARGET=""; DST_HOST=""; MSG=""; FILE=""
DST_SESSION=""; SSH_TARGET=""; DST_HOST=""; MSG=""; FILE=""; SOCKET_NAME=""
SRC_LABEL=""; RETRIES=2; VERBOSE=0
usage() { sed -n '2,44p' "$0"; exit "${1:-3}"; }
while getopts "s:H:n:m:f:S:r:vh" o; do
while getopts "L:s:H:n:m:f:S:r:vh" o; do
case "$o" in
L) SOCKET_NAME=$OPTARG ;;
s) DST_SESSION=$OPTARG ;; H) SSH_TARGET=$OPTARG ;; n) DST_HOST=$OPTARG ;;
m) MSG=$OPTARG ;; f) FILE=$OPTARG ;; S) SRC_LABEL=$OPTARG ;;
r) RETRIES=$OPTARG ;; v) VERBOSE=1 ;; h) usage 0 ;; *) usage 3 ;;
@@ -70,8 +72,12 @@ fi
# Source label: this agent's host:session (auto-detected, overridable).
if [ -z "$SRC_LABEL" ]; then
tmux_cmd=(tmux)
if [ -n "$SOCKET_NAME" ]; then
tmux_cmd+=(-L "$SOCKET_NAME")
fi
src_host=$(hostname -s 2>/dev/null || echo "?")
src_sess=$(tmux display-message -p '#S' 2>/dev/null || echo "?")
src_sess=$("${tmux_cmd[@]}" display-message -p '#S' 2>/dev/null || echo "?")
SRC_LABEL="${src_host}:${src_sess}"
fi
@@ -89,12 +95,16 @@ FULL="${PREAMBLE} ${MSG}"
B64=$(printf '%s' "$FULL" | base64 -w0)
vflag=""; [ "$VERBOSE" = 1 ] && vflag="-v"
socket_args=()
if [ -n "$SOCKET_NAME" ]; then
socket_args=(-L "$SOCKET_NAME")
fi
if [ -z "$SSH_TARGET" ]; then
# Local pane: call the canonical sender directly.
exec "$SENDER" -t "$DST_SESSION" -b "$B64" -r "$RETRIES" $vflag
exec "$SENDER" "${socket_args[@]}" -t "$DST_SESSION" -b "$B64" -r "$RETRIES" $vflag
else
# Remote pane: ship the sender over ssh and run it local to the target.
ssh -o ConnectTimeout=10 "$SSH_TARGET" \
"bash -s -- -t '$DST_SESSION' -b '$B64' -r '$RETRIES' $vflag" < "$SENDER"
"bash -s -- ${socket_args[*]@Q} -t '$DST_SESSION' -b '$B64' -r '$RETRIES' $vflag" < "$SENDER"
fi

View File

@@ -13,12 +13,13 @@
# no-op in Claude Code, so the double-Enter is safe.
#
# USAGE
# send-message.sh -t <target> -m "message"
# send-message.sh -t <target> -f <file>
# echo "message" | send-message.sh -t <target>
# ssh host bash -s -- -t <target> -b "$(base64 -w0 <<<msg)" < send-message.sh
# send-message.sh [-L socket_name] -t <target> -m "message"
# send-message.sh [-L socket_name] -t <target> -f <file>
# echo "message" | send-message.sh [-L socket_name] -t <target>
# ssh host bash -s -- -L socket -t <target> -b "$(base64 -w0 <<<msg)" < send-message.sh
#
# OPTIONS
# -L NAME tmux socket name passed to `tmux -L NAME` (optional)
# -t TARGET tmux target: session, or session:window.pane [required]
# -m MESSAGE message text (single- or multi-line)
# -f FILE read message from FILE instead of -m
@@ -34,11 +35,12 @@
# 3 usage error
set -uo pipefail
TARGET=""; MSG=""; FILE=""; B64=""; RETRIES=2; VERBOSE=0
SOCKET_NAME=""; TARGET=""; MSG=""; FILE=""; B64=""; RETRIES=2; VERBOSE=0
usage() { sed -n '2,34p' "$0"; exit "${1:-3}"; }
while getopts "t:m:f:b:r:vh" o; do
while getopts "L:t:m:f:b:r:vh" o; do
case "$o" in
L) SOCKET_NAME=$OPTARG ;;
t) TARGET=$OPTARG ;; m) MSG=$OPTARG ;; f) FILE=$OPTARG ;; b) B64=$OPTARG ;;
r) RETRIES=$OPTARG ;; v) VERBOSE=1 ;; h) usage 0 ;; *) usage 3 ;;
esac
@@ -51,8 +53,21 @@ elif [ -z "$MSG" ] && [ ! -t 0 ]; then MSG=$(cat)
fi
[ -n "$MSG" ] || { echo "ERROR: empty message (use -m, -f, or stdin)" >&2; exit 3; }
tmux_cmd=(tmux)
if [ -n "$SOCKET_NAME" ]; then
tmux_cmd+=(-L "$SOCKET_NAME")
fi
# tmux accepts `=session` for some commands, but pane-level commands such as
# capture-pane require a pane-qualified target. Keep exact-session addressing
# convenient while avoiding accidental prefix matches.
EFFECTIVE_TARGET=$TARGET
if [[ "$TARGET" == =* && "$TARGET" != *:* ]]; then
EFFECTIVE_TARGET="${TARGET}:0.0"
fi
# Target must resolve to a live pane.
if ! tmux list-panes -t "$TARGET" >/dev/null 2>&1; then
if ! "${tmux_cmd[@]}" list-panes -t "$EFFECTIVE_TARGET" >/dev/null 2>&1; then
echo "ERROR: tmux target not found: $TARGET" >&2; exit 1
fi
@@ -62,18 +77,18 @@ snippet=$(printf '%s' "$MSG" | tr '\n' ' ' | tr -s ' ' | sed 's/[^[:print:]]//g'
# 1) Paste the body as a bracketed paste so multi-line content does not submit
# line-by-line. load-buffer/paste-buffer is far safer than `send-keys -l`.
printf '%s' "$MSG" | tmux load-buffer -b __mosaic_send -
printf '%s' "$MSG" | "${tmux_cmd[@]}" load-buffer -b __mosaic_send -
# -p = bracketed paste when the client supports it; fall back if not.
tmux paste-buffer -d -p -b __mosaic_send -t "$TARGET" 2>/dev/null \
|| tmux paste-buffer -d -b __mosaic_send -t "$TARGET"
"${tmux_cmd[@]}" paste-buffer -d -p -b __mosaic_send -t "$EFFECTIVE_TARGET" 2>/dev/null \
|| "${tmux_cmd[@]}" paste-buffer -d -b __mosaic_send -t "$EFFECTIVE_TARGET"
sleep 0.5
# 2) Submit, then verify; flush with another Enter if it is still a draft.
status="sent"
for attempt in $(seq 1 $((RETRIES + 1))); do
tmux send-keys -t "$TARGET" Enter
"${tmux_cmd[@]}" send-keys -t "$EFFECTIVE_TARGET" Enter
sleep 1.2
pane=$(tmux capture-pane -t "$TARGET" -p 2>/dev/null)
pane=$("${tmux_cmd[@]}" capture-pane -t "$EFFECTIVE_TARGET" -p 2>/dev/null)
if printf '%s' "$pane" | grep -qF "$QUEUED_RE"; then
status="queued"; break

View File

@@ -0,0 +1,50 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
SEND_MESSAGE="$SCRIPT_DIR/send-message.sh"
AGENT_SEND="$SCRIPT_DIR/agent-send.sh"
SOCKET="mosaic-test-$RANDOM-$$"
TARGET="target-$RANDOM"
DEFAULT_TARGET="default-target-$RANDOM"
TMPDIR=$(mktemp -d)
trap 'tmux -L "$SOCKET" kill-server >/dev/null 2>&1 || true; tmux kill-session -t "$DEFAULT_TARGET" >/dev/null 2>&1 || true; rm -rf "$TMPDIR"' EXIT
fail() {
echo "FAIL: $*" >&2
exit 1
}
require_tmux() {
command -v tmux >/dev/null 2>&1 || fail "tmux is required"
}
capture_named() {
tmux -L "$SOCKET" capture-pane -t "=$TARGET:0.0" -p
}
capture_default() {
tmux capture-pane -t "=$DEFAULT_TARGET:0.0" -p
}
require_tmux
tmux -L "$SOCKET" new-session -d -s "$TARGET" -c "$TMPDIR" 'bash --noprofile --norc -i'
tmux new-session -d -s "$DEFAULT_TARGET" -c "$TMPDIR" 'bash --noprofile --norc -i'
"$SEND_MESSAGE" -L "$SOCKET" -t "=$TARGET" -m "named socket hello" >/tmp/send-message-named.out
sleep 0.2
capture_named | grep -qF "named socket hello" || fail "send-message.sh did not deliver to named socket"
if capture_default | grep -qF "named socket hello"; then
fail "send-message.sh leaked named-socket message to default tmux server"
fi
"$AGENT_SEND" -L "$SOCKET" -S "tester:source" -s "=$TARGET" -m "agent socket hello" >/tmp/agent-send-named.out
sleep 0.2
capture_named | grep -qF "[tester:source ->" || fail "agent-send.sh did not include preamble"
capture_named | grep -qF "agent socket hello" || fail "agent-send.sh did not deliver to named socket"
if capture_default | grep -qF "agent socket hello"; then
fail "agent-send.sh leaked named-socket message to default tmux server"
fi
echo "ok - named tmux socket send tools"

View File

@@ -6,6 +6,8 @@
**Architecture:** Mosaic should ship generic tmux fleet primitives in the framework, then layer local rosters through configuration. The holder service owns the tmux socket; each agent service joins the holder-owned server and runs `mosaic yolo <runtime>`. The orchestrator addresses agents through `mosaic agent ...` abstractions so tmux can later be replaced by Matrix-backed agent comms without changing mission flow.
**Reference:** AI Guide `playbooks/tmux-fleet.md` at commit `2a0b0b5` documents the organization-neutral holder-service pattern, exact-match `=<name>` stop targets, and coupled-server cutover/verification sequence. The Stack implementation should treat that as the lifecycle model and keep concrete Mosaic unit/tooling details here.
**Tech Stack:** Bash, tmux, user systemd units, Mosaic CLI/framework installer, JSON/YAML roster config, existing `packages/mosaic/framework/tools/tmux/{agent-send.sh,send-message.sh}`.
---
@@ -715,7 +717,7 @@ The implementation is complete when:
## Risks and mitigations
| Risk | Mitigation |
|---|---|
| --------------------------------------------------- | --------------------------------------------------------------------------------- |
| Killing existing tmux sessions | Use named `mosaic-factory` socket; no default `tmux kill-server`. |
| systemd unit quoting/env expansion bugs | Move logic into shell wrappers; verify with `systemd-analyze --user verify`. |
| Runtime reset command mismatch | Make reset command runtime-configurable; fallback to service restart + kickstart. |