Compare commits
2 Commits
feb0d8a58b
...
plan/tmux-
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
757f5e6998 | ||
|
|
250d3da12d |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -12,6 +12,3 @@ docs/reports/
|
||||
|
||||
# Step-CA dev password — real file is gitignored; commit only the .example
|
||||
infra/step-ca/dev-password
|
||||
|
||||
# Scratch dirs created by the framework git-wrapper shell test harnesses
|
||||
.mosaic-test-work/
|
||||
|
||||
@@ -1,87 +0,0 @@
|
||||
# Wrapper hardening fold-in: #559 (eval removal) + #560 (host-derived login)
|
||||
|
||||
**Branch:** `fix/wrapper-hardening-tls-credpath-cicwait` (PR #551)
|
||||
**Worker:** coderlite0 (Sonnet lane) · coordinated by mos-claude
|
||||
**Date:** 2026-06-20
|
||||
**Scope:** `packages/mosaic/framework/tools/git/*.sh` only
|
||||
|
||||
## What the issues asked for vs. what was already landed
|
||||
|
||||
Both issues were largely satisfied by prior merged work; this fold-in closes the
|
||||
remaining gaps (regression tests + a loud diagnostic + one residual word-split site)
|
||||
rather than re-implementing finished functionality.
|
||||
|
||||
### #559 — remove `eval` from issue-create.sh (and siblings)
|
||||
|
||||
- `eval`-based command construction was already removed across the wrapper surface
|
||||
(landed in #549). A full scan of `tools/git/*.sh` finds **zero** `eval` usages.
|
||||
- `issue-create.sh`, `pr-create.sh`, `issue-edit.sh`, `issue-assign.sh` already build
|
||||
their `tea`/`gh` invocations as argv arrays (`CMD=(...)`, `"${CMD[@]}"`), so Markdown
|
||||
bodies pass through verbatim.
|
||||
- **Residual found & fixed:** `issue-comment.sh` still used unquoted
|
||||
`$(get_gitea_repo_args)` word-splitting (the comment body itself was already safely
|
||||
quoted, so no injection bug — but it was the inconsistent, fragile pattern #559 targets,
|
||||
and it failed silently when no login resolved). Converted to an argv array with an
|
||||
explicit, loud login-resolution error.
|
||||
- **Added regression test:** `test-issue-create-body-safety.sh` — feeds a hostile
|
||||
Markdown body (`$(touch SENTINEL)`, backticks, single/double quotes, `$HOME`/`${PATH}`,
|
||||
pipes/`&&`/`;`) through `issue-create.sh` and asserts (1) no command substitution
|
||||
executes (sentinel file never created) and (2) the `--description` `tea` receives is
|
||||
byte-for-byte the original body.
|
||||
|
||||
### #560 — auto-detect Gitea `--login` from repo origin host
|
||||
|
||||
- Centralized host→login resolution already exists in `detect-platform.sh`
|
||||
(`get_gitea_login_for_host` → `find_tea_login_for_host`, matching `urlparse(url).hostname`).
|
||||
Every wrapper routes through it (or `get_gitea_login` / `get_gitea_login_for_repo_override`);
|
||||
**no wrapper hardcodes `${GITEA_LOGIN:-mosaicstack}`**. Explicit `GITEA_LOGIN` wins only
|
||||
when it matches the host (`tea_login_matches_host`), so stale overrides are rejected.
|
||||
- **Gap fixed — silent failure → loud diagnostic:** the failure path of
|
||||
`get_gitea_login_for_host` returned non-zero with no message. Added
|
||||
`print_gitea_login_diagnostic`, emitted to **stderr** on resolution failure: names the
|
||||
unresolved host, lists available tea logins (name + host), and gives the `GITEA_LOGIN`
|
||||
override + `tea login add` fix. Stderr-only, so it never contaminates stdout (the
|
||||
resolved login name) or the log-grep assertions in the existing harnesses. Callers with
|
||||
an API fallback (pr-merge, issue-close, pr-create, issue-create) still follow with their
|
||||
own "using API fallback" line, giving a clear "no login → fallback" trail.
|
||||
- **Extended test:** `test-gitea-login-resolution.sh` now also asserts (a) the loud
|
||||
diagnostic fires and lists available logins for an unresolved host, (b) login is derived
|
||||
from origin host for **both** instances (mosaicstack + usc) via a scoped second `tea`
|
||||
mock, and (c) a valid `GITEA_LOGIN` override is honored. The scoped mock keeps the
|
||||
existing API-fallback assertions (which require mosaicstack to have _no_ tea login) valid.
|
||||
|
||||
## Files changed (wrapper surface only)
|
||||
|
||||
- `detect-platform.sh` — add `print_gitea_login_diagnostic`; call it on the
|
||||
`get_gitea_login_for_host` failure path.
|
||||
- `issue-comment.sh` — argv array + loud login-resolution error (was unquoted
|
||||
`$(get_gitea_repo_args)`).
|
||||
- `test-issue-create-body-safety.sh` — **new** (#559 regression).
|
||||
- `test-gitea-login-resolution.sh` — extended (#560 diagnostic + both-host + override).
|
||||
|
||||
## Verification
|
||||
|
||||
All wrapper harnesses pass locally:
|
||||
|
||||
- `test-issue-create-body-safety.sh` — PASS
|
||||
- `test-gitea-login-resolution.sh` — PASS
|
||||
- `test-pr-merge-gitea-empty-uid.sh` — PASS
|
||||
- `test-pr-metadata-gitea.sh` — PASS
|
||||
- `test-lane-brief-pr-linkage.sh` — PASS
|
||||
|
||||
## Open items flagged to mos-claude (orchestrator decisions)
|
||||
|
||||
1. **CHANGELOG absent.** The task said "update CHANGELOG (append-only), keep the existing
|
||||
#550/#551 entry." No CHANGELOG file exists anywhere in the repo, and #550/#551 are not
|
||||
recorded in one. **ASSUMPTION:** documenting #559/#560 in this scratchpad + the PR
|
||||
description (`Closes #559 Closes #560`) follows the repo's actual convention
|
||||
(`docs/scratchpads/`). Did not invent a new CHANGELOG structure.
|
||||
2. **`docs/TASKS.md` is orchestrator single-writer.** It carries a "Workers read but never
|
||||
modify" banner. As a worker I did **not** edit it; task tracking is via the linked Gitea
|
||||
issues #559/#560 + this scratchpad. Orchestrator may add a rollup row if desired.
|
||||
3. **Wrapper `test-*.sh` are not CI-wired.** `.woodpecker/ci.yml` runs `pnpm
|
||||
typecheck/lint/format:check/test` (`turbo run test`); the framework dir has no
|
||||
`package.json`, so these shell harnesses run **locally/manually only** — they do not gate
|
||||
the PR in Woodpecker. **ASSUMPTION:** out of scope to wire a shell-test step into CI in
|
||||
this PR (would broaden the diff beyond the wrapper surface). Flagging for a follow-up if
|
||||
the fleet wants these gated.
|
||||
57
packages/mosaic/framework/systemd/user/README.md
Normal file
57
packages/mosaic/framework/systemd/user/README.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Mosaic tmux Fleet PoC
|
||||
|
||||
This directory contains the first durable tmux-backed fleet primitives for the
|
||||
Mosaic software-factory model.
|
||||
|
||||
The lifecycle model follows the organization-neutral AI Guide playbook
|
||||
`mosaicstack/aiguide:playbooks/tmux-fleet.md` (commit `2a0b0b5`): a dedicated
|
||||
holder owns the tmux server/socket; agent units join it and stop only their own
|
||||
exact-match session.
|
||||
|
||||
## Layout
|
||||
|
||||
- `mosaic-tmux-holder.service` — user-mode holder that owns the named tmux server.
|
||||
- `mosaic-agent@.service` — user-mode template for one reusable agent session.
|
||||
- `test-fleet-units.sh` — validates unit syntax and required relationships.
|
||||
|
||||
The agent template calls:
|
||||
|
||||
```text
|
||||
~/.config/mosaic/tools/fleet/start-agent-session.sh <agent-name>
|
||||
```
|
||||
|
||||
which starts or reuses a tmux session on `MOSAIC_TMUX_SOCKET`.
|
||||
|
||||
## Local customization
|
||||
|
||||
Per-agent overrides live outside the package in:
|
||||
|
||||
```text
|
||||
~/.config/mosaic/fleet/agents/<agent>.env
|
||||
```
|
||||
|
||||
Example:
|
||||
|
||||
```dotenv
|
||||
MOSAIC_TMUX_SOCKET=mosaic-factory
|
||||
MOSAIC_AGENT_RUNTIME=claude
|
||||
MOSAIC_AGENT_WORKDIR=/home/jarvis/src/mosaic-stack
|
||||
# Optional escape hatch for PoC/canary agents:
|
||||
# MOSAIC_AGENT_COMMAND=mosaic yolo claude
|
||||
```
|
||||
|
||||
## Manual canary sequence
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.config/systemd/user ~/.config/mosaic/tools/fleet ~/.config/mosaic/fleet/agents
|
||||
cp packages/mosaic/framework/systemd/user/mosaic-*.service ~/.config/systemd/user/
|
||||
cp packages/mosaic/framework/tools/fleet/start-agent-session.sh ~/.config/mosaic/tools/fleet/
|
||||
chmod +x ~/.config/mosaic/tools/fleet/start-agent-session.sh
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user start mosaic-tmux-holder.service
|
||||
systemctl --user start mosaic-agent@canary.service
|
||||
tmux -L mosaic-factory ls
|
||||
```
|
||||
|
||||
Do not use `tmux kill-server` without `-L mosaic-factory`; this pattern is meant
|
||||
to avoid disturbing the user's default tmux server.
|
||||
20
packages/mosaic/framework/systemd/user/mosaic-agent@.service
Normal file
20
packages/mosaic/framework/systemd/user/mosaic-agent@.service
Normal file
@@ -0,0 +1,20 @@
|
||||
[Unit]
|
||||
Description=Mosaic tmux fleet agent %i
|
||||
Documentation=https://git.mosaicstack.dev/mosaicstack/stack
|
||||
Requires=mosaic-tmux-holder.service
|
||||
After=mosaic-tmux-holder.service
|
||||
PartOf=mosaic-tmux-holder.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
|
||||
Environment=MOSAIC_AGENT_NAME=%i
|
||||
Environment=MOSAIC_AGENT_RUNTIME=pi
|
||||
Environment=MOSAIC_AGENT_WORKDIR=%h
|
||||
EnvironmentFile=-%h/.config/mosaic/fleet/agents/%i.env
|
||||
ExecStart=/bin/bash %h/.config/mosaic/tools/fleet/start-agent-session.sh %i
|
||||
ExecStop=-/bin/bash -lc 'tmux -L "${MOSAIC_TMUX_SOCKET:-mosaic-factory}" kill-session -t "=%i"'
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
@@ -0,0 +1,15 @@
|
||||
[Unit]
|
||||
Description=Mosaic tmux fleet holder
|
||||
Documentation=https://git.mosaicstack.dev/mosaicstack/stack
|
||||
After=default.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
|
||||
Environment=MOSAIC_TMUX_HOLDER=_holder
|
||||
ExecStart=/bin/bash -lc 'tmux -L "$MOSAIC_TMUX_SOCKET" has-session -t "=${MOSAIC_TMUX_HOLDER}:0.0" 2>/dev/null || tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$MOSAIC_TMUX_HOLDER" "while true; do sleep 3600; done"'
|
||||
ExecStop=-/bin/bash -lc 'tmux -L "$MOSAIC_TMUX_SOCKET" kill-server'
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
30
packages/mosaic/framework/systemd/user/test-fleet-units.sh
Executable file
30
packages/mosaic/framework/systemd/user/test-fleet-units.sh
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
|
||||
HOLDER="$SCRIPT_DIR/mosaic-tmux-holder.service"
|
||||
AGENT="$SCRIPT_DIR/mosaic-agent@.service"
|
||||
|
||||
fail() {
|
||||
echo "FAIL: $*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
[ -f "$HOLDER" ] || fail "missing mosaic-tmux-holder.service"
|
||||
[ -f "$AGENT" ] || fail "missing mosaic-agent@.service"
|
||||
|
||||
grep -qF 'ExecStart=' "$HOLDER" || fail "holder has no ExecStart"
|
||||
grep -qF 'tmux -L' "$HOLDER" || fail "holder does not use named tmux socket"
|
||||
grep -qF '_holder' "$HOLDER" || fail "holder session is not explicit"
|
||||
grep -qF 'Requires=mosaic-tmux-holder.service' "$AGENT" || fail "agent does not require holder"
|
||||
grep -qF 'start-agent-session.sh' "$AGENT" || fail "agent unit does not call start-agent-session.sh"
|
||||
grep -qF 'kill-session -t "=%i"' "$AGENT" || fail "agent stop does not exact-match its session"
|
||||
|
||||
if command -v systemd-analyze >/dev/null 2>&1; then
|
||||
systemd-analyze verify --user "$HOLDER" "$AGENT" >/tmp/mosaic-fleet-systemd-verify.log 2>&1 || {
|
||||
cat /tmp/mosaic-fleet-systemd-verify.log >&2
|
||||
fail "systemd-analyze verify failed"
|
||||
}
|
||||
fi
|
||||
|
||||
echo "ok - fleet systemd unit templates"
|
||||
@@ -16,12 +16,7 @@
|
||||
# After loading, service-specific env vars are exported.
|
||||
# Run `load_credentials --help` for details.
|
||||
|
||||
if [[ -z "${MOSAIC_CREDENTIALS_FILE:-}" ]]; then
|
||||
for _cand in "$HOME/.config/mosaic/credentials.json" "$HOME/src/jarvis-brain/credentials.json"; do
|
||||
if [[ -f "$_cand" ]]; then MOSAIC_CREDENTIALS_FILE="$_cand"; break; fi
|
||||
done
|
||||
: "${MOSAIC_CREDENTIALS_FILE:=$HOME/src/jarvis-brain/credentials.json}"
|
||||
fi
|
||||
MOSAIC_CREDENTIALS_FILE="${MOSAIC_CREDENTIALS_FILE:-$HOME/src/jarvis-brain/credentials.json}"
|
||||
|
||||
_mosaic_require_jq() {
|
||||
if ! command -v jq &>/dev/null; then
|
||||
@@ -39,19 +34,6 @@ _mosaic_read_cred() {
|
||||
jq -r "$jq_path // empty" "$MOSAIC_CREDENTIALS_FILE"
|
||||
}
|
||||
|
||||
# Decide curl TLS flag for a target URL: validate public hosts (MITM matters on
|
||||
# WAN); allow self-signed only for private-network IP literals (trusted LAN) or an
|
||||
# explicit $MOSAIC_INSECURE_TLS opt-in. Echoes "-k" or "" (empty).
|
||||
_mosaic_tls_opt() {
|
||||
local url="$1" host
|
||||
[[ -n "${MOSAIC_INSECURE_TLS:-}" ]] && { echo "-k"; return; }
|
||||
host=$(printf '%s' "$url" | sed -E 's#^[a-zA-Z]+://([^/:]+).*#\1#')
|
||||
if [[ "$host" =~ ^(10\.|127\.|192\.168\.|172\.(1[6-9]|2[0-9]|3[01])\.) ]]; then
|
||||
echo "-k"; return
|
||||
fi
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Sync Woodpecker credentials to ~/.woodpecker/<instance>.env
|
||||
# Only writes when values differ to avoid unnecessary disk writes.
|
||||
_mosaic_sync_woodpecker_env() {
|
||||
@@ -279,8 +261,7 @@ mosaic_http() {
|
||||
local base_url="${4:-}"
|
||||
|
||||
local response
|
||||
local _tls; _tls=$(_mosaic_tls_opt "${base_url}${endpoint}")
|
||||
response=$(curl -sS $_tls -w "\n%{http_code}" -X "$method" \
|
||||
response=$(curl -sk -w "\n%{http_code}" -X "$method" \
|
||||
-H "$auth_header" \
|
||||
-H "Content-Type: application/json" \
|
||||
"${base_url}${endpoint}")
|
||||
@@ -298,8 +279,7 @@ mosaic_http_post() {
|
||||
local base_url="${4:-}"
|
||||
|
||||
local response
|
||||
local _tls; _tls=$(_mosaic_tls_opt "${base_url}${endpoint}")
|
||||
response=$(curl -sS $_tls -w "\n%{http_code}" -X POST \
|
||||
response=$(curl -sk -w "\n%{http_code}" -X POST \
|
||||
-H "$auth_header" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$data" \
|
||||
@@ -317,8 +297,7 @@ mosaic_http_patch() {
|
||||
local base_url="${4:-}"
|
||||
|
||||
local response
|
||||
local _tls; _tls=$(_mosaic_tls_opt "${base_url}${endpoint}")
|
||||
response=$(curl -sS $_tls -w "\n%{http_code}" -X PATCH \
|
||||
response=$(curl -sk -w "\n%{http_code}" -X PATCH \
|
||||
-H "$auth_header" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$data" \
|
||||
|
||||
30
packages/mosaic/framework/tools/fleet/start-agent-session.sh
Executable file
30
packages/mosaic/framework/tools/fleet/start-agent-session.sh
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
AGENT_NAME=${1:-${MOSAIC_AGENT_NAME:-}}
|
||||
MOSAIC_TMUX_SOCKET=${MOSAIC_TMUX_SOCKET:-mosaic-factory}
|
||||
MOSAIC_AGENT_RUNTIME=${MOSAIC_AGENT_RUNTIME:-pi}
|
||||
MOSAIC_AGENT_WORKDIR=${MOSAIC_AGENT_WORKDIR:-$HOME}
|
||||
MOSAIC_AGENT_COMMAND=${MOSAIC_AGENT_COMMAND:-}
|
||||
|
||||
if [ -z "$AGENT_NAME" ]; then
|
||||
echo "ERROR: agent name argument or MOSAIC_AGENT_NAME is required" >&2
|
||||
exit 64
|
||||
fi
|
||||
|
||||
if ! command -v tmux >/dev/null 2>&1; then
|
||||
echo "ERROR: tmux is required" >&2
|
||||
exit 69
|
||||
fi
|
||||
|
||||
if tmux -L "$MOSAIC_TMUX_SOCKET" has-session -t "=${AGENT_NAME}:0.0" 2>/dev/null; then
|
||||
echo "Mosaic agent session already running: $AGENT_NAME on socket $MOSAIC_TMUX_SOCKET"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [ -z "$MOSAIC_AGENT_COMMAND" ]; then
|
||||
MOSAIC_AGENT_COMMAND="mosaic yolo $MOSAIC_AGENT_RUNTIME"
|
||||
fi
|
||||
|
||||
mkdir -p "$MOSAIC_AGENT_WORKDIR"
|
||||
exec tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" "$MOSAIC_AGENT_COMMAND"
|
||||
32
packages/mosaic/framework/tools/fleet/test-start-agent-session.sh
Executable file
32
packages/mosaic/framework/tools/fleet/test-start-agent-session.sh
Executable file
@@ -0,0 +1,32 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
|
||||
START="$SCRIPT_DIR/start-agent-session.sh"
|
||||
SOCKET="mosaic-agent-test-$RANDOM-$$"
|
||||
AGENT="agent-$RANDOM"
|
||||
WORKDIR=$(mktemp -d)
|
||||
trap 'tmux -L "$SOCKET" kill-server >/dev/null 2>&1 || true; rm -rf "$WORKDIR"' EXIT
|
||||
|
||||
fail() {
|
||||
echo "FAIL: $*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
MOSAIC_TMUX_SOCKET="$SOCKET" \
|
||||
MOSAIC_AGENT_WORKDIR="$WORKDIR" \
|
||||
MOSAIC_AGENT_COMMAND='bash --noprofile --norc -i' \
|
||||
"$START" "$AGENT"
|
||||
|
||||
tmux -L "$SOCKET" has-session -t "=$AGENT:0.0" || fail "agent session was not created"
|
||||
actual_dir=$(tmux -L "$SOCKET" display-message -p -t "=$AGENT:0.0" '#{pane_current_path}')
|
||||
[ "$actual_dir" = "$WORKDIR" ] || fail "agent workdir mismatch: $actual_dir"
|
||||
|
||||
MOSAIC_TMUX_SOCKET="$SOCKET" \
|
||||
MOSAIC_AGENT_WORKDIR="$WORKDIR" \
|
||||
MOSAIC_AGENT_COMMAND='bash --noprofile --norc -i' \
|
||||
"$START" "$AGENT" >/tmp/mosaic-start-agent-idempotent.out
|
||||
|
||||
grep -qF 'already running' /tmp/mosaic-start-agent-idempotent.out || fail "duplicate start was not idempotent"
|
||||
|
||||
echo "ok - start-agent-session"
|
||||
@@ -169,43 +169,6 @@ raise SystemExit(1)
|
||||
PY
|
||||
}
|
||||
|
||||
# Emit an actionable diagnostic to stderr when no tea login resolves for a host.
|
||||
# Callers that have a working API fallback may ignore the non-zero return of
|
||||
# get_gitea_login_for_host; this turns the previously SILENT failure into a loud,
|
||||
# greppable hint (available logins + override + add-login instructions). Printed to
|
||||
# stderr only, so it never contaminates stdout (the resolved login name) or log
|
||||
# assertions that capture tea/curl invocations.
|
||||
print_gitea_login_diagnostic() {
|
||||
local host="${1:-<unknown>}"
|
||||
local available
|
||||
available=$(
|
||||
command -v tea >/dev/null 2>&1 || { echo "(tea CLI not installed)"; exit 0; }
|
||||
logins_json=$(tea login list --output json 2>/dev/null) || { echo "(could not query tea login list)"; exit 0; }
|
||||
TEA_LOGINS_JSON="$logins_json" python3 - <<'PY'
|
||||
import json, os
|
||||
from urllib.parse import urlparse
|
||||
try:
|
||||
logins = json.loads(os.environ.get("TEA_LOGINS_JSON", "[]"))
|
||||
except Exception:
|
||||
logins = []
|
||||
rows = []
|
||||
for login in logins if isinstance(logins, list) else []:
|
||||
name = str(login.get("name") or login.get("Name") or "")
|
||||
url = str(login.get("url") or login.get("URL") or "")
|
||||
host = urlparse(url).hostname or "?"
|
||||
if name:
|
||||
rows.append(f"{name} (host: {host})")
|
||||
print("; ".join(rows) if rows else "(none configured)")
|
||||
PY
|
||||
)
|
||||
{
|
||||
echo "Error: no Gitea tea login matches host '$host'."
|
||||
echo " Available tea logins: ${available}"
|
||||
echo " Fix: set GITEA_LOGIN to a login whose URL host is '$host',"
|
||||
echo " or add one: tea login add --name <name> --url https://$host --token <token>"
|
||||
} >&2
|
||||
}
|
||||
|
||||
get_gitea_login_for_host() {
|
||||
local host="${1:-}"
|
||||
local login
|
||||
@@ -227,7 +190,6 @@ get_gitea_login_for_host() {
|
||||
return 0
|
||||
fi
|
||||
|
||||
print_gitea_login_diagnostic "$host"
|
||||
return 1
|
||||
}
|
||||
|
||||
|
||||
@@ -53,15 +53,7 @@ if [[ "$PLATFORM" == "github" ]]; then
|
||||
gh issue comment "$ISSUE_NUMBER" --body "$COMMENT"
|
||||
echo "Added comment to GitHub issue #$ISSUE_NUMBER"
|
||||
elif [[ "$PLATFORM" == "gitea" ]]; then
|
||||
# Build the invocation as an argv array (not unquoted $(get_gitea_repo_args)
|
||||
# word-splitting) so the comment body — including Markdown backticks, $(...),
|
||||
# and quotes — is passed verbatim and never re-split or shell-evaluated.
|
||||
REPO_SLUG=$(get_repo_slug)
|
||||
GITEA_LOGIN_NAME=$(get_gitea_login) || {
|
||||
echo "Error: could not resolve a Gitea login for this repo; cannot comment on issue #$ISSUE_NUMBER." >&2
|
||||
exit 1
|
||||
}
|
||||
tea issue comment "$ISSUE_NUMBER" "$COMMENT" --repo "$REPO_SLUG" --login "$GITEA_LOGIN_NAME"
|
||||
tea issue comment "$ISSUE_NUMBER" "$COMMENT" $(get_gitea_repo_args)
|
||||
echo "Added comment to Gitea issue #$ISSUE_NUMBER"
|
||||
else
|
||||
echo "Error: Unknown platform"
|
||||
|
||||
@@ -72,11 +72,6 @@ elif values and all(v == "success" for v in values):
|
||||
print("success")
|
||||
elif any(v in {"pending", "running", "queued", "waiting"} for v in values):
|
||||
print("pending")
|
||||
elif not values and not state:
|
||||
# No pipeline/status of any kind reported for this commit. Distinct from
|
||||
# "unknown" (an ambiguous/unrecognized status that should keep polling):
|
||||
# this signals a repo/commit that simply has no CI configured.
|
||||
print("no-status")
|
||||
else:
|
||||
print("unknown")
|
||||
PY
|
||||
@@ -147,21 +142,6 @@ gitea_get_commit_status_json() {
|
||||
curl -fsSL -H "User-Agent: curl/8" -H "Authorization: token ${token}" "$url"
|
||||
}
|
||||
|
||||
gitea_get_default_branch() {
|
||||
local host="$1"
|
||||
local repo="$2"
|
||||
local token="$3"
|
||||
local url="https://${host}/api/v1/repos/${repo}"
|
||||
curl -fsSL -H "User-Agent: curl/8" -H "Authorization: token ${token}" "$url" | python3 -c '
|
||||
import json, sys
|
||||
print((json.load(sys.stdin) or {}).get("default_branch", ""))
|
||||
'
|
||||
}
|
||||
|
||||
github_get_default_branch() {
|
||||
gh api "repos/${OWNER}/${REPO}" --jq '.default_branch'
|
||||
}
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
-n|--number)
|
||||
@@ -265,51 +245,6 @@ else
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# No-CI determination is TWO-TIER (primary: CI history; secondary: empty-poll streak).
|
||||
#
|
||||
# PRIMARY — "does this repo run CI at all?" Probed once, up front, from the DEFAULT
|
||||
# BRANCH's commit status. A repo whose default branch carries CI statuses
|
||||
# demonstrably runs CI, so an EMPTY status on the PR head means the pipeline simply
|
||||
# has not registered YET (webhook/queue lag) — NOT that the repo is CI-less. In that
|
||||
# case we must NEVER fast-green; we keep polling until the pipeline registers or the
|
||||
# timeout fires (both safe). This closes the webhook-lag false-green: a slow-to-
|
||||
# register pipeline feeding a merge gate can no longer be mistaken for "no CI".
|
||||
#
|
||||
# SECONDARY — the empty-poll streak below applies ONLY to genuinely CI-less repos
|
||||
# (default branch also has no CI history, e.g. device-imaging class), where burning
|
||||
# the full timeout would be pure waste. There, NO_CI_MAX empty polls => fast-exit 0.
|
||||
#
|
||||
# Probe failure is treated conservatively as REPO_HAS_CI=1 (assume CI present): we
|
||||
# would rather wait-then-timeout than risk a false-green, per the merge-gate priority.
|
||||
REPO_HAS_CI=1
|
||||
detect_repo_ci() {
|
||||
local def_branch def_status
|
||||
# Every early exit returns 0: a probe miss must leave the conservative
|
||||
# REPO_HAS_CI=1 default in place, never abort the caller under `set -e`.
|
||||
if [[ "$PLATFORM" == "github" ]]; then
|
||||
def_branch=$(github_get_default_branch 2>/dev/null) || {
|
||||
echo "[pr-ci-wait] WARN: default-branch probe failed; assuming CI-enabled (will not fast-green on empty status)."; return 0; }
|
||||
[[ -n "$def_branch" ]] || return 0
|
||||
def_status=$(github_get_commit_status_json "$OWNER" "$REPO" "$def_branch" 2>/dev/null | extract_state_from_status_json) || return 0
|
||||
else
|
||||
def_branch=$(gitea_get_default_branch "$HOST" "$OWNER/$REPO" "$TOKEN" 2>/dev/null) || {
|
||||
echo "[pr-ci-wait] WARN: default-branch probe failed; assuming CI-enabled (will not fast-green on empty status)."; return 0; }
|
||||
[[ -n "$def_branch" ]] || return 0
|
||||
def_status=$(gitea_get_commit_status_json "$HOST" "$OWNER/$REPO" "$TOKEN" "$def_branch" 2>/dev/null | extract_state_from_status_json) || return 0
|
||||
fi
|
||||
if [[ "$def_status" == "no-status" || -z "$def_status" ]]; then
|
||||
REPO_HAS_CI=0
|
||||
echo "[pr-ci-wait] default branch '${def_branch}' has no CI status history — treating repo as CI-less (empty-poll fast-exit enabled)."
|
||||
else
|
||||
REPO_HAS_CI=1
|
||||
echo "[pr-ci-wait] default branch '${def_branch}' has CI history (state=${def_status}) — repo runs CI; empty status on PR head => awaiting registration, will not fast-green."
|
||||
fi
|
||||
}
|
||||
detect_repo_ci || true
|
||||
|
||||
NO_CI_STREAK=0
|
||||
NO_CI_MAX=3
|
||||
|
||||
while true; do
|
||||
NOW_TS=$(date +%s)
|
||||
if (( NOW_TS > DEADLINE_TS )); then
|
||||
@@ -337,35 +272,11 @@ while true; do
|
||||
echo "Error: CI reported ${STATE} for PR #$PR_NUMBER." >&2
|
||||
exit 1
|
||||
;;
|
||||
no-status)
|
||||
if [[ "$REPO_HAS_CI" == "1" ]]; then
|
||||
# PRIMARY tier: repo demonstrably runs CI but this commit's pipeline
|
||||
# has not registered yet (webhook/queue lag). Do NOT fast-green — keep
|
||||
# polling until it registers or the timeout fires. Reset the streak so
|
||||
# a later genuine CI-less misread can't accumulate across this state.
|
||||
NO_CI_STREAK=0
|
||||
echo "[pr-ci-wait] empty status on PR head but repo runs CI — awaiting pipeline registration (webhook lag), not fast-greening."
|
||||
else
|
||||
# SECONDARY tier: genuinely CI-less repo (default branch has no CI
|
||||
# history either). Empty polls => fast-exit green after NO_CI_MAX.
|
||||
NO_CI_STREAK=$((NO_CI_STREAK + 1))
|
||||
if (( NO_CI_STREAK >= NO_CI_MAX )); then
|
||||
echo "[INFO] no CI configured for this repo/commit (PR #$PR_NUMBER, ${NO_CI_STREAK} consecutive empty polls, default branch also CI-less); treating as green."
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
sleep "$INTERVAL_SEC"
|
||||
;;
|
||||
pending|unknown)
|
||||
# A pipeline exists but hasn't reached a terminal state (or is
|
||||
# transiently ambiguous) — keep waiting, and reset the no-CI streak
|
||||
# since this commit is not in the "no CI at all" condition.
|
||||
NO_CI_STREAK=0
|
||||
sleep "$INTERVAL_SEC"
|
||||
;;
|
||||
*)
|
||||
echo "[pr-ci-wait] Unrecognized state '${STATE}', continuing to poll..."
|
||||
NO_CI_STREAK=0
|
||||
sleep "$INTERVAL_SEC"
|
||||
;;
|
||||
esac
|
||||
|
||||
@@ -230,81 +230,4 @@ if grep -q -- 'tea issue close 536 .*--login mosaicstack' "$LOG_FILE"; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# #560: loud diagnostic + host-derived login for BOTH instances + override-wins
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Loud diagnostic: a host with no matching tea login must emit an actionable
|
||||
# error to stderr (the previous behavior was a SILENT failure). The original
|
||||
# mock defines only usc/evil-usc logins, so mosaicstack resolution fails here.
|
||||
git -C "$REPO_DIR" remote set-url origin https://git.mosaicstack.dev/mosaicstack/stack.git
|
||||
diag_stderr=$(run_in_repo bash -c '
|
||||
source "'"$SCRIPT_DIR"'/detect-platform.sh"
|
||||
get_gitea_login_for_host git.mosaicstack.dev
|
||||
' 2>&1 1>/dev/null || true)
|
||||
if ! grep -q "no Gitea tea login matches host 'git.mosaicstack.dev'" <<<"$diag_stderr"; then
|
||||
echo "Expected loud diagnostic naming the unresolved host; got: $diag_stderr" >&2
|
||||
exit 1
|
||||
fi
|
||||
if ! grep -q "Available tea logins:" <<<"$diag_stderr"; then
|
||||
echo "Expected diagnostic to list available tea logins; got: $diag_stderr" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Both-instance host derivation + override-wins, using a mock that DOES define a
|
||||
# mosaicstack login. Scoped to this section so the API-fallback assertions above
|
||||
# (which rely on mosaicstack having NO tea login) remain valid.
|
||||
BIN_DIR2="$WORK_DIR/bin2"
|
||||
mkdir -p "$BIN_DIR2"
|
||||
cp "$BIN_DIR/curl" "$BIN_DIR2/curl"
|
||||
cat > "$BIN_DIR2/tea" <<'SH'
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
if [[ "$*" == "login list --output json" ]]; then
|
||||
cat <<'JSON'
|
||||
[
|
||||
{"name":"mosaicstack","url":"https://git.mosaicstack.dev","user":"jason.woltje"},
|
||||
{"name":"usc","url":"https://git.uscllc.com","user":"jason.woltje"}
|
||||
]
|
||||
JSON
|
||||
exit 0
|
||||
fi
|
||||
printf 'tea %s\n' "$*" >> "$MOSAIC_TEST_LOG"
|
||||
exit 0
|
||||
SH
|
||||
chmod +x "$BIN_DIR2/tea"
|
||||
|
||||
run_in_repo2() {
|
||||
(
|
||||
cd "$REPO_DIR"
|
||||
PATH="$BIN_DIR2:$PATH" \
|
||||
MOSAIC_CREDENTIALS_FILE="$CREDENTIALS_FILE" \
|
||||
MOSAIC_TEST_LOG="$LOG_FILE" \
|
||||
"$@"
|
||||
)
|
||||
}
|
||||
|
||||
git -C "$REPO_DIR" remote set-url origin https://git.mosaicstack.dev/mosaicstack/stack.git
|
||||
mosaic_login=$(run_in_repo2 bash -c 'source "'"$SCRIPT_DIR"'/detect-platform.sh"; get_gitea_login')
|
||||
if [[ "$mosaic_login" != "mosaicstack" ]]; then
|
||||
echo "Expected mosaicstack origin to derive login 'mosaicstack'; got '$mosaic_login'" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
git -C "$REPO_DIR" remote set-url origin https://git.uscllc.com/USC/uconnect.git
|
||||
usc_login_derived=$(run_in_repo2 bash -c 'source "'"$SCRIPT_DIR"'/detect-platform.sh"; get_gitea_login')
|
||||
if [[ "$usc_login_derived" != "usc" ]]; then
|
||||
echo "Expected usc origin to derive login 'usc'; got '$usc_login_derived'" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Explicit GITEA_LOGIN override is honored when it matches the host.
|
||||
git -C "$REPO_DIR" remote set-url origin https://git.mosaicstack.dev/mosaicstack/stack.git
|
||||
override_wins=$(run_in_repo2 bash -c 'export GITEA_LOGIN=mosaicstack; source "'"$SCRIPT_DIR"'/detect-platform.sh"; get_gitea_login')
|
||||
if [[ "$override_wins" != "mosaicstack" ]]; then
|
||||
echo "Expected valid GITEA_LOGIN override to win on mosaicstack host; got '$override_wins'" >&2
|
||||
exit 1
|
||||
fi
|
||||
git -C "$REPO_DIR" remote set-url origin https://git.uscllc.com/USC/uconnect.git
|
||||
|
||||
echo "Gitea login resolution regression harness passed"
|
||||
|
||||
@@ -1,102 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# Regression harness for issue-create.sh Markdown-body safety (#559).
|
||||
#
|
||||
# Guards against reintroduction of eval-based command construction. The wrapper
|
||||
# builds its tea/gh invocation as an argv array, so a body containing command
|
||||
# substitution ($(...)), backticks, quotes, and dollar signs MUST reach tea
|
||||
# verbatim and MUST NOT be shell-evaluated. This test asserts both:
|
||||
# 1. No command-substitution side effect (an injected `touch SENTINEL` never runs).
|
||||
# 2. The --description value tea receives is byte-for-byte the original body.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
WORK_DIR="${MOSAIC_TEST_WORK_DIR:-$PWD/.mosaic-test-work/issue-create-body-safety}"
|
||||
REPO_DIR="$WORK_DIR/repo"
|
||||
BIN_DIR="$WORK_DIR/bin"
|
||||
SENTINEL="$WORK_DIR/INJECTION_SENTINEL"
|
||||
BODY_FILE="$WORK_DIR/body.txt"
|
||||
RECEIVED_FILE="$WORK_DIR/received-description.txt"
|
||||
|
||||
rm -rf "$WORK_DIR"
|
||||
mkdir -p "$REPO_DIR" "$BIN_DIR"
|
||||
|
||||
git -C "$REPO_DIR" init -q
|
||||
git -C "$REPO_DIR" remote add origin https://git.mosaicstack.dev/mosaicstack/stack.git
|
||||
|
||||
# Hostile Markdown body. The unquoted heredoc expands $SENTINEL (a real path we
|
||||
# want embedded) but every shell metacharacter we care about is backslash-escaped
|
||||
# so the TEST shell writes them literally into the file — the bytes the wrapper
|
||||
# must then preserve.
|
||||
cat > "$BODY_FILE" <<EOF
|
||||
# Release notes
|
||||
|
||||
Inline code: \`rm -rf /\` must stay literal.
|
||||
Command sub attempt: \$(touch $SENTINEL)
|
||||
Backtick cmd attempt: \`touch $SENTINEL\`
|
||||
Dollars: \$HOME \${PATH} \$5.00 and 100% done
|
||||
Quotes: "double" and 'single' and \`mixed\`
|
||||
Trailing pipe-ish: foo | bar && baz ; qux
|
||||
EOF
|
||||
|
||||
BODY="$(cat "$BODY_FILE")"
|
||||
|
||||
# Mock tea: resolve a mosaicstack login, then capture the --description verbatim.
|
||||
cat > "$BIN_DIR/tea" <<'SH'
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [[ "$*" == "login list --output json" ]]; then
|
||||
cat <<'JSON'
|
||||
[
|
||||
{"name":"mosaicstack","url":"https://git.mosaicstack.dev","user":"jason.woltje"}
|
||||
]
|
||||
JSON
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [[ "${1:-}" == "issue" && "${2:-}" == "create" ]]; then
|
||||
desc=""
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
--description) desc="$2"; shift 2 ;;
|
||||
*) shift ;;
|
||||
esac
|
||||
done
|
||||
printf '%s' "$desc" > "$MOSAIC_TEST_RECEIVED"
|
||||
echo "#1 created"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
exit 0
|
||||
SH
|
||||
chmod +x "$BIN_DIR/tea"
|
||||
|
||||
(
|
||||
cd "$REPO_DIR"
|
||||
PATH="$BIN_DIR:$PATH" \
|
||||
MOSAIC_TEST_RECEIVED="$RECEIVED_FILE" \
|
||||
"$SCRIPT_DIR/issue-create.sh" -t "Body safety test" -b "$BODY"
|
||||
) >/dev/null
|
||||
|
||||
# 1. No command substitution executed anywhere in the pipeline.
|
||||
if [[ -e "$SENTINEL" ]]; then
|
||||
echo "FAIL: injected command substitution executed (sentinel file created): $SENTINEL" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 2. tea actually received the body (issue create path taken, not silently dropped).
|
||||
if [[ ! -f "$RECEIVED_FILE" ]]; then
|
||||
echo "FAIL: tea issue create was never invoked with a --description" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 3. The description tea received is byte-for-byte the original body.
|
||||
if [[ "$(cat "$RECEIVED_FILE")" != "$BODY" ]]; then
|
||||
echo "FAIL: body was not preserved verbatim through issue-create.sh" >&2
|
||||
echo "--- expected ---" >&2; printf '%s\n' "$BODY" >&2
|
||||
echo "--- received ---" >&2; cat "$RECEIVED_FILE" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "issue-create.sh Markdown body-safety regression harness passed"
|
||||
@@ -31,9 +31,12 @@ Prepends the preamble automatically (auto-detecting your own `host:session`) and
|
||||
delivers reliably to local OR remote panes.
|
||||
|
||||
```bash
|
||||
# Local target (same host)
|
||||
# Local target (same host, default tmux server)
|
||||
agent-send.sh -s <dst_session> -m "message"
|
||||
|
||||
# Local target on a Mosaic fleet socket
|
||||
agent-send.sh -L mosaic-factory -s '=coder0' -m "message"
|
||||
|
||||
# Remote target (over ssh)
|
||||
agent-send.sh -H user@host -s <dst_session> -m "message"
|
||||
|
||||
@@ -42,10 +45,27 @@ agent-send.sh -H user@host -s <dst_session> -f msg.txt
|
||||
echo "msg" | agent-send.sh -s <dst_session>
|
||||
```
|
||||
|
||||
Key flags: `-s` dst session (required) · `-H` ssh target for remote · `-n` dst
|
||||
Key flags: `-L` named tmux socket · `-s` dst session (required) · `-H` ssh target for remote · `-n` dst
|
||||
hostname for the preamble (else auto-resolved) · `-m`/`-f`/stdin body · `-S`
|
||||
override source label · `-v` verbose · `-r N` Enter-flush attempts.
|
||||
|
||||
For durable fleet use, prefer exact tmux targets such as `=coder0`. The helper
|
||||
normalizes exact session targets to pane-qualified targets internally so pane
|
||||
commands do not fall back to tmux's prefix matching behavior.
|
||||
|
||||
## Named socket isolation
|
||||
|
||||
Durable Mosaic fleets should use a dedicated tmux socket, for example:
|
||||
|
||||
```bash
|
||||
tmux -L mosaic-factory ls
|
||||
agent-send.sh -L mosaic-factory -s '=coder0' -m "status?"
|
||||
send-message.sh -L mosaic-factory -t '=coder0' -m "raw pane message"
|
||||
```
|
||||
|
||||
This keeps fleet operations away from the user's default tmux server. It is the
|
||||
safe rollout path on hosts that already have manual tmux sessions.
|
||||
|
||||
## Why a helper exists (the submission gotcha)
|
||||
|
||||
Pasting into an interactive REPL via raw `tmux send-keys` is unreliable: a
|
||||
@@ -67,6 +87,7 @@ message crosses the wire as base64 (`-b`) to avoid all shell-quoting hazards.
|
||||
|
||||
- `agent-send.sh` — inter-agent wrapper (preamble + local/remote dispatch).
|
||||
- `send-message.sh` — low-level reliable single-pane submitter (`-b` base64 input).
|
||||
- `test-send-message-socket.sh` — smoke test for named-socket isolation.
|
||||
|
||||
## Distribution
|
||||
|
||||
|
||||
@@ -23,12 +23,13 @@
|
||||
# the remote host; only bash + tmux + base64 (standard).
|
||||
#
|
||||
# USAGE
|
||||
# agent-send.sh -s <dst_session> -m "message" # local target
|
||||
# agent-send.sh -H user@host -s <dst_session> -m "message" # remote target
|
||||
# agent-send.sh -H user@host -n <dst_hostname> -s <sess> -f msg.txt
|
||||
# echo "msg" | agent-send.sh -H user@host -s <dst_session>
|
||||
# agent-send.sh [-L socket] -s <dst_session> -m "message" # local target
|
||||
# agent-send.sh [-L socket] -H user@host -s <dst_session> -m "message" # remote target
|
||||
# agent-send.sh [-L socket] -H user@host -n <dst_hostname> -s <sess> -f msg.txt
|
||||
# echo "msg" | agent-send.sh [-L socket] -H user@host -s <dst_session>
|
||||
#
|
||||
# OPTIONS
|
||||
# -L NAME tmux socket name passed to `tmux -L NAME` on the target host
|
||||
# -s DST_SESSION target tmux session (or session:window.pane) [required]
|
||||
# -H SSH_TARGET ssh target (user@host) for a remote pane; omit for local
|
||||
# -n DST_HOST hostname to show in the preamble for the target.
|
||||
@@ -47,12 +48,13 @@ set -uo pipefail
|
||||
SELF_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
|
||||
SENDER="$SELF_DIR/send-message.sh"
|
||||
|
||||
DST_SESSION=""; SSH_TARGET=""; DST_HOST=""; MSG=""; FILE=""
|
||||
DST_SESSION=""; SSH_TARGET=""; DST_HOST=""; MSG=""; FILE=""; SOCKET_NAME=""
|
||||
SRC_LABEL=""; RETRIES=2; VERBOSE=0
|
||||
usage() { sed -n '2,44p' "$0"; exit "${1:-3}"; }
|
||||
|
||||
while getopts "s:H:n:m:f:S:r:vh" o; do
|
||||
while getopts "L:s:H:n:m:f:S:r:vh" o; do
|
||||
case "$o" in
|
||||
L) SOCKET_NAME=$OPTARG ;;
|
||||
s) DST_SESSION=$OPTARG ;; H) SSH_TARGET=$OPTARG ;; n) DST_HOST=$OPTARG ;;
|
||||
m) MSG=$OPTARG ;; f) FILE=$OPTARG ;; S) SRC_LABEL=$OPTARG ;;
|
||||
r) RETRIES=$OPTARG ;; v) VERBOSE=1 ;; h) usage 0 ;; *) usage 3 ;;
|
||||
@@ -70,8 +72,12 @@ fi
|
||||
|
||||
# Source label: this agent's host:session (auto-detected, overridable).
|
||||
if [ -z "$SRC_LABEL" ]; then
|
||||
tmux_cmd=(tmux)
|
||||
if [ -n "$SOCKET_NAME" ]; then
|
||||
tmux_cmd+=(-L "$SOCKET_NAME")
|
||||
fi
|
||||
src_host=$(hostname -s 2>/dev/null || echo "?")
|
||||
src_sess=$(tmux display-message -p '#S' 2>/dev/null || echo "?")
|
||||
src_sess=$("${tmux_cmd[@]}" display-message -p '#S' 2>/dev/null || echo "?")
|
||||
SRC_LABEL="${src_host}:${src_sess}"
|
||||
fi
|
||||
|
||||
@@ -89,12 +95,16 @@ FULL="${PREAMBLE} ${MSG}"
|
||||
B64=$(printf '%s' "$FULL" | base64 -w0)
|
||||
|
||||
vflag=""; [ "$VERBOSE" = 1 ] && vflag="-v"
|
||||
socket_args=()
|
||||
if [ -n "$SOCKET_NAME" ]; then
|
||||
socket_args=(-L "$SOCKET_NAME")
|
||||
fi
|
||||
|
||||
if [ -z "$SSH_TARGET" ]; then
|
||||
# Local pane: call the canonical sender directly.
|
||||
exec "$SENDER" -t "$DST_SESSION" -b "$B64" -r "$RETRIES" $vflag
|
||||
exec "$SENDER" "${socket_args[@]}" -t "$DST_SESSION" -b "$B64" -r "$RETRIES" $vflag
|
||||
else
|
||||
# Remote pane: ship the sender over ssh and run it local to the target.
|
||||
ssh -o ConnectTimeout=10 "$SSH_TARGET" \
|
||||
"bash -s -- -t '$DST_SESSION' -b '$B64' -r '$RETRIES' $vflag" < "$SENDER"
|
||||
"bash -s -- ${socket_args[*]@Q} -t '$DST_SESSION' -b '$B64' -r '$RETRIES' $vflag" < "$SENDER"
|
||||
fi
|
||||
|
||||
@@ -13,12 +13,13 @@
|
||||
# no-op in Claude Code, so the double-Enter is safe.
|
||||
#
|
||||
# USAGE
|
||||
# send-message.sh -t <target> -m "message"
|
||||
# send-message.sh -t <target> -f <file>
|
||||
# echo "message" | send-message.sh -t <target>
|
||||
# ssh host bash -s -- -t <target> -b "$(base64 -w0 <<<msg)" < send-message.sh
|
||||
# send-message.sh [-L socket_name] -t <target> -m "message"
|
||||
# send-message.sh [-L socket_name] -t <target> -f <file>
|
||||
# echo "message" | send-message.sh [-L socket_name] -t <target>
|
||||
# ssh host bash -s -- -L socket -t <target> -b "$(base64 -w0 <<<msg)" < send-message.sh
|
||||
#
|
||||
# OPTIONS
|
||||
# -L NAME tmux socket name passed to `tmux -L NAME` (optional)
|
||||
# -t TARGET tmux target: session, or session:window.pane [required]
|
||||
# -m MESSAGE message text (single- or multi-line)
|
||||
# -f FILE read message from FILE instead of -m
|
||||
@@ -34,11 +35,12 @@
|
||||
# 3 usage error
|
||||
set -uo pipefail
|
||||
|
||||
TARGET=""; MSG=""; FILE=""; B64=""; RETRIES=2; VERBOSE=0
|
||||
SOCKET_NAME=""; TARGET=""; MSG=""; FILE=""; B64=""; RETRIES=2; VERBOSE=0
|
||||
usage() { sed -n '2,34p' "$0"; exit "${1:-3}"; }
|
||||
|
||||
while getopts "t:m:f:b:r:vh" o; do
|
||||
while getopts "L:t:m:f:b:r:vh" o; do
|
||||
case "$o" in
|
||||
L) SOCKET_NAME=$OPTARG ;;
|
||||
t) TARGET=$OPTARG ;; m) MSG=$OPTARG ;; f) FILE=$OPTARG ;; b) B64=$OPTARG ;;
|
||||
r) RETRIES=$OPTARG ;; v) VERBOSE=1 ;; h) usage 0 ;; *) usage 3 ;;
|
||||
esac
|
||||
@@ -51,8 +53,21 @@ elif [ -z "$MSG" ] && [ ! -t 0 ]; then MSG=$(cat)
|
||||
fi
|
||||
[ -n "$MSG" ] || { echo "ERROR: empty message (use -m, -f, or stdin)" >&2; exit 3; }
|
||||
|
||||
tmux_cmd=(tmux)
|
||||
if [ -n "$SOCKET_NAME" ]; then
|
||||
tmux_cmd+=(-L "$SOCKET_NAME")
|
||||
fi
|
||||
|
||||
# tmux accepts `=session` for some commands, but pane-level commands such as
|
||||
# capture-pane require a pane-qualified target. Keep exact-session addressing
|
||||
# convenient while avoiding accidental prefix matches.
|
||||
EFFECTIVE_TARGET=$TARGET
|
||||
if [[ "$TARGET" == =* && "$TARGET" != *:* ]]; then
|
||||
EFFECTIVE_TARGET="${TARGET}:0.0"
|
||||
fi
|
||||
|
||||
# Target must resolve to a live pane.
|
||||
if ! tmux list-panes -t "$TARGET" >/dev/null 2>&1; then
|
||||
if ! "${tmux_cmd[@]}" list-panes -t "$EFFECTIVE_TARGET" >/dev/null 2>&1; then
|
||||
echo "ERROR: tmux target not found: $TARGET" >&2; exit 1
|
||||
fi
|
||||
|
||||
@@ -62,18 +77,18 @@ snippet=$(printf '%s' "$MSG" | tr '\n' ' ' | tr -s ' ' | sed 's/[^[:print:]]//g'
|
||||
|
||||
# 1) Paste the body as a bracketed paste so multi-line content does not submit
|
||||
# line-by-line. load-buffer/paste-buffer is far safer than `send-keys -l`.
|
||||
printf '%s' "$MSG" | tmux load-buffer -b __mosaic_send -
|
||||
printf '%s' "$MSG" | "${tmux_cmd[@]}" load-buffer -b __mosaic_send -
|
||||
# -p = bracketed paste when the client supports it; fall back if not.
|
||||
tmux paste-buffer -d -p -b __mosaic_send -t "$TARGET" 2>/dev/null \
|
||||
|| tmux paste-buffer -d -b __mosaic_send -t "$TARGET"
|
||||
"${tmux_cmd[@]}" paste-buffer -d -p -b __mosaic_send -t "$EFFECTIVE_TARGET" 2>/dev/null \
|
||||
|| "${tmux_cmd[@]}" paste-buffer -d -b __mosaic_send -t "$EFFECTIVE_TARGET"
|
||||
sleep 0.5
|
||||
|
||||
# 2) Submit, then verify; flush with another Enter if it is still a draft.
|
||||
status="sent"
|
||||
for attempt in $(seq 1 $((RETRIES + 1))); do
|
||||
tmux send-keys -t "$TARGET" Enter
|
||||
"${tmux_cmd[@]}" send-keys -t "$EFFECTIVE_TARGET" Enter
|
||||
sleep 1.2
|
||||
pane=$(tmux capture-pane -t "$TARGET" -p 2>/dev/null)
|
||||
pane=$("${tmux_cmd[@]}" capture-pane -t "$EFFECTIVE_TARGET" -p 2>/dev/null)
|
||||
|
||||
if printf '%s' "$pane" | grep -qF "$QUEUED_RE"; then
|
||||
status="queued"; break
|
||||
|
||||
50
packages/mosaic/framework/tools/tmux/test-send-message-socket.sh
Executable file
50
packages/mosaic/framework/tools/tmux/test-send-message-socket.sh
Executable file
@@ -0,0 +1,50 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR=$(cd -- "$(dirname -- "$0")" && pwd)
|
||||
SEND_MESSAGE="$SCRIPT_DIR/send-message.sh"
|
||||
AGENT_SEND="$SCRIPT_DIR/agent-send.sh"
|
||||
SOCKET="mosaic-test-$RANDOM-$$"
|
||||
TARGET="target-$RANDOM"
|
||||
DEFAULT_TARGET="default-target-$RANDOM"
|
||||
TMPDIR=$(mktemp -d)
|
||||
trap 'tmux -L "$SOCKET" kill-server >/dev/null 2>&1 || true; tmux kill-session -t "$DEFAULT_TARGET" >/dev/null 2>&1 || true; rm -rf "$TMPDIR"' EXIT
|
||||
|
||||
fail() {
|
||||
echo "FAIL: $*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
require_tmux() {
|
||||
command -v tmux >/dev/null 2>&1 || fail "tmux is required"
|
||||
}
|
||||
|
||||
capture_named() {
|
||||
tmux -L "$SOCKET" capture-pane -t "=$TARGET:0.0" -p
|
||||
}
|
||||
|
||||
capture_default() {
|
||||
tmux capture-pane -t "=$DEFAULT_TARGET:0.0" -p
|
||||
}
|
||||
|
||||
require_tmux
|
||||
|
||||
tmux -L "$SOCKET" new-session -d -s "$TARGET" -c "$TMPDIR" 'bash --noprofile --norc -i'
|
||||
tmux new-session -d -s "$DEFAULT_TARGET" -c "$TMPDIR" 'bash --noprofile --norc -i'
|
||||
|
||||
"$SEND_MESSAGE" -L "$SOCKET" -t "=$TARGET" -m "named socket hello" >/tmp/send-message-named.out
|
||||
sleep 0.2
|
||||
capture_named | grep -qF "named socket hello" || fail "send-message.sh did not deliver to named socket"
|
||||
if capture_default | grep -qF "named socket hello"; then
|
||||
fail "send-message.sh leaked named-socket message to default tmux server"
|
||||
fi
|
||||
|
||||
"$AGENT_SEND" -L "$SOCKET" -S "tester:source" -s "=$TARGET" -m "agent socket hello" >/tmp/agent-send-named.out
|
||||
sleep 0.2
|
||||
capture_named | grep -qF "[tester:source ->" || fail "agent-send.sh did not include preamble"
|
||||
capture_named | grep -qF "agent socket hello" || fail "agent-send.sh did not deliver to named socket"
|
||||
if capture_default | grep -qF "agent socket hello"; then
|
||||
fail "agent-send.sh leaked named-socket message to default tmux server"
|
||||
fi
|
||||
|
||||
echo "ok - named tmux socket send tools"
|
||||
@@ -12,7 +12,7 @@ wp_resolve_repo_id() {
|
||||
local full_name="$1"
|
||||
local response http_code body repo_id
|
||||
|
||||
response=$(curl -sS -w "\n%{http_code}" \
|
||||
response=$(curl -sk -w "\n%{http_code}" \
|
||||
-H "Authorization: Bearer $WOODPECKER_TOKEN" \
|
||||
"${WOODPECKER_URL}/api/repos/lookup/${full_name}")
|
||||
|
||||
|
||||
@@ -48,7 +48,7 @@ fi
|
||||
# Resolve owner/repo to numeric ID (Woodpecker v3 API)
|
||||
REPO_ID=$(wp_resolve_repo_id "$REPO") || exit 1
|
||||
|
||||
response=$(curl -sS -w "\n%{http_code}" \
|
||||
response=$(curl -sk -w "\n%{http_code}" \
|
||||
-H "Authorization: Bearer $WOODPECKER_TOKEN" \
|
||||
"${WOODPECKER_URL}/api/repos/${REPO_ID}/pipelines?perPage=${LIMIT}")
|
||||
|
||||
|
||||
@@ -50,7 +50,7 @@ REPO_ID=$(wp_resolve_repo_id "$REPO") || exit 1
|
||||
_wp_fetch() {
|
||||
local ep="$1"
|
||||
local resp http_code body
|
||||
resp=$(curl -sS -w "\n%{http_code}" \
|
||||
resp=$(curl -sk -w "\n%{http_code}" \
|
||||
-H "Authorization: Bearer $WOODPECKER_TOKEN" \
|
||||
"$ep")
|
||||
http_code=$(echo "$resp" | tail -n1)
|
||||
|
||||
@@ -46,7 +46,7 @@ REPO_ID=$(wp_resolve_repo_id "$REPO") || exit 1
|
||||
|
||||
echo "Triggering pipeline for $REPO on branch $BRANCH..."
|
||||
|
||||
response=$(curl -sS -w "\n%{http_code}" -X POST \
|
||||
response=$(curl -sk -w "\n%{http_code}" -X POST \
|
||||
-H "Authorization: Bearer $WOODPECKER_TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$(jq -n --arg b "$BRANCH" '{branch: $b}')" \
|
||||
|
||||
755
scratchpads/2026-06-19-tmux-fleet-durable-install-plan.md
Normal file
755
scratchpads/2026-06-19-tmux-fleet-durable-install-plan.md
Normal file
@@ -0,0 +1,755 @@
|
||||
# Durable tmux Fleet Installation Plan
|
||||
|
||||
> **For Mosaic/Hermes:** This is an implementation plan for making the tmux-backed Mosaic software-factory fleet durable on this server and reusable in generic Mosaic Stack installs. Keep local USC/Mosaic defaults in profiles; keep framework behavior customizable.
|
||||
|
||||
**Goal:** Add a supported Mosaic tmux-fleet installation path: holder-owned tmux server, per-agent reusable sessions, reliable send/reset/status tools, local roster customization, and a documented cutover for this server.
|
||||
|
||||
**Architecture:** Mosaic should ship generic tmux fleet primitives in the framework, then layer local rosters through configuration. The holder service owns the tmux socket; each agent service joins the holder-owned server and runs `mosaic yolo <runtime>`. The orchestrator addresses agents through `mosaic agent ...` abstractions so tmux can later be replaced by Matrix-backed agent comms without changing mission flow.
|
||||
|
||||
**Reference:** AI Guide `playbooks/tmux-fleet.md` at commit `2a0b0b5` documents the organization-neutral holder-service pattern, exact-match `=<name>` stop targets, and coupled-server cutover/verification sequence. The Stack implementation should treat that as the lifecycle model and keep concrete Mosaic unit/tooling details here.
|
||||
|
||||
**Tech Stack:** Bash, tmux, user systemd units, Mosaic CLI/framework installer, JSON/YAML roster config, existing `packages/mosaic/framework/tools/tmux/{agent-send.sh,send-message.sh}`.
|
||||
|
||||
---
|
||||
|
||||
## Current evidence from this server
|
||||
|
||||
Checked 2026-06-19:
|
||||
|
||||
- Host: `W-jarvis`
|
||||
- User: `jarvis`
|
||||
- tmux: `/usr/bin/tmux`, version `3.4`
|
||||
- user systemd: active
|
||||
- existing tmux sessions: `ai-bma-0`, `dyor-1`, `melaniewoltje-3`, `sage-2`
|
||||
- existing Mosaic runtime: `/home/jarvis/.npm-global/bin/mosaic`, version `0.0.31`
|
||||
- installed `~/.config/mosaic/tools/tmux` was not present even though the stack repo contains `packages/mosaic/framework/tools/tmux/`
|
||||
|
||||
Implication: do not kill the current tmux server casually. This server has active ad-hoc/service sessions. The durable fleet cutover must be planned, with either a separate socket first or a scheduled fleet recycle.
|
||||
|
||||
## Design decisions
|
||||
|
||||
### 1. Generic framework, local profile
|
||||
|
||||
The Mosaic framework should ship:
|
||||
|
||||
- systemd unit templates;
|
||||
- tmux fleet CLI wrappers;
|
||||
- roster schema and examples;
|
||||
- install/enable/status/reset commands;
|
||||
- docs and verification scripts.
|
||||
|
||||
Local environments should provide:
|
||||
|
||||
- agent names;
|
||||
- runtime per slot (`claude`, `pi`, `codex`, etc.);
|
||||
- default role class;
|
||||
- launch directory;
|
||||
- optional kickstart prompt;
|
||||
- model/provider hints;
|
||||
- transport selection (`tmux` now, `matrix` later).
|
||||
|
||||
Do not bake the USC roster into generic install code. Ship it as an example profile.
|
||||
|
||||
### 2. Durable sessions, disposable task context
|
||||
|
||||
Session names are durable operational addresses. Task persona is disposable. Reusable worker slots should be reset with `/clear` or `/new` and then receive a fresh task kickstart.
|
||||
|
||||
Persistent/semi-persistent personas:
|
||||
|
||||
- lead orchestrator;
|
||||
- final/adversarial reviewer;
|
||||
- architecture/enhancement lane.
|
||||
|
||||
Disposable slots:
|
||||
|
||||
- implementers;
|
||||
- ordinary reviewers;
|
||||
- security reviewers unless actively holding a security mission.
|
||||
|
||||
### 3. Transport abstraction now
|
||||
|
||||
Add commands around tmux instead of calling tmux directly from orchestration:
|
||||
|
||||
```bash
|
||||
mosaic agent send <agent> --message "..."
|
||||
mosaic agent status [--json]
|
||||
mosaic agent reset <agent> [--clear|--new]
|
||||
mosaic agent roster [--json]
|
||||
mosaic fleet install|start|stop|restart|status|verify
|
||||
```
|
||||
|
||||
Today these call tmux/systemd. Later the same command surface can target Matrix or per-agent gateways.
|
||||
|
||||
### 4. Avoid shared-server ownership bug
|
||||
|
||||
Use the AI Guide holder pattern:
|
||||
|
||||
```text
|
||||
mosaic-tmux-holder.service owns the tmux server/socket
|
||||
mosaic-agent@<name>.service joins the existing holder-owned socket
|
||||
ExecStop kills only session =<name>
|
||||
```
|
||||
|
||||
Use exact tmux targets: `=<session>`.
|
||||
|
||||
### 5. Prefer separate named socket for Mosaic factory
|
||||
|
||||
To avoid disturbing existing tmux work, the default fleet should use a named socket such as:
|
||||
|
||||
```text
|
||||
$XDG_RUNTIME_DIR/mosaic-factory.tmux
|
||||
```
|
||||
|
||||
or tmux socket name:
|
||||
|
||||
```bash
|
||||
tmux -L mosaic-factory ...
|
||||
```
|
||||
|
||||
This avoids collision with ordinary `tmux ls` sessions. The send tools need socket support.
|
||||
|
||||
---
|
||||
|
||||
## Target USC-style roster example
|
||||
|
||||
Ship as example only, not default:
|
||||
|
||||
```yaml
|
||||
version: 1
|
||||
transport: tmux
|
||||
tmux:
|
||||
socket_name: mosaic-factory
|
||||
holder_session: _holder
|
||||
working_directory: ~/src
|
||||
agents:
|
||||
- name: mos-claude
|
||||
runtime: claude
|
||||
class: orchestrator
|
||||
model_hint: Claude Opus
|
||||
persistent_persona: true
|
||||
- name: coder0
|
||||
runtime: claude
|
||||
class: implementer
|
||||
model_hint: Claude Opus
|
||||
reset_between_tasks: true
|
||||
- name: coder1
|
||||
runtime: claude
|
||||
class: implementer
|
||||
model_hint: Claude Opus
|
||||
reset_between_tasks: true
|
||||
- name: coder2
|
||||
runtime: pi
|
||||
class: implementer
|
||||
model_hint: Pi GPT-5.5
|
||||
reset_between_tasks: true
|
||||
- name: coder3
|
||||
runtime: pi
|
||||
class: implementer
|
||||
model_hint: Pi GPT-5.5
|
||||
reset_between_tasks: true
|
||||
- name: coder4
|
||||
runtime: claude
|
||||
class: implementer
|
||||
model_hint: Claude Opus
|
||||
reset_between_tasks: true
|
||||
- name: coder5
|
||||
runtime: claude
|
||||
class: implementer
|
||||
model_hint: Claude Opus
|
||||
reset_between_tasks: true
|
||||
- name: enhance
|
||||
runtime: claude
|
||||
class: enhancer
|
||||
model_hint: Claude Opus
|
||||
persistent_persona: semi
|
||||
- name: rev0
|
||||
runtime: pi
|
||||
class: reviewer
|
||||
model_hint: Pi GPT-5.5
|
||||
reset_between_tasks: true
|
||||
- name: rev1
|
||||
runtime: pi
|
||||
class: reviewer
|
||||
model_hint: Pi GPT-5.5
|
||||
reset_between_tasks: true
|
||||
- name: secrev0
|
||||
runtime: pi
|
||||
class: security_reviewer
|
||||
model_hint: Pi GPT-5.5
|
||||
reset_between_tasks: true
|
||||
- name: secrev1
|
||||
runtime: pi
|
||||
class: security_reviewer
|
||||
model_hint: Pi GPT-5.5
|
||||
reset_between_tasks: true
|
||||
- name: ultron
|
||||
runtime: pi
|
||||
class: final_reviewer
|
||||
model_hint: Pi GPT-5.5
|
||||
persistent_persona: semi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 0 — Confirm install surfaces
|
||||
|
||||
### Task 0.1: Inspect installer copy behavior
|
||||
|
||||
**Objective:** Confirm how framework files under `packages/mosaic/framework/` become installed under `~/.config/mosaic/`.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Read: `tools/install.sh`
|
||||
- Read: `packages/mosaic/framework/install.sh`
|
||||
- Read: `packages/mosaic/src/runtime/install-manifest.ts`
|
||||
|
||||
**Steps:**
|
||||
|
||||
1. Verify `packages/mosaic/framework/install.sh` rsyncs `tools/tmux`.
|
||||
2. Verify whether npm-packaged installs include `framework/tools/tmux`.
|
||||
3. Confirm whether installed hosts should run `mosaic update`, `bash tools/install.sh`, or `packages/mosaic/framework/install.sh` to receive new tmux tools.
|
||||
4. Record exact propagation command in docs.
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
bash packages/mosaic/framework/install.sh --help || true
|
||||
npm pack --dry-run --json | jq '.[0].files[].path' | grep 'framework/tools/tmux'
|
||||
```
|
||||
|
||||
Expected: tmux tools are included in installable package or packaging fix is identified.
|
||||
|
||||
### Task 0.2: Inspect current yolo launch semantics
|
||||
|
||||
**Objective:** Confirm `mosaic yolo claude` and `mosaic yolo pi` accept optional initial prompt text and behave well under systemd/tmux.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Read: `packages/mosaic/src/**`
|
||||
- Read: `packages/mosaic/framework/runtime/claude/RUNTIME.md`
|
||||
- Read: `packages/mosaic/framework/runtime/pi/RUNTIME.md`
|
||||
|
||||
**Verification commands:**
|
||||
|
||||
```bash
|
||||
mosaic yolo claude --help
|
||||
mosaic yolo pi --help
|
||||
```
|
||||
|
||||
Expected: a systemd `ExecStart` can launch the runtime either with no prompt or with a kickstart prompt file/string.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Framework tmux primitives
|
||||
|
||||
### Task 1.1: Add socket support to send tools
|
||||
|
||||
**Objective:** Allow `agent-send.sh` and `send-message.sh` to target a named Mosaic tmux socket without affecting default tmux sessions.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `packages/mosaic/framework/tools/tmux/send-message.sh`
|
||||
- Modify: `packages/mosaic/framework/tools/tmux/agent-send.sh`
|
||||
- Modify: `packages/mosaic/framework/tools/tmux/README.md`
|
||||
- Test: `packages/mosaic/framework/tools/tmux/test-send-message.sh` (new)
|
||||
|
||||
**Design:**
|
||||
|
||||
Add optional flags:
|
||||
|
||||
```bash
|
||||
-L SOCKET_NAME # tmux -L socket name
|
||||
-SOCKET PATH # optional later if needed; avoid conflict with existing -S source label in agent-send
|
||||
```
|
||||
|
||||
Because `agent-send.sh` already uses `-S` for source label, prefer `-L` for socket name and `-T` or `--socket-path` only if long-option parsing is added.
|
||||
|
||||
**Implementation notes:**
|
||||
|
||||
- Build a tmux command array:
|
||||
|
||||
```bash
|
||||
tmux_cmd=(tmux)
|
||||
if [ -n "$SOCKET_NAME" ]; then tmux_cmd+=( -L "$SOCKET_NAME" ); fi
|
||||
```
|
||||
|
||||
- Replace raw `tmux ...` calls with `"${tmux_cmd[@]}" ...`.
|
||||
- Pass `-L` through remote ssh invocation.
|
||||
- Include socket name in verbose output.
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
tmux -L mosaic-test new-session -d -s target 'cat'
|
||||
packages/mosaic/framework/tools/tmux/send-message.sh -L mosaic-test -t target -m 'hello'
|
||||
tmux -L mosaic-test capture-pane -t target -p | grep hello
|
||||
tmux -L mosaic-test kill-server
|
||||
```
|
||||
|
||||
Expected: message lands in the named socket session; default `tmux ls` is untouched.
|
||||
|
||||
### Task 1.2: Add exact target validation helper
|
||||
|
||||
**Objective:** Prevent accidental prefix targeting in all tmux fleet operations.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `packages/mosaic/framework/tools/tmux/_lib.sh`
|
||||
- Modify: `send-message.sh`
|
||||
- Modify: `agent-send.sh`
|
||||
|
||||
**Behavior:**
|
||||
|
||||
- For session-only agent names, normalize target to `=<name>` before kill/status/reset operations.
|
||||
- For explicit pane targets like `session:window.pane`, allow as advanced path but document the risk.
|
||||
|
||||
**Verification:**
|
||||
|
||||
Create sessions `agent` and `agent0`; verify killing/resetting `agent` does not affect `agent0`.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — systemd unit templates
|
||||
|
||||
### Task 2.1: Add holder service template
|
||||
|
||||
**Objective:** Ship a user systemd unit template that owns the Mosaic factory tmux server.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `packages/mosaic/framework/systemd/user/mosaic-tmux-holder.service`
|
||||
- Create: `packages/mosaic/framework/tools/fleet/install-user-units.sh`
|
||||
|
||||
**Unit shape:**
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Mosaic tmux fleet holder
|
||||
Documentation=https://git.mosaicstack.dev/mosaicstack/aiguide
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
|
||||
ExecStart=/usr/bin/tmux -L ${MOSAIC_TMUX_SOCKET} new-session -d -s _holder 'while true; do sleep 3600; done'
|
||||
ExecStop=-/usr/bin/tmux -L ${MOSAIC_TMUX_SOCKET} kill-server
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
|
||||
**Important:** systemd environment expansion in `ExecStart` is limited. Verify syntax; if `%E`/environment expansion is awkward, generate concrete units from config instead of relying on dynamic expansion.
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
systemd-analyze --user verify ~/.config/systemd/user/mosaic-tmux-holder.service
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user start mosaic-tmux-holder.service
|
||||
tmux -L mosaic-factory ls | grep _holder
|
||||
```
|
||||
|
||||
### Task 2.2: Add agent service template
|
||||
|
||||
**Objective:** Ship a user systemd template that starts one configured agent slot.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `packages/mosaic/framework/systemd/user/mosaic-agent@.service`
|
||||
- Modify: `packages/mosaic/framework/tools/fleet/install-user-units.sh`
|
||||
|
||||
**Unit shape:**
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Mosaic agent session %i
|
||||
Requires=mosaic-tmux-holder.service
|
||||
After=mosaic-tmux-holder.service
|
||||
PartOf=mosaic-tmux-holder.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
RemainAfterExit=yes
|
||||
WorkingDirectory=%h/src
|
||||
Environment=MOSAIC_TMUX_SOCKET=mosaic-factory
|
||||
ExecStart=/bin/bash -lc 'tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "%i" "mosaic yolo $(mosaic fleet runtime %i)"'
|
||||
ExecStop=-/usr/bin/tmux -L mosaic-factory kill-session -t '=%i'
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
```
|
||||
|
||||
**Design warning:** command substitution in unit files can become brittle. Prefer a generated per-agent EnvironmentFile:
|
||||
|
||||
```text
|
||||
~/.config/mosaic/fleet/agents/coder0.env
|
||||
```
|
||||
|
||||
with:
|
||||
|
||||
```bash
|
||||
MOSAIC_AGENT_NAME=coder0
|
||||
MOSAIC_AGENT_RUNTIME=claude
|
||||
MOSAIC_AGENT_WORKDIR=/home/jarvis/src
|
||||
MOSAIC_TMUX_SOCKET=mosaic-factory
|
||||
```
|
||||
|
||||
Then `ExecStart` calls a wrapper:
|
||||
|
||||
```bash
|
||||
~/.config/mosaic/tools/fleet/start-agent-session.sh
|
||||
```
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
systemd-analyze --user verify ~/.config/systemd/user/mosaic-agent@.service
|
||||
systemctl --user start mosaic-agent@coder0.service
|
||||
tmux -L mosaic-factory has-session -t '=coder0'
|
||||
systemctl --user restart mosaic-agent@coder0.service
|
||||
```
|
||||
|
||||
Expected: holder server PID remains unchanged; only `coder0` session recycles.
|
||||
|
||||
### Task 2.3: Add start-agent wrapper
|
||||
|
||||
**Objective:** Keep systemd units simple by moving config lookup and launch command construction into a script.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `packages/mosaic/framework/tools/fleet/start-agent-session.sh`
|
||||
|
||||
**Behavior:**
|
||||
|
||||
Inputs:
|
||||
|
||||
```bash
|
||||
start-agent-session.sh <agent-name>
|
||||
```
|
||||
|
||||
Reads:
|
||||
|
||||
```text
|
||||
$MOSAIC_HOME/fleet/agents/<agent-name>.env
|
||||
```
|
||||
|
||||
Starts:
|
||||
|
||||
```bash
|
||||
tmux -L "$MOSAIC_TMUX_SOCKET" new-session -d -s "$MOSAIC_AGENT_NAME" -c "$MOSAIC_AGENT_WORKDIR" "mosaic yolo $MOSAIC_AGENT_RUNTIME"
|
||||
```
|
||||
|
||||
Guardrails:
|
||||
|
||||
- fail if runtime is empty;
|
||||
- fail if workdir does not exist;
|
||||
- no duplicate sessions unless `--replace` is passed;
|
||||
- exact session names only.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — roster config and CLI wrappers
|
||||
|
||||
### Task 3.1: Add fleet config schema and examples
|
||||
|
||||
**Objective:** Define customizable install-time roster without hardcoding USC.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `packages/mosaic/framework/fleet/roster.schema.json`
|
||||
- Create: `packages/mosaic/framework/fleet/examples/minimal.yaml`
|
||||
- Create: `packages/mosaic/framework/fleet/examples/usc-software-factory.yaml`
|
||||
- Create: `packages/mosaic/framework/fleet/README.md`
|
||||
|
||||
**Schema concepts:**
|
||||
|
||||
- `transport`: `tmux` now; `matrix` later.
|
||||
- `tmux.socket_name`
|
||||
- `tmux.holder_session`
|
||||
- `defaults.working_directory`
|
||||
- `agents[].name`
|
||||
- `agents[].runtime`
|
||||
- `agents[].class`
|
||||
- `agents[].model_hint`
|
||||
- `agents[].persistent_persona`
|
||||
- `agents[].reset_between_tasks`
|
||||
- `agents[].kickstart_template`
|
||||
|
||||
**Verification:**
|
||||
|
||||
Use `jq` for JSON examples or add a small Python/YAML validator if YAML is chosen. If no YAML parser is guaranteed, store examples as JSON or support both with Python stdlib JSON first.
|
||||
|
||||
### Task 3.2: Add `mosaic fleet` commands
|
||||
|
||||
**Objective:** Provide operator-safe commands for install/status/start/stop/restart/verify.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: `packages/mosaic/src/cli.ts` or the current commander entrypoint.
|
||||
- Create scripts under: `packages/mosaic/framework/tools/fleet/`
|
||||
|
||||
**Commands:**
|
||||
|
||||
```bash
|
||||
mosaic fleet init --profile minimal|usc --write
|
||||
mosaic fleet install-systemd
|
||||
mosaic fleet start [agent]
|
||||
mosaic fleet stop [agent]
|
||||
mosaic fleet restart [agent]
|
||||
mosaic fleet status --json
|
||||
mosaic fleet verify
|
||||
```
|
||||
|
||||
**Implementation path:**
|
||||
|
||||
Start by wrapping framework shell scripts from the TypeScript CLI. Do not overbuild a TypeScript service manager in the first pass.
|
||||
|
||||
### Task 3.3: Add `mosaic agent` commands
|
||||
|
||||
**Objective:** Provide transport-stable per-agent operations.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Modify: Mosaic CLI entrypoint.
|
||||
- Create: `packages/mosaic/framework/tools/agent/` or reuse `tools/tmux` + `tools/fleet`.
|
||||
|
||||
**Commands:**
|
||||
|
||||
```bash
|
||||
mosaic agent roster [--json]
|
||||
mosaic agent status [agent] [--json]
|
||||
mosaic agent send <agent> --message "..."
|
||||
mosaic agent reset <agent> --clear|--new
|
||||
mosaic agent tail <agent> [-n 80]
|
||||
```
|
||||
|
||||
**Reset behavior:**
|
||||
|
||||
For tmux transport, `reset --clear` sends `/clear` then Enter through `send-message.sh`.
|
||||
|
||||
For Claude/Pi differences, keep reset command configurable per runtime:
|
||||
|
||||
```yaml
|
||||
runtimes:
|
||||
claude:
|
||||
reset_command: /clear
|
||||
pi:
|
||||
reset_command: /new
|
||||
```
|
||||
|
||||
If a runtime does not support a known reset command, restart the service and send a fresh kickstart.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — this-server rollout strategy
|
||||
|
||||
### Task 4.1: Install on separate socket first
|
||||
|
||||
**Objective:** Prove the holder pattern without disturbing existing sessions.
|
||||
|
||||
**Commands after implementation lands locally:**
|
||||
|
||||
```bash
|
||||
mosaic fleet init --profile minimal --write
|
||||
mosaic fleet install-systemd
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user start mosaic-tmux-holder.service
|
||||
mosaic fleet verify
|
||||
```
|
||||
|
||||
Expected:
|
||||
|
||||
- `tmux -L mosaic-factory ls` shows `_holder`.
|
||||
- normal `tmux ls` still shows existing sessions unchanged.
|
||||
|
||||
### Task 4.2: Start one canary agent
|
||||
|
||||
**Objective:** Validate single-agent start/restart isolation.
|
||||
|
||||
Use a harmless canary first, not the full fleet.
|
||||
|
||||
Example roster addition:
|
||||
|
||||
```yaml
|
||||
- name: canary-pi
|
||||
runtime: pi
|
||||
class: canary
|
||||
working_directory: /home/jarvis/src
|
||||
```
|
||||
|
||||
Commands:
|
||||
|
||||
```bash
|
||||
systemctl --user start mosaic-agent@canary-pi.service
|
||||
SRV=$(tmux -L mosaic-factory display-message -p '#{pid}')
|
||||
systemctl --user restart mosaic-agent@canary-pi.service
|
||||
test "$SRV" = "$(tmux -L mosaic-factory display-message -p '#{pid}')"
|
||||
tmux -L mosaic-factory ls
|
||||
```
|
||||
|
||||
Expected: holder PID unchanged; `_holder` remains; `canary-pi` recreated.
|
||||
|
||||
### Task 4.3: Configure local Mosaic factory roster
|
||||
|
||||
**Objective:** Create the actual local roster for this server after canary passes.
|
||||
|
||||
Do not assume USC exact roster is desired here. Create a local profile such as:
|
||||
|
||||
```text
|
||||
~/.config/mosaic/fleet/roster.yaml
|
||||
```
|
||||
|
||||
Initial local recommendation:
|
||||
|
||||
- `mos-claude` orchestrator
|
||||
- `coder0` / `coder1` implementers
|
||||
- `rev0` reviewer
|
||||
- `secrev0` security reviewer
|
||||
- `ultron` final/adversarial reviewer
|
||||
|
||||
Scale to full USC-style pool only after resource/budget behavior is understood.
|
||||
|
||||
### Task 4.4: Cut over existing ad-hoc tmux sessions only if desired
|
||||
|
||||
**Objective:** Avoid data loss.
|
||||
|
||||
Existing sessions on this server are not on the proposed `mosaic-factory` socket. They can remain untouched. If we later want them under Mosaic fleet control:
|
||||
|
||||
1. list sessions;
|
||||
2. capture logs/handoffs;
|
||||
3. stop old processes intentionally;
|
||||
4. recreate as configured `mosaic-agent@...` services;
|
||||
5. verify comms and state.
|
||||
|
||||
Do not run `tmux kill-server` on the default socket unless Jason explicitly approves that outage.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — docs and AI Guide backfill
|
||||
|
||||
### Task 5.1: Stack docs
|
||||
|
||||
**Objective:** Document install and customization for Mosaic Stack users.
|
||||
|
||||
**Files:**
|
||||
|
||||
- Create: `docs/fleet/tmux-fleet.md` or `packages/mosaic/framework/tools/fleet/README.md`
|
||||
- Modify: top-level `README.md` if appropriate.
|
||||
|
||||
Must cover:
|
||||
|
||||
- what problem holder service solves;
|
||||
- install commands;
|
||||
- customization file;
|
||||
- example rosters;
|
||||
- reset/reuse lifecycle;
|
||||
- exact-target safety;
|
||||
- separate socket default;
|
||||
- Matrix migration path.
|
||||
|
||||
### Task 5.2: AI Guide docs
|
||||
|
||||
**Objective:** Keep generic guidance in AI Guide and implementation details in Stack.
|
||||
|
||||
**Files in `mosaicstack/aiguide`:**
|
||||
|
||||
- Update: `playbooks/tmux-fleet.md` with named socket, roster/profile, and resettable-slot pattern.
|
||||
- Add or update: `reference/agent-role-matrix.md` if PR #5 lands.
|
||||
|
||||
Do not put Mosaic install commands as the only path in AI Guide. Present them as one implementation profile.
|
||||
|
||||
---
|
||||
|
||||
## Phase 6 — Matrix migration seam
|
||||
|
||||
### Task 6.1: Add transport enum but implement tmux only
|
||||
|
||||
**Objective:** Avoid hardcoding tmux into orchestration semantics.
|
||||
|
||||
Roster:
|
||||
|
||||
```yaml
|
||||
transport: tmux
|
||||
```
|
||||
|
||||
Future:
|
||||
|
||||
```yaml
|
||||
transport: matrix
|
||||
matrix:
|
||||
homeserver: https://matrix.example
|
||||
room_prefix: mosaic-factory
|
||||
```
|
||||
|
||||
### Task 6.2: Define transport interface docs
|
||||
|
||||
**Objective:** Make Matrix plugin work a transport swap, not a rewrite.
|
||||
|
||||
Minimum operations:
|
||||
|
||||
```text
|
||||
send(agent, message)
|
||||
reset(agent, mode)
|
||||
status(agent)
|
||||
tail(agent)
|
||||
listAgents()
|
||||
```
|
||||
|
||||
Any tmux-specific concept must stay below this line.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
The implementation is complete when:
|
||||
|
||||
- `mosaic fleet init` can write a minimal roster.
|
||||
- `mosaic fleet install-systemd` installs holder and agent units without hand editing.
|
||||
- `mosaic fleet start` starts the holder and configured agents on a named tmux socket.
|
||||
- Restarting one `mosaic-agent@name.service` does not change holder server PID or kill sibling sessions.
|
||||
- `mosaic agent send` can deliver a message to a named agent with a self-identifying preamble.
|
||||
- `mosaic agent reset` can clear/new a reusable slot and send a fresh kickstart.
|
||||
- `mosaic fleet verify` proves holder ownership, exact-target safety, and per-agent restart isolation.
|
||||
- Existing default tmux sessions on this server are not disturbed by default install.
|
||||
- Docs explain generic customization and include USC-style roster only as an example.
|
||||
- AI Guide remains generic; Mosaic Stack docs carry the concrete install path.
|
||||
|
||||
## Risks and mitigations
|
||||
|
||||
| Risk | Mitigation |
|
||||
| --------------------------------------------------- | --------------------------------------------------------------------------------- |
|
||||
| Killing existing tmux sessions | Use named `mosaic-factory` socket; no default `tmux kill-server`. |
|
||||
| systemd unit quoting/env expansion bugs | Move logic into shell wrappers; verify with `systemd-analyze --user verify`. |
|
||||
| Runtime reset command mismatch | Make reset command runtime-configurable; fallback to service restart + kickstart. |
|
||||
| Tool install drift | Ensure npm package includes framework tmux/fleet tools; add packaging test. |
|
||||
| Mosaic-specific assumptions leak into generic guide | Keep USC roster as example profile; AI Guide documents pattern/options. |
|
||||
| Matrix migration blocked by tmux coupling | Add `mosaic agent` abstraction now; keep tmux details below transport layer. |
|
||||
|
||||
## Suggested first PR split
|
||||
|
||||
1. **PR A — tmux tool hardening**
|
||||
- socket support;
|
||||
- exact target helpers;
|
||||
- tests/docs.
|
||||
|
||||
2. **PR B — fleet systemd primitives**
|
||||
- holder unit;
|
||||
- agent unit;
|
||||
- start-agent wrapper;
|
||||
- install-user-units script;
|
||||
- verify script.
|
||||
|
||||
3. **PR C — roster and CLI**
|
||||
- roster schema/examples;
|
||||
- `mosaic fleet ...` commands;
|
||||
- `mosaic agent ...` commands.
|
||||
|
||||
4. **PR D — local rollout and docs**
|
||||
- local roster for this server;
|
||||
- run canary;
|
||||
- document verification evidence;
|
||||
- update AI Guide with generic lessons.
|
||||
|
||||
## Immediate next action
|
||||
|
||||
Implement PR A first. It is low-risk, improves existing tools, and is required for a safe named-socket rollout on this server.
|
||||
Reference in New Issue
Block a user