Files
stack/docs/scratchpads/cli-unification-20260404.md
Jarvis f7b5f187c5
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
docs: archive stale mission state, scaffold CLI unification mission
Prior sessions left three different missions spread across the docs:
- docs/MISSION-MANIFEST.md: Harness Foundation (complete)
- docs/TASKS.md: Storage Abstraction Retrofit (P1-P4 done, P5 pending)
- docs/scratchpads/mvp-20260312.md: MVP mission (stale)

Reset the working state to a single clean mission focused on what
actually needs to happen next: unify the mosaic CLI, add first-class
commands for every sub-package, fix the gateway bootstrap token
recovery dead-end, and stitch the install UX end-to-end.

Changes:
- Move Harness Foundation manifest + PRD to docs/archive/missions/harness-20260321/
- Move Storage Abstraction TASKS.md to docs/archive/missions/storage-abstraction/
- Scaffold new docs/MISSION-MANIFEST.md for cli-unification-20260404
  with 8 milestones (M1 done via PR #398, M2 in-progress via this PR)
- Scaffold new docs/TASKS.md with per-milestone task breakdown,
  dependencies, agent assignments, and token estimates
- Scaffold docs/scratchpads/cli-unification-20260404.md with full
  planning decisions, gateway bootstrap bug root cause analysis,
  telemetry architecture notes, and open risks

Left intact:
- docs/PRD.md (v0.1.0, 1005 lines) — still the long-term target
- docs/PRD-TUI_Improvements.md — active TUI work
- docs/scratchpads/* historical task scratchpads — append-only breadcrumbs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:43:42 -05:00

7.5 KiB
Raw Blame History

Mission Scratchpad — CLI Unification & E2E First-Run

Append-only log. NEVER delete entries. NEVER overwrite sections. This is the orchestrator's working memory across sessions.

Mission ID: cli-unification-20260404 Started: 2026-04-04 Related PRDs: docs/PRD.md (v0.1.0 long-term target)

Original Mission Prompt

Original user framing (2026-04-04):

We are off the reservation right now. Working on getting the system to work via cli first, then working on the webUI. The missions are likely all wrong. The PRDs might have valid info.

E2E install to functional, with Mosaic Forge working. mosaic gateway config is broken — no token is created. Unable to configure. Installation doesn't really configure, it just installs and launches the gateway. Multiple mosaic commands are missing that should be included. Unified installer experience is not ready. UX is bad.

The various mosaic packages will need to be available within the mosaic cli: mosaic auth, mosaic brain, mosaic forge, mosaic log, mosaic macp, mosaic memory, mosaic queue, mosaic storage.

The list of commands in mosaic --help also need to be alphabetized for readability.

mosaic telemetry should also exist. Local OTEL for wide-event logging / post-mortems. Remote upload opt-in via @mosaicstack/telemetry-client-js (https://git.mosaicstack.dev/mosaicstack/telemetry-client-js) — the telemetry server will be part of the main mosaicstack.dev website. Python counterpart at https://git.mosaicstack.dev/mosaicstack/telemetry-client-py.

Planning Decisions

2026-04-04 — State discovery + prep PR

Critical finding: Two CLI packages both owned bin.mosaic@mosaicstack/mosaic (0.0.21) and @mosaicstack/cli (0.0.17). Their src/cli.ts files were near-verbatim duplicates (424 vs 422 lines) and their src/commands/ directories overlapped, with some files silently diverging (notably gateway/install.ts, the version responsible for the broken install UX). Whichever package was linked last won the mosaic symlink.

Decision: @mosaicstack/cli dies. @mosaicstack/mosaic is the single CLI + TUI package. This was confirmed with user ("The @mosaicstack/cli package is no longer a package. Its features were moved to @mosaicstack/mosaic instead."). Prep PR #398 executed the removal.

Decision: CLI registration pattern = register<Name>Command(parent: Command) exported by each sub-package, co-located with the library code. Proven by @mosaicstack/quality-railsregisterQualityRails(program). Avoids cross-package commander version mismatches.

Decision: Stale mission state (harness-20260321 manifest, storage-abstraction TASKS.md, PRD-Harness_Foundation.md) gets archived under docs/archive/missions/. Scratchpads for completed sub-missions are left in docs/scratchpads/ as historical record — they're append-only by design and valuable as breadcrumbs.

2026-04-04 — Gateway bootstrap token bug root cause

apps/gateway/src/admin/bootstrap.controller.ts:

  • GET /api/bootstrap/status returns needsSetup: true only when users table count is zero
  • POST /api/bootstrap/setup throws ForbiddenException if any user exists

packages/mosaic/src/commands/gateway/install.tsrunInstall() "explicit reinstall" branch (lines ~8798):

  1. Clears meta.adminToken from meta.json (line 175 — preserveToken = false when regeneratedConfig = true)
  2. Calls bootstrapFirstUser()
  3. Status endpoint returns needsSetup: false because users row still exists
  4. bootstrapFirstUser prints "Admin user already exists — skipping setup. (No admin token on file — sign in via the web UI to manage tokens.)" and returns
  5. Install "succeeds" with NO token, NO CLI path to generate one, and chicken-and-egg on /api/admin/tokens which requires auth

Recovery design options (to decide in CU-03-01):

  • Filesystem-signed nonce file written by the installer; recovery endpoint checks it
  • Accept a valid BetterAuth admin session cookie → mint new admin token via authenticated API call (leans on existing auth; mosaic gateway login becomes the recovery entry point)
  • Gateway daemon accepts --rescue flag that mints a one-shot recovery token, prints it, then exits

Current lean: option 2 (BetterAuth cookie) because it reuses existing auth and gives us mosaic gateway login as a useful command regardless. But the design spike in CU-03-01 should evaluate all three against: security, complexity, headless-environment friendliness, and disaster-recovery scenarios.

2026-04-04 — Telemetry architecture

  • @mosaicstack/telemetry-client-js + @mosaicstack/telemetry-client-py are separate repos on Gitea — not currently consumed anywhere in this monorepo (verified via grep)
  • Telemetry server will be combined with the main mosaicstack.dev website (not built yet)
  • Local OTEL stays — apps/gateway/src/tracing.ts already wires it up for wide-event logging and post-mortem traces
  • mosaic telemetry is a thin wrapper that:
    • mosaic telemetry local {status,tail,jaeger} → local OTEL state, Jaeger links
    • mosaic telemetry {status,opt-in,opt-out,test,upload} → remote upload path via telemetry-client-js
    • Remote disabled by default; opt-in requires explicit consent
    • test/upload ship with dry-run mode until the server endpoint is live

Session Log

Session Date Milestone Tasks Done Outcome
1 2026-04-04 cu-m01 Kill legacy CLI CU-01-01 PR #398 merged to main as c39433c3. 48 files deleted, 6685 LOC removed. CI green (pipeline 702).
1 2026-04-04 cu-m02 Archive + scaffold CU-02-01, CU-02-02 (this file) In progress — this PR.

Corrections / Course Changes

(append here as they happen)

Open Risks

  • Telemetry server not live: CU-06-03 (mosaic telemetry upload) may need a dry-run stub until the server endpoint exists on mosaicstack.dev. Not blocking for this mission, but ships with reduced validation until then.
  • mosaic auth depends on gateway login: CU-05-06 is gated by CU-03-03 (mosaic gateway login). Sequencing matters — do not start CU-05-06 until M3 is done or significantly underway.
  • pr-create.sh wrapper bug: Discovered during M1 — ~/.config/mosaic/tools/git/pr-create.sh line 158 uses eval "$CMD", which shell-evaluates any backticks / $(…) / ${…} in PR bodies. Workaround: strip backticks from PR bodies (use bold / italic / plain text instead), or use tea pr create directly. Captured in openbrain as gotcha. Should be fixed upstream in Mosaic tools repo at some point, but out of scope for this mission.
  • Mosaic coord / orchestrator session lock drift: .mosaic/orchestrator/session.lock gets re-written every session launch and shows up as a dirty working tree on branch switch. Not blocking — just noise to ignore.

Verification Evidence

CU-01-01 (PR #398)

  • Branch: chore/remove-cli-package-duplicate
  • Commit: 7206b9411d96
  • Merge commit on main: c39433c3
  • CI pipeline: #702 (pull_request event, all 6 steps green: postgres, install, typecheck, lint, format, test)
  • Quality gates (pre-push): typecheck 38/38, lint 21/21, format clean, test 38/38