Files
stack/docs/plans/2026-03-15-agent-platform-architecture.md
Jarvis 774b76447d
Some checks failed
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/push/ci Pipeline failed
fix: rename all packages from @mosaic/* to @mosaicstack/*
- Updated all package.json name fields and dependency references
- Updated all TypeScript/JavaScript imports
- Updated .woodpecker/publish.yml filters and registry paths
- Updated tools/install.sh scope default
- Updated .npmrc registry paths (worktree + host)
- Enhanced update-checker.ts with checkForAllUpdates() multi-package support
- Updated CLI update command to show table of all packages
- Added KNOWN_PACKAGES, formatAllPackagesTable, getInstallAllCommand
- Marked checkForUpdate() with @deprecated JSDoc

Closes #391
2026-04-04 21:43:23 -05:00

1573 lines
85 KiB
Markdown

# Agent Platform Architecture — Slash Commands, Workspaces, Task Orchestration & Agent Isolation
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Date:** 2026-03-15
**Status:** Augmented (2026-03-15)
**Packages:** `packages/types`, `packages/cli`, `packages/queue`, `packages/coord`, `packages/db`, `apps/gateway`
---
## Problem Statement
The Mosaic TUI currently sends all user input directly to the agent via Socket.IO. There is no mechanism for in-session slash commands, user preferences, session overrides, or gateway-driven command discovery. Users must exit the TUI to perform operations like switching models, managing missions, or changing agent configs.
Beyond the command interface, the platform lacks: structured workspaces for multi-user project isolation, a unified task orchestration layer (DB and file-based systems are disconnected), agent sandboxing to prevent cross-user data access, session artifact garbage collection, and a gateway-owned command registry.
This plan establishes the foundational architecture for these systems.
---
## Goals
1. **Slash command system** — parse, validate, and execute `/commands` from the TUI input bar
2. **Gateway-owned command registry** — the gateway serves a typed command manifest; TUI consumes it
3. **Preference stack** — four-layer override system (platform defaults → agent config → user preferences → session `/system` overrides)
4. **Hot reload** — soft-restart the gateway to load new plugins/skills/commands without dropping connections
5. **Local primitives** — baseline commands that work even when disconnected from the gateway
6. **Workspaces** — structured, git-backed, per-user/per-project filesystem layout with chroot isolation
7. **Task orchestration** — unified `@mosaicstack/queue` layer bridging PG, workspace files, and Valkey for agent task assignment
8. **Session garbage collection** — three-tier GC (session, sweep, full cold-start) across Valkey, PG, and filesystem
---
## Architecture
### Override Precedence Stack
```
┌─────────────────────────────────────┐
│ /system (session ephemeral) │ ← highest priority, Valkey-backed
├─────────────────────────────────────┤
│ /preferences (user persistent) │ ← per-user, stored in PG via gateway
├─────────────────────────────────────┤
│ Agent config (systemPrompt) │ ← per-agent, stored in agentConfigs
├─────────────────────────────────────┤
│ Platform defaults (Mosaic ships) │ ← base layer, enforcements + defaults
└─────────────────────────────────────┘
```
Each layer is additive by default. Destructive overrides require explicit syntax (TBD in implementation).
`/system` overrides are stored in Valkey keyed by session ID (`mosaic:session:{sessionId}:system`). They survive context compaction because they are never part of the message context — the gateway re-injects them from Valkey when constructing the system prompt for each agent turn.
### Command Registry Flow
```
┌─────────────────────────────────────────────────┐
│ Gateway │
│ │
│ CommandRegistryService │
│ ├── core commands (always present) │
│ ├── agent-scoped commands (from agent config) │
│ ├── skill commands (from loaded skills) │
│ ├── plugin commands (discord/telegram may add) │
│ └── admin commands (system-level, RBAC-gated) │
│ │
│ On change → emit 'commands:manifest' to clients │
└──────────────┬──────────────────────────────────┘
│ Socket.IO
┌──────────────────────────────────┐
│ TUI Client │
│ │
│ Local command manifest │
│ (received from gateway) │
│ + client-only commands │
│ (/help, /stop, /clear local) │
│ │
│ Merged manifest feeds: │
│ - autocomplete on "/" prefix │
│ - /help rendering │
│ - argument validation │
└──────────────────────────────────┘
```
### Hot Reload Flow
```
Admin triggers reload
(via /reload command, POST /api/admin/reload, or SIGHUP)
┌─────────────────────────────────────────────┐
│ Gateway ReloadService │
│ │
│ 1. Acquire reload lock (prevent race) │
│ 2. Snapshot current state │
│ 3. Re-scan skills directories │
│ 4. Re-init plugin lifecycles │
│ - plugin.onUnload() → cleanup │
│ - plugin.onLoad() → re-register │
│ 5. Rebuild command registry │
│ 6. Diff old manifest vs new manifest │
│ 7. Broadcast 'system:reload' to sessions │
│ 8. Release lock │
│ │
│ On failure → rollback to snapshot │
└─────────┬───────────────────────────────────┘
│ Socket.IO broadcast
┌──────────────────────────────────┐
│ All connected TUIs │
│ │
│ 'system:reload' event │
│ { commands, skills, providers, │
│ message } │
│ │
│ TUI patches local state, shows │
│ status notification │
└──────────────────────────────────┘
```
### Plugin Lifecycle Contract (required for hot reload)
```typescript
interface MosaicPlugin {
id: string;
onLoad(ctx: PluginContext): Promise<void>;
onUnload(ctx: PluginContext): Promise<void>;
onReload?(ctx: PluginContext): Promise<void>;
}
```
Without `onUnload`, hot-reload is impossible — would leak listeners, duplicate registrations, orphan connections.
---
## Command List
### Session Commands
| Command | Short | Args | Execution | RBAC | Description |
| ---------- | ----- | ----------------- | -------------- | ---- | --------------------------------------------------------------------------------------------------------------------------------- |
| `/new` | `/n` | `{fresh}?` | socket | user | Start new session. `/new fresh` also GC-collects the old session's artifacts. |
| `/clear` | — | — | socket | user | Clear current session context (server-side context reset, keep session ID). Triggers session GC (Valkey overrides, log demotion). |
| `/compact` | — | `{instructions}?` | socket | user | Trigger context compaction. Optional custom instructions appended. |
| `/rename` | — | `{name}?` | REST | user | Name/rename current session. No arg → prompt inline. |
| `/resume` | — | `{session}?` | REST + socket | user | Resume by name or ID. No arg → show session picker. |
| `/history` | — | `{n}?` | REST | user | Show last N sessions or messages. |
| `/export` | — | `{format}?` | REST | user | Export conversation (md, json). |
| `/stop` | — | — | local + socket | user | Cancel in-progress streaming. Also triggered by `Esc` while streaming. |
| `/retry` | — | — | socket | user | Re-send last user message with current settings. Distinct from up-arrow (input history recall for editing). |
| `/gc` | — | — | REST | user | Trigger sweep GC. Admin sees system-wide results; regular user scoped to their own orphans. |
### Model & Provider Commands
| Command | Short | Args | Execution | RBAC | Description |
| ----------- | ----- | ---------------------------------- | ------------- | ---- | ----------------------------------------------------- |
| `/model` | `/m` | `{name}?` | socket | user | Switch model. No arg → provider-grouped model picker. |
| `/provider` | — | `{login\|logout}? {name}?` | REST + socket | user | List providers. Subcommands for OAuth login/logout. |
| `/thinking` | `/t` | `{low\|medium\|high\|xhigh\|auto}` | socket | user | Set thinking level directly (not cycle). |
`/provider` collapses the original `/login` and `/logout` into subcommands:
- `/provider` → list available providers + connection status
- `/provider login` → show OAuth provider picker
- `/provider login anthropic` → direct OAuth login
- `/provider logout openrouter` → direct logout
### Agent & Tools Commands
| Command | Short | Args | Execution | RBAC | Description |
| ---------- | ----- | --------- | --------------------- | ---- | ------------------------------------------------------------------------- |
| `/agent` | `/a` | `{name}?` | REST + socket | user | Switch agent config. No arg → agent picker. |
| `/tools` | — | — | local (from manifest) | user | List available tools and enabled/disabled state. |
| `/skill:*` | — | `{args}?` | socket | user | Invoke installed skill by name. Follows Pi/Agent Skills standard pattern. |
### Preferences & Override Commands
| Command | Short | Args | Execution | RBAC | Description |
| -------------- | ------- | -------------------- | --------- | ---- | --------------------------------------------------------------- |
| `/preferences` | `/pref` | `{show\|set\|reset}` | REST | user | Persistent user preferences. Stored in PG. Survives sessions. |
| `/system` | — | `{prompt}?` | socket | user | Session-scoped override. Stored in Valkey. Survives compaction. |
### Mission & Planning Commands
| Command | Short | Args | Execution | RBAC | Description |
| ---------- | ----- | ----------------- | ------------- | ---- | ----------------------------- |
| `/mission` | — | `{subcommand}?` | REST | user | Mission status/management. |
| `/prdy` | — | `{init\|update}?` | REST + wizard | user | Launch PRD wizard in-session. |
`/mission` subcommands:
- `/mission` → show active mission summary (name, phase, status, task count)
- `/mission list` → list missions
- `/mission tasks` → list tasks for active mission
- `/mission set {name}` → set active mission context for session. Triggers the agent to begin working on the chosen mission. Requires user confirmation before proceeding (destructive context switch). Mission context (name, description, phase, tasks) is injected into the system prompt so the agent understands its objective.
### Status & Info Commands
| Command | Short | Args | Execution | RBAC | Description |
| --------- | ----- | ------------ | --------------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------- |
| `/status` | `/s` | — | local + REST | user | Full session status dump: model, provider, agent, connection, tokens, cost, context %, thinking level, active mission, loaded skills. |
| `/cost` | — | — | local | user | Cumulative session cost + token breakdown. |
| `/help` | `/h` | `{command}?` | local (from manifest) | user | List all commands, or show help for specific command. |
### Admin Commands
| Command | Short | Args | Execution | RBAC | Description |
| --------- | ----- | ---- | --------- | ----- | ------------------------------------------------------------------------------------------------------- |
| `/reload` | — | — | REST | admin | Soft-restart: reload skills, plugins, command registry. Maintain connections. Push updates to sessions. |
### Future Commands (Not In Scope)
| Command | Description |
| -------- | ----------------------------------------------------------------- |
| `/share` | Share conversation link (1:1, group). Similar to ChatGPT sharing. |
| `/fork` | Branch conversation from a specific message point. |
| `/copy` | Copy last assistant response to clipboard. |
---
## Type Contracts (`@mosaicstack/types`)
### CommandDef — Gateway Command Manifest Entry
```typescript
/** Argument definition for a slash command */
export interface CommandArgDef {
name: string;
type: 'string' | 'enum';
optional: boolean;
/** For enum type, the allowed values */
values?: string[];
description?: string;
}
/** A single command definition served by the gateway */
export interface CommandDef {
/** Command name without slash prefix, e.g. "model" */
name: string;
/** Short aliases, e.g. ["m"] */
aliases: string[];
/** Human-readable description */
description: string;
/** Argument schema */
args?: CommandArgDef[];
/** Nested subcommands (e.g. provider → login, logout) */
subcommands?: CommandDef[];
/** Origin of this command */
scope: 'core' | 'agent' | 'skill' | 'plugin' | 'admin';
/** Where the command executes */
execution: 'local' | 'socket' | 'rest' | 'hybrid';
/** Whether this command is currently available (provider connected, RBAC allows, etc.) */
available: boolean;
}
/** Full command manifest pushed from gateway to TUI */
export interface CommandManifest {
commands: CommandDef[];
skills: SkillCommandDef[];
/** Manifest version — TUI compares to detect changes */
version: number;
}
/** Skill registered as /skill:name */
export interface SkillCommandDef {
/** Skill name (used as /skill:{name}) */
name: string;
description: string;
/** Whether the skill is currently loaded and available */
available: boolean;
}
```
### New Socket Events
```typescript
/** Payload for commands:manifest event */
export interface CommandManifestPayload {
manifest: CommandManifest;
}
/** Payload for system:reload broadcast */
export interface SystemReloadPayload {
commands: CommandDef[];
skills: SkillCommandDef[];
providers: string[];
message: string;
}
/** Client request to execute a slash command via socket */
export interface SlashCommandPayload {
conversationId: string;
command: string;
args?: string;
}
/** Server response to a slash command */
export interface SlashCommandResultPayload {
conversationId: string;
command: string;
success: boolean;
message?: string;
data?: Record<string, unknown>;
}
// ── Add to ServerToClientEvents ──
export interface ServerToClientEvents {
// ... existing events ...
'commands:manifest': (payload: CommandManifestPayload) => void;
'command:result': (payload: SlashCommandResultPayload) => void;
'system:reload': (payload: SystemReloadPayload) => void;
}
// ── Add to ClientToServerEvents ──
export interface ClientToServerEvents {
// ... existing events ...
'command:execute': (data: SlashCommandPayload) => void;
}
```
---
## `/system` Override — Valkey Storage Design
### Key Schema
```
mosaic:session:{sessionId}:system → string (the override prompt text)
```
### TTL & Session Lifetime
**Problem:** Users leave TUI sessions open for days. Discord/Telegram bot sessions are effectively infinite (run until the bot stops). A fixed 24-hour TTL would silently destroy `/system` overrides.
**Solution: Activity-based TTL renewal.**
The Valkey key uses a generous base TTL (7 days) that is renewed on every interaction:
| Event | TTL Action |
| ------------------------------ | -------------------------------------- |
| `/system` override set | SET with 7-day TTL |
| Any agent prompt in session | EXPIRE renewed to 7 days |
| Socket heartbeat (every 5 min) | EXPIRE renewed to 7 days |
| `/system clear` | DEL immediately |
| `/new` (new session) | DEL immediately |
| Session destroyed (explicit) | DEL immediately |
| No activity for 7 days | Expires automatically (orphan cleanup) |
**Channel-type considerations:**
| Channel | Session Lifetime | TTL Strategy |
| --------------- | ---------------- | ----------------------------------------------------------------------- |
| TUI (websocket) | Minutes to days | Activity-renewed, cleaned on disconnect if session ends |
| Discord | Weeks to months | Activity-renewed, per-channel key (`mosaic:session:{channelId}:system`) |
| Telegram | Weeks to months | Activity-renewed, per-chat key |
| REST API | Per-request | N/A — no persistent session overrides |
For Discord/Telegram, the "session" is really the channel/chat — overrides persist as long as the bot is active in that channel. The key schema accommodates this:
```
mosaic:session:{sessionId}:system # TUI sessions
mosaic:channel:{channelType}:{channelId}:system # Plugin channel sessions
```
### Lifecycle
1. User sends `/system Always respond in bullet points`
2. TUI emits `command:execute` with `{ command: "system", args: "Always respond in bullet points" }`
3. Gateway `CommandExecutorService` writes to Valkey: `SET mosaic:session:{sid}:system "Always respond in bullet points" EX 604800`
4. Gateway responds with `command:result` confirming the override is set
5. User later sends `/system Use numbered lists instead`
6. Gateway reads existing override from Valkey, appends new override
7. Gateway calls a condensation step: the accumulated overrides are distilled by the agent/gateway into a single coherent override using last-wins semantics (e.g., "numbered lists" supersedes "bullet points")
8. Gateway writes the condensed result back to Valkey: `SET mosaic:session:{sid}:system "<condensed>" EX 604800`
9. On every subsequent agent turn, gateway's prompt assembly reads the Valkey key and layers it as the highest-priority system prompt override (after platform defaults, after agent config, after user preferences)
10. On compaction, the override is NOT in the message context — it's re-injected from Valkey, so it survives intact
11. `/system` with no args → gateway reads the key and returns current override in `command:result`
12. `/system clear` → gateway DELs the key
13. On session end / `/new` → key is explicitly deleted
14. On every prompt, the key's TTL is refreshed via `EXPIRE`
### Override Condensation
When multiple `/system` overrides accumulate, the gateway condenses them to prevent unbounded growth:
```
User: /system Always respond in bullet points
Valkey: "Always respond in bullet points"
User: /system Use numbered lists instead
Gateway reads existing: "Always respond in bullet points"
Gateway appends: "Always respond in bullet points\n---\nUse numbered lists instead"
Gateway condenses (LLM call): "Use numbered lists for all responses"
Valkey: "Use numbered lists for all responses"
User: /system Include code examples when relevant
Gateway reads existing: "Use numbered lists for all responses"
Gateway appends + condenses: "Use numbered lists for all responses. Include code examples when relevant."
Valkey: "Use numbered lists for all responses. Include code examples when relevant."
```
The condensation step is a lightweight LLM call (cheap model, small context) that merges `$EXISTING_OVERRIDE + $NEW_OVERRIDE` into a single coherent instruction set, applying last-wins for contradictions. This prevents creep while preserving cumulative non-conflicting preferences.
### Using Existing Queue Package
The `@mosaicstack/queue` package already provides `createQueue()` returning an ioredis handle on `redis://localhost:6380`. The `/system` storage will use the same Valkey instance directly via the redis handle — no queue semantics needed, just `SET`/`GET`/`DEL`/`EXPIRE`.
---
## `/preferences` — Schema Design
### Storage
Postgres via `@mosaicstack/db`. The `preferences` table already exists in `packages/db/src/schema.ts` with the right shape:
```typescript
// Existing schema — already has category + key + value JSONB
export const preferences = pgTable('preferences', {
id: uuid('id').primaryKey().defaultRandom(),
userId: text('user_id')
.notNull()
.references(() => users.id, { onDelete: 'cascade' }),
key: text('key').notNull(),
value: jsonb('value').notNull(),
category: text('category', {
enum: ['communication', 'coding', 'workflow', 'appearance', 'general'],
})
.notNull()
.default('general'),
source: text('source'),
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
});
```
The existing category enum covers user-facing groupings. System-level categories (`session`, `safety`, `limits`) may need to be added to the enum via migration. A `mutable` boolean column should be added to distinguish user-adjustable preferences from enforcements.
The `key` is the unique identifier (e.g., `response.language`). The `category` is the display grouping. This keeps queries flat (`WHERE user_id = $1 AND key = $2`) while supporting grouped display (`GROUP BY category`).
### Platform Defaults
Mosaic ships with a seed set of preferences. Stored as defaults in code (not in DB until user mutates them). The `mutable` flag distinguishes user-adjustable preferences from platform enforcements. Enforcements are never written to the DB — they are applied in code unconditionally and cannot be overridden via `/preferences set`.
| Key | Default Value | Mutable | Category |
| -------------------------------- | ------------------------- | --------------- | -------- |
| `response.language` | `"auto"` (follows locale) | ✓ | response |
| `response.codeAnnotations` | `true` | ✓ | response |
| `safety.confirmDestructiveTools` | `true` | ✓ | safety |
| `session.autoCompactThreshold` | `0.80` | ✓ | session |
| `session.autoCompactEnabled` | `true` | ✓ | session |
| `limits.maxThinkingLevel` | (per role) | ✗ (enforcement) | limits |
| `limits.rateLimit` | (per role) | ✗ (enforcement) | limits |
`/preferences set` rejects writes to immutable keys with a clear message:
```
⚙ Cannot override "limits.maxThinkingLevel" — this is a platform enforcement. Contact your admin.
```
### Merge Logic
When constructing the effective preference set for a request:
1. Start with platform defaults (hardcoded)
2. Overlay with user preferences from DB (only mutable keys)
3. Enforcements from platform defaults are re-applied last (cannot be overridden)
### Per-Project RBAC (Future)
Each project can define user-level RBAC for shared projects. This affects which preferences are visible/editable and which commands are available. The `user_preferences` table is user-global; project-scoped preference overrides would be a separate table (`project_user_preferences`) with a `project_id` foreign key. Deferred to RBAC phase.
---
## Hot Reload — Reloadable vs Static Components
| Component | Hot-Reloadable | Reason |
| -------------------------- | -------------- | -------------------------------------- |
| Skills | ✓ | File-based, scan + re-register |
| Plugins (discord/telegram) | ✓ | Module-level lifecycle hooks |
| Command registry | ✓ | Derived from skills + plugins + config |
| Preference defaults | ✓ | Config/DB read |
| Provider configs | ✓ | DB/env read |
| NestJS modules/routes | ✗ | Fastify listener can't hot-swap routes |
| DB schema | ✗ | Requires migration + restart |
| Auth config | ✗ | BetterAuth bootstraps once |
### Reload Triggers
| Trigger | Use Case |
| ---------------------------- | ------------------------------------------------- |
| `/reload` slash command | Admin in TUI session |
| `POST /api/admin/reload` | REST call from CI/CD, webhook, or admin dashboard |
| `SIGHUP` signal | Ops/systemd convention |
| File watcher (dev mode only) | Watch skills directories for changes |
---
## TUI-Side Command Parsing
### Input Interception
In `InputBar.handleSubmit`, before sending to socket:
```typescript
function handleSubmit(value: string) {
const trimmed = value.trim();
// Slash command detection
if (trimmed.startsWith('/')) {
const parsed = parseSlashCommand(trimmed);
if (parsed) {
executeCommand(parsed);
return;
}
// Unknown command — show error inline, don't send to agent
showSystemMessage(`Unknown command: ${trimmed.split(' ')[0]}`);
return;
}
// Normal message — send to agent
socket.sendMessage(value);
}
```
### Parse Function
```typescript
interface ParsedCommand {
command: string; // "model", "skill:brave-search"
args: string | null; // "claude-4" or null
raw: string; // "/model claude-4"
}
function parseSlashCommand(input: string): ParsedCommand | null {
const match = input.match(/^\/([a-z][a-z0-9:_-]*)\s*(.*)?$/i);
if (!match) return null;
return {
command: match[1]!,
args: match[2]?.trim() || null,
raw: input,
};
}
```
### Execution Routing
```typescript
function executeCommand(parsed: ParsedCommand) {
const def = findCommand(parsed.command); // search manifest + local commands
if (!def) {
showSystemMessage(`Unknown command: /${parsed.command}`);
return;
}
if (!def.available) {
showSystemMessage(`Command /${parsed.command} is not available (${reason})`);
return;
}
switch (def.execution) {
case 'local':
executeLocal(parsed, def); // /help, /stop, /cost
break;
case 'socket':
socket.emit('command:execute', {
conversationId,
command: parsed.command,
args: parsed.args ?? undefined,
});
break;
case 'rest':
executeRest(parsed, def); // /rename, /preferences, /export
break;
case 'hybrid':
executeHybrid(parsed, def); // /provider login, /status
break;
}
}
```
### Local-Only Commands (Work Offline)
These commands must function even when the gateway is disconnected:
| Command | Behavior When Offline |
| --------- | ---------------------------------------------------------- |
| `/help` | Show commands from cached manifest + local commands |
| `/stop` | Cancel local streaming state |
| `/cost` | Show locally tracked token/cost accumulator |
| `/status` | Show local state (model, connection: disconnected, tokens) |
### System Messages
Command output renders as `role: 'system'` messages — visually distinct from user and assistant messages (e.g., dimmed, no avatar, centered or left-aligned with a `⚙` prefix). This requires adding `'system'` to the `Message.role` union type in `use-socket.ts`.
---
## Short Aliases
| Alias | Command |
| ------- | -------------- |
| `/n` | `/new` |
| `/m` | `/model` |
| `/t` | `/thinking` |
| `/a` | `/agent` |
| `/s` | `/status` |
| `/h` | `/help` |
| `/pref` | `/preferences` |
Aliases are resolved in `findCommand()` before manifest lookup.
---
## Implementation Phases
### Phase 1: Types + Local Command Parsing (no gateway changes)
1. Add `CommandDef`, `CommandManifest`, new socket events to `@mosaicstack/types`
2. Add `parseSlashCommand()` utility to `packages/cli`
3. Add `role: 'system'` to `Message` type, render system messages in `MessageList`
4. Implement local-only commands: `/help`, `/stop`, `/cost`, `/status` (local state only)
5. Wire command parsing into `InputBar.handleSubmit` — intercept `/` prefix
6. Hardcode initial command manifest in TUI (temporary, replaced in Phase 2)
### Phase 2: Gateway Command Registry
1. Create `CommandRegistryService` in `apps/gateway`
2. Register core commands with `CommandDef` metadata
3. Emit `commands:manifest` on socket connect (alongside `session:info`)
4. Handle `command:execute` socket event — route to appropriate service
5. Implement socket-executed commands: `/model`, `/thinking`, `/new`, `/clear`, `/compact`, `/retry`
6. Implement REST-executed commands: `/rename`, `/resume`, `/history`, `/export`
7. TUI replaces hardcoded manifest with gateway-provided manifest
### Phase 3: Preferences & System Overrides
1. Create `user_preferences` table in `@mosaicstack/db`, Drizzle schema + migration
2. Create `PreferencesService` in gateway — CRUD + defaults + enforcement logic
3. Implement `/preferences` command (REST-executed)
4. Implement `/system` command — Valkey storage, session-scoped
5. Wire system override into prompt assembly (injected from Valkey on each turn)
6. Wire user preferences into prompt assembly (merged with defaults)
### Phase 4: Agent, Provider, Mission Commands
1. Implement `/agent` — switch agent config mid-session
2. Implement `/provider` — list providers, login/logout subcommands (OAuth flow)
3. Implement `/mission` subcommands — status, list, tasks, set
4. Implement `/prdy` — launch PRD wizard in-session
5. Implement `/tools` — list available tools from agent config
### Phase 5: Hot Reload
1. Define `MosaicPlugin` lifecycle interface
2. Refactor discord/telegram plugins to implement `onLoad`/`onUnload`
3. Create `ReloadService` in gateway — scan, diff, broadcast
4. Implement `/reload` admin command
5. Add `POST /api/admin/reload` REST endpoint
6. Add `SIGHUP` handler
7. Emit `system:reload` to all connected sessions
8. TUI handles `system:reload` — patch manifest, show notification
### Phase 6: Session Garbage Collection
1. Create `SessionGCService` in `apps/gateway` — three tiers: `collect(sessionId)`, `sweepOrphans()`, `fullCollect()`
2. Wire `collect()` into `AgentService.destroySession()` — clean Valkey keys, demote logs
3. Wire `collect()` into `/clear` and `/new fresh` command handlers
4. Implement `fullCollect()` in `onModuleInit` — runs on every gateway cold start
5. Add sweep GC cron job to existing `CronService` (daily at 4am default, configurable via `SESSION_GC_CRON`)
6. Implement `/gc` slash command — triggers `sweepOrphans()`, user-scoped for non-admins
7. Add `GCResult` / `GCSweepResult` / `FullGCResult` types
8. Expose last GC run stats in `/status` for admin users
9. Log all GC activity via OTEL spans
### Phase 7: Workspaces
1. Define `MOSAIC_ROOT` configuration — env var, config file, install-time prompt
2. Create `WorkspaceService` in gateway — directory creation, git init/clone, path resolution
3. Create `ProjectBootstrapService` — orchestrates the full project creation sequence (DB + workspace + agent + Discord channel)
4. Wire `sandboxDir` in `AgentService.createSession()` to resolve from workspace path instead of env var
5. Harden file/git/shell tools — strict path validation, reject resolved paths outside `sandboxDir`
6. Add `repoUrl` optional field to project creation API
7. Wire `/prdy` output to `docs/` directory in workspace
8. Add workspace cleanup to project deletion and user deletion flows
9. Port Gatekeeper service from old codebase — `isSystem: true` agent, PR review/merge authority
### Phase 8: Autocomplete & Polish
1. Autocomplete provider in TUI — triggers on `/` keystroke, opens sidebar-style command list
2. Filter commands as user types (fuzzy match on name + aliases)
3. Show arg hints for selected command
4. Skill commands (`/skill:*`) dynamically populated from manifest
5. Up-arrow input history recall in InputBar
---
## File Impact Summary
### New Files
| Path | Description |
| --------------------------------------------------------- | --------------------------------------------------------------- |
| `packages/types/src/commands/index.ts` | CommandDef, CommandManifest, socket event types |
| `packages/cli/src/tui/commands/parse.ts` | parseSlashCommand() utility |
| `packages/cli/src/tui/commands/registry.ts` | Client-side command registry + local commands |
| `packages/cli/src/tui/commands/local/*.ts` | Local command handlers (help, stop, cost, status) |
| `apps/gateway/src/commands/command-registry.service.ts` | Gateway CommandRegistryService |
| `apps/gateway/src/commands/command-executor.service.ts` | Gateway command execution routing |
| `apps/gateway/src/commands/commands.module.ts` | NestJS module |
| `apps/gateway/src/preferences/preferences.service.ts` | User preferences CRUD + defaults + enforcement |
| `apps/gateway/src/preferences/preferences.module.ts` | NestJS module |
| `apps/gateway/src/reload/reload.service.ts` | Hot reload orchestration |
| `apps/gateway/src/gc/session-gc.service.ts` | Session garbage collection (on-demand + sweep + full) |
| `apps/gateway/src/gc/gc.module.ts` | NestJS module |
| `apps/gateway/src/workspace/workspace.service.ts` | Workspace directory management, git init/clone, path resolution |
| `apps/gateway/src/workspace/workspace.module.ts` | NestJS module |
| `apps/gateway/src/workspace/project-bootstrap.service.ts` | Orchestrates full project creation sequence |
| `apps/gateway/src/gatekeeper/gatekeeper.service.ts` | PR review/merge agent service (ported from old codebase) |
| `apps/gateway/src/gatekeeper/gatekeeper.module.ts` | NestJS module |
### Modified Files
| Path | Changes |
| -------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| `packages/types/src/chat/events.ts` | Add command socket events to Server/ClientToServerEvents |
| `packages/types/src/index.ts` | Re-export commands types |
| `packages/cli/src/tui/hooks/use-socket.ts` | Add `role: 'system'` to Message, handle command/manifest events |
| `packages/cli/src/tui/components/input-bar.tsx` | Intercept `/` prefix, route to command parser, up-arrow history |
| `packages/cli/src/tui/components/message-list.tsx` | Render system messages distinctly |
| `packages/cli/src/tui/app.tsx` | Wire command manifest state, pass to input bar |
| `apps/gateway/src/chat/chat.gateway.ts` | Handle `command:execute`, emit `commands:manifest` |
| `apps/gateway/src/agent/agent.service.ts` | Call `SessionGCService.collect()` from `destroySession()`, resolve `sandboxDir` from workspace |
| `apps/gateway/src/log/cron.service.ts` | Add GC sweep cron job alongside existing summarization/tier crons |
| `apps/gateway/src/agent/tools/shell-tools.ts` | Harden path validation — reject resolved paths outside sandboxDir |
| `apps/gateway/src/agent/tools/file-tools.ts` | Harden path validation — reject resolved paths outside sandboxDir |
| `apps/gateway/src/agent/tools/git-tools.ts` | Harden path validation — reject resolved paths outside sandboxDir |
### Note: `preferences` Table Already Exists
The `preferences` table already exists in `packages/db/src/schema.ts` with `category` and `key` columns. No new migration is needed for basic preference storage — only schema adjustments if the `mutable` column or additional indexes are required.
---
---
## Session Garbage Collection
### Problem
When a session is terminated or deleted, artifacts accumulate across multiple stores:
| Store | Artifacts | Current Cleanup |
| --------------------------- | ---------------------------------------------------------------- | -------------------------------------------------------------- |
| **Valkey** | `/system` overrides, future session-scoped keys | TTL-based expiry only |
| **PG — agent_logs** | Session-scoped log entries (`session_id`) | Tier management (hot→warm→cold) but no deletion on session end |
| **PG — messages** | Conversation messages | CASCADE on conversation delete, but sessions ≠ conversations |
| **PG — summarization_jobs** | Completed/failed job records | None currently |
| **In-memory** | `AgentService.sessions` Map, Pi session objects, event listeners | `destroySession()` clears these |
| **Future: temp files** | Files created by agent shell/file tools in sandbox | None |
### Current State
`AgentService.destroySession()` handles in-memory cleanup (unsubscribe, dispose piSession, clear listeners/channels, delete from Map). But it does NOT clean up:
- Valkey keys for the session
- PG agent_logs tied to the session
- Temporary files in the sandbox directory
### Proposed: SessionGarbageCollector
A service that runs both **on-demand** (triggered by session termination) and **periodically** (cron sweep for orphans).
#### GC Tiers
The GC system operates at three levels of scope, all using the same underlying `SessionGCService`:
| Tier | Trigger | Scope | Description |
| -------------- | ------------------------------------------ | -------------- | -------------------------------------------------------------------------------------- |
| **Session GC** | `destroySession()`, `/clear`, `/new fresh` | Single session | Clean artifacts for one specific session |
| **Sweep GC** | Cron (daily), `/gc` command | All orphans | Scan for and collect orphaned artifacts across all stores |
| **Full GC** | Gateway cold start / restart | Everything | Aggressive pass — assumes no sessions survived the restart, cleans all stale artifacts |
#### Session GC (Single Session)
Triggered when a session is terminated, cleared, or explicitly collected. Called from:
- `AgentService.destroySession(sessionId)`
- `/clear` command handler (context reset includes artifact cleanup)
- `/new fresh` variant (new session + GC the old one)
```typescript
@Injectable()
export class SessionGCService {
constructor(
@Inject(QUEUE_HANDLE) private readonly queue: QueueHandle,
@Inject(BRAIN) private readonly brain: Brain,
) {}
/**
* Immediate cleanup for a single session.
*/
async collect(sessionId: string): Promise<GCResult> {
const result: GCResult = { sessionId, cleaned: {} };
// 1. Valkey: delete all session-scoped keys
const valkeyKeys = await this.queue.redis.keys(`mosaic:session:${sessionId}:*`);
if (valkeyKeys.length > 0) {
await this.queue.redis.del(...valkeyKeys);
result.cleaned.valkeyKeys = valkeyKeys.length;
}
// 2. PG: mark agent_logs for this session as cold tier
// (don't delete — tier management handles archival/deletion)
const logsUpdated = await this.brain.logs.markSessionCold(sessionId);
result.cleaned.logsDemoted = logsUpdated;
// 3. Future: clean sandbox temp files
// await this.cleanSandboxTempFiles(sessionId);
return result;
}
}
```
#### Sweep GC (Orphan Collection)
Scans all stores for artifacts whose owning session no longer exists. Triggered by:
- Daily cron job (configurable via `SESSION_GC_CRON`, default `0 4 * * *`)
- `/gc` slash command (admin or user — user-scoped sweep only collects their own orphans)
```typescript
// In CronService.onModuleInit()
const gcSchedule = process.env['SESSION_GC_CRON'] ?? '0 4 * * *'; // daily at 4am
this.tasks.push(
cron.schedule(gcSchedule, () => {
this.sessionGC.sweepOrphans().catch((err) => {
this.logger.error(`Session GC sweep failed: ${err}`);
});
}),
);
```
| Sweep Target | Detection | Action |
| -------------------------------------- | -------------------------------------------------------------------------------------------------------- | --------------------------- |
| Valkey `mosaic:session:*` keys | Session ID not in active sessions AND key TTL remaining < threshold | DEL |
| Valkey `mosaic:channel:*` keys | Channel no longer active in any plugin | DEL |
| PG `agent_logs` with no active session | `session_id NOT IN (active session IDs)` AND `tier = 'hot'` AND `created_at < now() - interval '7 days'` | Demote to cold |
| PG `summarization_jobs` | `status IN ('completed', 'failed')` AND `created_at < now() - interval '30 days'` | DELETE |
| Sandbox temp directories | Directory exists but session ID not in active sessions | rm -rf (with safety checks) |
#### Full GC (Cold Start)
Runs once during gateway bootstrap (`onModuleInit`), after all services are initialized but before accepting connections. A cold start means the in-memory session map is empty — every `mosaic:session:*` key in Valkey is an orphan from the previous process lifetime.
```typescript
@Injectable()
export class SessionGCService implements OnModuleInit {
async onModuleInit(): Promise<void> {
this.logger.log('Running full GC on cold start...');
const result = await this.fullCollect();
this.logger.log(
`Full GC complete: ${result.valkeyKeys} Valkey keys, ` +
`${result.logsDemoted} logs demoted, ` +
`${result.tempFilesRemoved} temp dirs removed ` +
`(${result.duration}ms)`,
);
}
/**
* Aggressive collection — assumes no sessions survived restart.
* All session-scoped Valkey keys are orphans.
* All hot-tier logs older than threshold are demoted.
*/
async fullCollect(): Promise<FullGCResult> {
const start = Date.now();
// 1. Valkey: delete ALL session-scoped keys (no active sessions exist)
const sessionKeys = await this.queue.redis.keys('mosaic:session:*');
if (sessionKeys.length > 0) {
await this.queue.redis.del(...sessionKeys);
}
// 2. Valkey: channel keys are NOT collected on cold start
// (plugins may reconnect and resume channels)
// 3. PG: demote all hot-tier logs older than 24h
// (recent logs may still be useful for debugging the restart)
const logsDemoted = await this.brain.logs.demoteStaleHotLogs('24 hours');
// 4. PG: purge old completed/failed summarization jobs
const jobsPurged = await this.brain.summarization.purgeOldJobs('30 days');
// 5. Sandbox: clean all temp directories
const tempFilesRemoved = await this.cleanAllSandboxTempDirs();
return {
valkeyKeys: sessionKeys.length,
logsDemoted,
jobsPurged,
tempFilesRemoved,
duration: Date.now() - start,
};
}
}
```
**Channel keys are excluded from full GC** — Discord/Telegram plugins reconnect after a gateway restart and may resume using existing channel overrides. Only the sweep GC (which checks plugin state) collects dead channel keys.
#### `/clear` and `/new` GC Variants
| Command | GC Behavior |
| ------------ | --------------------------------------------------------------------------------------------------- |
| `/clear` | Session GC on current session (clean Valkey overrides + demote logs), then reset context |
| `/new` | Create new session. Old session's in-memory state is dropped but artifacts linger until sweep/TTL |
| `/new fresh` | Create new session + Session GC on the old one. Clean break — nothing left behind |
| `/gc` | Trigger a sweep GC. Admin sees system-wide results; regular user sees only their orphaned artifacts |
#### GC Result Reporting
GC results are logged via OTEL and surfaced contextually:
```typescript
interface GCResult {
sessionId: string;
cleaned: {
valkeyKeys?: number;
logsDemoted?: number;
tempFilesRemoved?: number;
};
}
interface GCSweepResult {
orphanedSessions: number;
totalCleaned: GCResult[];
duration: number;
}
interface FullGCResult {
valkeyKeys: number;
logsDemoted: number;
jobsPurged: number;
tempFilesRemoved: number;
duration: number;
}
```
| Context | Output |
| ----------------- | ----------------------------------------------------------------------------------------- |
| Cold start | Logger: `Full GC complete: 14 Valkey keys, 230 logs demoted (340ms)` |
| `/clear` | System message in TUI: `⚙ Session cleared. Cleaned 2 Valkey keys, 8 logs demoted.` |
| `/new fresh` | System message: `⚙ New session started. Previous session artifacts collected.` |
| `/gc` | System message: `⚙ GC sweep: 3 orphaned sessions, 7 Valkey keys, 42 logs demoted (120ms)` |
| Cron sweep | Logger only (no TUI output) |
| `/status` (admin) | Includes last GC run time + result summary |
---
## Resolved Design Decisions
1. **Autocomplete UX** — Use the TUI sidebar panel (same pattern as conversation sidebar). Typing `/` opens a filterable command list in the sidebar. Consistent with existing UI patterns.
2. **`/mission set`** — Yes, injects mission context into system prompt. Requires user confirmation. See `/mission` subcommands section above.
3. **`/preferences` granularity** — Flat key-value with a `category` column for display grouping. Already exists in schema as `preferences` table. See storage section above.
4. **`/system` multiple overrides** — Accumulate then condense. Gateway appends new override to existing, runs a lightweight LLM condensation pass with last-wins semantics, writes condensed result back to Valkey. Prevents unbounded growth. See "Override Condensation" section above.
5. **Up-arrow input history** — Needs implementation in InputBar. Independent of `/retry`. Ubiquitous UX expectation — must ship.
6. **RBAC granularity** — Will support roles beyond admin/user. Per-project user RBAC for shared projects (team interaction). Deferred to RBAC phase but architecture accommodates it.
## Resolved Open Questions (Round 2)
1. **Workspace-based file tracking** — Structured workspace hierarchy under `$MOSAIC_ROOT/.workspaces/` provides session-scoped directories, project-scoped repos, and user isolation. Replaces ad-hoc temp file tracking. See "Workspaces" section below.
2. **GC on conversation delete** — Aggressive. If a user deletes a conversation, they don't want the data. Important artifacts (files, DB entries) have already been persisted to workspaces/brain. GC destroys the agent session if active, cleans all Valkey keys, demotes logs. The conversation's messages cascade-delete via FK.
3. **Discord/Telegram channel GC** — On bot disconnect from channel + admin command + `/gc` invoked in the channel itself. TTL remains as a safety net for anything the explicit triggers miss.
4. **Condensation model** — Configurable via `/settings` command and web UI settings page, stored in DB as a system setting. Defaults to a fast/cheap model. Admin-adjustable.
## Workspaces
### Concept
Every user gets an isolated workspace rooted at `$MOSAIC_ROOT/.workspaces/<user_id>/`. Projects, sessions, repos, and planning artifacts are organized under this hierarchy. This replaces the current `AGENT_FILE_SANDBOX_DIR` / `process.cwd()` fallback with a structured, multi-user, project-aware filesystem layout.
All project workspaces are git repositories to enable rollback and change tracking. Mistakes happen — git history provides the safety net.
### Directory Structure
```
$MOSAIC_ROOT/ # e.g. /opt/mosaic (Linux), configurable on install
└── .workspaces/
└── <user_id>/
└── <project_id>/
├── <git_repo>/ # Full working copy (primary agent workspace)
│ └── docs/
│ ├── PRD-<name>.md # PRD content (working copy, DB is SOT for state)
│ ├── <prd_name>-TASKS.md # Task breakdown
│ ├── plans/ # Implementation plans
│ └── reports/ # Agent reports, summaries
└── <git_repo>-worktrees/ # Git worktrees for parallel agent work
├── <branch-name>/ # Worktree checkouts created by agents
└── ...
```
### Repo Type
**Full working copies** (not bare). Agents need to read/write files, which requires a checkout. Git worktrees also require a non-bare parent repo. Bare repos are for serving as remotes — that's Gitea/GitHub's job, not Mosaic's.
### `$MOSAIC_ROOT` Ownership
- Owned by the **mosaic service user** (e.g., `mosaic:mosaic`)
- Centrally located on install (prompted with platform-specific defaults)
- Linux: `/opt/mosaic`
- macOS: `/usr/local/mosaic` or `~/Library/Application Support/Mosaic`
- Windows: `C:\ProgramData\Mosaic`
- Docker deployments: mounted volume / bind mount at container path `/opt/mosaic`
- The gateway process runs as the mosaic service user and has write access to all workspaces
### Project Creation Behaviors
When a project is created, the gateway executes a consistent initialization sequence:
1. **Create the project** in DB (`projects` table)
2. **Create the project workspace**`$MOSAIC_ROOT/.workspaces/<user_id>/<project_id>/`
3. **Initialize git repo**`git init` the project workspace (or `git clone <url>` if remote upstream provided)
4. **Create the agent config** — default project agent in DB (`agents` table), linked to the project
5. **Create docs structure**`docs/`, `docs/plans/`, `docs/reports/`
6. **If Discord is configured** — create the Discord category/channel for the project, link the agent to the channel
7. **Set agent sandboxDir** — point the agent session's sandbox to the workspace repo
This is handled by a `ProjectBootstrapService` in the gateway — a single orchestrator that ensures consistency. Partial failures roll back cleanly.
### Optional Remote Clone
Projects may optionally clone from a remote upstream on creation:
- `POST /api/projects` accepts an optional `repoUrl` field
- `/project create --repo <url>` in CLI
- If provided, `git clone <url>` replaces `git init`
- Remote tracking is configured automatically (`origin` → upstream)
- Projects without a remote are local-only. Encouraged to add one later but not required.
### Agent Working Model
| Agent | Works in | Purpose |
| ------------------------- | -------------------------------- | ---------------------------------------------------------------------------------------- |
| **Primary project agent** | `<git_repo>/` | Main development, PRD creation, task execution |
| **Parallel agents** | `<git_repo>-worktrees/<branch>/` | Independent tasks that can run concurrently without conflicts |
| **Gatekeeper agent** | Read-only access to PRs/diffs | PR review and merge decisions. `isSystem: true`. Outside project agents' trust boundary. |
The primary agent works directly in the repo. When the agent determines that parallel work is needed (e.g., multiple independent tasks), it creates worktrees via git tools. This is agent-directed — the agent decides when parallelism is appropriate based on task dependencies.
### Git Responsibility Model
Agents are given framework instructions for git operations:
| Operation | Responsible | Notes |
| --------------------- | ----------- | -------------------------------------------------------------------------- |
| `commit` | Agent | Agents commit their own work with meaningful messages |
| `push` | Agent | Push to remote when configured. Agent decides timing. |
| `pull` / `fetch` | Agent | Keep up to date with upstream before starting work |
| `branch` | Agent | Create feature branches for tasks |
| `worktree add/remove` | Agent | Manage parallel workspaces |
| **PR creation** | Agent | Agent creates the PR (via Gitea/GitHub/GitLab API tools) |
| **PR review** | Gatekeeper | Specialized agent service. Cannot be self-approved by the authoring agent. |
| **PR merge** | Gatekeeper | Only after quality gates pass (lint, typecheck, tests) |
### Gatekeeper Service
A specialized agent service responsible for PR review and merge actions. Critical design constraints:
- **Isolated trust boundary** — project agents CANNOT approve or merge their own PRs
- **System agent** — `isSystem: true`, not editable by users
- **Read-only code access** — can read PR diffs, run quality checks, but cannot modify code
- **Quality gates** — lint, typecheck, test results must pass before merge is allowed
- **Configurable strictness** — projects can define required checks, minimum review depth
- **Existed in old codebase** — to be ported/adapted for mosaic-mono-v1
### PRD/Task Files vs DB
**DB is source of truth** for structured state. Files are working copies for agent interaction.
| Layer | What it stores | Accessed by |
| -------------------------- | ------------------------------------------------------------------- | ---------------------------------------------------- |
| **DB** (source of truth) | Mission status, phase, task status, assignees, timestamps, metadata | API, dashboard, `/mission` command, `/status` |
| **Files** (working copies) | PRD prose, specifications, plans, reports — the actual content | Agents (read/write naturally), users (review in git) |
The `/prdy` wizard writes markdown files to `docs/` AND syncs structured metadata (mission name, tasks, phase) to the DB. Both representations exist but serve different purposes:
- Dashboard: "Mission Auth System — implementation phase — 4/7 tasks done" → from DB
- Agent starting work: reads `docs/PRD-auth-system.md` → from disk
- Version history of PRD changes: `git log docs/PRD-auth-system.md` → from git
### RBAC & Filesystem Security
**Gateway-enforced RBAC** — the gateway process (running as mosaic service user) owns all workspace files. RBAC is enforced at the application layer, not the OS layer.
Rationale:
- Gateway is already the single API surface (Architecture Rule #1) — it mediates all operations
- Docker deployment makes OS-level per-user file ownership impractical
- File/git/shell tools already accept `sandboxDir` and scope to it
#### Sandbox Escape Prevention
The primary risk: an agent with shell tool access running commands like `cat /opt/mosaic/.workspaces/other_user/...` bypasses gateway RBAC.
**v1 mitigations:**
- **Chroot per session** — agent tool processes are chrooted to the workspace directory. Cannot see outside. See "Chroot Agent Sandboxing" section below.
- Strict path validation in ALL file/git/shell tools — defense-in-depth alongside chroot
- Shell tool command auditing — log all commands via OTEL
- `sandboxDir` set to the project workspace on session creation
**v2 mitigations (multi-tenant hardening):**
- Container-per-session for untrusted multi-tenant deployments
- AppArmor/SELinux profiles restricting the gateway process's file access patterns
#### Permission Inheritance
Within the gateway's RBAC layer:
- Users can only access workspaces for projects they are members of
- Project membership is defined in DB (project_members table or via organization RBAC)
- File tool operations check: `user.id` is member of `project.id` that owns the target workspace path
- Admin users can access all workspaces
### Workspace Lifecycle
| Event | Filesystem Action |
| ----------------- | ------------------------------------------------------------------------------------------- |
| User registration | Create `$MOSAIC_ROOT/.workspaces/<user_id>/` |
| Project creation | Create `<user_id>/<project_id>/`, init git repo, create docs structure |
| Session start | Set `sandboxDir` to project workspace (no new directory needed — sessions work in the repo) |
| Worktree creation | Agent creates `<git_repo>-worktrees/<branch>/` via git tools |
| Session end | GC cleans session artifacts (Valkey keys, logs). Workspace files persist (they're in git). |
| Project deletion | Delete `<user_id>/<project_id>/` recursively. GC cleans DB artifacts. |
| User deletion | Delete `$MOSAIC_ROOT/.workspaces/<user_id>/` recursively. CASCADE handles DB. |
### Tooling
Existing tool sets in `apps/gateway/src/agent/tools/`:
- `createFileTools(sandboxDir)` — file read/write/list ✓
- `createGitTools(sandboxDir)` — git operations ✓
- `createShellTools(sandboxDir)` — shell commands ✓
- `createWebTools()` — HTTP requests ✓
- `createBrainTools(brain)` — DB queries ✓
- `createCoordTools(coord)` — task coordination ✓
- `createMemoryTools(memory)` — memory/insights ✓
Additional tool sets needed for workspace workflows:
- **Gitea/GitHub/GitLab API tools** — PR creation, review comments, merge, branch protection
- **Woodpecker/CI tools** — trigger builds, check status, read logs
- **Docker/Portainer tools** — container management, deployment
- These are registered as additional `ToolDefinition[]` sets, same pattern as existing tools
`@mosaicstack/prdy` already provides the PRD wizard tooling — the workspace structure gives it a canonical output location (`docs/PRD-<name>.md`).
### Task Queue & Orchestration
#### Current State: Two Disconnected Systems
There are currently two parallel systems for task management:
1. **`@mosaicstack/coord`** (file-based) — missions stored as `mission.json`, tasks in `TASKS.md`, file locks, session tracking, subprocess spawning. Built for single-machine orchestrator pattern.
2. **PG tables** (`tasks`, `mission_tasks`, `missions`) — DB-backed CRUD with status, priority, assignee, project/mission FKs. Exposed via REST API and Brain repos.
These are not connected. `@mosaicstack/coord` reads/writes files. The DB tables are managed via MissionsController. An agent using `coord_mission_status` gets file-based data; the dashboard shows DB data.
#### Vision: `@mosaicstack/queue` as the Unified Task Layer
`@mosaicstack/queue` becomes the task orchestration service — not just a Valkey queue primitive, but the coordinator between agents, DB, and workspace files:
```
┌──────────────────────────────────────────────┐
│ @mosaicstack/queue │
│ (Task Orchestration Service) │
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ DB (PG) │ │ Files (workspace)│ │
│ │ - tasks table │ │ - TASKS.md │ │
│ │ - missions │ │ - PRDs │ │
│ │ - mission_tasks│ │ - plans/ │ │
│ │ (source of │ │ (working copies │ │
│ │ truth) │ │ for agents) │ │
│ └────────┬────────┘ └────────┬─────────┘ │
│ │ sync │ │
│ └────────┬───────────┘ │
│ │ │
│ ┌─────────────────┴──────────────────┐ │
│ │ Valkey │ │
│ │ - Task assignment queue │ │
│ │ - Agent claim locks │ │
│ │ - Status pub/sub │ │
│ └────────────────────────────────────┘ │
└──────────────┬───────────────────────────────┘
┌──────────┼──────────┐
▼ ▼ ▼
Agent A Agent B Gatekeeper
(primary) (worktree) (PR review)
```
**Agent workflow:**
1. Mission created → tasks written to PG (`tasks` table) AND workspace file (`docs/TASKS.md`)
2. Tasks enqueued in Valkey for assignment (`mosaic:queue:project:{projectId}:tasks`)
3. Agent requests next task → queue service dequeues, returns task details from PG
4. Agent claims the task → Valkey lock + PG status → `in-progress` + file sync
5. Agent works in its workspace (repo or worktree)
6. Agent completes → updates status via queue service → PG updated + file synced + lock released
7. Gateway/orchestrator monitors progress, assigns next based on dependencies
**Flatfile fallback:** If no PG configured, queue service writes to flatfiles in workspace (JSON task manifests). Preserves the `@mosaicstack/coord` file-based pattern for single-machine, no-DB deployments.
**What this replaces:**
- `@mosaicstack/coord`'s file-only task tracking → unified DB+file via queue service
- Direct PG CRUD for task status → routed through queue service for consistency
- Manual task assignment → queue-based distribution with agent claiming
**What this preserves:**
- `TASKS.md` file format — still the agent-readable working copy
- Mission structure from `@mosaicstack/coord` — creation, milestones, sessions
- `@mosaicstack/prdy` PRD workflow — writes to `docs/`, syncs metadata to DB
> **Note:** This is a significant refactor of `@mosaicstack/coord` + `@mosaicstack/queue`. Warrants its own dedicated plan alongside the Gatekeeper plan.
### Chroot Agent Sandboxing
Agents are chrooted to their workspace directory. Sweet spot between full container isolation (heavy) and path validation only (escape-prone).
**How it works:**
- Before agent tool execution, the gateway spawns tool processes inside a chroot at `sandboxDir`
- File/git/shell tools operate inside the chroot — literally cannot see outside the workspace
- The chroot environment needs minimal deps: git, shell utilities, language runtimes
- Node.js `child_process.spawn` with chroot requires `CAP_SYS_CHROOT` capability (not root)
**Docker consideration:**
- Container itself is already isolated
- Chroot inside Docker provides defense-in-depth: user A's agent can't access user B's workspace within the same container
- Alternative: Linux namespaces (`unshare`) for lighter-weight isolation without full chroot env setup
**v1 approach:**
- `chroot` to workspace directory for all agent tool processes
- Gateway process gets `CAP_SYS_CHROOT` via capabilities
- Minimal chroot environment provisioned by `WorkspaceService` on workspace creation
**What lives outside the chroot (gateway-only, not agent-accessible):**
- Valkey connection
- PG connection
- Other users' workspaces
- Gateway configuration
- OTEL collector endpoint
### Spin-Off Plans
The following topics are significant enough to warrant their own dedicated plan documents. Stubs created at:
| Plan | Stub File | Scope |
| -------------------------- | -------------------------------------- | -------------------------------------------------------------------------------------------------- |
| **Gatekeeper Service** | `docs/plans/gatekeeper-service.md` | PR review/merge agent, quality gates, CI integration, trust boundary design |
| **Task Queue Unification** | `docs/plans/task-queue-unification.md` | `@mosaicstack/queue` refactor, `@mosaicstack/coord` consolidation, DB+file sync, flatfile fallback |
| **Chroot Sandboxing** | `docs/plans/chroot-sandboxing.md` | Chroot environment provisioning, capability management, Docker integration, namespace alternatives |
---
## Teams Architecture
### Concept
Projects can be owned by a **user** (personal) or a **team** (multi-member collaboration). Teams have a designated manager who controls membership and project settings. The workspace path, RBAC checks, and agent sandbox resolution all branch on owner type.
### DB Schema — New Tables
```typescript
// teams — group identity
export const teams = pgTable('teams', {
id: uuid('id').primaryKey().defaultRandom(),
name: text('name').notNull(),
slug: text('slug').notNull().unique(),
ownerId: text('owner_id')
.notNull()
.references(() => users.id, { onDelete: 'restrict' }),
managerId: text('manager_id')
.notNull()
.references(() => users.id, { onDelete: 'restrict' }),
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
});
// team_members — membership roster
export const teamMembers = pgTable(
'team_members',
{
id: uuid('id').primaryKey().defaultRandom(),
teamId: uuid('team_id')
.notNull()
.references(() => teams.id, { onDelete: 'cascade' }),
userId: text('user_id')
.notNull()
.references(() => users.id, { onDelete: 'cascade' }),
role: text('role', { enum: ['manager', 'member'] })
.notNull()
.default('member'),
invitedBy: text('invited_by').references(() => users.id, { onDelete: 'set null' }),
joinedAt: timestamp('joined_at', { withTimezone: true }).notNull().defaultNow(),
},
(t) => ({
uniq: uniqueIndex('team_members_team_user_idx').on(t.teamId, t.userId),
}),
);
```
### DB Schema — Modified `projects` Table
```typescript
// Add to existing projects table:
teamId: uuid('team_id').references(() => teams.id, { onDelete: 'cascade' }),
ownerType: text('owner_type', { enum: ['user', 'team'] }).notNull().default('user'),
```
- **Solo project:** `teamId = null`, `ownerType = 'user'`
- **Team project:** `teamId = <uuid>`, `ownerType = 'team'`
### Workspace Path Resolution
| Owner Type | Path |
| ---------- | -------------------------------------------------------- |
| Solo | `$MOSAIC_ROOT/.workspaces/users/<user_id>/<project_id>/` |
| Team | `$MOSAIC_ROOT/.workspaces/teams/<team_id>/<project_id>/` |
### RBAC
| Role | Access |
| -------------- | ----------------------------------------------------------- |
| Team manager | Full project access, manage members, update settings |
| Team member | Project workspace access, create sessions, read/write files |
| Non-member | No access |
| Platform admin | Cross-team read access (audit/support) |
Access check in gateway middleware:
```typescript
async function canAccessProject(userId: string, projectId: string): Promise<boolean> {
const project = await brain.projects.findById(projectId);
if (project.ownerType === 'user') return project.userId === userId;
if (project.ownerType === 'team') return brain.teams.isMember(project.teamId, userId);
return false;
}
```
### WorkspaceService Path Resolution
```typescript
resolvePath(project: Project): string {
const root = process.env['MOSAIC_ROOT'] ?? '/opt/mosaic';
if (project.ownerType === 'team') {
return path.join(root, '.workspaces', 'teams', project.teamId!, project.id);
}
return path.join(root, '.workspaces', 'users', project.userId, project.id);
}
```
---
## REST Route Specifications
Commands with `execution: 'rest'` or `'hybrid'` map to these gateway endpoints. All require authentication.
| Command | Method | Route | Notes |
| -------------------- | -------- | ----------------------------------------------- | -------------------------------------------------------------------------------------- |
| `/rename` | `PATCH` | `/api/conversations/:id` | Body: `{ name: string }` |
| `/resume` | `GET` | `/api/conversations` | Returns list; TUI shows picker; socket reconnect is client-side |
| `/history` | `GET` | `/api/conversations/:id/messages?limit=N` | N defaults to 50 |
| `/export` | `GET` | `/api/conversations/:id/export?format=md\|json` | Streams file download |
| `/gc` | `POST` | `/api/sessions/gc` | Body: `{ scope: 'user' \| 'system' }`. `system` is admin-only. Returns `GCSweepResult` |
| `/preferences show` | `GET` | `/api/preferences` | Returns merged effective preferences (defaults + user mutations) |
| `/preferences set` | `POST` | `/api/preferences` | Body: `{ key: string, value: unknown }`. Rejects immutable keys. |
| `/preferences reset` | `DELETE` | `/api/preferences/:key` | Removes user override, reverts to platform default |
| `/provider` (list) | `GET` | `/api/providers` | Existing endpoint |
| `/provider login` | `GET` | `/api/auth/provider/:name/url` | Returns `{ url: string, expiresAt: string, pollToken: string }` |
| `/provider logout` | `DELETE` | `/api/auth/providers/:name/session` | Revokes stored token |
| `/agent` (list) | `GET` | `/api/agents` | Existing endpoint (P8-005) |
| `/mission` (status) | `GET` | `/api/missions?active=true` | Existing endpoint |
| `/mission set` | `POST` | `/api/sessions/:sessionId/mission` | Body: `{ missionId: string }` |
| `/status` | `GET` | `/api/sessions/:sessionId/status` | Session metadata; admin sees last GC stats |
| `/reload` | `POST` | `/api/admin/reload` | RBAC: admin only |
---
## `/provider` OAuth Flow (TUI)
A TUI session has no embedded browser. The mechanism follows the same pattern as Pi agent: generate URL, copy to clipboard, poll for completion.
### Flow
```
User types: /provider login anthropic
TUI emits: command:execute { command: 'provider', args: 'login anthropic' }
Gateway: GET /api/auth/provider/anthropic/url
→ Generates OAuth authorization URL
→ Creates poll token in Valkey: mosaic:auth:poll:<token> TTL=5min
→ Returns { url, expiresAt, pollToken }
Gateway emits: command:result { success: true, data: { url, expiresAt } }
TUI:
1. Writes URL to clipboard (clipboardy)
2. Shows system message: "⚙ Authorization URL copied to clipboard. Open in browser to authorize Anthropic."
3. Starts polling: GET /api/auth/provider/anthropic/status?token=<pollToken>
4. Poll interval: 3s, max timeout: 5min
User opens browser, completes OAuth
Gateway receives callback, stores token, marks poll token as completed
TUI poll returns: { status: 'completed', provider: 'anthropic' }
TUI shows: "⚙ Anthropic connected successfully."
TUI requests fresh commands:manifest (reflects new provider availability)
```
### Poll Status Endpoint
`GET /api/auth/provider/:name/status?token=<pollToken>``{ status: 'pending' | 'completed' | 'failed' | 'expired' }`
### Implementation Notes
- Gateway stores poll state in Valkey: `mosaic:auth:poll:<pollToken>` with 5-min TTL
- `clipboardy` used for clipboard write in TUI (add as dep to `@mosaicstack/cli` if not already present)
- On success, gateway emits a fresh `commands:manifest` via socket (reflects provider now connected)
---
## Preferences `mutable` Column Migration
Add to `packages/db/src/schema.ts` in the `preferences` table definition:
```typescript
mutable: boolean('mutable').notNull().default(true),
```
Generate and apply:
```bash
pnpm --filter @mosaicstack/db db:generate # generates migration SQL
pnpm --filter @mosaicstack/db db:migrate # applies to PG
```
Platform enforcement keys (seeded with `mutable = false` by gateway `PreferencesService.onModuleInit()`):
| Key | Category | Reason |
| ------------------------- | -------- | ----------------------------------------- |
| `limits.maxThinkingLevel` | limits | Admin-controlled ceiling; role-dependent |
| `limits.rateLimit` | limits | Admin-controlled rate cap; role-dependent |
| `safety.contentFiltering` | safety | Cannot be disabled by users |
`/preferences set` calls `PreferencesService.set()` which reads the `mutable` flag and rejects immutable keys with:
```
⚙ Cannot override "limits.maxThinkingLevel" — this is a platform enforcement. Contact your admin.
```
---
## Test Strategy
### Test Files Per Task
| Task | Test File(s) | Test Types |
| ------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| P8-007 (DB) | `packages/db/src/__tests__/teams.test.ts` | Schema validation, FK constraint checks |
| P8-008 (Types) | — | Compile-time only; covered by typecheck gate |
| P8-009 (TUI parsing) | `packages/cli/src/__tests__/parse-command.test.ts` | Unit: parseSlashCommand, alias resolution, unknown commands, skill:name syntax |
| P8-010 (Gateway registry) | `apps/gateway/src/commands/__tests__/command-registry.service.spec.ts` | Unit: manifest build, RBAC filter; Integration: socket command:execute round-trip |
| P8-011 (Preferences) | `apps/gateway/src/preferences/__tests__/preferences.service.spec.ts` | Unit: merge logic, mutable enforcement; Integration: /preferences REST + /system Valkey |
| P8-012 (Commands P4) | `apps/gateway/src/commands/__tests__/command-executor.service.spec.ts` | Integration: /agent, /mission, /provider URL gen + poll state |
| P8-013 (Hot Reload) | `apps/gateway/src/reload/__tests__/reload.service.spec.ts` | Integration: plugin unload/reload cycle, manifest diff, system:reload broadcast |
| P8-014 (Session GC) | `apps/gateway/src/gc/__tests__/session-gc.service.spec.ts` | Unit: collect, sweepOrphans, fullCollect — all three tiers |
| P8-015 (Workspaces) | `apps/gateway/src/workspace/__tests__/workspace.service.spec.ts` | Unit: path resolution (solo vs team); Integration: project creation sequence, cleanup |
| P8-016 (Tool hardening) | `apps/gateway/src/agent/tools/__tests__/path-validation.spec.ts` | Unit: sandbox escape attempt rejection (path traversal, symlink) |
| P8-017 (Autocomplete) | `packages/cli/src/__tests__/autocomplete.test.ts` | Component: fuzzy match, keyboard nav, arg hints |
### Key Test Cases Per Phase
**P8-009 (TUI parsing):**
- `parseSlashCommand('/help')``{ command: 'help', args: null }`
- `parseSlashCommand('/skill:brave-search query')``{ command: 'skill:brave-search', args: 'query' }`
- `parseSlashCommand('not a command')``null`
- Alias resolution: `/m claude-4` → resolves to `/model`
- Unknown command: shows inline error, does NOT emit to socket
**P8-010 (gateway registry):**
- `commands:manifest` pushed on socket connect (alongside `session:info`)
- Admin command not in manifest for non-admin user
- `command:execute` for unknown command → `command:result { success: false }`
**P8-011 (preferences):**
- Platform default → user override → enforcement re-applied (enforcements always win)
- `/system` condensation: mock LLM call, verify condensed output written to Valkey
- TTL renewal: verify `EXPIRE` called on each agent turn
**P8-014 (Session GC):**
- `collect(sessionId)`: mock Valkey + DB, verify correct keys deleted and logs demoted
- `sweepOrphans()`: seed orphaned Valkey keys, verify detection and deletion
- `fullCollect()` (cold start): seed stale keys across all namespaces, verify all cleared
**P8-015 (Workspaces):**
- Solo project: workspace created at `users/<user_id>/<project_id>/`
- Team project: workspace created at `teams/<team_id>/<project_id>/`
- Non-member access attempt: `canAccessProject()` returns false
- `repoUrl` provided: workspace created via `git clone`, not `git init`
---
## Phase Execution Order
This is the authoritative dependency and parallelism plan for implementation.
### Dependency Graph
```
P8-007 (DB migrations) ──────────────────────────────────► P8-015 (Workspaces)
P8-008 (Types) ──────────► P8-009 (TUI local cmds)
└────────► P8-010 (Gateway registry) ───► P8-011 (Preferences)
├──► P8-012 (Commands P4)
├──► P8-013 (Hot Reload)
└──► P8-014 (Session GC)
P8-008 + P8-010 ─────────────────────────────────────────► P8-017 (Autocomplete)
P8-007 ─────────────────────────────────────────────────► P8-016 (Tool hardening) [independent]
(all above) ─────────────────────────────────────────────► P8-019 (Verify)
```
### Wave Execution Plan
| Wave | Tasks | Parallelism |
| ------ | ------------------------------------------------- | ---------------------- |
| Wave 1 | P8-007 (DB migrations) + P8-008 (Types) | 2 workers |
| Wave 2 | P8-009 (TUI local cmds) + P8-016 (Tool hardening) | 2 workers |
| Wave 3 | P8-010 (Gateway command registry) | 1 worker — gating wave |
| Wave 4 | P8-011 (Preferences) + P8-012 (Commands P4) | 2 workers |
| Wave 5 | P8-013 (Hot Reload) + P8-014 (Session GC) | 2 workers |
| Wave 6 | P8-015 (Workspaces + Teams) | 1 worker |
| Wave 7 | P8-017 (Autocomplete) | 1 worker |
| Wave 8 | P8-019 (Verify) | 1 worker |
**P8-018** (spin-off plan stubs) is documentation-only — completed during plan preparation, not in execution waves.