# PRD: Mosaic Stack v0.1.0 ## Metadata - **Owner:** Jason Woltje - **Date:** 2026-03-12 - **Status:** draft - **Best-Guess Mode:** true - Repo (target): `git.mosaicstack.dev/mosaic/mosaic-stack` - Baseline: `~/src/jarvis-old` (jarvis v0.2.0) - Package source: `~/src/mosaic-mono-v0` (@mosaic/\* packages) - Agent harness: [pi](https://github.com/badlogic/pi-mono) (v0.57.1) - Remote control reference: [OpenClaw](https://github.com/openclaw/openclaw) (upstream, canonical) --- ## Problem Statement Jarvis (v0.2.0) is a self-hosted AI assistant with a Python FastAPI backend and Next.js frontend. It handles chat, projects, tasks, and LLM routing but lacks orchestration depth, agent coordination, shared memory, and remote access. The Mosaic framework (`~/.config/mosaic`) provides agent guides, shell-based orchestration tools, and quality rails — but these are loose scripts, not an integrated platform. The `@mosaic/*` packages in mosaic-mono-v0 began consolidating these into TypeScript packages (brain, queue, coord, cli, prdy, quality-rails) but have no UI, no auth, and no agent runtime integration. **The gap:** Three codebases with overlapping concerns, no unified runtime, no remote control surface (Discord/Telegram), no gateway orchestrator, and a Python backend that doesn't align with the target TypeScript-everywhere stack. **What Mosaic Stack solves:** A single monorepo that brings together many pieces to make one beautiful picture — a self-hosted, multi-user AI agent platform with web dashboard, TUI, remote control, shared memory, mission orchestration, and extensible skill/plugin architecture. All TypeScript. Pi as the agent harness. Brain as the knowledge layer. Queue as the coordination backbone. --- ## Objectives 1. **Unified TypeScript monorepo** — One repo, one language, one build pipeline for all Mosaic Stack components 2. **Pi-powered agent runtime** — Pi SDK embedded as the core agent loop; Pi TUI as the terminal interface 3. **Web + TUI + Remote** — Next.js dashboard for visual management, Pi TUI for terminal work, Discord/Telegram for remote control 4. **Gateway orchestrator** — Central routing layer that dispatches tasks to appropriate agents based on capability, cost, and context 5. **Shared memory** — PostgreSQL canonical store + vector DB for semantic search + tiered log summarization to prevent context creep 6. **Multi-user with SSO** — BetterAuth with Authentik/WorkOS/Keycloak SSO, RBAC for family/team/business use 7. **Full @mosaic/\* package integration** — brain, queue, coord, mosaic, prdy, quality-rails, cli all integrated 8. **Extensible** — MCP capability, skill import interface, plugin architecture for LLM providers and remote channels --- ## Scope ### In Scope (v0.1.0 Beta) 1. Chat/conversation UI (web) — carry forward from jarvis-old, rewrite frontend to work with new backend 2. Pi TUI integration — terminal-based agent interaction using Pi SDK 3. Web dashboard — settings, task management, projects, PRDs, missions, agent status 4. Gateway orchestrator (`@mosaic/gateway`) — central dispatch for agent tasks with routing logic 5. Task management — CRUD, kanban, mission-scoped tasks, dependency tracking 6. Project management — projects, milestones, PRDs linked to missions 7. Shared memory system — learned preferences, behaviors, defaults; tiered storage with summarization 8. User management — RBAC (admin, member, viewer), multi-user capable 9. SSO — BetterAuth with Authentik/WorkOS/Keycloak adapter 10. Remote control — Discord plugin (high priority), Telegram plugin 11. LLM provider support — Anthropic subs, Codex subs, Z.ai subs, other API-based, Ollama, LM Studio, llama.cpp 12. Agent routing — task-based model/provider selection (cost/capability matrix) 13. MCP capability — server and client, tool registration 14. Skill import interface — browse, install, manage agent skills 15. `@mosaic/brain` — structured data layer (migrated to PG + vector DB backend) 16. `@mosaic/queue` — Valkey-backed task queue with MCP tools 17. `@mosaic/coord` — mission coordination engine 18. `@mosaic/mosaic` — install wizard / bootstrap 19. `@mosaic/prdy` — PRD wizard 20. `@mosaic/quality-rails` — code quality scaffolder 21. `@mosaic/cli` — unified `mosaic` CLI 22. Docker Compose deployment + bare-metal capability 23. Agent log service — ingest, parse, tier, summarize agent interaction logs ### Out of Scope (v0.1.0) 1. SaaS / multi-tenant revenue model — this is a personal/family/team tool 2. Mobile native apps — web responsive is sufficient 3. Public npm registry publishing — Gitea registry only 4. Video/voice agent interaction 5. Full OpenClaw feature parity — we take inspiration, not wholesale migration 6. Calendar integration (deferred — brain tracks events, but no gcal sync yet) 7. GLPI/helpdesk ticket sync (deferred) 8. Woodpecker CI integration tooling (deferred — focus on core platform first) --- ## Architecture ### High-Level System Diagram ``` ┌─────────────────────────────────────────────────────────────────┐ │ Mosaic Stack │ │ │ │ ┌──────────┐ ┌──────────┐ ┌─────────────┐ ┌──────────────┐ │ │ │ Next.js │ │ Pi TUI │ │ Discord │ │ Telegram │ │ │ │ Web App │ │ Terminal │ │ Plugin │ │ Plugin │ │ │ └────┬─────┘ └────┬─────┘ └──────┬──────┘ └──────┬───────┘ │ │ │ │ │ │ │ │ └──────────────┴───────┬───────┴────────────────┘ │ │ │ │ │ ┌─────────▼──────────┐ │ │ │ @mosaic/gateway │ ← Central Orchestrator│ │ │ (NestJS+Fastify) │ │ │ └────┬────┬────┬─────┘ │ │ │ │ │ │ │ ┌──────────────┤ │ ├──────────────┐ │ │ │ │ │ │ │ │ │ ┌───────▼──────┐ ┌────▼────▼──┐ │ ┌───────────▼────────┐ │ │ │ @mosaic/brain│ │ @mosaic/ │ │ │ Agent Pool │ │ │ │ (Data Layer) │ │ queue │ │ │ (Pi SDK sessions) │ │ │ └───────┬──────┘ └────────────┘ │ │ - Anthropic │ │ │ │ │ │ - Codex │ │ │ ┌───────▼──────────────────┐ │ │ - Z.ai │ │ │ │ PostgreSQL │ VectorDB │ │ │ - Ollama │ │ │ │ (canonical) │ (semantic)│ │ │ - LM Studio │ │ │ └──────────────┴───────────┘ │ │ - llama.cpp │ │ │ │ └────────────────────┘ │ │ ┌─────────────▼──────┐ │ │ │ @mosaic/coord │ │ │ │ Mission lifecycle │ │ │ └────────────────────┘ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ @mosaic/cli │ │ @mosaic/prdy │ │ @mosaic/ │ │ │ │ │ │ │ │ quality-rails │ │ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ │ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Valkey (queue backend) │ BetterAuth (SSO/RBAC) │ │ │ └──────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ ``` ### Technology Decisions | Layer | Technology | Rationale | | ------------------ | ------------------------------------ | ----------------------------------------------------------------------------------------------------------- | | **Web Frontend** | Next.js 16 + React 19 + Tailwind CSS | SSR, RSC; design tokens from @mosaic/design-tokens (mosaic-stack-website) | | **API / Gateway** | NestJS + Fastify adapter | Module system, DI, guards/interceptors for complex gateway; Fastify performance underneath | | **Agent Runtime** | Pi SDK (embedded) | Extensible harness with tools, skills, session management | | **TUI** | Pi interactive mode | Native terminal agent interaction | | **Auth** | BetterAuth + SSO adapters | Multi-user RBAC with Authentik/WorkOS/Keycloak | | **Database** | PostgreSQL 17 + pgvector | Canonical store; pgvector for embedding search | | **Vector DB** | pgvector + VectorStore interface | pgvector for v0.1.0; `VectorStore` abstraction in @mosaic/memory makes Qdrant a drop-in later | | **Cache / Queue** | Valkey 8 | Redis-compatible; proven in @mosaic/queue | | **ORM** | Drizzle ORM | TypeScript-native, lightweight, good migration story | | **Validation** | Zod | Already used across @mosaic/\* packages | | **Build** | pnpm workspaces + Turborepo | Proven in both jarvis-old and mosaic-mono-v0 | | **Testing** | Vitest + Playwright | Unit/integration via Vitest, E2E via Playwright | | **Remote Control** | Discord.js + Telegraf | Inspired by OpenClaw plugin architecture | | **MCP** | @modelcontextprotocol/sdk | Already used in @mosaic/brain and @mosaic/queue | | **Container** | Docker Compose | Self-hosted; bare-metal also supported | | **CI** | Woodpecker CI | Existing infrastructure at git.mosaicstack.dev | | **Observability** | OpenTelemetry + SigNoz | Wide-event logging from day one; OTEL auto-instrumentation for NestJS/PG/HTTP; SigNoz as all-in-one backend | | **Log Processing** | Custom ingest service | Parse agent logs → tiered storage → summarization | ### Key Architecture Decisions **AD-1: TypeScript everywhere (no Python backend)** The jarvis-old FastAPI backend is not carried forward as code. Its domain logic (conversation management, LLM routing, task/project CRUD, auth) is reimplemented in TypeScript. The Python plugin system is replaced by Pi's extension/skill system and MCP tool registration. **AD-2: Pi SDK as the agent runtime** Instead of a custom LLM provider abstraction (jarvis-old's `BaseLLMProvider`), Pi SDK manages agent sessions. Pi handles model selection, tool calling, context management, and compaction. The gateway dispatches work to Pi sessions configured with appropriate providers. **AD-3: Gateway as the central nervous system (NestJS + Fastify adapter)** `@mosaic/gateway` is the single API surface. The web app, TUI, Discord, and Telegram all talk to the gateway. The gateway routes to brain (data), queue (coordination), agent pool (LLM work), and coord (mission lifecycle). This replaces the direct FastAPI-to-DB pattern from jarvis-old. NestJS was chosen over raw Fastify because the gateway is inherently complex — it hosts channel plugins, agent pool management, routing engine, WebSocket hub, MCP server, auth middleware, and integrates brain, queue, memory, and log services. NestJS provides the module system, dependency injection, guards, and interceptors needed to organize this cleanly. NestJS uses Fastify as its HTTP adapter, so Fastify's performance is preserved. This also aligns with the stated stack preference in USER.md ("NestJS API + Next.js web"). @mosaic/brain's existing Fastify code migrates naturally into a NestJS module with Fastify adapter. **AD-4: Brain migrates from JSON files to PostgreSQL** `@mosaic/brain` currently uses a JSON file store. For Mosaic Stack, brain's data model (tasks, projects, events, agents, missions, tickets) moves to PostgreSQL via Drizzle ORM. Brain's REST + MCP interface is preserved — only the storage backend changes. **AD-5: Tiered memory with summarization** Agent interaction logs are ingested into a log service. Raw logs are stored short-term. A summarization pipeline (using a cheap LLM) periodically compresses logs into structured insights stored in the vector DB. This prevents unbounded log growth while preserving searchable context. **AD-6: Remote control via plugin architecture** Discord and Telegram plugins follow a channel plugin pattern inspired by OpenClaw (https://github.com/openclaw/openclaw). Each plugin registers as a channel with the gateway, receives messages, and dispatches them through the same routing pipeline as web/TUI messages. **AD-7: Gateway state persistence via Valkey (restart resilience)** The gateway persists its orchestration state (active sessions, pending dispatches, routing context, agent assignments) to Valkey. On restart, the gateway reads Valkey state and resumes operations — active agent sessions are reconnected or gracefully recovered. `mosaic gateway restart --fresh` is the nuclear option: clears the Valkey queue and all in-flight state, starting with a clean slate. This prevents context/focus/direction loss that would otherwise occur on every restart. **AD-8: Multi-session agent architecture** Each agent operates in a distinct session. Multiple authorized input channels (TUI, web UI, Discord) can connect to the same agent session simultaneously. This means a user can start a conversation in Discord, continue in the web UI, and monitor via TUI — all feeding into the same agent context. OpenClaw has this concept; Mosaic Stack evolves it with proper session authorization and channel multiplexing at the gateway level. **AD-9: Discord channel-to-agent binding** Discord channels pair to specific agent/session combinations via channel ID binding. This provides data segregation — messages in #project-alpha route to the project-alpha agent session, messages in #general route to a general-purpose session. Prevents cross-contamination between contexts and provides clear boundaries for multi-channel use. **AD-10: Agent session barge-in via tmux** Each agent session runs in a dedicated, named tmux session (e.g., `mosaic-agent-project-alpha`). This enables barge-in — a user can attach to any active agent's tmux session to observe, interrupt, or redirect. `mosaic agent attach ` connects to the tmux session. This provides direct low-level access when the normal channel interfaces are insufficient. **AD-11: Cron-based scheduled jobs** The gateway includes a cron scheduler for recurring tasks: log summarization runs, stale task detection, memory decay, provider health checks, scheduled agent dispatches. Uses node-cron or similar; schedules are configurable via web dashboard and stored in PG. Each cron job is a gateway-dispatched task that goes through the normal routing pipeline. **AD-12: Web search tool (DuckDuckGo MCP)** Agent sessions include a web search tool for information retrieval. DuckDuckGo via MCP server is the primary option (privacy-respecting, no API key required). Falls back to other search MCP providers if configured. Registered as a standard MCP tool available to all agent sessions. **AD-13: Design system from @mosaic/design-tokens** The web dashboard uses the Mosaic Stack design system established in `mosaic-stack-website`. The `@mosaic/design-tokens` package provides CSS custom properties, Tailwind preset, and TS color/font/radius exports. Dark theme default with light theme support. Fonts: Outfit (sans), Fira Code (mono). Color palette: deep blue-grays with blue/purple/teal accents. **AD-14: Multi-tier deployment readiness** Code is structured assuming eventual multi-node deployment with dedicated roles (gateway nodes, agent worker nodes, brain/DB nodes). Packages communicate via well-defined APIs (HTTP/WS/MCP), not in-process calls where avoidable. Service boundaries are clean: gateway is stateless (state in PG/Valkey), agent pool can scale independently, brain is a separate service. v0.1.0 runs single-node; the architecture doesn't fight horizontal scaling later. --- ## Package Structure ### Monorepo Layout ``` mosaic-mono-v1/ ├── apps/ │ ├── web/ Next.js 16 web dashboard │ └── gateway/ @mosaic/gateway — NestJS API + WebSocket ├── packages/ │ ├── types/ @mosaic/types — shared type contracts │ ├── brain/ @mosaic/brain — data layer (PG-backed) │ ├── queue/ @mosaic/queue — Valkey task queue + MCP │ ├── coord/ @mosaic/coord — mission coordination │ ├── mosaic/ @mosaic/mosaic — install wizard │ ├── prdy/ @mosaic/prdy — PRD wizard │ ├── quality-rails/ @mosaic/quality-rails — code quality scaffolder │ ├── cli/ @mosaic/cli — unified CLI │ ├── auth/ @mosaic/auth — BetterAuth config + SSO adapters │ ├── db/ @mosaic/db — Drizzle schema, migrations, connection │ ├── agent/ @mosaic/agent — Pi SDK integration, agent pool manager │ ├── memory/ @mosaic/memory — tiered memory + summarization service │ ├── log/ @mosaic/log — agent log ingest + processing │ └── design-tokens/ @mosaic/design-tokens — CSS vars, Tailwind preset, colors ├── plugins/ │ ├── discord/ @mosaic/discord-plugin — Discord channel │ └── telegram/ @mosaic/telegram-plugin — Telegram channel ├── docker/ │ ├── gateway.Dockerfile │ ├── web.Dockerfile │ └── init-db.sql ├── docs/ │ ├── PRD.md (this file) │ ├── TASKS.md │ └── scratchpads/ ├── docker-compose.yml ├── pnpm-workspace.yaml ├── turbo.json ├── tsconfig.base.json ├── vitest.workspace.ts ├── AGENTS.md ├── CLAUDE.md └── README.md ``` ### Package Responsibilities #### `apps/gateway` — @mosaic/gateway (NEW — critical path) The central nervous system. All clients connect here. Built with NestJS (Fastify adapter). - **NestJS modules** — Each concern (chat, brain, agent, auth, queue, memory, plugins) is a module with clear boundaries - **Fastify adapter** — Fastify performance under NestJS's organizational structure - **WebSocket gateway** — NestJS built-in WebSocket support for chat streaming, agent status, notifications - **Agent routing engine** — Routes tasks to appropriate LLM provider/model based on task type, cost tier, capability requirements - **Session management** — Tracks active conversations, agent sessions, user contexts - **MCP server** — Exposes Mosaic capabilities as MCP tools - **Plugin host** — Loads and manages channel plugins (Discord, Telegram) - **Auth middleware** — BetterAuth session validation, RBAC enforcement Key routes: ``` POST /api/chat Send message, get streamed response GET /api/conversations List conversations POST /api/conversations Create conversation GET /api/conversations/:id Get conversation with messages DELETE /api/conversations/:id Delete conversation POST /api/tasks Create task (brain-backed) GET /api/tasks List/filter tasks PATCH /api/tasks/:id Update task GET /api/projects List projects POST /api/projects Create project GET /api/missions List missions POST /api/missions Create mission GET /api/missions/:id Mission summary with tasks POST /api/agents/dispatch Dispatch work to agent pool GET /api/agents/status Active agent sessions GET /api/memory/search Semantic search across memory POST /api/memory/preferences Store learned preference GET /api/skills List available skills POST /api/skills/install Install a skill GET /api/providers List configured LLM providers POST /api/providers Configure LLM provider GET /api/admin/users User management (admin) POST /api/admin/users Create user (admin) WS /ws/chat/:conversationId Streaming chat via WebSocket WS /ws/agents Agent status stream GET /mcp MCP endpoint (streamable HTTP) ``` #### `apps/web` — Next.js Web Dashboard Carried forward from jarvis-old with significant refactoring. - Chat/conversation UI (primary interaction surface) - Settings management (providers, integrations, profile) - Task management (list, kanban, detail views) - Project management (list, detail, linked missions) - Mission dashboard (status, progress, task breakdown) - PRD viewer/editor - Agent status dashboard (active sessions, routing stats) - Skill browser and installer - User management (admin RBAC panel) - Auth pages (login, SSO redirect, registration) #### `packages/types` — @mosaic/types Migrated from mosaic-mono-v0. Extended with: - Gateway types (routing, dispatch, agent pool) - Auth types (user, role, permission) - Conversation/message types (from jarvis-old domain) - Memory types (preference, insight, summary) - Plugin channel types (Discord, Telegram message mapping) #### `packages/brain` — @mosaic/brain Migrated from mosaic-mono-v0. **Storage backend changes from JSON to PostgreSQL.** - REST API preserved (mounted as gateway sub-router or standalone) - MCP tools preserved - Collections layer rewritten to use Drizzle ORM queries instead of JSON file I/O - Same entity model: tasks, projects, events, agents, missions, mission-tasks, tickets - New: computed endpoints (today, stale, stats, search, audit) run against PG - New: appreciation collection preserved for family use #### `packages/queue` — @mosaic/queue Migrated from mosaic-mono-v0 with minimal changes. - Valkey-backed task queue with atomic WATCH/MULTI/EXEC - MCP server with 8 tools - Used by gateway for agent task dispatch and coordination #### `packages/coord` — @mosaic/coord Migrated from mosaic-mono-v0. - Mission lifecycle: init, run, resume, status, drain - TASKS.md parsing and management - Session lock management - Continuation prompt generation - Integration with gateway for mission-driven orchestration #### `packages/db` — @mosaic/db (NEW) Shared database package. - Drizzle ORM schema definitions (all tables) - Migration management - Connection pool configuration - Shared by gateway, brain, auth, memory #### `packages/auth` — @mosaic/auth (NEW) Authentication and authorization. - BetterAuth configuration - SSO adapters: Authentik, WorkOS, Keycloak - RBAC: roles (admin, member, viewer), permissions - API key generation for brain/MCP access - Session management middleware #### `packages/agent` — @mosaic/agent (NEW — critical path) Pi SDK integration layer. - Agent pool manager — spawns and manages Pi agent sessions - Provider configuration — Anthropic, Codex, Z.ai, Ollama, LM Studio, llama.cpp - Agent routing logic — selects provider/model based on task characteristics - Tool registration — registers Mosaic-specific tools (brain access, queue ops, memory search) - Skill management — loads and configures Pi skills for agent sessions - Session lifecycle — create, monitor, complete, fail, timeout #### `packages/memory` — @mosaic/memory (NEW) Tiered memory system. - Preference store — learned user preferences, behaviors, defaults (PG) - Insight store — distilled knowledge from agent interactions (PG + vector) - Semantic search — query across memory using pgvector embeddings - Summarization pipeline — compress raw logs into structured insights - Memory API — used by gateway and agent sessions #### `packages/log` — @mosaic/log (NEW) Agent log service. - Log ingest — receives structured logs from agent sessions - Log parsing — extracts decisions, learnings, tool usage patterns - Tiered storage — hot (recent, full detail), warm (summarized), cold (archived) - Summarization trigger — invokes cheap LLM to compress aging logs - Retention policy — configurable TTLs per tier #### `packages/mosaic` — @mosaic/mosaic Migrated from mosaic-mono-v0, updated for v1. - Install wizard for Mosaic Stack setup - Detects existing installations, offers upgrade path - Configures `~/.config/mosaic/` with guides, tools, runtime configs #### `packages/prdy` — @mosaic/prdy Migrated from mosaic-mono-v0. - PRD generation wizard - Template-based PRD creation with Zod validation - CLI integration via `mosaic prdy` #### `packages/quality-rails` — @mosaic/quality-rails Migrated from mosaic-mono-v0. - TypeScript scaffolder for project quality config - Generates ESLint, tsconfig, Woodpecker, husky, lint-staged configs - Supports project types: monorepo, typescript-node, nextjs #### `packages/cli` — @mosaic/cli Migrated from mosaic-mono-v0, extended. - Unified `mosaic` binary - Subcommands: `mosaic coord`, `mosaic prdy`, `mosaic queue`, `mosaic quality`, `mosaic gateway`, `mosaic brain` - Plugin discovery for installed @mosaic/\* packages #### `plugins/discord` — @mosaic/discord-plugin (NEW — high priority) Discord remote control channel. Architecture inspired by OpenClaw (https://github.com/openclaw/openclaw). - Channel plugin that registers with the gateway as a NestJS dynamic module - Single-guild binding only (v0.1.0) — prevents data leaks between servers - Receives Discord messages, dispatches through gateway routing - Streams agent responses back to Discord (chunked for 2000-char limit) - Supports mention-based activation, thread management for multi-turn - Bot pairing and permission management (Discord user → Mosaic user mapping) - DM support for private conversations #### `plugins/telegram` — @mosaic/telegram-plugin (NEW) Telegram remote control channel. - Same channel plugin pattern as Discord - Telegraf-based bot - Message routing through gateway - Inline keyboard for interactive responses --- ## User/Stakeholder Requirements ### US-001 Multi-Channel Chat **As a user**, I can chat with an AI assistant via web browser, terminal (Pi TUI), Discord, or Telegram and get consistent responses regardless of channel. ### US-002 Task & Project Dashboard **As a user**, I can manage my tasks, projects, and missions from the web dashboard with kanban and list views. ### US-003 PRD Management **As a user**, I can view and edit PRDs for active missions from the web dashboard. ### US-004 Agent Visibility **As a user**, I can see which agents are active, what they're working on, and their status in real-time. ### US-005 Provider Configuration **As a user**, I can configure which LLM providers to use and set routing preferences (cost vs capability). ### US-006 Skill Management **As a user**, I can install and manage agent skills through the web dashboard. ### US-007 Persistent Memory **As a user**, the system remembers my preferences, learned behaviors, and past decisions across sessions. ### US-008 Semantic Search **As a user**, I can search across my memory, conversations, and knowledge semantically. ### US-009 User Management **As an admin**, I can manage users, assign roles, and control access. ### US-010 SSO Configuration **As an admin**, I can configure SSO via Authentik, WorkOS, or Keycloak. ### US-011 Self-Hosted Deployment **As a user**, I can run Mosaic Stack via Docker Compose or directly on bare metal. ### US-012 Intelligent Routing **As an agent operator**, the gateway intelligently routes tasks to the cheapest capable model. ### US-013 CLI Tooling **As a user**, I can use the `mosaic` CLI for PRD creation, quality rail setup, queue management, and mission coordination. --- ## Functional Requirements - FR-1: Chat System - FR-2: Gateway Orchestrator - FR-3: Agent Pool - FR-4: Task Management - FR-5: Project Management - FR-6: Mission System - FR-7: Memory System - FR-8: Authentication & Authorization - FR-9: Remote Control — Discord - FR-10: Remote Control — Telegram - FR-11: LLM Provider Management - FR-12: Agent Routing - FR-13: MCP Capability - FR-14: Skill Management - FR-15: CLI Integration - FR-16: Log Service - FR-17: Gateway State Persistence - FR-18: Multi-Session Agent Architecture - FR-19: Cron Scheduler - FR-20: Web Search Tool - FR-21: Skill Import from skills.sh ### FR-1: Chat System - Conversation CRUD (create, list, get with messages, delete) - Real-time streaming responses via WebSocket - Multi-provider support (route to configured LLM) - Conversation history with search - Project-scoped conversations - System prompt per project/conversation - Message rendering with markdown, code blocks, tool call display ### FR-2: Gateway Orchestrator - Central API surface for all clients (web, TUI, Discord, Telegram) - Agent dispatch — receive task, select provider/model, spawn Pi session, return result - Routing engine — cost/capability matrix, user preference overrides, task-type heuristics - Plugin host — load channel plugins at startup, manage lifecycle - MCP server — expose Mosaic tools via MCP protocol - WebSocket hub — real-time updates for chat, agent status, notifications - Rate limiting and request validation ### FR-3: Agent Pool (@mosaic/agent) - Manage concurrent Pi SDK sessions - Provider configuration: API key management, endpoint URLs, model lists - Support providers: Anthropic (subscription + API), OpenAI/Codex (subscription + API), Z.ai, Ollama (local), LM Studio (local), llama.cpp (local) - Tool injection — all agent sessions get Mosaic tools (brain, queue, memory) - Skill loading — configure skills per agent session based on task type - Session monitoring — track active sessions, token usage, duration - Graceful shutdown — drain active sessions on shutdown ### FR-4: Task Management - Brain-backed task CRUD with full filter/sort - Task statuses: backlog, scheduled, in-progress, blocked, done, cancelled - Priority levels: critical, high, medium, low - Domain categorization - Dependency tracking (blocks/blocked_by) - Project association - Assignee tracking - Kanban board view in web dashboard - Due date tracking with stale detection ### FR-5: Project Management - Project CRUD with domain, status, priority - Link to repository, branch, current/next milestone - Progress tracking - Blocker tracking - Owner assignment ### FR-6: Mission System - Mission CRUD (linked to project and PRD) - Mission tasks with phases, dependencies, ordering - Mission summary with computed progress - Mission coordination via @mosaic/coord - Active mission dashboard in web UI ### FR-7: Memory System - **Preferences**: Key-value store for learned user preferences (e.g., "prefers tables over paragraphs", "timezone: America/Chicago") - **Insights**: Distilled knowledge from agent interactions, stored with embeddings - **Semantic search**: Query across all memory using natural language - **Auto-capture**: Agent sessions automatically log decisions and learnings - **Summarization**: Periodic compression of raw logs into structured insights - **Decay**: Old, unused insights decay in relevance score over time ### FR-8: Authentication & Authorization - BetterAuth integration with Next.js - Email/password registration and login - SSO via OIDC/SAML: Authentik, WorkOS, Keycloak - RBAC roles: admin (full access), member (own resources + shared), viewer (read-only) - API key generation for programmatic/MCP access - Session management (web + API) ### FR-9: Remote Control — Discord - Discord bot that connects to the gateway - Mention-based activation in channels - DM support for private conversations - Thread creation for multi-turn conversations - Chunked message delivery (Discord 2000-char limit) - Bot configuration via web dashboard - Permission management (which Discord users/roles can interact) ### FR-10: Remote Control — Telegram - Telegram bot via Telegraf - Private and group chat support - Command-based interaction (`/ask`, `/task`, `/status`) - Inline keyboard for task management - Message routing through gateway ### FR-11: LLM Provider Management - Provider configuration UI in web dashboard - Per-provider: API key/endpoint, enabled models, cost per token - Subscription-based providers: detect available models from subscription - Local providers: Ollama model list, LM Studio endpoint, llama.cpp binary path - Provider health monitoring - Usage tracking per provider/model ### FR-12: Agent Routing - Task-type to model-tier mapping (from AGENTS.md cost matrix) - User preference overrides (e.g., "always use Claude for code review") - Fallback chains (if primary provider unavailable, try next) - Cost tracking and budget enforcement - Routing transparency — user can see why a particular model was chosen ### FR-13: MCP Capability - Gateway exposes MCP server (streamable HTTP transport) - Brain tools registered as MCP tools - Queue tools registered as MCP tools - Memory search registered as MCP tool - Agent sessions can call MCP tools from other services - External MCP server connectivity (agent can use third-party MCP servers) ### FR-14: Skill Management - Skill catalog — list available skills from configured sources - Skill install — install skill to `~/.config/mosaic/skills/` or project-local - Skill configuration — per-skill settings - Skill status — installed, available, update available - Web UI for browsing and managing skills ### FR-15: CLI Integration - `mosaic gateway start` — start the gateway server - `mosaic brain` — brain data management - `mosaic queue` — queue operations - `mosaic coord` — mission coordination - `mosaic prdy` — PRD wizard - `mosaic quality` — quality rail management - `mosaic tui` — launch Pi TUI connected to gateway ### FR-16: Log Service - Structured log ingest from agent sessions - Parse logs for: decisions made, tools used, errors encountered, learnings captured - Tier management: hot (7 days, full detail), warm (30 days, summarized), cold (90 days, key facts only) - Summarization pipeline: cheap LLM compresses aging logs on schedule - Query interface for log search ### FR-17: Gateway State Persistence - Orchestration state persisted to Valkey (active sessions, pending dispatches, routing context) - On restart, gateway reads Valkey state and resumes — reconnects to active agent sessions - `mosaic gateway restart --fresh` clears Valkey queue and all in-flight state (nuclear option) - Session recovery: detect orphaned agent sessions, offer reconnect or cleanup ### FR-18: Multi-Session Agent Architecture - Each agent has a distinct session with dedicated context - Multiple input channels (TUI, web, Discord, Telegram) can connect to same agent session - Channel multiplexing at gateway level with proper authorization - Discord channel ID paired to specific agent/session (prevents cross-contamination) - Agent session runs in named tmux session for barge-in capability - `mosaic agent attach ` connects to agent's tmux session - `mosaic agent list` shows active sessions with connected channels ### FR-19: Cron Scheduler - Built-in cron scheduler in gateway for recurring tasks - Default schedules: log summarization, stale task detection, memory decay, provider health checks - Custom schedules: user-defined agent dispatches on cron expressions - Schedule management via web dashboard and CLI - Cron jobs dispatched through normal gateway routing pipeline - Persistence: schedules stored in PG, survive gateway restart ### FR-20: Web Search Tool - DuckDuckGo web search via MCP server (primary — privacy-respecting, no API key) - Registered as standard MCP tool available to all agent sessions - Configurable: can swap to other search providers (Brave, SearXNG, Tavily) - Results formatted for agent consumption (title, snippet, URL) ### FR-21: Skill Import from skills.sh - Browse skills from https://skills.sh directory via API - Import skills into `~/.config/mosaic/skills/` or project-local `.mosaic/skills/` - Vetting workflow: imported skills marked as "unvetted" until admin approves - Skill review interface in web dashboard (view skill content before approval) - Vetted skills auto-available to agent sessions; unvetted require explicit enable - `mosaic skill import ` CLI command - Track installed skills, versions, update availability --- ## Non-Functional Requirements ### Security - No hardcoded secrets — all secrets via environment variables or vault - API key rotation capability - RBAC enforcement at gateway level - Input validation (Zod) on all API endpoints - Rate limiting on public endpoints - CORS configuration for web app - Secure WebSocket connections - SSO token validation - Database connection encryption (SSL) ### Performance - Chat response streaming latency < 200ms TTFB (gateway overhead, not LLM latency) - Dashboard page loads < 2s - Brain query responses < 100ms for filtered reads - Semantic search < 500ms - Support 10+ concurrent agent sessions - WebSocket connection handling for 50+ concurrent users ### Reliability - Graceful degradation when LLM provider is unavailable (fallback chain) - Queue persistence — tasks survive gateway restart - Database connection pooling with retry - Health check endpoints for all services - Structured error responses with correlation IDs ### Observability (Wide-Event Logging — Required from Phase 0) - **OpenTelemetry instrumentation** across all services from day one - `@opentelemetry/sdk-node` + `@opentelemetry/auto-instrumentations-node` for auto-instrumentation (HTTP, PG, Fastify/NestJS) - NestJS interceptors for custom spans on agent dispatch, routing decisions, memory writes, summarization runs - Every significant operation emits a structured event with rich context (wide events, not just request/response) - **SigNoz** as OTEL backend (single Docker service: traces, metrics, logs, built-in UI) - Request tracing with correlation IDs (trace-id propagated across gateway → agent → brain → queue) - Agent session metrics (duration, tokens, cost, success/failure, model used, routing reason) - Provider availability monitoring (health check spans) - Queue depth monitoring (periodic gauge metrics) - Memory usage metrics (embedding count, search latency, summarization runs) - Migrate to Grafana stack (Tempo + Loki + Grafana) post-beta if more customization is needed ### Scalability (Multi-Tier Readiness) - Single-node deployment is the MVP target for v0.1.0 - Code structured with assumption that multi-tiered deployment will follow: dedicated gateway nodes, agent worker nodes, brain/DB nodes - Service boundaries communicate via HTTP/WS/MCP APIs, not in-process calls where avoidable - Gateway is stateless (all state in PG/Valkey) to enable horizontal scaling - Agent pool designed as independently scalable service - Database migrations support forward-only schema evolution - Hierarchical deployment with dedicated roles/specialties is the post-beta target --- ## Acceptance Criteria ### AC-1: Core Chat Flow - [ ] User can log in via web UI, send a message, and receive a streamed response - [ ] Conversation persists across page refreshes - [ ] User can create, list, search, and delete conversations - [ ] Conversations can be scoped to projects ### AC-2: TUI Integration - [ ] `mosaic tui` launches Pi interactive mode connected to gateway - [ ] User can chat with same conversation context as web UI - [ ] Agent has access to brain, queue, and memory tools ### AC-3: Discord Remote Control - [ ] Discord bot connects and responds to mentions - [ ] Messages route through gateway to agent pool - [ ] Responses stream back to Discord (chunked) - [ ] Thread creation for multi-turn conversations ### AC-4: Gateway Orchestration - [ ] Gateway dispatches tasks to appropriate provider/model - [ ] Routing decision logged and inspectable - [ ] Fallback when primary provider unavailable - [ ] Multiple concurrent agent sessions managed correctly ### AC-5: Task & Project Management - [ ] CRUD operations for tasks, projects, missions via web dashboard - [ ] Kanban board view for tasks - [ ] Mission progress tracking with computed stats - [ ] Brain MCP tools accessible from agent sessions ### AC-6: Memory System - [ ] Agent sessions auto-capture decisions and learnings - [ ] Semantic search returns relevant past context - [ ] Learned preferences are applied in new sessions - [ ] Log summarization runs on schedule, old logs compressed ### AC-7: Authentication & RBAC - [ ] Email/password login works - [ ] At least one SSO provider (Authentik) works end-to-end - [ ] Admin can create users and assign roles - [ ] RBAC enforced on API endpoints ### AC-8: Multi-Provider LLM Support - [ ] At least 3 providers configured and routing correctly (e.g., Anthropic + Ollama + Z.ai) - [ ] Agent routing selects appropriate model for task type - [ ] Provider configuration manageable from web UI ### AC-9: MCP - [ ] Gateway exposes MCP endpoint - [ ] Brain and queue tools callable via MCP - [ ] Agent sessions can connect to external MCP servers ### AC-10: Deployment - [ ] `docker compose up` starts full stack from clean state - [ ] `mosaic` CLI installable and functional on bare metal - [ ] Database migrations run automatically on first start - [ ] `.env.example` documents all required configuration ### AC-11: @mosaic/\* Packages - [ ] All 7 migrated packages build, pass tests, and integrate with gateway - [ ] `mosaic` CLI provides subcommands for each package - [ ] Types package is the single source of shared interfaces --- ## Constraints and Dependencies 1. **Pi SDK** — Core dependency; any Pi breaking changes affect the agent layer. Pin to known-good version. 2. **BetterAuth** — Auth framework; must support SSO adapters. Verify Authentik/WorkOS/Keycloak support before committing. 3. **Drizzle ORM** — Database layer; must support PostgreSQL + pgvector extension. 4. **Discord API** — Rate limits, intent requirements, message size limits (2000 chars). 5. **Valkey** — Queue backend; must be available for queue and caching. 6. **Gitea registry** — Package publishing target; `.npmrc` must be configured. 7. **OpenClaw** — Reference architecture for Discord/Telegram plugin pattern (https://github.com/openclaw/openclaw). Inspiration only, not a dependency. --- ## Risks and Open Questions ### Risks | Risk | Likelihood | Impact | Mitigation | | -------------------------------------------------- | ---------- | ------ | ---------------------------------------------------------------------------------------- | | Pi SDK API instability (pre-1.0) | Medium | High | Pin version, abstract behind @mosaic/agent interface | | Brain PG migration complexity | Medium | Medium | Preserve Brain REST/MCP API contract; only storage changes | | Discord plugin complexity (OpenClaw has ~60 files) | Medium | Medium | Start minimal (DM + mention in channel), single-guild only; expand iteratively post-beta | | LLM provider subscription auth varies by provider | Medium | Medium | Abstract behind provider interface; implement per-provider adapters | | Drizzle + pgvector extension compatibility | Low | Medium | Validate in Phase 0 with spike | | Agent log volume overwhelming storage | Medium | High | Tiered storage with aggressive summarization; configurable retention | | Scope creep from jarvis-old feature surface | High | High | Strict v0.1.0 scope; features not listed above are post-beta | ### Open Questions | # | Question | Priority | Status | | --- | ------------------------------------------------------------------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | 1 | Pi SDK version to pin for v0.1.0? | High | ✅ Resolved — Pin `@mariozechner/pi-coding-agent@~0.57.1` (current stable). Abstract behind `@mosaic/agent` interface to insulate from breaking changes. Bump deliberately after testing. | | 2 | Authentik vs WorkOS vs Keycloak — which SSO provider to implement first? | Medium | ✅ Resolved — Authentik first (already in Jason's infrastructure) | | 3 | Vector DB: pgvector sufficient or need Qdrant from the start? | Medium | ✅ Resolved — pgvector with VectorStore interface abstraction. Qdrant drops in later if needed. | | 4 | Summarization LLM: which model for log compression? | Medium | ✅ Resolved — Haiku-tier default with structured output guardrails, configurable via routing engine. | | 5 | LM Studio and llama.cpp — provider adapters exist in Pi or need custom? | Medium | ✅ Resolved — Pi handles both natively. LM Studio and llama.cpp (server mode) expose OpenAI-compatible APIs; configure via Pi's `models.json` with `openai-completions` API type. No custom adapters needed. | | 6 | Discord bot — single guild or multi-guild from day one? | Medium | ✅ Resolved — Single-guild only for v0.1.0 to prevent data leaks. Bot binds to one guild. Multi-guild with tenant isolation is a post-beta feature requiring explicit data boundary design. | | 7 | Bare-metal install — systemd units or just docs? | Low | ASSUMPTION: Docs + CLI launch commands; systemd units post-beta | --- ## Testing and Verification Expectations 1. **Baseline checks**: `pnpm typecheck && pnpm lint && pnpm test` must pass across all packages 2. **Unit tests**: Vitest for all packages; mocked dependencies for isolation 3. **Integration tests**: Gateway + Brain + Queue with test PG + Valkey (Docker services in CI) 4. **E2E tests**: Playwright for web dashboard critical paths (login, chat, task CRUD) 5. **Agent tests**: Pi SDK session tests with mock provider (verify tool registration, routing) 6. **Evidence format**: CI pipeline green + test count report per package --- ## Milestone / Delivery Intent All work is **alpha** (< 0.1.0) until Jason approves 0.1.0 beta release. ### Phase 0: Foundation (v0.0.1) - Scaffold monorepo (pnpm + turbo + tsconfig + eslint + vitest) - `@mosaic/types` — migrate and extend from v0 - `@mosaic/db` — Drizzle schema, PG connection, migrations - `@mosaic/auth` — BetterAuth setup with email/password - OTEL foundation — `@opentelemetry/sdk-node` setup, SigNoz in docker-compose, trace propagation wired - Docker Compose (PG 17 + Valkey + SigNoz) - CI pipeline (Woodpecker) - AGENTS.md, CLAUDE.md, README.md ### Phase 1: Core API (v0.0.2) - `apps/gateway` — NestJS server (Fastify adapter), auth middleware, health endpoints - `@mosaic/brain` — migrate from v0, swap JSON store for PG via @mosaic/db - `@mosaic/queue` — migrate from v0 (minimal changes) - Gateway routes: conversations, tasks, projects, missions - WebSocket server for chat streaming - Basic agent dispatch (single provider, no routing) ### Phase 2: Agent Layer (v0.0.3) - `@mosaic/agent` — Pi SDK integration, agent pool manager - Multi-provider support (Anthropic + Ollama minimum) - Agent routing engine (cost/capability matrix) - Tool registration (brain, queue, memory tools injected into agent sessions) - `@mosaic/coord` — migrate from v0, integrate with gateway ### Phase 3: Web Dashboard (v0.0.4) - `apps/web` — Next.js app with BetterAuth - Chat UI (conversation list, message display, streaming input) - Task management (list + kanban) - Project and mission views - Settings (provider config, profile) - Admin panel (user management, RBAC) ### Phase 4: Memory & Intelligence (v0.0.5) - `@mosaic/memory` — preference store, insight store, semantic search - `@mosaic/log` — log ingest, parsing, tiered storage - Summarization pipeline - Memory integration into agent sessions - Skill management interface (web UI + CLI) ### Phase 5: Remote Control (v0.0.6) - `@mosaic/discord-plugin` — Discord channel plugin - `@mosaic/telegram-plugin` — Telegram channel plugin - Plugin host in gateway - SSO configuration (Authentik) ### Phase 6: CLI & Tools (v0.0.7) - `@mosaic/cli` — unified CLI with all subcommands - `@mosaic/prdy` — migrate from v0 - `@mosaic/quality-rails` — migrate from v0 - `@mosaic/mosaic` — install wizard updated for v1 - Pi TUI integration (`mosaic tui`) ### Phase 7: Polish & Beta (v0.0.8 → v0.1.0) - MCP endpoint hardening - Additional SSO providers (WorkOS/Keycloak) - Additional LLM providers (Codex, Z.ai, LM Studio, llama.cpp) - Bare-metal deployment documentation - E2E test suite - Performance optimization - Documentation: user guide, admin guide, developer guide - **Jason approval gate → v0.1.0 beta release** --- ## Assumptions 1. RESOLVED: **pgvector is sufficient** for semantic search at v0.1.0 scale (personal/family/team = thousands to low hundreds-of-thousands of vectors). `@mosaic/memory` defines a `VectorStore` interface with pgvector as the default adapter. The interface boundary makes Qdrant a drop-in migration if PG resource contention or scale demands it later. Zero additional infrastructure for v0.1.0. Rationale: Reduces ops burden; pgvector HNSW indexes are fast at this scale; interface abstraction costs almost nothing now. 2. RESOLVED: **Authentik is the first SSO provider** — confirmed, already running in Jason's infrastructure. WorkOS and Keycloak adapters follow in Phase 7. 3. RESOLVED: **NestJS with Fastify adapter for the gateway.** The gateway's complexity (plugin host, agent pool, routing engine, WebSocket hub, MCP server, auth, brain/queue/memory/log integration) warrants NestJS's module system, DI, and guards. Fastify performance preserved via adapter. Aligns with USER.md stated stack ("NestJS API + Next.js web"). @mosaic/brain's Fastify code migrates into a NestJS module. 4. RESOLVED: **OpenTelemetry from Phase 0.** Wide-event logging is required from the start. OTEL auto-instrumentation for NestJS/PG/HTTP via `@opentelemetry/sdk-node`. SigNoz as the all-in-one OTEL backend (single Docker service). Every significant operation emits structured events with rich context. Custom spans for agent dispatch, routing decisions, memory writes. Rationale: Retrofitting observability is painful; baking it in from day one means consistent instrumentation across all services. 5. ASSUMPTION: **Single-node deployment for v0.1.0**, but code structured for multi-tier. No Kubernetes yet. Docker Compose + bare metal. Service boundaries use HTTP/WS/MCP APIs (not in-process) so gateway, agent pool, and brain can split to separate nodes later. Rationale: Ship single-node MVP; the architecture doesn't fight horizontal scaling when needed. 6. ASSUMPTION: **Log summarization uses Haiku-tier LLM by default, configurable.** Haiku is well-suited for summarization (compression, not generation — source material is in context). Guardrails: structured output via Zod schema (force extraction of decisions/tools/outcomes/errors as discrete fields), chunked per-session processing (no bulk conflation), extraction-focused prompts. Raw logs stay in hot tier (7 days) as safety net. Users can override the summarization model via routing engine config if they want higher fidelity. Rationale: Haiku is 10-20x cheaper than Sonnet; log summarization runs on schedule against large volumes where cost matters. 7. ASSUMPTION: **Discord plugin starts minimal and single-guild only** — DM support, mention-based channel activation, thread management, chunked responses. Single guild binding to prevent data leaks between servers. Advanced features (voice, components, slash commands, multi-guild) are post-beta. Rationale: Proven pattern from OpenClaw; ship core interaction first; data isolation is non-negotiable. 8. ASSUMPTION: **Telegram plugin is lower priority than Discord** and may ship as v0.0.7 or later if Discord takes longer than expected. Rationale: Jason indicated Discord as the high-priority remote channel. 9. ASSUMPTION: **Brain's REST API is preserved** as a gateway sub-router (mounted at `/api/brain/*` or similar). Existing MCP tools continue to work. Only the storage backend changes. Rationale: Minimize migration risk; brain's API contract is proven. 10. ASSUMPTION: **Conversations and messages get their own PG tables** (not stored in brain's entity model). They follow a chat-specific schema with proper foreign keys to users and projects. Rationale: Chat has different access patterns (streaming, pagination, search) than brain entities. 11. RESOLVED: **Pi handles all target LLM providers natively.** Anthropic, OpenAI/Codex, Z.ai, Ollama, LM Studio, and llama.cpp are all supported via Pi's built-in providers or `models.json` configuration with `openai-completions` API type. No custom provider adapters needed in @mosaic/agent — only configuration management.