Files

Jason Woltje 9ac5779e66 fix(P0-001): add missing typescript-eslint dep, format all files

Add typescript-eslint to root devDependencies (required by eslint
flat config). Run prettier across all files. Exclude QA reports
from git.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-12 20:11:25 -05:00

56 KiB

Raw Blame History

PRD: Mosaic Stack v0.1.0

Metadata

Owner: Jason Woltje
Date: 2026-03-12
Status: draft
Best-Guess Mode: true
Repo (target): git.mosaicstack.dev/mosaic/mosaic-stack
Baseline: ~/src/jarvis-old (jarvis v0.2.0)
Package source: ~/src/mosaic-mono-v0 (@mosaic/* packages)
Agent harness: pi (v0.57.1)
Remote control reference: OpenClaw (upstream, canonical)

Problem Statement

Jarvis (v0.2.0) is a self-hosted AI assistant with a Python FastAPI backend and Next.js frontend. It handles chat, projects, tasks, and LLM routing but lacks orchestration depth, agent coordination, shared memory, and remote access. The Mosaic framework (~/.config/mosaic) provides agent guides, shell-based orchestration tools, and quality rails — but these are loose scripts, not an integrated platform. The @mosaic/* packages in mosaic-mono-v0 began consolidating these into TypeScript packages (brain, queue, coord, cli, prdy, quality-rails) but have no UI, no auth, and no agent runtime integration.

The gap: Three codebases with overlapping concerns, no unified runtime, no remote control surface (Discord/Telegram), no gateway orchestrator, and a Python backend that doesn't align with the target TypeScript-everywhere stack.

What Mosaic Stack solves: A single monorepo that brings together many pieces to make one beautiful picture — a self-hosted, multi-user AI agent platform with web dashboard, TUI, remote control, shared memory, mission orchestration, and extensible skill/plugin architecture. All TypeScript. Pi as the agent harness. Brain as the knowledge layer. Queue as the coordination backbone.

Objectives

Unified TypeScript monorepo — One repo, one language, one build pipeline for all Mosaic Stack components
Pi-powered agent runtime — Pi SDK embedded as the core agent loop; Pi TUI as the terminal interface
Web + TUI + Remote — Next.js dashboard for visual management, Pi TUI for terminal work, Discord/Telegram for remote control
Gateway orchestrator — Central routing layer that dispatches tasks to appropriate agents based on capability, cost, and context
Shared memory — PostgreSQL canonical store + vector DB for semantic search + tiered log summarization to prevent context creep
Multi-user with SSO — BetterAuth with Authentik/WorkOS/Keycloak SSO, RBAC for family/team/business use
Full @mosaic/* package integration — brain, queue, coord, mosaic, prdy, quality-rails, cli all integrated
Extensible — MCP capability, skill import interface, plugin architecture for LLM providers and remote channels

Scope

In Scope (v0.1.0 Beta)

Chat/conversation UI (web) — carry forward from jarvis-old, rewrite frontend to work with new backend
Pi TUI integration — terminal-based agent interaction using Pi SDK
Web dashboard — settings, task management, projects, PRDs, missions, agent status
Gateway orchestrator (@mosaic/gateway) — central dispatch for agent tasks with routing logic
Task management — CRUD, kanban, mission-scoped tasks, dependency tracking
Project management — projects, milestones, PRDs linked to missions
Shared memory system — learned preferences, behaviors, defaults; tiered storage with summarization
User management — RBAC (admin, member, viewer), multi-user capable
SSO — BetterAuth with Authentik/WorkOS/Keycloak adapter
Remote control — Discord plugin (high priority), Telegram plugin
LLM provider support — Anthropic subs, Codex subs, Z.ai subs, other API-based, Ollama, LM Studio, llama.cpp
Agent routing — task-based model/provider selection (cost/capability matrix)
MCP capability — server and client, tool registration
Skill import interface — browse, install, manage agent skills
@mosaic/brain — structured data layer (migrated to PG + vector DB backend)
@mosaic/queue — Valkey-backed task queue with MCP tools
@mosaic/coord — mission coordination engine
@mosaic/mosaic — install wizard / bootstrap
@mosaic/prdy — PRD wizard
@mosaic/quality-rails — code quality scaffolder
@mosaic/cli — unified mosaic CLI
Docker Compose deployment + bare-metal capability
Agent log service — ingest, parse, tier, summarize agent interaction logs

Out of Scope (v0.1.0)

SaaS / multi-tenant revenue model — this is a personal/family/team tool
Mobile native apps — web responsive is sufficient
Public npm registry publishing — Gitea registry only
Video/voice agent interaction
Full OpenClaw feature parity — we take inspiration, not wholesale migration
Calendar integration (deferred — brain tracks events, but no gcal sync yet)
GLPI/helpdesk ticket sync (deferred)
Woodpecker CI integration tooling (deferred — focus on core platform first)

Architecture

High-Level System Diagram

┌─────────────────────────────────────────────────────────────────┐
│                        Mosaic Stack                              │
│                                                                  │
│  ┌──────────┐  ┌──────────┐  ┌─────────────┐  ┌──────────────┐ │
│  │ Next.js  │  │ Pi TUI   │  │  Discord    │  │  Telegram    │ │
│  │ Web App  │  │ Terminal  │  │  Plugin     │  │  Plugin      │ │
│  └────┬─────┘  └────┬─────┘  └──────┬──────┘  └──────┬───────┘ │
│       │              │               │                │          │
│       └──────────────┴───────┬───────┴────────────────┘          │
│                              │                                   │
│                    ┌─────────▼──────────┐                        │
│                    │  @mosaic/gateway   │  ← Central Orchestrator│
│                    │  (NestJS+Fastify)  │                        │
│                    └────┬────┬────┬─────┘                        │
│                         │    │    │                               │
│          ┌──────────────┤    │    ├──────────────┐               │
│          │              │    │    │              │               │
│  ┌───────▼──────┐ ┌────▼────▼──┐ │  ┌───────────▼────────┐     │
│  │ @mosaic/brain│ │ @mosaic/   │ │  │ Agent Pool         │     │
│  │ (Data Layer) │ │ queue      │ │  │ (Pi SDK sessions)  │     │
│  └───────┬──────┘ └────────────┘ │  │ - Anthropic        │     │
│          │                       │  │ - Codex            │     │
│  ┌───────▼──────────────────┐    │  │ - Z.ai             │     │
│  │  PostgreSQL  │  VectorDB │    │  │ - Ollama           │     │
│  │  (canonical) │  (semantic)│   │  │ - LM Studio        │     │
│  └──────────────┴───────────┘    │  │ - llama.cpp        │     │
│                                  │  └────────────────────┘     │
│                    ┌─────────────▼──────┐                       │
│                    │  @mosaic/coord     │                        │
│                    │  Mission lifecycle │                        │
│                    └────────────────────┘                        │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐      │
│  │ @mosaic/cli  │  │ @mosaic/prdy │  │ @mosaic/         │      │
│  │              │  │              │  │ quality-rails    │      │
│  └──────────────┘  └──────────────┘  └──────────────────┘      │
│                                                                  │
│  ┌──────────────────────────────────────────────────────┐       │
│  │  Valkey (queue backend)  │  BetterAuth (SSO/RBAC)   │       │
│  └──────────────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────────────┘

Technology Decisions

Layer	Technology	Rationale
Web Frontend	Next.js 16 + React 19 + Tailwind CSS	SSR, RSC; design tokens from @mosaic/design-tokens (mosaic-stack-website)
API / Gateway	NestJS + Fastify adapter	Module system, DI, guards/interceptors for complex gateway; Fastify performance underneath
Agent Runtime	Pi SDK (embedded)	Extensible harness with tools, skills, session management
TUI	Pi interactive mode	Native terminal agent interaction
Auth	BetterAuth + SSO adapters	Multi-user RBAC with Authentik/WorkOS/Keycloak
Database	PostgreSQL 17 + pgvector	Canonical store; pgvector for embedding search
Vector DB	pgvector + VectorStore interface	pgvector for v0.1.0; `VectorStore` abstraction in @mosaic/memory makes Qdrant a drop-in later
Cache / Queue	Valkey 8	Redis-compatible; proven in @mosaic/queue
ORM	Drizzle ORM	TypeScript-native, lightweight, good migration story
Validation	Zod	Already used across @mosaic/* packages
Build	pnpm workspaces + Turborepo	Proven in both jarvis-old and mosaic-mono-v0
Testing	Vitest + Playwright	Unit/integration via Vitest, E2E via Playwright
Remote Control	Discord.js + Telegraf	Inspired by OpenClaw plugin architecture
MCP	@modelcontextprotocol/sdk	Already used in @mosaic/brain and @mosaic/queue
Container	Docker Compose	Self-hosted; bare-metal also supported
CI	Woodpecker CI	Existing infrastructure at git.mosaicstack.dev
Observability	OpenTelemetry + SigNoz	Wide-event logging from day one; OTEL auto-instrumentation for NestJS/PG/HTTP; SigNoz as all-in-one backend
Log Processing	Custom ingest service	Parse agent logs → tiered storage → summarization

Key Architecture Decisions

AD-1: TypeScript everywhere (no Python backend) The jarvis-old FastAPI backend is not carried forward as code. Its domain logic (conversation management, LLM routing, task/project CRUD, auth) is reimplemented in TypeScript. The Python plugin system is replaced by Pi's extension/skill system and MCP tool registration.

AD-2: Pi SDK as the agent runtime Instead of a custom LLM provider abstraction (jarvis-old's BaseLLMProvider), Pi SDK manages agent sessions. Pi handles model selection, tool calling, context management, and compaction. The gateway dispatches work to Pi sessions configured with appropriate providers.

AD-3: Gateway as the central nervous system (NestJS + Fastify adapter) @mosaic/gateway is the single API surface. The web app, TUI, Discord, and Telegram all talk to the gateway. The gateway routes to brain (data), queue (coordination), agent pool (LLM work), and coord (mission lifecycle). This replaces the direct FastAPI-to-DB pattern from jarvis-old.

NestJS was chosen over raw Fastify because the gateway is inherently complex — it hosts channel plugins, agent pool management, routing engine, WebSocket hub, MCP server, auth middleware, and integrates brain, queue, memory, and log services. NestJS provides the module system, dependency injection, guards, and interceptors needed to organize this cleanly. NestJS uses Fastify as its HTTP adapter, so Fastify's performance is preserved. This also aligns with the stated stack preference in USER.md ("NestJS API + Next.js web"). @mosaic/brain's existing Fastify code migrates naturally into a NestJS module with Fastify adapter.

AD-4: Brain migrates from JSON files to PostgreSQL @mosaic/brain currently uses a JSON file store. For Mosaic Stack, brain's data model (tasks, projects, events, agents, missions, tickets) moves to PostgreSQL via Drizzle ORM. Brain's REST + MCP interface is preserved — only the storage backend changes.

AD-5: Tiered memory with summarization Agent interaction logs are ingested into a log service. Raw logs are stored short-term. A summarization pipeline (using a cheap LLM) periodically compresses logs into structured insights stored in the vector DB. This prevents unbounded log growth while preserving searchable context.

AD-6: Remote control via plugin architecture Discord and Telegram plugins follow a channel plugin pattern inspired by OpenClaw (https://github.com/openclaw/openclaw). Each plugin registers as a channel with the gateway, receives messages, and dispatches them through the same routing pipeline as web/TUI messages.

AD-7: Gateway state persistence via Valkey (restart resilience) The gateway persists its orchestration state (active sessions, pending dispatches, routing context, agent assignments) to Valkey. On restart, the gateway reads Valkey state and resumes operations — active agent sessions are reconnected or gracefully recovered. mosaic gateway restart --fresh is the nuclear option: clears the Valkey queue and all in-flight state, starting with a clean slate. This prevents context/focus/direction loss that would otherwise occur on every restart.

AD-8: Multi-session agent architecture Each agent operates in a distinct session. Multiple authorized input channels (TUI, web UI, Discord) can connect to the same agent session simultaneously. This means a user can start a conversation in Discord, continue in the web UI, and monitor via TUI — all feeding into the same agent context. OpenClaw has this concept; Mosaic Stack evolves it with proper session authorization and channel multiplexing at the gateway level.

AD-9: Discord channel-to-agent binding Discord channels pair to specific agent/session combinations via channel ID binding. This provides data segregation — messages in #project-alpha route to the project-alpha agent session, messages in #general route to a general-purpose session. Prevents cross-contamination between contexts and provides clear boundaries for multi-channel use.

AD-10: Agent session barge-in via tmux Each agent session runs in a dedicated, named tmux session (e.g., mosaic-agent-project-alpha). This enables barge-in — a user can attach to any active agent's tmux session to observe, interrupt, or redirect. mosaic agent attach <session-name> connects to the tmux session. This provides direct low-level access when the normal channel interfaces are insufficient.

AD-11: Cron-based scheduled jobs The gateway includes a cron scheduler for recurring tasks: log summarization runs, stale task detection, memory decay, provider health checks, scheduled agent dispatches. Uses node-cron or similar; schedules are configurable via web dashboard and stored in PG. Each cron job is a gateway-dispatched task that goes through the normal routing pipeline.

AD-12: Web search tool (DuckDuckGo MCP) Agent sessions include a web search tool for information retrieval. DuckDuckGo via MCP server is the primary option (privacy-respecting, no API key required). Falls back to other search MCP providers if configured. Registered as a standard MCP tool available to all agent sessions.

AD-13: Design system from @mosaic/design-tokens The web dashboard uses the Mosaic Stack design system established in mosaic-stack-website. The @mosaic/design-tokens package provides CSS custom properties, Tailwind preset, and TS color/font/radius exports. Dark theme default with light theme support. Fonts: Outfit (sans), Fira Code (mono). Color palette: deep blue-grays with blue/purple/teal accents.

AD-14: Multi-tier deployment readiness Code is structured assuming eventual multi-node deployment with dedicated roles (gateway nodes, agent worker nodes, brain/DB nodes). Packages communicate via well-defined APIs (HTTP/WS/MCP), not in-process calls where avoidable. Service boundaries are clean: gateway is stateless (state in PG/Valkey), agent pool can scale independently, brain is a separate service. v0.1.0 runs single-node; the architecture doesn't fight horizontal scaling later.

Package Structure

Monorepo Layout

mosaic-mono-v1/
├── apps/
│   ├── web/                    Next.js 16 web dashboard
│   └── gateway/                @mosaic/gateway — NestJS API + WebSocket
├── packages/
│   ├── types/                  @mosaic/types — shared type contracts
│   ├── brain/                  @mosaic/brain — data layer (PG-backed)
│   ├── queue/                  @mosaic/queue — Valkey task queue + MCP
│   ├── coord/                  @mosaic/coord — mission coordination
│   ├── mosaic/                 @mosaic/mosaic — install wizard
│   ├── prdy/                   @mosaic/prdy — PRD wizard
│   ├── quality-rails/          @mosaic/quality-rails — code quality scaffolder
│   ├── cli/                    @mosaic/cli — unified CLI
│   ├── auth/                   @mosaic/auth — BetterAuth config + SSO adapters
│   ├── db/                     @mosaic/db — Drizzle schema, migrations, connection
│   ├── agent/                  @mosaic/agent — Pi SDK integration, agent pool manager
│   ├── memory/                 @mosaic/memory — tiered memory + summarization service
│   ├── log/                    @mosaic/log — agent log ingest + processing
│   └── design-tokens/          @mosaic/design-tokens — CSS vars, Tailwind preset, colors
├── plugins/
│   ├── discord/                @mosaic/discord-plugin — Discord channel
│   └── telegram/               @mosaic/telegram-plugin — Telegram channel
├── docker/
│   ├── gateway.Dockerfile
│   ├── web.Dockerfile
│   └── init-db.sql
├── docs/
│   ├── PRD.md                  (this file)
│   ├── TASKS.md
│   └── scratchpads/
├── docker-compose.yml
├── pnpm-workspace.yaml
├── turbo.json
├── tsconfig.base.json
├── vitest.workspace.ts
├── AGENTS.md
├── CLAUDE.md
└── README.md

Package Responsibilities

`apps/gateway` — @mosaic/gateway (NEW — critical path)

The central nervous system. All clients connect here. Built with NestJS (Fastify adapter).

NestJS modules — Each concern (chat, brain, agent, auth, queue, memory, plugins) is a module with clear boundaries
Fastify adapter — Fastify performance under NestJS's organizational structure
WebSocket gateway — NestJS built-in WebSocket support for chat streaming, agent status, notifications
Agent routing engine — Routes tasks to appropriate LLM provider/model based on task type, cost tier, capability requirements
Session management — Tracks active conversations, agent sessions, user contexts
MCP server — Exposes Mosaic capabilities as MCP tools
Plugin host — Loads and manages channel plugins (Discord, Telegram)
Auth middleware — BetterAuth session validation, RBAC enforcement

Key routes:

POST   /api/chat                    Send message, get streamed response
GET    /api/conversations           List conversations
POST   /api/conversations           Create conversation
GET    /api/conversations/:id       Get conversation with messages
DELETE /api/conversations/:id       Delete conversation
POST   /api/tasks                   Create task (brain-backed)
GET    /api/tasks                   List/filter tasks
PATCH  /api/tasks/:id               Update task
GET    /api/projects                List projects
POST   /api/projects                Create project
GET    /api/missions                List missions
POST   /api/missions                Create mission
GET    /api/missions/:id            Mission summary with tasks
POST   /api/agents/dispatch         Dispatch work to agent pool
GET    /api/agents/status            Active agent sessions
GET    /api/memory/search           Semantic search across memory
POST   /api/memory/preferences      Store learned preference
GET    /api/skills                  List available skills
POST   /api/skills/install          Install a skill
GET    /api/providers               List configured LLM providers
POST   /api/providers               Configure LLM provider
GET    /api/admin/users             User management (admin)
POST   /api/admin/users             Create user (admin)
WS     /ws/chat/:conversationId     Streaming chat via WebSocket
WS     /ws/agents                   Agent status stream
GET    /mcp                         MCP endpoint (streamable HTTP)

`apps/web` — Next.js Web Dashboard

Carried forward from jarvis-old with significant refactoring.

Chat/conversation UI (primary interaction surface)
Settings management (providers, integrations, profile)
Task management (list, kanban, detail views)
Project management (list, detail, linked missions)
Mission dashboard (status, progress, task breakdown)
PRD viewer/editor
Agent status dashboard (active sessions, routing stats)
Skill browser and installer
User management (admin RBAC panel)
Auth pages (login, SSO redirect, registration)

`packages/types` — @mosaic/types

Migrated from mosaic-mono-v0. Extended with:

Gateway types (routing, dispatch, agent pool)
Auth types (user, role, permission)
Conversation/message types (from jarvis-old domain)
Memory types (preference, insight, summary)
Plugin channel types (Discord, Telegram message mapping)

`packages/brain` — @mosaic/brain

Migrated from mosaic-mono-v0. Storage backend changes from JSON to PostgreSQL.

REST API preserved (mounted as gateway sub-router or standalone)
MCP tools preserved
Collections layer rewritten to use Drizzle ORM queries instead of JSON file I/O
Same entity model: tasks, projects, events, agents, missions, mission-tasks, tickets
New: computed endpoints (today, stale, stats, search, audit) run against PG
New: appreciation collection preserved for family use

`packages/queue` — @mosaic/queue

Migrated from mosaic-mono-v0 with minimal changes.

Valkey-backed task queue with atomic WATCH/MULTI/EXEC
MCP server with 8 tools
Used by gateway for agent task dispatch and coordination

`packages/coord` — @mosaic/coord

Migrated from mosaic-mono-v0.

Mission lifecycle: init, run, resume, status, drain
TASKS.md parsing and management
Session lock management
Continuation prompt generation
Integration with gateway for mission-driven orchestration

`packages/db` — @mosaic/db (NEW)

Shared database package.

Drizzle ORM schema definitions (all tables)
Migration management
Connection pool configuration
Shared by gateway, brain, auth, memory

`packages/auth` — @mosaic/auth (NEW)

Authentication and authorization.

BetterAuth configuration
SSO adapters: Authentik, WorkOS, Keycloak
RBAC: roles (admin, member, viewer), permissions
API key generation for brain/MCP access
Session management middleware

`packages/agent` — @mosaic/agent (NEW — critical path)

Pi SDK integration layer.

Agent pool manager — spawns and manages Pi agent sessions
Provider configuration — Anthropic, Codex, Z.ai, Ollama, LM Studio, llama.cpp
Agent routing logic — selects provider/model based on task characteristics
Tool registration — registers Mosaic-specific tools (brain access, queue ops, memory search)
Skill management — loads and configures Pi skills for agent sessions
Session lifecycle — create, monitor, complete, fail, timeout

`packages/memory` — @mosaic/memory (NEW)

Tiered memory system.

Preference store — learned user preferences, behaviors, defaults (PG)
Insight store — distilled knowledge from agent interactions (PG + vector)
Semantic search — query across memory using pgvector embeddings
Summarization pipeline — compress raw logs into structured insights
Memory API — used by gateway and agent sessions

`packages/log` — @mosaic/log (NEW)

Agent log service.

Log ingest — receives structured logs from agent sessions
Log parsing — extracts decisions, learnings, tool usage patterns
Tiered storage — hot (recent, full detail), warm (summarized), cold (archived)
Summarization trigger — invokes cheap LLM to compress aging logs
Retention policy — configurable TTLs per tier

`packages/mosaic` — @mosaic/mosaic

Migrated from mosaic-mono-v0, updated for v1.

Install wizard for Mosaic Stack setup
Detects existing installations, offers upgrade path
Configures ~/.config/mosaic/ with guides, tools, runtime configs

`packages/prdy` — @mosaic/prdy

Migrated from mosaic-mono-v0.

PRD generation wizard
Template-based PRD creation with Zod validation
CLI integration via mosaic prdy

`packages/quality-rails` — @mosaic/quality-rails

Migrated from mosaic-mono-v0.

TypeScript scaffolder for project quality config
Generates ESLint, tsconfig, Woodpecker, husky, lint-staged configs
Supports project types: monorepo, typescript-node, nextjs

`packages/cli` — @mosaic/cli

Migrated from mosaic-mono-v0, extended.

Unified mosaic binary
Subcommands: mosaic coord, mosaic prdy, mosaic queue, mosaic quality, mosaic gateway, mosaic brain
Plugin discovery for installed @mosaic/* packages

`plugins/discord` — @mosaic/discord-plugin (NEW — high priority)

Discord remote control channel. Architecture inspired by OpenClaw (https://github.com/openclaw/openclaw).

Channel plugin that registers with the gateway as a NestJS dynamic module
Single-guild binding only (v0.1.0) — prevents data leaks between servers
Receives Discord messages, dispatches through gateway routing
Streams agent responses back to Discord (chunked for 2000-char limit)
Supports mention-based activation, thread management for multi-turn
Bot pairing and permission management (Discord user → Mosaic user mapping)
DM support for private conversations

`plugins/telegram` — @mosaic/telegram-plugin (NEW)

Telegram remote control channel.

Same channel plugin pattern as Discord
Telegraf-based bot
Message routing through gateway
Inline keyboard for interactive responses

User/Stakeholder Requirements

US-001 Multi-Channel Chat

As a user, I can chat with an AI assistant via web browser, terminal (Pi TUI), Discord, or Telegram and get consistent responses regardless of channel.

US-002 Task & Project Dashboard

As a user, I can manage my tasks, projects, and missions from the web dashboard with kanban and list views.

US-003 PRD Management

As a user, I can view and edit PRDs for active missions from the web dashboard.

US-004 Agent Visibility

As a user, I can see which agents are active, what they're working on, and their status in real-time.

US-005 Provider Configuration

As a user, I can configure which LLM providers to use and set routing preferences (cost vs capability).

US-006 Skill Management

As a user, I can install and manage agent skills through the web dashboard.

US-007 Persistent Memory

As a user, the system remembers my preferences, learned behaviors, and past decisions across sessions.

US-008 Semantic Search

As a user, I can search across my memory, conversations, and knowledge semantically.

US-009 User Management

As an admin, I can manage users, assign roles, and control access.

US-010 SSO Configuration

As an admin, I can configure SSO via Authentik, WorkOS, or Keycloak.

US-011 Self-Hosted Deployment

As a user, I can run Mosaic Stack via Docker Compose or directly on bare metal.

US-012 Intelligent Routing

As an agent operator, the gateway intelligently routes tasks to the cheapest capable model.

US-013 CLI Tooling

As a user, I can use the mosaic CLI for PRD creation, quality rail setup, queue management, and mission coordination.

Functional Requirements

FR-1: Chat System
FR-2: Gateway Orchestrator
FR-3: Agent Pool
FR-4: Task Management
FR-5: Project Management
FR-6: Mission System
FR-7: Memory System
FR-8: Authentication & Authorization
FR-9: Remote Control — Discord
FR-10: Remote Control — Telegram
FR-11: LLM Provider Management
FR-12: Agent Routing
FR-13: MCP Capability
FR-14: Skill Management
FR-15: CLI Integration
FR-16: Log Service
FR-17: Gateway State Persistence
FR-18: Multi-Session Agent Architecture
FR-19: Cron Scheduler
FR-20: Web Search Tool
FR-21: Skill Import from skills.sh

FR-1: Chat System

Conversation CRUD (create, list, get with messages, delete)
Real-time streaming responses via WebSocket
Multi-provider support (route to configured LLM)
Conversation history with search
Project-scoped conversations
System prompt per project/conversation
Message rendering with markdown, code blocks, tool call display

FR-2: Gateway Orchestrator

Central API surface for all clients (web, TUI, Discord, Telegram)
Agent dispatch — receive task, select provider/model, spawn Pi session, return result
Routing engine — cost/capability matrix, user preference overrides, task-type heuristics
Plugin host — load channel plugins at startup, manage lifecycle
MCP server — expose Mosaic tools via MCP protocol
WebSocket hub — real-time updates for chat, agent status, notifications
Rate limiting and request validation

FR-3: Agent Pool (@mosaic/agent)

Manage concurrent Pi SDK sessions
Provider configuration: API key management, endpoint URLs, model lists
Support providers: Anthropic (subscription + API), OpenAI/Codex (subscription + API), Z.ai, Ollama (local), LM Studio (local), llama.cpp (local)
Tool injection — all agent sessions get Mosaic tools (brain, queue, memory)
Skill loading — configure skills per agent session based on task type
Session monitoring — track active sessions, token usage, duration
Graceful shutdown — drain active sessions on shutdown

FR-4: Task Management

Brain-backed task CRUD with full filter/sort
Task statuses: backlog, scheduled, in-progress, blocked, done, cancelled
Priority levels: critical, high, medium, low
Domain categorization
Dependency tracking (blocks/blocked_by)
Project association
Assignee tracking
Kanban board view in web dashboard
Due date tracking with stale detection

FR-5: Project Management

Project CRUD with domain, status, priority
Link to repository, branch, current/next milestone
Progress tracking
Blocker tracking
Owner assignment

FR-6: Mission System

Mission CRUD (linked to project and PRD)
Mission tasks with phases, dependencies, ordering
Mission summary with computed progress
Mission coordination via @mosaic/coord
Active mission dashboard in web UI

FR-7: Memory System

Preferences: Key-value store for learned user preferences (e.g., "prefers tables over paragraphs", "timezone: America/Chicago")
Insights: Distilled knowledge from agent interactions, stored with embeddings
Semantic search: Query across all memory using natural language
Auto-capture: Agent sessions automatically log decisions and learnings
Summarization: Periodic compression of raw logs into structured insights
Decay: Old, unused insights decay in relevance score over time

FR-8: Authentication & Authorization

BetterAuth integration with Next.js
Email/password registration and login
SSO via OIDC/SAML: Authentik, WorkOS, Keycloak
RBAC roles: admin (full access), member (own resources + shared), viewer (read-only)
API key generation for programmatic/MCP access
Session management (web + API)

FR-9: Remote Control — Discord

Discord bot that connects to the gateway
Mention-based activation in channels
DM support for private conversations
Thread creation for multi-turn conversations
Chunked message delivery (Discord 2000-char limit)
Bot configuration via web dashboard
Permission management (which Discord users/roles can interact)

FR-10: Remote Control — Telegram

Telegram bot via Telegraf
Private and group chat support
Command-based interaction (/ask, /task, /status)
Inline keyboard for task management
Message routing through gateway

FR-11: LLM Provider Management

Provider configuration UI in web dashboard
Per-provider: API key/endpoint, enabled models, cost per token
Subscription-based providers: detect available models from subscription
Local providers: Ollama model list, LM Studio endpoint, llama.cpp binary path
Provider health monitoring
Usage tracking per provider/model

FR-12: Agent Routing

Task-type to model-tier mapping (from AGENTS.md cost matrix)
User preference overrides (e.g., "always use Claude for code review")
Fallback chains (if primary provider unavailable, try next)
Cost tracking and budget enforcement
Routing transparency — user can see why a particular model was chosen

FR-13: MCP Capability

Gateway exposes MCP server (streamable HTTP transport)
Brain tools registered as MCP tools
Queue tools registered as MCP tools
Memory search registered as MCP tool
Agent sessions can call MCP tools from other services
External MCP server connectivity (agent can use third-party MCP servers)

FR-14: Skill Management

Skill catalog — list available skills from configured sources
Skill install — install skill to ~/.config/mosaic/skills/ or project-local
Skill configuration — per-skill settings
Skill status — installed, available, update available
Web UI for browsing and managing skills

FR-15: CLI Integration

mosaic gateway start — start the gateway server
mosaic brain — brain data management
mosaic queue — queue operations
mosaic coord — mission coordination
mosaic prdy — PRD wizard
mosaic quality — quality rail management
mosaic tui — launch Pi TUI connected to gateway

FR-16: Log Service

Structured log ingest from agent sessions
Parse logs for: decisions made, tools used, errors encountered, learnings captured
Tier management: hot (7 days, full detail), warm (30 days, summarized), cold (90 days, key facts only)
Summarization pipeline: cheap LLM compresses aging logs on schedule
Query interface for log search

FR-17: Gateway State Persistence

Orchestration state persisted to Valkey (active sessions, pending dispatches, routing context)
On restart, gateway reads Valkey state and resumes — reconnects to active agent sessions
mosaic gateway restart --fresh clears Valkey queue and all in-flight state (nuclear option)
Session recovery: detect orphaned agent sessions, offer reconnect or cleanup

FR-18: Multi-Session Agent Architecture

Each agent has a distinct session with dedicated context
Multiple input channels (TUI, web, Discord, Telegram) can connect to same agent session
Channel multiplexing at gateway level with proper authorization
Discord channel ID paired to specific agent/session (prevents cross-contamination)
Agent session runs in named tmux session for barge-in capability
mosaic agent attach <session> connects to agent's tmux session
mosaic agent list shows active sessions with connected channels

FR-19: Cron Scheduler

Built-in cron scheduler in gateway for recurring tasks
Default schedules: log summarization, stale task detection, memory decay, provider health checks
Custom schedules: user-defined agent dispatches on cron expressions
Schedule management via web dashboard and CLI
Cron jobs dispatched through normal gateway routing pipeline
Persistence: schedules stored in PG, survive gateway restart

FR-20: Web Search Tool

DuckDuckGo web search via MCP server (primary — privacy-respecting, no API key)
Registered as standard MCP tool available to all agent sessions
Configurable: can swap to other search providers (Brave, SearXNG, Tavily)
Results formatted for agent consumption (title, snippet, URL)

FR-21: Skill Import from skills.sh

Browse skills from https://skills.sh directory via API
Import skills into ~/.config/mosaic/skills/ or project-local .mosaic/skills/
Vetting workflow: imported skills marked as "unvetted" until admin approves
Skill review interface in web dashboard (view skill content before approval)
Vetted skills auto-available to agent sessions; unvetted require explicit enable
mosaic skill import <source/skillId> CLI command
Track installed skills, versions, update availability

Non-Functional Requirements

Security

No hardcoded secrets — all secrets via environment variables or vault
API key rotation capability
RBAC enforcement at gateway level
Input validation (Zod) on all API endpoints
Rate limiting on public endpoints
CORS configuration for web app
Secure WebSocket connections
SSO token validation
Database connection encryption (SSL)

Performance

Chat response streaming latency < 200ms TTFB (gateway overhead, not LLM latency)
Dashboard page loads < 2s
Brain query responses < 100ms for filtered reads
Semantic search < 500ms
Support 10+ concurrent agent sessions
WebSocket connection handling for 50+ concurrent users

Reliability

Graceful degradation when LLM provider is unavailable (fallback chain)
Queue persistence — tasks survive gateway restart
Database connection pooling with retry
Health check endpoints for all services
Structured error responses with correlation IDs

Observability (Wide-Event Logging — Required from Phase 0)

OpenTelemetry instrumentation across all services from day one
- @opentelemetry/sdk-node + @opentelemetry/auto-instrumentations-node for auto-instrumentation (HTTP, PG, Fastify/NestJS)
- NestJS interceptors for custom spans on agent dispatch, routing decisions, memory writes, summarization runs
- Every significant operation emits a structured event with rich context (wide events, not just request/response)
SigNoz as OTEL backend (single Docker service: traces, metrics, logs, built-in UI)
Request tracing with correlation IDs (trace-id propagated across gateway → agent → brain → queue)
Agent session metrics (duration, tokens, cost, success/failure, model used, routing reason)
Provider availability monitoring (health check spans)
Queue depth monitoring (periodic gauge metrics)
Memory usage metrics (embedding count, search latency, summarization runs)
Migrate to Grafana stack (Tempo + Loki + Grafana) post-beta if more customization is needed

Scalability (Multi-Tier Readiness)

Single-node deployment is the MVP target for v0.1.0
Code structured with assumption that multi-tiered deployment will follow: dedicated gateway nodes, agent worker nodes, brain/DB nodes
Service boundaries communicate via HTTP/WS/MCP APIs, not in-process calls where avoidable
Gateway is stateless (all state in PG/Valkey) to enable horizontal scaling
Agent pool designed as independently scalable service
Database migrations support forward-only schema evolution
Hierarchical deployment with dedicated roles/specialties is the post-beta target

Acceptance Criteria

AC-1: Core Chat Flow

User can log in via web UI, send a message, and receive a streamed response
Conversation persists across page refreshes
User can create, list, search, and delete conversations
Conversations can be scoped to projects

AC-2: TUI Integration

mosaic tui launches Pi interactive mode connected to gateway
User can chat with same conversation context as web UI
Agent has access to brain, queue, and memory tools

AC-3: Discord Remote Control

Discord bot connects and responds to mentions
Messages route through gateway to agent pool
Responses stream back to Discord (chunked)
Thread creation for multi-turn conversations

AC-4: Gateway Orchestration

Gateway dispatches tasks to appropriate provider/model
Routing decision logged and inspectable
Fallback when primary provider unavailable
Multiple concurrent agent sessions managed correctly

AC-5: Task & Project Management

CRUD operations for tasks, projects, missions via web dashboard
Kanban board view for tasks
Mission progress tracking with computed stats
Brain MCP tools accessible from agent sessions

AC-6: Memory System

Agent sessions auto-capture decisions and learnings
Semantic search returns relevant past context
Learned preferences are applied in new sessions
Log summarization runs on schedule, old logs compressed

AC-7: Authentication & RBAC

Email/password login works
At least one SSO provider (Authentik) works end-to-end
Admin can create users and assign roles
RBAC enforced on API endpoints

AC-8: Multi-Provider LLM Support

At least 3 providers configured and routing correctly (e.g., Anthropic + Ollama + Z.ai)
Agent routing selects appropriate model for task type
Provider configuration manageable from web UI

AC-9: MCP

Gateway exposes MCP endpoint
Brain and queue tools callable via MCP
Agent sessions can connect to external MCP servers

AC-10: Deployment

docker compose up starts full stack from clean state
mosaic CLI installable and functional on bare metal
Database migrations run automatically on first start
.env.example documents all required configuration

AC-11: @mosaic/* Packages

All 7 migrated packages build, pass tests, and integrate with gateway
mosaic CLI provides subcommands for each package
Types package is the single source of shared interfaces

Constraints and Dependencies

Pi SDK — Core dependency; any Pi breaking changes affect the agent layer. Pin to known-good version.
BetterAuth — Auth framework; must support SSO adapters. Verify Authentik/WorkOS/Keycloak support before committing.
Drizzle ORM — Database layer; must support PostgreSQL + pgvector extension.
Discord API — Rate limits, intent requirements, message size limits (2000 chars).
Valkey — Queue backend; must be available for queue and caching.
Gitea registry — Package publishing target; .npmrc must be configured.
OpenClaw — Reference architecture for Discord/Telegram plugin pattern (https://github.com/openclaw/openclaw). Inspiration only, not a dependency.

Risks and Open Questions

Risks

Risk	Likelihood	Impact	Mitigation
Pi SDK API instability (pre-1.0)	Medium	High	Pin version, abstract behind @mosaic/agent interface
Brain PG migration complexity	Medium	Medium	Preserve Brain REST/MCP API contract; only storage changes
Discord plugin complexity (OpenClaw has ~60 files)	Medium	Medium	Start minimal (DM + mention in channel), single-guild only; expand iteratively post-beta
LLM provider subscription auth varies by provider	Medium	Medium	Abstract behind provider interface; implement per-provider adapters
Drizzle + pgvector extension compatibility	Low	Medium	Validate in Phase 0 with spike
Agent log volume overwhelming storage	Medium	High	Tiered storage with aggressive summarization; configurable retention
Scope creep from jarvis-old feature surface	High	High	Strict v0.1.0 scope; features not listed above are post-beta

Open Questions

#	Question	Priority	Status
1	Pi SDK version to pin for v0.1.0?	High	✅ Resolved — Pin `@mariozechner/pi-coding-agent@~0.57.1` (current stable). Abstract behind `@mosaic/agent` interface to insulate from breaking changes. Bump deliberately after testing.
2	Authentik vs WorkOS vs Keycloak — which SSO provider to implement first?	Medium	✅ Resolved — Authentik first (already in Jason's infrastructure)
3	Vector DB: pgvector sufficient or need Qdrant from the start?	Medium	✅ Resolved — pgvector with VectorStore interface abstraction. Qdrant drops in later if needed.
4	Summarization LLM: which model for log compression?	Medium	✅ Resolved — Haiku-tier default with structured output guardrails, configurable via routing engine.
5	LM Studio and llama.cpp — provider adapters exist in Pi or need custom?	Medium	✅ Resolved — Pi handles both natively. LM Studio and llama.cpp (server mode) expose OpenAI-compatible APIs; configure via Pi's `models.json` with `openai-completions` API type. No custom adapters needed.
6	Discord bot — single guild or multi-guild from day one?	Medium	✅ Resolved — Single-guild only for v0.1.0 to prevent data leaks. Bot binds to one guild. Multi-guild with tenant isolation is a post-beta feature requiring explicit data boundary design.
7	Bare-metal install — systemd units or just docs?	Low	ASSUMPTION: Docs + CLI launch commands; systemd units post-beta

Testing and Verification Expectations

Baseline checks: pnpm typecheck && pnpm lint && pnpm test must pass across all packages
Unit tests: Vitest for all packages; mocked dependencies for isolation
Integration tests: Gateway + Brain + Queue with test PG + Valkey (Docker services in CI)
E2E tests: Playwright for web dashboard critical paths (login, chat, task CRUD)
Agent tests: Pi SDK session tests with mock provider (verify tool registration, routing)
Evidence format: CI pipeline green + test count report per package

Milestone / Delivery Intent

All work is alpha (< 0.1.0) until Jason approves 0.1.0 beta release.

Phase 0: Foundation (v0.0.1)

Scaffold monorepo (pnpm + turbo + tsconfig + eslint + vitest)
@mosaic/types — migrate and extend from v0
@mosaic/db — Drizzle schema, PG connection, migrations
@mosaic/auth — BetterAuth setup with email/password
OTEL foundation — @opentelemetry/sdk-node setup, SigNoz in docker-compose, trace propagation wired
Docker Compose (PG 17 + Valkey + SigNoz)
CI pipeline (Woodpecker)
AGENTS.md, CLAUDE.md, README.md

Phase 1: Core API (v0.0.2)

apps/gateway — NestJS server (Fastify adapter), auth middleware, health endpoints
@mosaic/brain — migrate from v0, swap JSON store for PG via @mosaic/db
@mosaic/queue — migrate from v0 (minimal changes)
Gateway routes: conversations, tasks, projects, missions
WebSocket server for chat streaming
Basic agent dispatch (single provider, no routing)

Phase 2: Agent Layer (v0.0.3)

@mosaic/agent — Pi SDK integration, agent pool manager
Multi-provider support (Anthropic + Ollama minimum)
Agent routing engine (cost/capability matrix)
Tool registration (brain, queue, memory tools injected into agent sessions)
@mosaic/coord — migrate from v0, integrate with gateway

Phase 3: Web Dashboard (v0.0.4)

apps/web — Next.js app with BetterAuth
Chat UI (conversation list, message display, streaming input)
Task management (list + kanban)
Project and mission views
Settings (provider config, profile)
Admin panel (user management, RBAC)

Phase 4: Memory & Intelligence (v0.0.5)

@mosaic/memory — preference store, insight store, semantic search
@mosaic/log — log ingest, parsing, tiered storage
Summarization pipeline
Memory integration into agent sessions
Skill management interface (web UI + CLI)

Phase 5: Remote Control (v0.0.6)

@mosaic/discord-plugin — Discord channel plugin
@mosaic/telegram-plugin — Telegram channel plugin
Plugin host in gateway
SSO configuration (Authentik)

Phase 6: CLI & Tools (v0.0.7)

@mosaic/cli — unified CLI with all subcommands
@mosaic/prdy — migrate from v0
@mosaic/quality-rails — migrate from v0
@mosaic/mosaic — install wizard updated for v1
Pi TUI integration (mosaic tui)

Phase 7: Polish & Beta (v0.0.8 → v0.1.0)

MCP endpoint hardening
Additional SSO providers (WorkOS/Keycloak)
Additional LLM providers (Codex, Z.ai, LM Studio, llama.cpp)
Bare-metal deployment documentation
E2E test suite
Performance optimization
Documentation: user guide, admin guide, developer guide
Jason approval gate → v0.1.0 beta release

Assumptions

RESOLVED: pgvector is sufficient for semantic search at v0.1.0 scale (personal/family/team = thousands to low hundreds-of-thousands of vectors). @mosaic/memory defines a VectorStore interface with pgvector as the default adapter. The interface boundary makes Qdrant a drop-in migration if PG resource contention or scale demands it later. Zero additional infrastructure for v0.1.0. Rationale: Reduces ops burden; pgvector HNSW indexes are fast at this scale; interface abstraction costs almost nothing now.
RESOLVED: Authentik is the first SSO provider — confirmed, already running in Jason's infrastructure. WorkOS and Keycloak adapters follow in Phase 7.
RESOLVED: NestJS with Fastify adapter for the gateway. The gateway's complexity (plugin host, agent pool, routing engine, WebSocket hub, MCP server, auth, brain/queue/memory/log integration) warrants NestJS's module system, DI, and guards. Fastify performance preserved via adapter. Aligns with USER.md stated stack ("NestJS API + Next.js web"). @mosaic/brain's Fastify code migrates into a NestJS module.
RESOLVED: OpenTelemetry from Phase 0. Wide-event logging is required from the start. OTEL auto-instrumentation for NestJS/PG/HTTP via @opentelemetry/sdk-node. SigNoz as the all-in-one OTEL backend (single Docker service). Every significant operation emits structured events with rich context. Custom spans for agent dispatch, routing decisions, memory writes. Rationale: Retrofitting observability is painful; baking it in from day one means consistent instrumentation across all services.
ASSUMPTION: Single-node deployment for v0.1.0, but code structured for multi-tier. No Kubernetes yet. Docker Compose + bare metal. Service boundaries use HTTP/WS/MCP APIs (not in-process) so gateway, agent pool, and brain can split to separate nodes later. Rationale: Ship single-node MVP; the architecture doesn't fight horizontal scaling when needed.
ASSUMPTION: Log summarization uses Haiku-tier LLM by default, configurable. Haiku is well-suited for summarization (compression, not generation — source material is in context). Guardrails: structured output via Zod schema (force extraction of decisions/tools/outcomes/errors as discrete fields), chunked per-session processing (no bulk conflation), extraction-focused prompts. Raw logs stay in hot tier (7 days) as safety net. Users can override the summarization model via routing engine config if they want higher fidelity. Rationale: Haiku is 10-20x cheaper than Sonnet; log summarization runs on schedule against large volumes where cost matters.
ASSUMPTION: Discord plugin starts minimal and single-guild only — DM support, mention-based channel activation, thread management, chunked responses. Single guild binding to prevent data leaks between servers. Advanced features (voice, components, slash commands, multi-guild) are post-beta. Rationale: Proven pattern from OpenClaw; ship core interaction first; data isolation is non-negotiable.
ASSUMPTION: Telegram plugin is lower priority than Discord and may ship as v0.0.7 or later if Discord takes longer than expected. Rationale: Jason indicated Discord as the high-priority remote channel.
ASSUMPTION: Brain's REST API is preserved as a gateway sub-router (mounted at /api/brain/* or similar). Existing MCP tools continue to work. Only the storage backend changes. Rationale: Minimize migration risk; brain's API contract is proven.
ASSUMPTION: Conversations and messages get their own PG tables (not stored in brain's entity model). They follow a chat-specific schema with proper foreign keys to users and projects. Rationale: Chat has different access patterns (streaming, pagination, search) than brain entities.
RESOLVED: Pi handles all target LLM providers natively. Anthropic, OpenAI/Codex, Z.ai, Ollama, LM Studio, and llama.cpp are all supported via Pi's built-in providers or models.json configuration with openai-completions API type. No custom provider adapters needed in @mosaic/agent — only configuration management.

56 KiB Raw Blame History

PRD: Mosaic Stack v0.1.0

Metadata

Problem Statement

Objectives

Scope

In Scope (v0.1.0 Beta)

Out of Scope (v0.1.0)

Architecture

High-Level System Diagram

Technology Decisions

Key Architecture Decisions

Package Structure

Monorepo Layout

Package Responsibilities

apps/gateway — @mosaic/gateway (NEW — critical path)

apps/web — Next.js Web Dashboard

packages/types — @mosaic/types

packages/brain — @mosaic/brain

packages/queue — @mosaic/queue

packages/coord — @mosaic/coord

packages/db — @mosaic/db (NEW)

packages/auth — @mosaic/auth (NEW)

packages/agent — @mosaic/agent (NEW — critical path)

packages/memory — @mosaic/memory (NEW)

packages/log — @mosaic/log (NEW)

packages/mosaic — @mosaic/mosaic

packages/prdy — @mosaic/prdy

packages/quality-rails — @mosaic/quality-rails

packages/cli — @mosaic/cli

plugins/discord — @mosaic/discord-plugin (NEW — high priority)

plugins/telegram — @mosaic/telegram-plugin (NEW)

User/Stakeholder Requirements

US-001 Multi-Channel Chat

US-002 Task & Project Dashboard

US-003 PRD Management

US-004 Agent Visibility

US-005 Provider Configuration

US-006 Skill Management

US-007 Persistent Memory

US-008 Semantic Search

US-009 User Management

US-010 SSO Configuration

US-011 Self-Hosted Deployment

US-012 Intelligent Routing

US-013 CLI Tooling

Functional Requirements

FR-1: Chat System

FR-2: Gateway Orchestrator

FR-3: Agent Pool (@mosaic/agent)

FR-4: Task Management

FR-5: Project Management

FR-6: Mission System

FR-7: Memory System

FR-8: Authentication & Authorization

FR-9: Remote Control — Discord

FR-10: Remote Control — Telegram

FR-11: LLM Provider Management

FR-12: Agent Routing

FR-13: MCP Capability

FR-14: Skill Management

FR-15: CLI Integration

FR-16: Log Service

FR-17: Gateway State Persistence

FR-18: Multi-Session Agent Architecture

FR-19: Cron Scheduler

FR-20: Web Search Tool

FR-21: Skill Import from skills.sh

Non-Functional Requirements

Security

Performance

Reliability

Observability (Wide-Event Logging — Required from Phase 0)

Scalability (Multi-Tier Readiness)

Acceptance Criteria

AC-1: Core Chat Flow

AC-2: TUI Integration

AC-3: Discord Remote Control

AC-4: Gateway Orchestration

AC-5: Task & Project Management

56 KiB

Raw Blame History

`apps/gateway` — @mosaic/gateway (NEW — critical path)

`apps/web` — Next.js Web Dashboard

`packages/types` — @mosaic/types

`packages/brain` — @mosaic/brain

`packages/queue` — @mosaic/queue

`packages/coord` — @mosaic/coord

`packages/db` — @mosaic/db (NEW)

`packages/auth` — @mosaic/auth (NEW)

`packages/agent` — @mosaic/agent (NEW — critical path)

`packages/memory` — @mosaic/memory (NEW)

`packages/log` — @mosaic/log (NEW)

`packages/mosaic` — @mosaic/mosaic

`packages/prdy` — @mosaic/prdy

`packages/quality-rails` — @mosaic/quality-rails

`packages/cli` — @mosaic/cli

`plugins/discord` — @mosaic/discord-plugin (NEW — high priority)

`plugins/telegram` — @mosaic/telegram-plugin (NEW)