stack/docs/research/00-SUMMARY.md

# Mosaic Stack — Fast-Track Completion Plan

**Date:** 2026-03-01
**Goal:** Make Mosaic Stack usable for daily agent orchestration in hours, not weeks.

Based on research of 9 community dashboards (openclaw-dashboard, clawd-control, claw-dashboard, ai-maestro, clawview, clawde-dashboard, agent-web-ui, cogni-flow, openclaw-panel), here is the prioritized build plan.

---

## What Mosaic Stack Already Has (Strengths)

- ✅ Better Auth with CSRF + bearer token bypass for API agents
- ✅ NestJS API with PostgreSQL (Prisma), full RBAC
- ✅ Next.js 15 web app: dashboard widgets, projects, kanban, calendar, tasks, knowledge, files, logs, terminal (xterm.js+WebSocket), usage tracking, settings
- ✅ Agent fleet: agents table, orchestrator endpoint, container lifecycle
- ✅ Fleet settings: LLM provider config, agent config

## What's Missing (Gaps)

- ❌ Chat page is a stub — not connected to any backend
- ❌ No memory/file viewer for agent workspace files
- ❌ No cron/automation visibility
- ❌ No agent creation wizard — must use DB directly
- ❌ Fleet overview lacks real-time status and health indicators
- ❌ No rate limiting or audit logging
- ❌ No agent-to-agent messaging

---

## P0 — Do Today (< 2h each, unblocks daily use)

### 1. Connect Chat to Backend
- **Why:** Chat page exists but does nothing. This is the #1 interaction surface for agents. Without it, Mosaic Stack is a dashboard you look at, not a tool you use.
- **Effort:** 2h
- **Inspired by:** ai-maestro (agent inbox), clawview (embedded chat)
- **Approach:** Wire existing chat UI to WebSocket endpoint. Send messages to agent, display responses. Use existing auth context for user identity. Store messages in PostgreSQL.

### 2. Fleet Overview with Live Status
- **Why:** Can't tell which agents are running, idle, or broken. Every dashboard researched puts this front and center.
- **Effort:** 2h
- **Inspired by:** clawd-control (card grid), openclaw-dashboard (sparklines)
- **Approach:** Agent card grid on fleet page. Each card: name, emoji, status dot (green/yellow/red), last activity, session count. Poll agent health endpoint every 10s. Use existing agents table.

### 3. Agent Memory/File Viewer
- **Why:** Debugging agents requires reading MEMORY.md, HEARTBEAT.md, daily logs. Without this, you SSH into the server every time.
- **Effort:** 1-2h
- **Inspired by:** openclaw-dashboard (memory viewer with markdown rendering)
- **Approach:** NestJS endpoint reads files from agent workspace dir. Path traversal protection. Next.js page: file tree sidebar + markdown preview panel. Read-only initially.

### 4. Rate Limiting + Security Headers
- **Why:** Any exposed web app without rate limiting is a brute-force target. 30 minutes of work prevents real attacks.
- **Effort:** 30min
- **Inspired by:** openclaw-dashboard (5-attempt lockout, HSTS, CSP)
- **Approach:** Add `@nestjs/throttler` to auth endpoints (5 req/min for login). Add `helmet` middleware for security headers.

### 5. Activity Feed / Recent Events
- **Why:** "What happened while I was away?" is the first question every morning. Every dashboard has this.
- **Effort:** 1h
- **Inspired by:** openclaw-dashboard (live feed via SSE), clawd-control (fleet activity)
- **Approach:** Query recent log entries from DB. Display as reverse-chronological list on dashboard. Agent name + action + timestamp. Auto-refresh every 30s.

---

## P1 — Do This Week (2-8h each, major features)

### 6. Agent Creation Wizard
- **Why:** Creating agents currently requires direct DB manipulation. Friction kills adoption.
- **Effort:** 3-4h
- **Inspired by:** clawd-control (guided wizard), ai-maestro (UI-based agent creation)
- **Approach:** Dialog/wizard in fleet settings: name, emoji, model, connection details (host/port/token), workspace path. Writes to agents table. Could be single-page form (faster) or multi-step (nicer UX).

### 7. Cron/Automation Management
- **Why:** Scheduled tasks are invisible — you don't know what's running, when, or if it failed.
- **Effort:** 2-3h
- **Inspired by:** openclaw-dashboard (cron list with toggle/trigger)
- **Approach:** NestJS reads scheduled jobs (from @nestjs/schedule or config). API: list, toggle, trigger. Frontend: table with Name | Schedule | Status | Last Run | Actions.

### 8. Audit Logging
- **Why:** Security compliance and debugging. "Who did what, when?" is unanswerable without this.
- **Effort:** 2-3h
- **Inspired by:** openclaw-dashboard (audit.log with auto-rotation)
- **Approach:** NestJS middleware logs auth events, destructive actions, config changes to audit_logs table. View in Settings > Security.

### 9. Agent-to-Agent Simple Messaging
- **Why:** Orchestrating multiple agents requires passing context between them. Without messaging, the human is the bottleneck.
- **Effort:** 4-6h
- **Inspired by:** ai-maestro (AMP protocol — simplified)
- **Approach:** `messages` table in PostgreSQL: fromAgentId, toAgentId, type, priority, subject, body, threadId, readAt. API endpoints for send/list/read. Agent inbox UI. Skip cryptographic signing and multi-machine for now.

### 10. SSE for Real-Time Fleet Updates
- **Why:** Polling is fine initially but SSE gives instant feedback when agents change state.
- **Effort:** 2-3h
- **Inspired by:** openclaw-dashboard, clawd-control (both use SSE)
- **Approach:** NestJS SSE endpoint streams agent status changes. Next.js EventSource client updates fleet cards in real-time.

---

## P2 — Nice to Have (8h+, polish)

### 11. TOTP Multi-Factor Authentication
- **Effort:** 4-6h
- **Inspired by:** openclaw-dashboard
- **Approach:** Better Auth may have a TOTP plugin. Otherwise use `otplib` + QR code generation.

### 12. Multi-Machine Agent Mesh
- **Effort:** 16h+
- **Inspired by:** ai-maestro (peer mesh, no central server)
- **Approach:** Agent discovery across machines. Network-aware routing. Defer until single-machine is solid.

### 13. Code Graph / Codebase Visualization
- **Effort:** 12h+
- **Inspired by:** ai-maestro (interactive code graph with delta indexing)
- **Approach:** Use ts-morph to parse codebase, D3.js for visualization. Cool but not urgent.

### 14. Activity Heatmap
- **Effort:** 4h
- **Inspired by:** openclaw-dashboard (30-day heatmap)
- **Approach:** GitHub-style contribution heatmap showing agent activity by hour/day.

### 15. Agent Personality Profiles
- **Effort:** 2-3h
- **Inspired by:** ai-maestro (avatars, personality, visual identity)
- **Approach:** Add personality/system-prompt field to agent config. Avatar upload. Nice for team feel.

---

## Execution Order (Recommended)

```
Day 1 (Today):
  Morning:  #4 Rate limiting (30min) → #2 Fleet overview (2h)
  Afternoon: #1 Connect chat (2h) → #3 Memory viewer (1.5h)
  Evening:  #5 Activity feed (1h)

Day 2-3:
  #6 Agent creation wizard (3h)
  #7 Cron management (2h)
  #8 Audit logging (2h)

Day 4-5:
  #9 Agent messaging (5h)
  #10 SSE real-time (2h)

Week 2+:
  P2 items as time permits
```

## Total Effort to "Usable Daily"

| Priority | Items | Total Hours |
|----------|-------|-------------|
| P0 | 5 items | ~7h |
| P1 | 5 items | ~15h |
| P2 | 5 items | ~40h+ |

**Bottom line:** ~7 hours of focused work today gets Mosaic Stack from "demo" to "daily driver." Another 15 hours this week makes it genuinely powerful. The P2 items are polish — nice but not blocking daily use.

---

## Key Design Principles (Learned from Research)

1. **Simplicity first** (clawd-control) — No build tools for simple features. Use what's already there.
2. **Single-screen overview** (all dashboards) — Users want one page that answers "is everything OK?"
3. **Read before write** (openclaw-dashboard) — Memory viewer is read-only first, edit later.
4. **Progressive enhancement** — Polling → SSE → WebSocket. Don't over-engineer day one.
5. **Existing infra** — PostgreSQL, NestJS, Next.js are already set up. Don't add new databases or frameworks.