- mosaic-component-architecture.md: OpenClaw wrapper pattern, component naming, job tracking, chat integration, database schema - guard-rails-capability-permissions.md: Capability-based permission model Related: #162 (M4.2 Infrastructure Epic) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
14 KiB
Mosaic Component Architecture Design
Strategic Decision
OpenClaw as execution engine, Mosaic as control layer.
- Now (M1-M2): Wrapper approach - use OpenClaw, add Mosaic controls
- After M2: Evaluate - is OpenClaw working for us?
- If needed: Fork or rebuild with lessons learned
Why: 355+ contributors maintain OpenClaw. We maintain only the wrapper. Ship faster, pivot later if needed.
Philosophy
Mosaic = pieces combining to create a beautiful, larger picture.
Each component has a dedicated function (single responsibility). Focused tasks = agents stay on rails. If an agent only does one thing, it can't wander off-track.
Overview
Establish the pattern for how Mosaic's control layer wraps OpenClaw's execution layer, with full job step tracking and event logging.
Component Naming
| Component | Dedicated Function | Rails |
|---|---|---|
| @mosaic | Gitea bot user - triggers workflow on issue assignment/mention | Webhook receiver only |
| mosaic-stitcher | Orchestrates workflow, sequences jobs, manages priorities | Control plane only, no execution |
| mosaic-bridge | Chat integrations (Discord, Mattermost, Slack) - commands in, status out | I/O only, no execution |
| mosaic-runner | Fetches information, gathers context, reads repos | Read-only operations |
| mosaic-weaver | Implements code changes, writes files | Write operations, scoped to worktree |
| mosaic-inspector | Runs quality gates (build, lint, test) | Validation only, no modifications |
| mosaic-herald | Reports status, creates PR comments, notifications | Output/reporting only |
Why this works: Each component has exactly ONE job. Can't go off rails if there's only one rail.
Note: Names are placeholders. Components are modular plugins—names can change later.
Architecture
┌─────────────────┐ ┌─────────────────┐
│ @mosaic │ │ mosaic-bridge │
│ (Gitea Bot) │ │ (Chat I/O) │
│ Webhook Trigger│ │ Discord/MM/etc │
└────────┬────────┘ └────────┬────────┘
│ Issue assigned │ Commands
└───────────────┬────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ MOSAIC STACK (Control Layer) │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ MOSAIC-STITCHER (Wrapper) │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────────────┐ │ │
│ │ │ Guard │ │ Quality │ │ Job Tracking │ │ │
│ │ │ Rails │ │ Rails │ │ (Events/Steps) │ │ │
│ │ │ (perms) │ │ (gates) │ │ │ │ │
│ │ └───────────┘ └───────────┘ └───────────────────┘ │ │
│ └──────────────────────────┬───────────────────────────┘ │
│ │ │
└─────────────────────────────┼────────────────────────────────┘
│ Dispatch with constraints
▼
┌─────────────────────────────────────────────────────────────┐
│ OPENCLAW (Execution Layer) │
│ 355+ contributors maintain │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Agent │ │ Session │ │ Multi-LLM │ │ Discord │ │
│ │ Spawning │ │ Manager │ │ Support │ │ Integr. │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ Agent Profiles (Mosaic-defined constraints): │
│ ┌─────────┐ ┌─────────┐ ┌───────────┐ ┌─────────┐ │
│ │ RUNNER │ │ WEAVER │ │ INSPECTOR │ │ HERALD │ │
│ │ (read) │ │ (write) │ │ (validate)│ │ (report)│ │
│ └─────────┘ └─────────┘ └───────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────┘
Key insight: Agent profiles (runner, weaver, etc.) are constraints passed to OpenClaw, not separate containers. OpenClaw spawns agents, Mosaic controls what they're allowed to do.
Relationship to Non-AI Coordinator (M4.1)
This architecture complements the Non-AI Coordinator Pattern:
| Layer | Responsibility | Milestone |
|---|---|---|
| Non-AI Coordinator | Orchestration logic (when to assign, context monitoring, quality gates enforcement) | M4.1 |
| Mosaic Component Architecture | Execution infrastructure (job tracking, OpenClaw integration, chat commands) | M4.2 |
The Non-AI Coordinator uses this infrastructure to dispatch and monitor jobs.
Chat Integration (mosaic-bridge)
Control Mosaic Stack via Discord, Mattermost, Slack, etc.
#mosaic-control
├── User: "@mosaic fix issue #42"
├── Mosaic: "🚀 Started job #123 for issue #42" [link to thread]
│
└── Thread: "Job #123: Fix issue #42"
├── 📖 Runner: Gathering context... ✓
├── 🧵 Weaver: Implementing... ✓
├── 🔍 Inspector: Running tests... ✓
├── 📢 Herald: PR created → #456
└── [Full event log: /api/jobs/123/events]
Noise Management Strategy
| Channel | Purpose | Verbosity |
|---|---|---|
#mosaic-control |
Commands + summaries | Low (milestones only) |
| Job threads | Per-job activity | Medium (step completions) |
/api/jobs/{id}/events |
Full audit log | High (everything) |
| DMs (optional) | Private updates to triggering user | Configurable |
Commands (via chat)
@mosaic fix <issue> # Start job for issue
@mosaic status <job> # Get job status
@mosaic cancel <job> # Cancel running job
@mosaic verbose <job> # Stream full logs to thread
@mosaic quiet # Reduce notifications
@mosaic help # Show commands
Integration lives at Mosaic layer, not OpenClaw
- mosaic-bridge handles Discord/Mattermost/Slack APIs
- mosaic-stitcher receives commands, dispatches jobs
- mosaic-herald sends status updates back through bridge
- OpenClaw has NO direct chat access (stays focused on execution)
Key Components
1. Mosaic-Stitcher (The Wrapper)
The control layer that wraps OpenClaw:
- Receives webhooks from @mosaic bot
- Applies Guard Rails (capability permissions)
- Applies Quality Rails (mandatory gates)
- Tracks all job steps and events
- Dispatches work to OpenClaw with constraints
2. OpenClaw (Execution Engine)
Community-maintained agent swarm (355+ contributors):
- Spawns and manages AI agent sessions
- Multi-LLM support (Claude, GPT, Ollama, etc.)
- Session management and recovery
- We use as-is, wrapped by Mosaic-Stitcher
3. Agent Profiles (Constraints for OpenClaw)
Mosaic-defined capability constraints passed to OpenClaw agents:
- runner - read-only: fetch context, read files, query APIs
- weaver - write: implement code, scoped to git worktree
- inspector - validate: run gates, no modifications
- herald - report: PR comments, notifications, status updates
4. Job Structure
Every job contains granular steps:
| Phase | Steps |
|---|---|
| SETUP | Clone repo, create worktree, install deps |
| EXECUTION | Read requirements, analyze code, implement, write tests |
| VALIDATION | Lint gate, typecheck gate, test gate, coverage gate |
| CLEANUP | Stage, commit, push, create PR |
5. Event Logging (Event Sourcing)
Every action emits an event:
job.created,job.queued,job.started,job.completed,job.failedstep.started,step.progress,step.output,step.completedai.tool_called,ai.tokens_used,ai.artifact_createdgate.started,gate.passed,gate.failed
Storage:
- PostgreSQL: Immutable audit log (permanent)
- Valkey Streams: Recent events (last 1000 per job)
- Valkey Pub/Sub: Real-time streaming
6. Queue Architecture
BullMQ over plain ValkeyService because:
- Job progress tracking (0-100%)
- Automatic retry with exponential backoff
- Rate limiting
- Job dependencies
- Rich lifecycle events
Uses same Valkey instance already configured.
Database Schema
-- Runner jobs (links to existing agent_tasks)
CREATE TABLE runner_jobs (
id UUID PRIMARY KEY,
workspace_id UUID NOT NULL,
agent_task_id UUID REFERENCES agent_tasks(id),
type VARCHAR(100), -- 'git-status', 'code-task', 'priority-calc'
status VARCHAR(50), -- PENDING → QUEUED → RUNNING → COMPLETED/FAILED
priority INT,
progress_percent INT,
result JSONB,
error TEXT,
created_at TIMESTAMPTZ,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ
);
-- Job steps (granular tracking)
CREATE TABLE job_steps (
id UUID PRIMARY KEY,
job_id UUID REFERENCES runner_jobs(id),
ordinal INT,
phase VARCHAR(50), -- setup, execution, validation, cleanup
name VARCHAR(255),
type VARCHAR(50), -- command, ai-action, gate, artifact
status VARCHAR(50),
output TEXT,
tokens_input INT,
tokens_output INT,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
duration_ms INT
);
-- Job events (immutable audit log)
CREATE TABLE job_events (
id UUID PRIMARY KEY,
job_id UUID REFERENCES runner_jobs(id),
step_id UUID REFERENCES job_steps(id),
type VARCHAR(100),
timestamp TIMESTAMPTZ,
actor VARCHAR(100),
payload JSONB
);
Deployment Model
Mosaic wrapper + OpenClaw instance:
docker-compose.yml:
mosaic-stitcher: # Control layer (our code)
mosaic-bridge: # Chat integrations (Discord, Mattermost, Slack)
openclaw: # Execution layer (community code)
valkey: # Queue + cache
postgres: # Job store, events
NOT separate containers per agent type. Runner/weaver/inspector are agent profiles (constraints), not services. OpenClaw spawns agents with the profile constraints we define.
All services:
- Share Valkey (BullMQ queues)
- Share PostgreSQL (job store, events)
- Communicate via queue (stitcher → openclaw)
New Modules (in API for now, extract to containers later)
apps/api/src/
├── stitcher/ # Workflow engine, job creation
├── runner-jobs/ # Job CRUD, queue submission
├── job-steps/ # Step tracking
├── job-events/ # Event logging, WebSocket gateway
└── workers/ # BullMQ processors (one per component type)
Implementation Phases
- Core Infrastructure - BullMQ setup, database migrations
- Coordinator Service - Job submission, status polling, cancel/retry
- Runner Worker - Claude Code integration, step-by-step execution
- Real-time Status - WebSocket gateway, SSE for CLI
- Integration Testing - End-to-end tests
Files to Modify
apps/api/src/app.module.ts- Import new modulesapps/api/src/valkey/valkey.service.ts- Share connection with BullMQapps/api/src/quality-orchestrator/- Integrate with runner for gatespackage.json- Add@nestjs/bullmq,bullmq
Verification
- Create a test job via API
- Verify job appears in BullMQ queue
- Runner picks up and executes with step events
- WebSocket receives real-time updates
- All events persisted to PostgreSQL
- Quality gates run before completion