All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- mosaic-component-architecture.md: OpenClaw wrapper pattern, component naming, job tracking, chat integration, database schema - guard-rails-capability-permissions.md: Capability-based permission model Related: #162 (M4.2 Infrastructure Epic) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
315 lines
14 KiB
Markdown
315 lines
14 KiB
Markdown
# Mosaic Component Architecture Design
|
|
|
|
## Strategic Decision
|
|
|
|
**OpenClaw as execution engine, Mosaic as control layer.**
|
|
|
|
- **Now (M1-M2):** Wrapper approach - use OpenClaw, add Mosaic controls
|
|
- **After M2:** Evaluate - is OpenClaw working for us?
|
|
- **If needed:** Fork or rebuild with lessons learned
|
|
|
|
**Why:** 355+ contributors maintain OpenClaw. We maintain only the wrapper. Ship faster, pivot later if needed.
|
|
|
|
## Philosophy
|
|
|
|
**Mosaic** = pieces combining to create a beautiful, larger picture.
|
|
|
|
Each component has a **dedicated function** (single responsibility). Focused tasks = agents stay on rails. If an agent only does one thing, it can't wander off-track.
|
|
|
|
## Overview
|
|
|
|
Establish the pattern for how Mosaic's control layer wraps OpenClaw's execution layer, with full job step tracking and event logging.
|
|
|
|
## Component Naming
|
|
|
|
| Component | Dedicated Function | Rails |
|
|
| -------------------- | ------------------------------------------------------------------------ | ------------------------------------ |
|
|
| **@mosaic** | Gitea bot user - triggers workflow on issue assignment/mention | Webhook receiver only |
|
|
| **mosaic-stitcher** | Orchestrates workflow, sequences jobs, manages priorities | Control plane only, no execution |
|
|
| **mosaic-bridge** | Chat integrations (Discord, Mattermost, Slack) - commands in, status out | I/O only, no execution |
|
|
| **mosaic-runner** | Fetches information, gathers context, reads repos | Read-only operations |
|
|
| **mosaic-weaver** | Implements code changes, writes files | Write operations, scoped to worktree |
|
|
| **mosaic-inspector** | Runs quality gates (build, lint, test) | Validation only, no modifications |
|
|
| **mosaic-herald** | Reports status, creates PR comments, notifications | Output/reporting only |
|
|
|
|
**Why this works:** Each component has exactly ONE job. Can't go off rails if there's only one rail.
|
|
|
|
**Note:** Names are placeholders. Components are modular plugins—names can change later.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐
|
|
│ @mosaic │ │ mosaic-bridge │
|
|
│ (Gitea Bot) │ │ (Chat I/O) │
|
|
│ Webhook Trigger│ │ Discord/MM/etc │
|
|
└────────┬────────┘ └────────┬────────┘
|
|
│ Issue assigned │ Commands
|
|
└───────────────┬────────────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ MOSAIC STACK (Control Layer) │
|
|
│ │
|
|
│ ┌──────────────────────────────────────────────────────┐ │
|
|
│ │ MOSAIC-STITCHER (Wrapper) │ │
|
|
│ │ ┌───────────┐ ┌───────────┐ ┌───────────────────┐ │ │
|
|
│ │ │ Guard │ │ Quality │ │ Job Tracking │ │ │
|
|
│ │ │ Rails │ │ Rails │ │ (Events/Steps) │ │ │
|
|
│ │ │ (perms) │ │ (gates) │ │ │ │ │
|
|
│ │ └───────────┘ └───────────┘ └───────────────────┘ │ │
|
|
│ └──────────────────────────┬───────────────────────────┘ │
|
|
│ │ │
|
|
└─────────────────────────────┼────────────────────────────────┘
|
|
│ Dispatch with constraints
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ OPENCLAW (Execution Layer) │
|
|
│ 355+ contributors maintain │
|
|
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
|
|
│ │ Agent │ │ Session │ │ Multi-LLM │ │ Discord │ │
|
|
│ │ Spawning │ │ Manager │ │ Support │ │ Integr. │ │
|
|
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
|
|
│ │
|
|
│ Agent Profiles (Mosaic-defined constraints): │
|
|
│ ┌─────────┐ ┌─────────┐ ┌───────────┐ ┌─────────┐ │
|
|
│ │ RUNNER │ │ WEAVER │ │ INSPECTOR │ │ HERALD │ │
|
|
│ │ (read) │ │ (write) │ │ (validate)│ │ (report)│ │
|
|
│ └─────────┘ └─────────┘ └───────────┘ └─────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Key insight:** Agent profiles (runner, weaver, etc.) are **constraints passed to OpenClaw**, not separate containers. OpenClaw spawns agents, Mosaic controls what they're allowed to do.
|
|
|
|
## Relationship to Non-AI Coordinator (M4.1)
|
|
|
|
This architecture **complements** the Non-AI Coordinator Pattern:
|
|
|
|
| Layer | Responsibility | Milestone |
|
|
| --------------------------------- | ----------------------------------------------------------------------------------- | --------- |
|
|
| **Non-AI Coordinator** | Orchestration logic (when to assign, context monitoring, quality gates enforcement) | M4.1 |
|
|
| **Mosaic Component Architecture** | Execution infrastructure (job tracking, OpenClaw integration, chat commands) | M4.2 |
|
|
|
|
The Non-AI Coordinator uses this infrastructure to dispatch and monitor jobs.
|
|
|
|
## Chat Integration (mosaic-bridge)
|
|
|
|
**Control Mosaic Stack via Discord, Mattermost, Slack, etc.**
|
|
|
|
```
|
|
#mosaic-control
|
|
├── User: "@mosaic fix issue #42"
|
|
├── Mosaic: "🚀 Started job #123 for issue #42" [link to thread]
|
|
│
|
|
└── Thread: "Job #123: Fix issue #42"
|
|
├── 📖 Runner: Gathering context... ✓
|
|
├── 🧵 Weaver: Implementing... ✓
|
|
├── 🔍 Inspector: Running tests... ✓
|
|
├── 📢 Herald: PR created → #456
|
|
└── [Full event log: /api/jobs/123/events]
|
|
```
|
|
|
|
### Noise Management Strategy
|
|
|
|
| Channel | Purpose | Verbosity |
|
|
| ----------------------- | ---------------------------------- | ------------------------- |
|
|
| `#mosaic-control` | Commands + summaries | Low (milestones only) |
|
|
| Job threads | Per-job activity | Medium (step completions) |
|
|
| `/api/jobs/{id}/events` | Full audit log | High (everything) |
|
|
| DMs (optional) | Private updates to triggering user | Configurable |
|
|
|
|
### Commands (via chat)
|
|
|
|
```
|
|
@mosaic fix <issue> # Start job for issue
|
|
@mosaic status <job> # Get job status
|
|
@mosaic cancel <job> # Cancel running job
|
|
@mosaic verbose <job> # Stream full logs to thread
|
|
@mosaic quiet # Reduce notifications
|
|
@mosaic help # Show commands
|
|
```
|
|
|
|
### Integration lives at Mosaic layer, not OpenClaw
|
|
|
|
- **mosaic-bridge** handles Discord/Mattermost/Slack APIs
|
|
- **mosaic-stitcher** receives commands, dispatches jobs
|
|
- **mosaic-herald** sends status updates back through bridge
|
|
- OpenClaw has NO direct chat access (stays focused on execution)
|
|
|
|
## Key Components
|
|
|
|
### 1. Mosaic-Stitcher (The Wrapper)
|
|
|
|
The control layer that wraps OpenClaw:
|
|
|
|
- Receives webhooks from @mosaic bot
|
|
- Applies Guard Rails (capability permissions)
|
|
- Applies Quality Rails (mandatory gates)
|
|
- Tracks all job steps and events
|
|
- Dispatches work to OpenClaw with constraints
|
|
|
|
### 2. OpenClaw (Execution Engine)
|
|
|
|
Community-maintained agent swarm (355+ contributors):
|
|
|
|
- Spawns and manages AI agent sessions
|
|
- Multi-LLM support (Claude, GPT, Ollama, etc.)
|
|
- Session management and recovery
|
|
- We use as-is, wrapped by Mosaic-Stitcher
|
|
|
|
### 3. Agent Profiles (Constraints for OpenClaw)
|
|
|
|
Mosaic-defined capability constraints passed to OpenClaw agents:
|
|
|
|
- **runner** - read-only: fetch context, read files, query APIs
|
|
- **weaver** - write: implement code, scoped to git worktree
|
|
- **inspector** - validate: run gates, no modifications
|
|
- **herald** - report: PR comments, notifications, status updates
|
|
|
|
### 4. Job Structure
|
|
|
|
Every job contains granular steps:
|
|
|
|
| Phase | Steps |
|
|
| ---------- | ------------------------------------------------------- |
|
|
| SETUP | Clone repo, create worktree, install deps |
|
|
| EXECUTION | Read requirements, analyze code, implement, write tests |
|
|
| VALIDATION | Lint gate, typecheck gate, test gate, coverage gate |
|
|
| CLEANUP | Stage, commit, push, create PR |
|
|
|
|
### 5. Event Logging (Event Sourcing)
|
|
|
|
Every action emits an event:
|
|
|
|
- `job.created`, `job.queued`, `job.started`, `job.completed`, `job.failed`
|
|
- `step.started`, `step.progress`, `step.output`, `step.completed`
|
|
- `ai.tool_called`, `ai.tokens_used`, `ai.artifact_created`
|
|
- `gate.started`, `gate.passed`, `gate.failed`
|
|
|
|
Storage:
|
|
|
|
- PostgreSQL: Immutable audit log (permanent)
|
|
- Valkey Streams: Recent events (last 1000 per job)
|
|
- Valkey Pub/Sub: Real-time streaming
|
|
|
|
### 6. Queue Architecture
|
|
|
|
**BullMQ** over plain ValkeyService because:
|
|
|
|
- Job progress tracking (0-100%)
|
|
- Automatic retry with exponential backoff
|
|
- Rate limiting
|
|
- Job dependencies
|
|
- Rich lifecycle events
|
|
|
|
Uses same Valkey instance already configured.
|
|
|
|
## Database Schema
|
|
|
|
```sql
|
|
-- Runner jobs (links to existing agent_tasks)
|
|
CREATE TABLE runner_jobs (
|
|
id UUID PRIMARY KEY,
|
|
workspace_id UUID NOT NULL,
|
|
agent_task_id UUID REFERENCES agent_tasks(id),
|
|
type VARCHAR(100), -- 'git-status', 'code-task', 'priority-calc'
|
|
status VARCHAR(50), -- PENDING → QUEUED → RUNNING → COMPLETED/FAILED
|
|
priority INT,
|
|
progress_percent INT,
|
|
result JSONB,
|
|
error TEXT,
|
|
created_at TIMESTAMPTZ,
|
|
started_at TIMESTAMPTZ,
|
|
completed_at TIMESTAMPTZ
|
|
);
|
|
|
|
-- Job steps (granular tracking)
|
|
CREATE TABLE job_steps (
|
|
id UUID PRIMARY KEY,
|
|
job_id UUID REFERENCES runner_jobs(id),
|
|
ordinal INT,
|
|
phase VARCHAR(50), -- setup, execution, validation, cleanup
|
|
name VARCHAR(255),
|
|
type VARCHAR(50), -- command, ai-action, gate, artifact
|
|
status VARCHAR(50),
|
|
output TEXT,
|
|
tokens_input INT,
|
|
tokens_output INT,
|
|
started_at TIMESTAMPTZ,
|
|
completed_at TIMESTAMPTZ,
|
|
duration_ms INT
|
|
);
|
|
|
|
-- Job events (immutable audit log)
|
|
CREATE TABLE job_events (
|
|
id UUID PRIMARY KEY,
|
|
job_id UUID REFERENCES runner_jobs(id),
|
|
step_id UUID REFERENCES job_steps(id),
|
|
type VARCHAR(100),
|
|
timestamp TIMESTAMPTZ,
|
|
actor VARCHAR(100),
|
|
payload JSONB
|
|
);
|
|
```
|
|
|
|
## Deployment Model
|
|
|
|
**Mosaic wrapper + OpenClaw instance:**
|
|
|
|
```
|
|
docker-compose.yml:
|
|
mosaic-stitcher: # Control layer (our code)
|
|
mosaic-bridge: # Chat integrations (Discord, Mattermost, Slack)
|
|
openclaw: # Execution layer (community code)
|
|
valkey: # Queue + cache
|
|
postgres: # Job store, events
|
|
```
|
|
|
|
**NOT separate containers per agent type.** Runner/weaver/inspector are **agent profiles** (constraints), not services. OpenClaw spawns agents with the profile constraints we define.
|
|
|
|
All services:
|
|
|
|
- Share Valkey (BullMQ queues)
|
|
- Share PostgreSQL (job store, events)
|
|
- Communicate via queue (stitcher → openclaw)
|
|
|
|
## New Modules (in API for now, extract to containers later)
|
|
|
|
```
|
|
apps/api/src/
|
|
├── stitcher/ # Workflow engine, job creation
|
|
├── runner-jobs/ # Job CRUD, queue submission
|
|
├── job-steps/ # Step tracking
|
|
├── job-events/ # Event logging, WebSocket gateway
|
|
└── workers/ # BullMQ processors (one per component type)
|
|
```
|
|
|
|
## Implementation Phases
|
|
|
|
1. **Core Infrastructure** - BullMQ setup, database migrations
|
|
2. **Coordinator Service** - Job submission, status polling, cancel/retry
|
|
3. **Runner Worker** - Claude Code integration, step-by-step execution
|
|
4. **Real-time Status** - WebSocket gateway, SSE for CLI
|
|
5. **Integration Testing** - End-to-end tests
|
|
|
|
## Files to Modify
|
|
|
|
- `apps/api/src/app.module.ts` - Import new modules
|
|
- `apps/api/src/valkey/valkey.service.ts` - Share connection with BullMQ
|
|
- `apps/api/src/quality-orchestrator/` - Integrate with runner for gates
|
|
- `package.json` - Add `@nestjs/bullmq`, `bullmq`
|
|
|
|
## Verification
|
|
|
|
1. Create a test job via API
|
|
2. Verify job appears in BullMQ queue
|
|
3. Runner picks up and executes with step events
|
|
4. WebSocket receives real-time updates
|
|
5. All events persisted to PostgreSQL
|
|
6. Quality gates run before completion
|
|
|
|
## Related Documentation
|
|
|
|
- [Guard Rails: Capability-Based Permission System](./guard-rails-capability-permissions.md)
|
|
- [Quality Rails Architecture](./quality-rails-orchestration-architecture.md)
|
|
- [Non-AI Coordinator Pattern](./non-ai-coordinator-architecture.md)
|