docs: agent platform architecture plan — augmentation + task breakdown (#173)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>
This commit was merged in pull request #173.
This commit is contained in:
1572
docs/plans/2026-03-15-agent-platform-architecture.md
Normal file
1572
docs/plans/2026-03-15-agent-platform-architecture.md
Normal file
File diff suppressed because it is too large
Load Diff
60
docs/plans/chroot-sandboxing.md
Normal file
60
docs/plans/chroot-sandboxing.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# Chroot Agent Sandboxing — Process Isolation for Agent Tool Execution
|
||||
|
||||
> **Status:** Stub — deferred. Referenced from `2026-03-15-agent-platform-architecture.md` (Phase 7 Workspaces → Chroot Agent Sandboxing).
|
||||
> Implement after Workspaces (P8-015) is complete. Requires workspace directory structure and `WorkspaceService` to be operational.
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Packages:** `apps/gateway`
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Agent sessions can use file, git, and shell tools. Path validation in tools is defense-in-depth but insufficient alone — an agent with shell access can run `cat /opt/mosaic/.workspaces/other_user/...` and bypass gateway RBAC.
|
||||
|
||||
Chroot provides OS-level enforcement: tool processes literally cannot see outside their workspace directory.
|
||||
|
||||
---
|
||||
|
||||
## Design (Sweet Spot)
|
||||
|
||||
Chroot strikes the balance between full container isolation (too heavy per session) and path validation only (escape-prone):
|
||||
|
||||
- Gateway spawns tool processes inside a chroot rooted at the session's `sandboxDir`
|
||||
- Requires `CAP_SYS_CHROOT` capability on the gateway process (not full root)
|
||||
- Chroot environment provisioned by `WorkspaceService` on workspace creation (minimal deps: git, shell utils, language runtimes as needed)
|
||||
- Alternative for Docker deployments: Linux `unshare` namespaces (lighter, no chroot env setup)
|
||||
|
||||
---
|
||||
|
||||
## Scope (To Be Designed)
|
||||
|
||||
- [ ] Chroot environment provisioning — `WorkspaceService.provisionChroot(workspacePath)` on project creation
|
||||
- [ ] Minimal chroot deps — identify required binaries/libs per tool type (file: none; git: git binary; shell: bash, common utils)
|
||||
- [ ] Gateway capability — document `CAP_SYS_CHROOT` requirement; Dockerfile and docker-compose.yml changes
|
||||
- [ ] Tool process spawning — modify `createShellTools`, `createFileTools`, `createGitTools` to spawn via chroot wrapper
|
||||
- [ ] Docker alternative — `unshare --mount --pid --user` namespace wrapper as fallback for environments without chroot capability
|
||||
- [ ] Defense-in-depth layering — chroot + path validation both active; neither alone is sufficient
|
||||
- [ ] Chroot cleanup — integrate with `SessionGCService` / workspace deletion
|
||||
- [ ] AppArmor/SELinux profiles (v2) — restrict gateway process file access patterns for multi-tenant hardening
|
||||
|
||||
---
|
||||
|
||||
## Security Constraints
|
||||
|
||||
- What lives **inside** the chroot (agent-accessible): workspace files, git repo, language runtimes
|
||||
- What lives **outside** the chroot (gateway-only, never agent-accessible): Valkey connection, PG connection, other users' workspaces, gateway config, OTEL endpoint, credentials
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Workspaces (P8-015) — chroot is rooted at workspace directory; workspace must exist first
|
||||
- Tool hardening (P8-016) — path validation stays active as defense-in-depth alongside chroot
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Original design context: `docs/plans/2026-03-15-agent-platform-architecture.md` → "Chroot Agent Sandboxing" section
|
||||
- Current tool implementations: `apps/gateway/src/agent/tools/`
|
||||
53
docs/plans/gatekeeper-service.md
Normal file
53
docs/plans/gatekeeper-service.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# Gatekeeper Service — PR Review, Quality Gates & Merge Authority
|
||||
|
||||
> **Status:** Stub — deferred. Referenced from `2026-03-15-agent-platform-architecture.md` (Phase 7 Workspaces).
|
||||
> Implement after Workspaces (P8-015) is complete and the workspace/git infrastructure is operational.
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Packages:** `apps/gateway`, `packages/types`, `packages/agent`
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Project agents create PRs but cannot review or merge their own work. A separate, isolated agent service with read-only code access and quality gate enforcement is needed to act as the authoritative merge authority.
|
||||
|
||||
The Gatekeeper existed in the old Mosaic codebase and must be ported/redesigned for mosaic-mono-v1.
|
||||
|
||||
---
|
||||
|
||||
## Key Design Constraints
|
||||
|
||||
- **Isolated trust boundary** — project agents cannot invoke Gatekeeper directly; it listens for PR events from the git provider
|
||||
- **`isSystem: true`** — system agent, not editable by users
|
||||
- **Read-only code access** — reads diffs and runs checks; cannot commit or push
|
||||
- **Quality gates required before merge** — lint, typecheck, test results must pass
|
||||
- **Cannot self-approve** — the agent that authored the PR cannot be the Gatekeeper for that PR
|
||||
|
||||
---
|
||||
|
||||
## Scope (To Be Designed)
|
||||
|
||||
- [ ] Gatekeeper agent bootstrap — system agent config, tool set, prompt engineering
|
||||
- [ ] PR event listener — Gitea/GitHub webhook integration (PR opened/updated/ready)
|
||||
- [ ] Quality gate runner — trigger CI checks, poll for results, enforce pass criteria
|
||||
- [ ] Review generation — LLM-driven code review comment generation
|
||||
- [ ] Merge execution — approve + merge when gates pass; reject with comments when they fail
|
||||
- [ ] Configurable strictness — per-project required checks, review depth
|
||||
- [ ] Trust boundary enforcement — gateway rejects Gatekeeper tool calls that exceed read-only scope
|
||||
- [ ] Audit trail — OTEL spans for all Gatekeeper decisions (approve/reject/merge)
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Workspaces (P8-015) — Gatekeeper needs project workspace layout to locate code
|
||||
- Git provider API tools — PR creation/review/merge API (Gitea/GitHub/GitLab)
|
||||
- CI/CD tool integration — Woodpecker pipeline status polling
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Original design context: `docs/plans/2026-03-15-agent-platform-architecture.md` → "Gatekeeper Service" section
|
||||
- Workspace RBAC and agent trust model: same document → "RBAC & Filesystem Security"
|
||||
60
docs/plans/task-queue-unification.md
Normal file
60
docs/plans/task-queue-unification.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# Task Queue Unification — @mosaic/queue as Unified Orchestration Layer
|
||||
|
||||
> **Status:** Stub — deferred. Referenced from `2026-03-15-agent-platform-architecture.md` (Task Queue & Orchestration section).
|
||||
> Implement after Workspaces (P8-015) is complete. Requires workspace file structure to be in place.
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Packages:** `packages/queue`, `packages/coord`, `packages/db`, `apps/gateway`
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Two disconnected task systems exist:
|
||||
|
||||
1. **`@mosaic/coord`** — file-based missions (`mission.json`, `TASKS.md`), file locks, subprocess spawning. Single-machine orchestrator pattern.
|
||||
2. **PG tables** (`tasks`, `mission_tasks`, `missions`) — DB-backed CRUD, REST API, Brain repos.
|
||||
|
||||
An agent using `coord_mission_status` gets file data. The dashboard shows DB data. They are never in sync.
|
||||
|
||||
---
|
||||
|
||||
## Vision
|
||||
|
||||
`@mosaic/queue` becomes the unified task orchestration service bridging PG, workspace files, and Valkey:
|
||||
|
||||
- DB is source of truth for structured state (status, assignees, timestamps)
|
||||
- Workspace files (`TASKS.md`, PRDs) are working copies for agent interaction
|
||||
- Valkey handles real-time assignment queues and agent claim locks
|
||||
- Flatfile fallback for no-DB single-machine deployments (preserves `@mosaic/coord` pattern)
|
||||
|
||||
---
|
||||
|
||||
## Scope (To Be Designed)
|
||||
|
||||
- [ ] `@mosaic/queue` refactor — elevate from ioredis primitive to task orchestration service
|
||||
- [ ] DB ↔ file sync layer — writes to PG propagate to `TASKS.md`; file edits by agents sync back
|
||||
- [ ] Task assignment queue — Valkey-backed RPUSH/BLPOP for agent task claiming
|
||||
- [ ] Agent claim locks — `mosaic:queue:project:{id}:lock:{taskId}` with TTL
|
||||
- [ ] `@mosaic/coord` consolidation — file-based ops ported into queue service; `@mosaic/coord` becomes thin adapter or deprecated
|
||||
- [ ] Flatfile fallback — queue service writes JSON manifests when PG unavailable
|
||||
- [ ] Status pub/sub — real-time task status updates via Valkey pub/sub
|
||||
- [ ] Dependency resolution — block task assignment until dependencies are met
|
||||
- [ ] Orchestrator monitor — gateway process watches task queue, assigns next based on dependency graph
|
||||
- [ ] API surface — queue service exposes typed interface used by agents, gateway, and CLI
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Workspaces (P8-015) — file sync targets the workspace directory structure
|
||||
- Teams architecture (P8-007) — project ownership determines queue namespacing
|
||||
- DB schema stable — task/mission tables must not change mid-unification
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Original design context: `docs/plans/2026-03-15-agent-platform-architecture.md` → "Task Queue & Orchestration" section
|
||||
- Current `@mosaic/coord` implementation: `packages/coord/src/`
|
||||
- Current `@mosaic/queue` implementation: `packages/queue/src/`
|
||||
Reference in New Issue
Block a user