All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Detailed comparison showing: - Existing doc addresses L-015 (premature completion) - New doc addresses context exhaustion (multi-issue orchestration) - ~20% overlap (both use non-AI coordinator, mechanical gates) - 80% complementary (different problems, different solutions) Recommends merging into comprehensive document (already done). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
418 lines
14 KiB
Markdown
418 lines
14 KiB
Markdown
# Non-AI Coordinator Pattern - Overlap Analysis
|
|
|
|
**Date:** 2026-01-31
|
|
**Purpose:** Identify overlaps and differences between two complementary architecture documents
|
|
|
|
---
|
|
|
|
## Documents Compared
|
|
|
|
### Document A: Mosaic Stack Non-AI Coordinator Pattern
|
|
|
|
**Location:** `/home/jwoltje/src/mosaic-stack/docs/3-architecture/non-ai-coordinator-pattern.md`
|
|
**Length:** 903 lines
|
|
**Problem Space:** L-015 Agent Premature Completion
|
|
**Focus:** Single-agent quality enforcement
|
|
|
|
### Document B: Quality-Rails Orchestration Architecture
|
|
|
|
**Location:** `/home/jwoltje/src/jarvis-brain/docs/work/quality-rails-orchestration-architecture.md`
|
|
**Length:** ~600 lines
|
|
**Problem Space:** Context exhaustion in multi-issue orchestration
|
|
**Focus:** Multi-agent lifecycle management at scale
|
|
|
|
---
|
|
|
|
## Summary Table
|
|
|
|
| Aspect | Document A (Existing) | Document B (New) | Overlap? |
|
|
| -------------------------- | ------------------------------------------- | ---------------------------------------- | ------------------ |
|
|
| **Primary Problem** | Agents claim "done" prematurely | Agents pause at 95% context | Different |
|
|
| **Coordinator Type** | Non-AI (TypeScript/NestJS) | Non-AI (Python/Node.js) | ✅ Overlap |
|
|
| **Quality Gates** | BuildGate, LintGate, TestGate, CoverageGate | Mechanical gates (lint, typecheck, test) | ✅ Overlap |
|
|
| **Agent Scope** | Single agent per issue | Multi-agent orchestration | Different |
|
|
| **Context Management** | Not addressed | Core feature (80% compact, 95% rotate) | Different |
|
|
| **Model Assignment** | Not addressed | Agent profiles + difficulty matching | Different |
|
|
| **Issue Sizing** | Not addressed | 50% rule, epic decomposition | Different |
|
|
| **Implementation Status** | Full TypeScript code | Python pseudocode + PoC plan | Different |
|
|
| **Forced Continuation** | Yes (rejection loop) | No (preventive via context mgmt) | Different approach |
|
|
| **Non-negotiable Quality** | Yes | Yes | ✅ Overlap |
|
|
|
|
---
|
|
|
|
## Unique to Document A (Existing Mosaic Stack Pattern)
|
|
|
|
### 1. **Premature Completion Problem**
|
|
|
|
- **Problem:** Agents claim work is "done" when tests fail, files are missing, or requirements are incomplete
|
|
- **Root cause:** Agent interprets partial completion as success
|
|
- **Example:** Agent implements feature, tests fail, agent says "done" anyway
|
|
|
|
### 2. **Rejection Loop & Forced Continuation**
|
|
|
|
```typescript
|
|
// CompletionVerificationEngine
|
|
if (!allGatesPassed) {
|
|
return this.forcedContinuationService.generateContinuationPrompt({
|
|
failedGates,
|
|
tone: "non-negotiable",
|
|
});
|
|
}
|
|
```
|
|
|
|
**Key innovation:** When agent claims "done" but gates fail, coordinator injects prompt forcing continuation:
|
|
|
|
```
|
|
COMPLETION REJECTED. The following quality gates have failed:
|
|
- Build Gate: Compilation errors detected
|
|
- Test Gate: 3/15 tests failing
|
|
|
|
You must continue working until ALL quality gates pass.
|
|
This is not optional. Do not claim completion until gates pass.
|
|
```
|
|
|
|
### 3. **State Machine for Completion Claims**
|
|
|
|
```
|
|
Agent Working → Claims Done → Run Gates → Pass/Reject
|
|
↓
|
|
Reject → Force Continue → Agent Working
|
|
```
|
|
|
|
### 4. **TypeScript/NestJS Implementation**
|
|
|
|
- Full production-ready service code
|
|
- QualityOrchestrator service
|
|
- Gate interfaces and implementations
|
|
- Dependency injection architecture
|
|
|
|
### 5. **CompletionVerificationEngine**
|
|
|
|
- Intercepts agent completion claims
|
|
- Runs all gates synchronously
|
|
- Blocks "done" status until gates pass
|
|
|
|
---
|
|
|
|
## Unique to Document B (New Quality-Rails Orchestration)
|
|
|
|
### 1. **Context Exhaustion Problem**
|
|
|
|
- **Problem:** AI orchestrators pause at 95% context usage, losing autonomy
|
|
- **Root cause:** Linear context growth without compaction
|
|
- **Example:** M4 session completed 11 issues, paused at 95%, required manual restart
|
|
|
|
### 2. **50% Rule for Issue Sizing**
|
|
|
|
```
|
|
Issue context estimate MUST NOT exceed 50% of target agent's context limit.
|
|
|
|
Example:
|
|
- Sonnet agent: 200K context limit
|
|
- Maximum issue estimate: 100K tokens
|
|
- Reasoning: Leaves 100K for system prompts, conversation, safety buffer
|
|
```
|
|
|
|
### 3. **Agent Profiles & Model Assignment**
|
|
|
|
```python
|
|
AGENT_PROFILES = {
|
|
'opus': {
|
|
'context_limit': 200000,
|
|
'cost_per_mtok': 15.00,
|
|
'capabilities': ['high', 'medium', 'low']
|
|
},
|
|
'sonnet': {
|
|
'context_limit': 200000,
|
|
'cost_per_mtok': 3.00,
|
|
'capabilities': ['medium', 'low']
|
|
},
|
|
'glm': {
|
|
'context_limit': 128000,
|
|
'cost_per_mtok': 0.00, # Self-hosted
|
|
'capabilities': ['medium', 'low']
|
|
}
|
|
}
|
|
```
|
|
|
|
**Assignment logic:** Choose cheapest capable agent based on:
|
|
|
|
- Estimated context usage
|
|
- Difficulty level
|
|
- Agent capabilities
|
|
|
|
### 4. **Context Monitoring & Session Rotation**
|
|
|
|
```python
|
|
def monitor_agent_context(agent_id: str) -> ContextAction:
|
|
usage = get_context_usage(agent_id)
|
|
|
|
if usage > 0.95:
|
|
return ContextAction.ROTATE_SESSION # Start fresh agent
|
|
elif usage > 0.80:
|
|
return ContextAction.COMPACT # Summarize completed work
|
|
else:
|
|
return ContextAction.CONTINUE # Keep working
|
|
```
|
|
|
|
### 5. **Context Estimation Formula**
|
|
|
|
```python
|
|
def estimate_context(issue: Issue) -> int:
|
|
base = (
|
|
issue.files_to_modify * 7000 + # Average file size
|
|
issue.implementation_complexity * 20000 + # Code writing
|
|
issue.test_requirements * 10000 + # Test writing
|
|
issue.documentation * 3000 # Docs
|
|
)
|
|
|
|
buffer = base * 1.3 # 30% safety margin
|
|
return int(buffer)
|
|
```
|
|
|
|
### 6. **Epic Decomposition Workflow**
|
|
|
|
```
|
|
User creates Epic → Coordinator analyzes scope → Decomposes into sub-issues
|
|
↓
|
|
Each issue ≤ 50% agent context limit
|
|
↓
|
|
Assigns metadata: estimated_context, difficulty
|
|
```
|
|
|
|
### 7. **Multi-Model Support**
|
|
|
|
- Supports Opus, Sonnet, Haiku, GLM, MiniMax, Cogito
|
|
- Cost optimization through model selection
|
|
- Self-hosted model preference when capable
|
|
|
|
### 8. **Proactive Context Management**
|
|
|
|
- Prevents context exhaustion BEFORE it happens
|
|
- No manual intervention needed
|
|
- Maintains autonomy through entire queue
|
|
|
|
---
|
|
|
|
## Overlaps (Both Documents)
|
|
|
|
### 1. **Non-AI Coordinator Pattern** ✅
|
|
|
|
Both use deterministic code (not AI) as the orchestrator:
|
|
|
|
- **Doc A:** TypeScript/NestJS service
|
|
- **Doc B:** Python/Node.js coordinator
|
|
- **Rationale:** Avoid AI orchestrator context limits and inconsistency
|
|
|
|
### 2. **Mechanical Quality Gates** ✅
|
|
|
|
Both enforce quality through automated checks:
|
|
|
|
**Doc A gates:**
|
|
|
|
- BuildGate (compilation)
|
|
- LintGate (code style)
|
|
- TestGate (unit/integration tests)
|
|
- CoverageGate (test coverage threshold)
|
|
|
|
**Doc B gates:**
|
|
|
|
- lint (code quality)
|
|
- typecheck (type safety)
|
|
- test (functionality)
|
|
- coverage (same as Doc A)
|
|
|
|
### 3. **Programmatic Enforcement** ✅
|
|
|
|
Both prevent agent from bypassing quality:
|
|
|
|
- **Doc A:** Rejection loop blocks completion until gates pass
|
|
- **Doc B:** Coordinator enforces gates before allowing next issue
|
|
- **Shared principle:** Quality is a requirement, not a suggestion
|
|
|
|
### 4. **Non-Negotiable Quality Standards** ✅
|
|
|
|
Both use firm language about quality requirements:
|
|
|
|
- **Doc A:** "This is not optional. Do not claim completion until gates pass."
|
|
- **Doc B:** "Quality gates are mechanical blockers, not suggestions."
|
|
|
|
### 5. **State Management** ✅
|
|
|
|
Both track work state programmatically:
|
|
|
|
- **Doc A:** Agent state machine (working → claimed done → verified → actual done)
|
|
- **Doc B:** Issue state in tracking system (pending → in-progress → gate-check → completed)
|
|
|
|
### 6. **Validation Before Progression** ✅
|
|
|
|
Both prevent moving forward with broken code:
|
|
|
|
- **Doc A:** Cannot claim "done" until gates pass
|
|
- **Doc B:** Cannot start next issue until current issue passes gates
|
|
|
|
---
|
|
|
|
## Complementary Nature
|
|
|
|
These documents solve **different problems in the same architectural pattern**:
|
|
|
|
### Document A (Existing): Quality Enforcement
|
|
|
|
**Problem:** "How do we prevent an agent from claiming work is done when it's not?"
|
|
**Solution:** Rejection loop with forced continuation
|
|
**Scope:** Single agent working on single issue
|
|
**Lifecycle stage:** Task completion verification
|
|
|
|
### Document B (New): Orchestration at Scale
|
|
|
|
**Problem:** "How do we manage multiple agents working through dozens of issues without context exhaustion?"
|
|
**Solution:** Proactive context management + intelligent agent assignment
|
|
**Scope:** Multi-agent orchestration across entire milestone
|
|
**Lifecycle stage:** Agent selection, session management, queue progression
|
|
|
|
### Together They Form:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ Non-AI Coordinator (Document B) │
|
|
│ - Monitors context usage across all agents │
|
|
│ - Assigns issues based on context estimates │
|
|
│ - Rotates agents at 95% context │
|
|
│ - Enforces 50% rule during issue creation │
|
|
└─────────────────────────┬───────────────────────────────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
▼ ▼ ▼
|
|
Agent 1 Agent 2 Agent 3
|
|
Issue #42 Issue #57 Issue #89
|
|
│ │ │
|
|
└─────────────────┴─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Quality Orchestrator (Document A) │
|
|
│ - Intercepts completion claims │
|
|
│ - Runs quality gates │
|
|
│ - Forces continuation if gates fail │
|
|
│ - Only allows "done" when gates pass │
|
|
└─────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Document B (new)** manages the **agent lifecycle and orchestration**.
|
|
**Document A (existing)** manages the **quality enforcement per agent**.
|
|
|
|
---
|
|
|
|
## Integration Recommendations
|
|
|
|
### Option 1: Merge into Single Document (Recommended)
|
|
|
|
**Reason:** They're parts of the same system
|
|
|
|
**Structure:**
|
|
|
|
```markdown
|
|
# Non-AI Coordinator Pattern Architecture
|
|
|
|
## Part 1: Multi-Agent Orchestration (from Doc B)
|
|
|
|
- Context management
|
|
- Agent assignment
|
|
- Session rotation
|
|
- 50% rule
|
|
- Epic decomposition
|
|
|
|
## Part 2: Quality Enforcement (from Doc A)
|
|
|
|
- Premature completion problem
|
|
- Quality gates
|
|
- Rejection loop
|
|
- Forced continuation
|
|
- CompletionVerificationEngine
|
|
|
|
## Part 3: Implementation
|
|
|
|
- TypeScript/NestJS orchestrator (from Doc A)
|
|
- Python coordinator enhancements (from Doc B)
|
|
- Integration points
|
|
```
|
|
|
|
### Option 2: Keep Separate, Create Integration Doc
|
|
|
|
**Reason:** Different audiences (orchestration vs quality enforcement)
|
|
|
|
**Documents:**
|
|
|
|
1. `orchestration-architecture.md` (Doc B) - For understanding multi-agent coordination
|
|
2. `quality-enforcement-architecture.md` (Doc A) - For understanding quality gates
|
|
3. `non-ai-coordinator-integration.md` (NEW) - How they work together
|
|
|
|
### Option 3: Hierarchical Documentation
|
|
|
|
**Reason:** Layers of abstraction
|
|
|
|
```
|
|
non-ai-coordinator-pattern.md (Overview)
|
|
├── orchestration-layer.md (Doc B content)
|
|
└── quality-layer.md (Doc A content)
|
|
```
|
|
|
|
---
|
|
|
|
## Action Items
|
|
|
|
Based on overlap analysis, recommend:
|
|
|
|
1. **Merge the documents** into comprehensive architecture guide
|
|
- Use Doc A's problem statement for quality enforcement
|
|
- Use Doc B's problem statement for context exhaustion
|
|
- Show how both problems require non-AI coordinator
|
|
- Integrate TypeScript implementation with context monitoring
|
|
|
|
2. **Update Mosaic Stack issue #140**
|
|
- Current: "Document Non-AI Coordinator Pattern Architecture"
|
|
- Expand scope: Include both quality enforcement AND orchestration
|
|
- Reference both problem spaces (L-015 + context exhaustion)
|
|
|
|
3. **Create unified PoC plan**
|
|
- Phase 1: Context monitoring (from Doc B)
|
|
- Phase 2: Agent assignment logic (from Doc B)
|
|
- Phase 3: Quality gate integration (from Doc A)
|
|
- Phase 4: Forced continuation (from Doc A)
|
|
|
|
4. **Preserve unique innovations from each**
|
|
- Doc A: Rejection loop, forced continuation prompts
|
|
- Doc B: 50% rule, agent profiles, context estimation formula
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
**These documents are highly complementary, not duplicative.**
|
|
|
|
- **~20% overlap:** Both use non-AI coordinator, mechanical gates, non-negotiable quality
|
|
- **80% unique value:** Doc A solves premature completion, Doc B solves context exhaustion
|
|
|
|
**Best path forward:** Merge into single comprehensive architecture document that addresses both problems within the unified non-AI coordinator pattern.
|
|
|
|
The pattern is:
|
|
|
|
1. Non-AI coordinator assigns issues based on context estimates (Doc B)
|
|
2. Agent works on issue
|
|
3. Quality gates enforce completion standards (Doc A)
|
|
4. Context monitoring prevents exhaustion (Doc B)
|
|
5. Forced continuation prevents premature "done" (Doc A)
|
|
6. Next issue assigned when ready (Doc B)
|
|
|
|
Together they create a **robust, autonomous, quality-enforcing orchestration system** that scales beyond single-agent, single-issue scenarios.
|
|
|
|
---
|
|
|
|
**Next Steps:**
|
|
|
|
1. User review of this analysis
|
|
2. Decision on integration approach (Option 1, 2, or 3)
|
|
3. Update Mosaic Stack documentation accordingly
|
|
4. Proceed with PoC implementation
|