mosaic/stack

Fork 0

Files

Jason Woltje 0eb3abc12c

ci/woodpecker/push/woodpecker Pipeline was successful

Details

Clean up documents located in the project root.

2026-01-31 16:42:26 -06:00

30 KiB

Raw Permalink Blame History

Quality-Rails Orchestration Architecture

Version: 1.0 Date: 2026-01-31 Status: Proposed - Proof of Concept Required Authors: Jason Woltje + Claude

Executive Summary

A non-AI coordinator pattern for autonomous agent swarm orchestration with mechanical quality enforcement and intelligent context management.

Key Innovation: Separate coordination logic (deterministic code) from execution (AI agents), enabling infinite runtime, cost optimization, and guaranteed quality through mechanical gates.

Core Principles:

Non-AI coordinator - No context limit, runs forever
Mechanical quality gates - Lint, typecheck, test (not AI-judged)
Context monitoring - Track and manage AI agent capacity
Model flexibility - Assign right model for each task
50% rule - Issues never exceed 50% of agent context limit

Problem Statement

Current State: AI-Orchestrated Agents

AI Orchestrator (Opus/Sonnet)
├── Has context limit (200K tokens)
├── Context grows linearly during multi-issue work
├── At 95% usage: Pauses for confirmation (loses autonomy)
├── Manual intervention required (defeats automation)
└── Cannot work through large issue queues unattended

Result: Autonomous orchestration fails at scale

Observed behavior (M4 milestone):

11 issues completed in 97 minutes
Agent paused at 95% context usage
Asked "Should I continue?" (lost autonomy)
10 issues remained incomplete (32% incomplete)
No compaction occurred
Manual restart required

Root Causes

Context accumulation - No automatic compaction
AI risk aversion - Conservative pause at high context
Monolithic design - Coordinator has same limits as workers
No capacity planning - Issues not sized for agent limits
Model inflexibility - One model for all tasks (waste)

Solution: Non-AI Coordinator Architecture

System Architecture

┌─────────────────────────────────────────────────────────┐
│ Non-AI Coordinator (Python/Node.js)                     │
├─────────────────────────────────────────────────────────┤
│ • No context limit (it's just code)                     │
│ • Reads issue queue                                     │
│ • Assigns agents based on context + difficulty          │
│ • Monitors agent context usage                          │
│ • Enforces mechanical quality gates                     │
│ • Triggers compaction at threshold                      │
│ • Rotates agents when exhausted                         │
│ • Infinite runtime capability                           │
└─────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────┐
│ AI Swarm Controller (OpenClaw Session)                  │
├─────────────────────────────────────────────────────────┤
│ • Coordinates subagent work                             │
│ • Context monitored externally                          │
│ • Receives compaction commands                          │
│ • Replaceable/rotatable                                 │
│ • Just an executor (not decision-maker)                 │
└─────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────┐
│ Subagents (OpenClaw Workers)                            │
├─────────────────────────────────────────────────────────┤
│ • Execute individual issues                             │
│ • Report to swarm controller                            │
│ • Quality-gated by coordinator                          │
│ • Model-specific (Opus, Sonnet, Haiku, etc.)           │
└─────────────────────────────────────────────────────────┘

Separation of Concerns

Concern	Non-AI Coordinator	AI Swarm Controller	Subagents
Context limit	None (immortal)	200K tokens	200K tokens
Lifespan	Entire milestone	Rotatable	Per-issue
Decision-making	Model assignment, quality enforcement	Work coordination	Task execution
Quality gates	Enforces mechanically	N/A	N/A
State management	Persistent	Can be rotated	Ephemeral
Cost	Minimal (code execution)	Per-token	Per-token

The 50% Rule

Issue Size Constraint

Rule: Each issue must consume no more than 50% of the assigned agent's context limit.

Rationale:

Agent context limit: 200,000 tokens

Overhead consumption:
├── System prompts: 10-20K tokens
├── Project context: 20-30K tokens
├── Code reading: 20-40K tokens
├── Execution buffer: 10-20K tokens
└── Total overhead: 60-110K tokens (30-55%)

Available for issue: 90-140K tokens
Safe limit (50%): 100K tokens

This allows:
- Room for overhead
- Iteration and debugging
- Unexpected complexity
- No mid-task exhaustion

Enforcement:

Issue creation MUST include context estimate
Coordinator validates estimate before assignment
If estimate > 50% of target agent: Reject or decompose

Epic Decomposition

Large epics must be split:

Epic: Authentication System
Estimated context: 300K tokens total
Target agent: Sonnet (200K limit)
Issue size limit: 100K tokens (50% rule)

Decomposition required:
├── Issue 1: Auth middleware [20K ctx | Medium]
├── Issue 2: JWT implementation [25K ctx | Medium]
├── Issue 3: User sessions [30K ctx | Medium]
├── Issue 4: Login endpoints [25K ctx | Low]
├── Issue 5: RBAC permissions [20K ctx | Medium]
└── Total: 120K ctx across 5 issues

Each issue < 100K ✅
Epic fits within multiple agent sessions ✅

Agent Profiles

Model Capabilities Matrix

{
  "agents": {
    "opus": {
      "model": "claude-opus-4-5",
      "context_limit": 200000,
      "difficulty_levels": ["high", "medium", "low"],
      "cost_per_1k_input": 0.015,
      "cost_per_1k_output": 0.075,
      "speed": "slow",
      "use_cases": [
        "Complex refactoring",
        "Architecture design",
        "Difficult debugging",
        "Novel algorithms"
      ]
    },
    "sonnet": {
      "model": "claude-sonnet-4-5",
      "context_limit": 200000,
      "difficulty_levels": ["medium", "low"],
      "cost_per_1k_input": 0.003,
      "cost_per_1k_output": 0.015,
      "speed": "medium",
      "use_cases": ["API endpoints", "Business logic", "Standard features", "Test writing"]
    },
    "haiku": {
      "model": "claude-haiku-4",
      "context_limit": 200000,
      "difficulty_levels": ["low"],
      "cost_per_1k_input": 0.00025,
      "cost_per_1k_output": 0.00125,
      "speed": "fast",
      "use_cases": ["CRUD operations", "Config changes", "Documentation", "Simple fixes"]
    },
    "glm": {
      "model": "glm-4-plus",
      "context_limit": 128000,
      "difficulty_levels": ["medium", "low"],
      "cost_per_1k_input": 0.001,
      "cost_per_1k_output": 0.001,
      "speed": "fast",
      "use_cases": ["Standard features (lower cost)", "International projects", "High-volume tasks"]
    },
    "minimax": {
      "model": "minimax-01",
      "context_limit": 128000,
      "difficulty_levels": ["low"],
      "cost_per_1k_input": 0.0005,
      "cost_per_1k_output": 0.0005,
      "speed": "fast",
      "use_cases": ["Simple tasks (very low cost)", "Bulk operations", "Non-critical work"]
    }
  }
}

Difficulty Levels Defined

Low Difficulty:

CRUD operations (create, read, update, delete)
Configuration changes
Documentation updates
Simple bug fixes
UI text changes
Adding logging/comments

Criteria:

Well-established patterns
No complex logic
Minimal dependencies
Low risk of regressions

Medium Difficulty:

API endpoint implementation
Business logic features
Database schema changes
Integration with external services
Standard refactoring
Test suite additions

Criteria:

Moderate complexity
Some novel logic required
Multiple file changes
Moderate risk of side effects

High Difficulty:

Architecture changes
Complex algorithms
Performance optimization
Security-critical features
Large-scale refactoring
Novel problem-solving

Criteria:

High complexity
Requires deep understanding
Cross-cutting concerns
High risk of regressions

Issue Metadata Schema

Required Fields

{
  "issue": {
    "id": 123,
    "title": "Add JWT authentication [25K | Medium]",
    "description": "Implement JWT token-based authentication...",

    "metadata": {
      "estimated_context": 25000,
      "difficulty": "medium",
      "epic": "auth-system",
      "dependencies": [122],
      "quality_gates": ["lint", "typecheck", "test", "security-scan"],

      "assignment": {
        "suggested_models": ["sonnet", "opus"],
        "assigned_model": null,
        "assigned_agent_id": null
      },

      "tracking": {
        "created_at": "2026-01-31T10:00:00Z",
        "started_at": null,
        "completed_at": null,
        "actual_context_used": null,
        "duration_minutes": null
      }
    }
  }
}

Issue Title Format

Template: [Feature name] [Context estimate | Difficulty]

Examples:

✅ "Add JWT authentication [25K | Medium]"
✅ "Fix typo in README [2K | Low]"
✅ "Refactor auth system [80K | High]"
✅ "Implement rate limiting [30K | Medium]"
✅ "Add OpenAPI docs [15K | Low]"

❌ "Add authentication" (missing metadata)
❌ "Refactor auth [High]" (missing context estimate)
❌ "Fix bug [20K]" (missing difficulty)

Issue Body Template

## Context Estimate

**Estimated tokens:** 25,000 (12.5% of 200K limit)

## Difficulty

**Level:** Medium

**Rationale:**

- Requires understanding JWT spec
- Integration with existing auth middleware
- Security considerations (token signing, validation)
- Test coverage for auth flows

## Suggested Models

- Primary: Sonnet (cost-effective for medium difficulty)
- Fallback: Opus (if complexity increases)

## Dependencies

- #122 (Auth middleware must be complete first)

## Quality Gates

- [x] Lint (ESLint + Prettier)
- [x] Typecheck (TypeScript strict mode)
- [x] Tests (Unit + Integration, 80%+ coverage)
- [x] Security scan (No hardcoded secrets, safe crypto)

## Task Description

[Detailed description of work to be done...]

## Acceptance Criteria

- [ ] JWT tokens generated on login
- [ ] Tokens validated on protected routes
- [ ] Token refresh mechanism implemented
- [ ] Tests cover happy path + edge cases
- [ ] Documentation updated

## Context Breakdown

| Activity                          | Estimated Tokens |
| --------------------------------- | ---------------- |
| Read existing auth code           | 5,000            |
| Implement JWT library integration | 8,000            |
| Write middleware logic            | 6,000            |
| Add tests                         | 4,000            |
| Update documentation              | 2,000            |
| **Total**                         | **25,000**       |

Context Estimation Guidelines

Estimation Formula

Estimated Context = (
    Files to read × 5-10K per file +
    Implementation complexity × 10-30K +
    Test writing × 5-15K +
    Documentation × 2-5K +
    Buffer for iteration × 20-50%
)

Examples

Simple (Low Difficulty):

Task: Fix typo in README.md

Files to read: 1 × 5K = 5K
Implementation: Minimal = 1K
Tests: None = 0K
Docs: None = 0K
Buffer: 20% = 1.2K
Total: ~7K tokens

Rounded estimate: 10K tokens (conservative)

Medium (Medium Difficulty):

Task: Add API endpoint for user profile

Files to read: 3 × 8K = 24K
Implementation: Standard endpoint = 15K
Tests: Unit + integration = 10K
Docs: API spec update = 3K
Buffer: 30% = 15.6K
Total: ~67.6K tokens

Rounded estimate: 70K tokens

Complex (High Difficulty):

Task: Refactor authentication system

Files to read: 8 × 10K = 80K
Implementation: Complex refactor = 30K
Tests: Extensive = 15K
Docs: Architecture guide = 5K
Buffer: 50% = 65K
Total: ~195K tokens

⚠️ Exceeds 50% rule (100K limit)!
Action: Split into 2-3 smaller issues

Estimation Accuracy Tracking

After each issue, measure variance:

variance = actual_context - estimated_context
variance_pct = (variance / estimated_context) * 100

# Log for calibration
if variance_pct > 20%:
    print(f"⚠️ Estimate off by {variance_pct}%")
    print(f"Estimated: {estimated_context}")
    print(f"Actual: {actual_context}")
    print("Review estimation guidelines")

Over time, refine estimation formula based on historical data.

Coordinator Implementation

Core Algorithm

class QualityRailsCoordinator:
    """Non-AI coordinator for agent swarm orchestration."""

    def __init__(self, issue_queue, agent_profiles, quality_gates):
        self.issues = issue_queue
        self.agents = agent_profiles
        self.gates = quality_gates
        self.current_controller = None

    def run(self):
        """Main orchestration loop."""

        # Validate all issues
        self.validate_issues()

        # Sort by dependencies and priority
        self.issues = self.topological_sort(self.issues)

        # Start AI swarm controller
        self.start_swarm_controller()

        # Process queue
        for issue in self.issues:
            print(f"\n{'='*60}")
            print(f"Starting issue #{issue['id']}: {issue['title']}")
            print(f"{'='*60}\n")

            # Assign optimal agent
            agent = self.assign_agent(issue)

            # Monitor and execute
            self.execute_issue(issue, agent)

            # Log metrics
            self.log_metrics(issue, agent)

        print("\n✅ All issues complete. Queue empty.")

    def validate_issues(self):
        """Ensure all issues have required metadata."""
        for issue in self.issues:
            if not issue.get("estimated_context"):
                raise ValueError(
                    f"Issue {issue['id']} missing context estimate"
                )

            if not issue.get("difficulty"):
                raise ValueError(
                    f"Issue {issue['id']} missing difficulty rating"
                )

            # Validate 50% rule
            max_context = max(
                agent["context_limit"]
                for agent in self.agents.values()
            )

            if issue["estimated_context"] > (max_context * 0.5):
                raise ValueError(
                    f"Issue {issue['id']} exceeds 50% rule: "
                    f"{issue['estimated_context']} > {max_context * 0.5}"
                )

    def assign_agent(self, issue):
        """Assign optimal agent based on context + difficulty."""
        context_est = issue["estimated_context"]
        difficulty = issue["difficulty"]

        # Filter models that can handle this issue
        candidates = []

        for model_name, profile in self.agents.items():
            # Check context capacity (50% rule)
            if context_est <= (profile["context_limit"] * 0.5):
                # Check difficulty match
                if difficulty in profile["difficulty_levels"]:
                    # Calculate cost
                    cost = (
                        context_est * profile["cost_per_1k_input"] / 1000
                    )

                    candidates.append({
                        "model": model_name,
                        "profile": profile,
                        "cost": cost
                    })

        if not candidates:
            raise ValueError(
                f"No model can handle issue {issue['id']}: "
                f"{context_est}K ctx, {difficulty} difficulty"
            )

        # Optimize for cost (prefer cheaper models when capable)
        candidates.sort(key=lambda x: x["cost"])
        selected = candidates[0]

        print(f"📋 Assigned {selected['model']} to issue {issue['id']}")
        print(f"   Context: {context_est}K tokens")
        print(f"   Difficulty: {difficulty}")
        print(f"   Estimated cost: ${selected['cost']:.4f}")

        return selected

    def execute_issue(self, issue, agent):
        """Execute issue with assigned agent."""

        # Start agent session
        session = self.start_agent_session(agent["profile"])

        # Track context
        session_context = 0
        context_limit = agent["profile"]["context_limit"]

        # Execution loop
        iteration = 0
        while not issue.get("complete"):
            iteration += 1

            # Check context health
            if session_context > (context_limit * 0.80):
                print(f"⚠️ Context at 80% ({session_context}/{context_limit})")
                print("   Triggering compaction...")
                session_context = self.compact_session(session)
                print(f"   ✓ Compacted to {session_context} tokens")

            if session_context > (context_limit * 0.95):
                print(f"🔄 Context at 95% - rotating agent session")
                state = session.save_state()
                session.terminate()
                session = self.start_agent_session(agent["profile"])
                session.load_state(state)
                session_context = session.current_context()

            # Agent executes step
            print(f"   Iteration {iteration}...")
            result = session.execute_step(issue)

            # Update context tracking
            session_context += result["context_used"]

            # Check if agent claims completion
            if result.get("claims_complete"):
                print("   Agent claims completion. Running quality gates...")

                # Enforce quality gates
                gate_results = self.gates.validate(result)

                if gate_results["passed"]:
                    print("   ✅ All quality gates passed")
                    issue["complete"] = True
                    issue["actual_context_used"] = session_context
                else:
                    print("   ❌ Quality gates failed:")
                    for gate, errors in gate_results["failures"].items():
                        print(f"      {gate}: {errors}")

                    # Send feedback to agent
                    session.send_feedback(gate_results["failures"])

        # Clean up
        session.terminate()

    def start_swarm_controller(self):
        """Start AI swarm controller (OpenClaw session)."""
        # Initialize OpenClaw swarm controller
        # This coordinates subagents but is managed by this coordinator
        pass

    def start_agent_session(self, agent_profile):
        """Start individual agent session."""
        # Start agent with specific model
        # Return session handle
        pass

    def compact_session(self, session):
        """Trigger compaction in agent session."""
        summary = session.send_message(
            "Summarize all completed work concisely. "
            "Keep only essential context for continuation."
        )

        session.reset_history_with_summary(summary)

        return session.current_context()

    def topological_sort(self, issues):
        """Sort issues by dependencies."""
        # Implement dependency graph sorting
        # Ensures dependencies complete before dependents
        pass

    def log_metrics(self, issue, agent):
        """Log issue completion metrics."""
        metrics = {
            "issue_id": issue["id"],
            "title": issue["title"],
            "estimated_context": issue["estimated_context"],
            "actual_context": issue.get("actual_context_used"),
            "variance": (
                issue.get("actual_context_used", 0) -
                issue["estimated_context"]
            ),
            "model": agent["model"],
            "difficulty": issue["difficulty"],
            "timestamp": datetime.now().isoformat()
        }

        # Write to metrics file
        with open("orchestrator-metrics.jsonl", "a") as f:
            f.write(json.dumps(metrics) + "\n")

Quality Gates Implementation

class QualityGates:
    """Mechanical quality enforcement."""

    def validate(self, result):
        """Run all quality gates."""

        gates = {
            "lint": self.run_lint,
            "typecheck": self.run_typecheck,
            "test": self.run_tests,
            "security": self.run_security_scan
        }

        failures = {}

        for gate_name, gate_fn in gates.items():
            gate_result = gate_fn(result)

            if not gate_result["passed"]:
                failures[gate_name] = gate_result["errors"]

        return {
            "passed": len(failures) == 0,
            "failures": failures
        }

    def run_lint(self, result):
        """Run linting (ESLint, Prettier, etc.)."""
        # Execute: pnpm turbo run lint
        # Parse output
        # Return pass/fail + errors
        pass

    def run_typecheck(self, result):
        """Run TypeScript type checking."""
        # Execute: pnpm turbo run typecheck
        # Parse output
        # Return pass/fail + errors
        pass

    def run_tests(self, result):
        """Run test suite."""
        # Execute: pnpm turbo run test
        # Check coverage threshold
        # Return pass/fail + errors
        pass

    def run_security_scan(self, result):
        """Run security checks."""
        # Execute: detect-secrets scan
        # Check for vulnerabilities
        # Return pass/fail + errors
        pass

Issue Creation Process

Workflow

1. Epic Planning Agent
   ├── Receives epic description
   ├── Estimates total context required
   ├── Checks against agent limits
   └── Decomposes into issues if needed

2. Issue Creation
   ├── For each sub-issue:
   │   ├── Estimate context (formula + buffer)
   │   ├── Assign difficulty level
   │   ├── Validate 50% rule
   │   └── Create issue with metadata

3. Validation
   ├── Coordinator validates all issues
   ├── Checks for missing metadata
   └── Rejects oversized issues

4. Execution
   ├── Coordinator assigns agents
   ├── Monitors context usage
   ├── Enforces quality gates
   └── Logs metrics for calibration

Epic Planning Agent Prompt

You are an Epic Planning Agent. Your job is to decompose epics into
properly-sized issues for autonomous execution.

## Guidelines

1. **Estimate total context:**
   - Read all related code files
   - Estimate implementation complexity
   - Account for tests and documentation
   - Add 30% buffer for iteration

2. **Apply 50% rule:**
   - Target agent context limit: 200K tokens
   - Maximum issue size: 100K tokens
   - If epic exceeds 100K: Split into multiple issues

3. **Assign difficulty:**
   - Low: CRUD, config, docs, simple fixes
   - Medium: APIs, business logic, integrations
   - High: Architecture, complex algorithms, refactors

4. **Create issues with metadata:**
   ```json
   {
     "title": "[Feature] [Context | Difficulty]",
     "estimated_context": 25000,
     "difficulty": "medium",
     "epic": "epic-name",
     "dependencies": [],
     "quality_gates": ["lint", "typecheck", "test"]
   }
   ```

Validate:
- Each issue < 100K tokens ✓
- Dependencies are explicit ✓
- Difficulty matches complexity ✓
- Quality gates defined ✓

Output Format

Create a JSON array of issues:

[
  {
    "id": 1,
    "title": "Add auth middleware [20K | Medium]",
    "estimated_context": 20000,
    "difficulty": "medium",
    ...
  },
  ...
]


---

## Proof of Concept Plan

### PoC Goals

1. **Validate non-AI coordinator pattern** - Prove it can manage agent lifecycle
2. **Test context monitoring** - Verify we can track and react to context usage
3. **Validate quality gates** - Ensure mechanical enforcement works
4. **Test agent assignment** - Confirm model selection logic
5. **Measure metrics** - Collect data on estimate accuracy

### PoC Scope

**Small test project:**
- 5-10 simple issues
- Mix of difficulty levels
- Use Haiku + Sonnet (cheap)
- Real quality gates (lint, typecheck, test)

**What we'll build:**

poc/ ├── coordinator.py # Non-AI coordinator ├── agent_profiles.json # Model capabilities ├── issues.json # Test issue queue ├── quality_gates.py # Mechanical gates └── metrics.jsonl # Results log


**Test cases:**
1. Low difficulty issue → Haiku (cheap, fast)
2. Medium difficulty issue → Sonnet (balanced)
3. Oversized issue → Should reject (50% rule)
4. Issue with failed quality gate → Agent retries
5. High context issue → Triggers compaction

### PoC Success Criteria

- [ ] Coordinator completes all issues without human intervention
- [ ] Quality gates enforce standards (at least 1 failure caught + fixed)
- [ ] Context monitoring works (log shows tracking)
- [ ] Agent assignment is optimal (cheapest capable model chosen)
- [ ] Metrics collected for all issues
- [ ] No agent exhaustion (50% rule enforced)

### PoC Timeline

**Week 1: Foundation**
- [ ] Build coordinator skeleton
- [ ] Implement agent profiles
- [ ] Create test issue queue
- [ ] Set up quality gates

**Week 2: Integration**
- [ ] Connect to Claude API
- [ ] Implement context monitoring
- [ ] Test agent lifecycle
- [ ] Validate quality gates

**Week 3: Testing**
- [ ] Run full PoC
- [ ] Collect metrics
- [ ] Analyze results
- [ ] Document findings

**Week 4: Refinement**
- [ ] Fix issues discovered
- [ ] Optimize assignment logic
- [ ] Update documentation
- [ ] Prepare for production

---

## Production Deployment (Post-PoC)

### Integration with Mosaic Stack

**Phase 1: Core Implementation**
- Implement coordinator in Mosaic Stack codebase
- Add agent profiles to configuration
- Integrate with existing OpenClaw infrastructure
- Add quality gates to CI/CD

**Phase 2: Issue Management**
- Update issue templates with metadata fields
- Train team on estimation guidelines
- Build issue validation tools
- Create epic planning workflows

**Phase 3: Monitoring**
- Add coordinator metrics dashboard
- Track estimate accuracy over time
- Monitor cost optimization
- Alert on failures

**Phase 4: Scale**
- Expand to all milestones
- Add more agent types (GLM, MiniMax)
- Optimize for multi-epic orchestration
- Build self-learning estimation

---

## Open Questions (To Resolve in PoC)

1. **Compaction effectiveness:** How much context does summarization actually free?
2. **Estimation accuracy:** How close are initial estimates to reality?
3. **Model selection:** Is cost-optimized assignment actually optimal, or should we prioritize speed/quality?
4. **Quality gate timing:** Should gates run after each commit, or only at issue completion?
5. **Session rotation overhead:** What's the cost of rotating agents vs compaction?
6. **Dependency handling:** How to ensure dependencies are truly complete before starting dependent issues?

---

## Success Metrics

### PoC Metrics

- **Autonomy:** % of issues completed without human intervention
- **Quality:** % of commits passing all quality gates on first try
- **Cost:** Total cost vs baseline (all-Opus)
- **Accuracy:** Context estimate variance (target: <20%)
- **Efficiency:** Issues per hour

### Production Metrics

- **Throughput:** Issues completed per day
- **Quality rate:** % passing all gates first try
- **Context efficiency:** Avg context used vs estimated
- **Cost savings:** % saved vs all-Opus baseline
- **Agent utilization:** % of time agents are productive (not waiting)

---

## Appendix: Agent Skill Definitions

### Agent Skills Schema

```json
{
  "skills": {
    "backend-api": {
      "description": "Build RESTful APIs and endpoints",
      "difficulty": "medium",
      "typical_context": "20-40K",
      "quality_gates": ["lint", "typecheck", "test", "api-spec"]
    },
    "frontend-ui": {
      "description": "Build UI components and pages",
      "difficulty": "medium",
      "typical_context": "15-35K",
      "quality_gates": ["lint", "typecheck", "test", "a11y"]
    },
    "database-schema": {
      "description": "Design and migrate database schemas",
      "difficulty": "high",
      "typical_context": "30-50K",
      "quality_gates": ["typecheck", "test", "migration-validate"]
    },
    "documentation": {
      "description": "Write technical documentation",
      "difficulty": "low",
      "typical_context": "5-15K",
      "quality_gates": ["spelling", "markdown-lint"]
    },
    "refactoring": {
      "description": "Refactor existing code",
      "difficulty": "high",
      "typical_context": "40-80K",
      "quality_gates": ["lint", "typecheck", "test", "no-behavior-change"]
    },
    "bug-fix": {
      "description": "Fix reported bugs",
      "difficulty": "low-medium",
      "typical_context": "10-30K",
      "quality_gates": ["lint", "typecheck", "test", "regression-test"]
    }
  }
}

Usage:

Issues can reference skills: "skills": ["backend-api", "database-schema"]
Coordinator uses skill metadata to inform estimates
Helps with consistent difficulty assignment

Document Status

Version: 1.0 - Proposed Architecture Next Steps: Build Proof of Concept Approval Required: After successful PoC

End of Architecture Document

30 KiB Raw Permalink Blame History Unescape Escape

Quality-Rails Orchestration Architecture

Executive Summary

Problem Statement

Current State: AI-Orchestrated Agents

Root Causes

Solution: Non-AI Coordinator Architecture

System Architecture

Separation of Concerns

The 50% Rule

Issue Size Constraint

Epic Decomposition

Agent Profiles

Model Capabilities Matrix

Difficulty Levels Defined

Issue Metadata Schema

Required Fields

Issue Title Format

Issue Body Template

Context Estimation Guidelines

Estimation Formula

Examples

Estimation Accuracy Tracking

Coordinator Implementation

Core Algorithm

Quality Gates Implementation

Issue Creation Process

Workflow

Epic Planning Agent Prompt

Output Format

Document Status

30 KiB

Raw Permalink Blame History