Files
stack/docs/scratchpads/orch-116-fifty-percent.md
Jason Woltje 12abdfe81d feat(#93): implement agent spawn via federation
Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 14:37:06 -06:00

12 KiB

Issue ORCH-116: 50% Rule Enforcement

Objective

Enforce 50% rule: no more than 50% AI-generated code in PR. This is done by ensuring the orchestrator calls both mechanical gates (typecheck, lint, tests, coverage) AND AI confirmation gates (independent AI agent review).

Approach

Following TDD principles:

  1. RED: Write tests first for enhanced quality-gates.service.ts
  2. GREEN: Implement minimal code to pass tests
  3. REFACTOR: Clean up and optimize

Key Requirements (from M6-NEW-ISSUES-TEMPLATES.md)

  • Mechanical gates: typecheck, lint, tests, coverage (coordinator)
  • AI confirmation: independent AI agent reviews (coordinator)
  • Orchestrator calls both mechanical and AI gates
  • Reject if either fails
  • Return detailed failure reasons

Design

The coordinator enforces the 50% rule. The orchestrator's role is to:

  1. Call coordinator quality gates (which now includes AI review)
  2. Handle the response appropriately
  3. Return detailed failure reasons to the caller

Key Insight: ORCH-114 already implements quality gate callbacks. ORCH-116 is about ensuring the coordinator's quality gates include AI review, and that the orchestrator properly handles those AI review results.

Implementation Strategy:

Since the coordinator is responsible for running the AI review (as per the technical notes), and the orchestrator already calls the coordinator via checkQuality(), the main work for ORCH-116 is to:

  1. Ensure the QualityGatesService properly handles AI review results in the coordinator response
  2. Add specific tests for AI confirmation scenarios
  3. Enhance logging and error messages to distinguish between mechanical and AI gate failures
  4. Add a method to check if the coordinator's response includes AI confirmation

Enhanced QualityGatesService:

class QualityGatesService {
  // Existing methods
  async preCommitCheck(params): Promise<QualityGateResult>;
  async postCommitCheck(params): Promise<QualityGateResult>;

  // New helper method
  private hasAIConfirmation(result: QualityGateResult): boolean;

  // Enhanced response handling
  private mapResponse(response): QualityGateResult; // Already exists
}

Quality Gate Flow:

  1. Pre-commit: Mechanical gates only (fast)
  2. Post-commit: Mechanical gates + AI confirmation (comprehensive)
  3. AI confirmation is independent agent review (not self-review)
  4. Reject if ANY gate fails (mechanical OR AI)

Progress

  • Read ORCH-116 requirements
  • Review existing ORCH-114 implementation
  • Design enhancement strategy
  • Write tests for AI confirmation scenarios (RED)
  • Implement AI confirmation handling (GREEN)
  • Refactor and optimize (REFACTOR)
  • Verify test coverage (93.33% branch, 100% line)
  • Update scratchpad with results
  • Create/close Gitea issue

Testing Strategy

New Test Scenarios for ORCH-116

  1. AI confirmation passes: Post-commit with AI review approved
  2. AI confirmation fails: Post-commit with AI review rejected (confidence < 0.9)
  3. Mechanical pass, AI fails: Mechanical gates pass but AI rejects
  4. Mechanical fail, AI pass: Mechanical gates fail, AI review not checked
  5. Both pass: Full approval with both mechanical and AI
  6. 50% rule violation: AI detects >50% AI-generated code
  7. AI review details: Parse and return AI confidence scores and findings

Test Coverage Target

  • Minimum 85% coverage (existing: 91.66% branch, 100% line)
  • All new AI confirmation scenarios covered
  • Error handling for AI review failures

Notes

Coordinator Responsibility

The coordinator (apps/coordinator) is responsible for:

  • Running mechanical gates (typecheck, lint, tests, coverage)
  • Spawning independent AI reviewer agent
  • Enforcing 50% rule through AI review
  • Combining mechanical and AI results
  • Returning comprehensive QualityCheckResponse

The orchestrator (apps/orchestrator) is responsible for:

  • Calling coordinator's quality gates
  • Handling the combined response
  • Blocking commit/push based on coordinator decision
  • Returning detailed failure reasons to agents

50% Rule Mechanics

The 50% rule means:

  • AI-generated code should be ≤50% of the PR
  • Independent AI agent reviews the changes
  • Checks for: excessive AI generation, quality issues, security problems
  • Confidence threshold: ≥0.9 to approve
  • Rejection reasons include AI confidence score and findings

AI Confirmation in Response

The coordinator's QualityCheckResponse includes:

{
  approved: boolean,
  gate: string,
  message?: string,
  details?: {
    // Mechanical gate results
    typecheck?: string,
    lint?: string,
    tests?: string,
    coverage?: { current: number, required: number },

    // AI confirmation results
    aiReview?: {
      confidence: number,      // 0.0 - 1.0
      approved: boolean,       // true if confidence >= 0.9
      findings?: string[],     // Issues found by AI
      aiGeneratedPercent?: number  // Estimated % of AI-generated code
    }
  }
}

Blockers

None - ORCH-114 is complete and provides the foundation.

  • ORCH-114: Quality gate callbacks (complete) - Foundation
  • ORCH-113: Coordinator API client (complete)
  • ORCH-122: AI agent confirmation (coordinator implementation)

Implementation Summary

Phase 1: RED - Write Tests First

Will add tests for:

  1. AI confirmation in post-commit responses
  2. AI rejection scenarios (low confidence, >50% AI-generated)
  3. Combined mechanical + AI failures
  4. AI confirmation details parsing
  5. 50% rule violation detection

Phase 2: GREEN - Minimal Implementation

Will implement:

  1. Enhanced response parsing for AI review fields
  2. Helper method to check AI confirmation presence
  3. Enhanced logging for AI review results
  4. Proper error messages distinguishing mechanical vs AI failures

Phase 3: REFACTOR - Optimize

Will refine:

  1. Code organization and clarity
  2. Error message quality
  3. Documentation and comments
  4. Test coverage verification (≥85%)

Implementation Complete

Summary

ORCH-116 has been successfully implemented. The orchestrator now properly handles the 50% rule enforcement by:

  1. Calling coordinator quality gates that include both mechanical and AI review
  2. Handling AI confirmation results in the response
  3. Rejecting when either mechanical OR AI gates fail
  4. Returning detailed failure reasons including AI confidence scores and findings

Key Implementation Details

Architecture Decision: The coordinator is responsible for enforcing the 50% rule through its AI review feature. The orchestrator's role is to call the coordinator and properly handle the combined response.

What Changed:

  1. Added comprehensive tests for 50% rule scenarios (9 new test cases)
  2. Added hasAIConfirmation() helper method to check for AI review presence
  3. Enhanced documentation in service comments to explain 50% rule enforcement
  4. All tests passing (36 total tests)
  5. Coverage: 93.33% branch, 100% line (exceeds 85% requirement)

What Didn't Need to Change:

  • The existing preCommitCheck() and postCommitCheck() methods already handle AI review properly
  • The mapResponse() method already preserves all coordinator response fields including aiReview
  • Error handling and logging already work correctly for AI failures

Test Scenarios Added for ORCH-116

  1. AI confirmation passes with mechanical gates (45% AI-generated)
  2. AI confidence below threshold (< 0.9) - rejected
  3. 50% rule violated (65% AI-generated) - rejected
  4. Mechanical pass but AI fails - rejected
  5. Mechanical fail, AI not checked - rejected early
  6. AI review with security findings - rejected
  7. Exactly 50% AI-generated - approved
  8. AI review unavailable fallback - coordinator decides
  9. Preserve all AI review metadata for debugging

Files Modified

  1. quality-gates.service.spec.ts (+240 lines)

    • Added 9 comprehensive test cases for 50% rule enforcement
    • Added 5 test cases for hasAIConfirmation() helper method
    • Total: 36 tests (was 22), all passing
  2. quality-gates.service.ts (+20 lines)

    • Added hasAIConfirmation() public helper method
    • Enhanced documentation in mapResponse() to explain 50% rule
    • No changes to core logic - already handles AI review properly

Quality Gates Flow (Post-Implementation)

Pre-commit (Fast):

  1. Orchestrator calls coordinator with files/diff
  2. Coordinator runs: typecheck, lint, unit tests
  3. Returns approved/rejected
  4. Orchestrator blocks commit if rejected

Post-commit (Comprehensive + AI):

  1. Orchestrator calls coordinator with files/diff
  2. Coordinator runs mechanical gates first
  3. If mechanical pass, coordinator spawns independent AI reviewer
  4. AI reviewer checks:
    • Code quality
    • Security vulnerabilities
    • AI-generated percentage (50% rule)
    • Logic errors
  5. Coordinator combines mechanical + AI results
  6. Returns approved (both pass) or rejected (either fails)
  7. Orchestrator blocks push if rejected

50% Rule Enforcement Details

How it Works:

  • Independent AI agent analyzes the PR diff
  • Estimates percentage of AI-generated code
  • Checks for quality, security, and logic issues
  • Returns confidence score (0.0 - 1.0)
  • Approval threshold: confidence >= 0.9
  • 50% threshold: aiGeneratedPercent <= 50

Response Structure:

{
  approved: boolean,
  gate: "post-commit",
  message: "50% rule violated: excessive AI-generated code detected",
  details: {
    // Mechanical results
    typecheck: "passed",
    lint: "passed",
    tests: "passed",
    coverage: { current: 90, required: 85 },

    // AI confirmation
    aiReview: {
      confidence: 0.88,
      approved: false,
      aiGeneratedPercent: 65,
      findings: [
        "Detected 65% AI-generated code in PR",
        "Exceeds 50% threshold for AI-generated content"
      ]
    }
  }
}

Test Coverage

Final Coverage:

  • Statements: 100%
  • Branches: 93.33% (exceeds 85% requirement)
  • Functions: 100%
  • Lines: 100%

36 Test Cases Total:

  • Pre-commit scenarios: 6 tests
  • Post-commit scenarios: 5 tests
  • 50% rule enforcement: 9 tests (NEW for ORCH-116)
  • Error handling: 6 tests
  • Response parsing: 5 tests
  • hasAIConfirmation helper: 5 tests (NEW for ORCH-116)

Integration Points

Coordinator (apps/coordinator):

  • Implements mechanical gates (typecheck, lint, tests, coverage)
  • Spawns independent AI reviewer agent
  • Enforces 50% rule through AI review
  • Combines results and returns QualityCheckResponse

Orchestrator (apps/orchestrator):

  • Calls coordinator before commit/push
  • Handles combined mechanical + AI response
  • Blocks operations if rejected
  • Returns detailed failure reasons to agent

Agent Workflow:

  1. Agent makes code changes
  2. Agent calls orchestrator pre-commit check
  3. Orchestrator → Coordinator (mechanical gates)
  4. If rejected: Agent fixes issues, repeats
  5. If approved: Agent commits
  6. Agent calls orchestrator post-commit check
  7. Orchestrator → Coordinator (mechanical + AI gates)
  8. If rejected: Agent addresses concerns, repeats
  9. If approved: Agent pushes

Acceptance Criteria - COMPLETED

  • Mechanical gates: typecheck, lint, tests, coverage (coordinator)
  • AI confirmation: independent AI agent reviews (coordinator)
  • Orchestrator calls both mechanical and AI gates
  • Reject if either fails
  • Return detailed failure reasons
  • Comprehensive unit tests (36 total, 14 new for ORCH-116)
  • Test coverage >= 85% (achieved 93.33% branch, 100% line)
  • Helper method to check AI confirmation presence
  • Enhanced documentation explaining 50% rule

Next Steps

This completes ORCH-116. The orchestrator now properly handles the 50% rule enforcement through coordinator integration. The coordinator is responsible for the actual AI review implementation (ORCH-122), which will use this interface.

Related Work:

  • ORCH-122: AI agent confirmation (coordinator implementation)
  • ORCH-123: YOLO mode (gate bypass configuration)
  • ORCH-124: Gate configuration per-task (different profiles)