Implements FED-010: Agent Spawn via Federation feature that enables spawning and managing Claude agents on remote federated Mosaic Stack instances via COMMAND message type. Features: - Federation agent command types (spawn, status, kill) - FederationAgentService for handling agent operations - Integration with orchestrator's agent spawner/lifecycle services - API endpoints for spawning, querying status, and killing agents - Full command routing through federation COMMAND infrastructure - Comprehensive test coverage (12/12 tests passing) Architecture: - Hub → Spoke: Spawn agents on remote instances - Command flow: FederationController → FederationAgentService → CommandService → Remote Orchestrator - Response handling: Remote orchestrator returns agent status/results - Security: Connection validation, signature verification Files created: - apps/api/src/federation/types/federation-agent.types.ts - apps/api/src/federation/federation-agent.service.ts - apps/api/src/federation/federation-agent.service.spec.ts Files modified: - apps/api/src/federation/command.service.ts (agent command routing) - apps/api/src/federation/federation.controller.ts (agent endpoints) - apps/api/src/federation/federation.module.ts (service registration) - apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint) - apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration) Testing: - 12/12 tests passing for FederationAgentService - All command service tests passing - TypeScript compilation successful - Linting passed Refs #93 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
375 lines
12 KiB
Markdown
375 lines
12 KiB
Markdown
# Issue ORCH-116: 50% Rule Enforcement
|
|
|
|
## Objective
|
|
|
|
Enforce 50% rule: no more than 50% AI-generated code in PR. This is done by ensuring the orchestrator calls both mechanical gates (typecheck, lint, tests, coverage) AND AI confirmation gates (independent AI agent review).
|
|
|
|
## Approach
|
|
|
|
Following TDD principles:
|
|
|
|
1. **RED**: Write tests first for enhanced quality-gates.service.ts
|
|
2. **GREEN**: Implement minimal code to pass tests
|
|
3. **REFACTOR**: Clean up and optimize
|
|
|
|
### Key Requirements (from M6-NEW-ISSUES-TEMPLATES.md)
|
|
|
|
- [ ] Mechanical gates: typecheck, lint, tests, coverage (coordinator)
|
|
- [ ] AI confirmation: independent AI agent reviews (coordinator)
|
|
- [ ] Orchestrator calls both mechanical and AI gates
|
|
- [ ] Reject if either fails
|
|
- [ ] Return detailed failure reasons
|
|
|
|
### Design
|
|
|
|
The **coordinator** enforces the 50% rule. The **orchestrator's** role is to:
|
|
|
|
1. Call coordinator quality gates (which now includes AI review)
|
|
2. Handle the response appropriately
|
|
3. Return detailed failure reasons to the caller
|
|
|
|
**Key Insight**: ORCH-114 already implements quality gate callbacks. ORCH-116 is about ensuring the coordinator's quality gates include AI review, and that the orchestrator properly handles those AI review results.
|
|
|
|
**Implementation Strategy**:
|
|
|
|
Since the coordinator is responsible for running the AI review (as per the technical notes), and the orchestrator already calls the coordinator via `checkQuality()`, the main work for ORCH-116 is to:
|
|
|
|
1. Ensure the QualityGatesService properly handles AI review results in the coordinator response
|
|
2. Add specific tests for AI confirmation scenarios
|
|
3. Enhance logging and error messages to distinguish between mechanical and AI gate failures
|
|
4. Add a method to check if the coordinator's response includes AI confirmation
|
|
|
|
**Enhanced QualityGatesService**:
|
|
|
|
```typescript
|
|
class QualityGatesService {
|
|
// Existing methods
|
|
async preCommitCheck(params): Promise<QualityGateResult>;
|
|
async postCommitCheck(params): Promise<QualityGateResult>;
|
|
|
|
// New helper method
|
|
private hasAIConfirmation(result: QualityGateResult): boolean;
|
|
|
|
// Enhanced response handling
|
|
private mapResponse(response): QualityGateResult; // Already exists
|
|
}
|
|
```
|
|
|
|
**Quality Gate Flow**:
|
|
|
|
1. Pre-commit: Mechanical gates only (fast)
|
|
2. Post-commit: Mechanical gates + AI confirmation (comprehensive)
|
|
3. AI confirmation is independent agent review (not self-review)
|
|
4. Reject if ANY gate fails (mechanical OR AI)
|
|
|
|
## Progress
|
|
|
|
- [x] Read ORCH-116 requirements
|
|
- [x] Review existing ORCH-114 implementation
|
|
- [x] Design enhancement strategy
|
|
- [x] Write tests for AI confirmation scenarios (RED)
|
|
- [x] Implement AI confirmation handling (GREEN)
|
|
- [x] Refactor and optimize (REFACTOR)
|
|
- [x] Verify test coverage (93.33% branch, 100% line)
|
|
- [x] Update scratchpad with results
|
|
- [x] Create/close Gitea issue
|
|
|
|
## Testing Strategy
|
|
|
|
### New Test Scenarios for ORCH-116
|
|
|
|
1. **AI confirmation passes**: Post-commit with AI review approved
|
|
2. **AI confirmation fails**: Post-commit with AI review rejected (confidence < 0.9)
|
|
3. **Mechanical pass, AI fails**: Mechanical gates pass but AI rejects
|
|
4. **Mechanical fail, AI pass**: Mechanical gates fail, AI review not checked
|
|
5. **Both pass**: Full approval with both mechanical and AI
|
|
6. **50% rule violation**: AI detects >50% AI-generated code
|
|
7. **AI review details**: Parse and return AI confidence scores and findings
|
|
|
|
### Test Coverage Target
|
|
|
|
- Minimum 85% coverage (existing: 91.66% branch, 100% line)
|
|
- All new AI confirmation scenarios covered
|
|
- Error handling for AI review failures
|
|
|
|
## Notes
|
|
|
|
### Coordinator Responsibility
|
|
|
|
The **coordinator** (apps/coordinator) is responsible for:
|
|
|
|
- Running mechanical gates (typecheck, lint, tests, coverage)
|
|
- Spawning independent AI reviewer agent
|
|
- Enforcing 50% rule through AI review
|
|
- Combining mechanical and AI results
|
|
- Returning comprehensive QualityCheckResponse
|
|
|
|
The **orchestrator** (apps/orchestrator) is responsible for:
|
|
|
|
- Calling coordinator's quality gates
|
|
- Handling the combined response
|
|
- Blocking commit/push based on coordinator decision
|
|
- Returning detailed failure reasons to agents
|
|
|
|
### 50% Rule Mechanics
|
|
|
|
The 50% rule means:
|
|
|
|
- AI-generated code should be ≤50% of the PR
|
|
- Independent AI agent reviews the changes
|
|
- Checks for: excessive AI generation, quality issues, security problems
|
|
- Confidence threshold: ≥0.9 to approve
|
|
- Rejection reasons include AI confidence score and findings
|
|
|
|
### AI Confirmation in Response
|
|
|
|
The coordinator's `QualityCheckResponse` includes:
|
|
|
|
```typescript
|
|
{
|
|
approved: boolean,
|
|
gate: string,
|
|
message?: string,
|
|
details?: {
|
|
// Mechanical gate results
|
|
typecheck?: string,
|
|
lint?: string,
|
|
tests?: string,
|
|
coverage?: { current: number, required: number },
|
|
|
|
// AI confirmation results
|
|
aiReview?: {
|
|
confidence: number, // 0.0 - 1.0
|
|
approved: boolean, // true if confidence >= 0.9
|
|
findings?: string[], // Issues found by AI
|
|
aiGeneratedPercent?: number // Estimated % of AI-generated code
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Blockers
|
|
|
|
None - ORCH-114 is complete and provides the foundation.
|
|
|
|
## Related Issues
|
|
|
|
- ORCH-114: Quality gate callbacks (complete) - Foundation
|
|
- ORCH-113: Coordinator API client (complete)
|
|
- ORCH-122: AI agent confirmation (coordinator implementation)
|
|
|
|
## Implementation Summary
|
|
|
|
### Phase 1: RED - Write Tests First
|
|
|
|
Will add tests for:
|
|
|
|
1. AI confirmation in post-commit responses
|
|
2. AI rejection scenarios (low confidence, >50% AI-generated)
|
|
3. Combined mechanical + AI failures
|
|
4. AI confirmation details parsing
|
|
5. 50% rule violation detection
|
|
|
|
### Phase 2: GREEN - Minimal Implementation
|
|
|
|
Will implement:
|
|
|
|
1. Enhanced response parsing for AI review fields
|
|
2. Helper method to check AI confirmation presence
|
|
3. Enhanced logging for AI review results
|
|
4. Proper error messages distinguishing mechanical vs AI failures
|
|
|
|
### Phase 3: REFACTOR - Optimize
|
|
|
|
Will refine:
|
|
|
|
1. Code organization and clarity
|
|
2. Error message quality
|
|
3. Documentation and comments
|
|
4. Test coverage verification (≥85%)
|
|
|
|
---
|
|
|
|
## Implementation Complete
|
|
|
|
### Summary
|
|
|
|
ORCH-116 has been successfully implemented. The orchestrator now properly handles the 50% rule enforcement by:
|
|
|
|
1. **Calling coordinator quality gates** that include both mechanical and AI review
|
|
2. **Handling AI confirmation results** in the response
|
|
3. **Rejecting when either mechanical OR AI gates fail**
|
|
4. **Returning detailed failure reasons** including AI confidence scores and findings
|
|
|
|
### Key Implementation Details
|
|
|
|
**Architecture Decision**: The coordinator is responsible for enforcing the 50% rule through its AI review feature. The orchestrator's role is to call the coordinator and properly handle the combined response.
|
|
|
|
**What Changed**:
|
|
|
|
1. Added comprehensive tests for 50% rule scenarios (9 new test cases)
|
|
2. Added `hasAIConfirmation()` helper method to check for AI review presence
|
|
3. Enhanced documentation in service comments to explain 50% rule enforcement
|
|
4. All tests passing (36 total tests)
|
|
5. Coverage: 93.33% branch, 100% line (exceeds 85% requirement)
|
|
|
|
**What Didn't Need to Change**:
|
|
|
|
- The existing `preCommitCheck()` and `postCommitCheck()` methods already handle AI review properly
|
|
- The `mapResponse()` method already preserves all coordinator response fields including `aiReview`
|
|
- Error handling and logging already work correctly for AI failures
|
|
|
|
### Test Scenarios Added for ORCH-116
|
|
|
|
1. ✅ AI confirmation passes with mechanical gates (45% AI-generated)
|
|
2. ✅ AI confidence below threshold (< 0.9) - rejected
|
|
3. ✅ 50% rule violated (65% AI-generated) - rejected
|
|
4. ✅ Mechanical pass but AI fails - rejected
|
|
5. ✅ Mechanical fail, AI not checked - rejected early
|
|
6. ✅ AI review with security findings - rejected
|
|
7. ✅ Exactly 50% AI-generated - approved
|
|
8. ✅ AI review unavailable fallback - coordinator decides
|
|
9. ✅ Preserve all AI review metadata for debugging
|
|
|
|
### Files Modified
|
|
|
|
1. **quality-gates.service.spec.ts** (+240 lines)
|
|
- Added 9 comprehensive test cases for 50% rule enforcement
|
|
- Added 5 test cases for `hasAIConfirmation()` helper method
|
|
- Total: 36 tests (was 22), all passing
|
|
|
|
2. **quality-gates.service.ts** (+20 lines)
|
|
- Added `hasAIConfirmation()` public helper method
|
|
- Enhanced documentation in `mapResponse()` to explain 50% rule
|
|
- No changes to core logic - already handles AI review properly
|
|
|
|
### Quality Gates Flow (Post-Implementation)
|
|
|
|
**Pre-commit (Fast)**:
|
|
|
|
1. Orchestrator calls coordinator with files/diff
|
|
2. Coordinator runs: typecheck, lint, unit tests
|
|
3. Returns approved/rejected
|
|
4. Orchestrator blocks commit if rejected
|
|
|
|
**Post-commit (Comprehensive + AI)**:
|
|
|
|
1. Orchestrator calls coordinator with files/diff
|
|
2. Coordinator runs mechanical gates first
|
|
3. If mechanical pass, coordinator spawns independent AI reviewer
|
|
4. AI reviewer checks:
|
|
- Code quality
|
|
- Security vulnerabilities
|
|
- AI-generated percentage (50% rule)
|
|
- Logic errors
|
|
5. Coordinator combines mechanical + AI results
|
|
6. Returns approved (both pass) or rejected (either fails)
|
|
7. Orchestrator blocks push if rejected
|
|
|
|
### 50% Rule Enforcement Details
|
|
|
|
**How it Works**:
|
|
|
|
- Independent AI agent analyzes the PR diff
|
|
- Estimates percentage of AI-generated code
|
|
- Checks for quality, security, and logic issues
|
|
- Returns confidence score (0.0 - 1.0)
|
|
- Approval threshold: confidence >= 0.9
|
|
- 50% threshold: aiGeneratedPercent <= 50
|
|
|
|
**Response Structure**:
|
|
|
|
```typescript
|
|
{
|
|
approved: boolean,
|
|
gate: "post-commit",
|
|
message: "50% rule violated: excessive AI-generated code detected",
|
|
details: {
|
|
// Mechanical results
|
|
typecheck: "passed",
|
|
lint: "passed",
|
|
tests: "passed",
|
|
coverage: { current: 90, required: 85 },
|
|
|
|
// AI confirmation
|
|
aiReview: {
|
|
confidence: 0.88,
|
|
approved: false,
|
|
aiGeneratedPercent: 65,
|
|
findings: [
|
|
"Detected 65% AI-generated code in PR",
|
|
"Exceeds 50% threshold for AI-generated content"
|
|
]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Test Coverage
|
|
|
|
**Final Coverage**:
|
|
|
|
- Statements: 100%
|
|
- Branches: 93.33% (exceeds 85% requirement)
|
|
- Functions: 100%
|
|
- Lines: 100%
|
|
|
|
**36 Test Cases Total**:
|
|
|
|
- Pre-commit scenarios: 6 tests
|
|
- Post-commit scenarios: 5 tests
|
|
- 50% rule enforcement: 9 tests (NEW for ORCH-116)
|
|
- Error handling: 6 tests
|
|
- Response parsing: 5 tests
|
|
- hasAIConfirmation helper: 5 tests (NEW for ORCH-116)
|
|
|
|
### Integration Points
|
|
|
|
**Coordinator** (apps/coordinator):
|
|
|
|
- Implements mechanical gates (typecheck, lint, tests, coverage)
|
|
- Spawns independent AI reviewer agent
|
|
- Enforces 50% rule through AI review
|
|
- Combines results and returns QualityCheckResponse
|
|
|
|
**Orchestrator** (apps/orchestrator):
|
|
|
|
- Calls coordinator before commit/push
|
|
- Handles combined mechanical + AI response
|
|
- Blocks operations if rejected
|
|
- Returns detailed failure reasons to agent
|
|
|
|
**Agent Workflow**:
|
|
|
|
1. Agent makes code changes
|
|
2. Agent calls orchestrator pre-commit check
|
|
3. Orchestrator → Coordinator (mechanical gates)
|
|
4. If rejected: Agent fixes issues, repeats
|
|
5. If approved: Agent commits
|
|
6. Agent calls orchestrator post-commit check
|
|
7. Orchestrator → Coordinator (mechanical + AI gates)
|
|
8. If rejected: Agent addresses concerns, repeats
|
|
9. If approved: Agent pushes
|
|
|
|
### Acceptance Criteria - COMPLETED ✅
|
|
|
|
- [x] Mechanical gates: typecheck, lint, tests, coverage (coordinator)
|
|
- [x] AI confirmation: independent AI agent reviews (coordinator)
|
|
- [x] Orchestrator calls both mechanical and AI gates
|
|
- [x] Reject if either fails
|
|
- [x] Return detailed failure reasons
|
|
- [x] Comprehensive unit tests (36 total, 14 new for ORCH-116)
|
|
- [x] Test coverage >= 85% (achieved 93.33% branch, 100% line)
|
|
- [x] Helper method to check AI confirmation presence
|
|
- [x] Enhanced documentation explaining 50% rule
|
|
|
|
### Next Steps
|
|
|
|
This completes ORCH-116. The orchestrator now properly handles the 50% rule enforcement through coordinator integration. The coordinator is responsible for the actual AI review implementation (ORCH-122), which will use this interface.
|
|
|
|
**Related Work**:
|
|
|
|
- ORCH-122: AI agent confirmation (coordinator implementation)
|
|
- ORCH-123: YOLO mode (gate bypass configuration)
|
|
- ORCH-124: Gate configuration per-task (different profiles)
|