Files

Jason Woltje 12abdfe81d feat(#93 ): implement agent spawn via federation

Implements FED-010: Agent Spawn via Federation feature that enables
spawning and managing Claude agents on remote federated Mosaic Stack
instances via COMMAND message type.

Features:
- Federation agent command types (spawn, status, kill)
- FederationAgentService for handling agent operations
- Integration with orchestrator's agent spawner/lifecycle services
- API endpoints for spawning, querying status, and killing agents
- Full command routing through federation COMMAND infrastructure
- Comprehensive test coverage (12/12 tests passing)

Architecture:
- Hub → Spoke: Spawn agents on remote instances
- Command flow: FederationController → FederationAgentService →
  CommandService → Remote Orchestrator
- Response handling: Remote orchestrator returns agent status/results
- Security: Connection validation, signature verification

Files created:
- apps/api/src/federation/types/federation-agent.types.ts
- apps/api/src/federation/federation-agent.service.ts
- apps/api/src/federation/federation-agent.service.spec.ts

Files modified:
- apps/api/src/federation/command.service.ts (agent command routing)
- apps/api/src/federation/federation.controller.ts (agent endpoints)
- apps/api/src/federation/federation.module.ts (service registration)
- apps/orchestrator/src/api/agents/agents.controller.ts (status endpoint)
- apps/orchestrator/src/api/agents/agents.module.ts (lifecycle integration)

Testing:
- 12/12 tests passing for FederationAgentService
- All command service tests passing
- TypeScript compilation successful
- Linting passed

Refs #93

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-03 14:37:06 -06:00

17 KiB

Raw Blame History

Orchestrator Code Quality Remediation Session

Date: 2026-02-02 Agent: Main coordination agent Scope: Issues #260-269 (orchestrator code quality fixes)

Session Overview

Fixing code review findings from comprehensive M6 QA review. Working through 10 remediation issues sequentially.

Progress Tracking

Critical Issues (Fix First, In Order)

#260 - Fix TypeScript compilation errors (14 type errors) ✅ COMPLETE
#261 - Replace 'any' types with proper mocks ✅ COMPLETE
#262 - Fix silent cleanup failures ✅ COMPLETE
#263 - Fix silent Valkey event parsing ✅ COMPLETE
#264 - Add queue integration tests (15% → 85% coverage) ✅ COMPLETE

High Priority

#265 - Fix Prettier formatting (277 errors) ✅ COMPLETE
#266 - Improve Docker error context ✅ COMPLETE
#267 - Fix secret scanner false negatives (Security) ✅ COMPLETE
#268 - Fix worktree cleanup error swallowing ✅ COMPLETE

Medium Priority

#269 - Update outdated TODO comments ✅ COMPLETE

Issue #260: Fix TypeScript Compilation Errors ✅ COMPLETE

Status: Complete Started: 2026-02-02 16:10 Completed: 2026-02-02 16:28 Agent: general-purpose subagent (ab9d864)

Details

14 type errors blocking builds identified and fixed
All fixes follow Quality Rails standards (no 'any' types)
Verification: 0 TypeScript errors, 365/365 tests passing

TypeScript Errors Fixed (14 total):

agents.controller.spec.ts:23 - Added missing killswitchService mock to constructor 2-6. quality-gates.service.spec.ts - Added missing QualityGateResult type import (5 instances) 7-13. conflict-detection.service.spec.ts - Added missing localPath property to all test calls (7 instances)
conflict-detection.service.ts:104 - Fixed git.fetch call to handle optional branch parameter correctly

Files Modified:

/apps/orchestrator/src/api/agents/agents.controller.spec.ts
/apps/orchestrator/src/coordinator/quality-gates.service.spec.ts
/apps/orchestrator/src/git/conflict-detection.service.spec.ts
/apps/orchestrator/src/git/conflict-detection.service.ts

Progress

Read issue details
Identified all 14 TypeScript errors
Spawned subagent to fix
Verified typecheck passes (0 errors) ✅
Verified all tests pass (365/365) ✅
Build verification (typecheck = build for TS)
Close issue in Gitea (manual step)

Verification Results

# TypeScript compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Test suite
pnpm --filter @mosaic/orchestrator test
✅ 365/365 tests passing (18 test files)
✅ Duration: 12.00s

Notes

All fixes maintain type safety (no 'any' types used)
Test functionality preserved - all tests validate same behavior
Minimal changes - no runtime behavior affected
Quality Rails compliant

Issue #261: Replace 'any' Types with Proper Mocks ✅ COMPLETE

Status: Complete Started: 2026-02-02 16:30 Completed: 2026-02-02 16:38 Agent: general-purpose subagent (a35f89e)

Details

Quality Rails violation: Fixed all explicit 'any' types with proper mocks
Fixed 48 instances across 13 test files
Maintained type safety and test functionality

Files Fixed (13 files):

agents.controller.spec.ts - 8 instances (variable declarations + type assertions)
valkey.service.spec.ts - 2 instances
coordinator-client.service.spec.ts - 3 instances
quality-gates.service.spec.ts - 16 instances
killswitch.service.spec.ts - 3 instances
cleanup.service.spec.ts - 3 instances
git-operations.service.spec.ts - 1 instance
secret-scanner.service.spec.ts - 4 instances
agent-lifecycle.service.spec.ts - 1 instance
agent-spawner.service.spec.ts - 3 instances
agents-killswitch.controller.spec.ts - 3 instances
worktree-manager.service.spec.ts - 1 instance
queue.service.spec.ts - 8 instances (bonus fix)

Fix Approach:

Variable Declarations: Replaced any with explicit mock types using ReturnType<typeof vi.fn>
Type Assertions: Replaced as any with as unknown as [ProperType] for safe type casting
Mock Services: Created properly typed mock objects with explicit types

Progress

Scan codebase for 'any' types
Identified all 48 violations
Spawned subagent to fix
Verified lint passes (0 no-explicit-any violations) ✅
Verified all tests pass (365/365) ✅
Verified typecheck passes (0 errors) ✅

Verification Results

# TypeScript compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Test suite
pnpm --filter @mosaic/orchestrator test
✅ 365/365 tests passing

# Lint - no-explicit-any violations
pnpm lint | grep no-explicit-any
✅ No violations found

Notes

Quality Rails compliant (no explicit 'any' types)
All test behavior preserved
Improved type safety throughout test suite
Makes tests more maintainable with explicit type information

Issue #262: Fix Silent Cleanup Failures ✅ COMPLETE

Status: Complete Started: 2026-02-02 16:40 Completed: 2026-02-02 16:50 Agent: general-purpose subagent (aaffaa8)

Details

Problem: CleanupService.cleanup() returned void, hiding cleanup failures from callers
Solution: Return structured CleanupResult with detailed status of each cleanup step
Impact: Improved observability and debugging of cleanup failures

Changes Made:

1. Created Structured Result Types:

export interface CleanupStepResult {
  success: boolean;
  error?: string;
}

export interface CleanupResult {
  docker: CleanupStepResult;
  worktree: CleanupStepResult;
  state: CleanupStepResult;
}

2. Files Modified (4 files):

cleanup.service.ts - Changed return type to Promise<CleanupResult>, captures error messages
killswitch.service.ts - Captures cleanup result, logs structured summary
cleanup.service.spec.ts - Updated 10 tests to verify structured results
killswitch.service.spec.ts - Updated 8 tests with proper CleanupResult mocks

3. Example Results:

Success: { docker: {success: true}, worktree: {success: true}, state: {success: true} }
Partial failure: { docker: {success: false, error: "Docker error"}, worktree: {success: true}, state: {success: true} }

Progress

Identified cleanup operations that fail silently
Designed structured result types
Spawned subagent to fix
Verified all tests pass (365/365) ✅
Verified typecheck passes (0 errors) ✅

Verification Results

# TypeScript compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Test suite
pnpm --filter @mosaic/orchestrator test
✅ 365/365 tests passing

Key Benefits

No more silent failures - cleanup results now visible to callers
Detailed error information captured in result structure
Best-effort cleanup behavior preserved (continues on errors)
Enhanced observability through structured results
No breaking changes to external API contracts

Issue #263: Fix Silent Valkey Event Parsing ✅ COMPLETE

Status: Complete Started: 2026-02-02 16:52 Completed: 2026-02-02 17:00 Agent: general-purpose subagent (af72762)

Details

Problem: Valkey event parsing failures were silent (logged to console.error)
Solution: Replaced console.error with proper Logger + error handler support
Impact: Better error visibility and monitoring capabilities

Changes Made:

1. valkey.client.ts - Added Proper Error Handling:

Added optional logger parameter to ValkeyClientConfig
Added EventErrorHandler type for custom error handling
Updated subscribeToEvents() to accept optional errorHandler parameter
Replaced console.error with proper error handling:
- Logs via NestJS Logger (if provided)
- Invokes custom error handler (if provided)
- Includes contextual information (channel, message)

2. valkey.service.ts - NestJS Integration:

Added Logger instance to ValkeyService
Passed logger to ValkeyClient via config
Forwarded error handler parameter to client

3. Test Coverage (+3 new tests):

Test with logger - Verifies logger.error is called
Test with error handler - Verifies custom handler is invoked
Test without logger/handler - Verifies graceful degradation

4. Files Modified:

valkey.client.ts - Core error handling implementation
valkey.service.ts - Service layer integration
valkey.client.spec.ts - Added 3 new tests
valkey.service.spec.ts - Updated existing tests

Progress

Located Valkey event parsing code
Identified where parsing errors are swallowed
Spawned subagent to fix
Verified all tests pass (368/368, +3 new) ✅
Verified typecheck passes (0 errors) ✅
Verified no console.* usage ✅

Verification Results

# TypeScript compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Test suite
pnpm --filter @mosaic/orchestrator test
✅ 368/368 tests passing (+3 new tests)

# No console usage
grep -r "console\." apps/orchestrator/src/valkey/
✅ No console.* usage found

Key Benefits

Event parsing errors now visible via NestJS Logger
Applications can provide custom error handlers for monitoring
Maintains backward compatibility (both optional)
Errors don't crash subscription - continues processing
Includes contextual information in error logs

Issue #264: Add Queue Integration Tests (15% → 85% Coverage) ✅ COMPLETE

Status: Complete Started: 2026-02-02 17:02 Completed: 2026-02-02 17:15 Agent: general-purpose subagent (a673d29)

Details

Problem: Queue module had only 15% test coverage (only calculateBackoffDelay tested)
Target: Achieve 85% coverage with integration tests
Impact: Ensures queue reliability and prevents regressions

Coverage Achieved:

Statements: 100% (target: 85%)
Branches: 93.33% (target: 85%)
Functions: 100% (target: 85%)
Lines: 100% (target: 85%)

Significantly exceeds 85% target across all metrics ✅

Tests Added: 37 test cases

1. Module Lifecycle (5 tests)

Initialize BullMQ queue with correct configuration
Initialize BullMQ worker with correct configuration
Setup worker event handlers
Use password if configured
Close worker and queue on module destroy

2. addTask() Method (9 tests)

Add task with default options
Add task with custom priority (1-10)
Add task with custom maxRetries
Add task with delay
Validation: priority < 1 (throws error)
Validation: priority > 10 (throws error)
Validation: negative maxRetries (throws error)
Valkey state update integration
Event publishing integration

3. getStats() Method (3 tests)

Return correct queue statistics
Handle zero counts gracefully
Call getJobCounts with correct parameters

4. Queue Control (4 tests)

Pause queue
Resume queue
Remove task from queue (job exists)
Handle removeTask when job doesn't exist

5. Task Processing Integration (6 tests)

Process task successfully
Handle task completion
Handle task failure
Handle retry on failure
Calculate correct backoff delay on retry
Don't retry after max retries exceeded

6. Existing Tests Maintained (10 tests)

All calculateBackoffDelay tests preserved

Progress

Checked current test coverage
Identified uncovered code paths
Designed integration test scenarios
Spawned subagent to implement tests
Verified 100% statement/function/line coverage achieved ✅
Verified all tests pass (395/395) ✅

Verification Results

# All tests pass
pnpm --filter @mosaic/orchestrator test
✅ 395/395 tests passing (+27 new tests)

# TypeScript compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Coverage
✅ 100% statements
✅ 93.33% branches
✅ 100% functions
✅ 100% lines

Key Achievements

Comprehensive integration tests covering entire task lifecycle
Proper BullMQ mocking with realistic behavior
Valkey integration testing
Event publishing verification
Validation and error handling coverage
All existing tests maintained (no breaking changes)

Issue #265: Fix Prettier Formatting + TypeScript ESLint ✅ COMPLETE

Status: Complete Started: 2026-02-02 17:20 Completed: 2026-02-03 11:02 Agent: general-purpose subagent (ac892ba)

Details

Problem: 277 Prettier formatting errors + 78 TypeScript ESLint violations
Solution: Auto-format with lint --fix + manual fixes for TypeScript ESLint rules
Impact: Code consistency and Quality Rails compliance

Errors Fixed:

Phase 1: Prettier Formatting (Auto-fixed)

Fixed all 277 formatting errors (quote style, spacing, etc.)

Phase 2: TypeScript ESLint (Manual fixes - 78 errors)

restrict-template-expressions (65+ errors) - Cannot use non-string types in template literals
- Fixed in 10 files: Added .toString() or String() conversions
prefer-nullish-coalescing (10 errors) - Use ?? instead of ||
- Fixed in 5 files: Replaced logical OR with nullish coalescing
no-unused-vars (1 error) - Removed unused CleanupResult import
require-await (1 error) - Removed async from onModuleInit()
no-misused-promises (2 errors) - Added void cast for event handlers
no-unnecessary-condition (1 error) - Removed always-truthy condition
no-base-to-string (1 error) - Fixed object stringification

Files Modified: 15 TypeScript files

agents.controller.ts
coordinator-client.service.ts
gate-config.service.ts
quality-gates.service.ts
conflict-detection.service.ts
git-operations.service.ts
secret-scanner.service.ts
secret-scanner.types.ts
worktree-manager.service.ts
killswitch.service.ts
cleanup.service.ts
queue.service.ts
agent-lifecycle.service.ts
docker-sandbox.service.ts
valkey.client.ts

Progress

Run lint --fix to auto-format
Fix remaining TypeScript ESLint errors
Verified all tests still pass (395/395) ✅
Verified typecheck passes (0 errors) ✅
Verified lint passes (0 errors, 3 expected warnings) ✅

Verification Results

# ESLint
pnpm --filter @mosaic/orchestrator lint
✅ 0 errors
⚠️ 3 warnings (expected - security scanner dynamic patterns)

# TypeScript compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Test suite
pnpm --filter @mosaic/orchestrator test
✅ 395/395 tests passing

Notes

All formatting now consistent across codebase
TypeScript best practices enforced (nullish coalescing, proper type conversions)
Three security warnings are expected and acceptable (secret scanner requires dynamic file/pattern access)
All functionality preserved - no behavior changes

Token Usage Tracking

Issue	Tokens Used	Duration	Status
#260	~13,000	18 min	✅ Complete
#261	~10,000	8 min	✅ Complete
#262	~8,000	10 min	✅ Complete
#263	~9,000	8 min	✅ Complete
#264	~12,000	13 min	✅ Complete
#265	~14,000	22 min	✅ Complete

Total for Critical Issues (#260-264): ~52,000 tokens, ~57 minutes Total with High Priority #265: ~66,000 tokens, ~79 minutes

Session Summary

Critical Issues Completed (5/5) ✅

All critical issues have been successfully resolved:

#260 - Fixed 14 TypeScript compilation errors
#261 - Replaced 48 'any' types with proper mocks (Quality Rails compliance)
#262 - Fixed silent cleanup failures (return structured results)
#263 - Fixed silent Valkey event parsing (emit error events)
#264 - Added queue integration tests (15% → 100% coverage)

Final Verification

# TypeScript Compilation
pnpm --filter @mosaic/orchestrator typecheck
✅ 0 errors

# Test Suite
pnpm --filter @mosaic/orchestrator test
✅ 395 tests passing (18 test files)

# Lint (no-explicit-any violations)
pnpm lint | grep no-explicit-any
✅ No violations found

# Build
pnpm --filter @mosaic/orchestrator build
✅ Succeeds

Next Steps

High Priority Issues (6-9):

#265 - Fix Prettier formatting (277 errors) - IN PROGRESS
#266 - Improve Docker error context
#267 - Fix secret scanner false negatives
#268 - Fix worktree cleanup error swallowing

Medium Priority Issues (10):

#269 - Update outdated TODO comments

Recommendations

Run formatter: pnpm --filter @mosaic/orchestrator lint --fix to resolve #265
Close issues in Gitea: Issues #260-264 should be closed
Continue with high priority issues: Move to #265-268
Quality Rails Status: All critical violations resolved ✅

17 KiB Raw Blame History

Orchestrator Code Quality Remediation Session

Session Overview

Progress Tracking

Critical Issues (Fix First, In Order)

High Priority

Medium Priority

Issue #260: Fix TypeScript Compilation Errors ✅ COMPLETE

Details

TypeScript Errors Fixed (14 total):

Files Modified:

Progress

Verification Results

Notes

Issue #261: Replace 'any' Types with Proper Mocks ✅ COMPLETE

Details

Files Fixed (13 files):

Fix Approach:

Progress

Verification Results

Notes

Issue #262: Fix Silent Cleanup Failures ✅ COMPLETE

Details

Changes Made:

Progress

Verification Results

Key Benefits

Issue #263: Fix Silent Valkey Event Parsing ✅ COMPLETE

Details

Changes Made:

Progress

Verification Results

Key Benefits

Issue #264: Add Queue Integration Tests (15% → 85% Coverage) ✅ COMPLETE

Details

Coverage Achieved:

Tests Added: 37 test cases

Progress

Verification Results

Key Achievements

Issue #265: Fix Prettier Formatting + TypeScript ESLint ✅ COMPLETE

Details

Errors Fixed:

Files Modified: 15 TypeScript files

Progress

Verification Results

Notes

Token Usage Tracking

Session Summary

Critical Issues Completed (5/5) ✅

Final Verification

Next Steps

Recommendations

17 KiB

Raw Blame History