[CRITICAL] Fix silent cleanup failures - return structured results #262

Closed
opened 2026-02-02 23:16:09 +00:00 by jason.woltje · 1 comment
Owner

Priority: CRITICAL - Silent failures

Problem:
CleanupService catches all errors and only logs them, never notifying callers of partial failures. Users see "Agent killed successfully" even when cleanup failed, leaving orphaned resources.

File: apps/orchestrator/src/killswitch/cleanup.service.ts:62-133

Hidden Errors:

  • Docker container cleanup failures (permissions, daemon crashes)
  • Git worktree removal failures (filesystem issues, locks)
  • Valkey state deletion failures (network issues)
  • Event publication failures

Impact:

  • Orphaned Docker containers consuming memory
  • Orphaned git worktrees consuming disk space
  • Inconsistent system state
  • Users unaware cleanup failed

Acceptance Criteria:

  • CleanupService returns structured CleanupResult type
  • Result indicates success/failure for each step (docker, worktree, state)
  • Callers check results and inform users of partial failures
  • Tests verify partial failure scenarios
  • Documentation updated with cleanup behavior

Recommended Approach:
Return typed result instead of void:

interface CleanupResult {
  docker: { success: boolean; error?: string };
  worktree: { success: boolean; error?: string };
  state: { success: boolean; error?: string };
  allSuccessful(): boolean;
  getFailures(): string[];
}

Code Review Confidence: 90%
Found by: pr-review-toolkit:silent-failure-hunter

**Priority:** CRITICAL - Silent failures **Problem:** CleanupService catches all errors and only logs them, never notifying callers of partial failures. Users see "Agent killed successfully" even when cleanup failed, leaving orphaned resources. **File:** `apps/orchestrator/src/killswitch/cleanup.service.ts:62-133` **Hidden Errors:** - Docker container cleanup failures (permissions, daemon crashes) - Git worktree removal failures (filesystem issues, locks) - Valkey state deletion failures (network issues) - Event publication failures **Impact:** - Orphaned Docker containers consuming memory - Orphaned git worktrees consuming disk space - Inconsistent system state - Users unaware cleanup failed **Acceptance Criteria:** - [ ] CleanupService returns structured `CleanupResult` type - [ ] Result indicates success/failure for each step (docker, worktree, state) - [ ] Callers check results and inform users of partial failures - [ ] Tests verify partial failure scenarios - [ ] Documentation updated with cleanup behavior **Recommended Approach:** Return typed result instead of void: ```typescript interface CleanupResult { docker: { success: boolean; error?: string }; worktree: { success: boolean; error?: string }; state: { success: boolean; error?: string }; allSuccessful(): boolean; getFailures(): string[]; } ``` **Code Review Confidence:** 90% **Found by:** pr-review-toolkit:silent-failure-hunter
jason.woltje added this to the M6-AgentOrchestration (0.0.6) milestone 2026-02-02 23:16:09 +00:00
jason.woltje added the orchestrator label 2026-02-02 23:16:09 +00:00
Author
Owner

Fixed: Added CleanupResult interface with structured results for each cleanup step (docker, worktree, state). File: apps/orchestrator/src/killswitch/cleanup.service.ts

✅ Fixed: Added CleanupResult interface with structured results for each cleanup step (docker, worktree, state). File: apps/orchestrator/src/killswitch/cleanup.service.ts
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#262