[ORCH-005] ClawdBot Failure Handling #100
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase 3: Failure Handling
Handle failures reported by Orchestrator service (`apps/orchestrator/`) (not detecting them ourselves).
Deliverables
[ ] Failure callback handler (Orchestrator reports task failed)
[ ] Retry configuration per task type (max retries, backoff)
[ ] Automatic retry dispatch (if retries remaining)
[ ] Escalation logic (notify user after max retries)
[ ] Checkpoint preservation (save last known state for potential resume)
[ ] Failure audit trail (full history in AgentTaskLog)
Failure Flow
Removed (handled by Orchestrator service)
•
Stale agent detection— Orchestrator monitors its own agents•
Direct health monitoring— Orchestrator handles heartbeats•
Agent-level recovery— Orchestrator restarts failed agentsDependencies
• #99 Task Dispatcher Service
Related
• #95 Agent Orchestration EPIC
• #114 Kill Authority Implementation
• ORCH-118 (Orchestrator resource cleanup)
[ORCH-005] Agent Failure Recoveryto [ORCH-005] ClawdBot Failure Handling