feat(#176): Integrate M4.2 infrastructure with M4.1 coordinator

Add CoordinatorIntegrationModule providing REST API endpoints for the Python
coordinator to communicate with the NestJS API infrastructure:

- POST /coordinator/jobs - Create job from coordinator webhook events
- PATCH /coordinator/jobs/:id/status - Update job status (PENDING -> RUNNING)
- PATCH /coordinator/jobs/:id/progress - Update job progress percentage
- POST /coordinator/jobs/:id/complete - Mark job complete with results
- POST /coordinator/jobs/:id/fail - Mark job failed with gate results
- GET /coordinator/jobs/:id - Get job details with events and steps
- GET /coordinator/health - Integration health check

Integration features:
- Job creation dispatches to BullMQ queues
- Status updates emit JobEvents for audit logging
- Completion/failure events broadcast via Herald to Discord
- Status transition validation (PENDING -> QUEUED -> RUNNING -> COMPLETED/FAILED)
- Health check includes BullMQ connection status and queue counts

Also adds JOB_PROGRESS event type to event-types.ts for progress tracking.

Fixes #176

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-02-01 21:54:34 -06:00
parent 3cdcbf6774
commit 5a51ee8c30
17 changed files with 1262 additions and 0 deletions

View File

@@ -0,0 +1,102 @@
# Issue #176: Coordinator Integration
## Objective
Integrate M4.2 infrastructure (NestJS API) with M4.1 coordinator (Python FastAPI) to enable seamless job orchestration between the two systems.
## Architecture Analysis
### M4.1 Coordinator (Python)
- FastAPI application at `apps/coordinator`
- Handles Gitea webhooks, queue management, agent orchestration
- Uses file-based JSON queue for persistence
- Has QueueManager, Coordinator, and OrchestrationLoop classes
- Exposes `/webhook/gitea` and `/health` endpoints
### M4.2 Infrastructure (NestJS)
- StitcherModule: Workflow orchestration, webhook handling, job dispatch
- RunnerJobsModule: CRUD for RunnerJob entities, BullMQ integration
- JobEventsModule: Event tracking and audit logging
- JobStepsModule: Step tracking for jobs
- HeraldModule: Status broadcasting to Discord
- BullMqModule: Queue infrastructure with Valkey backend
- BridgeModule: Discord integration
## Integration Design
### Flow 1: Webhook -> Job Creation
```
Gitea -> Coordinator (Python) -> NestJS API -> RunnerJob + BullMQ
^
| HTTP POST /api/coordinator/jobs
```
### Flow 2: Job Status Updates
```
Coordinator (Python) -> NestJS API -> JobEvent -> Herald -> Discord
^
| HTTP PATCH /api/coordinator/jobs/:id/status
```
### Flow 3: Job Completion
```
Coordinator (Python) -> NestJS API -> Complete RunnerJob -> Herald broadcast
^
| HTTP POST /api/coordinator/jobs/:id/complete
```
## Implementation Plan
### 1. Create Coordinator Integration Module
- `apps/api/src/coordinator-integration/`
- `coordinator-integration.module.ts` - NestJS module
- `coordinator-integration.controller.ts` - REST endpoints for Python coordinator
- `coordinator-integration.service.ts` - Business logic
- `dto/` - DTOs for coordinator communication
- `interfaces/` - Type definitions
### 2. Endpoints for Python Coordinator
- `POST /api/coordinator/jobs` - Create job from coordinator
- `PATCH /api/coordinator/jobs/:id/status` - Update job status
- `POST /api/coordinator/jobs/:id/complete` - Mark job complete
- `POST /api/coordinator/jobs/:id/fail` - Mark job failed
- `GET /api/coordinator/health` - Integration health check
### 3. Event Bridging
- When coordinator reports progress -> emit JobEvent
- When coordinator completes -> update RunnerJob + emit completion event
- Herald subscribes and broadcasts to Discord
## TDD Approach
1. Write tests for CoordinatorIntegrationService
2. Write tests for CoordinatorIntegrationController
3. Implement minimal code to pass tests
4. Refactor
## Progress
- [x] Analyze coordinator structure
- [x] Analyze M4.2 infrastructure
- [x] Design integration layer
- [x] Write failing tests for service
- [x] Implement service
- [x] Write failing tests for controller
- [x] Implement controller
- [x] Add DTOs and interfaces
- [x] Run quality gates
- [x] Commit
## Notes
- The Python coordinator uses httpx.AsyncClient for HTTP calls
- API auth can be handled via shared secret (API key)
- Events follow established patterns from job-events module