18 tasks across 7 phases for TTS & STT integration. Estimated total: ~322K tokens. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.0 KiB
5.0 KiB
Tasks — M13-SpeechServices (0.0.13)
Orchestrator: Claude Code Started: 2026-02-15 Branch: feature/m13-speech-services Milestone: M13-SpeechServices (0.0.13) Epic: #388
Phase 1: Foundation (Config + Module + Providers)
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-CFG-001 | not-started | #401: Speech services environment variables and ConfigModule integration | #401 | api | feature/m13-speech-services | SP-MOD-001,SP-DOC-001 | 15K | ||||||
| SP-MOD-001 | not-started | #389: Create SpeechModule with provider abstraction layer | #389 | api | feature/m13-speech-services | SP-CFG-001 | SP-STT-001,SP-TTS-001,SP-MID-001 | 25K |
Phase 2: Providers (STT + TTS)
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-STT-001 | not-started | #390: Implement STT provider with Speaches/faster-whisper integration | #390 | api | feature/m13-speech-services | SP-MOD-001 | SP-EP-001,SP-WS-001 | 20K | |||||
| SP-TTS-001 | not-started | #391: Implement tiered TTS provider architecture | #391 | api | feature/m13-speech-services | SP-MOD-001 | SP-TTS-002,SP-TTS-003,SP-TTS-004,SP-EP-002 | 20K | |||||
| SP-TTS-002 | not-started | #393: Implement Kokoro-FastAPI TTS provider (default tier) | #393 | api | feature/m13-speech-services | SP-TTS-001 | SP-EP-002 | 15K | |||||
| SP-TTS-003 | not-started | #394: Implement Chatterbox TTS provider (premium tier, voice cloning) | #394 | api | feature/m13-speech-services | SP-TTS-001 | SP-EP-002 | 15K | |||||
| SP-TTS-004 | not-started | #395: Implement Piper TTS provider via OpenedAI Speech (fallback tier) | #395 | api | feature/m13-speech-services | SP-TTS-001 | SP-EP-002 | 12K |
Phase 3: Middleware + REST Endpoints
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-MID-001 | not-started | #398: Audio format validation and preprocessing middleware | #398 | api | feature/m13-speech-services | SP-MOD-001 | SP-EP-001,SP-EP-002 | 15K | |||||
| SP-EP-001 | not-started | #392: Create /api/speech/transcribe REST endpoint | #392 | api | feature/m13-speech-services | SP-STT-001,SP-MID-001 | SP-WS-001,SP-FE-001 | 20K | |||||
| SP-EP-002 | not-started | #396: Create /api/speech/synthesize REST endpoint | #396 | api | feature/m13-speech-services | SP-TTS-002,SP-TTS-003,SP-TTS-004,SP-MID-001 | SP-FE-002 | 20K |
Phase 4: WebSocket Streaming
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-WS-001 | not-started | #397: Implement WebSocket streaming transcription endpoint | #397 | api | feature/m13-speech-services | SP-STT-001,SP-EP-001 | SP-FE-001 | 20K |
Phase 5: Docker/DevOps
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-DOC-001 | not-started | #399: Docker Compose dev overlay for speech services | #399 | devops | feature/m13-speech-services | SP-CFG-001 | SP-DOC-002 | 10K | |||||
| SP-DOC-002 | not-started | #400: Docker Compose swarm/prod deployment for speech services | #400 | devops | feature/m13-speech-services | SP-DOC-001 | 10K |
Phase 6: Frontend
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-FE-001 | not-started | #402: Frontend voice input component (microphone capture + transcription) | #402 | web | feature/m13-speech-services | SP-EP-001,SP-WS-001 | SP-FE-003 | 25K | |||||
| SP-FE-002 | not-started | #403: Frontend audio playback component for TTS output | #403 | web | feature/m13-speech-services | SP-EP-002 | SP-FE-003 | 20K | |||||
| SP-FE-003 | not-started | #404: Frontend speech settings page (provider selection, voice config) | #404 | web | feature/m13-speech-services | SP-FE-001,SP-FE-002 | SP-E2E-001 | 20K |
Phase 7: Testing + Documentation
| id | status | description | issue | repo | branch | depends_on | blocks | agent | started_at | completed_at | estimate | used | notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SP-E2E-001 | not-started | #405: E2E integration tests for speech services | #405 | api | feature/m13-speech-services | SP-EP-001,SP-EP-002,SP-WS-001,SP-FE-003 | SP-DOCS-001 | 25K | |||||
| SP-DOCS-001 | not-started | #406: Documentation - Speech services architecture, API, and deployment | #406 | docs | feature/m13-speech-services | SP-E2E-001 | 15K |