Commit Graph

4 Commits

Author SHA1 Message Date
28c9e6fe65 feat(#397): implement WebSocket streaming transcription gateway
All checks were successful
ci/woodpecker/push/api Pipeline was successful
Add SpeechGateway with Socket.IO namespace /speech for real-time
streaming transcription. Supports start-transcription, audio-chunk,
and stop-transcription events with session management, authentication,
and buffer size rate limiting. Includes 29 unit tests covering
authentication, session lifecycle, error handling, cleanup, and
client isolation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:54:41 -06:00
527262af38 feat(#392): create /api/speech/transcribe REST endpoint
All checks were successful
ci/woodpecker/push/api Pipeline was successful
Add SpeechController with POST /api/speech/transcribe for audio
transcription and GET /api/speech/health for provider status.
Uses AudioValidationPipe for file upload validation and returns
results in standard { data: T } envelope.

Includes 10 unit tests covering transcribe with options, error
propagation, and all health status combinations.

Fixes #392

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:47:52 -06:00
3ae9e53bcc feat(#391): implement tiered TTS provider architecture with base class
Add abstract BaseTTSProvider class that implements common OpenAI-compatible
TTS logic using the OpenAI SDK with configurable baseURL. Includes synthesize(),
listVoices(), and isHealthy() methods. Create TTS provider factory that
dynamically registers Kokoro (default), Chatterbox (premium), and Piper
(fallback) providers based on configuration. Update SpeechModule to use
the factory for TTS_PROVIDERS injection token.

Also fixes lint error in speaches-stt.provider.ts (Array<T> -> T[]).

30 tests added (22 base provider + 8 factory), all passing.

Fixes #391
2026-02-15 02:19:46 -06:00
c40373fa3b feat(#389): create SpeechModule with provider abstraction layer
All checks were successful
ci/woodpecker/push/api Pipeline was successful
Add SpeechModule with provider interfaces and service skeleton for
multi-tier TTS fallback (premium -> default -> fallback) and STT
transcription support. Includes 27 unit tests covering provider
selection, fallback logic, and availability checks.

- ISTTProvider interface with transcribe/isHealthy methods
- ITTSProvider interface with synthesize/listVoices/isHealthy methods
- Shared types: SpeechTier, TranscriptionResult, SynthesisResult, etc.
- SpeechService with graceful TTS fallback chain
- NestJS injection tokens (STT_PROVIDER, TTS_PROVIDERS)
- SpeechModule registered in AppModule
- ConfigModule integration via speechConfig registerAs factory

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:09:45 -06:00