Critical fixes:
- Fix FormData field name mismatch (audio -> file) to match backend FileInterceptor
- Add /speech namespace to WebSocket connection URL
- Pass auth token in WebSocket handshake options
- Wrap audio.play() in try-catch for NotAllowedError and DOMException handling
- Replace bare catch block with named error parameter and descriptive message
- Add connect_error and disconnect event handlers to WebSocket
- Update JSDoc to accurately describe batch transcription (not real-time partial)
Important fixes:
- Emit transcription-error before disconnect in gateway auth failures
- Capture MediaRecorder error details and clean up media tracks on error
- Change TtsDefaultConfig.format type from string to AudioFormat
- Define canonical SPEECH_TIERS and AUDIO_FORMATS arrays as single source of truth
- Fix voice count from 54 to 53 in provider, AGENTS.md, and docs
- Fix inaccurate comments (Piper formats, tier prop, SpeachesProvider, TextValidationPipe)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add SpeechGateway with Socket.IO namespace /speech for real-time
streaming transcription. Supports start-transcription, audio-chunk,
and stop-transcription events with session management, authentication,
and buffer size rate limiting. Includes 29 unit tests covering
authentication, session lifecycle, error handling, cleanup, and
client isolation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>