Implement Chatterbox TTS provider (premium tier, voice cloning) #394

Closed
opened 2026-02-15 07:34:28 +00:00 by jason.woltje · 1 comment
Owner

Description

Implement the premium-tier TTS provider using Chatterbox (Resemble AI) for voice cloning and maximum quality.

Details

  • MIT license
  • Requires GPU (8-16GB VRAM, RTX 3060 12GB fits)
  • 63.75% preferred over ElevenLabs in blind tests
  • Zero-shot voice cloning from seconds of reference audio
  • Emotion exaggeration control
  • 23 languages, cross-language voice transfer
  • Sub-200ms streaming latency
  • OpenAI-compatible /v1/audio/speech endpoint via Chatterbox-TTS-Server

Voice Cloning

Acceptance Criteria

  • ChatterboxTtsProvider registered in SpeechModule
  • Voice cloning from reference audio
  • Emotion exaggeration parameter support
  • Graceful degradation when GPU unavailable
  • Provider marked as optional (premium tier)
  • Unit tests with mocked client
## Description Implement the premium-tier TTS provider using Chatterbox (Resemble AI) for voice cloning and maximum quality. ## Details - MIT license - Requires GPU (8-16GB VRAM, RTX 3060 12GB fits) - 63.75% preferred over ElevenLabs in blind tests - Zero-shot voice cloning from seconds of reference audio - Emotion exaggeration control - 23 languages, cross-language voice transfer - Sub-200ms streaming latency - OpenAI-compatible /v1/audio/speech endpoint via Chatterbox-TTS-Server ## Voice Cloning ## Acceptance Criteria - [ ] ChatterboxTtsProvider registered in SpeechModule - [ ] Voice cloning from reference audio - [ ] Emotion exaggeration parameter support - [ ] Graceful degradation when GPU unavailable - [ ] Provider marked as optional (premium tier) - [ ] Unit tests with mocked client
jason.woltje added this to the M13-SpeechServices (0.0.13) milestone 2026-02-15 07:34:28 +00:00
Author
Owner

Completed as part of M13-SpeechServices milestone on branch feature/m13-speech-services. SP-TTS-003: Chatterbox TTS provider with voice cloning (commit d37c78f, 26 tests). All quality gates passed (lint, typecheck, tests, security).

Completed as part of M13-SpeechServices milestone on branch feature/m13-speech-services. SP-TTS-003: Chatterbox TTS provider with voice cloning (commit d37c78f, 26 tests). All quality gates passed (lint, typecheck, tests, security).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#394