Audio format validation and preprocessing middleware #398

Closed
opened 2026-02-15 07:34:50 +00:00 by jason.woltje · 1 comment
Owner

Description

Create middleware/pipe for validating and preprocessing audio uploads before they reach the speech providers.

Validation Rules

  • Supported MIME types: audio/wav, audio/mp3, audio/mpeg, audio/webm, audio/ogg, audio/flac, audio/x-m4a
  • Max file size: configurable via SPEECH_MAX_UPLOAD_SIZE (default 25MB)
  • Duration limit: configurable via SPEECH_MAX_DURATION_SECONDS (default 600s / 10 min)

Preprocessing

  • Convert non-standard formats to WAV if needed (via ffmpeg or similar)
  • Sample rate normalization (16kHz for Whisper)
  • Channel normalization (mono)

Acceptance Criteria

  • AudioValidationPipe created
  • MIME type validation
  • File size validation
  • Format conversion utility (optional, if non-standard input needed)
  • Unit tests
## Description Create middleware/pipe for validating and preprocessing audio uploads before they reach the speech providers. ## Validation Rules - Supported MIME types: audio/wav, audio/mp3, audio/mpeg, audio/webm, audio/ogg, audio/flac, audio/x-m4a - Max file size: configurable via SPEECH_MAX_UPLOAD_SIZE (default 25MB) - Duration limit: configurable via SPEECH_MAX_DURATION_SECONDS (default 600s / 10 min) ## Preprocessing - Convert non-standard formats to WAV if needed (via ffmpeg or similar) - Sample rate normalization (16kHz for Whisper) - Channel normalization (mono) ## Acceptance Criteria - [ ] AudioValidationPipe created - [ ] MIME type validation - [ ] File size validation - [ ] Format conversion utility (optional, if non-standard input needed) - [ ] Unit tests
jason.woltje added this to the M13-SpeechServices (0.0.13) milestone 2026-02-15 07:34:50 +00:00
Author
Owner

Completed as part of M13-SpeechServices milestone on branch feature/m13-speech-services. SP-MID-001: Audio format validation and preprocessing middleware (commit 7b4fda6, 36 tests). All quality gates passed (lint, typecheck, tests, security).

Completed as part of M13-SpeechServices milestone on branch feature/m13-speech-services. SP-MID-001: Audio format validation and preprocessing middleware (commit 7b4fda6, 36 tests). All quality gates passed (lint, typecheck, tests, security).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mosaic/stack#398