fix(#199): implement rate limiting on webhook endpoints
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed

Implements comprehensive rate limiting on all webhook and coordinator endpoints
to prevent DoS attacks. Follows TDD protocol with 14 passing tests.

Implementation:
- Added @nestjs/throttler package for rate limiting
- Created ThrottlerApiKeyGuard for per-API-key rate limiting
- Created ThrottlerValkeyStorageService for distributed rate limiting via Redis
- Configured rate limits on stitcher endpoints (60 req/min)
- Configured rate limits on coordinator endpoints (100 req/min)
- Higher limits for health endpoints (300 req/min for monitoring)
- Added environment variables for rate limit configuration
- Rate limiting logs violations for security monitoring

Rate Limits:
- Stitcher webhooks: 60 requests/minute per API key
- Coordinator endpoints: 100 requests/minute per API key
- Health endpoints: 300 requests/minute (higher for monitoring)

Storage:
- Uses Valkey (Redis) for distributed rate limiting across API instances
- Falls back to in-memory storage if Redis unavailable

Testing:
- 14 comprehensive rate limiting tests (all passing)
- Tests verify: rate limit enforcement, Retry-After headers, per-API-key isolation
- TDD approach: RED (failing tests) → GREEN (implementation) → REFACTOR

Additional improvements:
- Type safety improvements in websocket gateway
- Array type notation standardization in coordinator service

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Jason Woltje
2026-02-02 13:07:16 -06:00
parent 210b3d2e8f
commit 41d56dadf0
14 changed files with 990 additions and 11 deletions

View File

@@ -170,6 +170,30 @@ GITEA_WEBHOOK_SECRET=REPLACE_WITH_RANDOM_WEBHOOK_SECRET
# The coordinator service uses this key to authenticate with the API
COORDINATOR_API_KEY=REPLACE_WITH_RANDOM_API_KEY_MINIMUM_32_CHARS
# ======================
# Rate Limiting
# ======================
# Rate limiting prevents DoS attacks on webhook and API endpoints
# TTL is in seconds, limits are per TTL window
# Global rate limit (applies to all endpoints unless overridden)
RATE_LIMIT_TTL=60 # Time window in seconds
RATE_LIMIT_GLOBAL_LIMIT=100 # Requests per window
# Webhook endpoints (/stitcher/webhook, /stitcher/dispatch)
RATE_LIMIT_WEBHOOK_LIMIT=60 # Requests per minute
# Coordinator endpoints (/coordinator/*)
RATE_LIMIT_COORDINATOR_LIMIT=100 # Requests per minute
# Health check endpoints (/coordinator/health)
RATE_LIMIT_HEALTH_LIMIT=300 # Requests per minute (higher for monitoring)
# Storage backend for rate limiting (redis or memory)
# redis: Uses Valkey for distributed rate limiting (recommended for production)
# memory: Uses in-memory storage (single instance only, for development)
RATE_LIMIT_STORAGE=redis
# ======================
# Discord Bridge (Optional)
# ======================