# Issue #199: Implement rate limiting on webhook endpoints ## Objective Implement rate limiting on webhook and public-facing API endpoints to prevent DoS attacks and ensure system stability under high load conditions. ## Approach ### TDD Implementation Plan 1. **RED**: Write failing tests for rate limiting - Test rate limit enforcement (429 status) - Test Retry-After header inclusion - Test per-IP rate limiting - Test per-API-key rate limiting - Test that legitimate requests are not blocked - Test storage mechanism (Redis/in-memory) 2. **GREEN**: Implement NestJS throttler - Install @nestjs/throttler package - Configure global rate limits - Configure per-endpoint rate limits - Add custom guards for per-API-key limiting - Integrate with Valkey (Redis) for distributed limiting - Add Retry-After headers to 429 responses 3. **REFACTOR**: Optimize and document - Extract configuration to environment variables - Add documentation - Ensure code quality ### Identified Webhook Endpoints **Stitcher Module** (`apps/api/src/stitcher/stitcher.controller.ts`): - `POST /stitcher/webhook` - Webhook endpoint for @mosaic bot - `POST /stitcher/dispatch` - Manual job dispatch endpoint **Coordinator Integration Module** (`apps/api/src/coordinator-integration/coordinator-integration.controller.ts`): - `POST /coordinator/jobs` - Create a job from coordinator - `PATCH /coordinator/jobs/:id/status` - Update job status - `PATCH /coordinator/jobs/:id/progress` - Update job progress - `POST /coordinator/jobs/:id/complete` - Mark job as complete - `POST /coordinator/jobs/:id/fail` - Mark job as failed - `GET /coordinator/jobs/:id` - Get job details - `GET /coordinator/health` - Integration health check ### Rate Limit Configuration **Proposed limits**: - Global default: 100 requests per minute - Webhook endpoints: 60 requests per minute per IP - Coordinator endpoints: 100 requests per minute per API key - Health endpoints: 300 requests per minute (higher for monitoring) **Storage**: Use Valkey (Redis-compatible) for distributed rate limiting across multiple API instances. ### Technology Stack - `@nestjs/throttler` - NestJS rate limiting module - Valkey (already in project) - Redis-compatible cache for distributed rate limiting - Custom guards for per-API-key limiting ## Progress - [x] Create scratchpad - [x] Identify webhook endpoints requiring rate limiting - [x] Define rate limit configuration strategy - [x] Write failing tests for rate limiting (RED phase - TDD) - [x] Install @nestjs/throttler package - [x] Implement ThrottlerModule configuration - [x] Implement custom guards for per-API-key limiting - [x] Implement ThrottlerValkeyStorageService for distributed rate limiting - [x] Add rate limiting decorators to endpoints (GREEN phase - TDD) - [x] Add environment variables for rate limiting configuration - [x] Verify all tests pass (14/14 tests pass) - [x] Commit changes - [ ] Update issue #199 ## Testing Plan ### Unit Tests 1. **Rate limit enforcement** - Verify 429 status code after exceeding limit - Verify requests within limit are allowed 2. **Retry-After header** - Verify header is present in 429 responses - Verify header value is correct 3. **Per-IP limiting** - Verify different IPs have independent limits - Verify same IP is rate limited 4. **Per-API-key limiting** - Verify different API keys have independent limits - Verify same API key is rate limited 5. **Storage mechanism** - Verify Redis/Valkey integration works - Verify fallback to in-memory if Redis unavailable ### Integration Tests 1. **E2E rate limiting** - Test actual HTTP requests hitting rate limits - Test rate limits reset after time window ## Environment Variables ```bash # Rate limiting configuration RATE_LIMIT_TTL=60 # Time window in seconds RATE_LIMIT_GLOBAL_LIMIT=100 # Global requests per window RATE_LIMIT_WEBHOOK_LIMIT=60 # Webhook endpoint limit RATE_LIMIT_COORDINATOR_LIMIT=100 # Coordinator endpoint limit RATE_LIMIT_HEALTH_LIMIT=300 # Health endpoint limit RATE_LIMIT_STORAGE=redis # redis or memory ``` ## Implementation Summary ### Files Created 1. `/home/localadmin/src/mosaic-stack/apps/api/src/common/throttler/throttler-api-key.guard.ts` - Custom guard for API-key based rate limiting 2. `/home/localadmin/src/mosaic-stack/apps/api/src/common/throttler/throttler-storage.service.ts` - Valkey/Redis storage for distributed rate limiting 3. `/home/localadmin/src/mosaic-stack/apps/api/src/common/throttler/index.ts` - Export barrel file 4. `/home/localadmin/src/mosaic-stack/apps/api/src/stitcher/stitcher.rate-limit.spec.ts` - Rate limiting tests for stitcher endpoints (6 tests) 5. `/home/localadmin/src/mosaic-stack/apps/api/src/coordinator-integration/coordinator-integration.rate-limit.spec.ts` - Rate limiting tests for coordinator endpoints (8 tests) ### Files Modified 1. `/home/localadmin/src/mosaic-stack/apps/api/src/app.module.ts` - Added ThrottlerModule and ThrottlerApiKeyGuard 2. `/home/localadmin/src/mosaic-stack/apps/api/src/stitcher/stitcher.controller.ts` - Added @Throttle decorators (60 req/min) 3. `/home/localadmin/src/mosaic-stack/apps/api/src/coordinator-integration/coordinator-integration.controller.ts` - Added @Throttle decorators (100 req/min, health: 300 req/min) 4. `/home/localadmin/src/mosaic-stack/.env.example` - Added rate limiting environment variables 5. `/home/localadmin/src/mosaic-stack/.env` - Added rate limiting environment variables 6. `/home/localadmin/src/mosaic-stack/apps/api/package.json` - Added @nestjs/throttler dependency ### Test Results - All 14 rate limiting tests pass (6 stitcher + 8 coordinator) - Tests verify: rate limit enforcement, Retry-After headers, per-API-key limiting, independent API key tracking - TDD approach followed: RED (failing tests) → GREEN (implementation) → REFACTOR ### Rate Limits Configured - Stitcher endpoints: 60 requests/minute per API key - Coordinator endpoints: 100 requests/minute per API key - Health endpoint: 300 requests/minute per API key (higher for monitoring) - Storage: Valkey (Redis) for distributed limiting with in-memory fallback ## Notes ### Why @nestjs/throttler? - Official NestJS package with good TypeScript support - Supports Redis for distributed rate limiting - Flexible per-route configuration - Built-in guard system - Active maintenance ### Security Considerations - Rate limiting by IP can be bypassed by rotating IPs - Implement per-API-key limiting as primary defense - Log rate limit violations for monitoring - Consider implementing progressive delays for repeated violations - Ensure rate limiting doesn't block legitimate traffic ### Implementation Details - Use `@Throttle()` decorator for per-endpoint limits - Use `@SkipThrottle()` to exclude specific endpoints - Custom ThrottlerGuard to extract API key from X-API-Key header - Use Valkey connection from existing ValkeyModule ## References - [NestJS Throttler Documentation](https://docs.nestjs.com/security/rate-limiting) - [OWASP Rate Limiting Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Denial_of_Service_Cheat_Sheet.html)