stack/docs/scratchpads/188-sanitize-discord-logs.md

# Issue #188: Sanitize Discord error logs to prevent secret exposure

## Objective

Implement log sanitization in Discord error logging to prevent exposure of sensitive information including API keys, tokens, credentials, and PII.

## Security Context

- **Priority**: P1 SECURITY
- **Risk**: Credential leakage through logs
- **Impact**: Could expose authentication tokens, API keys, passwords to unauthorized parties

## Approach

1. **Discovery Phase**: Locate all Discord logging points
2. **Test Phase**: Write tests for log sanitization (TDD)
3. **Implementation Phase**: Create sanitization utility
4. **Integration Phase**: Apply sanitization to Discord logging
5. **Verification Phase**: Ensure all tests pass with ≥85% coverage

## Progress

- [x] Create scratchpad
- [x] Locate Discord error logging code
- [x] Identify sensitive data patterns to redact
- [x] Write tests for log sanitization (TDD RED phase)
- [x] Implement sanitization utility (TDD GREEN phase)
- [x] Integrate with Discord service
- [x] Refactor for quality (TDD REFACTOR phase)
- [x] Verify test coverage ≥85%
- [x] Security review
- [x] Implementation complete (commit pending due to pre-existing lint issues in @mosaic/api package)

## Discovery

### Sensitive Data to Redact

1. **Authentication**: API keys, tokens, bearer tokens
2. **Headers**: Authorization headers, API key headers
3. **Credentials**: Passwords, secrets, client secrets
4. **Database**: Connection strings, database passwords
5. **PII**: Email addresses, user names, phone numbers
6. **Identifiers**: Workspace IDs (if considered sensitive)

### Logging Points Found

- **discord.service.ts:84** - `this.logger.error("Discord client error:", error)`
  - This logs raw error objects which may contain sensitive data
  - Error objects from Discord.js may contain authentication tokens
  - Error stack traces may reveal environment variables or configuration

### Implementation Plan

1. Create `apps/api/src/common/utils/log-sanitizer.ts`
2. Create `apps/api/src/common/utils/log-sanitizer.spec.ts` (TDD - tests first)
3. Implement sanitization patterns:
   - Redact tokens, API keys, passwords
   - Redact authorization headers
   - Redact connection strings
   - Redact email addresses
   - Deep scan objects and arrays
4. Apply to Discord error logging
5. Export from common/utils/index.ts

## Testing

TDD approach:

1. RED - Write failing tests for sanitization
2. GREEN - Implement minimal sanitization logic
3. REFACTOR - Improve code quality

Test cases:

- Sanitize string with API key
- Sanitize string with bearer token
- Sanitize string with password
- Sanitize object with nested secrets
- Sanitize array with secrets
- Sanitize error objects
- Preserve non-sensitive data
- Handle null/undefined inputs
- Sanitize connection strings
- Sanitize email addresses

## Implementation Summary

### Files Created

1. `/home/localadmin/src/mosaic-stack/apps/api/src/common/utils/log-sanitizer.ts` - Core sanitization utility
2. `/home/localadmin/src/mosaic-stack/apps/api/src/common/utils/log-sanitizer.spec.ts` - Comprehensive test suite (32 tests)

### Files Modified

1. `/home/localadmin/src/mosaic-stack/apps/api/src/common/utils/index.ts` - Export sanitization function
2. `/home/localadmin/src/mosaic-stack/apps/api/src/bridge/discord/discord.service.ts` - Integrate sanitization
3. `/home/localadmin/src/mosaic-stack/apps/api/src/bridge/discord/discord.service.spec.ts` - Add security tests

### Test Results

- **Log Sanitizer Tests**: 32/32 passed (100%)
- **Discord Service Tests**: 25/25 passed (100%)
- **Code Coverage**: 97.43% (exceeds 85% requirement)

### Security Patterns Implemented

The sanitizer detects and redacts:

1. API keys (sk*live*_, pk*test*_)
2. Bearer tokens
3. Discord bot tokens (specific format)
4. JWT tokens
5. Basic authentication tokens
6. Email addresses
7. Database connection string passwords
8. Environment variable style secrets (KEY=value)
9. Quoted passwords and secrets
10. Generic tokens in text

### Key Features

- Deep object traversal (handles nested objects and arrays)
- Circular reference detection
- Error object handling (preserves Error structure)
- Date object preservation
- Performance optimized (handles 1000+ key objects in <100ms)
- Maintains non-sensitive data (status codes, error types, etc.)

## Security Review

### Threat Model

**Before**: Discord error logging could expose:

- Bot authentication tokens
- API keys in error messages
- User credentials from failed authentication
- Database connection strings
- Environment variable values

**After**: All sensitive patterns are automatically redacted before logging.

### Validation

Tested scenarios:

1. ✅ Discord bot token in error message → Redacted
2. ✅ API keys in error objects → Redacted
3. ✅ Authorization headers → Redacted
4. ✅ Nested secrets in error.config → Redacted
5. ✅ Non-sensitive error data → Preserved

### Risk Assessment

- **Pre-mitigation**: P1 - Critical (credential exposure possible)
- **Post-mitigation**: P4 - Low (mechanical prevention in place)

## Completion Status

**Implementation: COMPLETE**

- All code written and tested (57/57 tests passing)
- 97.43% code coverage (exceeds 85% requirement)
- TDD process followed correctly (RED → GREEN → REFACTOR)
- Security validation complete

**Commit Status: BLOCKED by pre-existing lint issues**

- My files pass lint individually
- Pre-commit hooks enforce package-level linting (per Quality Rails)
- @mosaic/api package has 602 pre-existing lint errors
- These errors are unrelated to my changes
- Per Quality Rails documentation: This is expected during incremental cleanup

**Recommendation:**
Either:

1. Fix all @mosaic/api lint issues first (out of scope for this issue)
2. Temporarily disable strict linting for @mosaic/api during transition
3. Commit with --no-verify and address lint in separate issue

The security fix itself is complete and tested. The log sanitization is functional
and prevents secret exposure in Discord error logging.

## Notes

- Focus on Discord error logging as primary use case
- Make utility reusable for other logging scenarios
- Consider performance (this will be called frequently)
- Use regex patterns for common secret formats