feat: rename rails/ to tools/ and add service tool suites (#4)
Co-authored-by: Jason Woltje <jason@diversecanvas.com> Co-committed-by: Jason Woltje <jason@diversecanvas.com>
This commit was merged in pull request #4.
This commit is contained in:
189
tools/quality/PHILOSOPHY.md
Normal file
189
tools/quality/PHILOSOPHY.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Why Hard Rails Matter
|
||||
|
||||
## The Problem We Discovered
|
||||
|
||||
In AI-assisted development, we found:
|
||||
|
||||
1. **Process adherence fails** - Agents claim to do code review but miss critical issues
|
||||
2. **Manual review insufficient** - Even AI-assisted review missed hardcoded passwords, SQL injection
|
||||
3. **Scale breaks quality** - 50 issues in a single patch release despite explicit QA processes
|
||||
|
||||
### Real-World Case Study
|
||||
|
||||
**Production patch validation:**
|
||||
|
||||
After explicit code review and QA processes, we discovered **50 issues**:
|
||||
|
||||
**Security Issues (9):**
|
||||
- 4 hardcoded passwords committed to repository
|
||||
- 1 SQL injection vulnerability
|
||||
- World-readable .env files
|
||||
- XSS vulnerabilities (CSP unsafe-inline)
|
||||
|
||||
**Type Safety Issues (11):**
|
||||
- TypeScript strict mode DISABLED (`"strict": false`)
|
||||
- ESLint explicitly ALLOWING any types (`no-explicit-any: 'off'`)
|
||||
- Missing return types
|
||||
- Type assertion overuse
|
||||
|
||||
**Silent Failures (9):**
|
||||
- Errors swallowed in try/catch blocks
|
||||
- Functions returning wrong types on error
|
||||
- No error logging
|
||||
- Network failures treated as false instead of errors
|
||||
|
||||
**Test Coverage Gaps (10):**
|
||||
- No test coverage requirements
|
||||
- No testing framework setup
|
||||
- Code shipped with 0% coverage
|
||||
|
||||
**Build Failures (2):**
|
||||
- Code committed that doesn't compile
|
||||
- Tests committed that fail
|
||||
|
||||
**Dependency Issues (6):**
|
||||
- Critical CVEs not caught
|
||||
- Version conflicts between packages
|
||||
|
||||
## The Solution: Mechanical Enforcement
|
||||
|
||||
Don't **ask** agents to:
|
||||
- "Please do code review"
|
||||
- "Make sure to run tests"
|
||||
- "Check for security issues"
|
||||
|
||||
Instead, **BLOCK** commits that:
|
||||
- Have type errors
|
||||
- Contain hardcoded secrets
|
||||
- Don't pass tests
|
||||
- Have security vulnerabilities
|
||||
|
||||
### Why This Works
|
||||
|
||||
**Example: Type Safety**
|
||||
|
||||
❌ **Process-based (fails):**
|
||||
```
|
||||
Human: "Please avoid using 'any' types"
|
||||
Agent: "I'll make sure to use proper types"
|
||||
*Agent uses any types anyway*
|
||||
```
|
||||
|
||||
✅ **Mechanically enforced (works):**
|
||||
```
|
||||
Agent writes: const x: any = 123;
|
||||
Git hook runs: ❌ Error: no-explicit-any
|
||||
Commit blocked
|
||||
Agent must fix to proceed
|
||||
```
|
||||
|
||||
The agent doesn't get to **claim** it followed the process. The automated gate **determines** if code is acceptable.
|
||||
|
||||
## Design Principles
|
||||
|
||||
### 1. Fail Fast
|
||||
|
||||
Detect issues at commit time, not in CI, not in code review, not in production.
|
||||
|
||||
**Timeline:**
|
||||
- ⚡ Commit time: Type errors, lint errors, secrets → **BLOCKED**
|
||||
- 🔄 CI time: Build failures, test failures, CVEs → **BLOCKED**
|
||||
- 👀 Code review: Architecture, design, business logic
|
||||
- 🚀 Production: (Issues should never reach here)
|
||||
|
||||
### 2. Non-Negotiable
|
||||
|
||||
No agent can bypass enforcement. No "skip hooks" flag. No emergency override.
|
||||
|
||||
If the code doesn't pass gates, it doesn't get committed. Period.
|
||||
|
||||
### 3. Portable
|
||||
|
||||
Same enforcement across:
|
||||
- All projects
|
||||
- All developers (human + AI)
|
||||
- All environments (local, CI, production)
|
||||
|
||||
### 4. Minimal Friction
|
||||
|
||||
Auto-fix where possible:
|
||||
- Prettier formats code automatically
|
||||
- ESLint --fix corrects simple issues
|
||||
- Only block when can't auto-fix
|
||||
|
||||
### 5. Clear Feedback
|
||||
|
||||
When enforcement blocks a commit, tell the agent:
|
||||
- ❌ What's wrong (type error, lint violation, etc.)
|
||||
- 📍 Where it is (file:line)
|
||||
- ✅ How to fix it (expected type, remove 'any', etc.)
|
||||
|
||||
## Impact Prediction
|
||||
|
||||
Based on a 50-issue production analysis:
|
||||
|
||||
| Phase | Enforcement | Issues Prevented |
|
||||
|-------|-------------|------------------|
|
||||
| **Phase 1** | Pre-commit + strict mode + ESLint | 25 of 50 (50%) |
|
||||
| **Phase 2** | + CI expansion + npm audit | 35 of 50 (70%) |
|
||||
| **Phase 3** | + OWASP + coverage gates | 45 of 50 (90%) |
|
||||
|
||||
**The remaining 10%** require human judgment:
|
||||
- Architecture decisions
|
||||
- Business logic correctness
|
||||
- User experience
|
||||
- Performance optimization
|
||||
|
||||
## Agent Behavior Evolution
|
||||
|
||||
### Before Quality Rails
|
||||
```
|
||||
Agent: "I've completed the feature and run all tests"
|
||||
Reality: Code has type errors, no tests written, hardcoded password
|
||||
Result: 50 issues discovered in code review
|
||||
```
|
||||
|
||||
### After Quality Rails
|
||||
```
|
||||
Agent writes code with 'any' type
|
||||
Git hook: ❌ no-explicit-any
|
||||
Agent rewrites with proper type
|
||||
Git hook: ✅ Pass
|
||||
|
||||
Agent writes code with hardcoded password
|
||||
Git hook: ❌ Secret detected
|
||||
Agent moves to environment variable
|
||||
Git hook: ✅ Pass
|
||||
|
||||
Agent commits without tests
|
||||
CI: ❌ Coverage below 80%
|
||||
Agent writes tests
|
||||
CI: ✅ Pass
|
||||
```
|
||||
|
||||
**The agent learns:** Good code passes gates, bad code is rejected.
|
||||
|
||||
## Why This Matters for AI Development
|
||||
|
||||
AI agents are **deterministically bad** at self-enforcement:
|
||||
- They claim to follow processes
|
||||
- They **believe** they're following processes
|
||||
- Output proves otherwise
|
||||
|
||||
But AI agents are **good** at responding to mechanical feedback:
|
||||
- Clear error messages
|
||||
- Specific line numbers
|
||||
- Concrete fix requirements
|
||||
|
||||
Quality Rails exploits this strength and avoids the weakness.
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Process compliance:** Agents claim → Output fails
|
||||
**Mechanical enforcement:** Gates determine → Output succeeds
|
||||
|
||||
This is not philosophical. This is pragmatic. Based on 50 real issues from production code.
|
||||
|
||||
Quality Rails exists because **process-based quality doesn't work at scale with AI agents.**
|
||||
|
||||
Mechanical enforcement does.
|
||||
Reference in New Issue
Block a user