Files
stack/docs/scratchpads/158-issue-parser.md
Jason Woltje a5416e4a66 fix(#180): Update pnpm to 10.27.0 in Dockerfiles
Updated pnpm version from 10.19.0 to 10.27.0 to fix HIGH severity
vulnerabilities (CVE-2025-69262, CVE-2025-69263, CVE-2025-6926).

Changes:
- apps/api/Dockerfile: line 8
- apps/web/Dockerfile: lines 8 and 81

Fixes #180
2026-02-01 20:52:43 -06:00

110 lines
3.2 KiB
Markdown

# Issue #158: Implement issue parser agent
## Objective
Create an AI agent using Anthropic's Sonnet model that parses Gitea issue markdown bodies to extract structured metadata for autonomous task coordination.
## Approach
### 1. Dependencies
- Add `anthropic` package to pyproject.toml
- Add `ANTHROPIC_API_KEY` to config.py
### 2. Data Models (src/models.py)
- `IssueMetadata`: Pydantic model for parsed metadata
- `estimated_context`: int (tokens)
- `difficulty`: str (easy/medium/hard)
- `assigned_agent`: str (sonnet/haiku/opus/glm)
- `blocks`: list[int] (issue numbers this blocks)
- `blocked_by`: list[int] (issue numbers blocking this)
### 3. Parser Agent (src/parser.py)
- `parse_issue_metadata(issue_body: str, issue_number: int) -> IssueMetadata`
- Uses Anthropic API with claude-sonnet-4.5 model
- Structured JSON extraction via prompt
- Cache results using simple in-memory dict (issue_number -> metadata)
- Graceful fallback to defaults on parse failure
### 4. Integration
- Update `webhook.py` to call parser in `handle_assigned_event()`
- Log parsed metadata
## Progress
- [x] Create scratchpad
- [x] Update pyproject.toml with anthropic dependency
- [x] Create models.py with IssueMetadata (TEST FIRST)
- [x] Create parser.py with parse function (TEST FIRST)
- [x] Update config.py with ANTHROPIC_API_KEY
- [x] Write comprehensive tests (9 test cases)
- [x] Run quality gates (mypy, ruff, pytest)
- [x] Verify 95% coverage (exceeds 85% requirement)
- [x] Create .env.example
- [x] Update README.md
- [x] All quality gates pass
- [x] Commit changes
## Testing
### Unit Tests (test_parser.py)
- Test parsing complete issue body → valid metadata
- Test parsing minimal issue body → defaults used
- Test parsing malformed markdown → graceful failure
- Test caching (same issue parsed twice = 1 API call)
- Test different difficulty levels
- Test blocks/blocked_by extraction
- Mock Anthropic API for unit tests
- Integration test with real API (optional, can be skipped if no key)
### Test Cases
1. **Complete issue body** - All fields present
2. **Minimal issue body** - Only required fields
3. **Missing Context Estimate** - Default to reasonable value
4. **Missing Difficulty** - Default to "medium"
5. **Missing Agent** - Default to "sonnet"
6. **Malformed blocks/blocked_by** - Empty lists
7. **API failure** - Return defaults with error logged
8. **Cache hit** - Second parse returns cached result
## Notes
### Default Values
- estimated_context: 50000 (reasonable default for medium issues)
- difficulty: "medium"
- assigned_agent: "sonnet"
- blocks: []
- blocked_by: []
### Prompt Strategy
Use structured output with clear instructions to extract from markdown sections:
- "Context Estimate" section → estimated_context
- "Difficulty" section → difficulty
- "Dependencies" section → blocks, blocked_by
### Performance Target
- Average parse time < 2 seconds
- Cache to avoid redundant API calls
- Log token usage for cost tracking
### API Integration
- Use `anthropic.Anthropic()` client
- Model: `claude-sonnet-4.5-20250929`
- Max tokens: 1024 (responses are small)
- Temperature: 0 (deterministic parsing)
## Token Tracking
- Estimated: 46,800 tokens
- Actual: TBD after implementation