feat(#158): Implement issue parser agent

Add AI-powered issue metadata parser using Anthropic Sonnet model. - Parse issue markdown to extract: estimated_context, difficulty, assigned_agent, blocks, blocked_by - Implement in-memory caching to avoid duplicate API calls - Graceful fallback to defaults on parse failures - Add comprehensive test suite (9 test cases) - 95% test coverage (exceeds 85% requirement) - Add ANTHROPIC_API_KEY to config - Update documentation and add .env.example Fixes #158 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-01 17:50:35 -06:00
parent d54c65360a
commit dad4b68f66
8 changed files with 689 additions and 10 deletions
--- a/docs/scratchpads/158-issue-parser.md
+++ b/docs/scratchpads/158-issue-parser.md
@@ -0,0 +1,109 @@
+# Issue #158: Implement issue parser agent
+
+## Objective
+
+Create an AI agent using Anthropic's Sonnet model that parses Gitea issue markdown bodies to extract structured metadata for autonomous task coordination.
+
+## Approach
+
+### 1. Dependencies
+
+- Add `anthropic` package to pyproject.toml
+- Add `ANTHROPIC_API_KEY` to config.py
+
+### 2. Data Models (src/models.py)
+
+- `IssueMetadata`: Pydantic model for parsed metadata
+  - `estimated_context`: int (tokens)
+  - `difficulty`: str (easy/medium/hard)
+  - `assigned_agent`: str (sonnet/haiku/opus/glm)
+  - `blocks`: list[int] (issue numbers this blocks)
+  - `blocked_by`: list[int] (issue numbers blocking this)
+
+### 3. Parser Agent (src/parser.py)
+
+- `parse_issue_metadata(issue_body: str, issue_number: int) -> IssueMetadata`
+- Uses Anthropic API with claude-sonnet-4.5 model
+- Structured JSON extraction via prompt
+- Cache results using simple in-memory dict (issue_number -> metadata)
+- Graceful fallback to defaults on parse failure
+
+### 4. Integration
+
+- Update `webhook.py` to call parser in `handle_assigned_event()`
+- Log parsed metadata
+
+## Progress
+
+- [x] Create scratchpad
+- [x] Update pyproject.toml with anthropic dependency
+- [x] Create models.py with IssueMetadata (TEST FIRST)
+- [x] Create parser.py with parse function (TEST FIRST)
+- [x] Update config.py with ANTHROPIC_API_KEY
+- [x] Write comprehensive tests (9 test cases)
+- [x] Run quality gates (mypy, ruff, pytest)
+- [x] Verify 95% coverage (exceeds 85% requirement)
+- [x] Create .env.example
+- [x] Update README.md
+- [x] All quality gates pass
+- [ ] Commit changes
+
+## Testing
+
+### Unit Tests (test_parser.py)
+
+- Test parsing complete issue body → valid metadata
+- Test parsing minimal issue body → defaults used
+- Test parsing malformed markdown → graceful failure
+- Test caching (same issue parsed twice = 1 API call)
+- Test different difficulty levels
+- Test blocks/blocked_by extraction
+- Mock Anthropic API for unit tests
+- Integration test with real API (optional, can be skipped if no key)
+
+### Test Cases
+
+1. **Complete issue body** - All fields present
+2. **Minimal issue body** - Only required fields
+3. **Missing Context Estimate** - Default to reasonable value
+4. **Missing Difficulty** - Default to "medium"
+5. **Missing Agent** - Default to "sonnet"
+6. **Malformed blocks/blocked_by** - Empty lists
+7. **API failure** - Return defaults with error logged
+8. **Cache hit** - Second parse returns cached result
+
+## Notes
+
+### Default Values
+
+- estimated_context: 50000 (reasonable default for medium issues)
+- difficulty: "medium"
+- assigned_agent: "sonnet"
+- blocks: []
+- blocked_by: []
+
+### Prompt Strategy
+
+Use structured output with clear instructions to extract from markdown sections:
+
+- "Context Estimate" section → estimated_context
+- "Difficulty" section → difficulty
+- "Dependencies" section → blocks, blocked_by
+
+### Performance Target
+
+- Average parse time < 2 seconds
+- Cache to avoid redundant API calls
+- Log token usage for cost tracking
+
+### API Integration
+
+- Use `anthropic.Anthropic()` client
+- Model: `claude-sonnet-4.5-20250929`
+- Max tokens: 1024 (responses are small)
+- Temperature: 0 (deterministic parsing)
+
+## Token Tracking
+
+- Estimated: 46,800 tokens
+- Actual: TBD after implementation