stack/docs/scratchpads/158-issue-parser.md

# Issue #158: Implement issue parser agent

## Objective

Create an AI agent using Anthropic's Sonnet model that parses Gitea issue markdown bodies to extract structured metadata for autonomous task coordination.

## Approach

### 1. Dependencies

- Add `anthropic` package to pyproject.toml
- Add `ANTHROPIC_API_KEY` to config.py

### 2. Data Models (src/models.py)

- `IssueMetadata`: Pydantic model for parsed metadata
  - `estimated_context`: int (tokens)
  - `difficulty`: str (easy/medium/hard)
  - `assigned_agent`: str (sonnet/haiku/opus/glm)
  - `blocks`: list[int] (issue numbers this blocks)
  - `blocked_by`: list[int] (issue numbers blocking this)

### 3. Parser Agent (src/parser.py)

- `parse_issue_metadata(issue_body: str, issue_number: int) -> IssueMetadata`
- Uses Anthropic API with claude-sonnet-4.5 model
- Structured JSON extraction via prompt
- Cache results using simple in-memory dict (issue_number -> metadata)
- Graceful fallback to defaults on parse failure

### 4. Integration

- Update `webhook.py` to call parser in `handle_assigned_event()`
- Log parsed metadata

## Progress

- [x] Create scratchpad
- [x] Update pyproject.toml with anthropic dependency
- [x] Create models.py with IssueMetadata (TEST FIRST)
- [x] Create parser.py with parse function (TEST FIRST)
- [x] Update config.py with ANTHROPIC_API_KEY
- [x] Write comprehensive tests (9 test cases)
- [x] Run quality gates (mypy, ruff, pytest)
- [x] Verify 95% coverage (exceeds 85% requirement)
- [x] Create .env.example
- [x] Update README.md
- [x] All quality gates pass
- [x] Commit changes

## Testing

### Unit Tests (test_parser.py)

- Test parsing complete issue body → valid metadata
- Test parsing minimal issue body → defaults used
- Test parsing malformed markdown → graceful failure
- Test caching (same issue parsed twice = 1 API call)
- Test different difficulty levels
- Test blocks/blocked_by extraction
- Mock Anthropic API for unit tests
- Integration test with real API (optional, can be skipped if no key)

### Test Cases

1. **Complete issue body** - All fields present
2. **Minimal issue body** - Only required fields
3. **Missing Context Estimate** - Default to reasonable value
4. **Missing Difficulty** - Default to "medium"
5. **Missing Agent** - Default to "sonnet"
6. **Malformed blocks/blocked_by** - Empty lists
7. **API failure** - Return defaults with error logged
8. **Cache hit** - Second parse returns cached result

## Notes

### Default Values

- estimated_context: 50000 (reasonable default for medium issues)
- difficulty: "medium"
- assigned_agent: "sonnet"
- blocks: []
- blocked_by: []

### Prompt Strategy

Use structured output with clear instructions to extract from markdown sections:

- "Context Estimate" section → estimated_context
- "Difficulty" section → difficulty
- "Dependencies" section → blocks, blocked_by

### Performance Target

- Average parse time < 2 seconds
- Cache to avoid redundant API calls
- Log token usage for cost tracking

### API Integration

- Use `anthropic.Anthropic()` client
- Model: `claude-sonnet-4.5-20250929`
- Max tokens: 1024 (responses are small)
- Temperature: 0 (deterministic parsing)

## Token Tracking

- Estimated: 46,800 tokens
- Actual: TBD after implementation