Critical enhancement for real-world usage - parser must handle: - Unformatted issues (estimate from content) - Incomplete metadata (best-guess + confidence score) - Oversized issues (auto-decompose before queuing) Three-level estimation: 1. Structured metadata → extract directly (95%+ confidence) 2. Content analysis → AI estimates from description (50-95%) 3. Minimal info → defaults + warn user (<50%) 50% rule enforcement: - Detect issues > 50% of agent's context limit - Auto-decompose into sub-issues using Opus - Create sub-issues in Gitea with dependencies - Label parent as EPIC Confidence-based workflow: - ≥60%: Queue automatically - 30-59%: Queue with warning - <30%: Don't queue, request more details Makes coordinator truly autonomous - handles whatever users throw at it. Refs #158 (COORD-002) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
20 KiB
Issue Parser Estimation Strategy
Status: Proposed (Phase 0 Enhancement) Related Issues: COORD-002 (Issue Parser Agent) Priority: Critical (P0) - Required for real-world usage
Problem Statement
Not all issues will follow the formatted metadata structure used in COORD-XXX issues. The issue parser must handle:
- Unformatted issues - Just title and description, no metadata
- Incomplete metadata - Some fields present, others missing
- Oversized issues - Exceed 50% rule, need decomposition
- Varying formats - Different teams use different templates
The parser must make intelligent estimates when metadata is missing.
Estimation Strategy
Level 1: Structured Metadata (Best Case)
When issue has formatted metadata:
## Context Estimate
- Files to modify: 3
- Implementation complexity: medium (20000 tokens)
- **Total estimated: 46800 tokens**
- **Recommended agent: glm**
## Difficulty
medium
Action:
- Extract directly from markdown
- Confidence: HIGH (95%+)
- Use values as-is
Level 2: Content Analysis (Common Case)
When metadata is missing, analyze issue content:
2.1 Analyze Title and Description
async def estimate_from_content(issue: dict) -> dict:
"""Estimate metadata from issue content using AI."""
client = anthropic.Anthropic()
response = await client.messages.create(
model="claude-sonnet-4-5",
max_tokens=2000,
messages=[{
"role": "user",
"content": f"""Analyze this issue and estimate resource requirements.
Title: {issue['title']}
Description:
{issue['body']}
Estimate:
1. **Files to modify**: How many files will likely be touched?
- Count mentions of specific files, modules, components
- Look for scope indicators (refactor, add feature, fix bug)
2. **Implementation complexity**:
- Low: Simple CRUD, config changes, one-file fixes
- Medium: Multi-file changes, business logic, API development
- High: Architecture changes, complex refactoring, new systems
3. **Context estimate**:
- Use formula: (files × 7000) + complexity + tests + docs
- Low: ~20-40K tokens
- Medium: ~40-80K tokens
- High: ~80-150K tokens
4. **Difficulty**: low/medium/high
5. **Confidence**: 0-100% (based on clarity of issue description)
Return JSON:
{{
"estimated_context": <integer>,
"difficulty": "low" | "medium" | "high",
"assigned_agent": "haiku" | "sonnet" | "glm" | "opus",
"confidence": <integer 0-100>,
"reasoning": "Brief explanation of estimates"
}}
"""
}]
)
metadata = json.loads(response.content[0].text)
# Add metadata source
metadata['source'] = 'content_analysis'
return metadata
Confidence factors:
| Factor | High Confidence | Low Confidence |
|---|---|---|
| Description length | >500 chars, detailed | <100 chars, vague |
| Specific mentions | Files, modules, APIs named | Generic "fix the thing" |
| Acceptance criteria | Clear checklist | None provided |
| Technical details | Stack traces, logs, examples | "It's broken" |
| Scope clarity | Well-defined boundaries | Open-ended |
Confidence scoring:
def calculate_confidence(issue: dict, analysis: dict) -> int:
"""Calculate confidence score 0-100."""
score = 50 # Start at neutral
# Description length
if len(issue['body']) > 500:
score += 15
elif len(issue['body']) < 100:
score -= 20
# Specific file/module mentions
code_patterns = r'(`[^`]+`|\.ts|\.py|\.js|src/|components/)'
mentions = len(re.findall(code_patterns, issue['body']))
score += min(mentions * 5, 20)
# Acceptance criteria
if '- [ ]' in issue['body'] or '- [x]' in issue['body']:
score += 10
# Technical details (stack traces, logs, code blocks)
if '```' in issue['body']:
score += 10
# Scope keywords
scope_keywords = ['refactor', 'implement', 'add', 'fix', 'update']
if any(kw in issue['title'].lower() for kw in scope_keywords):
score += 5
return max(0, min(100, score))
Action:
- Use AI to estimate from content
- Confidence: MEDIUM (50-80%)
- Comment confidence on issue
Example comment:
🤖 Estimated metadata (confidence: 65%):
- Estimated context: 52,000 tokens
- Difficulty: medium
- Recommended agent: glm
📊 Reasoning:
- Mentions 3 components (UserService, AuthMiddleware, tests)
- Requires API changes (medium complexity)
- Has acceptance criteria (+confidence)
- Description is detailed (+confidence)
Note: These are estimates. Actual usage may vary.
Level 3: Minimal Information (Worst Case)
When issue is very vague:
Title: Fix the login bug
Body: Login doesn't work
Action:
- Use conservative defaults
- Confidence: LOW (<50%)
- Warn user, suggest more details
Default estimates:
DEFAULT_ESTIMATES = {
'estimated_context': 50000, # Conservative default
'difficulty': 'medium',
'assigned_agent': 'sonnet', # Safe middle-tier agent
'confidence': 30,
'reasoning': 'Minimal information provided, using defaults'
}
Example comment:
⚠️ Low confidence estimate (30%):
- Estimated context: 50,000 tokens
- Difficulty: medium
- Recommended agent: sonnet
📝 Suggestion: For better estimates, please add:
- Which files/components are affected
- Expected scope (one file? multiple modules?)
- Acceptance criteria or definition of "done"
- Any relevant logs, stack traces, or examples
Oversized Issue Detection & Decomposition
50% Rule Enforcement
Before queuing, check if issue exceeds 50% of target agent's limit:
async def check_and_decompose(issue: dict, metadata: dict) -> List[dict]:
"""Check if issue exceeds 50% rule. If so, decompose."""
# Get agent limit
agent = metadata['assigned_agent']
agent_limit = AGENT_PROFILES[agent]['context_limit']
max_issue_size = agent_limit * 0.5
# Check if oversized
if metadata['estimated_context'] > max_issue_size:
logger.warning(
f"Issue #{issue['number']} exceeds 50% rule: "
f"{metadata['estimated_context']} > {max_issue_size}"
)
# Decompose into sub-issues
sub_issues = await decompose_epic(issue, metadata)
# Comment on parent issue
await gitea_client.comment_on_issue(
issue['number'],
f"⚠️ Issue exceeds 50% rule ({metadata['estimated_context']:,} tokens)\n\n"
f"Auto-decomposing into {len(sub_issues)} sub-issues...\n\n"
f"This issue will be converted to an EPIC tracking the sub-issues."
)
# Label as epic
await gitea_client.add_label(issue['number'], 'epic')
return sub_issues
else:
# Single issue, good to go
return [issue]
Automatic Epic Decomposition
When issue is oversized, use AI to break it down:
async def decompose_epic(issue: dict, metadata: dict) -> List[dict]:
"""Decompose oversized issue into sub-issues."""
client = anthropic.Anthropic()
# Get max issue size for target agent
agent = metadata['assigned_agent']
max_size = AGENT_PROFILES[agent]['context_limit'] * 0.5
response = await client.messages.create(
model="claude-opus-4-5", # Use Opus for decomposition
max_tokens=4000,
messages=[{
"role": "user",
"content": f"""This issue is too large ({metadata['estimated_context']:,} tokens)
and must be broken into smaller sub-issues.
**Original Issue:**
Title: {issue['title']}
Body: {issue['body']}
**Constraints:**
- Each sub-issue must be ≤ {max_size:,} tokens
- Sub-issues should be independently completable
- Maintain logical order (dependencies)
- Cover all aspects of original issue
**Instructions:**
1. Identify logical breakdown points
2. Create 3-6 sub-issues
3. Estimate context for each
4. Define dependencies (what must come first)
Return JSON array:
[
{{
"title": "Sub-issue title",
"description": "Detailed description",
"estimated_context": <integer>,
"difficulty": "low" | "medium" | "high",
"depends_on": [<array of titles this depends on>]
}},
...
]
Ensure NO sub-issue exceeds {max_size:,} tokens.
"""
}]
)
sub_issues = json.loads(response.content[0].text)
# Validate all sub-issues fit 50% rule
for sub in sub_issues:
if sub['estimated_context'] > max_size:
raise ValueError(
f"Sub-issue '{sub['title']}' still exceeds limit: "
f"{sub['estimated_context']} > {max_size}"
)
# Create sub-issues in Gitea
created_issues = []
issue_map = {} # title -> issue number
for sub in sub_issues:
# Create issue
new_issue = await gitea_client.create_issue(
title=f"[SUB] {sub['title']}",
body=f"""**Parent Epic:** #{issue['number']} - {issue['title']}
## Objective
{sub['description']}
## Context Estimate
- **Total estimated: {sub['estimated_context']:,} tokens**
- Difficulty: {sub['difficulty']}
## Dependencies
{format_dependencies(sub['depends_on'], issue_map)}
## Notes
Auto-generated from epic decomposition.
""",
labels=['sub-issue', f"p{issue.get('priority', 1)}"],
milestone=issue.get('milestone')
)
created_issues.append(new_issue)
issue_map[sub['title']] = new_issue['number']
# Update parent issue to reference sub-issues
sub_issue_list = '\n'.join(
f"- #{i['number']} {i['title']}"
for i in created_issues
)
await gitea_client.comment_on_issue(
issue['number'],
f"## Sub-Issues Created\n\n{sub_issue_list}\n\n"
f"This issue is now an EPIC. Close this when all sub-issues complete."
)
return created_issues
Example decomposition:
Original Issue #200: "Refactor authentication system"
Estimated: 180,000 tokens (EXCEEDS 50% rule for Opus: 100K limit)
Auto-decomposed into:
├─ #201 [SUB] Extract auth middleware (45K tokens) → Ready
├─ #202 [SUB] Implement JWT service (38K tokens) → Blocked by #201
├─ #203 [SUB] Add token refresh logic (32K tokens) → Blocked by #202
├─ #204 [SUB] Update auth guards (28K tokens) → Blocked by #202
└─ #205 [SUB] Add integration tests (35K tokens) → Blocked by #201,#202,#203,#204
Total: 178K tokens across 5 sub-issues
Each sub-issue: ≤50K tokens ✅
Confidence-Based Workflow
High Confidence (95%+)
- Source: Structured metadata in issue body
- Action: Use values directly, queue immediately
- Comment: "✅ Metadata detected, high confidence"
Medium Confidence (50-95%)
- Source: Content analysis
- Action: Use estimates, queue with note
- Comment: "📊 Estimated from content (confidence: X%)"
Low Confidence (<50%)
- Source: Minimal info, using defaults
- Action: Use defaults, warn user
- Comment: "⚠️ Low confidence - please add details"
- Optional: Wait for user to update issue before queuing
Confidence Thresholds
class ConfidenceThresholds:
"""Confidence-based behavior thresholds."""
AUTO_QUEUE = 60 # ≥60% confidence: Queue automatically
WARN_USER = 50 # <50% confidence: Warn user
WAIT_FOR_UPDATE = 30 # <30% confidence: Don't queue, wait for update
Workflow:
async def handle_issue_assignment(issue: dict):
"""Handle issue assigned to @mosaic."""
# Parse metadata (structured or estimated)
metadata = await parse_issue_metadata(issue)
# Check confidence
if metadata['confidence'] >= ConfidenceThresholds.AUTO_QUEUE:
# High/medium confidence - queue it
await queue_manager.enqueue(issue, metadata)
await gitea_client.comment_on_issue(
issue['number'],
f"🤖 Added to coordinator queue\n\n"
f"**Metadata** (confidence: {metadata['confidence']}%):\n"
f"- Estimated context: {metadata['estimated_context']:,} tokens\n"
f"- Difficulty: {metadata['difficulty']}\n"
f"- Assigned agent: {metadata['assigned_agent']}\n\n"
f"{metadata.get('reasoning', '')}"
)
elif metadata['confidence'] >= ConfidenceThresholds.WARN_USER:
# Low confidence - queue but warn
await queue_manager.enqueue(issue, metadata)
await gitea_client.comment_on_issue(
issue['number'],
f"⚠️ Low confidence estimate ({metadata['confidence']}%)\n\n"
f"Using best-guess estimates:\n"
f"- Estimated context: {metadata['estimated_context']:,} tokens\n"
f"- Difficulty: {metadata['difficulty']}\n"
f"- Assigned agent: {metadata['assigned_agent']}\n\n"
f"💡 For better estimates, please add:\n"
f"- Which files/components are affected\n"
f"- Expected scope\n"
f"- Acceptance criteria\n\n"
f"Queued anyway - work will proceed with these estimates."
)
else:
# Very low confidence - don't queue
await gitea_client.comment_on_issue(
issue['number'],
f"❌ Cannot queue - insufficient information ({metadata['confidence']}%)\n\n"
f"Please add more details:\n"
f"- What files/components need changes?\n"
f"- What is the expected scope?\n"
f"- What are the acceptance criteria?\n\n"
f"Re-assign to @mosaic when ready."
)
# Unassign from coordinator
await gitea_client.unassign_issue(issue['number'], 'mosaic')
Edge Cases
Case 1: Issue Updated After Queuing
User adds details after low-confidence queuing:
@app.post('/webhook/gitea')
async def handle_webhook(payload: dict):
"""Handle Gitea webhooks."""
if payload['action'] == 'edited':
issue = payload['issue']
# Check if already in queue
if queue_manager.has_issue(issue['number']):
# Re-parse with updated content
new_metadata = await parse_issue_metadata(issue)
# Update queue
queue_manager.update_metadata(issue['number'], new_metadata)
await gitea_client.comment_on_issue(
issue['number'],
f"🔄 Issue updated - re-estimated metadata:\n"
f"- Estimated context: {new_metadata['estimated_context']:,} tokens\n"
f"- Difficulty: {new_metadata['difficulty']}\n"
f"- Confidence: {new_metadata['confidence']}%"
)
Case 2: Decomposition Creates More Oversized Issues
If decomposed sub-issue still exceeds 50% rule:
# Recursive decomposition
async def decompose_epic(issue: dict, metadata: dict, depth: int = 0) -> List[dict]:
"""Decompose with recursion limit."""
if depth > 2:
raise ValueError(
f"Issue #{issue['number']} cannot be decomposed enough. "
f"Manual intervention required."
)
sub_issues = await ai_decompose(issue, metadata)
# Check if any sub-issue is still too large
oversized = [s for s in sub_issues if s['estimated_context'] > max_size]
if oversized:
# Recursively decompose oversized sub-issues
final_issues = []
for sub in sub_issues:
if sub['estimated_context'] > max_size:
# Decompose further
sub_sub_issues = await decompose_epic(sub, sub, depth + 1)
final_issues.extend(sub_sub_issues)
else:
final_issues.append(sub)
return final_issues
return sub_issues
Case 3: No Clear Decomposition
If AI can't find good breakdown points:
# Comment on issue, unassign from coordinator
await gitea_client.comment_on_issue(
issue['number'],
f"❌ Cannot auto-decompose this issue.\n\n"
f"Estimated at {metadata['estimated_context']:,} tokens "
f"(exceeds {max_size:,} limit), but no clear breakdown found.\n\n"
f"**Manual action needed:**\n"
f"1. Break this into smaller sub-issues manually\n"
f"2. Assign sub-issues to @mosaic\n"
f"3. This issue can become an EPIC tracking sub-issues\n\n"
f"Unassigning from coordinator."
)
await gitea_client.unassign_issue(issue['number'], 'mosaic')
Implementation Checklist
Phase 0 (COORD-002) must include:
- Structured metadata extraction (existing plan)
- Content analysis estimation (NEW)
- Confidence scoring (NEW)
- Best-guess defaults (NEW)
- 50% rule validation (NEW)
- Automatic epic decomposition (NEW)
- Recursive decomposition handling (NEW)
- Confidence-based workflow (NEW)
- Update issue handling (NEW)
Success Criteria
Parser handles all issue types:
- ✅ Formatted issues → High confidence, extract directly
- ✅ Unformatted issues → Medium confidence, estimate from content
- ✅ Vague issues → Low confidence, use defaults + warn
- ✅ Oversized issues → Auto-decompose, create sub-issues
- ✅ Updated issues → Re-parse, update queue
No manual intervention needed for:
- Well-formatted issues
- Clear descriptions (even without metadata)
- Oversized issues (auto-decompose)
Manual intervention only for:
- Very vague issues (<30% confidence)
- Issues that can't be decomposed
- Edge cases requiring human judgment
Example Scenarios
Scenario 1: Well-Formatted Issue
Issue #300: [COORD-020] Implement user profile caching
## Context Estimate
- Files: 4
- Total: 52,000 tokens
- Agent: glm
## Difficulty
medium
Result:
- ✅ Extract directly
- Confidence: 95%
- Queue immediately
Scenario 2: Clear But Unformatted Issue
Issue #301: Add caching to user profile API
Need to cache user profiles to reduce database load.
Files affected:
- src/api/users/users.service.ts
- src/cache/cache.service.ts
- src/api/users/users.controller.ts
- tests/users.service.spec.ts
Acceptance criteria:
- [ ] Cache GET /users/:id requests
- [ ] 5 minute TTL
- [ ] Invalidate on update/delete
- [ ] Add tests
Result:
- 📊 Estimate from content
- Files: 4 → 28K base
- Clear scope → Medium complexity (20K)
- Tests mentioned → 10K
- Total: ~58K tokens
- Confidence: 75%
- Queue with note
Scenario 3: Vague Issue
Issue #302: Fix user thing
Users are complaining
Result:
- ⚠️ Minimal info
- Use defaults (50K, medium, sonnet)
- Confidence: 25%
- Comment: "Please add details"
- Don't queue (<30% threshold)
- Unassign from @mosaic
Scenario 4: Oversized Issue
Issue #303: Refactor entire authentication system
We need to modernize our auth:
- Replace session-based auth with JWT
- Add OAuth2 support
- Implement refresh tokens
- Add MFA
- Update all protected routes
- Migration for existing users
Result:
- 📊 Estimate: 180K tokens
- ⚠️ Exceeds 50% rule (>100K)
- Auto-decompose into sub-issues:
- #304: Extract JWT service (35K)
- #305: Add OAuth2 integration (40K)
- #306: Implement refresh tokens (28K)
- #307: Add MFA support (32K)
- #308: Update route guards (22K)
- #309: User migration script (18K)
- Label #303 as EPIC
- Queue sub-issues
Conclusion
The issue parser must be robust and intelligent to handle real-world issues:
- ✅ Extract structured metadata when available
- ✅ Estimate from content when missing
- ✅ Use confidence scores to guide behavior
- ✅ Auto-decompose oversized issues
- ✅ Warn users on low confidence
- ✅ Handle edge cases gracefully
This makes the coordinator truly autonomous - it can handle whatever issues users throw at it.
Document Version: 1.0 Created: 2026-01-31 Status: Proposed - Update COORD-002 Priority: Critical (P0) - Required for real-world usage