docs(design): add Knowledge Module design and implementation plan

- Full design document with architecture, data model, API specs
- 28 implementation issues across 5 phases (~127h total)
- Wiki-link syntax, semantic search, graph visualization
- Integration points for agent access

Ref: memory/2025-01-29-agent-orchestration.md
This commit is contained in:
Jason Woltje
2026-01-29 15:38:50 -06:00
parent f47dd8bc92
commit 91399f597f
3 changed files with 1501 additions and 0 deletions

73
docs/design/README.md Normal file
View File

@@ -0,0 +1,73 @@
# Design Documents
Technical design documents for major Mosaic Stack features.
## Purpose
Design documents serve as:
- **Blueprints** for implementation
- **Reference** for architectural decisions
- **Communication** between team members
- **Historical record** of design evolution
## Document Structure
Each design document should include:
1. **Problem Statement** — What are we solving?
2. **Architecture Overview** — High-level design with diagrams
3. **Database Schema** — Tables, indexes, relationships
4. **API Specifications** — Endpoints, request/response formats
5. **Implementation Plan** — Phased rollout with milestones
6. **Security & Performance** — Considerations and constraints
## Documents
### [Agent Orchestration Layer](./agent-orchestration.md)
**Status:** Design Phase
**Version:** 1.0
**Date:** 2025-01-29
Infrastructure for persistent task management and autonomous agent coordination. Enables long-running background work independent of user sessions.
**Key Features:**
- Task queue with priority scheduling
- Agent health monitoring and automatic recovery
- Checkpoint-based resumption for interrupted work
- Multi-workspace coordination with row-level security
---
### [Knowledge Module](./knowledge-module.md)
**Status:** Design Phase
**Version:** 1.0
**Date:** 2025-01-29
**Issues:** [Implementation Tracker](./knowledge-module-issues.md)
Native knowledge management with wiki-style linking, semantic search, and graph visualization. Enables teams and agents to capture, connect, and query organizational knowledge.
**Key Features:**
- Wiki-style `[[links]]` between entries
- Full-text and semantic (vector) search
- Interactive knowledge graph visualization
- Version history with diff view
- Tag-based organization
---
## Contributing
When creating a new design document:
1. Copy the structure from an existing document
2. Use ASCII diagrams for architecture (keep them simple)
3. Include code examples in TypeScript
4. Specify database schema in SQL (PostgreSQL dialect)
5. Add implementation phases with clear milestones
6. Update this README with a summary
---
**Last Updated:** 2025-01-29

View File

@@ -0,0 +1,670 @@
# Knowledge Module - Implementation Issues
> **Epic:** Knowledge Module
> **Design Doc:** [knowledge-module.md](./knowledge-module.md)
> **Target:** 6 weeks
> **Created:** 2025-01-29
---
## Epic Overview
Build a native knowledge management module for Mosaic Stack with wiki-style linking, semantic search, and graph visualization.
**Labels:** `epic`, `feature`, `knowledge-module`
---
## Phase 1: Foundation (Week 1-2)
### KNOW-001: Database Schema for Knowledge Module
**Priority:** P0
**Estimate:** 4h
**Labels:** `database`, `schema`, `phase-1`
**Description:**
Create Prisma schema and migrations for the Knowledge module.
**Acceptance Criteria:**
- [ ] `KnowledgeEntry` model with all fields
- [ ] `KnowledgeEntryVersion` model for history
- [ ] `KnowledgeLink` model for wiki-links
- [ ] `KnowledgeTag` and `KnowledgeEntryTag` models
- [ ] `KnowledgeEmbedding` model (pgvector ready)
- [ ] All indexes defined
- [ ] Migration runs cleanly
- [ ] Seed data for testing
**Technical Notes:**
- Reference design doc for full schema
- Ensure `@@unique([workspaceId, slug])` constraint
- Add `search_vector` column for full-text search
- pgvector extension may need separate setup
---
### KNOW-002: Entry CRUD API Endpoints
**Priority:** P0
**Estimate:** 6h
**Labels:** `api`, `phase-1`
**Description:**
Implement RESTful API for knowledge entry management.
**Acceptance Criteria:**
- [ ] `POST /api/knowledge/entries` - Create entry
- [ ] `GET /api/knowledge/entries` - List entries (paginated, filterable)
- [ ] `GET /api/knowledge/entries/:slug` - Get single entry
- [ ] `PUT /api/knowledge/entries/:slug` - Update entry
- [ ] `DELETE /api/knowledge/entries/:slug` - Archive entry (soft delete)
- [ ] Workspace isolation enforced
- [ ] Input validation with class-validator
- [ ] OpenAPI/Swagger documentation
**Technical Notes:**
- Follow existing Mosaic API patterns
- Use `@WorkspaceGuard()` for tenant isolation
- Slug generation from title with collision handling
---
### KNOW-003: Tag Management API
**Priority:** P0
**Estimate:** 3h
**Labels:** `api`, `phase-1`
**Description:**
Implement tag CRUD and entry-tag associations.
**Acceptance Criteria:**
- [ ] `GET /api/knowledge/tags` - List workspace tags
- [ ] `POST /api/knowledge/tags` - Create tag
- [ ] `PUT /api/knowledge/tags/:slug` - Update tag
- [ ] `DELETE /api/knowledge/tags/:slug` - Delete tag
- [ ] `GET /api/knowledge/tags/:slug/entries` - Entries with tag
- [ ] Entry creation/update accepts tag slugs
- [ ] Auto-create tags if `autoCreateTags: true`
---
### KNOW-004: Basic Markdown Rendering
**Priority:** P0
**Estimate:** 4h
**Labels:** `api`, `rendering`, `phase-1`
**Description:**
Render markdown content to HTML with caching.
**Acceptance Criteria:**
- [ ] Markdown-to-HTML conversion on entry save
- [ ] Support GFM (tables, task lists, strikethrough)
- [ ] Code syntax highlighting (highlight.js or Shiki)
- [ ] Sanitize HTML output (XSS prevention)
- [ ] Cache rendered HTML in `contentHtml` field
- [ ] Invalidate cache on content update
**Technical Notes:**
- Use `marked` or `remark` for parsing
- Wiki-links (`[[...]]`) parsed but not resolved yet (Phase 2)
---
### KNOW-005: Entry List Page UI
**Priority:** P0
**Estimate:** 6h
**Labels:** `frontend`, `phase-1`
**Description:**
Build the knowledge entry list page in the web UI.
**Acceptance Criteria:**
- [ ] List view with title, summary, tags, updated date
- [ ] Filter by status (draft/published/archived)
- [ ] Filter by tag
- [ ] Sort by updated/created/title
- [ ] Pagination
- [ ] Quick search (client-side filter)
- [ ] Create new entry button → editor
- [ ] Responsive design
---
### KNOW-006: Entry Detail/Editor Page UI
**Priority:** P0
**Estimate:** 8h
**Labels:** `frontend`, `editor`, `phase-1`
**Description:**
Build the entry view and edit page.
**Acceptance Criteria:**
- [ ] View mode with rendered markdown
- [ ] Edit mode with markdown editor
- [ ] Split view option (edit + preview)
- [ ] Title editing
- [ ] Tag selection/creation
- [ ] Status dropdown
- [ ] Save/cancel/delete actions
- [ ] Unsaved changes warning
- [ ] Keyboard shortcuts (Cmd+S to save)
**Technical Notes:**
- Consider CodeMirror or Monaco for editor
- May use existing rich-text patterns from Mosaic
---
## Phase 2: Linking (Week 2-3)
### KNOW-007: Wiki-Link Parser
**Priority:** P0
**Estimate:** 4h
**Labels:** `api`, `parsing`, `phase-2`
**Description:**
Parse `[[wiki-link]]` syntax from markdown content.
**Acceptance Criteria:**
- [ ] Extract all `[[...]]` patterns from content
- [ ] Support `[[slug]]` basic syntax
- [ ] Support `[[slug|display text]]` aliased links
- [ ] Support `[[slug#header]]` section links
- [ ] Return structured link data with positions
- [ ] Handle edge cases (nested brackets, escaping)
**Technical Notes:**
```typescript
interface ParsedLink {
raw: string; // "[[design|Design Doc]]"
target: string; // "design"
display: string; // "Design Doc"
section?: string; // "header" if [[design#header]]
position: { start: number; end: number };
}
```
---
### KNOW-008: Link Resolution Service
**Priority:** P0
**Estimate:** 4h
**Labels:** `api`, `phase-2`
**Description:**
Resolve parsed wiki-links to actual entries.
**Acceptance Criteria:**
- [ ] Resolve by exact slug match
- [ ] Resolve by title match (case-insensitive)
- [ ] Fuzzy match fallback (optional)
- [ ] Mark unresolved links as broken
- [ ] Extract surrounding context for link record
- [ ] Batch resolution for efficiency
---
### KNOW-009: Link Storage and Sync
**Priority:** P0
**Estimate:** 4h
**Labels:** `api`, `database`, `phase-2`
**Description:**
Store links in database and keep in sync with content.
**Acceptance Criteria:**
- [ ] On entry save: parse → resolve → store links
- [ ] Remove stale links on update
- [ ] `GET /api/knowledge/entries/:slug/links/outgoing`
- [ ] `GET /api/knowledge/entries/:slug/links/incoming` (backlinks)
- [ ] `GET /api/knowledge/entries/:slug/links/broken`
- [ ] Broken link report for workspace
---
### KNOW-010: Backlinks Display UI
**Priority:** P1
**Estimate:** 3h
**Labels:** `frontend`, `phase-2`
**Description:**
Show incoming links (backlinks) on entry pages.
**Acceptance Criteria:**
- [ ] Backlinks section on entry detail page
- [ ] Show linking entry title + context snippet
- [ ] Click to navigate to linking entry
- [ ] Count badge in sidebar/header
- [ ] Empty state when no backlinks
---
### KNOW-011: Link Autocomplete in Editor
**Priority:** P1
**Estimate:** 6h
**Labels:** `frontend`, `editor`, `phase-2`
**Description:**
Autocomplete suggestions when typing `[[`.
**Acceptance Criteria:**
- [ ] Trigger on `[[` typed in editor
- [ ] Show dropdown with matching entries
- [ ] Search by title and slug
- [ ] Show recent entries first, then alphabetical
- [ ] Insert selected entry as `[[slug]]` or `[[slug|title]]`
- [ ] Keyboard navigation (arrows, enter, escape)
- [ ] Debounced API calls
---
### KNOW-012: Render Links in View Mode
**Priority:** P0
**Estimate:** 3h
**Labels:** `frontend`, `rendering`, `phase-2`
**Description:**
Render wiki-links as clickable links in entry view.
**Acceptance Criteria:**
- [ ] `[[slug]]` renders as link to `/knowledge/slug`
- [ ] `[[slug|text]]` shows custom text
- [ ] Broken links styled differently (red, dashed underline)
- [ ] Hover preview (optional, stretch goal)
---
## Phase 3: Search (Week 3-4)
### KNOW-013: Full-Text Search Setup
**Priority:** P0
**Estimate:** 4h
**Labels:** `database`, `search`, `phase-3`
**Description:**
Set up PostgreSQL full-text search for entries.
**Acceptance Criteria:**
- [ ] Add `tsvector` column to entries table
- [ ] Create GIN index on search vector
- [ ] Weight title (A), summary (B), content (C)
- [ ] Trigger to update vector on insert/update
- [ ] Verify search performance with test data
---
### KNOW-014: Search API Endpoint
**Priority:** P0
**Estimate:** 4h
**Labels:** `api`, `search`, `phase-3`
**Description:**
Implement search API with full-text search.
**Acceptance Criteria:**
- [ ] `GET /api/knowledge/search?q=...`
- [ ] Return ranked results with snippets
- [ ] Highlight matching terms in snippets
- [ ] Filter by tags, status
- [ ] Pagination
- [ ] Response time < 200ms
---
### KNOW-015: Search UI
**Priority:** P0
**Estimate:** 6h
**Labels:** `frontend`, `search`, `phase-3`
**Description:**
Build search interface in web UI.
**Acceptance Criteria:**
- [ ] Search input in knowledge module header
- [ ] Search results page
- [ ] Highlighted snippets
- [ ] Filter sidebar (tags, status)
- [ ] "No results" state with suggestions
- [ ] Search as you type (debounced)
- [ ] Keyboard shortcut (Cmd+K) to focus search
---
### KNOW-016: pgvector Setup
**Priority:** P1
**Estimate:** 4h
**Labels:** `database`, `vector`, `phase-3`
**Description:**
Set up pgvector extension for semantic search.
**Acceptance Criteria:**
- [ ] Enable pgvector extension in PostgreSQL
- [ ] Create embeddings table with vector column
- [ ] HNSW index for fast similarity search
- [ ] Verify extension works in dev and prod
**Technical Notes:**
- May need PostgreSQL 15+ for best pgvector support
- Consider managed options (Supabase, Neon) if self-hosting is complex
---
### KNOW-017: Embedding Generation Pipeline
**Priority:** P1
**Estimate:** 6h
**Labels:** `api`, `vector`, `phase-3`
**Description:**
Generate embeddings for entries using OpenAI or local model.
**Acceptance Criteria:**
- [ ] Service to generate embeddings from text
- [ ] On entry create/update: queue embedding job
- [ ] Background worker processes queue
- [ ] Store embedding in database
- [ ] Handle API rate limits and errors
- [ ] Config for embedding model selection
**Technical Notes:**
- Start with OpenAI `text-embedding-ada-002`
- Consider local options (sentence-transformers) for cost/privacy
---
### KNOW-018: Semantic Search API
**Priority:** P1
**Estimate:** 4h
**Labels:** `api`, `search`, `vector`, `phase-3`
**Description:**
Implement semantic (vector) search endpoint.
**Acceptance Criteria:**
- [ ] `POST /api/knowledge/search/semantic`
- [ ] Accept natural language query
- [ ] Generate query embedding
- [ ] Find similar entries by cosine similarity
- [ ] Return results with similarity scores
- [ ] Configurable similarity threshold
---
## Phase 4: Graph (Week 4-5)
### KNOW-019: Graph Data API
**Priority:** P1
**Estimate:** 4h
**Labels:** `api`, `graph`, `phase-4`
**Description:**
API to retrieve knowledge graph data.
**Acceptance Criteria:**
- [ ] `GET /api/knowledge/graph` - Full graph (nodes + edges)
- [ ] `GET /api/knowledge/graph/:slug` - Subgraph centered on entry
- [ ] `GET /api/knowledge/graph/stats` - Graph statistics
- [ ] Include orphan detection
- [ ] Filter by tag, status
- [ ] Limit node count option
---
### KNOW-020: Graph Visualization Component
**Priority:** P1
**Estimate:** 8h
**Labels:** `frontend`, `graph`, `phase-4`
**Description:**
Interactive knowledge graph visualization.
**Acceptance Criteria:**
- [ ] Force-directed graph layout
- [ ] Nodes sized by connection count
- [ ] Nodes colored by status
- [ ] Click node to navigate or show details
- [ ] Zoom and pan controls
- [ ] Layout toggle (force, hierarchical, radial)
- [ ] Performance OK with 500+ nodes
**Technical Notes:**
- Use D3.js or Cytoscape.js
- Consider WebGL renderer for large graphs
---
### KNOW-021: Entry-Centered Graph View
**Priority:** P2
**Estimate:** 4h
**Labels:** `frontend`, `graph`, `phase-4`
**Description:**
Show mini-graph on entry detail page.
**Acceptance Criteria:**
- [ ] Small graph showing entry + direct connections
- [ ] 1-2 hop neighbors
- [ ] Click to expand or navigate
- [ ] Toggle to show/hide on entry page
---
### KNOW-022: Graph Statistics Dashboard
**Priority:** P2
**Estimate:** 3h
**Labels:** `frontend`, `graph`, `phase-4`
**Description:**
Dashboard showing knowledge base health.
**Acceptance Criteria:**
- [ ] Total entries, links, tags
- [ ] Orphan entry count (no links)
- [ ] Broken link count
- [ ] Average connections per entry
- [ ] Most connected entries
- [ ] Recently updated
---
## Phase 5: Polish (Week 5-6)
### KNOW-023: Version History API
**Priority:** P1
**Estimate:** 4h
**Labels:** `api`, `versioning`, `phase-5`
**Description:**
API for entry version history.
**Acceptance Criteria:**
- [ ] Create version on each save
- [ ] `GET /api/knowledge/entries/:slug/versions`
- [ ] `GET /api/knowledge/entries/:slug/versions/:v`
- [ ] `POST /api/knowledge/entries/:slug/restore/:v`
- [ ] Version limit per entry (configurable, default 50)
- [ ] Prune old versions
---
### KNOW-024: Version History UI
**Priority:** P1
**Estimate:** 6h
**Labels:** `frontend`, `versioning`, `phase-5`
**Description:**
UI to browse and restore versions.
**Acceptance Criteria:**
- [ ] Version list sidebar/panel
- [ ] Show version date, author, change note
- [ ] Click to view historical version
- [ ] Diff view between versions
- [ ] Restore button with confirmation
- [ ] Compare any two versions
**Technical Notes:**
- Use diff library for content comparison
- Highlight additions/deletions
---
### KNOW-025: Markdown Import
**Priority:** P2
**Estimate:** 4h
**Labels:** `api`, `import`, `phase-5`
**Description:**
Import existing markdown files into knowledge base.
**Acceptance Criteria:**
- [ ] Upload `.md` file(s)
- [ ] Parse frontmatter for metadata
- [ ] Generate slug from filename or title
- [ ] Resolve wiki-links if target entries exist
- [ ] Report on import results
- [ ] Bulk import from folder/zip
---
### KNOW-026: Export Functionality
**Priority:** P2
**Estimate:** 3h
**Labels:** `api`, `export`, `phase-5`
**Description:**
Export entries to markdown/PDF.
**Acceptance Criteria:**
- [ ] Export single entry as markdown
- [ ] Export single entry as PDF
- [ ] Bulk export (all or filtered)
- [ ] Include frontmatter in markdown export
- [ ] Preserve wiki-links in export
---
### KNOW-027: Caching Layer
**Priority:** P1
**Estimate:** 4h
**Labels:** `api`, `performance`, `phase-5`
**Description:**
Implement Valkey caching for knowledge module.
**Acceptance Criteria:**
- [ ] Cache entry JSON
- [ ] Cache rendered HTML
- [ ] Cache graph data
- [ ] Cache search results (short TTL)
- [ ] Proper invalidation on updates
- [ ] Cache hit metrics
---
### KNOW-028: Documentation
**Priority:** P1
**Estimate:** 4h
**Labels:** `docs`, `phase-5`
**Description:**
Document the knowledge module.
**Acceptance Criteria:**
- [ ] User guide for knowledge module
- [ ] API reference (OpenAPI already in place)
- [ ] Wiki-link syntax reference
- [ ] Admin/config documentation
- [ ] Architecture overview for developers
---
## Stretch Goals / Future
### KNOW-029: Real-Time Collaboration
**Priority:** P3
**Labels:** `future`, `collaboration`
**Description:**
Multiple users editing same entry simultaneously.
**Notes:**
- Would require CRDT or OT implementation
- Significant complexity
- Evaluate need before committing
---
### KNOW-030: Entry Templates
**Priority:** P3
**Labels:** `future`, `templates`
**Description:**
Pre-defined templates for common entry types.
**Notes:**
- ADR template
- Design doc template
- Meeting notes template
- Custom templates per workspace
---
### KNOW-031: Attachments
**Priority:** P3
**Labels:** `future`, `attachments`
**Description:**
Upload and embed images/files in entries.
**Notes:**
- S3/compatible storage backend
- Image optimization
- Paste images into editor
---
## Summary
| Phase | Issues | Est. Hours | Focus |
|-------|--------|------------|-------|
| 1 | KNOW-001 to KNOW-006 | 31h | CRUD + Basic UI |
| 2 | KNOW-007 to KNOW-012 | 24h | Wiki-links |
| 3 | KNOW-013 to KNOW-018 | 28h | Search |
| 4 | KNOW-019 to KNOW-022 | 19h | Graph |
| 5 | KNOW-023 to KNOW-028 | 25h | Polish |
| **Total** | 28 issues | ~127h | ~3-4 dev weeks |
---
*Generated by Jarvis • 2025-01-29*

View File

@@ -0,0 +1,758 @@
# Knowledge Module - Design Document
> **Status:** Draft
> **Author:** Agent (Jarvis)
> **Created:** 2025-01-29
> **Related:** [Agent Orchestration](./agent-orchestration.md)
## Problem Statement
Development teams and AI agents working on complex projects need a way to:
1. **Capture decisions** — Why was X chosen over Y?
2. **Track connections** — How does component A relate to concept B?
3. **Search contextually** — Find relevant context without knowing exact keywords
4. **Evolve understanding** — Knowledge changes; track that evolution
5. **Share across boundaries** — Human and agent access to the same knowledge base
### Current Pain Points
- **Scattered documentation** — README, comments, Slack threads, memory files
- **No explicit linking** — Connections exist but aren't captured
- **Agent amnesia** — Each session starts fresh, relies on file search
- **No decision archaeology** — Hard to find *why* something was decided
- **Human/agent mismatch** — Humans browse, agents grep
## Requirements
### Functional Requirements
| ID | Requirement | Priority |
|----|-------------|----------|
| FR1 | Create, read, update, delete knowledge entries | P0 |
| FR2 | Wiki-style linking between entries (`[[link]]` syntax) | P0 |
| FR3 | Tagging and categorization | P0 |
| FR4 | Full-text search | P0 |
| FR5 | Semantic/vector search for agents | P1 |
| FR6 | Graph visualization of connections | P1 |
| FR7 | Version history and diff view | P1 |
| FR8 | Timeline view of changes | P2 |
| FR9 | Import from markdown files | P2 |
| FR10 | Export to markdown/PDF | P2 |
### Non-Functional Requirements
| ID | Requirement | Target |
|----|-------------|--------|
| NFR1 | Search response time | < 200ms |
| NFR2 | Entry render time | < 100ms |
| NFR3 | Graph render (< 1000 nodes) | < 500ms |
| NFR4 | Multi-tenant isolation | Complete |
| NFR5 | API-first design | All features via API |
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ Mosaic Web UI │
├─────────────────────────────────────────────────────────────────┤
│ Knowledge Browser │ Graph View │ Search │ Timeline │
└─────────┬───────────┴──────┬───────┴────┬─────┴────┬────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ Knowledge API (NestJS) │
├─────────────────────────────────────────────────────────────────┤
│ EntryController │ SearchController │ GraphController │
│ TagController │ LinkController │ VersionController │
└─────────┬─────────┴─────────┬──────────┴──────────┬─────────────┘
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐
│ PostgreSQL │ │ Valkey │ │ Vector Store │
│ │ │ │ │ (pgvector) │
│ - entries │ │ - search cache │ │ │
│ - entry_versions │ │ - graph cache │ │ - embeddings │
│ - entry_links │ │ - hot entries │ │ - semantic index │
│ - tags │ │ │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────────┘
```
## Data Model
### Core Entities
```prisma
// Entry - A single knowledge entry (document/page)
model KnowledgeEntry {
id String @id @default(cuid())
workspaceId String
workspace Workspace @relation(fields: [workspaceId], references: [id])
slug String // URL-friendly identifier
title String
content String @db.Text // Markdown content
contentHtml String? @db.Text // Rendered HTML (cached)
summary String? // Auto-generated or manual summary
status EntryStatus @default(DRAFT)
visibility Visibility @default(PRIVATE)
// Metadata
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
createdBy String
updatedBy String
// Relations
tags KnowledgeEntryTag[]
outgoingLinks KnowledgeLink[] @relation("SourceEntry")
incomingLinks KnowledgeLink[] @relation("TargetEntry")
versions KnowledgeEntryVersion[]
embedding KnowledgeEmbedding?
@@unique([workspaceId, slug])
@@index([workspaceId, status])
@@index([workspaceId, updatedAt])
}
enum EntryStatus {
DRAFT
PUBLISHED
ARCHIVED
}
enum Visibility {
PRIVATE // Only creator
WORKSPACE // All workspace members
PUBLIC // Anyone with link
}
// Version history
model KnowledgeEntryVersion {
id String @id @default(cuid())
entryId String
entry KnowledgeEntry @relation(fields: [entryId], references: [id], onDelete: Cascade)
version Int
title String
content String @db.Text
summary String?
createdAt DateTime @default(now())
createdBy String
changeNote String? // Optional commit message
@@unique([entryId, version])
@@index([entryId, version])
}
// Wiki-style links between entries
model KnowledgeLink {
id String @id @default(cuid())
sourceId String
source KnowledgeEntry @relation("SourceEntry", fields: [sourceId], references: [id], onDelete: Cascade)
targetId String
target KnowledgeEntry @relation("TargetEntry", fields: [targetId], references: [id], onDelete: Cascade)
// Link metadata
linkText String // The text used in [[link|display text]]
context String? // Surrounding text for context
createdAt DateTime @default(now())
@@unique([sourceId, targetId])
@@index([sourceId])
@@index([targetId])
}
// Tags for categorization
model KnowledgeTag {
id String @id @default(cuid())
workspaceId String
workspace Workspace @relation(fields: [workspaceId], references: [id])
name String
slug String
color String? // Hex color for UI
description String?
entries KnowledgeEntryTag[]
@@unique([workspaceId, slug])
}
model KnowledgeEntryTag {
entryId String
entry KnowledgeEntry @relation(fields: [entryId], references: [id], onDelete: Cascade)
tagId String
tag KnowledgeTag @relation(fields: [tagId], references: [id], onDelete: Cascade)
@@id([entryId, tagId])
}
// Vector embeddings for semantic search
model KnowledgeEmbedding {
id String @id @default(cuid())
entryId String @unique
entry KnowledgeEntry @relation(fields: [entryId], references: [id], onDelete: Cascade)
embedding Unsupported("vector(1536)") // OpenAI ada-002 dimension
model String // Which model generated this
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([embedding], type: Hnsw(ops: VectorCosineOps))
}
```
### Frontmatter Schema
Entries support YAML frontmatter for structured metadata:
```yaml
---
title: Agent Orchestration Design
status: published
tags: [architecture, agents, orchestration]
created: 2025-01-29
updated: 2025-01-29
author: jarvis
related:
- "[[task-queues]]"
- "[[valkey-patterns]]"
decision:
status: accepted
date: 2025-01-29
participants: [jason, jarvis]
supersedes: null
---
```
## API Endpoints
### Entry Management
```
POST /api/knowledge/entries Create entry
GET /api/knowledge/entries List entries (paginated)
GET /api/knowledge/entries/:slug Get entry by slug
PUT /api/knowledge/entries/:slug Update entry
DELETE /api/knowledge/entries/:slug Delete entry (soft delete → archive)
GET /api/knowledge/entries/:slug/versions List versions
GET /api/knowledge/entries/:slug/versions/:v Get specific version
POST /api/knowledge/entries/:slug/restore/:v Restore to version
```
### Search
```
GET /api/knowledge/search?q=... Full-text search
POST /api/knowledge/search/semantic Semantic search (vector)
GET /api/knowledge/search/suggestions Autocomplete suggestions
```
### Graph
```
GET /api/knowledge/graph Full graph (nodes + edges)
GET /api/knowledge/graph/:slug Subgraph centered on entry
GET /api/knowledge/graph/stats Graph statistics
```
### Tags
```
GET /api/knowledge/tags List all tags
POST /api/knowledge/tags Create tag
PUT /api/knowledge/tags/:slug Update tag
DELETE /api/knowledge/tags/:slug Delete tag
GET /api/knowledge/tags/:slug/entries Entries with tag
```
### Links
```
GET /api/knowledge/entries/:slug/links/outgoing Outgoing links
GET /api/knowledge/entries/:slug/links/incoming Incoming links (backlinks)
GET /api/knowledge/entries/:slug/links/broken Broken links
POST /api/knowledge/links/resolve Resolve [[link]] to entry
```
## Link Processing
### Wiki-Link Syntax
The module supports Obsidian-compatible wiki-link syntax:
```markdown
Basic link: [[entry-slug]]
Display text: [[entry-slug|Custom Display Text]]
Header link: [[entry-slug#section-header]]
Block link: [[entry-slug#^block-id]]
```
### Link Resolution Flow
```
┌─────────────────┐
│ Entry Content │
│ "See [[design]] │
│ for details" │
└────────┬────────┘
│ Parse
┌─────────────────┐
│ Extract Links │
│ [[design]] │
└────────┬────────┘
│ Resolve
┌─────────────────┐
│ Find Target │
│ slug: "design" │
│ OR title match │
│ OR fuzzy match │
└────────┬────────┘
┌────┴────┐
▼ ▼
┌───────┐ ┌───────────┐
│ Found │ │ Not Found │
│ │ │ (broken) │
└───┬───┘ └─────┬─────┘
│ │
▼ ▼
┌───────────────────────┐
│ Create/Update Link │
│ Record in entry_links │
│ Mark broken if needed │
└───────────────────────┘
```
### Automatic Link Detection
On entry save:
1. Parse content for `[[...]]` patterns
2. Resolve each link to target entry
3. Update `KnowledgeLink` records
4. Flag broken links for UI warning
## Search Implementation
### Full-Text Search (PostgreSQL)
```sql
-- Create search index
ALTER TABLE knowledge_entries
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(summary, '')), 'B') ||
setweight(to_tsvector('english', coalesce(content, '')), 'C')
) STORED;
CREATE INDEX idx_knowledge_search ON knowledge_entries USING GIN(search_vector);
-- Search query
SELECT id, slug, title,
ts_rank(search_vector, query) as rank,
ts_headline('english', content, query) as snippet
FROM knowledge_entries, plainto_tsquery('english', $1) query
WHERE search_vector @@ query
AND workspace_id = $2
ORDER BY rank DESC
LIMIT 20;
```
### Semantic Search (pgvector)
```sql
-- Semantic search query
SELECT e.id, e.slug, e.title, e.summary,
1 - (emb.embedding <=> $1::vector) as similarity
FROM knowledge_entries e
JOIN knowledge_embeddings emb ON e.id = emb.entry_id
WHERE e.workspace_id = $2
AND 1 - (emb.embedding <=> $1::vector) > 0.7 -- similarity threshold
ORDER BY emb.embedding <=> $1::vector
LIMIT 10;
```
### Embedding Generation
```typescript
async function generateEmbedding(entry: KnowledgeEntry): Promise<number[]> {
const text = `${entry.title}\n\n${entry.summary || ''}\n\n${entry.content}`;
// Use OpenAI or local model
const response = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: text.slice(0, 8000), // Token limit
});
return response.data[0].embedding;
}
```
## Graph Visualization
### Data Structure
```typescript
interface KnowledgeGraph {
nodes: GraphNode[];
edges: GraphEdge[];
stats: GraphStats;
}
interface GraphNode {
id: string;
slug: string;
title: string;
type: 'entry' | 'tag' | 'external';
status: EntryStatus;
linkCount: number; // in + out
tags: string[];
updatedAt: string;
}
interface GraphEdge {
id: string;
source: string; // node id
target: string; // node id
type: 'link' | 'tag';
label?: string;
}
interface GraphStats {
nodeCount: number;
edgeCount: number;
orphanCount: number; // entries with no links
brokenLinkCount: number;
avgConnections: number;
}
```
### Graph Query
```sql
-- Get full graph for workspace
WITH nodes AS (
SELECT
id, slug, title, 'entry' as type, status,
(SELECT COUNT(*) FROM knowledge_links WHERE source_id = e.id OR target_id = e.id) as link_count,
updated_at
FROM knowledge_entries e
WHERE workspace_id = $1 AND status != 'ARCHIVED'
),
edges AS (
SELECT
l.id, l.source_id as source, l.target_id as target, 'link' as type, l.link_text as label
FROM knowledge_links l
JOIN knowledge_entries e ON l.source_id = e.id
WHERE e.workspace_id = $1
)
SELECT
json_build_object(
'nodes', (SELECT json_agg(nodes) FROM nodes),
'edges', (SELECT json_agg(edges) FROM edges)
) as graph;
```
### Frontend Rendering
Use D3.js force-directed graph or Cytoscape.js:
```typescript
// Graph component configuration
const graphConfig = {
layout: 'force-directed',
physics: {
repulsion: 100,
springLength: 150,
springStrength: 0.05,
},
nodeSize: (node) => Math.sqrt(node.linkCount) * 10 + 20,
nodeColor: (node) => {
switch (node.status) {
case 'PUBLISHED': return '#22c55e';
case 'DRAFT': return '#f59e0b';
case 'ARCHIVED': return '#6b7280';
}
},
edgeStyle: {
color: '#94a3b8',
width: 1,
arrows: 'to',
},
};
```
## Caching Strategy
### Valkey Key Patterns
```
knowledge:{workspaceId}:entry:{slug} Entry cache (JSON)
knowledge:{workspaceId}:entry:{slug}:html Rendered HTML cache
knowledge:{workspaceId}:graph Full graph cache
knowledge:{workspaceId}:graph:{slug} Subgraph cache
knowledge:{workspaceId}:search:{hash} Search result cache
knowledge:{workspaceId}:tags Tag list cache
knowledge:{workspaceId}:recent Recent entries list
```
### Cache Invalidation
```typescript
async function invalidateEntryCache(workspaceId: string, slug: string) {
const keys = [
`knowledge:${workspaceId}:entry:${slug}`,
`knowledge:${workspaceId}:entry:${slug}:html`,
`knowledge:${workspaceId}:graph`, // Full graph affected
`knowledge:${workspaceId}:graph:${slug}`,
`knowledge:${workspaceId}:recent`,
];
// Also invalidate subgraphs for linked entries
const linkedSlugs = await getLinkedEntrySlugs(workspaceId, slug);
for (const linked of linkedSlugs) {
keys.push(`knowledge:${workspaceId}:graph:${linked}`);
}
await valkey.del(...keys);
// Invalidate search caches (pattern delete)
const searchKeys = await valkey.keys(`knowledge:${workspaceId}:search:*`);
if (searchKeys.length) await valkey.del(...searchKeys);
}
```
## UI Components
### Entry Editor
```
┌────────────────────────────────────────────────────────────────┐
│ [📄] Agent Orchestration Design [Save] [···]│
├────────────────────────────────────────────────────────────────┤
│ Status: [Published ▼] Tags: [architecture] [agents] [+] │
├────────────────────────────────────────────────────────────────┤
│ │
│ # Problem Statement │
│ │
│ Development teams and AI agents working on complex projects │
│ need a way to [[capture-decisions|capture decisions]]... │
│ │
│ See also: [[task-queues]] and [[valkey-patterns]] │
│ │
│ ─────────────────────────────────────────────────────────────│
│ Backlinks (3): │
│ • [[mosaic-roadmap]] - "...implements agent orchestration..." │
│ • [[design-index]] - "Core designs: [[agent-orchestration]]" │
│ • [[jarvis-memory]] - "Created orchestration design..." │
│ │
└────────────────────────────────────────────────────────────────┘
```
### Graph View
```
┌────────────────────────────────────────────────────────────────┐
│ Knowledge Graph [Filter ▼] [Layout ▼] │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ │
│ │ valkey │ │
│ │patterns │ │
│ └────┬────┘ │
│ │ │
│ ┌────────────┼────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────┐ ┌────────┐ ┌────────┐ │
│ │cache │ │ task │ │ agent │◄─────┐ │
│ │layer │ │ queues │ │ orch │ │ │
│ └──────┘ └────────┘ └───┬────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌────────┐ ┌────┴───┐ │
│ │recovery│ │ mosaic │ │
│ │patterns│ │roadmap │ │
│ └────────┘ └────────┘ │
│ │
│ 🟢 Published (6) 🟡 Draft (2) ⚪ Orphan (0) │
└────────────────────────────────────────────────────────────────┘
```
### Search Results
```
┌────────────────────────────────────────────────────────────────┐
│ 🔍 [agent recovery ] [Search]│
├────────────────────────────────────────────────────────────────┤
│ │
│ 📄 Agent Orchestration - Recovery Patterns │
│ ...automatic **recovery** when an **agent** fails or the... │
│ Tags: architecture, agents • Updated 2 hours ago │
│ │
│ 📄 Agent Health Monitoring │
│ ...heartbeat monitoring enables **recovery** of stale... │
│ Tags: agents, monitoring • Updated 1 day ago │
│ │
│ 📄 Task Queue Design │
│ ...retry logic with exponential backoff for **agent**... │
│ Tags: architecture, queues • Updated 3 days ago │
│ │
│ ───────────────────────────────────────────────────────────── │
│ Also try: Semantic search for conceptually related entries │
│ │
└────────────────────────────────────────────────────────────────┘
```
## Implementation Phases
### Phase 1: Foundation (Week 1-2)
**Goal:** Basic CRUD + storage working
- [ ] Database schema + migrations
- [ ] Entry CRUD API endpoints
- [ ] Basic markdown rendering
- [ ] Tag management
- [ ] Entry list/detail pages
**Deliverables:**
- Can create, edit, view, delete entries
- Tags work
- Basic search (title/slug match)
### Phase 2: Linking (Week 2-3)
**Goal:** Wiki-link functionality
- [ ] Link parser (`[[...]]` syntax)
- [ ] Link resolution logic
- [ ] Broken link detection
- [ ] Backlinks display
- [ ] Link autocomplete in editor
**Deliverables:**
- Links between entries work
- Backlinks show on entry pages
- Editor suggests links as you type
### Phase 3: Search (Week 3-4)
**Goal:** Full-text + semantic search
- [ ] PostgreSQL full-text search setup
- [ ] Search API endpoint
- [ ] Search UI with highlighting
- [ ] pgvector extension setup
- [ ] Embedding generation pipeline
- [ ] Semantic search API
**Deliverables:**
- Fast full-text search
- Semantic search for "fuzzy" queries
- Search results with snippets
### Phase 4: Graph (Week 4-5)
**Goal:** Visual knowledge graph
- [ ] Graph data API
- [ ] D3.js/Cytoscape integration
- [ ] Interactive graph view
- [ ] Subgraph (entry-centered) view
- [ ] Graph statistics
**Deliverables:**
- Can view full knowledge graph
- Can explore from any entry
- Visual indicators for status/orphans
### Phase 5: Polish (Week 5-6)
**Goal:** Production-ready
- [ ] Version history UI
- [ ] Diff view between versions
- [ ] Import from markdown files
- [ ] Export functionality
- [ ] Performance optimization
- [ ] Caching implementation
- [ ] Documentation
**Deliverables:**
- Version history works
- Can import existing docs
- Performance is acceptable
- Module is documented
## Integration Points
### Agent Access
The Knowledge module should be accessible to agents via API:
```typescript
// Agent tool for knowledge access
interface KnowledgeTools {
// Search
searchKnowledge(query: string): Promise<SearchResult[]>;
semanticSearch(query: string): Promise<SearchResult[]>;
// CRUD
getEntry(slug: string): Promise<KnowledgeEntry>;
createEntry(data: CreateEntryInput): Promise<KnowledgeEntry>;
updateEntry(slug: string, data: UpdateEntryInput): Promise<KnowledgeEntry>;
// Graph
getRelatedEntries(slug: string): Promise<KnowledgeEntry[]>;
getBacklinks(slug: string): Promise<KnowledgeEntry[]>;
}
```
### Clawdbot Integration
For Clawdbot specifically, the Knowledge module could:
1. Sync with `memory/*.md` files
2. Provide semantic search for `memory_search` tool
3. Generate embeddings for memory entries
4. Visualize agent memory as a knowledge graph
## Success Metrics
| Metric | Target | Measurement |
|--------|--------|-------------|
| Entry creation time | < 200ms | API response time |
| Search latency (full-text) | < 100ms | p95 response time |
| Search latency (semantic) | < 300ms | p95 response time |
| Graph render (100 nodes) | < 200ms | Client-side time |
| Graph render (1000 nodes) | < 1s | Client-side time |
| Adoption | 50+ entries/workspace | After 1 month |
| Link density | > 2 links/entry avg | Graph statistics |
## Open Questions
1. **Embedding model** — Use OpenAI embeddings or self-hosted? (Cost vs privacy)
2. **Real-time collab** — Do we need multiplayer editing? (CRDT complexity)
3. **Permissions** — Entry-level permissions or workspace-level only?
4. **Templates** — Support entry templates (ADR, design doc, etc.)?
5. **Attachments** — Allow images/files in entries?
## References
- [Obsidian](https://obsidian.md/) — Wiki-link syntax inspiration
- [Roam Research](https://roamresearch.com/) — Block-level linking
- [pgvector](https://github.com/pgvector/pgvector) — PostgreSQL vector extension
- [Mosaic Agent Orchestration](./agent-orchestration.md) — Related design