Files

Jason Woltje 5771ec5260 feat: initial alpha scaffold — FastAPI + MCP + pgvector

Implements v0.0.1 of OpenBrain:

- FastAPI REST API (capture, search, recent, stats) with Bearer auth
- MCP server (streamable HTTP at /mcp) exposing all 4 tools
- pgvector schema (vector(1024) for bge-m3)
- asyncpg connection pool with lazy init + graceful close
- Ollama embedding client with fallback (stores thought without vector if Ollama unreachable)
- Woodpecker CI pipeline (lint + kaniko build + push to Gitea registry)
- Portainer/Swarm deployment compose
- Mosaic framework files: AGENTS.md, PRD.md, TASKS.md, scratchpad

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-02 18:25:07 -06:00

2.7 KiB

Raw Permalink Blame History

OpenBrain — Product Requirements Document

Version: 0.0.1 Status: Active Owner: Jason Woltje

Problem

AI agents and tools have no shared persistent memory. Every session starts from zero. Platform memory (Claude, ChatGPT, etc.) is siloed — each tool can't see what the others know. This forces constant context re-injection, burns tokens, and prevents compounding knowledge.

Goal

A self-hosted, agent-readable semantic brain that any AI tool can plug into via MCP. One database. Standard protocol. Owned infrastructure. No SaaS middlemen.

Users

Jason — primary human user, captures thoughts from any AI tool
AI agents — Claude Code, Codex, Claude Desktop, any MCP-compatible client
Future: Mosaic Stack integration as the knowledge layer for the agent fleet

Requirements

v0.0.1 (Alpha — Current)

ID	Requirement	Priority
R1	Capture a thought with content, source, and metadata	Must
R2	Generate vector embedding via Ollama (bge-m3)	Must
R3	Semantic search by meaning (cosine similarity)	Must
R4	List recent thoughts	Must
R5	Usage stats (total, embedded, by source)	Must
R6	REST API with Bearer token auth	Must
R7	MCP server (streamable HTTP) exposing all 4 tools	Must
R8	Deployable as Portainer/Swarm stack	Must
R9	CI/CD via Woodpecker (lint + build + push)	Must
R10	Graceful embedding fallback (store without vector if Ollama down)	Must
R11	Public repo — zero secrets in code	Must

v0.1.0 (Future)

Thought tagging and tag-based filtering
Batch import (ingest jarvis-brain data, Claude memory, etc.)
Scheduled re-embedding for thoughts stored without vectors
Webhook capture endpoint (ingest from any tool without MCP)
Usage dashboard (thoughts/day, source breakdown)
Mosaic Stack integration (knowledge module backend)

Acceptance Criteria (v0.0.1)

POST /v1/thoughts stores a thought and returns it with embedded=true when Ollama is reachable
POST /v1/search with a natural-language query returns semantically relevant results
GET /v1/thoughts/recent returns the last N thoughts in reverse chronological order
GET /v1/stats returns total count, embedded count, and source breakdown
MCP server at /mcp exposes all 4 tools (capture, search, recent, stats)
Claude Code can connect to the MCP server and execute all 4 tools
Portainer stack deploys both brain-db and brain-api successfully
CI pipeline runs on push to main and produces a tagged image

Out of Scope (v0.0.1)

User accounts / multi-user
Workspace isolation
Web UI
Rate limiting
Mosaic Stack integration

2.7 KiB Raw Permalink Blame History