feat: initial alpha scaffold — FastAPI + MCP + pgvector
Implements v0.0.1 of OpenBrain: - FastAPI REST API (capture, search, recent, stats) with Bearer auth - MCP server (streamable HTTP at /mcp) exposing all 4 tools - pgvector schema (vector(1024) for bge-m3) - asyncpg connection pool with lazy init + graceful close - Ollama embedding client with fallback (stores thought without vector if Ollama unreachable) - Woodpecker CI pipeline (lint + kaniko build + push to Gitea registry) - Portainer/Swarm deployment compose - Mosaic framework files: AGENTS.md, PRD.md, TASKS.md, scratchpad Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
70
docs/PRD.md
Normal file
70
docs/PRD.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# OpenBrain — Product Requirements Document
|
||||
|
||||
**Version**: 0.0.1
|
||||
**Status**: Active
|
||||
**Owner**: Jason Woltje
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
AI agents and tools have no shared persistent memory. Every session starts from zero.
|
||||
Platform memory (Claude, ChatGPT, etc.) is siloed — each tool can't see what the others know.
|
||||
This forces constant context re-injection, burns tokens, and prevents compounding knowledge.
|
||||
|
||||
## Goal
|
||||
|
||||
A self-hosted, agent-readable semantic brain that any AI tool can plug into via MCP.
|
||||
One database. Standard protocol. Owned infrastructure. No SaaS middlemen.
|
||||
|
||||
## Users
|
||||
|
||||
1. **Jason** — primary human user, captures thoughts from any AI tool
|
||||
2. **AI agents** — Claude Code, Codex, Claude Desktop, any MCP-compatible client
|
||||
3. **Future**: Mosaic Stack integration as the knowledge layer for the agent fleet
|
||||
|
||||
## Requirements
|
||||
|
||||
### v0.0.1 (Alpha — Current)
|
||||
|
||||
| ID | Requirement | Priority |
|
||||
|----|-------------|----------|
|
||||
| R1 | Capture a thought with content, source, and metadata | Must |
|
||||
| R2 | Generate vector embedding via Ollama (bge-m3) | Must |
|
||||
| R3 | Semantic search by meaning (cosine similarity) | Must |
|
||||
| R4 | List recent thoughts | Must |
|
||||
| R5 | Usage stats (total, embedded, by source) | Must |
|
||||
| R6 | REST API with Bearer token auth | Must |
|
||||
| R7 | MCP server (streamable HTTP) exposing all 4 tools | Must |
|
||||
| R8 | Deployable as Portainer/Swarm stack | Must |
|
||||
| R9 | CI/CD via Woodpecker (lint + build + push) | Must |
|
||||
| R10 | Graceful embedding fallback (store without vector if Ollama down) | Must |
|
||||
| R11 | Public repo — zero secrets in code | Must |
|
||||
|
||||
### v0.1.0 (Future)
|
||||
|
||||
- Thought tagging and tag-based filtering
|
||||
- Batch import (ingest jarvis-brain data, Claude memory, etc.)
|
||||
- Scheduled re-embedding for thoughts stored without vectors
|
||||
- Webhook capture endpoint (ingest from any tool without MCP)
|
||||
- Usage dashboard (thoughts/day, source breakdown)
|
||||
- Mosaic Stack integration (knowledge module backend)
|
||||
|
||||
## Acceptance Criteria (v0.0.1)
|
||||
|
||||
1. `POST /v1/thoughts` stores a thought and returns it with embedded=true when Ollama is reachable
|
||||
2. `POST /v1/search` with a natural-language query returns semantically relevant results
|
||||
3. `GET /v1/thoughts/recent` returns the last N thoughts in reverse chronological order
|
||||
4. `GET /v1/stats` returns total count, embedded count, and source breakdown
|
||||
5. MCP server at `/mcp` exposes all 4 tools (capture, search, recent, stats)
|
||||
6. Claude Code can connect to the MCP server and execute all 4 tools
|
||||
7. Portainer stack deploys both brain-db and brain-api successfully
|
||||
8. CI pipeline runs on push to main and produces a tagged image
|
||||
|
||||
## Out of Scope (v0.0.1)
|
||||
|
||||
- User accounts / multi-user
|
||||
- Workspace isolation
|
||||
- Web UI
|
||||
- Rate limiting
|
||||
- Mosaic Stack integration
|
||||
28
docs/TASKS.md
Normal file
28
docs/TASKS.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# OpenBrain — Tasks
|
||||
|
||||
**Project**: openbrain
|
||||
**Provider**: https://git.mosaicstack.dev/mosaic/openbrain
|
||||
|
||||
---
|
||||
|
||||
## Active
|
||||
|
||||
| ID | Title | Status | Notes |
|
||||
|----|-------|--------|-------|
|
||||
| T1 | Scaffold repo + core service | in-progress | Building now |
|
||||
| T2 | CI/CD pipeline (Woodpecker) | in-progress | Building now |
|
||||
| T3 | Portainer deployment | pending | Follows T1, T2 |
|
||||
| T4 | Copy init.sql to host, deploy stack | pending | Requires server access |
|
||||
| T5 | Configure MCP in Claude Code settings | pending | Follows T3 |
|
||||
| T6 | Smoke test: capture + search via MCP | pending | Follows T5 |
|
||||
|
||||
## Backlog
|
||||
|
||||
| ID | Title | Notes |
|
||||
|----|-------|-------|
|
||||
| T10 | Woodpecker CI secrets setup (GITEA_USERNAME, GITEA_TOKEN) | Required for build pipeline |
|
||||
| T11 | DNS: brain.woltje.com → Swarm ingress | Required for HTTPS access |
|
||||
| T12 | Traefik TLS cert for brain.woltje.com | Required for HTTPS MCP |
|
||||
| T20 | Batch import: ingest jarvis-brain JSON data | v0.1.0 |
|
||||
| T21 | Scheduled re-embedding for non-embedded thoughts | v0.1.0 |
|
||||
| T22 | Mosaic Stack knowledge module integration | v0.1.0+ |
|
||||
41
docs/scratchpads/v001-build.md
Normal file
41
docs/scratchpads/v001-build.md
Normal file
@@ -0,0 +1,41 @@
|
||||
# Scratchpad: v0.0.1 Build
|
||||
|
||||
**Date**: 2026-03-02
|
||||
**Objective**: Build and deploy alpha OpenBrain service
|
||||
|
||||
## Plan
|
||||
|
||||
1. [x] Scaffold project structure
|
||||
2. [x] Core brain operations (capture, search, recent, stats)
|
||||
3. [x] FastAPI REST + MCP server (single process)
|
||||
4. [x] pgvector schema
|
||||
5. [x] Dockerfile
|
||||
6. [x] Portainer compose
|
||||
7. [x] Woodpecker CI pipeline
|
||||
8. [x] Mosaic framework files (AGENTS.md, PRD.md, TASKS.md)
|
||||
9. [ ] Initial commit + push
|
||||
10. [ ] Woodpecker secrets verified
|
||||
11. [ ] DNS + Traefik config for brain.woltje.com
|
||||
12. [ ] Host init.sql copy + Portainer stack deploy
|
||||
13. [ ] Smoke test via MCP
|
||||
|
||||
## Decisions
|
||||
|
||||
- Single Python process for REST + MCP (avoids 2-container overhead for alpha)
|
||||
- Streamable HTTP MCP transport (not stdio — deployed service, needs HTTP)
|
||||
- bge-m3 via existing Ollama at 10.1.1.42 (verified live)
|
||||
- vector(1024) — bge-m3 native, no padding
|
||||
- Graceful fallback: thoughts stored without embedding if Ollama unreachable
|
||||
- pgvector/pgvector:pg17 official image — no custom build needed
|
||||
|
||||
## Blockers / Notes
|
||||
|
||||
- Woodpecker CI secrets (GITEA_USERNAME, GITEA_TOKEN) must be set for build pipeline
|
||||
- DNS record for brain.woltje.com needs to be created
|
||||
- Init SQL must be on host at /opt/openbrain/init.sql before first Portainer deploy
|
||||
- MCP auth: headers passed via Claude Code settings — confirm MCP SDK accepts headers on streamable HTTP
|
||||
|
||||
## Risks
|
||||
|
||||
- MCP streamable HTTP transport is newer spec — need to verify Claude Code supports it
|
||||
- Fallback: switch to SSE transport (mcp.server.sse.SseServerTransport)
|
||||
Reference in New Issue
Block a user