feat: initial alpha scaffold — FastAPI + MCP + pgvector

Implements v0.0.1 of OpenBrain: - FastAPI REST API (capture, search, recent, stats) with Bearer auth - MCP server (streamable HTTP at /mcp) exposing all 4 tools - pgvector schema (vector(1024) for bge-m3) - asyncpg connection pool with lazy init + graceful close - Ollama embedding client with fallback (stores thought without vector if Ollama unreachable) - Woodpecker CI pipeline (lint + kaniko build + push to Gitea registry) - Portainer/Swarm deployment compose - Mosaic framework files: AGENTS.md, PRD.md, TASKS.md, scratchpad Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 18:25:07 -06:00
commit 5771ec5260
18 changed files with 792 additions and 0 deletions
--- a/.env.example
+++ b/.env.example
@@ -0,0 +1,12 @@
+# Database — update host/credentials for your deployment
+DATABASE_URL=postgresql://openbrain:changeme@brain-db:5432/openbrain
+
+# Auth — generate a strong random key: openssl rand -hex 32
+API_KEY=your-secret-key-here
+
+# Ollama — point at your Ollama instance
+OLLAMA_URL=http://your-ollama-host:11434
+OLLAMA_EMBEDDING_MODEL=bge-m3:latest
+
+# Service
+LOG_LEVEL=info
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,11 @@
+.env
+.env.local
+__pycache__/
+*.pyc
+*.pyo
+.pytest_cache/
+.ruff_cache/
+dist/
+*.egg-info/
+.venv/
+venv/
--- a/.woodpecker/build.yml
+++ b/.woodpecker/build.yml
@@ -0,0 +1,49 @@
+when:
+  - event: push
+    branch: main
+  - event: tag
+
+variables:
+  - &registry git.mosaicstack.dev
+  - &image git.mosaicstack.dev/mosaic/openbrain
+
+steps:
+  lint:
+    image: python:3.12-slim
+    commands:
+      - pip install ruff --quiet
+      - ruff check src/
+      - ruff format --check src/
+
+  build:
+    image: plugins/kaniko
+    settings:
+      registry: *registry
+      repo: *image
+      tags:
+        - sha-${CI_COMMIT_SHA:0:8}
+        - latest
+      username:
+        from_secret: GITEA_USERNAME
+      password:
+        from_secret: GITEA_TOKEN
+      build_args:
+        - BUILDKIT_INLINE_CACHE=1
+    when:
+      - event: push
+        branch: main
+
+  build-tag:
+    image: plugins/kaniko
+    settings:
+      registry: *registry
+      repo: *image
+      tags:
+        - ${CI_COMMIT_TAG}
+        - sha-${CI_COMMIT_SHA:0:8}
+      username:
+        from_secret: GITEA_USERNAME
+      password:
+        from_secret: GITEA_TOKEN
+    when:
+      - event: tag
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,87 @@
+# OpenBrain — Agent Guidelines
+
+> **Purpose**: Self-hosted semantic brain — pgvector + MCP server for any AI agent
+> **SSOT**: https://git.mosaicstack.dev/mosaic/openbrain
+> **Status**: Alpha (0.0.1)
+
+---
+
+## Stack
+
+| Layer | Tech |
+|-------|------|
+| Language | Python 3.12 |
+| API | FastAPI + uvicorn |
+| MCP | `mcp[cli]` Python SDK (streamable HTTP transport) |
+| Database | PostgreSQL 17 + pgvector |
+| Embeddings | Ollama (`bge-m3:latest`, 1024-dim) |
+| CI/CD | Woodpecker → Gitea registry |
+| Deployment | Docker Swarm via Portainer |
+
+## Structure
+
+```
+src/
+  config.py     — env-based settings (pydantic-settings)
+  db.py         — asyncpg connection pool
+  embeddings.py — Ollama embedding client
+  models.py     — Pydantic request/response models
+  brain.py      — core operations (capture, search, recent, stats)
+  main.py       — FastAPI app + MCP server mount
+docker/
+  postgres/init.sql — schema + pgvector setup
+.woodpecker/
+  build.yml     — lint → kaniko build → push
+```
+
+## Key Rules
+
+1. **Never hardcode secrets, IPs, or internal hostnames.** All config via env vars.
+2. **Public repo.** `.env` is gitignored. `.env.example` has placeholders only.
+3. **MCP transport is Streamable HTTP** mounted at `/mcp`. Not stdio.
+4. **REST + MCP live in one process** (`src/main.py`). No separate MCP container.
+5. **Schema is append-only** in alpha. Migrations via new SQL files in `docker/postgres/`.
+6. **Embeddings are best-effort**: if Ollama is unreachable, thought is stored without embedding.
+
+## Auth
+
+All REST endpoints require: `Authorization: Bearer <API_KEY>`
+
+MCP server at `/mcp` uses the same key via MCP client config headers.
+
+## Local Dev
+
+```bash
+cp .env.example .env
+# Fill in DATABASE_URL, API_KEY, OLLAMA_URL
+
+uv pip install -e ".[dev]"
+uvicorn src.main:app --reload
+```
+
+## CI/CD
+
+Push to `main` → Woodpecker lints + builds image → pushes `sha-<hash>` + `latest` tags.
+Tag a release → pushes `v0.0.x` + `sha-<hash>` tags.
+
+## Deployment
+
+Use `docker-compose.portainer.yml` as a Portainer stack.
+Required env vars: `POSTGRES_PASSWORD`, `API_KEY`, `OLLAMA_URL`, `IMAGE_TAG`.
+Init SQL must be copied to host at `/opt/openbrain/init.sql` before first deploy.
+
+## MCP Client Config (Claude Code)
+
+```json
+{
+  "mcpServers": {
+    "openbrain": {
+      "type": "http",
+      "url": "https://brain.woltje.com/mcp",
+      "headers": {
+        "Authorization": "Bearer <API_KEY>"
+      }
+    }
+  }
+}
+```
--- a/20
+++ b/20
@@ -0,0 +1,20 @@
+FROM python:3.12-slim
+
+WORKDIR /app
+
+# Install uv for fast dependency installation
+RUN pip install uv --no-cache-dir
+
+# Copy dependency spec first for layer caching
+COPY pyproject.toml .
+RUN uv pip install --system --no-cache .
+
+# Copy source
+COPY src/ ./src/
+
+ENV PYTHONUNBUFFERED=1
+ENV PYTHONPATH=/app
+
+EXPOSE 8000
+
+CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
--- a/docker-compose.portainer.yml
+++ b/docker-compose.portainer.yml
@@ -0,0 +1,65 @@
+# OpenBrain — Portainer / Docker Swarm deployment
+#
+# Required environment variables (set in Portainer stack env):
+#   POSTGRES_PASSWORD   — postgres user password
+#   API_KEY             — secret key for API/MCP auth
+#   OLLAMA_URL          — Ollama endpoint (e.g. http://10.x.x.x:11434)
+#   IMAGE_TAG           — image tag to deploy (e.g. sha-abc1234 or 0.0.1)
+#
+# Optional:
+#   OLLAMA_EMBEDDING_MODEL  — default: bge-m3:latest
+#   LOG_LEVEL               — default: info
+
+services:
+  brain-db:
+    image: pgvector/pgvector:pg17
+    environment:
+      POSTGRES_USER: openbrain
+      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
+      POSTGRES_DB: openbrain
+    volumes:
+      - brain_db_data:/var/lib/postgresql/data
+      - /opt/openbrain/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
+    healthcheck:
+      test: ["CMD-SHELL", "pg_isready -U openbrain -d openbrain"]
+      interval: 10s
+      timeout: 5s
+      retries: 5
+    networks:
+      - brain-internal
+    deploy:
+      replicas: 1
+      restart_policy:
+        condition: on-failure
+
+  brain-api:
+    image: git.mosaicstack.dev/mosaic/openbrain:${IMAGE_TAG:-latest}
+    environment:
+      DATABASE_URL: postgresql://openbrain:${POSTGRES_PASSWORD}@brain-db:5432/openbrain
+      API_KEY: ${API_KEY}
+      OLLAMA_URL: ${OLLAMA_URL}
+      OLLAMA_EMBEDDING_MODEL: ${OLLAMA_EMBEDDING_MODEL:-bge-m3:latest}
+      LOG_LEVEL: ${LOG_LEVEL:-info}
+    ports:
+      - "8765:8000"
+    depends_on:
+      - brain-db
+    networks:
+      - brain-internal
+    deploy:
+      replicas: 1
+      restart_policy:
+        condition: on-failure
+      labels:
+        - "traefik.enable=true"
+        - "traefik.http.routers.openbrain.rule=Host(`brain.woltje.com`)"
+        - "traefik.http.routers.openbrain.entrypoints=websecure"
+        - "traefik.http.routers.openbrain.tls=true"
+        - "traefik.http.services.openbrain.loadbalancer.server.port=8000"
+
+volumes:
+  brain_db_data:
+
+networks:
+  brain-internal:
+    driver: overlay
--- a/docker/postgres/init.sql
+++ b/docker/postgres/init.sql
@@ -0,0 +1,28 @@
+-- OpenBrain — Database Initialization
+-- Runs once on first container start
+
+CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
+CREATE EXTENSION IF NOT EXISTS vector;
+
+CREATE TABLE IF NOT EXISTS thoughts (
+    id          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
+    content     TEXT        NOT NULL,
+    embedding   vector(1024),           -- bge-m3 native dimension
+    source      VARCHAR(100)            NOT NULL DEFAULT 'unknown',
+    metadata    JSONB                   NOT NULL DEFAULT '{}',
+    created_at  TIMESTAMPTZ             NOT NULL DEFAULT NOW(),
+    updated_at  TIMESTAMPTZ             NOT NULL DEFAULT NOW()
+);
+
+-- Vector similarity search index (cosine)
+CREATE INDEX IF NOT EXISTS thoughts_embedding_idx
+    ON thoughts USING ivfflat (embedding vector_cosine_ops)
+    WITH (lists = 100);
+
+-- Recent queries
+CREATE INDEX IF NOT EXISTS thoughts_created_at_idx
+    ON thoughts (created_at DESC);
+
+-- Filter by source
+CREATE INDEX IF NOT EXISTS thoughts_source_idx
+    ON thoughts (source);
--- a/docs/PRD.md
+++ b/docs/PRD.md
@@ -0,0 +1,70 @@
+# OpenBrain — Product Requirements Document
+
+**Version**: 0.0.1
+**Status**: Active
+**Owner**: Jason Woltje
+
+---
+
+## Problem
+
+AI agents and tools have no shared persistent memory. Every session starts from zero.
+Platform memory (Claude, ChatGPT, etc.) is siloed — each tool can't see what the others know.
+This forces constant context re-injection, burns tokens, and prevents compounding knowledge.
+
+## Goal
+
+A self-hosted, agent-readable semantic brain that any AI tool can plug into via MCP.
+One database. Standard protocol. Owned infrastructure. No SaaS middlemen.
+
+## Users
+
+1. **Jason** — primary human user, captures thoughts from any AI tool
+2. **AI agents** — Claude Code, Codex, Claude Desktop, any MCP-compatible client
+3. **Future**: Mosaic Stack integration as the knowledge layer for the agent fleet
+
+## Requirements
+
+### v0.0.1 (Alpha — Current)
+
+| ID | Requirement | Priority |
+|----|-------------|----------|
+| R1 | Capture a thought with content, source, and metadata | Must |
+| R2 | Generate vector embedding via Ollama (bge-m3) | Must |
+| R3 | Semantic search by meaning (cosine similarity) | Must |
+| R4 | List recent thoughts | Must |
+| R5 | Usage stats (total, embedded, by source) | Must |
+| R6 | REST API with Bearer token auth | Must |
+| R7 | MCP server (streamable HTTP) exposing all 4 tools | Must |
+| R8 | Deployable as Portainer/Swarm stack | Must |
+| R9 | CI/CD via Woodpecker (lint + build + push) | Must |
+| R10 | Graceful embedding fallback (store without vector if Ollama down) | Must |
+| R11 | Public repo — zero secrets in code | Must |
+
+### v0.1.0 (Future)
+
+- Thought tagging and tag-based filtering
+- Batch import (ingest jarvis-brain data, Claude memory, etc.)
+- Scheduled re-embedding for thoughts stored without vectors
+- Webhook capture endpoint (ingest from any tool without MCP)
+- Usage dashboard (thoughts/day, source breakdown)
+- Mosaic Stack integration (knowledge module backend)
+
+## Acceptance Criteria (v0.0.1)
+
+1. `POST /v1/thoughts` stores a thought and returns it with embedded=true when Ollama is reachable
+2. `POST /v1/search` with a natural-language query returns semantically relevant results
+3. `GET /v1/thoughts/recent` returns the last N thoughts in reverse chronological order
+4. `GET /v1/stats` returns total count, embedded count, and source breakdown
+5. MCP server at `/mcp` exposes all 4 tools (capture, search, recent, stats)
+6. Claude Code can connect to the MCP server and execute all 4 tools
+7. Portainer stack deploys both brain-db and brain-api successfully
+8. CI pipeline runs on push to main and produces a tagged image
+
+## Out of Scope (v0.0.1)
+
+- User accounts / multi-user
+- Workspace isolation
+- Web UI
+- Rate limiting
+- Mosaic Stack integration
--- a/docs/TASKS.md
+++ b/docs/TASKS.md
@@ -0,0 +1,28 @@
+# OpenBrain — Tasks
+
+**Project**: openbrain
+**Provider**: https://git.mosaicstack.dev/mosaic/openbrain
+
+---
+
+## Active
+
+| ID | Title | Status | Notes |
+|----|-------|--------|-------|
+| T1 | Scaffold repo + core service | in-progress | Building now |
+| T2 | CI/CD pipeline (Woodpecker) | in-progress | Building now |
+| T3 | Portainer deployment | pending | Follows T1, T2 |
+| T4 | Copy init.sql to host, deploy stack | pending | Requires server access |
+| T5 | Configure MCP in Claude Code settings | pending | Follows T3 |
+| T6 | Smoke test: capture + search via MCP | pending | Follows T5 |
+
+## Backlog
+
+| ID | Title | Notes |
+|----|-------|-------|
+| T10 | Woodpecker CI secrets setup (GITEA_USERNAME, GITEA_TOKEN) | Required for build pipeline |
+| T11 | DNS: brain.woltje.com → Swarm ingress | Required for HTTPS access |
+| T12 | Traefik TLS cert for brain.woltje.com | Required for HTTPS MCP |
+| T20 | Batch import: ingest jarvis-brain JSON data | v0.1.0 |
+| T21 | Scheduled re-embedding for non-embedded thoughts | v0.1.0 |
+| T22 | Mosaic Stack knowledge module integration | v0.1.0+ |
--- a/docs/scratchpads/v001-build.md
+++ b/docs/scratchpads/v001-build.md
@@ -0,0 +1,41 @@
+# Scratchpad: v0.0.1 Build
+
+**Date**: 2026-03-02
+**Objective**: Build and deploy alpha OpenBrain service
+
+## Plan
+
+1. [x] Scaffold project structure
+2. [x] Core brain operations (capture, search, recent, stats)
+3. [x] FastAPI REST + MCP server (single process)
+4. [x] pgvector schema
+5. [x] Dockerfile
+6. [x] Portainer compose
+7. [x] Woodpecker CI pipeline
+8. [x] Mosaic framework files (AGENTS.md, PRD.md, TASKS.md)
+9. [ ] Initial commit + push
+10. [ ] Woodpecker secrets verified
+11. [ ] DNS + Traefik config for brain.woltje.com
+12. [ ] Host init.sql copy + Portainer stack deploy
+13. [ ] Smoke test via MCP
+
+## Decisions
+
+- Single Python process for REST + MCP (avoids 2-container overhead for alpha)
+- Streamable HTTP MCP transport (not stdio — deployed service, needs HTTP)
+- bge-m3 via existing Ollama at 10.1.1.42 (verified live)
+- vector(1024) — bge-m3 native, no padding
+- Graceful fallback: thoughts stored without embedding if Ollama unreachable
+- pgvector/pgvector:pg17 official image — no custom build needed
+
+## Blockers / Notes
+
+- Woodpecker CI secrets (GITEA_USERNAME, GITEA_TOKEN) must be set for build pipeline
+- DNS record for brain.woltje.com needs to be created
+- Init SQL must be on host at /opt/openbrain/init.sql before first Portainer deploy
+- MCP auth: headers passed via Claude Code settings — confirm MCP SDK accepts headers on streamable HTTP
+
+## Risks
+
+- MCP streamable HTTP transport is newer spec — need to verify Claude Code supports it
+  - Fallback: switch to SSE transport (mcp.server.sse.SseServerTransport)
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,32 @@
+[project]
+name = "openbrain"
+version = "0.0.1"
+description = "Self-hosted semantic brain — pgvector + MCP server for any AI agent"
+requires-python = ">=3.12"
+dependencies = [
+    "fastapi>=0.115.0",
+    "uvicorn[standard]>=0.32.0",
+    "asyncpg>=0.30.0",
+    "httpx>=0.28.0",
+    "pydantic>=2.10.0",
+    "pydantic-settings>=2.7.0",
+    "mcp[cli]>=1.6.0",
+    "python-multipart>=0.0.20",
+]
+
+[project.optional-dependencies]
+dev = [
+    "pytest>=8.0.0",
+    "pytest-asyncio>=0.24.0",
+    "ruff>=0.8.0",
+]
+
+[tool.ruff]
+line-length = 100
+target-version = "py312"
+
+[tool.ruff.lint]
+select = ["E", "F", "I"]
+
+[tool.pytest.ini_options]
+asyncio_mode = "auto"
--- a/src/init.py
+++ b/src/init.py
--- a/src/brain.py
+++ b/src/brain.py
@@ -0,0 +1,127 @@
+"""Core brain operations — capture, search, recent, stats."""
+import json
+from src import db, embeddings
+from src.models import CaptureRequest, Thought, SearchRequest, SearchResult, Stats
+
+
+async def capture(req: CaptureRequest) -> Thought:
+    pool = await db.get_pool()
+    embedding = await embeddings.embed(req.content)
+
+    async with pool.acquire() as conn:
+        if embedding is not None:
+            vec = f"[{','.join(str(v) for v in embedding)}]"
+            row = await conn.fetchrow(
+                """
+                INSERT INTO thoughts (content, embedding, source, metadata)
+                VALUES ($1, $2::vector, $3, $4::jsonb)
+                RETURNING id::text, content, source, metadata, created_at, embedding IS NOT NULL AS embedded
+                """,
+                req.content, vec, req.source, json.dumps(req.metadata),
+            )
+        else:
+            row = await conn.fetchrow(
+                """
+                INSERT INTO thoughts (content, source, metadata)
+                VALUES ($1, $2, $3::jsonb)
+                RETURNING id::text, content, source, metadata, created_at, embedding IS NOT NULL AS embedded
+                """,
+                req.content, req.source, json.dumps(req.metadata),
+            )
+
+    return Thought(
+        id=row["id"],
+        content=row["content"],
+        source=row["source"],
+        metadata=json.loads(row["metadata"]) if isinstance(row["metadata"], str) else row["metadata"],
+        created_at=row["created_at"],
+        embedded=row["embedded"],
+    )
+
+
+async def search(req: SearchRequest) -> list[SearchResult]:
+    embedding = await embeddings.embed(req.query)
+    if embedding is None:
+        return []
+
+    pool = await db.get_pool()
+    vec = f"[{','.join(str(v) for v in embedding)}]"
+
+    async with pool.acquire() as conn:
+        if req.source:
+            rows = await conn.fetch(
+                """
+                SELECT id::text, content, source, metadata, created_at,
+                       1 - (embedding <=> $1::vector) AS similarity
+                FROM thoughts
+                WHERE embedding IS NOT NULL AND source = $2
+                ORDER BY embedding <=> $1::vector
+                LIMIT $3
+                """,
+                vec, req.source, req.limit,
+            )
+        else:
+            rows = await conn.fetch(
+                """
+                SELECT id::text, content, source, metadata, created_at,
+                       1 - (embedding <=> $1::vector) AS similarity
+                FROM thoughts
+                WHERE embedding IS NOT NULL
+                ORDER BY embedding <=> $1::vector
+                LIMIT $2
+                """,
+                vec, req.limit,
+            )
+
+    return [
+        SearchResult(
+            id=r["id"],
+            content=r["content"],
+            source=r["source"],
+            similarity=float(r["similarity"]),
+            created_at=r["created_at"],
+            metadata=json.loads(r["metadata"]) if isinstance(r["metadata"], str) else r["metadata"],
+        )
+        for r in rows
+    ]
+
+
+async def recent(limit: int = 20) -> list[Thought]:
+    pool = await db.get_pool()
+    async with pool.acquire() as conn:
+        rows = await conn.fetch(
+            """
+            SELECT id::text, content, source, metadata, created_at,
+                   embedding IS NOT NULL AS embedded
+            FROM thoughts
+            ORDER BY created_at DESC
+            LIMIT $1
+            """,
+            limit,
+        )
+    return [
+        Thought(
+            id=r["id"],
+            content=r["content"],
+            source=r["source"],
+            metadata=json.loads(r["metadata"]) if isinstance(r["metadata"], str) else r["metadata"],
+            created_at=r["created_at"],
+            embedded=r["embedded"],
+        )
+        for r in rows
+    ]
+
+
+async def stats() -> Stats:
+    pool = await db.get_pool()
+    async with pool.acquire() as conn:
+        total = await conn.fetchval("SELECT COUNT(*) FROM thoughts")
+        embedded = await conn.fetchval("SELECT COUNT(*) FROM thoughts WHERE embedding IS NOT NULL")
+        sources = await conn.fetch(
+            "SELECT source, COUNT(*) AS count FROM thoughts GROUP BY source ORDER BY count DESC"
+        )
+    return Stats(
+        total_thoughts=total,
+        embedded_count=embedded,
+        sources=[{"source": r["source"], "count": r["count"]} for r in sources],
+    )
--- a/src/config.py
+++ b/src/config.py
@@ -0,0 +1,23 @@
+from pydantic_settings import BaseSettings, SettingsConfigDict
+
+
+class Settings(BaseSettings):
+    model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8")
+
+    # Database
+    database_url: str = "postgresql://openbrain:openbrain@localhost:5432/openbrain"
+
+    # Auth
+    api_key: str  # Required — no default, must be set
+
+    # Ollama
+    ollama_url: str = "http://localhost:11434"
+    ollama_embedding_model: str = "bge-m3:latest"
+
+    # Service
+    host: str = "0.0.0.0"
+    port: int = 8000
+    log_level: str = "info"
+
+
+settings = Settings()
--- a/src/db.py
+++ b/src/db.py
@@ -0,0 +1,18 @@
+import asyncpg
+from src.config import settings
+
+_pool: asyncpg.Pool | None = None
+
+
+async def get_pool() -> asyncpg.Pool:
+    global _pool
+    if _pool is None:
+        _pool = await asyncpg.create_pool(settings.database_url, min_size=2, max_size=10)
+    return _pool
+
+
+async def close_pool() -> None:
+    global _pool
+    if _pool:
+        await _pool.close()
+        _pool = None
--- a/src/embeddings.py
+++ b/src/embeddings.py
@@ -0,0 +1,16 @@
+import httpx
+from src.config import settings
+
+
+async def embed(text: str) -> list[float] | None:
+    """Generate embedding via Ollama. Returns None if Ollama is unreachable."""
+    try:
+        async with httpx.AsyncClient(timeout=30.0) as client:
+            response = await client.post(
+                f"{settings.ollama_url}/api/embeddings",
+                json={"model": settings.ollama_embedding_model, "prompt": text},
+            )
+            response.raise_for_status()
+            return response.json()["embedding"]
+    except Exception:
+        return None
--- a/src/main.py
+++ b/src/main.py
@@ -0,0 +1,126 @@
+"""OpenBrain — FastAPI REST + MCP server (single process)."""
+import contextlib
+import logging
+
+from fastapi import Depends, FastAPI, HTTPException, Security
+from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
+from mcp.server.fastmcp import FastMCP
+
+from src import brain, db
+from src.config import settings
+from src.models import CaptureRequest, SearchRequest, SearchResult, Stats, Thought
+
+logging.basicConfig(level=settings.log_level.upper())
+logger = logging.getLogger("openbrain")
+
+# ---------------------------------------------------------------------------
+# Auth
+# ---------------------------------------------------------------------------
+bearer = HTTPBearer()
+
+
+def require_api_key(credentials: HTTPAuthorizationCredentials = Security(bearer)) -> str:
+    if credentials.credentials != settings.api_key:
+        raise HTTPException(status_code=401, detail="Invalid API key")
+    return credentials.credentials
+
+
+# ---------------------------------------------------------------------------
+# MCP server
+# ---------------------------------------------------------------------------
+mcp = FastMCP("openbrain", stateless_http=True)
+
+
+@mcp.tool()
+async def capture(content: str, source: str = "unknown", metadata: dict | None = None) -> dict:
+    """Store a thought or piece of information in your brain.
+
+    Args:
+        content: The text to remember
+        source: Which agent or tool is capturing this (e.g. 'claude-code', 'codex')
+        metadata: Optional key/value pairs (tags, project, etc.)
+    """
+    thought = await brain.capture(CaptureRequest(content=content, source=source, metadata=metadata or {}))
+    return thought.model_dump(mode="json")
+
+
+@mcp.tool()
+async def search(query: str, limit: int = 10, source: str | None = None) -> list[dict]:
+    """Search your brain by meaning (semantic search).
+
+    Args:
+        query: What you're looking for — describe it naturally
+        limit: Max results to return (default 10)
+        source: Optional — filter to a specific agent/tool
+    """
+    results = await brain.search(SearchRequest(query=query, limit=limit, source=source))
+    return [r.model_dump(mode="json") for r in results]
+
+
+@mcp.tool()
+async def recent(limit: int = 20) -> list[dict]:
+    """Get recently captured thoughts.
+
+    Args:
+        limit: How many to return (default 20)
+    """
+    thoughts = await brain.recent(limit=limit)
+    return [t.model_dump(mode="json") for t in thoughts]
+
+
+@mcp.tool()
+async def stats() -> dict:
+    """Get statistics about your brain — total thoughts, embedding coverage, sources."""
+    s = await brain.stats()
+    return s.model_dump(mode="json")
+
+
+# ---------------------------------------------------------------------------
+# FastAPI app
+# ---------------------------------------------------------------------------
+@contextlib.asynccontextmanager
+async def lifespan(app: FastAPI):
+    logger.info("OpenBrain starting up")
+    await db.get_pool()  # Warm the connection pool
+    yield
+    await db.close_pool()
+    logger.info("OpenBrain shut down")
+
+
+app = FastAPI(
+    title="OpenBrain",
+    description="Self-hosted semantic brain — pgvector + MCP for any AI agent",
+    version="0.0.1",
+    lifespan=lifespan,
+)
+
+# Mount MCP server at /mcp (HTTP streamable transport)
+app.mount("/mcp", mcp.streamable_http_app())
+
+
+# ---------------------------------------------------------------------------
+# REST endpoints (for direct API access and health checks)
+# ---------------------------------------------------------------------------
+@app.get("/health")
+async def health() -> dict:
+    return {"status": "ok", "version": "0.0.1"}
+
+
+@app.post("/v1/thoughts", response_model=Thought)
+async def api_capture(req: CaptureRequest, _: str = Depends(require_api_key)) -> Thought:
+    return await brain.capture(req)
+
+
+@app.post("/v1/search", response_model=list[SearchResult])
+async def api_search(req: SearchRequest, _: str = Depends(require_api_key)) -> list[SearchResult]:
+    return await brain.search(req)
+
+
+@app.get("/v1/thoughts/recent", response_model=list[Thought])
+async def api_recent(limit: int = 20, _: str = Depends(require_api_key)) -> list[Thought]:
+    return await brain.recent(limit=limit)
+
+
+@app.get("/v1/stats", response_model=Stats)
+async def api_stats(_: str = Depends(require_api_key)) -> Stats:
+    return await brain.stats()
--- a/src/models.py
+++ b/src/models.py
@@ -0,0 +1,39 @@
+from datetime import datetime
+from typing import Any
+from pydantic import BaseModel
+
+
+class CaptureRequest(BaseModel):
+    content: str
+    source: str = "unknown"
+    metadata: dict[str, Any] = {}
+
+
+class Thought(BaseModel):
+    id: str
+    content: str
+    source: str
+    metadata: dict[str, Any]
+    created_at: datetime
+    embedded: bool
+
+
+class SearchRequest(BaseModel):
+    query: str
+    limit: int = 10
+    source: str | None = None
+
+
+class SearchResult(BaseModel):
+    id: str
+    content: str
+    source: str
+    similarity: float
+    created_at: datetime
+    metadata: dict[str, Any]
+
+
+class Stats(BaseModel):
+    total_thoughts: int
+    embedded_count: int
+    sources: list[dict[str, Any]]