Commit Graph

2 Commits

Author SHA1 Message Date
Garry Tan
3e21e9b69b feat: GBrain v0.6.0 — Remote MCP Server + 12 Bug Fixes (#28)
* fix: 7 bug fixes from Issue #9 and #22

- fix(mcp): use ListToolsRequestSchema/CallToolRequestSchema instead of string literals (Issue #9, PR #25)
- fix(mcp): handleToolCall reads dry_run from params instead of hardcoding false (#22 Bug #11)
- fix(search): keyword search returns best chunk per page via DISTINCT ON, not all chunks (#22 Bug #8)
- fix(search): dedup layer 1 keeps top 3 chunks per page instead of collapsing to 1 (#22 Bug #12)
- fix(engine): transaction uses scoped engine via Object.create, no shared state mutation (#22 Bug #2)
- fix(engine): upsertChunks uses UPSERT instead of DELETE+INSERT, preserves existing embeddings (#22 Bug #1)
- fix(slugs): validateSlug normalizes to lowercase, pathToSlug lowercases consistently (#22 Bug #4)
- schema: add unique index on content_chunks(page_id, chunk_index) for UPSERT support
- schema: add access_tokens and mcp_request_log tables via migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: embed schema.sql at build time, remove fs dependency from initSchema

initSchema() previously read schema.sql from disk at runtime via readFileSync,
which broke in compiled Bun binaries and Deno Edge Functions. Now uses a
generated schema-embedded.ts constant (run `bun run build:schema` to regenerate).

- Removes fs and path imports from postgres-engine.ts and db.ts
- Adds scripts/build-schema.sh for one-source-of-truth generation
- Adds build:schema npm script

Fixes Issue #22 Bug #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 5 more bug fixes from Issue #22

- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: remote MCP server via Supabase Edge Functions

Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase
instance. One brain, accessible from Claude Desktop, Claude Code, Cowork,
Perplexity Computer, and any MCP client. Zero new infrastructure.

New files:
- supabase/functions/gbrain-mcp/index.ts — Edge Function with Hono + MCP SDK
- supabase/functions/gbrain-mcp/deno.json — Deno import map
- src/edge-entry.ts — curated bundle entry point (excludes fs-dependent modules)
- src/commands/auth.ts — standalone token management (create/list/revoke/test)
- scripts/deploy-remote.sh — one-script deployment
- .env.production.example — 3-value config template

Changes:
- config.ts: lazy-evaluate CONFIG_DIR (no homedir() at module scope)
- schema.sql: add access_tokens + mcp_request_log tables
- package.json: add build:edge script

Auth: bearer tokens via access_tokens table (SHA-256 hashed, per-client, revocable)
Transport: WebStandardStreamableHTTPServerTransport (stateless, Streamable HTTP)
Health: /health endpoint (unauth: 200/503, auth: postgres/pgvector/openai checks)
Excluded from remote: sync_brain, file_upload (may exceed 60s timeout)

Setup: clone, fill .env.production, run scripts/deploy-remote.sh, create token, done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: per-client MCP setup guides

- docs/mcp/DEPLOY.md — deployment walkthrough, auth, troubleshooting, latency table
- docs/mcp/CLAUDE_CODE.md — claude mcp add command
- docs/mcp/CLAUDE_DESKTOP.md — Settings > Integrations (NOT JSON config!)
- docs/mcp/CLAUDE_COWORK.md — remote + local bridge paths
- docs/mcp/PERPLEXITY.md — Perplexity Computer connector setup
- docs/mcp/CHATGPT.md — coming soon (requires OAuth 2.1, P0 TODO)
- docs/mcp/ALTERNATIVES.md — Tailscale Funnel + ngrok self-hosted options

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.6.0)

GBrain v0.6.0: Remote MCP server via Supabase Edge Functions + 12 bug fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Remote MCP Server section to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: make document-release mandatory in CLAUDE.md, add MCP key files

Post-ship requirements section: document-release is NOT optional. Lists every
file that must be checked on every ship. A ship without updated docs is incomplete.

Also adds remote MCP server files to Key files section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: batch upsertChunks into single statement to prevent deadlocks

The per-chunk UPSERT loop caused deadlocks under parallel workers because
each INSERT ON CONFLICT acquired row-level locks sequentially. Multiple
workers upserting different pages could deadlock on the shared unique index.

Fix: batch all chunks into a single multi-row INSERT ON CONFLICT statement.
One round-trip, one lock acquisition. COALESCE preserves existing embeddings
when the new value is NULL.

Fixes CI failure: "E2E: Parallel Import > parallel import with --workers 4"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: advisory lock in initSchema() prevents deadlock on concurrent DDL

When multiple processes call initSchema() concurrently (e.g., test setup +
CLI subprocess, or parallel workers during E2E tests), the schema SQL's
DROP TRIGGER + CREATE TRIGGER statements acquire AccessExclusiveLock on
different tables, causing deadlocks.

Fix: pg_advisory_lock(42) serializes all initSchema() calls within the
same database. The lock is session-scoped and released in a finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add explicit test timeouts for CLI subprocess E2E tests

CLI subprocess tests (Setup Journey, Doctor Command, Parallel Import)
spawn `bun run src/cli.ts` which takes several seconds to JIT compile +
connect. The Bun test framework default 5000ms per-test timeout is too
tight for CI. Added 30-60s timeouts matching each subprocess's own
timeout to prevent false failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: infinite recursion in config.ts exported getConfigDir/getConfigPath

The replace_all refactor created recursive functions: the exported
getConfigDir() called the private getConfigDir() which called itself.
Renamed exports to configDir()/configPath() to avoid shadowing.

Also adds scripts/smoke-test-mcp.ts — verified all 8 MCP tool calls
work against a real Postgres database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:23:00 -10:00
Garry Tan
b22cbd349a feat: GBrain v0.1.0 — Postgres-native personal knowledge brain (#1)
* chore: add CLAUDE.md with project context and gstack skill routing rules

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: initialize project with Bun + TypeScript

package.json with dependencies (postgres, pgvector, openai, anthropic,
MCP SDK, gray-matter). TypeScript config targeting ESNext with bundler
module resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add foundation layer — engine interface, Postgres engine, schema

BrainEngine pluggable interface with full PostgresEngine: CRUD, search
(keyword + vector), links, tags, timeline, versions, stats, health,
ingest log, config. Trigger-based tsvector spanning pages +
timeline_entries. Markdown parser with frontmatter, compiled_truth /
timeline splitting, and round-trip serialization. 19 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 3-tier chunking and embedding service

Recursive delimiter-aware chunker (5-level hierarchy, 300-word chunks,
50-word overlap). Semantic chunker with Savitzky-Golay boundary detection
and recursive fallback. LLM-guided chunker via Claude Haiku with sliding
window topic detection. OpenAI embedding service with batch support,
exponential backoff, and rate limit handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add hybrid search with RRF fusion, expansion, and 4-layer dedup

Hybrid search merges vector (pgvector HNSW) + keyword (tsvector) via
Reciprocal Rank Fusion. Multi-query expansion via Claude Haiku generates
2 alternative phrasings. 4-layer dedup pipeline: by source, cosine
similarity, type diversity (60% cap), per-page cap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add GBRAIN_V0 spec, pluggable engine architecture, SQLite engine plan

GBRAIN_V0.md: full product spec with architecture decisions, CLI commands,
schema, search architecture, chunking strategies, first-time experience,
and future plans. ENGINES.md: pluggable engine interface, capability matrix,
how to add new backends. SQLITE_ENGINE.md: complete SQLite implementation
plan with schema, FTS5 setup, vector search options, and contributor guide.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add CLI with all commands

Full CLI dispatcher with 25+ commands: init (Supabase wizard), get, put,
delete, list, search, query (hybrid RRF), import (bulk with progress bar),
export (round-trip), embed, stats, health, tag/untag/tags, link/unlink/
backlinks/graph, timeline/timeline-add, history/revert, config, upgrade,
serve, call. Smart slug resolution on reads. Version snapshots on updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add MCP stdio server with all brain tools

20 MCP tools mirroring CLI operations: get/put/delete/list pages,
search (keyword), query (hybrid RRF + expansion), tags, links with
graph traversal, timeline, stats, health, version history, and revert.
Auto-chunks and embeds on put_page. CLI and MCP share the same engine.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 6 skill files and ClawHub manifest

Fat markdown skills for AI agents: ingest (meetings/docs/articles with
timeline merge), query (3-layer search + synthesis + citations), maintain
(health checks, stale detection, orphan audit), enrich (external API
enrichment), briefing (daily briefing compilation), migrate (universal
migration from Obsidian/Notion/Logseq/markdown/CSV/JSON/Roam).
ClawHub manifest for skill distribution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add README, CONTRIBUTING, update CLAUDE.md test references

README with quickstart, commands, architecture, library usage, MCP setup,
and links to design docs. CONTRIBUTING with setup, project structure,
and guides for adding commands and engines. CLAUDE.md updated to reference
actual test files instead of planned-but-unwritten import test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address adversarial review findings — 5 critical/high fixes

- revertToVersion: add page_id check to prevent cross-page data corruption
- traverseGraph: use UNION instead of UNION ALL for cycle safety
- embedAll: preserve all chunks when embedding stale subset only
- embedding: throw on retry exhaustion instead of returning zero vectors
- putPage: validate slugs to prevent path traversal on export

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.1.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: expand README with schema, install, search architecture, and motivation

Why it exists, how search works (with ASCII diagram), full database schema
with all 9 tables and index details, chunking strategies explained, storage
estimates, setup wizard walkthrough, knowledge model with example page,
library usage with more examples, expanded skills table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: add MIT license (Copyright 2026 Garry Tan)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add OpenClaw install flow as primary option in README

OpenClaw users just say "install gbrain" and the orchestrator handles
everything: package install, Supabase setup wizard, skill registration.
Shows the conversational interface for querying, ingesting, and briefings.
ClawHub and standalone CLI paths follow as alternatives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add prerequisites and explicit OpenClaw install instructions

Prerequisites table listing Supabase, OpenAI, and Anthropic dependencies
with links. Environment variable setup. Explicit step-by-step prompt for
OpenClaw users showing exactly what to tell the orchestrator. Note that
search degrades gracefully without API keys (keyword-only without OpenAI,
no expansion without Anthropic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: scrub named references, add PG essay demo section to README

Replace all Pedro/Brex/Jensen Huang/River AI examples with Paul Graham
essay examples using the kindling corpus. Add "Try it" section to README
showing the power of hybrid search on PG essays in 90 seconds. Update
test fixtures to use concept pages instead of person pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:48:10 -07:00