Files
gbrain/CHANGELOG.md
Garry Tan 3e21e9b69b feat: GBrain v0.6.0 — Remote MCP Server + 12 Bug Fixes (#28)
* fix: 7 bug fixes from Issue #9 and #22

- fix(mcp): use ListToolsRequestSchema/CallToolRequestSchema instead of string literals (Issue #9, PR #25)
- fix(mcp): handleToolCall reads dry_run from params instead of hardcoding false (#22 Bug #11)
- fix(search): keyword search returns best chunk per page via DISTINCT ON, not all chunks (#22 Bug #8)
- fix(search): dedup layer 1 keeps top 3 chunks per page instead of collapsing to 1 (#22 Bug #12)
- fix(engine): transaction uses scoped engine via Object.create, no shared state mutation (#22 Bug #2)
- fix(engine): upsertChunks uses UPSERT instead of DELETE+INSERT, preserves existing embeddings (#22 Bug #1)
- fix(slugs): validateSlug normalizes to lowercase, pathToSlug lowercases consistently (#22 Bug #4)
- schema: add unique index on content_chunks(page_id, chunk_index) for UPSERT support
- schema: add access_tokens and mcp_request_log tables via migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: embed schema.sql at build time, remove fs dependency from initSchema

initSchema() previously read schema.sql from disk at runtime via readFileSync,
which broke in compiled Bun binaries and Deno Edge Functions. Now uses a
generated schema-embedded.ts constant (run `bun run build:schema` to regenerate).

- Removes fs and path imports from postgres-engine.ts and db.ts
- Adds scripts/build-schema.sh for one-source-of-truth generation
- Adds build:schema npm script

Fixes Issue #22 Bug #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 5 more bug fixes from Issue #22

- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: remote MCP server via Supabase Edge Functions

Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase
instance. One brain, accessible from Claude Desktop, Claude Code, Cowork,
Perplexity Computer, and any MCP client. Zero new infrastructure.

New files:
- supabase/functions/gbrain-mcp/index.ts — Edge Function with Hono + MCP SDK
- supabase/functions/gbrain-mcp/deno.json — Deno import map
- src/edge-entry.ts — curated bundle entry point (excludes fs-dependent modules)
- src/commands/auth.ts — standalone token management (create/list/revoke/test)
- scripts/deploy-remote.sh — one-script deployment
- .env.production.example — 3-value config template

Changes:
- config.ts: lazy-evaluate CONFIG_DIR (no homedir() at module scope)
- schema.sql: add access_tokens + mcp_request_log tables
- package.json: add build:edge script

Auth: bearer tokens via access_tokens table (SHA-256 hashed, per-client, revocable)
Transport: WebStandardStreamableHTTPServerTransport (stateless, Streamable HTTP)
Health: /health endpoint (unauth: 200/503, auth: postgres/pgvector/openai checks)
Excluded from remote: sync_brain, file_upload (may exceed 60s timeout)

Setup: clone, fill .env.production, run scripts/deploy-remote.sh, create token, done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: per-client MCP setup guides

- docs/mcp/DEPLOY.md — deployment walkthrough, auth, troubleshooting, latency table
- docs/mcp/CLAUDE_CODE.md — claude mcp add command
- docs/mcp/CLAUDE_DESKTOP.md — Settings > Integrations (NOT JSON config!)
- docs/mcp/CLAUDE_COWORK.md — remote + local bridge paths
- docs/mcp/PERPLEXITY.md — Perplexity Computer connector setup
- docs/mcp/CHATGPT.md — coming soon (requires OAuth 2.1, P0 TODO)
- docs/mcp/ALTERNATIVES.md — Tailscale Funnel + ngrok self-hosted options

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.6.0)

GBrain v0.6.0: Remote MCP server via Supabase Edge Functions + 12 bug fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Remote MCP Server section to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: make document-release mandatory in CLAUDE.md, add MCP key files

Post-ship requirements section: document-release is NOT optional. Lists every
file that must be checked on every ship. A ship without updated docs is incomplete.

Also adds remote MCP server files to Key files section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: batch upsertChunks into single statement to prevent deadlocks

The per-chunk UPSERT loop caused deadlocks under parallel workers because
each INSERT ON CONFLICT acquired row-level locks sequentially. Multiple
workers upserting different pages could deadlock on the shared unique index.

Fix: batch all chunks into a single multi-row INSERT ON CONFLICT statement.
One round-trip, one lock acquisition. COALESCE preserves existing embeddings
when the new value is NULL.

Fixes CI failure: "E2E: Parallel Import > parallel import with --workers 4"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: advisory lock in initSchema() prevents deadlock on concurrent DDL

When multiple processes call initSchema() concurrently (e.g., test setup +
CLI subprocess, or parallel workers during E2E tests), the schema SQL's
DROP TRIGGER + CREATE TRIGGER statements acquire AccessExclusiveLock on
different tables, causing deadlocks.

Fix: pg_advisory_lock(42) serializes all initSchema() calls within the
same database. The lock is session-scoped and released in a finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add explicit test timeouts for CLI subprocess E2E tests

CLI subprocess tests (Setup Journey, Doctor Command, Parallel Import)
spawn `bun run src/cli.ts` which takes several seconds to JIT compile +
connect. The Bun test framework default 5000ms per-test timeout is too
tight for CI. Added 30-60s timeouts matching each subprocess's own
timeout to prevent false failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: infinite recursion in config.ts exported getConfigDir/getConfigPath

The replace_all refactor created recursive functions: the exported
getConfigDir() called the private getConfigDir() which called itself.
Renamed exports to configDir()/configPath() to avoid shadowing.

Also adds scripts/smoke-test-mcp.ts — verified all 8 MCP tool calls
work against a real Postgres database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:23:00 -10:00

18 KiB

Changelog

All notable changes to GBrain will be documented in this file.

[0.6.0] - 2026-04-10

Added

  • Access your brain from any AI client. Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase instance. Works with Claude Desktop, Claude Code, Cowork, and Perplexity Computer. One URL, bearer token auth, zero new infrastructure. Clone the repo, fill in 3 env vars, run scripts/deploy-remote.sh, done.
  • Per-client setup guides in docs/mcp/ for Claude Code, Claude Desktop, Cowork, Perplexity, and ChatGPT (coming soon, requires OAuth 2.1). Also documents Tailscale Funnel and ngrok as self-hosted alternatives.
  • Token management via standalone src/commands/auth.ts. Create, list, revoke per-client bearer tokens. Includes smoke test: auth.ts test <url> --token <token> verifies the full pipeline (initialize + tools/list + get_stats) in 3 seconds.
  • Usage logging via mcp_request_log table. Every remote tool call logs token name, operation, latency, and status for debugging and security auditing.
  • Hardened health endpoint at /health. Unauthenticated: 200/503 only (no info disclosure). Authenticated: checks postgres, pgvector, and OpenAI API key status.

Fixed

  • MCP server actually connects now. Handler registration used string literals ('tools/list' as any) instead of SDK typed schemas. Replaced with ListToolsRequestSchema and CallToolRequestSchema. Without this fix, gbrain serve silently failed to register handlers. (Issue #9)
  • Search results no longer flooded by one large page. Keyword search returned ALL chunks from matching pages. Now returns one best chunk per page via DISTINCT ON. (Issue #22)
  • Search dedup no longer collapses to one chunk per page. Layer 1 kept only the single highest-scoring chunk per slug. Now keeps top 3, letting later dedup layers (text similarity, cap per page) do their job. (Issue #22)
  • Transactions no longer corrupt shared state. Both PostgresEngine.transaction() and db.withTransaction() swapped the shared connection reference, breaking under concurrent use. Now uses scoped engine via Object.create with no shared state mutation. (Issue #22)
  • embed --stale no longer wipes valid embeddings. upsertChunks() deleted all chunks then re-inserted, writing NULL for chunks without new embeddings. Now uses UPSERT (INSERT ON CONFLICT UPDATE) with COALESCE to preserve existing embeddings. (Issue #22)
  • Slug normalization is consistent. pathToSlug() preserved case while inferSlug() lowercased. Now validateSlug() enforces lowercase at the validation layer, covering all entry points. (Issue #22)
  • initSchema no longer reads from disk at runtime. Both schema loaders used readFileSync with import.meta.url, which broke in compiled binaries and Deno Edge Functions. Schema is now embedded at build time via scripts/build-schema.sh. (Issue #22)
  • file_upload actually uploads content. The operation wrote DB metadata but never called the storage backend. Fixed in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics. (Issue #22)
  • S3 storage backend authenticates requests. signedFetch() was just unsigned fetch(). Replaced with @aws-sdk/client-s3 for proper SigV4 signing. Supports R2/MinIO via forcePathStyle. (Issue #22)
  • Parallel import uses thread-safe queue. queue.shift() had race conditions under parallel workers. Now uses an atomic index counter. Checkpoint preserved on errors for safe resume. (Issue #22)
  • redirect verifies remote existence before deleting local files. Previously deleted local files unconditionally. Now checks storage backend before removing. (Issue #22)
  • gbrain call respects dry_run. handleToolCall() hardcoded dryRun: false. Now reads from params. (Issue #22)

Changed

  • Added @aws-sdk/client-s3 as a dependency for authenticated S3 operations.
  • Schema migration v2: unique index on content_chunks(page_id, chunk_index) for UPSERT support.
  • Schema migration v3: access_tokens and mcp_request_log tables for remote MCP auth.

[0.5.1] - 2026-04-10

Fixed

  • Apple Notes and files with spaces just work. Paths like Apple Notes/2017-05-03 ohmygreen.md now auto-slugify to clean slugs (apple-notes/2017-05-03-ohmygreen). Spaces become hyphens, parens and special characters are stripped, accented characters normalize to ASCII. All 5,861+ Apple Notes files import cleanly without manual renaming.
  • Existing brains auto-migrate. On first run after upgrade, a one-time migration renames all existing slugs with spaces or special characters to their clean form. Links are rewritten automatically. No manual cleanup needed.
  • Import and sync produce identical slugs. Both pipelines now use the same slugifyPath() function, eliminating the mismatch where sync preserved case but import lowercased.

[0.5.0] - 2026-04-10

Added

  • Your brain never falls behind. Live sync keeps the vector DB current with your brain repo automatically. Set up a cron, use --watch, hook into GitHub webhooks, or use git hooks. Your agent picks whatever fits its environment. Edit a markdown file, push, and within minutes it's searchable. No more stale embeddings serving wrong answers.
  • Know your install actually works. New verification runbook (docs/GBRAIN_VERIFY.md) catches the silent failures that used to go unnoticed: the pooler bug that skips pages, missing embeddings, stale sync. The real test: push a correction, wait, search for it. If the old text comes back, sync is broken and the runbook tells you exactly why.
  • New installs set up live sync automatically. The setup skill now includes live sync (Phase H) and full verification (Phase I) as mandatory steps. Agents that install GBrain will configure automatic sync and verify it works before declaring setup complete.
  • Fixes the silent page-skip bug. If your Supabase connection uses the Transaction mode pooler, sync silently skips most pages. The new docs call this out as a hard prerequisite with a clear fix (switch to Session mode). The verification runbook catches it by comparing page count against file count.

[0.4.2] - 2026-04-10

Changed

  • All GitHub Actions pinned to commit SHAs across test, e2e, and release workflows. Prevents supply chain attacks via mutable version tags.
  • Workflow permissions hardened: contents: read on test and e2e workflows limits GITHUB_TOKEN blast radius.
  • OpenClaw CI install pinned to v2026.4.9 instead of pulling latest.

Added

  • Gitleaks secret scanning CI job runs on every push and PR. Catches accidentally committed API keys, tokens, and credentials.
  • .gitleaks.toml config with allowlists for test fixtures and example files.
  • GitHub Actions SHA maintenance rule in CLAUDE.md so pins stay fresh on every /ship and /review.
  • S3 Sig V4 TODO for future implementation when S3 storage becomes a deployment path.

[0.4.1] - 2026-04-09

Added

  • gbrain check-update command with --json output. Checks GitHub Releases for new versions, compares semver (minor+ only, skips patches), fetches and parses changelog diffs. Fail-silent on network errors.
  • SKILLPACK Section 17: Auto-Update Notifications. Full agent playbook for the update lifecycle: check, notify, consent, upgrade, skills refresh, schema sync, report. Never auto-upgrades without user permission.
  • Standalone SKILLPACK self-update for users who load the skillpack directly without the gbrain CLI. Version markers in SKILLPACK and RECOMMENDED_SCHEMA headers, with raw GitHub URL fetching.
  • Step 7 in the OpenClaw install paste: daily update checks, default-on. User opts into being notified about updates, not into automatic installs.
  • Setup skill Phase G: conditional auto-update offer for manual install users.
  • Schema state tracking via ~/.gbrain/update-state.json. Tracks which recommended schema directories the user adopted, declined, or added custom. Future upgrades suggest new additions without re-suggesting declined items.
  • skills/migrations/ directory convention for version-specific post-upgrade agent directives.
  • 20 unit tests and 5 E2E tests for the check-update command, covering version comparison, changelog extraction, CLI wiring, and real GitHub API interaction.
  • E2E test DB lifecycle documentation in CLAUDE.md: spin up, run tests, tear down. No orphaned containers.

Changed

  • detectInstallMethod() exported from upgrade.ts for reuse by check-update.

Fixed

  • Semver comparison in changelog extraction was missing major-version guard, causing incorrect changelog entries to appear when crossing major version boundaries.

[0.4.0] - 2026-04-09

Added

  • gbrain doctor command with --json output. Checks pgvector extension, RLS policies, schema version, embedding coverage, and connection health. Agents can self-diagnose issues.
  • Pluggable storage backends: S3, Supabase Storage, and local filesystem. Choose where binary files live independently of the database. Configured via gbrain init or environment variables.
  • Parallel import with per-worker engine instances. Large brain imports now use multiple database connections concurrently instead of a single serial pipeline.
  • Import resume checkpoints. If gbrain import is interrupted, it picks up where it left off instead of re-importing everything.
  • Automatic schema migration runner. On connect, gbrain detects the current schema version and applies any pending migrations without manual intervention.
  • Row-Level Security (RLS) enabled on all tables with BYPASSRLS safety check. Every query goes through RLS policies.
  • --json flag on gbrain init and gbrain import for machine-readable output. Agents can parse structured results instead of scraping CLI text.
  • File migration CLI (gbrain files migrate) for moving files between storage backends. Two-way-door: test with --dry-run, migrate incrementally.
  • Bulk chunk INSERT for faster page writes. Chunks are inserted in a single statement instead of one-at-a-time.
  • Supabase smart URL parsing: automatically detects and converts IPv6-only pooler URLs to the correct connection format.
  • 56 new unit tests covering doctor, storage backends, file migration, import resume, slug validation, setup branching, Supabase admin, and YAML parsing. Test suite grew from 9 to 19 test files.
  • E2E tests for parallel import concurrency and all new features.

Fixed

  • validateSlug now accepts any filename characters (spaces, unicode, special chars) instead of rejecting non-alphanumeric slugs. Apple Notes and other real-world filenames import cleanly.
  • Import resilience: files over 5MB are skipped with a warning instead of crashing the pipeline. Errors in individual files no longer abort the entire import.
  • gbrain init detects IPv6-only Supabase URLs and adds the required pgvector check during setup.
  • E2E test fixture counts, CLI argument parsing, and doctor exit codes cleaned up.

Changed

  • Setup skill and README rewritten for agent-first developer experience.
  • Maintain skill updated with RLS verification, schema health checks, and nohup hints for large embedding jobs.

[0.3.0] - 2026-04-08

Added

  • Contract-first architecture: single operations.ts defines ~30 shared operations. CLI, MCP, and tools-json all generated from the same source. Zero drift.
  • OperationError type with structured error codes (page_not_found, invalid_params, embedding_failed, etc.). Agents can self-correct.
  • dry_run parameter on all mutating operations. Agents preview before committing.
  • importFromContent() split from importFile(). Both share the same chunk+embed+tag pipeline, but importFromContent works from strings (used by put_page). Wrapped in engine.transaction().
  • Idempotency hash now includes ALL fields (title, type, frontmatter, tags), not just compiled_truth + timeline. Metadata-only edits no longer silently skipped.
  • get_page now supports optional fuzzy: true for slug resolution. Returns resolved_slug so callers know what happened.
  • query operation now supports expand toggle (default true). Both CLI and MCP get the same control.
  • 10 new operations wired up: put_raw_data, get_raw_data, resolve_slugs, get_chunks, log_ingest, get_ingest_log, file_list, file_upload, file_url.
  • OpenClaw bundle plugin manifest (openclaw.plugin.json) with config schema, MCP server config, and skill listing.
  • GitHub Actions CI: test on push/PR, multi-platform release builds (macOS arm64 + Linux x64) on version tags.
  • gbrain init --non-interactive flag for plugin mode (accepts config via flags/env vars, no TTY required).
  • Post-upgrade version verification in gbrain upgrade.
  • Parity test (test/parity.test.ts) verifies structural contract between operations, CLI, and MCP.
  • New setup skill replacing install: auto-provision Supabase via CLI, AGENTS.md injection, target TTHW < 2 min.
  • E2E test suite against real Postgres+pgvector. 13 realistic fixtures (miniature brain with people, companies, deals, meetings, concepts), 14 test suites covering all operations, search quality benchmarks, idempotency stress tests, schema validation, and full setup journey verification.
  • GitHub Actions E2E workflow: Tier 1 (mechanical) on every PR, Tier 2 (LLM skills via OpenClaw) nightly.
  • docker-compose.test.yml and .env.testing.example for local E2E development.

Fixed

  • Schema loader in db.ts broke on PL/pgSQL trigger functions containing semicolons inside $$ blocks. Replaced per-statement execution with single conn.unsafe() call.
  • traverseGraph query failed with "could not identify equality operator for type json" when using SELECT DISTINCT with json_agg. Changed to jsonb_agg.

Changed

  • src/mcp/server.ts rewritten from ~233 to ~80 lines. Tool definitions and dispatch generated from operations[].
  • src/cli.ts rewritten. Shared operations auto-registered from operations[]. CLI-only commands (init, upgrade, import, export, files, embed) kept as manual registrations.
  • tools-json output now generated FROM operations[]. Third contract surface eliminated.
  • All 7 skills rewritten with tool-agnostic language. Works with both CLI and MCP plugin contexts.
  • File schema: storage_url column dropped, storage_path is the only identifier. URLs generated on demand via file_url operation.
  • Config loading: env vars (GBRAIN_DATABASE_URL, DATABASE_URL, OPENAI_API_KEY) override config file values. Plugin config injected via env vars.

Removed

  • 12 command files migrated to operations.ts: get.ts, put.ts, delete.ts, list.ts, search.ts, query.ts, health.ts, stats.ts, tags.ts, link.ts, timeline.ts, version.ts.
  • storage_url column from files table.

[0.2.0.2] - 2026-04-07

Changed

  • Rewrote recommended brain schema doc with expanded architecture: database layer (entity registry, event ledger, fact store, relationship graph) presented as the core architecture, entity identity and deduplication, enrichment source ordering, epistemic discipline rules, worked examples showing full ingestion chains, concurrency guidance, and browser budget. Smoothed language for open-source readability.

[0.2.0.1] - 2026-04-07

Added

  • Recommended brain schema doc (docs/GBRAIN_RECOMMENDED_SCHEMA.md): full MECE directory structure, compiled truth + timeline pages, enrichment pipeline, resolver decision tree, skill architecture, and cron job recommendations. The OpenClaw paste now links to this as step 5.

Changed

  • First-time experience rewritten. "Try it" section shows your own data, not fictional PG essays. OpenClaw paste references the GitHub repo, includes bun install fallback, and has the agent pick a dynamic query based on what it imported.
  • Removed all references to data/kindling/ (a demo corpus directory that never existed).

[0.2.0] - 2026-04-05

Added

  • You can now keep your brain current with gbrain sync, which uses git's own diff machinery to process only what changed. No more 30-second full directory walks when 3 files changed.
  • Watch mode (gbrain sync --watch) polls for changes and syncs automatically. Set it and forget it.
  • Binary file management with gbrain files commands (list, upload, sync, verify). Store images, PDFs, and audio in Supabase Storage instead of clogging your git repo.
  • Install skill (skills/install/SKILL.md) that walks you through setup from scratch, including Supabase CLI magic path for zero-copy-paste onboarding.
  • Import and sync now share a checkpoint. Run gbrain import, then gbrain sync, and it picks up right where import left off. Zero gap.
  • Tag reconciliation on reimport. If you remove a tag from your markdown, it actually gets removed from the database now.
  • gbrain config show redacts database passwords so you can safely share your config.
  • updateSlug engine method preserves page identity (page_id, chunks, embeddings) across renames. Zero re-embedding cost.
  • sync_brain MCP tool returns structured results so agents know exactly what changed.
  • 20 new sync tests (39 total across 3 test files)

[0.1.0] - 2026-04-05

Added

  • Pluggable engine interface (BrainEngine) with full Postgres + pgvector implementation
  • 25+ CLI commands: init, get, put, delete, list, search, query, import, export, embed, stats, health, link/unlink/backlinks/graph, tag/untag/tags, timeline/timeline-add, history/revert, config, upgrade, serve, call
  • MCP stdio server with 20 tools mirroring all CLI operations
  • 3-tier chunking: recursive (delimiter-aware), semantic (Savitzky-Golay boundary detection), LLM-guided (Claude Haiku topic shifts)
  • Hybrid search with Reciprocal Rank Fusion merging vector + keyword results
  • Multi-query expansion via Claude Haiku (2 alternative phrasings per query)
  • 4-layer dedup pipeline: by source, cosine similarity, type diversity, per-page cap
  • OpenAI embedding service (text-embedding-3-large, 1536 dims) with batch support and exponential backoff
  • Postgres schema with pgvector HNSW, tsvector (trigger-based, spans timeline_entries), pg_trgm fuzzy slug matching
  • Smart slug resolution for reads (fuzzy match via pg_trgm)
  • Page version control with snapshot, history, and revert
  • Typed links with recursive CTE graph traversal (max depth configurable)
  • Brain health dashboard (embed coverage, stale pages, orphans, dead links)
  • Stale alert annotations in search results
  • Supabase init wizard with CLI auto-provision fallback
  • Slug validation to prevent path traversal on export
  • 6 fat markdown skills: ingest, query, maintain, enrich, briefing, migrate
  • ClawHub manifest for skill distribution
  • Full design docs: GBRAIN_V0 spec, pluggable engine architecture, SQLite engine plan