feat: PGLite engine — local brain, zero infrastructure (v0.7.0) (#41)

* refactor: extract shared utils, add runMigration + getChunksWithEmbeddings to BrainEngine

Extract validateSlug, contentHash, rowToPage, rowToChunk, rowToSearchResult
from postgres-engine.ts into shared utils.ts. Add rowToChunk includeEmbedding
parameter for migration support.

Add two new methods to BrainEngine interface:
- runMigration(version, sql) — replaces internal eng.sql access in migrate.ts
- getChunksWithEmbeddings(slug) — returns chunks with embedding data for migration

Replace 'sqlite' with 'pglite' in EngineConfig and GBrainConfig types.
Fix loadConfig to infer engine from database_path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: pluggable engine factory + hybridSearch keyword-only fallback

Add createEngine() factory with dynamic imports so PGLite WASM is never
loaded for Postgres users. Wire CLI to use factory instead of hardcoded
PostgresEngine.

Force workers=1 for PGLite imports (single-connection architecture).

Fix hybridSearch to check OPENAI_API_KEY before calling embed(). When
unset, returns keyword-only results instead of throwing. Critical for
local PGLite users who don't need vector search.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: PGLiteEngine — embedded Postgres 17.5 via WASM, same SQL everywhere

Full BrainEngine implementation (37 methods) using @electric-sql/pglite.
Same SQL as PostgresEngine — tsvector triggers, pgvector HNSW, pg_trgm
fuzzy matching, recursive CTEs, JSONB. Only the driver call syntax differs
(parameterized queries instead of tagged templates).

PGLite schema is the Postgres schema minus RLS, advisory locks, and
remote auth tables (access_tokens, mcp_request_log, files).

No server. No subscription. One directory. Works offline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: smart init (PGLite default) + bidirectional engine migration

gbrain init now defaults to PGLite — brain ready in 2 seconds, no
server needed. Scans target directory: <1000 .md files = PGLite,
>=1000 = suggests Supabase. --supabase and --pglite flags override.

gbrain migrate --to supabase/pglite transfers all data between engines
with manifest-based resume. Copies pages, chunks (with embeddings),
tags, timeline, raw data, links, and config. --force overwrites
non-empty target.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 60 new tests for PGLite engine, utils, and factory

41 PGLite engine tests covering all 37 BrainEngine methods: CRUD,
tsvector keyword search, pg_trgm fuzzy matching, chunk upsert with
COALESCE, graph traversal via recursive CTE, transactions, cascade
deletes, stats/health, and embedding round-trip.

14 shared utility tests (validateSlug, contentHash, row mappers).
5 engine factory tests (dispatch, error messages).

All run in-memory — zero Docker, zero DATABASE_URL, instant in CI.

Add P0 TODO: submit Bun PR for WASM embedding in bun build --compile
(oven-sh/bun#15032).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.7.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.7.0 PGLite engine

- CLAUDE.md: add PGLite key files, update architecture, add migrate command, add 3 test files
- README.md: PGLite as default init, zero-config getting started, migration path to Supabase
- docs/ENGINES.md: PGLiteEngine shipped (v0.7), capability matrix, migration docs
- docs/SQLITE_ENGINE.md: marked superseded by PGLite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove stale v0.4 README update prompt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: remove SQLITE_ENGINE.md (superseded by PGLite)

PGLite uses the same SQL as Postgres, making a separate SQLite
engine unnecessary. docs/ENGINES.md covers PGLiteEngine.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update README step 2 to default to PGLite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add schema setup step and install-all-integrations step to README

Step 3 now tells agents to read GBRAIN_RECOMMENDED_SCHEMA.md and set up
the MECE directory structure before importing. Step 7 tells agents to
install every available integration recipe, not just list them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update install goal to match full opinionated setup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add 'Need an AI agent first?' section with one-click deploy links

New users who don't have OpenClaw or Hermes Agent get pointed to
AlphaClaw on Render and the Hermes Agent Railway template. One click
each. Claude Code mentioned for users who already have it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add migrate to CLI_ONLY + help output, fix standalone example

- migrate command was missing from CLI_ONLY set (errored as "Unknown command")
- migrate now shows in --help under SETUP
- init help line shows --pglite flag
- standalone CLI example uses gbrain init (not --supabase)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: set realistic time expectation (~30 min to working brain)

DB is 2 seconds. But schema + import + embeddings + integrations
is 15-30 minutes. The agent does the work, you answer API key
questions. Don't oversell time-to-value.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: fix AlphaClaw Render requirement (8GB+ RAM, not free tier)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: final README polish for launch

- GOAL line: "Garry Tan's exact setup" (not Claude Code specific)
- Remove markdown links from code block (won't render)
- STEP 2 renamed from "START HERE" to "DATABASE"
- Tighten Supabase fallback text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove duplicate old install block from README

The v0.5-era "With OpenClaw or Hermes Agent" paste block was
superseded by the top-level "Start here" block. Having both
confused users and the old one still said --supabase as step 2.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clean up README consistency and remove duplicated content

- Remove duplicate "Try it" section (old 4-act walkthrough that
  repeated the install flow and contradicted "~30 min" with "90 sec")
- Remove duplicate Setup section (third repetition of gbrain init)
- Fix brain.db → brain.pglite (actual default path)
- Fix "coming in v0.7" → "not yet implemented" (we ARE v0.7)
- Remove "You don't need Postgres" (confusing since PGLite IS Postgres)
- Deduplicate "competitive dynamics" query (appeared 3 times)
- Collapse redundant standalone CLI section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-11 00:01:09 -10:00
committed by GitHub
parent ce15062694
commit 6c7d2ed30b
26 changed files with 2287 additions and 744 deletions

View File

@@ -6,7 +6,12 @@ All notable changes to GBrain will be documented in this file.
### Added
- **Your brain gets new senses automatically.** Integration recipes teach your agent how to wire up voice calls, email, Twitter, and calendar into your brain. Run `gbrain integrations` to see what's available. Your agent reads the recipe, asks for API keys, validates each one, and sets everything up. Markdown is code — the recipe IS the installer.
- **Your brain now runs locally with zero infrastructure.** PGLite (Postgres 17.5 compiled to WASM) gives you the exact same search quality as Supabase, same pgvector HNSW, same pg_trgm fuzzy matching, same tsvector full-text search. No server, no subscription, no API keys needed for keyword search. `gbrain init` and you're running in 2 seconds.
- **Smart init defaults to local.** `gbrain init` now creates a PGLite brain by default. If your repo has 1000+ markdown files, it suggests Supabase for scale. `--supabase` and `--pglite` flags let you choose explicitly.
- **Migrate between engines anytime.** `gbrain migrate --to supabase` transfers your entire brain (pages, chunks, embeddings, tags, links, timeline) to remote Postgres with manifest-based resume. `gbrain migrate --to pglite` goes the other way. Embeddings copy directly, no re-embedding needed.
- **Pluggable engine factory.** `createEngine()` dynamically loads the right engine from config. PGLite WASM is never loaded for Postgres users.
- **Search works without OpenAI.** `hybridSearch` now checks for `OPENAI_API_KEY` before attempting embeddings. No key = keyword-only search. No more crashes when you just want to search your local brain.
- **Your brain gets new senses automatically.** Integration recipes teach your agent how to wire up voice calls, email, Twitter, and calendar into your brain. Run `gbrain integrations` to see what's available. Your agent reads the recipe, asks for API keys, validates each one, and sets everything up. Markdown is code -- the recipe IS the installer.
- **Voice-to-brain: phone calls create brain pages.** The first recipe: Twilio + OpenAI Realtime voice agent. Call a number, talk, and a structured brain page appears with entity detection, cross-references, and a summary posted to your messaging app. Opinionated defaults: caller screening, brain-first lookup, quiet hours, thinking sounds. The smoke test calls YOU (outbound) so you experience the magic immediately.
- **`gbrain integrations` command.** Six subcommands for managing integration recipes: `list` (dashboard of senses + reflexes), `show` (recipe details), `status` (credential checks with direct links to get missing keys), `doctor` (health checks), `stats` (signal analytics), `test` (recipe validation). `--json` on every subcommand for agent-parseable output. No database connection needed.
- **Health heartbeat.** Integrations log events to `~/.gbrain/integrations/<id>/heartbeat.jsonl`. Status checks detect stale integrations and include diagnostic steps.
@@ -14,6 +19,13 @@ All notable changes to GBrain will be documented in this file.
- **"Getting Data In" documentation.** New `docs/integrations/` with a landing page, recipe format documentation, credential gateway guide, and meeting webhook guide. Explains the deterministic collector pattern: code for data, LLMs for judgment.
- **Architecture and philosophy docs.** `docs/architecture/infra-layer.md` documents the shared foundation (import, chunk, embed, search). `docs/ethos/THIN_HARNESS_FAT_SKILLS.md` is Garry's essay on the architecture philosophy with an agent decision guide. `docs/designs/HOMEBREW_FOR_PERSONAL_AI.md` maps the 10-star vision.
### Changed
- **Engine interface expanded.** Added `runMigration()` (replaces internal driver access for schema migrations) and `getChunksWithEmbeddings()` (loads embedding data for cross-engine migration).
- **Shared utilities extracted.** `validateSlug`, `contentHash`, and row mappers moved from `postgres-engine.ts` to `src/core/utils.ts`. Both engines share them.
- **Config infers engine type.** If `database_path` is set but `engine` is missing, config now infers `pglite` instead of defaulting to `postgres`.
- **Import serializes on PGLite.** Parallel workers are Postgres-only. PGLite uses sequential import (single-connection architecture).
## [0.6.1] - 2026-04-10
### Fixed