source-id-write-path-v0.18.2
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6c7d2ed30b |
feat: PGLite engine — local brain, zero infrastructure (v0.7.0) (#41)
* refactor: extract shared utils, add runMigration + getChunksWithEmbeddings to BrainEngine Extract validateSlug, contentHash, rowToPage, rowToChunk, rowToSearchResult from postgres-engine.ts into shared utils.ts. Add rowToChunk includeEmbedding parameter for migration support. Add two new methods to BrainEngine interface: - runMigration(version, sql) — replaces internal eng.sql access in migrate.ts - getChunksWithEmbeddings(slug) — returns chunks with embedding data for migration Replace 'sqlite' with 'pglite' in EngineConfig and GBrainConfig types. Fix loadConfig to infer engine from database_path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: pluggable engine factory + hybridSearch keyword-only fallback Add createEngine() factory with dynamic imports so PGLite WASM is never loaded for Postgres users. Wire CLI to use factory instead of hardcoded PostgresEngine. Force workers=1 for PGLite imports (single-connection architecture). Fix hybridSearch to check OPENAI_API_KEY before calling embed(). When unset, returns keyword-only results instead of throwing. Critical for local PGLite users who don't need vector search. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: PGLiteEngine — embedded Postgres 17.5 via WASM, same SQL everywhere Full BrainEngine implementation (37 methods) using @electric-sql/pglite. Same SQL as PostgresEngine — tsvector triggers, pgvector HNSW, pg_trgm fuzzy matching, recursive CTEs, JSONB. Only the driver call syntax differs (parameterized queries instead of tagged templates). PGLite schema is the Postgres schema minus RLS, advisory locks, and remote auth tables (access_tokens, mcp_request_log, files). No server. No subscription. One directory. Works offline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: smart init (PGLite default) + bidirectional engine migration gbrain init now defaults to PGLite — brain ready in 2 seconds, no server needed. Scans target directory: <1000 .md files = PGLite, >=1000 = suggests Supabase. --supabase and --pglite flags override. gbrain migrate --to supabase/pglite transfers all data between engines with manifest-based resume. Copies pages, chunks (with embeddings), tags, timeline, raw data, links, and config. --force overwrites non-empty target. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: 60 new tests for PGLite engine, utils, and factory 41 PGLite engine tests covering all 37 BrainEngine methods: CRUD, tsvector keyword search, pg_trgm fuzzy matching, chunk upsert with COALESCE, graph traversal via recursive CTE, transactions, cascade deletes, stats/health, and embedding round-trip. 14 shared utility tests (validateSlug, contentHash, row mappers). 5 engine factory tests (dispatch, error messages). All run in-memory — zero Docker, zero DATABASE_URL, instant in CI. Add P0 TODO: submit Bun PR for WASM embedding in bun build --compile (oven-sh/bun#15032). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.7.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.7.0 PGLite engine - CLAUDE.md: add PGLite key files, update architecture, add migrate command, add 3 test files - README.md: PGLite as default init, zero-config getting started, migration path to Supabase - docs/ENGINES.md: PGLiteEngine shipped (v0.7), capability matrix, migration docs - docs/SQLITE_ENGINE.md: marked superseded by PGLite Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove stale v0.4 README update prompt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove SQLITE_ENGINE.md (superseded by PGLite) PGLite uses the same SQL as Postgres, making a separate SQLite engine unnecessary. docs/ENGINES.md covers PGLiteEngine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update README step 2 to default to PGLite Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add schema setup step and install-all-integrations step to README Step 3 now tells agents to read GBRAIN_RECOMMENDED_SCHEMA.md and set up the MECE directory structure before importing. Step 7 tells agents to install every available integration recipe, not just list them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update install goal to match full opinionated setup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add 'Need an AI agent first?' section with one-click deploy links New users who don't have OpenClaw or Hermes Agent get pointed to AlphaClaw on Render and the Hermes Agent Railway template. One click each. Claude Code mentioned for users who already have it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add migrate to CLI_ONLY + help output, fix standalone example - migrate command was missing from CLI_ONLY set (errored as "Unknown command") - migrate now shows in --help under SETUP - init help line shows --pglite flag - standalone CLI example uses gbrain init (not --supabase) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: set realistic time expectation (~30 min to working brain) DB is 2 seconds. But schema + import + embeddings + integrations is 15-30 minutes. The agent does the work, you answer API key questions. Don't oversell time-to-value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: fix AlphaClaw Render requirement (8GB+ RAM, not free tier) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: final README polish for launch - GOAL line: "Garry Tan's exact setup" (not Claude Code specific) - Remove markdown links from code block (won't render) - STEP 2 renamed from "START HERE" to "DATABASE" - Tighten Supabase fallback text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove duplicate old install block from README The v0.5-era "With OpenClaw or Hermes Agent" paste block was superseded by the top-level "Start here" block. Having both confused users and the old one still said --supabase as step 2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clean up README consistency and remove duplicated content - Remove duplicate "Try it" section (old 4-act walkthrough that repeated the install flow and contradicted "~30 min" with "90 sec") - Remove duplicate Setup section (third repetition of gbrain init) - Fix brain.db → brain.pglite (actual default path) - Fix "coming in v0.7" → "not yet implemented" (we ARE v0.7) - Remove "You don't need Postgres" (confusing since PGLite IS Postgres) - Deduplicate "competitive dynamics" query (appeared 3 times) - Collapse redundant standalone CLI section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
b22cbd349a |
feat: GBrain v0.1.0 — Postgres-native personal knowledge brain (#1)
* chore: add CLAUDE.md with project context and gstack skill routing rules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: initialize project with Bun + TypeScript package.json with dependencies (postgres, pgvector, openai, anthropic, MCP SDK, gray-matter). TypeScript config targeting ESNext with bundler module resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add foundation layer — engine interface, Postgres engine, schema BrainEngine pluggable interface with full PostgresEngine: CRUD, search (keyword + vector), links, tags, timeline, versions, stats, health, ingest log, config. Trigger-based tsvector spanning pages + timeline_entries. Markdown parser with frontmatter, compiled_truth / timeline splitting, and round-trip serialization. 19 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add 3-tier chunking and embedding service Recursive delimiter-aware chunker (5-level hierarchy, 300-word chunks, 50-word overlap). Semantic chunker with Savitzky-Golay boundary detection and recursive fallback. LLM-guided chunker via Claude Haiku with sliding window topic detection. OpenAI embedding service with batch support, exponential backoff, and rate limit handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add hybrid search with RRF fusion, expansion, and 4-layer dedup Hybrid search merges vector (pgvector HNSW) + keyword (tsvector) via Reciprocal Rank Fusion. Multi-query expansion via Claude Haiku generates 2 alternative phrasings. 4-layer dedup pipeline: by source, cosine similarity, type diversity (60% cap), per-page cap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add GBRAIN_V0 spec, pluggable engine architecture, SQLite engine plan GBRAIN_V0.md: full product spec with architecture decisions, CLI commands, schema, search architecture, chunking strategies, first-time experience, and future plans. ENGINES.md: pluggable engine interface, capability matrix, how to add new backends. SQLITE_ENGINE.md: complete SQLite implementation plan with schema, FTS5 setup, vector search options, and contributor guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add CLI with all commands Full CLI dispatcher with 25+ commands: init (Supabase wizard), get, put, delete, list, search, query (hybrid RRF), import (bulk with progress bar), export (round-trip), embed, stats, health, tag/untag/tags, link/unlink/ backlinks/graph, timeline/timeline-add, history/revert, config, upgrade, serve, call. Smart slug resolution on reads. Version snapshots on updates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add MCP stdio server with all brain tools 20 MCP tools mirroring CLI operations: get/put/delete/list pages, search (keyword), query (hybrid RRF + expansion), tags, links with graph traversal, timeline, stats, health, version history, and revert. Auto-chunks and embeds on put_page. CLI and MCP share the same engine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add 6 skill files and ClawHub manifest Fat markdown skills for AI agents: ingest (meetings/docs/articles with timeline merge), query (3-layer search + synthesis + citations), maintain (health checks, stale detection, orphan audit), enrich (external API enrichment), briefing (daily briefing compilation), migrate (universal migration from Obsidian/Notion/Logseq/markdown/CSV/JSON/Roam). ClawHub manifest for skill distribution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add README, CONTRIBUTING, update CLAUDE.md test references README with quickstart, commands, architecture, library usage, MCP setup, and links to design docs. CONTRIBUTING with setup, project structure, and guides for adding commands and engines. CLAUDE.md updated to reference actual test files instead of planned-but-unwritten import test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address adversarial review findings — 5 critical/high fixes - revertToVersion: add page_id check to prevent cross-page data corruption - traverseGraph: use UNION instead of UNION ALL for cycle safety - embedAll: preserve all chunks when embedding stale subset only - embedding: throw on retry exhaustion instead of returning zero vectors - putPage: validate slugs to prevent path traversal on export Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.1.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: expand README with schema, install, search architecture, and motivation Why it exists, how search works (with ASCII diagram), full database schema with all 9 tables and index details, chunking strategies explained, storage estimates, setup wizard walkthrough, knowledge model with example page, library usage with more examples, expanded skills table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: add MIT license (Copyright 2026 Garry Tan) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add OpenClaw install flow as primary option in README OpenClaw users just say "install gbrain" and the orchestrator handles everything: package install, Supabase setup wizard, skill registration. Shows the conversational interface for querying, ingesting, and briefings. ClawHub and standalone CLI paths follow as alternatives. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add prerequisites and explicit OpenClaw install instructions Prerequisites table listing Supabase, OpenAI, and Anthropic dependencies with links. Environment variable setup. Explicit step-by-step prompt for OpenClaw users showing exactly what to tell the orchestrator. Note that search degrades gracefully without API keys (keyword-only without OpenAI, no expansion without Anthropic). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: scrub named references, add PG essay demo section to README Replace all Pedro/Brex/Jensen Huang/River AI examples with Paul Graham essay examples using the kindling corpus. Add "Try it" section to README showing the power of hybrid search on PG essays in 90 seconds. Update test fixtures to use concept pages instead of person pages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |