feat: GBrain v0.2.0 — incremental sync, file storage, install skill (#2)

* refactor: extract importFile from import.ts + add tag reconciliation Shared single-file import function used by both import and sync. Adds tag reconciliation (removes stale tags on reimport), >1MB file skip, and import->sync checkpoint continuity (writes git HEAD to config table after import so sync picks up seamlessly). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add sync pure functions, updateSlug engine method, and sync tests - buildSyncManifest: parses git diff --name-status -M output - isSyncable: filters to .md pages, excludes hidden/ops/.raw/skip-list - pathToSlug: converts file paths to page slugs with optional prefix - updateSlug: renames page slug in-place (preserves page_id, chunks, embeddings) - rewriteLinks: stub for v0.2 (FKs use page_id, already correct) - 20 new tests, all passing (39 total across 3 files) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add gbrain sync command with CLI, MCP, and watch mode 18-step sync protocol: read config, git pull, ancestry validation, git diff --name-status -M for net changes, isSyncable filter, process deletes/renames/adds/modifies via importFile, batch optimization, sync state checkpoint in Postgres config table. Watch mode with polling and consecutive error counter. MCP sync_brain tool returns structured SyncResult. Stale page deletion for un-syncable files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add files table, gbrain files commands, and config show redaction - files table: page_slug FK with ON DELETE SET NULL + ON UPDATE CASCADE, storage_path, storage_url, mime_type, content_hash for dedup - gbrain files list/upload/sync/verify commands for Supabase Storage - gbrain config show redacts postgresql:// passwords and secret keys - CLI help updated with FILES section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add install skill for GBrain onboarding 6-phase install workflow: environment discovery, Supabase setup (magic path via CLI OAuth or fallback 2-copy-paste), init + import, ongoing sync cron, optional file migration with mandatory verification, and agent teaching (AGENTS.md rules). Every error gets what + why + fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.2.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.2 features to README (sync, files, install skill) README.md: added sync command to IMPORT/EXPORT section, added FILES section with 4 commands, added files table to schema diagram, added install skill to skills table, updated MCP tools count from 20 to 21 (sync_brain added). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: OpenClaw DX improvements (skill count, upgrade docs, config show help) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: consolidate version to single source of truth Create src/version.ts that reads from package.json via static import (safe for bun compiled binaries). Update mcp/server.ts from hardcoded '0.1.0' to use shared VERSION. Bump skills/manifest.json to 0.2.0. * fix: upgrade detection order, npm→bun naming, clawhub false positives Reorder detection: node_modules first, binary second, clawhub last. Rename 'npm' install method to 'bun'. Use 'clawhub --version' instead of 'which clawhub' to avoid false positives from dangling symlinks. Add 120s timeout to execSync calls to prevent hanging. Add --help flag. * feat: per-command --help, unknown command check before DB connection Add COMMAND_HELP map covering all 28 commands. Check --help before init/upgrade dispatch and before connectEngine() so help works without a database. Use COMMAND_HELP keys as known-command set to catch unknown commands before wasting a DB round-trip. * docs: standardize npm references to bun, add Upgrade section to README Fix init.ts: npx→bunx, npm→bun for supabase CLI guidance. Fix README: npm install→bun add for standalone CLI install. Add ## Upgrade section to README with all three install methods. Update install skill Upgrading section to list bun, ClawHub, and binary. * test: full coverage audit — CLI dispatch, upgrade detection, config, edge cases New test files: - test/cli.test.ts: COMMAND_HELP ↔ switch consistency, version from package.json, per-command --help, unknown command handling, global help - test/upgrade.test.ts: detection order verification, npm→bun naming, clawhub --version (not which), timeout presence - test/config.test.ts: redactUrl for postgresql URLs, edge cases Extended existing tests: - test/sync.test.ts: empty string pathToSlug, uppercase .MD rejection, deeply nested files, multiple renames, unknown status codes - test/markdown.test.ts: multiple --- separators, missing frontmatter, no frontmatter at all, empty string, type inference from paths Tests: 39 → 83 (+44 new). All pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: 100% coverage — import-file mock engine, files utils, chunker edge cases New test files: - test/import-file.test.ts (9 tests): mock BrainEngine to test importFile without DB — MAX_FILE_SIZE skip, content_hash dedup, tag reconciliation (remove stale + add new), compiled_truth/timeline chunking, noEmbed flag, sequential chunk_index - test/files.test.ts (22 tests): getMimeType for all extensions + uppercase + unknown + no-extension, fileHash consistency + different content + empty, collectFiles pattern (skip .md, skip hidden dirs, recurse, sorted output) Extended: - test/chunkers/recursive.test.ts (+6 tests): single newline splits, word-only text, clause delimiters, lossless preservation, default options, mixed delimiter hierarchy Tests: 83 → 118 (+35 new). All pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:50:15 -07:00
parent b22cbd349a
commit ecebd5552a
29 changed files with 2365 additions and 145 deletions
--- a/README.md
+++ b/README.md
@@ -96,7 +96,7 @@ You: "Install gbrain and set up my knowledge brain.
      4. Read the skill files in skills/ so you know how to use the brain"
 ```

-OpenClaw will install the package, walk through the Supabase connection wizard, import demo data, and learn the 6 brain skills (ingest, query, maintain, enrich, briefing, migrate).
+OpenClaw will install the package, walk through the Supabase connection wizard, import demo data, and learn the 7 brain skills (ingest, query, maintain, enrich, briefing, migrate, install).

 After setup, you talk to your brain through OpenClaw:

@@ -109,18 +109,20 @@ You: "Import my Obsidian vault into the brain"

 OpenClaw reads the skill files in `skills/`, figures out which gbrain commands to run, and does the work. You never touch the CLI directly unless you want to.

+GBrain keeps your brain current automatically. After setup, `gbrain sync --watch` polls your git repo and imports only what changed. Binary files (images, PDFs, audio) can be moved to Supabase Storage with `gbrain files sync` to slim down your git repo.
+
 ### With ClawHub

 ```bash
 clawhub install gbrain
 ```

-This installs the npm package, copies the skill files, and runs `gbrain init --supabase` on first use.
+This installs the package, copies the skill files, and runs `gbrain init --supabase` on first use.

 ### Standalone CLI

 ```bash
-npm install -g gbrain
+bun add -g gbrain
 ```

 ### As a library
@@ -135,6 +137,23 @@ import { PostgresEngine } from 'gbrain';

 All paths require a Postgres database with pgvector. Supabase Pro ($25/mo) is the recommended zero-ops option.

+## Upgrade
+
+Upgrade depends on how you installed:
+
+```bash
+# Installed via bun (standalone or library)
+bun update gbrain
+
+# Installed via ClawHub
+clawhub update gbrain
+
+# Compiled binary
+# Download the latest from https://github.com/garrytan/gbrain/releases
+```
+
+After upgrading, run `gbrain init` again to apply any schema migrations (idempotent, safe to re-run).
+
 ## Setup

 After installing via CLI or library path, run the setup wizard:
@@ -269,9 +288,13 @@ page_versions            Snapshot history for compiled_truth
 raw_data                 Sidecar JSON from external APIs
  page_id, source, data (JSONB)

+files                    Binary attachments in Supabase Storage
+  page_slug (FK)         Links to pages (ON UPDATE CASCADE)
+  storage_path, storage_url, content_hash, mime_type, metadata (JSONB)
+
 ingest_log               Audit trail of import/ingest operations

-config                   Brain-level settings (embedding model, chunk strategy)
+config                   Brain-level settings (embedding model, chunk strategy, sync state)
 ```

 Indexes: B-tree on slug/type, GIN on frontmatter/search_vector, HNSW on embeddings, pg_trgm on title for fuzzy slug resolution.
@@ -305,8 +328,15 @@ SEARCH

 IMPORT/EXPORT
  gbrain import <dir> [--no-embed]          Import markdown directory (idempotent)
+  gbrain sync [--repo <path>] [flags]       Git-to-brain incremental sync
  gbrain export [--dir ./out/]              Export to markdown (round-trip)

+FILES
+  gbrain files list [slug]                  List stored files
+  gbrain files upload <file> --page <slug>  Upload file to storage
+  gbrain files sync <dir>                   Bulk upload directory
+  gbrain files verify                       Verify all uploads
+
 EMBEDDINGS
  gbrain embed [<slug>|--all|--stale]       Generate/refresh embeddings

@@ -386,7 +416,7 @@ Add to your Claude Code or Cursor MCP config:
 }
 ```

-20 tools: get_page, put_page, delete_page, list_pages, search, query, add_tag, remove_tag, get_tags, add_link, remove_link, get_links, get_backlinks, traverse_graph, add_timeline_entry, get_timeline, get_stats, get_health, get_versions, revert_version.
+21 tools: get_page, put_page, delete_page, list_pages, search, query, add_tag, remove_tag, get_tags, add_link, remove_link, get_links, get_backlinks, traverse_graph, add_timeline_entry, get_timeline, get_stats, get_health, get_versions, revert_version, sync_brain.

 Every tool mirrors a CLI command. Drift tests verify identical behavior.

@@ -402,6 +432,7 @@ Fat markdown files that tell AI agents HOW to use gbrain. No skill logic in the
 | **enrich** | Enrich pages from external APIs. Raw data stored separately, distilled highlights go to compiled truth. |
 | **briefing** | Daily briefing: today's meetings with participant context, active deals with deadlines, time-sensitive threads, recent changes. |
 | **migrate** | Universal migration from Obsidian (wikilinks to gbrain links), Notion (stripped UUIDs), Logseq (block refs), plain markdown, CSV, JSON, Roam. |
+| **install** | Set up GBrain from scratch: Supabase setup (magic path via CLI or 2-copy-paste fallback), import, sync cron, optional file migration, agent teaching. |

 ## Architecture