* docs: add SKILLPACK Section 18 — Live Sync (MUST ADD) Contract-first guide for keeping the vector DB in sync with the brain repo. Documents the pooler prerequisite (Session mode required for transactions), sync + embed primitives, four example approaches (cron, --watch, webhook, git hook), isSyncable exclusions, silent skip warning, and OpenClaw/Hermes cron registration examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add GBRAIN_VERIFY.md installation verification runbook Six-check runbook: schema (doctor), skillpack loaded, auto-update, live sync (coverage check + embed check + end-to-end push-and-search test), embedding coverage, brain-first lookup protocol. Emphasizes "sync ran" != "sync worked" — the real test is searching for corrected text after a push. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add setup Phases H (Live Sync) and I (Verification) Phase H: MUST ADD live sync setup — pooler prerequisite check, automatic sync configuration (agent picks approach), sync+embed chaining, coverage verification. Phase I: run GBRAIN_VERIFY.md end-to-end before declaring setup complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add install steps 8-9 (live sync + verification) Step 8: set up automatic sync with SKILLPACK Section 18 reference. Step 9: run GBRAIN_VERIFY.md runbook. Add GBRAIN_VERIFY.md to docs section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add API key loading instructions to CLAUDE.md Source ~/.zshrc before running Tier 2 tests so OPENAI_API_KEY and ANTHROPIC_API_KEY are available. Without this, embedding and skills tests skip silently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to v0.5.0 Live sync, verification runbook, API key loading instructions. Version markers updated in SKILLPACK and RECOMMENDED_SCHEMA. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add anti-hand-roll rule to skill routing in CLAUDE.md Explicitly prohibit manually running git commit + push + gh pr create when /ship is available. /ship handles VERSION, CHANGELOG, document-release, reviews, and coverage audit. Hand-rolling skips all of these. Added "commit and ship" / "push and ship" variants to the ship routing rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: changelog voice rule + rewrite 0.5.0 changelog to sell the upgrade CLAUDE.md: add changelog voice guidance — lead with benefits, not implementation details. Make users want to upgrade. CHANGELOG: rewrite 0.5.0 entries from dry feature descriptions to capability-focused bullets ("your brain never falls behind" not "SKILLPACK Section 18 added"). SKILLPACK Section 17: update the auto-update message template to instruct agents to sell the upgrade, not just summarize the diff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.5.0 migration directive for live sync + verification Agents upgrading from v0.4.x will automatically: check their pooler connection string, set up automatic sync, and run the verification runbook. Without this migration file, upgrading agents would learn about live sync (by re-reading Section 18) but wouldn't set it up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: sharpen migration file guidance in CLAUDE.md Replace vague "requires agent action" with concrete trigger list: new setup steps existing users don't have, MUST ADD skillpack sections, schema changes, deprecated commands, new verification steps, new crons. Add the key test: "if an existing user upgrades and does nothing else, will their brain work worse?" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: make Section 17 upgrade flow work for direct user requests Section 17 was structured as a cron-initiated flow only. An agent handling "upgrade gbrain" might just run the command and stop, missing the post-upgrade steps where the value is (re-read skills, run migrations, schema sync). Added explicit entry point for direct upgrade requests. Made Steps 2-4 more concrete about where to find files and why migrations can't be skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add E2E sync tests — git-to-DB pipeline (11 tests) Tests the full sync lifecycle against real Postgres+pgvector: - First sync imports all pages from a git repo - Second sync with no changes returns up_to_date - Incremental sync picks up new files (add → commit → sync → verify) - Incremental sync picks up modifications — THE CRITICAL TEST: corrected text appears in DB and keyword search after sync - Incremental sync handles deletes - Non-syncable files are excluded (README, .raw/, ops/) - Sync state (last_commit, last_run) persisted to config - Sync logged to ingest_log - --full reimports everything - --dry-run shows changes without applying Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: strengthen CLAUDE.md to always run ALL test tiers Replace passive "source zshrc" suggestion with ALWAYS directive. Explicitly state that "run all tests" means ALL tiers including Tier 2 with API keys. Do not skip Tier 2 just because keys need loading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Tier 2 E2E tests — correct openclaw CLI invocation The tests used `openclaw -p` which doesn't exist. The correct command is `openclaw agent --local --agent <id> --message <prompt>`. Also fixed JSON output parsing (structured JSON goes to stderr, not stdout — use non-JSON mode instead). Fixed ingest test to assert on agent response text rather than test DB state (the agent writes to its own configured DB, not the ephemeral test DB). 82 tests pass, 0 fail, 0 skip across all 5 E2E files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
14 KiB
14 KiB
Changelog
All notable changes to GBrain will be documented in this file.
[0.5.0] - 2026-04-10
Added
- Your brain never falls behind. Live sync keeps the vector DB current with your brain repo automatically. Set up a cron, use
--watch, hook into GitHub webhooks, or use git hooks. Your agent picks whatever fits its environment. Edit a markdown file, push, and within minutes it's searchable. No more stale embeddings serving wrong answers. - Know your install actually works. New verification runbook (
docs/GBRAIN_VERIFY.md) catches the silent failures that used to go unnoticed: the pooler bug that skips pages, missing embeddings, stale sync. The real test: push a correction, wait, search for it. If the old text comes back, sync is broken and the runbook tells you exactly why. - New installs set up live sync automatically. The setup skill now includes live sync (Phase H) and full verification (Phase I) as mandatory steps. Agents that install GBrain will configure automatic sync and verify it works before declaring setup complete.
- Fixes the silent page-skip bug. If your Supabase connection uses the Transaction mode pooler, sync silently skips most pages. The new docs call this out as a hard prerequisite with a clear fix (switch to Session mode). The verification runbook catches it by comparing page count against file count.
[0.4.2] - 2026-04-10
Changed
- All GitHub Actions pinned to commit SHAs across test, e2e, and release workflows. Prevents supply chain attacks via mutable version tags.
- Workflow permissions hardened:
contents: readon test and e2e workflows limits GITHUB_TOKEN blast radius. - OpenClaw CI install pinned to v2026.4.9 instead of pulling latest.
Added
- Gitleaks secret scanning CI job runs on every push and PR. Catches accidentally committed API keys, tokens, and credentials.
.gitleaks.tomlconfig with allowlists for test fixtures and example files.- GitHub Actions SHA maintenance rule in CLAUDE.md so pins stay fresh on every
/shipand/review. - S3 Sig V4 TODO for future implementation when S3 storage becomes a deployment path.
[0.4.1] - 2026-04-09
Added
gbrain check-updatecommand with--jsonoutput. Checks GitHub Releases for new versions, compares semver (minor+ only, skips patches), fetches and parses changelog diffs. Fail-silent on network errors.- SKILLPACK Section 17: Auto-Update Notifications. Full agent playbook for the update lifecycle: check, notify, consent, upgrade, skills refresh, schema sync, report. Never auto-upgrades without user permission.
- Standalone SKILLPACK self-update for users who load the skillpack directly without the gbrain CLI. Version markers in SKILLPACK and RECOMMENDED_SCHEMA headers, with raw GitHub URL fetching.
- Step 7 in the OpenClaw install paste: daily update checks, default-on. User opts into being notified about updates, not into automatic installs.
- Setup skill Phase G: conditional auto-update offer for manual install users.
- Schema state tracking via
~/.gbrain/update-state.json. Tracks which recommended schema directories the user adopted, declined, or added custom. Future upgrades suggest new additions without re-suggesting declined items. skills/migrations/directory convention for version-specific post-upgrade agent directives.- 20 unit tests and 5 E2E tests for the check-update command, covering version comparison, changelog extraction, CLI wiring, and real GitHub API interaction.
- E2E test DB lifecycle documentation in CLAUDE.md: spin up, run tests, tear down. No orphaned containers.
Changed
detectInstallMethod()exported fromupgrade.tsfor reuse bycheck-update.
Fixed
- Semver comparison in changelog extraction was missing major-version guard, causing incorrect changelog entries to appear when crossing major version boundaries.
[0.4.0] - 2026-04-09
Added
gbrain doctorcommand with--jsonoutput. Checks pgvector extension, RLS policies, schema version, embedding coverage, and connection health. Agents can self-diagnose issues.- Pluggable storage backends: S3, Supabase Storage, and local filesystem. Choose where binary files live independently of the database. Configured via
gbrain initor environment variables. - Parallel import with per-worker engine instances. Large brain imports now use multiple database connections concurrently instead of a single serial pipeline.
- Import resume checkpoints. If
gbrain importis interrupted, it picks up where it left off instead of re-importing everything. - Automatic schema migration runner. On connect, gbrain detects the current schema version and applies any pending migrations without manual intervention.
- Row-Level Security (RLS) enabled on all tables with
BYPASSRLSsafety check. Every query goes through RLS policies. --jsonflag ongbrain initandgbrain importfor machine-readable output. Agents can parse structured results instead of scraping CLI text.- File migration CLI (
gbrain files migrate) for moving files between storage backends. Two-way-door: test with--dry-run, migrate incrementally. - Bulk chunk INSERT for faster page writes. Chunks are inserted in a single statement instead of one-at-a-time.
- Supabase smart URL parsing: automatically detects and converts IPv6-only pooler URLs to the correct connection format.
- 56 new unit tests covering doctor, storage backends, file migration, import resume, slug validation, setup branching, Supabase admin, and YAML parsing. Test suite grew from 9 to 19 test files.
- E2E tests for parallel import concurrency and all new features.
Fixed
validateSlugnow accepts any filename characters (spaces, unicode, special chars) instead of rejecting non-alphanumeric slugs. Apple Notes and other real-world filenames import cleanly.- Import resilience: files over 5MB are skipped with a warning instead of crashing the pipeline. Errors in individual files no longer abort the entire import.
gbrain initdetects IPv6-only Supabase URLs and adds the requiredpgvectorcheck during setup.- E2E test fixture counts, CLI argument parsing, and doctor exit codes cleaned up.
Changed
- Setup skill and README rewritten for agent-first developer experience.
- Maintain skill updated with RLS verification, schema health checks, and
nohuphints for large embedding jobs.
[0.3.0] - 2026-04-08
Added
- Contract-first architecture: single
operations.tsdefines ~30 shared operations. CLI, MCP, and tools-json all generated from the same source. Zero drift. OperationErrortype with structured error codes (page_not_found,invalid_params,embedding_failed, etc.). Agents can self-correct.dry_runparameter on all mutating operations. Agents preview before committing.importFromContent()split fromimportFile(). Both share the same chunk+embed+tag pipeline, butimportFromContentworks from strings (used byput_page). Wrapped inengine.transaction().- Idempotency hash now includes ALL fields (title, type, frontmatter, tags), not just compiled_truth + timeline. Metadata-only edits no longer silently skipped.
get_pagenow supports optionalfuzzy: truefor slug resolution. Returnsresolved_slugso callers know what happened.queryoperation now supportsexpandtoggle (default true). Both CLI and MCP get the same control.- 10 new operations wired up:
put_raw_data,get_raw_data,resolve_slugs,get_chunks,log_ingest,get_ingest_log,file_list,file_upload,file_url. - OpenClaw bundle plugin manifest (
openclaw.plugin.json) with config schema, MCP server config, and skill listing. - GitHub Actions CI: test on push/PR, multi-platform release builds (macOS arm64 + Linux x64) on version tags.
gbrain init --non-interactiveflag for plugin mode (accepts config via flags/env vars, no TTY required).- Post-upgrade version verification in
gbrain upgrade. - Parity test (
test/parity.test.ts) verifies structural contract between operations, CLI, and MCP. - New
setupskill replacinginstall: auto-provision Supabase via CLI, AGENTS.md injection, target TTHW < 2 min. - E2E test suite against real Postgres+pgvector. 13 realistic fixtures (miniature brain with people, companies, deals, meetings, concepts), 14 test suites covering all operations, search quality benchmarks, idempotency stress tests, schema validation, and full setup journey verification.
- GitHub Actions E2E workflow: Tier 1 (mechanical) on every PR, Tier 2 (LLM skills via OpenClaw) nightly.
docker-compose.test.ymland.env.testing.examplefor local E2E development.
Fixed
- Schema loader in
db.tsbroke on PL/pgSQL trigger functions containing semicolons inside$$blocks. Replaced per-statement execution with singleconn.unsafe()call. traverseGraphquery failed with "could not identify equality operator for type json" when usingSELECT DISTINCTwithjson_agg. Changed tojsonb_agg.
Changed
src/mcp/server.tsrewritten from ~233 to ~80 lines. Tool definitions and dispatch generated from operations[].src/cli.tsrewritten. Shared operations auto-registered from operations[]. CLI-only commands (init, upgrade, import, export, files, embed) kept as manual registrations.tools-jsonoutput now generated FROM operations[]. Third contract surface eliminated.- All 7 skills rewritten with tool-agnostic language. Works with both CLI and MCP plugin contexts.
- File schema:
storage_urlcolumn dropped,storage_pathis the only identifier. URLs generated on demand viafile_urloperation. - Config loading: env vars (
GBRAIN_DATABASE_URL,DATABASE_URL,OPENAI_API_KEY) override config file values. Plugin config injected via env vars.
Removed
- 12 command files migrated to operations.ts: get.ts, put.ts, delete.ts, list.ts, search.ts, query.ts, health.ts, stats.ts, tags.ts, link.ts, timeline.ts, version.ts.
storage_urlcolumn from files table.
[0.2.0.2] - 2026-04-07
Changed
- Rewrote recommended brain schema doc with expanded architecture: database layer (entity registry, event ledger, fact store, relationship graph) presented as the core architecture, entity identity and deduplication, enrichment source ordering, epistemic discipline rules, worked examples showing full ingestion chains, concurrency guidance, and browser budget. Smoothed language for open-source readability.
[0.2.0.1] - 2026-04-07
Added
- Recommended brain schema doc (
docs/GBRAIN_RECOMMENDED_SCHEMA.md): full MECE directory structure, compiled truth + timeline pages, enrichment pipeline, resolver decision tree, skill architecture, and cron job recommendations. The OpenClaw paste now links to this as step 5.
Changed
- First-time experience rewritten. "Try it" section shows your own data, not fictional PG essays. OpenClaw paste references the GitHub repo, includes bun install fallback, and has the agent pick a dynamic query based on what it imported.
- Removed all references to
data/kindling/(a demo corpus directory that never existed).
[0.2.0] - 2026-04-05
Added
- You can now keep your brain current with
gbrain sync, which uses git's own diff machinery to process only what changed. No more 30-second full directory walks when 3 files changed. - Watch mode (
gbrain sync --watch) polls for changes and syncs automatically. Set it and forget it. - Binary file management with
gbrain filescommands (list, upload, sync, verify). Store images, PDFs, and audio in Supabase Storage instead of clogging your git repo. - Install skill (
skills/install/SKILL.md) that walks you through setup from scratch, including Supabase CLI magic path for zero-copy-paste onboarding. - Import and sync now share a checkpoint. Run
gbrain import, thengbrain sync, and it picks up right where import left off. Zero gap. - Tag reconciliation on reimport. If you remove a tag from your markdown, it actually gets removed from the database now.
gbrain config showredacts database passwords so you can safely share your config.updateSlugengine method preserves page identity (page_id, chunks, embeddings) across renames. Zero re-embedding cost.sync_brainMCP tool returns structured results so agents know exactly what changed.- 20 new sync tests (39 total across 3 test files)
[0.1.0] - 2026-04-05
Added
- Pluggable engine interface (
BrainEngine) with full Postgres + pgvector implementation - 25+ CLI commands: init, get, put, delete, list, search, query, import, export, embed, stats, health, link/unlink/backlinks/graph, tag/untag/tags, timeline/timeline-add, history/revert, config, upgrade, serve, call
- MCP stdio server with 20 tools mirroring all CLI operations
- 3-tier chunking: recursive (delimiter-aware), semantic (Savitzky-Golay boundary detection), LLM-guided (Claude Haiku topic shifts)
- Hybrid search with Reciprocal Rank Fusion merging vector + keyword results
- Multi-query expansion via Claude Haiku (2 alternative phrasings per query)
- 4-layer dedup pipeline: by source, cosine similarity, type diversity, per-page cap
- OpenAI embedding service (text-embedding-3-large, 1536 dims) with batch support and exponential backoff
- Postgres schema with pgvector HNSW, tsvector (trigger-based, spans timeline_entries), pg_trgm fuzzy slug matching
- Smart slug resolution for reads (fuzzy match via pg_trgm)
- Page version control with snapshot, history, and revert
- Typed links with recursive CTE graph traversal (max depth configurable)
- Brain health dashboard (embed coverage, stale pages, orphans, dead links)
- Stale alert annotations in search results
- Supabase init wizard with CLI auto-provision fallback
- Slug validation to prevent path traversal on export
- 6 fat markdown skills: ingest, query, maintain, enrich, briefing, migrate
- ClawHub manifest for skill distribution
- Full design docs: GBRAIN_V0 spec, pluggable engine architecture, SQLite engine plan