* docs: add SKILLPACK Section 18 — Live Sync (MUST ADD) Contract-first guide for keeping the vector DB in sync with the brain repo. Documents the pooler prerequisite (Session mode required for transactions), sync + embed primitives, four example approaches (cron, --watch, webhook, git hook), isSyncable exclusions, silent skip warning, and OpenClaw/Hermes cron registration examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add GBRAIN_VERIFY.md installation verification runbook Six-check runbook: schema (doctor), skillpack loaded, auto-update, live sync (coverage check + embed check + end-to-end push-and-search test), embedding coverage, brain-first lookup protocol. Emphasizes "sync ran" != "sync worked" — the real test is searching for corrected text after a push. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add setup Phases H (Live Sync) and I (Verification) Phase H: MUST ADD live sync setup — pooler prerequisite check, automatic sync configuration (agent picks approach), sync+embed chaining, coverage verification. Phase I: run GBRAIN_VERIFY.md end-to-end before declaring setup complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add install steps 8-9 (live sync + verification) Step 8: set up automatic sync with SKILLPACK Section 18 reference. Step 9: run GBRAIN_VERIFY.md runbook. Add GBRAIN_VERIFY.md to docs section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add API key loading instructions to CLAUDE.md Source ~/.zshrc before running Tier 2 tests so OPENAI_API_KEY and ANTHROPIC_API_KEY are available. Without this, embedding and skills tests skip silently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to v0.5.0 Live sync, verification runbook, API key loading instructions. Version markers updated in SKILLPACK and RECOMMENDED_SCHEMA. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add anti-hand-roll rule to skill routing in CLAUDE.md Explicitly prohibit manually running git commit + push + gh pr create when /ship is available. /ship handles VERSION, CHANGELOG, document-release, reviews, and coverage audit. Hand-rolling skips all of these. Added "commit and ship" / "push and ship" variants to the ship routing rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: changelog voice rule + rewrite 0.5.0 changelog to sell the upgrade CLAUDE.md: add changelog voice guidance — lead with benefits, not implementation details. Make users want to upgrade. CHANGELOG: rewrite 0.5.0 entries from dry feature descriptions to capability-focused bullets ("your brain never falls behind" not "SKILLPACK Section 18 added"). SKILLPACK Section 17: update the auto-update message template to instruct agents to sell the upgrade, not just summarize the diff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.5.0 migration directive for live sync + verification Agents upgrading from v0.4.x will automatically: check their pooler connection string, set up automatic sync, and run the verification runbook. Without this migration file, upgrading agents would learn about live sync (by re-reading Section 18) but wouldn't set it up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: sharpen migration file guidance in CLAUDE.md Replace vague "requires agent action" with concrete trigger list: new setup steps existing users don't have, MUST ADD skillpack sections, schema changes, deprecated commands, new verification steps, new crons. Add the key test: "if an existing user upgrades and does nothing else, will their brain work worse?" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: make Section 17 upgrade flow work for direct user requests Section 17 was structured as a cron-initiated flow only. An agent handling "upgrade gbrain" might just run the command and stop, missing the post-upgrade steps where the value is (re-read skills, run migrations, schema sync). Added explicit entry point for direct upgrade requests. Made Steps 2-4 more concrete about where to find files and why migrations can't be skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add E2E sync tests — git-to-DB pipeline (11 tests) Tests the full sync lifecycle against real Postgres+pgvector: - First sync imports all pages from a git repo - Second sync with no changes returns up_to_date - Incremental sync picks up new files (add → commit → sync → verify) - Incremental sync picks up modifications — THE CRITICAL TEST: corrected text appears in DB and keyword search after sync - Incremental sync handles deletes - Non-syncable files are excluded (README, .raw/, ops/) - Sync state (last_commit, last_run) persisted to config - Sync logged to ingest_log - --full reimports everything - --dry-run shows changes without applying Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: strengthen CLAUDE.md to always run ALL test tiers Replace passive "source zshrc" suggestion with ALWAYS directive. Explicitly state that "run all tests" means ALL tiers including Tier 2 with API keys. Do not skip Tier 2 just because keys need loading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Tier 2 E2E tests — correct openclaw CLI invocation The tests used `openclaw -p` which doesn't exist. The correct command is `openclaw agent --local --agent <id> --message <prompt>`. Also fixed JSON output parsing (structured JSON goes to stderr, not stdout — use non-JSON mode instead). Fixed ingest test to assert on agent response text rather than test DB state (the agent writes to its own configured DB, not the ephemeral test DB). 82 tests pass, 0 fail, 0 skip across all 5 E2E files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
13 KiB
Setup GBrain
Set up GBrain from scratch. Target: working brain in under 5 minutes.
Install (if not already installed)
bun add github:garrytan/gbrain
How GBrain connects
GBrain connects directly to Postgres over the wire protocol. NOT through the
Supabase REST API. You need the database connection string (a postgresql:// URI),
not the project URL or anon key. The password is embedded in the connection string.
Use the Shared Pooler connection string (port 6543), not the direct connection (port 5432). The direct hostname resolves to IPv6 only, which many environments can't reach. Find it: go to the project, click Get Connected next to the project URL, then Direct Connection String > Session Pooler, and copy the Shared Pooler connection string.
Do NOT ask for the Supabase anon key. GBrain doesn't use it.
Why Supabase
Supabase gives you managed Postgres + pgvector (vector search built in) for $25/mo:
- 8GB database + 100GB storage on Pro tier
- No server to manage, automatic backups, dashboard for debugging
- pgvector pre-installed, just works
- Alternative: any Postgres with pgvector extension (self-hosted, Neon, Railway, etc.)
Prerequisites
- A Supabase account (Pro tier recommended, $25/mo) OR any Postgres with pgvector
- An OpenAI API key (for semantic search embeddings, ~$4-5 for 7,500 pages)
- A git-backed markdown knowledge base (or start fresh)
Available init options
gbrain init --supabase-- interactive wizard (prompts for connection string)gbrain init --url <connection_string>-- direct, no promptsgbrain init --non-interactive --url <connection_string>-- for scripts/agentsgbrain doctor --json-- health check after init
There is no --local, --sqlite, or offline mode. GBrain requires Postgres + pgvector.
Phase A: Supabase Setup (recommended)
Guide the user through creating a Supabase project:
- "Go to https://supabase.com and sign up or log in."
- "Click 'New Project' in the top left."
- Name:
gbrain - Region: pick the one closest to you
- Database password: generate a strong one and save it
- Name:
- "Wait about 2 minutes for the project to initialize."
- "Find the connection string: go to your project, click Get Connected next to the project URL, then Direct Connection String > Session Pooler, and copy the Shared Pooler connection string (port 6543)."
- Initialize gbrain:
gbrain init --non-interactive --url "postgresql://postgres.[ref]:[password]@aws-0-[region].pooler.supabase.com:6543/postgres" - Verify:
gbrain doctor --json
OpenClaw/Hermes agent instructions: Store the Supabase access token in your persistent
env as SUPABASE_ACCESS_TOKEN. gbrain doesn't store it, you need it for future
gbrain doctor runs. Generate at: https://supabase.com/dashboard/account/tokens
Phase B: BYO Postgres (alternative)
If the user already has Postgres with pgvector:
- Get the connection string from the user.
- Run:
gbrain init --non-interactive --url "<connection_string>" - Verify:
gbrain doctor --json
If the connection fails with ECONNREFUSED and the URL contains supabase.co,
the user probably pasted the direct connection (IPv6 only). Guide them to the
Session pooler string instead (see Phase A step 4).
Phase C: First Import
- Discover markdown repos. Scan the environment for git repos with markdown content.
echo "=== GBrain Environment Discovery ==="
for dir in /data/* ~/git/* ~/Documents/* 2>/dev/null; do
if [ -d "$dir/.git" ]; then
md_count=$(find "$dir" -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" 2>/dev/null | wc -l | tr -d ' ')
if [ "$md_count" -gt 10 ]; then
total_size=$(du -sh "$dir" 2>/dev/null | cut -f1)
echo " $dir ($total_size, $md_count .md files)"
fi
fi
done
echo "=== Discovery Complete ==="
-
Import the best candidate. For large imports (>1000 files), use nohup to survive session timeouts:
nohup gbrain import <dir> --no-embed --workers 4 > /tmp/gbrain-import.log 2>&1 &Then check progress:
tail -1 /tmp/gbrain-import.logFor smaller imports, run directly:
gbrain import <dir> --no-embed -
Prove search works. Pick a semantic query based on what you imported:
gbrain search "<topic from the imported data>"This is the magical moment: the user sees search finding things grep couldn't.
-
Start embeddings. Refresh stale embeddings (runs in background). Keyword search works NOW, semantic search improves as embeddings complete.
-
Offer file migration. If the repo has binary files (.raw/ directories with images, PDFs, audio):
"You have N binary files (X GB) in your brain repo. Want to move them to cloud storage? Your git repo will drop from X GB to Y MB. All links keep working."
If no markdown repos are found, create a starter brain with a few template pages (a person page, a company page, a concept page) from docs/GBRAIN_RECOMMENDED_SCHEMA.md.
Phase D: Brain-First Lookup Protocol
Inject the brain-first lookup protocol into the project's AGENTS.md (or equivalent). This replaces grep-based knowledge lookups with structured gbrain queries.
BEFORE (grep) vs AFTER (gbrain)
| Task | Before (grep) | After (gbrain) |
|---|---|---|
| Find a person | grep -r "Pedro" brain/ |
gbrain search "Pedro" |
| Understand a topic | grep -rl "deal" brain/ | head -5 && cat ... |
gbrain query "what's the status of the deal" |
| Read a known page | cat brain/people/pedro.md |
gbrain get people/pedro |
| Find connections | grep -rl "Brex" brain/ | xargs grep "Pedro" |
gbrain query "Pedro Brex relationship" |
Lookup sequence (MANDATORY for every entity question)
gbrain search "name"-- keyword match, fast, works without embeddingsgbrain query "what do we know about name"-- hybrid search, needs embeddingsgbrain get <slug>-- direct page read when you know the slug from steps 1-2grepfallback -- only if gbrain returns zero results AND the file may exist outside the indexed brain
Stop at the first step that gives you what you need. Most lookups resolve at step 1.
Sync-after-write rule
After creating or updating any brain page in the repo, sync immediately so the index stays current:
gbrain sync --no-pull --no-embed
This indexes new/changed files without pulling from git or regenerating embeddings.
Embeddings can be refreshed later in batch (gbrain embed --stale).
gbrain vs memory_search
| Layer | What it stores | When to use |
|---|---|---|
| gbrain | World knowledge: people, companies, deals, meetings, concepts, media | "Who is Pedro?", "What happened at the board meeting?" |
| memory_search | Agent operational state: preferences, decisions, session context | "How does the user like formatting?", "What did we decide about X?" |
Both should be checked. gbrain for facts about the world. memory_search for how the agent should behave.
Phase E: Load the Production Agent Guide
Read docs/GBRAIN_SKILLPACK.md. This is the reference architecture for how a
production agent uses gbrain: the brain-agent loop, entity detection, enrichment
pipeline, meeting ingestion, cron schedules, and the five operational disciplines.
Inject the key patterns into the agent's system context or AGENTS.md:
- Brain-agent loop (Section 2): read before responding, write after learning
- Entity detection (Section 3): spawn on every message, capture people/companies/ideas
- Source attribution (Section 7): every fact needs
[Source: ...] - Iron law back-linking (Section 15.4): every mention links back to the entity page
Tell the user: "The production agent guide is at docs/GBRAIN_SKILLPACK.md. It covers the brain-agent loop, entity detection, enrichment, meeting ingestion, and cron schedules. Read it when you're ready to go from 'search works' to 'the brain maintains itself.'"
Phase F: Health Check
Run gbrain doctor --json and report the results. Every check should be OK.
If any check fails, the doctor output tells you exactly what's wrong and how to fix it.
Error Recovery
If any gbrain command fails, run gbrain doctor --json first. Report the full
output. It checks connection, pgvector, RLS, schema version, and embeddings.
| What You See | Why | Fix |
|---|---|---|
| Connection refused | Supabase project paused, IPv6, or wrong URL | Use Session pooler (port 6543), or supabase.com/dashboard > Restore |
| Password authentication failed | Wrong password | Project Settings > Database > Reset password |
| pgvector not available | Extension not enabled | Run CREATE EXTENSION vector; in SQL Editor |
| OpenAI key invalid | Expired or wrong key | platform.openai.com/api-keys > Create new |
| No pages found | Query before import | Import files into gbrain first |
| RLS not enabled | Security gap | Run gbrain init again (auto-enables RLS) |
Phase G: Auto-Update Check (if not already configured)
If the user's install did NOT include setting up auto-update checks (e.g., they used the manual install path or an older version of the OpenClaw/Hermes paste), offer it:
"Would you like daily GBrain update checks? I'll let you know when there's a new version worth upgrading to — including new skills and schema recommendations. You'll always be asked before anything is installed."
If they agree:
- Test:
gbrain check-update --json - Register daily cron (see GBRAIN_SKILLPACK.md Section 17)
If already configured or user declines, skip.
Phase H: Live Sync Setup (MUST ADD)
The brain repo is the source of truth. If sync doesn't run automatically, the vector DB falls behind and gbrain returns stale answers. This phase is not optional.
Read docs/GBRAIN_SKILLPACK.md Section 18 for the full reference. Key points:
-
Check the connection pooler first. Sync uses transactions on every import. If
DATABASE_URLuses Supabase's Transaction mode pooler, sync will throw.begin() is not a functionand silently skip most pages. Verify the connection string uses Session mode (port 6543, Session mode) or direct (port 5432). -
Set up automatic sync. Choose the approach that fits your environment:
- Cron (recommended for agents): register a cron every 5-30 minutes:
gbrain sync --repo /data/brain && gbrain embed --stale - Watch mode:
gbrain sync --watch --repo /data/brainunder a process manager. Pair with a cron fallback (watch exits after 5 consecutive failures). - Webhook or git hook: if available in your environment.
- Cron (recommended for agents): register a cron every 5-30 minutes:
-
Verify sync works. Don't just check that the command ran. Check that it worked:
gbrain statsshould show page count close to syncable file count in the repo.- If page count is way too low, the pooler bug is silently skipping pages.
- Push a test change and confirm it appears in
gbrain search.
-
Chain sync + embed. Always run both:
gbrain sync --repo <path> && gbrain embed --stale. For small syncs, embeddings are generated inline. Theembed --staleis a safety net for any stale chunks.
Tell the user: "Live sync is configured. The brain will stay current automatically. I'll verify it's working in the next phase."
Phase I: Full Verification
Run the full verification runbook to confirm the entire installation is working.
- Read
docs/GBRAIN_VERIFY.md - Execute each check in order
- Report results to the user
- Fix any failures before declaring setup complete
Every check in the runbook should pass. The most important one is check 4 (live sync actually works): push a change, wait for sync, search for the corrected text. "Sync ran" is not the same as "sync worked."
Tell the user: "I've verified the full GBrain installation. Here's the status of each check: [list results]. Everything is working / [specific item] needs attention."
If already configured or user declines, skip.
Schema State Tracking
After presenting the recommended directories (Phase C/E) and the user selects which
ones to create, write ~/.gbrain/update-state.json recording:
schema_version_applied: current gbrain versionskillpack_version_applied: current gbrain versionschema_choices.adopted: directories the user createdschema_choices.declined: directories the user explicitly skippedschema_choices.custom: directories the user added that aren't in the recommended schema
This file enables future upgrades to suggest new schema additions without re-suggesting things the user already declined.
Tools Used
gbrain init --non-interactive --url ...-- create braingbrain import <dir> --no-embed [--workers N]-- import filesgbrain search <query>-- search braingbrain doctor --json-- health checkgbrain check-update --json-- check for updatesgbrain embed refresh-- generate embeddingsgbrain embed --stale-- backfill missing embeddingsgbrain sync --repo <path>-- one-shot sync from brain repogbrain sync --watch --repo <path>-- continuous sync pollinggbrain config get sync.last_run-- check last sync timestampgbrain stats-- page count + embed coverage