docs: live sync setup + verification runbook + API key loading (#24)

* docs: add SKILLPACK Section 18 — Live Sync (MUST ADD) Contract-first guide for keeping the vector DB in sync with the brain repo. Documents the pooler prerequisite (Session mode required for transactions), sync + embed primitives, four example approaches (cron, --watch, webhook, git hook), isSyncable exclusions, silent skip warning, and OpenClaw/Hermes cron registration examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add GBRAIN_VERIFY.md installation verification runbook Six-check runbook: schema (doctor), skillpack loaded, auto-update, live sync (coverage check + embed check + end-to-end push-and-search test), embedding coverage, brain-first lookup protocol. Emphasizes "sync ran" != "sync worked" — the real test is searching for corrected text after a push. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add setup Phases H (Live Sync) and I (Verification) Phase H: MUST ADD live sync setup — pooler prerequisite check, automatic sync configuration (agent picks approach), sync+embed chaining, coverage verification. Phase I: run GBRAIN_VERIFY.md end-to-end before declaring setup complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add install steps 8-9 (live sync + verification) Step 8: set up automatic sync with SKILLPACK Section 18 reference. Step 9: run GBRAIN_VERIFY.md runbook. Add GBRAIN_VERIFY.md to docs section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add API key loading instructions to CLAUDE.md Source ~/.zshrc before running Tier 2 tests so OPENAI_API_KEY and ANTHROPIC_API_KEY are available. Without this, embedding and skills tests skip silently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version to v0.5.0 Live sync, verification runbook, API key loading instructions. Version markers updated in SKILLPACK and RECOMMENDED_SCHEMA. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add anti-hand-roll rule to skill routing in CLAUDE.md Explicitly prohibit manually running git commit + push + gh pr create when /ship is available. /ship handles VERSION, CHANGELOG, document-release, reviews, and coverage audit. Hand-rolling skips all of these. Added "commit and ship" / "push and ship" variants to the ship routing rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: changelog voice rule + rewrite 0.5.0 changelog to sell the upgrade CLAUDE.md: add changelog voice guidance — lead with benefits, not implementation details. Make users want to upgrade. CHANGELOG: rewrite 0.5.0 entries from dry feature descriptions to capability-focused bullets ("your brain never falls behind" not "SKILLPACK Section 18 added"). SKILLPACK Section 17: update the auto-update message template to instruct agents to sell the upgrade, not just summarize the diff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.5.0 migration directive for live sync + verification Agents upgrading from v0.4.x will automatically: check their pooler connection string, set up automatic sync, and run the verification runbook. Without this migration file, upgrading agents would learn about live sync (by re-reading Section 18) but wouldn't set it up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: sharpen migration file guidance in CLAUDE.md Replace vague "requires agent action" with concrete trigger list: new setup steps existing users don't have, MUST ADD skillpack sections, schema changes, deprecated commands, new verification steps, new crons. Add the key test: "if an existing user upgrades and does nothing else, will their brain work worse?" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: make Section 17 upgrade flow work for direct user requests Section 17 was structured as a cron-initiated flow only. An agent handling "upgrade gbrain" might just run the command and stop, missing the post-upgrade steps where the value is (re-read skills, run migrations, schema sync). Added explicit entry point for direct upgrade requests. Made Steps 2-4 more concrete about where to find files and why migrations can't be skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add E2E sync tests — git-to-DB pipeline (11 tests) Tests the full sync lifecycle against real Postgres+pgvector: - First sync imports all pages from a git repo - Second sync with no changes returns up_to_date - Incremental sync picks up new files (add → commit → sync → verify) - Incremental sync picks up modifications — THE CRITICAL TEST: corrected text appears in DB and keyword search after sync - Incremental sync handles deletes - Non-syncable files are excluded (README, .raw/, ops/) - Sync state (last_commit, last_run) persisted to config - Sync logged to ingest_log - --full reimports everything - --dry-run shows changes without applying Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: strengthen CLAUDE.md to always run ALL test tiers Replace passive "source zshrc" suggestion with ALWAYS directive. Explicitly state that "run all tests" means ALL tiers including Tier 2 with API keys. Do not skip Tier 2 just because keys need loading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Tier 2 E2E tests — correct openclaw CLI invocation The tests used `openclaw -p` which doesn't exist. The correct command is `openclaw agent --local --agent <id> --message <prompt>`. Also fixed JSON output parsing (structured JSON goes to stderr, not stdout — use non-JSON mode instead). Fixed ingest test to assert on agent response text rather than test DB state (the agent writes to its own configured DB, not the ephemeral test DB). 82 tests pass, 0 fail, 0 skip across all 5 E2E files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:23:59 -10:00
parent eb218a96ad
commit e9f3c9c24d
12 changed files with 1005 additions and 87 deletions
--- a/docs/GBRAIN_SKILLPACK.md
+++ b/docs/GBRAIN_SKILLPACK.md
@@ -1,4 +1,4 @@
-<!-- skillpack-version: 0.4.1 -->
+<!-- skillpack-version: 0.5.0 -->
 <!-- source: https://raw.githubusercontent.com/garrytan/gbrain/master/docs/GBRAIN_SKILLPACK.md -->
 # GBrain Skillpack: Reference Architecture for AI Agents

@@ -1008,30 +1008,42 @@ concatenation but 10x better at understanding what an email means. Use both.

 ---

-## 17. Auto-Update Notifications
+## 17. Upgrades and Auto-Update Notifications

-GBrain ships updates frequently. The auto-update cron keeps users current by
-checking for new versions and messaging them when there's something worth upgrading to.
+GBrain ships updates frequently. There are two ways an upgrade happens:

-The check is automatic. The upgrade is always manual — never install without
-the user's explicit permission.
+**User says "upgrade gbrain":** Run `gbrain check-update --json` to see what's new,
+then run the Full Upgrade Flow below (Steps 1-6). Do NOT just run `gbrain upgrade`
+and stop. The post-upgrade steps (re-read skills, run migrations, schema sync) are
+where the value is. Without them, the agent has new code but old behavior.

-### The Check
+**Cron finds an update:** The auto-update cron checks for new versions and messages
+the user. The user decides whether to upgrade. If yes, run the same Full Upgrade
+Flow (Steps 1-6).
+
+The upgrade is always manual. Never install without the user's explicit permission.
+
+### The Check (cron-initiated)

 Run `gbrain check-update --json`. If `update_available` is false, stay completely
 silent — do nothing. If true, message the user on their preferred channel.

 ### The Message

-Summarize the `changelog_diff` into 2-3 human-friendly bullets. No raw markdown.
-Lead with the most impactful change.
+Sell the upgrade. The user should feel "hell yeah, I want that." Lead with what
+they can DO now that they couldn't before, not what files changed. Frame as
+capabilities and benefits, not implementation details. Make them excited that
+GBrain keeps getting better. 2-3 punchy bullets, no raw markdown, no file names.

 > **GBrain v0.5.0 is available** (you're on v0.4.0)
 >
 > What's new:
-> - Semantic chunking now 3x faster with batched embedding
-> - New `gbrain files mirror` command for cloud storage migration
-> - Doctor command catches RLS misconfigurations
+> - Your brain never falls behind. Live sync keeps the vector DB current
+>   automatically, so edits show up in search within minutes, not "whenever
+>   someone remembers to run sync"
+> - New verification runbook catches silent failures: the pooler bug that
+>   skips pages, missing embeddings, stale search results
+> - New installs set up live sync automatically. No more manual setup step
 >
 > Want me to upgrade? I'll update everything and refresh my playbook.
 >
@@ -1057,15 +1069,24 @@ Acceptable "yes": any clearly affirmative response. When in doubt, ask again.
 Run `gbrain upgrade`. This updates the CLI and all shipped files (skills, docs, migrations).

 **Step 2: Re-read all updated skills.**
-Re-read every skill file in `skills/*/SKILL.md` to pick up new patterns and workflows.
-Updated skills = better agent behavior. The user gets this for free.
+Find the gbrain package directory (`bun pm ls 2>/dev/null | grep gbrain` or check
+`node_modules/gbrain/`). Re-read every skill file in `skills/*/SKILL.md` to pick up
+new patterns and workflows. Updated skills = better agent behavior. The user gets
+this for free.

 **Step 3: Re-read the production reference docs.**
-Read `docs/GBRAIN_SKILLPACK.md` and `docs/GBRAIN_RECOMMENDED_SCHEMA.md` fresh.
+Read `docs/GBRAIN_SKILLPACK.md` and `docs/GBRAIN_RECOMMENDED_SCHEMA.md` fresh from
+the gbrain package directory. These contain the latest patterns, cron schedules,
+and integration guides. This is how the agent learns about new capabilities like
+live sync (Section 18).

 **Step 4: Check for version-specific migration directives.**
 Look for `skills/migrations/v[version].md` files between the old and new version.
-If they exist, read and execute them in order.
+If they exist, read and execute them **in order**. These are the post-upgrade
+actions that make the new version actually work for existing users (e.g., v0.5.0
+migration sets up live sync and runs the verification runbook). Do NOT skip this
+step. Without migrations, the agent has new code but the user's environment hasn't
+changed.

 **Step 5: Schema sync — suggest new recommendations without undoing user choices.**
 Read `~/.gbrain/update-state.json` to see what the user previously adopted, declined,
@@ -1104,7 +1125,7 @@ installing gbrain, you can still stay current. Both this file and
 GBRAIN_RECOMMENDED_SCHEMA.md have version markers in their first lines:

 ```
-<!-- skillpack-version: 0.4.1 -->
+<!-- skillpack-version: 0.5.0 -->
 <!-- source: https://raw.githubusercontent.com/garrytan/gbrain/master/docs/GBRAIN_SKILLPACK.md -->
 ```

@@ -1129,3 +1150,132 @@ If both are current, stay silent."

 This way standalone users get new patterns, workflows, and schema recommendations
 without installing the gbrain CLI. The agent fetches, diffs, and updates its own playbook.
+
+---
+
+## 18. Live Sync -- Keep the Index Current (MUST ADD)
+
+The brain repo is the source of truth. The vector DB is a derived index. If sync
+doesn't run, gbrain returns stale answers. This is not theoretical: a hallucination
+was corrected in a brain page but the vector DB kept serving the old text because
+nobody ran `gbrain sync`. This section is **mandatory** during agent install.
+
+### The Contract
+
+After setup, the vector DB must stay current with the brain repo automatically. No
+human should need to run `gbrain sync` manually. The agent chooses HOW based on its
+environment. The only hard requirement: verify it actually works (see
+`docs/GBRAIN_VERIFY.md`).
+
+### Prerequisite: Session Mode Pooler
+
+Sync uses `engine.transaction()` on every import. If `DATABASE_URL` points to
+Supabase's **Transaction mode** pooler, sync will throw `.begin() is not a function`
+and **silently skip most pages**. This is the #1 cause of "sync ran but nothing
+happened."
+
+Fix: use the **Session mode** pooler string (port 6543, Session mode) or the direct
+connection (port 5432, IPv6-only). Verify by running `gbrain sync` and checking that
+the page count in `gbrain stats` matches the syncable file count in the repo. If
+they diverge, your connection string is wrong.
+
+### The Primitives
+
+Always chain sync + embed:
+
+```bash
+gbrain sync --repo /path/to/brain && gbrain embed --stale
+```
+
+- `gbrain sync --repo <path>` -- one-shot incremental sync. Detects changes via
+  `git diff`, imports only what changed. For small changesets (<= 100 files),
+  embeddings are generated inline during import.
+- `gbrain embed --stale` -- backfill embeddings for any chunks that don't have them.
+  This is a safety net: it catches chunks from large syncs (>100 files, where
+  embeddings are deferred) or prior `--no-embed` runs.
+- `gbrain sync --watch --repo <path>` -- foreground polling loop, every 60s
+  (configurable with `--interval N`). Embeds inline for small changesets. **Exits
+  after 5 consecutive failures**, so run under a process manager (systemd
+  `Restart=on-failure`, pm2) or pair with a cron fallback.
+
+### Example Approaches (pick what fits your environment)
+
+**Cron job** (recommended for agents): run every 5-30 minutes.
+
+```bash
+gbrain sync --repo /data/brain && gbrain embed --stale
+```
+
+Works with any cron scheduler: OpenClaw, Hermes, system crontab.
+
+**Long-lived watcher**: for near-instant sync (60s polling).
+
+```bash
+gbrain sync --watch --repo /data/brain
+```
+
+Run under a process manager that auto-restarts on exit. Pair with a cron fallback
+since `--watch` exits on repeated failures.
+
+**GitHub webhook**: triggers sync on push events. Optional, for instant sync (<5s).
+Set up the webhook to call `gbrain sync --repo /data/brain && gbrain embed --stale`.
+If using webhooks, verify `X-Hub-Signature-256` against a shared secret.
+
+**Git post-receive hook**: if the brain repo is on the same machine.
+
+### What Gets Synced
+
+Sync only indexes "syncable" markdown files. These are excluded by design:
+
+- Hidden paths (`.git/`, `.raw/`, etc.)
+- The `ops/` directory
+- Meta files: `README.md`, `index.md`, `schema.md`, `log.md`
+
+This is intentional. The brain is the `.md` files in the directory structure, not
+every file in the repo.
+
+### Sync is Idempotent
+
+Concurrent runs are safe. Two syncs on the same commit no-op because content hashes
+match. If both a cron and `--watch` fire simultaneously, no conflict.
+
+### Silent Skip Warning
+
+If a file fails to import (malformed frontmatter, binary content in a `.md` file,
+connection error on a single page), sync logs a warning but still advances
+`sync.last_commit`. That file is silently skipped until it changes again or someone
+runs `gbrain sync --full`.
+
+The verify step in `docs/GBRAIN_VERIFY.md` catches this by comparing page count in
+the DB against syncable file count in the repo. If they diverge, `--full` reconciles.
+
+### OpenClaw
+
+```
+Name: gbrain-auto-sync
+Schedule: */15 * * * *
+Prompt: "Run: gbrain sync --repo /data/brain && gbrain embed --stale
+  Log the result. If sync fails with .begin() is not a function,
+  the DATABASE_URL is using Transaction mode pooler. Switch to
+  Session mode (see GBRAIN_SKILLPACK.md Section 18)."
+```
+
+Add to `/data/.openclaw/cron/jobs.json`.
+
+### Hermes
+
+```
+/cron add "*/15 * * * *" "Run gbrain sync --repo /data/brain &&
+  gbrain embed --stale. Log the result." --name "gbrain-auto-sync"
+```
+
+### What "Done" Looks Like
+
+1. Edit a brain file, commit, push
+2. Wait for the next sync cycle (cron interval or `--watch` poll)
+3. `gbrain search "<text from the edit>"` returns the updated content
+4. `gbrain stats` shows page count matching syncable file count
+5. Embedded chunk count is close to total chunk count
+
+If step 3 returns old content, sync failed silently. Run the full verification
+from `docs/GBRAIN_VERIFY.md`.