feat: GBrain v0.2.0 — incremental sync, file storage, install skill (#2)
* refactor: extract importFile from import.ts + add tag reconciliation Shared single-file import function used by both import and sync. Adds tag reconciliation (removes stale tags on reimport), >1MB file skip, and import->sync checkpoint continuity (writes git HEAD to config table after import so sync picks up seamlessly). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add sync pure functions, updateSlug engine method, and sync tests - buildSyncManifest: parses git diff --name-status -M output - isSyncable: filters to .md pages, excludes hidden/ops/.raw/skip-list - pathToSlug: converts file paths to page slugs with optional prefix - updateSlug: renames page slug in-place (preserves page_id, chunks, embeddings) - rewriteLinks: stub for v0.2 (FKs use page_id, already correct) - 20 new tests, all passing (39 total across 3 files) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add gbrain sync command with CLI, MCP, and watch mode 18-step sync protocol: read config, git pull, ancestry validation, git diff --name-status -M for net changes, isSyncable filter, process deletes/renames/adds/modifies via importFile, batch optimization, sync state checkpoint in Postgres config table. Watch mode with polling and consecutive error counter. MCP sync_brain tool returns structured SyncResult. Stale page deletion for un-syncable files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add files table, gbrain files commands, and config show redaction - files table: page_slug FK with ON DELETE SET NULL + ON UPDATE CASCADE, storage_path, storage_url, mime_type, content_hash for dedup - gbrain files list/upload/sync/verify commands for Supabase Storage - gbrain config show redacts postgresql:// passwords and secret keys - CLI help updated with FILES section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add install skill for GBrain onboarding 6-phase install workflow: environment discovery, Supabase setup (magic path via CLI OAuth or fallback 2-copy-paste), init + import, ongoing sync cron, optional file migration with mandatory verification, and agent teaching (AGENTS.md rules). Every error gets what + why + fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update project documentation for v0.2.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add v0.2 features to README (sync, files, install skill) README.md: added sync command to IMPORT/EXPORT section, added FILES section with 4 commands, added files table to schema diagram, added install skill to skills table, updated MCP tools count from 20 to 21 (sync_brain added). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: OpenClaw DX improvements (skill count, upgrade docs, config show help) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: consolidate version to single source of truth Create src/version.ts that reads from package.json via static import (safe for bun compiled binaries). Update mcp/server.ts from hardcoded '0.1.0' to use shared VERSION. Bump skills/manifest.json to 0.2.0. * fix: upgrade detection order, npm→bun naming, clawhub false positives Reorder detection: node_modules first, binary second, clawhub last. Rename 'npm' install method to 'bun'. Use 'clawhub --version' instead of 'which clawhub' to avoid false positives from dangling symlinks. Add 120s timeout to execSync calls to prevent hanging. Add --help flag. * feat: per-command --help, unknown command check before DB connection Add COMMAND_HELP map covering all 28 commands. Check --help before init/upgrade dispatch and before connectEngine() so help works without a database. Use COMMAND_HELP keys as known-command set to catch unknown commands before wasting a DB round-trip. * docs: standardize npm references to bun, add Upgrade section to README Fix init.ts: npx→bunx, npm→bun for supabase CLI guidance. Fix README: npm install→bun add for standalone CLI install. Add ## Upgrade section to README with all three install methods. Update install skill Upgrading section to list bun, ClawHub, and binary. * test: full coverage audit — CLI dispatch, upgrade detection, config, edge cases New test files: - test/cli.test.ts: COMMAND_HELP ↔ switch consistency, version from package.json, per-command --help, unknown command handling, global help - test/upgrade.test.ts: detection order verification, npm→bun naming, clawhub --version (not which), timeout presence - test/config.test.ts: redactUrl for postgresql URLs, edge cases Extended existing tests: - test/sync.test.ts: empty string pathToSlug, uppercase .MD rejection, deeply nested files, multiple renames, unknown status codes - test/markdown.test.ts: multiple --- separators, missing frontmatter, no frontmatter at all, empty string, type inference from paths Tests: 39 → 83 (+44 new). All pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: 100% coverage — import-file mock engine, files utils, chunker edge cases New test files: - test/import-file.test.ts (9 tests): mock BrainEngine to test importFile without DB — MAX_FILE_SIZE skip, content_hash dedup, tag reconciliation (remove stale + add new), compiled_truth/timeline chunking, noEmbed flag, sequential chunk_index - test/files.test.ts (22 tests): getMimeType for all extensions + uppercase + unknown + no-extension, fileHash consistency + different content + empty, collectFiles pattern (skip .md, skip hidden dirs, recurse, sorted output) Extended: - test/chunkers/recursive.test.ts (+6 tests): single newline splits, word-only text, clause delimiters, lossless preservation, default options, mixed delimiter hierarchy Tests: 83 → 118 (+35 new). All pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
210
skills/install/SKILL.md
Normal file
210
skills/install/SKILL.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# Install GBrain
|
||||
|
||||
Set up GBrain from scratch. The agent drives the process, the human provides secrets and approvals.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- A Supabase account (Pro tier recommended: $25/mo for 8GB DB + 100GB storage)
|
||||
- An OpenAI API key (for semantic search embeddings, ~$4-5 for 7,500 pages)
|
||||
- A git-backed markdown knowledge base (or start fresh)
|
||||
|
||||
## Phase 1: Environment Discovery
|
||||
|
||||
Scan the environment to understand what we're working with.
|
||||
|
||||
```bash
|
||||
# Find all git repos with markdown content
|
||||
echo "=== GBrain Environment Discovery ==="
|
||||
for dir in /data/* ~/git/* ~/Documents/* 2>/dev/null; do
|
||||
if [ -d "$dir/.git" ]; then
|
||||
md_count=$(find "$dir" -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" 2>/dev/null | wc -l | tr -d ' ')
|
||||
if [ "$md_count" -gt 10 ]; then
|
||||
total_size=$(du -sh "$dir" 2>/dev/null | cut -f1)
|
||||
binary_count=$(find "$dir" -not -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" -type f \( -name "*.jpg" -o -name "*.png" -o -name "*.pdf" -o -name "*.mp4" -o -name "*.m4a" -o -name "*.heic" -o -name "*.tiff" -o -name "*.dng" \) 2>/dev/null | wc -l | tr -d ' ')
|
||||
echo ""
|
||||
echo " $dir ($total_size, $md_count .md files, $binary_count binary files)"
|
||||
# Detect knowledge base type
|
||||
if [ -d "$dir/.obsidian" ]; then
|
||||
echo " Type: Obsidian vault (detected, wikilink conversion needed in future release)"
|
||||
elif [ -d "$dir/logseq" ]; then
|
||||
echo " Type: Logseq (detected, block-ref conversion needed in future release)"
|
||||
else
|
||||
echo " Type: Plain markdown (ready for import)"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
done
|
||||
echo ""
|
||||
echo "=== Discovery Complete ==="
|
||||
```
|
||||
|
||||
Present findings to the human. Recommend which repos to import.
|
||||
|
||||
## Phase 2: Supabase Setup
|
||||
|
||||
### Magic Path (zero copy-pastes)
|
||||
|
||||
Check if the Supabase CLI is available:
|
||||
|
||||
```bash
|
||||
which supabase 2>/dev/null || npx supabase --version 2>/dev/null
|
||||
```
|
||||
|
||||
If available, use the magic path:
|
||||
|
||||
1. Tell the human: "I'll set up Supabase for you. Click 'Authorize' when your browser opens."
|
||||
2. Run `supabase login` (opens browser for OAuth)
|
||||
3. Run `supabase projects create --name gbrain --region us-east-1`
|
||||
4. Extract credentials from `supabase projects api-keys`
|
||||
5. Proceed to Phase 3 automatically
|
||||
|
||||
### Fallback Path (2 copy-pastes)
|
||||
|
||||
If the Supabase CLI is not available, tell the human exactly what to do:
|
||||
|
||||
1. "Log into Supabase and add a credit card: https://supabase.com/dashboard/account/billing"
|
||||
2. "Create a new project: https://supabase.com/dashboard/new/_"
|
||||
- Name: gbrain
|
||||
- Region: closest to you
|
||||
- Generate a strong password
|
||||
3. "Go to Project Settings > Database and copy the connection string (URI format)"
|
||||
- Paste it here
|
||||
4. "Go to Project Settings > API and copy the service_role key"
|
||||
- Paste it here
|
||||
|
||||
That's it. Two copy-pastes. The agent does everything else.
|
||||
|
||||
## Phase 3: Initialize GBrain
|
||||
|
||||
```bash
|
||||
gbrain init \
|
||||
--url "<database_url>" \
|
||||
--repo "<repo_path>"
|
||||
```
|
||||
|
||||
This runs:
|
||||
1. Connection test (SELECT 1)
|
||||
2. pgvector extension check (CREATE EXTENSION IF NOT EXISTS vector)
|
||||
3. Schema migration (idempotent, safe to re-run)
|
||||
4. Text import (all .md files, no embeddings yet)
|
||||
5. Sync checkpoint (writes git HEAD for seamless gbrain sync)
|
||||
|
||||
### First Search Result
|
||||
|
||||
After import completes, run a sample query to prove it works:
|
||||
|
||||
```bash
|
||||
# Query the most recently modified page's topic
|
||||
gbrain query "$(ls -t <repo_path>/*.md <repo_path>/**/*.md 2>/dev/null | head -1 | xargs head -5 | grep -i 'title:' | cut -d: -f2 | tr -d ' ')"
|
||||
```
|
||||
|
||||
Show results to the human immediately. This is the magic moment.
|
||||
|
||||
### Start Embeddings
|
||||
|
||||
```bash
|
||||
gbrain embed --stale &
|
||||
```
|
||||
|
||||
Embeddings run in background. Keyword search works NOW. Semantic search improves as embeddings complete. Check progress with `gbrain embed --status`.
|
||||
|
||||
## Phase 4: Set Up Ongoing Sync
|
||||
|
||||
```bash
|
||||
# Add to cron (every 5 minutes)
|
||||
(crontab -l 2>/dev/null; echo "*/5 * * * * gbrain sync --no-pull 2>&1 | tail -1 >> /tmp/gbrain-sync.log") | crontab -
|
||||
```
|
||||
|
||||
Or for agents that push to the brain repo, trigger sync after writes:
|
||||
```bash
|
||||
gbrain sync --no-pull
|
||||
```
|
||||
|
||||
## Phase 5: Optional File Migration
|
||||
|
||||
If the repo has >100MB of binary files:
|
||||
|
||||
1. **Tell the human what will happen:**
|
||||
"Your repo has X binary files (Y MB). I can move them to Supabase Storage to slim down git. Files stay in git history permanently. Want me to proceed?"
|
||||
|
||||
2. **If approved:**
|
||||
```bash
|
||||
gbrain health # verify everything is connected
|
||||
gbrain files sync <repo>/attachments/ # upload all files
|
||||
gbrain files verify # mandatory 100% verification
|
||||
# STOP: ask human for approval before git rm
|
||||
```
|
||||
|
||||
3. **After human approves git rm:**
|
||||
```bash
|
||||
cd <repo>
|
||||
echo "attachments/" >> .gitignore
|
||||
git rm -r --cached attachments/
|
||||
git commit -m "Move attachments to Supabase Storage"
|
||||
git push
|
||||
```
|
||||
|
||||
## Phase 6: Teach the Agent
|
||||
|
||||
Add GBrain rules to AGENTS.md (or equivalent):
|
||||
|
||||
```markdown
|
||||
## GBrain (Knowledge Search)
|
||||
|
||||
GBrain indexes your knowledge base for fast search. Always search before answering
|
||||
questions about people, companies, deals, or anything in the brain.
|
||||
|
||||
### Commands
|
||||
- `gbrain query "search terms"` -- Search the knowledge base (keyword + semantic)
|
||||
- `gbrain sync` -- Sync latest changes from git to GBrain
|
||||
- `gbrain files upload <path> --page <slug>` -- Upload a file to storage
|
||||
- `gbrain health` -- Check GBrain status
|
||||
- `gbrain stats` -- Show page count, embedding coverage, last sync
|
||||
|
||||
### Rules
|
||||
1. **Search the brain first.** Before answering any question about people, companies,
|
||||
deals, meetings, or strategy, run `gbrain query`. Your memory of file contents
|
||||
goes stale; the database doesn't.
|
||||
2. **Never commit binaries to git.** Use `gbrain files upload` instead.
|
||||
3. **After writing to the brain repo,** trigger `gbrain sync --no-pull` to update
|
||||
the search index immediately.
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
Every error tells you what happened, why, and how to fix it:
|
||||
|
||||
| What You See | Why | Fix |
|
||||
|---|---|---|
|
||||
| Connection refused | Supabase project paused or wrong URL | supabase.com/dashboard > Restore |
|
||||
| Password authentication failed | Wrong password | Project Settings > Database > Reset password |
|
||||
| pgvector not available | Extension not enabled | Run CREATE EXTENSION vector in SQL Editor |
|
||||
| OpenAI key invalid | Expired or wrong key | platform.openai.com/api-keys > Create new |
|
||||
| Sync anchor missing | Force push removed the commit | `gbrain sync --full` |
|
||||
| No pages found | Query before import | `gbrain import <dir>` first |
|
||||
|
||||
## Upgrading
|
||||
|
||||
Upgrade depends on how you installed:
|
||||
- **bun (standalone or library):** `bun update gbrain`
|
||||
- **ClawHub:** `clawhub update gbrain`
|
||||
- **Compiled binary:** Download the latest from [GitHub Releases](https://github.com/garrytan/gbrain/releases)
|
||||
|
||||
After upgrading:
|
||||
- Run `gbrain init` again to apply schema migrations (idempotent, safe to re-run)
|
||||
- The new `files` table gets created automatically on next init
|
||||
- Sync state is preserved across upgrades
|
||||
|
||||
## Health Check
|
||||
|
||||
Run `gbrain health` at any time to verify all connections:
|
||||
|
||||
```
|
||||
ok Database: connected
|
||||
ok pgvector: extension loaded
|
||||
ok Schema: up to date
|
||||
ok Sync: last run N min ago
|
||||
ok Embeddings: X/Y pages embedded
|
||||
```
|
||||
|
||||
Every unhealthy line includes WHY and FIX.
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "gbrain",
|
||||
"version": "0.1.0",
|
||||
"version": "0.2.0",
|
||||
"description": "Personal knowledge brain with hybrid RAG search",
|
||||
"skills": [
|
||||
{
|
||||
@@ -32,6 +32,11 @@
|
||||
"name": "migrate",
|
||||
"path": "migrate/SKILL.md",
|
||||
"description": "Universal migration from Obsidian, Notion, Logseq, markdown, CSV, JSON, Roam"
|
||||
},
|
||||
{
|
||||
"name": "install",
|
||||
"path": "install/SKILL.md",
|
||||
"description": "Set up GBrain from scratch: Supabase, import, sync, file migration"
|
||||
}
|
||||
],
|
||||
"dependencies": {
|
||||
|
||||
Reference in New Issue
Block a user