feat: GBrain v0.3.0 — contract-first architecture + ClawHub plugin (#7)
* feat: contract-first operations.ts with OperationError, dry_run, importFromContent 30 shared operations as single source of truth for CLI and MCP. - OperationError with typed error codes (page_not_found, invalid_params, etc.) - dry_run support on all mutating operations - importFromContent split from importFile with transaction wrapping - Idempotency hash now includes ALL fields (title, type, frontmatter, tags) - Config env var fallback: GBRAIN_DATABASE_URL > DATABASE_URL > config file Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: rewrite MCP server + CLI + tools-json from operations server.ts: 233 -> ~80 lines. Tool definitions and dispatch generated from operations[]. cli.ts: shared operations auto-registered, CLI-only commands kept as manual dispatch. tools-json: generated FROM operations[], eliminating the third contract surface. Parity test verifies structural contract between operations, CLI, and MCP. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: delete 12 command files migrated to operations.ts Handler logic for get, put, delete, list, search, query, health, stats, tags, link, timeline, and version now lives in operations.ts. Kept: init, upgrade, import, export, files, embed, sync, serve, call, config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: init --non-interactive, upgrade verification, schema migration - gbrain init --non-interactive --url <url> for plugin mode (no TTY required) - Post-upgrade version verification in gbrain upgrade - Drop storage_url from files table (storage_path is the only identifier) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: tool-agnostic skills + new setup skill All 7 skills rewritten with intent-based language instead of CLI commands. Works with both CLI and MCP plugin contexts. New setup skill replaces install: auto-provision Supabase via CLI, AGENTS.md injection, target TTHW < 2 min. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: ClawHub bundle plugin, CI workflows, v0.3.0 - openclaw.plugin.json with configSchema, MCP server config, skill listing - GitHub Actions: test on push/PR, multi-platform release (macOS arm64 + Linux x64) - Version bump 0.3.0, CHANGELOG, README ClawHub section, CLAUDE.md updated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: idempotency hash mismatch + MCP dry_run passthrough importFromContent now passes its all-fields hash through putPage via content_hash on PageInput, so the stored hash matches the computed hash. Previously the skip-if-unchanged check never fired because the hash formulas differed. MCP server now passes dry_run from tool params to OperationContext. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.3.0.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: schema loader handles PL/pgSQL $$ blocks Delete the semicolon-based SQL splitter in db.ts which broke on PL/pgSQL trigger functions containing semicolons inside $$ delimiter blocks. Use single conn.unsafe(schemaSql) call instead — the postgres driver handles multi-statement SQL natively. schema.sql already uses IF NOT EXISTS / CREATE OR REPLACE for idempotency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: E2E test infrastructure + realistic brain fixtures Add test infrastructure for running E2E tests against real Postgres+pgvector. Includes: - test/e2e/helpers.ts: DB lifecycle, fixture import, timing, diagnostics - 13 fixture files as a miniature realistic brain (people, companies, deals, meetings, concepts, projects, sources) following the compiled truth + timeline format from GBRAIN_RECOMMENDED_SCHEMA.md - docker-compose.test.yml: local pgvector convenience (port 5433) - .env.testing.example: template for test credentials - package.json: add test:e2e script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: E2E test suites + CI workflow Tier 1 (mechanical.test.ts): 14 test suites covering all operations against real Postgres — page CRUD, search with quality scoring, links, tags, timeline, versions, admin, chunks, resolution, ingest log, raw data, files, idempotency stress, setup journey (full CLI flow), init edge cases, schema idempotency, schema diff guard, performance baselines. Tier 1 (mcp.test.ts): MCP protocol test — spawns server, sends JSON-RPC, verifies tools/list matches operations count. Tier 2 (skills.test.ts): OpenClaw skill tests — ingest, query, health. Skips gracefully when dependencies missing. CI (.github/workflows/e2e.yml): Tier 1 on every PR (pgvector service), Tier 2 nightly/manual with API key secrets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: E2E test fixes + traverseGraph jsonb cast - Fix traverseGraph query: cast json_agg to jsonb_agg so SELECT DISTINCT works - Fix put_page tests to use importFromContent with noEmbed (no OpenAI key in Tier 1) - Fix get_health assertion (page_count not total_pages) - Fix raw_data test to handle JSONB string/object return - Simplify MCP test to verify tool generation directly - Add timeouts to CLI subprocess tests - Use port 5434 for docker-compose (5433 often in use) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update all project docs for E2E test suite - CLAUDE.md: updated test count (9 unit + 3 E2E), added E2E test instructions, fixed skill count to 8 - CONTRIBUTING.md: updated project structure with test/e2e/, added E2E test instructions, rewrote "Adding a new command" to reflect contract-first architecture (add to operations.ts, done) - README.md: fixed table count (10 not 9), added recommended schema doc to Docs section, added E2E instructions to Contributing section - CHANGELOG.md: added E2E test suite, docker-compose, schema loader fix, and traverseGraph jsonb fix to v0.3.0 entry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5,27 +5,27 @@ Compile a daily briefing from brain context.
|
||||
## Workflow
|
||||
|
||||
1. **Today's meetings.** For each meeting on the calendar:
|
||||
- Look up all participants via `gbrain query <name>`
|
||||
- Read their pages for compiled_truth context
|
||||
- Search gbrain for each participant by name
|
||||
- Read their pages from gbrain for compiled_truth context
|
||||
- Summarize: who they are, recent timeline, relationship to you
|
||||
2. **Active deals.** `gbrain list --type deal` filtered to active status:
|
||||
2. **Active deals.** List deal pages in gbrain filtered to active status:
|
||||
- Deadlines approaching in the next 7 days
|
||||
- Recent timeline entries (last 7 days)
|
||||
3. **Time-sensitive threads.** Open items from timeline entries:
|
||||
- Items with deadlines in the next 48 hours
|
||||
- Follow-ups that are overdue
|
||||
4. **Recent changes.** Pages updated in the last 24 hours:
|
||||
- What changed and why (read timeline entries)
|
||||
5. **People in play.** `gbrain list --type person` sorted by recency:
|
||||
- What changed and why (read timeline entries from gbrain)
|
||||
5. **People in play.** List person pages in gbrain sorted by recency:
|
||||
- Updated in last 7 days
|
||||
- Have high activity (many recent timeline entries)
|
||||
6. **Stale alerts.** From `gbrain health`:
|
||||
6. **Stale alerts.** From gbrain health check:
|
||||
- Pages flagged as stale that are relevant to today's meetings
|
||||
|
||||
## Output Format
|
||||
|
||||
```
|
||||
DAILY BRIEFING — [date]
|
||||
DAILY BRIEFING -- [date]
|
||||
========================
|
||||
|
||||
MEETINGS TODAY
|
||||
@@ -33,26 +33,23 @@ MEETINGS TODAY
|
||||
Participants: [name] (slug: people/name, [key context])
|
||||
|
||||
ACTIVE DEALS
|
||||
- [deal name] — [status], deadline: [date]
|
||||
- [deal name] -- [status], deadline: [date]
|
||||
Recent: [latest timeline entry]
|
||||
|
||||
ACTION ITEMS
|
||||
- [item] — due [date], related to [slug]
|
||||
- [item] -- due [date], related to [slug]
|
||||
|
||||
RECENT CHANGES (24h)
|
||||
- [slug] — [what changed]
|
||||
- [slug] -- [what changed]
|
||||
|
||||
PEOPLE IN PLAY
|
||||
- [name] — [why they're active]
|
||||
- [name] -- [why they're active]
|
||||
```
|
||||
|
||||
## Commands Used
|
||||
## Tools Used
|
||||
|
||||
```
|
||||
gbrain query <name>
|
||||
gbrain get <slug>
|
||||
gbrain list --type deal
|
||||
gbrain list --type person
|
||||
gbrain health
|
||||
gbrain timeline <slug>
|
||||
```
|
||||
- Search gbrain by name (query)
|
||||
- Read a page from gbrain (get_page)
|
||||
- List pages in gbrain by type (list_pages)
|
||||
- Check gbrain health (get_health)
|
||||
- View timeline entries in gbrain (get_timeline)
|
||||
|
||||
@@ -11,16 +11,17 @@ Enrich person and company pages from external APIs.
|
||||
| Exa | Web mentions, articles | REST API |
|
||||
|
||||
Note: enrichment requires separate API credentials for each service. No client
|
||||
integrations ship in v1. This skill guides Claude Code to make API calls directly.
|
||||
integrations ship in v1. This skill guides the agent to make API calls directly.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Select target pages.** `gbrain list --type person` or `gbrain list --type company`
|
||||
1. **Select target pages.** List person or company pages in gbrain.
|
||||
2. **For each page:**
|
||||
- Read current compiled_truth to understand what we already know
|
||||
- Read the page from gbrain to understand what we already know
|
||||
- Call external APIs for fresh data
|
||||
- Store raw API responses: the raw JSON goes into `gbrain call put_raw_data`
|
||||
- Store raw API responses in gbrain (put_raw_data) to preserve provenance
|
||||
- Distill highlights into compiled_truth updates
|
||||
- Store the updated page in gbrain
|
||||
3. **Validation rules:**
|
||||
- Connection count < 20 on LinkedIn = likely wrong person, skip
|
||||
- Name mismatch between brain and API = skip, flag for manual review
|
||||
@@ -28,18 +29,17 @@ integrations ship in v1. This skill guides Claude Code to make API calls directl
|
||||
|
||||
## Quality Rules
|
||||
|
||||
- Raw data goes to raw_data table (preserves provenance)
|
||||
- Raw data goes to gbrain's raw_data store (preserves provenance)
|
||||
- Only distilled, useful info goes to compiled_truth
|
||||
- Always add a timeline entry: "Enriched from [source] on [date]"
|
||||
- Always add a timeline entry in gbrain: "Enriched from [source] on [date]"
|
||||
- Don't enrich the same page more than once per week unless requested
|
||||
- Rate limit: respect API rate limits, use exponential backoff
|
||||
|
||||
## Commands Used
|
||||
## Tools Used
|
||||
|
||||
```
|
||||
gbrain get <slug>
|
||||
gbrain put <slug>
|
||||
gbrain timeline-add <slug> <date> "Enriched from <source>"
|
||||
gbrain list --type person
|
||||
gbrain list --type company
|
||||
```
|
||||
- Read a page from gbrain (get_page)
|
||||
- Store/update a page in gbrain (put_page)
|
||||
- Add a timeline entry in gbrain (add_timeline_entry)
|
||||
- List pages in gbrain by type (list_pages)
|
||||
- Store raw API data in gbrain (put_raw_data)
|
||||
- Retrieve raw data from gbrain (get_raw_data)
|
||||
|
||||
@@ -6,11 +6,11 @@ Ingest meetings, articles, documents, and conversations into the brain.
|
||||
|
||||
1. **Parse the source.** Extract people, companies, dates, and events from the input.
|
||||
2. **For each entity mentioned:**
|
||||
- `gbrain get <slug>` to check if page exists
|
||||
- Read the entity's page from gbrain to check if it exists
|
||||
- If exists: update compiled_truth (rewrite State section with new info, don't append)
|
||||
- If new: `gbrain put <slug>` to create the page
|
||||
3. **Append to timeline.** `gbrain timeline-add <slug> <date> <summary>` for each event.
|
||||
4. **Create cross-reference links.** `gbrain link <from> <to> --type <relationship>` for every entity pair mentioned together.
|
||||
- If new: store the page in gbrain with the appropriate type and slug
|
||||
3. **Append to timeline.** Add a timeline entry in gbrain for each event, with date, summary, and source.
|
||||
4. **Create cross-reference links.** Link entities in gbrain for every entity pair mentioned together, using the appropriate relationship type.
|
||||
5. **Timeline merge.** The same event appears on ALL mentioned entities' timelines. If Alice met Bob at Acme Corp, the event goes on Alice's page, Bob's page, and Acme Corp's page.
|
||||
|
||||
## Quality Rules
|
||||
@@ -22,13 +22,11 @@ Ingest meetings, articles, documents, and conversations into the brain.
|
||||
- Link types: knows, works_at, invested_in, founded, met_at, discussed
|
||||
- Source attribution: every timeline entry includes the source (meeting, article, email, etc.)
|
||||
|
||||
## Commands Used
|
||||
## Tools Used
|
||||
|
||||
```
|
||||
gbrain get <slug>
|
||||
gbrain put <slug> < content.md
|
||||
gbrain timeline-add <slug> <date> <summary>
|
||||
gbrain link <from> <to> --type <type>
|
||||
gbrain tags <slug>
|
||||
gbrain tag <slug> <tag>
|
||||
```
|
||||
- Read a page from gbrain (get_page)
|
||||
- Store/update a page in gbrain (put_page)
|
||||
- Add a timeline entry in gbrain (add_timeline_entry)
|
||||
- Link entities in gbrain (add_link)
|
||||
- List tags for a page (get_tags)
|
||||
- Tag a page in gbrain (add_tag)
|
||||
|
||||
@@ -1,210 +1,9 @@
|
||||
# Install GBrain
|
||||
# Install GBrain (Deprecated)
|
||||
|
||||
Set up GBrain from scratch. The agent drives the process, the human provides secrets and approvals.
|
||||
This skill has been replaced by the **setup** skill. See `skills/setup/SKILL.md`.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- A Supabase account (Pro tier recommended: $25/mo for 8GB DB + 100GB storage)
|
||||
- An OpenAI API key (for semantic search embeddings, ~$4-5 for 7,500 pages)
|
||||
- A git-backed markdown knowledge base (or start fresh)
|
||||
|
||||
## Phase 1: Environment Discovery
|
||||
|
||||
Scan the environment to understand what we're working with.
|
||||
|
||||
```bash
|
||||
# Find all git repos with markdown content
|
||||
echo "=== GBrain Environment Discovery ==="
|
||||
for dir in /data/* ~/git/* ~/Documents/* 2>/dev/null; do
|
||||
if [ -d "$dir/.git" ]; then
|
||||
md_count=$(find "$dir" -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" 2>/dev/null | wc -l | tr -d ' ')
|
||||
if [ "$md_count" -gt 10 ]; then
|
||||
total_size=$(du -sh "$dir" 2>/dev/null | cut -f1)
|
||||
binary_count=$(find "$dir" -not -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" -type f \( -name "*.jpg" -o -name "*.png" -o -name "*.pdf" -o -name "*.mp4" -o -name "*.m4a" -o -name "*.heic" -o -name "*.tiff" -o -name "*.dng" \) 2>/dev/null | wc -l | tr -d ' ')
|
||||
echo ""
|
||||
echo " $dir ($total_size, $md_count .md files, $binary_count binary files)"
|
||||
# Detect knowledge base type
|
||||
if [ -d "$dir/.obsidian" ]; then
|
||||
echo " Type: Obsidian vault (detected, wikilink conversion needed in future release)"
|
||||
elif [ -d "$dir/logseq" ]; then
|
||||
echo " Type: Logseq (detected, block-ref conversion needed in future release)"
|
||||
else
|
||||
echo " Type: Plain markdown (ready for import)"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
done
|
||||
echo ""
|
||||
echo "=== Discovery Complete ==="
|
||||
```
|
||||
|
||||
Present findings to the human. Recommend which repos to import.
|
||||
|
||||
## Phase 2: Supabase Setup
|
||||
|
||||
### Magic Path (zero copy-pastes)
|
||||
|
||||
Check if the Supabase CLI is available:
|
||||
|
||||
```bash
|
||||
which supabase 2>/dev/null || npx supabase --version 2>/dev/null
|
||||
```
|
||||
|
||||
If available, use the magic path:
|
||||
|
||||
1. Tell the human: "I'll set up Supabase for you. Click 'Authorize' when your browser opens."
|
||||
2. Run `supabase login` (opens browser for OAuth)
|
||||
3. Run `supabase projects create --name gbrain --region us-east-1`
|
||||
4. Extract credentials from `supabase projects api-keys`
|
||||
5. Proceed to Phase 3 automatically
|
||||
|
||||
### Fallback Path (2 copy-pastes)
|
||||
|
||||
If the Supabase CLI is not available, tell the human exactly what to do:
|
||||
|
||||
1. "Log into Supabase and add a credit card: https://supabase.com/dashboard/account/billing"
|
||||
2. "Create a new project: https://supabase.com/dashboard/new/_"
|
||||
- Name: gbrain
|
||||
- Region: closest to you
|
||||
- Generate a strong password
|
||||
3. "Go to Project Settings > Database and copy the connection string (URI format)"
|
||||
- Paste it here
|
||||
4. "Go to Project Settings > API and copy the service_role key"
|
||||
- Paste it here
|
||||
|
||||
That's it. Two copy-pastes. The agent does everything else.
|
||||
|
||||
## Phase 3: Initialize GBrain
|
||||
|
||||
```bash
|
||||
gbrain init \
|
||||
--url "<database_url>" \
|
||||
--repo "<repo_path>"
|
||||
```
|
||||
|
||||
This runs:
|
||||
1. Connection test (SELECT 1)
|
||||
2. pgvector extension check (CREATE EXTENSION IF NOT EXISTS vector)
|
||||
3. Schema migration (idempotent, safe to re-run)
|
||||
4. Text import (all .md files, no embeddings yet)
|
||||
5. Sync checkpoint (writes git HEAD for seamless gbrain sync)
|
||||
|
||||
### First Search Result
|
||||
|
||||
After import completes, run a sample query to prove it works:
|
||||
|
||||
```bash
|
||||
# Query the most recently modified page's topic
|
||||
gbrain query "$(ls -t <repo_path>/*.md <repo_path>/**/*.md 2>/dev/null | head -1 | xargs head -5 | grep -i 'title:' | cut -d: -f2 | tr -d ' ')"
|
||||
```
|
||||
|
||||
Show results to the human immediately. This is the magic moment.
|
||||
|
||||
### Start Embeddings
|
||||
|
||||
```bash
|
||||
gbrain embed --stale &
|
||||
```
|
||||
|
||||
Embeddings run in background. Keyword search works NOW. Semantic search improves as embeddings complete. Check progress with `gbrain embed --status`.
|
||||
|
||||
## Phase 4: Set Up Ongoing Sync
|
||||
|
||||
```bash
|
||||
# Add to cron (every 5 minutes)
|
||||
(crontab -l 2>/dev/null; echo "*/5 * * * * gbrain sync --no-pull 2>&1 | tail -1 >> /tmp/gbrain-sync.log") | crontab -
|
||||
```
|
||||
|
||||
Or for agents that push to the brain repo, trigger sync after writes:
|
||||
```bash
|
||||
gbrain sync --no-pull
|
||||
```
|
||||
|
||||
## Phase 5: Optional File Migration
|
||||
|
||||
If the repo has >100MB of binary files:
|
||||
|
||||
1. **Tell the human what will happen:**
|
||||
"Your repo has X binary files (Y MB). I can move them to Supabase Storage to slim down git. Files stay in git history permanently. Want me to proceed?"
|
||||
|
||||
2. **If approved:**
|
||||
```bash
|
||||
gbrain health # verify everything is connected
|
||||
gbrain files sync <repo>/attachments/ # upload all files
|
||||
gbrain files verify # mandatory 100% verification
|
||||
# STOP: ask human for approval before git rm
|
||||
```
|
||||
|
||||
3. **After human approves git rm:**
|
||||
```bash
|
||||
cd <repo>
|
||||
echo "attachments/" >> .gitignore
|
||||
git rm -r --cached attachments/
|
||||
git commit -m "Move attachments to Supabase Storage"
|
||||
git push
|
||||
```
|
||||
|
||||
## Phase 6: Teach the Agent
|
||||
|
||||
Add GBrain rules to AGENTS.md (or equivalent):
|
||||
|
||||
```markdown
|
||||
## GBrain (Knowledge Search)
|
||||
|
||||
GBrain indexes your knowledge base for fast search. Always search before answering
|
||||
questions about people, companies, deals, or anything in the brain.
|
||||
|
||||
### Commands
|
||||
- `gbrain query "search terms"` -- Search the knowledge base (keyword + semantic)
|
||||
- `gbrain sync` -- Sync latest changes from git to GBrain
|
||||
- `gbrain files upload <path> --page <slug>` -- Upload a file to storage
|
||||
- `gbrain health` -- Check GBrain status
|
||||
- `gbrain stats` -- Show page count, embedding coverage, last sync
|
||||
|
||||
### Rules
|
||||
1. **Search the brain first.** Before answering any question about people, companies,
|
||||
deals, meetings, or strategy, run `gbrain query`. Your memory of file contents
|
||||
goes stale; the database doesn't.
|
||||
2. **Never commit binaries to git.** Use `gbrain files upload` instead.
|
||||
3. **After writing to the brain repo,** trigger `gbrain sync --no-pull` to update
|
||||
the search index immediately.
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
Every error tells you what happened, why, and how to fix it:
|
||||
|
||||
| What You See | Why | Fix |
|
||||
|---|---|---|
|
||||
| Connection refused | Supabase project paused or wrong URL | supabase.com/dashboard > Restore |
|
||||
| Password authentication failed | Wrong password | Project Settings > Database > Reset password |
|
||||
| pgvector not available | Extension not enabled | Run CREATE EXTENSION vector in SQL Editor |
|
||||
| OpenAI key invalid | Expired or wrong key | platform.openai.com/api-keys > Create new |
|
||||
| Sync anchor missing | Force push removed the commit | `gbrain sync --full` |
|
||||
| No pages found | Query before import | `gbrain import <dir>` first |
|
||||
|
||||
## Upgrading
|
||||
|
||||
Upgrade depends on how you installed:
|
||||
- **bun (standalone or library):** `bun update gbrain`
|
||||
- **ClawHub:** `clawhub update gbrain`
|
||||
- **Compiled binary:** Download the latest from [GitHub Releases](https://github.com/garrytan/gbrain/releases)
|
||||
|
||||
After upgrading:
|
||||
- Run `gbrain init` again to apply schema migrations (idempotent, safe to re-run)
|
||||
- The new `files` table gets created automatically on next init
|
||||
- Sync state is preserved across upgrades
|
||||
|
||||
## Health Check
|
||||
|
||||
Run `gbrain health` at any time to verify all connections:
|
||||
|
||||
```
|
||||
ok Database: connected
|
||||
ok pgvector: extension loaded
|
||||
ok Schema: up to date
|
||||
ok Sync: last run N min ago
|
||||
ok Embeddings: X/Y pages embedded
|
||||
```
|
||||
|
||||
Every unhealthy line includes WHY and FIX.
|
||||
The setup skill provides:
|
||||
- Auto-provision Supabase via CLI (< 2 min TTHW)
|
||||
- Manual fallback with non-interactive init
|
||||
- AGENTS.md auto-injection (upgrade-safe)
|
||||
- First import and health verification
|
||||
|
||||
@@ -4,34 +4,34 @@ Periodic brain health checks and cleanup.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Run health check.** `gbrain health` to get the dashboard.
|
||||
1. **Run health check.** Check gbrain health to get the dashboard.
|
||||
2. **Check each dimension:**
|
||||
|
||||
### Stale pages
|
||||
Pages where compiled_truth is older than the latest timeline entry. The assessment hasn't been updated to reflect recent evidence.
|
||||
- `gbrain query "stale pages"` or check health output
|
||||
- For each stale page: read timeline, determine if compiled_truth needs rewriting
|
||||
- Check the health output for stale page count
|
||||
- For each stale page: read the page from gbrain, review timeline, determine if compiled_truth needs rewriting
|
||||
|
||||
### Orphan pages
|
||||
Pages with zero inbound links. Nobody references them.
|
||||
- Review orphans: are they genuinely isolated or just missing links?
|
||||
- Add links from related pages or flag for deletion
|
||||
- Add links in gbrain from related pages or flag for deletion
|
||||
|
||||
### Dead links
|
||||
Links pointing to pages that don't exist.
|
||||
- Remove dead links with `gbrain unlink`
|
||||
- Remove dead links in gbrain
|
||||
|
||||
### Missing cross-references
|
||||
Pages that mention entity names but don't have formal links.
|
||||
- Read compiled_truth, extract entity mentions, create links
|
||||
- Read compiled_truth from gbrain, extract entity mentions, create links in gbrain
|
||||
|
||||
### Tag consistency
|
||||
Inconsistent tagging (e.g., "vc" vs "venture-capital", "ai" vs "artificial-intelligence").
|
||||
- Standardize to the most common variant
|
||||
- Standardize to the most common variant using gbrain tag operations
|
||||
|
||||
### Embedding freshness
|
||||
Chunks without embeddings, or chunks embedded with an old model.
|
||||
- `gbrain embed --stale` to backfill
|
||||
- Refresh stale embeddings in gbrain
|
||||
|
||||
### Open threads
|
||||
Timeline items older than 30 days with unresolved action items.
|
||||
@@ -41,19 +41,16 @@ Timeline items older than 30 days with unresolved action items.
|
||||
|
||||
- Never delete pages without confirmation
|
||||
- Log all changes via timeline entries
|
||||
- Run `gbrain health` before and after to show improvement
|
||||
- Check gbrain health before and after to show improvement
|
||||
|
||||
## Commands Used
|
||||
## Tools Used
|
||||
|
||||
```
|
||||
gbrain health
|
||||
gbrain list [--type T]
|
||||
gbrain get <slug>
|
||||
gbrain backlinks <slug>
|
||||
gbrain link <from> <to> --type <type>
|
||||
gbrain unlink <from> <to>
|
||||
gbrain tag <slug> <tag>
|
||||
gbrain untag <slug> <tag>
|
||||
gbrain embed --stale
|
||||
gbrain timeline <slug>
|
||||
```
|
||||
- Check gbrain health (get_health)
|
||||
- List pages in gbrain with filters (list_pages)
|
||||
- Read a page from gbrain (get_page)
|
||||
- Check backlinks in gbrain (get_backlinks)
|
||||
- Link entities in gbrain (add_link)
|
||||
- Remove links in gbrain (remove_link)
|
||||
- Tag a page in gbrain (add_tag)
|
||||
- Remove a tag in gbrain (remove_tag)
|
||||
- View timeline in gbrain (get_timeline)
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "gbrain",
|
||||
"version": "0.2.0",
|
||||
"version": "0.3.0",
|
||||
"description": "Personal knowledge brain with hybrid RAG search",
|
||||
"skills": [
|
||||
{
|
||||
@@ -34,9 +34,9 @@
|
||||
"description": "Universal migration from Obsidian, Notion, Logseq, markdown, CSV, JSON, Roam"
|
||||
},
|
||||
{
|
||||
"name": "install",
|
||||
"path": "install/SKILL.md",
|
||||
"description": "Set up GBrain from scratch: Supabase, import, sync, file migration"
|
||||
"name": "setup",
|
||||
"path": "setup/SKILL.md",
|
||||
"description": "Set up GBrain: auto-provision Supabase, AGENTS.md injection, first import"
|
||||
}
|
||||
],
|
||||
"dependencies": {
|
||||
@@ -44,7 +44,7 @@
|
||||
"package": "gbrain"
|
||||
},
|
||||
"setup": {
|
||||
"command": "gbrain init --supabase",
|
||||
"description": "Initialize brain with Supabase (guided wizard)"
|
||||
"skill": "setup",
|
||||
"description": "Auto-provision Supabase and configure GBrain (< 2 min)"
|
||||
}
|
||||
}
|
||||
|
||||
@@ -9,7 +9,7 @@ Universal migration from any wiki, note tool, or brain system into GBrain.
|
||||
| Obsidian | Markdown + `[[wikilinks]]` | Direct import, convert wikilinks to gbrain links |
|
||||
| Notion | Exported markdown or CSV | Parse Notion's export structure |
|
||||
| Logseq | Markdown with `((block refs))` | Convert block refs to page links |
|
||||
| Plain markdown | Any .md directory | `gbrain import <dir>` directly |
|
||||
| Plain markdown | Any .md directory | Import directory into gbrain directly |
|
||||
| CSV | Tabular data | Map columns to frontmatter fields |
|
||||
| JSON | Structured data | Map keys to page fields |
|
||||
| Roam | JSON export | Convert block structure to pages |
|
||||
@@ -18,31 +18,23 @@ Universal migration from any wiki, note tool, or brain system into GBrain.
|
||||
|
||||
1. **Assess the source.** What format? How many files? What structure?
|
||||
2. **Plan the mapping.** How do source fields map to gbrain fields (type, title, tags, compiled_truth, timeline)?
|
||||
3. **Test with a sample.** Import 5-10 files, verify with `gbrain get` and `gbrain export`.
|
||||
4. **Bulk import.** Run the full migration.
|
||||
5. **Verify.** `gbrain health` + `gbrain stats` + spot-check pages.
|
||||
6. **Build links.** Extract cross-references from content and create typed links.
|
||||
3. **Test with a sample.** Import 5-10 files, verify by reading them back from gbrain and exporting.
|
||||
4. **Bulk import.** Import the full directory into gbrain.
|
||||
5. **Verify.** Check gbrain health and statistics, spot-check pages.
|
||||
6. **Build links.** Extract cross-references from content and create typed links in gbrain.
|
||||
|
||||
## Obsidian Migration
|
||||
|
||||
```bash
|
||||
# 1. Direct import (obsidian vaults are markdown directories)
|
||||
gbrain import /path/to/vault/
|
||||
|
||||
# 2. Convert [[wikilinks]] to gbrain links
|
||||
# The skill reads each page's compiled_truth, finds [[Name]] patterns,
|
||||
# resolves them to slugs, and creates links:
|
||||
gbrain get <slug> # read content
|
||||
# For each [[Name]] found:
|
||||
gbrain link <current-slug> <resolved-slug> --type references
|
||||
```
|
||||
1. Import the vault directory into gbrain (Obsidian vaults are markdown directories)
|
||||
2. Convert `[[wikilinks]]` to gbrain links:
|
||||
- Read each page from gbrain
|
||||
- For each `[[Name]]` found, resolve to a slug and create a link in gbrain
|
||||
- `[[Name|alias]]` uses the alias for context
|
||||
|
||||
Obsidian-specific:
|
||||
- `[[Name]]` becomes `gbrain link`
|
||||
- `[[Name|alias]]` uses the alias for context
|
||||
- Tags (`#tag`) become `gbrain tag`
|
||||
- Tags (`#tag`) become gbrain tags
|
||||
- Frontmatter properties map to gbrain frontmatter
|
||||
- Attachments (images, PDFs) are noted but not imported (future work)
|
||||
- Attachments (images, PDFs) are noted but handled separately via file storage
|
||||
|
||||
## Notion Migration
|
||||
|
||||
@@ -50,38 +42,31 @@ Obsidian-specific:
|
||||
2. Notion exports nested directories with UUIDs in filenames
|
||||
3. Strip UUIDs from filenames for clean slugs
|
||||
4. Map Notion's database properties to frontmatter
|
||||
5. `gbrain import` the cleaned directory
|
||||
5. Import the cleaned directory into gbrain
|
||||
|
||||
## CSV Migration
|
||||
|
||||
For tabular data (e.g., CRM exports, contact lists):
|
||||
|
||||
```bash
|
||||
# For each row in the CSV:
|
||||
# 1. Create a page with column values as frontmatter
|
||||
# 2. Use a designated column as the slug (e.g., name)
|
||||
# 3. Use another column as compiled_truth (e.g., notes)
|
||||
gbrain put <slug> < generated.md
|
||||
```
|
||||
1. For each row in the CSV, create a page with column values as frontmatter
|
||||
2. Use a designated column as the slug (e.g., name)
|
||||
3. Use another column as compiled_truth (e.g., notes)
|
||||
4. Store each page in gbrain
|
||||
|
||||
## Verification
|
||||
|
||||
After any migration:
|
||||
1. `gbrain stats` — check page count matches source
|
||||
2. `gbrain health` — check for orphans, missing embeddings
|
||||
3. `gbrain export --dir /tmp/verify/` — round-trip test
|
||||
4. Spot-check 5-10 pages with `gbrain get`
|
||||
5. Test search: `gbrain query "someone you know is in the data"`
|
||||
1. Check gbrain statistics to verify page count matches source
|
||||
2. Check gbrain health for orphans and missing embeddings
|
||||
3. Export pages from gbrain for round-trip verification
|
||||
4. Spot-check 5-10 pages by reading them from gbrain
|
||||
5. Test search: search gbrain for "someone you know is in the data"
|
||||
|
||||
## Commands Used
|
||||
## Tools Used
|
||||
|
||||
```
|
||||
gbrain import <dir> [--no-embed]
|
||||
gbrain get <slug>
|
||||
gbrain put <slug>
|
||||
gbrain link <from> <to> --type <type>
|
||||
gbrain tag <slug> <tag>
|
||||
gbrain stats
|
||||
gbrain health
|
||||
gbrain export [--dir ./verify/]
|
||||
```
|
||||
- Store/update pages in gbrain (put_page)
|
||||
- Read pages from gbrain (get_page)
|
||||
- Link entities in gbrain (add_link)
|
||||
- Tag pages in gbrain (add_tag)
|
||||
- Get gbrain statistics (get_stats)
|
||||
- Check gbrain health (get_health)
|
||||
- Search gbrain (query)
|
||||
|
||||
@@ -9,10 +9,10 @@ Answer questions using the brain's knowledge with 3-layer search and synthesis.
|
||||
- Semantic query for conceptual questions
|
||||
- Structured queries (list by type, backlinks) for relational questions
|
||||
2. **Execute searches:**
|
||||
- `gbrain search <keywords>` for FTS matches
|
||||
- `gbrain query <question>` for hybrid semantic+keyword with expansion
|
||||
- `gbrain list --type <type>` or `gbrain backlinks <slug>` for structural queries
|
||||
3. **Read top results.** `gbrain get <slug>` for the top 3-5 pages to get full context.
|
||||
- Keyword search gbrain for FTS matches (search)
|
||||
- Hybrid search gbrain for semantic+keyword with expansion (query)
|
||||
- List pages in gbrain by type or check backlinks for structural queries
|
||||
3. **Read top results.** Read the top 3-5 pages from gbrain to get full context.
|
||||
4. **Synthesize answer** with citations. Every claim traces back to a specific page slug.
|
||||
5. **Flag gaps.** If the brain doesn't have info, say "the brain doesn't have information on X" rather than hallucinating.
|
||||
|
||||
@@ -25,14 +25,12 @@ Answer questions using the brain's knowledge with 3-layer search and synthesis.
|
||||
- For "what happened" questions, use timeline entries
|
||||
- For "what do we know" questions, read compiled_truth directly
|
||||
|
||||
## Commands Used
|
||||
## Tools Used
|
||||
|
||||
```
|
||||
gbrain search <query>
|
||||
gbrain query <question>
|
||||
gbrain get <slug>
|
||||
gbrain list [--type T] [--tag T]
|
||||
gbrain backlinks <slug>
|
||||
gbrain graph <slug> [--depth N]
|
||||
gbrain timeline <slug>
|
||||
```
|
||||
- Keyword search gbrain (search)
|
||||
- Hybrid search gbrain (query)
|
||||
- Read a page from gbrain (get_page)
|
||||
- List pages in gbrain with filters (list_pages)
|
||||
- Check backlinks in gbrain (get_backlinks)
|
||||
- Traverse the link graph in gbrain (traverse_graph)
|
||||
- View timeline entries in gbrain (get_timeline)
|
||||
|
||||
111
skills/setup/SKILL.md
Normal file
111
skills/setup/SKILL.md
Normal file
@@ -0,0 +1,111 @@
|
||||
# Setup GBrain
|
||||
|
||||
Set up GBrain from scratch. Target: working brain in under 2 minutes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- A Supabase account (Pro tier recommended: $25/mo for 8GB DB + 100GB storage)
|
||||
- An OpenAI API key (for semantic search embeddings, ~$4-5 for 7,500 pages)
|
||||
- A git-backed markdown knowledge base (or start fresh)
|
||||
|
||||
## Phase A: Auto-Provision (Supabase CLI)
|
||||
|
||||
Check if the Supabase CLI is available. If it is, use the fast path:
|
||||
|
||||
1. Tell the user: "I'll set up Supabase for you. Click 'Authorize' when your browser opens."
|
||||
2. Run `supabase login` (opens browser for OAuth)
|
||||
3. Run `supabase projects create --name gbrain --region us-east-1`
|
||||
4. Extract the database connection URL from `supabase projects api-keys`
|
||||
5. Initialize gbrain with the connection URL in non-interactive mode
|
||||
6. Proceed to Phase C automatically
|
||||
|
||||
## Phase B: Manual Fallback
|
||||
|
||||
If the Supabase CLI is not available, guide the user:
|
||||
|
||||
1. "Log into Supabase and add a credit card: https://supabase.com/dashboard/account/billing"
|
||||
2. "Create a new project: https://supabase.com/dashboard/new/_"
|
||||
- Name: gbrain
|
||||
- Region: closest to you
|
||||
- Generate a strong password
|
||||
3. "Go to Project Settings > Database and copy the connection string (URI format)"
|
||||
- Paste it here
|
||||
4. Initialize gbrain with the provided URL in non-interactive mode
|
||||
|
||||
That's it. One copy-paste. The agent does everything else.
|
||||
|
||||
## Phase C: First Import
|
||||
|
||||
1. **Discover markdown repos.** Scan the environment for git repos with markdown content.
|
||||
|
||||
```bash
|
||||
echo "=== GBrain Environment Discovery ==="
|
||||
for dir in /data/* ~/git/* ~/Documents/* 2>/dev/null; do
|
||||
if [ -d "$dir/.git" ]; then
|
||||
md_count=$(find "$dir" -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" 2>/dev/null | wc -l | tr -d ' ')
|
||||
if [ "$md_count" -gt 10 ]; then
|
||||
total_size=$(du -sh "$dir" 2>/dev/null | cut -f1)
|
||||
echo " $dir ($total_size, $md_count .md files)"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
echo "=== Discovery Complete ==="
|
||||
```
|
||||
|
||||
2. **Import the best candidate.** Import the recommended directory into gbrain.
|
||||
3. **Prove search works.** Search gbrain for a topic from the imported data. Show results immediately.
|
||||
4. **Start embeddings.** Refresh stale embeddings in gbrain (runs in background). Keyword search works NOW, semantic search improves as embeddings complete.
|
||||
|
||||
## Phase D: AGENTS.md Injection
|
||||
|
||||
Auto-inject gbrain instructions into the project's AGENTS.md (or equivalent). Use a delimited managed block that's upgrade-safe:
|
||||
|
||||
```markdown
|
||||
<!-- gbrain:start -->
|
||||
## GBrain (Knowledge Search)
|
||||
|
||||
GBrain indexes your knowledge base for fast search. Always search before answering
|
||||
questions about people, companies, deals, or anything in the brain.
|
||||
|
||||
### How to use
|
||||
- Search gbrain for any topic before answering questions
|
||||
- After writing new content, sync the repository to gbrain
|
||||
- Upload binary files to gbrain storage instead of committing to git
|
||||
- Check gbrain health periodically
|
||||
|
||||
### Rules
|
||||
1. **Search the brain first.** Before answering any question about people, companies,
|
||||
deals, meetings, or strategy, search gbrain. Your memory of file contents goes
|
||||
stale; the database doesn't.
|
||||
2. **Never commit binaries to git.** Upload to gbrain file storage instead.
|
||||
3. **After writing to the brain repo,** sync to gbrain immediately.
|
||||
<!-- gbrain:end -->
|
||||
```
|
||||
|
||||
## Phase E: Health Check
|
||||
|
||||
After setup is complete, check gbrain health. Every dimension should be healthy.
|
||||
Report the final state to the user:
|
||||
- Page count and statistics
|
||||
- Embedding coverage
|
||||
- Search verification (run a sample query)
|
||||
|
||||
## Error Handling
|
||||
|
||||
Every error tells you what happened, why, and how to fix it:
|
||||
|
||||
| What You See | Why | Fix |
|
||||
|---|---|---|
|
||||
| Connection refused | Supabase project paused or wrong URL | supabase.com/dashboard > Restore |
|
||||
| Password authentication failed | Wrong password | Project Settings > Database > Reset password |
|
||||
| pgvector not available | Extension not enabled | Run CREATE EXTENSION vector in SQL Editor |
|
||||
| OpenAI key invalid | Expired or wrong key | platform.openai.com/api-keys > Create new |
|
||||
| No pages found | Query before import | Import files into gbrain first |
|
||||
|
||||
## Tools Used
|
||||
|
||||
- Initialize gbrain (via CLI: gbrain init --non-interactive --url ...)
|
||||
- Import files into gbrain (via CLI: gbrain import)
|
||||
- Search gbrain (query)
|
||||
- Check gbrain health (get_health)
|
||||
- Get gbrain statistics (get_stats)
|
||||
Reference in New Issue
Block a user