feat: GBrain v0.3.0 — contract-first architecture + ClawHub plugin (#7)

* feat: contract-first operations.ts with OperationError, dry_run, importFromContent

30 shared operations as single source of truth for CLI and MCP.
- OperationError with typed error codes (page_not_found, invalid_params, etc.)
- dry_run support on all mutating operations
- importFromContent split from importFile with transaction wrapping
- Idempotency hash now includes ALL fields (title, type, frontmatter, tags)
- Config env var fallback: GBRAIN_DATABASE_URL > DATABASE_URL > config file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: rewrite MCP server + CLI + tools-json from operations

server.ts: 233 -> ~80 lines. Tool definitions and dispatch generated from operations[].
cli.ts: shared operations auto-registered, CLI-only commands kept as manual dispatch.
tools-json: generated FROM operations[], eliminating the third contract surface.
Parity test verifies structural contract between operations, CLI, and MCP.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: delete 12 command files migrated to operations.ts

Handler logic for get, put, delete, list, search, query, health, stats,
tags, link, timeline, and version now lives in operations.ts.
Kept: init, upgrade, import, export, files, embed, sync, serve, call, config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: init --non-interactive, upgrade verification, schema migration

- gbrain init --non-interactive --url <url> for plugin mode (no TTY required)
- Post-upgrade version verification in gbrain upgrade
- Drop storage_url from files table (storage_path is the only identifier)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: tool-agnostic skills + new setup skill

All 7 skills rewritten with intent-based language instead of CLI commands.
Works with both CLI and MCP plugin contexts.
New setup skill replaces install: auto-provision Supabase via CLI,
AGENTS.md injection, target TTHW < 2 min.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: ClawHub bundle plugin, CI workflows, v0.3.0

- openclaw.plugin.json with configSchema, MCP server config, skill listing
- GitHub Actions: test on push/PR, multi-platform release (macOS arm64 + Linux x64)
- Version bump 0.3.0, CHANGELOG, README ClawHub section, CLAUDE.md updated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: idempotency hash mismatch + MCP dry_run passthrough

importFromContent now passes its all-fields hash through putPage via
content_hash on PageInput, so the stored hash matches the computed hash.
Previously the skip-if-unchanged check never fired because the hash
formulas differed.

MCP server now passes dry_run from tool params to OperationContext.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.3.0.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: schema loader handles PL/pgSQL $$ blocks

Delete the semicolon-based SQL splitter in db.ts which broke on
PL/pgSQL trigger functions containing semicolons inside $$ delimiter
blocks. Use single conn.unsafe(schemaSql) call instead — the postgres
driver handles multi-statement SQL natively. schema.sql already uses
IF NOT EXISTS / CREATE OR REPLACE for idempotency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: E2E test infrastructure + realistic brain fixtures

Add test infrastructure for running E2E tests against real
Postgres+pgvector. Includes:
- test/e2e/helpers.ts: DB lifecycle, fixture import, timing, diagnostics
- 13 fixture files as a miniature realistic brain (people, companies,
  deals, meetings, concepts, projects, sources) following the
  compiled truth + timeline format from GBRAIN_RECOMMENDED_SCHEMA.md
- docker-compose.test.yml: local pgvector convenience (port 5433)
- .env.testing.example: template for test credentials
- package.json: add test:e2e script

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: E2E test suites + CI workflow

Tier 1 (mechanical.test.ts): 14 test suites covering all operations
against real Postgres — page CRUD, search with quality scoring, links,
tags, timeline, versions, admin, chunks, resolution, ingest log, raw
data, files, idempotency stress, setup journey (full CLI flow), init
edge cases, schema idempotency, schema diff guard, performance baselines.

Tier 1 (mcp.test.ts): MCP protocol test — spawns server, sends JSON-RPC,
verifies tools/list matches operations count.

Tier 2 (skills.test.ts): OpenClaw skill tests — ingest, query, health.
Skips gracefully when dependencies missing.

CI (.github/workflows/e2e.yml): Tier 1 on every PR (pgvector service),
Tier 2 nightly/manual with API key secrets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: E2E test fixes + traverseGraph jsonb cast

- Fix traverseGraph query: cast json_agg to jsonb_agg so SELECT DISTINCT works
- Fix put_page tests to use importFromContent with noEmbed (no OpenAI key in Tier 1)
- Fix get_health assertion (page_count not total_pages)
- Fix raw_data test to handle JSONB string/object return
- Simplify MCP test to verify tool generation directly
- Add timeouts to CLI subprocess tests
- Use port 5434 for docker-compose (5433 often in use)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: update all project docs for E2E test suite

- CLAUDE.md: updated test count (9 unit + 3 E2E), added E2E test
  instructions, fixed skill count to 8
- CONTRIBUTING.md: updated project structure with test/e2e/, added E2E
  test instructions, rewrote "Adding a new command" to reflect
  contract-first architecture (add to operations.ts, done)
- README.md: fixed table count (10 not 9), added recommended schema doc
  to Docs section, added E2E instructions to Contributing section
- CHANGELOG.md: added E2E test suite, docker-compose, schema loader fix,
  and traverseGraph jsonb fix to v0.3.0 entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-08 23:26:11 -10:00
committed by GitHub
parent ee9e6689ad
commit a86f995883
67 changed files with 3827 additions and 1394 deletions

111
skills/setup/SKILL.md Normal file
View File

@@ -0,0 +1,111 @@
# Setup GBrain
Set up GBrain from scratch. Target: working brain in under 2 minutes.
## Prerequisites
- A Supabase account (Pro tier recommended: $25/mo for 8GB DB + 100GB storage)
- An OpenAI API key (for semantic search embeddings, ~$4-5 for 7,500 pages)
- A git-backed markdown knowledge base (or start fresh)
## Phase A: Auto-Provision (Supabase CLI)
Check if the Supabase CLI is available. If it is, use the fast path:
1. Tell the user: "I'll set up Supabase for you. Click 'Authorize' when your browser opens."
2. Run `supabase login` (opens browser for OAuth)
3. Run `supabase projects create --name gbrain --region us-east-1`
4. Extract the database connection URL from `supabase projects api-keys`
5. Initialize gbrain with the connection URL in non-interactive mode
6. Proceed to Phase C automatically
## Phase B: Manual Fallback
If the Supabase CLI is not available, guide the user:
1. "Log into Supabase and add a credit card: https://supabase.com/dashboard/account/billing"
2. "Create a new project: https://supabase.com/dashboard/new/_"
- Name: gbrain
- Region: closest to you
- Generate a strong password
3. "Go to Project Settings > Database and copy the connection string (URI format)"
- Paste it here
4. Initialize gbrain with the provided URL in non-interactive mode
That's it. One copy-paste. The agent does everything else.
## Phase C: First Import
1. **Discover markdown repos.** Scan the environment for git repos with markdown content.
```bash
echo "=== GBrain Environment Discovery ==="
for dir in /data/* ~/git/* ~/Documents/* 2>/dev/null; do
if [ -d "$dir/.git" ]; then
md_count=$(find "$dir" -name "*.md" -not -path "*/node_modules/*" -not -path "*/.git/*" 2>/dev/null | wc -l | tr -d ' ')
if [ "$md_count" -gt 10 ]; then
total_size=$(du -sh "$dir" 2>/dev/null | cut -f1)
echo " $dir ($total_size, $md_count .md files)"
fi
fi
done
echo "=== Discovery Complete ==="
```
2. **Import the best candidate.** Import the recommended directory into gbrain.
3. **Prove search works.** Search gbrain for a topic from the imported data. Show results immediately.
4. **Start embeddings.** Refresh stale embeddings in gbrain (runs in background). Keyword search works NOW, semantic search improves as embeddings complete.
## Phase D: AGENTS.md Injection
Auto-inject gbrain instructions into the project's AGENTS.md (or equivalent). Use a delimited managed block that's upgrade-safe:
```markdown
<!-- gbrain:start -->
## GBrain (Knowledge Search)
GBrain indexes your knowledge base for fast search. Always search before answering
questions about people, companies, deals, or anything in the brain.
### How to use
- Search gbrain for any topic before answering questions
- After writing new content, sync the repository to gbrain
- Upload binary files to gbrain storage instead of committing to git
- Check gbrain health periodically
### Rules
1. **Search the brain first.** Before answering any question about people, companies,
deals, meetings, or strategy, search gbrain. Your memory of file contents goes
stale; the database doesn't.
2. **Never commit binaries to git.** Upload to gbrain file storage instead.
3. **After writing to the brain repo,** sync to gbrain immediately.
<!-- gbrain:end -->
```
## Phase E: Health Check
After setup is complete, check gbrain health. Every dimension should be healthy.
Report the final state to the user:
- Page count and statistics
- Embedding coverage
- Search verification (run a sample query)
## Error Handling
Every error tells you what happened, why, and how to fix it:
| What You See | Why | Fix |
|---|---|---|
| Connection refused | Supabase project paused or wrong URL | supabase.com/dashboard > Restore |
| Password authentication failed | Wrong password | Project Settings > Database > Reset password |
| pgvector not available | Extension not enabled | Run CREATE EXTENSION vector in SQL Editor |
| OpenAI key invalid | Expired or wrong key | platform.openai.com/api-keys > Create new |
| No pages found | Query before import | Import files into gbrain first |
## Tools Used
- Initialize gbrain (via CLI: gbrain init --non-interactive --url ...)
- Import files into gbrain (via CLI: gbrain import)
- Search gbrain (query)
- Check gbrain health (get_health)
- Get gbrain statistics (get_stats)