* feat: battle-tested skill patterns from production deployment Backport production-learned brain-operations patterns: - Iron Law of Back-Linking (mandatory bidirectional linking) - Brain filing rules (file by primary subject, not format) - Enrichment protocol (7-step pipeline, 3-tier system, person/company templates) - Media ingest workflows (articles, videos, podcasts, PDFs, screenshots) - Citation requirements (mandatory [Source: ...] on every fact) - Test Before Bulk operating principle - Voice recipe: unicode crash fix, PII scrub, identity-first prompt, DIY STT+LLM+TTS - X-to-Brain recipe: image OCR, Filtered Stream, tweet rating rubric, cron stagger * chore: bump version and changelog (v0.8.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add _brain-filing-rules.md to CLAUDE.md key files * feat: smart file upload with TUS resumable and .redirect.yaml pointers - Supabase Storage auto-selects upload method by file size: < 100 MB standard POST, >= 100 MB TUS resumable (6 MB chunks + retry) - Signed URL generation for private bucket access (1-hour expiry) - New `upload-raw` command with size routing: small text stays in git, large/media files go to cloud with .redirect.yaml pointer - New `signed-url` command for generating access links - File resolver supports both .redirect.yaml (v0.9+) and .redirect (legacy) - Redirect format upgraded: 10 fields with full metadata - All migration commands (mirror, redirect, restore, clean) handle both formats * feat: skills reference actual gbrain file commands - Filing rules document upload-raw, signed-url, and .redirect.yaml format - Ingest skill uses gbrain files upload-raw for raw source preservation - Maintain skill adds file storage health checks - Setup skill adds storage configuration phase with migration guidance - Voice recipe uses upload-raw for call audio storage - Migration v0.9.0 with complete storage setup instructions * chore: bump version and changelog (v0.9.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: gbrain publish -- shareable HTML with password protection First code+skill pair: deterministic code does the work (strip private data, encrypt with AES-256-GCM, generate self-contained HTML), the skill tells the agent when and how to use it. 34 new tests. See: https://x.com/garrytan/status/2042925773300908103 * feat: backlinks check/fix, page lint, and report commands Three new deterministic tools (zero LLM calls): - gbrain backlinks check/fix -- scans brain for entity mentions without back-links, creates them. Enforces the Iron Law from the skills. - gbrain lint [--fix] -- catches LLM preambles, code fence wrapping, placeholder dates, missing frontmatter, broken citations, empty sections. --fix auto-strips fixable artifacts. - gbrain report --type <name> -- saves timestamped reports to brain/reports/{type}/YYYY-MM-DD-HHMM.md for audit trails. 33 new tests (409 total, 0 fail). * feat: v0.9.0 migration tells agents to swap scripts for built-in commands Migration file now: - Lists all 5 new deterministic commands with usage examples - Includes a script-to-command replacement table (old -> new) - Tells the agent to find custom script references in AGENTS.md, skills, and cron jobs and replace with gbrain commands - Adds recommended cron jobs for daily backlink fix + weekly lint - References the Thin Harness, Fat Skills thread * fix: CLI routing bugs found during DX review - Fixed subArgs reference error in handleCliOnly (used wrong variable name) - Renamed gbrain backlinks check/fix to gbrain check-backlinks to avoid conflict with existing backlinks operation (per-page incoming links) - Added TOOLS section to --help output showing publish, check-backlinks, lint, report - Added upload-raw and signed-url to FILES section in --help - Updated all docs/migration references to use check-backlinks * fix: security hardening from adversarial review - XSS: sanitize marked.parse() output (strip script/iframe/on* attrs) - Path traversal: validate report --type against [a-z0-9-] pattern - TUS: HEAD request before retry to get server's actual offset (TUS spec) - Pointer: upload-raw now includes pointer content in JSON output - Symlinks: use lstatSync in all walkers to prevent directory escape --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
300 lines
16 KiB
Markdown
300 lines
16 KiB
Markdown
# CLAUDE.md
|
|
|
|
GBrain is a personal knowledge brain. Pluggable engines: PGLite (embedded Postgres
|
|
via WASM, zero-config default) or Postgres + pgvector + hybrid search in a managed
|
|
Supabase instance. `gbrain init` defaults to PGLite; suggests Supabase for 1000+ files.
|
|
|
|
## Architecture
|
|
|
|
Contract-first: `src/core/operations.ts` defines ~30 shared operations. CLI and MCP
|
|
server are both generated from this single source. Engine factory (`src/core/engine-factory.ts`)
|
|
dynamically imports the configured engine (`'pglite'` or `'postgres'`). Skills are fat
|
|
markdown files (tool-agnostic, work with both CLI and plugin contexts).
|
|
|
|
## Key files
|
|
|
|
- `src/core/operations.ts` — Contract-first operation definitions (the foundation)
|
|
- `src/core/engine.ts` — Pluggable engine interface (BrainEngine)
|
|
- `src/core/engine-factory.ts` — Engine factory with dynamic imports (`'pglite'` | `'postgres'`)
|
|
- `src/core/pglite-engine.ts` — PGLite (embedded Postgres 17.5 via WASM) implementation, all 37 BrainEngine methods
|
|
- `src/core/pglite-schema.ts` — PGLite-specific DDL (pgvector, pg_trgm, triggers)
|
|
- `src/core/postgres-engine.ts` — Postgres + pgvector implementation (Supabase / self-hosted)
|
|
- `src/core/utils.ts` — Shared SQL utilities extracted from postgres-engine.ts
|
|
- `src/core/db.ts` — Connection management, schema initialization
|
|
- `src/commands/migrate-engine.ts` — Bidirectional engine migration (`gbrain migrate --to supabase/pglite`)
|
|
- `src/core/import-file.ts` — importFromFile + importFromContent (chunk + embed + tags)
|
|
- `src/core/sync.ts` — Pure sync functions (manifest parsing, filtering, slug conversion)
|
|
- `src/core/storage.ts` — Pluggable storage interface (S3, Supabase Storage, local)
|
|
- `src/core/supabase-admin.ts` — Supabase admin API (project discovery, pgvector check)
|
|
- `src/core/file-resolver.ts` — File resolution with fallback chain (local -> .redirect.yaml -> .redirect -> .supabase)
|
|
- `src/core/chunkers/` — 3-tier chunking (recursive, semantic, LLM-guided)
|
|
- `src/core/search/` — Hybrid search: vector + keyword + RRF + multi-query expansion + dedup
|
|
- `src/core/embedding.ts` — OpenAI text-embedding-3-large, batch, retry, backoff
|
|
- `src/mcp/server.ts` — MCP stdio server (generated from operations)
|
|
- `src/commands/auth.ts` — Standalone token management (create/list/revoke/test)
|
|
- `src/commands/upgrade.ts` — Self-update CLI with post-upgrade feature discovery
|
|
- `src/core/schema-embedded.ts` — AUTO-GENERATED from schema.sql (run `bun run build:schema`)
|
|
- `src/schema.sql` — Full Postgres + pgvector DDL (source of truth, generates schema-embedded.ts)
|
|
- `src/commands/integrations.ts` — Standalone integration recipe management (no DB needed)
|
|
- `recipes/` — Integration recipe files (YAML frontmatter + markdown setup instructions)
|
|
- `docs/guides/` — Individual SKILLPACK guides (broken out from monolith)
|
|
- `docs/integrations/` — "Getting Data In" guides and integration docs
|
|
- `docs/architecture/infra-layer.md` — Shared infrastructure documentation
|
|
- `docs/ethos/THIN_HARNESS_FAT_SKILLS.md` — Architecture philosophy essay
|
|
- `docs/ethos/MARKDOWN_SKILLS_AS_RECIPES.md` — "Homebrew for Personal AI" essay
|
|
- `docs/guides/repo-architecture.md` — Two-repo pattern (agent vs brain)
|
|
- `docs/guides/sub-agent-routing.md` — Model routing table for sub-agents
|
|
- `docs/guides/skill-development.md` — 5-step skill development cycle + MECE
|
|
- `docs/guides/idea-capture.md` — Originality distribution, depth test, cross-linking
|
|
- `docs/guides/quiet-hours.md` — Notification hold + timezone-aware delivery
|
|
- `docs/guides/diligence-ingestion.md` — Data room to brain pages pipeline
|
|
- `docs/designs/HOMEBREW_FOR_PERSONAL_AI.md` — 10-star vision for integration system
|
|
- `docs/mcp/` — Per-client setup guides (Claude Desktop, Code, Cowork, Perplexity)
|
|
- `skills/_brain-filing-rules.md` — Cross-cutting brain filing rules (referenced by all brain-writing skills)
|
|
- `skills/migrations/` — Version migration files with feature_pitch YAML frontmatter
|
|
- `src/commands/publish.ts` — Deterministic brain page publisher (code+skill pair, zero LLM calls)
|
|
- `src/commands/backlinks.ts` — Back-link checker and fixer (enforces Iron Law)
|
|
- `src/commands/lint.ts` — Page quality linter (catches LLM artifacts, placeholder dates)
|
|
- `src/commands/report.ts` — Structured report saver (audit trail for maintenance/enrichment)
|
|
- `openclaw.plugin.json` — ClawHub bundle plugin manifest
|
|
|
|
## Commands
|
|
|
|
Run `gbrain --help` or `gbrain --tools-json` for full command reference.
|
|
|
|
Key commands added in v0.7:
|
|
- `gbrain init` — defaults to PGLite (no Supabase needed), scans repo size, suggests Supabase for 1000+ files
|
|
- `gbrain migrate --to supabase` / `gbrain migrate --to pglite` — bidirectional engine migration
|
|
|
|
## Testing
|
|
|
|
`bun test` runs all tests (23 unit test files + 4 E2E test files). Unit tests run
|
|
without a database. E2E tests skip gracefully when `DATABASE_URL` is not set.
|
|
|
|
Unit tests: `test/markdown.test.ts` (frontmatter parsing), `test/chunkers/recursive.test.ts`
|
|
(chunking), `test/sync.test.ts` (sync logic), `test/parity.test.ts` (operations contract
|
|
parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redaction),
|
|
`test/files.test.ts` (MIME/hash), `test/import-file.test.ts` (import pipeline),
|
|
`test/upgrade.test.ts` (schema migrations), `test/doctor.test.ts` (doctor command),
|
|
`test/file-migration.test.ts` (file migration), `test/file-resolver.test.ts` (file resolution),
|
|
`test/import-resume.test.ts` (import checkpoints), `test/migrate.test.ts` (migration),
|
|
`test/setup-branching.test.ts` (setup flow), `test/slug-validation.test.ts` (slug validation),
|
|
`test/storage.test.ts` (storage backends), `test/supabase-admin.test.ts` (Supabase admin),
|
|
`test/yaml-lite.test.ts` (YAML parsing), `test/check-update.test.ts` (version check + update CLI),
|
|
`test/pglite-engine.test.ts` (PGLite engine, all 37 BrainEngine methods),
|
|
`test/utils.test.ts` (shared SQL utilities), `test/engine-factory.test.ts` (engine factory + dynamic imports),
|
|
`test/integrations.test.ts` (recipe parsing, CLI routing, recipe validation),
|
|
`test/publish.test.ts` (content stripping, encryption, password generation, HTML output),
|
|
`test/backlinks.test.ts` (entity extraction, back-link detection, timeline entry generation),
|
|
`test/lint.test.ts` (LLM artifact detection, code fence stripping, frontmatter validation),
|
|
`test/report.test.ts` (report format, directory structure).
|
|
|
|
E2E tests (`test/e2e/`): Run against real Postgres+pgvector. Require `DATABASE_URL`.
|
|
- `bun run test:e2e` runs Tier 1 (mechanical, all operations, no API keys)
|
|
- `test/e2e/upgrade.test.ts` runs check-update E2E against real GitHub API (network required)
|
|
- Tier 2 (`skills.test.ts`) requires OpenClaw + API keys, runs nightly in CI
|
|
- If `.env.testing` doesn't exist in this directory, check sibling worktrees for one:
|
|
`find ../ -maxdepth 2 -name .env.testing -print -quit` and copy it here if found.
|
|
- Always run E2E tests when they exist. Do not skip them just because DATABASE_URL
|
|
is not set. Start the test DB, run the tests, then tear it down.
|
|
|
|
### API keys and running ALL tests
|
|
|
|
ALWAYS source the user's shell profile before running tests:
|
|
|
|
```bash
|
|
source ~/.zshrc 2>/dev/null || true
|
|
```
|
|
|
|
This loads `OPENAI_API_KEY` and `ANTHROPIC_API_KEY`. Without these, Tier 2 tests
|
|
skip silently. Do NOT skip Tier 2 tests just because they require API keys — load
|
|
the keys and run them.
|
|
|
|
When asked to "run all E2E tests" or "run tests", that means ALL tiers:
|
|
- Tier 1: `bun run test:e2e` (mechanical, sync, upgrade — no API keys needed)
|
|
- Tier 2: `test/e2e/skills.test.ts` (requires OpenAI + Anthropic + openclaw CLI)
|
|
- Always spin up the test DB, source zshrc, run everything, tear down.
|
|
|
|
### E2E test DB lifecycle (ALWAYS follow this)
|
|
|
|
You are responsible for spinning up and tearing down the test Postgres container.
|
|
Do not leave containers running after tests. Do not skip E2E tests.
|
|
|
|
1. **Check for `.env.testing`** — if missing, copy from sibling worktree.
|
|
Read it to get the DATABASE_URL (it has the port number).
|
|
2. **Check if the port is free:**
|
|
`docker ps --filter "publish=PORT"` — if another container is on that port,
|
|
pick a different port (try 5435, 5436, 5437) and start on that one instead.
|
|
3. **Start the test DB:**
|
|
```bash
|
|
docker run -d --name gbrain-test-pg \
|
|
-e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres \
|
|
-e POSTGRES_DB=gbrain_test \
|
|
-p PORT:5432 pgvector/pgvector:pg16
|
|
```
|
|
Wait for ready: `docker exec gbrain-test-pg pg_isready -U postgres`
|
|
4. **Run E2E tests:**
|
|
`DATABASE_URL=postgresql://postgres:postgres@localhost:PORT/gbrain_test bun run test:e2e`
|
|
5. **Tear down immediately after tests finish (pass or fail):**
|
|
`docker stop gbrain-test-pg && docker rm gbrain-test-pg`
|
|
|
|
Never leave `gbrain-test-pg` running. If you find a stale one from a previous run,
|
|
stop and remove it before starting a new one.
|
|
|
|
## Skills
|
|
|
|
Read the skill files in `skills/` before doing brain operations. They contain the
|
|
workflows, heuristics, and quality rules for ingestion, querying, maintenance,
|
|
enrichment, and setup. 7 skills: ingest, query, maintain, enrich, briefing,
|
|
migrate, setup.
|
|
|
|
## Build
|
|
|
|
`bun build --compile --outfile bin/gbrain src/cli.ts`
|
|
|
|
## Pre-ship requirements
|
|
|
|
Before shipping (/ship) or reviewing (/review), always run the full test suite:
|
|
- `bun test` — unit tests (no database required)
|
|
- Follow the "E2E test DB lifecycle" steps above to spin up the test DB,
|
|
run `bun run test:e2e`, then tear it down.
|
|
|
|
Both must pass. Do not ship with failing E2E tests. Do not skip E2E tests.
|
|
|
|
## Post-ship requirements (MANDATORY)
|
|
|
|
After EVERY /ship, you MUST run /document-release. This is NOT optional. Do NOT
|
|
skip it. Do NOT say "docs look fine" without running it. The skill reads every .md
|
|
file in the project, cross-references the diff, and updates anything that drifted.
|
|
|
|
If /ship's Step 8.5 triggers document-release automatically, that counts. But if
|
|
it gets skipped for ANY reason (timeout, error, oversight), you MUST run it manually
|
|
before considering the ship complete.
|
|
|
|
Files that MUST be checked on every ship:
|
|
- README.md — does it reflect new features, commands, or setup steps?
|
|
- CLAUDE.md — does it reflect new files, test files, or architecture changes?
|
|
- CHANGELOG.md — does it cover every commit?
|
|
- TODOS.md — are completed items marked done?
|
|
- docs/ — do any guides need updating?
|
|
|
|
A ship without updated docs is an incomplete ship. Period.
|
|
|
|
## CHANGELOG voice
|
|
|
|
CHANGELOG.md is read by agents during auto-update (Section 17). The agent summarizes
|
|
the changelog to convince the user to upgrade. Write changelog entries that sell the
|
|
upgrade, not document the implementation.
|
|
|
|
- Lead with what the user can now DO that they couldn't before
|
|
- Frame as benefits and capabilities, not files changed or code written
|
|
- Make the user think "hell yeah, I want that"
|
|
- Bad: "Added GBRAIN_VERIFY.md installation verification runbook"
|
|
- Good: "Your agent now verifies the entire GBrain installation end-to-end, catching
|
|
silent sync failures and stale embeddings before they bite you"
|
|
- Bad: "Setup skill Phase H and Phase I added"
|
|
- Good: "New installs automatically set up live sync so your brain never falls behind"
|
|
|
|
## Version migrations
|
|
|
|
Create a migration file at `skills/migrations/v[version].md` when a release
|
|
includes changes that existing users need to act on. The auto-update agent
|
|
reads these files post-upgrade (Section 17, Step 4) and executes them.
|
|
|
|
**You need a migration file when:**
|
|
- New setup step that existing installs don't have (e.g., v0.5.0 added live sync,
|
|
existing users need to set it up, not just new installs)
|
|
- New SKILLPACK section with a MUST ADD setup requirement
|
|
- Schema changes that require `gbrain init` or manual SQL
|
|
- Changed defaults that affect existing behavior
|
|
- Deprecated commands or flags that need replacement
|
|
- New verification steps that should run on existing installs
|
|
- New cron jobs or background processes that should be registered
|
|
|
|
**You do NOT need a migration file when:**
|
|
- Bug fixes with no behavior changes
|
|
- Documentation-only improvements (the agent re-reads docs automatically)
|
|
- New optional features that don't affect existing setups
|
|
- Performance improvements that are transparent
|
|
|
|
**The key test:** if an existing user upgrades and does nothing else, will their
|
|
brain work worse than before? If yes, migration file. If no, skip it.
|
|
|
|
Write migration files as agent instructions, not technical notes. Tell the agent
|
|
what to do, step by step, with exact commands. See `skills/migrations/v0.5.0.md`
|
|
for the pattern.
|
|
|
|
## Schema state tracking
|
|
|
|
`~/.gbrain/update-state.json` tracks which recommended schema directories the user
|
|
adopted, declined, or added custom. The auto-update agent (SKILLPACK Section 17)
|
|
reads this during upgrades to suggest new schema additions without re-suggesting
|
|
things the user already declined. The setup skill writes the initial state during
|
|
Phase C/E. Never modify a user's custom directories or re-suggest declined ones.
|
|
|
|
## GitHub Actions SHA maintenance
|
|
|
|
All GitHub Actions in `.github/workflows/` are pinned to commit SHAs. Before shipping
|
|
(`/ship`) or reviewing (`/review`), check for stale pins and update them:
|
|
|
|
```bash
|
|
for action in actions/checkout oven-sh/setup-bun actions/upload-artifact actions/download-artifact softprops/action-gh-release gitleaks/gitleaks-action; do
|
|
tag=$(grep -r "$action@" .github/workflows/ | head -1 | grep -o '#.*' | tr -d '# ')
|
|
[ -n "$tag" ] && echo "$action@$tag: $(gh api repos/$action/git/ref/tags/$tag --jq .object.sha 2>/dev/null)"
|
|
done
|
|
```
|
|
|
|
If any SHA differs from what's in the workflow files, update the pin and version comment.
|
|
|
|
## Community PR wave process
|
|
|
|
Never merge external PRs directly into master. Instead, use the "fix wave" workflow:
|
|
|
|
1. **Categorize** — group PRs by theme (bug fixes, features, infra, docs)
|
|
2. **Deduplicate** — if two PRs fix the same thing, pick the one that changes fewer
|
|
lines. Close the other with a note pointing to the winner.
|
|
3. **Collector branch** — create a feature branch (e.g. `garrytan/fix-wave-N`), cherry-pick
|
|
or manually re-implement the best fixes from each PR. Do NOT merge PR branches directly —
|
|
read the diff, understand the fix, and write it yourself if needed.
|
|
4. **Test the wave** — verify with `bun test && bun run test:e2e` (full E2E lifecycle).
|
|
Every fix in the wave must have test coverage.
|
|
5. **Close with context** — every closed PR gets a comment explaining why and what (if
|
|
anything) supersedes it. Contributors did real work; respect that with clear communication
|
|
and thank them.
|
|
6. **Ship as one PR** — single PR to master with all attributions preserved via
|
|
`Co-Authored-By:` trailers. Include a summary of what merged and what closed.
|
|
|
|
**Community PR guardrails:**
|
|
- Always AskUserQuestion before accepting commits that touch voice, tone, or
|
|
promotional material (README intro, CHANGELOG voice, skill templates).
|
|
- Never auto-merge PRs that remove YC references or "neutralize" the founder perspective.
|
|
- Preserve contributor attribution in commit messages.
|
|
|
|
## Skill routing
|
|
|
|
When the user's request matches an available skill, ALWAYS invoke it using the Skill
|
|
tool as your FIRST action. Do NOT answer directly, do NOT use other tools first.
|
|
The skill has specialized workflows that produce better results than ad-hoc answers.
|
|
|
|
**NEVER hand-roll ship operations.** Do not manually run git commit + push + gh pr
|
|
create when /ship is available. /ship handles VERSION bump, CHANGELOG, document-release,
|
|
pre-landing review, test coverage audit, and adversarial review. Manually creating a PR
|
|
skips all of these. If the user says "commit and ship", "push and ship", "bisect and
|
|
ship", or any combination that ends with shipping — invoke /ship and let it handle
|
|
everything including the commits. If the branch name contains a version (e.g.
|
|
`v0.5-live-sync`), /ship should use that version for the bump.
|
|
|
|
Key routing rules:
|
|
- Product ideas, "is this worth building", brainstorming → invoke office-hours
|
|
- Bugs, errors, "why is this broken", 500 errors → invoke investigate
|
|
- Ship, deploy, push, create PR, "commit and ship", "push and ship" → invoke ship
|
|
- QA, test the site, find bugs → invoke qa
|
|
- Code review, check my diff → invoke review
|
|
- Update docs after shipping → invoke document-release
|
|
- Weekly retro → invoke retro
|
|
- Design system, brand → invoke design-consultation
|
|
- Visual audit, design polish → invoke design-review
|
|
- Architecture review → invoke plan-eng-review
|
|
- Save progress, checkpoint, resume → invoke checkpoint
|
|
- Code quality, health check → invoke health
|