Files

Garry Tan e5a9f0126a feat: GStackBrain — 16 new skills, resolver, conventions, identity layer (v0.10.0) (#120 )

* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README rewrite with production brain stats, sample output, new infrastructure

Lead with the flex: 17,888 pages, 4,383 people, 723 companies, 526 meeting
transcripts built in 12 days. Show sample query output so readers see what
they'll get. Document self-improving infrastructure (tier auto-escalation,
fail-improve loop, doctor trajectory). Add data-research recipes to Getting
Data In. Update commands section with doctor --fix, transcribe, research
init/list. Fix stale "24" references to "25".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with YC President origin and production agent deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with skill philosophy and link to Thin Harness Fat Skills

Skills section now explains: skill files are code, they encode entire
workflows, they call deterministic TypeScript for the parts that shouldn't
be LLM judgment. Links to the tweet and the architecture essay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: link GStack repo, add 70K stars and 30K daily users

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove meeting transcript count from README (sensitive)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with YC President origin and production agent deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename political-donations recipe to expense-tracker (sensitivity)

Renamed the built-in data-research recipe from political-donations to
expense-tracker across README, CHANGELOG, SKILL.md, and reports routing.
Same extraction patterns (amounts, dates, recipients), neutral framing.
Also renamed social-radar keyword route to social-mentions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-14 19:41:34 -10:00

3.8 KiB

Raw Blame History

GBrain Installation Guide for AI Agents

Read this entire file, then follow the steps. Ask the user for API keys when needed. Target: ~30 minutes to a fully working brain.

Step 1: Install GBrain

git clone https://github.com/garrytan/gbrain.git ~/gbrain && cd ~/gbrain
curl -fsSL https://bun.sh/install | bash
export PATH="$HOME/.bun/bin:$PATH"
bun install && bun link

Verify: gbrain --version should print a version number. If gbrain is not found, restart the shell or add the PATH export to the shell profile.

Step 2: API Keys

Ask the user for these:

export OPENAI_API_KEY=sk-...          # required for vector search
export ANTHROPIC_API_KEY=sk-ant-...   # optional, improves search quality

Save to shell profile or .env. Without OpenAI, keyword search still works. Without Anthropic, search works but skips query expansion.

Step 3: Create the Brain

gbrain init                           # PGLite, no server needed
gbrain doctor --json                  # verify all checks pass

The user's markdown files (notes, docs, brain repo) are SEPARATE from this tool repo. Ask the user where their files are, or create a new brain repo:

mkdir -p ~/brain && cd ~/brain && git init

Read ~/gbrain/docs/GBRAIN_RECOMMENDED_SCHEMA.md and set up the MECE directory structure (people/, companies/, concepts/, etc.) inside the user's brain repo, NOT inside ~/gbrain.

Step 4: Import and Index

gbrain import ~/brain/ --no-embed     # import markdown files
gbrain embed --stale                  # generate vector embeddings
gbrain query "key themes across these documents?"

Step 5: Load Skills

Read ~/gbrain/skills/RESOLVER.md. This is the skill dispatcher. It tells you which skill to read for any task. Save this to your memory permanently.

The three most important skills to adopt immediately:

Signal detector (skills/signal-detector/SKILL.md) — fire this on EVERY inbound message. It captures ideas and entities in parallel. The brain compounds.
Brain-ops (skills/brain-ops/SKILL.md) — brain-first lookup on every response. Check the brain before any external API call.
Conventions (skills/conventions/quality.md) — citation format, back-linking iron law, source attribution. These are non-negotiable quality rules.

Step 6: Identity (optional)

Run the soul-audit skill to customize the agent's identity:

Read skills/soul-audit/SKILL.md and follow it.

This generates SOUL.md (agent identity), USER.md (user profile), ACCESS_POLICY.md (who sees what), and HEARTBEAT.md (operational cadence) from the user's answers.

If skipped, minimal defaults are installed automatically.

Step 7: Recurring Jobs

Set up using your platform's scheduler (OpenClaw cron, Railway cron, crontab):

Live sync (every 15 min): gbrain sync --repo ~/brain && gbrain embed --stale
Auto-update (daily): gbrain check-update --json (tell user, never auto-install)
Dream cycle (nightly): read docs/guides/cron-schedule.md for the full protocol. Entity sweep, citation fixes, memory consolidation. This is what makes the brain compound. Do not skip it.
Weekly: gbrain doctor --json && gbrain embed --stale

Step 8: Integrations

Run gbrain integrations list. Each recipe in ~/gbrain/recipes/ is a self-contained installer. It tells you what credentials to ask for, how to validate, and what cron to register. Ask the user which integrations they want (email, calendar, voice, Twitter).

Verify: gbrain integrations doctor (after at least one is configured)

Step 9: Verify

Read docs/GBRAIN_VERIFY.md and run all 6 verification checks. Check #4 (live sync actually works) is the most important.

Upgrade

cd ~/gbrain && git pull origin main && bun install

Then run gbrain init to apply any schema migrations (idempotent, safe to re-run).

3.8 KiB Raw Blame History