* feat: battle-tested skill patterns from production deployment Backport production-learned brain-operations patterns: - Iron Law of Back-Linking (mandatory bidirectional linking) - Brain filing rules (file by primary subject, not format) - Enrichment protocol (7-step pipeline, 3-tier system, person/company templates) - Media ingest workflows (articles, videos, podcasts, PDFs, screenshots) - Citation requirements (mandatory [Source: ...] on every fact) - Test Before Bulk operating principle - Voice recipe: unicode crash fix, PII scrub, identity-first prompt, DIY STT+LLM+TTS - X-to-Brain recipe: image OCR, Filtered Stream, tweet rating rubric, cron stagger * chore: bump version and changelog (v0.8.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add _brain-filing-rules.md to CLAUDE.md key files * feat: smart file upload with TUS resumable and .redirect.yaml pointers - Supabase Storage auto-selects upload method by file size: < 100 MB standard POST, >= 100 MB TUS resumable (6 MB chunks + retry) - Signed URL generation for private bucket access (1-hour expiry) - New `upload-raw` command with size routing: small text stays in git, large/media files go to cloud with .redirect.yaml pointer - New `signed-url` command for generating access links - File resolver supports both .redirect.yaml (v0.9+) and .redirect (legacy) - Redirect format upgraded: 10 fields with full metadata - All migration commands (mirror, redirect, restore, clean) handle both formats * feat: skills reference actual gbrain file commands - Filing rules document upload-raw, signed-url, and .redirect.yaml format - Ingest skill uses gbrain files upload-raw for raw source preservation - Maintain skill adds file storage health checks - Setup skill adds storage configuration phase with migration guidance - Voice recipe uses upload-raw for call audio storage - Migration v0.9.0 with complete storage setup instructions * chore: bump version and changelog (v0.9.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: gbrain publish -- shareable HTML with password protection First code+skill pair: deterministic code does the work (strip private data, encrypt with AES-256-GCM, generate self-contained HTML), the skill tells the agent when and how to use it. 34 new tests. See: https://x.com/garrytan/status/2042925773300908103 * feat: backlinks check/fix, page lint, and report commands Three new deterministic tools (zero LLM calls): - gbrain backlinks check/fix -- scans brain for entity mentions without back-links, creates them. Enforces the Iron Law from the skills. - gbrain lint [--fix] -- catches LLM preambles, code fence wrapping, placeholder dates, missing frontmatter, broken citations, empty sections. --fix auto-strips fixable artifacts. - gbrain report --type <name> -- saves timestamped reports to brain/reports/{type}/YYYY-MM-DD-HHMM.md for audit trails. 33 new tests (409 total, 0 fail). * feat: v0.9.0 migration tells agents to swap scripts for built-in commands Migration file now: - Lists all 5 new deterministic commands with usage examples - Includes a script-to-command replacement table (old -> new) - Tells the agent to find custom script references in AGENTS.md, skills, and cron jobs and replace with gbrain commands - Adds recommended cron jobs for daily backlink fix + weekly lint - References the Thin Harness, Fat Skills thread * fix: CLI routing bugs found during DX review - Fixed subArgs reference error in handleCliOnly (used wrong variable name) - Renamed gbrain backlinks check/fix to gbrain check-backlinks to avoid conflict with existing backlinks operation (per-page incoming links) - Added TOOLS section to --help output showing publish, check-backlinks, lint, report - Added upload-raw and signed-url to FILES section in --help - Updated all docs/migration references to use check-backlinks * fix: security hardening from adversarial review - XSS: sanitize marked.parse() output (strip script/iframe/on* attrs) - Path traversal: validate report --type against [a-z0-9-] pattern - TUS: HEAD request before retry to get server's actual offset (TUS spec) - Pointer: upload-raw now includes pointer content in JSON output - Symlinks: use lstatSync in all walkers to prevent directory escape --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
115 lines
4.3 KiB
Markdown
115 lines
4.3 KiB
Markdown
# Brain Filing Rules -- MANDATORY for all skills that write to the brain
|
|
|
|
## The Rule
|
|
|
|
The PRIMARY SUBJECT of the content determines where it goes. Not the format,
|
|
not the source, not the skill that's running.
|
|
|
|
## Decision Protocol
|
|
|
|
1. Identify the primary subject (a person? company? concept? policy issue?)
|
|
2. File in the directory that matches the subject
|
|
3. Cross-link from related directories
|
|
4. When in doubt: what would you search for to find this page again?
|
|
|
|
## Common Misfiling Patterns -- DO NOT DO THESE
|
|
|
|
| Wrong | Right | Why |
|
|
|-------|-------|-----|
|
|
| Analysis of a topic -> `sources/` | -> appropriate subject directory | sources/ is for raw data only |
|
|
| Article about a person -> `sources/` | -> `people/` | Primary subject is a person |
|
|
| Meeting-derived company info -> `meetings/` only | -> ALSO update `companies/` | Entity propagation is mandatory |
|
|
| Research about a company -> `sources/` | -> `companies/` | Primary subject is a company |
|
|
| Reusable framework/thesis -> `sources/` | -> `concepts/` | It's a mental model |
|
|
| Tweet thread about policy -> `media/` | -> `civic/` or `concepts/` | media/ is for content ops |
|
|
|
|
## What `sources/` Is Actually For
|
|
|
|
`sources/` is ONLY for:
|
|
- Bulk data imports (API dumps, CSV exports, snapshots)
|
|
- Raw data that feeds multiple brain pages (e.g., a guest export, contact sync)
|
|
- Periodic captures (quarterly snapshots, sync exports)
|
|
|
|
If the content has a clear primary subject (a person, company, concept, policy
|
|
issue), it does NOT go in sources/. Period.
|
|
|
|
## Notability Gate
|
|
|
|
Not everything deserves a brain page. Before creating a new entity page:
|
|
- **People:** Will you interact with them again? Are they relevant to your work?
|
|
- **Companies:** Are they relevant to your work or interests?
|
|
- **Concepts:** Is this a reusable mental model worth referencing later?
|
|
- **When in doubt, DON'T create.** A missing page can be created later.
|
|
A junk page wastes attention and degrades search quality.
|
|
|
|
## Iron Law: Back-Linking (MANDATORY)
|
|
|
|
Every mention of a person or company with a brain page MUST create a back-link
|
|
FROM that entity's page TO the page mentioning them. This is bidirectional:
|
|
the new page links to the entity, AND the entity's page links back.
|
|
|
|
Format for back-links (append to Timeline or See Also):
|
|
```
|
|
- **YYYY-MM-DD** | Referenced in [page title](path/to/page.md) -- brief context
|
|
```
|
|
|
|
An unlinked mention is a broken brain. The graph is the intelligence.
|
|
|
|
## Citation Requirements (MANDATORY)
|
|
|
|
Every fact written to a brain page must carry an inline `[Source: ...]` citation.
|
|
|
|
Three formats:
|
|
- **Direct attribution:** `[Source: User, {context}, YYYY-MM-DD]`
|
|
- **API/external:** `[Source: {provider}, YYYY-MM-DD]` or `[Source: {publication}, {URL}]`
|
|
- **Synthesis:** `[Source: compiled from {list of sources}]`
|
|
|
|
Source precedence (highest to lowest):
|
|
1. User's direct statements (highest authority)
|
|
2. Compiled truth (pre-existing brain synthesis)
|
|
3. Timeline entries (raw evidence)
|
|
4. External sources (API enrichment, web search -- lowest)
|
|
|
|
When sources conflict, note the contradiction with both citations. Don't
|
|
silently pick one.
|
|
|
|
## Raw Source Preservation
|
|
|
|
Every ingested item should have its raw source preserved for provenance.
|
|
|
|
**Size routing (automatic via `gbrain files upload-raw`):**
|
|
- **< 100 MB text/PDF**: stays in the brain repo (git-tracked) in a `.raw/`
|
|
sidecar directory alongside the brain page
|
|
- **>= 100 MB OR media files** (video, audio, images): uploaded to cloud
|
|
storage (Supabase Storage, S3, etc.) with a `.redirect.yaml` pointer left
|
|
in the brain repo. Files >= 100 MB use TUS resumable upload (6 MB chunks
|
|
with retry) for reliability.
|
|
|
|
**Upload command:**
|
|
```bash
|
|
gbrain files upload-raw <file> --page <page-slug> --type <type>
|
|
```
|
|
Returns JSON: `{storage: "git"}` for small files, `{storage: "supabase", storagePath, reference}` for cloud.
|
|
|
|
**The `.redirect.yaml` pointer format:**
|
|
```yaml
|
|
target: supabase://brain-files/page-slug/filename.mp4
|
|
bucket: brain-files
|
|
storage_path: page-slug/filename.mp4
|
|
size: 524288000
|
|
size_human: 500 MB
|
|
hash: sha256:abc123...
|
|
mime: video/mp4
|
|
uploaded: 2026-04-11T...
|
|
type: transcript
|
|
```
|
|
|
|
**Accessing stored files:**
|
|
```bash
|
|
gbrain files signed-url <storage-path> # Generate 1-hour signed URL
|
|
gbrain files restore <dir> # Download back to local
|
|
```
|
|
|
|
This ensures any derived brain page can be traced back to its original source,
|
|
and large files don't bloat the git repo.
|