* feat: battle-tested skill patterns from production deployment Backport production-learned brain-operations patterns: - Iron Law of Back-Linking (mandatory bidirectional linking) - Brain filing rules (file by primary subject, not format) - Enrichment protocol (7-step pipeline, 3-tier system, person/company templates) - Media ingest workflows (articles, videos, podcasts, PDFs, screenshots) - Citation requirements (mandatory [Source: ...] on every fact) - Test Before Bulk operating principle - Voice recipe: unicode crash fix, PII scrub, identity-first prompt, DIY STT+LLM+TTS - X-to-Brain recipe: image OCR, Filtered Stream, tweet rating rubric, cron stagger * chore: bump version and changelog (v0.8.1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add _brain-filing-rules.md to CLAUDE.md key files * feat: smart file upload with TUS resumable and .redirect.yaml pointers - Supabase Storage auto-selects upload method by file size: < 100 MB standard POST, >= 100 MB TUS resumable (6 MB chunks + retry) - Signed URL generation for private bucket access (1-hour expiry) - New `upload-raw` command with size routing: small text stays in git, large/media files go to cloud with .redirect.yaml pointer - New `signed-url` command for generating access links - File resolver supports both .redirect.yaml (v0.9+) and .redirect (legacy) - Redirect format upgraded: 10 fields with full metadata - All migration commands (mirror, redirect, restore, clean) handle both formats * feat: skills reference actual gbrain file commands - Filing rules document upload-raw, signed-url, and .redirect.yaml format - Ingest skill uses gbrain files upload-raw for raw source preservation - Maintain skill adds file storage health checks - Setup skill adds storage configuration phase with migration guidance - Voice recipe uses upload-raw for call audio storage - Migration v0.9.0 with complete storage setup instructions * chore: bump version and changelog (v0.9.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: gbrain publish -- shareable HTML with password protection First code+skill pair: deterministic code does the work (strip private data, encrypt with AES-256-GCM, generate self-contained HTML), the skill tells the agent when and how to use it. 34 new tests. See: https://x.com/garrytan/status/2042925773300908103 * feat: backlinks check/fix, page lint, and report commands Three new deterministic tools (zero LLM calls): - gbrain backlinks check/fix -- scans brain for entity mentions without back-links, creates them. Enforces the Iron Law from the skills. - gbrain lint [--fix] -- catches LLM preambles, code fence wrapping, placeholder dates, missing frontmatter, broken citations, empty sections. --fix auto-strips fixable artifacts. - gbrain report --type <name> -- saves timestamped reports to brain/reports/{type}/YYYY-MM-DD-HHMM.md for audit trails. 33 new tests (409 total, 0 fail). * feat: v0.9.0 migration tells agents to swap scripts for built-in commands Migration file now: - Lists all 5 new deterministic commands with usage examples - Includes a script-to-command replacement table (old -> new) - Tells the agent to find custom script references in AGENTS.md, skills, and cron jobs and replace with gbrain commands - Adds recommended cron jobs for daily backlink fix + weekly lint - References the Thin Harness, Fat Skills thread * fix: CLI routing bugs found during DX review - Fixed subArgs reference error in handleCliOnly (used wrong variable name) - Renamed gbrain backlinks check/fix to gbrain check-backlinks to avoid conflict with existing backlinks operation (per-page incoming links) - Added TOOLS section to --help output showing publish, check-backlinks, lint, report - Added upload-raw and signed-url to FILES section in --help - Updated all docs/migration references to use check-backlinks * fix: security hardening from adversarial review - XSS: sanitize marked.parse() output (strip script/iframe/on* attrs) - Path traversal: validate report --type against [a-z0-9-] pattern - TUS: HEAD request before retry to get server's actual offset (TUS spec) - Pointer: upload-raw now includes pointer content in JSON output - Symlinks: use lstatSync in all walkers to prevent directory escape --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
78 lines
3.4 KiB
Markdown
78 lines
3.4 KiB
Markdown
# Query Skill
|
|
|
|
Answer questions using the brain's knowledge with 3-layer search and synthesis.
|
|
|
|
## Workflow
|
|
|
|
1. **Decompose the question** into search strategies:
|
|
- Keyword search for specific names, dates, terms
|
|
- Semantic query for conceptual questions
|
|
- Structured queries (list by type, backlinks) for relational questions
|
|
2. **Execute searches:**
|
|
- Keyword search gbrain for FTS matches (search)
|
|
- Hybrid search gbrain for semantic+keyword with expansion (query)
|
|
- List pages in gbrain by type or check backlinks for structural queries
|
|
3. **Read top results.** Read the top 3-5 pages from gbrain to get full context.
|
|
4. **Synthesize answer** with citations. Every claim traces back to a specific page slug.
|
|
5. **Flag gaps.** If the brain doesn't have info, say "the brain doesn't have information on X" rather than hallucinating.
|
|
|
|
## Quality Rules
|
|
|
|
- Never hallucinate. Only answer from brain content.
|
|
- Cite sources: "According to concepts/do-things-that-dont-scale..."
|
|
- Flag stale results: if a search result shows [STALE], note that the info may be outdated
|
|
- For "who" questions, use backlinks and typed links to find connections
|
|
- For "what happened" questions, use timeline entries
|
|
- For "what do we know" questions, read compiled_truth directly
|
|
|
|
## Token-Budget Awareness
|
|
|
|
Search returns **chunks**, not full pages. Read the excerpts first before deciding
|
|
whether to load a full page.
|
|
|
|
- `gbrain search` / `gbrain query` return ranked chunks with context snippets.
|
|
These are often enough to answer the question directly.
|
|
- Only use `gbrain get <slug>` to load the full page when a chunk confirms the
|
|
page is relevant and you need more context (e.g., compiled truth, timeline).
|
|
- **"Tell me about X"** -- get the full page (the user wants the complete picture).
|
|
- **"Did anyone mention Y?"** -- search results are enough (the user wants a yes/no with evidence).
|
|
|
|
### Source precedence
|
|
|
|
When multiple sources provide conflicting information, follow this precedence:
|
|
|
|
1. **User's direct statements** (highest authority -- what the user told you directly)
|
|
2. **Compiled truth** (the brain's synthesized, cited understanding)
|
|
3. **Timeline entries** (raw evidence, reverse-chronological)
|
|
4. **External sources** (web search, API enrichment -- lowest authority)
|
|
|
|
When sources conflict, note the contradiction with both citations. Don't silently
|
|
pick one.
|
|
|
|
## Citation in Answers
|
|
|
|
When referencing brain pages in your answer, propagate inline citations:
|
|
- Cite the page: "According to [Source: people/jane-doe, compiled truth]..."
|
|
- When brain pages have inline `[Source: ...]` citations, propagate them so
|
|
the user can trace facts to their origin
|
|
- When you synthesize across multiple pages, cite all sources
|
|
|
|
## Search Quality Awareness
|
|
|
|
If search results seem off (wrong results, missing known pages, irrelevant hits):
|
|
- Run `gbrain doctor --json` to check index health
|
|
- Check embedding coverage -- partial embeddings degrade hybrid search
|
|
- Compare keyword search (`gbrain search`) vs hybrid search (`gbrain query`)
|
|
for the same query to isolate whether the issue is embedding-related
|
|
- Report search quality issues in the maintain workflow (see maintain skill)
|
|
|
|
## Tools Used
|
|
|
|
- Keyword search gbrain (search)
|
|
- Hybrid search gbrain (query)
|
|
- Read a page from gbrain (get_page)
|
|
- List pages in gbrain with filters (list_pages)
|
|
- Check backlinks in gbrain (get_backlinks)
|
|
- Traverse the link graph in gbrain (traverse_graph)
|
|
- View timeline entries in gbrain (get_timeline)
|