docs: add Hermes alternatives in SKILLPACK, remove duplicate Section 16

- Section 13: agent memory table shows both OpenClaw memory_search
  and Hermes memory()/session_search()
- Section 14a: credential gateway covers both ClawVisor (OpenClaw)
  and Hermes built-in gateway
- Removed duplicate Section 16 (Deterministic Collectors was
  copy-pasted twice)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-09 20:50:17 -10:00
parent 01a2844fef
commit c68a4ccbbb

View File

@@ -663,7 +663,7 @@ serve different purposes.
| Layer | What It Stores | Examples | How to Access |
|-------|---------------|----------|---------------|
| **GBrain** | World knowledge -- facts about people, companies, deals, meetings, concepts, ideas | Pedro's company page, meeting transcripts, original theses, deal terms | `gbrain search`, `gbrain query`, `gbrain get` |
| **Agent memory** (`memory_search`) | Operational state -- preferences, architecture decisions, tool config, session continuity | "User prefers concise formatting", "Deploy to staging before prod", "ClawVisor task IDs" | `memory_search`, file reads |
| **Agent memory** | Operational state -- preferences, architecture decisions, tool config, session continuity | "User prefers concise formatting", "Deploy to staging before prod" | OpenClaw: `memory_search`. Hermes: `memory(action="read")` + `session_search()` |
| **Session context** | Current conversation window -- what was just said, what the user just asked | The last 20 messages, current task, immediate context | Already in context |
### When to Use Each
@@ -685,11 +685,14 @@ store people dossiers in agent memory.
Three integrations that make the agent real. Without these, the brain is a static
database. With them, it's alive.
### 14a. ClawVisor -- Secure Gateway to Google and iMessage
### 14a. Credential Gateway (ClawVisor / Hermes Gateway)
[ClawVisor](https://clawvisor.com) is a credential vaulting and authorization gateway.
The agent never holds API keys directly. ClawVisor enforces policies, manages
task-scoped authorization, and injects credentials at request time.
The EA workflow needs Gmail, Calendar, Contacts, and messaging access. The agent
should never hold API keys directly. Use a credential gateway that enforces policies
and injects credentials at request time.
**OpenClaw: ClawVisor.** [ClawVisor](https://clawvisor.com) is a credential vaulting
and authorization gateway with task-scoped authorization.
**Services:** Gmail (list, read, send, draft), Google Calendar (CRUD), Google Drive
(list, search, read), Google Contacts (list, search), Apple iMessage (list, read,
@@ -718,6 +721,13 @@ tracking threads" works. "Email triage" gets rejected. The intent verification m
uses the purpose to judge whether each request is consistent -- if your purpose is
narrow, legitimate requests fail verification.
**Hermes Agent: Built-in gateway.** Hermes has multi-platform messaging (Telegram,
Discord, Slack, WhatsApp, Signal, Email) and tool access built into its gateway. Use
`config.yaml` to configure API credentials. The gateway daemon manages connections
and routes webhooks to agent sessions. For Google services, configure OAuth credentials
in the gateway config. Hermes's scheduled automations can run the same EA workflows
(email triage, calendar prep, contact enrichment) through the gateway's tool system.
### 14b. Circleback -- Meeting Ingestion via Webhooks
[Circleback](https://circleback.ai) records meetings, generates transcripts with
@@ -998,158 +1008,6 @@ concatenation but 10x better at understanding what an email means. Use both.
---
## 16. Deterministic Collectors -- Code for Data, LLMs for Judgment
When your agent keeps failing at a mechanical task despite repeated prompt fixes, stop
fighting the LLM. Move the mechanical work to code.
### The Pattern That Broke
We built an email triage system. The agent swept Gmail, classified emails by urgency,
and posted a digest to the user. One rule: every email item must include a clickable
`[Open in Gmail]` link so the user can act on it with one tap.
We put the rule in the skill file. We put it in MEMORY.md. We put it in the cron
prompt. We wrote "NO EXCEPTIONS" in all caps. We wrote "ZERO TOLERANCE" after the
fourth failure. The agent still dropped links -- on carry-forward reminders, on FYI
items, on "still awaiting" sections. The user asked five times. Each time we added
stronger language to the prompt.
The failure mode is probabilistic. The LLM understands the rule. It follows it for the
first 10 items. Then it gets sloppy on item 11, especially on items that are
re-surfaced from state rather than freshly pulled from the API. No amount of prompt
engineering fixes a 90%-reliable formatting task, because 90% reliability over 20 items
per sweep means you fail visibly about twice per day. That's enough to destroy trust.
### The Fix: Separate Deterministic from Analytical
```
┌─────────────────────────────┐ ┌──────────────────────────────┐
│ Deterministic Collector │────▶│ LLM Agent │
│ (Node.js / Python script) │ │ │
│ │ │ • Read the pre-formatted │
│ • Pull data from API │ │ digest │
│ • Store structured JSON │ │ • Classify items │
│ • Generate links/URLs │ │ • Add commentary │
│ • Detect patterns (regex) │ │ • Run brain enrichment │
│ • Track state (seen/new) │ │ • Draft replies │
│ • Output markdown digest │ │ • Surface to user │
│ │ │ │
│ CODE — deterministic, │ │ AI — judgment, context, │
│ never forgets │ │ creativity │
└─────────────────────────────┘ └──────────────────────────────┘
```
The collector handles everything mechanical:
- Pulling emails from Gmail (via credential gateway)
- Generating `[Open in Gmail](URL)` from message IDs -- **by code, not by LLM**
- Detecting signature requests (DocuSign/Dropbox Sign regex patterns)
- Tracking which messages are new vs. already seen (state file)
- Storing structured JSON with full metadata
- Generating a pre-formatted markdown digest with every link already embedded
The LLM reads the pre-formatted digest and does what LLMs are good at:
- Classifying urgency (requires understanding relationships, deadlines, context)
- Writing commentary ("this is the $110M acquisition thread, 7 days dropped")
- Running brain enrichment on notable entities (`gbrain search` + page updates)
- Drafting replies
- Deciding what to surface vs. filter
**The links are in the source data. The LLM can't forget them because it doesn't
generate them.**
### Implementation
The email collector follows the same architecture as the X/Twitter collector (a
deterministic data pipeline for social media monitoring):
```
scripts/email-collector/
├── email-collector.mjs # No LLM calls, no external deps
├── data/
│ ├── state.json # Last pull timestamp, known IDs, pending signatures
│ ├── messages/ # Structured JSON per day
│ │ └── 2026-04-09.json
│ └── digests/ # Pre-formatted markdown
│ └── 2026-04-09.md
```
Every stored message includes:
```json
{
"id": "19d74109a811b9e7",
"account": "work",
"authuser": "user@example.com",
"from": "Alex Smith",
"subject": "Re: Next Steps",
"snippet": "Hey, wanted to follow up on...",
"timestamp": "2026-04-09T08:56:09Z",
"is_unread": true,
"is_noise": false,
"is_signature": false,
"gmail_link": "https://mail.google.com/mail/u/?authuser=user@example.com#inbox/19d74109a811b9e7",
"gmail_markdown": "[Open in Gmail](https://mail.google.com/mail/u/?authuser=user@example.com#inbox/19d74109a811b9e7)"
}
```
The `gmail_link` and `gmail_markdown` fields are computed from `id` + `authuser` at
collection time. Three lines of code. Never wrong.
### Cron Integration
The email monitoring cron runs the collector first, then invokes the LLM:
```
1. node email-collector.mjs collect → deterministic API pull, store JSON
2. node email-collector.mjs digest → generate markdown with links baked in
3. node email-collector.mjs signatures → list pending e-signature items
4. LLM reads digest + signatures → classifies, enriches, posts to user
```
The collector runs in under 10 seconds. The LLM analysis takes 30-60 seconds. Total:
under 90 seconds for a full inbox sweep with brain enrichment.
### Where Else This Pattern Applies
The deterministic-collector pattern works for any recurring data pull where the LLM
was previously responsible for both fetching AND formatting:
| Signal Source | Collector Generates | LLM Adds |
|--------------|-------------------|----------|
| **Email** | Gmail links, sender metadata, signature detection | Urgency classification, enrichment, reply drafts |
| **X/Twitter** | Tweet links, engagement metrics, deletion detection | Sentiment analysis, narrative detection, content ideas |
| **Calendar** | Event links, attendee lists, conflict detection | Prep briefings, meeting context from brain |
| **Slack** | Channel links, thread links, mention detection | Priority classification, action item extraction |
| **GitHub** | PR/issue links, diff stats, CI status | Code review context, priority assessment |
The principle: if a piece of output MUST be present and MUST be formatted correctly
every time, generate it in code. If a piece of output requires judgment, context, or
creativity, generate it with the LLM. Don't ask the LLM to do both in the same pass.
### The Lesson
When an LLM keeps failing at a mechanical task despite repeated prompt fixes:
1. **Stop adding more prompt language.** You've already written "NO EXCEPTIONS" and
"ZERO TOLERANCE." The LLM read it. The failure is probabilistic, not comprehension.
2. **Identify what's mechanical vs. analytical.** URL generation is mechanical.
Classification is analytical. State tracking is mechanical. Commentary is analytical.
3. **Move the mechanical work to a script.** Node.js, Python, bash -- anything
deterministic. No LLM calls, no external dependencies if possible.
4. **Feed the LLM pre-formatted data.** The script's output becomes the LLM's input.
Links are already there. Metadata is already structured. The LLM just adds judgment.
5. **Wire it into your cron.** Script runs first (fast, cheap, reliable), then LLM
reads the output (slower, expensive, creative).
This is not about the LLM being bad. It's about using the right tool for the right
job. Code is 100% reliable at string concatenation. LLMs are 90% reliable at string
concatenation but 10x better at understanding what an email means. Use both.
---
## 17. Auto-Update Notifications
GBrain ships updates frequently. The auto-update cron keeps users current by