(no title)
alameenpd | 19 days ago
{ "agents": { "defaults": { "model": "gemini/flash-lite", "fallbacks": ["anthropic/claude-3-opus"], "contextTokens": 150000 }, "heartbeat": { "model": "deepseek/v3.2", "interval": "45m" } } }
→ Switched trivial stuff away from Opus. Whisper skill setupCreated ~/.openclaw/workspace/skills/whisper/:SKILL.md
# Whisper Context Fetch Pulls delta-compressed context / memories to reduce prompt size.
@whisper get "summarize auth changes in repo" --project openclaw-exp
index.ts (simplified)ts
import { Whisper } from '@usewhisper/sdk';
const whisper = new Whisper({ apiKey: process.env.WHISPER_API_KEY, orgId: process.env.WHISPER_ORG_ID });
export async function getContext(args: { query: string; project: string }) { const res = await whisper.query({ project: args.project || 'openclaw-exp', query: args.query, compress: true, compression_strategy: 'delta', include_memories: true }); return { context: res.context, memories: res.memories?.map(m => `${m.type}: ${m.content}`) || [], meta: res.meta }; }
Then in agent instructions (AGENTS.md or prompt templates): "If context/history is large, call @whisper get first and prepend the result." Wrapper example (in a custom loop or cron)ts
async function runAgentWithContext(msg: string) { let enhanced = msg; if (/* detect long context needed */) { const wc = await getContext({ query: msg }); enhanced = `Context diff: ${wc.context}\nMemories:\n${wc.memories.join('\n')}\n\n${msg}`; } return openclaw.run(enhanced); }
Observations so farDelta compression seems to genuinely cut repeat context (e.g., codebase overviews drop from 10–15k → 1–3k tokens after first message). Memory types (preferences, events, goals) help avoid re-explaining user prefs. Combined with routing, my rough tracking shows ~60–75% lower usage on mixed workloads (not 97% like some YouTube claims, but meaningful). Latency added (~200–500ms per query) but worth it for longer sessions. Still manual; over-indexing sources bloats Whisper project if not careful.
Caveats: This is custom / early. Whisper costs $20/mo after trial. OpenClaw itself has had security noise lately (skills gone rogue, etc.), so sandbox everything. Not production-ready for high-stakes use.Curious from folks running OpenClaw:What's your biggest token sink (heartbeats, tools, long histories)? Tried context layers / vector DBs? Actual savings numbers? Better cheap fallbacks than Flash/DeepSeek for non-reasoning tasks?
Happy to share more snippets or debug if anyone's experimenting similarly.
No comments yet.