概述

SKILL: Token Cost Intelligence — Free Primer

Source: Production agent stack running at $0.91/day (down from $8–10/session)

Domain: Token cost optimization, OpenClaw deployments

Type: Free primer

THE CORE TRUTH

> The models are not expensive. Your habits are.

Most OpenClaw operators are spending 8–10x more than they need to. This primer gives you the diagnostic framework to find out where you're leaking.

THE "STUPID BUTTON" — 6 DIAGNOSTIC QUESTIONS

Run these before every session:

Are you feeding raw PDFs/images when you only need text?

Screenshots are the worst offender. Copy-paste or convert to Markdown. A 4,500-word PDF = 100,000+ tokens raw. The same content in Markdown = 4,000–6,000 tokens. ~20x reduction.

When did you last start a fresh conversation?

Every new turn re-sends the entire conversation history. 30-turn threads don't just feel inefficient — they are. 10–15 turn cap, then summarize and start fresh.

Are you using the most expensive model for everything?

Opus for formatting and proofreading is a Ferrari to the grocery store. Haiku handles light tasks at 1/30th the cost.

Do you know what's loading in context before you type?

Each loaded plugin = silent token tax per session. Documented case: 50,000 tokens consumed before the first keystroke. Audit your connectors. Disable what you don't use.

Are you caching stable context? (API builders)

Cache hits on Opus: $0.50/M vs $5.00/M standard = 90% discount. System prompts, tool definitions, persona instructions → all cacheable. If you're not caching, you're paying full price for the same tokens every call.

How are you handling web search?

Native model web search is token-heavy. MCP-routed alternatives return structured results at a fraction of the cost. Know what you're paying per search.

COST COMPARISON (CONCRETE)

Session Type	Input Tokens	Output Tokens	Cost (Opus pricing)
---	---	---	---
Sloppy (raw PDFs, 30-turn sprawl, Opus-everything)	800K–1M	150K–200K	$8–$10
Clean (markdown, 10-turn cap, tiered models)	100K–150K	50K–80K	~$1
Reduction	~8x	~3x	8–10x

Scaled to a team of 10 for one month:

Sloppy habits: ~$2,000/month
Clean habits: ~$250/month
Same output volume.

5 AGENT COMMANDMENTS

For anyone running OpenClaw agents at any scale:

Index your references. Agents get relevant chunks, not raw document dumps. Dumping full documents per agent call is architectural waste.

Pre-process context before it hits the window. Chunk, summarize, and clean before ingestion. If the model's first tokens are spent parsing your bad preprocessing, you failed.

Cache your stable context. System prompts, tool definitions, persona instructions, reference material → all cacheable. Thousands of agent calls per day without caching is pouring money out.

Scope each agent to minimum viable context. Planning agent doesn't need the full codebase. Editing agent doesn't need the project roadmap. Passing everything to every agent is measurable waste — and models perform worse drowning in irrelevant context.

Measure what you burn. Instrument all agent calls: input tokens, output tokens, model mix, cost ratio. You cannot optimize what you don't measure.

Full framework with anti-patterns by tier, tiered model routing, and confirmed production delta available in Token Cost Intelligence on Claw Mart.

版本历史

共 1 个版本

v1.0.0 当前

2026-05-03 09:51 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)

安全，无风险

查看报告

Stop Burning Tokens — OpenClaw Cost Diagnostic Primer

概述

SKILL: Token Cost Intelligence — Free Primer

THE CORE TRUTH

THE "STUPID BUTTON" — 6 DIAGNOSTIC QUESTIONS

COST COMPARISON (CONCRETE)

5 AGENT COMMANDMENTS

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Free Compaction Primer

OpenClaw Security Hardening Toolkit

Free Bash Safety Primer