← 返回
未分类

Token Saver

Five-phase token audit & optimization framework for OpenClaw: Discover → Prioritize (3D matrix) → Optimize (9 category techniques) → Validate → Monitor. Univ...
五阶段代币审计与优化框架(OpenClaw):发现 → 优先级排序(3D矩阵) → 优化(9类技术) → 验证 → 监控。Univ...
youxiyin youxiyin 来源
未分类 clawhub v2.1.0 2 版本 100000 Key: 无需
★ 0
Stars
📥 392
下载
💾 0
安装
2
版本
#latest#openclaw#token-optimization

概述

Token Saver

> Universal token audit & optimization framework for OpenClaw agents.

> Based on real-world practice (2026-05-04).

Core Principles

  1. Tier your model usage — Simple tasks use cheap models; complex reasoning

uses expensive ones. Don't mix the two.

  1. Prompts say what, not why — Background rationale and philosophy are

noise to an agent. Strip them.

  1. Batch > Serial — One call for 10 results costs marginally more than

three calls for 3+3+4 results. Combine.

  1. Context = Cost — Every file loaded at session start, every tool schema

registered, every past message injected — all have a token price.

  1. Idle = Zero burn — Nighttime, weekends, and idle periods should run

nothing. Configure active hours.

Output

After each full execution, write a report (token-audit-report-YYYY-MM-DD.md)

containing: before/after comparison table, estimated weekly savings per change,

items deferred and why, recommended next step.


Quick Start

Not every audit needs the full Phase 1-5 treatment. Use these shortcuts

based on your goal:

🚀 Express Audit (15 min)

Trigger: "快速 token 审计"

  1. Run Phase 1A (enumerate cron tasks) + Phase 1D (model tier map)
  2. Skip directly to Phase 3 category table and pick the lowest-hanging fruit
  3. Apply Safe techniques only, no user confirmation needed
  4. Skip Phase 4 and Phase 5 — just log the changes

🎯 Quick Wins (<5 min)

Trigger: "快速省 token"

Go straight to these high-impact, zero-risk techniques:

  1. A3 (constrain output — add conciseness instruction)
  2. B1 (right-size each task — cheapest viable model)
  3. G1 (fixed prefix first — static prefix + dynamic suffix)
  4. H1/H2 (default to working path, fail once and switch)

Apply directly without audit preamble.

🔄 When to run full audit

IndicatorAction
-------------------
First time using this skillFull Phase 1-5 — establish baseline
Cron tasks changed significantlyPhase 1-3 — re-discover + re-optimize
New provider/API addedPhase 1B + 3 — check config + optimize
>30 days since last auditPhase 1-2 — measure drift, then re-optimize
Just want a quick checkQuick Wins or Express Audit

Phase 1: DISCOVER — Map the Full Token Landscape

1A Enumerate All Automated Tasks

Read your cron/scheduled task configuration (e.g. ~/.openclaw/cron/jobs.json).

For each task record:

  • name
  • model (or "default" if unset)
  • message / prompt length in chars
  • schedule frequency (daily / weekly / other)
  • delivery.mode (announce / none)
  • sessionTarget (isolated / main)

1B Analyze Agent Configuration

Inspect your gateway config (e.g. openclaw.json):

  • agents.defaults.heartbeat.* — interval, active hours, isolated session,

light context flag

  • agents.defaults.compaction.mode — message retention aggressiveness
  • agents.list[].tools.profile — full, coding, or custom
  • agents.list[].model — per-agent model override

1C Measure Context Load

List every file that is injected at session start (typically files in the

workspace root directory). Measure each in chars and estimate token cost

(~3 chars per token for CJK-heavy text, ~4 for English-heavy).

If LCM (Lossless Context Management) is active, note the number and average

size of compacted summary blocks injected per turn.

If tool schemas are accessible, estimate total schema chars:

(count of registered tools × average schema size in chars).

1D Map Models to Tiers

Categorize all available models into three tiers based on capability and cost:

  • 🏆 Premium (strong reasoning, high cost): e.g. deepseek-v4-pro, gpt-5.x
  • 🟡 Standard (balanced): e.g. deepseek-v4-flash, minimax-m2.7
  • 🟢 Economy (lightweight): e.g. minimax-m2.7-highspeed, ollama local

Map each task from 1A to its current model tier.

> ⚠️ Checkpoint: Before moving to Phase 2, present your Phase 1 findings

> (task inventory, file sizes, model tier map) to the user.

> Confirm that the inventory is complete and the measurements are correct.

> This prevents optimizing the wrong things.


Phase 2: PRIORITIZE — Build Your Decision Matrix

Score each finding from Phase 1 along three independent dimensions:

DimensionScaleAssessment
------------------------------
Token Impact 🎯High / Med / LowTokens per occurrence × occurrences per period
Risk ⚠️Safe / Moderate / HighCan you undo it? Does it affect core function?
Effort 🔧Easy / Med / HardSingle config change? Multi-file edit? Needs research?

How to Score

Compute a relative priority for each finding by inverting Risk and Effort:

Priority = ImpactWeight × (1 / RiskWeight) × (1 / EffortWeight)

Where each dimension maps to a simple numeric weight:

  • Impact: High=3, Med=2, Low=1
  • Risk: Safe=1, Moderate=2, High=3
  • Effort: Easy=1, Med=2, Hard=3

Focus on items scoring ≥ 1.5 first. Skip items < 1.0 unless they are

trivially easy (effort=1) and safe (risk=1).

Common High-Impact Patterns

These patterns tend to score high across most deployments:

PatternTypical ImpactTypical RiskTypical Effort
-----------------------------------------------------
Overly verbose task promptsHighSafeEasy
Heavy models on simple tasksHighSafeEasy
No active hours on heartbeatMed-HighSafeEasy
Duplicated content across bootstrap filesMed-HighSafeEasy-Med
Full tool profile on task-specific agentsHighModerateEasy
Idle-time session not configuredMedSafeEasy
Outdated tool/plugin configs still loadedLow-MedSafeEasy

> ⚠️ Checkpoint: Show your top-3 priority items to the user.

> Confirm direction before starting optimization.

> If the highest-score items seem wrong, revisit Phase 1 measurements.


Phase 3: OPTIMIZE — Apply Categorical Techniques

> ⚠️ User confirmation gate: Techniques marked Moderate or High risk

> involve config changes, profile switches, or task merging. Before applying them,

> present the proposed change using this template and get explicit approval:

>

> ```

> ## Proposed Change

> Technique: [category/technique name]

> Target: [file/config path]

> Before: [current state, chars/tokens if measurable]

> After: [proposed state, estimated savings]

> Risk: [Moderate/High]

> Rollback: [how to undo]

> ```

>

> Techniques marked Safe can be applied directly.

Each category below contains a set of techniques. Apply them in priority

order from Phase 2 — start with the highest-score items first, regardless

of which category they fall into.

Failure Recovery

If a technique causes a problem:

  • Config change: Restore the backed-up config file and reload.
  • Cron merge broken: Restore the old separate cron job from version control

or re-create it from the original prompt.

  • Profile switch issue: Revert to "full" profile, report the missing tool.
  • Prompt compression over-aggressive: Restore from the diff backup (keep

pre-optimization prompt versions in a prompts/backup/ directory).

Category Selection Guide

Match your Phase 2 findings to the best starting category:

FindingStart With
--------------------
Verbose task prompts (background context, philosophy)A Prompt Simplicity
Heavy models on simple automation tasksB Model Tiering
Bootstrap files >2K chars each, duplicated contentC Context Slimming
Full tool profile, rarely-used tools registeredD Tool Profile Optimization
Verbose agent output, too many turns per taskE Output Discipline
No active hours, co-located tasks running separatelyF Session Lifecycle
Repeated system prompts without caching structureG Provider-Side Caching
Agent retries failed approaches instead of switchingH Behavioral Discipline
Simple/complex tasks both use premium modelJ Intelligent Model Routing

Category Decision Tree

If you're not sure which category to start with, follow this tree from top

to bottom — the first match tells you your likely best starting category:

1. Is the main session slow or expensive?
   → Check B (tiering) and J (routing)
   → Also check D (too many tools loaded?)

2. Are cron jobs consuming more than expected?
   → Check A (prompts too wordy?), then B (wrong model?)
   → If F (same-tier jobs not batched?)

3. Is context getting cut off mid-task?
   → Check C (bootstrap too large?) → I (progressive disclosure?)
   → Then J3 (incremental delivery?)

4. Are agent outputs too verbose?
   → Check E (output discipline) → H (behavioral discipline)

5. Is the same heavy prompt repeated across tasks?
   → Check G (provider-side caching: fixed prefix first?)

6. Are you seeing the same errors repeatedly?
   → Check H2 (fail once, switch) → H4 (fix root cause)

7. Default (no obvious symptom):
   Run Phase 1 from scratch → Phase 2 will tell you where to go

> Pro tip: Start with G (Provider-Side Caching) if you use DeepSeek.

> Cache pricing is 0.83% of uncached — fixing prefix structure alone

> can cut token costs by 90%+.

A. Prompt Simplicity

TechniqueDescriptionRisk
------------------------------

| A1 Strip preamble | Remove background/rationale paragraphs from task prompts. Keep only: trigger, action, output format.

Before: "你是系统监控助手。每天检查服务器状态:CPU使用率>80%告警、内存>90%告警、磁盘>85%告警、SSL证书<30天告警。每个告警按严重程度分别处理:严重→立即通知值班、一般→发运维邮件、提示→记录日志。"

After: "系统监控。检查:CPU(>80%) Mem(>90%) Disk(>85%) SSL(<30d)。告警:严重→立即、一般→邮件、提示→日志。" (360→110 chars, -69%) | Safe |

A2 Bullet points > proseReplace multi-sentence descriptions with keyword checklists.Safe
A3 Constrain outputAdd "Answer concisely in ≤3 lines" or equivalent to reduce generated tokens.Safe
A4 Remove redundancyDelete "What NOT to do" sections — proper instructions make negatives implicit.Safe
A5 Reference > inlineReplace full instructions for sub-tasks with file references ("See X.md") when the referenced file is always loaded.Safe

B. Model Tiering

TechniqueDescriptionRisk
------------------------------
B1 Right-size each taskMap every automated task to the cheapest model that can do it adequately. Test borderline cases.Safe
B2 Define tier boundariesDocument which model(s) belong to each tier so new tasks are assigned correctly.Safe
B3 Batch same-tier runsSchedule same-tier tasks back-to-back to reuse the same session (single context load).Moderate

C. Context Slimming

TechniqueDescriptionRisk
------------------------------
C1 Measure every boot fileList all files loaded at session start and identify those > 2K chars for potential trimming.Safe
C2 Cross-reference dedupWhen the same content appears in 2+ files (e.g. "Core Principles" in SOUL.md and IDENTITY.md), keep it in one authoritative file and replace the others with a 详见 reference.Safe
C3 Archive aged-out contentMove old diary entries, superseded milestones, and historical promoted entries to a dedicated archive directory.Safe

| C4 Trim to one-liner | Convert verbose descriptions to single-line summaries.

Before: "This project's coding conventions were established after three code reviews revealed inconsistent patterns: use 2-space indent for HTML/CSS, 4-space for Python, tabs for Go. Prefix private methods with underscore. No Hungarian notation. Import order: stdlib, third-party, local."

After: "Coding conventions (see CONTRIBUTING.md) — 6 rules, numbered."

Actionable instructions stay; background context goes. | Safe |

D. Tool Profile Optimization

TechniqueDescriptionRisk
------------------------------
D1 Size your tool schemaCount all registered tools and estimate total schema chars. This is typically the single largest per-turn overhead.Safe (measure only)
D2 Switch profile per agentUse "coding" profile for sub-agents/cron jobs (excludes browser, canvas, media generation, feishu tools). Use "full" only where those tools are actually needed.Moderate (test on sub-agents first)
D3 Disable unused toolsIf you have disabled skills or orphaned plugin tools still registering schemas, disable or remove them from the registry. Check skills.entries and plugins.load.paths.Safe
D4 Create custom profileIf neither "full" nor "coding" fits, define a custom profile with exactly the 15-25 tools your use-case needs. Requires config reload.High

E. Output Discipline

TechniqueDescriptionRisk
------------------------------
E1 No operation narrationRemove "I'll...", "Let me check..." patterns. Do the action directly.Safe (behavioral)
E2 Lead with conclusionPut the answer first. Add explanation only when needed.Safe (behavioral)
E3 Batch turnsRead → plan → apply all changes in as few turns as possible, instead of read→think→edit→think→verify per-item. Each extra turn adds LCM context overhead.Safe (behavioral)
E4 Sub-agent concisenessWhen spawning sub-agents, specify a concise return format. Their full output is injected into context if returned.Safe

F. Session Lifecycle

TechniqueDescriptionRisk
------------------------------
F1 Set active hoursConfigure heartbeat.activeHours so no work runs during idle time (overnight, weekends).Safe
F2 Isolated sessionsSet heartbeat.isolatedSession: true so periodic checks don't accumulate in the main session.Safe
F3 Light contextSet heartbeat.lightContext: true to skip loading all bootstrap files — only HEARTBEAT.md is injected.Safe
F4 Merge co-located tasksIf two cron jobs run within minutes of each other (e.g. both at 23:xx), merge them into one session with a combined prompt. Copy both prompts into one job's message field separated by a blank line, then remove the later job. Saves one full startup context per day.Moderate
F5 Merge exampleBefore: Job A at 23:00 (System health check), Job B at 23:10 (Log cleanup). After: Single job at 23:00 with prompt "Do A then B.~A: ...~B: ..."Moderate
F6 Configure queueIf the platform supports message queue settings (debounce, collect), tune them to prevent rapid-turn accumulation during tool execution.Safe

G. Provider-Side Caching

> Impact is 10× any other category. DeepSeek V4 Pro cached price is 0.83% of

> uncached. Cache hit rates of 91-96% are achievable with proper prompt structure.

TechniqueDescriptionRisk
------------------------------

| G1 Fixed prefix first | Design all prompts as [static prefix] + [dynamic suffix]. Static prefix includes system instructions, bootstrap summary, and tool schemas. Dynamic suffix includes runtime instruction. This maximizes KV cache hits on the provider side.

Wrong: "Analyze this code for memory leaks...你是代码审查助手,审查规则如下:..."

Right: "你是代码审查助手,审查规则如下:...现在分析这段代码的内存泄漏:..." | Safe |

G2 Session contiguityDon't insert unrelated messages between consecutive calls to the same model — this breaks the KV cache prefix. Batch related calls into a single turn instead.Safe
G3 Monitor cache rateCheck provider dashboards for cache hit rate. If <80%, your prefix structure likely has variability. Fix it.Safe
G4 Route to best caching providerDifferent providers have wildly different cached prices. DeepSeek V4 Pro: 0.83% of uncached. MiniMax: ~20%. Route routine tasks to the provider with the best cache economics.Moderate

H. Behavioral Discipline

> These are zero-config, zero-cost techniques. The savings come from how you use

> the system, not how it's configured.

TechniqueDescriptionRisk
------------------------------

| H1 Default to working path | Use known-working tools before alternatives. Don't retry tools known to be broken in the current deployment — each retry is a wasted tool call + error response.

Bad: web_search (broken) → error → web_search again → error → baidu-search → works

Good: baidu-search → works (first attempt) | Safe |

H2 Fail once, switchIf a method fails, switch immediately to a known alternative. Don't retry the same approach with slightly different parameters. Each retry costs full tool-call tokens.Safe
H3 Batch > PollGather all data before acting instead of incrementally. One exec or read call that returns 10 results costs less than 5 separate calls returning 2 each.Safe
H4 Fix root causeIf a tool works inconsistently due to a known config issue (API key expired, wrong provider), fix the config. Working around it each time costs more in accumulated failed calls.Safe

I. Context Engineering (2026-05-12 新增)

> Context Engineering 是 2026 年从 Prompt Engineering 演进出的上层方法论。

> 核心原则:渐进式披露 (Progressive Disclosure) — 仅在任务需要时加载特定模块。

> 北京大学论文《Meta Context Engineering via Agentic Skill Evolution》实测:

> Token 消耗降低 60%,任务成功率提升 45%。

TechniqueDescriptionRisk
------------------------------

| I1 Progressive disclosure | 按任务类型分级加载系统能力描述。不需要的技能说明不加载。与 token-saver 的 C1-C4 互补:C 系列负责文件级裁剪,I1 负责能力级裁剪。

Wrong: Agent 启动时加载全部技能描述

Right: Agent 启动时按任务类型加载对应技能(搜索类任务只加载搜索相关技能描述) | Safe |

| I2 Semantic context scoring | 用语义重要性评分代替固定滑动窗口。计算历史交互与当前 query 的语义相似度,只保留 top-3 + 最近 1 轮。

推荐实现:Sentence-BERT 计算余弦相似度,按得分排序裁剪到 token budget。

参考开源项目:Agent Skills for Context Engineering | Safe |

| I3 Fixed prefix pattern | 设计所有 prompt 为「static prefix + dynamic suffix」。静态前缀包括系统指令、bootstrap 摘要、工具 schema 描述。动态后缀包括本次运行指令。最大化 provider 端 KV cache 命中。

Wrong: 每次运行时先写任务指令再补系统描述

Right: 系统描述在前,任务指令在后 | Safe |

| I4 Tool call circuit breaking | 在调用高成本外部工具(SQL 查询、图像生成、文件写入)前增加预检代理。校验参数合法性 + 资源预估。超限或非法直接拒绝,不触发实际调用。

适用场景:web_fetch 超长 URL、exec 高风险命令、文件写入大文件。 | Safe |

| I5 Cost-aware context pruning | 按 taskType 和 budget tokens 动态决定上下文精度。简单任务用精简 prompt,复杂任务才加载完整能力描述。

实现:在 cron payload 中嵌入 task_type 字段,根据该字段调度上下文模板。 | Safe |

J. Intelligent Model Routing (2026-05-14 新增)

> 基于 OpenSquilla(开源 Token 优化引擎)方案。

> Core insight: 路由决策本身不消耗 Token — 在本地判定任务复杂度后决定走哪个模型。

> 与 B 系列(Model Tiering)的区别:B 说"每个任务应该固定分配一个 tier",

> J 说"同一个任务的每一次调用动态判断走哪个 tier"。

TechniqueDescriptionRisk
------------------------------

| J1 Dynamic inbound grading | 对每次入站请求做轻量复杂度判定(prompt length + expected reasoning depth + tool count),自动路由到对应 tier 的模型。

判定规则示例:

  • 🟢 Economy: 简单信息查询(token 消耗 < 1K,无推理要求)→ sensenova-6.7-flash-lite / minimax-m2.7
  • 🟡 Standard: 一般分析任务(1K-5K,轻度推理)→ deepseek-v4-flash
  • 🏆 Premium: 深度推理(>5K,复杂推理/多工具)→ deepseek-v4-pro

实现:在 cron/子代理 payload 中嵌入 model 字段,或使用中间路由代理做分类。

Before: 所有每日 cron 统一用 deepseek-v4-pro,包括简单版本检查

After: 版本检查→sensenova-6.7-flash-lite,领域探针→deepseek-v4-flash,

深度分析→deepseek-v4-pro。按任务特征分级,非一刀切。 | Safe |

| J2 Local routing decision | 路由判定在本地完成(不需要发 request 到远程判断路由)。

方案:在任务 payload 中预置 routing_hint 字段(economy/standard/premium),

由编排层基于任务特征(prompt length、任务类型、预期工具调用数)决定。

Bad: 每次调用先发一个轻量请求问"这个走哪个模型?"(浪费 Token)

Good: 任务描述本身包含路由线索,编排层本地匹配 | Safe |

| J3 Incremental context delivery | OpenSquilla 核心方案之一:先发送最小必要上下文,不足时增量补充。

与 C 系列(Context Slimming)区别:C 是文件级裁剪(减少启动加载量),

J3 是调用级递进(先发主干,模型需要时再发细节)。

实现:prompt 中先给出高度压缩的摘要版本,后面用"如需详情请见"标记展开点。

结合 OpenClaw 的 lossless-claw 机制:LCM 摘要本身已经是一个渐进式实现。 | Safe |

| J4 Cache-aware routing | 统计各 provider 的 KV cache 命中率,优先路由到命中率高的 provider。

当前已知:DeepSeek V4 Pro cache price = 0.83% of uncached。

如果命中了缓存的 prompt prefix(含 bootstrap + 技能描述),成本降低 99%+。

实现:在路由时优先选择上次使用同一模型+同一 prefix 结构的 provider。 | Moderate |

| J5 Fallback chain | 当分配的路由模型不可用(API 超时/限流/报错),自动降级到备用模型。

与 H2(Fail once, switch)的区别:H2 是行为纪律(失败了就换方法),

J5 是配置级的自动 failover 链。

示例 Fallback Chain:

1️⃣ deepseek-v4-pro → 2️⃣ deepseek-v4-flash → 3️⃣ sensenova-6.7-flash-lite

实现:在 cron/子代理 payload 中使用 fallbacks 字段配置降级顺序。

Before: cron job 用 minimax-m2.7,卡死后无降级 → 任务永远失败

After: 同一 cron 配置 fallbacks: ["deepseek-v4-flash", "deepseek-v4-pro"]

minimax 卡死后自动切到 deepseek-v4-flash 执行 | Safe |

> 权衡:智能路由的收益上限取决于任务复杂度分布。如果 80% 的任务是简单查询,

> 智能路由的 Token 节省可达 60-80%。如果大部分任务已经是标准/经济 tier,

> 额外收益有限。建议做一次任务复杂度分布扫描后再选择启用哪些 J 技术。


Phase 4: VALIDATE — Confirm Results

4A Prompt Length Delta

Before/after comparison of all modified prompts and files. Include total

chars and estimated tokens saved.

4B Config Integrity

After editing JSON configuration files, validate:

python3 -c "import json; json.load(open('<config-path>')); print('OK')"

4C Functional Test

  • Verify cron tasks still start correctly (check cron action=runs or next

scheduled trigger)

  • Verify heartbeat runs in configured active window
  • Read through compressed cron prompts to ensure key instructions survive

4D Generate Report

Write token-audit-report-YYYY-MM-DD.md summarizing:

  • Changes made and per-change token savings
  • Total estimated weekly token reduction
  • Items deferred and why
  • Recommended next optimization

Log each optimization cycle in results.tsv (see skill directory for

format reference). This creates an audit trail for the quarterly deep audit (5B).


Phase 5: MONITOR — Guard Against Regrowth

5A Periodic Token Watch (Optional)

Optionally create a weekly cron (cheapest available model) that checks

prompt lengths haven't crept back:

{
  "name": "token-watch-weekly",
  "schedule": { "kind": "cron", "expr": "0 10 * * 1", "tz": "Asia/Shanghai" },
  "payload": {
    "kind": "agentTurn",
    "model": "<cheapest-model>",
    "message": "Check all cron prompt lengths. Flag any that grew >20% since last baseline.",
    "timeoutSeconds": 120
  },
  "sessionTarget": "isolated",
  "delivery": { "mode": "none" }
}

5B Quarterly Deep Audit

Run the full Phase 1-4 cycle every quarter using the cheapest available

model. Compare results against previous reports to spot regrowth trends.

The quarterly audit MUST include:

  • Compare each cron's prompt length against last audit baseline
  • Check if unused categories crept back (complacency regrowth)
  • Verify all J routing hints still reflect actual task complexity
  • Re-run the Priority Matrix from Phase 2 to catch new high-impact items

5C Real-Time Drift Alerts (Optional)

When token consumption suddenly spikes, it's usually one of three causes.

Know which one:

SymptomLikely CauseFix
---------------------------
A specific cron doubled in token countPrompt crept back (someone added preamble)A1 Strip preamble
All crons increased proportionallyModel changed (e.g. dev switched back to pro from flash)B1 Right-size, check default model
Session-level spikes, not cronTool profile expanded / new plugin registeredD1 Size tool schemas, check profile
Intermittent spikesPrefix changed → KV cache missedG1 Fixed prefix, check for variation

If you have access to provider dashboards:

  • Monitor cache hit rate — DeepSeek dashboard shows prefix cache stats
  • If hit rate drops below 80% -> inspect recent prompt changes for prefix variation
  • Track per-model token consumption weekly — a 2× week-over-week jump is a red flag

Safety Boundaries

Configs That Need Gateway Restart

Some configuration paths require a gateway restart to take effect:

  • agents.defaults.heartbeat.* (edit config file + restart)
  • agents.list[].tools.profile
  • gateway., auth.
  • plugins.* — certain sub-fields

What NOT to Compress

These core mechanisms must be preserved even in an aggressive token budget:

  • Error detection logic (consecutive errors, failure alerts)
  • Essential signal handling (high-priority alerts → auto-escalation)
  • Drift detection for recurring tasks

External References

  • OpenClaw Cron Jobs: https://docs.openclaw.ai/automation/cron.md
  • OpenClaw Standing Orders: https://docs.openclaw.ai/automation/standing-orders.md
  • OpenClaw Gateway Config: openclaw gateway config CLI
  • OpenClaw Agent Profiles: agents.list[].tools.profile in openclaw.json
  • Test Prompts: test-prompts.json in skill directory
  • Results Log: results.tsv in skill directory

Appendix: Local Deployment Configuration

This section is populated by the first execution of the Token Saver in a

specific deployment. Replace the example values below with real ones.

Configuration Paths

ItemExample Path
-------------------
Cron jobs~/.openclaw/cron/jobs.json
Gateway config~/.openclaw/openclaw.json
Workspace root~/.openclaw/workspace/
Bootstrap filesAGENTS.md, SOUL.md, USER.md, MEMORY.md, HEARTBEAT.md, IDENTITY.md, TOOLS.md, STANDING-ORDERS.md

Baseline Measurements (example: Wave 2026-05-04)

FileInitial SizeAfter First PassReductionTechniques Used
----------------------------------------------------------------
SOUL.md7,0343,521-50%C2 (cross-ref), C4 (one-liner), A2
STANDING-ORDERS.md10,9603,816-65%C2 (cross-ref), A4 (remove redundancy)
IDENTITY.md6,2284,313-31%C2 (dedup with SOUL.md), C4
AGENTS.md5,0722,691-47%C2 (ref to STANDING-ORDERS), C4
TOOLS.md8,8937,488-16%C4 (remove stale entries)
MEMORY.md30,22426,420-13%C3 (archive promoted entries)
Total68,41148,249-29%

Per-session token savings from bootstrap compression: ~6,720 tokens.

Benchmark: Compression by File Type

File TypeTypical SavingsBest Technique
-------------------------------------------
Program/Protocol (STANDING-ORDERS.md)55-65%A4 (remove boilerplate sections)
Guide/Identity (SOUL.md, IDENTITY.md)30-50%C2 (cross-reference dedup)
Instructions (AGENTS.md)40-50%C2 (replace lists with file refs)
Knowledge base (MEMORY.md)10-20%C3 (archive old entries only)
Config/state table (TOOLS.md)10-20%C4 (remove stale entries only)

Task-to-Model Map

TaskModel TierModel
------------------------
Version checkEconomyminimax-m2.7
Demand scanningStandarddeepseek-v4-pro (needs search)
Domain probeEconomyminimax-m2.7
Dreaming (memory integration)Economyminimax-m2.7
Doc maintenanceEconomyminimax-m2.7
WaveCap daily expansionStandarddeepseek-v4-pro (needs reasoning)
Weekly reviewPremiumdeepseek-v4-pro
Friday topic selectionPremiumdeepseek-v4-pro
Main sessionStandarddeepseek-v4-flash

Deferred Items

ItemReasonCondition to Revisit
-----------------------------------
Tool profile for main agentHigh risk (may break unexpected features)After sub-agent coding profile proven in production for 1 week
Cron task mergingNeeds user confirmation; may affect reliabilityNext token audit cycle
Compaction mode change (safeguard→normal)Needs config reloadWhen gateway restarted for other reasons

Deployment-Specific Constraints

  • Network: GFW blocks chatgpt.com, api.openai.com. All OpenAI/Codex models unavailable.
  • Models available: deepseek-v4-pro (premium), deepseek-v4-flash (standard),

minimax-m2.7 (economy).

  • File paths: Standard OpenClaw paths under ~/.openclaw/.
  • Git: Workspace is a git repository; all changes version-controlled.

版本历史

共 2 个版本

  • v2.1.0 当前
    2026-05-21 13:40 安全 安全
  • v2.0.0
    2026-05-08 03:30 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

dev-programming

Wave Project Evaluator

youxiyin
对项目进行结构化评估与优化,借鉴达尔文技能的质量方法论。v0.2.0(2026-05-14)新增过程级评估维度(借鉴 AgentLens 方法论)。触发词:评估项目、项目健康检查、project review、项目评分、优化项目、项目质量检
★ 0 📥 358
professional

Stock Market Pro

kys42
Yahoo Finance (yfinance) 驱动的股票分析技能:行情报价、基本面、ASCII 趋势图、高分辨率图表(RSI/MACD/BB/VWAP/ATR),以及可选的网络...
★ 164 📥 40,312
professional

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 196 📥 63,685