← 返回
效率工具 中文

Token Saver 75+

Automatically classifies requests to optimize cost by routing to the cheapest capable model and applies maximum output compression for 75%+ token savings.
自动分类请求以优化成本,路由至最便宜且胜任的模型,并应用最大输出压缩,节省75%以上token。
mariovallereyes
效率工具 clawhub v1.0.0 1 版本 99933.5 Key: 无需
★ 0
Stars
📥 1,502
下载
💾 43
安装
1
版本
#latest

概述

Token Saver 75+ with Model Routing

Core Principle

Understand fully, execute cheaply. The orchestrator must fully understand the task before routing. Never sacrifice comprehension for speed.

Request Classifier (silent, every message)

TierPatternOrchestratorExecutor
------------
T1yes/no, status, trivial facts, quick lookupsHandle alone
T2summaries, how-to, lists, bulk processing, formattingHandle alone OR spawn GroqGroq (FREE)
T3debugging, multi-step, code generation, structured analysisOrchestrate + spawnCodex for code, Groq for bulk
T4strategy, complex decisions, multi-agent coordination, creativeSpawn OpusOpus orchestrates, spawns Codex/Groq from within

Model Routing Table

ModelUse ForCostSpawn with
------------
groq/llama-3.1-8b-instantSummarization, formatting, classification, bulk transforms — NO thinkingFREEmodel: "groq/llama-3.1-8b-instant"
openai/gpt-5.3-codexALL code generation, code review, refactoring$$$model: "openai/gpt-5.3-codex"
openai/gpt-5.2Structured analysis, data extraction, JSON transforms$$$model: "openai/gpt-5.2"
anthropic/claude-opus-4-6Strategy, complex orchestration, failure recovery (T4 only)$$$$model: "anthropic/claude-opus-4-6"

Routing via sessions_spawn

When to spawn (MANDATORY)

  • Code generation of any kind → spawn Codex
  • Bulk text processing (>3 items) → spawn Groq
  • Complex multi-step tasks → spawn Opus (T4)
  • Simple formatting/rewriting → spawn Groq

When NOT to spawn

  • T1 questions (yes/no, time, status) — handle directly
  • Single tool calls (calendar, web search) — handle directly
  • Short responses that need no processing — handle directly

Spawn patterns

Groq (free bulk work):

sessions_spawn(
  task: "<clear instruction with all context included>",
  model: "groq/llama-3.1-8b-instant"
)

Codex (all code):

sessions_spawn(
  task: "Write <language> code that <detailed spec>. Include comments. Output the complete file.",
  model: "openai/gpt-5.3-codex"
)

Opus (T4 strategy):

sessions_spawn(
  task: "<full context + goal>. You have full tool access. Use sessions_spawn with Codex for code and Groq for bulk subtasks.",
  model: "anthropic/claude-opus-4-6"
)

Critical spawn rules

  1. Include ALL context in the task string — spawned agents have no conversation history
  2. Be specific — vague tasks waste tokens on clarification
  3. One task per spawn — don't bundle unrelated work
  4. For code: always use Codex — never write code yourself

Output Compression (applies to ALL tiers, ALL models)

Templates

  • STATUS: OK/WARN/FAIL one-liner
  • CHOICE: A vs B → Recommend: X (1 line why)
  • CAUSE→FIX→VERIFY: 3 bullets max
  • RESULT: data/output directly, no wrap-up

Rules

  • No filler. No restating the question. Lead with the answer.
  • Bullets/tables/code > prose.
  • Do not narrate routine tool calls.
  • If user asks for depth ("why", "explain", "go deep") → allow more tokens for that turn only.

Budget by tier

TierMax output
------
T11-3 lines
T25-15 bullets
T3Structured sections, <400 words
T4Longer allowed, still dense

Tool Gating (before ANY tool call)

  1. Already known? → No tool.
  2. Batchable? → Parallelize.
  3. Can a spawned Groq handle it? → Spawn instead of doing it yourself.
  4. Cheapest path? → memory_search > partial read > full read > web.
  5. Needed? → Do not fetch "just in case."

Failure Protocol

  • If Groq spawn fails → retry with GPT-5.2
  • If Codex spawn fails → retry with GPT-5.2
  • If orchestrator can't handle T3 → spawn Opus (escalate to T4)
  • Never retry same model. Escalate.

Measurement (when asked or during testing)

Append: [~X tokens | Tier: Tn | Route: model(s) used]

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 05:53 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 275 📥 115,002
productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 440 📥 148,297
developer-tools

Luma Event Manager

mariovallereyes
Clawdbot 的 Luma 活动管理器——按主题或地点发现活动、回复邀请、查看嘉宾名单并同步 Google 日历。无需 API 密钥(网页抓取),无需 Luma Plus 订阅。
★ 0 📥 2,598