← 返回
数据分析 中文

Smart Model Routing for Z.AI

Auto-route tasks to the cheapest z.ai (GLM) model that works correctly. Three-tier progression: Flash → Standard → Plus/32B. Classify before responding. FLASH (default): factual Q&A, greetings, reminders, status checks, lookups, simple file ops, heartbeats, casual chat, 1–2 sentence tasks, cron jobs. ESCALATE TO STANDARD: code >10 lines, analysis, comparisons, planning, reports, multi-step reasoning, tables, long writing >3 paragraphs, summarization, research synthesis, most user conversations.
自动路由至能正确工作的最廉价 z.ai (GLM) 模型。遵循三级递进:Flash → Standard → Plus/32B。响应前先分类。 FLASH (默认):事实问答、寒暄、提醒、状态查询、检索、简单文件操作、心跳、闲聊、1-2句任务、定时任务。 升级至 STANDARD:超10行代码、分析、对比、规划、报告、多步推理、表格、超3段长文、摘要、研究综合、多数用户对话。 升级至 PLUS/32B:架构决策、复杂调试、多文件重构、战略规划、精细判断、深度研究、关键生产决策。 规则:若需专注思考超30秒则升级;若 Standard 处理复杂度吃力则转 Plus/32B。起步廉价,按需升级,大幅节省 API 成本。
princnl
数据分析 clawhub v1.0.0 1 版本 99940.9 Key: 无需
★ 1
Stars
📥 1,672
下载
💾 61
安装
1
版本
#latest

概述

Smart Model Switching

Three-tier z.ai (GLM) routing: Flash → Standard → Plus / 32B

Start with the cheapest model. Escalate only when needed. Designed to minimize API cost without sacrificing correctness.


The Golden Rule

> If a human would need more than 30 seconds of focused thinking, escalate from Flash to Standard.

> If the task involves architecture, complex tradeoffs, or deep reasoning, escalate to Plus / 32B.


Model Reality (Relative)

TierExample ModelsPurpose
------------------------------
FlashGLM-4.5-Flash, GLM-4.7-FlashFastest & cheapest
StandardGLM-4.6, GLM-4.7Strong reasoning & code
Plus / 32BGLM-4-Plus, GLM-4-32B-128KHeavy reasoning & architecture

Bottom line: Wrong model selection wastes money OR time. Flash for simple, Standard for normal work, Plus/32B for complex decisions.


💚 FLASH — Default for Simple Tasks

Stay on Flash for:

  • Factual Q&A — “what is X”, “who is Y”, “when did Z”
  • Quick lookups — definitions, unit conversions, short translations
  • Status checks — monitoring, file reads, session state
  • Heartbeats — periodic checks, OK responses
  • Memory & reminders
  • Casual conversation — greetings, acknowledgments
  • Simple file ops — read, list, basic writes
  • One-liner tasks — anything answerable in 1–2 sentences
  • Cron jobs (always Flash by default)

NEVER do these on Flash

  • ❌ Write code longer than 10 lines
  • ❌ Create comparison tables
  • ❌ Write more than 3 paragraphs
  • ❌ Do multi-step analysis
  • ❌ Write reports or proposals

💛 STANDARD — Core Workhorse

Escalate to Standard for:

Code & Technical

  • Code generation — functions, scripts, features
  • Debugging — normal bug investigation
  • Code review — PRs, refactors
  • Documentation — README, comments, guides

Analysis & Planning

  • Comparisons and evaluations
  • Planning — roadmaps, task breakdowns
  • Research synthesis
  • Multi-step reasoning

Writing & Content

  • Long-form writing (>3 paragraphs)
  • Summaries of long documents
  • Structured output — tables, outlines

Most real user conversations belong here.


❤️ PLUS / 32B — Complex Reasoning Only

Escalate to Plus / 32B for:

Architecture & Design

  • System and service architecture
  • Database schema design
  • Distributed or multi-tenant systems
  • Major refactors across multiple files

Deep Analysis

  • Complex debugging (race conditions, subtle bugs)
  • Security reviews
  • Performance optimization strategy
  • Root cause analysis

Strategic & Judgment-Based Work

  • Strategic planning
  • Nuanced judgment and ambiguity
  • Deep or multi-source research
  • Critical production decisions

🔄 Implementation

For Subagents

// Routine monitoring
sessions_spawn(task="Check backup status", model="GLM-4.5-Flash")

// Standard code work
sessions_spawn(task="Build the REST API endpoint", model="GLM-4.7")

// Architecture decisions
sessions_spawn(task="Design the database schema for multi-tenancy", model="GLM-4-Plus")
For Cron Jobs
json
Copy code
{
  "payload": {
    "kind": "agentTurn",
    "model": "GLM-4.5-Flash"
  }
}
Always use Flash for cron unless the task genuinely needs reasoning.

📊 Quick Decision Tree
pgsql
Copy code
Is it a greeting, lookup, status check, or 1–2 sentence answer?
  YES → FLASH
  NO ↓

Is it code, analysis, planning, writing, or multi-step?
  YES → STANDARD
  NO ↓

Is it architecture, deep reasoning, or a critical decision?
  YES → PLUS / 32B
  NO → Default to STANDARD, escalate if struggling
📋 Quick Reference Card
less
Copy code
┌─────────────────────────────────────────────────────────────┐
│                  SMART MODEL SWITCHING                      │
│              Flash → Standard → Plus / 32B                  │
├─────────────────────────────────────────────────────────────┤
│  💚 FLASH (cheapest)                                        │
│  • Greetings, status checks, quick lookups                  │
│  • Factual Q&A, reminders                                   │
│  • Simple file ops, 1–2 sentence answers                    │
├─────────────────────────────────────────────────────────────┤
│  💛 STANDARD (workhorse)                                    │
│  • Code > 10 lines, debugging                               │
│  • Analysis, comparisons, planning                          │
│  • Reports, long writing                                    │
├─────────────────────────────────────────────────────────────┤
│  ❤️ PLUS / 32B (complex)                                    │
│  • Architecture decisions                                   │
│  • Complex debugging, multi-file refactoring                │
│  • Strategic planning, deep research                        │
├─────────────────────────────────────────────────────────────┤
│  💡 RULE: >30 sec human thinking → escalate                 │
│  💰 START CHEAP → SCALE ONLY WHEN NEEDED                    │
└─────────────────────────────────────────────────────────────┘
Built for z.ai (GLM) setups.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-28 21:42 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Stock Analysis

udiedrichsen
{"answer":"基于雅虎财经数据,分析股票与加密货币。支持投资组合管理、自选股预警、股息分析、8维评分、热门趋势扫描及传闻/早期信号探测。适用于股票分析、持仓追踪、财报异动、加密监控、热门股追踪或提前发掘非主流传闻。"}
★ 270 📥 57,018
data-analysis

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 166 📥 60,181
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 368 📥 140,751