← 返回
AI智能 中文

Adaptive Routing

Routes LLM requests to a local model first (Ollama, LM Studio, llamafile), validates the response quality, and escalates to cloud only when the local result...
优先将LLM请求路由至本地模型(如Ollama、LM Studio、llamafile),验证响应质量,仅在本地结果不佳时升级至云端。
joelnishanth
AI智能 clawhub v1.0.0 1 版本 99864.3 Key: 无需
★ 1
Stars
📥 716
下载
💾 30
安装
1
版本
#latest

概述

Adaptive Routing

Route requests to a local LLM first. Validate the response quality. Escalate to cloud only when the local result fails the quality check. Track every outcome in a persistent dashboard.

Quick Start

1. Check if a local LLM is running

python3 skills/adaptive-routing/scripts/check_local.py

Returns JSON: { "any_available": true, "best": { "provider": "ollama", "models": [...] } }

2. Route a request

python3 skills/adaptive-routing/scripts/route_request.py \
  --prompt "Summarize this meeting transcript" \
  --tokens 800 \
  --local-available \
  --local-provider ollama

Returns: { "decision": "local", "reason": "...", "complexity_score": -1, "complexity_threshold": 3 }

3. Execute with the chosen provider

Send the request to your local provider (Ollama, LM Studio, or llamafile).

See references/local-providers.md for curl examples.

4. Validate the response

python3 skills/adaptive-routing/scripts/validate_result.py \
  --response "The meeting covered three topics..." \
  --exit-code 0

Returns: { "passed": true, "score": 1.0, "reason": "ok", "should_escalate": false }

If should_escalate: true, re-run step 3 with your cloud provider instead.

5. Log the outcome

# Local success (no escalation needed)
python3 skills/adaptive-routing/scripts/track_savings.py log \
  --kind local_success --tokens 800 --model gpt-4o

# Escalated (local failed validation, used cloud)
python3 skills/adaptive-routing/scripts/track_savings.py log \
  --kind escalated --tokens 800 --model gpt-4o

6. Show the dashboard

python3 skills/adaptive-routing/scripts/dashboard.py

Full Routing Workflow

┌──────────────────────────────────────────────────────────┐
│  1. check_local.py  →  is a local provider running?      │
│                                                           │
│  2. route_request.py  →  local or cloud?                  │
│     · sensitivity check  (private data → local)          │
│     · complexity score   (high score → cloud)            │
│     · availability gate  (no local → cloud)              │
│                                                           │
│  3. Execute with local provider                          │
│                                                           │
│  4. validate_result.py  →  did the response pass?        │
│     · passed=true   → use result   (kind=local_success)  │
│     · passed=false  → re-run cloud (kind=escalated)      │
│                                                           │
│  5. track_savings.py log  →  record the outcome          │
│                                                           │
│  6. dashboard.py  →  show cumulative savings             │
└──────────────────────────────────────────────────────────┘

Routing Rules (Summary)

ConditionRoute
-------------------------------------------------------------------------------------
No local provider available☁️ Cloud
Prompt contains sensitive data (password, secret, api key, ssn, etc.)🏠 Local
Complexity score ≥ threshold (default 3)☁️ Cloud
Complexity score < threshold🏠 Local

After routing locally, validate_result.py applies a second gate:

SignalEscalate?
-------------------------------------
Empty responseYes
Process exit code != 0Yes
Timed outYes
Tool errorYes
Clean response, score ≥ 0.75No

For full scoring details, see references/routing-logic.md.


Configuration

Create ~/.openclaw/adaptive-routing/config.json to tune thresholds:

{
  "complexity_threshold": 3,
  "token_high_watermark": 4000,
  "token_low_watermark": 500,
  "redact_output": true
}

Pass --config /path/to/config.json to route_request.py to use a custom path.


Executing with a Local Provider

Once route_request.py returns "decision": "local", send the request:

Ollama

curl http://localhost:11434/api/generate \
  -d '{"model": "llama3.2", "prompt": "YOUR_PROMPT", "stream": false}'

LM Studio / llamafile (OpenAI-compatible)

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "local-model", "messages": [{"role": "user", "content": "YOUR_PROMPT"}]}'

Dashboard

The dashboard reads from ~/.openclaw/adaptive-routing/savings.json (auto-created).

┌───────────────────────────────────────────────┐
│      🔀  Adaptive Routing  ·  Dashboard       │
├───────────────────────────────────────────────┤
│  Local LLM:  ✅  ollama (llama3.2...)         │
├───────────────────────────────────────────────┤
│  Total requests:                           42  │
│  Local (passed):               31  (73.8%)    │
│  Escalated to cloud:                        4  │
│  Cloud (direct):                            7  │
│  Escalation rate:                       11.4%  │
├───────────────────────────────────────────────┤
│  Tokens (local):                       84,200  │
│  Tokens (cloud):                        9,600  │
│  Cost saved (USD):                     $0.4210 │
└───────────────────────────────────────────────┘

Reset savings data:

python3 skills/adaptive-routing/scripts/track_savings.py reset

Additional References

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 05:57 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Local-First LLM

joelnishanth
将LLM请求优先路由至本地模型(Ollama、LM Studio、llamafile),失败后回退至云端API,并持续追踪Token节省与成本规避情况。
★ 1 📥 929
ai-intelligence

Proactive Agent

halthelobster
将AI智能体从任务执行者升级为主动预判需求、持续优化的智能伙伴。集成WAL协议、工作缓冲区、自主定时任务及实战验证模式。Hal Stack核心组件 🦞
★ 836 📥 213,189
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,358 📥 318,473