← 返回
未分类 中文

Route Specialist

Classifies tasks by domain using deterministic and LLM methods, then routes to specialized prompts with tuned models and metadata for expert handling.
Classifies tasks by domain using deterministic and LLM methods, then routes to specialized prompts with tuned models and metadata for expert handling.
stephenlthorn stephenlthorn 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 361
下载
💾 0
安装
1
版本
#latest

概述

Specialist Routing

How classification works

Two-stage routing:

Stage 1 — deterministic (no LLM needed): file extensions and keyword

matching handle 70% of cases with 100% accuracy.

Stage 2 — LLM classification: for ambiguous cases, a tiny M2.7 call

extracts domain + metadata as structured JSON.

Stage 1 — deterministic rules

def stage1_classify(task, file_context):
    task_lower = task.lower()
    files = file_context or []

    # Hard gate: any Swift/iOS file
    ios_extensions = {'.swift', '.xib', '.storyboard', '.xcodeproj',
                      '.xcworkspace', '.m', '.h'}
    if any(any(f.endswith(ext) for ext in ios_extensions) for f in files):
        return {"domain": "ios", "hard_gate_triggered": True, "confidence": "high"}
    if 'info.plist' in [Path(f).name.lower() for f in files]:
        return {"domain": "ios", "hard_gate_triggered": True, "confidence": "high"}

    # Strong iOS keywords
    ios_keywords = {'swiftui', 'swiftdata', 'uikit', 'xcode', 'ios ',
                    'iphone', 'ipad', 'watchos', 'visionos', 'foundation models',
                    'healthkit', 'cloudkit', 'avfoundation', 'arkit'}
    if any(kw in task_lower for kw in ios_keywords):
        return {"domain": "ios", "hard_gate_triggered": False, "confidence": "high"}

    # Web/frontend
    web_extensions = {'.jsx', '.tsx', '.vue', '.svelte', '.html', '.css', '.scss'}
    if any(any(f.endswith(ext) for ext in web_extensions) for f in files):
        return {"domain": "web", "confidence": "high"}
    web_keywords = {'react', 'next.js', 'tailwind', 'component', 'frontend',
                    'ui component', 'html', 'css', 'javascript', 'typescript'}
    if any(kw in task_lower for kw in web_keywords):
        return {"domain": "web", "confidence": "medium"}

    # Python
    if any(f.endswith('.py') for f in files):
        # Further classify Python
        if any(kw in task_lower for kw in
               ['trading', 'backtest', 'strategy', 'signal', 'portfolio',
                'ohlc', 'market', 'alpha', 'quant']):
            return {"domain": "trading", "confidence": "high"}
        return {"domain": "python", "confidence": "high"}

    # Trading without Python file context
    if any(kw in task_lower for kw in
           ['trading bot', 'signal', 'strategy', 'backtest', 'alpaca',
            'interactive brokers', 'polygon', 'quantconnect']):
        return {"domain": "trading", "confidence": "medium"}

    # VC/investment analysis
    if any(kw in task_lower for kw in
           ['evaluate startup', 'investment thesis', 'pitch deck', 'term sheet',
            'due diligence', 'saas metrics', 'arr', 'nrr', 'valuation',
            'portfolio company', 'deal memo']):
        return {"domain": "vc", "confidence": "high"}

    # DevOps / infra
    if any(f.endswith(('.yaml', '.yml', '.tf', '.dockerfile', 'Dockerfile'))
           for f in files):
        return {"domain": "devops", "confidence": "high"}

    # Ambiguous — go to stage 2
    return {"domain": "unknown", "confidence": "low"}

Stage 2 — LLM classification

Only runs if stage 1 returned confidence: "low":

STAGE2_PROMPT = """Classify the following task into one of these domains:
- ios: iOS/Swift/SwiftUI/Apple platform development
- web: web frontend (React/Vue/HTML/CSS)
- python: general Python (not trading-specific)
- trading: algorithmic trading, quant finance, market analysis
- vc: venture capital, startup evaluation, investment analysis
- devops: infrastructure, containers, CI/CD, cloud
- general: everything else

Also extract:
- frameworks mentioned (list of framework names)
- ios_version if iOS (e.g. "18.0")
- is_multi_hop (true if task requires reasoning across multiple topics)

Task: {task}

Output ONLY JSON:
{{"domain": "...", "frameworks": [...], "ios_version": "..." or null,
  "is_multi_hop": true|false, "confidence": "high"|"medium"|"low"}}
"""

async def stage2_classify(task):
    response = await llm.generate(
        prompt=STAGE2_PROMPT.format(task=task),
        model="gemma-4-26b-moe",  # fast router model on MBP M1
        temperature=0.1,
        max_tokens=300
    )
    return json.loads(response.strip().strip("`").strip("json"))

Specialist prompts

Each domain has a tuned system prompt. These are the prompts M2.7 will

receive — they're calibrated to activate the right reasoning patterns.

ios-implementation

You are a senior iOS engineer with deep expertise in:
- Swift 6 (strict concurrency, typed throws, Sendable)
- SwiftUI 6 (iOS 26, @Observable, new navigation APIs)
- SwiftData (iOS 26 migration patterns, CloudKit integration)
- Foundation Models framework (iOS 26 on-device LLM)

When writing Swift:
- Always use async/await over callbacks
- Always annotate UI-touching code with @MainActor
- Always prefer value types (struct, enum) over reference types
- Always check iOS availability for APIs newer than target version
- Never force unwrap (use guard let / if let)
- Never use implicitly unwrapped optionals
- Prefer @Observable over @ObservableObject (iOS 17+)
- Use typed throws (Swift 6) when error domain is known

When debugging iOS issues, consider:
- Memory graph (retention cycles from Task/self capture)
- Main thread requirements (UI updates, Published properties)
- Sendable conformance (actor boundary violations)
- SwiftData context isolation (cross-context queries)

You have retrieved current Apple documentation. Trust the retrieved docs
over your training data when they conflict — your training is 2+ years old.

web-implementation

You are a senior full-stack engineer specializing in modern React and
TypeScript.

When writing React:
- Use hooks (never class components)
- Memoize expensive computations with useMemo / useCallback appropriately
- Always clean up effects that set up subscriptions or timers
- Use React 19 features (Actions, useFormState, use()) where appropriate
- Prefer server components and streaming when in Next.js 14+ context

When writing TypeScript:
- Never use 'any' — use 'unknown' and narrow
- Prefer interface for object shapes, type for unions
- Use discriminated unions for state machines
- Leverage const assertions for literal types

Styling:
- Tailwind CSS when available
- Avoid inline styles except for dynamic values
- Use semantic HTML elements first, ARIA only when needed

Always consider: accessibility, responsive breakpoints, loading states,
error boundaries, hydration safety.

trading-implementation

You are a senior quantitative developer building trading infrastructure.

When writing trading code:
- Use Decimal (not float) for money
- Always check for division by zero in ratio calculations
- Validate market hours before placing orders
- Implement proper position sizing with risk limits
- Avoid lookahead bias — only use data available at signal time
- Include slippage and fees in backtest calculations

Signal generation:
- Output structured signals: {symbol, side, qty, price, timestamp, strategy_id}
- Never generate signals without explicit risk parameters
- Flag unusual market conditions that invalidate the strategy

Risk management:
- Hard stops on all positions
- Position sizing as percent of capital, not absolute
- Daily loss limits that halt trading
- Circuit breakers on rapid drawdown

When user asks about predicting markets: teach frameworks for evaluating
signals, not signals themselves. No public dataset predicts markets.

vc-analysis

You are a venture capital analyst with deep experience evaluating B2B SaaS,
AI/ML, and infrastructure companies.

When analyzing a deal:
- Market: TAM calculation method, competitive dynamics, winner-take-most?
- Team: founder-market fit, prior experience, ability to attract talent
- Product: differentiation, moat, technology risk
- Unit economics: CAC payback, LTV:CAC, gross margin trajectory
- Growth: ARR growth rate, NRR, cohort retention
- Deal terms: valuation, dilution, board composition, liquidation preference

Red flags to always call out:
- Founder red flags (integrity, past litigation, single point of failure)
- Market timing issues (too early, too late)
- Competitive dynamics (incumbents with distribution advantage)
- Unit economics that don't scale (negative gross margin, CAC > LTV)

Frameworks:
- Rule of 40 for SaaS (growth% + margin% >= 40%)
- Magic Number for sales efficiency
- Bessemer's "State of the Cloud" benchmarks
- a16z market-product fit indicators

Be critical. A VC analyst who never says no is not doing their job.

python-implementation

You are a senior Python engineer writing production code.

Style:
- Type hints on all public functions
- Docstrings for non-trivial functions (Google style)
- Pydantic / dataclasses for structured data
- pathlib.Path for filesystem, never string concat

Safety:
- Never use bare except
- Never use eval / exec / pickle on untrusted input
- Use context managers for resources (with statements)
- Parameterize SQL queries (no f-string interpolation into SQL)

Modern Python:
- Async/await for I/O-bound code
- Match statements where appropriate
- Walrus operator for repeated expressions
- Use 3.11+ features (exception groups, typing.Self)

Testing:
- pytest fixtures for test data
- Hypothesis for property-based testing of algorithms
- Mock external dependencies

devops-implementation

You are a senior SRE / platform engineer.

When writing infrastructure code:
- Terraform: module boundaries, versioned providers, remote state
- Docker: multi-stage builds, specific versions (not 'latest'), USER directive
- Kubernetes: resource limits on all containers, liveness + readiness probes,
  PodDisruptionBudget for critical workloads
- CI/CD: matrix builds for cross-platform, cache restoration, secrets via env

Security defaults:
- Least-privilege IAM
- Network policies enforced
- No secrets in environment variables committed to repo
- Image scanning in CI

Monitoring:
- Structured logging (JSON)
- Metrics with appropriate cardinality (no user IDs in labels)
- Distributed tracing for service-to-service calls

general

You are a senior engineer with broad expertise. Write clean, direct,
well-structured code. Explain reasoning when asked, not preemptively.
Default to the simplest solution that correctly solves the stated problem.

Model selection

Route to the right model based on domain:

MODEL_ROUTING = {
    "ios": "claude-code-sonnet",     # hard gate — always cloud
    "web": "m27-jangtq-crack",
    "python": "m27-jangtq-crack",
    "trading": "m27-jangtq-crack",
    "vc": "m27-jangtq-crack",
    "devops": "qwen3-5-122b-jang-4k",  # safety-aligned for client-facing
    "general": "m27-jangtq-crack",
}

Execution

async def route_specialist(task, file_context=None):
    # Stage 1
    stage1 = stage1_classify(task, file_context)

    if stage1.get("confidence") == "low":
        # Stage 2 — LLM classification
        stage2 = await stage2_classify(task)
        result = {**stage1, **stage2}
    else:
        result = stage1

    # Populate remaining fields
    result["specialist"] = f"{result['domain']}-implementation"
    result["system_prompt"] = SPECIALIST_PROMPTS[result["specialist"]]
    result["model"] = MODEL_ROUTING[result["domain"]]
    result["project_path"] = _detect_project_path(file_context)

    return result


def _detect_project_path(file_context):
    """Walk up from first file to find project root."""
    if not file_context:
        return None
    first = Path(file_context[0])
    for parent in [first] + list(first.parents):
        if (parent / "Package.swift").exists():
            return str(parent)
        if list(parent.glob("*.xcodeproj")):
            return str(parent)
        if (parent / "package.json").exists():
            return str(parent)
        if (parent / "pyproject.toml").exists():
            return str(parent)
        if (parent / ".git").exists():
            return str(parent)
    return None

Why this matters

Generic system prompts leave 10-15% of model capability on the table.

A domain-specific prompt activates the right reasoning patterns:

  • iOS prompt makes the model think about @MainActor before writing
  • Trading prompt makes it think about lookahead bias before generating
  • VC prompt makes it think about red flags before framing a thesis

This is essentially "free" quality improvement — no extra compute, just

better prompt engineering as structured configuration.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 16:09 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Reflect Critique Revise

stephenlthorn
执行多轮高级工程师的代码审查与修改,通过捕获错误、API误用和风格问题,提高代码质量,涵盖iOS等领域。
★ 0 📥 437

Claude Handoff

stephenlthorn
Writes a structured handoff package when local agent determines cloud Claude Code is needed. This is the ONLY path from
★ 0 📥 431

Decompose Plan

stephenlthorn
Forces M2.7 to produce an explicit structured plan before writing code. This makes Tree-of-Thought reasoning explicit in
★ 0 📥 430