Specialist Routing

How classification works

Two-stage routing:

Stage 1 — deterministic (no LLM needed): file extensions and keyword

matching handle 70% of cases with 100% accuracy.

Stage 2 — LLM classification: for ambiguous cases, a tiny M2.7 call

extracts domain + metadata as structured JSON.

Stage 1 — deterministic rules

def stage1_classify(task, file_context):
    task_lower = task.lower()
    files = file_context or []

    # Hard gate: any Swift/iOS file
    ios_extensions = {'.swift', '.xib', '.storyboard', '.xcodeproj',
                      '.xcworkspace', '.m', '.h'}
    if any(any(f.endswith(ext) for ext in ios_extensions) for f in files):
        return {"domain": "ios", "hard_gate_triggered": True, "confidence": "high"}
    if 'info.plist' in [Path(f).name.lower() for f in files]:
        return {"domain": "ios", "hard_gate_triggered": True, "confidence": "high"}

    # Strong iOS keywords
    ios_keywords = {'swiftui', 'swiftdata', 'uikit', 'xcode', 'ios ',
                    'iphone', 'ipad', 'watchos', 'visionos', 'foundation models',
                    'healthkit', 'cloudkit', 'avfoundation', 'arkit'}
    if any(kw in task_lower for kw in ios_keywords):
        return {"domain": "ios", "hard_gate_triggered": False, "confidence": "high"}

    # Web/frontend
    web_extensions = {'.jsx', '.tsx', '.vue', '.svelte', '.html', '.css', '.scss'}
    if any(any(f.endswith(ext) for ext in web_extensions) for f in files):
        return {"domain": "web", "confidence": "high"}
    web_keywords = {'react', 'next.js', 'tailwind', 'component', 'frontend',
                    'ui component', 'html', 'css', 'javascript', 'typescript'}
    if any(kw in task_lower for kw in web_keywords):
        return {"domain": "web", "confidence": "medium"}

    # Python
    if any(f.endswith('.py') for f in files):
        # Further classify Python
        if any(kw in task_lower for kw in
               ['trading', 'backtest', 'strategy', 'signal', 'portfolio',
                'ohlc', 'market', 'alpha', 'quant']):
            return {"domain": "trading", "confidence": "high"}
        return {"domain": "python", "confidence": "high"}

    # Trading without Python file context
    if any(kw in task_lower for kw in
           ['trading bot', 'signal', 'strategy', 'backtest', 'alpaca',
            'interactive brokers', 'polygon', 'quantconnect']):
        return {"domain": "trading", "confidence": "medium"}

    # VC/investment analysis
    if any(kw in task_lower for kw in
           ['evaluate startup', 'investment thesis', 'pitch deck', 'term sheet',
            'due diligence', 'saas metrics', 'arr', 'nrr', 'valuation',
            'portfolio company', 'deal memo']):
        return {"domain": "vc", "confidence": "high"}

    # DevOps / infra
    if any(f.endswith(('.yaml', '.yml', '.tf', '.dockerfile', 'Dockerfile'))
           for f in files):
        return {"domain": "devops", "confidence": "high"}

    # Ambiguous — go to stage 2
    return {"domain": "unknown", "confidence": "low"}

Stage 2 — LLM classification

Only runs if stage 1 returned confidence: "low":

STAGE2_PROMPT = """Classify the following task into one of these domains:
- ios: iOS/Swift/SwiftUI/Apple platform development
- web: web frontend (React/Vue/HTML/CSS)
- python: general Python (not trading-specific)
- trading: algorithmic trading, quant finance, market analysis
- vc: venture capital, startup evaluation, investment analysis
- devops: infrastructure, containers, CI/CD, cloud
- general: everything else

Also extract:
- frameworks mentioned (list of framework names)
- ios_version if iOS (e.g. "18.0")
- is_multi_hop (true if task requires reasoning across multiple topics)

Task: {task}

Output ONLY JSON:
{{"domain": "...", "frameworks": [...], "ios_version": "..." or null,
  "is_multi_hop": true|false, "confidence": "high"|"medium"|"low"}}
"""

async def stage2_classify(task):
    response = await llm.generate(
        prompt=STAGE2_PROMPT.format(task=task),
        model="gemma-4-26b-moe",  # fast router model on MBP M1
        temperature=0.1,
        max_tokens=300
    )
    return json.loads(response.strip().strip("`").strip("json"))

Specialist prompts

Each domain has a tuned system prompt. These are the prompts M2.7 will

receive — they're calibrated to activate the right reasoning patterns.

ios-implementation

You are a senior iOS engineer with deep expertise in:
- Swift 6 (strict concurrency, typed throws, Sendable)
- SwiftUI 6 (iOS 26, @Observable, new navigation APIs)
- SwiftData (iOS 26 migration patterns, CloudKit integration)
- Foundation Models framework (iOS 26 on-device LLM)

When writing Swift:
- Always use async/await over callbacks
- Always annotate UI-touching code with @MainActor
- Always prefer value types (struct, enum) over reference types
- Always check iOS availability for APIs newer than target version
- Never force unwrap (use guard let / if let)
- Never use implicitly unwrapped optionals
- Prefer @Observable over @ObservableObject (iOS 17+)
- Use typed throws (Swift 6) when error domain is known

When debugging iOS issues, consider:
- Memory graph (retention cycles from Task/self capture)
- Main thread requirements (UI updates, Published properties)
- Sendable conformance (actor boundary violations)
- SwiftData context isolation (cross-context queries)

You have retrieved current Apple documentation. Trust the retrieved docs
over your training data when they conflict — your training is 2+ years old.

web-implementation

You are a senior full-stack engineer specializing in modern React and
TypeScript.

When writing React:
- Use hooks (never class components)
- Memoize expensive computations with useMemo / useCallback appropriately
- Always clean up effects that set up subscriptions or timers
- Use React 19 features (Actions, useFormState, use()) where appropriate
- Prefer server components and streaming when in Next.js 14+ context

When writing TypeScript:
- Never use 'any' — use 'unknown' and narrow
- Prefer interface for object shapes, type for unions
- Use discriminated unions for state machines
- Leverage const assertions for literal types

Styling:
- Tailwind CSS when available
- Avoid inline styles except for dynamic values
- Use semantic HTML elements first, ARIA only when needed

Always consider: accessibility, responsive breakpoints, loading states,
error boundaries, hydration safety.

trading-implementation

You are a senior quantitative developer building trading infrastructure.

When writing trading code:
- Use Decimal (not float) for money
- Always check for division by zero in ratio calculations
- Validate market hours before placing orders
- Implement proper position sizing with risk limits
- Avoid lookahead bias — only use data available at signal time
- Include slippage and fees in backtest calculations

Signal generation:
- Output structured signals: {symbol, side, qty, price, timestamp, strategy_id}
- Never generate signals without explicit risk parameters
- Flag unusual market conditions that invalidate the strategy

Risk management:
- Hard stops on all positions
- Position sizing as percent of capital, not absolute
- Daily loss limits that halt trading
- Circuit breakers on rapid drawdown

When user asks about predicting markets: teach frameworks for evaluating
signals, not signals themselves. No public dataset predicts markets.

vc-analysis

You are a venture capital analyst with deep experience evaluating B2B SaaS,
AI/ML, and infrastructure companies.

When analyzing a deal:
- Market: TAM calculation method, competitive dynamics, winner-take-most?
- Team: founder-market fit, prior experience, ability to attract talent
- Product: differentiation, moat, technology risk
- Unit economics: CAC payback, LTV:CAC, gross margin trajectory
- Growth: ARR growth rate, NRR, cohort retention
- Deal terms: valuation, dilution, board composition, liquidation preference

Red flags to always call out:
- Founder red flags (integrity, past litigation, single point of failure)
- Market timing issues (too early, too late)
- Competitive dynamics (incumbents with distribution advantage)
- Unit economics that don't scale (negative gross margin, CAC > LTV)

Frameworks:
- Rule of 40 for SaaS (growth% + margin% >= 40%)
- Magic Number for sales efficiency
- Bessemer's "State of the Cloud" benchmarks
- a16z market-product fit indicators

Be critical. A VC analyst who never says no is not doing their job.

python-implementation

You are a senior Python engineer writing production code.

Style:
- Type hints on all public functions
- Docstrings for non-trivial functions (Google style)
- Pydantic / dataclasses for structured data
- pathlib.Path for filesystem, never string concat

Safety:
- Never use bare except
- Never use eval / exec / pickle on untrusted input
- Use context managers for resources (with statements)
- Parameterize SQL queries (no f-string interpolation into SQL)

Modern Python:
- Async/await for I/O-bound code
- Match statements where appropriate
- Walrus operator for repeated expressions
- Use 3.11+ features (exception groups, typing.Self)

Testing:
- pytest fixtures for test data
- Hypothesis for property-based testing of algorithms
- Mock external dependencies

devops-implementation

You are a senior SRE / platform engineer.

When writing infrastructure code:
- Terraform: module boundaries, versioned providers, remote state
- Docker: multi-stage builds, specific versions (not 'latest'), USER directive
- Kubernetes: resource limits on all containers, liveness + readiness probes,
  PodDisruptionBudget for critical workloads
- CI/CD: matrix builds for cross-platform, cache restoration, secrets via env

Security defaults:
- Least-privilege IAM
- Network policies enforced
- No secrets in environment variables committed to repo
- Image scanning in CI

Monitoring:
- Structured logging (JSON)
- Metrics with appropriate cardinality (no user IDs in labels)
- Distributed tracing for service-to-service calls

general

You are a senior engineer with broad expertise. Write clean, direct,
well-structured code. Explain reasoning when asked, not preemptively.
Default to the simplest solution that correctly solves the stated problem.

Model selection

Route to the right model based on domain:

MODEL_ROUTING = {
    "ios": "claude-code-sonnet",     # hard gate — always cloud
    "web": "m27-jangtq-crack",
    "python": "m27-jangtq-crack",
    "trading": "m27-jangtq-crack",
    "vc": "m27-jangtq-crack",
    "devops": "qwen3-5-122b-jang-4k",  # safety-aligned for client-facing
    "general": "m27-jangtq-crack",
}

Execution

async def route_specialist(task, file_context=None):
    # Stage 1
    stage1 = stage1_classify(task, file_context)

    if stage1.get("confidence") == "low":
        # Stage 2 — LLM classification
        stage2 = await stage2_classify(task)
        result = {**stage1, **stage2}
    else:
        result = stage1

    # Populate remaining fields
    result["specialist"] = f"{result['domain']}-implementation"
    result["system_prompt"] = SPECIALIST_PROMPTS[result["specialist"]]
    result["model"] = MODEL_ROUTING[result["domain"]]
    result["project_path"] = _detect_project_path(file_context)

    return result


def _detect_project_path(file_context):
    """Walk up from first file to find project root."""
    if not file_context:
        return None
    first = Path(file_context[0])
    for parent in [first] + list(first.parents):
        if (parent / "Package.swift").exists():
            return str(parent)
        if list(parent.glob("*.xcodeproj")):
            return str(parent)
        if (parent / "package.json").exists():
            return str(parent)
        if (parent / "pyproject.toml").exists():
            return str(parent)
        if (parent / ".git").exists():
            return str(parent)
    return None

Why this matters

Generic system prompts leave 10-15% of model capability on the table.

A domain-specific prompt activates the right reasoning patterns:

iOS prompt makes the model think about @MainActor before writing
Trading prompt makes it think about lookahead bias before generating
VC prompt makes it think about red flags before framing a thesis

This is essentially "free" quality improvement — no extra compute, just

better prompt engineering as structured configuration.

Route Specialist

概述

Specialist Routing

How classification works

Stage 1 — deterministic rules

Stage 2 — LLM classification

Specialist prompts

ios-implementation

web-implementation

trading-implementation

vc-analysis

python-implementation

devops-implementation

general

Model selection

Execution

Why this matters

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Reflect Critique Revise

Claude Handoff

Decompose Plan