← 返回
未分类 中文

Rotifer Arena

One-click Gene comparison and evaluation for Rotifer Protocol. Import from ClawHub Skills, local files, or build from scratch — automatically compile, match...
One-click Gene comparison and evaluation for Rotifer Protocol. Import from ClawHub Skills, local files, or build from scratch — automatically compile, match...
xiaoba-dev xiaoba-dev 来源
未分类 clawhub v1.0.3 1 版本 100000 Key: 无需
★ 0
Stars
📥 334
下载
💾 0
安装
1
版本
#latest

概述

Rotifer Arena — Gene Comparison & Evaluation

> One Skill covering Gene/Genome/Agent comparison across all scenarios.

Prerequisites

This Skill requires the Rotifer CLI:

npx @rotifer/playground --version

Or use the MCP Server for IDE integration:

{
  "mcpServers": {
    "rotifer": {
      "command": "npx",
      "args": ["@rotifer/mcp-server"]
    }
  }
}

Overview

This Skill wraps Rotifer Protocol's core value — objective, quantifiable capability evaluation — into a one-click workflow. Users don't need to understand Gene, Arena, or F(g) concepts upfront; the Skill introduces them naturally during execution.

Cross-platform: This SKILL.md runs in any AI development environment that supports Skills/Agents.


Workflow

Phase 1: Identify Evaluation Target

Understand user intent through conversation and determine the evaluation mode:

User signalModeAction
---------------------------
"Evaluate the X skill from ClawHub"ClawHub migration evaluationrotifer wrap --from-clawhub
"Compare my two implementations"Local comparisonConfirm both Gene names, skip to Phase 3
"I have a Skill I want to test"Skill import evaluationrotifer wrap --from-skill
"Help me build a XX scenario"Scenario scaffoldingGuide Gene creation (rotifer init or manual phenotype)

If the user doesn't specify a domain: auto-read from phenotype.json, or guide the user to choose.

Phase 2: Compile & Verify

rotifer compile <gene-name>

Output guidance based on fidelity result:

  • Wrapped: Verification passed, deterministic evaluation mode
  • Hybrid/Native: WASM compilation, real sandbox execution mode (requires NAPI binding)

Phase 3: Automatic Opponent Matching

Priority order:

  1. User-specified: If the user says "compare X and Y", use those directly
  2. Same-domain local search: Highest-ranked Gene from rotifer arena list --domain
  3. Same-fidelity preferred: If target is Wrapped, prefer Wrapped opponents (avoid cross-fidelity blowouts)
  4. No opponent found: Inform the user, show current cross-domain Arena rankings for reference

Opponent selection requires user confirmation — show candidate F(g) and fidelity.

Phase 4: Arena Submit & Compare

rotifer arena submit <gene-a>
rotifer arena submit <gene-b>
rotifer arena list --domain <domain>

Collect evaluation results for both Genes.

Phase 5: Generate Evaluation Report

Output the full report in the conversation (rendered Markdown).

Append at the end: > Reply "save" to write the report to arena-reports/.

When the user replies "save", write to /arena-reports/--vs-.md.

Report format requirements:

  1. Title = conclusion: Use scenario name + both Gene names, not a generic title
  2. Conclusion first: Immediately below the title, a > blockquote with one-sentence summary of winner and key data
  3. Concise comparison table: Only decision-relevant metrics (rank, F(g), V(g), Fidelity, success rate, latency, source), bold the winner
  4. Ranking visualization: Fixed-width ASCII table showing the full domain ranking, mark new entries with
  5. Reproduction commands in a standalone bash block: Pure commands (no comments/output) for easy copy-paste
  6. No internal references: No ADR numbers, plan section numbers, or internal version notes
  7. Minimal metadata: One line at the bottom with date + CLI version + evaluation mode

Report structure (output directly in conversation):

  • Title: # Comparison: vs
  • Conclusion blockquote: One sentence — who won, key metric delta, core reason
  • Comparison table: Rank, F(g), V(g), Fidelity, Success rate, Latency score, Source
  • Current ranking: Full domain leaderboard (ASCII table, marks new entries)
  • Analysis: 2–3 paragraphs on fitness gap attribution, security comparison, same-fidelity positioning
  • Upgrade path: Table with path / action / expected improvement / effort
  • Reproduction steps: 4–5 pure CLI commands
  • Next steps: 4 commands with brief descriptions
  • Footer: Generated on YYYY-MM-DD · @rotifer/playground@X.Y.Z · Mode: deterministic estimation

Scenario Examples

Example 1: Evaluate a ClawHub Skill's Competitiveness

User: Evaluate the web-search skill from ClawHub in the Rotifer ecosystem

Skill execution:
1. rotifer wrap clawhub-web-search --from-clawhub web-search -d search
2. rotifer compile clawhub-web-search
3. Auto-discover same-domain opponent: genesis-web-search (Native, F(g)=0.9470)
4. rotifer arena submit clawhub-web-search
5. Generate comparison report

Example 2: Compare Two Custom Genes

User: Compare my particle-brute and particle-spatial — which is better?

Skill execution:
1. Confirm both Genes exist with phenotype.json
2. rotifer arena submit particle-brute
3. rotifer arena submit particle-spatial
4. rotifer arena list --domain sim.particle
5. Generate comparison report

Example 3: Build a Quantitative Scenario

User: Help me build a quantitative strategy comparison scenario

Skill execution:
1. Guide user to define domain (e.g. quant.strategy)
2. Guide creation of two Gene phenotype.json files (Strategy A vs Strategy B)
3. If compilable source exists, compile to WASM
4. rotifer arena submit both Genes
5. Generate scenario comparison report

Prerequisites

  • Project has a rotifer.json (if not, guide rotifer init)
  • CLI is built (npm run build in rotifer-playground)
  • ClawHub imports require network connectivity

Related Skills

SkillRelationship
--------------------
gene-devRoute here when users need to create a Gene from scratch
gene-migrationRoute here when the report recommends a fidelity upgrade
gene-auditSuggest running when the report shows low security scores

Constraints

  • No automatic Cloud publishing: Comparison evaluation is a local operation; Cloud publishing requires explicit user confirmation
  • Cross-fidelity comparisons need a disclaimer: The baseFitness gap between Wrapped and Native comes from the scoring model, not actual capability differences
  • Reports are Markdown format: Ready for blogs, community sharing, or GitHub Issues

版本历史

共 1 个版本

  • v1.0.3 当前
    2026-05-07 17:46 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Rotifer Self Evolving Agent

xiaoba-dev
智能体自我进化——自动扫描能力、对标 Arena 排行榜并升级。
★ 0 📥 415

Rotifer Guide

xiaoba-dev
Unified user-facing entry point for Rotifer Protocol: interactive onboarding, natural-language scaffolding, diagnostics,
★ 0 📥 410

Rotifer Agent

xiaoba-dev
End-to-end guide for building AI Agents from Genes: intent decomposition, Gene selection, Genome composition, Agent crea
★ 0 📥 356