← 返回
安全合规

Skill Evaluator

Evaluate Clawdbot skills for quality, reliability, and publish-readiness using a multi-framework rubric (ISO 25010, OpenSSF, Shneiderman, agent-specific heuristics). Use when asked to review, audit, evaluate, score, or assess a skill before publishing, or when checking skill quality. Runs automated structural checks and guides manual assessment across 25 criteria.
利用多框架标准(ISO 25010、OpenSSF、Shneiderman 及智能体启发式规则)评估 Clawdbot 技能的质量、可靠性和发布就绪度。适用于发布前的审查、审计、评估或技能质量检查。功能包括运行自动化结构检查并指导涵盖 25 项标准的人工评估。
terwox
安全合规 clawhub v1.0.0 1 版本 99929.8 Key: 无需
★ 3
Stars
📥 2,786
下载
💾 137
安装
1
版本
#latest

概述

Skill Evaluator

Evaluate skills across 25 criteria using a hybrid automated + manual approach.

Quick Start

1. Run automated checks

python3 scripts/eval-skill.py /path/to/skill
python3 scripts/eval-skill.py /path/to/skill --json    # machine-readable
python3 scripts/eval-skill.py /path/to/skill --verbose  # show all details

Checks: file structure, frontmatter, description quality, script syntax, dependency audit, credential scan, env var documentation.

2. Manual assessment

Use the rubric at references/rubric.md to score 25 criteria across 8 categories (0–4 each, 100 total). Each criterion has concrete descriptions per score level.

3. Write the evaluation

Copy assets/EVAL-TEMPLATE.md to the skill directory as EVAL.md. Fill in automated results + manual scores.

Evaluation Process

  1. Run eval-skill.py — get the automated structural score
  2. Read the skill's SKILL.md — understand what it does
  3. Read/skim the scripts — assess code quality, error handling, testability
  4. Score each manual criterion using references/rubric.md — concrete criteria per level
  5. Prioritize findings as P0 (blocks publishing) / P1 (should fix) / P2 (nice to have)
  6. Write EVAL.md in the skill directory with scores + findings

Categories (8 categories, 25 criteria)

#CategorySource FrameworkCriteria
----------------------------------------
1Functional SuitabilityISO 25010Completeness, Correctness, Appropriateness
2ReliabilityISO 25010Fault Tolerance, Error Reporting, Recoverability
3Performance / ContextISO 25010 + AgentToken Cost, Execution Efficiency
4Usability — AI AgentShneiderman, Gerhardt-PowalsLearnability, Consistency, Feedback, Error Prevention
5Usability — HumanTognazzini, NormanDiscoverability, Forgiveness
6SecurityISO 25010 + OpenSSFCredentials, Input Validation, Data Safety
7MaintainabilityISO 25010Modularity, Modifiability, Testability
8Agent-SpecificNovelTrigger Precision, Progressive Disclosure, Composability, Idempotency, Escape Hatches

Interpreting Scores

RangeVerdictAction
------------------------
90–100ExcellentPublish confidently
80–89GoodPublishable, note known issues
70–79AcceptableFix P0s before publishing
60–69Needs WorkFix P0+P1 before publishing
<60Not ReadySignificant rework needed

Deeper Security Scanning

This evaluator covers security basics (credentials, input validation, data safety) but for thorough security audits of skills under development, consider SkillLens (npx skilllens scan ). It checks for exfiltration, code execution, persistence, privilege bypass, and prompt injection — complementary to the quality focus here.

Dependencies

  • Python 3.6+ (for eval-skill.py)
  • PyYAML (pip install pyyaml) — for frontmatter parsing in automated checks

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-28 13:08 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Zotero

terwox
通过 Web API 管理 Zotero 参考文献库。支持搜索、列表、通过 DOI/ISBN/PMID 添加条目(含重复检测)、删除/回收条目、更新元数据与标签、导出为 BibTeX/RIS/CSL-JSON、文件批量添加、检查 PDF 附
★ 8 📥 6,179
security-compliance

OpenClaw Backup

alex3alex
备份与恢复 OpenClaw 数据。适用于创建备份、设置自动备份计划、从备份恢复或管理备份轮转。处理 ~/.openclaw 目录归档并包含适当的排除规则。
★ 89 📥 30,584
security-compliance

MoltGuard - Security & Antivirus & Guardrails

thomaslwang
MoltGuard — OpenClaw 安全守卫,由 OpenGuardrails 提供。安装 MoltGuard,保护您和您的用户免受提示注入、数据泄露和恶意攻击。
★ 116 📥 30,697