概述

Skill: skill-scorer

Overview

A meta-skill that evaluates the quality of other skills. Given a SKILL.md file (or a complete skill folder), it performs a systematic audit across 8 dimensions, assigns a score out of 100, identifies issues by severity, and generates actionable optimization suggestions.

This skill synthesizes quality criteria from Anthropic's official skill authoring best practices, the Skill Engineering Standard (v1.4.3), and community-tested patterns from production skill ecosystems.

When to Activate

User provides a skill and asks any of:

"帮我评分/打分/检测/质检这个 skill"
"review/audit/score/grade/lint this skill"
"这个 skill 写得怎么样？" / "is this skill any good?"
"帮我优化这个 skill" (evaluate first, then suggest improvements)
Provides a SKILL.md and expects quality feedback

Do NOT activate for: creating a new skill from scratch → use skill-creator. This skill is for evaluation, not generation.

Core Workflow

Step 0: Load the Skill Under Test

Determine what the user has provided:

Input	Action
-------	--------
Single `SKILL.md` file	Evaluate that file
Skill folder (with `references/`)	Evaluate all files, cross-reference consistency
URL / GitHub link	Fetch and evaluate
Pasted markdown content	Treat as SKILL.md

If the user has not provided a skill → ask: "请提供要评估的 SKILL.md 文件或 skill 文件夹路径。"

Input validation — before proceeding to Step 1, verify the input is actually a skill:

Check	Condition	Action
-------	-----------	--------
Binary / garbled content	File is not valid text, or text is unreadable gibberish	STOP. Report: "This file does not appear to be a valid SKILL.md — it contains binary or unreadable content. Please provide a markdown-based skill file." Do NOT attempt to score.
No skill markers at all	Text is valid but contains zero skill indicators (no YAML frontmatter `---`, no markdown headings resembling skill sections, no workflow/instructions)	STOP. Report: "This appears to be a {detected_type} file (e.g., Python script, JSON config, plain prose), not a SKILL.md. skill-scorer evaluates SKILL.md files only." Do NOT force-fit 8 dimensions onto non-skill content.
Partial skill structure	Has some skill-like elements (e.g., YAML frontmatter exists but body is minimal, or has headings but no workflow)	PROCEED with caveats. Evaluate normally, but note in the report header: "⚠️ This file has incomplete skill structure — scores reflect what is present." Score missing sections as 0 in relevant dimensions rather than guessing.

Step 1: Parse Skill Structure

Extract and inventory:

YAML frontmatter fields (name, description, version, compatibility)
Section headings and their order
References to external files (references/, scripts/, assets/)
Total line count and estimated token count of SKILL.md body

Step 2: Run 8-Dimension Evaluation

Read references/rubric.md for the complete scoring rubric.

Evaluate the skill across these 8 dimensions (each scored 0-100, then weighted):

#	Dimension	Weight	What It Measures
---	-----------	--------	------------------
1	Metadata & Triggering	15%	Name clarity, description quality, trigger coverage
2	Structure & Architecture	15%	File organization, section order, progressive disclosure
3	Instruction Clarity	15%	Actionability, conciseness, examples, tone
4	Workflow & Logic	15%	Step completeness, parameter handling, validation
5	Error Handling	10%	Fallbacks, edge cases, failure recovery
6	Context Efficiency	10%	Token budget, redundancy, information density
7	Portability & Compatibility	10%	Self-containment, cross-platform support
8	Safety & Robustness	10%	No injection risk, no hallucination traps, identity lock

Step 3: Identify Issues

For each issue found, classify severity:

Severity	Meaning	Score Impact
----------	---------	--------------
🔴 Critical	Skill will malfunction or not trigger	-10 to -15 per issue
🟡 Warning	Skill works but suboptimally	-3 to -8 per issue
🟢 Suggestion	Nice-to-have improvement	-1 to -2 per issue

Step 4: Generate Report

Read references/report-template.md for the output format.

The report includes:

Score Card — Overall score + per-dimension breakdown
Issue List — All findings sorted by severity
Top 3 Quick Wins — Highest-impact fixes with before/after examples
Optimization Roadmap — Prioritized improvement plan

Step 5: Offer Follow-Up

After presenting the report, ask:

"需要我帮你自动修复这些问题吗？" (auto-fix mode)
"需要对某个维度深入分析吗？" (deep-dive mode)
"需要生成优化后的 SKILL.md 吗？" (rewrite mode)

Output Rules

Bilingual report — Chinese first, English after, no interleaving. Always output the complete report in Chinese, then a --- separator, then the complete report in English. Never mix languages within a section. Both versions must contain identical scores, issues, and suggestions — only the language differs.
Score must be justified. Every deducted point must trace to a specific issue.
Suggestions must be actionable. Include before/after code snippets, not vague advice.
Be constructive, not destructive. Lead with what the skill does well before listing issues.
❌ Never inflate scores to be polite — honest assessment helps the user improve.
❌ Never evaluate based on domain correctness of the skill's content (e.g., whether hotel recommendations are good) — only evaluate skill engineering quality.

References

File	Purpose	When to read
------	---------	-------------
references/rubric.md	Detailed scoring criteria for all 8 dimensions	Step 2: scoring
references/report-template.md	Output format and report structure	Step 4: generating report
references/anti-patterns.md	Common skill mistakes and how to detect them	Step 3: finding issues

版本历史

共 1 个版本

v1.0.0 当前

2026-05-03 09:33 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)