概述

Auto

> Plan → [ExitPlanMode batch-approve] → Execute in Batches → Verify

Manual invocation only. Type /auto. Never auto-triggers.

Announce at start: "I'm using the auto skill to plan and execute this task."

When to Use

Type /auto. That is the only trigger.

Use for: tasks with 2+ distinct steps, multi-file changes, cross-domain work.

Skip for: single-line fixes, pure Q&A, reading files.

Modes

The user's intent determines the mode. No keyword matching — understand what they want from the request.

Mode	User Intent	Behavior
------	-------------	----------
PLAN	User wants a plan only, no execution	EnterPlanMode -> design plan -> save to `docs/plans/` -> ExitPlanMode -> STOP
BUILD	User wants to execute an existing plan	Load plan from `docs/plans/` -> critical review -> execute in batches with checkpoints
FULL	(default) Plan then execute	EnterPlanMode -> present plan -> ExitPlanMode(allowedPrompts) -> execute all steps
AUTO	User wants autonomous execution, minimal prompts	Plan internally -> ExitPlanMode(allowedPrompts) -> execute without review gates

Workflow

Step 0: Initialize Skill Index

Every auto invocation starts here. The skill index at references/skill-index.json is the single source of truth for skill matching. Keep it fresh.

Two modes — index absent vs. index present:

Scenario	Behavior
----------	----------
First run (index missing)	Full scan → read every SKILL.md body → classify each skill from full content
Subsequent run (index exists)	Diff scan → only classify new/changed skills via full body read → delete removed entries

Step 0a: Full Scan (First Run — No Index Exists)

This is the expensive path. It happens ONCE.

Run: python scripts/scan_skills.py --json-stdout
Read the scan output. classifications_needed will contain ALL discovered skills.
For every skill in classifications_needed:

a. Read the full SKILL.md body using the file path from the scan output. No shortcuts. No pattern pre-classification. No name+description guessing.

b. Classify using the Classification Guidelines below — ops, domain, prereqs, summary (one sentence capturing the actual behavior, not the frontmatter description), use_for (2-5 specific tasks), do_not_use_for (1-3 likely misapplications).

c. The scan_skills.py pre_classified field is a hint only — verify against the full body. Override when it disagrees.

Write all classifications to skill-index.json v2 format: {version, scanned_at, skills: {name: {ops, domain, prereqs, summary, use_for, do_not_use_for, content_hash}}}
Save. The index now exists. Proceed to Step 1.

Constraint: Process in batches of 20-30 skills. After each batch, write partial results to the index so a crash doesn't lose all progress.

Step 0b: Incremental Update (Subsequent Runs — Index Exists)

This is the cheap path. It happens on every subsequent /auto invocation.

Run: python scripts/scan_skills.py --json-stdout
Read the scan output:

classifications_needed has only new + changed skills
deleted has removed skills

If classifications_needed is non-empty:

a. For each skill: read the full SKILL.md body before classifying (same classification rules as first run)

b. Merge into index: add/update entries in skills, update scanned_at

If deleted is non-empty: Remove those keys from skills in the index.
If both are empty: Index is fresh. Proceed to Step 1.
Save the updated index.

Fallback

If scan_skills.py fails (Python not available, etc.):

Read skill-index.json directly and proceed with whatever data is available
Warn: "Skill index may be outdated."

Index freshness rule: Re-scan if scanned_at > 3 days ago OR user mentions installing/removing skills.

Step 1: Plan

Parse the request into ordered, bite-sized tasks — each 2-5 minutes of work, one concrete action
Assign each task: exact file paths, tool to use, expected output
For each task, decide: invoke a skill (see Skill Matching below), or use direct tools
Present plan compactly:

Plan: <one-line summary> | Tasks: N | Mode: <mode>
-> Task 1..N: <brief sequence>
Skills: <list or "direct">

Plan file naming (PLAN / FULL modes): docs/plans/YYYY-MM-DD-.md

Step 2: Get Approval

Use ExitPlanMode with allowedPrompts — the official Claude Code batch-approval mechanism. The user approves once, all listed operations are pre-authorized.

In AUTO mode: still call ExitPlanMode (required by the harness). For truly zero-prompt execution, pre-configure permissions.allow in settings.json:

{
  "permissions": {
    "allow": [
      "Bash(git *)",
      "Bash(npm *)",
      "Bash(cargo *)",
      "Bash(rtk *)",
      "WebSearch",
      "WebFetch(*)"
    ]
  }
}

Use /update-config or /fewer-permission-prompts to build this list from actual usage.

Step 3: Execute in Batches

Adopted from executing-plans: batch of 3 tasks -> report -> checkpoint.

FULL mode: execute batch -> report -> auto-continue to next batch.

BUILD mode: execute batch -> report -> wait for feedback before next batch.

AUTO mode: skip review gates entirely. Continue until done or blocked.

Re-evaluate after each batch: if a later task's inputs changed due to earlier results, update it before executing.

Per-task execution:

Announce: Task N/M:
Invoke matched skill via Skill tool, or use direct tools
Run verification for the task
Mark complete with TaskUpdate

Track all tasks with TaskCreate / TaskUpdate (Claude Code's official task tracking).

Step 4: Verify

Iron law (from verification-before-completion):

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

For each task and at final backpressure:

IDENTIFY: What command proves this claim?
RUN: Execute the full command (fresh, complete)
READ: Full output, check exit code, count failures
VERIFY: Does output confirm the claim?
ONLY THEN: Make the claim

Never use "should work", "probably", or "seems to". Run the command. Read the output. Then claim.

When to Stop

Adopted from executing-plans — STOP immediately when:

Hit a blocker (missing dependency, test fails, instruction unclear)
Verification fails after 3 attempts with different approaches (per-task counter, resets each new task)
You don't understand an instruction

Ask for clarification rather than guessing. Don't force through blockers.

Skill Matching

The skill index at references/skill-index.json is the single source of truth. Match each task against the index using the algorithm below.

Matching Algorithm

Given a task with an operation tag:

Pre-filter: If the task is a simple file read, shell command, or local file search (Glob/Grep/Read), skip skills entirely — use direct tools.

Compatibility Gate: Remove skills whose prereqs are not met in the current environment (no git repo -> remove git-prereq skills, etc.)

Operation Filter: Keep only skills where at least one ops tag matches the task's operation. Hard gates:

explore:local tasks -> never match explore:web skills
create tasks -> never match review-only skills
design tasks -> never match execute-only skills

Rank Candidates (additive scoring):

+4: exact operation tag match
+3: domain matches task domain
+3 per word: skill name keywords appear in task description (up to +6)
+1: skill's use_for entries semantically match the task
-2 per word: skill's do_not_use_for entries semantically match the task

Select: Take the highest-ranked skill with score >= 3. If no skill scores >= 3, use direct tools.

Tiebreaker (when multiple skills have equal score):

Prefer the skill with fewer words in its name (simpler = more general-purpose)
Prefer the skill with more ops tags (broader applicability)
If still tied, pick the first alphabetically

Standardized Tags

Operation Tags (assign exactly ONE per task)

Tag	Meaning
-----	---------
`create`	Making new files, features, content from scratch
`update`	Modifying, refactoring, fixing existing things
`review`	Reading, analyzing, auditing, explaining
`design`	Planning, brainstorming, architecting, estimating
`execute:local`	Running local commands, builds, tests, scripts
`execute:remote`	Deploying, pushing, remote API calls
`explore:local`	Searching/reading local codebase
`explore:web`	Web research, external data fetching

Domain Tags

meta backend frontend devops testing docs git security ml research utility performance

Prereq Tags

git git:diff web node python pip mcp api:anthropic

Classification Guidelines

When classifying a skill from its description and source context:

ops (choose 1-4):

create if it produces new files/code/content
update if it modifies existing things
review if it reads, analyzes, audits, explains, or inspects
design if it plans, brainstorms, architects, or estimates
execute:local if it runs local commands (build, test, install, cli)
execute:remote if it deploys, pushes, or calls remote services
explore:local if it searches/reads the local codebase
explore:web if it does web searches or fetches external data

domain (exactly 1): Infer from description keywords and source context.

Plugin skills under "plugin-dev/" -> meta
CLI-anything skills -> utility
Skills mentioning React/UI/frontend/css -> frontend
Skills mentioning API/server/backend/database -> backend
Skills mentioning deploy/k8s/infra/docker -> devops
Skills mentioning test/verify/QA/TDD -> testing
Skills mentioning docs/write/presentation/README -> docs
Skills mentioning git/commit/PR/branch -> git
Skills mentioning security/vulnerability/audit -> security
Skills mentioning ML/model/training -> ml
Skills mentioning research/search/data -> research

prereqs (0-4): Infer from description. Git commands -> git. pip install -> pip. npm/node -> node. Web searches -> web. MCP tools -> mcp.

use_for (2-5 short phrases): What specific tasks does this skill handle well? Be specific.

do_not_use_for (1-3 short phrases): What common tasks would be a bad fit? Focus on likely mistakes.

Batch strategy: Classify in groups by source context for consistency (e.g., all sc- skills together, all cli-anything- together).

File-Based Memory

For tasks with 5+ steps:

File	Purpose
------	---------
`docs/plans/YYYY-MM-DD-.md`	Tasks, progress, decisions
`findings.md`	Research discoveries
`progress.md`	Session log

Reboot check (after context gaps): Read plan file -> check current phase -> resume from last completed task.

Completion

After all tasks complete and verified (from finishing-a-development-branch):

Verify tests pass the project's test command
Run final backpressure check (build, test, lint)
Report with evidence: All N tasks complete. .

版本历史

共 10 个版本

v1.0.9 Initial release 当前

2026-06-12 17:31 安全安全
v1.0.8 Initial release

2026-06-07 13:45 安全安全
v1.0.7 Initial release

2026-06-06 22:24 安全安全
v1.0.6 Initial release

2026-06-06 16:24 安全安全
v1.0.5 Initial release

2026-06-04 20:37 安全安全
v1.0.4 修复问题

2026-06-02 15:17 安全安全
v1.0.3 已嵌入 auto SKILL.md 的 Matching Algorithm 章节之前的问题: - 每次匹配需额外 Read skill-registry.md（1 次工具调用） - 表格式扫描 ~100 条目，O(n) 查找 - Registry 可能过期现在的方案 — 三级查找体系: ┌──────────┬─────────────────────────────────────┬──────────┬────────┐ │ 优先级 │ 路径 │ 成本 │ 覆盖率 │ ├──────────┼─────────────────────────────────────┼──────────┼────────┤ │ │ 内嵌 Skill Classifier（Operation → │ 0 tool │ │ │ Step 0 │ Skills 表 + Domain 交叉引用） │ calls, │ ~80% │ │ │ │ O(1) │ │ ├──────────┼─────────────────────────────────────┼──────────┼────────┤ │ Fallback │ references/skill-registry.md │ 1 tool │ ~95% │ │ 1 │ │ call │ │ ├──────────┼─────────────────────────────────────┼──────────┼────────┤ │ Fallback │ 从 system prompt 推断 │ 0 tool │ ~100% │ │ 2 │ │ calls │ │ └──────────┴─────────────────────────────────────┴──────────┴────────┘ 新增内容: - 8 行 Operation → Skills 表：按 create/update/design/review/explore:web/e xplore:local/execute:local/execute:remote 索引 - 9 行 Domain 交叉引用表：React/Backend/ML/DevOps/Security/Performance/Doc s/Git/Testing/Research - 行内前置条件标注 †git ¶py §pip *web ◊mcp ¤anthropic，一次查找完成操作+领域+前置条件匹配

2026-06-01 22:59 安全安全
v1.0.2 重构了 auto skill 的 SKILL.md，吸收 planning-with-files、executing-plans、subagent-driven-development、ralph-loop 四个 skill 的优点，修复了全部 10 个缺陷（4 原始 + 6 审计）

2026-06-01 22:01 安全安全
v1.0.1 优化了skills匹配逻辑，SKILL.md 改进 1. 操作语义标签 — 每个步骤标注 create / update / explore:local / explore:web / execute:local / execute:remote / design / review，避免把 web 调研工具匹配到本地文件探索 2. 兼容性准入检查 — 匹配前检查 skill 需要的前置条件（MCP、git、网络），不满足则直接淘汰 3. 匹配中断检测 — skill 加载后立即扫描其指令，发现不匹配立即切到 fallback 4. 退化链 — 最优 skill → 次优 skill → agent → direct 执行，永不重试同一个 skill 5. 直接跳过的场景 — 单文件读写、简单 shell 命令这类开销小于 skill 调用的步骤直接 direct skill-registry.md 改进 - 每个 skill 增加了 Ops（操作标签）、Prereqs（前置条件）、DO NOT use for（反向匹配）三列 - 新增"按前置条件查 skill"索引表，快速识别哪些 skill 依赖 MCP/git/网络/运行时

2026-06-01 15:14 安全安全
v1.0.0 the first version

2026-06-01 14:46 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)