← 返回
未分类

auto-invoke

/auto — plan & execute multi-step tasks. Uses ExitPlanMode + allowedPrompts (official batch-approval) instead of prompt-based confirm hacks.
/auto — plan & execute multi-step tasks. Uses ExitPlanMode + allowedPrompts (official batch-approval) instead of prompt-based confirm hacks.
meltme
未分类 community v1.0.9 10 版本 100000 Key: 无需
★ 2
Stars
📥 59
下载
💾 0
安装
10
版本
#latest

概述

Auto

> Plan → [ExitPlanMode batch-approve] → Execute in Batches → Verify

Manual invocation only. Type /auto. Never auto-triggers.

Announce at start: "I'm using the auto skill to plan and execute this task."

When to Use

Type /auto. That is the only trigger.

Use for: tasks with 2+ distinct steps, multi-file changes, cross-domain work.

Skip for: single-line fixes, pure Q&A, reading files.

Modes

The user's intent determines the mode. No keyword matching — understand what they want from the request.

ModeUser IntentBehavior
-----------------------------
PLANUser wants a plan only, no executionEnterPlanMode -> design plan -> save to docs/plans/ -> ExitPlanMode -> STOP
BUILDUser wants to execute an existing planLoad plan from docs/plans/ -> critical review -> execute in batches with checkpoints
FULL(default) Plan then executeEnterPlanMode -> present plan -> ExitPlanMode(allowedPrompts) -> execute all steps
AUTOUser wants autonomous execution, minimal promptsPlan internally -> ExitPlanMode(allowedPrompts) -> execute without review gates

Workflow

Step 0: Initialize Skill Index

Every auto invocation starts here. The skill index at references/skill-index.json is the single source of truth for skill matching. Keep it fresh.

Two modes — index absent vs. index present:

ScenarioBehavior
--------------------
First run (index missing)Full scan → read every SKILL.md body → classify each skill from full content
Subsequent run (index exists)Diff scan → only classify new/changed skills via full body read → delete removed entries

Step 0a: Full Scan (First Run — No Index Exists)

This is the expensive path. It happens ONCE.

  1. Run: python scripts/scan_skills.py --json-stdout
  2. Read the scan output. classifications_needed will contain ALL discovered skills.
  3. For every skill in classifications_needed:

a. Read the full SKILL.md body using the file path from the scan output. No shortcuts. No pattern pre-classification. No name+description guessing.

b. Classify using the Classification Guidelines below — ops, domain, prereqs, summary (one sentence capturing the actual behavior, not the frontmatter description), use_for (2-5 specific tasks), do_not_use_for (1-3 likely misapplications).

c. The scan_skills.py pre_classified field is a hint only — verify against the full body. Override when it disagrees.

  1. Write all classifications to skill-index.json v2 format: {version, scanned_at, skills: {name: {ops, domain, prereqs, summary, use_for, do_not_use_for, content_hash}}}
  2. Save. The index now exists. Proceed to Step 1.

Constraint: Process in batches of 20-30 skills. After each batch, write partial results to the index so a crash doesn't lose all progress.

Step 0b: Incremental Update (Subsequent Runs — Index Exists)

This is the cheap path. It happens on every subsequent /auto invocation.

  1. Run: python scripts/scan_skills.py --json-stdout
  2. Read the scan output:
    • classifications_needed has only new + changed skills
    • deleted has removed skills
  3. If classifications_needed is non-empty:

a. For each skill: read the full SKILL.md body before classifying (same classification rules as first run)

b. Merge into index: add/update entries in skills, update scanned_at

  1. If deleted is non-empty: Remove those keys from skills in the index.
  2. If both are empty: Index is fresh. Proceed to Step 1.
  3. Save the updated index.

Fallback

If scan_skills.py fails (Python not available, etc.):

  • Read skill-index.json directly and proceed with whatever data is available
  • Warn: "Skill index may be outdated."

Index freshness rule: Re-scan if scanned_at > 3 days ago OR user mentions installing/removing skills.

Step 1: Plan

  1. Parse the request into ordered, bite-sized tasks — each 2-5 minutes of work, one concrete action
  2. Assign each task: exact file paths, tool to use, expected output
  3. For each task, decide: invoke a skill (see Skill Matching below), or use direct tools
  4. Present plan compactly:
Plan: <one-line summary> | Tasks: N | Mode: <mode>
-> Task 1..N: <brief sequence>
Skills: <list or "direct">

Plan file naming (PLAN / FULL modes): docs/plans/YYYY-MM-DD-.md

Step 2: Get Approval

Use ExitPlanMode with allowedPrompts — the official Claude Code batch-approval mechanism. The user approves once, all listed operations are pre-authorized.

In AUTO mode: still call ExitPlanMode (required by the harness). For truly zero-prompt execution, pre-configure permissions.allow in settings.json:

{
  "permissions": {
    "allow": [
      "Bash(git *)",
      "Bash(npm *)",
      "Bash(cargo *)",
      "Bash(rtk *)",
      "WebSearch",
      "WebFetch(*)"
    ]
  }
}

Use /update-config or /fewer-permission-prompts to build this list from actual usage.

Step 3: Execute in Batches

Adopted from executing-plans: batch of 3 tasks -> report -> checkpoint.

FULL mode: execute batch -> report -> auto-continue to next batch.

BUILD mode: execute batch -> report -> wait for feedback before next batch.

AUTO mode: skip review gates entirely. Continue until done or blocked.

Re-evaluate after each batch: if a later task's inputs changed due to earlier results, update it before executing.

Per-task execution:

  1. Announce: Task N/M:
  2. Invoke matched skill via Skill tool, or use direct tools
  3. Run verification for the task
  4. Mark complete with TaskUpdate

Track all tasks with TaskCreate / TaskUpdate (Claude Code's official task tracking).

Step 4: Verify

Iron law (from verification-before-completion):

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

For each task and at final backpressure:

  1. IDENTIFY: What command proves this claim?
  2. RUN: Execute the full command (fresh, complete)
  3. READ: Full output, check exit code, count failures
  4. VERIFY: Does output confirm the claim?
  5. ONLY THEN: Make the claim

Never use "should work", "probably", or "seems to". Run the command. Read the output. Then claim.

When to Stop

Adopted from executing-plans — STOP immediately when:

  • Hit a blocker (missing dependency, test fails, instruction unclear)
  • Verification fails after 3 attempts with different approaches (per-task counter, resets each new task)
  • You don't understand an instruction

Ask for clarification rather than guessing. Don't force through blockers.

Skill Matching

The skill index at references/skill-index.json is the single source of truth. Match each task against the index using the algorithm below.

Matching Algorithm

Given a task with an operation tag:

  1. Pre-filter: If the task is a simple file read, shell command, or local file search (Glob/Grep/Read), skip skills entirely — use direct tools.
  1. Compatibility Gate: Remove skills whose prereqs are not met in the current environment (no git repo -> remove git-prereq skills, etc.)
  1. Operation Filter: Keep only skills where at least one ops tag matches the task's operation. Hard gates:
    • explore:local tasks -> never match explore:web skills
    • create tasks -> never match review-only skills
    • design tasks -> never match execute-only skills
  1. Rank Candidates (additive scoring):
    • +4: exact operation tag match
    • +3: domain matches task domain
    • +3 per word: skill name keywords appear in task description (up to +6)
    • +1: skill's use_for entries semantically match the task
    • -2 per word: skill's do_not_use_for entries semantically match the task
  1. Select: Take the highest-ranked skill with score >= 3. If no skill scores >= 3, use direct tools.
  1. Tiebreaker (when multiple skills have equal score):
    • Prefer the skill with fewer words in its name (simpler = more general-purpose)
    • Prefer the skill with more ops tags (broader applicability)
    • If still tied, pick the first alphabetically

Standardized Tags

Operation Tags (assign exactly ONE per task)

TagMeaning
--------------
createMaking new files, features, content from scratch
updateModifying, refactoring, fixing existing things
reviewReading, analyzing, auditing, explaining
designPlanning, brainstorming, architecting, estimating
execute:localRunning local commands, builds, tests, scripts
execute:remoteDeploying, pushing, remote API calls
explore:localSearching/reading local codebase
explore:webWeb research, external data fetching

Domain Tags

meta backend frontend devops testing docs git security ml research utility performance

Prereq Tags

git git:diff web node python pip mcp api:anthropic

Classification Guidelines

When classifying a skill from its description and source context:

ops (choose 1-4):

  • create if it produces new files/code/content
  • update if it modifies existing things
  • review if it reads, analyzes, audits, explains, or inspects
  • design if it plans, brainstorms, architects, or estimates
  • execute:local if it runs local commands (build, test, install, cli)
  • execute:remote if it deploys, pushes, or calls remote services
  • explore:local if it searches/reads the local codebase
  • explore:web if it does web searches or fetches external data

domain (exactly 1): Infer from description keywords and source context.

  • Plugin skills under "plugin-dev/" -> meta
  • CLI-anything skills -> utility
  • Skills mentioning React/UI/frontend/css -> frontend
  • Skills mentioning API/server/backend/database -> backend
  • Skills mentioning deploy/k8s/infra/docker -> devops
  • Skills mentioning test/verify/QA/TDD -> testing
  • Skills mentioning docs/write/presentation/README -> docs
  • Skills mentioning git/commit/PR/branch -> git
  • Skills mentioning security/vulnerability/audit -> security
  • Skills mentioning ML/model/training -> ml
  • Skills mentioning research/search/data -> research

prereqs (0-4): Infer from description. Git commands -> git. pip install -> pip. npm/node -> node. Web searches -> web. MCP tools -> mcp.

use_for (2-5 short phrases): What specific tasks does this skill handle well? Be specific.

do_not_use_for (1-3 short phrases): What common tasks would be a bad fit? Focus on likely mistakes.

Batch strategy: Classify in groups by source context for consistency (e.g., all sc- skills together, all cli-anything- together).

File-Based Memory

For tasks with 5+ steps:

FilePurpose
---------------
docs/plans/YYYY-MM-DD-.mdTasks, progress, decisions
findings.mdResearch discoveries
progress.mdSession log

Reboot check (after context gaps): Read plan file -> check current phase -> resume from last completed task.

Completion

After all tasks complete and verified (from finishing-a-development-branch):

  1. Verify tests pass the project's test command
  2. Run final backpressure check (build, test, lint)
  3. Report with evidence: All N tasks complete. .

版本历史

共 10 个版本

  • v1.0.9 Initial release 当前
    2026-06-12 17:31 安全 安全
  • v1.0.8 Initial release
    2026-06-07 13:45 安全 安全
  • v1.0.7 Initial release
    2026-06-06 22:24 安全 安全
  • v1.0.6 Initial release
    2026-06-06 16:24 安全 安全
  • v1.0.5 Initial release
    2026-06-04 20:37 安全 安全
  • v1.0.4 修复问题
    2026-06-02 15:17 安全 安全
  • v1.0.3 已嵌入 auto SKILL.md 的 Matching Algorithm 章节 之前的问题: - 每次匹配需额外 Read skill-registry.md(1 次工具调用) - 表格式扫描 ~100 条目,O(n) 查找 - Registry 可能过期 现在的方案 — 三级查找体系: ┌──────────┬─────────────────────────────────────┬──────────┬────────┐ │ 优先级 │ 路径 │ 成本 │ 覆盖率 │ ├──────────┼─────────────────────────────────────┼──────────┼────────┤ │ │ 内嵌 Skill Classifier(Operation → │ 0 tool │ │ │ Step 0 │ Skills 表 + Domain 交叉引用) │ calls, │ ~80% │ │ │ │ O(1) │ │ ├──────────┼─────────────────────────────────────┼──────────┼────────┤ │ Fallback │ references/skill-registry.md │ 1 tool │ ~95% │ │ 1 │ │ call │ │ ├──────────┼─────────────────────────────────────┼──────────┼────────┤ │ Fallback │ 从 system prompt 推断 │ 0 tool │ ~100% │ │ 2 │ │ calls │ │ └──────────┴─────────────────────────────────────┴──────────┴────────┘ 新增内容: - 8 行 Operation → Skills 表:按 create/update/design/review/explore:web/e xplore:local/execute:local/execute:remote 索引 - 9 行 Domain 交叉引用表:React/Backend/ML/DevOps/Security/Performance/Doc s/Git/Testing/Research - 行内前置条件标注 †git ¶py §pip *web ◊mcp ¤anthropic,一次查找完成操作+领域+前置条件匹配
    2026-06-01 22:59 安全 安全
  • v1.0.2 重构了 auto skill 的 SKILL.md,吸收 planning-with-files、executing-plans、subagent-driven-development、ralph-loop 四个 skill 的优点,修复了全部 10 个缺陷(4 原始 + 6 审计)
    2026-06-01 22:01 安全 安全
  • v1.0.1 优化了skills匹配逻辑,SKILL.md 改进 1. 操作语义标签 — 每个步骤标注 create / update / explore:local / explore:web / execute:local / execute:remote / design / review,避免把 web 调研工具匹配到本地文件探索 2. 兼容性准入检查 — 匹配前检查 skill 需要的前置条件(MCP、git、网络),不满足则直接淘汰 3. 匹配中断检测 — skill 加载后立即扫描其指令,发现不匹配立即切到 fallback 4. 退化链 — 最优 skill → 次优 skill → agent → direct 执行,永不重试同一个 skill 5. 直接跳过的场景 — 单文件读写、简单 shell 命令这类开销小于 skill 调用的步骤直接 direct skill-registry.md 改进 - 每个 skill 增加了 Ops(操作标签)、Prereqs(前置条件)、DO NOT use for(反向匹配)三列 - 新增"按前置条件查 skill"索引表,快速识别哪些 skill 依赖 MCP/git/网络/运行时
    2026-06-01 15:14 安全 安全
  • v1.0.0 the first version
    2026-06-01 14:46 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,244 📥 271,711
ai-agent

self-improving agent

pskoett
记录自身发现以实现自我改进的技能
★ 4,129 📥 882,970
ai-agent

Agent Browser

rez0
用于 AI 代理的浏览器自动化 CLI。当用户需要与网站交互(包括浏览页面、填写表单、点击按钮、截图等)时使用。
★ 848 📥 328,760