← 返回
未分类 中文

Skill Auditor & Enhancer

Periodically audit all workspace skills, learnings, memory, and configuration files to recommend refactoring, new skill ideas, and workflow improvements. Tri...
定期审计所有工作区技能、学习记录、记忆和配置文件,以推荐重构方案、新技能创意和工作流程改进建议。
omaression
未分类 clawhub v1.0.0-alpha 1 版本 100000 Key: 无需
★ 0
Stars
📥 585
下载
💾 0
安装
1
版本
#latest

概述

Skill Auditor

Automated weekly workspace health check. Evaluates skills, learnings, memory, and config files. Delivers actionable recommendations to Telegram.

Pipeline architecture

4-phase sequential pipeline with internal parallelism:

Phase 1: Digest (opencode-go/kimi-k2.5)

Ingest all workspace files in one long-context call:

  • skills/*/SKILL.md and associated scripts/tests
  • .learnings/LEARNINGS.md, ERRORS.md, FEATURE_REQUESTS.md
  • SOUL.md, AGENTS.md, USER.md, TOOLS.md, MEMORY.md, HEARTBEAT.md
  • recent memory/*.md files (last 14 days)

Output: audit-state.json with per-file summaries, staleness scores, overlap detection, gap analysis.

Optimization: hash watched files against state.json from last run. Skip unchanged files to prevent token burn.

Also: web_search for best practices relevant to detected gaps.

Phase 2: Evaluate (parallel)

Phase 2A (opencode-go/glm-5): Score each skill on effectiveness, token efficiency, coverage, staleness, overlap, alignment with USER.md goals. Propose new skill ideas.

Phase 2B (openai-codex/gpt-5.3-codex): Score independently. Generate concrete refactor proposals. Propose new skill ideas.

Both output structured evaluation JSON.

Phase 3: Judge (openai-codex/gpt-5.4)

Receives: audit-state.json + both evaluation outputs.

  • Cross-validate proposals, resolve conflicts
  • Filter: only recommend changes with clear ROI
  • Classify each recommendation:
  • 🟢 safe refactor — low-risk, can PR directly after approval
  • 🟡 needs review — structural change or new skill creation
  • 🔴 informational — trend or observation, no action yet
  • Confidence threshold: ≥0.7 to recommend, ≥0.85 for safe-refactor classification

Output: final-recommendations.json

Phase 4: Deliver (main session)

Format recommendations as Telegram message and send. Archive to memory/audits/YYYY-MM-DD.json.

Recommendation format

Each recommendation:

{
  "id": "rec-001",
  "type": "refactor | new-skill | config-update | deprecate | merge",
  "severity": "green | yellow | red",
  "target": "skills/context-optimizer/SKILL.md",
  "title": "compress context-optimizer references section",
  "rationale": "...",
  "proposed_action": "...",
  "confidence": 0.87,
  "agreed_by": ["glm-5", "gpt-5.3-codex"]
}

Telegram delivery format

📋 Weekly Skill Audit — YYYY-MM-DD

🟢 Safe refactors (N):
  1. [title] → [one-line action]

🟡 Needs review (N):
  2. [title]

🔴 Informational (N):
  3. [title]

Reply with a number for details, or "approve 1,2" to greenlight.

If no strong recommendations: send "no action needed this week" one-liner.

If quality score is low across all recommendations: send nothing.

Scheduling

Primary: OpenClaw cron, every 7 days (Sunday 10:00 AM ET):

openclaw cron add --schedule "0 10 * * 0" --model openai-codex/gpt-5.4 --label skill-auditor-weekly --prompt "Read skills/skill-auditor/SKILL.md and execute the full audit pipeline. Deliver results to Telegram."

State tracking: memory/audits/last-run.json records last execution timestamp. Heartbeat checks if last run was >10 days ago and alerts.

Manual trigger: User says "audit skills" or "review workflow".

Evaluation criteria

Each file/skill scored on:

  1. Effectiveness — achieves stated purpose? (1-5)
  2. Token cost — bloated? shorter without losing value? (1-5)
  3. Coverage — workflow gaps not addressed by any skill? (binary + description)
  4. Freshness — last meaningful update vs relevance decay
  5. Overlap — duplicates content in another file/skill? (list pairs)
  6. Alignment — matches USER.md goals and SOUL.md persona? (1-5)

Safety rules

  • No automatic file edits. Recommendations are advisory until approved.
  • Green recommendations produce diff previews; actual changes require explicit "approve" reply.
  • Respect all workspace GitHub handling rules — no repo-visible changes without Omar's approval.

File structure

skills/skill-auditor/
├── SKILL.md
├── scripts/
│   ├── build_audit_state.py
│   ├── merge_evaluations.py
│   └── format_telegram.py
└── tests/
    ├── test_build_audit_state.py
    ├── test_merge_evaluations.py
    └── test_format_telegram.py

Runtime artifacts (not tracked in repo):

memory/audits/
├── last-run.json
├── YYYY-MM-DD.json
└── state.json (file hashes for change detection)

Validation checklist

  1. All 3 helper scripts exist and pass unit tests.
  2. Dry-run mode completes full pipeline without sending messages.
  3. At least one real audit cycle delivers a well-formatted Telegram message.
  4. Recommendations are advisory-only (no auto-edits without approval).
  5. Unchanged files are skipped via hash comparison.
  6. Confidence thresholds are enforced.

版本历史

共 1 个版本

  • v1.0.0-alpha 当前
    2026-05-02 09:14 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Auto Improving Agent

omaression
利用信号过滤自动将更正、失败和可复用发现捕获至`.learnings/`文件。当用户更正时触发……
★ 0 📥 468

Commit Message Validation

omaression
严格的约定式提交 v1.0.0、原子提交规范及主干开发准则。用于准备提交或暂存变更时。
★ 0 📥 454

Atomic Memory Manager

omaression
为 OpenClaw 代理利用 MEMORY.md、memory/YYYY-MM-DD.md、memory_search 和 memory_get 回读写工作区记忆。当用户询问优先级时使用。
★ 0 📥 652