← 返回
未分类 中文

Autoresearch

Automatically improve OpenClaw skills, prompts, or articles through iterative mutation-testing loops. Inspired by Karpathy's autoresearch. Use when user says...
通过迭代变异测试循环自动改进 OpenClaw 技能、提示或文章;受 Karpathy 自动研究启发,适用于用户说...
0xcjl
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 2
Stars
📥 461
下载
💾 2
安装
1
版本
#latest

概述

autoresearch-pro

Overview

Automatically improve any OpenClaw skill, prompt, or article through iterative mutation-testing: small edits → run test cases → score with checklist → keep improvements, discard regressions.

Inspired by Karpathy/autoresearch.

Supports three optimization modes:

ModeInputOutput
---------------------
SkillPath to a skill directoryImproved SKILL.md
PromptA prompt text stringImproved prompt
ArticleAn article/document textImproved article

Workflow

Step 1 — Identify Mode and Input

Ask the user to confirm:

  • Mode 1 — Skill: User says "optimize [skill-name]" or provides a skill path
  • Mode 2 — Prompt: User says "optimize this prompt" or pastes a prompt
  • Mode 3 — Article: User says "improve this article" or pastes article text

For Skill mode, resolve the skill path to ~/.openclaw/skills//SKILL.md.

For Prompt/Article mode, keep the text in context (do not write to disk unless needed).

Step 2 — Generate Checklist (10 Questions)

Read the target content first. Then generate 10 diverse, specific yes/no checklist questions relevant to the content type:

For Skill mode (same as before):

#DimensionWhat to Check
----------------------------
1Description clarityIs the frontmatter description precise and actionable?
2Trigger coverageDoes it cover the main real-world use cases?
3Workflow structureAre steps clearly sequenced and unambiguous?
4Error guidanceDoes it handle error states and edge cases?
5Tool usage accuracyAre tool names and parameters correct for OpenClaw?
6Example qualityDo examples reflect real usage patterns?
7ConcisenessIs content free of redundant repetition?
8Freedom calibrationIs instruction specificity appropriate?
9Reference qualityAre references and links accurate?
10CompletenessAre all sections filled with real content?

For Prompt mode (10 tailored questions):

#DimensionWhat to Check
----------------------------
1Goal clarityDoes the prompt state a clear, specific goal?
2Role/toneIs the desired role or tone specified?
3Input formatIs the input format clearly described?
4Output formatIs the expected output format specified?
5ConstraintsAre key constraints and boundaries stated?
6Context sufficiencyIs enough context provided to avoid hallucination?
7Edge casesDoes it handle ambiguous or edge case inputs?
8ConcisenessIs it free of redundant or contradictory instructions?
9ActionabilityAre instructions concrete and actionable vs. vague?
10CompletenessAre all necessary elements for the task present?

For Article mode (10 tailored questions):

#DimensionWhat to Check
----------------------------
1Title qualityDoes the title clearly convey the main value?
2Opening hookDoes the opening grab attention and set expectations?
3Logical structureAre ideas logically organized (not random)?
4Argument clarityAre claims supported with evidence or reasoning?
5ConcisenessIs unnecessary padding or repetition removed?
6Transition flowDo paragraphs/sections flow smoothly?
7Closing strengthDoes the conclusion summarize and inspire action?
8Tone consistencyIs the tone consistent throughout?
9ReadabilityIs sentence/paragraph length varied appropriately?
10Audience matchDoes language match the target audience level?

Present the 10 questions, numbered 1-10. Ask the user to select which ones to activate (e.g., "use questions 1, 3, 5, 7"). Default: use all 10 if user doesn't specify.

Step 3 — Prepare Test Cases

  • Skill mode: Generate 3-5 realistic prompts a user would send when using the skill
  • Prompt mode: Generate 3-5 test inputs that the prompt would process
  • Article mode: Generate 3-5 ways the article might be read or consumed

Store test cases in context — do not write to disk.

Step 4 — Run Autoresearch Loop

Loop configuration:

  • Rounds per batch: 30
  • Max total rounds: 100
  • Pause: After every 30 rounds, show summary and ask user to continue or stop
  • Stop conditions: User says stop, OR 100 rounds completed

Per-round procedure:

  1. Mutate: Make ONE small edit to the target content:
    • Skill mode: edit SKILL.md
    • Prompt mode: edit the prompt string
    • Article mode: edit the article text
  1. Test: For each test case, simulate what output the content would produce.
  1. Score: Apply each active checklist question (0 or 1 per question). Score = (passed / total) × 100.
  1. Decide: If new score ≥ best score → keep the mutation. If lower → revert.
  1. Log: Round number, mutation type, score, keep/revert decision.

Mutation types (pick one per round):

TypeDescription
-------------------
AAdd a constraint rule
BStrengthen trigger/coverage
CAdd a concrete example
DTighten vague language
EImprove error/edge case handling
FRemove redundant content
GImprove transitions
HExpand a thin section
IAdd cross-reference
JAdjust degree-of-freedom

Step 5 — Report Results

After each batch (30 rounds):

Batch N (rounds X-Y):
  Best score: XX%
  Mutations kept: N  |  Reverted: N
  Most effective types: [list top 2-3]
Accumulated improvements: [summary]
Continue? (yes/stop)

After full completion:

  • Original score vs. final score
  • Top 3 most impactful mutations
  • Final improved content (inline or diff)
  • File path (skill mode only)

Mutation Strategy Reference

High-impact, low-risk changes:

  • Adding explicit constraints where the content is vague
  • Expanding coverage to cover edge cases
  • Adding concrete examples to abstract instructions
  • Tightening soft language ("try to" → "must")

Avoid in one round:

  • Large rewrites of entire sections
  • Multiple unrelated changes at once
  • Changing fundamental scope or purpose

See references/mutation_strategies.md for the full strategy guide.


Mode Selection Quick Reference

User saysMode
-----------------
"optimize [skill]" / "autoresearch [skill]"Skill
"optimize this prompt" / "improve my prompt"Prompt
"polish this article" / "improve this article"Article
"optimize this document"Article

Default to Prompt mode if the input is a text string without a skill path.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 10:15 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

Merge Drafts

0xcjl
智能草稿合并工具,支持质量评估与冲突解决。可将多份草稿合并为高质量文章,支持多种输入格式。
★ 2 📥 555

humanizer-cn

0xcjl
消除中文文本的AI痕迹,使其自然流畅。基于维基百科AI写作特征指南检测24种AI模式。触发词:humanizer-cn、去除AI痕迹、去除AI写作痕迹、中文文本人性化。
★ 0 📥 547

browser-cdp

0xcjl
使用 CDP 代理实现真实 Chrome 自动化,支持完整登录状态访问页面、绕过反爬虫检测、执行点击/填写等交互操作。
★ 2 📥 792