概述

Prompt A/B Lab

Purpose

Design, log, compare, and score prompt experiments so users can systematically improve outputs instead of guessing.

Trigger phrases

比较两个提示词
prompt ab test
提示词实验
哪个 prompt 更好
建一个评测表

Ask for these inputs

prompt A and B
task
evaluation criteria
test set
weights if any

Workflow

Define what success looks like before comparing prompts.
Generate an evaluation rubric and structured test table.
Log outputs per test case and compute weighted scores.
Summarize tradeoffs instead of declaring a winner too early.
Recommend the next experiment iteration.

Output contract

experiment plan
scored comparison table
rubric
next-iteration suggestions

Files in this skill

Script: {baseDir}/scripts/prompt_experiment_logger.py
Resource: {baseDir}/resources/eval_rubric.md

Operating rules

Be concrete and action-oriented.
Prefer preview / draft / simulation mode before destructive changes.
If information is missing, ask only for the minimum needed to proceed.
Never fabricate metrics, legal certainty, receipts, credentials, or evidence.
Keep assumptions explicit.

Suggested prompts

比较两个提示词
prompt ab test
提示词实验

Use of script and resources

Use the bundled script when it helps the user produce a structured file, manifest, CSV, or first-pass draft.

Use the resource file as the default schema, checklist, or preset when the user does not provide one.

Boundaries

This skill supports planning, structuring, and first-pass artifacts.
It should not claim that files were modified, messages were sent, or legal/financial decisions were finalized unless the user actually performed those actions.

Compatibility notes

Directory-based AgentSkills/OpenClaw skill.
Runtime dependency declared through metadata.openclaw.requires.
Helper script is local and auditable: scripts/prompt_experiment_logger.py.
Bundled resource is local and referenced by the instructions: resources/eval_rubric.md.

版本历史

共 1 个版本

v1.0.0 当前

2026-03-30 03:30 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)